feat(iteration-system): add knowledge base folder structure with seeded scanner files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Youssef Aitousarrah 2026-04-08 08:09:07 -07:00
parent ec2b3c2a45
commit 3fb82e8180
15 changed files with 210 additions and 0 deletions

View File

@ -0,0 +1,20 @@
# Learnings Index
**Last analyzed run:** _(none yet — will be set by first /iterate run)_
| Domain | File | Last Updated | One-line Summary |
|--------|------|--------------|-----------------|
| options_flow | scanners/options_flow.md | — | No data yet |
| insider_buying | scanners/insider_buying.md | — | No data yet |
| volume_accumulation | scanners/volume_accumulation.md | — | No data yet |
| reddit_dd | scanners/reddit_dd.md | — | No data yet |
| reddit_trending | scanners/reddit_trending.md | — | No data yet |
| semantic_news | scanners/semantic_news.md | — | No data yet |
| market_movers | scanners/market_movers.md | — | No data yet |
| earnings_calendar | scanners/earnings_calendar.md | — | No data yet |
| analyst_upgrades | scanners/analyst_upgrades.md | — | No data yet |
| technical_breakout | scanners/technical_breakout.md | — | No data yet |
| sector_rotation | scanners/sector_rotation.md | — | No data yet |
| ml_signal | scanners/ml_signal.md | — | No data yet |
| minervini | scanners/minervini.md | — | No data yet |
| pipeline/scoring | pipeline/scoring.md | — | No data yet |

View File

@ -0,0 +1,14 @@
# Pipeline Scoring & Ranking
## Current Understanding
LLM assigns a final_score (0-100) and confidence (1-10) to each candidate.
Score and confidence are correlated but not identical — a speculative setup
can score 80 with confidence 6. The ranker uses final_score as primary sort key.
No evidence yet on whether confidence or score is a better predictor of outcomes.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Is confidence a better outcome predictor than final_score?
- [ ] Does score threshold (e.g. only surface candidates >70) improve hit rate?

View File

@ -0,0 +1,14 @@
# Analyst Upgrades Scanner
## Current Understanding
Detects analyst upgrades/price target increases. Most reliable when upgrade comes
from a top-tier firm (Goldman, Morgan Stanley, JPMorgan) and represents a meaningful
target increase (>15%). Short squeeze potential (high short interest) combined with
an upgrade is a historically strong setup.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does analyst tier (BB firm vs boutique) predict upgrade quality?
- [ ] Does short interest >20% combined with an upgrade produce outsized moves?

View File

@ -0,0 +1,13 @@
# Earnings Calendar Scanner
## Current Understanding
Identifies stocks with earnings announcements in the next N days. Pre-earnings
setups work best when combined with options flow (IV expansion) or insider activity.
Standalone earnings calendar signal is too broad — nearly every stock has earnings
quarterly.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does requiring options confirmation alongside earnings improve signal quality?

View File

@ -0,0 +1,14 @@
# Insider Buying Scanner
## Current Understanding
Scrapes SEC Form 4 filings. CEO/CFO purchases >$100K are the most reliable signal.
Cluster detection (2+ insiders buying within 14 days) historically a high-conviction
setup. Transaction details (name, title, value) must be preserved from scraper output
and included in candidate context — dropping them loses signal clarity.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does cluster detection (2+ insiders in 14 days) outperform single-insider signals?
- [ ] Is there a minimum transaction size below which signal quality degrades sharply?

View File

@ -0,0 +1,13 @@
# Market Movers Scanner
## Current Understanding
Finds stocks that have already moved significantly. This is a reactive scanner —
it identifies momentum after it starts rather than predicting it. Useful for
continuation plays but not for early-stage entry. Best combined with volume
confirmation to distinguish breakouts from spikes.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Is a volume confirmation filter (>1.5x average) useful for filtering out noise?

View File

@ -0,0 +1,13 @@
# Minervini Scanner
## Current Understanding
Implements Mark Minervini's SEPA (Specific Entry Point Analysis) criteria: stage 2
uptrend, price above 50/150/200 SMA in the right order, 52-week high proximity,
RS line at new highs. Historically one of the highest-conviction scanner setups.
Works best in bull market conditions; underperforms in choppy/bear markets.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does adding a market condition filter (S&P 500 above 200 SMA) improve hit rate?

View File

@ -0,0 +1,14 @@
# ML Signal Scanner
## Current Understanding
Uses a trained ML model to predict short-term price movement probability. Current
threshold of 35% win probability is worse than a coin flip — the model needs
retraining or the threshold needs raising to 55%+ to be useful. Signal quality
depends heavily on feature freshness; stale features degrade performance.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does raising the threshold to 55%+ improve precision at the cost of recall?
- [ ] Would retraining on the last 90 days of recommendations improve accuracy?

View File

@ -0,0 +1,15 @@
# Options Flow Scanner
## Current Understanding
Scans for unusual options volume relative to open interest using Tradier API.
Call/put volume ratio below 0.1 is a reliable bullish signal when combined with
premium >$25K. The premium filter is configured but must be explicitly applied.
Scanning only the nearest expiration misses institutional positioning in 30+ DTE
contracts — scanning up to 3 expirations improves signal quality.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does scanning 3 expirations vs 1 meaningfully change hit rate?
- [ ] Is moneyness (ITM vs OTM) a useful signal filter?

View File

@ -0,0 +1,14 @@
# Reddit DD Scanner
## Current Understanding
Scans r/investing, r/stocks, r/wallstreetbets for DD posts. LLM quality score is
computed but not used for filtering — using it (80+ = HIGH, 60-79 = MEDIUM, <60 = skip)
would reduce noise. Subreddit weighting matters: r/investing posts are more reliable
than r/pennystocks. Post title and LLM score should appear in candidate context.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does filtering by LLM quality score >60 meaningfully reduce false positives?
- [ ] Does subreddit weighting change hit rates?

View File

@ -0,0 +1,12 @@
# Reddit Trending Scanner
## Current Understanding
Tracks mention velocity across subreddits. 50+ mentions in 6 hours = HIGH priority.
20-49 = MEDIUM. Mention count should appear in context ("47 mentions in 6hrs").
Signal is early-indicator oriented — catches momentum before price moves.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does mention velocity (rate of increase) outperform raw mention count?

View File

@ -0,0 +1,13 @@
# Sector Rotation Scanner
## Current Understanding
Detects money flowing between sectors using relative strength analysis. Most useful
as a macro filter rather than a primary signal — knowing which sectors are in favor
improves conviction in scanner candidates from those sectors. Standalone sector
rotation signals are too broad for individual stock selection.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Can sector rotation data be used as a multiplier on other scanner scores?

View File

@ -0,0 +1,14 @@
# Semantic News Scanner
## Current Understanding
Currently regex-based extraction, not semantic. Headline text is not included in
candidate context — the context just says "Mentioned in recent market news" which
is not informative. Catalyst classification from headline keywords (upgrade/FDA/
acquisition/earnings) would improve LLM scoring quality significantly.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Would embedding-based semantic matching outperform keyword regex?
- [ ] Does catalyst classification (FDA vs earnings vs acquisition) affect hit rate?

View File

@ -0,0 +1,13 @@
# Technical Breakout Scanner
## Current Understanding
Detects price breakouts above key resistance levels on above-average volume.
Minervini-style setups (stage 2 uptrend, tight base, volume-confirmed breakout)
tend to have the highest follow-through rate. False breakouts are common without
volume confirmation (>1.5x average on breakout day).
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does requiring volume confirmation on the breakout day reduce false positives?

View File

@ -0,0 +1,14 @@
# Volume Accumulation Scanner
## Current Understanding
Detects stocks with volume >2x average. Key weakness: cannot distinguish buying from
selling — high volume on a down day is distribution, not accumulation. Multi-day mode
(3 of last 5 days >1.5x) is more reliable than single-day spikes. Price-change filter
(<3% absolute move) isolates quiet accumulation from momentum chasing.
## Evidence Log
_(populated by /iterate runs)_
## Pending Hypotheses
- [ ] Does adding a price-direction filter (volume + flat/up price) improve hit rate?
- [ ] Is 3-of-5-day accumulation a stronger signal than single-day 2x volume?