From 3fb82e818072abb55523452e4f1aa4c43b27733e Mon Sep 17 00:00:00 2001 From: Youssef Aitousarrah Date: Wed, 8 Apr 2026 08:09:07 -0700 Subject: [PATCH] feat(iteration-system): add knowledge base folder structure with seeded scanner files Co-Authored-By: Claude Sonnet 4.6 --- docs/iterations/LEARNINGS.md | 20 +++++++++++++++++++ docs/iterations/pipeline/scoring.md | 14 +++++++++++++ docs/iterations/scanners/analyst_upgrades.md | 14 +++++++++++++ docs/iterations/scanners/earnings_calendar.md | 13 ++++++++++++ docs/iterations/scanners/insider_buying.md | 14 +++++++++++++ docs/iterations/scanners/market_movers.md | 13 ++++++++++++ docs/iterations/scanners/minervini.md | 13 ++++++++++++ docs/iterations/scanners/ml_signal.md | 14 +++++++++++++ docs/iterations/scanners/options_flow.md | 15 ++++++++++++++ docs/iterations/scanners/reddit_dd.md | 14 +++++++++++++ docs/iterations/scanners/reddit_trending.md | 12 +++++++++++ docs/iterations/scanners/sector_rotation.md | 13 ++++++++++++ docs/iterations/scanners/semantic_news.md | 14 +++++++++++++ .../iterations/scanners/technical_breakout.md | 13 ++++++++++++ .../scanners/volume_accumulation.md | 14 +++++++++++++ 15 files changed, 210 insertions(+) create mode 100644 docs/iterations/LEARNINGS.md create mode 100644 docs/iterations/pipeline/scoring.md create mode 100644 docs/iterations/scanners/analyst_upgrades.md create mode 100644 docs/iterations/scanners/earnings_calendar.md create mode 100644 docs/iterations/scanners/insider_buying.md create mode 100644 docs/iterations/scanners/market_movers.md create mode 100644 docs/iterations/scanners/minervini.md create mode 100644 docs/iterations/scanners/ml_signal.md create mode 100644 docs/iterations/scanners/options_flow.md create mode 100644 docs/iterations/scanners/reddit_dd.md create mode 100644 docs/iterations/scanners/reddit_trending.md create mode 100644 docs/iterations/scanners/sector_rotation.md create mode 100644 docs/iterations/scanners/semantic_news.md create mode 100644 docs/iterations/scanners/technical_breakout.md create mode 100644 docs/iterations/scanners/volume_accumulation.md diff --git a/docs/iterations/LEARNINGS.md b/docs/iterations/LEARNINGS.md new file mode 100644 index 00000000..61c54c30 --- /dev/null +++ b/docs/iterations/LEARNINGS.md @@ -0,0 +1,20 @@ +# Learnings Index + +**Last analyzed run:** _(none yet — will be set by first /iterate run)_ + +| Domain | File | Last Updated | One-line Summary | +|--------|------|--------------|-----------------| +| options_flow | scanners/options_flow.md | — | No data yet | +| insider_buying | scanners/insider_buying.md | — | No data yet | +| volume_accumulation | scanners/volume_accumulation.md | — | No data yet | +| reddit_dd | scanners/reddit_dd.md | — | No data yet | +| reddit_trending | scanners/reddit_trending.md | — | No data yet | +| semantic_news | scanners/semantic_news.md | — | No data yet | +| market_movers | scanners/market_movers.md | — | No data yet | +| earnings_calendar | scanners/earnings_calendar.md | — | No data yet | +| analyst_upgrades | scanners/analyst_upgrades.md | — | No data yet | +| technical_breakout | scanners/technical_breakout.md | — | No data yet | +| sector_rotation | scanners/sector_rotation.md | — | No data yet | +| ml_signal | scanners/ml_signal.md | — | No data yet | +| minervini | scanners/minervini.md | — | No data yet | +| pipeline/scoring | pipeline/scoring.md | — | No data yet | diff --git a/docs/iterations/pipeline/scoring.md b/docs/iterations/pipeline/scoring.md new file mode 100644 index 00000000..54b00039 --- /dev/null +++ b/docs/iterations/pipeline/scoring.md @@ -0,0 +1,14 @@ +# Pipeline Scoring & Ranking + +## Current Understanding +LLM assigns a final_score (0-100) and confidence (1-10) to each candidate. +Score and confidence are correlated but not identical — a speculative setup +can score 80 with confidence 6. The ranker uses final_score as primary sort key. +No evidence yet on whether confidence or score is a better predictor of outcomes. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Is confidence a better outcome predictor than final_score? +- [ ] Does score threshold (e.g. only surface candidates >70) improve hit rate? diff --git a/docs/iterations/scanners/analyst_upgrades.md b/docs/iterations/scanners/analyst_upgrades.md new file mode 100644 index 00000000..767dbc70 --- /dev/null +++ b/docs/iterations/scanners/analyst_upgrades.md @@ -0,0 +1,14 @@ +# Analyst Upgrades Scanner + +## Current Understanding +Detects analyst upgrades/price target increases. Most reliable when upgrade comes +from a top-tier firm (Goldman, Morgan Stanley, JPMorgan) and represents a meaningful +target increase (>15%). Short squeeze potential (high short interest) combined with +an upgrade is a historically strong setup. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does analyst tier (BB firm vs boutique) predict upgrade quality? +- [ ] Does short interest >20% combined with an upgrade produce outsized moves? diff --git a/docs/iterations/scanners/earnings_calendar.md b/docs/iterations/scanners/earnings_calendar.md new file mode 100644 index 00000000..1550247b --- /dev/null +++ b/docs/iterations/scanners/earnings_calendar.md @@ -0,0 +1,13 @@ +# Earnings Calendar Scanner + +## Current Understanding +Identifies stocks with earnings announcements in the next N days. Pre-earnings +setups work best when combined with options flow (IV expansion) or insider activity. +Standalone earnings calendar signal is too broad — nearly every stock has earnings +quarterly. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does requiring options confirmation alongside earnings improve signal quality? diff --git a/docs/iterations/scanners/insider_buying.md b/docs/iterations/scanners/insider_buying.md new file mode 100644 index 00000000..b77fc60b --- /dev/null +++ b/docs/iterations/scanners/insider_buying.md @@ -0,0 +1,14 @@ +# Insider Buying Scanner + +## Current Understanding +Scrapes SEC Form 4 filings. CEO/CFO purchases >$100K are the most reliable signal. +Cluster detection (2+ insiders buying within 14 days) historically a high-conviction +setup. Transaction details (name, title, value) must be preserved from scraper output +and included in candidate context — dropping them loses signal clarity. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does cluster detection (2+ insiders in 14 days) outperform single-insider signals? +- [ ] Is there a minimum transaction size below which signal quality degrades sharply? diff --git a/docs/iterations/scanners/market_movers.md b/docs/iterations/scanners/market_movers.md new file mode 100644 index 00000000..a4435309 --- /dev/null +++ b/docs/iterations/scanners/market_movers.md @@ -0,0 +1,13 @@ +# Market Movers Scanner + +## Current Understanding +Finds stocks that have already moved significantly. This is a reactive scanner — +it identifies momentum after it starts rather than predicting it. Useful for +continuation plays but not for early-stage entry. Best combined with volume +confirmation to distinguish breakouts from spikes. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Is a volume confirmation filter (>1.5x average) useful for filtering out noise? diff --git a/docs/iterations/scanners/minervini.md b/docs/iterations/scanners/minervini.md new file mode 100644 index 00000000..e2cc127c --- /dev/null +++ b/docs/iterations/scanners/minervini.md @@ -0,0 +1,13 @@ +# Minervini Scanner + +## Current Understanding +Implements Mark Minervini's SEPA (Specific Entry Point Analysis) criteria: stage 2 +uptrend, price above 50/150/200 SMA in the right order, 52-week high proximity, +RS line at new highs. Historically one of the highest-conviction scanner setups. +Works best in bull market conditions; underperforms in choppy/bear markets. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does adding a market condition filter (S&P 500 above 200 SMA) improve hit rate? diff --git a/docs/iterations/scanners/ml_signal.md b/docs/iterations/scanners/ml_signal.md new file mode 100644 index 00000000..c824eea8 --- /dev/null +++ b/docs/iterations/scanners/ml_signal.md @@ -0,0 +1,14 @@ +# ML Signal Scanner + +## Current Understanding +Uses a trained ML model to predict short-term price movement probability. Current +threshold of 35% win probability is worse than a coin flip — the model needs +retraining or the threshold needs raising to 55%+ to be useful. Signal quality +depends heavily on feature freshness; stale features degrade performance. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does raising the threshold to 55%+ improve precision at the cost of recall? +- [ ] Would retraining on the last 90 days of recommendations improve accuracy? diff --git a/docs/iterations/scanners/options_flow.md b/docs/iterations/scanners/options_flow.md new file mode 100644 index 00000000..eb06170e --- /dev/null +++ b/docs/iterations/scanners/options_flow.md @@ -0,0 +1,15 @@ +# Options Flow Scanner + +## Current Understanding +Scans for unusual options volume relative to open interest using Tradier API. +Call/put volume ratio below 0.1 is a reliable bullish signal when combined with +premium >$25K. The premium filter is configured but must be explicitly applied. +Scanning only the nearest expiration misses institutional positioning in 30+ DTE +contracts — scanning up to 3 expirations improves signal quality. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does scanning 3 expirations vs 1 meaningfully change hit rate? +- [ ] Is moneyness (ITM vs OTM) a useful signal filter? diff --git a/docs/iterations/scanners/reddit_dd.md b/docs/iterations/scanners/reddit_dd.md new file mode 100644 index 00000000..e3164ad0 --- /dev/null +++ b/docs/iterations/scanners/reddit_dd.md @@ -0,0 +1,14 @@ +# Reddit DD Scanner + +## Current Understanding +Scans r/investing, r/stocks, r/wallstreetbets for DD posts. LLM quality score is +computed but not used for filtering — using it (80+ = HIGH, 60-79 = MEDIUM, <60 = skip) +would reduce noise. Subreddit weighting matters: r/investing posts are more reliable +than r/pennystocks. Post title and LLM score should appear in candidate context. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does filtering by LLM quality score >60 meaningfully reduce false positives? +- [ ] Does subreddit weighting change hit rates? diff --git a/docs/iterations/scanners/reddit_trending.md b/docs/iterations/scanners/reddit_trending.md new file mode 100644 index 00000000..fdb48f8d --- /dev/null +++ b/docs/iterations/scanners/reddit_trending.md @@ -0,0 +1,12 @@ +# Reddit Trending Scanner + +## Current Understanding +Tracks mention velocity across subreddits. 50+ mentions in 6 hours = HIGH priority. +20-49 = MEDIUM. Mention count should appear in context ("47 mentions in 6hrs"). +Signal is early-indicator oriented — catches momentum before price moves. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does mention velocity (rate of increase) outperform raw mention count? diff --git a/docs/iterations/scanners/sector_rotation.md b/docs/iterations/scanners/sector_rotation.md new file mode 100644 index 00000000..47608168 --- /dev/null +++ b/docs/iterations/scanners/sector_rotation.md @@ -0,0 +1,13 @@ +# Sector Rotation Scanner + +## Current Understanding +Detects money flowing between sectors using relative strength analysis. Most useful +as a macro filter rather than a primary signal — knowing which sectors are in favor +improves conviction in scanner candidates from those sectors. Standalone sector +rotation signals are too broad for individual stock selection. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Can sector rotation data be used as a multiplier on other scanner scores? diff --git a/docs/iterations/scanners/semantic_news.md b/docs/iterations/scanners/semantic_news.md new file mode 100644 index 00000000..07fdd295 --- /dev/null +++ b/docs/iterations/scanners/semantic_news.md @@ -0,0 +1,14 @@ +# Semantic News Scanner + +## Current Understanding +Currently regex-based extraction, not semantic. Headline text is not included in +candidate context — the context just says "Mentioned in recent market news" which +is not informative. Catalyst classification from headline keywords (upgrade/FDA/ +acquisition/earnings) would improve LLM scoring quality significantly. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Would embedding-based semantic matching outperform keyword regex? +- [ ] Does catalyst classification (FDA vs earnings vs acquisition) affect hit rate? diff --git a/docs/iterations/scanners/technical_breakout.md b/docs/iterations/scanners/technical_breakout.md new file mode 100644 index 00000000..138678b0 --- /dev/null +++ b/docs/iterations/scanners/technical_breakout.md @@ -0,0 +1,13 @@ +# Technical Breakout Scanner + +## Current Understanding +Detects price breakouts above key resistance levels on above-average volume. +Minervini-style setups (stage 2 uptrend, tight base, volume-confirmed breakout) +tend to have the highest follow-through rate. False breakouts are common without +volume confirmation (>1.5x average on breakout day). + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does requiring volume confirmation on the breakout day reduce false positives? diff --git a/docs/iterations/scanners/volume_accumulation.md b/docs/iterations/scanners/volume_accumulation.md new file mode 100644 index 00000000..76b99e3d --- /dev/null +++ b/docs/iterations/scanners/volume_accumulation.md @@ -0,0 +1,14 @@ +# Volume Accumulation Scanner + +## Current Understanding +Detects stocks with volume >2x average. Key weakness: cannot distinguish buying from +selling — high volume on a down day is distribution, not accumulation. Multi-day mode +(3 of last 5 days >1.5x) is more reliable than single-day spikes. Price-change filter +(<3% absolute move) isolates quiet accumulation from momentum chasing. + +## Evidence Log +_(populated by /iterate runs)_ + +## Pending Hypotheses +- [ ] Does adding a price-direction filter (volume + flat/up price) improve hit rate? +- [ ] Is 3-of-5-day accumulation a stronger signal than single-day 2x volume?