feat(iteration-system): add knowledge base folder structure with seeded scanner files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 08:09:07 -07:00 · 2026-04-08 08:09:07 -07:00 · 3fb82e8180
parent ec2b3c2a45
commit 3fb82e8180
15 changed files with 210 additions and 0 deletions
--- a/docs/iterations/LEARNINGS.md
+++ b/docs/iterations/LEARNINGS.md
@ -0,0 +1,20 @@
+# Learnings Index
+
+**Last analyzed run:** _(none yet — will be set by first /iterate run)_
+
+| Domain | File | Last Updated | One-line Summary |
+|--------|------|--------------|-----------------|
+| options_flow | scanners/options_flow.md | — | No data yet |
+| insider_buying | scanners/insider_buying.md | — | No data yet |
+| volume_accumulation | scanners/volume_accumulation.md | — | No data yet |
+| reddit_dd | scanners/reddit_dd.md | — | No data yet |
+| reddit_trending | scanners/reddit_trending.md | — | No data yet |
+| semantic_news | scanners/semantic_news.md | — | No data yet |
+| market_movers | scanners/market_movers.md | — | No data yet |
+| earnings_calendar | scanners/earnings_calendar.md | — | No data yet |
+| analyst_upgrades | scanners/analyst_upgrades.md | — | No data yet |
+| technical_breakout | scanners/technical_breakout.md | — | No data yet |
+| sector_rotation | scanners/sector_rotation.md | — | No data yet |
+| ml_signal | scanners/ml_signal.md | — | No data yet |
+| minervini | scanners/minervini.md | — | No data yet |
+| pipeline/scoring | pipeline/scoring.md | — | No data yet |
--- a/docs/iterations/pipeline/scoring.md
+++ b/docs/iterations/pipeline/scoring.md
@ -0,0 +1,14 @@
+# Pipeline Scoring & Ranking
+
+## Current Understanding
+LLM assigns a final_score (0-100) and confidence (1-10) to each candidate.
+Score and confidence are correlated but not identical — a speculative setup
+can score 80 with confidence 6. The ranker uses final_score as primary sort key.
+No evidence yet on whether confidence or score is a better predictor of outcomes.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Is confidence a better outcome predictor than final_score?
+- [ ] Does score threshold (e.g. only surface candidates >70) improve hit rate?
--- a/docs/iterations/scanners/analyst_upgrades.md
+++ b/docs/iterations/scanners/analyst_upgrades.md
@ -0,0 +1,14 @@
+# Analyst Upgrades Scanner
+
+## Current Understanding
+Detects analyst upgrades/price target increases. Most reliable when upgrade comes
+from a top-tier firm (Goldman, Morgan Stanley, JPMorgan) and represents a meaningful
+target increase (>15%). Short squeeze potential (high short interest) combined with
+an upgrade is a historically strong setup.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does analyst tier (BB firm vs boutique) predict upgrade quality?
+- [ ] Does short interest >20% combined with an upgrade produce outsized moves?
--- a/docs/iterations/scanners/earnings_calendar.md
+++ b/docs/iterations/scanners/earnings_calendar.md
@ -0,0 +1,13 @@
+# Earnings Calendar Scanner
+
+## Current Understanding
+Identifies stocks with earnings announcements in the next N days. Pre-earnings
+setups work best when combined with options flow (IV expansion) or insider activity.
+Standalone earnings calendar signal is too broad — nearly every stock has earnings
+quarterly.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does requiring options confirmation alongside earnings improve signal quality?
--- a/docs/iterations/scanners/insider_buying.md
+++ b/docs/iterations/scanners/insider_buying.md
@ -0,0 +1,14 @@
+# Insider Buying Scanner
+
+## Current Understanding
+Scrapes SEC Form 4 filings. CEO/CFO purchases >$100K are the most reliable signal.
+Cluster detection (2+ insiders buying within 14 days) historically a high-conviction
+setup. Transaction details (name, title, value) must be preserved from scraper output
+and included in candidate context — dropping them loses signal clarity.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does cluster detection (2+ insiders in 14 days) outperform single-insider signals?
+- [ ] Is there a minimum transaction size below which signal quality degrades sharply?
--- a/docs/iterations/scanners/market_movers.md
+++ b/docs/iterations/scanners/market_movers.md
@ -0,0 +1,13 @@
+# Market Movers Scanner
+
+## Current Understanding
+Finds stocks that have already moved significantly. This is a reactive scanner —
+it identifies momentum after it starts rather than predicting it. Useful for
+continuation plays but not for early-stage entry. Best combined with volume
+confirmation to distinguish breakouts from spikes.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Is a volume confirmation filter (>1.5x average) useful for filtering out noise?
--- a/docs/iterations/scanners/minervini.md
+++ b/docs/iterations/scanners/minervini.md
@ -0,0 +1,13 @@
+# Minervini Scanner
+
+## Current Understanding
+Implements Mark Minervini's SEPA (Specific Entry Point Analysis) criteria: stage 2
+uptrend, price above 50/150/200 SMA in the right order, 52-week high proximity,
+RS line at new highs. Historically one of the highest-conviction scanner setups.
+Works best in bull market conditions; underperforms in choppy/bear markets.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does adding a market condition filter (S&P 500 above 200 SMA) improve hit rate?
--- a/docs/iterations/scanners/ml_signal.md
+++ b/docs/iterations/scanners/ml_signal.md
@ -0,0 +1,14 @@
+# ML Signal Scanner
+
+## Current Understanding
+Uses a trained ML model to predict short-term price movement probability. Current
+threshold of 35% win probability is worse than a coin flip — the model needs
+retraining or the threshold needs raising to 55%+ to be useful. Signal quality
+depends heavily on feature freshness; stale features degrade performance.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does raising the threshold to 55%+ improve precision at the cost of recall?
+- [ ] Would retraining on the last 90 days of recommendations improve accuracy?
--- a/docs/iterations/scanners/options_flow.md
+++ b/docs/iterations/scanners/options_flow.md
@ -0,0 +1,15 @@
+# Options Flow Scanner
+
+## Current Understanding
+Scans for unusual options volume relative to open interest using Tradier API.
+Call/put volume ratio below 0.1 is a reliable bullish signal when combined with
+premium >$25K. The premium filter is configured but must be explicitly applied.
+Scanning only the nearest expiration misses institutional positioning in 30+ DTE
+contracts — scanning up to 3 expirations improves signal quality.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does scanning 3 expirations vs 1 meaningfully change hit rate?
+- [ ] Is moneyness (ITM vs OTM) a useful signal filter?
--- a/docs/iterations/scanners/reddit_dd.md
+++ b/docs/iterations/scanners/reddit_dd.md
@ -0,0 +1,14 @@
+# Reddit DD Scanner
+
+## Current Understanding
+Scans r/investing, r/stocks, r/wallstreetbets for DD posts. LLM quality score is
+computed but not used for filtering — using it (80+ = HIGH, 60-79 = MEDIUM, <60 = skip)
+would reduce noise. Subreddit weighting matters: r/investing posts are more reliable
+than r/pennystocks. Post title and LLM score should appear in candidate context.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does filtering by LLM quality score >60 meaningfully reduce false positives?
+- [ ] Does subreddit weighting change hit rates?
--- a/docs/iterations/scanners/reddit_trending.md
+++ b/docs/iterations/scanners/reddit_trending.md
@ -0,0 +1,12 @@
+# Reddit Trending Scanner
+
+## Current Understanding
+Tracks mention velocity across subreddits. 50+ mentions in 6 hours = HIGH priority.
+20-49 = MEDIUM. Mention count should appear in context ("47 mentions in 6hrs").
+Signal is early-indicator oriented — catches momentum before price moves.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does mention velocity (rate of increase) outperform raw mention count?
--- a/docs/iterations/scanners/sector_rotation.md
+++ b/docs/iterations/scanners/sector_rotation.md
@ -0,0 +1,13 @@
+# Sector Rotation Scanner
+
+## Current Understanding
+Detects money flowing between sectors using relative strength analysis. Most useful
+as a macro filter rather than a primary signal — knowing which sectors are in favor
+improves conviction in scanner candidates from those sectors. Standalone sector
+rotation signals are too broad for individual stock selection.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Can sector rotation data be used as a multiplier on other scanner scores?
--- a/docs/iterations/scanners/semantic_news.md
+++ b/docs/iterations/scanners/semantic_news.md
@ -0,0 +1,14 @@
+# Semantic News Scanner
+
+## Current Understanding
+Currently regex-based extraction, not semantic. Headline text is not included in
+candidate context — the context just says "Mentioned in recent market news" which
+is not informative. Catalyst classification from headline keywords (upgrade/FDA/
+acquisition/earnings) would improve LLM scoring quality significantly.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Would embedding-based semantic matching outperform keyword regex?
+- [ ] Does catalyst classification (FDA vs earnings vs acquisition) affect hit rate?
--- a/docs/iterations/scanners/technical_breakout.md
+++ b/docs/iterations/scanners/technical_breakout.md
@ -0,0 +1,13 @@
+# Technical Breakout Scanner
+
+## Current Understanding
+Detects price breakouts above key resistance levels on above-average volume.
+Minervini-style setups (stage 2 uptrend, tight base, volume-confirmed breakout)
+tend to have the highest follow-through rate. False breakouts are common without
+volume confirmation (>1.5x average on breakout day).
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does requiring volume confirmation on the breakout day reduce false positives?
--- a/docs/iterations/scanners/volume_accumulation.md
+++ b/docs/iterations/scanners/volume_accumulation.md
@ -0,0 +1,14 @@
+# Volume Accumulation Scanner
+
+## Current Understanding
+Detects stocks with volume >2x average. Key weakness: cannot distinguish buying from
+selling — high volume on a down day is distribution, not accumulation. Multi-day mode
+(3 of last 5 days >1.5x) is more reliable than single-day spikes. Price-change filter
+(<3% absolute move) isolates quiet accumulation from momentum chasing.
+
+## Evidence Log
+_(populated by /iterate runs)_
+
+## Pending Hypotheses
+- [ ] Does adding a price-direction filter (volume + flat/up price) improve hit rate?
+- [ ] Is 3-of-5-day accumulation a stronger signal than single-day 2x volume?