Add scanner improvements design document
Documents the approved plan to fix signal quality issues in all 9 existing scanners and add 3 new scanners (analyst upgrades, technical breakout, sector rotation). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
6342090a94
commit
2b74d298da
|
|
@ -0,0 +1,98 @@
|
|||
# Scanner Improvements Design
|
||||
|
||||
**Date:** 2026-02-18
|
||||
**Status:** Approved
|
||||
|
||||
## Problem
|
||||
|
||||
Most scanners produce weak or broken signals:
|
||||
- Insider buying drops transaction details (name, title, value)
|
||||
- Options flow ignores its own premium filter, only checks nearest expiration
|
||||
- Volume accumulation can't distinguish buying from selling
|
||||
- Reddit DD scores posts with an LLM then ignores the score
|
||||
- Semantic news is just regex extraction, not semantic
|
||||
- Market movers finds stocks after they moved
|
||||
- ML signal threshold (35%) is worse than a coin flip
|
||||
|
||||
Three useful scanner types are missing entirely: analyst upgrades, technical breakouts, sector rotation.
|
||||
|
||||
## Phase 1: Fix Existing Scanners
|
||||
|
||||
### 1. Insider Buying
|
||||
- Preserve `insider_name`, `title`, `transaction_value`, `shares` from scraper
|
||||
- Priority by significance: CEO/CFO >$100K = CRITICAL, director >$50K = HIGH, other = MEDIUM
|
||||
- Cluster detection: 2+ insiders buying same stock within 14 days = CRITICAL
|
||||
- Rich context: "CFO Jane Smith purchased $250K of shares"
|
||||
|
||||
### 2. Options Flow
|
||||
- Apply the existing `min_premium` threshold ($25K) — currently configured but never checked
|
||||
- Scan up to 3 nearest expirations instead of 1
|
||||
- Classify moneyness: ITM call buying (conviction) > OTM (speculative)
|
||||
- Weight by expiration: 30+ DTE scored higher than weeklies
|
||||
|
||||
### 3. Volume Accumulation
|
||||
- Price-change filter: volume >2x AND absolute price change <3% (quiet accumulation only)
|
||||
- Multi-day mode: 3 of last 5 days >1.5x average = sustained accumulation
|
||||
- Exclude distribution: high volume + big price drop = skip
|
||||
|
||||
### 4. Reddit DD
|
||||
- Use LLM quality score for priority: 80+ = HIGH, 60-79 = MEDIUM, <60 = skip
|
||||
- Subreddit weighting: r/investing bonus, r/pennystocks penalty
|
||||
- Include post title and LLM score in context
|
||||
|
||||
### 5. Reddit Trending
|
||||
- Add mention count to context: "47 mentions in 6hrs"
|
||||
- Priority by volume: 50+ = HIGH, 20-49 = MEDIUM
|
||||
- Basic sentiment check from available data
|
||||
|
||||
### 6. Semantic News
|
||||
- Include actual headline text in context (not just "Mentioned in recent market news")
|
||||
- Catalyst classification from headline keywords: upgrade/FDA/acquisition/earnings
|
||||
- Priority based on catalyst type
|
||||
|
||||
### 7. Earnings Calendar
|
||||
- Add historical earnings reaction via `get_pre_earnings_accumulation_signal()`
|
||||
- Include EPS/revenue estimates from `get_ticker_earnings_estimate()`
|
||||
- Priority: proximity + accumulation signal = CRITICAL
|
||||
|
||||
### 8. Market Movers
|
||||
- Market cap filter: exclude <$300M
|
||||
- Volume validation: require avg volume >500K
|
||||
- Include change percentage in context
|
||||
- Cross-reference with news for catalyst attribution
|
||||
|
||||
### 9. ML Signal
|
||||
- Raise `min_win_prob` default from 0.35 to 0.50
|
||||
- Log model metadata (version, training date) if available
|
||||
- Add feature importances to context when model exposes them
|
||||
|
||||
## Phase 2: New Scanners
|
||||
|
||||
### 10. Analyst Upgrades Scanner
|
||||
- Uses existing `get_analyst_rating_changes()` from `alpha_vantage_analysts.py`
|
||||
- Filters for upgrades, initiations, price target increases from last 3 days
|
||||
- Priority: upgrade with >20% target increase = HIGH, initiation = MEDIUM
|
||||
- Strategy: `analyst_upgrade`
|
||||
|
||||
### 11. Technical Breakout Scanner
|
||||
- Uses yfinance OHLCV data (no new APIs)
|
||||
- Detects: volume-confirmed breakout above 20-day high, or 52-week high on 2x+ volume
|
||||
- Priority: 3x+ volume at breakout = HIGH, 2x+ = MEDIUM
|
||||
- Strategy: `momentum` (reuses existing enum)
|
||||
|
||||
### 12. Sector Rotation Scanner
|
||||
- Compares sector ETF relative strength: 5-day vs 20-day periods
|
||||
- Flags individual stocks in accelerating sectors that haven't moved yet
|
||||
- Uses yfinance sector ETFs (XLK, XLF, XLE, etc.)
|
||||
- Strategy: new `sector_rotation` enum value
|
||||
|
||||
## Data Sources
|
||||
|
||||
All improvements use existing APIs:
|
||||
- yfinance (free, no key)
|
||||
- Alpha Vantage (existing key)
|
||||
- Finnhub (existing key)
|
||||
- OpenInsider scraping (existing)
|
||||
- Reddit PRAW (existing)
|
||||
|
||||
No new API subscriptions required.
|
||||
Loading…
Reference in New Issue