TradingAgents/docs/plans/2026-02-18-scanner-improvem...

99 lines
3.8 KiB
Markdown

# Scanner Improvements Design
**Date:** 2026-02-18
**Status:** Approved
## Problem
Most scanners produce weak or broken signals:
- Insider buying drops transaction details (name, title, value)
- Options flow ignores its own premium filter, only checks nearest expiration
- Volume accumulation can't distinguish buying from selling
- Reddit DD scores posts with an LLM then ignores the score
- Semantic news is just regex extraction, not semantic
- Market movers finds stocks after they moved
- ML signal threshold (35%) is worse than a coin flip
Three useful scanner types are missing entirely: analyst upgrades, technical breakouts, sector rotation.
## Phase 1: Fix Existing Scanners
### 1. Insider Buying
- Preserve `insider_name`, `title`, `transaction_value`, `shares` from scraper
- Priority by significance: CEO/CFO >$100K = CRITICAL, director >$50K = HIGH, other = MEDIUM
- Cluster detection: 2+ insiders buying same stock within 14 days = CRITICAL
- Rich context: "CFO Jane Smith purchased $250K of shares"
### 2. Options Flow
- Apply the existing `min_premium` threshold ($25K) — currently configured but never checked
- Scan up to 3 nearest expirations instead of 1
- Classify moneyness: ITM call buying (conviction) > OTM (speculative)
- Weight by expiration: 30+ DTE scored higher than weeklies
### 3. Volume Accumulation
- Price-change filter: volume >2x AND absolute price change <3% (quiet accumulation only)
- Multi-day mode: 3 of last 5 days >1.5x average = sustained accumulation
- Exclude distribution: high volume + big price drop = skip
### 4. Reddit DD
- Use LLM quality score for priority: 80+ = HIGH, 60-79 = MEDIUM, <60 = skip
- Subreddit weighting: r/investing bonus, r/pennystocks penalty
- Include post title and LLM score in context
### 5. Reddit Trending
- Add mention count to context: "47 mentions in 6hrs"
- Priority by volume: 50+ = HIGH, 20-49 = MEDIUM
- Basic sentiment check from available data
### 6. Semantic News
- Include actual headline text in context (not just "Mentioned in recent market news")
- Catalyst classification from headline keywords: upgrade/FDA/acquisition/earnings
- Priority based on catalyst type
### 7. Earnings Calendar
- Add historical earnings reaction via `get_pre_earnings_accumulation_signal()`
- Include EPS/revenue estimates from `get_ticker_earnings_estimate()`
- Priority: proximity + accumulation signal = CRITICAL
### 8. Market Movers
- Market cap filter: exclude <$300M
- Volume validation: require avg volume >500K
- Include change percentage in context
- Cross-reference with news for catalyst attribution
### 9. ML Signal
- Raise `min_win_prob` default from 0.35 to 0.50
- Log model metadata (version, training date) if available
- Add feature importances to context when model exposes them
## Phase 2: New Scanners
### 10. Analyst Upgrades Scanner
- Uses existing `get_analyst_rating_changes()` from `alpha_vantage_analysts.py`
- Filters for upgrades, initiations, price target increases from last 3 days
- Priority: upgrade with >20% target increase = HIGH, initiation = MEDIUM
- Strategy: `analyst_upgrade`
### 11. Technical Breakout Scanner
- Uses yfinance OHLCV data (no new APIs)
- Detects: volume-confirmed breakout above 20-day high, or 52-week high on 2x+ volume
- Priority: 3x+ volume at breakout = HIGH, 2x+ = MEDIUM
- Strategy: `momentum` (reuses existing enum)
### 12. Sector Rotation Scanner
- Compares sector ETF relative strength: 5-day vs 20-day periods
- Flags individual stocks in accelerating sectors that haven't moved yet
- Uses yfinance sector ETFs (XLK, XLF, XLE, etc.)
- Strategy: new `sector_rotation` enum value
## Data Sources
All improvements use existing APIs:
- yfinance (free, no key)
- Alpha Vantage (existing key)
- Finnhub (existing key)
- OpenInsider scraping (existing)
- Reddit PRAW (existing)
No new API subscriptions required.