TradingAgents/docs/plans/2026-02-18-scanner-improvem...

3.8 KiB

Scanner Improvements Design

Date: 2026-02-18 Status: Approved

Problem

Most scanners produce weak or broken signals:

  • Insider buying drops transaction details (name, title, value)
  • Options flow ignores its own premium filter, only checks nearest expiration
  • Volume accumulation can't distinguish buying from selling
  • Reddit DD scores posts with an LLM then ignores the score
  • Semantic news is just regex extraction, not semantic
  • Market movers finds stocks after they moved
  • ML signal threshold (35%) is worse than a coin flip

Three useful scanner types are missing entirely: analyst upgrades, technical breakouts, sector rotation.

Phase 1: Fix Existing Scanners

1. Insider Buying

  • Preserve insider_name, title, transaction_value, shares from scraper
  • Priority by significance: CEO/CFO >$100K = CRITICAL, director >$50K = HIGH, other = MEDIUM
  • Cluster detection: 2+ insiders buying same stock within 14 days = CRITICAL
  • Rich context: "CFO Jane Smith purchased $250K of shares"

2. Options Flow

  • Apply the existing min_premium threshold ($25K) — currently configured but never checked
  • Scan up to 3 nearest expirations instead of 1
  • Classify moneyness: ITM call buying (conviction) > OTM (speculative)
  • Weight by expiration: 30+ DTE scored higher than weeklies

3. Volume Accumulation

  • Price-change filter: volume >2x AND absolute price change <3% (quiet accumulation only)
  • Multi-day mode: 3 of last 5 days >1.5x average = sustained accumulation
  • Exclude distribution: high volume + big price drop = skip

4. Reddit DD

  • Use LLM quality score for priority: 80+ = HIGH, 60-79 = MEDIUM, <60 = skip
  • Subreddit weighting: r/investing bonus, r/pennystocks penalty
  • Include post title and LLM score in context
  • Add mention count to context: "47 mentions in 6hrs"
  • Priority by volume: 50+ = HIGH, 20-49 = MEDIUM
  • Basic sentiment check from available data

6. Semantic News

  • Include actual headline text in context (not just "Mentioned in recent market news")
  • Catalyst classification from headline keywords: upgrade/FDA/acquisition/earnings
  • Priority based on catalyst type

7. Earnings Calendar

  • Add historical earnings reaction via get_pre_earnings_accumulation_signal()
  • Include EPS/revenue estimates from get_ticker_earnings_estimate()
  • Priority: proximity + accumulation signal = CRITICAL

8. Market Movers

  • Market cap filter: exclude <$300M
  • Volume validation: require avg volume >500K
  • Include change percentage in context
  • Cross-reference with news for catalyst attribution

9. ML Signal

  • Raise min_win_prob default from 0.35 to 0.50
  • Log model metadata (version, training date) if available
  • Add feature importances to context when model exposes them

Phase 2: New Scanners

10. Analyst Upgrades Scanner

  • Uses existing get_analyst_rating_changes() from alpha_vantage_analysts.py
  • Filters for upgrades, initiations, price target increases from last 3 days
  • Priority: upgrade with >20% target increase = HIGH, initiation = MEDIUM
  • Strategy: analyst_upgrade

11. Technical Breakout Scanner

  • Uses yfinance OHLCV data (no new APIs)
  • Detects: volume-confirmed breakout above 20-day high, or 52-week high on 2x+ volume
  • Priority: 3x+ volume at breakout = HIGH, 2x+ = MEDIUM
  • Strategy: momentum (reuses existing enum)

12. Sector Rotation Scanner

  • Compares sector ETF relative strength: 5-day vs 20-day periods
  • Flags individual stocks in accelerating sectors that haven't moved yet
  • Uses yfinance sector ETFs (XLK, XLF, XLE, etc.)
  • Strategy: new sector_rotation enum value

Data Sources

All improvements use existing APIs:

  • yfinance (free, no key)
  • Alpha Vantage (existing key)
  • Finnhub (existing key)
  • OpenInsider scraping (existing)
  • Reddit PRAW (existing)

No new API subscriptions required.