45 KiB
Scanner Improvements Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Fix signal quality issues in all 9 existing discovery scanners and add 3 new scanners (analyst upgrades, technical breakout, sector rotation).
Architecture: Each scanner is a subclass of BaseScanner in tradingagents/dataflows/discovery/scanners/. Scanners register via SCANNER_REGISTRY.register() at import time. They return List[Dict] of candidate dicts with ticker, source, context, priority, strategy fields. The filter and ranker downstream consume these candidates.
Tech Stack: Python, yfinance, Alpha Vantage API, Finnhub API, OpenInsider scraping, PRAW (Reddit)
Phase 1: Fix Existing Scanners
Task 1: Fix Insider Buying — Preserve Transaction Details
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/insider_buying.py
Context: The scraper (finviz_scraper.py:get_finviz_insider_buying) returns structured dicts with insider, title, value_num, qty, price, trade_type when called with return_structured=True. But the scanner calls it with return_structured=False (markdown string) and then parses only the ticker from markdown rows, losing all transaction details.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/insider_buying.py fully to understand current logic.
Step 2: Rewrite the scan() method
Replace the scan method. Key changes:
- Call
get_finviz_insider_buying(lookback_days, min_transaction_value, return_structured=True)to get structured data - Preserve
insider_name,title,transaction_value,sharesin candidate output - Priority by significance: CEO/CFO title + value >$100K = CRITICAL, director + >$50K = HIGH, other = MEDIUM
- Cluster detection: if 2+ unique insiders bought same ticker, boost to CRITICAL
- Rich context string:
"CEO John Smith purchased $250K of AAPL shares"
def scan(self, state: Dict[str, Any]) -> List[Dict[str, Any]]:
if not self.is_enabled():
return []
logger.info("🔍 Scanning insider buying (OpenInsider)...")
try:
from tradingagents.dataflows.finviz_scraper import get_finviz_insider_buying
transactions = get_finviz_insider_buying(
lookback_days=self.lookback_days,
min_transaction_value=self.min_transaction_value,
return_structured=True,
)
if not transactions:
logger.info("No insider buying transactions found")
return []
logger.info(f"Found {len(transactions)} insider transactions")
# Group by ticker for cluster detection
by_ticker: Dict[str, list] = {}
for txn in transactions:
ticker = txn.get("ticker", "").upper().strip()
if not ticker:
continue
by_ticker.setdefault(ticker, []).append(txn)
candidates = []
for ticker, txns in by_ticker.items():
# Use the largest transaction as primary
txns.sort(key=lambda t: t.get("value_num", 0), reverse=True)
primary = txns[0]
insider_name = primary.get("insider", "Unknown")
title = primary.get("title", "")
value = primary.get("value_num", 0)
value_str = primary.get("value_str", f"${value:,.0f}")
num_insiders = len(txns)
# Priority by significance
title_lower = title.lower()
is_c_suite = any(t in title_lower for t in ["ceo", "cfo", "coo", "cto", "president", "chairman"])
is_director = "director" in title_lower
if num_insiders >= 2:
priority = Priority.CRITICAL.value
elif is_c_suite and value >= 100_000:
priority = Priority.CRITICAL.value
elif is_c_suite or (is_director and value >= 50_000):
priority = Priority.HIGH.value
elif value >= 50_000:
priority = Priority.HIGH.value
else:
priority = Priority.MEDIUM.value
# Build context
if num_insiders > 1:
context = f"Cluster: {num_insiders} insiders buying {ticker}. Largest: {title} {insider_name} purchased {value_str}"
else:
context = f"{title} {insider_name} purchased {value_str} of {ticker}"
candidates.append({
"ticker": ticker,
"source": self.name,
"context": context,
"priority": priority,
"strategy": self.strategy,
"insider_name": insider_name,
"insider_title": title,
"transaction_value": value,
"num_insiders_buying": num_insiders,
})
if len(candidates) >= self.limit:
break
logger.info(f"Insider buying: {len(candidates)} candidates")
return candidates
except Exception as e:
logger.error(f"Insider buying scan failed: {e}", exc_info=True)
return []
Step 3: Run verification
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners.insider_buying
cls = SCANNER_REGISTRY.scanners['insider_buying']
print(f'name={cls.name}, strategy={cls.strategy}, pipeline={cls.pipeline}')
print('Has scan method:', hasattr(cls, 'scan'))
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/insider_buying.py
git commit -m "fix(insider-buying): preserve transaction details, add cluster detection and smart priority"
Task 2: Fix Options Flow — Apply Premium Filter, Multi-Expiration
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/options_flow.py
Context: self.min_premium is loaded at line 50 but never used. Only expirations[0] is scanned (line 104). Need to apply premium filter and scan up to 3 expirations.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/options_flow.py fully.
Step 2: Fix the _scan_ticker method
Key changes to _scan_ticker():
- Loop through up to 3 expirations instead of just
expirations[0] - Add premium filter: skip strikes where
volume * lastPrice * 100 < self.min_premium - Track which expiration had the most unusual activity
- Add
days_to_expiryclassification in output
Replace the inner scanning logic (the _scan_ticker method). The core change is:
def _scan_ticker(self, ticker: str) -> Optional[Dict[str, Any]]:
"""Scan a single ticker for unusual options activity."""
try:
expirations = get_ticker_options(ticker)
if not expirations:
return None
# Scan up to 3 nearest expirations
max_expirations = min(3, len(expirations))
total_unusual_calls = 0
total_unusual_puts = 0
total_call_vol = 0
total_put_vol = 0
best_expiration = None
best_unusual_count = 0
for exp in expirations[:max_expirations]:
try:
options = get_option_chain(ticker, exp)
except Exception:
continue
if options is None:
continue
calls_df, puts_df = (None, None)
if isinstance(options, tuple) and len(options) == 2:
calls_df, puts_df = options
elif hasattr(options, "calls") and hasattr(options, "puts"):
calls_df, puts_df = options.calls, options.puts
else:
continue
exp_unusual_calls = 0
exp_unusual_puts = 0
# Analyze calls
if calls_df is not None and not calls_df.empty:
for _, opt in calls_df.iterrows():
vol = opt.get("volume", 0) or 0
oi = opt.get("openInterest", 0) or 0
price = opt.get("lastPrice", 0) or 0
if vol < self.min_volume:
continue
# Premium filter (volume * price * 100 shares per contract)
if (vol * price * 100) < self.min_premium:
continue
if oi > 0 and (vol / oi) >= self.min_volume_oi_ratio:
exp_unusual_calls += 1
total_call_vol += vol
# Analyze puts
if puts_df is not None and not puts_df.empty:
for _, opt in puts_df.iterrows():
vol = opt.get("volume", 0) or 0
oi = opt.get("openInterest", 0) or 0
price = opt.get("lastPrice", 0) or 0
if vol < self.min_volume:
continue
if (vol * price * 100) < self.min_premium:
continue
if oi > 0 and (vol / oi) >= self.min_volume_oi_ratio:
exp_unusual_puts += 1
total_put_vol += vol
total_unusual_calls += exp_unusual_calls
total_unusual_puts += exp_unusual_puts
exp_total = exp_unusual_calls + exp_unusual_puts
if exp_total > best_unusual_count:
best_unusual_count = exp_total
best_expiration = exp
total_unusual = total_unusual_calls + total_unusual_puts
if total_unusual == 0:
return None
# Calculate put/call ratio
pc_ratio = total_put_vol / total_call_vol if total_call_vol > 0 else 999
if pc_ratio < 0.7:
sentiment = "bullish"
elif pc_ratio > 1.3:
sentiment = "bearish"
else:
sentiment = "neutral"
priority = Priority.HIGH.value if sentiment == "bullish" else Priority.MEDIUM.value
context = (
f"Unusual options: {total_unusual} strikes across {max_expirations} exp, "
f"P/C={pc_ratio:.2f} ({sentiment}), "
f"{total_unusual_calls} unusual calls / {total_unusual_puts} unusual puts"
)
return {
"ticker": ticker,
"source": self.name,
"context": context,
"priority": priority,
"strategy": self.strategy,
"put_call_ratio": round(pc_ratio, 2),
"unusual_calls": total_unusual_calls,
"unusual_puts": total_unusual_puts,
"best_expiration": best_expiration,
}
except Exception as e:
logger.debug(f"Error scanning {ticker}: {e}")
return None
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners.options_flow
cls = SCANNER_REGISTRY.scanners['options_flow']
print(f'name={cls.name}, strategy={cls.strategy}')
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/options_flow.py
git commit -m "fix(options-flow): apply premium filter, scan multiple expirations"
Task 3: Fix Volume Accumulation — Distinguish Accumulation from Distribution
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/volume_accumulation.py
Context: Currently flags any unusual volume. Need to add price-change context and multi-day accumulation detection.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/volume_accumulation.py fully.
Step 2: Add price-change and multi-day enrichment
After the existing volume parsing, add enrichment using yfinance data. The key addition is a helper that checks whether the volume spike is accumulation (flat price) vs distribution (big drop):
def _enrich_volume_candidate(self, ticker: str, cand: Dict[str, Any]) -> Dict[str, Any]:
"""Add price-change context to distinguish accumulation from distribution."""
try:
from tradingagents.dataflows.y_finance import download_history
hist = download_history(ticker, period="10d", interval="1d", auto_adjust=True, progress=False)
if hist.empty or len(hist) < 2:
return cand
# Today's price change
latest_close = float(hist["Close"].iloc[-1])
prev_close = float(hist["Close"].iloc[-2])
day_change_pct = ((latest_close - prev_close) / prev_close) * 100
cand["day_change_pct"] = round(day_change_pct, 2)
# Multi-day volume pattern: count days with >1.5x avg volume in last 5 days
if len(hist) >= 6:
avg_vol = float(hist["Volume"].iloc[:-5].mean()) if len(hist) > 5 else float(hist["Volume"].mean())
if avg_vol > 0:
recent_high_vol_days = sum(
1 for v in hist["Volume"].iloc[-5:] if float(v) > avg_vol * 1.5
)
cand["high_vol_days_5d"] = recent_high_vol_days
if recent_high_vol_days >= 3:
cand["context"] += f" | Sustained: {recent_high_vol_days}/5 days above 1.5x avg"
# Classify signal
if abs(day_change_pct) < 3:
# Quiet accumulation — the best signal
cand["volume_signal"] = "accumulation"
cand["context"] += f" | Price flat ({day_change_pct:+.1f}%) — quiet accumulation"
elif day_change_pct < -5:
# Distribution / panic selling
cand["volume_signal"] = "distribution"
cand["priority"] = Priority.LOW.value
cand["context"] += f" | Price dropped {day_change_pct:+.1f}% — possible distribution"
else:
cand["volume_signal"] = "momentum"
except Exception as e:
logger.debug(f"Volume enrichment failed for {ticker}: {e}")
return cand
Call this method for each candidate after the existing parsing loop, before appending to the final list. Skip (don't append) candidates with volume_signal == "distribution".
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners.volume_accumulation
print('volume_accumulation registered:', 'volume_accumulation' in SCANNER_REGISTRY.scanners)
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/volume_accumulation.py
git commit -m "fix(volume): distinguish accumulation from distribution, add multi-day pattern"
Task 4: Fix Reddit DD — Use LLM Quality Score
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/reddit_dd.py
Context: The LLM evaluates each DD post with a 0-100 quality score, but the scanner stores it as dd_score and uses Reddit upvotes for priority instead. Additionally, the tool "scan_reddit_dd" may not exist in the registry, causing the scanner to always fall back.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/reddit_dd.py fully, and check if "scan_reddit_dd" exists in tradingagents/tools/registry.py.
Step 2: Fix priority logic to use quality score
In the structured result parsing section (where dd posts are iterated), change the priority assignment:
# Replace the existing priority logic with:
dd_score = post.get("quality_score", post.get("score", 0))
if dd_score >= 80:
priority = Priority.HIGH.value
elif dd_score >= 60:
priority = Priority.MEDIUM.value
else:
# Skip low-quality posts
continue
Also preserve the score and post title in context:
title = post.get("title", "")[:100]
context = f"Reddit DD (score: {dd_score}/100): {title}"
And in the candidate dict, include:
"dd_quality_score": dd_score,
"dd_title": title,
If the "scan_reddit_dd" tool doesn't exist in the registry, add a fallback that calls get_reddit_undiscovered_dd() directly (imported from tradingagents.dataflows.reddit_api).
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners.reddit_dd
print('reddit_dd registered:', 'reddit_dd' in SCANNER_REGISTRY.scanners)
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/reddit_dd.py
git commit -m "fix(reddit-dd): use LLM quality score for priority, preserve post details"
Task 5: Fix Reddit Trending — Add Mention Count and Sentiment
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/reddit_trending.py
Context: Currently all candidates get MEDIUM priority with a generic "Reddit trending discussion" context. No mention counts or sentiment info.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/reddit_trending.py fully.
Step 2: Enrich with mention counts
If the tool returns structured data (list of dicts), extract mention counts. If it returns text, count ticker occurrences. Use counts for priority:
# After extracting tickers, count mentions
from collections import Counter
ticker_counts = Counter()
# ... count each ticker mention in result text/data
for ticker in unique_tickers:
count = ticker_counts.get(ticker, 1)
if count >= 50:
priority = Priority.HIGH.value
elif count >= 20:
priority = Priority.MEDIUM.value
else:
priority = Priority.LOW.value
context = f"Trending on Reddit: ~{count} mentions"
Step 3: Commit
git add tradingagents/dataflows/discovery/scanners/reddit_trending.py
git commit -m "fix(reddit-trending): add mention counts, scale priority by volume"
Task 6: Fix Semantic News — Include Headlines, Add Catalyst Classification
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/semantic_news.py
Context: self.min_importance is loaded (line 23) but never used. Context is generic "Mentioned in recent market news" with no headline text. Scanner just regex-extracts uppercase words.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/semantic_news.py fully.
Step 2: Improve context and add catalyst classification
When creating candidates, include the actual headline text. Add simple keyword-based catalyst classification for priority:
CATALYST_KEYWORDS = {
Priority.CRITICAL.value: ["fda approval", "acquisition", "merger", "buyout", "takeover"],
Priority.HIGH.value: ["upgrade", "initiated", "beat", "surprise", "contract win", "patent"],
Priority.MEDIUM.value: ["downgrade", "miss", "lawsuit", "investigation", "recall"],
}
def _classify_catalyst(self, headline: str) -> str:
"""Classify news headline by catalyst type and return priority."""
headline_lower = headline.lower()
for priority, keywords in CATALYST_KEYWORDS.items():
if any(kw in headline_lower for kw in keywords):
return priority
return Priority.MEDIUM.value
For each news item, preserve the headline and set priority by catalyst type:
headline = news_item.get("title", "")[:150]
priority = self._classify_catalyst(headline)
context = f"News catalyst: {headline}" if headline else "Mentioned in recent market news"
Also store news_context as a list of headline dicts for the downstream ranker:
"news_context": [{"news_title": headline, "news_summary": summary, "published_at": timestamp}]
Step 3: Commit
git add tradingagents/dataflows/discovery/scanners/semantic_news.py
git commit -m "fix(semantic-news): include headlines, add catalyst classification"
Task 7: Fix Earnings Calendar — Add Accumulation Signal and Estimates
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/earnings_calendar.py
Context: Currently a pure calendar. get_pre_earnings_accumulation_signal() and get_ticker_earnings_estimate() already exist in the codebase but aren't used.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/earnings_calendar.py fully.
Step 2: Add accumulation signal enrichment
After the existing candidate creation, add a post-processing step. For each candidate with days_until between 2 and 7, check for volume accumulation:
def _enrich_earnings_candidate(self, cand: Dict[str, Any]) -> Dict[str, Any]:
"""Enrich earnings candidate with accumulation signal and estimates."""
ticker = cand["ticker"]
# Check pre-earnings volume accumulation
try:
from tradingagents.dataflows.y_finance import get_pre_earnings_accumulation_signal
signal = get_pre_earnings_accumulation_signal(ticker)
if signal and signal.get("signal"):
vol_ratio = signal.get("volume_ratio", 0)
cand["has_accumulation"] = True
cand["accumulation_volume_ratio"] = vol_ratio
cand["context"] += f" | Pre-earnings accumulation: {vol_ratio:.1f}x volume"
# Boost priority if accumulation detected
cand["priority"] = Priority.CRITICAL.value
except Exception:
pass
# Add earnings estimates
try:
from tradingagents.dataflows.finnhub_api import get_ticker_earnings_estimate
est = get_ticker_earnings_estimate(ticker)
if est and est.get("has_upcoming_earnings"):
eps = est.get("eps_estimate")
if eps is not None:
cand["eps_estimate"] = eps
cand["context"] += f" | EPS est: ${eps:.2f}"
except Exception:
pass
return cand
Call this for each candidate before appending to the final list. Limit enrichment to avoid API rate limits (only enrich top 10 by proximity).
Step 3: Commit
git add tradingagents/dataflows/discovery/scanners/earnings_calendar.py
git commit -m "fix(earnings): add pre-earnings accumulation signal and EPS estimates"
Task 8: Fix Market Movers — Add Market Cap and Volume Filters
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/market_movers.py
Context: Takes whatever Alpha Vantage returns with no filtering. Penny stocks with 400% gains on 100 shares get included.
Step 1: Read the current scanner
Read tradingagents/dataflows/discovery/scanners/market_movers.py fully.
Step 2: Add filtering configuration and validation
Add configurable filters in __init__:
self.min_price = self.scanner_config.get("min_price", 5.0)
self.min_volume = self.scanner_config.get("min_volume", 500_000)
After parsing candidates from the tool result, validate each one:
def _validate_mover(self, ticker: str) -> bool:
"""Quick validation: price and volume check."""
try:
from tradingagents.dataflows.y_finance import get_stock_price, get_ticker_info
price = get_stock_price(ticker)
if price is not None and price < self.min_price:
return False
info = get_ticker_info(ticker)
avg_vol = info.get("averageVolume", 0) if info else 0
if avg_vol and avg_vol < self.min_volume:
return False
return True
except Exception:
return True # Don't filter on errors
Call _validate_mover() before appending each candidate. This removes penny stocks and illiquid names.
Step 3: Commit
git add tradingagents/dataflows/discovery/scanners/market_movers.py
git commit -m "fix(market-movers): add price and volume validation filters"
Task 9: Fix ML Signal — Raise Threshold
Files:
- Modify:
tradingagents/dataflows/discovery/scanners/ml_signal.py
Context: Default min_win_prob is 0.35 (35%). This is barely better than random.
Step 1: Change default threshold
In __init__, change the default:
# Change from:
self.min_win_prob = self.scanner_config.get("min_win_prob", 0.35)
# To:
self.min_win_prob = self.scanner_config.get("min_win_prob", 0.50)
Also adjust priority thresholds to match:
# Change from:
if win_prob >= 0.50:
priority = Priority.CRITICAL.value
elif win_prob >= 0.40:
priority = Priority.HIGH.value
else:
priority = Priority.MEDIUM.value
# To:
if win_prob >= 0.65:
priority = Priority.CRITICAL.value
elif win_prob >= 0.55:
priority = Priority.HIGH.value
else:
priority = Priority.MEDIUM.value
Step 2: Commit
git add tradingagents/dataflows/discovery/scanners/ml_signal.py
git commit -m "fix(ml-signal): raise min win probability to 50%, adjust priority tiers"
Phase 2: New Scanners
Task 10: Add Strategy Enum Values for New Scanners
Files:
- Modify:
tradingagents/dataflows/discovery/utils.py
Step 1: Add new enum values
Add after the existing SOCIAL_DD entry:
SECTOR_ROTATION = "sector_rotation"
TECHNICAL_BREAKOUT = "technical_breakout"
ANALYST_UPGRADE already exists in the enum.
Step 2: Commit
git add tradingagents/dataflows/discovery/utils.py
git commit -m "feat: add sector_rotation and technical_breakout strategy enum values"
Task 11: Add Analyst Upgrades Scanner
Files:
- Create:
tradingagents/dataflows/discovery/scanners/analyst_upgrades.py - Modify:
tradingagents/dataflows/discovery/scanners/__init__.py
Context: get_analyst_rating_changes(return_structured=True) already exists in alpha_vantage_analysts.py. Returns list of dicts with ticker, action, date, hours_old, headline, source, url.
Step 1: Create the scanner
"""Analyst upgrade and initiation scanner."""
from typing import Any, Dict, List
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY, BaseScanner
from tradingagents.dataflows.discovery.utils import Priority
from tradingagents.utils.logger import get_logger
logger = get_logger(__name__)
class AnalystUpgradeScanner(BaseScanner):
"""Scan for recent analyst upgrades and coverage initiations."""
name = "analyst_upgrades"
pipeline = "edge"
strategy = "analyst_upgrade"
def __init__(self, config: Dict[str, Any]):
super().__init__(config)
self.lookback_days = self.scanner_config.get("lookback_days", 3)
self.max_hours_old = self.scanner_config.get("max_hours_old", 72)
def scan(self, state: Dict[str, Any]) -> List[Dict[str, Any]]:
if not self.is_enabled():
return []
logger.info("📊 Scanning analyst upgrades and initiations...")
try:
from tradingagents.dataflows.alpha_vantage_analysts import (
get_analyst_rating_changes,
)
changes = get_analyst_rating_changes(
lookback_days=self.lookback_days,
change_types=["upgrade", "initiated"],
top_n=self.limit * 2,
return_structured=True,
)
if not changes:
logger.info("No analyst upgrades found")
return []
candidates = []
for change in changes:
ticker = change.get("ticker", "").upper().strip()
if not ticker:
continue
action = change.get("action", "unknown")
hours_old = change.get("hours_old", 999)
headline = change.get("headline", "")
source = change.get("source", "")
if hours_old > self.max_hours_old:
continue
# Priority by freshness and action type
if action == "upgrade" and hours_old <= 24:
priority = Priority.HIGH.value
elif action == "initiated" and hours_old <= 24:
priority = Priority.HIGH.value
elif hours_old <= 48:
priority = Priority.MEDIUM.value
else:
priority = Priority.LOW.value
context = f"Analyst {action}: {headline}" if headline else f"Analyst {action} ({source})"
candidates.append({
"ticker": ticker,
"source": self.name,
"context": context,
"priority": priority,
"strategy": self.strategy,
"analyst_action": action,
"hours_old": hours_old,
})
if len(candidates) >= self.limit:
break
logger.info(f"Analyst upgrades: {len(candidates)} candidates")
return candidates
except Exception as e:
logger.error(f"Analyst upgrades scan failed: {e}", exc_info=True)
return []
SCANNER_REGISTRY.register(AnalystUpgradeScanner)
Step 2: Register in __init__.py
Add to the import block:
analyst_upgrades, # noqa: F401
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners
print('analyst_upgrades' in SCANNER_REGISTRY.scanners)
cls = SCANNER_REGISTRY.scanners['analyst_upgrades']
print(f'name={cls.name}, strategy={cls.strategy}, pipeline={cls.pipeline}')
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/analyst_upgrades.py tradingagents/dataflows/discovery/scanners/__init__.py
git commit -m "feat: add analyst upgrades scanner"
Task 12: Add Technical Breakout Scanner
Files:
- Create:
tradingagents/dataflows/discovery/scanners/technical_breakout.py - Modify:
tradingagents/dataflows/discovery/scanners/__init__.py
Context: Uses yfinance OHLCV data. Detects volume-confirmed breakouts above recent resistance or 52-week highs. Scans same ticker universe as ML/options scanners.
Step 1: Create the scanner
"""Technical breakout scanner — volume-confirmed price breakouts."""
from typing import Any, Dict, List, Optional
import pandas as pd
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY, BaseScanner
from tradingagents.dataflows.discovery.utils import Priority
from tradingagents.utils.logger import get_logger
logger = get_logger(__name__)
DEFAULT_TICKER_FILE = "data/tickers.txt"
def _load_tickers_from_file(path: str) -> List[str]:
"""Load ticker symbols from a text file."""
try:
with open(path) as f:
tickers = [
line.strip().upper()
for line in f
if line.strip() and not line.strip().startswith("#")
]
if tickers:
logger.info(f"Breakout scanner: loaded {len(tickers)} tickers from {path}")
return tickers
except FileNotFoundError:
logger.warning(f"Ticker file not found: {path}")
except Exception as e:
logger.warning(f"Failed to load ticker file {path}: {e}")
return []
class TechnicalBreakoutScanner(BaseScanner):
"""Scan for volume-confirmed technical breakouts."""
name = "technical_breakout"
pipeline = "momentum"
strategy = "technical_breakout"
def __init__(self, config: Dict[str, Any]):
super().__init__(config)
self.ticker_file = self.scanner_config.get("ticker_file", DEFAULT_TICKER_FILE)
self.max_tickers = self.scanner_config.get("max_tickers", 150)
self.min_volume_multiple = self.scanner_config.get("min_volume_multiple", 2.0)
self.lookback_days = self.scanner_config.get("lookback_days", 20)
self.max_workers = self.scanner_config.get("max_workers", 8)
def scan(self, state: Dict[str, Any]) -> List[Dict[str, Any]]:
if not self.is_enabled():
return []
logger.info("📈 Scanning for technical breakouts...")
tickers = _load_tickers_from_file(self.ticker_file)
if not tickers:
logger.warning("No tickers loaded for breakout scan")
return []
tickers = tickers[: self.max_tickers]
# Batch download OHLCV
from tradingagents.dataflows.y_finance import download_history
try:
data = download_history(
tickers,
period="3mo",
interval="1d",
auto_adjust=True,
progress=False,
)
except Exception as e:
logger.error(f"Batch download failed: {e}")
return []
if data.empty:
return []
candidates = []
for ticker in tickers:
result = self._check_breakout(ticker, data)
if result:
candidates.append(result)
if len(candidates) >= self.limit:
break
candidates.sort(key=lambda c: c.get("volume_multiple", 0), reverse=True)
logger.info(f"Technical breakouts: {len(candidates)} candidates")
return candidates[: self.limit]
def _check_breakout(self, ticker: str, data: pd.DataFrame) -> Optional[Dict[str, Any]]:
"""Check if ticker has a volume-confirmed breakout."""
try:
# Extract single-ticker data from multi-ticker download
if isinstance(data.columns, pd.MultiIndex):
if ticker not in data.columns.get_level_values(1):
return None
df = data.xs(ticker, axis=1, level=1).dropna()
else:
df = data.dropna()
if len(df) < self.lookback_days + 5:
return None
close = df["Close"]
volume = df["Volume"]
high = df["High"]
latest_close = float(close.iloc[-1])
latest_vol = float(volume.iloc[-1])
# 20-day lookback resistance (excluding last day)
lookback_high = float(high.iloc[-(self.lookback_days + 1) : -1].max())
# Average volume over lookback period
avg_vol = float(volume.iloc[-(self.lookback_days + 1) : -1].mean())
if avg_vol <= 0:
return None
vol_multiple = latest_vol / avg_vol
# Breakout conditions:
# 1. Price closed above the lookback-period high
# 2. Volume is at least min_volume_multiple times average
is_breakout = latest_close > lookback_high and vol_multiple >= self.min_volume_multiple
if not is_breakout:
return None
# Check if near 52-week high for bonus
if len(df) >= 252:
high_52w = float(high.iloc[-252:].max())
near_52w_high = latest_close >= high_52w * 0.95
else:
high_52w = float(high.max())
near_52w_high = latest_close >= high_52w * 0.95
# Priority
if vol_multiple >= 3.0 and near_52w_high:
priority = Priority.CRITICAL.value
elif vol_multiple >= 3.0 or near_52w_high:
priority = Priority.HIGH.value
else:
priority = Priority.MEDIUM.value
breakout_pct = ((latest_close - lookback_high) / lookback_high) * 100
context = (
f"Breakout: closed {breakout_pct:+.1f}% above {self.lookback_days}d high "
f"on {vol_multiple:.1f}x volume"
)
if near_52w_high:
context += " | Near 52-week high"
return {
"ticker": ticker,
"source": self.name,
"context": context,
"priority": priority,
"strategy": self.strategy,
"volume_multiple": round(vol_multiple, 2),
"breakout_pct": round(breakout_pct, 2),
"near_52w_high": near_52w_high,
}
except Exception as e:
logger.debug(f"Breakout check failed for {ticker}: {e}")
return None
SCANNER_REGISTRY.register(TechnicalBreakoutScanner)
Step 2: Register in __init__.py
Add technical_breakout to imports.
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners
print('technical_breakout' in SCANNER_REGISTRY.scanners)
"
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/technical_breakout.py tradingagents/dataflows/discovery/scanners/__init__.py
git commit -m "feat: add technical breakout scanner"
Task 13: Add Sector Rotation Scanner
Files:
- Create:
tradingagents/dataflows/discovery/scanners/sector_rotation.py - Modify:
tradingagents/dataflows/discovery/scanners/__init__.py
Context: Compares sector ETF relative strength (5-day vs 20-day). Flags stocks in accelerating sectors that haven't moved yet. Uses yfinance — no new APIs.
Step 1: Create the scanner
"""Sector rotation scanner — finds laggards in accelerating sectors."""
from typing import Any, Dict, List, Optional
import pandas as pd
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY, BaseScanner
from tradingagents.dataflows.discovery.utils import Priority
from tradingagents.utils.logger import get_logger
logger = get_logger(__name__)
# SPDR Select Sector ETFs
SECTOR_ETFS = {
"XLK": "Technology",
"XLF": "Financials",
"XLE": "Energy",
"XLV": "Healthcare",
"XLI": "Industrials",
"XLY": "Consumer Discretionary",
"XLP": "Consumer Staples",
"XLU": "Utilities",
"XLB": "Materials",
"XLRE": "Real Estate",
"XLC": "Communication Services",
}
DEFAULT_TICKER_FILE = "data/tickers.txt"
def _load_tickers_from_file(path: str) -> List[str]:
"""Load ticker symbols from a text file."""
try:
with open(path) as f:
return [
line.strip().upper()
for line in f
if line.strip() and not line.strip().startswith("#")
]
except Exception:
return []
class SectorRotationScanner(BaseScanner):
"""Detect sector momentum shifts and find laggards in accelerating sectors."""
name = "sector_rotation"
pipeline = "momentum"
strategy = "sector_rotation"
def __init__(self, config: Dict[str, Any]):
super().__init__(config)
self.ticker_file = self.scanner_config.get("ticker_file", DEFAULT_TICKER_FILE)
self.max_tickers = self.scanner_config.get("max_tickers", 100)
self.min_sector_accel = self.scanner_config.get("min_sector_acceleration", 2.0)
def scan(self, state: Dict[str, Any]) -> List[Dict[str, Any]]:
if not self.is_enabled():
return []
logger.info("🔄 Scanning sector rotation...")
from tradingagents.dataflows.y_finance import download_history, get_ticker_info
# Step 1: Identify accelerating sectors
try:
etf_symbols = list(SECTOR_ETFS.keys())
etf_data = download_history(
etf_symbols, period="2mo", interval="1d", auto_adjust=True, progress=False
)
except Exception as e:
logger.error(f"Failed to download sector ETF data: {e}")
return []
if etf_data.empty:
return []
accelerating_sectors = self._find_accelerating_sectors(etf_data)
if not accelerating_sectors:
logger.info("No accelerating sectors detected")
return []
sector_names = [SECTOR_ETFS.get(etf, etf) for etf in accelerating_sectors]
logger.info(f"Accelerating sectors: {', '.join(sector_names)}")
# Step 2: Find laggard stocks in those sectors
tickers = _load_tickers_from_file(self.ticker_file)
if not tickers:
return []
tickers = tickers[: self.max_tickers]
candidates = []
for ticker in tickers:
result = self._check_sector_laggard(ticker, accelerating_sectors, get_ticker_info)
if result:
candidates.append(result)
if len(candidates) >= self.limit:
break
logger.info(f"Sector rotation: {len(candidates)} candidates")
return candidates
def _find_accelerating_sectors(self, data: pd.DataFrame) -> List[str]:
"""Find sectors where 5-day return is accelerating vs 20-day trend."""
accelerating = []
for etf in SECTOR_ETFS:
try:
if isinstance(data.columns, pd.MultiIndex):
if etf not in data.columns.get_level_values(1):
continue
close = data.xs(etf, axis=1, level=1)["Close"].dropna()
else:
close = data["Close"].dropna()
if len(close) < 21:
continue
ret_5d = (float(close.iloc[-1]) / float(close.iloc[-6]) - 1) * 100
ret_20d = (float(close.iloc[-1]) / float(close.iloc[-21]) - 1) * 100
# Acceleration: 5-day annualized return significantly beats 20-day
# i.e., the sector is moving faster recently
daily_rate_5d = ret_5d / 5
daily_rate_20d = ret_20d / 20
if daily_rate_20d != 0:
acceleration = daily_rate_5d / daily_rate_20d
elif daily_rate_5d > 0:
acceleration = 10.0 # Strong acceleration from flat
else:
acceleration = 0
if acceleration >= self.min_sector_accel and ret_5d > 0:
accelerating.append(etf)
logger.debug(
f"{etf} ({SECTOR_ETFS[etf]}): 5d={ret_5d:+.1f}%, "
f"20d={ret_20d:+.1f}%, accel={acceleration:.1f}x"
)
except Exception as e:
logger.debug(f"Error analyzing {etf}: {e}")
return accelerating
def _check_sector_laggard(
self, ticker: str, accelerating_sectors: List[str], get_info_fn
) -> Optional[Dict[str, Any]]:
"""Check if stock is in an accelerating sector but hasn't moved yet."""
try:
info = get_info_fn(ticker)
if not info:
return None
stock_sector = info.get("sector", "")
# Map stock sector to ETF
sector_to_etf = {v: k for k, v in SECTOR_ETFS.items()}
sector_etf = sector_to_etf.get(stock_sector)
if not sector_etf or sector_etf not in accelerating_sectors:
return None
# Check if stock is lagging its sector (hasn't caught up yet)
from tradingagents.dataflows.y_finance import download_history
hist = download_history(ticker, period="1mo", interval="1d", auto_adjust=True, progress=False)
if hist.empty or len(hist) < 6:
return None
close = hist["Close"] if "Close" in hist.columns else hist.iloc[:, 0]
ret_5d = (float(close.iloc[-1]) / float(close.iloc[-6]) - 1) * 100
# Stock is a laggard if it moved less than 1% while sector is accelerating
if ret_5d > 2.0:
return None # Already moved, not a laggard
context = (
f"Sector rotation: {stock_sector} sector accelerating, "
f"{ticker} lagging at {ret_5d:+.1f}% (5d)"
)
return {
"ticker": ticker,
"source": self.name,
"context": context,
"priority": Priority.MEDIUM.value,
"strategy": self.strategy,
"sector": stock_sector,
"sector_etf": sector_etf,
"stock_5d_return": round(ret_5d, 2),
}
except Exception as e:
logger.debug(f"Sector check failed for {ticker}: {e}")
return None
SCANNER_REGISTRY.register(SectorRotationScanner)
Step 2: Register in __init__.py
Add sector_rotation to imports.
Step 3: Verify
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
import tradingagents.dataflows.discovery.scanners
for name in sorted(SCANNER_REGISTRY.scanners):
cls = SCANNER_REGISTRY.scanners[name]
print(f'{name:25s} pipeline={cls.pipeline:12s} strategy={cls.strategy}')
print(f'Total: {len(SCANNER_REGISTRY.scanners)} scanners')
"
Expected: 12 scanners total.
Step 4: Commit
git add tradingagents/dataflows/discovery/scanners/sector_rotation.py tradingagents/dataflows/discovery/scanners/__init__.py tradingagents/dataflows/discovery/utils.py
git commit -m "feat: add sector rotation scanner"
Task 14: Final Verification
Step 1: Run all scanner registration
python -c "
from tradingagents.dataflows.discovery.scanner_registry import SCANNER_REGISTRY
from tradingagents.dataflows.discovery.utils import Strategy
import tradingagents.dataflows.discovery.scanners
valid_strategies = {s.value for s in Strategy}
errors = []
for name, cls in SCANNER_REGISTRY.scanners.items():
if cls.strategy not in valid_strategies:
errors.append(f'{name}: strategy {cls.strategy!r} not in Strategy enum')
if errors:
print('ERRORS:')
for e in errors: print(f' {e}')
else:
print(f'All {len(SCANNER_REGISTRY.scanners)} scanners have valid strategies')
"
Step 2: Run existing tests
pytest tests/ -x -q
Step 3: Final commit if any cleanup needed
git add -A && git commit -m "chore: scanner improvements cleanup"