- tradingagents/dataflows/universe.py: single source of truth for ticker
universe; all scanners now call load_universe(config) instead of
duplicating the 3-level fallback chain with hardcoded "data/tickers.txt"
- scripts/prefetch_ohlcv.py: nightly script using existing ohlcv_cache.py
incremental logic; first run downloads 1y history, subsequent runs append
only new trading days
- .github/workflows/prefetch.yml: runs at 01:00 UTC daily, before all other
workflows; commits updated parquet to repo
- Updated 6 scanners: minervini, high_52w_breakout, ml_signal, options_flow,
sector_rotation, technical_breakout — removed duplicate DEFAULT_TICKER_FILE
constants and _load_tickers_from_file() functions
- minervini, high_52w_breakout, technical_breakout: replace yf.download()
with download_ohlcv_cached() — reads from prefetched cache instead of
hitting yfinance at discovery time
- default_config.py: added discovery.ohlcv_cache_dir config key
- data/ohlcv_cache/: initial 1y backfill (588 tickers, 5.4MB parquet)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. executor.shutdown(wait=True) still blocked after global timeout (critical)
The previous fix added timeout= to as_completed() but used `with
ThreadPoolExecutor() as executor`, whose __exit__ calls shutdown(wait=True).
This meant the process still hung waiting for stuck threads (ml_signal) even
after the TimeoutError was caught. Fixed by creating the executor explicitly
and calling shutdown(wait=False) in a finally block.
2. ml_signal hangs on every run — "Batch-downloading 592 tickers (1y)..." never
completes. Root cause: a single yfinance request for 592 tickers × 1 year of
daily OHLCV is a very large payload that regularly times out at the network
layer. Fixed by:
- Reducing default lookback from "1y" to "6mo" (halves download size)
- Splitting downloads into 150-ticker chunks so a slow chunk doesn't kill
the whole scan (partial results are still returned)
3. C (Citigroup) and other single-letter NYSE tickers rejected as invalid.
validate_ticker_format used ^[A-Z]{2,5}$ requiring at least 2 letters.
Real tickers like C, A, F, T, X, M are 1 letter. Fixed to ^[A-Z]{1,5}$.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add ML signal scanner results table logging
- Add log_prompts_console config flag for prompt visibility control
- Expand ranker investment thesis to 4-6 sentence structured reasoning
- Linter auto-formatting across modified files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Major additions:
- ML win probability scanner: scans ticker universe using trained
LightGBM/TabPFN model, surfaces candidates with P(WIN) above threshold
- 30-feature engineering pipeline (20 base + 10 interaction features)
computed from OHLCV data via stockstats + pandas
- Triple-barrier labeling for training data generation
- Dataset builder and training script with calibration analysis
- Discovery enrichment: confluence scoring, short interest extraction,
earnings estimates, options signal normalization, quant pre-score
- Configurable prompt logging (log_prompts_console flag)
- Enhanced ranker investment thesis (4-6 sentence reasoning)
- Typed DiscoveryConfig dataclass for all discovery settings
- Console price charts for visual ticker analysis
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>