TradingAgents

Commit Graph

Author	SHA1	Message	Date
Youssef Aitousarrah	c792b17ab6	fix(discovery): fix three scanner hang/validation bugs found in ranker_debug.log 1. executor.shutdown(wait=True) still blocked after global timeout (critical) The previous fix added timeout= to as_completed() but used `with ThreadPoolExecutor() as executor`, whose __exit__ calls shutdown(wait=True). This meant the process still hung waiting for stuck threads (ml_signal) even after the TimeoutError was caught. Fixed by creating the executor explicitly and calling shutdown(wait=False) in a finally block. 2. ml_signal hangs on every run — "Batch-downloading 592 tickers (1y)..." never completes. Root cause: a single yfinance request for 592 tickers × 1 year of daily OHLCV is a very large payload that regularly times out at the network layer. Fixed by: - Reducing default lookback from "1y" to "6mo" (halves download size) - Splitting downloads into 150-ticker chunks so a slow chunk doesn't kill the whole scan (partial results are still returned) 3. C (Citigroup) and other single-letter NYSE tickers rejected as invalid. validate_ticker_format used ^[A-Z]{2,5}$ requiring at least 2 letters. Real tickers like C, A, F, T, X, M are 1 letter. Fixed to ^[A-Z]{1,5}$. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 22:35:42 -08:00
Youssef Aitousarrah	ce2a6ef8fa	fix(discovery): fix infinite hang when a scanner thread blocks indefinitely Two issues caused the agent to get stuck after the last log message from a completed scanner (e.g. "✓ reddit_trending: 11 candidates"): 1. `as_completed()` had no global timeout. If a scanner thread blocked in a non-interruptible I/O call, `as_completed()` waited forever because it only yields a future once it has finished — the per-future `future.result(timeout=N)` call was never even reached. Fixed by passing `timeout=global_timeout` to `as_completed()` so the outer iterator raises TimeoutError after a capped wall-clock budget, then logs which scanners didn't complete and continues. 2. `SectorRotationScanner` called `get_ticker_info()` (one HTTP request per ticker) in a serial loop for up to 100 tickers from a 592-ticker file, easily exceeding the 30 s per-scanner budget. Fixed by batch-downloading close prices for all tickers in a single `download_history()` call, computing 5-day returns locally, and only calling `get_ticker_info()` for the small subset of laggard tickers (<2% 5d move) that actually need a sector label. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 22:14:53 -08:00
Youssef Aitousarrah	ec8309a34e	Update	2026-02-20 08:38:15 -08:00
Youssef Aitousarrah	1c20dc8c90	feat: improve all 9 scanners and add 3 new scanners Phase 1 - Fix existing scanners: - Options flow: apply min_premium filter, scan 3 expirations - Volume accumulation: distinguish accumulation vs distribution - Reddit DD: use LLM quality score for priority (skip <60) - Reddit trending: add mention counts, scale priority by volume - Semantic news: include headlines, add catalyst classification - Earnings calendar: add pre-earnings accumulation + EPS estimates - Market movers: add price ($5) and volume (500K) validation - ML signal: raise min_win_prob from 35% to 50% Phase 2 - New scanners: - Analyst upgrades: monitors rating changes via Alpha Vantage - Technical breakout: volume-confirmed breakouts above 20d high - Sector rotation: finds laggards in accelerating sectors All 12 scanners register with valid Strategy enum values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 08:36:18 -08:00
Youssef Aitousarrah	573b756b4b	fix(insider-buying): preserve transaction details, add cluster detection and smart priority - Call get_finviz_insider_buying with return_structured=True and deduplicate=False to get all raw transaction dicts instead of parsing markdown - Group transactions by ticker for cluster detection (2+ unique insiders = CRITICAL) - Smart priority: CEO/CFO + >$100K = CRITICAL, director + >$50K = HIGH, etc. - Preserve insider_name, insider_title, transaction_value, num_insiders_buying in output - Rich context strings: "CEO John Smith purchased $250K of AAPL shares" - Update finviz_scraper alias to pass through return_structured and deduplicate params Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 08:36:18 -08:00
Youssef Aitousarrah	6831339b78	Remore unused code and improve the UI	2026-02-16 14:17:43 -08:00
Youssef Aitousarrah	8d3205043e	Update	2026-02-16 14:17:41 -08:00
Youssef Aitousarrah	f4aceef857	feat: add daily discovery workflow, recommendation history, and scanner improvements - Add GitHub Actions workflow for daily discovery (8:30 AM ET, weekdays) - Add headless run_daily_discovery.py script for scheduling - Expand options_flow scanner to use tickers.txt with parallel execution - Add recommendation history section to Performance page with filters and charts - Fix strategy name normalization (momentum/Momentum/Momentum-Hype → momentum) - Fix strategy metrics to count all recs, not just evaluated ones - Add error handling to Streamlit page rendering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 22:07:02 -08:00
Youssef Aitousarrah	8ebb42114d	Add recommendations folder so that the UI can display it 4	2026-02-10 22:28:52 -08:00
Youssef Aitousarrah	cb5ae49501	chore: linter formatting + ML scanner logging, prompt control, ranker reasoning - Add ML signal scanner results table logging - Add log_prompts_console config flag for prompt visibility control - Expand ranker investment thesis to 4-6 sentence structured reasoning - Linter auto-formatting across modified files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 23:04:38 -08:00
Youssef Aitousarrah	43bdd6de11	feat: discovery pipeline enhancements with ML signal scanner Major additions: - ML win probability scanner: scans ticker universe using trained LightGBM/TabPFN model, surfaces candidates with P(WIN) above threshold - 30-feature engineering pipeline (20 base + 10 interaction features) computed from OHLCV data via stockstats + pandas - Triple-barrier labeling for training data generation - Dataset builder and training script with calibration analysis - Discovery enrichment: confluence scoring, short interest extraction, earnings estimates, options signal normalization, quant pre-score - Configurable prompt logging (log_prompts_console flag) - Enhanced ranker investment thesis (4-6 sentence reasoning) - Typed DiscoveryConfig dataclass for all discovery settings - Console price charts for visual ticker analysis Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 22:53:42 -08:00
Youssef Aitousarrah	f1178b4a57	refactor: organize discovery config into dedicated filter/enrichment sections - Created nested "filters" section for all filter-stage settings (min_average_volume, same-day movers, recent movers, etc.) - Created nested "enrichment" section for batch news settings - Updated CandidateFilter to read from new nested structure - Added backward compatibility fallback for old flat config - Improved config organization and clarity Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-06 08:22:39 -08:00
Youssef Aitousarrah	369f8c444b	feat: discovery system code quality improvements and concurrent execution Implement comprehensive code quality improvements and performance optimizations for the discovery pipeline based on code review findings. ## Key Improvements ### 1. Common Utilities (DRY Principle) - Created `tradingagents/dataflows/discovery/common_utils.py` - Extracted ticker parsing logic (eliminates 40+ lines of duplication) - Centralized stopwords list (71 common non-ticker words) - Added ReDoS protection (100KB text length limit) - Provides `validate_candidate_structure()` for output validation ### 2. Scanner Output Validation - Two-layer validation approach: - Registration-time: Check scanner class structure - Runtime: Validate each candidate dictionary - Added `scan_with_validation()` wrapper in BaseScanner - Validates required keys: ticker, source, context, priority - Graceful error handling with structured logging ### 3. Configuration-Driven Design - Moved magic numbers to `default_config.py`: - `ticker_universe`: Top 20 liquid options tickers - `min_volume`: 1000 (options flow threshold) - `min_transaction_value`: $25,000 (insider buying filter) - Fixed hardcoded absolute paths to relative paths - Improved portability across development environments ### 4. Concurrent Scanner Execution (37% Performance Gain) - Implemented ThreadPoolExecutor for parallel scanner execution - Configuration: `scanner_execution.concurrent`, `max_workers`, `timeout_seconds` - Performance: 42s vs 67s (37% faster with 8 scanners) - Thread-safe state management (each scanner gets copy) - Per-scanner timeout with graceful degradation - Error isolation (one failure doesn't stop others) ### 5. Error Handling Improvements - Changed bare `except:` to `except Exception:` (avoid catching KeyboardInterrupt) - Added structured logging with `exc_info=True` and extra fields - Implemented graceful degradation throughout pipeline ## Files Changed ### Core Implementation - `tradingagents/__init__.py` (NEW) - Package initialization - `tradingagents/default_config.py` - Scanner execution config, magic numbers - `tradingagents/graph/discovery_graph.py` - Concurrent execution logic - `tradingagents/dataflows/discovery/common_utils.py` (NEW) - Shared utilities - `tradingagents/dataflows/discovery/scanner_registry.py` - Validation wrapper - `tradingagents/dataflows/discovery/scanners/*.py` - Use common utilities ### Testing & Documentation - `tests/test_concurrent_scanners.py` (NEW) - Comprehensive test suite - `verify_concurrent_execution.py` (NEW) - Performance verification - `CONCURRENT_EXECUTION.md` (NEW) - Implementation documentation ## Test Results All tests passing (exit code 0): - ✅ Concurrent execution: 42s, 66-69 candidates - ✅ Sequential fallback: 56-67s, 65-68 candidates - ✅ Timeout handling: Graceful degradation with 1s timeout - ✅ Error isolation: Individual failures don't cascade ## Performance Impact - Scanner execution: 37% faster (42s vs 67s) - Time saved: ~25 seconds per discovery run - At scale: 4+ minutes saved daily in production - Same candidate quality (65-69 tickers in both modes) ## Breaking Changes None. Concurrent execution is opt-in via config flag. Sequential mode remains available as fallback. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-05 23:27:01 -08:00

13 Commits