Youssef Aitousarrah
6e43c7164a
fix(analytics): merge recommendations into existing dated file instead of overwriting
...
Multiple runs on the same day (scheduled discovery, hypothesis runner, manual
re-runs) were each clobbering the shared YYYY-MM-DD.json file. Now merges by
loading existing picks and upserting new ones by ticker — later run wins.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 11:04:52 -07:00
Youssef Aitousarrah
94d52df8b7
learn(iterate): 2026-04-12 — surface worst-performing strategies in ranker context; LLM now sees news_catalyst (0% 7d win rate) and social_hype (14.3%) as explicit penalties
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 18:04:04 -07:00
Youssef Aitousarrah
ec8309a34e
Update
2026-02-20 08:38:15 -08:00
Youssef Aitousarrah
8ebb42114d
Add recommendations folder so that the UI can display it 4
2026-02-10 22:28:52 -08:00
Youssef Aitousarrah
cb5ae49501
chore: linter formatting + ML scanner logging, prompt control, ranker reasoning
...
- Add ML signal scanner results table logging
- Add log_prompts_console config flag for prompt visibility control
- Expand ranker investment thesis to 4-6 sentence structured reasoning
- Linter auto-formatting across modified files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 23:04:38 -08:00
Youssef Aitousarrah
43bdd6de11
feat: discovery pipeline enhancements with ML signal scanner
...
Major additions:
- ML win probability scanner: scans ticker universe using trained
LightGBM/TabPFN model, surfaces candidates with P(WIN) above threshold
- 30-feature engineering pipeline (20 base + 10 interaction features)
computed from OHLCV data via stockstats + pandas
- Triple-barrier labeling for training data generation
- Dataset builder and training script with calibration analysis
- Discovery enrichment: confluence scoring, short interest extraction,
earnings estimates, options signal normalization, quant pre-score
- Configurable prompt logging (log_prompts_console flag)
- Enhanced ranker investment thesis (4-6 sentence reasoning)
- Typed DiscoveryConfig dataclass for all discovery settings
- Console price charts for visual ticker analysis
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 22:53:42 -08:00
Youssef Aitousarrah
369f8c444b
feat: discovery system code quality improvements and concurrent execution
...
Implement comprehensive code quality improvements and performance optimizations
for the discovery pipeline based on code review findings.
## Key Improvements
### 1. Common Utilities (DRY Principle)
- Created `tradingagents/dataflows/discovery/common_utils.py`
- Extracted ticker parsing logic (eliminates 40+ lines of duplication)
- Centralized stopwords list (71 common non-ticker words)
- Added ReDoS protection (100KB text length limit)
- Provides `validate_candidate_structure()` for output validation
### 2. Scanner Output Validation
- Two-layer validation approach:
- Registration-time: Check scanner class structure
- Runtime: Validate each candidate dictionary
- Added `scan_with_validation()` wrapper in BaseScanner
- Validates required keys: ticker, source, context, priority
- Graceful error handling with structured logging
### 3. Configuration-Driven Design
- Moved magic numbers to `default_config.py`:
- `ticker_universe`: Top 20 liquid options tickers
- `min_volume`: 1000 (options flow threshold)
- `min_transaction_value`: $25,000 (insider buying filter)
- Fixed hardcoded absolute paths to relative paths
- Improved portability across development environments
### 4. Concurrent Scanner Execution (37% Performance Gain)
- Implemented ThreadPoolExecutor for parallel scanner execution
- Configuration: `scanner_execution.concurrent`, `max_workers`, `timeout_seconds`
- Performance: 42s vs 67s (37% faster with 8 scanners)
- Thread-safe state management (each scanner gets copy)
- Per-scanner timeout with graceful degradation
- Error isolation (one failure doesn't stop others)
### 5. Error Handling Improvements
- Changed bare `except:` to `except Exception:` (avoid catching KeyboardInterrupt)
- Added structured logging with `exc_info=True` and extra fields
- Implemented graceful degradation throughout pipeline
## Files Changed
### Core Implementation
- `tradingagents/__init__.py` (NEW) - Package initialization
- `tradingagents/default_config.py` - Scanner execution config, magic numbers
- `tradingagents/graph/discovery_graph.py` - Concurrent execution logic
- `tradingagents/dataflows/discovery/common_utils.py` (NEW) - Shared utilities
- `tradingagents/dataflows/discovery/scanner_registry.py` - Validation wrapper
- `tradingagents/dataflows/discovery/scanners/*.py` - Use common utilities
### Testing & Documentation
- `tests/test_concurrent_scanners.py` (NEW) - Comprehensive test suite
- `verify_concurrent_execution.py` (NEW) - Performance verification
- `CONCURRENT_EXECUTION.md` (NEW) - Implementation documentation
## Test Results
All tests passing (exit code 0):
- ✅ Concurrent execution: 42s, 66-69 candidates
- ✅ Sequential fallback: 56-67s, 65-68 candidates
- ✅ Timeout handling: Graceful degradation with 1s timeout
- ✅ Error isolation: Individual failures don't cascade
## Performance Impact
- Scanner execution: 37% faster (42s vs 67s)
- Time saved: ~25 seconds per discovery run
- At scale: 4+ minutes saved daily in production
- Same candidate quality (65-69 tickers in both modes)
## Breaking Changes
None. Concurrent execution is opt-in via config flag.
Sequential mode remains available as fallback.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-05 23:27:01 -08:00