Commit Graph

29 Commits

Author SHA1 Message Date
Youssef Aitousarrah f87197ef41 style: apply black/ruff formatting fixes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:21:13 -07:00
Youssef Aitousarrah e15e2df7a5 feat(cache): unified ticker universe + nightly OHLCV prefetch
- tradingagents/dataflows/universe.py: single source of truth for ticker
  universe; all scanners now call load_universe(config) instead of
  duplicating the 3-level fallback chain with hardcoded "data/tickers.txt"

- scripts/prefetch_ohlcv.py: nightly script using existing ohlcv_cache.py
  incremental logic; first run downloads 1y history, subsequent runs append
  only new trading days

- .github/workflows/prefetch.yml: runs at 01:00 UTC daily, before all other
  workflows; commits updated parquet to repo

- Updated 6 scanners: minervini, high_52w_breakout, ml_signal, options_flow,
  sector_rotation, technical_breakout — removed duplicate DEFAULT_TICKER_FILE
  constants and _load_tickers_from_file() functions

- minervini, high_52w_breakout, technical_breakout: replace yf.download()
  with download_ohlcv_cached() — reads from prefetched cache instead of
  hitting yfinance at discovery time

- default_config.py: added discovery.ohlcv_cache_dir config key

- data/ohlcv_cache/: initial 1y backfill (588 tickers, 5.4MB parquet)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:18:52 -07:00
Youssef Aitousarrah 7ffbadca09 fix(hypotheses): prune concluded entries from active.json after each run
Concluded hypotheses already live in concluded/ — keeping them in active.json
causes the registry to grow unboundedly. Runner now removes them at the end
of each cycle. Also cleaned up the existing social_dd concluded entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 13:47:21 -07:00
Youssef Aitousarrah 662fdb5753 feat(hypotheses): uncap statistical hypotheses from max_active limit
Statistical hypotheses now conclude immediately on the next runner cycle
without counting toward max_active. Only implementation hypotheses occupy
runner slots. Added conclude_statistical_hypothesis() for instant analysis
against existing performance data with Gemini LLM enrichment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 12:40:33 -07:00
Youssef Aitousarrah 615107cada feat(hypotheses): use gemini-3-flash-preview for LLM analysis
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 17:40:01 -07:00
Youssef Aitousarrah 43fb186d0e feat(hypotheses): switch LLM analysis from Anthropic to Gemini
Uses google-genai SDK with gemini-2.5-flash-lite — same model already
used by the discovery pipeline, so no new secret needed (GOOGLE_API_KEY).
Removed ANTHROPIC_API_KEY from hypothesis-runner.yml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 17:37:12 -07:00
Youssef Aitousarrah e2c3ae14c1 fix(hypotheses): skip weekends to avoid counting non-trading days
days_elapsed counts entries in picks_log, so running on weekends would
inflate the counter with noise picks. Exit early on Saturday/Sunday.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 15:48:17 -07:00
Youssef Aitousarrah 91311ad69d feat(hypotheses): detect baseline drift when scanner changes on main mid-experiment
Before concluding a hypothesis, check if the scanner's source file
changed on main since created_at. If it did, the baseline picks in
performance_database.json reflect the updated code for the later part
of the experiment, which can confound the comparison.

When drift is detected, a warning is embedded in:
- the concluded .md doc (blockquote below Decision)
- the PR comment (blockquote in the conclusion body)

The programmatic decision is not overridden — the warning is purely
informational, allowing the reviewer to judge whether the result is
trustworthy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 15:42:21 -07:00
Youssef Aitousarrah 9e1c800f01 fix(hypotheses): symlink .env into worktree for local dev
load_dotenv() in tradingagents/config.py searches the cwd for .env.
Worktrees in /tmp/ don't have one, so symlink the main repo's .env
into the worktree root before running discovery.

In CI, secrets are passed as env vars directly — symlink is a no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 11:12:36 -07:00
Youssef Aitousarrah 26df957e37 feat(hypotheses): add LLM analysis to hypothesis conclusion
When ANTHROPIC_API_KEY is set, conclude_hypothesis now:
- Loads the scanner domain file for context
- Calls claude-haiku-4-5-20251001 for a 3–5 sentence interpretation
- Embeds the analysis in the concluded .md doc and PR comment

The LLM enriches the conclusion with sample-size caveats, market
context, and a follow-up hypothesis suggestion — without overriding
the programmatic accept/reject decision.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 10:57:52 -07:00
Youssef Aitousarrah 49175e3b0a feat(hypotheses): post conclusion as PR comment instead of auto-merging 2026-04-10 10:52:00 -07:00
Youssef Aitousarrah 9562bb7cc0 fix(hypotheses): id validation, worktree prune, safe loop, 14d enrichment cutoff
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 09:56:22 -07:00
Youssef Aitousarrah fe5b8886c0 fix(hypotheses): only count successful discovery days in picks_log 2026-04-10 09:50:37 -07:00
Youssef Aitousarrah 1b782b1cd6 feat(hypotheses): add daily hypothesis runner workflow 2026-04-10 09:49:10 -07:00
Youssef Aitousarrah f8063f3596 fix(hypotheses): use correct 7-trading-day exit index in comparison 2026-04-10 09:31:07 -07:00
Youssef Aitousarrah 2747ccddcd feat(hypotheses): add comparison + conclusion script
Implements compute_7d_return, compute_metrics, load_baseline_metrics,
and make_decision functions with full TDD coverage (11 tests passing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 09:29:22 -07:00
Youssef Aitousarrah 6c438f87e6 feat(hypotheses): add comparison + conclusion script
Implements compute_7d_return, compute_metrics, load_baseline_metrics,
and make_decision functions with full TDD coverage (11 tests passing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 09:29:08 -07:00
Youssef Aitousarrah b68a43ec0d feat(scanners): add minervini scanner to registry
minervini.py existed but was never committed. Without the file on the
remote, the __init__.py import added in the previous fix causes an
ImportError in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 13:51:42 -07:00
Youssef Aitousarrah ec8309a34e Update 2026-02-20 08:38:15 -08:00
Youssef Aitousarrah fd951be8bc Update 2026-02-17 12:07:07 -08:00
Youssef Aitousarrah 457d650e42 Update 2026-02-17 10:27:13 -08:00
Youssef Aitousarrah 8d3205043e Update 2026-02-16 14:17:41 -08:00
Youssef Aitousarrah f4aceef857 feat: add daily discovery workflow, recommendation history, and scanner improvements
- Add GitHub Actions workflow for daily discovery (8:30 AM ET, weekdays)
- Add headless run_daily_discovery.py script for scheduling
- Expand options_flow scanner to use tickers.txt with parallel execution
- Add recommendation history section to Performance page with filters and charts
- Fix strategy name normalization (momentum/Momentum/Momentum-Hype → momentum)
- Fix strategy metrics to count all recs, not just evaluated ones
- Add error handling to Streamlit page rendering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 22:07:02 -08:00
Youssef Aitousarrah ab8d174990 Add recommendations folder so that the UI can display it 5 2026-02-10 22:43:46 -08:00
Youssef Aitousarrah 8ebb42114d Add recommendations folder so that the UI can display it 4 2026-02-10 22:28:52 -08:00
Youssef Aitousarrah cb5ae49501 chore: linter formatting + ML scanner logging, prompt control, ranker reasoning
- Add ML signal scanner results table logging
- Add log_prompts_console config flag for prompt visibility control
- Expand ranker investment thesis to 4-6 sentence structured reasoning
- Linter auto-formatting across modified files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 23:04:38 -08:00
Youssef Aitousarrah 43bdd6de11 feat: discovery pipeline enhancements with ML signal scanner
Major additions:
- ML win probability scanner: scans ticker universe using trained
  LightGBM/TabPFN model, surfaces candidates with P(WIN) above threshold
- 30-feature engineering pipeline (20 base + 10 interaction features)
  computed from OHLCV data via stockstats + pandas
- Triple-barrier labeling for training data generation
- Dataset builder and training script with calibration analysis
- Discovery enrichment: confluence scoring, short interest extraction,
  earnings estimates, options signal normalization, quant pre-score
- Configurable prompt logging (log_prompts_console flag)
- Enhanced ranker investment thesis (4-6 sentence reasoning)
- Typed DiscoveryConfig dataclass for all discovery settings
- Console price charts for visual ticker analysis

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 22:53:42 -08:00
Youssef Aitousarrah 369f8c444b feat: discovery system code quality improvements and concurrent execution
Implement comprehensive code quality improvements and performance optimizations
for the discovery pipeline based on code review findings.

## Key Improvements

### 1. Common Utilities (DRY Principle)
- Created `tradingagents/dataflows/discovery/common_utils.py`
- Extracted ticker parsing logic (eliminates 40+ lines of duplication)
- Centralized stopwords list (71 common non-ticker words)
- Added ReDoS protection (100KB text length limit)
- Provides `validate_candidate_structure()` for output validation

### 2. Scanner Output Validation
- Two-layer validation approach:
  - Registration-time: Check scanner class structure
  - Runtime: Validate each candidate dictionary
- Added `scan_with_validation()` wrapper in BaseScanner
- Validates required keys: ticker, source, context, priority
- Graceful error handling with structured logging

### 3. Configuration-Driven Design
- Moved magic numbers to `default_config.py`:
  - `ticker_universe`: Top 20 liquid options tickers
  - `min_volume`: 1000 (options flow threshold)
  - `min_transaction_value`: $25,000 (insider buying filter)
- Fixed hardcoded absolute paths to relative paths
- Improved portability across development environments

### 4. Concurrent Scanner Execution (37% Performance Gain)
- Implemented ThreadPoolExecutor for parallel scanner execution
- Configuration: `scanner_execution.concurrent`, `max_workers`, `timeout_seconds`
- Performance: 42s vs 67s (37% faster with 8 scanners)
- Thread-safe state management (each scanner gets copy)
- Per-scanner timeout with graceful degradation
- Error isolation (one failure doesn't stop others)

### 5. Error Handling Improvements
- Changed bare `except:` to `except Exception:` (avoid catching KeyboardInterrupt)
- Added structured logging with `exc_info=True` and extra fields
- Implemented graceful degradation throughout pipeline

## Files Changed

### Core Implementation
- `tradingagents/__init__.py` (NEW) - Package initialization
- `tradingagents/default_config.py` - Scanner execution config, magic numbers
- `tradingagents/graph/discovery_graph.py` - Concurrent execution logic
- `tradingagents/dataflows/discovery/common_utils.py` (NEW) - Shared utilities
- `tradingagents/dataflows/discovery/scanner_registry.py` - Validation wrapper
- `tradingagents/dataflows/discovery/scanners/*.py` - Use common utilities

### Testing & Documentation
- `tests/test_concurrent_scanners.py` (NEW) - Comprehensive test suite
- `verify_concurrent_execution.py` (NEW) - Performance verification
- `CONCURRENT_EXECUTION.md` (NEW) - Implementation documentation

## Test Results

All tests passing (exit code 0):
-  Concurrent execution: 42s, 66-69 candidates
-  Sequential fallback: 56-67s, 65-68 candidates
-  Timeout handling: Graceful degradation with 1s timeout
-  Error isolation: Individual failures don't cascade

## Performance Impact

- Scanner execution: 37% faster (42s vs 67s)
- Time saved: ~25 seconds per discovery run
- At scale: 4+ minutes saved daily in production
- Same candidate quality (65-69 tickers in both modes)

## Breaking Changes

None. Concurrent execution is opt-in via config flag.
Sequential mode remains available as fallback.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-05 23:27:01 -08:00
Youssef Aitousarrah 5cf57e5d97 Update 2025-12-02 20:49:42 -08:00