Implements ShortSqueezeScanner wrapping existing get_short_interest() in finviz_scraper.py.
Research finding: raw high SI predicts negative long-term returns (academic); edge is using
SI as a squeeze-risk flag when combined with earnings_calendar or options_flow catalysts.
Directly addresses earnings_calendar pending hypothesis (APLD 30.6% SI was strongest setup).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements ShortSqueezeScanner wrapping existing get_short_interest() in finviz_scraper.py.
Research finding: raw high SI predicts negative long-term returns (academic); edge is using
SI as a squeeze-risk flag when combined with earnings_calendar or options_flow catalysts.
Directly addresses earnings_calendar pending hypothesis (APLD 30.6% SI was strongest setup).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous implementation redirected sys.stderr to /dev/null using a
context manager. This is not thread-safe: 8 concurrent scanner threads each
mutate sys.stderr, and when one thread's context manager closes the devnull
file, another thread that captured devnull as its saved stderr attempts to
write to the closed fd and raises "I/O operation on closed file".
This corrupted sys.stderr state caused _fetch_batch_prices to fail and
all per-ticker get_stock_price fallback calls to return None, resulting in
every candidate being dropped with "no data available".
Fix by suppressing at the Python logging level instead of redirecting
sys.stderr. Logger.setLevel() is protected by internal locks and is safe
to call from concurrent threads.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs causing zero recommendations:
1. risk_metrics.py was untracked — importing it raised ModuleNotFoundError which
was caught by the outer try/except in filter.py, silently dropping all 32
candidates that reached the fundamental risk check stage.
2. Minervini scanner at max_tickers=200 took >5 min to download 200 tickers x 1y
of OHLCV data. ThreadPoolExecutor.cancel() cannot kill a running thread, so the
download kept running as a zombie thread for 20 more minutes after the pipeline
completed, holding the Python process alive until the 30-min workflow timeout
killed the entire job.
Reducing to 50 tickers brings the download to ~75s, well under the 300s global
scanner timeout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The same-day mover filter used abs() to check intraday movement, which
filtered both gap-ups AND gap-downs. On volatile/crash days (e.g. 2026-04-07)
all stocks dropped >10% from open, causing every candidate to be filtered and
leaving zero recommendations.
The filter's purpose is to avoid chasing stocks that already ran up. A stock
down 20% intraday is not "stale" — it should be evaluated on its merits.
Changed threshold check from abs(pct) >= threshold to pct >= threshold so only
upside movers are filtered.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yf.download(592 tickers, period=1y) takes 20+ minutes in CI, causing
the 30-minute job timeout to trigger. Add max_tickers=200 (configurable)
to limit the batch download to the first N tickers from the file. The
concurrent scanner pool already has a 5-min global timeout, but the hung
download thread monopolises network connections and starves the filter stage.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
minervini.py existed but was never committed. Without the file on the
remote, the __init__.py import added in the previous fix causes an
ImportError in CI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add permissions: contents: write so git push works (was failing with 403)
- Add continue-on-error: true on discovery step so partial output still commits
- Change all commit/tracking/position steps to if: always() so they run regardless of discovery outcome
- Use commit-then-pull-rebase-then-push pattern to handle branch divergence
- Fix minervini scanner missing from scanners/__init__.py (enabled in config but never loaded)
- Fix .gitignore: results/* + !results/discovery/ so CI run logs can be committed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Same issue as options_flow: early exit on candidate count discards strong
signals that happen to be later in iteration order.
insider_buying: Dict iteration order matched OpenInsider HTML scrape order,
not signal quality. Now scores by cluster buys + C-suite + dollar value,
then takes top N.
technical_breakout: Stopped at limit*2 in file order despite data already
being batch-downloaded (zero API cost to check all). Removed early exit,
scan full universe, sort by volume_multiple.
sector_rotation: Checked laggards in arbitrary dict order, spending API
calls on random tickers. Now sorts by most-negative 5d return first so
the strongest laggard candidates are checked before hitting the budget.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously the scanner stopped as soon as self.limit candidates were found
from as_completed() futures. Since futures complete in non-deterministic
network-latency order, this was equivalent to random sampling — fast-to-
respond tickers won regardless of how strong their options signal was.
Fix: collect all candidates from the full universe, then sort by options_score
(unusual strike count weighted 1.5x for calls to favor bullish flow) before
applying the limit. The top-N strongest signals are now always returned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tqdm writes to stderr immediately on __enter__, before any loop iteration.
In Streamlit's thread/subprocess context stderr can be a closed pipe, causing
'I/O operation on closed file' which _run_call catches and returns {} — so
the entire news enrichment step was silently skipped every run.
Replaced tqdm progress bars with logger.info() calls in:
- get_batch_stock_news_google() in openai.py
- get_batch_stock_news_openai() in openai.py
- Reddit DD parallel evaluation in reddit_api.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. executor.shutdown(wait=True) still blocked after global timeout (critical)
The previous fix added timeout= to as_completed() but used `with
ThreadPoolExecutor() as executor`, whose __exit__ calls shutdown(wait=True).
This meant the process still hung waiting for stuck threads (ml_signal) even
after the TimeoutError was caught. Fixed by creating the executor explicitly
and calling shutdown(wait=False) in a finally block.
2. ml_signal hangs on every run — "Batch-downloading 592 tickers (1y)..." never
completes. Root cause: a single yfinance request for 592 tickers × 1 year of
daily OHLCV is a very large payload that regularly times out at the network
layer. Fixed by:
- Reducing default lookback from "1y" to "6mo" (halves download size)
- Splitting downloads into 150-ticker chunks so a slow chunk doesn't kill
the whole scan (partial results are still returned)
3. C (Citigroup) and other single-letter NYSE tickers rejected as invalid.
validate_ticker_format used ^[A-Z]{2,5}$ requiring at least 2 letters.
Real tickers like C, A, F, T, X, M are 1 letter. Fixed to ^[A-Z]{1,5}$.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two issues caused the agent to get stuck after the last log message
from a completed scanner (e.g. "✓ reddit_trending: 11 candidates"):
1. `as_completed()` had no global timeout. If a scanner thread blocked
in a non-interruptible I/O call, `as_completed()` waited forever
because it only yields a future once it has finished — the per-future
`future.result(timeout=N)` call was never even reached.
Fixed by passing `timeout=global_timeout` to `as_completed()` so
the outer iterator raises TimeoutError after a capped wall-clock
budget, then logs which scanners didn't complete and continues.
2. `SectorRotationScanner` called `get_ticker_info()` (one HTTP request
per ticker) in a serial loop for up to 100 tickers from a 592-ticker
file, easily exceeding the 30 s per-scanner budget.
Fixed by batch-downloading close prices for all tickers in a single
`download_history()` call, computing 5-day returns locally, and only
calling `get_ticker_info()` for the small subset of laggard tickers
(<2% 5d move) that actually need a sector label.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Call get_finviz_insider_buying with return_structured=True and deduplicate=False
to get all raw transaction dicts instead of parsing markdown
- Group transactions by ticker for cluster detection (2+ unique insiders = CRITICAL)
- Smart priority: CEO/CFO + >$100K = CRITICAL, director + >$50K = HIGH, etc.
- Preserve insider_name, insider_title, transaction_value, num_insiders_buying in output
- Rich context strings: "CEO John Smith purchased $250K of AAPL shares"
- Update finviz_scraper alias to pass through return_structured and deduplicate params
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add GitHub Actions workflow for daily discovery (8:30 AM ET, weekdays)
- Add headless run_daily_discovery.py script for scheduling
- Expand options_flow scanner to use tickers.txt with parallel execution
- Add recommendation history section to Performance page with filters and charts
- Fix strategy name normalization (momentum/Momentum/Momentum-Hype → momentum)
- Fix strategy metrics to count all recs, not just evaluated ones
- Add error handling to Streamlit page rendering
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ML signal scanner results table logging
- Add log_prompts_console config flag for prompt visibility control
- Expand ranker investment thesis to 4-6 sentence structured reasoning
- Linter auto-formatting across modified files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Major additions:
- ML win probability scanner: scans ticker universe using trained
LightGBM/TabPFN model, surfaces candidates with P(WIN) above threshold
- 30-feature engineering pipeline (20 base + 10 interaction features)
computed from OHLCV data via stockstats + pandas
- Triple-barrier labeling for training data generation
- Dataset builder and training script with calibration analysis
- Discovery enrichment: confluence scoring, short interest extraction,
earnings estimates, options signal normalization, quant pre-score
- Configurable prompt logging (log_prompts_console flag)
- Enhanced ranker investment thesis (4-6 sentence reasoning)
- Typed DiscoveryConfig dataclass for all discovery settings
- Console price charts for visual ticker analysis
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Created nested "filters" section for all filter-stage settings
(min_average_volume, same-day movers, recent movers, etc.)
- Created nested "enrichment" section for batch news settings
- Updated CandidateFilter to read from new nested structure
- Added backward compatibility fallback for old flat config
- Improved config organization and clarity
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When multiple primary vendors are configured (e.g., 'reddit,alpha_vantage'),
the system now correctly stops after attempting all primary vendors instead
of continuing through all fallback vendors.
Changes:
- Track which primary vendors have been attempted in a list
- Add stopping condition when all primary vendors are attempted
- Preserve existing single-vendor behavior (stop after first success)
This prevents unnecessary API calls and ensures predictable behavior.
- Replace hardcoded column indices with column name lookup
- Add mapping for all supported indicators to their expected CSV column names
- Handle missing columns gracefully with descriptive error messages
- Strip whitespace from header parsing for reliability
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace FinnHub with Alpha Vantage API in README documentation
- Implement comprehensive Alpha Vantage modules:
- Stock data (daily OHLCV with date filtering)
- Technical indicators (SMA, EMA, MACD, RSI, Bollinger Bands, ATR)
- Fundamental data (overview, balance sheet, cashflow, income statement)
- News and sentiment data with insider transactions
- Update news analyst tools to use ticker-based news search
- Integrate Alpha Vantage vendor methods into interface routing
- Maintain backward compatibility with existing vendor system
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>