37 lines
2.0 KiB
Markdown
37 lines
2.0 KiB
Markdown
# Semantic News Scanner
|
|
|
|
## Current Understanding
|
|
Currently regex-based extraction, not semantic. Headline text IS included in
|
|
candidate context via `news_headline` field (improved from prior version).
|
|
Catalyst classification from headline keywords maps to priority:
|
|
- CRITICAL: FDA approval, acquisition, merger, breakthrough
|
|
- HIGH: upgrade, beat, contract win, patent, guidance raise
|
|
- MEDIUM: downgrade, miss, lawsuit, investigation, recall, warning
|
|
|
|
P&L data shows `news_catalyst` is the worst-performing strategy: -17.5% avg 30d
|
|
return, 0% 7d win rate, 12.5% 1d win rate. Root cause: MEDIUM-priority candidates
|
|
(negative catalysts — downgrades, lawsuits, recalls) are included in the candidate
|
|
pool and frequently get through to recommendations with a bullish framing. Scanner
|
|
now restricted to CRITICAL-only to eliminate negative-catalyst contamination.
|
|
|
|
## Evidence Log
|
|
|
|
### 2026-04-11 — P&L review
|
|
- 8 recommendations, 1d win rate 12.5%, 7d win rate 0% (worst of all strategies).
|
|
- Avg 30d return: -17.5%. Avg 1d return: -4.19%. Avg 7d return: -8.79%.
|
|
- Sample shows WTI (W&T Offshore) appearing twice (Apr 3 and Apr 6) as news_catalyst
|
|
based on geopolitical oil price spike — both marked as "high" risk. The spike
|
|
reversed, consistent with the -17.5% 30d outcome.
|
|
- Root issue 1: MEDIUM-priority keywords include negative events (downgrade, miss,
|
|
lawsuit) that generate candidates with inherently negative thesis.
|
|
- Root issue 2: CRITICAL/HIGH keywords like "upgrade" and "patent" overlap with
|
|
noise in global news feeds that mention these terms incidentally.
|
|
- Fix applied: only emit candidates when headline matches CRITICAL-priority keywords.
|
|
Eliminates the negative-catalyst false positives.
|
|
- Confidence: medium (8 data points; market downturn may amplify losses)
|
|
|
|
## Pending Hypotheses
|
|
- [ ] Would embedding-based semantic matching outperform keyword regex?
|
|
- [ ] Does catalyst classification (FDA vs earnings vs acquisition) affect hit rate?
|
|
- [ ] Do CRITICAL-only candidates (post-fix) outperform CRITICAL+HIGH baseline?
|