TradingAgents/MISTAKES.md

# Mistakes & Lessons Learned

Documenting bugs and wrong assumptions to avoid repeating them.

---

## Mistake 1: Scanner agents had no tool execution

**What happened**: All 4 scanner agents (geopolitical, market movers, sector, industry) used `llm.bind_tools(tools)` but only checked `if len(result.tool_calls) == 0: report = result.content`. When the LLM chose to call tools (which it always does when tools are available), nobody executed them. Reports were always empty strings.

**Root cause**: Copied the pattern from existing analysts (`news_analyst.py`) without realizing that the trading graph has separate `ToolNode` graph nodes that handle tool execution in a routing loop. The scanner graph has no such nodes.

**Fix**: Created `tool_runner.py` with `run_tool_loop()` that executes tools inline within the agent node.

**Lesson**: When an LLM has `bind_tools`, there MUST be a tool execution mechanism — either graph-level `ToolNode` routing or inline execution. Always verify the tool execution path exists.

---

## Mistake 2: Assumed yfinance `Sector.overview` has performance data

**What happened**: Wrote `get_sector_performance_yfinance` using `yf.Sector("technology").overview["oneDay"]` etc. This field doesn't exist — `overview` only returns metadata (companies_count, market_cap, industries_count).

**Root cause**: Assumed the yfinance Sector API mirrors the Yahoo Finance website which shows performance data. It doesn't.

**Fix**: Switched to SPDR ETF proxy approach — download ETF prices and compute percentage changes.

**Lesson**: Always test data source APIs interactively before writing agent code. Run `python -c "import yfinance as yf; print(yf.Sector('technology').overview)"` to see actual data shape.

---

## Mistake 3: yfinance `top_companies` — ticker is the index, not a column

**What happened**: Used `row.get('symbol')` to get ticker from `top_companies` DataFrame. Always returned N/A.

**Root cause**: The DataFrame has `index.name = 'symbol'` — tickers are the index, not a column. The actual columns are `['name', 'rating', 'market weight']`.

**Fix**: Changed to `for symbol, row in top_companies.iterrows()`.

**Lesson**: Always inspect DataFrame structure with `.head()`, `.columns`, and `.index` before writing access code.

---

## Mistake 4: Hardcoded Ollama localhost URL

**What happened**: `openai_client.py` had `base_url = "http://localhost:11434/v1"` hardcoded for Ollama provider, ignoring the `self.base_url` config. User's Ollama runs on `192.168.50.76:11434`.

**Fix**: Changed to `host = self.base_url or "http://localhost:11434"` with `/v1` suffix appended.

**Lesson**: Never hardcode URLs. Always use the configured value with a sensible default.

---

## Mistake 5: Only caught `RateLimitError` in vendor fallback

**What happened**: `route_to_vendor()` only caught `RateLimitError`. Alpha Vantage demo key returns "Information" responses (not rate limit errors) and other `AlphaVantageError` subtypes. Fallback to yfinance never triggered.

**Fix**: Broadened catch to `AlphaVantageError` (base class).

**Lesson**: Fallback mechanisms should catch the broadest reasonable error class, not just specific subtypes.

---

## Mistake 6: AV scanner functions silently caught errors

**What happened**: `get_sector_performance_alpha_vantage` and `get_industry_performance_alpha_vantage` caught exceptions internally and embedded error strings in the output (e.g., `"Error: ..."` in the result dict). `route_to_vendor` never saw an exception, so it never fell back to yfinance.

**Fix**: Made both functions raise `AlphaVantageError` when ALL queries fail, while still tolerating partial failures.

**Lesson**: Functions used inside `route_to_vendor` MUST raise exceptions on total failure — embedding errors in return values defeats the fallback mechanism.

---

## Mistake 7: LangGraph concurrent write without reducer

**What happened**: Phase 1 runs 3 scanners in parallel. All write to `sender` (and other shared fields). LangGraph raised `INVALID_CONCURRENT_GRAPH_UPDATE` because `ScannerState` had no reducer for concurrent writes.

**Fix**: Added `_last_value` reducer via `Annotated[str, _last_value]` to all ScannerState fields.

**Lesson**: Any LangGraph state field written by parallel nodes MUST have a reducer. Use `Annotated[type, reducer_fn]`.

---

## Mistake 8: .env file had placeholder values in worktree

**What happened**: Created `.env` in worktree with template values (`your_openrouter_key_here`). User's real keys were only in main repo's `.env`. `load_dotenv()` loaded the worktree placeholder, so OpenRouter returned 401.

**Root cause**: Created `.env` template during setup without copying real keys. `load_dotenv()` with `override=False` (default) keeps the first value found.

**Fix**: Updated worktree `.env` with real keys. Also added fallback `load_dotenv()` call for project root.

**Lesson**: When creating `.env` files, always verify they have real values, not placeholders. When debugging auth errors, first check `os.environ.get('KEY')` to see what value is actually loaded.

---

## Mistake 10: Python 3.11 f-string backslash restriction

**What happened**: `ttm_analysis.py` used a backslash inside an f-string expression:
```python
f"| Debt / Equity | {f\"{ttm['debt_to_equity']:.2f}x\" if ...} |"
```
Python 3.11 raises `SyntaxError: f-string expression part cannot include a backslash`.

**Fix**: Pre-compute the string outside the f-string or use string concatenation:
```python
f"| Debt / Equity | {(str(round(ttm['debt_to_equity'], 2)) + 'x') if ... else 'N/A'} |"
```

**Lesson**: Python 3.11 does not allow backslashes inside f-string `{}` expressions. Extract to a variable or use string concatenation instead. (Python 3.12+ relaxes this restriction.)

---

## Mistake 11: Mock test data precision — threshold boundary failures

**What happened**: `test_risk_on_regime` failed because mock risk-on data scored only 2 (needed ≥3). Two signals were inadvertently near-threshold:
1. Flat VIX series → VIX trend signal = 0 (SMA5 == SMA20)
2. `_trending_series(80, 85, 250)` HYG → 21-day change was 0.499%, just under 0.5% threshold → credit spread = 0

**Fix**: Made mock data obviously far from thresholds: `_trending_series(30, 12, n)` for VIX (clearly falling), `_trending_series(75, 90, n)` for HYG (clearly improving).

**Lesson**: When writing signal/threshold tests, make mock data unmistakably one-sided. Near-threshold values cause brittle tests. The mock should test the regime, not the exact threshold boundary.

---

## Mistake 12: Remote naming — "origin" is the fork, not the upstream

**What happened**: Confusion about which remote is the "fork" vs "origin". The user said "you pushed to origin, not the fork" — but there is only one remote configured, and it points to `aguzererler/TradingAgents` which IS the fork.

**Setup**:
```
origin → http://127.0.0.1:46699/git/aguzererler/TradingAgents  ← this IS the fork
```
There is no separate `upstream` remote for the original/parent repo.

**Lesson**: In this project, `origin` = the user's fork (`aguzererler/TradingAgents`). Always push development branches to `origin`. If an `upstream` remote is ever added, never push feature branches to it — only fetch from it.

---

## Mistake 9: Removed top-level `llm_provider` but code still references it

**What happened**: Removed `llm_provider` from `default_config.py` (since we have per-tier providers). But `scanner_graph.py` line 78 does `self.config.get(f"{tier}_llm_provider") or self.config["llm_provider"]` — would crash if per-tier provider is ever None.

**Status**: Works currently because per-tier providers are always set. But it's a latent bug.

**TODO**: Add a safe fallback or remove the dead code path.