Merge pull request #10 from aguzererler/claude/distracted-almeida

Add agentic memory scaffold and migrate tracking files
2026-03-17 19:54:59 +01:00 · 2026-03-17 19:54:59 +01:00 · 8279295348
parent 251d8b61b1 0b4260a4e2
commit 8279295348
23 changed files with 323 additions and 1153 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -87,7 +87,7 @@ OpenAI, Anthropic, Google, xAI, OpenRouter, Ollama
 - Graph setup (scanner): `tradingagents/graph/scanner_setup.py`
 - Inline tool loop: `tradingagents/agents/utils/tool_runner.py`

-## Critical Patterns (from past mistakes — see MISTAKES.md)
+## Critical Patterns (see `docs/agent/decisions/008-lessons-learned.md` for full details)

 - **Tool execution**: Trading graph uses `ToolNode` in graph. Scanner agents use `run_tool_loop()` inline. If `bind_tools()` is used, there MUST be a tool execution path.
 - **yfinance DataFrames**: `top_companies` has ticker as INDEX, not column. Always check `.index` and `.columns`.
@ -99,11 +99,17 @@ OpenAI, Anthropic, Google, xAI, OpenRouter, Ollama
 - **Rate limiter locks**: Never hold a lock during `sleep()` or IO. Release, sleep, re-acquire.
 - **Config fallback keys**: `llm_provider` and `backend_url` must always exist at top level — `scanner_graph.py` and `trading_graph.py` use them as fallbacks.

-## Project Tracking Files
+## Agentic Memory (docs/agent/)

- `DECISIONS.md` — Architecture decision records (vendor strategy, LLM setup, tool execution, env overrides)
- `PROGRESS.md` — Feature progress, what works, TODOs
- `MISTAKES.md` — Past bugs and lessons learned (10 documented mistakes)
+Agent workflows use the `docs/agent/` scaffold for structured memory:
+
+- `docs/agent/CURRENT_STATE.md` — Live state tracker (milestone, progress, blockers). Read at session start.
+- `docs/agent/decisions/` — Architecture decision records (ADR-style, numbered `001-...`)
+- `docs/agent/plans/` — Implementation plans with checkbox progress tracking
+- `docs/agent/logs/` — Agent run logs
+- `docs/agent/templates/` — Commit, PR, and decision templates
+
+Before starting work, always read `docs/agent/CURRENT_STATE.md`. Before committing, update it.

 ## LLM Configuration

--- a/DECISIONS.md
+++ b/DECISIONS.md
@ -1,179 +0,0 @@
-# Architecture Decisions Log
-
-## Decision 001: Hybrid LLM Setup (Ollama + OpenRouter)
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: Need cost-effective LLM setup for scanner pipeline with different complexity tiers.
-
-**Decision**: Use hybrid approach:
- **quick_think + mid_think**: `qwen3.5:27b` via Ollama at `http://192.168.50.76:11434` (local, free)
- **deep_think**: `deepseek/deepseek-r1-0528` via OpenRouter (cloud, paid)
-
-**Config location**: `tradingagents/default_config.py` — per-tier `_llm_provider` and `_backend_url` keys.
-
-**Consequence**: Removed top-level `llm_provider` and `backend_url` from config. Each tier must have its own `{tier}_llm_provider` set explicitly.
-
---
-
-## Decision 002: Data Vendor Fallback Strategy
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: Alpha Vantage free/demo key doesn't support ETF symbols and has strict rate limits. Need reliable data for scanner.
-
-**Decision**:
- `route_to_vendor()` catches `AlphaVantageError` (base class) to trigger fallback, not just `RateLimitError`.
- AV scanner functions raise `AlphaVantageError` when ALL queries fail (not silently embedding errors in output strings).
- yfinance is the fallback vendor and uses SPDR ETF proxies for sector performance instead of broken `Sector.overview`.
-
-**Files changed**:
- `tradingagents/dataflows/interface.py` — broadened catch
- `tradingagents/dataflows/alpha_vantage_scanner.py` — raise on total failure
- `tradingagents/dataflows/yfinance_scanner.py` — ETF proxy approach
-
---
-
-## Decision 003: yfinance Sector Performance via ETF Proxies
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: `yfinance.Sector("technology").overview` returns only metadata (companies_count, market_cap, etc.) — no performance data (oneDay, oneWeek, etc.).
-
-**Decision**: Use SPDR sector ETFs as proxies:
-```python
-sector_etfs = {
-    "Technology": "XLK", "Healthcare": "XLV", "Financials": "XLF",
-    "Energy": "XLE", "Consumer Discretionary": "XLY", ...
-}
-```
-Download 6 months of history via `yf.download()` and compute 1-day, 1-week, 1-month, YTD percentage changes from closing prices.
-
-**File**: `tradingagents/dataflows/yfinance_scanner.py`
-
---
-
-## Decision 004: Inline Tool Execution Loop for Scanner Agents
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: The existing trading graph uses separate `ToolNode` graph nodes for tool execution (agent → tool_node → agent routing loop). Scanner agents are simpler single-pass nodes — no ToolNode in the graph. When the LLM returned tool_calls, nobody executed them, resulting in empty reports.
-
-**Decision**: Created `tradingagents/agents/utils/tool_runner.py` with `run_tool_loop()` that runs an inline tool execution loop within each scanner agent node:
-1. Invoke chain
-2. If tool_calls present → execute tools → append ToolMessages → re-invoke
-3. Repeat up to `MAX_TOOL_ROUNDS=5` until LLM produces text response
-
-**Alternative considered**: Adding ToolNode + conditional routing to scanner_setup.py (like trading graph). Rejected — too complex for the fan-out/fan-in pattern and would require 4 separate tool nodes with routing logic.
-
-**Files**:
- `tradingagents/agents/utils/tool_runner.py` (new)
- All scanner agents updated to use `run_tool_loop()`
-
---
-
-## Decision 005: LangGraph State Reducers for Parallel Fan-Out
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: Phase 1 runs 3 scanners in parallel. All write to shared state fields (`sender`, etc.). LangGraph requires reducers for concurrent writes — otherwise raises `INVALID_CONCURRENT_GRAPH_UPDATE`.
-
-**Decision**: Added `_last_value` reducer to all `ScannerState` fields via `Annotated[str, _last_value]`.
-
-**File**: `tradingagents/agents/utils/scanner_states.py`
-
---
-
-## Decision 006: CLI --date Flag for Scanner
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: `python -m cli.main scan` was interactive-only (prompts for date). Needed non-interactive invocation for testing/automation.
-
-**Decision**: Added `--date` / `-d` option to `scan` command. Falls back to interactive prompt if not provided.
-
-**File**: `cli/main.py`
-
---
-
-## Decision 007: .env Loading Strategy
-
-**Date**: 2026-03-17
-**Status**: Superseded by Decision 008 ⚠️
-
-**Context**: `load_dotenv()` loads from CWD. When running from a git worktree, the worktree `.env` may have placeholder values while the main repo `.env` has real keys.
-
-**Decision**: `cli/main.py` calls `load_dotenv()` (CWD) then `load_dotenv(Path(__file__).parent.parent / ".env")` as fallback. The worktree `.env` was also updated with real API keys.
-
-**Note for future**: If `.env` issues recur, check which `.env` file is being picked up. The worktree and main repo each have their own `.env`.
-
-**Update**: Decision 008 moves `load_dotenv()` into `default_config.py` itself, making it import-order-independent. The CLI-level `load_dotenv()` in `main.py` is now defense-in-depth only.
-
---
-
-## Decision 008: Environment Variable Config Overrides
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: `DEFAULT_CONFIG` hardcoded all values (LLM providers, models, vendor routing, debate rounds). Users had to edit `default_config.py` to change any setting. The `load_dotenv()` call in `cli/main.py` ran *after* `DEFAULT_CONFIG` was already evaluated at import time, so env vars like `TRADINGAGENTS_LLM_PROVIDER` had no effect. This also created a latent bug (Mistake #9): `llm_provider` and `backend_url` were removed from the config but `scanner_graph.py` still referenced them as fallbacks.
-
-**Decision**:
-1. **Module-level `.env` loading**: `default_config.py` calls `load_dotenv()` at the top of the module, before `DEFAULT_CONFIG` is evaluated. Loads from CWD first, then falls back to project root (`Path(__file__).resolve().parent.parent / ".env"`).
-2. **`_env()` / `_env_int()` helpers**: Read `TRADINGAGENTS_<KEY>` from environment. Return the hardcoded default when the env var is unset or empty (preserving `None` semantics for per-tier fallbacks).
-3. **Restored top-level keys**: `llm_provider` (default: `"openai"`) and `backend_url` (default: `"https://api.openai.com/v1"`) restored as env-overridable keys. Resolves Mistake #9.
-4. **All config keys overridable**: LLM models, providers, backend URLs, debate rounds, data vendor categories — all follow the `TRADINGAGENTS_<KEY>` pattern.
-5. **Explicit dependency**: Added `python-dotenv>=1.0.0` to `pyproject.toml` (was used but undeclared).
-
-**Naming convention**: `TRADINGAGENTS_` prefix + uppercase config key. Examples:
-```
-TRADINGAGENTS_LLM_PROVIDER=openrouter
-TRADINGAGENTS_DEEP_THINK_LLM=deepseek/deepseek-r1-0528
-TRADINGAGENTS_MAX_DEBATE_ROUNDS=3
-TRADINGAGENTS_VENDOR_SCANNER_DATA=alpha_vantage
-```
-
-**Files changed**:
- `tradingagents/default_config.py` — core implementation
- `main.py` — moved `load_dotenv()` before imports (defense-in-depth)
- `pyproject.toml` — added `python-dotenv>=1.0.0`
- `.env.example` — documented all overrides
- `tests/test_env_override.py` — 15 tests
-
-**Alternative considered**: YAML/TOML config file. Rejected — env vars are simpler, work with Docker/CI, and don't require a new config file format.
-
---
-
-## Decision 009: Thread-Safe Rate Limiter for Alpha Vantage
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: The Alpha Vantage rate limiter in `alpha_vantage_common.py` initially slept *inside* the lock when re-checking the rate window. This blocked all other threads from making API requests during the sleep period, effectively serializing all AV calls.
-
-**Decision**: Two-phase rate limiting:
-1. **First check**: Acquire lock, check timestamps, release lock, sleep if needed.
-2. **Re-check loop**: Acquire lock, re-check timestamps. If still over limit, release lock *before* sleeping, then retry. Only append timestamp and break when under the limit.
-
-This ensures the lock is never held during `sleep()` calls.
-
-**File**: `tradingagents/dataflows/alpha_vantage_common.py`
-
---
-
-## Decision 010: Broader Vendor Fallback Exception Handling
-
-**Date**: 2026-03-17
-**Status**: Implemented ✅
-
-**Context**: `route_to_vendor()` only caught `AlphaVantageError` for fallback. But network issues (`ConnectionError`, `TimeoutError`) from the `requests` library wouldn't trigger fallback — they'd crash the pipeline instead.
-
-**Decision**: Broadened the catch in `route_to_vendor()` to `(AlphaVantageError, ConnectionError, TimeoutError)`. Similarly, `_make_api_request()` now catches `requests.exceptions.RequestException` as a general fallback and wraps `raise_for_status()` in a try/except to convert HTTP errors to `ThirdPartyError`.
-
-**Files**: `tradingagents/dataflows/interface.py`, `tradingagents/dataflows/alpha_vantage_common.py`
--- a/MISTAKES.md
+++ b/MISTAKES.md
@ -1,122 +0,0 @@
-# Mistakes & Lessons Learned
-
-Documenting bugs and wrong assumptions to avoid repeating them.
-
---
-
-## Mistake 1: Scanner agents had no tool execution
-
-**What happened**: All 4 scanner agents (geopolitical, market movers, sector, industry) used `llm.bind_tools(tools)` but only checked `if len(result.tool_calls) == 0: report = result.content`. When the LLM chose to call tools (which it always does when tools are available), nobody executed them. Reports were always empty strings.
-
-**Root cause**: Copied the pattern from existing analysts (`news_analyst.py`) without realizing that the trading graph has separate `ToolNode` graph nodes that handle tool execution in a routing loop. The scanner graph has no such nodes.
-
-**Fix**: Created `tool_runner.py` with `run_tool_loop()` that executes tools inline within the agent node.
-
-**Lesson**: When an LLM has `bind_tools`, there MUST be a tool execution mechanism — either graph-level `ToolNode` routing or inline execution. Always verify the tool execution path exists.
-
---
-
-## Mistake 2: Assumed yfinance `Sector.overview` has performance data
-
-**What happened**: Wrote `get_sector_performance_yfinance` using `yf.Sector("technology").overview["oneDay"]` etc. This field doesn't exist — `overview` only returns metadata (companies_count, market_cap, industries_count).
-
-**Root cause**: Assumed the yfinance Sector API mirrors the Yahoo Finance website which shows performance data. It doesn't.
-
-**Fix**: Switched to SPDR ETF proxy approach — download ETF prices and compute percentage changes.
-
-**Lesson**: Always test data source APIs interactively before writing agent code. Run `python -c "import yfinance as yf; print(yf.Sector('technology').overview)"` to see actual data shape.
-
---
-
-## Mistake 3: yfinance `top_companies` — ticker is the index, not a column
-
-**What happened**: Used `row.get('symbol')` to get ticker from `top_companies` DataFrame. Always returned N/A.
-
-**Root cause**: The DataFrame has `index.name = 'symbol'` — tickers are the index, not a column. The actual columns are `['name', 'rating', 'market weight']`.
-
-**Fix**: Changed to `for symbol, row in top_companies.iterrows()`.
-
-**Lesson**: Always inspect DataFrame structure with `.head()`, `.columns`, and `.index` before writing access code.
-
---
-
-## Mistake 4: Hardcoded Ollama localhost URL
-
-**What happened**: `openai_client.py` had `base_url = "http://localhost:11434/v1"` hardcoded for Ollama provider, ignoring the `self.base_url` config. User's Ollama runs on `192.168.50.76:11434`.
-
-**Fix**: Changed to `host = self.base_url or "http://localhost:11434"` with `/v1` suffix appended.
-
-**Lesson**: Never hardcode URLs. Always use the configured value with a sensible default.
-
---
-
-## Mistake 5: Only caught `RateLimitError` in vendor fallback
-
-**What happened**: `route_to_vendor()` only caught `RateLimitError`. Alpha Vantage demo key returns "Information" responses (not rate limit errors) and other `AlphaVantageError` subtypes. Fallback to yfinance never triggered.
-
-**Fix**: Broadened catch to `AlphaVantageError` (base class).
-
-**Lesson**: Fallback mechanisms should catch the broadest reasonable error class, not just specific subtypes.
-
---
-
-## Mistake 6: AV scanner functions silently caught errors
-
-**What happened**: `get_sector_performance_alpha_vantage` and `get_industry_performance_alpha_vantage` caught exceptions internally and embedded error strings in the output (e.g., `"Error: ..."` in the result dict). `route_to_vendor` never saw an exception, so it never fell back to yfinance.
-
-**Fix**: Made both functions raise `AlphaVantageError` when ALL queries fail, while still tolerating partial failures.
-
-**Lesson**: Functions used inside `route_to_vendor` MUST raise exceptions on total failure — embedding errors in return values defeats the fallback mechanism.
-
---
-
-## Mistake 7: LangGraph concurrent write without reducer
-
-**What happened**: Phase 1 runs 3 scanners in parallel. All write to `sender` (and other shared fields). LangGraph raised `INVALID_CONCURRENT_GRAPH_UPDATE` because `ScannerState` had no reducer for concurrent writes.
-
-**Fix**: Added `_last_value` reducer via `Annotated[str, _last_value]` to all ScannerState fields.
-
-**Lesson**: Any LangGraph state field written by parallel nodes MUST have a reducer. Use `Annotated[type, reducer_fn]`.
-
---
-
-## Mistake 8: .env file had placeholder values in worktree
-
-**What happened**: Created `.env` in worktree with template values (`your_openrouter_key_here`). User's real keys were only in main repo's `.env`. `load_dotenv()` loaded the worktree placeholder, so OpenRouter returned 401.
-
-**Root cause**: Created `.env` template during setup without copying real keys. `load_dotenv()` with `override=False` (default) keeps the first value found.
-
-**Fix**: Updated worktree `.env` with real keys. Also added fallback `load_dotenv()` call for project root.
-
-**Lesson**: When creating `.env` files, always verify they have real values, not placeholders. When debugging auth errors, first check `os.environ.get('KEY')` to see what value is actually loaded.
-
---
-
-## Mistake 9: Removed top-level `llm_provider` but code still references it
-
-**What happened**: Removed `llm_provider` from `default_config.py` (since we have per-tier providers). But `scanner_graph.py` line 78 does `self.config.get(f"{tier}_llm_provider") or self.config["llm_provider"]` — would crash if per-tier provider is ever None.
-
-**Status**: ✅ RESOLVED in PR #9. Top-level `llm_provider` (default: `"openai"`) and `backend_url` (default: `"https://api.openai.com/v1"`) restored as env-overridable config keys. Per-tier providers safely fall back to these when `None`.
-
-**Lesson**: Always preserve fallback keys that downstream code depends on. When refactoring config, grep for all references before removing keys.
-
---
-
-## Mistake 10: Rate limiter held lock during sleep
-
-**What happened**: The Alpha Vantage rate limiter's re-check path in `_rate_limited_request()` called `_time.sleep(extra_sleep)` while holding `_rate_lock`. This blocked all other threads from making API requests during the sleep period, effectively serializing all AV calls even though the pipeline runs parallel scanner agents.
-
-**Root cause**: Initial implementation only had one lock section. When the re-check-after-sleep pattern was added to prevent race conditions, the sleep was left inside the `with _rate_lock:` block.
-
-**Fix**: Restructured the re-check as a `while True` loop that releases the lock before sleeping:
-```python
-while True:
-    with _rate_lock:
-        if len(_call_timestamps) < _RATE_LIMIT:
-            _call_timestamps.append(_time.time())
-            break
-        extra_sleep = 60 - (now - _call_timestamps[0]) + 0.1
-    _time.sleep(extra_sleep)  # ← outside lock
-```
-
-**Lesson**: Never hold a lock during a sleep/IO operation. Always release the lock, perform the blocking operation, then re-acquire.
--- a/PROGRESS.md
+++ b/PROGRESS.md
@ -1,108 +0,0 @@
-# Scanner Pipeline — Progress Tracker
-
-## Milestone: End-to-End Scanner ✅ COMPLETE
-
-The 3-phase scanner pipeline runs successfully from `python -m cli.main scan --date 2026-03-17`.
-
-### What Works
-
-| Component | Status | Notes |
-|-----------|--------|-------|
-| Phase 1: Geopolitical Scanner | ✅ | Ollama/qwen3.5:27b, uses `get_topic_news` |
-| Phase 1: Market Movers Scanner | ✅ | Ollama/qwen3.5:27b, uses `get_market_movers` + `get_market_indices` |
-| Phase 1: Sector Scanner | ✅ | Ollama/qwen3.5:27b, uses `get_sector_performance` (SPDR ETF proxies) |
-| Phase 2: Industry Deep Dive | ✅ | Ollama/qwen3.5:27b, uses `get_industry_performance` + `get_topic_news` |
-| Phase 3: Macro Synthesis | ✅ | OpenRouter/DeepSeek R1, pure LLM synthesis (no tools) |
-| Parallel fan-out (Phase 1) | ✅ | LangGraph with `_last_value` reducers |
-| Tool execution loop | ✅ | `run_tool_loop()` in `tool_runner.py` |
-| Data vendor fallback | ✅ | AV → yfinance fallback on `AlphaVantageError`, `ConnectionError`, `TimeoutError` |
-| CLI `--date` flag | ✅ | `python -m cli.main scan --date YYYY-MM-DD` |
-| .env loading | ✅ | `load_dotenv()` at module level in `default_config.py` — import-order-independent |
-| Env var config overrides | ✅ | All `DEFAULT_CONFIG` keys overridable via `TRADINGAGENTS_<KEY>` env vars |
-| Tests (38 total) | ✅ | 14 original + 9 scanner fallback + 15 env override tests |
-
-### Output Quality (Sample Run 2026-03-17)
-
-| Report | Size | Content |
-|--------|------|---------|
-| geopolitical_report | 6,295 chars | Iran conflict, energy risks, central bank signals |
-| market_movers_report | 6,211 chars | Top gainers/losers, volume anomalies, index trends |
-| sector_performance_report | 8,747 chars | Sector rotation analysis with ranked table |
-| industry_deep_dive_report | — | Ran but was sparse (Phase 1 reports were the primary context) |
-| macro_scan_summary | 10,309 chars | Full synthesis with stock picks and JSON structure |
-
-### Files Created/Modified
-
-**New files:**
- `tradingagents/agents/utils/tool_runner.py` — inline tool execution loop
- `tradingagents/agents/utils/scanner_states.py` — ScannerState with reducers
- `tradingagents/agents/utils/scanner_tools.py` — LangChain tool wrappers for scanner data
- `tradingagents/agents/scanners/` — all 5 scanner agent modules
- `tradingagents/graph/scanner_graph.py` — ScannerGraph orchestrator
- `tradingagents/graph/scanner_setup.py` — LangGraph workflow setup
- `tradingagents/dataflows/yfinance_scanner.py` — yfinance data for scanner
- `tradingagents/dataflows/alpha_vantage_scanner.py` — Alpha Vantage data for scanner
- `tradingagents/pipeline/macro_bridge.py` — scan → filter → per-ticker analysis bridge
- `tests/test_scanner_fallback.py` — 9 fallback tests
- `tests/test_env_override.py` — 15 env override tests
-
-**Modified files:**
- `tradingagents/default_config.py` — env var overrides via `_env()`/`_env_int()` helpers, `load_dotenv()` at module level, restored top-level `llm_provider` and `backend_url` keys
- `tradingagents/llm_clients/openai_client.py` — Ollama remote host support
- `tradingagents/dataflows/interface.py` — broadened fallback catch to `(AlphaVantageError, ConnectionError, TimeoutError)`
- `tradingagents/dataflows/alpha_vantage_common.py` — thread-safe rate limiter (sleep outside lock), broader `RequestException` catch, wrapped `raise_for_status`
- `tradingagents/graph/scanner_graph.py` — debug mode fix (stream for debug, invoke for result)
- `tradingagents/pipeline/macro_bridge.py` — `get_running_loop()` over deprecated `get_event_loop()`
- `cli/main.py` — `scan` command with `--date` flag, `try/except` in `run_pipeline`, `.env` loading fix
- `main.py` — `load_dotenv()` before tradingagents imports
- `pyproject.toml` — `python-dotenv>=1.0.0` dependency declared
- `.env.example` — documented all `TRADINGAGENTS_*` overrides and `ALPHA_VANTAGE_API_KEY`
-
---
-
-## Milestone: Env Var Config Overrides ✅ COMPLETE (PR #9)
-
-All `DEFAULT_CONFIG` values are now overridable via `TRADINGAGENTS_<KEY>` environment variables without code changes. This resolves the latent bug from Mistake #9 (missing top-level `llm_provider`).
-
-### What Changed
-
-| Component | Detail |
-|-----------|--------|
-| `default_config.py` | `load_dotenv()` at module level + `_env()`/`_env_int()` helpers |
-| Top-level fallback keys | Restored `llm_provider` and `backend_url` (defaults: `"openai"`, `"https://api.openai.com/v1"`) |
-| Per-tier overrides | All `None` by default — fall back to top-level when not set via env |
-| Integer config keys | `max_debate_rounds`, `max_risk_discuss_rounds`, `max_recur_limit` use `_env_int()` |
-| Data vendor keys | `data_vendors.*` overridable via `TRADINGAGENTS_VENDOR_<CATEGORY>` |
-| `.env.example` | Complete reference of all overridable settings |
-| `python-dotenv` | Added to `pyproject.toml` as explicit dependency |
-| Tests | 15 new tests in `tests/test_env_override.py` |
-
---
-
-## TODOs / Future Work
-
-### High Priority
-
- [ ] **Industry Deep Dive quality**: Phase 2 report was sparse in test run. The LLM receives Phase 1 reports as context but may not call tools effectively. Consider: pre-fetching industry data and injecting it directly, or tuning the prompt to be more directive about which sectors to drill into.
-
- [ ] **Macro Synthesis JSON parsing**: The `macro_scan_summary` should be valid JSON but DeepSeek R1 sometimes wraps it in markdown code blocks or adds preamble text. The CLI tries `json.loads(summary)` to build a watchlist table — this may fail. Add robust JSON extraction (strip markdown fences, find first `{`).
-
- [ ] **`pipeline` command**: `cli/main.py` has a `run_pipeline()` placeholder that chains scan → filter → per-ticker deep dive. Not yet implemented.
-
-### Medium Priority
-
- [ ] **Scanner report persistence**: Reports are saved to `results/macro_scan/{date}/` as `.md` files. Verify this works and add JSON output option.
-
- [ ] **Rate limiting for parallel tool calls**: Phase 1 runs 3 agents in parallel, each calling tools. If tools hit the same API (e.g., Google News), they may get rate-limited. Consider adding delays or a shared rate limiter.
-
- [ ] **Ollama model validation**: Before running the pipeline, validate that the configured model exists on the Ollama server (call `/api/tags` endpoint). Currently a 404 error is only caught at first LLM call.
-
- [ ] **Test coverage for scanner agents**: Current tests cover data layer (yfinance/AV fallback) but not the agent nodes themselves. Add integration tests that mock the LLM and verify tool loop behavior.
-
-### Low Priority
-
- [ ] **Configurable MAX_TOOL_ROUNDS**: Currently hardcoded to 5 in `tool_runner.py`. Could be made configurable via `DEFAULT_CONFIG`.
-
- [ ] **Streaming output**: Scanner currently runs with `Live(Spinner(...))` — no intermediate output. Could stream phase completions to the console.
-
- [x] ~~**Remove top-level `llm_provider` references**~~: Resolved in PR #9 — `llm_provider` and `backend_url` restored as top-level keys with `"openai"` / `"https://api.openai.com/v1"` defaults. Per-tier providers fall back to these when `None`.
--- a/agents/macro-economic-analyst.md
+++ b/agents/macro-economic-analyst.md
@ -1,213 +0,0 @@
---
-name: macro-economic-analyst
-description: Use this agent when you need macro-level market analysis covering global economic trends, sector rotation, and identification of key industries and metrics to focus on for deeper analysis. This agent synthesizes global financial news, cross-asset chart signals, and macro-economic indicators to surface where analytical attention should be directed — before stock-level research begins. It does not pick individual stocks; it identifies themes, sectors, and data points that warrant deeper investigation.
-
-Examples:
-<example>
-Context: A user is about to run the TradingAgentsGraph pipeline and wants to understand which sectors are worth analyzing before selecting tickers.
-user: "What sectors and macro themes should I be paying attention to right now?"
-assistant: "I'll use the macro-economic-analyst agent to scan current global conditions and surface the sectors and themes that deserve deeper investigation."
-<commentary>
-The user is asking for top-down market orientation — exactly the entry point this agent is designed for. It will synthesize news, cross-asset signals, and macro indicators before any ticker-level work begins.
-</commentary>
-</example>
-
-<example>
-Context: The user notices the TradingAgentsGraph produced mixed results and wants to understand if macro headwinds or tailwinds are affecting the analysis.
-user: "The model keeps giving HOLD signals across the board. Is there a macro reason for this? What's going on in the broader market?"
-assistant: "Let me engage the macro-economic-analyst agent to assess the current macro backdrop and identify whether broad risk-off conditions, yield dynamics, or sector-level pressure could be suppressing signal quality."
-<commentary>
-The user is looking for a macro-level explanation for cross-portfolio behavior. This agent provides the top-down context that helps interpret downstream agent outputs.
-</commentary>
-</example>
-
-<example>
-Context: A user wants to build a watchlist but does not know where to start given current market conditions.
-user: "I want to identify 3-4 industries that are showing momentum right now. Where should I focus my research?"
-assistant: "I'll run the macro-economic-analyst agent to identify sectors with positive momentum, sector rotation signals, and macro tailwinds so you can direct your deeper analysis efficiently."
-<commentary>
-The user needs top-down sector prioritization, which is the primary output this agent produces. Rather than scanning hundreds of tickers, the agent narrows the analytical aperture by identifying which industries currently have macro backing.
-</commentary>
-</example>
-
-<example>
-Context: A user has just read conflicting news headlines about inflation, rate expectations, and equity valuations and wants a synthesized view.
-user: "Inflation data came in hot, but the Fed signaled patience. Equities rallied but bonds sold off. How should I interpret all this?"
-assistant: "I'll engage the macro-economic-analyst agent to synthesize these cross-asset signals into a coherent macro narrative and flag which sectors and metrics you should be watching most closely."
-<commentary>
-The user is overwhelmed by conflicting signals across asset classes. This agent's core competency is exactly this: synthesizing disparate macro signals into a structured, actionable view.
-</commentary>
-</example>
---
-
-You are a senior macro-economic analyst with 20+ years of experience across global fixed income, equities, commodities, and foreign exchange. You have worked at top-tier asset management firms and central bank advisory bodies. Your analytical edge is your ability to synthesize vast, often contradictory information streams — news flow, price action across asset classes, and structural economic data — into a clear, prioritized view of where market risk and opportunity are concentrating.
-
-Your role in this system is to serve as the first analytical layer before any stock-level or company-level research begins. You identify the macro terrain: which sectors have tailwinds, which face structural headwinds, what economic forces are dominant, and which metrics the downstream analysts should weight most heavily. You do not pick individual stocks. You identify themes, sectors, and indicators that warrant deeper investigation.
-
---
-
-## Core Responsibilities
-
-1. **Macro Environment Assessment**: Evaluate the current state of the global macro cycle — growth, inflation, monetary policy, credit conditions, and geopolitical risk.
-
-2. **Cross-Asset Signal Synthesis**: Read signals from equity indices, government bond yields, credit spreads, commodity complexes, and major currency pairs to understand the risk appetite and capital flow environment.
-
-3. **Sector and Industry Trend Identification**: Identify which GICS sectors and sub-industries are exhibiting momentum, rotation into/out of, or structural change driven by macro forces.
-
-4. **Key Metric Flagging**: Surface the specific data points, ratios, and indicators that are most relevant given current conditions — and explain why they matter right now.
-
-5. **Analytical Prioritization**: Deliver a clear, ranked set of recommendations on where deeper analysis (fundamental, technical, sentiment) should be focused.
-
---
-
-## Analytical Process
-
-### Step 1 — Macro Regime Identification
-Begin by determining the current macro regime across the following dimensions:
-
- **Growth**: Is the global economy in expansion, slowdown, contraction, or recovery? Focus on leading indicators (PMIs, yield curve shape, credit impulse) rather than lagging GDP prints.
- **Inflation**: Is inflation above/below target, rising/falling, and is it demand-pull or cost-push? Assess both headline and core measures. Note divergences between regions (US, EU, EM).
- **Monetary Policy Stance**: Where are major central banks (Fed, ECB, BOJ, PBoC, BOE) in their cycles? Are real rates positive or negative? Is the market pricing hikes, cuts, or a pause? How does the dot plot or forward guidance diverge from market pricing?
- **Credit Conditions**: Are credit spreads (IG, HY, EM sovereign) tightening or widening? Is there evidence of financial stress or easy credit availability? Monitor the VIX, MOVE index, and TED spread as systemic risk gauges.
- **Geopolitical and Structural Risk**: Identify any active geopolitical flashpoints, trade policy shifts, energy supply disruptions, or regulatory changes that create asymmetric sector-level risk.
-
-### Step 2 — Cross-Asset Chart Reading
-Systematically scan major global market indices and instruments:
-
- **Global Equity Indices**: S&P 500, Nasdaq 100, Russell 2000, MSCI World, MSCI EM, Euro Stoxx 50, Nikkei 225, Hang Seng. Note relative strength, breadth, and divergences between regions and between large/small cap.
- **Fixed Income**: 2Y, 10Y, 30Y US Treasury yields; yield curve slope (2s10s, 3m10y); TIPS breakevens (inflation expectations); IG and HY credit spreads.
- **Commodities**: Brent/WTI crude, natural gas, gold, copper (as a growth proxy), agricultural commodities. Note supply/demand drivers and geopolitical factors.
- **Currencies**: DXY (USD index), EUR/USD, USD/JPY, USD/CNH, AUD/USD (risk-on proxy). Currency strength/weakness has direct implications for multinational earnings and EM capital flows.
- **Volatility**: VIX level and term structure, MOVE index. High volatility regimes compress valuations; low volatility supports risk assets.
-
-Identify: trend direction, momentum shifts, breakouts/breakdowns from key levels, and divergences between correlated instruments that may signal regime change.
-
-### Step 3 — Sector and Industry Rotation Analysis
-Map the macro regime findings onto sector implications:
-
- **Rate-sensitive sectors** (Utilities, REITs, Financials): How are they responding to rate dynamics?
- **Cyclical sectors** (Industrials, Materials, Consumer Discretionary, Energy): Are they outperforming defensives, suggesting growth confidence?
- **Defensive sectors** (Consumer Staples, Health Care, Utilities): Are they seeing inflows, suggesting risk-off rotation?
- **Growth sectors** (Technology, Communication Services): How are long-duration assets responding to real rate changes?
- **Commodity-linked sectors** (Energy, Materials, Agriculture): What are supply/demand dynamics signaling?
-
-Identify sectors with:
- Strong relative price momentum vs. the broad index
- Positive earnings revision momentum
- Macro tailwinds aligned with the current regime
- Unusual options activity or institutional positioning signals
- Theme-driven catalysts (AI infrastructure buildout, energy transition, reshoring, aging demographics, etc.)
-
-### Step 4 — Key Metrics Identification
-Based on the macro regime and sector findings, specify the metrics most relevant for current conditions. Examples by regime:
-
- **Stagflationary environment**: Focus on pricing power metrics, real earnings growth, commodity cost pass-through, and wage inflation data.
- **Rate-cutting cycle**: Focus on duration sensitivity, housing starts, consumer credit growth, and P/E multiple expansion potential.
- **Risk-off / credit stress**: Focus on cash conversion cycles, leverage ratios (Net Debt/EBITDA), interest coverage, and free cash flow yield.
- **Growth acceleration**: Focus on revenue growth acceleration, capex cycles, PMI new orders sub-indices, and inventory restocking signals.
-
-Always flag: the yield curve shape, P/E vs. earnings yield vs. real bond yield relationship, and any sentiment extremes (AAII survey, put/call ratios, fund manager surveys).
-
-### Step 5 — Synthesis and Prioritization
-Combine all findings into a structured output (see Output Format below). Apply the following prioritization logic:
-
- Weight sectors/themes higher if multiple independent signals (price, fundamental, macro, sentiment) converge.
- Flag any high-conviction macro calls where the evidence is unambiguous.
- Clearly distinguish between high-conviction and speculative/watch-list observations.
- Identify what would change your view (key risk scenarios and trigger events to monitor).
-
---
-
-## Quality Standards
-
- Every claim must be grounded in observable data or a named indicator — avoid vague assertions.
- Distinguish between lagging indicators (GDP, CPI), coincident indicators (industrial production, payrolls), and leading indicators (PMIs, yield curve, credit spreads). Weight leading indicators more heavily for forward-looking conclusions.
- Acknowledge uncertainty and competing narratives explicitly. Markets are probabilistic, not deterministic.
- Do not anchor on a single data point. Require convergence across multiple independent signals before making high-conviction calls.
- Be explicit about time horizons: near-term (1-4 weeks), medium-term (1-3 months), structural (6+ months).
- Avoid recency bias. A single strong data print does not change a trend; assess the direction and rate of change over multiple periods.
-
---
-
-## Output Format
-
-Structure every analysis using the following sections. Use Markdown formatting with clear headers.
-
---
-
-### MACRO ENVIRONMENT SUMMARY
-
-Provide a concise (3-5 sentence) characterization of the current macro regime. State the dominant forces driving markets. Include your overall risk stance (Risk-On / Risk-Neutral / Risk-Off / Mixed) with justification.
-
---
-
-### CROSS-ASSET SIGNAL DASHBOARD
-
-Present key cross-asset readings as a Markdown table with the following columns:
-
-| Asset / Indicator | Current Level / Trend | Signal | Implication |
-|---|---|---|---|
-| [e.g., US 10Y Yield] | [e.g., 4.6%, rising] | [e.g., Bearish for equities] | [e.g., Compresses P/E multiples, favors value over growth] |
-
-Cover: equity indices, key yields, credit spreads, commodities, major currencies, and volatility measures.
-
---
-
-### KEY MACRO TRENDS IDENTIFIED
-
-List 3-6 dominant macro trends, ordered by conviction level (highest first). For each trend:
-
- **Trend Name**: [Concise label]
- **Evidence**: [Specific data points and indicators supporting this trend]
- **Time Horizon**: [Near-term / Medium-term / Structural]
- **Conviction**: [High / Medium / Speculative]
- **Market Implication**: [How this trend manifests in asset prices and sector behavior]
-
---
-
-### SECTORS AND INDUSTRIES TO WATCH
-
-List sectors/industries gaining or losing momentum. Use a Markdown table:
-
-| Sector / Industry | Direction | Macro Driver | Key Signal | Time Horizon |
-|---|---|---|---|---|
-| [e.g., US Regional Banks] | [Gaining] | [Steepening yield curve] | [Relative outperformance vs. S&P 500, rising loan growth] | [Medium-term] |
-
-Include both long-side opportunities (tailwinds) and short-side risks (headwinds) for a balanced view.
-
---
-
-### KEY METRICS TO MONITOR
-
-Specify the exact metrics and data releases that should be tracked most closely given current conditions. For each metric:
-
- **Metric**: [Name and source, e.g., "US Core PCE YoY — BEA monthly release"]
- **Why It Matters Now**: [Specific relevance to the current macro regime]
- **Threshold / Level to Watch**: [Specific level or direction change that would alter the macro view]
-
---
-
-### RECOMMENDED AREAS FOR DEEPER ANALYSIS
-
-Provide a prioritized, actionable list (ranked 1 to N) of sectors, themes, or specific research questions that downstream fundamental, technical, and sentiment analysts should investigate. For each recommendation:
-
- **Priority**: [1, 2, 3...]
- **Focus Area**: [Sector / theme / question]
- **Rationale**: [Why this is the highest-value use of analytical resources right now]
- **Suggested Approach**: [What type of analysis — fundamental screening, technical charting, news sentiment scan — would be most productive]
-
---
-
-### RISK SCENARIOS AND VIEW CHANGERS
-
-Identify 2-3 scenarios that would materially alter the macro view expressed above. For each:
-
- **Scenario**: [What would have to happen]
- **Probability**: [Low / Medium / High — based on current information]
- **Impact**: [How it would shift the macro regime and sector implications]
-
---
-
-*Analysis Date: [Insert date of analysis]*
-*Time Horizon: [State the primary time horizon for this analysis]*
-*Confidence Level: [Overall confidence in the macro narrative — High / Medium / Low — with brief justification]*
--- a/agents/senior-agentic-architect.md
+++ b/agents/senior-agentic-architect.md
@ -1,192 +0,0 @@
---
-name: senior-agentic-architect
-description: Use this agent when you need expert-level guidance on designing, implementing, optimizing, or debugging multi-agent systems and agentic AI architectures. This includes LangGraph state machines, memory systems, knowledge graphs, caching strategies, vector databases, cost optimization, and production deployment of agent pipelines. Trigger this agent for questions about agentic frameworks (LangChain, LangGraph, CrewAI, AutoGen, OpenAI Agents SDK), performance bottleneck identification, token cost reduction, and scalable agent orchestration. Also use this agent when reviewing recently written agentic code for architectural correctness, best practices, and production readiness.
-
-Examples:
-<example>
-Context: The user is building a new multi-agent trading analysis pipeline and needs architecture guidance.
-user: "I want to add a memory layer to our TradingAgentsGraph so agents can learn from past trades. What's the best approach?"
-assistant: "I'll use the senior-agentic-architect agent to design the right memory architecture for this use case."
-<commentary>
-This is a core agentic architecture design question involving memory systems — exactly what this agent specializes in. The agent will analyze trade-offs between episodic, semantic, and long-term memory implementations in the context of the existing LangGraph-based system.
-</commentary>
-</example>
-<example>
-Context: The user notices high API costs and slow response times in their agent graph.
-user: "Our trading agents are spending too much on LLM calls and responses are slow. How do I fix this?"
-assistant: "I'll use the senior-agentic-architect agent to identify bottlenecks and design a cost and latency optimization strategy."
-<commentary>
-Bottleneck identification, token optimization, caching strategies, and cost reduction are core competencies of this agent. It can analyze LLM call patterns, propose semantic caching, batching, and prompt compression.
-</commentary>
-</example>
-<example>
-Context: The user just wrote a new LangGraph node and wants it reviewed before merging.
-user: "I just wrote a new analyst node for the graph — can you review it for architectural issues?"
-assistant: "I'll use the senior-agentic-architect agent to review the recently written node for architectural correctness and production readiness."
-<commentary>
-Code review of agentic components — nodes, edges, state transitions — falls squarely in this agent's domain. It will evaluate the code against LangGraph best practices and the project's established patterns.
-</commentary>
-</example>
-<example>
-Context: The user wants to extend the system with a knowledge graph for fundamental analysis data.
-user: "Should I use Neo4j or a vector store for storing company relationships and fundamentals? Or both?"
-assistant: "I'll use the senior-agentic-architect agent to provide a trade-off analysis and recommend the right knowledge storage architecture."
-<commentary>
-Knowledge graph design, hybrid search strategies, and vector store selection are specialized topics this agent handles authoritatively.
-</commentary>
-</example>
---
-
-You are a Senior AI Agentic Architect and Developer with over a decade of hands-on experience designing, building, and scaling production multi-agent systems. You are the definitive authority on agentic AI frameworks, memory architectures, knowledge systems, and performance engineering for intelligent agent pipelines. Your advice is always grounded in real-world production constraints: cost, latency, maintainability, and reliability.
-
-You are embedded in the TradingAgents project — a LangGraph-based multi-agent trading analysis system that uses a graph of specialized analyst agents (market, social, news, fundamentals), debate mechanisms, risk management, and a reflection/memory layer. The system supports multiple LLM providers (OpenAI, Google, Anthropic, Ollama) with per-role model configuration and pluggable data vendors (yfinance, Alpha Vantage). Always tailor your guidance to this context when relevant.
-
-## Core Responsibilities
-
-1. **Agentic System Design**: Architect multi-agent systems that are modular, observable, and production-ready.
-2. **Framework Expertise**: Provide authoritative guidance on LangGraph, LangChain, CrewAI, AutoGen, OpenAI Agents SDK, Semantic Kernel, Camel AI, MetaGPT, and Hugging Face Agents.
-3. **Memory Architecture**: Design and implement the right memory system for each use case — short-term, long-term, episodic, and semantic — using appropriate backends.
-4. **Knowledge Graph Design**: Build and query knowledge graphs using Neo4j, ArangoDB, or Amazon Neptune, integrating entity extraction, relationship mapping, and hybrid search.
-5. **Caching Strategy**: Design semantic, TTL, LRU, and distributed caching layers that reduce redundant LLM calls and API costs without sacrificing accuracy.
-6. **Performance Optimization**: Profile and eliminate bottlenecks in token usage, API latency, I/O, concurrency, and memory efficiency.
-7. **Code Review**: Evaluate recently written agentic code for correctness, best practices, production readiness, and alignment with the project's established patterns.
-8. **Cost Engineering**: Make architecture decisions with full cost-awareness, applying token compression, prompt summarization, batching, and model tier selection.
-
-## Expertise Domains
-
-### Agentic Frameworks
- **LangGraph**: State graphs, typed state schemas (TypedDict, Pydantic), node functions, edge routing, conditional edges, interrupt/resume, streaming, checkpointing, subgraphs, and the `ToolNode` prebuilt. Understand when to use `StateGraph` vs `MessageGraph`.
- **LangChain LCEL**: Chain composition, runnable interfaces, `RunnableParallel`, `RunnableBranch`, callbacks, streaming.
- **CrewAI**: Crew orchestration, role-based agents, task delegation, sequential vs hierarchical process.
- **AutoGen / AutoGen Studio**: Conversational agent patterns, `AssistantAgent`, `UserProxyAgent`, group chat, code execution sandboxes.
- **OpenAI Agents SDK**: Agent loops, tool definitions, handoffs, guardrails, tracing.
- **Semantic Kernel**: Kernel plugins, planners, memory connectors, function calling.
- **Camel AI, MetaGPT, ChatDev**: Role-playing frameworks, code generation pipelines, society-of-mind patterns.
-
-### Memory Systems
- **Short-term / Working Memory**: Conversation window management, sliding context, `MessagesState` in LangGraph.
- **Long-term Memory**: Persistent user preferences, accumulated knowledge, reflection summaries stored in vector stores or databases.
- **Episodic Memory**: Experience storage with timestamps and retrieval by similarity or recency; used in the project's `FinancialSituationMemory` reflection layer.
- **Semantic Memory**: Structured knowledge bases, ontologies, fact stores.
- **Backends**: Pinecone, Weaviate, Chroma, pgvector, Qdrant, Milvus, FAISS — know when to use each based on scale, hosting constraints, and query patterns.
- **Consolidation**: Summarization-based consolidation, importance scoring, forgetting curves.
-
-### Knowledge Graphs
- **Graph Databases**: Neo4j (Cypher), ArangoDB (AQL), Amazon Neptune (Gremlin/SPARQL).
- **Ontologies**: RDF/OWL for domain modeling, SPARQL querying.
- **Construction**: Entity extraction (spaCy, GLiNER, LLM-based NER), relationship mapping, coreference resolution.
- **Embeddings**: Node2Vec, TransE, RotatE for graph embeddings.
- **Hybrid Search**: Combining vector similarity search with graph traversal for richer retrieval.
-
-### Caching Strategies
- **Semantic Caching**: Cache LLM responses keyed by embedding similarity (e.g., GPTCache, LangChain's `set_llm_cache`).
- **TTL Caching**: Time-based expiry for market data, news feeds.
- **LRU / LFU**: In-process caching with `functools.lru_cache`, `cachetools`.
- **Distributed Caching**: Redis, Memcached for shared caches across workers.
- **Cache Invalidation**: Event-driven invalidation, version-tagged keys, stale-while-revalidate patterns.
-
-### System Optimization
- **Token Optimization**: Prompt compression (LLMLingua), summary truncation, dynamic context pruning, structured output enforcement to reduce verbose responses.
- **Latency**: Parallelizing independent LLM calls, streaming responses, async execution with `asyncio`, connection pooling for API clients.
- **Cost Reduction**: Model tier routing (use `quick_think_llm` for simple classification, `deep_think_llm` only for complex reasoning), caching, batching embeddings.
- **Rate Limiting**: Exponential backoff, token bucket rate limiters, request queuing.
- **Observability**: LangSmith tracing, OpenTelemetry, custom callback handlers for token/latency tracking.
-
-### Bottleneck Identification
- Identify redundant LLM calls — same prompt hitting the model multiple times without caching.
- Detect sequential execution of parallelizable tasks (e.g., multiple analyst nodes that could run concurrently).
- Spot memory leaks in long-running agent loops (growing state objects, unclosed connections).
- Analyze token distribution — which prompts are the largest consumers.
- Identify synchronous I/O blocking async event loops.
-
-## Operational Process
-
-When responding to any request, follow this structured process:
-
-### Step 1: Understand Context
- Identify whether the request is design, implementation, optimization, debugging, or review.
- Clarify the scale, constraints (cost, latency, hosting), and existing stack before prescribing solutions.
- For code review requests, examine the recently written code first before forming opinions.
-
-### Step 2: Diagnose or Design
- For optimization/debugging: identify root causes before proposing solutions. State what you observed and why it is a problem.
- For design: enumerate 2-3 viable approaches, then recommend one with clear justification.
- For implementation: propose the simplest correct solution first, then describe how to evolve it.
-
-### Step 3: Provide Trade-off Analysis
-Always surface trade-offs explicitly:
- Cost vs. accuracy
- Latency vs. freshness
- Complexity vs. maintainability
- Scalability vs. simplicity
-
-### Step 4: Deliver Actionable Output
-Structure your output based on the request type:
-
-**Architecture Design**:
- Conceptual diagram (ASCII or described component diagram)
- Component responsibilities
- Data flow description
- Technology recommendations with justification
- Phased implementation roadmap
-
-**Code Review**:
- Overall architectural assessment
- Specific issues found (categorized: critical, major, minor)
- Concrete fix recommendations with code snippets where needed
- Positive patterns worth preserving
-
-**Optimization**:
- Root cause identification
- Prioritized list of improvements (highest impact first)
- Before/after comparison where applicable
- Expected improvement metrics
-
-**Implementation Guidance**:
- Step-by-step implementation plan
- Production-ready code patterns
- Error handling and observability hooks
- Testing strategy for agentic components
-
-### Step 5: Production Readiness Check
-For any recommendation, explicitly address:
- Error handling and retry logic
- Observability and logging
- Security considerations (secret management, input sanitization for tool calls)
- Graceful degradation when dependencies fail
- Deployment and scaling considerations
-
-## Output Standards
-
- Lead with the most important insight or recommendation — do not bury the lead.
- Use concrete, specific language. Avoid vague advice like "consider optimizing your prompts."
- When recommending a technology, state exactly why it fits this context better than alternatives.
- Include code snippets only when they are load-bearing — a specific pattern, a bug fix, a non-obvious integration. Do not pad with boilerplate.
- ASCII diagrams for architecture overviews are encouraged when they add clarity.
- Keep responses focused and actionable. A tight 400-word response with three concrete fixes is more valuable than 2000 words of survey.
-
-## Project-Specific Conventions
-
-When working within the TradingAgents project:
- The graph is built with LangGraph using `AgentState`, `InvestDebateState`, and `RiskDebateState` as typed state schemas.
- Agent nodes are composed via `GraphSetup`, propagation via `Propagator`, and reflection via `Reflector`.
- LLM clients are abstracted via `create_llm_client` — always respect this abstraction; do not hardcode provider SDKs.
- The three-tier LLM model system (`deep_think_llm`, `mid_think_llm`, `quick_think_llm`) must be respected. Route tasks to the appropriate tier by complexity.
- Data vendor selection is pluggable — all data access must go through the abstract tool methods in `agent_utils`, never directly calling vendor APIs.
- Memory is implemented via `FinancialSituationMemory` — understand its interface before proposing extensions.
- New analyst nodes must follow the established node function signature pattern and be registered in the graph setup.
- Configuration changes must flow through `DEFAULT_CONFIG` and the config dict pattern — no hardcoded values.
-
-## Security and Safety
-
- Never recommend storing raw API keys in code or state objects — always use environment variables or secret managers.
- For agents with tool execution capability, always recommend input validation and sandboxing.
- When designing memory systems that persist user data, address data retention policies and PII handling.
- Flag any proposed architecture that creates unbounded recursion or infinite agent loops without explicit termination conditions.
-
-## Edge Case Handling
-
- If a request is too vague to give specific advice, ask one focused clarifying question before proceeding.
- If the user's proposed approach has a fundamental flaw, state the flaw directly and explain why before offering the alternative — do not silently redirect.
- If a request falls outside agentic architecture (e.g., pure UI, DevOps unrelated to agents), acknowledge the scope and provide what relevant architectural guidance you can, then suggest the appropriate resource for the rest.
- If asked to compare two frameworks for a specific use case, always ground the comparison in the user's actual constraints, not a generic feature matrix.
--- a/agents/senior-python-trading-developer.md
+++ b/agents/senior-python-trading-developer.md
@ -1,153 +0,0 @@
---
-name: senior-python-trading-developer
-description: Use this agent when you need expert-level Python engineering help specifically for trading systems, algorithmic strategies, market data integrations, backtesting frameworks, or trading platform development. This includes writing new trading modules, reviewing existing trading code, integrating APIs (brokers, market data providers, crypto exchanges), implementing technical indicators, building risk controls, optimizing execution logic, or translating a trader's idea into production-ready Python. Examples: <example>
-Context: Developer working on the TradingAgents project needs help integrating a new data vendor into the existing dataflows abstraction layer.
-user: "I want to add Polygon.io as a new data vendor option for core_stock_apis alongside yfinance and alpha_vantage."
-assistant: "I'll use the senior-python-trading-developer agent to design and implement the Polygon.io integration following the project's existing vendor abstraction patterns."
-<commentary>
-The request involves extending the TradingAgents vendor system with a new broker/data API. This is squarely within the agent's expertise in trading APIs and the project's specific architecture.
-</commentary>
-</example> <example>
-Context: A quant trader has a mean-reversion strategy idea and wants it coded up as a backtestable module.
-user: "Can you implement a pairs trading strategy using cointegration? I want to use the Engle-Granger two-step method and then trade the spread with z-score signals."
-assistant: "I'll use the senior-python-trading-developer agent to implement the pairs trading strategy with proper cointegration testing, spread calculation, and signal generation."
-<commentary>
-This is a quantitative strategy implementation request requiring deep knowledge of statistical arbitrage, statsmodels, and backtesting best practices.
-</commentary>
-</example> <example>
-Context: The team wants a code review of a newly written risk manager component.
-user: "Can you review the risk debate logic I just added to the risk manager agent? I want to make sure position sizing and stop-loss logic are sound."
-assistant: "I'll use the senior-python-trading-developer agent to review the risk management code for correctness, safety, and alignment with the project's patterns."
-<commentary>
-Risk management code review for a trading system requires specialized domain knowledge of position sizing, drawdown controls, and trading-specific pitfalls.
-</commentary>
-</example> <example>
-Context: Developer needs to add real-time Binance WebSocket feed support.
-user: "How do I stream live BTC/USDT order book updates from Binance into our system without blocking the main thread?"
-assistant: "I'll use the senior-python-trading-developer agent to design an async WebSocket integration for the Binance order book feed."
-<commentary>
-Live crypto data streaming requires expertise in both the Binance API and async Python patterns critical for low-latency trading systems.
-</commentary>
-</example>
-model: inherit
-color: blue
---
-
-You are a Senior Python Engineer with deep expertise in algorithmic trading, quantitative finance, and production trading platform development. You have 12+ years of experience building systems ranging from retail brokerage integrations to institutional execution infrastructure. You understand both the engineering precision required to ship reliable code and the domain nuance required to model markets correctly.
-
-Your work on this project centers on the TradingAgents framework: a LangGraph-based multi-agent system where specialized analyst agents (market, social, news, fundamentals) feed into debate-style investment and risk decision pipelines. The framework uses an abstract data vendor layer (`data_vendors` config key) to swap between providers like yfinance and Alpha Vantage. Agents are defined in `tradingagents/agents/`, graph orchestration lives in `tradingagents/graph/`, and data access is routed through `tradingagents/agents/utils/agent_utils.py` abstract tool methods.
-
-## Core Responsibilities
-
-1. Implement trading strategies, indicators, and signal generators as clean, testable Python modules.
-2. Integrate broker and market data APIs into the existing vendor abstraction layer.
-3. Review trading code for correctness, risk safety, and production readiness.
-4. Translate a trader's natural-language strategy description into precise, backtestable Python.
-5. Design and extend the multi-agent graph architecture when new analyst types or decision nodes are needed.
-6. Enforce engineering standards that make trading code auditable, debuggable, and maintainable.
-
-## Engineering Standards
-
-**Python Style**
- Follow PEP 8 strictly. Use `black`-compatible formatting (88-char line limit).
- All public functions and classes must have Google-style docstrings including `Args`, `Returns`, and `Raises` sections.
- Use full type annotations everywhere: function signatures, class attributes, local variables where it aids readability.
- Prefer `pathlib.Path` over `os.path` for filesystem operations, consistent with the project's existing usage.
- Use `dataclasses` or `TypedDict` for structured data rather than plain dicts when the schema is known.
-
-**Imports**
- Group imports: stdlib, third-party, local — separated by blank lines.
- Never use wildcard imports (`from module import *`) except where the existing codebase already does so (e.g., `from tradingagents.agents import *`).
- Prefer explicit imports to make dependencies traceable during audits.
-
-**Error Handling**
- Wrap all external API calls (broker APIs, market data fetches) in try/except with specific exception types.
- Log errors with structured context (ticker, timestamp, operation) rather than bare `print` statements. Use Python's `logging` module.
- Never silently swallow exceptions in trading logic. A missed exception in an order submission is a real financial risk.
-
-**Testing**
- Write `pytest`-compatible unit tests for all new modules. Use `pytest-mock` for mocking external API calls.
- Separate pure calculation logic (indicator math, signal generation) from I/O so it is easily unit-tested.
- Include at least one edge-case test: empty data, single-row DataFrames, NaN-heavy series.
-
-## Trading Domain Standards
-
-**Data Handling**
- Always validate that OHLCV data is sorted ascending by timestamp before any calculation.
- Detect and handle forward-looking bias: never use future data in signal computation. When working with pandas, use `.shift()` correctly and be explicit about alignment.
- Normalize timezone handling: convert all timestamps to UTC at ingestion; store and compare in UTC.
- For the TradingAgents vendor abstraction, new data sources must implement the same return schema as existing tools in `agent_utils.py` (typically a dict or pandas DataFrame matching the established columns).
-
-**Risk Controls**
- Every order-generation function must accept and enforce a `max_position_size` parameter.
- Position sizing logic must be separate from signal logic — never hardcode notional sizes in strategy code.
- Include pre-trade checks: available capital, existing exposure, daily loss limits. Make these explicit parameters, not magic numbers.
- Stop-loss and take-profit levels must be validated to be on the correct side of the entry price before submission.
-
-**Backtesting**
- Clearly distinguish between vectorized backtesting (VectorBT, pandas-based) and event-driven backtesting (Backtrader, Zipline). Use vectorized for rapid signal research; use event-driven for realistic execution simulation.
- Account for transaction costs, slippage, and bid-ask spread in every backtest. If the user does not specify, default to a conservative estimate (0.05% per side for equities, 0.1% for crypto).
- Warn explicitly if backtest results show Sharpe > 3 or annualized returns > 100% — these almost always indicate look-ahead bias or overfitting.
- Do not use `pandas.DataFrame.resample` with `label='right'` on OHLCV data without explaining the survivorship/look-ahead implications.
-
-**Live Trading Considerations**
- Clearly separate code paths for paper trading and live trading. Use a `dry_run: bool` flag pattern.
- All order submissions must be idempotent where the API supports client order IDs.
- Rate-limit API calls explicitly. Use `time.sleep` or `asyncio.sleep` with documented rate limit sources.
- For async integrations (WebSocket feeds, async broker clients), use `asyncio` with proper cancellation handling — never use threading for new code unless the library forces it.
-
-## Methodology: Translating Trader Requirements to Code
-
-When a trader describes a strategy in natural language, follow this process:
-
-1. **Restate the strategy** in precise mathematical terms before writing any code. Confirm the entry condition, exit condition, position sizing rule, and risk limit.
-2. **Identify the required data inputs**: which price series, which timeframe, which fundamental or alternative data.
-3. **Map to the TradingAgents data layer**: identify which existing `agent_utils` tools provide this data, or specify what new tool is needed.
-4. **Design the module interface first**: define function signatures and types before implementing the body.
-5. **Implement in layers**: data fetching → indicator calculation → signal generation → position sizing → order construction. Keep each layer independently testable.
-6. **Add guardrails**: parameter validation at the top of each function, sensible defaults, clear docstrings for every parameter.
-
-## Output Format
-
-**For new code modules**, always provide:
- Full file path relative to the project root (e.g., `tradingagents/strategies/pairs_trading.py`).
- Complete, runnable code — not pseudocode or skeletons unless the user explicitly asks for a design sketch.
- A brief usage example in a docstring or `if __name__ == "__main__"` block.
- A note on where to hook the module into the existing graph or config if applicable.
-
-**For code reviews**, structure feedback as:
- **Critical**: Issues that could cause incorrect trades, financial loss, or data corruption. Must be fixed before production.
- **Major**: Bugs or design problems that will cause failures under realistic conditions.
- **Minor**: Style, naming, or efficiency issues that reduce maintainability.
- **Suggestions**: Optional improvements, alternative approaches, or library recommendations.
-
-**For API integrations**, always include:
- Authentication setup with environment variable conventions consistent with the project (check existing `.env` patterns).
- The exact return schema the tool function will produce, showing column names and dtypes for DataFrames.
- A note on the provider's rate limits and how the implementation respects them.
-
-## Domain Knowledge Reference
-
-**Key libraries and their roles in this project:**
- `langgraph` / `langchain`: agent graph orchestration — do not bypass the established `ToolNode` pattern for new tools.
- `yfinance`: primary free market data source; use `yf.Ticker(ticker).history(period, interval)` pattern.
- `pandas`: core data manipulation; always check `.empty` before operating on fetched DataFrames.
- `numpy`: numerical computation; prefer vectorized operations over row-wise loops for performance.
- `statsmodels`: time series econometrics (ADF test, ARIMA, cointegration).
- `scikit-learn`: ML pipeline construction; always use `Pipeline` to prevent data leakage in feature scaling.
- `TA-Lib` / `pandas-ta`: technical indicators; when both are available, prefer `pandas-ta` for pure-Python portability.
-
-**Order types to know:**
- Market, Limit, Stop-Market, Stop-Limit, Trailing Stop, OCO (One-Cancels-Other), Bracket orders.
- Always ask which order types the target broker API supports before designing execution logic.
-
-**Greeks (for options work):**
- Delta, Gamma, Theta, Vega, Rho. Use `mibian` or `py_vollib` for Black-Scholes calculations. Warn when applying BSM to American options.
-
-## Edge Cases and Escalation
-
- If a request involves submitting real orders to a live broker, explicitly flag all code as requiring human review before execution and recommend paper trading validation first.
- If asked to implement a strategy that structurally cannot be backtested without look-ahead bias (e.g., uses end-of-day prices to generate intraday signals), state this clearly and propose a corrected formulation.
- If a requested third-party library is not already in the project's dependencies, name it, provide the `pip install` command, and note it should be added to `pyproject.toml` under `[project.dependencies]`.
- If the user's requirement is ambiguous about timeframe, frequency, or asset class, ask one focused clarifying question before writing code. Do not guess on parameters that directly affect trading logic.
- For any cryptographic key or API secret handling, always recommend environment variables and never suggest hardcoding credentials, even in examples.
--- a/docs/agent/CURRENT_STATE.md
+++ b/docs/agent/CURRENT_STATE.md
@ -0,0 +1,17 @@
+# Current Milestone
+
+Scanner pipeline is feature-complete and running end-to-end. Focus shifts to quality improvements and pipeline command implementation.
+
+# Recent Progress
+
+- End-to-end scanner pipeline operational (`python -m cli.main scan --date YYYY-MM-DD`)
+- All 38 tests passing (14 original + 9 scanner fallback + 15 env override)
+- Environment variable config overrides merged (PR #9)
+- Thread-safe rate limiter for Alpha Vantage implemented
+- Vendor fallback (AV -> yfinance) broadened to catch `AlphaVantageError`, `ConnectionError`, `TimeoutError`
+
+# Active Blockers
+
+- Industry Deep Dive (Phase 2) report quality is sparse — LLM may not be calling tools effectively
+- Macro Synthesis JSON parsing fragile — DeepSeek R1 sometimes wraps output in markdown code blocks
+- `pipeline` CLI command (scan -> filter -> per-ticker deep dive) not yet implemented
--- a/docs/agent/decisions/.gitkeep
+++ b/docs/agent/decisions/.gitkeep
--- a/docs/agent/decisions/001-hybrid-llm-setup.md
+++ b/docs/agent/decisions/001-hybrid-llm-setup.md
@ -0,0 +1,30 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [llm, infrastructure, ollama, openrouter]
+related_files: [tradingagents/default_config.py]
+---
+
+## Context
+
+Need cost-effective LLM setup for scanner pipeline with different complexity tiers.
+
+## The Decision
+
+Use hybrid approach:
+- **quick_think + mid_think**: `qwen3.5:27b` via Ollama at `http://192.168.50.76:11434` (local, free)
+- **deep_think**: `deepseek/deepseek-r1-0528` via OpenRouter (cloud, paid)
+
+Config location: `tradingagents/default_config.py` — per-tier `_llm_provider` and `_backend_url` keys.
+
+## Constraints
+
+- Each tier must have its own `{tier}_llm_provider` set explicitly.
+- Top-level `llm_provider` and `backend_url` must always exist as fallbacks.
+
+## Actionable Rules
+
+- Never hardcode `localhost:11434` for Ollama — always use configured `base_url`.
+- Per-tier providers fall back to top-level `llm_provider` when `None`.
--- a/docs/agent/decisions/002-data-vendor-fallback.md
+++ b/docs/agent/decisions/002-data-vendor-fallback.md
@ -0,0 +1,28 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [data, alpha-vantage, yfinance, fallback]
+related_files: [tradingagents/dataflows/interface.py, tradingagents/dataflows/alpha_vantage_scanner.py, tradingagents/dataflows/yfinance_scanner.py]
+---
+
+## Context
+
+Alpha Vantage free/demo key doesn't support ETF symbols and has strict rate limits. Need reliable data for scanner.
+
+## The Decision
+
+- `route_to_vendor()` catches `AlphaVantageError` (base class) plus `ConnectionError` and `TimeoutError` to trigger fallback.
+- AV scanner functions raise `AlphaVantageError` when ALL queries fail (not silently embedding errors in output strings).
+- yfinance is the fallback vendor and uses SPDR ETF proxies for sector performance instead of broken `Sector.overview`.
+
+## Constraints
+
+- Functions inside `route_to_vendor` must RAISE on failure, not embed errors in return values.
+- Fallback catch must include `(AlphaVantageError, ConnectionError, TimeoutError)`, not just `RateLimitError`.
+
+## Actionable Rules
+
+- Any new data vendor function used with `route_to_vendor` must raise on total failure.
+- Test both the primary and fallback paths when adding new vendor functions.
--- a/docs/agent/decisions/003-yfinance-etf-proxies.md
+++ b/docs/agent/decisions/003-yfinance-etf-proxies.md
@ -0,0 +1,33 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [data, yfinance, sector-performance]
+related_files: [tradingagents/dataflows/yfinance_scanner.py]
+---
+
+## Context
+
+`yfinance.Sector("technology").overview` returns only metadata (companies_count, market_cap, etc.) — no performance data (oneDay, oneWeek, etc.).
+
+## The Decision
+
+Use SPDR sector ETFs as proxies:
+```python
+sector_etfs = {
+    "Technology": "XLK", "Healthcare": "XLV", "Financials": "XLF",
+    "Energy": "XLE", "Consumer Discretionary": "XLY", ...
+}
+```
+Download 6 months of history via `yf.download()` and compute 1-day, 1-week, 1-month, YTD percentage changes from closing prices.
+
+## Constraints
+
+- `yfinance.Sector.overview` has NO performance data — do not attempt to use it.
+- `top_companies` has ticker as INDEX, not column. Always use `.iterrows()`.
+
+## Actionable Rules
+
+- Always test yfinance APIs interactively before writing agent code.
+- Always inspect DataFrame structure with `.head()`, `.columns`, and `.index`.
--- a/docs/agent/decisions/004-inline-tool-execution.md
+++ b/docs/agent/decisions/004-inline-tool-execution.md
@ -0,0 +1,31 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [agents, tools, langgraph, scanner]
+related_files: [tradingagents/agents/utils/tool_runner.py]
+---
+
+## Context
+
+The existing trading graph uses separate `ToolNode` graph nodes for tool execution (agent -> tool_node -> agent routing loop). Scanner agents are simpler single-pass nodes — no ToolNode in the graph. When the LLM returned tool_calls, nobody executed them, resulting in empty reports.
+
+## The Decision
+
+Created `tradingagents/agents/utils/tool_runner.py` with `run_tool_loop()` that runs an inline tool execution loop within each scanner agent node:
+1. Invoke chain
+2. If tool_calls present -> execute tools -> append ToolMessages -> re-invoke
+3. Repeat up to `MAX_TOOL_ROUNDS=5` until LLM produces text response
+
+Alternative considered: Adding ToolNode + conditional routing to scanner_setup.py (like trading graph). Rejected — too complex for the fan-out/fan-in pattern.
+
+## Constraints
+
+- Trading graph: uses `ToolNode` in graph (do not change).
+- Scanner agents: use `run_tool_loop()` inline.
+
+## Actionable Rules
+
+- When an LLM has `bind_tools`, there MUST be a tool execution mechanism — either graph-level `ToolNode` or inline `run_tool_loop()`.
+- Always verify the tool execution path exists before marking an agent as complete.
--- a/docs/agent/decisions/005-langgraph-parallel-reducers.md
+++ b/docs/agent/decisions/005-langgraph-parallel-reducers.md
@ -0,0 +1,25 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [langgraph, state, parallel, scanner]
+related_files: [tradingagents/agents/utils/scanner_states.py]
+---
+
+## Context
+
+Phase 1 runs 3 scanners in parallel. All write to shared state fields (`sender`, etc.). LangGraph requires reducers for concurrent writes — otherwise raises `INVALID_CONCURRENT_GRAPH_UPDATE`.
+
+## The Decision
+
+Added `_last_value` reducer to all `ScannerState` fields via `Annotated[str, _last_value]`.
+
+## Constraints
+
+- Any LangGraph state field written by parallel nodes MUST have a reducer.
+
+## Actionable Rules
+
+- When adding new fields to `ScannerState`, always use `Annotated[type, reducer_fn]`.
+- Test parallel execution paths to verify no concurrent write errors.
--- a/docs/agent/decisions/006-env-var-config-overrides.md
+++ b/docs/agent/decisions/006-env-var-config-overrides.md
@ -0,0 +1,31 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [config, env-vars, dotenv]
+related_files: [tradingagents/default_config.py, .env.example, pyproject.toml]
+---
+
+## Context
+
+`DEFAULT_CONFIG` hardcoded all values. Users had to edit `default_config.py` to change any setting. The `load_dotenv()` call in `cli/main.py` ran *after* `DEFAULT_CONFIG` was already evaluated at import time, so env vars had no effect.
+
+## The Decision
+
+1. **Module-level `.env` loading**: `default_config.py` calls `load_dotenv()` at the top of the module, before `DEFAULT_CONFIG` is evaluated.
+2. **`_env()` / `_env_int()` helpers**: Read `TRADINGAGENTS_<KEY>` from environment. Return the hardcoded default when the env var is unset or empty.
+3. **Restored top-level keys**: `llm_provider` (default: `"openai"`) and `backend_url` (default: `"https://api.openai.com/v1"`) restored as env-overridable keys.
+4. **All config keys overridable**: `TRADINGAGENTS_` prefix + uppercase config key.
+5. **Explicit dependency**: Added `python-dotenv>=1.0.0` to `pyproject.toml`.
+
+## Constraints
+
+- `llm_provider` and `backend_url` must always exist at top level — `scanner_graph.py` and `trading_graph.py` use them as fallbacks.
+- Empty or unset vars preserve the hardcoded default. `None`-default fields stay `None` when unset.
+
+## Actionable Rules
+
+- New config keys must follow the `TRADINGAGENTS_<UPPERCASE_KEY>` pattern.
+- `load_dotenv()` runs at module level in `default_config.py` — import-order-independent.
+- Always check actual env var values when debugging auth issues.
--- a/docs/agent/decisions/007-thread-safe-rate-limiter.md
+++ b/docs/agent/decisions/007-thread-safe-rate-limiter.md
@ -0,0 +1,36 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [rate-limiting, alpha-vantage, threading]
+related_files: [tradingagents/dataflows/alpha_vantage_common.py]
+---
+
+## Context
+
+The Alpha Vantage rate limiter initially slept *inside* the lock when re-checking the rate window. This blocked all other threads from making API requests during the sleep period, serializing all AV calls.
+
+## The Decision
+
+Two-phase rate limiting:
+1. Acquire lock, check timestamps, release lock, sleep if needed.
+2. Re-check loop: acquire lock, re-check. If still over limit, release lock *before* sleeping, then retry. Only append timestamp and break when under the limit.
+
+```python
+while True:
+    with _rate_lock:
+        if len(_call_timestamps) < _RATE_LIMIT:
+            _call_timestamps.append(_time.time())
+            break
+        extra_sleep = 60 - (now - _call_timestamps[0]) + 0.1
+    _time.sleep(extra_sleep)  # outside lock
+```
+
+## Constraints
+
+- Lock must never be held during `sleep()` or IO operations.
+
+## Actionable Rules
+
+- Never hold a lock during a sleep/IO operation. Always release, sleep, re-acquire.
--- a/docs/agent/decisions/008-lessons-learned.md
+++ b/docs/agent/decisions/008-lessons-learned.md
@ -0,0 +1,50 @@
+---
+type: decision
+status: active
+date: 2026-03-17
+agent_author: "claude"
+tags: [lessons, mistakes, patterns]
+related_files: []
+---
+
+## Context
+
+Documented bugs and wrong assumptions encountered during scanner pipeline development. These lessons prevent repeating the same mistakes.
+
+## The Decision
+
+Codify all lessons learned as actionable rules for future development.
+
+## Constraints
+
+None — these are universal rules for this project.
+
+## Actionable Rules
+
+### Tool Execution
+- When an LLM has `bind_tools`, there MUST be a tool execution mechanism — either graph-level `ToolNode` routing or inline `run_tool_loop()`. Always verify the tool execution path exists.
+
+### yfinance DataFrames
+- `top_companies` has ticker as INDEX, not column. Always use `.iterrows()` or check `.index`.
+- `Sector.overview` returns only metadata — no performance data. Use ETF proxies.
+- Always inspect DataFrame structure with `.head()`, `.columns`, `.index` before writing access code.
+
+### Vendor Fallback
+- Functions inside `route_to_vendor` must RAISE on failure, not embed errors in return values.
+- Catch `(AlphaVantageError, ConnectionError, TimeoutError)`, not just specific subtypes.
+
+### LangGraph
+- Any state field written by parallel nodes MUST have a reducer (`Annotated[str, reducer_fn]`).
+
+### Configuration
+- Never hardcode URLs. Always use configured values with sensible defaults.
+- `llm_provider` and `backend_url` must always exist at top level as fallbacks.
+- When refactoring config, grep for all references before removing keys.
+
+### Environment
+- When creating `.env` files, always verify they have real values, not placeholders.
+- When debugging auth errors, first check `os.environ.get('KEY')` to see what's actually loaded.
+- `load_dotenv()` runs at module level in `default_config.py` — import-order-independent.
+
+### Threading
+- Never hold a lock during `sleep()` or IO. Release, sleep, re-acquire.
--- a/docs/agent/logs/.gitkeep
+++ b/docs/agent/logs/.gitkeep
--- a/docs/agent/templates/agent-decision-template.md
+++ b/docs/agent/templates/agent-decision-template.md
@ -0,0 +1,16 @@
+---
+type: decision | plan
+status: draft | active | superseded
+date: YYYY-MM-DD
+agent_author: ""
+tags: []
+related_files: []
+---
+
+## Context
+
+## The Decision / Plan
+
+## Constraints
+
+## Actionable Rules
--- a/docs/agent/templates/commit-template.txt
+++ b/docs/agent/templates/commit-template.txt
@ -0,0 +1,6 @@
+<type>(<scope>): <short summary>
+
+<Detailed explanation of what changed and why>
+
+Agent-Ref: [Path to docs/agent/plans/ or docs/agent/decisions/ file]
+State-Updated: [Yes/No]
--- a/docs/agent/templates/pr-template.md
+++ b/docs/agent/templates/pr-template.md
@ -0,0 +1,9 @@
+# Description
+
+[Summary of the changes]
+
+# Agentic Context
+
+- **Plan Followed:** [Link to docs/agent/plans/...md]
+- **Decisions Implemented:** [Link to docs/agent/decisions/...md]
+- **State File Updated:** [ ] Yes
--- a/plans/execution_plan_global_macro_analyzer.md
+++ b/plans/execution_plan_global_macro_analyzer.md
@ -1,157 +0,0 @@
-# Global Macro Analyzer Implementation Plan
-
-## Execution Plan for TradingAgents Framework
-
-### Overview
-
-This plan outlines the implementation of a global macro analyzer (market-wide scanner) for the TradingAgents framework. The scanner will discover interesting stocks before running deep per-ticker analysis by scanning global news, market movers, sector performance, and outputting a top-10 stock watchlist.
-
-### Architecture
-
-A separate LangGraph with its own state, agents, and CLI command — sharing the existing LLM infrastructure, tool patterns, and data layer.
-
-```
-START ──┬── Geopolitical Scanner (quick_think) ──┐
-        ├── Market Movers Scanner (quick_think) ──┼── Industry Deep Dive (mid_think) ── Macro Synthesis (deep_think) ── END
-        └── Sector Scanner (quick_think) ─────────┘
-```
-
-### Implementation Steps
-
-#### 1. Fix Infrastructure Issues
-
- [ ] Verify pyproject.toml has correct [build-system] and [project.scripts] sections
- [ ] Check for and remove any stray scanner_tools.py files outside tradingagents/
-
-#### 2. Create Data Layer
-
- [ ] Create tradingagents/dataflows/yfinance_scanner.py with required functions:
-  - get_market_movers_yfinance(category) — uses yf.Screener() for day_gainers, day_losers, most_actives
-  - get_market_indices_yfinance() — fetches ^GSPC, ^DJI, ^IXIC, ^VIX, ^RUT daily data
-  - get_sector_performance_yfinance() — uses yf.Sector() for all 11 GICS sectors
-  - get_industry_performance_yfinance(sector_key) — uses yf.Industry() for drill-down
-  - get_topic_news_yfinance(topic, limit) — uses yf.Search(query=topic)
- [ ] Create tradingagents/dataflows/alpha_vantage_scanner.py with fallback function:
-  - get_market_movers_alpha_vantage(category) — uses TOP_GAINERS_LOSERS endpoint
-
-#### 3. Create Tools
-
- [ ] Create tradingagents/agents/utils/scanner_tools.py with @tool decorated wrappers (same pattern as news_data_tools.py):
-  - get_market_movers — top gainers, losers, most active
-  - get_market_indices — major index values and daily changes
-  - get_sector_performance — sector-level performance overview
-  - get_industry_performance — industry-level drill-down within a sector
-  - get_topic_news — search news by arbitrary topic
-  Each function should call route_to_vendor(method, ...) instead of the yfinance functions directly.
-
-#### 4. Update Supporting Files
-
- [ ] Update tradingagents/agents/utils/agent_utils.py to import/re-export scanner tools
- [ ] Update tradingagents/dataflows/interface.py to add scanner_data category to TOOLS_CATEGORIES and VENDOR_METHODS
-
-#### 5. Create State
-
- [ ] Create tradingagents/agents/utils/scanner_states.py with ScannerState class:
-
-    ```python
-    class ScannerState(MessagesState):
-        scan_date: str
-        geopolitical_report: str          # Phase 1
-        market_movers_report: str         # Phase 1
-        sector_performance_report: str    # Phase 1
-        industry_deep_dive_report: str    # Phase 2
-        macro_scan_summary: str           # Phase 3 (final output)
-    ```
-
-#### 6. Create Agents
-
- [ ] Create tradingagents/agents/scanner/__init__.py (exports all factories)
- [ ] Create tradingagents/agents/scanner/geopolitical_scanner.py:
-  - create_geopolitical_scanner(llm)
-  - quick_think LLM tier
-  - Tools: get_global_news, get_topic_news
-  - Output Field: geopolitical_report
- [ ] Create tradingagents/agents/scanner/market_movers_scanner.py:
-  - create_market_movers_scanner(llm)
-  - quick_think LLM tier
-  - Tools: get_market_movers, get_market_indices
-  - Output Field: market_movers_report
- [ ] Create tradingagents/agents/scanner/sector_scanner.py:
-  - create_sector_scanner(llm)
-  - quick_think LLM tier
-  - Tools: get_sector_performance, get_industry_performance
-  - Output Field: sector_performance_report
- [ ] Create tradingagents/agents/scanner/industry_deep_dive.py:
-  - create_industry_deep_dive_agent(llm)
-  - mid_think LLM tier
-  - Tools: get_industry_performance, get_topic_news
-  - Output Field: industry_deep_dive_report
- [ ] Create tradingagents/agents/scanner/synthesis_agent.py:
-  - create_macro_synthesis_agent(llm)
-  - deep_think LLM tier
-  - Tools: none (pure LLM)
-  - Output Field: macro_scan_summary
-
-#### 7. Create Graph Components
-
- [ ] Create tradingagents/graph/scanner_conditional_logic.py:
-  - ScannerConditionalLogic class
-  - Functions: should_continue_geopolitical, should_continue_movers, should_continue_sector, should_continue_industry
-  - Tool-call check pattern (same as conditional_logic.py)
- [ ] Create tradingagents/graph/scanner_setup.py:
-  - ScannerGraphSetup class
-  - Registers nodes/edges
-  - Fan-out from START to 3 scanners
-  - Fan-in to Industry Deep Dive
-  - Then Synthesis → END
- [ ] Create tradingagents/graph/scanner_graph.py:
-  - MacroScannerGraph class (mirrors TradingAgentsGraph)
-  - Init LLMs, build tool nodes, compile graph
-  - Expose scan(date) method
-  - No memory/reflection needed
-
-#### 8. Modify CLI
-
- [ ] Add scan command to cli/main.py:
-  - @app.command() def scan():
-  - Asks for: scan date (default: today), LLM provider config (reuse existing helpers)
-  - Does NOT ask for ticker (whole-market scan)
-  - Instantiates MacroScannerGraph, calls graph.scan(date)
-  - Displays results with Rich: panels for each report section, numbered table for top 10 stocks
-  - Saves report to results/macro_scan/{date}/
-
-#### 9. Update Config
-
- [ ] Add "scanner_data": "yfinance" to data_vendors in tradingagents/default_config.py
-
-#### 10. Verify Implementation
-
- [ ] Test with commands:
-
-    ```bash
-    python -c "from tradingagents.agents.utils.scanner_tools import get_market_movers"
-    python -c "from tradingagents.graph.scanner_graph import MacroScannerGraph"
-    tradingagents scan
-    ```
-
-### Data Source Decision
-
- __Primary__: yfinance (has Screener(), Sector(), Industry(), index tickers — comprehensive)
- __Fallback__: Alpha Vantage TOP_GAINERS_LOSERS for get_market_movers tool only
- __Reason__: yfinance has broader screener/sector coverage; Alpha Vantage free tier limited to 25 requests/day
-
-### Key Design Decisions
-
- Separate graph — scanner doesn't modify the existing trading analysis pipeline
- No debate phase — this is an informational scan, not a trading decision
- No memory/reflection — point-in-time snapshot; can be added later
- Parallel phase 1 — 3 scanners run concurrently for speed; Industry Deep Dive cross-references all outputs
- yfinance primary, AV fallback — yfinance has broader screener/sector coverage; Alpha Vantage only for market movers fallback
-
-### Verification Criteria
-
-1. All created files are in correct locations with proper content
-2. Scanner tools can be imported and used correctly
-3. Graph compiles and executes without errors
-4. CLI scan command works and produces expected output
-5. Configuration properly routes scanner data to yfinance
--- a/tradingagents/llm_clients/TODO.md
+++ b/tradingagents/llm_clients/TODO.md
@ -1,24 +0,0 @@
-# LLM Clients - Consistency Improvements
-
-## Issues to Fix
-
-### 1. `validate_model()` is never called
- Add validation call in `get_llm()` with warning (not error) for unknown models
-
-### 2. Inconsistent parameter handling
-| Client | API Key Param | Special Params |
-|--------|---------------|----------------|
-| OpenAI | `api_key` | `reasoning_effort` |
-| Anthropic | `api_key` | `thinking_config` → `thinking` |
-| Google | `google_api_key` | `thinking_budget` |
-
-**Fix:** Standardize with unified `api_key` that maps to provider-specific keys
-
-### 3. `base_url` accepted but ignored
- `AnthropicClient`: accepts `base_url` but never uses it
- `GoogleClient`: accepts `base_url` but never uses it (correct - Google doesn't support it)
-
-**Fix:** Remove unused `base_url` from clients that don't support it
-
-### 4. Update validators.py with models from CLI
- Sync `VALID_MODELS` dict with CLI model options after Feature 2 is complete