# TradingAgents Framework - Project Knowledge ## Project Overview Multi-agent LLM trading framework using LangGraph for financial analysis and decision making. ## Development Environment **Conda Environment**: `tradingagents` Before starting any development work, activate the conda environment: ```bash conda activate tradingagents ``` ## Architecture - **Agent Factory Pattern**: `create_X(llm)` → closure pattern - **3-Tier LLM System**: - Quick thinking (fast responses) - Mid thinking (balanced analysis) - Deep thinking (complex reasoning) - **Data Vendor Routing**: yfinance (primary), Alpha Vantage (fallback) - **Graph-Based Workflows**: LangGraph for agent coordination ## Key Directories - `tradingagents/agents/` - Agent implementations - `tradingagents/graph/` - Workflow graphs and setup - `tradingagents/dataflows/` - Data access layer - `tradingagents/portfolio/` - Portfolio models, report stores, store factory - `cli/` - Command-line interface - `agent_os/backend/` - FastAPI backend (routes, engine, services) - `agent_os/frontend/` - React + Chakra UI + ReactFlow dashboard ## Agent Flow (Existing Trading Analysis) 1. Analysts (parallel): Fundamentals, Market, News, Social Media 2. Bull/Bear Debate 3. Research Manager 4. Trader 5. Risk Debate 6. Risk Judge ## Scanner Flow (New Market-Wide Analysis) ``` START ──┬── Geopolitical Scanner (quick_think) ──┐ ├── Market Movers Scanner (quick_think) ──┼── Industry Deep Dive (mid_think) ── Macro Synthesis (deep_think) ── END └── Sector Scanner (quick_think) ─────────┘ ``` - Phase 1: Parallel execution of 3 scanners - Phase 2: Industry Deep Dive cross-references all outputs - Phase 3: Macro Synthesis produces top-10 watchlist ## Data Vendors - **yfinance** (primary, free): Screener(), Sector(), Industry(), index tickers - **Alpha Vantage** (alternative, API key required): TOP_GAINERS_LOSERS endpoint only (fallback for market movers) ## LLM Providers OpenAI, Anthropic, Google, xAI, OpenRouter, Ollama ## CLI Entry Point `cli/main.py` with Typer: - `analyze` (per-ticker analysis) - `scan` (new, market-wide scan) ## Configuration `tradingagents/default_config.py`: - LLM tiers configuration - Vendor routing - Debate rounds settings - All values overridable via `TRADINGAGENTS_` env vars (see `.env.example`) ## Patterns to Follow - Agent creation (trading): `tradingagents/agents/analysts/news_analyst.py` - Agent creation (scanner): `tradingagents/agents/scanners/geopolitical_scanner.py` - Tools: `tradingagents/agents/utils/news_data_tools.py` - Scanner tools: `tradingagents/agents/utils/scanner_tools.py` - Graph setup (trading): `tradingagents/graph/setup.py` - Graph setup (scanner): `tradingagents/graph/scanner_setup.py` - Inline tool loop: `tradingagents/agents/utils/tool_runner.py` ## AgentOS — Storage, Events & Phase Re-run (see ADR 018 for full detail) ### Storage Layout Reports are scoped by `flow_id` (8-char hex), NOT `run_id` (UUID): ``` reports/daily/{date}/{flow_id}/ run_meta.json ← run metadata persisted on completion run_events.jsonl ← all WebSocket events, newline-delimited JSON {TICKER}/report/ ← e.g. RIG/report/ {ts}_complete_report.json {ts}_analysts_checkpoint.json ← written after analysts phase {ts}_trader_checkpoint.json ← written after trader phase market/report/ ← scan output portfolio/report/ ← PM decisions, execution results ``` - **`flow_id`** = stable disk key, shared across all sub-phases of one auto run - **`run_id`** = ephemeral in-memory UUID (WebSocket endpoint key only) ### Store Factory — Always Use It ```python from tradingagents.portfolio.store_factory import create_report_store # Writing: always pass flow_id writer = create_report_store(flow_id=flow_id) # Reading / checkpoint lookup: always pass the ORIGINAL flow_id reader = create_report_store(flow_id=original_flow_id) # Reading latest (skip-if-exists checks): omit flow_id reader = create_report_store() ``` **Never** instantiate `ReportStore()` or `MongoReportStore()` directly in engine code. ### Phase Re-run Node → phase mapping lives in `NODE_TO_PHASE` (langgraph_engine.py): | Nodes | Phase | Checkpoint loaded | |-------|-------|-------------------| | Market/News/Fundamentals/Social Analyst | `analysts` | none | | Bull/Bear Researcher, Research Manager, Trader | `debate_and_trader` | analysts_checkpoint | | Aggressive/Conservative/Neutral Analyst, Portfolio Manager | `risk` | trader_checkpoint | - **Checkpoint lookup requires the original `flow_id`** — pass it through `rerun_params["flow_id"]` - **Analysts checkpoint**: saved when `any()` analyst report is populated (Social Analyst is optional — never use `all()`) - **Selective event filtering**: re-run preserves events from other tickers and earlier phases; only clears nodes in the re-run scope - **Cascade**: every phase re-run ends with a `run_portfolio()` call to update the PM decision ### WebSocket Event Flow ``` POST /api/run/{type} → BackgroundTask drives engine → caches events in runs[run_id] WS /ws/stream/{run_id} → replays cached events (polling 50ms) → streams new ones On reconnect (history) → lazy-loads run_events.jsonl from disk if events == [] Orphaned "running" run with disk events → auto-marked "failed" ``` ### MongoDB vs Local Storage - **Local (default)**: development, single-machine, offline. Set via `TRADINGAGENTS_REPORTS_DIR`. - **MongoDB**: multi-process, production, reflexion memory. Set `TRADINGAGENTS_MONGO_URI`. - `DualReportStore` writes to both when Mongo is configured; reads Mongo first, falls back to disk. - Mongo failures always fall back gracefully — never crash on missing Mongo. ## Critical Patterns (see `docs/agent/decisions/008-lessons-learned.md` for full details) - **Tool execution**: Trading graph uses `ToolNode` in graph. Scanner agents use `run_tool_loop()` inline. If `bind_tools()` is used, there MUST be a tool execution path. - **yfinance DataFrames**: `top_companies` has ticker as INDEX, not column. Always check `.index` and `.columns`. - **yfinance Sector/Industry**: `Sector.overview` has NO performance data. Use ETF proxies for performance. - **Vendor fallback**: Functions inside `route_to_vendor` must RAISE on failure, not embed errors in return values. Catch `(AlphaVantageError, ConnectionError, TimeoutError)`, not just `RateLimitError`. - **LangGraph parallel writes**: Any state field written by parallel nodes MUST have a reducer (`Annotated[str, reducer_fn]`). - **Ollama remote host**: Never hardcode `localhost:11434`. Use configured `base_url`. - **.env loading**: `load_dotenv()` runs at module level in `default_config.py` — import-order-independent. Check actual env var values when debugging auth. - **Rate limiter locks**: Never hold a lock during `sleep()` or IO. Release, sleep, re-acquire. - **LLM policy errors**: `_is_policy_error(exc)` detects 404 from any provider (checks `status_code` attribute or message content). `_build_fallback_config(config)` substitutes per-tier fallback models. Both live in `agent_os/backend/services/langgraph_engine.py`. - **Config fallback keys**: `llm_provider` and `backend_url` must always exist at top level — `scanner_graph.py` and `trading_graph.py` use them as fallbacks. - **Report store writes**: always pass `flow_id` to `create_report_store(flow_id=…)`. Omitting it writes to the flat legacy path and overwrites across runs. - **Checkpoint lookup on re-run**: pass the original run's `flow_id` (from `run.get("flow_id") or run.get("short_rid") or run["params"]["flow_id"]`). Without it, `_date_root()` falls back to flat layout and finds nothing. - **Analysts checkpoint condition**: use `any()` not `all()` over analyst keys — Social Analyst is not in the default analysts list, so `sentiment_report` is empty in typical runs. - **Re-run event filtering**: use `_filter_rerun_events(events, ticker, phase)` — never clear all events on re-run. Clearing all loses scan nodes and other tickers from the graph. ## Agentic Memory (docs/agent/) Agent workflows use the `docs/agent/` scaffold for structured memory: - `docs/agent/CURRENT_STATE.md` — Live state tracker (milestone, progress, blockers). Read at session start. - `docs/agent/decisions/` — Architecture decision records (ADR-style, numbered `001-...`) - `docs/agent/plans/` — Implementation plans with checkbox progress tracking - `docs/agent/logs/` — Agent run logs - `docs/agent/templates/` — Commit, PR, and decision templates Before starting work, always read `docs/agent/CURRENT_STATE.md`. Before committing, update it. ## LLM Configuration Per-tier provider overrides in `tradingagents/default_config.py`: - Each tier (`quick_think`, `mid_think`, `deep_think`) can have its own `_llm_provider` and `_backend_url` - Falls back to top-level `llm_provider` and `backend_url` when per-tier values are None - All config values overridable via `TRADINGAGENTS_` env vars - Keys for LLM providers: `.env` file (e.g., `OPENROUTER_API_KEY`, `ALPHA_VANTAGE_API_KEY`) ### Env Var Override Convention ```env # Pattern: TRADINGAGENTS_=value TRADINGAGENTS_LLM_PROVIDER=openrouter TRADINGAGENTS_DEEP_THINK_LLM=deepseek/deepseek-r1-0528 TRADINGAGENTS_MAX_DEBATE_ROUNDS=3 TRADINGAGENTS_VENDOR_SCANNER_DATA=alpha_vantage ``` Empty or unset vars preserve the hardcoded default. `None`-default fields (like `mid_think_llm`) stay `None` when unset, preserving fallback semantics. ### Per-Tier Fallback LLM When a model returns HTTP 404 (blocked by provider guardrail/policy), the engine auto-detects it via `_is_policy_error()` and retries with a per-tier fallback: ```env TRADINGAGENTS_QUICK_THINK_FALLBACK_LLM=gpt-5-mini TRADINGAGENTS_QUICK_THINK_FALLBACK_LLM_PROVIDER=openai TRADINGAGENTS_MID_THINK_FALLBACK_LLM=gpt-5-mini TRADINGAGENTS_MID_THINK_FALLBACK_LLM_PROVIDER=openai TRADINGAGENTS_DEEP_THINK_FALLBACK_LLM=gpt-5.2 TRADINGAGENTS_DEEP_THINK_FALLBACK_LLM_PROVIDER=openai ``` Leave unset to disable auto-retry (pipeline emits a clear actionable error instead). ## Running the Scanner ```bash conda activate tradingagents python -m cli.main scan --date 2026-03-17 ``` ## Running Tests ```bash conda activate tradingagents pytest tests/ -v ```