diff --git a/docs/proposals/ARCHITECTURE_OVERVIEW.md b/docs/proposals/ARCHITECTURE_OVERVIEW.md new file mode 100644 index 00000000..95d1e6e2 --- /dev/null +++ b/docs/proposals/ARCHITECTURE_OVERVIEW.md @@ -0,0 +1,545 @@ +# TradingAgents Architecture Overview + +> **Purpose:** Reference document mapping the current TradingAgents architecture. +> **Status:** Informational (no changes proposed here) +> **Related:** [RFC_AUTORESEARCH_INTRADAY.md](./RFC_AUTORESEARCH_INTRADAY.md) +> +> This document is submitted as context for the auto-research RFC. It captures +> the current architecture to ground the proposal in existing code. + +## Overview + +TradingAgents is a **multi-agent LLM system** that analyzes stocks using 12 AI agents organized in 4 layers: +1. **Analysis Layer** - 4 analysts gather data using tools +2. **Investment Debate Layer** - Bull vs Bear researchers debate, judge decides +3. **Trading Layer** - Trader creates execution plan +4. **Risk Management Layer** - 3 risk analysts debate, portfolio manager makes final call + +--- + +## Complete System Flow (High Level) + +```mermaid +flowchart TD + USER["User calls ta.propagate('NVDA', '2024-05-10')"] + + subgraph INIT["Initialization"] + MAIN["main.py"] --> CONFIG["default_config.py"] + CONFIG --> GRAPH["TradingAgentsGraph.__init__()"] + GRAPH --> LLM_FACTORY["create_llm_client() - factory.py"] + LLM_FACTORY --> DEEP["deep_thinking_llm"] + LLM_FACTORY --> QUICK["quick_thinking_llm"] + GRAPH --> MEM_INIT["Initialize 5 Memories
bull_memory, bear_memory, trader_memory,
invest_judge_memory, portfolio_manager_memory"] + end + + USER --> PROPAGATOR["Propagator
Creates initial state"] + + subgraph ANALYSTS["Layer 1: Analysis (Sequential)"] + MA["Market Analyst
tools: get_stock_data, get_indicators"] + SA["Social Media Analyst
tools: get_news"] + NA["News Analyst
tools: get_news, get_global_news"] + FA["Fundamentals Analyst
tools: get_fundamentals,
get_balance_sheet,
get_cashflow,
get_income_statement"] + MA --> SA --> NA --> FA + end + + subgraph DEBATE["Layer 2: Investment Debate"] + BULL["Bull Researcher
(BUY advocate + memory)"] + BEAR["Bear Researcher
(SELL advocate + memory)"] + BULL <-->|"max_debate_rounds"| BEAR + JUDGE["Research Manager
(Judge: BUY/SELL/HOLD)"] + BULL --> JUDGE + BEAR --> JUDGE + end + + subgraph TRADE["Layer 3: Trading"] + TRADER["Trader
(Execution strategy + memory)"] + end + + subgraph RISK["Layer 4: Risk Management Debate"] + AGG["Aggressive Analyst
(High risk, high reward)"] + CON["Conservative Analyst
(Low risk, protect assets)"] + NEU["Neutral Analyst
(Balanced approach)"] + AGG <-->|"max_risk_discuss_rounds"| CON + CON <-->|"max_risk_discuss_rounds"| NEU + PM["Portfolio Manager
(Final Judge)"] + AGG --> PM + CON --> PM + NEU --> PM + end + + subgraph OUTPUT["Final Output"] + SP["SignalProcessor
Extracts: BUY/OVERWEIGHT/HOLD/UNDERWEIGHT/SELL"] + end + + PROPAGATOR --> ANALYSTS + FA --> DEBATE + JUDGE --> TRADE + TRADER --> RISK + PM --> SP + + SP --> DECISION["Final Decision Returned to User"] + + style ANALYSTS fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style DEBATE fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style TRADE fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style RISK fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#880e4f + style OUTPUT fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c +``` + +--- + +## Data Flow: From APIs to Agent Reports + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '20px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 140 } +}}%% +flowchart LR + subgraph EXTERNAL["External Data Sources"] + YF["yfinance API
(Free, no key)"] + AV["Alpha Vantage API
(Needs API key)"] + end + + subgraph DATAFLOWS["tradingagents/dataflows/"] + YF_PY["y_finance.py
get_YFin_data_online()"] + YF_NEWS["yfinance_news.py
get_news_yfinance()
get_global_news_yfinance()"] + AV_STOCK["alpha_vantage_stock.py"] + AV_FUND["alpha_vantage_fundamentals.py"] + AV_IND["alpha_vantage_indicator.py"] + AV_NEWS["alpha_vantage_news.py"] + + ROUTER["interface.py
route_to_vendor()

Decides: yfinance or alpha_vantage?
Auto-fallback on rate limit"] + end + + subgraph TOOLS["tradingagents/agents/utils/ (Tool Layer)"] + T1["core_stock_tools.py
get_stock_data()"] + T2["technical_indicators_tools.py
get_indicators()"] + T3["fundamental_data_tools.py
get_fundamentals()
get_balance_sheet()
get_cashflow()
get_income_statement()"] + T4["news_data_tools.py
get_news()
get_global_news()
get_insider_transactions()"] + end + + subgraph AGENTS["Analyst Agents"] + MA2["Market Analyst"] + SA2["Social Media Analyst"] + NA2["News Analyst"] + FA2["Fundamentals Analyst"] + end + + YF --> YF_PY + YF --> YF_NEWS + AV --> AV_STOCK + AV --> AV_FUND + AV --> AV_IND + AV --> AV_NEWS + + YF_PY --> ROUTER + YF_NEWS --> ROUTER + AV_STOCK --> ROUTER + AV_FUND --> ROUTER + AV_IND --> ROUTER + AV_NEWS --> ROUTER + + ROUTER --> T1 + ROUTER --> T2 + ROUTER --> T3 + ROUTER --> T4 + + T1 --> MA2 + T2 --> MA2 + T4 --> SA2 + T4 --> NA2 + T3 --> FA2 + + style EXTERNAL fill:#ffecb3,stroke:#f9a825,stroke-width:2px,color:#f57f17 + style DATAFLOWS fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style TOOLS fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style AGENTS fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c +``` + +--- + +## interface.py - The Router (Detailed) + +```mermaid +flowchart TD + CALL["Agent calls a tool
e.g., get_stock_data('NVDA', ...)"] + + ROUTE["route_to_vendor('get_stock_data', *args)"] + + CAT["get_category_for_method()
→ 'core_stock_apis'"] + + VENDOR["get_vendor(category, method)
1. Check tool_vendors config (highest priority)
2. Fall back to data_vendors config
3. Fall back to 'default'"] + + PRIMARY["Try PRIMARY vendor
(e.g., yfinance)"] + + SUCCESS{"Success?"} + + RATE_LIMIT{"Rate Limited?"} + + FALLBACK["Try FALLBACK vendor
(e.g., alpha_vantage)"] + + RETURN["Return data to agent"] + + CALL --> ROUTE --> CAT --> VENDOR --> PRIMARY --> SUCCESS + SUCCESS -->|"Yes"| RETURN + SUCCESS -->|"No"| RATE_LIMIT + RATE_LIMIT -->|"Yes"| FALLBACK + RATE_LIMIT -->|"No (other error)"| ERROR["Raise Error"] + FALLBACK --> RETURN + + style ROUTE fill:#bbdefb,stroke:#0277bd,stroke-width:2px,color:#01579b + style VENDOR fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 +``` + +--- + +## Tool Categories & Vendor Mapping + +```mermaid +%%{init: {'flowchart': {'nodeSpacing': 80, 'rankSpacing': 120}}}%% +flowchart TD + subgraph CATEGORIES["Tool Categories (from config)"] + C1["core_stock_apis"] + C2["technical_indicators"] + C3["fundamental_data"] + C4["news_data"] + end + + subgraph TOOLS_IN_CATS["Tools per Category"] + C1 --> T_STOCK["get_stock_data"] + C2 --> T_IND["get_indicators"] + C3 --> T_FUND["get_fundamentals"] + C3 --> T_BAL["get_balance_sheet"] + C3 --> T_CASH["get_cashflow"] + C3 --> T_INC["get_income_statement"] + C4 --> T_NEWS["get_news"] + C4 --> T_GNEWS["get_global_news"] + C4 --> T_INSIDER["get_insider_transactions"] + end + + subgraph VENDORS["Available Vendor Implementations"] + V_YF["yfinance
(Free, default)"] + V_AV["Alpha Vantage
(API key needed)"] + end + + T_STOCK --> V_YF + T_STOCK --> V_AV + T_IND --> V_YF + T_IND --> V_AV + T_FUND --> V_YF + T_FUND --> V_AV + T_BAL --> V_YF + T_BAL --> V_AV + T_CASH --> V_YF + T_CASH --> V_AV + T_INC --> V_YF + T_INC --> V_AV + T_NEWS --> V_YF + T_NEWS --> V_AV + T_GNEWS --> V_YF + T_GNEWS --> V_AV + T_INSIDER --> V_YF + + style CATEGORIES fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style VENDORS fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 +``` + +--- + +## Agent Detail: Who Has What Tools + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '20px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 50 } +}}%% +flowchart LR + subgraph WITH_TOOLS["Agents WITH Tools (4)"] + MA3["Market Analyst"] + SA3["Social Media Analyst"] + NA3["News Analyst"] + FA3["Fundamentals Analyst"] + end + + subgraph NO_TOOLS["Agents WITHOUT Tools (8) - Pure LLM Reasoning"] + BULL3["Bull Researcher"] + BEAR3["Bear Researcher"] + RM3["Research Manager"] + TR3["Trader"] + AG3["Aggressive Analyst"] + CO3["Conservative Analyst"] + NE3["Neutral Analyst"] + PM3["Portfolio Manager"] + end + + MA3 -->|uses| T_S["get_stock_data
get_indicators"] + SA3 -->|uses| T_N1["get_news"] + NA3 -->|uses| T_N2["get_news
get_global_news"] + FA3 -->|uses| T_F["get_fundamentals
get_balance_sheet
get_cashflow
get_income_statement"] + + BULL3 -->|reads| REPORTS["All 4 Analyst Reports
+ Past Memories"] + BEAR3 -->|reads| REPORTS + RM3 -->|reads| DEBATE_HIST["Debate History"] + TR3 -->|reads| INV_PLAN["Investment Plan"] + AG3 -->|reads| TRADE_PLAN["Trader's Plan"] + CO3 -->|reads| TRADE_PLAN + NE3 -->|reads| TRADE_PLAN + PM3 -->|reads| RISK_HIST["Risk Debate History"] + + style WITH_TOOLS fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style NO_TOOLS fill:#ffecb3,stroke:#f9a825,stroke-width:2px,color:#f57f17 +``` + +--- + +## LangGraph Execution Flow (Detailed) + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '20px' }, + 'flowchart': { 'nodeSpacing': 80, 'rankSpacing': 80 } +}}%% +stateDiagram-v2 + [*] --> Propagator: propagate(ticker, date) + + Propagator --> MarketAnalyst: Initial state created + + state "Analyst Phase" as AP { + MarketAnalyst --> tools_market: Calls tools + tools_market --> MarketAnalyst: Returns data + MarketAnalyst --> MsgClearMarket: Report done + MsgClearMarket --> SocialAnalyst + SocialAnalyst --> tools_social: Calls tools + tools_social --> SocialAnalyst: Returns data + SocialAnalyst --> MsgClearSocial: Report done + MsgClearSocial --> NewsAnalyst + NewsAnalyst --> tools_news: Calls tools + tools_news --> NewsAnalyst: Returns data + NewsAnalyst --> MsgClearNews: Report done + MsgClearNews --> FundAnalyst + FundAnalyst --> tools_fund: Calls tools + tools_fund --> FundAnalyst: Returns data + FundAnalyst --> MsgClearFund: Report done + } + + state "Investment Debate" as ID { + BullResearcher --> BearResearcher: Bull case + BearResearcher --> BullResearcher: Bear counter + note right of BullResearcher: Loops max_debate_rounds times + BearResearcher --> ResearchManager: Debate ends + ResearchManager --> InvestmentPlan: BUY/SELL/HOLD + } + + state "Trading" as TR { + Trader --> TraderPlan: Execution strategy + } + + state "Risk Debate" as RD { + Aggressive --> Conservative: High-risk view + Conservative --> Neutral: Low-risk view + Neutral --> Aggressive: Balanced view + note right of Aggressive: Loops max_risk_discuss_rounds times + Neutral --> PortfolioManager: Debate ends + } + + MsgClearFund --> BullResearcher + InvestmentPlan --> Trader + TraderPlan --> Aggressive + PortfolioManager --> SignalProcessor + SignalProcessor --> [*]: BUY/OVERWEIGHT/HOLD/UNDERWEIGHT/SELL +``` + +--- + +## Memory System (BM25 Similarity Search) + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '20px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 120 } +}}%% +flowchart TD + subgraph MEMORIES["5 Memory Instances"] + M1["bull_memory
FinancialSituationMemory"] + M2["bear_memory
FinancialSituationMemory"] + M3["trader_memory
FinancialSituationMemory"] + M4["invest_judge_memory"] + M5["portfolio_manager_memory"] + end + + subgraph WRITE_PATH["Writing to Memory (after trade results)"] + RESULT["Trade returns/losses"] + REFLECT["Reflector
reflection.py"] + REFLECT -->|"What went right/wrong?"| LESSONS["Lessons learned
(situation, recommendation) pairs"] + LESSONS --> M1 + LESSONS --> M2 + LESSONS --> M3 + LESSONS --> M4 + LESSONS --> M5 + end + + subgraph READ_PATH["Reading from Memory (during analysis)"] + CURRENT["Current market situation"] + BM25["BM25Okapi Search
memory.py"] + CURRENT --> BM25 + BM25 -->|"Top N similar past situations"| CONTEXT["Past lessons + recommendations"] + CONTEXT --> AGENTS2["Researchers & Managers
use past experience"] + end + + RESULT --> REFLECT + M1 --> BM25 + M2 --> BM25 + + style MEMORIES fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style WRITE_PATH fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style READ_PATH fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 +``` + +--- + +## LLM Client Architecture + +```mermaid +%%{init: { + 'themeVariables': { + 'fontSize': '18px' + }, + 'flowchart': { + 'nodeSpacing': 80, + 'rankSpacing': 120 + } +}}%% +flowchart TB + + %% Factory Layer + subgraph FACTORY["Factory Layer"] + CF["create_llm_client(provider, model)"] + end + + %% Base Layer + subgraph BASE["Base Class"] + BLC["BaseLLMClient
- get_llm()
- validate_model()
- warn_if_unknown_model()"] + end + + %% Provider Layer + subgraph CLIENTS["Provider Implementations"] + direction LR + OAI["OpenAIClient
(openai, ollama, openrouter, xai)"] + ANTH["AnthropicClient"] + GOOG["GoogleClient"] + end + + %% Flow (clean hierarchy) + CF --> BLC + BLC --> OAI + BLC --> ANTH + BLC --> GOOG + + %% Optional: show routing logic (lighter) + CF -.->|"openai"| OAI + CF -.->|"anthropic"| ANTH + CF -.->|"google"| GOOG + + %% Styles (cleaner contrast) + style FACTORY fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style BASE fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style CLIENTS fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 +``` + +--- + +## Complete File Structure + +``` +TradingAgents/ +├── main.py # Entry point +├── tradingagents/ +│ ├── default_config.py # All default settings +│ │ +│ ├── agents/ +│ │ ├── analysts/ +│ │ │ ├── market_analyst.py # Tools: get_stock_data, get_indicators +│ │ │ ├── social_media_analyst.py # Tools: get_news +│ │ │ ├── news_analyst.py # Tools: get_news, get_global_news +│ │ │ └── fundamentals_analyst.py # Tools: get_fundamentals, balance_sheet, cashflow, income +│ │ │ +│ │ ├── researchers/ +│ │ │ ├── bull_researcher.py # BUY advocate (with memory) +│ │ │ └── bear_researcher.py # SELL advocate (with memory) +│ │ │ +│ │ ├── managers/ +│ │ │ ├── research_manager.py # Judge for Bull/Bear debate +│ │ │ └── portfolio_manager.py # Judge for Risk debate (FINAL decision) +│ │ │ +│ │ ├── trader/ +│ │ │ └── trader.py # Execution strategy +│ │ │ +│ │ ├── risk_mgmt/ +│ │ │ ├── aggressive_debator.py # High risk advocate +│ │ │ ├── conservative_debator.py # Low risk advocate +│ │ │ └── neutral_debator.py # Balanced advocate +│ │ │ +│ │ └── utils/ +│ │ ├── agent_states.py # State definitions (AgentState) +│ │ ├── agent_utils.py # Helper utilities +│ │ ├── memory.py # BM25-based memory system +│ │ ├── core_stock_tools.py # Tool: get_stock_data +│ │ ├── technical_indicators_tools.py # Tool: get_indicators +│ │ ├── fundamental_data_tools.py # Tools: fundamentals, balance sheet, etc. +│ │ └── news_data_tools.py # Tools: news, global_news, insider_transactions +│ │ +│ ├── graph/ +│ │ ├── trading_graph.py # Main orchestrator class +│ │ ├── setup.py # LangGraph node/edge definitions +│ │ ├── conditional_logic.py # Flow control (debate rounds, routing) +│ │ ├── propagation.py # State initialization +│ │ ├── reflection.py # Post-trade learning +│ │ └── signal_processing.py # Extract final BUY/SELL/HOLD signal +│ │ +│ ├── dataflows/ +│ │ ├── interface.py # THE ROUTER: routes tools to vendors +│ │ ├── config.py # Data config getter/setter +│ │ ├── utils.py # Utility functions +│ │ ├── y_finance.py # yfinance data fetching +│ │ ├── yfinance_news.py # yfinance news fetching +│ │ ├── alpha_vantage_stock.py # Alpha Vantage stock data +│ │ ├── alpha_vantage_fundamentals.py # Alpha Vantage financials +│ │ ├── alpha_vantage_indicator.py # Alpha Vantage indicators +│ │ ├── alpha_vantage_news.py # Alpha Vantage news +│ │ ├── alpha_vantage_common.py # Shared AV utilities +│ │ └── stockstats_utils.py # Technical indicator calculations +│ │ +│ └── llm_clients/ +│ ├── factory.py # create_llm_client() factory function +│ ├── base_client.py # BaseLLMClient abstract class +│ ├── openai_client.py # OpenAI/Ollama/xAI/OpenRouter +│ ├── anthropic_client.py # Anthropic Claude +│ ├── google_client.py # Google Gemini +│ ├── validators.py # Model name validation +│ └── model_catalog.py # Known model lists +``` + +--- + +## State Object: What Data Flows Between Agents + +```mermaid +flowchart TD + subgraph STATE["AgentState (shared state object)"] + S1["messages: list - LLM conversation history"] + S2["company_of_interest: str - 'NVDA'"] + S3["trade_date: str - '2024-05-10'"] + S4["market_report: str - Market Analyst output"] + S5["sentiment_report: str - Social Analyst output"] + S6["news_report: str - News Analyst output"] + S7["fundamentals_report: str - Fundamentals Analyst output"] + S8["investment_debate_state: dict - Bull/Bear debate history + judge decision"] + S9["investment_plan: str - Research Manager's plan"] + S10["trader_investment_plan: str - Trader's execution plan"] + S11["risk_debate_state: dict - Risk debate history"] + S12["final_trade_decision: str - Portfolio Manager's FINAL output"] + end + + style STATE fill:#f5f5f5 +``` diff --git a/docs/proposals/RFC_AUTORESEARCH_INTRADAY.md b/docs/proposals/RFC_AUTORESEARCH_INTRADAY.md new file mode 100644 index 00000000..3ce078cf --- /dev/null +++ b/docs/proposals/RFC_AUTORESEARCH_INTRADAY.md @@ -0,0 +1,568 @@ +# RFC: Auto-Research Loop for Intraday Prediction + +> **Status:** Draft — seeking feedback +> **Scope:** Additive module (no changes to existing files) +> **Related:** [ARCHITECTURE_OVERVIEW.md](./ARCHITECTURE_OVERVIEW.md) + +## TL;DR + +Add a `tradingagents/autoresearch/` module that runs walk-forward backtesting +on the existing `TradingAgentsGraph`, using the existing `reflect_and_remember()` +memory system to iteratively improve intraday predictions. No existing files +are modified. + +## The Core Idea + +Apply Andrew Karpathy-style iterative research methodology to the existing TradingAgents architecture: + +> **Take historical data → Predict next day → Check if right → Learn from mistakes → Predict again → Repeat** + +This is essentially **walk-forward backtesting with self-improvement** — a proven concept in quantitative finance, now powered by LLM agents instead of traditional ML models. + +--- + +## Design Tradeoffs + +### Strengths of this approach + +| Aspect | Why it works | +|---|---| +| **We already have the agents** | TradingAgents already does single-day analysis. We're just running it repeatedly | +| **We already have the data pipeline** | yfinance gives us free historical data — no new APIs needed | +| **Walk-forward is proven** | This is how quant funds actually test strategies | +| **Memory system exists** | `reflect_and_remember()` already learns from past trades | +| **Iterative learning** | Each wrong prediction improves the next one via memory | + +### Risks requiring careful design + +| Risk | Mitigation | +|---|---| +| **LLM API costs** | Each day = ~12 agent calls with LLM. 30 days = 360+ LLM calls. Reuse existing `quick_think_llm` (currently `gpt-5.4-mini` in `default_config.py`) for cheap agents; only use `deep_think_llm` where reasoning depth is required | +| **Overfitting to past data** | Don't tune prompts to specific dates — tune the APPROACH (which tools matter, what indicators to prioritize) | +| **Look-ahead bias** | When predicting day 11, the agents must ONLY see data up to day 10. Never leak future data | +| **Rate limits** | yfinance and Alpha Vantage have limits. Add delays between runs | +| **What "change everything" means** | Don't change model weights (we can't). Change: which analysts to use, debate rounds, indicator selection, prompt emphasis | + +### Key design decision: no same-day retries + +**Alternative considered:** If a prediction is wrong, retry the same day with a different approach. + +**Rejected because:** Retrying the same day with knowledge of the actual outcome introduces look-ahead bias, which invalidates backtesting results. + +**Recommended approach:** Move forward only — let memory accumulate lessons naturally. +1. Predict day 11 → Wrong → **Reflect and store lesson in memory** +2. Move to day 12 with the lesson learned +3. The memory system naturally improves future predictions +4. After all 30 days, analyze WHICH types of predictions failed and WHY + +Rationale: +- Retrying the same day with knowledge of the answer is look-ahead bias +- The existing memory system already handles "learning from mistakes" +- The approach (not individual predictions) is what should be tuned + +--- + +## How It Maps to Existing Architecture + +```mermaid +%%{init: {'flowchart': {'nodeSpacing': 80, 'rankSpacing': 100}}}%% +flowchart TD + subgraph EXISTING["What TradingAgents Already Does (Single Day)"] + E1["propagate('NVDA', '2024-05-10')"] + E2["4 Analysts gather data"] + E3["Bull vs Bear debate"] + E4["Trader + Risk debate"] + E5["Final: BUY/OVERWEIGHT/HOLD/UNDERWEIGHT/SELL"] + E1 --> E2 --> E3 --> E4 --> E5 + end + + subgraph NEW["What We're Adding (Auto-Research Loop)"] + N1["train.py
Run propagate() for each day in sequence"] + N2["evaluation.py
Compare prediction vs actual next-day price"] + N3["reflect_and_remember()
Store lessons when wrong"] + N4["model_harness.py
Manage the loop, configs, and results"] + N5["prompt.py
Define what we're looking for"] + N1 --> N2 --> N3 --> N4 + N4 -->|"Next day"| N1 + N5 --> N1 + end + + EXISTING -.->|"We call this repeatedly"| NEW + + style EXISTING fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style NEW fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 +``` + +--- + +## Time Horizon Configuration + +```mermaid +%%{init: {'flowchart': {'nodeSpacing': 80, 'rankSpacing': 120}}}%% +flowchart TD + USER["User selects time horizon"] + + USER -->|"1 day"| D1["Predict: Tomorrow
Training data: Last 1 month (30 days)
Evaluation: Compare with actual tomorrow"] + + USER -->|"1 week"| D2["Predict: Next 5 trading days
Training data: Last 3 months (60 days)
Evaluation: Compare each day"] + + USER -->|"1 month"| D3["Predict: Next 20 trading days
Training data: Last 6 months (120 days)
Evaluation: Compare each day"] + + subgraph LOGIC["How Training Window Works"] + L1["Take training window of historical data"] + L2["Split: first (N - test_window) days = context
last test_window days = walk-forward test
(test_window is configurable;
default ~20% of N, min 5 days)"] + L3["Predict day by day through test window"] + L4["After test: use full window to predict FUTURE"] + end + + D1 --> LOGIC + D2 --> LOGIC + D3 --> LOGIC + + %% Improved styles + style D1 fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style D2 fill:#fff9c4,stroke:#f9a825,stroke-width:2px,color:#f57f17 + style D3 fill:#ffccbc,stroke:#d84315,stroke-width:2px,color:#bf360c +``` + +--- + +## Complete Auto-Research Pipeline + +```mermaid +%%{init: {'flowchart': {'nodeSpacing': 80, 'rankSpacing': 120}}}%% +flowchart TD + subgraph SETUP["Phase 1: Setup"] + S1["User inputs:
- Ticker (e.g., NVDA)
- Time horizon (1 day / 1 week / 1 month)
- Start date"] + S2["prompt.py
Define analysis focus:
- What indicators matter?
- What news to prioritize?
- Risk tolerance?"] + S3["model_harness.py
Load config + initialize TradingAgentsGraph"] + S1 --> S3 + S2 --> S3 + end + + subgraph TRAIN["Phase 2: Walk-Forward Training (train.py)"] + T1["Load training window
(e.g., 30 days for 1-day horizon)"] + T2["Day 1-20: Historical context
(agents can see this data)"] + T3["Day 21: First prediction target"] + + T4["Run propagate(ticker, day_20)
Get prediction for day 21"] + T5["evaluation.py:
Compare prediction vs actual day 21"] + + T6{"Prediction
correct?"} + T7["reflect_and_remember(positive_return)
Store: what worked"] + T8["reflect_and_remember(negative_return)
Store: what went wrong + why"] + + T9["Slide window: Add day 21 to context
Now predict day 22"] + + T1 --> T2 --> T3 --> T4 --> T5 --> T6 + T6 -->|"Yes"| T7 + T6 -->|"No"| T8 + T7 --> T9 + T8 --> T9 + T9 -->|"Repeat for days 22-30"| T4 + end + + subgraph EVAL["Phase 3: Evaluation Summary (evaluation.py)"] + EV1["Accuracy: X/10 days predicted correctly"] + EV2["Direction accuracy: Did we get UP/DOWN right?"] + EV3["Magnitude: How close was the prediction?"] + EV4["Best/worst performing indicators"] + EV5["Save results to Excel/CSV"] + end + + subgraph PREDICT["Phase 4: Future Prediction"] + P1["Use full 30-day window + learned memories"] + P2["Predict next 10-30 days (based on horizon)"] + P3["Save predictions to Excel"] + end + + subgraph VIZ["Phase 5: Visualization"] + V1["Left chart: Actual price history"] + V2["Right chart: Predicted prices"] + V3["Overlay: Where predictions matched/diverged"] + V4["Metrics dashboard: accuracy, returns, etc."] + end + + S3 --> T1 + T9 -->|"After all training days"| EV1 + EV1 --> EV2 --> EV3 --> EV4 --> EV5 + EV5 --> P1 --> P2 --> P3 + P3 --> V1 + V1 --> V2 --> V3 --> V4 + + %% FIXED STYLES (dark text + stronger borders) + style SETUP fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style TRAIN fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style EVAL fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style PREDICT fill:#fce4ec,stroke:#c2185b,stroke-width:2px,color:#880e4f + style VIZ fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#4a148c +``` + +--- + +## File Structure for the PR + +```mermaid +%%{init: { + 'themeVariables': { + 'fontSize': '20px', + 'fontFamily': 'Arial', + 'lineColor': '#ffffff' + }, + 'flowchart': { + 'nodeSpacing': 80, + 'rankSpacing': 120 + } +}}%% +flowchart TD + + subgraph NEW_FILES["New Files We'll Add"] + direction TB + PR["tradingagents/autoresearch/"] + PR --> TRAIN_PY["train.py
Walk-forward training loop"] + PR --> EVAL_PY["evaluation.py
Compare predictions vs actual"] + PR --> MODEL_PY["model.py
Wrapper around TradingAgentsGraph
for batch prediction"] + PR --> HARNESS["model_harness.py
Orchestrates the full pipeline:
setup → train → eval → predict → viz"] + PR --> PROMPT_PY["prompt.py
Configurable analysis prompts
and research focus areas"] + PR --> VIZ_PY["visualization.py
Side-by-side charts
(actual vs predicted)"] + end + + OUTPUTS_NOTE["All generated artifacts (Excel, CSV, charts)
are written to config['results_dir']
from default_config.py — NOT committed
inside the source package"] + HARNESS -.->|"writes outputs to"| OUTPUTS_NOTE + + subgraph EXISTING_USED["Existing Files We Use (Don't Modify)"] + EX1["tradingagents/graph/trading_graph.py
TradingAgentsGraph class"] + EX2["tradingagents/graph/reflection.py
reflect_and_remember()"] + EX3["tradingagents/agents/utils/memory.py
FinancialSituationMemory"] + EX4["tradingagents/dataflows/interface.py
Data routing"] + EX5["tradingagents/default_config.py
Configuration"] + end + + HARNESS -->|"calls"| EX1 + EVAL_PY -->|"triggers"| EX2 + EX2 -->|"stores in"| EX3 + MODEL_PY -->|"uses"| EX4 + HARNESS -->|"extends"| EX5 + + %% FIXED styles (contrast + borders) + style NEW_FILES fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style EXISTING_USED fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b +``` + +--- + +## Detailed: train.py Logic + +```mermaid +flowchart TD + START["train(ticker, horizon, start_date)"] + + WINDOW["Calculate training window
1 day → 30 days lookback
1 week → 90 days lookback
1 month → 180 days lookback"] + + FETCH["Fetch full historical data
yfinance: get_stock_data(ticker, start, end)"] + + SPLIT["Split data (configurable test_window):
context_days = window[:-test_window]
test_days = window[-test_window:]
Default: test_window = max(5, int(0.2 * N))"] + + INIT["Initialize TradingAgentsGraph
with fresh memories"] + + subgraph LOOP["Walk-Forward Loop (for each test day)"] + DAY_N["Current test day = day[i]"] + PROPAGATE["ta.propagate(ticker, day[i-1])
Predict what happens on day[i]"] + GET_ACTUAL["Get actual price on day[i]
from historical data"] + COMPARE["evaluation.compare(
predicted=decision,
actual=price_change
)"] + CORRECT{"Direction
correct?"} + POSITIVE["ta.reflect_and_remember(+return)
Memory: 'This approach worked
when indicators showed X'"] + NEGATIVE["ta.reflect_and_remember(-return)
Memory: 'This approach failed
when conditions were Y'"] + LOG["Log result to results list:
{date, predicted, actual, correct, return}"] + NEXT["i += 1"] + + DAY_N --> PROPAGATE --> GET_ACTUAL --> COMPARE --> CORRECT + CORRECT -->|"Yes"| POSITIVE --> LOG + CORRECT -->|"No"| NEGATIVE --> LOG + LOG --> NEXT + NEXT -->|"More days?"| DAY_N + end + + RETURN["Return results list + trained memory"] + + START --> WINDOW --> FETCH --> SPLIT --> INIT --> LOOP + NEXT -->|"Done"| RETURN + + style LOOP fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 +``` + +--- + +## Detailed: evaluation.py Logic + +```mermaid +flowchart TD + INPUT["Input: list of
{date, predicted, actual, correct, return}"] + + subgraph METRICS["Calculated Metrics"] + M1["Direction Accuracy
% of days where UP/DOWN was correct"] + M2["Signal Distribution
How many BUY vs HOLD vs SELL"] + M3["Cumulative Return
If you followed every signal"] + M4["Max Drawdown
Worst losing streak"] + M5["Win Rate by Signal Type
BUY accuracy vs SELL accuracy"] + M6["Best/Worst Days
Biggest wins and losses"] + end + + subgraph OUTPUT["Output Files (written to config['results_dir'])"] + O1["{results_dir}/training_log.xlsx
Every prediction with details"] + O2["{results_dir}/metrics_summary.xlsx
All metrics in one sheet"] + O3["{results_dir}/memory_dump.json
What the agents learned"] + end + + INPUT --> METRICS + METRICS --> OUTPUT + + style METRICS fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style OUTPUT fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b +``` + +--- + +## Detailed: visualization.py Layout + +```mermaid +flowchart LR + subgraph LEFT["Left Panel: Actual Data"] + L1["Stock price line chart"] + L2["Volume bars below"] + L3["Key indicators overlay
(SMA 50, SMA 200, RSI)"] + L4["Green/Red markers:
Days where agents were right/wrong"] + end + + subgraph RIGHT["Right Panel: Predicted"] + R1["Agent's predicted direction
per day (arrows up/down)"] + R2["Confidence level
(BUY=high, OVERWEIGHT=medium, etc.)"] + R3["Decision breakdown:
Which agents agreed/disagreed"] + end + + subgraph BOTTOM["Bottom Panel: Comparison"] + B1["Overlay: actual vs predicted direction"] + B2["Running accuracy score"] + B3["Memory growth chart:
How many lessons stored over time"] + end + + style LEFT fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style RIGHT fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style BOTTOM fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b +``` + +--- + +## Detailed: model_harness.py (The Orchestrator) + +```mermaid +flowchart TD + subgraph CLI["User Interface"] + U1["python model_harness.py
--ticker NVDA
--horizon 1day
--start-date 2024-01-01"] + end + + subgraph HARNESS["model_harness.py Pipeline"] + H1["Parse arguments"] + H2["Load/extend config from default_config.py"] + H3["Initialize TradingAgentsGraph"] + + H4["Phase 1: TRAIN
train.run_walk_forward()"] + H5["Phase 2: EVALUATE
evaluation.generate_report()"] + H6["Phase 3: PREDICT
model.predict_future()"] + H7["Phase 4: VISUALIZE
visualization.create_dashboard()"] + + H8["Save all results to config['results_dir']"] + + H1 --> H2 --> H3 --> H4 --> H5 --> H6 --> H7 --> H8 + end + + subgraph CONFIG_OPTIONS["Configurable via prompt.py"] + C1["analysis_focus: 'intraday momentum'"] + C2["priority_indicators: ['RSI', 'MACD', 'VWAP']"] + C3["news_weight: 'high' or 'low'"] + C4["debate_rounds: 1-3"] + C5["risk_tolerance: 'aggressive' / 'moderate' / 'conservative'"] + end + + CLI --> HARNESS + CONFIG_OPTIONS --> H2 + + style CLI fill:#f3e5f5 + style HARNESS fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style CONFIG_OPTIONS fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 +``` + +--- + +## How prompt.py Works + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '18px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 140 } +}}%% +flowchart TD + subgraph PROMPT["prompt.py - Research Focus Configuration"] + P1["RESEARCH_FOCUS = {
'mode': 'intraday',
'timeframe': '1day',
'focus_areas': [
'momentum indicators',
'volume analysis',
'news catalysts'
],
'avoid': [
'long-term fundamentals',
'quarterly earnings'
]
}"] + + P2["This gets injected into the
system prompts of each analyst"] + end + + subgraph EFFECT["How It Changes Agent Behavior"] + E1["Market Analyst
→ Prioritizes RSI, MACD, VWAP
→ Focuses on intraday patterns"] + E2["News Analyst
→ Looks for same-day catalysts
→ Ignores long-term trends"] + E3["Bull/Bear Researchers
→ Debate short-term momentum
→ Not long-term value"] + end + + P1 --> P2 --> E1 + P2 --> E2 + P2 --> E3 + + style PROMPT fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style EFFECT fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 +``` + +--- + +## Walk-Forward Example: 1-Day Horizon with NVDA + +```mermaid +gantt + title Walk-Forward Training: NVDA 1-Day Prediction + dateFormat YYYY-MM-DD + + section Context Window + Historical data (agents can see) :done, ctx, 2024-04-01, 20d + + section Test Window (predict one at a time) + Day 21 - Predict (first test) :active, d21, 2024-04-21, 1d + Day 22 - Predict :d22, 2024-04-22, 1d + Day 23 - Predict :d23, 2024-04-23, 1d + Day 24 - Predict :d24, 2024-04-24, 1d + Day 25 - Predict :d25, 2024-04-25, 1d + Day 26 - Predict :d26, 2024-04-28, 1d + Day 27 - Predict :d27, 2024-04-29, 1d + Day 28 - Predict :d28, 2024-04-30, 1d + Day 29 - Predict :d29, 2024-05-01, 1d + Day 30 - Predict (last test) :crit, d30, 2024-05-02, 1d + + section After Training + Predict FUTURE days :milestone, future, 2024-05-03, 0d +``` + +**Step-by-step for Day 21:** +1. Agents see data from Apr 1-20 only +2. `ta.propagate("NVDA", "2024-04-20")` → Predicts direction for Apr 21 +3. Check actual Apr 21 price: Was prediction right? +4. `ta.reflect_and_remember(actual_return)` → Store lesson +5. Now agents see Apr 1-21 → Predict Apr 22 +6. Repeat... + +--- + +## What "Adjusting the Approach" Actually Means + +When a prediction is wrong, here's what safely adjusts vs. what must remain fixed: + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '18px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 140 } +}}%% +flowchart TD + WRONG["Prediction was WRONG"] + + subgraph AUTO_CHANGES["Automatic (via reflect_and_remember)"] + A1["Memory updated:
'When RSI was 72 and we said BUY,
the stock actually dropped 3%.
Next time: consider overbought signal.'"] + A2["Next prediction naturally
considers this lesson via
BM25 memory retrieval"] + end + + subgraph AFTER_TRAINING["After full training run (manual analysis)"] + B1["Check: Which analyst was most wrong?
→ Maybe disable social analyst for this stock"] + B2["Check: Which indicators helped most?
→ Update prompt.py focus_areas"] + B3["Check: Were debate rounds enough?
→ Increase max_debate_rounds"] + B4["Check: Was risk assessment accurate?
→ Adjust risk_tolerance in config"] + end + + subgraph NEVER_CHANGE["What We DON'T Change"] + N1["Don't retry the same day
(look-ahead bias = cheating)"] + N2["Don't modify the model weights
(LLMs don't work that way)"] + N3["Don't change data source mid-run
(inconsistent comparison)"] + end + + WRONG --> AUTO_CHANGES + WRONG --> AFTER_TRAINING + WRONG -.->|"AVOID"| NEVER_CHANGE + + style AUTO_CHANGES fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style AFTER_TRAINING fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#e65100 + style NEVER_CHANGE fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:#b71c1c +``` + +--- + +## Summary: What We're Building + +```mermaid +%%{init: { + 'themeVariables': { 'fontSize': '18px' }, + 'flowchart': { 'nodeSpacing': 100, 'rankSpacing': 140 } +}}%% +flowchart TD + subgraph PR_SCOPE["PR Scope: tradingagents/autoresearch/"] + F1["train.py — Walk-forward loop"] + F2["evaluation.py — Metrics + Excel output"] + F3["model.py — Batch prediction wrapper"] + F4["model_harness.py — Full pipeline orchestrator"] + F5["prompt.py — Intraday research focus config"] + F6["visualization.py — Actual vs Predicted charts"] + end + + subgraph USES["Uses Existing (No Modifications)"] + U1["TradingAgentsGraph.propagate()"] + U2["TradingAgentsGraph.reflect_and_remember()"] + U3["FinancialSituationMemory (BM25)"] + U4["All 12 agents + tools + dataflows"] + end + + subgraph OUTPUTS["What User Gets"] + O1["Excel: Day-by-day predictions vs actual"] + O2["Charts: Side-by-side actual vs predicted"] + O3["Metrics: Accuracy, returns, win rate"] + O4["Trained memory: Lessons for future use"] + end + + PR_SCOPE -->|"calls"| USES + PR_SCOPE -->|"produces"| OUTPUTS + + style PR_SCOPE fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20 + style USES fill:#e1f5fe,stroke:#0277bd,stroke-width:2px,color:#01579b + style OUTPUTS fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#4a148c +``` + +--- + +## Key Design Decisions + +| Decision | Choice | Why | +|---|---|---| +| Retry same day on failure? | **No** — move forward, learn via memory | Retrying with answer knowledge = look-ahead bias | +| Modify existing agent code? | **No** — only ADD new files | Clean PR, no risk of breaking existing functionality | +| Where does learning happen? | **reflect_and_remember()** — already built | Don't reinvent the wheel | +| How to tune approach? | **prompt.py** config + post-training analysis | Separates "what to focus on" from "how it runs" | +| Output format? | **Excel + matplotlib charts** | Simple, shareable, no extra dependencies | +| Max prediction horizon? | **1 month (not 1 year)** | LLM-based analysis degrades over long horizons | + +--- + +## Questions for Reviewers + +1. **Is the approach sound?** Walk-forward backtesting with memory-based learning vs. alternative approaches the team might prefer? +2. **Module location** — `tradingagents/autoresearch/` OK, or better under `experiments/` or `research/`? +3. **API cost concern** — Training over 30 days = ~360 LLM calls. Is this acceptable, or should the design include batch/cheap-model modes? +4. **Scope** — Start with just `1day` horizon, or all three (`1day`/`1week`/`1month`) in the first iteration? +5. **Merged feature or experimental branch?** — Should this live in `main` or as a separate experimental track? + +## Next Steps + +If the approach is approved, a follow-up PR will implement the actual module according to the design above. This RFC is intentionally docs-only to gather feedback before implementation.