7.0 KiB
Scanner Pipeline — Progress Tracker
Milestone: End-to-End Scanner ✅ COMPLETE
The 3-phase scanner pipeline runs successfully from python -m cli.main scan --date 2026-03-17.
What Works
| Component | Status | Notes |
|---|---|---|
| Phase 1: Geopolitical Scanner | ✅ | Ollama/qwen3.5:27b, uses get_topic_news |
| Phase 1: Market Movers Scanner | ✅ | Ollama/qwen3.5:27b, uses get_market_movers + get_market_indices |
| Phase 1: Sector Scanner | ✅ | Ollama/qwen3.5:27b, uses get_sector_performance (SPDR ETF proxies) |
| Phase 2: Industry Deep Dive | ✅ | Ollama/qwen3.5:27b, uses get_industry_performance + get_topic_news |
| Phase 3: Macro Synthesis | ✅ | OpenRouter/DeepSeek R1, pure LLM synthesis (no tools) |
| Parallel fan-out (Phase 1) | ✅ | LangGraph with _last_value reducers |
| Tool execution loop | ✅ | run_tool_loop() in tool_runner.py |
| Data vendor fallback | ✅ | AV → yfinance fallback on AlphaVantageError, ConnectionError, TimeoutError |
CLI --date flag |
✅ | python -m cli.main scan --date YYYY-MM-DD |
| .env loading | ✅ | load_dotenv() at module level in default_config.py — import-order-independent |
| Env var config overrides | ✅ | All DEFAULT_CONFIG keys overridable via TRADINGAGENTS_<KEY> env vars |
| Tests (38 total) | ✅ | 14 original + 9 scanner fallback + 15 env override tests |
Output Quality (Sample Run 2026-03-17)
| Report | Size | Content |
|---|---|---|
| geopolitical_report | 6,295 chars | Iran conflict, energy risks, central bank signals |
| market_movers_report | 6,211 chars | Top gainers/losers, volume anomalies, index trends |
| sector_performance_report | 8,747 chars | Sector rotation analysis with ranked table |
| industry_deep_dive_report | — | Ran but was sparse (Phase 1 reports were the primary context) |
| macro_scan_summary | 10,309 chars | Full synthesis with stock picks and JSON structure |
Files Created/Modified
New files:
tradingagents/agents/utils/tool_runner.py— inline tool execution looptradingagents/agents/utils/scanner_states.py— ScannerState with reducerstradingagents/agents/utils/scanner_tools.py— LangChain tool wrappers for scanner datatradingagents/agents/scanners/— all 5 scanner agent modulestradingagents/graph/scanner_graph.py— ScannerGraph orchestratortradingagents/graph/scanner_setup.py— LangGraph workflow setuptradingagents/dataflows/yfinance_scanner.py— yfinance data for scannertradingagents/dataflows/alpha_vantage_scanner.py— Alpha Vantage data for scannertradingagents/pipeline/macro_bridge.py— scan → filter → per-ticker analysis bridgetests/test_scanner_fallback.py— 9 fallback teststests/test_env_override.py— 15 env override tests
Modified files:
tradingagents/default_config.py— env var overrides via_env()/_env_int()helpers,load_dotenv()at module level, restored top-levelllm_providerandbackend_urlkeystradingagents/llm_clients/openai_client.py— Ollama remote host supporttradingagents/dataflows/interface.py— broadened fallback catch to(AlphaVantageError, ConnectionError, TimeoutError)tradingagents/dataflows/alpha_vantage_common.py— thread-safe rate limiter (sleep outside lock), broaderRequestExceptioncatch, wrappedraise_for_statustradingagents/graph/scanner_graph.py— debug mode fix (stream for debug, invoke for result)tradingagents/pipeline/macro_bridge.py—get_running_loop()over deprecatedget_event_loop()cli/main.py—scancommand with--dateflag,try/exceptinrun_pipeline,.envloading fixmain.py—load_dotenv()before tradingagents importspyproject.toml—python-dotenv>=1.0.0dependency declared.env.example— documented allTRADINGAGENTS_*overrides andALPHA_VANTAGE_API_KEY
Milestone: Env Var Config Overrides ✅ COMPLETE (PR #9)
All DEFAULT_CONFIG values are now overridable via TRADINGAGENTS_<KEY> environment variables without code changes. This resolves the latent bug from Mistake #9 (missing top-level llm_provider).
What Changed
| Component | Detail |
|---|---|
default_config.py |
load_dotenv() at module level + _env()/_env_int() helpers |
| Top-level fallback keys | Restored llm_provider and backend_url (defaults: "openai", "https://api.openai.com/v1") |
| Per-tier overrides | All None by default — fall back to top-level when not set via env |
| Integer config keys | max_debate_rounds, max_risk_discuss_rounds, max_recur_limit use _env_int() |
| Data vendor keys | data_vendors.* overridable via TRADINGAGENTS_VENDOR_<CATEGORY> |
.env.example |
Complete reference of all overridable settings |
python-dotenv |
Added to pyproject.toml as explicit dependency |
| Tests | 15 new tests in tests/test_env_override.py |
TODOs / Future Work
High Priority
-
Industry Deep Dive quality: Phase 2 report was sparse in test run. The LLM receives Phase 1 reports as context but may not call tools effectively. Consider: pre-fetching industry data and injecting it directly, or tuning the prompt to be more directive about which sectors to drill into.
-
Macro Synthesis JSON parsing: The
macro_scan_summaryshould be valid JSON but DeepSeek R1 sometimes wraps it in markdown code blocks or adds preamble text. The CLI triesjson.loads(summary)to build a watchlist table — this may fail. Add robust JSON extraction (strip markdown fences, find first{). -
pipelinecommand:cli/main.pyhas arun_pipeline()placeholder that chains scan → filter → per-ticker deep dive. Not yet implemented.
Medium Priority
-
Scanner report persistence: Reports are saved to
results/macro_scan/{date}/as.mdfiles. Verify this works and add JSON output option. -
Rate limiting for parallel tool calls: Phase 1 runs 3 agents in parallel, each calling tools. If tools hit the same API (e.g., Google News), they may get rate-limited. Consider adding delays or a shared rate limiter.
-
Ollama model validation: Before running the pipeline, validate that the configured model exists on the Ollama server (call
/api/tagsendpoint). Currently a 404 error is only caught at first LLM call. -
Test coverage for scanner agents: Current tests cover data layer (yfinance/AV fallback) but not the agent nodes themselves. Add integration tests that mock the LLM and verify tool loop behavior.
Low Priority
-
Configurable MAX_TOOL_ROUNDS: Currently hardcoded to 5 in
tool_runner.py. Could be made configurable viaDEFAULT_CONFIG. -
Streaming output: Scanner currently runs with
Live(Spinner(...))— no intermediate output. Could stream phase completions to the console. -
Remove top-level: Resolved in PR #9 —llm_providerreferencesllm_providerandbackend_urlrestored as top-level keys with"openai"/"https://api.openai.com/v1"defaults. Per-tier providers fall back to these whenNone.