Implemented `tests/portfolio/test_config.py` to test the `validate_config` function in `tradingagents/portfolio/config.py`.
Key improvements:
- Added `test_validate_config_max_positions_invalid` to verify `max_positions` boundary conditions (0, -1).
- Added happy path test `test_validate_config_valid`.
- Added tests for `max_position_pct`, `max_sector_pct`, `min_cash_pct`, and `default_budget` validation.
- Verified that tests correctly catch bugs by temporarily disabling validation logic.
These tests ensure the portfolio manager configuration is robustly validated before use.
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
- tradingagents/portfolio/risk_metrics.py: pure-Python computation of
Sharpe, Sortino, VaR, max drawdown, beta, sector concentration from
PortfolioSnapshot NAV history — no LLM, no external dependencies
- tradingagents/portfolio/__init__.py: export compute_risk_metrics
- tradingagents/agents/utils/portfolio_tools.py: 4 LangChain tools
wrapping Holding.enrich, Portfolio.enrich, ReportStore APIs, and
compute_risk_metrics so agents can access portfolio data without
reimplementing computations
- tests/portfolio/test_risk_metrics.py: 48 tests for risk metrics
- tests/unit/test_portfolio_tools.py: 19 tests for portfolio tools
- tests/portfolio/test_repository.py: fix pre-existing import error
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Replace supabase-py stubs with working psycopg2 implementation using
Supabase pooler connection string. Implement full business logic in
repository (avg cost basis, cash accounting, trade recording, snapshots).
Add 12 unit tests + 4 integration tests (51 total portfolio tests pass).
Fix cash_pct bug in models.py, update docs for psycopg2 + pooler pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Implement integration tests for scanner vendor routing, ensuring correct routing to Alpha Vantage and fallback to yfinance.
- Create comprehensive unit tests for TTM analysis, covering metrics computation and report formatting.
- Introduce fail-fast vendor routing tests to verify immediate failure for methods not in FALLBACK_ALLOWED.
- Develop extensive integration tests for the yfinance data layer, mocking external calls to validate functionality across various financial data retrieval methods.
* gitignore
* feat: unify report paths under reports/daily/{date}/ hierarchy
All generated artifacts now land under a single reports/ tree:
- reports/daily/{date}/market/ for scan results (was results/macro_scan/)
- reports/daily/{date}/{TICKER}/ for per-ticker analysis (was reports/{TICKER}_{timestamp}/)
- reports/daily/{date}/{TICKER}/eval/ for eval logs (was eval_results/{TICKER}/...)
Adds tradingagents/report_paths.py with centralized path helpers used by
CLI commands, trading graph, and pipeline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: structured observability logging for LLM, tool, and vendor calls
Add RunLogger (tradingagents/observability.py) that emits JSON-lines events
for every LLM call (model, agent, tokens in/out, latency), tool invocation
(tool name, args, success, latency), data vendor call (method, vendor,
success/failure, latency), and report save.
Integration points:
- route_to_vendor: log_vendor_call() on every try/catch
- run_tool_loop: log_tool_call() on every tool invoke
- ScannerGraph: new callbacks param, passes RunLogger.callback to all LLM tiers
- pipeline/macro_bridge: picks up RunLogger from thread-local, passes to TradingAgentsGraph
- cli/main.py: one RunLogger per command (analyze/scan/pipeline), write_log()
at end, summary line printed to console
Log files co-located with reports:
reports/daily/{date}/{TICKER}/run_log.jsonl (analyze)
reports/daily/{date}/market/run_log.jsonl (scan)
reports/daily/{date}/run_log.jsonl (pipeline)
Also fix test_long_response_no_nudge: update "A"*600 → "A"*2100 to match
MIN_REPORT_LENGTH=2000 threshold set in an earlier commit.
Update memory system context files (ARCHITECTURE, COMPONENTS, CONVENTIONS,
GLOSSARY, CURRENT_STATE) to document observability and report path systems.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add extract_json() utility for robust LLM JSON parsing
Handles DeepSeek R1 <think> blocks, markdown code fences, and
preamble/postamble text that LLMs wrap around JSON output.
Applied to macro_synthesis, macro_bridge, and CLI scan output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: opt-in vendor fallback — fail-fast by default (ADR 011)
Silent cross-vendor fallback corrupts signal quality when data contracts
differ (e.g., AV news has sentiment scores yfinance lacks). Only methods
with fungible data contracts (OHLCV, indices, sector/industry perf,
market movers) now get fallback. All others raise immediately.
- Add FALLBACK_ALLOWED whitelist to interface.py
- Rewrite route_to_vendor() with fail-fast/fallback branching
- Improve error messages with method name, vendors tried, and exception chaining
- Add 11 new tests in test_vendor_failfast.py
- Update ADRs 002 (superseded), 008, 010; create ADR 011
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Root causes fixed:
- test_config_wiring.py: `callable()` returns False on LangChain @tool
objects — replaced with `hasattr(x, "invoke")` check
- test_env_override.py: `load_dotenv()` in default_config.py re-reads
.env on importlib.reload(), leaking user's TRADINGAGENTS_* env vars
into isolation tests — mock env vars before reload
- test_scanner_comprehensive.py: LLM-calling test was not marked
@pytest.mark.integration — added marker so offline runs skip it
- test_scanner_fallback.py: assertions used stale `_output_files` list
from a previous run when output dir already existed — clear dir in
setUp; also fixed tool-availability check using hasattr(x, "invoke")
- test_scanner_graph.py: output-file path assertions used hardcoded
date string instead of fixture date; graph node assertions checked
for removed node names
Full offline suite: 388 passed, 70 deselected, 0 failures.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: add implementation plan for medium-term positioning upgrade
Covers 4 objectives: increased debate rounds, 8-quarter TTM fundamentals,
sector/peer relative performance, and macro regime classification.
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* feat: medium-term positioning upgrade (debate rounds, TTM, peer comparison, macro regime)
## Changes
### Step 1: Agentic Debate Depth
- Increase `max_debate_rounds` and `max_risk_discuss_rounds` from 1 → 2 in `default_config.py`
- Fix bug in `trading_graph.py`: wire config values into `ConditionalLogic()` (was ignoring config, using hardcoded defaults)
### Step 2: 8-Quarter TTM Fundamental Analysis
- New `tradingagents/dataflows/ttm_analysis.py`: parses quarterly income/balance/cashflow CSV strings, computes TTM (sum of last 4 quarters), QoQ/YoY growth rates, margin trends across 8 quarters
- New `@tool get_ttm_analysis` in `fundamental_data_tools.py`
- Wire into fundamentals ToolNode; register in `TOOLS_CATEGORIES`
- Update fundamentals analyst prompt: "last 8 quarters (2 years)" focus
### Step 3: Sector & Peer Relative Performance
- New `tradingagents/dataflows/peer_comparison.py`: sector peer lookup, 1W/1M/3M/6M/YTD return ranking, alpha vs sector ETF
- New `@tool get_peer_comparison` and `@tool get_sector_relative`
- Wire into fundamentals ToolNode
### Step 4: Macro Regime Flag
- New `tradingagents/dataflows/macro_regime.py`: 6-signal classifier (VIX level/trend, credit spread HYG/LQD, yield curve TLT/SHY, market breadth SPX vs 200-SMA, sector rotation) → risk-on / transition / risk-off
- New `@tool get_macro_regime`; add `macro_regime_report` field to AgentState
- Wire into market ToolNode; feed into research_manager and risk_manager prompts
### Step 5: Tests (88 new unit tests, 0 integration)
- `tests/test_debate_rounds.py` (17 tests)
- `tests/test_ttm_analysis.py` (18 tests)
- `tests/test_peer_comparison.py` (11 tests)
- `tests/test_macro_regime.py` (16 tests)
- `tests/test_config_wiring.py` (12 tests)
All 88 new unit tests pass; no regressions in existing tests.
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* test: mark live yfinance network tests as integration
TestYfinanceIndustryPerformance, TestRouteToVendorFallback, and TestFallbackRouting
all make live HTTP calls to yfinance (yfinance.Sector / market movers). Mark them
@pytest.mark.integration so they're skipped in standard offline runs.
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* docs: update memory files for medium-term positioning upgrade
- PROGRESS.md: add milestone section with all new files and changes
- DECISIONS.md: add decisions 008-010 (macro regime, TTM data source, peer comparison)
- MISTAKES.md: add mistakes 10-11 (Python 3.11 f-string, mock data precision)
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* docs: document git remote setup (origin = aguzererler fork)
origin points to aguzererler/TradingAgents which IS the fork.
No upstream remote configured. All feature branches push to origin.
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* docs: redirect tracking files to memory system
Replace DECISIONS.md/MISTAKES.md/PROGRESS.md references in CLAUDE.md
with instructions to use /remember memory system. A PreToolUse hook
in ~/.claude/settings.json enforces this by blocking writes to those files.
https://claude.ai/code/session_01TuPpssTo83whKkNgSu57HH
* Initial plan
* Add integration tests for yfinance and Alpha Vantage APIs (78 tests, all passing)
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* Initial plan
* fix: allow .env variables to override DEFAULT_CONFIG values
Merged origin/main and resolved all 8 conflicting files:
- CLAUDE.md: merged MISTAKES.md ref + Project Tracking section + env override docs
- cli/main.py: kept top-level json import, kept try/except in run_pipeline
- tool_runner.py: kept descriptive comments for MAX_TOOL_ROUNDS
- alpha_vantage_common.py: kept thread-safe rate limiter, robust error handling
- interface.py: kept broader exception catch (AlphaVantageError + ConnectionError + TimeoutError)
- default_config.py: kept _env()/_env_int() env var overrides with load_dotenv() at module level
- scanner_graph.py: kept debug mode fix (stream for debug, invoke for result)
- macro_bridge.py: kept get_running_loop() over deprecated get_event_loop()
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* fix: move rate limiter sleep outside lock to avoid blocking threads
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* docs: update PROGRESS, DECISIONS, MISTAKES, CLAUDE with env override implementation
- PROGRESS.md: added env override milestone, updated test count (38 total),
marked Mistake #9 as resolved, added all new/modified files from PR #9
- DECISIONS.md: added Decision 008 (env var config overrides),
Decision 009 (thread-safe rate limiter), Decision 010 (broader
vendor fallback exceptions), updated Decision 007 status to superseded
- MISTAKES.md: updated Mistake #9 status to RESOLVED, added Mistake #10
(rate limiter held lock during sleep)
- CLAUDE.md: added env var override convention docs, updated critical
patterns with rate limiter and config fallback key lessons, updated
mistake count to 10
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* merge main into branch (-X theirs) and fix tests to pass against current main code
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* feat: add scanner tests, global demo key in conftest, remove 48 inline key patches
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* feat: add agentic memory scaffold and migrate tracking files to docs/agent/
Migrate DECISIONS.md, MISTAKES.md, PROGRESS.md, agents/, plans/, and
tradingagents/llm_clients/TODO.md into a structured docs/agent/ scaffold
with ADR-style decisions, plans, templates, and a live state tracker.
This gives agent workflows a standard memory structure for decisions,
plans, logs, and session continuity via CURRENT_STATE.md.
Agent-Ref: docs/agent/plans/global-macro-scanner.md
State-Updated: Yes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: improve Industry Deep Dive report quality with enriched data, sector routing, and tool-call nudge
* Initial plan
* Improve Industry Deep Dive quality: enrich tool data, explicit sector keys, tool-call nudge
- Enrich get_industry_performance_yfinance with 1-day/1-week/1-month price returns
via batched yf.download() for top 10 tickers (Step 1)
- Add VALID_SECTOR_KEYS, _DISPLAY_TO_KEY, _extract_top_sectors() to industry_deep_dive.py
to pre-extract top sectors from Phase 1 report and inject them into the prompt (Step 2)
- Add tool-call nudge to run_tool_loop: if first LLM response has no tool calls and is
under 500 chars, re-prompt with explicit instruction to call tools (Step 3)
- Update scanner_tools.py get_industry_performance docstring to list all valid sector keys (Step 4)
- Add 15 unit tests covering _extract_top_sectors, tool_runner nudge, and enriched output (Step 5)
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* Address code review: move cols[3] access into try block for IndexError safety
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* fix: align display row count with download count in get_industry_performance_yfinance
The enriched function downloads price data for top 10 tickers but displayed
20 rows, causing rows 11-20 to show N/A in all price columns. This broke
test_industry_perf_falls_back_to_yfinance which asserts N/A count < 5.
Now both download and display use head(10) for consistency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Co-authored-by: Ahmet Guzererler <guzererler@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update memory files after PR #13 (Industry Deep Dive quality fix)
- CURRENT_STATE.md: remove Industry Deep Dive blocker (resolved), update
test count 38 → 53, add PR #13 to Recent Progress, update milestone focus
- decisions/009-industry-deep-dive-quality.md: new ADR documenting the
three-pronged fix (enriched data, explicit sector routing, tool-call nudge)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add architecture-coordinator skill for mandatory ADR reading protocol
New Claude Code skill that enforces reading docs/agent/CURRENT_STATE.md,
decisions/, and plans/ before any code changes. Includes conflict resolution
protocol that stops work and quotes the violated ADR rule when user requests
conflict with established architectural decisions.
Files:
- .claude/skills/architecture-coordinator/SKILL.md
- .claude/skills/architecture-coordinator/references/adr-template.md
- .claude/skills/architecture-coordinator/references/reading-checklist.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* Initial plan
* Improve Industry Deep Dive quality: enrich tool data, explicit sector keys, tool-call nudge
- Enrich get_industry_performance_yfinance with 1-day/1-week/1-month price returns
via batched yf.download() for top 10 tickers (Step 1)
- Add VALID_SECTOR_KEYS, _DISPLAY_TO_KEY, _extract_top_sectors() to industry_deep_dive.py
to pre-extract top sectors from Phase 1 report and inject them into the prompt (Step 2)
- Add tool-call nudge to run_tool_loop: if first LLM response has no tool calls and is
under 500 chars, re-prompt with explicit instruction to call tools (Step 3)
- Update scanner_tools.py get_industry_performance docstring to list all valid sector keys (Step 4)
- Add 15 unit tests covering _extract_top_sectors, tool_runner nudge, and enriched output (Step 5)
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* Address code review: move cols[3] access into try block for IndexError safety
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
* fix: align display row count with download count in get_industry_performance_yfinance
The enriched function downloads price data for top 10 tickers but displayed
20 rows, causing rows 11-20 to show N/A in all price columns. This broke
test_industry_perf_falls_back_to_yfinance which asserts N/A count < 5.
Now both download and display use head(10) for consistency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Co-authored-by: Ahmet Guzererler <guzererler@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The scanner pipeline now runs end-to-end: Phase 1 (geopolitical, market
movers, sector scanners in parallel via Ollama), Phase 2 (industry deep
dive), Phase 3 (macro synthesis via OpenRouter/DeepSeek R1).
Key changes:
- Add tool_runner.py with run_tool_loop() for inline tool execution in
scanner agents (scanner graph has no ToolNode, unlike trading graph)
- Fix vendor fallback: catch AlphaVantageError base class, raise on
total failure instead of embedding errors in return values
- Rewrite yfinance sector perf to use SPDR ETF proxies (Sector.overview
has no performance data)
- Fix Ollama remote host support in openai_client.py
- Add LangGraph state reducers for parallel fan-out writes
- Add --date CLI flag for non-interactive scanner invocation
- Fix .env loading to find keys from both CWD and project root
- Add hybrid LLM config (per-tier provider/backend_url)
- Add project tracking: DECISIONS.md, PROGRESS.md, MISTAKES.md
- Add 9 new test files covering exceptions, fallback, and routing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Created new files for industry performance, market indices, market movers, sector performance, and topic news.
- Implemented end-to-end tests for scanner functionality, ensuring all tools return expected data formats and can save results to files.
- Added integration tests to verify scanner tools work seamlessly with the CLI scan command.
- Enhanced test coverage for individual scanner tools, validating output structure and content.
## Summary
The changes refactor the scanner tool invocation to use LangChain's StructuredTool `.invoke()` method consistently across the codebase. This includes updating the CLI scan command, rewriting tests to use the new invocation pattern, and correcting yfinance screener key mappings. The changes also add comprehensive end-to-end test suites for scanner functionality.
## Issues Found
| Severity | File:Line | Issue |
|----------|-----------|-------|
| WARNING | cli/main.py:1193-1218 | Inconsistent error handling - some tools check for "Error" prefix while others check for "No data" prefix, but the actual error messages from yfinance_scanner.py use different formats |
| WARNING | tradingagents/dataflows/yfinance_scanner.py:34 | The condition `if not data or 'quotes' not in data:` may not catch all error cases - yfinance screener can return empty data structures that evaluate to False but don't contain 'quotes' key |
| SUGGESTION | tests/test_scanner_tools.py:38-46 | Test could be more robust by checking for actual data content rather than just headers |
| SUGGESTION | cli/main.py:1193-1218 | Consider extracting the scanner tool invocation pattern into a helper function to reduce duplication |
## Detailed Findings
### File: cli/main.py:1193-1218
- **Confidence:** 85%
- **Problem:** The error handling checks for different prefixes ("Error" vs "No data") but the actual functions in yfinance_scanner.py return error messages with different formats (e.g., "Error fetching market movers for..."). This inconsistency could lead to improper error handling where error results are still saved to files.
- **Suggestion:** Standardize error checking by creating a helper function that checks if a result indicates an error, or modify the yfinance_scanner functions to return consistent error prefixes.
### File: tradingagents/dataflows/yfinance_scanner.py:34
- **Confidence:** 80%
- **Problem:** The condition `if not data or 'quotes' not in data:` assumes that if data exists, it will contain a 'quotes' key. However, yfinance screener might return data in different formats or empty objects that don't contain this key, leading to potential KeyError exceptions.
- **Suggestion:** Add more robust checking: `if not data or not isinstance(data, dict) or 'quotes' not in data:` to prevent attribute errors.
### File: tests/test_scanner_tools.py:38-46
- **Confidence:** 75%
- **Problem:** The test for market movers only checks that the result contains the expected header but doesn't verify that actual financial data is present in the table rows.
- **Suggestion:** Enhance the test to verify that data rows are present (e.g., check for table rows with actual data, not just headers).
### File: cli/main.py:1193-1218
- **Confidence:** 70%
- **Problem:** The scanner tool invocation pattern is repeated 5 times with only minor variations in arguments, violating the DRY principle.
- **Suggestion:** Extract this pattern into a helper function like `invoke_scanner_tool(tool, args, filename)` to reduce code duplication and improve maintainability.
## Recommendation
**APPROVE WITH SUGGESTIONS**
The changes are fundamentally sound and improve code consistency by standardizing on the StructuredTool `.invoke()` interface. The added test coverage is excellent. Addressing the minor issues noted above would further improve robustness and maintainability.
- Added market-wide analysis capabilities (movers, indices, sectors, industries, topic news)
- Implemented yfinance and Alpha Vantage data fetching modules
- Added LangChain tools for scanner functions
- Created scanner state definitions and graph components
- Integrated scan command into CLI
- Added configuration for scanner_data vendor routing
- Included test files for scanner components
This implements a new feature for global macro scanning to identify market-wide trends and opportunities.
- Added support for running CLI and Ollama server via Docker
- Introduced tests for local embeddings model and standalone Docker setup
- Enabled conditional Ollama server launch via LLM_PROVIDER