TradingAgents

Commit Graph

Author	SHA1	Message	Date
ahmet guzererler	a90f14c086	feat: unified report paths, structured observability logging, and memory system update (#22 ) * gitignore * feat: unify report paths under reports/daily/{date}/ hierarchy All generated artifacts now land under a single reports/ tree: - reports/daily/{date}/market/ for scan results (was results/macro_scan/) - reports/daily/{date}/{TICKER}/ for per-ticker analysis (was reports/{TICKER}_{timestamp}/) - reports/daily/{date}/{TICKER}/eval/ for eval logs (was eval_results/{TICKER}/...) Adds tradingagents/report_paths.py with centralized path helpers used by CLI commands, trading graph, and pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: structured observability logging for LLM, tool, and vendor calls Add RunLogger (tradingagents/observability.py) that emits JSON-lines events for every LLM call (model, agent, tokens in/out, latency), tool invocation (tool name, args, success, latency), data vendor call (method, vendor, success/failure, latency), and report save. Integration points: - route_to_vendor: log_vendor_call() on every try/catch - run_tool_loop: log_tool_call() on every tool invoke - ScannerGraph: new callbacks param, passes RunLogger.callback to all LLM tiers - pipeline/macro_bridge: picks up RunLogger from thread-local, passes to TradingAgentsGraph - cli/main.py: one RunLogger per command (analyze/scan/pipeline), write_log() at end, summary line printed to console Log files co-located with reports: reports/daily/{date}/{TICKER}/run_log.jsonl (analyze) reports/daily/{date}/market/run_log.jsonl (scan) reports/daily/{date}/run_log.jsonl (pipeline) Also fix test_long_response_no_nudge: update "A"600 → "A"2100 to match MIN_REPORT_LENGTH=2000 threshold set in an earlier commit. Update memory system context files (ARCHITECTURE, COMPONENTS, CONVENTIONS, GLOSSARY, CURRENT_STATE) to document observability and report path systems. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 09:06:40 +01:00
Ahmet Guzererler	cf636232aa	fix: resolve 12 pre-existing test failures across 5 test files Root causes fixed: - test_config_wiring.py: `callable()` returns False on LangChain @tool objects — replaced with `hasattr(x, "invoke")` check - test_env_override.py: `load_dotenv()` in default_config.py re-reads .env on importlib.reload(), leaking user's TRADINGAGENTS_* env vars into isolation tests — mock env vars before reload - test_scanner_comprehensive.py: LLM-calling test was not marked @pytest.mark.integration — added marker so offline runs skip it - test_scanner_fallback.py: assertions used stale `_output_files` list from a previous run when output dir already existed — clear dir in setUp; also fixed tool-availability check using hasattr(x, "invoke") - test_scanner_graph.py: output-file path assertions used hardcoded date string instead of fixture date; graph node assertions checked for removed node names Full offline suite: 388 passed, 70 deselected, 0 failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 11:11:00 +01:00
Ahmet Guzererler	7c95188bf0	Add comprehensive end-to-end tests and market analysis results for March 15, 2026 - Created new files for industry performance, market indices, market movers, sector performance, and topic news. - Implemented end-to-end tests for scanner functionality, ensuring all tools return expected data formats and can save results to files. - Added integration tests to verify scanner tools work seamlessly with the CLI scan command. - Enhanced test coverage for individual scanner tools, validating output structure and content. ## Summary The changes refactor the scanner tool invocation to use LangChain's StructuredTool `.invoke()` method consistently across the codebase. This includes updating the CLI scan command, rewriting tests to use the new invocation pattern, and correcting yfinance screener key mappings. The changes also add comprehensive end-to-end test suites for scanner functionality. ## Issues Found \| Severity \| File:Line \| Issue \| \|----------\|-----------\|-------\| \| WARNING \| cli/main.py:1193-1218 \| Inconsistent error handling - some tools check for "Error" prefix while others check for "No data" prefix, but the actual error messages from yfinance_scanner.py use different formats \| \| WARNING \| tradingagents/dataflows/yfinance_scanner.py:34 \| The condition `if not data or 'quotes' not in data:` may not catch all error cases - yfinance screener can return empty data structures that evaluate to False but don't contain 'quotes' key \| \| SUGGESTION \| tests/test_scanner_tools.py:38-46 \| Test could be more robust by checking for actual data content rather than just headers \| \| SUGGESTION \| cli/main.py:1193-1218 \| Consider extracting the scanner tool invocation pattern into a helper function to reduce duplication \| ## Detailed Findings ### File: cli/main.py:1193-1218 - Confidence: 85% - Problem: The error handling checks for different prefixes ("Error" vs "No data") but the actual functions in yfinance_scanner.py return error messages with different formats (e.g., "Error fetching market movers for..."). This inconsistency could lead to improper error handling where error results are still saved to files. - Suggestion: Standardize error checking by creating a helper function that checks if a result indicates an error, or modify the yfinance_scanner functions to return consistent error prefixes. ### File: tradingagents/dataflows/yfinance_scanner.py:34 - Confidence: 80% - Problem: The condition `if not data or 'quotes' not in data:` assumes that if data exists, it will contain a 'quotes' key. However, yfinance screener might return data in different formats or empty objects that don't contain this key, leading to potential KeyError exceptions. - Suggestion: Add more robust checking: `if not data or not isinstance(data, dict) or 'quotes' not in data:` to prevent attribute errors. ### File: tests/test_scanner_tools.py:38-46 - Confidence: 75% - Problem: The test for market movers only checks that the result contains the expected header but doesn't verify that actual financial data is present in the table rows. - Suggestion: Enhance the test to verify that data rows are present (e.g., check for table rows with actual data, not just headers). ### File: cli/main.py:1193-1218 - Confidence: 70% - Problem: The scanner tool invocation pattern is repeated 5 times with only minor variations in arguments, violating the DRY principle. - Suggestion: Extract this pattern into a helper function like `invoke_scanner_tool(tool, args, filename)` to reduce code duplication and improve maintainability. ## Recommendation APPROVE WITH SUGGESTIONS The changes are fundamentally sound and improve code consistency by standardizing on the StructuredTool `.invoke()` interface. The added test coverage is excellent. Addressing the minor issues noted above would further improve robustness and maintainability.	2026-03-15 11:34:54 +01:00

Author

SHA1

Message

Date

ahmet guzererler

a90f14c086

feat: unified report paths, structured observability logging, and memory system update (#22 )

* gitignore

* feat: unify report paths under reports/daily/{date}/ hierarchy

All generated artifacts now land under a single reports/ tree:
- reports/daily/{date}/market/ for scan results (was results/macro_scan/)
- reports/daily/{date}/{TICKER}/ for per-ticker analysis (was reports/{TICKER}_{timestamp}/)
- reports/daily/{date}/{TICKER}/eval/ for eval logs (was eval_results/{TICKER}/...)

Adds tradingagents/report_paths.py with centralized path helpers used by
CLI commands, trading graph, and pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: structured observability logging for LLM, tool, and vendor calls

Add RunLogger (tradingagents/observability.py) that emits JSON-lines events
for every LLM call (model, agent, tokens in/out, latency), tool invocation
(tool name, args, success, latency), data vendor call (method, vendor,
success/failure, latency), and report save.

Integration points:
- route_to_vendor: log_vendor_call() on every try/catch
- run_tool_loop: log_tool_call() on every tool invoke
- ScannerGraph: new callbacks param, passes RunLogger.callback to all LLM tiers
- pipeline/macro_bridge: picks up RunLogger from thread-local, passes to TradingAgentsGraph
- cli/main.py: one RunLogger per command (analyze/scan/pipeline), write_log()
  at end, summary line printed to console

Log files co-located with reports:
  reports/daily/{date}/{TICKER}/run_log.jsonl   (analyze)
  reports/daily/{date}/market/run_log.jsonl     (scan)
  reports/daily/{date}/run_log.jsonl            (pipeline)

Also fix test_long_response_no_nudge: update "A"*600 → "A"*2100 to match
MIN_REPORT_LENGTH=2000 threshold set in an earlier commit.

Update memory system context files (ARCHITECTURE, COMPONENTS, CONVENTIONS,
GLOSSARY, CURRENT_STATE) to document observability and report path systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-19 09:06:40 +01:00

Ahmet Guzererler

cf636232aa

fix: resolve 12 pre-existing test failures across 5 test files

Root causes fixed:
- test_config_wiring.py: `callable()` returns False on LangChain @tool
  objects — replaced with `hasattr(x, "invoke")` check
- test_env_override.py: `load_dotenv()` in default_config.py re-reads
  .env on importlib.reload(), leaking user's TRADINGAGENTS_* env vars
  into isolation tests — mock env vars before reload
- test_scanner_comprehensive.py: LLM-calling test was not marked
  @pytest.mark.integration — added marker so offline runs skip it
- test_scanner_fallback.py: assertions used stale `_output_files` list
  from a previous run when output dir already existed — clear dir in
  setUp; also fixed tool-availability check using hasattr(x, "invoke")
- test_scanner_graph.py: output-file path assertions used hardcoded
  date string instead of fixture date; graph node assertions checked
  for removed node names

Full offline suite: 388 passed, 70 deselected, 0 failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-18 11:11:00 +01:00

Ahmet Guzererler

7c95188bf0

Add comprehensive end-to-end tests and market analysis results for March 15, 2026

- Created new files for industry performance, market indices, market movers, sector performance, and topic news.
- Implemented end-to-end tests for scanner functionality, ensuring all tools return expected data formats and can save results to files.
- Added integration tests to verify scanner tools work seamlessly with the CLI scan command.
- Enhanced test coverage for individual scanner tools, validating output structure and content.

## Summary
The changes refactor the scanner tool invocation to use LangChain's StructuredTool `.invoke()` method consistently across the codebase. This includes updating the CLI scan command, rewriting tests to use the new invocation pattern, and correcting yfinance screener key mappings. The changes also add comprehensive end-to-end test suites for scanner functionality.

## Issues Found
| Severity | File:Line | Issue |
|----------|-----------|-------|
| WARNING | cli/main.py:1193-1218 | Inconsistent error handling - some tools check for "Error" prefix while others check for "No data" prefix, but the actual error messages from yfinance_scanner.py use different formats |
| WARNING | tradingagents/dataflows/yfinance_scanner.py:34 | The condition `if not data or 'quotes' not in data:` may not catch all error cases - yfinance screener can return empty data structures that evaluate to False but don't contain 'quotes' key |
| SUGGESTION | tests/test_scanner_tools.py:38-46 | Test could be more robust by checking for actual data content rather than just headers |
| SUGGESTION | cli/main.py:1193-1218 | Consider extracting the scanner tool invocation pattern into a helper function to reduce duplication |

## Detailed Findings

### File: cli/main.py:1193-1218
- **Confidence:** 85%
- **Problem:** The error handling checks for different prefixes ("Error" vs "No data") but the actual functions in yfinance_scanner.py return error messages with different formats (e.g., "Error fetching market movers for..."). This inconsistency could lead to improper error handling where error results are still saved to files.
- **Suggestion:** Standardize error checking by creating a helper function that checks if a result indicates an error, or modify the yfinance_scanner functions to return consistent error prefixes.

### File: tradingagents/dataflows/yfinance_scanner.py:34
- **Confidence:** 80%
- **Problem:** The condition `if not data or 'quotes' not in data:` assumes that if data exists, it will contain a 'quotes' key. However, yfinance screener might return data in different formats or empty objects that don't contain this key, leading to potential KeyError exceptions.
- **Suggestion:** Add more robust checking: `if not data or not isinstance(data, dict) or 'quotes' not in data:` to prevent attribute errors.

### File: tests/test_scanner_tools.py:38-46
- **Confidence:** 75%
- **Problem:** The test for market movers only checks that the result contains the expected header but doesn't verify that actual financial data is present in the table rows.
- **Suggestion:** Enhance the test to verify that data rows are present (e.g., check for table rows with actual data, not just headers).

### File: cli/main.py:1193-1218
- **Confidence:** 70%
- **Problem:** The scanner tool invocation pattern is repeated 5 times with only minor variations in arguments, violating the DRY principle.
- **Suggestion:** Extract this pattern into a helper function like `invoke_scanner_tool(tool, args, filename)` to reduce code duplication and improve maintainability.

## Recommendation
**APPROVE WITH SUGGESTIONS**

The changes are fundamentally sound and improve code consistency by standardizing on the StructuredTool `.invoke()` interface. The added test coverage is excellent. Addressing the minor issues noted above would further improve robustness and maintainability.

2026-03-15 11:34:54 +01:00

3 Commits