* feat: introduce flow_id with timestamp-based report versioning
Replace run_id with flow_id as the primary grouping concept (one flow =
one user analysis intent spanning scan + pipeline + portfolio). Reports
are now written as {timestamp}_{name}.json so load methods always return
the latest version by lexicographic sort, eliminating the latest.json
pointer pattern for new flows.
Key changes:
- report_paths.py: add generate_flow_id(), ts_now() (ms precision),
flow_id kwarg on all path helpers; keep run_id / pointer helpers for
backward compatibility
- ReportStore: dual-mode save/load — flow_id uses timestamped layout,
run_id uses legacy runs/{id}/ layout with latest.json
- MongoReportStore: add flow_id field and index; run_id stays for compat
- DualReportStore: expose flow_id property
- store_factory: accept flow_id as primary param, run_id as alias
- runs.py / langgraph_engine.py: generate and thread flow_id through all
trigger endpoints and run methods
- Tests: add flow_id coverage for all layers; 905 tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: load flow_id in FE to resume runs and fix max_tickers cap on continuation
- Add flow_id to RunParams interface and initial state
- loadRun() now restores flow_id + max_auto_tickers from history so the next
run continues in the same flow directory (Phase 1 scan skipped, already-done
tickers skipped via skip-if-exists logic)
- startRun() spreads flow_id into the request body when set, letting the backend
reuse the existing flow directory instead of generating a fresh flow_id
- After each run, params.flow_id is updated from the response so subsequent
runs automatically continue from the same flow
- max_auto_tickers restored from run.params.max_tickers ensures the ticker cap
matches the original run; scan_tickers[:max_t] on the backend then limits
the Phase 2 queue to the user's setting even when the existing scan has more
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(mongo): fast-fail timeout + lazy ensure_indexes to avoid 30s block on fallback
MongoClient previously used pymongo's 30-second serverSelectionTimeoutMS default,
causing store_factory to hang for 30s before falling back to the filesystem when
Atlas is unreachable. Also, ensure_indexes() was called eagerly in __init__,
making every store construction attempt block on a live network call.
- Set serverSelectionTimeoutMS=5_000 so fallback is triggered in ≤5s
- Move ensure_indexes() call out of __init__ — indexes are now created lazily
on the first _save() call via a guarded self._indexes_ensured flag
- ensure_indexes() is still idempotent and safe to call explicitly in tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(store): wrap all DualReportStore mongo calls in _try_mongo() for graceful degradation
Any MongoDB exception (SSL error, ServerSelectionTimeout, auth failure) was
propagating uncaught through DualReportStore and crashing the run. Reads
would return an error instead of falling back to local, and writes would
abort mid-run without saving anything.
Introduce a single _try_mongo(fn, default) helper that:
- Executes the Mongo callable
- Catches *any* exception, logs it as WARNING with type + message
- Returns the default value so the caller continues with local-only data
Pattern per method:
writes → try mongo (fire-and-forget); always return local result
reads → try mongo first; fall back to local on None or exception
lists → try mongo; fall back to local on empty/None
Runs now complete successfully even when Atlas is unreachable or returns SSL
errors. MongoDB sync resumes automatically once connectivity is restored.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(observability): non-blocking MongoDB inserts + 5s timeout in RunLogger
Every LLM and tool callback called _append() which synchronously called
insert_one() against MongoDB. When Atlas was unreachable this blocked the
entire LangGraph run for pymongo's 30-second default timeout per event,
effectively serializing all agent work behind MongoDB retries.
Two fixes:
1. serverSelectionTimeoutMS=5_000 on the RunLogger's MongoClient — consistent
with the same fix applied to MongoReportStore.
2. MongoDB inserts are now fire-and-forget via daemon threads — _append() spawns
a Thread(target=_insert, daemon=True) and returns immediately. LLM callbacks
and tool events are never delayed by MongoDB connectivity issues.
Failures are still reported via WARNING log from the background thread.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* revert(observability): restore synchronous MongoDB inserts in RunLogger
Root cause was an IP whitelist issue on Atlas causing SSL failures, not
insert volume. The background-thread approach added unnecessary complexity.
The 5s serverSelectionTimeoutMS is retained as a defensive safeguard.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Feature 1 - Configurable max_auto_tickers:
- Add max_auto_tickers config key (default 10) with TRADINGAGENTS_MAX_AUTO_TICKERS env override
- Macro synthesis agent accepts max_scan_tickers param, injects exact count into LLM prompt
- ScannerGraph passes config value to create_macro_synthesis()
- Backend engine applies safety cap on scan candidates (portfolio holdings always included)
- Frontend adds Max Tickers number input in params panel, sends max_tickers in auto run body
Feature 2 - Run persistence + phase-level node re-run:
- 2A: ReportStore + MongoReportStore gain save/load_run_meta, save/load_run_events,
list_run_metas methods; runs.py persists to disk in finally block; startup hydration
restores historical runs; lazy event loading on GET /{run_id}
- 2B: Analysts + trader checkpoint save/load methods in both stores; engine saves
checkpoints after pipeline completion alongside complete_report.json
- 2C: GraphSetup gains build_debate_subgraph() and build_risk_subgraph() for partial
re-runs; TradingAgentsGraph exposes debate_graph/risk_graph as lazy properties;
NODE_TO_PHASE mapping + run_pipeline_from_phase() engine method;
POST /api/run/rerun-node endpoint with _append_and_store helper
- 2D: Frontend history popover (loads GET /api/run/, sorts by created_at, click to load);
triggerNodeRerun() calls rerun-node endpoint; handleNodeRerun uses phase-level
re-run when active run is loaded
All 890 existing tests pass (10 skipped).
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Initial plan
* feat: include portfolio holdings in auto mode pipeline analysis
In run_auto (both AgentOS and CLI), Phase 2 now loads current portfolio
holdings and merges their tickers with scan candidates before running
the per-ticker pipeline. This ensures the portfolio manager has fresh
analysis for both new opportunities and existing positions.
Key changes:
- macro_bridge.py: add candidates_from_holdings() factory
- langgraph_engine.py run_auto: merge holding tickers with scan tickers
- cli/main.py auto: load holdings, create StockCandidates, pass to run_pipeline
- cli/main.py run_pipeline: accept optional holdings_candidates parameter
- 9 new unit tests covering holdings inclusion, dedup, and graceful fallback
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aguzererler/TradingAgents/sessions/53065a07-d9f8-47be-9956-0eb4ee8c87da
* fix: normalize ticker case in dedup and clarify count display
Address code review feedback:
- Use .upper() for case-insensitive ticker comparison in run_pipeline
- Display accurate filtered scan count instead of raw candidate count
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aguzererler/TradingAgents/sessions/53065a07-d9f8-47be-9956-0eb4ee8c87da
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
## Summary
- Adds `smart_money_scanner` as a new Phase 1b node that runs sequentially after `sector_scanner`, surfacing institutional footprints via Finviz screeners
- Introduces the **Golden Overlap** strategy in `macro_synthesis`: stocks confirmed by both top-down macro themes and bottom-up Finviz signals are labelled high-conviction
- Fixes model-name badge overflow in AgentGraph (long model IDs like OpenRouter paths were visually spilling into adjacent nodes)
- Completes all documentation: ADR-014, dataflow, architecture, components, glossary, current-state
## Key Decisions (see ADR-014)
- 3 zero-parameter tools (`get_insider_buying_stocks`, `get_unusual_volume_stocks`, `get_breakout_accumulation_stocks`) instead of 1 parameterised tool — prevents LLM hallucinations on string args
- Sequential after `sector_scanner` (not parallel fan-out) — gives access to `sector_performance_report` context and avoids `MAX_TOOL_ROUNDS=5` truncation in market_movers_scanner
- Graceful fallback: `_run_finviz_screen()` catches all exceptions and returns an error string — pipeline never hard-fails on web-scraper failure
- `breakout_accumulation` (52-wk high + 2x vol = O'Neil CAN SLIM institutional signal) replaces `oversold_bounces` (RSI<30 = retail contrarian, not smart money)
## Test Plan
- [x] 6 new mocked tests in `tests/unit/test_scanner_mocked.py` (happy path, empty DF, exception, sort order)
- [x] Fixed `tests/unit/test_scanner_graph.py` — added `smart_money_scanner` mock to compilation test
- [x] 2 pre-existing test failures excluded (verified baseline before changes)
- [x] AgentGraph badge: visually verified truncation with long OpenRouter model identifiers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
- ReportStore.clear_portfolio_stage(date, portfolio_id): deletes pm_decision
(.json + .md) and execution_result files for a given date/portfolio
- DELETE /api/run/portfolio-stage endpoint: calls clear_portfolio_stage
and returns list of deleted files
- Dashboard: 'Reset Decision' button calls the endpoint, then user can
run Auto to re-run Phase 3 from scratch while skipping Phase 1 & 2
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TRADINGAGENTS_REPORTS_DIR now controls where all reports land (scans,
analysis, portfolio artifacts). Both report_paths.REPORTS_ROOT and
ReportStore.data_dir read from the same env var so the entire
reports/daily/{date}/... tree is rooted at one configurable location.
PORTFOLIO_DATA_DIR still works as a portfolio-specific override.
Falls back to "reports" (relative to CWD) when neither is set.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend run_portfolio to save Holding Reviews, Risk Metrics, PM Decision,
and Execution Result from final state
- Add run_trade_execution method for resuming trade execution from a saved
PM decision without re-running the full portfolio graph
- Update run_auto skip logic to check for execution result (not just decision)
and resume from saved decision when available
- Gitignore uv.lock and untrack it from version control
Co-Authored-By: Oz <oz-agent@warp.dev>
Added comprehensive unit tests for `fundamentals_analyst`, `market_analyst`,
`social_media_analyst`, and `news_analyst` to verify that they correctly
handle recursive tool calling via `run_tool_loop`. A MockLLM was created
to simulate a two-turn conversation (tool call request followed by a final
report generation) to ensure the `.invoke()` bug does not regress. Added
missing `build_instrument_context` imports to those agents to prevent
NameErrors.
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
This commit updates the `fundamentals_analyst`, `market_analyst`,
`social_media_analyst`, and `news_analyst` files to use `run_tool_loop`
instead of `.invoke()`. Using `.invoke()` resulted in the LLM execution
stopping immediately upon a tool call request without executing the tool,
returning an empty report or raw JSON. The `run_tool_loop` function
ensures tools are executed recursively and the final text content is
returned.
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Add effort parameter (high/medium/low) for Claude 4.5+ and 4.6 models,
consistent with OpenAI reasoning_effort and Google thinking_level.
Also add content normalization for Anthropic responses.
- Point requirements.txt to pyproject.toml as single source of truth
- Resolve welcome.txt path relative to module for CLI portability
- Include cli/static files in package build
- Extract shared normalize_content for OpenAI Responses API and
Gemini 3 list-format responses into base_client.py
- Update README install and CLI usage instructions
Enable use_responses_api for native OpenAI provider, which supports
reasoning_effort with function tools across all model families.
Removes the UnifiedChatOpenAI subclass workaround.
Closes#403
- y_finance.py: replace print() with logger.warning() in bulk-stats fallback
- macro_bridge.py: add elapsed_seconds field to TickerResult, populate in
run_ticker_analysis (success + error paths)
- cli/main.py: move inline 'import time as _time' and rich.progress imports
to module level; use result.elapsed_seconds for accurate per-ticker timing
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Agent-Logs-Url: https://github.com/aguzererler/TradingAgents/sessions/68fcf34c-8d55-4436-b743-f79fff68713f
Before this change, the pipeline showed a generic 'Analyzing...' spinner
for the entire multi-ticker run with no way to know which ticker was
processing or whether anything was actually working.
Changes:
- macro_bridge.py:
- run_ticker_analysis: logs '▶ Starting', '✓ complete in Xs', '✗ FAILED'
with elapsed time per ticker using logger.info/logger.error
- run_all_tickers: replaced asyncio.gather (swallows all progress) with
asyncio.as_completed + optional on_ticker_done(result, done, total)
callback; uses asyncio.Semaphore for max_concurrent control
- Added time and Callable imports
- cli/main.py run_pipeline:
- Replaced Live(Spinner) with Rich Progress bar (spinner + bar + counter
+ elapsed time)
- Prints '▷ Queued: TICKER' before analysis starts for each ticker
- on_ticker_done callback prints '✓ TICKER (N/M, Xs elapsed) → decision'
or '✗ TICKER failed ...' immediately as each ticker finishes
- Prints total elapsed time when all tickers complete
## Problem (Incident Post-mortem)
The pipeline was emitting hundreds of errors:
'Invalid number of return arguments after parsing column name: Date'
Root cause: after _clean_dataframe() lowercases all columns, stockstats.wrap()
promotes 'date' to the DataFrame index. Subsequent df['Date'] access caused
stockstats to try parsing 'Date' as a technical indicator name.
## Fixes
### 1. Fix df['Date'] stockstats bug (already shipped in prior commit)
- stockstats_utils.py + y_finance.py: use df.index.strftime() instead of
df['Date'] after wrap()
### 2. Extract _load_or_fetch_ohlcv() — single OHLCV authority
- Eliminates duplicated 30-line download+cache boilerplate in two places
- Cache filename is always derived from today's date — hardcoded stale date
'2015-01-01-2025-03-25' in local mode is gone
- Corruption/truncation detection: files <50 rows or unparseable are deleted
and re-fetched rather than silently returning bad data
- Drops on_bad_lines='skip' — malformed CSVs now raise instead of silently
dropping rows that would distort indicator calculations
### 3. YFinanceError typed exception
- Defined in stockstats_utils.py; raised instead of print()+return ''
- get_stockstats_indicator now raises YFinanceError on failure so errors
surface to callers rather than delivering empty strings to LLM agents
- interface.py route_to_vendor now catches YFinanceError alongside
AlphaVantageError and FinnhubError — failures appear in observability
telemetry and can trigger vendor fallback
### 4. Explicit date column discovery in alpha_vantage_common
- _filter_csv_by_date_range: replaced df.columns[0] positional assumption
with explicit search for 'time'/'timestamp'/'date' column
- ValueError re-raised (not swallowed) so bad API response shape is visible
### 5. Structured logging
- Replaced all print() calls in changed files with logging.getLogger()
- Added logging import + logger to alpha_vantage_common
## Tests
- tests/unit/test_incident_fixes.py: 12 new unit tests covering all fixes
(dynamic cache filename, corruption re-fetch, YFinanceError propagation,
explicit column lookup, empty download raises)
- tests/integration/test_stockstats_live.py: 11 live tests against real
yfinance API (all major indicators, weekend N/A, regression guard)
- All 70 tests pass (59 unit + 11 live integration)
- Removed unused `import time` from `tradingagents/agents/analysts/news_analyst.py`
- Verified file syntax with `py_compile`
- Confirmed that the import was successfully removed
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Replaces the O(N) database operations in the `TradeExecutor`'s
`execute_decisions` SELL loop with a single `batch_remove_holdings`
call to the repository. The new repository method calculates updates
in memory, resolves duplicate operations on the same ticker, and issues
the updates via newly implemented `psycopg2.extras.execute_batch`
routines on the `SupabaseClient`.
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>
Optimize the _percentile calculation to use heapq.nsmallest or heapq.nlargest when requesting small extreme percentiles (like 5% VaR) from large lists, falling back to sorted() only when necessary. This avoids fully sorting the entire array.
Co-authored-by: aguzererler <6199053+aguzererler@users.noreply.github.com>