TradingAgents

Commit Graph

Author	SHA1	Message	Date
陈少杰	eda9980729	feat(orchestrator): add comprehensive provider and timeout validation Add three layers of configuration validation to LLMRunner: 1. Provider × base_url matrix validation - Validates all 6 providers (anthropic, openai, google, xai, ollama, openrouter) - Uses precompiled regex patterns for efficiency - Detects mismatches before expensive graph initialization 2. Timeout configuration validation - Warns when analyst/research timeouts may be insufficient - Provides recommendations based on analyst count (1-4) - Non-blocking warnings logged at init time 3. Enhanced error classification - Distinguishes provider_mismatch from provider_auth_failed - Uses heuristic detection for auth failures - Simplified nested ternary expressions for readability Improvements: - Validation runs before cache check (prevents stale cache on config errors) - EAFP pattern for cache reading (more robust than TOCTOU) - Precompiled regex patterns (avoid recompilation overhead) - All 21 unit tests passing Documentation: - docs/architecture/orchestrator-validation.md - complete validation guide - orchestrator/examples/validation_examples.py - runnable examples Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 11:43:19 +08:00
陈少杰	64e3583f66	Unify research provenance extraction and persist it into state logs The earlier Phase 1-4 recovery left one unique worker-1 slice unrecovered: provenance extraction logic was still duplicated in the runner and the full-state log path still dropped the structured research fields. This change centralizes provenance extraction in agent state helpers, reuses it from the LLM runner, and writes the same structured fields into TradingAgents full-state logs with focused regression tests.\n\nConstraint: Preserve the existing debate-string output shape while making provenance reuse consistent across runner and state-log surfaces\nRejected: Cherry-pick worker-1 auto-checkpoint wholesale \| it mixed duplicate A/B files and uv.lock churn with the useful provenance helper changes\nConfidence: high\nScope-risk: narrow\nDirective: Keep research provenance extraction centralized; new consumers should call the helper instead of re-listing field names by hand\nTested: python -m pytest -q tradingagents/tests/test_research_guard.py orchestrator/tests/test_trading_graph_config.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_profile_stage_chain.py orchestrator/tests/test_profile_ab.py orchestrator/tests/test_contract_v1alpha1.py orchestrator/tests/test_live_mode.py\nTested: python -m compileall tradingagents/agents/utils/agent_states.py tradingagents/graph/trading_graph.py orchestrator/llm_runner.py orchestrator/tests/test_trading_graph_config.py tradingagents/tests/test_research_guard.py\nNot-tested: Live-provider end-to-end analysis run that emits a new full_states_log file	2026-04-14 13:34:25 +08:00
陈少杰	8c6da22f4f	Finish the A/B harness recovery without leaving conflict markers behind The worker-4 recovery brought in the trace-summary helper split and A/B harness updates, but the cherry-pick left conflict markers around build_trace_payload in profile_stage_chain.py. This follow-up keeps the merged import-based shape and records the cleanup as a standalone reversible step.\n\nConstraint: Preserve the recovered trace payload shape while removing only the cherry-pick residue\nRejected: Re-run the cherry-pick from scratch \| unnecessary after the resolved file already passed targeted verification\nConfidence: high\nScope-risk: narrow\nDirective: If profile_stage_chain.py is touched again, verify the file is marker-free before running compile/test to avoid silent recovery drift\nTested: python -m pytest -q orchestrator/tests/test_contract_v1alpha1.py tradingagents/tests/test_research_guard.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_profile_stage_chain.py orchestrator/tests/test_profile_ab.py; python -m orchestrator.profile_stage_chain --help; python -m compileall orchestrator/profile_stage_chain.py orchestrator/profile_trace_utils.py orchestrator/profile_ab.py orchestrator/tests/test_profile_ab.py tradingagents/tests/test_research_guard.py\nNot-tested: Live-provider end-to-end profile_ab comparison on real traces	2026-04-14 05:15:21 +08:00
陈少杰	d34ad8d3ef	omx(team): auto-checkpoint worker-4 [unknown]	2026-04-14 05:14:01 +08:00
陈少杰	a81f825203	Make A/B trace comparisons easier to trust during profiling The minimal offline harness now carries forward source-file and trace-schema metadata, and it can break ties using error counts instead of only elapsed runtime and degraded-research totals. This keeps Phase 1-4 profile comparisons self-describing when multiple dumps are aggregated. Constraint: Keep the harness offline and avoid changing the default runtime path Rejected: Add a live dual-run executor \| would couple profiling to external LLM calls and increase risk Confidence: high Scope-risk: narrow Directive: Preserve the trace dump shape as the source of truth for future comparison tooling Tested: uv run python inline assertions for orchestrator.tests.test_profile_ab Tested: uv run python CLI smoke test for orchestrator.profile_ab with temp traces Tested: uv run python -m compileall orchestrator/profile_stage_chain.py orchestrator/profile_trace_utils.py orchestrator/profile_ab.py orchestrator/tests/test_profile_ab.py	2026-04-14 05:12:13 +08:00
陈少杰	909519ff17	omx(team): auto-checkpoint worker-2 [unknown]	2026-04-14 04:47:52 +08:00
陈少杰	addc4a1e9c	Keep research degradation visible while bounding researcher nodes Research provenance now rides with the debate state, cache metadata, live payloads, and trace dumps so degraded research no longer masquerades as a normal sample. Bull/Bear/Manager nodes also return explicit guarded fallbacks on timeout or exception, which gives the graph a real node budget boundary without rewriting the bull/bear output shape or removing debate.\n\nConstraint: Must preserve bull/bear debate structure and output shape while adding provenance and node guards\nRejected: Skip bull/bear debate in compact mode \| would trade away analysis quality before A/B evidence exists\nConfidence: high\nScope-risk: moderate\nReversibility: clean\nDirective: Treat research_status and data_quality as rollout gates; do not collapse degraded research back into normal success samples\nTested: python -m pytest tradingagents/tests/test_research_guard.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q; python -m compileall tradingagents/graph/setup.py tradingagents/agents/utils/agent_states.py tradingagents/graph/propagation.py orchestrator/llm_runner.py orchestrator/live_mode.py orchestrator/profile_stage_chain.py; python orchestrator/profile_stage_chain.py --ticker 600519.SS --date 2026-04-10 --provider anthropic --model MiniMax-M2.7-highspeed --base-url https://api.minimaxi.com/anthropic --selected-analysts market --analysis-prompt-style compact --timeout 45 --max-retries 0 --overall-timeout 120 --dump-raw-on-failure\nNot-tested: Full successful live-provider completion through Portfolio Manager after the post-research connection failure	2026-04-14 03:49:33 +08:00
陈少杰	baf67dbd58	Trim the research phase before trusting profiling output The legacy path was already narrowed to market-only compact execution, but the research stage remained the slowest leg and the profiler lacked persistent raw event artifacts for comparison. This change further compresses the compact prompts for Bull Researcher, Bear Researcher, and Research Manager, adds durable raw event dumps to the graph profiler, and keeps profiling evidence out of the runtime contract itself. Constraint: No new dependencies and no runtime-contract pollution for profiling-only data Rejected: Add synthetic timing fields back into the subprocess protocol \| those timings are not real graph-stage boundaries and would mislead diagnosis Rejected: Skip raw event dump persistence and rely on console output \| makes multi-run comparison and regression tracking fragile Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep profiling as an external diagnostic surface; if stage timing ever enters contracts again, it must come from real graph boundaries Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q Tested: python -m compileall tradingagents/agents/researchers/bull_researcher.py tradingagents/agents/researchers/bear_researcher.py tradingagents/agents/managers/research_manager.py orchestrator/profile_stage_chain.py Tested: real provider profiling via orchestrator/profile_stage_chain.py with market-only compact settings; dump persisted to orchestrator/profile_runs/600519.SS_2026-04-10_20260413T184742Z.json Not-tested: browser/manual consumption of the persisted profiling dump	2026-04-14 02:51:07 +08:00
陈少杰	8a4f0ad540	Reduce the legacy execution path before profiling it for real The provider itself was healthy, but the legacy dashboard path still ran the heaviest graph shape by default and had no trustworthy stage profiling story. This change narrows the default legacy execution settings to the market-only compact path with conservative timeout/retry values, injects those settings through the unified request/runtime surface, and adds a standalone graph-update profiler so stage timing comes from real node completions rather than synthetic script labels. Constraint: Profiling evidence had to be grounded in the real provider path without adding new dependencies or polluting the runtime contract Rejected: Keep synthetic STAGE_TIMING in the subprocess protocol \| misattributes the heaviest work to the wrong phase and makes the profiling conclusion untrustworthy Rejected: Broaden the default legacy path and rely on longer timeouts \| raises cost and latency while obscuring the true bottleneck Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep operational profiling separate from runtime business contracts unless timings are sourced from real graph-stage boundaries Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q Tested: python -m compileall web_dashboard/backend orchestrator/profile_stage_chain.py Tested: real provider direct invoke returned OK against MiniMax anthropic-compatible endpoint Tested: real graph profiling via orchestrator/profile_stage_chain.py produced stage timings for 600519.SS on 2026-04-10 with selected_analysts=market and compact prompt Not-tested: legacy subprocess full end-to-end success case on the same provider path (current run still exits via protocol failure after upstream connection error)	2026-04-14 02:42:53 +08:00
陈少杰	eb2ab0afcf	Preserve diagnostics in live-mode failure payloads The previous hardening pass still dropped source diagnostics and data-quality context once live-mode serialized a dual-lane failure. Keep those fields when a structured CombinedSignalFailure reaches the websocket layer so consumers can distinguish provider mismatch, stale data, and other degraded cases even when no final signal exists. Constraint: Follow-on fix after `63858bf` should stay minimal and not reopen unrelated executor/calendar work Rejected: Fold this into a larger amend of the prior commit \| history is already shared and the delta is a single behavioral correction Confidence: high Scope-risk: narrow Reversibility: clean Directive: When failure exceptions carry structured diagnostics, live serializers must preserve them instead of flattening to a generic message Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q Tested: python -m compileall orchestrator web_dashboard/backend Tested: npm run build (web_dashboard/frontend) Not-tested: real websocket consumers against provider-backed failure paths	2026-04-14 02:10:31 +08:00
陈少杰	a4def7aff9	Harden executor configuration and failure contracts before further rollout The rollout-ready branch still conflated dashboard auth with provider credentials, discarded diagnostics when both signal lanes degraded, and treated RESULT_META as optional even though downstream contracts now depend on it. This change separates provider runtime settings from request auth, preserves source diagnostics/data quality in full-failure contracts, requires RESULT_META in the subprocess protocol, and moves A-share holidays into an updateable calendar data source. Constraint: No external market-calendar dependency is available in env312 and dependency policy forbids adding one casually Rejected: Keep reading provider keys from request headers \| couples dashboard auth to execution and breaks non-anthropic providers Rejected: Leave both-signals-unavailable as a bare ValueError \| loses diagnostics before live/backend contracts can serialize them Rejected: Keep A-share holidays embedded in Python constants \| requires code edits every year and preserves the stopgap design Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep subprocess protocol fields explicit and fail closed when RESULT_META is missing; do not route provider credentials through dashboard auth again Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q Tested: python -m compileall orchestrator web_dashboard/backend Not-tested: real provider-backed execution across openai/google providers Not-tested: browser/manual verification beyond existing frontend contract consumers	2026-04-14 01:54:44 +08:00
陈少杰	11cbb7ce85	Carry Phase 4 rollout-readiness work back into the mainline safely Team execution produced recoverable commits for market-holiday handling, live websocket contracts, regression coverage, and the remaining frontend contract-view polish. Recover those changes into main without waiting for terminal team shutdown, preserving the verified payload semantics while avoiding the worker auto-checkpoint noise. Constraint: Team workers were still in progress, so recovery had to avoid destructive shutdown and ignore the worker-3 uv.lock churn Rejected: Wait for terminal shutdown before recovery \| unnecessary delay once commits were already recoverable and verified Rejected: Cherry-pick worker-3 checkpoint wholesale \| would import unrelated uv.lock churn into main Confidence: high Scope-risk: moderate Reversibility: clean Directive: Treat team INTEGRATED mailbox messages as hints only; always inspect snapshot refs/worktrees before claiming the leader actually merged code Tested: python -m pytest orchestrator/tests/test_market_calendar.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_application_service.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_api_smoke.py -q Tested: python -m compileall orchestrator web_dashboard/backend Tested: npm run build (web_dashboard/frontend) Not-tested: final team terminal completion after recovery Not-tested: real websocket clients or live provider-backed market holiday sessions	2026-04-14 01:15:18 +08:00
陈少杰	7cd9c4617a	Expose data-quality semantics before rolling contract-first further Phase 3 adds concrete data-quality states to the contract surface so weekend runs, stale market data, partial payloads, and provider/config mismatches stop collapsing into generic success or failure. The backend now carries those diagnostics from quant/llm runners through the legacy executor contract, while the frontend reads decision/confidence fields from result or compat instead of assuming legacy top-level payloads. Constraint: existing recommendation/task files and current dashboard routes must remain readable during migration Rejected: infer data quality only in the service layer \| loses source-specific evidence and violates the executor/orchestrator boundary Rejected: leave frontend on top-level decision fields \| breaks as soon as contract-first payloads become the default Confidence: high Scope-risk: moderate Reversibility: clean Directive: keep new data-quality states explicit in contract metadata and route all UI reads through result/compat helpers Tested: python -m pytest orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_signals.py orchestrator/tests/test_application_service.py orchestrator/tests/test_trading_graph_config.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py -q Tested: python -m compileall orchestrator tradingagents web_dashboard/backend Tested: npm run build (web_dashboard/frontend) Not-tested: real exchange holiday calendars beyond weekend detection Not-tested: real provider-backed end-to-end runs for provider_mismatch and stale-data scenarios	2026-04-14 00:37:35 +08:00
陈少杰	b6e57d01e3	Stabilize TradingAgents contracts so orchestration and dashboard can converge This change set introduces a versioned result contract, shared config schema/loading, provider/data adapter seams, and a no-strategy application-service skeleton so the current research graph, orchestrator layer, and dashboard backend stop drifting further apart. It also keeps the earlier MiniMax compatibility and compact-prompt work aligned with the new contract shape and extends regression coverage so degradation, fallback, and service migration remain testable during the next phases. Constraint: Must preserve existing FastAPI entrypoints and fallback behavior while introducing an application-service seam Constraint: Must not turn application service into a new strategy or learning layer Rejected: Full backend rewrite to service-only execution now \| too risky before contract and fallback paths stabilize Rejected: Leave provider/data/config logic distributed across scripts and endpoints \| continues boundary drift and weakens verification Confidence: high Scope-risk: broad Directive: Keep future application-service changes orchestration-only; move any scoring, signal fusion, or learning logic to orchestrator or tradingagents instead Tested: python -m compileall orchestrator tradingagents web_dashboard/backend Tested: python -m pytest orchestrator/tests/test_signals.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_contract_v1alpha1.py orchestrator/tests/test_application_service.py orchestrator/tests/test_provider_adapter.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_services_migration.py -q Not-tested: live MiniMax/provider execution against external services Not-tested: full dashboard/manual websocket flow against a running frontend Not-tested: omx team runtime end-to-end in the primary workspace	2026-04-13 17:25:07 +08:00
陈少杰	b50e5b4725	fix(review): hmac.compare_digest for API key, ws/orchestrator auth, SignalMerger per-signal cap logic	2026-04-09 23:00:20 +08:00
陈少杰	28a95f34a7	fix(review): api_key→anthropic_key bug, sync-in-async event loop block, orchestrator per-message re-init, dead code cleanup	2026-04-09 22:55:36 +08:00
陈少杰	ce2e6d32cc	feat(orchestrator): example scripts for backtest and live mode	2026-04-09 22:12:02 +08:00
陈少杰	480f0299b0	feat(orchestrator): LiveMode + /ws/orchestrator WebSocket endpoint	2026-04-09 22:10:15 +08:00
陈少杰	724c447720	feat(orchestrator): BacktestMode for historical signal collection	2026-04-09 22:09:38 +08:00
陈少杰	928f069184	test(orchestrator): unit tests for SignalMerger, LLMRunner._map_rating, QuantRunner._calc_confidence	2026-04-09 22:07:21 +08:00
陈少杰	14191abc29	feat(orchestrator): TradingOrchestrator main class with get_combined_signal	2026-04-09 22:05:03 +08:00
陈少杰	ba3297a696	fix(llm_runner): use stored direction/confidence on cache hit, sanitize ticker path	2026-04-09 22:03:17 +08:00
陈少杰	852b6c98e3	feat(orchestrator): implement LLMRunner with lazy graph init and JSON cache	2026-04-09 21:58:38 +08:00
陈少杰	29aae4bb18	feat(orchestrator): implement LLMRunner with caching and rating mapping	2026-04-09 21:54:48 +08:00
陈少杰	30d8f90467	fix(quant_runner): fix 3 critical issues and 2 important improvements - Critical 1: initialize orders=[] before loop to prevent NameError when df is empty - Critical 2: replace bare sqlite3 conn with context manager (with statement) in get_signal - Critical 3: remove ticker param from _load_best_params (table has no ticker col, params are global) - Important: extract db_path as self._db_path attribute in __init__ (DRY) - Important: add comment explaining lazy imports require sys.path set in __init__	2026-04-09 21:51:38 +08:00
陈少杰	7a03c29330	feat(orchestrator): implement QuantRunner with BollingerStrategy signal generation	2026-04-09 21:44:34 +08:00
陈少杰	dacb3316fa	fix(orchestrator): code quality fixes in config and signals - config: remove hardcoded absolute path for quant_backtest_path (now empty string) - config: add llm_solo_penalty (0.7) and quant_solo_penalty (0.8) fields - signals: SignalMerger now accepts OrchestratorConfig in __init__ - signals: use config.llm_solo_penalty / quant_solo_penalty instead of magic numbers - signals: apply quant_weight_cap / llm_weight_cap as confidence upper bounds - signals: both-None branch raises ValueError instead of returning ticker="" - signals: replace assert with explicit ValueError for llm-None-when-quant-None - signals: replace datetime.utcnow() with datetime.now(timezone.utc)	2026-04-09 21:39:23 +08:00
陈少杰	56dc76d44a	feat(orchestrator): add signals.py and config.py - Signal / FinalSignal dataclasses - SignalMerger with weighted merge, single-track fallbacks, and cancel-out HOLD - OrchestratorConfig with all required fields	2026-04-09 21:35:31 +08:00

28 Commits