Commit Graph

28 Commits

Author SHA1 Message Date
陈少杰 eda9980729 feat(orchestrator): add comprehensive provider and timeout validation
Add three layers of configuration validation to LLMRunner:

1. Provider × base_url matrix validation
   - Validates all 6 providers (anthropic, openai, google, xai, ollama, openrouter)
   - Uses precompiled regex patterns for efficiency
   - Detects mismatches before expensive graph initialization

2. Timeout configuration validation
   - Warns when analyst/research timeouts may be insufficient
   - Provides recommendations based on analyst count (1-4)
   - Non-blocking warnings logged at init time

3. Enhanced error classification
   - Distinguishes provider_mismatch from provider_auth_failed
   - Uses heuristic detection for auth failures
   - Simplified nested ternary expressions for readability

Improvements:
- Validation runs before cache check (prevents stale cache on config errors)
- EAFP pattern for cache reading (more robust than TOCTOU)
- Precompiled regex patterns (avoid recompilation overhead)
- All 21 unit tests passing

Documentation:
- docs/architecture/orchestrator-validation.md - complete validation guide
- orchestrator/examples/validation_examples.py - runnable examples

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 11:43:19 +08:00
陈少杰 64e3583f66 Unify research provenance extraction and persist it into state logs
The earlier Phase 1-4 recovery left one unique worker-1 slice unrecovered: provenance extraction logic was still duplicated in the runner and the full-state log path still dropped the structured research fields. This change centralizes provenance extraction in agent state helpers, reuses it from the LLM runner, and writes the same structured fields into TradingAgents full-state logs with focused regression tests.\n\nConstraint: Preserve the existing debate-string output shape while making provenance reuse consistent across runner and state-log surfaces\nRejected: Cherry-pick worker-1 auto-checkpoint wholesale | it mixed duplicate A/B files and uv.lock churn with the useful provenance helper changes\nConfidence: high\nScope-risk: narrow\nDirective: Keep research provenance extraction centralized; new consumers should call the helper instead of re-listing field names by hand\nTested: python -m pytest -q tradingagents/tests/test_research_guard.py orchestrator/tests/test_trading_graph_config.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_profile_stage_chain.py orchestrator/tests/test_profile_ab.py orchestrator/tests/test_contract_v1alpha1.py orchestrator/tests/test_live_mode.py\nTested: python -m compileall tradingagents/agents/utils/agent_states.py tradingagents/graph/trading_graph.py orchestrator/llm_runner.py orchestrator/tests/test_trading_graph_config.py tradingagents/tests/test_research_guard.py\nNot-tested: Live-provider end-to-end analysis run that emits a new full_states_log file
2026-04-14 13:34:25 +08:00
陈少杰 8c6da22f4f Finish the A/B harness recovery without leaving conflict markers behind
The worker-4 recovery brought in the trace-summary helper split and A/B harness updates, but the cherry-pick left conflict markers around build_trace_payload in profile_stage_chain.py. This follow-up keeps the merged import-based shape and records the cleanup as a standalone reversible step.\n\nConstraint: Preserve the recovered trace payload shape while removing only the cherry-pick residue\nRejected: Re-run the cherry-pick from scratch | unnecessary after the resolved file already passed targeted verification\nConfidence: high\nScope-risk: narrow\nDirective: If profile_stage_chain.py is touched again, verify the file is marker-free before running compile/test to avoid silent recovery drift\nTested: python -m pytest -q orchestrator/tests/test_contract_v1alpha1.py tradingagents/tests/test_research_guard.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_profile_stage_chain.py orchestrator/tests/test_profile_ab.py; python -m orchestrator.profile_stage_chain --help; python -m compileall orchestrator/profile_stage_chain.py orchestrator/profile_trace_utils.py orchestrator/profile_ab.py orchestrator/tests/test_profile_ab.py tradingagents/tests/test_research_guard.py\nNot-tested: Live-provider end-to-end profile_ab comparison on real traces
2026-04-14 05:15:21 +08:00
陈少杰 d34ad8d3ef omx(team): auto-checkpoint worker-4 [unknown] 2026-04-14 05:14:01 +08:00
陈少杰 a81f825203 Make A/B trace comparisons easier to trust during profiling
The minimal offline harness now carries forward source-file and trace-schema
metadata, and it can break ties using error counts instead of only elapsed
runtime and degraded-research totals. This keeps Phase 1-4 profile comparisons
self-describing when multiple dumps are aggregated.

Constraint: Keep the harness offline and avoid changing the default runtime path
Rejected: Add a live dual-run executor | would couple profiling to external LLM calls and increase risk
Confidence: high
Scope-risk: narrow
Directive: Preserve the trace dump shape as the source of truth for future comparison tooling
Tested: uv run python inline assertions for orchestrator.tests.test_profile_ab
Tested: uv run python CLI smoke test for orchestrator.profile_ab with temp traces
Tested: uv run python -m compileall orchestrator/profile_stage_chain.py orchestrator/profile_trace_utils.py orchestrator/profile_ab.py orchestrator/tests/test_profile_ab.py
2026-04-14 05:12:13 +08:00
陈少杰 909519ff17 omx(team): auto-checkpoint worker-2 [unknown] 2026-04-14 04:47:52 +08:00
陈少杰 addc4a1e9c Keep research degradation visible while bounding researcher nodes
Research provenance now rides with the debate state, cache metadata, live payloads, and trace dumps so degraded research no longer masquerades as a normal sample. Bull/Bear/Manager nodes also return explicit guarded fallbacks on timeout or exception, which gives the graph a real node budget boundary without rewriting the bull/bear output shape or removing debate.\n\nConstraint: Must preserve bull/bear debate structure and output shape while adding provenance and node guards\nRejected: Skip bull/bear debate in compact mode | would trade away analysis quality before A/B evidence exists\nConfidence: high\nScope-risk: moderate\nReversibility: clean\nDirective: Treat research_status and data_quality as rollout gates; do not collapse degraded research back into normal success samples\nTested: python -m pytest tradingagents/tests/test_research_guard.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q; python -m compileall tradingagents/graph/setup.py tradingagents/agents/utils/agent_states.py tradingagents/graph/propagation.py orchestrator/llm_runner.py orchestrator/live_mode.py orchestrator/profile_stage_chain.py; python orchestrator/profile_stage_chain.py --ticker 600519.SS --date 2026-04-10 --provider anthropic --model MiniMax-M2.7-highspeed --base-url https://api.minimaxi.com/anthropic --selected-analysts market --analysis-prompt-style compact --timeout 45 --max-retries 0 --overall-timeout 120 --dump-raw-on-failure\nNot-tested: Full successful live-provider completion through Portfolio Manager after the post-research connection failure
2026-04-14 03:49:33 +08:00
陈少杰 baf67dbd58 Trim the research phase before trusting profiling output
The legacy path was already narrowed to market-only compact execution, but the research stage remained the slowest leg and the profiler lacked persistent raw event artifacts for comparison. This change further compresses the compact prompts for Bull Researcher, Bear Researcher, and Research Manager, adds durable raw event dumps to the graph profiler, and keeps profiling evidence out of the runtime contract itself.

Constraint: No new dependencies and no runtime-contract pollution for profiling-only data
Rejected: Add synthetic timing fields back into the subprocess protocol | those timings are not real graph-stage boundaries and would mislead diagnosis
Rejected: Skip raw event dump persistence and rely on console output | makes multi-run comparison and regression tracking fragile
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep profiling as an external diagnostic surface; if stage timing ever enters contracts again, it must come from real graph boundaries
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall tradingagents/agents/researchers/bull_researcher.py tradingagents/agents/researchers/bear_researcher.py tradingagents/agents/managers/research_manager.py orchestrator/profile_stage_chain.py
Tested: real provider profiling via orchestrator/profile_stage_chain.py with market-only compact settings; dump persisted to orchestrator/profile_runs/600519.SS_2026-04-10_20260413T184742Z.json
Not-tested: browser/manual consumption of the persisted profiling dump
2026-04-14 02:51:07 +08:00
陈少杰 8a4f0ad540 Reduce the legacy execution path before profiling it for real
The provider itself was healthy, but the legacy dashboard path still ran the heaviest graph shape by default and had no trustworthy stage profiling story. This change narrows the default legacy execution settings to the market-only compact path with conservative timeout/retry values, injects those settings through the unified request/runtime surface, and adds a standalone graph-update profiler so stage timing comes from real node completions rather than synthetic script labels.

Constraint: Profiling evidence had to be grounded in the real provider path without adding new dependencies or polluting the runtime contract
Rejected: Keep synthetic STAGE_TIMING in the subprocess protocol | misattributes the heaviest work to the wrong phase and makes the profiling conclusion untrustworthy
Rejected: Broaden the default legacy path and rely on longer timeouts | raises cost and latency while obscuring the true bottleneck
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep operational profiling separate from runtime business contracts unless timings are sourced from real graph-stage boundaries
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall web_dashboard/backend orchestrator/profile_stage_chain.py
Tested: real provider direct invoke returned OK against MiniMax anthropic-compatible endpoint
Tested: real graph profiling via orchestrator/profile_stage_chain.py produced stage timings for 600519.SS on 2026-04-10 with selected_analysts=market and compact prompt
Not-tested: legacy subprocess full end-to-end success case on the same provider path (current run still exits via protocol failure after upstream connection error)
2026-04-14 02:42:53 +08:00
陈少杰 eb2ab0afcf Preserve diagnostics in live-mode failure payloads
The previous hardening pass still dropped source diagnostics and data-quality context once live-mode serialized a dual-lane failure. Keep those fields when a structured CombinedSignalFailure reaches the websocket layer so consumers can distinguish provider mismatch, stale data, and other degraded cases even when no final signal exists.

Constraint: Follow-on fix after 63858bf should stay minimal and not reopen unrelated executor/calendar work
Rejected: Fold this into a larger amend of the prior commit | history is already shared and the delta is a single behavioral correction
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When failure exceptions carry structured diagnostics, live serializers must preserve them instead of flattening to a generic message
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: real websocket consumers against provider-backed failure paths
2026-04-14 02:10:31 +08:00
陈少杰 a4def7aff9 Harden executor configuration and failure contracts before further rollout
The rollout-ready branch still conflated dashboard auth with provider credentials, discarded diagnostics when both signal lanes degraded, and treated RESULT_META as optional even though downstream contracts now depend on it. This change separates provider runtime settings from request auth, preserves source diagnostics/data quality in full-failure contracts, requires RESULT_META in the subprocess protocol, and moves A-share holidays into an updateable calendar data source.

Constraint: No external market-calendar dependency is available in env312 and dependency policy forbids adding one casually
Rejected: Keep reading provider keys from request headers | couples dashboard auth to execution and breaks non-anthropic providers
Rejected: Leave both-signals-unavailable as a bare ValueError | loses diagnostics before live/backend contracts can serialize them
Rejected: Keep A-share holidays embedded in Python constants | requires code edits every year and preserves the stopgap design
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep subprocess protocol fields explicit and fail closed when RESULT_META is missing; do not route provider credentials through dashboard auth again
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Not-tested: real provider-backed execution across openai/google providers
Not-tested: browser/manual verification beyond existing frontend contract consumers
2026-04-14 01:54:44 +08:00
陈少杰 11cbb7ce85 Carry Phase 4 rollout-readiness work back into the mainline safely
Team execution produced recoverable commits for market-holiday handling, live websocket contracts, regression coverage, and the remaining frontend contract-view polish. Recover those changes into main without waiting for terminal team shutdown, preserving the verified payload semantics while avoiding the worker auto-checkpoint noise.

Constraint: Team workers were still in progress, so recovery had to avoid destructive shutdown and ignore the worker-3 uv.lock churn
Rejected: Wait for terminal shutdown before recovery | unnecessary delay once commits were already recoverable and verified
Rejected: Cherry-pick worker-3 checkpoint wholesale | would import unrelated uv.lock churn into main
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Treat team INTEGRATED mailbox messages as hints only; always inspect snapshot refs/worktrees before claiming the leader actually merged code
Tested: python -m pytest orchestrator/tests/test_market_calendar.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_application_service.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: final team terminal completion after recovery
Not-tested: real websocket clients or live provider-backed market holiday sessions
2026-04-14 01:15:18 +08:00
陈少杰 7cd9c4617a Expose data-quality semantics before rolling contract-first further
Phase 3 adds concrete data-quality states to the contract surface so weekend runs, stale market data, partial payloads, and provider/config mismatches stop collapsing into generic success or failure. The backend now carries those diagnostics from quant/llm runners through the legacy executor contract, while the frontend reads decision/confidence fields from result or compat instead of assuming legacy top-level payloads.

Constraint: existing recommendation/task files and current dashboard routes must remain readable during migration
Rejected: infer data quality only in the service layer | loses source-specific evidence and violates the executor/orchestrator boundary
Rejected: leave frontend on top-level decision fields | breaks as soon as contract-first payloads become the default
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: keep new data-quality states explicit in contract metadata and route all UI reads through result/compat helpers
Tested: python -m pytest orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_signals.py orchestrator/tests/test_application_service.py orchestrator/tests/test_trading_graph_config.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py -q
Tested: python -m compileall orchestrator tradingagents web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: real exchange holiday calendars beyond weekend detection
Not-tested: real provider-backed end-to-end runs for provider_mismatch and stale-data scenarios
2026-04-14 00:37:35 +08:00
陈少杰 b6e57d01e3 Stabilize TradingAgents contracts so orchestration and dashboard can converge
This change set introduces a versioned result contract, shared config schema/loading, provider/data adapter seams, and a no-strategy application-service skeleton so the current research graph, orchestrator layer, and dashboard backend stop drifting further apart. It also keeps the earlier MiniMax compatibility and compact-prompt work aligned with the new contract shape and extends regression coverage so degradation, fallback, and service migration remain testable during the next phases.

Constraint: Must preserve existing FastAPI entrypoints and fallback behavior while introducing an application-service seam
Constraint: Must not turn application service into a new strategy or learning layer
Rejected: Full backend rewrite to service-only execution now | too risky before contract and fallback paths stabilize
Rejected: Leave provider/data/config logic distributed across scripts and endpoints | continues boundary drift and weakens verification
Confidence: high
Scope-risk: broad
Directive: Keep future application-service changes orchestration-only; move any scoring, signal fusion, or learning logic to orchestrator or tradingagents instead
Tested: python -m compileall orchestrator tradingagents web_dashboard/backend
Tested: python -m pytest orchestrator/tests/test_signals.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_contract_v1alpha1.py orchestrator/tests/test_application_service.py orchestrator/tests/test_provider_adapter.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_services_migration.py -q
Not-tested: live MiniMax/provider execution against external services
Not-tested: full dashboard/manual websocket flow against a running frontend
Not-tested: omx team runtime end-to-end in the primary workspace
2026-04-13 17:25:07 +08:00
陈少杰 b50e5b4725 fix(review): hmac.compare_digest for API key, ws/orchestrator auth, SignalMerger per-signal cap logic 2026-04-09 23:00:20 +08:00
陈少杰 28a95f34a7 fix(review): api_key→anthropic_key bug, sync-in-async event loop block, orchestrator per-message re-init, dead code cleanup 2026-04-09 22:55:36 +08:00
陈少杰 ce2e6d32cc feat(orchestrator): example scripts for backtest and live mode 2026-04-09 22:12:02 +08:00
陈少杰 480f0299b0 feat(orchestrator): LiveMode + /ws/orchestrator WebSocket endpoint 2026-04-09 22:10:15 +08:00
陈少杰 724c447720 feat(orchestrator): BacktestMode for historical signal collection 2026-04-09 22:09:38 +08:00
陈少杰 928f069184 test(orchestrator): unit tests for SignalMerger, LLMRunner._map_rating, QuantRunner._calc_confidence 2026-04-09 22:07:21 +08:00
陈少杰 14191abc29 feat(orchestrator): TradingOrchestrator main class with get_combined_signal 2026-04-09 22:05:03 +08:00
陈少杰 ba3297a696 fix(llm_runner): use stored direction/confidence on cache hit, sanitize ticker path 2026-04-09 22:03:17 +08:00
陈少杰 852b6c98e3 feat(orchestrator): implement LLMRunner with lazy graph init and JSON cache 2026-04-09 21:58:38 +08:00
陈少杰 29aae4bb18 feat(orchestrator): implement LLMRunner with caching and rating mapping 2026-04-09 21:54:48 +08:00
陈少杰 30d8f90467 fix(quant_runner): fix 3 critical issues and 2 important improvements
- Critical 1: initialize orders=[] before loop to prevent NameError when df is empty
- Critical 2: replace bare sqlite3 conn with context manager (with statement) in get_signal
- Critical 3: remove ticker param from _load_best_params (table has no ticker col, params are global)
- Important: extract db_path as self._db_path attribute in __init__ (DRY)
- Important: add comment explaining lazy imports require sys.path set in __init__
2026-04-09 21:51:38 +08:00
陈少杰 7a03c29330 feat(orchestrator): implement QuantRunner with BollingerStrategy signal generation 2026-04-09 21:44:34 +08:00
陈少杰 dacb3316fa fix(orchestrator): code quality fixes in config and signals
- config: remove hardcoded absolute path for quant_backtest_path (now empty string)
- config: add llm_solo_penalty (0.7) and quant_solo_penalty (0.8) fields
- signals: SignalMerger now accepts OrchestratorConfig in __init__
- signals: use config.llm_solo_penalty / quant_solo_penalty instead of magic numbers
- signals: apply quant_weight_cap / llm_weight_cap as confidence upper bounds
- signals: both-None branch raises ValueError instead of returning ticker=""
- signals: replace assert with explicit ValueError for llm-None-when-quant-None
- signals: replace datetime.utcnow() with datetime.now(timezone.utc)
2026-04-09 21:39:23 +08:00
陈少杰 56dc76d44a feat(orchestrator): add signals.py and config.py
- Signal / FinalSignal dataclasses
- SignalMerger with weighted merge, single-track fallbacks, and cancel-out HOLD
- OrchestratorConfig with all required fields
2026-04-09 21:35:31 +08:00