The minimal offline harness now carries forward source-file and trace-schema
metadata, and it can break ties using error counts instead of only elapsed
runtime and degraded-research totals. This keeps Phase 1-4 profile comparisons
self-describing when multiple dumps are aggregated.
Constraint: Keep the harness offline and avoid changing the default runtime path
Rejected: Add a live dual-run executor | would couple profiling to external LLM calls and increase risk
Confidence: high
Scope-risk: narrow
Directive: Preserve the trace dump shape as the source of truth for future comparison tooling
Tested: uv run python inline assertions for orchestrator.tests.test_profile_ab
Tested: uv run python CLI smoke test for orchestrator.profile_ab with temp traces
Tested: uv run python -m compileall orchestrator/profile_stage_chain.py orchestrator/profile_trace_utils.py orchestrator/profile_ab.py orchestrator/tests/test_profile_ab.py
Research provenance now rides with the debate state, cache metadata, live payloads, and trace dumps so degraded research no longer masquerades as a normal sample. Bull/Bear/Manager nodes also return explicit guarded fallbacks on timeout or exception, which gives the graph a real node budget boundary without rewriting the bull/bear output shape or removing debate.\n\nConstraint: Must preserve bull/bear debate structure and output shape while adding provenance and node guards\nRejected: Skip bull/bear debate in compact mode | would trade away analysis quality before A/B evidence exists\nConfidence: high\nScope-risk: moderate\nReversibility: clean\nDirective: Treat research_status and data_quality as rollout gates; do not collapse degraded research back into normal success samples\nTested: python -m pytest tradingagents/tests/test_research_guard.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q; python -m compileall tradingagents/graph/setup.py tradingagents/agents/utils/agent_states.py tradingagents/graph/propagation.py orchestrator/llm_runner.py orchestrator/live_mode.py orchestrator/profile_stage_chain.py; python orchestrator/profile_stage_chain.py --ticker 600519.SS --date 2026-04-10 --provider anthropic --model MiniMax-M2.7-highspeed --base-url https://api.minimaxi.com/anthropic --selected-analysts market --analysis-prompt-style compact --timeout 45 --max-retries 0 --overall-timeout 120 --dump-raw-on-failure\nNot-tested: Full successful live-provider completion through Portfolio Manager after the post-research connection failure
The legacy path was already narrowed to market-only compact execution, but the research stage remained the slowest leg and the profiler lacked persistent raw event artifacts for comparison. This change further compresses the compact prompts for Bull Researcher, Bear Researcher, and Research Manager, adds durable raw event dumps to the graph profiler, and keeps profiling evidence out of the runtime contract itself.
Constraint: No new dependencies and no runtime-contract pollution for profiling-only data
Rejected: Add synthetic timing fields back into the subprocess protocol | those timings are not real graph-stage boundaries and would mislead diagnosis
Rejected: Skip raw event dump persistence and rely on console output | makes multi-run comparison and regression tracking fragile
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep profiling as an external diagnostic surface; if stage timing ever enters contracts again, it must come from real graph boundaries
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall tradingagents/agents/researchers/bull_researcher.py tradingagents/agents/researchers/bear_researcher.py tradingagents/agents/managers/research_manager.py orchestrator/profile_stage_chain.py
Tested: real provider profiling via orchestrator/profile_stage_chain.py with market-only compact settings; dump persisted to orchestrator/profile_runs/600519.SS_2026-04-10_20260413T184742Z.json
Not-tested: browser/manual consumption of the persisted profiling dump
The provider itself was healthy, but the legacy dashboard path still ran the heaviest graph shape by default and had no trustworthy stage profiling story. This change narrows the default legacy execution settings to the market-only compact path with conservative timeout/retry values, injects those settings through the unified request/runtime surface, and adds a standalone graph-update profiler so stage timing comes from real node completions rather than synthetic script labels.
Constraint: Profiling evidence had to be grounded in the real provider path without adding new dependencies or polluting the runtime contract
Rejected: Keep synthetic STAGE_TIMING in the subprocess protocol | misattributes the heaviest work to the wrong phase and makes the profiling conclusion untrustworthy
Rejected: Broaden the default legacy path and rely on longer timeouts | raises cost and latency while obscuring the true bottleneck
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep operational profiling separate from runtime business contracts unless timings are sourced from real graph-stage boundaries
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall web_dashboard/backend orchestrator/profile_stage_chain.py
Tested: real provider direct invoke returned OK against MiniMax anthropic-compatible endpoint
Tested: real graph profiling via orchestrator/profile_stage_chain.py produced stage timings for 600519.SS on 2026-04-10 with selected_analysts=market and compact prompt
Not-tested: legacy subprocess full end-to-end success case on the same provider path (current run still exits via protocol failure after upstream connection error)
The previous hardening pass still dropped source diagnostics and data-quality context once live-mode serialized a dual-lane failure. Keep those fields when a structured CombinedSignalFailure reaches the websocket layer so consumers can distinguish provider mismatch, stale data, and other degraded cases even when no final signal exists.
Constraint: Follow-on fix after 63858bf should stay minimal and not reopen unrelated executor/calendar work
Rejected: Fold this into a larger amend of the prior commit | history is already shared and the delta is a single behavioral correction
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When failure exceptions carry structured diagnostics, live serializers must preserve them instead of flattening to a generic message
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: real websocket consumers against provider-backed failure paths
The rollout-ready branch still conflated dashboard auth with provider credentials, discarded diagnostics when both signal lanes degraded, and treated RESULT_META as optional even though downstream contracts now depend on it. This change separates provider runtime settings from request auth, preserves source diagnostics/data quality in full-failure contracts, requires RESULT_META in the subprocess protocol, and moves A-share holidays into an updateable calendar data source.
Constraint: No external market-calendar dependency is available in env312 and dependency policy forbids adding one casually
Rejected: Keep reading provider keys from request headers | couples dashboard auth to execution and breaks non-anthropic providers
Rejected: Leave both-signals-unavailable as a bare ValueError | loses diagnostics before live/backend contracts can serialize them
Rejected: Keep A-share holidays embedded in Python constants | requires code edits every year and preserves the stopgap design
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep subprocess protocol fields explicit and fail closed when RESULT_META is missing; do not route provider credentials through dashboard auth again
Tested: python -m pytest web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py orchestrator/tests/test_market_calendar.py orchestrator/tests/test_live_mode.py orchestrator/tests/test_application_service.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Not-tested: real provider-backed execution across openai/google providers
Not-tested: browser/manual verification beyond existing frontend contract consumers
Team execution produced recoverable commits for market-holiday handling, live websocket contracts, regression coverage, and the remaining frontend contract-view polish. Recover those changes into main without waiting for terminal team shutdown, preserving the verified payload semantics while avoiding the worker auto-checkpoint noise.
Constraint: Team workers were still in progress, so recovery had to avoid destructive shutdown and ignore the worker-3 uv.lock churn
Rejected: Wait for terminal shutdown before recovery | unnecessary delay once commits were already recoverable and verified
Rejected: Cherry-pick worker-3 checkpoint wholesale | would import unrelated uv.lock churn into main
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Treat team INTEGRATED mailbox messages as hints only; always inspect snapshot refs/worktrees before claiming the leader actually merged code
Tested: python -m pytest orchestrator/tests/test_market_calendar.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_application_service.py orchestrator/tests/test_live_mode.py web_dashboard/backend/tests/test_api_smoke.py -q
Tested: python -m compileall orchestrator web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: final team terminal completion after recovery
Not-tested: real websocket clients or live provider-backed market holiday sessions
Phase 3 adds concrete data-quality states to the contract surface so weekend runs, stale market data, partial payloads, and provider/config mismatches stop collapsing into generic success or failure. The backend now carries those diagnostics from quant/llm runners through the legacy executor contract, while the frontend reads decision/confidence fields from result or compat instead of assuming legacy top-level payloads.
Constraint: existing recommendation/task files and current dashboard routes must remain readable during migration
Rejected: infer data quality only in the service layer | loses source-specific evidence and violates the executor/orchestrator boundary
Rejected: leave frontend on top-level decision fields | breaks as soon as contract-first payloads become the default
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: keep new data-quality states explicit in contract metadata and route all UI reads through result/compat helpers
Tested: python -m pytest orchestrator/tests/test_quant_runner.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_signals.py orchestrator/tests/test_application_service.py orchestrator/tests/test_trading_graph_config.py web_dashboard/backend/tests/test_executors.py web_dashboard/backend/tests/test_services_migration.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py -q
Tested: python -m compileall orchestrator tradingagents web_dashboard/backend
Tested: npm run build (web_dashboard/frontend)
Not-tested: real exchange holiday calendars beyond weekend detection
Not-tested: real provider-backed end-to-end runs for provider_mismatch and stale-data scenarios
This change set introduces a versioned result contract, shared config schema/loading, provider/data adapter seams, and a no-strategy application-service skeleton so the current research graph, orchestrator layer, and dashboard backend stop drifting further apart. It also keeps the earlier MiniMax compatibility and compact-prompt work aligned with the new contract shape and extends regression coverage so degradation, fallback, and service migration remain testable during the next phases.
Constraint: Must preserve existing FastAPI entrypoints and fallback behavior while introducing an application-service seam
Constraint: Must not turn application service into a new strategy or learning layer
Rejected: Full backend rewrite to service-only execution now | too risky before contract and fallback paths stabilize
Rejected: Leave provider/data/config logic distributed across scripts and endpoints | continues boundary drift and weakens verification
Confidence: high
Scope-risk: broad
Directive: Keep future application-service changes orchestration-only; move any scoring, signal fusion, or learning logic to orchestrator or tradingagents instead
Tested: python -m compileall orchestrator tradingagents web_dashboard/backend
Tested: python -m pytest orchestrator/tests/test_signals.py orchestrator/tests/test_llm_runner.py orchestrator/tests/test_quant_runner.py orchestrator/tests/test_contract_v1alpha1.py orchestrator/tests/test_application_service.py orchestrator/tests/test_provider_adapter.py web_dashboard/backend/tests/test_main_api.py web_dashboard/backend/tests/test_portfolio_api.py web_dashboard/backend/tests/test_api_smoke.py web_dashboard/backend/tests/test_services_migration.py -q
Not-tested: live MiniMax/provider execution against external services
Not-tested: full dashboard/manual websocket flow against a running frontend
Not-tested: omx team runtime end-to-end in the primary workspace
- Critical 1: initialize orders=[] before loop to prevent NameError when df is empty
- Critical 2: replace bare sqlite3 conn with context manager (with statement) in get_signal
- Critical 3: remove ticker param from _load_best_params (table has no ticker col, params are global)
- Important: extract db_path as self._db_path attribute in __init__ (DRY)
- Important: add comment explaining lazy imports require sys.path set in __init__
- Signal / FinalSignal dataclasses
- SignalMerger with weighted merge, single-track fallbacks, and cancel-out HOLD
- OrchestratorConfig with all required fields