Keep maintainer docs aligned with the current contract-first and provenance reality

The repository state has moved well past the oldest migration drafts: backend public payloads are already contract-first in several paths, research provenance now spans runner/live/full-state logs, and the offline trace/A-B toolchain is part of the normal maintainer workflow. This doc update records what is already true on mainline versus what remains target-state, so future changes stop treating stale design notes as the current architecture.\n\nConstraint: Reflect only behavior that is already present on mainline; avoid documenting unrecovered worker-only experiments as current reality\nRejected: Collapse everything into README | maintainer-facing migration/provenance details would become harder to keep precise and reviewable\nConfidence: high\nScope-risk: narrow\nDirective: When changing backend public fields or profiling semantics, update AGENTS.md and the linked docs in the same change set so maintainer guidance does not drift behind code again\nTested: git diff --check on updated documentation set\nNot-tested: No runtime/code-path changes in this docs-only commit
2026-04-14 15:20:39 +08:00 · 2026-04-14 15:20:39 +08:00 · 0ba4e40601
parent 64e3583f66
commit 0ba4e40601
4 changed files with 159 additions and 7 deletions
--- a/docs/architecture/application-boundary.md
+++ b/docs/architecture/application-boundary.md
@ -4,6 +4,21 @@ Status: draft
 Audience: backend/dashboard/orchestrator maintainers
 Scope: define the boundary between HTTP/WebSocket delivery, application service orchestration, and the quant+LLM merge kernel
 ## Current status snapshot (2026-04)
 This document is still the **target boundary** document, but several convergence pieces are already landed on the mainline:
 - `web_dashboard/backend/services/job_service.py` now owns public task/job projection logic;
 - `web_dashboard/backend/services/result_store.py` persists result contracts under `results/<task_id>/result.v1alpha1.json`;
 - `web_dashboard/backend/services/analysis_service.py` and `api/portfolio.py` already expose contract-first result payloads by default;
 - `/ws/analysis/{task_id}` and `/ws/orchestrator` already carry `contract_version = "v1alpha1"` and include result/degradation/data-quality metadata.
 What is **not** fully finished yet:
 - `web_dashboard/backend/main.py` still contains too much orchestration glue and transport-local logic;
 - route handlers are thinner than before, but the application layer has not fully absorbed every lifecycle branch;
 - migration flags/modes still coexist with legacy compatibility paths.
 ## 1. Why this document exists
 The current backend mixes three concerns inside `web_dashboard/backend/main.py`:
@ -40,6 +55,12 @@ This is the correct place for quant/LLM merge semantics.
 This makes the transport layer hard to replace and makes result contracts implicit.
 At the same time, current mainline no longer matches the oldest “all logic sits in routes” description exactly. The codebase now sits in a **mid-migration** state:
 - merge semantics remain in `orchestrator/`;
 - public payload shaping has started moving into backend services;
 - legacy compatibility fields still exist for UI safety.
 ## 3. Target boundary
 ## 3.1 Layer model
@ -193,3 +214,14 @@ A change respects this boundary if all are true:
 - application service owns task lifecycle and contract mapping;
 - `orchestrator/` remains the only owner of merge semantics;
 - domain dataclasses can still be tested without FastAPI or WebSocket context.
 ## 9. Current maintainer guidance
 When touching backend convergence code, treat these files as the current application-facing boundary:
 - `web_dashboard/backend/services/job_service.py`
 - `web_dashboard/backend/services/result_store.py`
 - `web_dashboard/backend/services/analysis_service.py`
 - `web_dashboard/backend/api/portfolio.py`
 If a change adds or removes externally visible fields, update `docs/contracts/result-contract-v1alpha1.md` in the same change set.
--- a/docs/architecture/research-provenance.md
+++ b/docs/architecture/research-provenance.md
@ -4,6 +4,20 @@ Status: draft
 Audience: orchestrator, TradingAgents graph, verification
 Scope: document the Phase 1-4 provenance fields, Bull/Bear/Manager guard behavior, trace schema, and the smallest safe A/B workflow for verification
 ## Current implementation snapshot (2026-04)
 Mainline now has four distinct but connected pieces in place:
 1. `research provenance` fields are carried in `investment_debate_state`;
 2. the same provenance is reused by:
   - `orchestrator/llm_runner.py`
   - `orchestrator/live_mode.py`
   - `tradingagents/graph/trading_graph.py` full-state logs;
 3. `orchestrator/profile_stage_chain.py` emits node-level traces for offline analysis;
 4. `orchestrator/profile_ab.py` compares two trace cohorts offline without changing the production execution path.
 This document describes the **current mainline behavior**, not a future structured-memo design.
 ## 1. Why this document exists
 Phase 1-4 convergence added three closely related behaviors:
@ -84,6 +98,10 @@ This is intentionally **string-first**, not schema-first, so the downstream plan
 - `metadata.data_quality`
 - `metadata.sample_quality`
 The extraction path is now centralized through:
 - `tradingagents/agents/utils/agent_states.py::extract_research_provenance()`
 Current conventions:
 - normal path: `data_quality.state = "ok"`, `sample_quality = "full_research"`;
@ -98,13 +116,22 @@ Current conventions:
 This means consumers can inspect research degradation without parsing raw debate text.
 ### 4.3 Full-state log projection
 `tradingagents/graph/trading_graph.py::_log_state()` now also persists the same provenance subset into:
 - `results/<ticker>/TradingAgentsStrategy_logs/full_states_log_<trade_date>.json`
 This keeps the post-run JSON logs aligned with the runner/live metadata instead of silently dropping the structured fields.
 ## 5. Profiling trace schema
-`orchestrator/profile_stage_chain.py` is the current timing/provenance trace tool.
+`orchestrator/profile_stage_chain.py` is the current timing/provenance trace generator.
 `orchestrator/profile_trace_utils.py` holds the shared summary helper used by the offline A/B comparison path.
 ### 5.1 Top-level payload
-Successful runs write a JSON payload with:
+Successful runs currently write a JSON payload with:
 - `status`
 - `ticker`
@ -124,7 +151,7 @@ Error payloads add:
 ### 5.2 `node_timings[]` entry schema
-Each node timing entry currently contains:
+Each `node_timings[]` entry currently contains:
 | Field | Meaning |
 | --- | --- |
@ -143,9 +170,26 @@ Each node timing entry currently contains:
 This schema is intentionally **trace-oriented**, not a replacement for the application result contract.
-## 6. Minimal A/B harness guidance
+## 6. Offline A/B comparison helper
-Use `orchestrator/profile_stage_chain.py` when you want a small, explicit comparison harness without changing the production default path.
+`orchestrator/profile_ab.py` is the current offline comparison helper.
 It consumes one or more trace JSON files from cohort `A` and cohort `B`, then reports:
 - `median_total_elapsed_ms`
 - `median_event_count`
 - `median_phase_elapsed_ms`
 - `degraded_run_count`
 - `error_count`
 - `trace_schema_versions`
 - `source_files`
 - recommendation tie-breaks across elapsed time, degradation count, and error count
 This helper is intentionally offline-only: it does **not** re-run live providers or change the production runtime path.
 ## 7. Minimal A/B harness guidance
 Use `python -m orchestrator.profile_stage_chain` to generate traces, then `python -m orchestrator.profile_ab` to compare them.
 ### 6.1 Safe comparison knobs
@ -167,7 +211,7 @@ Keep these fixed when doing an A/B comparison:
 - the same `--overall-timeout`
 - `max_debate_rounds = 1` and `max_risk_discuss_rounds = 1` as currently baked into the harness
-### 6.3 Example commands
+### 7.3 Example commands
 ```bash
 python -m orchestrator.profile_stage_chain \
@ -181,6 +225,12 @@ python -m orchestrator.profile_stage_chain \
  --date 2026-04-11 \
  --selected-analysts market \
  --analysis-prompt-style detailed
 python -m orchestrator.profile_ab \
  --a orchestrator/profile_runs/compact \
  --b orchestrator/profile_runs/detailed \
  --label-a compact \
  --label-b detailed
 ```
 Compare the generated JSON dumps by focusing on:
@ -190,7 +240,7 @@ Compare the generated JSON dumps by focusing on:
 - provenance changes (`research_status`, `degraded_reason`)
 - history/response growth (`history_len`, `response_len`)
-## 7. Review guardrails
+## 8. Review guardrails
 When modifying this area, keep these invariants intact unless a broader migration explicitly approves otherwise:
--- a/docs/contracts/result-contract-v1alpha1.md
+++ b/docs/contracts/result-contract-v1alpha1.md
@ -4,6 +4,17 @@ Status: draft
 Audience: backend, desktop, frontend, verification
 Format: JSON-oriented contract notes with examples
 ## Current implementation snapshot (2026-04)
 Mainline backend behavior now partially matches this draft already:
 - `web_dashboard/backend/services/job_service.py` emits public task/job payloads with `contract_version = "v1alpha1"`;
 - `web_dashboard/backend/services/result_store.py` persists result contracts under `results/<task_id>/result.v1alpha1.json`;
 - `web_dashboard/backend/api/portfolio.py` and `/ws/orchestrator` already expose `v1alpha1` envelopes by default;
 - live signal payloads currently carry `data_quality`, `degradation`, and `research` as top-level contract fields in addition to `result` / `error`.
 This document is therefore a **working contract doc**, not a pure future sketch.
 ## 1. Goals
 `result-contract-v1alpha1` defines the stable shapes exchanged across:
@ -169,6 +180,9 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`.
        "llm_direction": 1,
        "timestamp": "2026-04-13T12:00:11Z"
      },
      "degradation": null,
      "data_quality": {"state": "ok"},
      "research": null,
      "error": null
    },
    {
@ -176,6 +190,19 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`.
      "date": "2026-04-13",
      "status": "failed",
      "result": null,
      "degradation": {
        "degraded": true,
        "reason_code": "provider_mismatch"
      },
      "data_quality": {"state": "provider_mismatch", "source": "llm"},
      "research": {
        "research_status": "failed",
        "research_mode": "degraded_synthesis",
        "timed_out_nodes": ["Bull Researcher"],
        "degraded_reason": "bull_researcher_connectionerror",
        "covered_dimensions": ["market"],
        "manager_confidence": null
      },
      "error": {
        "code": "live_signal_failed",
        "message": "both quant and llm signals are None",
@ -216,6 +243,7 @@ Current backend fields in `web_dashboard/backend/main.py` map roughly as follows
 - `quant_signal` -> `result.signals.quant.rating`
 - `llm_signal` -> `result.signals.llm.rating`
 - `confidence` -> `result.confidence`
 - `result_ref` -> persisted result contract location under `results/<task_id>/result.v1alpha1.json`
 - top-level `error` string -> structured `error`
 - positional `stages[]` -> named `stages[]`
@ -237,6 +265,10 @@ Do not freeze these until config-schema work lands:
 - raw metadata blobs from quant/LLM internals
 - report summary extraction fields
 Additional note:
 - trace/profiling payloads are **not** part of `result-contract-v1alpha1`; they use separate offline trace/A-B helper files under `orchestrator/`.
 ## 10. Open review questions
 - Should `rating` remain duplicated with `direction`, or should one be derived client-side?
--- a/docs/migration/rollback-notes.md
+++ b/docs/migration/rollback-notes.md
@ -4,6 +4,23 @@ Status: draft
 Audience: backend/application maintainers
 Scope: migrate toward application-service boundary and result-contract-v1alpha1 with rollback safety
 ## Current progress snapshot (2026-04)
 Mainline has moved beyond pure planning, but it has not finished the full boundary migration:
 - `Phase 0` is effectively done: contract and architecture drafts exist.
 - `Phase 1-4` are **partially landed**:
  - backend services now project `v1alpha1`-style public payloads;
  - result contracts are persisted via `result_store.py`;
  - `/ws/analysis/{task_id}` and `/ws/orchestrator` already wrap payloads with `contract_version`;
  - recommendation and task-status reads already depend on application-layer shaping more than route-local reconstruction.
 - `Phase 5` is **not complete**:
  - `web_dashboard/backend/main.py` is still too large;
  - route-local orchestration has not been fully deleted;
  - compatibility fields still coexist with the newer contract-first path.
 Also note that research provenance / node guard / profiling work is now landed on the orchestrator side. That effort complements the backend migration but should not be confused with “application boundary fully complete.”
 ## 1. Migration objective
 Move backend delivery code from route-local orchestration to an application-service layer without changing the quant+LLM merge kernel behavior.
@ -60,6 +77,11 @@ Rollback:
 - route handlers can call old inline functions directly via feature flag or import switch
 Current status:
 - partially complete on mainline via `analysis_service.py`, `job_service.py`, and `result_store.py`
 - not complete enough yet to claim `main.py` is only a thin adapter
 ## Phase 2: dual-read for task status
 Why:
@ -116,6 +138,12 @@ Rollback:
 - restore websocket serializer to legacy shape
 - keep application service intact behind adapter
 Current status:
 - partially complete on mainline
 - `/ws/orchestrator` already emits `contract_version`, `data_quality`, `degradation`, and `research`
 - `/ws/analysis/{task_id}` already reads application-shaped task state
 ## Phase 5: remove route-local orchestration
 Actions:
@ -186,3 +214,13 @@ A migration plan is acceptable only if it:
 - introduces feature-flagged cutover points
 - supports dual-read/dual-write only at application/persistence boundary
 - provides a one-step rollback path at each release phase
 ## 10. Maintainer note
 When updating migration status, keep these three documents aligned:
 - `docs/architecture/application-boundary.md`
 - `docs/contracts/result-contract-v1alpha1.md`
 - `docs/architecture/research-provenance.md`
 The first two describe backend/application convergence; the third describes orchestrator-side research degradation and profiling semantics that now feed those contracts.