Keep maintainer docs aligned with the current contract-first and provenance reality

The repository state has moved well past the oldest migration drafts: backend public payloads are already contract-first in several paths, research provenance now spans runner/live/full-state logs, and the offline trace/A-B toolchain is part of the normal maintainer workflow. This doc update records what is already true on mainline versus what remains target-state, so future changes stop treating stale design notes as the current architecture.\n\nConstraint: Reflect only behavior that is already present on mainline; avoid documenting unrecovered worker-only experiments as current reality\nRejected: Collapse everything into README | maintainer-facing migration/provenance details would become harder to keep precise and reviewable\nConfidence: high\nScope-risk: narrow\nDirective: When changing backend public fields or profiling semantics, update AGENTS.md and the linked docs in the same change set so maintainer guidance does not drift behind code again\nTested: git diff --check on updated documentation set\nNot-tested: No runtime/code-path changes in this docs-only commit
2026-04-14 15:20:39 +08:00 · 2026-04-14 15:20:39 +08:00 · 0ba4e40601
parent 64e3583f66
commit 0ba4e40601
4 changed files with 159 additions and 7 deletions
--- a/docs/architecture/application-boundary.md
+++ b/docs/architecture/application-boundary.md
@ -4,6 +4,21 @@ Status: draft
 Audience: backend/dashboard/orchestrator maintainers
 Scope: define the boundary between HTTP/WebSocket delivery, application service orchestration, and the quant+LLM merge kernel

+## Current status snapshot (2026-04)
+
+This document is still the **target boundary** document, but several convergence pieces are already landed on the mainline:
+
+- `web_dashboard/backend/services/job_service.py` now owns public task/job projection logic;
+- `web_dashboard/backend/services/result_store.py` persists result contracts under `results/<task_id>/result.v1alpha1.json`;
+- `web_dashboard/backend/services/analysis_service.py` and `api/portfolio.py` already expose contract-first result payloads by default;
+- `/ws/analysis/{task_id}` and `/ws/orchestrator` already carry `contract_version = "v1alpha1"` and include result/degradation/data-quality metadata.
+
+What is **not** fully finished yet:
+
+- `web_dashboard/backend/main.py` still contains too much orchestration glue and transport-local logic;
+- route handlers are thinner than before, but the application layer has not fully absorbed every lifecycle branch;
+- migration flags/modes still coexist with legacy compatibility paths.
+
 ## 1. Why this document exists

 The current backend mixes three concerns inside `web_dashboard/backend/main.py`:
@ -40,6 +55,12 @@ This is the correct place for quant/LLM merge semantics.

 This makes the transport layer hard to replace and makes result contracts implicit.

+At the same time, current mainline no longer matches the oldest “all logic sits in routes” description exactly. The codebase now sits in a **mid-migration** state:
+
+- merge semantics remain in `orchestrator/`;
+- public payload shaping has started moving into backend services;
+- legacy compatibility fields still exist for UI safety.
+
 ## 3. Target boundary

 ## 3.1 Layer model
@ -193,3 +214,14 @@ A change respects this boundary if all are true:
 - application service owns task lifecycle and contract mapping;
 - `orchestrator/` remains the only owner of merge semantics;
 - domain dataclasses can still be tested without FastAPI or WebSocket context.
+
+## 9. Current maintainer guidance
+
+When touching backend convergence code, treat these files as the current application-facing boundary:
+
+- `web_dashboard/backend/services/job_service.py`
+- `web_dashboard/backend/services/result_store.py`
+- `web_dashboard/backend/services/analysis_service.py`
+- `web_dashboard/backend/api/portfolio.py`
+
+If a change adds or removes externally visible fields, update `docs/contracts/result-contract-v1alpha1.md` in the same change set.
--- a/docs/architecture/research-provenance.md
+++ b/docs/architecture/research-provenance.md
@ -4,6 +4,20 @@ Status: draft
 Audience: orchestrator, TradingAgents graph, verification
 Scope: document the Phase 1-4 provenance fields, Bull/Bear/Manager guard behavior, trace schema, and the smallest safe A/B workflow for verification

+## Current implementation snapshot (2026-04)
+
+Mainline now has four distinct but connected pieces in place:
+
+1. `research provenance` fields are carried in `investment_debate_state`;
+2. the same provenance is reused by:
+   - `orchestrator/llm_runner.py`
+   - `orchestrator/live_mode.py`
+   - `tradingagents/graph/trading_graph.py` full-state logs;
+3. `orchestrator/profile_stage_chain.py` emits node-level traces for offline analysis;
+4. `orchestrator/profile_ab.py` compares two trace cohorts offline without changing the production execution path.
+
+This document describes the **current mainline behavior**, not a future structured-memo design.
+
 ## 1. Why this document exists

 Phase 1-4 convergence added three closely related behaviors:
@ -84,6 +98,10 @@ This is intentionally **string-first**, not schema-first, so the downstream plan
 - `metadata.data_quality`
 - `metadata.sample_quality`

+The extraction path is now centralized through:
+
+- `tradingagents/agents/utils/agent_states.py::extract_research_provenance()`
+
 Current conventions:

 - normal path: `data_quality.state = "ok"`, `sample_quality = "full_research"`;
@ -98,13 +116,22 @@ Current conventions:

 This means consumers can inspect research degradation without parsing raw debate text.

+### 4.3 Full-state log projection
+
+`tradingagents/graph/trading_graph.py::_log_state()` now also persists the same provenance subset into:
+
+- `results/<ticker>/TradingAgentsStrategy_logs/full_states_log_<trade_date>.json`
+
+This keeps the post-run JSON logs aligned with the runner/live metadata instead of silently dropping the structured fields.
+
 ## 5. Profiling trace schema

-`orchestrator/profile_stage_chain.py` is the current timing/provenance trace tool.
+`orchestrator/profile_stage_chain.py` is the current timing/provenance trace generator.
+`orchestrator/profile_trace_utils.py` holds the shared summary helper used by the offline A/B comparison path.

 ### 5.1 Top-level payload

-Successful runs write a JSON payload with:
+Successful runs currently write a JSON payload with:

 - `status`
 - `ticker`
@ -124,7 +151,7 @@ Error payloads add:

 ### 5.2 `node_timings[]` entry schema

-Each node timing entry currently contains:
+Each `node_timings[]` entry currently contains:

 | Field | Meaning |
 | --- | --- |
@ -143,9 +170,26 @@ Each node timing entry currently contains:

 This schema is intentionally **trace-oriented**, not a replacement for the application result contract.

-## 6. Minimal A/B harness guidance
+## 6. Offline A/B comparison helper

-Use `orchestrator/profile_stage_chain.py` when you want a small, explicit comparison harness without changing the production default path.
+`orchestrator/profile_ab.py` is the current offline comparison helper.
+
+It consumes one or more trace JSON files from cohort `A` and cohort `B`, then reports:
+
+- `median_total_elapsed_ms`
+- `median_event_count`
+- `median_phase_elapsed_ms`
+- `degraded_run_count`
+- `error_count`
+- `trace_schema_versions`
+- `source_files`
+- recommendation tie-breaks across elapsed time, degradation count, and error count
+
+This helper is intentionally offline-only: it does **not** re-run live providers or change the production runtime path.
+
+## 7. Minimal A/B harness guidance
+
+Use `python -m orchestrator.profile_stage_chain` to generate traces, then `python -m orchestrator.profile_ab` to compare them.

 ### 6.1 Safe comparison knobs

@ -167,7 +211,7 @@ Keep these fixed when doing an A/B comparison:
 - the same `--overall-timeout`
 - `max_debate_rounds = 1` and `max_risk_discuss_rounds = 1` as currently baked into the harness

-### 6.3 Example commands
+### 7.3 Example commands

 ```bash
 python -m orchestrator.profile_stage_chain \
@ -181,6 +225,12 @@ python -m orchestrator.profile_stage_chain \
  --date 2026-04-11 \
  --selected-analysts market \
  --analysis-prompt-style detailed
+
+python -m orchestrator.profile_ab \
+  --a orchestrator/profile_runs/compact \
+  --b orchestrator/profile_runs/detailed \
+  --label-a compact \
+  --label-b detailed
 ```

 Compare the generated JSON dumps by focusing on:
@ -190,7 +240,7 @@ Compare the generated JSON dumps by focusing on:
 - provenance changes (`research_status`, `degraded_reason`)
 - history/response growth (`history_len`, `response_len`)

-## 7. Review guardrails
+## 8. Review guardrails

 When modifying this area, keep these invariants intact unless a broader migration explicitly approves otherwise:

--- a/docs/contracts/result-contract-v1alpha1.md
+++ b/docs/contracts/result-contract-v1alpha1.md
@ -4,6 +4,17 @@ Status: draft
 Audience: backend, desktop, frontend, verification
 Format: JSON-oriented contract notes with examples

+## Current implementation snapshot (2026-04)
+
+Mainline backend behavior now partially matches this draft already:
+
+- `web_dashboard/backend/services/job_service.py` emits public task/job payloads with `contract_version = "v1alpha1"`;
+- `web_dashboard/backend/services/result_store.py` persists result contracts under `results/<task_id>/result.v1alpha1.json`;
+- `web_dashboard/backend/api/portfolio.py` and `/ws/orchestrator` already expose `v1alpha1` envelopes by default;
+- live signal payloads currently carry `data_quality`, `degradation`, and `research` as top-level contract fields in addition to `result` / `error`.
+
+This document is therefore a **working contract doc**, not a pure future sketch.
+
 ## 1. Goals

 `result-contract-v1alpha1` defines the stable shapes exchanged across:
@ -169,6 +180,9 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`.
        "llm_direction": 1,
        "timestamp": "2026-04-13T12:00:11Z"
      },
+      "degradation": null,
+      "data_quality": {"state": "ok"},
+      "research": null,
      "error": null
    },
    {
@ -176,6 +190,19 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`.
      "date": "2026-04-13",
      "status": "failed",
      "result": null,
+      "degradation": {
+        "degraded": true,
+        "reason_code": "provider_mismatch"
+      },
+      "data_quality": {"state": "provider_mismatch", "source": "llm"},
+      "research": {
+        "research_status": "failed",
+        "research_mode": "degraded_synthesis",
+        "timed_out_nodes": ["Bull Researcher"],
+        "degraded_reason": "bull_researcher_connectionerror",
+        "covered_dimensions": ["market"],
+        "manager_confidence": null
+      },
      "error": {
        "code": "live_signal_failed",
        "message": "both quant and llm signals are None",
@ -216,6 +243,7 @@ Current backend fields in `web_dashboard/backend/main.py` map roughly as follows
 - `quant_signal` -> `result.signals.quant.rating`
 - `llm_signal` -> `result.signals.llm.rating`
 - `confidence` -> `result.confidence`
+- `result_ref` -> persisted result contract location under `results/<task_id>/result.v1alpha1.json`
 - top-level `error` string -> structured `error`
 - positional `stages[]` -> named `stages[]`

@ -237,6 +265,10 @@ Do not freeze these until config-schema work lands:
 - raw metadata blobs from quant/LLM internals
 - report summary extraction fields

+Additional note:
+
+- trace/profiling payloads are **not** part of `result-contract-v1alpha1`; they use separate offline trace/A-B helper files under `orchestrator/`.
+
 ## 10. Open review questions

 - Should `rating` remain duplicated with `direction`, or should one be derived client-side?
--- a/docs/migration/rollback-notes.md
+++ b/docs/migration/rollback-notes.md
@ -4,6 +4,23 @@ Status: draft
 Audience: backend/application maintainers
 Scope: migrate toward application-service boundary and result-contract-v1alpha1 with rollback safety

+## Current progress snapshot (2026-04)
+
+Mainline has moved beyond pure planning, but it has not finished the full boundary migration:
+
+- `Phase 0` is effectively done: contract and architecture drafts exist.
+- `Phase 1-4` are **partially landed**:
+  - backend services now project `v1alpha1`-style public payloads;
+  - result contracts are persisted via `result_store.py`;
+  - `/ws/analysis/{task_id}` and `/ws/orchestrator` already wrap payloads with `contract_version`;
+  - recommendation and task-status reads already depend on application-layer shaping more than route-local reconstruction.
+- `Phase 5` is **not complete**:
+  - `web_dashboard/backend/main.py` is still too large;
+  - route-local orchestration has not been fully deleted;
+  - compatibility fields still coexist with the newer contract-first path.
+
+Also note that research provenance / node guard / profiling work is now landed on the orchestrator side. That effort complements the backend migration but should not be confused with “application boundary fully complete.”
+
 ## 1. Migration objective

 Move backend delivery code from route-local orchestration to an application-service layer without changing the quant+LLM merge kernel behavior.
@ -60,6 +77,11 @@ Rollback:

 - route handlers can call old inline functions directly via feature flag or import switch

+Current status:
+
+- partially complete on mainline via `analysis_service.py`, `job_service.py`, and `result_store.py`
+- not complete enough yet to claim `main.py` is only a thin adapter
+
 ## Phase 2: dual-read for task status

 Why:
@ -116,6 +138,12 @@ Rollback:
 - restore websocket serializer to legacy shape
 - keep application service intact behind adapter

+Current status:
+
+- partially complete on mainline
+- `/ws/orchestrator` already emits `contract_version`, `data_quality`, `degradation`, and `research`
+- `/ws/analysis/{task_id}` already reads application-shaped task state
+
 ## Phase 5: remove route-local orchestration

 Actions:
@ -186,3 +214,13 @@ A migration plan is acceptable only if it:
 - introduces feature-flagged cutover points
 - supports dual-read/dual-write only at application/persistence boundary
 - provides a one-step rollback path at each release phase
+
+## 10. Maintainer note
+
+When updating migration status, keep these three documents aligned:
+
+- `docs/architecture/application-boundary.md`
+- `docs/contracts/result-contract-v1alpha1.md`
+- `docs/architecture/research-provenance.md`
+
+The first two describe backend/application convergence; the third describes orchestrator-side research degradation and profiling semantics that now feed those contracts.