diff --git a/docs/architecture/application-boundary.md b/docs/architecture/application-boundary.md index fd450d73..69d57a9f 100644 --- a/docs/architecture/application-boundary.md +++ b/docs/architecture/application-boundary.md @@ -4,6 +4,21 @@ Status: draft Audience: backend/dashboard/orchestrator maintainers Scope: define the boundary between HTTP/WebSocket delivery, application service orchestration, and the quant+LLM merge kernel +## Current status snapshot (2026-04) + +This document is still the **target boundary** document, but several convergence pieces are already landed on the mainline: + +- `web_dashboard/backend/services/job_service.py` now owns public task/job projection logic; +- `web_dashboard/backend/services/result_store.py` persists result contracts under `results//result.v1alpha1.json`; +- `web_dashboard/backend/services/analysis_service.py` and `api/portfolio.py` already expose contract-first result payloads by default; +- `/ws/analysis/{task_id}` and `/ws/orchestrator` already carry `contract_version = "v1alpha1"` and include result/degradation/data-quality metadata. + +What is **not** fully finished yet: + +- `web_dashboard/backend/main.py` still contains too much orchestration glue and transport-local logic; +- route handlers are thinner than before, but the application layer has not fully absorbed every lifecycle branch; +- migration flags/modes still coexist with legacy compatibility paths. + ## 1. Why this document exists The current backend mixes three concerns inside `web_dashboard/backend/main.py`: @@ -40,6 +55,12 @@ This is the correct place for quant/LLM merge semantics. This makes the transport layer hard to replace and makes result contracts implicit. +At the same time, current mainline no longer matches the oldest “all logic sits in routes” description exactly. The codebase now sits in a **mid-migration** state: + +- merge semantics remain in `orchestrator/`; +- public payload shaping has started moving into backend services; +- legacy compatibility fields still exist for UI safety. + ## 3. Target boundary ## 3.1 Layer model @@ -193,3 +214,14 @@ A change respects this boundary if all are true: - application service owns task lifecycle and contract mapping; - `orchestrator/` remains the only owner of merge semantics; - domain dataclasses can still be tested without FastAPI or WebSocket context. + +## 9. Current maintainer guidance + +When touching backend convergence code, treat these files as the current application-facing boundary: + +- `web_dashboard/backend/services/job_service.py` +- `web_dashboard/backend/services/result_store.py` +- `web_dashboard/backend/services/analysis_service.py` +- `web_dashboard/backend/api/portfolio.py` + +If a change adds or removes externally visible fields, update `docs/contracts/result-contract-v1alpha1.md` in the same change set. diff --git a/docs/architecture/research-provenance.md b/docs/architecture/research-provenance.md index 3a74df9f..0775dd2d 100644 --- a/docs/architecture/research-provenance.md +++ b/docs/architecture/research-provenance.md @@ -4,6 +4,20 @@ Status: draft Audience: orchestrator, TradingAgents graph, verification Scope: document the Phase 1-4 provenance fields, Bull/Bear/Manager guard behavior, trace schema, and the smallest safe A/B workflow for verification +## Current implementation snapshot (2026-04) + +Mainline now has four distinct but connected pieces in place: + +1. `research provenance` fields are carried in `investment_debate_state`; +2. the same provenance is reused by: + - `orchestrator/llm_runner.py` + - `orchestrator/live_mode.py` + - `tradingagents/graph/trading_graph.py` full-state logs; +3. `orchestrator/profile_stage_chain.py` emits node-level traces for offline analysis; +4. `orchestrator/profile_ab.py` compares two trace cohorts offline without changing the production execution path. + +This document describes the **current mainline behavior**, not a future structured-memo design. + ## 1. Why this document exists Phase 1-4 convergence added three closely related behaviors: @@ -84,6 +98,10 @@ This is intentionally **string-first**, not schema-first, so the downstream plan - `metadata.data_quality` - `metadata.sample_quality` +The extraction path is now centralized through: + +- `tradingagents/agents/utils/agent_states.py::extract_research_provenance()` + Current conventions: - normal path: `data_quality.state = "ok"`, `sample_quality = "full_research"`; @@ -98,13 +116,22 @@ Current conventions: This means consumers can inspect research degradation without parsing raw debate text. +### 4.3 Full-state log projection + +`tradingagents/graph/trading_graph.py::_log_state()` now also persists the same provenance subset into: + +- `results//TradingAgentsStrategy_logs/full_states_log_.json` + +This keeps the post-run JSON logs aligned with the runner/live metadata instead of silently dropping the structured fields. + ## 5. Profiling trace schema -`orchestrator/profile_stage_chain.py` is the current timing/provenance trace tool. +`orchestrator/profile_stage_chain.py` is the current timing/provenance trace generator. +`orchestrator/profile_trace_utils.py` holds the shared summary helper used by the offline A/B comparison path. ### 5.1 Top-level payload -Successful runs write a JSON payload with: +Successful runs currently write a JSON payload with: - `status` - `ticker` @@ -124,7 +151,7 @@ Error payloads add: ### 5.2 `node_timings[]` entry schema -Each node timing entry currently contains: +Each `node_timings[]` entry currently contains: | Field | Meaning | | --- | --- | @@ -143,9 +170,26 @@ Each node timing entry currently contains: This schema is intentionally **trace-oriented**, not a replacement for the application result contract. -## 6. Minimal A/B harness guidance +## 6. Offline A/B comparison helper -Use `orchestrator/profile_stage_chain.py` when you want a small, explicit comparison harness without changing the production default path. +`orchestrator/profile_ab.py` is the current offline comparison helper. + +It consumes one or more trace JSON files from cohort `A` and cohort `B`, then reports: + +- `median_total_elapsed_ms` +- `median_event_count` +- `median_phase_elapsed_ms` +- `degraded_run_count` +- `error_count` +- `trace_schema_versions` +- `source_files` +- recommendation tie-breaks across elapsed time, degradation count, and error count + +This helper is intentionally offline-only: it does **not** re-run live providers or change the production runtime path. + +## 7. Minimal A/B harness guidance + +Use `python -m orchestrator.profile_stage_chain` to generate traces, then `python -m orchestrator.profile_ab` to compare them. ### 6.1 Safe comparison knobs @@ -167,7 +211,7 @@ Keep these fixed when doing an A/B comparison: - the same `--overall-timeout` - `max_debate_rounds = 1` and `max_risk_discuss_rounds = 1` as currently baked into the harness -### 6.3 Example commands +### 7.3 Example commands ```bash python -m orchestrator.profile_stage_chain \ @@ -181,6 +225,12 @@ python -m orchestrator.profile_stage_chain \ --date 2026-04-11 \ --selected-analysts market \ --analysis-prompt-style detailed + +python -m orchestrator.profile_ab \ + --a orchestrator/profile_runs/compact \ + --b orchestrator/profile_runs/detailed \ + --label-a compact \ + --label-b detailed ``` Compare the generated JSON dumps by focusing on: @@ -190,7 +240,7 @@ Compare the generated JSON dumps by focusing on: - provenance changes (`research_status`, `degraded_reason`) - history/response growth (`history_len`, `response_len`) -## 7. Review guardrails +## 8. Review guardrails When modifying this area, keep these invariants intact unless a broader migration explicitly approves otherwise: diff --git a/docs/contracts/result-contract-v1alpha1.md b/docs/contracts/result-contract-v1alpha1.md index 8c54be3d..b3ad93dc 100644 --- a/docs/contracts/result-contract-v1alpha1.md +++ b/docs/contracts/result-contract-v1alpha1.md @@ -4,6 +4,17 @@ Status: draft Audience: backend, desktop, frontend, verification Format: JSON-oriented contract notes with examples +## Current implementation snapshot (2026-04) + +Mainline backend behavior now partially matches this draft already: + +- `web_dashboard/backend/services/job_service.py` emits public task/job payloads with `contract_version = "v1alpha1"`; +- `web_dashboard/backend/services/result_store.py` persists result contracts under `results//result.v1alpha1.json`; +- `web_dashboard/backend/api/portfolio.py` and `/ws/orchestrator` already expose `v1alpha1` envelopes by default; +- live signal payloads currently carry `data_quality`, `degradation`, and `research` as top-level contract fields in addition to `result` / `error`. + +This document is therefore a **working contract doc**, not a pure future sketch. + ## 1. Goals `result-contract-v1alpha1` defines the stable shapes exchanged across: @@ -169,6 +180,9 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`. "llm_direction": 1, "timestamp": "2026-04-13T12:00:11Z" }, + "degradation": null, + "data_quality": {"state": "ok"}, + "research": null, "error": null }, { @@ -176,6 +190,19 @@ This covers `/ws/orchestrator` style responses currently produced by `LiveMode`. "date": "2026-04-13", "status": "failed", "result": null, + "degradation": { + "degraded": true, + "reason_code": "provider_mismatch" + }, + "data_quality": {"state": "provider_mismatch", "source": "llm"}, + "research": { + "research_status": "failed", + "research_mode": "degraded_synthesis", + "timed_out_nodes": ["Bull Researcher"], + "degraded_reason": "bull_researcher_connectionerror", + "covered_dimensions": ["market"], + "manager_confidence": null + }, "error": { "code": "live_signal_failed", "message": "both quant and llm signals are None", @@ -216,6 +243,7 @@ Current backend fields in `web_dashboard/backend/main.py` map roughly as follows - `quant_signal` -> `result.signals.quant.rating` - `llm_signal` -> `result.signals.llm.rating` - `confidence` -> `result.confidence` +- `result_ref` -> persisted result contract location under `results//result.v1alpha1.json` - top-level `error` string -> structured `error` - positional `stages[]` -> named `stages[]` @@ -237,6 +265,10 @@ Do not freeze these until config-schema work lands: - raw metadata blobs from quant/LLM internals - report summary extraction fields +Additional note: + +- trace/profiling payloads are **not** part of `result-contract-v1alpha1`; they use separate offline trace/A-B helper files under `orchestrator/`. + ## 10. Open review questions - Should `rating` remain duplicated with `direction`, or should one be derived client-side? diff --git a/docs/migration/rollback-notes.md b/docs/migration/rollback-notes.md index 5f2f6b38..e973f24d 100644 --- a/docs/migration/rollback-notes.md +++ b/docs/migration/rollback-notes.md @@ -4,6 +4,23 @@ Status: draft Audience: backend/application maintainers Scope: migrate toward application-service boundary and result-contract-v1alpha1 with rollback safety +## Current progress snapshot (2026-04) + +Mainline has moved beyond pure planning, but it has not finished the full boundary migration: + +- `Phase 0` is effectively done: contract and architecture drafts exist. +- `Phase 1-4` are **partially landed**: + - backend services now project `v1alpha1`-style public payloads; + - result contracts are persisted via `result_store.py`; + - `/ws/analysis/{task_id}` and `/ws/orchestrator` already wrap payloads with `contract_version`; + - recommendation and task-status reads already depend on application-layer shaping more than route-local reconstruction. +- `Phase 5` is **not complete**: + - `web_dashboard/backend/main.py` is still too large; + - route-local orchestration has not been fully deleted; + - compatibility fields still coexist with the newer contract-first path. + +Also note that research provenance / node guard / profiling work is now landed on the orchestrator side. That effort complements the backend migration but should not be confused with “application boundary fully complete.” + ## 1. Migration objective Move backend delivery code from route-local orchestration to an application-service layer without changing the quant+LLM merge kernel behavior. @@ -60,6 +77,11 @@ Rollback: - route handlers can call old inline functions directly via feature flag or import switch +Current status: + +- partially complete on mainline via `analysis_service.py`, `job_service.py`, and `result_store.py` +- not complete enough yet to claim `main.py` is only a thin adapter + ## Phase 2: dual-read for task status Why: @@ -116,6 +138,12 @@ Rollback: - restore websocket serializer to legacy shape - keep application service intact behind adapter +Current status: + +- partially complete on mainline +- `/ws/orchestrator` already emits `contract_version`, `data_quality`, `degradation`, and `research` +- `/ws/analysis/{task_id}` already reads application-shaped task state + ## Phase 5: remove route-local orchestration Actions: @@ -186,3 +214,13 @@ A migration plan is acceptable only if it: - introduces feature-flagged cutover points - supports dual-read/dual-write only at application/persistence boundary - provides a one-step rollback path at each release phase + +## 10. Maintainer note + +When updating migration status, keep these three documents aligned: + +- `docs/architecture/application-boundary.md` +- `docs/contracts/result-contract-v1alpha1.md` +- `docs/architecture/research-provenance.md` + +The first two describe backend/application convergence; the third describes orchestrator-side research degradation and profiling semantics that now feed those contracts.