# TradingAgents backend migration and rollback notes draft Status: draft Audience: backend/application maintainers Scope: migrate toward application-service boundary and result-contract-v1alpha1 with rollback safety ## Current progress snapshot (2026-04) Mainline has moved beyond pure planning, but it has not finished the full boundary migration: - `Phase 0` is effectively done: contract and architecture drafts exist. - `Phase 1-4` are **partially landed**: - backend services now project `v1alpha1`-style public payloads; - result contracts are persisted via `result_store.py`; - `/ws/analysis/{task_id}` and `/ws/orchestrator` already wrap payloads with `contract_version`; - recommendation and task-status reads already depend on application-layer shaping more than route-local reconstruction. - `Phase 5` is **not complete**: - `web_dashboard/backend/main.py` is still too large; - route-local orchestration has not been fully deleted; - compatibility fields still coexist with the newer contract-first path. Also note that research provenance / node guard / profiling work is now landed on the orchestrator side. That effort complements the backend migration but should not be confused with “application boundary fully complete.” ## 1. Migration objective Move backend delivery code from route-local orchestration to an application-service layer without changing the quant+LLM merge kernel behavior. Target outcomes: - stable result contract (`v1alpha1`) - thin FastAPI transport - application-owned task lifecycle and mapping - rollback-safe migration using dual-read/dual-write where useful ## 2. Current coupling hotspots Primary hotspot: `web_dashboard/backend/main.py` It currently combines: - route handlers - task persistence - subprocess creation and monitoring - progress/stage state mutation - result projection into API fields - report export concerns This file is the first migration target. ## 3. Recommended migration sequence ## Phase 0: contract freeze draft Deliverables: - agree on `docs/contracts/result-contract-v1alpha1.md` - agree on application boundary in `docs/architecture/application-boundary.md` Rollback: - none needed; documentation only ## Phase 1: introduce application service behind existing routes Actions: - add backend application modules for analysis status, live signals, and report reads - keep existing route URLs unchanged - move mapping logic out of route functions into service/mappers Compatibility tactic: - routes still return current payload shape if frontend depends on it - internal service also emits `v1alpha1` DTOs for verification comparison Rollback: - route handlers can call old inline functions directly via feature flag or import switch Current status: - partially complete on mainline via `analysis_service.py`, `job_service.py`, and `result_store.py` - not complete enough yet to claim `main.py` is only a thin adapter ## Phase 2: dual-read for task status Why: Task status currently lives in memory plus `data/task_status/*.json`. During migration, new service storage and old persisted shape may diverge. Recommended strategy: - read preference: new application store first - fallback read: legacy JSON task status - compare key fields during shadow period: `status`, `progress`, `current_stage`, `decision`, `error` Rollback: - switch read preference back to legacy JSON only - leave new store populated for debugging, but non-authoritative ## Phase 3: dual-write for task results Why: To avoid breaking status pages and historical tooling during rollout. Recommended strategy: - authoritative write: new application store - compatibility write: legacy `app.state.task_results` + `data/task_status/*.json` - emit diff logs when new-vs-legacy projections disagree Guardrails: - dual-write only for application-layer payloads - do not dual-write alternate domain semantics into `orchestrator/` Rollback: - disable new-store writes - continue legacy writes only ## Phase 4: websocket and live signal migration Actions: - make `/ws/analysis/{task_id}` and `/ws/orchestrator` render application contracts - keep websocket wrapper fields stable while migrating internal body shape Suggested compatibility step: - send legacy event envelope with embedded `contract_version` - update frontend consumers before removing legacy-only fields Rollback: - restore websocket serializer to legacy shape - keep application service intact behind adapter Current status: - partially complete on mainline - `/ws/orchestrator` already emits `contract_version`, `data_quality`, `degradation`, and `research` - `/ws/analysis/{task_id}` already reads application-shaped task state ## Phase 5: remove route-local orchestration Actions: - delete dead inline task mutation helpers from `main.py` - keep routes as thin adapter layer - preserve report retrieval behavior Rollback: - only safe after shadow metrics show parity - otherwise revert to Phase 3 dual-write mode, not direct deletion ## 4. Suggested feature flags Environment-variable style examples: - `TA_APP_SERVICE_ENABLED=1` - `TA_RESULT_CONTRACT_VERSION=v1alpha1` - `TA_TASKSTORE_DUAL_READ=1` - `TA_TASKSTORE_DUAL_WRITE=1` - `TA_WS_V1ALPHA1_ENABLED=0` These names are placeholders; exact naming can be chosen during implementation. ## 5. Verification checkpoints per phase For each migration phase, verify: - same task ids are returned for the same route behavior - stage transitions remain monotonic - completed tasks persist `decision`, `confidence`, and degraded-path outcomes - failure path still preserves actionable error text - live websocket payloads preserve ticker/date ordering expectations ## 6. Rollback triggers Rollback immediately if any of these happen: - task status disappears after backend restart - WebSocket clients stop receiving progress updates - completed analysis loses `decision` or confidence fields - degraded single-lane signals are reclassified incorrectly - report export or historical report retrieval cannot find prior artifacts ## 7. Explicit non-goals during migration - do not rewrite `orchestrator/signals.py` merge math as part of boundary migration - do not rework provider/model selection semantics in the same change set - do not force frontend redesign before contract shadowing proves parity - do not implement a new strategy layer inside the application service ## 8. Minimal rollback playbook If production or local verification fails after migration cutover: 1. disable application-service read path 2. disable dual-write to new store if it corrupts parity checks 3. restore legacy route-local serializers 4. keep generated comparison logs/artifacts for diff analysis 5. re-run backend tests and one end-to-end manual analysis flow ## 9. Review checklist A migration plan is acceptable only if it: - preserves orchestrator ownership of quant+LLM merge semantics - introduces feature-flagged cutover points - supports dual-read/dual-write only at application/persistence boundary - provides a one-step rollback path at each release phase ## 10. Maintainer note When updating migration status, keep these three documents aligned: - `docs/architecture/application-boundary.md` - `docs/contracts/result-contract-v1alpha1.md` - `docs/architecture/research-provenance.md` The first two describe backend/application convergence; the third describes orchestrator-side research degradation and profiling semantics that now feed those contracts.