docs: add DECISIONS.md recording deliberate "won't do" choices

Document 5 features we evaluated and chose not to implement
(handoffs, checkpointing, A2A, MCP, dashboard) to maintain
our "simplest multi-agent framework" positioning.

Closes #17, #20.
This commit is contained in:
JackChen 2026-04-03 03:02:56 +08:00
parent 8d27c6a1fe
commit d8a217106f
1 changed files with 43 additions and 0 deletions

43
DECISIONS.md Normal file
View File

@ -0,0 +1,43 @@
# Architecture Decisions
This document records deliberate "won't do" decisions for the project. These are features we evaluated and chose NOT to implement — not because they're bad ideas, but because they conflict with our positioning as the **simplest multi-agent framework**.
If you're considering a PR in any of these areas, please open a discussion first.
## Won't Do
### 1. Agent Handoffs
**What**: Agent A transfers an in-progress conversation to Agent B (like OpenAI Agents SDK `handoff()`).
**Why not**: Handoffs are a different paradigm from our task-based model. Our tasks have clear boundaries — one agent, one task, one result. Handoffs blur those boundaries and add state-transfer complexity. Users who need handoffs likely need a different framework (OpenAI Agents SDK is purpose-built for this).
### 2. State Persistence / Checkpointing
**What**: Save workflow state to a database so long-running workflows can resume after crashes (like LangGraph checkpointing).
**Why not**: Requires a storage backend (SQLite, Redis, Postgres), schema migrations, and serialization logic. This is enterprise infrastructure — it triples the complexity surface. Our target users run workflows that complete in seconds to minutes, not hours. If you need checkpointing, LangGraph is the right tool.
**Related**: Closing #20 with this rationale.
### 3. A2A Protocol (Agent-to-Agent)
**What**: Google's open protocol for agents on different servers to discover and communicate with each other.
**Why not**: Too early — the spec is still evolving and adoption is minimal. Our users run agents in a single process, not across distributed services. If A2A matures and there's real demand, we can revisit. Today it would add complexity for zero practical benefit.
### 4. MCP Integration (Model Context Protocol)
**What**: Anthropic's protocol for connecting LLMs to external tools and data sources.
**Why not**: MCP is valuable but targets a different layer. Our `defineTool()` API already lets users wrap any external service as a tool in ~10 lines of code. Adding MCP would mean maintaining protocol compatibility, transport layers, and tool discovery — complexity that serves tool platform builders, not our target users who just want to run agent teams.
### 5. Dashboard / Visualization
**What**: Built-in web UI to visualize task DAGs, agent activity, and token usage.
**Why not**: We expose data, we don't build UI. The `onProgress` callback and upcoming `onTrace` (#18) give users all the raw data. They can pipe it into Grafana, build a custom dashboard, or use console logs. Shipping a web UI means owning a frontend stack, which is outside our scope.
---
*Last updated: 2026-04-03*