feat: add Azure OpenAI LLMAdapter (#24 ) (#143 )

- New AzureOpenAIAdapter using AzureOpenAI client from openai SDK - Registered 'azure-openai' in SupportedProvider and createAdapter() - model field is primary deployment name; AZURE_OPENAI_DEPLOYMENT as fallback - Default api-version: 2024-10-21 - Example in examples/providers/azure-openai.ts - 14 tests covering chat, stream, tool_use, deployment fallback, error path - Updated README.md, README_zh.md, examples/README.md, src/cli/oma.ts
docs(cli): clarify --dashboard gating and add to flag table (#141 )
2026-04-21 14:28:30 +08:00 · 2026-04-21 02:26:06 +08:00 · 2026-04-21 02:22:03 +08:00 · 2026-04-20 15:51:18 +08:00 · 2026-04-20 15:45:26 +08:00 · 2026-04-20 00:58:00 +08:00
115 changed files with 16139 additions and 545 deletions
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -6,6 +6,17 @@ labels: enhancement
 assignees: ''
 ---

+## Source
+
+**Where did this idea come from?** (Pick one — helps maintainers triage and prioritize.)
+
+- [ ] **Real use case** — I'm using open-multi-agent and hit this limit. Describe the use case in "Problem" below.
+- [ ] **Competitive reference** — Another framework has this (LangChain, AutoGen, CrewAI, Mastra, XCLI, etc.). Please name or link it.
+- [ ] **Systematic gap** — A missing piece in the framework matrix (provider not supported, tool not covered, etc.).
+- [ ] **Discussion / inspiration** — Came up in a tweet, Reddit post, Discord, or AI conversation. Please link or paste the source if possible.
+
+> **Maintainer note**: after triage, label with one of `community-feedback`, `source:competitive`, `source:analysis`, `source:owner` (multiple OK if the source is mixed — e.g. competitive analysis + user feedback).
+
 ## Problem

 A clear description of the problem or limitation you're experiencing.
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -21,3 +21,19 @@ jobs:
      - run: rm -f package-lock.json && npm install
      - run: npm run lint
      - run: npm test
+
+  coverage:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+          cache: npm
+      - run: rm -f package-lock.json && npm install
+      - run: npm run test:coverage
+      - uses: codecov/codecov-action@v5
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}
+          files: ./coverage/lcov.info
+          fail_ci_if_error: false
--- a/.gitignore
+++ b/.gitignore
@ -3,5 +3,4 @@ dist/
 coverage/
 *.tgz
 .DS_Store
-promo-*.md
-non-tech_*/
+oma-dashboards/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -10,9 +10,10 @@ npm run dev            # Watch mode compilation
 npm run lint           # Type-check only (tsc --noEmit)
 npm test               # Run all tests (vitest run)
 npm run test:watch     # Vitest watch mode
+node dist/cli/oma.js help   # After build: shell/CI CLI (`oma` when installed via npm bin)
 ```

-Tests live in `tests/` (vitest). Examples in `examples/` are standalone scripts requiring API keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`).
+Tests live in `tests/` (vitest). Examples in `examples/` are standalone scripts requiring API keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`). CLI usage and JSON schemas: `docs/cli.md`.

 ## Architecture

@ -55,7 +56,7 @@ This is the framework's key feature. When `runTeam()` is called:

 ### Concurrency Control

-Two independent semaphores: `AgentPool` (max concurrent agent runs, default 5) and `ToolExecutor` (max concurrent tool calls, default 4).
+Three semaphore layers: `AgentPool` pool-level (max concurrent agent runs, default 5), `AgentPool` per-agent mutex (serializes concurrent runs on the same `Agent` instance), and `ToolExecutor` (max concurrent tool calls, default 4).

 ### Structured Output

@ -73,7 +74,21 @@ Optional `maxRetries`, `retryDelayMs`, `retryBackoff` on task config (used via `

 ### Built-in Tools

-`bash`, `file_read`, `file_write`, `file_edit`, `grep` — registered via `registerBuiltInTools(registry)`.
+`bash`, `file_read`, `file_write`, `file_edit`, `grep`, `glob` — registered via `registerBuiltInTools(registry)`. `delegate_to_agent` is opt-in (`registerBuiltInTools(registry, { includeDelegateTool: true })`) and only wired up inside pool workers by `runTeam`/`runTasks` — see "Agent Delegation" below.
+
+### Agent Delegation
+
+`delegate_to_agent` (in `src/tool/built-in/delegate.ts`) lets an agent synchronously hand a sub-prompt to another roster agent and receive its final output as a tool result. Only active during orchestrated runs; standalone `runAgent` and the `runTeam` short-circuit path (`isSimpleGoal` hit) do not inject it.
+
+Guards (all enforced in the tool itself, before `runDelegatedAgent` is called):
+
+- **Self-delegation:** rejected (`target === context.agent.name`)
+- **Unknown agent:** rejected (target not in team roster)
+- **Cycle detection:** rejected if target already in `TeamInfo.delegationChain` (prevents `A → B → A` from burning tokens up to the depth cap)
+- **Depth cap:** `OrchestratorConfig.maxDelegationDepth` (default 3)
+- **Pool deadlock:** rejected when `AgentPool.availableRunSlots < 1`, without calling the pool
+
+The delegated run's `AgentRunResult.tokenUsage` is surfaced via `ToolResult.metadata.tokenUsage`; the runner accumulates it into `totalUsage` before the next `maxTokenBudget` check, so delegation cannot silently bypass the parent's budget. Delegation tool_result blocks are exempt from `compressToolResults` and the `compact` context strategy so the parent agent retains the full sub-agent output across turns. Best-effort SharedMemory audit writes at `{caller}/delegation:{target}:{timestamp}-{rand}` if the team has shared memory enabled.

 ### Adding an LLM Adapter

--- a/DECISIONS.md
+++ b/DECISIONS.md
@ -1,11 +1,11 @@
 # Architecture Decisions

-This document records deliberate "won't do" decisions for the project. These are features we evaluated and chose NOT to implement — not because they're bad ideas, but because they conflict with our positioning as the **simplest multi-agent framework**.
-
-If you're considering a PR in any of these areas, please open a discussion first.
+This document records our architectural decisions — both what we choose NOT to build, and what we're actively working toward. Our goal is to be the **simplest multi-agent framework**, but simplicity doesn't mean closed. We believe the long-term value of a framework isn't its feature checklist — it's the size of the network it connects to.

 ## Won't Do

+These are paradigms we evaluated and deliberately chose not to implement, because they conflict with our core model.
+
 ### 1. Agent Handoffs

 **What**: Agent A transfers an in-progress conversation to Agent B (like OpenAI Agents SDK `handoff()`).
@ -20,24 +20,30 @@ If you're considering a PR in any of these areas, please open a discussion first

 **Related**: Closing #20 with this rationale.

-### 3. A2A Protocol (Agent-to-Agent)
+## Open to Adoption

-**What**: Google's open protocol for agents on different servers to discover and communicate with each other.
+These are protocols we see strategic value in and are actively tracking. We're waiting for the right moment — not the right feature spec, but the right network density.

-**Why not**: Too early — the spec is still evolving and adoption is minimal. Our users run agents in a single process, not across distributed services. If A2A matures and there's real demand, we can revisit. Today it would add complexity for zero practical benefit.
+> **Our thesis**: Framework competition on features (DAG scheduling, shared memory, zero-dependency) is a race that can always be caught. Network competition — where the value of the framework grows with every agent published to it — creates a fundamentally different moat. MCP and A2A are the protocols that turn a framework from a build tool into a registry.

-### 4. MCP Integration (Model Context Protocol)
+### 3. MCP Integration (Model Context Protocol)

 **What**: Anthropic's protocol for connecting LLMs to external tools and data sources.

-**Why not**: MCP is valuable but targets a different layer. Our `defineTool()` API already lets users wrap any external service as a tool in ~10 lines of code. Adding MCP would mean maintaining protocol compatibility, transport layers, and tool discovery — complexity that serves tool platform builders, not our target users who just want to run agent teams.
+**Status**: **Next up.** MCP has crossed the adoption threshold — Cursor, Windsurf, Claude Code all ship with built-in support, and many services now provide MCP servers directly. Asking users to re-wrap each one via `defineTool()` creates unnecessary friction.

-### 5. Dashboard / Visualization
+**Approach**: Optional peer dependency (`@modelcontextprotocol/sdk`). Zero impact on the core — if you don't use MCP, you don't pay for it. This preserves our minimal-dependency principle while connecting to the broader tool ecosystem.

-**What**: Built-in web UI to visualize task DAGs, agent activity, and token usage.
+**Tracking**: #86

-**Why not**: We expose data, we don't build UI. The `onProgress` callback and upcoming `onTrace` (#18) give users all the raw data. They can pipe it into Grafana, build a custom dashboard, or use console logs. Shipping a web UI means owning a frontend stack, which is outside our scope.
+### 4. A2A Protocol (Agent-to-Agent)
+
+**What**: Google's open protocol for agents on different servers to discover and communicate with each other.
+
+**Status**: **Watching.** The spec is still evolving and production adoption is minimal. But we recognize A2A's potential to enable the network effect we care about — if 1,000 developers publish agent services using open-multi-agent, the 1,001st developer isn't just choosing an API, they're choosing which ecosystem has the most agents they can call.
+
+**When we'll move**: When A2A adoption reaches a tipping point where the protocol connects real, production agent services — not just demos. We'll prioritize a lightweight integration that lets agents be both consumers and providers of A2A services.

 ---

-*Last updated: 2026-04-03*
+*Last updated: 2026-04-09*
--- a/README.md
+++ b/README.md
@ -1,29 +1,61 @@
 # Open Multi-Agent

-TypeScript framework for multi-agent orchestration. One `runTeam()` call from goal to result — the framework decomposes it into tasks, resolves dependencies, and runs agents in parallel.
+The lightweight multi-agent orchestration engine for TypeScript. Three runtime dependencies, zero config, goal to result in one `runTeam()` call.

-3 runtime dependencies · 33 source files · Deploys anywhere Node.js runs · Mentioned in [Latent Space](https://www.latent.space/p/ainews-a-quiet-april-fools) AI News
+CrewAI is Python. LangGraph makes you draw the graph by hand. `open-multi-agent` is the `npm install` you drop into an existing Node.js backend when you need a team of agents to work on a goal together. Nothing more, nothing less.

+[![npm version](https://img.shields.io/npm/v/@jackchen_me/open-multi-agent)](https://www.npmjs.com/package/@jackchen_me/open-multi-agent)
 [![GitHub stars](https://img.shields.io/github/stars/JackChen-me/open-multi-agent)](https://github.com/JackChen-me/open-multi-agent/stargazers)
 [![license](https://img.shields.io/github/license/JackChen-me/open-multi-agent)](./LICENSE)
 [![TypeScript](https://img.shields.io/badge/TypeScript-5.6-blue)](https://www.typescriptlang.org/)
-[![coverage](https://img.shields.io/badge/coverage-71%25-brightgreen)](https://github.com/JackChen-me/open-multi-agent/actions)
+[![runtime deps](https://img.shields.io/badge/runtime_deps-3-brightgreen)](https://github.com/JackChen-me/open-multi-agent/blob/main/package.json)
+[![codecov](https://codecov.io/gh/JackChen-me/open-multi-agent/graph/badge.svg)](https://codecov.io/gh/JackChen-me/open-multi-agent)

 **English** | [中文](./README_zh.md)

-## Why Open Multi-Agent?
+## What you actually get

- **Goal In, Result Out** — `runTeam(team, "Build a REST API")`. A coordinator agent auto-decomposes the goal into a task DAG with dependencies and assignees, runs independent tasks in parallel, and synthesizes the final output. No manual task definitions or graph wiring required.
- **TypeScript-Native** — Built for the Node.js ecosystem. `npm install`, import, run. No Python runtime, no subprocess bridge, no sidecar services. Embed in Express, Next.js, serverless functions, or CI/CD pipelines.
- **Auditable and Lightweight** — 3 runtime dependencies (`@anthropic-ai/sdk`, `openai`, `zod`). 33 source files. The entire codebase is readable in an afternoon.
- **Model Agnostic** — Claude, GPT, Gemma 4, and local models (Ollama, vLLM, LM Studio, llama.cpp server) in the same team. Swap models per agent via `baseURL`.
- **Multi-Agent Collaboration** — Agents with different roles, tools, and models collaborate through a message bus and shared memory.
- **Structured Output** — Add `outputSchema` (Zod) to any agent. Output is parsed as JSON, validated, and auto-retried once on failure. Access typed results via `result.structured`.
- **Task Retry** — Set `maxRetries` on tasks for automatic retry with exponential backoff. Failed attempts accumulate token usage for accurate billing.
- **Human-in-the-Loop** — Optional `onApproval` callback on `runTasks()`. After each batch of tasks completes, your callback decides whether to proceed or abort remaining work.
- **Lifecycle Hooks** — `beforeRun` / `afterRun` on `AgentConfig`. Intercept the prompt before execution or post-process results after. Throw from either hook to abort.
- **Loop Detection** — `loopDetection` on `AgentConfig` catches stuck agents repeating the same tool calls or text output. Configurable action: warn (default), terminate, or custom callback.
- **Observability** — Optional `onTrace` callback emits structured spans for every LLM call, tool execution, task, and agent run — with timing, token usage, and a shared `runId` for correlation. Zero overhead when not subscribed, zero extra dependencies.
+- **Goal to result in one call.** `runTeam(team, "Build a REST API")` kicks off a coordinator agent that decomposes the goal into a task DAG, resolves dependencies, runs independent tasks in parallel, and synthesizes the final output. No graph to draw, no tasks to wire up.
+- **TypeScript-native, three runtime dependencies.** `@anthropic-ai/sdk`, `openai`, `zod`. That is the whole runtime. Embed in Express, Next.js, serverless functions, or CI/CD pipelines. No Python runtime, no subprocess bridge, no cloud sidecar.
+- **Multi-model teams.** Claude, GPT, Gemini, Grok, MiniMax, DeepSeek, Copilot, or any OpenAI-compatible local model (Ollama, vLLM, LM Studio, llama.cpp) in the same team. Run the architect on Opus 4.6, the developer on GPT-5.4, the reviewer on local Gemma 4, all in one `runTeam()` call. Gemini ships as an optional peer dependency: `npm install @google/genai` to enable.
+
+Other features (MCP integration, context strategies, structured output, task retry, human-in-the-loop, lifecycle hooks, loop detection, observability) live below the fold and in [`examples/`](./examples/).
+
+## Philosophy: what we build, what we don't
+
+Our goal is to be the simplest multi-agent framework for TypeScript. Simplicity does not mean closed. We believe the long-term value of a framework is the size of the network it connects to, not its feature checklist.
+
+**We build:**
+- A coordinator that decomposes a goal into a task DAG.
+- A task queue that runs independent tasks in parallel and cascades failures to dependents.
+- A shared memory and message bus so agents can see each other's output.
+- Multi-model teams where each agent can use a different LLM provider.
+
+**We don't build:**
+- **Agent handoffs.** If agent A needs to transfer mid-conversation to agent B, use [OpenAI Agents SDK](https://github.com/openai/openai-agents-python). In our model, each agent owns one task end-to-end, with no mid-conversation transfers.
+- **State persistence / checkpointing.** Not planned for now. Adding a storage backend would break the three-dependency promise, and our workflows run in seconds to minutes, not hours. If real usage shifts toward long-running workflows, we will revisit.
+
+**Tracking:**
+- **A2A protocol.** Watching, will move when production adoption is real.
+
+See [`DECISIONS.md`](./DECISIONS.md) for the full rationale.
+
+## How is this different from X?
+
+**vs. [LangGraph JS](https://github.com/langchain-ai/langgraphjs).** LangGraph is declarative graph orchestration: you define nodes, edges, and conditional routing, then `compile()` and `invoke()`. `open-multi-agent` is goal-driven: you declare a team and a goal, a coordinator decomposes it into a task DAG at runtime. LangGraph gives you total control of topology (great for fixed production workflows). This gives you less typing and faster iteration (great for exploratory multi-agent work). LangGraph also has mature checkpointing; we do not.
+
+**vs. [CrewAI](https://github.com/crewAIInc/crewAI).** CrewAI is the mature Python choice. If your stack is Python, use CrewAI. `open-multi-agent` is TypeScript-native: three runtime dependencies, embeds directly in Node.js without a subprocess bridge. Roughly comparable capability on the orchestration side. Choose on language fit.
+
+**vs. [Vercel AI SDK](https://github.com/vercel/ai).** AI SDK is the LLM call layer: a unified TypeScript client for 60+ providers with streaming, tool calls, and structured outputs. It does not orchestrate multi-agent teams. `open-multi-agent` sits on top when you need that. They compose: use AI SDK for single-agent work, reach for this when you need a team.
+
+## Used by
+
+`open-multi-agent` is a new project (launched 2026-04-01, MIT, 5,500+ stars). The ecosystem is still forming, so the list below is short and honest:
+
+- **[temodar-agent](https://github.com/xeloxa/temodar-agent)** (~50 stars). WordPress security analysis platform by [Ali Sünbül](https://github.com/xeloxa). Uses our built-in tools (`bash`, `file_*`, `grep`) directly in its Docker runtime. Confirmed production use.
+- **Cybersecurity SOC (home lab).** A private setup running Qwen 2.5 + DeepSeek Coder entirely offline via Ollama, building an autonomous SOC pipeline on Wazuh + Proxmox. Early user, not yet public.
+
+Using `open-multi-agent` in production or a side project? [Open a discussion](https://github.com/JackChen-me/open-multi-agent/discussions) and we will list it here.

 ## Quick Start

@ -33,14 +65,21 @@ Requires Node.js >= 18.
 npm install @jackchen_me/open-multi-agent
 ```

-Set the API key for your provider. Local models via Ollama require no API key — see [example 06](examples/06-local-model.ts).
+Set the API key for your provider. Local models via Ollama require no API key. See [`providers/ollama`](examples/providers/ollama.ts).

 - `ANTHROPIC_API_KEY`
+- `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT` (for Azure OpenAI; deployment is optional fallback when `model` is blank)
 - `OPENAI_API_KEY`
 - `GEMINI_API_KEY`
+- `XAI_API_KEY` (for Grok)
+- `MINIMAX_API_KEY` (for MiniMax)
+- `MINIMAX_BASE_URL` (for MiniMax, optional, selects endpoint)
+- `DEEPSEEK_API_KEY` (for DeepSeek)
 - `GITHUB_TOKEN` (for Copilot)

-Three agents, one goal — the framework handles the rest:
+**CLI (`oma`).** For shell and CI, the package exposes a JSON-first binary. See [docs/cli.md](./docs/cli.md) for `oma run`, `oma task`, `oma provider`, exit codes, and file formats.
+
+Three agents, one goal. The framework handles the rest:

 ```typescript
 import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
@ -53,19 +92,8 @@ const architect: AgentConfig = {
  tools: ['file_write'],
 }

-const developer: AgentConfig = {
-  name: 'developer',
-  model: 'claude-sonnet-4-6',
-  systemPrompt: 'You implement what the architect designs.',
-  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
-}
-
-const reviewer: AgentConfig = {
-  name: 'reviewer',
-  model: 'claude-sonnet-4-6',
-  systemPrompt: 'You review code for correctness and clarity.',
-  tools: ['file_read', 'grep'],
-}
+const developer: AgentConfig = { /* same shape, tools: ['bash', 'file_read', 'file_write', 'file_edit'] */ }
+const reviewer: AgentConfig = { /* same shape, tools: ['file_read', 'grep'] */ }

 const orchestrator = new OpenMultiAgent({
  defaultModel: 'claude-sonnet-4-6',
@ -78,7 +106,7 @@ const team = orchestrator.createTeam('api-team', {
  sharedMemory: true,
 })

-// Describe a goal — the framework breaks it into tasks and orchestrates execution
+// Describe a goal. The framework breaks it into tasks and orchestrates execution
 const result = await orchestrator.runTeam(team, 'Create a REST API for a todo list in /tmp/todo-api/')

 console.log(`Success: ${result.success}`)
@ -94,8 +122,8 @@ task_complete architect
 task_start developer
 task_start developer              // independent tasks run in parallel
 task_complete developer
-task_start reviewer               // unblocked after implementation
 task_complete developer
+task_start reviewer               // unblocked after implementation
 task_complete reviewer
 agent_complete coordinator        // synthesizes final result
 Success: true
@ -106,33 +134,25 @@ Tokens: 12847 output tokens

 | Mode | Method | When to use |
 |------|--------|-------------|
-| Single agent | `runAgent()` | One agent, one prompt — simplest entry point |
+| Single agent | `runAgent()` | One agent, one prompt. Simplest entry point |
 | Auto-orchestrated team | `runTeam()` | Give a goal, framework plans and executes |
 | Explicit pipeline | `runTasks()` | You define the task graph and assignments |

+For MapReduce-style fan-out without task dependencies, use `AgentPool.runParallel()` directly. See [`patterns/fan-out-aggregate`](examples/patterns/fan-out-aggregate.ts).
+
 ## Examples

-All examples are runnable scripts in [`examples/`](./examples/). Run any of them with `npx tsx`:
+[`examples/`](./examples/) is organized by category: basics, providers, patterns, integrations, and production. See [`examples/README.md`](./examples/README.md) for the full index. Highlights:

-```bash
-npx tsx examples/01-single-agent.ts
-```
+- [`basics/team-collaboration`](examples/basics/team-collaboration.ts): `runTeam()` coordinator pattern.
+- [`patterns/structured-output`](examples/patterns/structured-output.ts): any agent returns Zod-validated JSON.
+- [`patterns/agent-handoff`](examples/patterns/agent-handoff.ts): synchronous sub-agent delegation via `delegate_to_agent`.
+- [`integrations/trace-observability`](examples/integrations/trace-observability.ts): `onTrace` spans for LLM calls, tools, and tasks.
+- [`integrations/mcp-github`](examples/integrations/mcp-github.ts): expose an MCP server's tools to an agent via `connectMCPTools()`.
+- [`integrations/with-vercel-ai-sdk`](examples/integrations/with-vercel-ai-sdk/): Next.js app combining OMA `runTeam()` with AI SDK `useChat` streaming.
+- **Provider examples**: eight three-agent teams (one per supported provider) under [`examples/providers/`](examples/providers/).

-| Example | What it shows |
-|---------|---------------|
-| [01 — Single Agent](examples/01-single-agent.ts) | `runAgent()` one-shot, `stream()` streaming, `prompt()` multi-turn |
-| [02 — Team Collaboration](examples/02-team-collaboration.ts) | `runTeam()` auto-orchestration with coordinator pattern |
-| [03 — Task Pipeline](examples/03-task-pipeline.ts) | `runTasks()` explicit dependency graph (design → implement → test + review) |
-| [04 — Multi-Model Team](examples/04-multi-model-team.ts) | `defineTool()` custom tools, mixed Anthropic + OpenAI providers, `AgentPool` |
-| [05 — Copilot](examples/05-copilot-test.ts) | GitHub Copilot as an LLM provider |
-| [06 — Local Model](examples/06-local-model.ts) | Ollama + Claude in one pipeline via `baseURL` (works with vLLM, LM Studio, etc.) |
-| [07 — Fan-Out / Aggregate](examples/07-fan-out-aggregate.ts) | `runParallel()` MapReduce — 3 analysts in parallel, then synthesize |
-| [08 — Gemma 4 Local](examples/08-gemma4-local.ts) | `runTasks()` + `runTeam()` with local Gemma 4 via Ollama — zero API cost |
-| [09 — Structured Output](examples/09-structured-output.ts) | `outputSchema` (Zod) on AgentConfig — validated JSON via `result.structured` |
-| [10 — Task Retry](examples/10-task-retry.ts) | `maxRetries` / `retryDelayMs` / `retryBackoff` with `task_retry` progress events |
-| [11 — Trace Observability](examples/11-trace-observability.ts) | `onTrace` callback — structured spans for LLM calls, tools, tasks, and agents |
-| [12 — Grok](examples/12-grok.ts) | Same as example 02 (`runTeam()` collaboration) with Grok (`XAI_API_KEY`) |
-| [13 — Gemini](examples/13-gemini.ts) | Gemini adapter smoke test with `gemini-2.5-flash` (`GEMINI_API_KEY`) |
+Run scripts with `npx tsx examples/basics/team-collaboration.ts`.

 ## Architecture

@ -168,12 +188,14 @@ npx tsx examples/01-single-agent.ts
         │               │  - CopilotAdapter    │
         │               │  - GeminiAdapter     │
         │               │  - GrokAdapter       │
+         │               │  - MiniMaxAdapter    │
+         │               │  - DeepSeekAdapter   │
         │               └──────────────────────┘
 ┌────────▼──────────┐
 │  AgentRunner      │    ┌──────────────────────┐
 │  - conversation   │───►│  ToolRegistry        │
 │    loop           │    │  - defineTool()      │
-│  - tool dispatch  │    │  - 5 built-in tools  │
+│  - tool dispatch  │    │  - 6 built-in tools  │
 └───────────────────┘    └──────────────────────┘
 ```

@ -186,6 +208,157 @@ npx tsx examples/01-single-agent.ts
 | `file_write` | Write or create a file. Auto-creates parent directories. |
 | `file_edit` | Edit a file by replacing an exact string match. |
 | `grep` | Search file contents with regex. Uses ripgrep when available, falls back to Node.js. |
+| `glob` | Find files by glob pattern. Returns matching paths sorted by modification time. |
+
+## Tool Configuration
+
+Agents can be configured with fine-grained tool access control using presets, allowlists, and denylists.
+
+### Tool Presets
+
+Predefined tool sets for common use cases:
+
+```typescript
+const readonlyAgent: AgentConfig = {
+  name: 'reader',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readonly',  // file_read, grep, glob
+}
+
+const readwriteAgent: AgentConfig = {
+  name: 'editor',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readwrite',  // file_read, file_write, file_edit, grep, glob
+}
+
+const fullAgent: AgentConfig = {
+  name: 'executor',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'full',  // file_read, file_write, file_edit, grep, glob, bash
+}
+```
+
+### Advanced Filtering
+
+Combine presets with allowlists and denylists for precise control:
+
+```typescript
+const customAgent: AgentConfig = {
+  name: 'custom',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readwrite',        // Start with: file_read, file_write, file_edit, grep, glob
+  tools: ['file_read', 'grep'],   // Allowlist: intersect with preset = file_read, grep
+  disallowedTools: ['grep'],      // Denylist: subtract = file_read only
+}
+```
+
+**Resolution order:** preset → allowlist → denylist → framework safety rails.
+
+### Custom Tools
+
+Two ways to give an agent a tool that is not in the built-in set.
+
+**Inject at config time** via `customTools` on `AgentConfig`. Good when the orchestrator wires up tools centrally. Tools defined here bypass preset/allowlist filtering but still respect `disallowedTools`.
+
+```typescript
+import { defineTool } from '@jackchen_me/open-multi-agent'
+import { z } from 'zod'
+
+const weatherTool = defineTool({
+  name: 'get_weather',
+  description: 'Look up current weather for a city.',
+  schema: z.object({ city: z.string() }),
+  execute: async ({ city }) => ({ content: await fetchWeather(city) }),
+})
+
+const agent: AgentConfig = {
+  name: 'assistant',
+  model: 'claude-sonnet-4-6',
+  customTools: [weatherTool],
+}
+```
+
+**Register at runtime** via `agent.addTool(tool)`. Tools added this way are always available, regardless of filtering.
+
+### Tool Output Control
+
+Long tool outputs can blow up conversation size and cost. Two controls work together.
+
+**Truncation.** Cap an individual tool result to a head + tail excerpt with a marker in between:
+
+```typescript
+const agent: AgentConfig = {
+  // ...
+  maxToolOutputChars: 10_000, // applies to every tool this agent runs
+}
+
+// Per-tool override (takes priority over AgentConfig.maxToolOutputChars):
+const bigQueryTool = defineTool({
+  // ...
+  maxOutputChars: 50_000,
+})
+```
+
+**Post-consumption compression.** Once the agent has acted on a tool result, compress older copies in the transcript so they stop costing input tokens on every subsequent turn. Error results are never compressed.
+
+```typescript
+const agent: AgentConfig = {
+  // ...
+  compressToolResults: true,                 // default threshold: 500 chars
+  // or: compressToolResults: { minChars: 2_000 }
+}
+```
+
+### MCP Tools (Model Context Protocol)
+
+`open-multi-agent` can connect to any MCP server and expose its tools directly to agents.
+
+```typescript
+import { connectMCPTools } from '@jackchen_me/open-multi-agent/mcp'
+
+const { tools, disconnect } = await connectMCPTools({
+  command: 'npx',
+  args: ['-y', '@modelcontextprotocol/server-github'],
+  env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
+  namePrefix: 'github',
+})
+
+// Register each MCP tool in your ToolRegistry, then include their names in AgentConfig.tools
+// Don't forget cleanup when done
+await disconnect()
+```
+
+Notes:
+- `@modelcontextprotocol/sdk` is an optional peer dependency, only needed when using MCP.
+- Current transport support is stdio.
+- MCP input validation is delegated to the MCP server (`inputSchema` is `z.any()`).
+
+See [`integrations/mcp-github`](examples/integrations/mcp-github.ts) for a full runnable setup.
+
+## Context Management
+
+Long-running agents can hit input token ceilings fast. Set `contextStrategy` on `AgentConfig` to control how the conversation shrinks as it grows:
+
+```typescript
+const agent: AgentConfig = {
+  name: 'long-runner',
+  model: 'claude-sonnet-4-6',
+  // Pick one:
+  contextStrategy: { type: 'sliding-window', maxTurns: 20 },
+  // contextStrategy: { type: 'summarize', maxTokens: 80_000, summaryModel: 'claude-haiku-4-5' },
+  // contextStrategy: { type: 'compact', maxTokens: 100_000, preserveRecentTurns: 4 },
+  // contextStrategy: { type: 'custom', compress: (messages, estimatedTokens, ctx) => ... },
+}
+```
+
+| Strategy | When to reach for it |
+|----------|----------------------|
+| `sliding-window` | Cheapest. Keep the last N turns, drop the rest. |
+| `summarize` | Send old turns to a summary model; keep the summary in place of the originals. |
+| `compact` | Rule-based: truncate large assistant text blocks and tool results, keep recent turns intact. No extra LLM call. |
+| `custom` | Supply your own `compress(messages, estimatedTokens, ctx)` function. |
+
+Pairs well with `compressToolResults` and `maxToolOutputChars` above.

 ## Supported Providers

@ -193,15 +366,22 @@ npx tsx examples/01-single-agent.ts
 |----------|--------|---------|--------|
 | Anthropic (Claude) | `provider: 'anthropic'` | `ANTHROPIC_API_KEY` | Verified |
 | OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | Verified |
+| Azure OpenAI | `provider: 'azure-openai'` | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` (+ optional `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`) | Verified |
 | Grok (xAI)   | `provider: 'grok'` | `XAI_API_KEY` | Verified |
+| MiniMax (global) | `provider: 'minimax'` | `MINIMAX_API_KEY` | Verified |
+| MiniMax (China) | `provider: 'minimax'` + `MINIMAX_BASE_URL` | `MINIMAX_API_KEY` | Verified |
+| DeepSeek | `provider: 'deepseek'` | `DEEPSEEK_API_KEY` | Verified |
 | GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | Verified |
 | Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | Verified |
-| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | Verified |
-| llama.cpp server | `provider: 'openai'` + `baseURL` | — | Verified |
+| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | none | Verified |
+| Groq | `provider: 'openai'` + `baseURL` | `GROQ_API_KEY` | Verified |
+| llama.cpp server | `provider: 'openai'` + `baseURL` | none | Verified |

-Verified local models with tool-calling: **Gemma 4** (see [example 08](examples/08-gemma4-local.ts)).
+Gemini requires `npm install @google/genai` (optional peer dependency).

-Any OpenAI-compatible API should work via `provider: 'openai'` + `baseURL` (DeepSeek, Groq, Mistral, Qwen, MiniMax, etc.). **Grok now has first-class support** via `provider: 'grok'`.
+Verified local models with tool-calling: **Gemma 4** (see [`providers/gemma4-local`](examples/providers/gemma4-local.ts)).
+
+Any OpenAI-compatible API should work via `provider: 'openai'` + `baseURL` (Mistral, Qwen, Moonshot, Doubao, etc.). Groq is now verified in [`providers/groq`](examples/providers/groq.ts). **Grok, MiniMax, and DeepSeek now have first-class support** via `provider: 'grok'`, `provider: 'minimax'`, and `provider: 'deepseek'`.

 ### Local Model Tool-Calling

@ -227,7 +407,7 @@ const localAgent: AgentConfig = {

 **Troubleshooting:**
 - Model not calling tools? Ensure it appears in Ollama's [Tools category](https://ollama.com/search?c=tools). Not all models support tool-calling.
- Using Ollama? Update to the latest version (`ollama update`) — older versions have known tool-calling bugs.
+- Using Ollama? Update to the latest version (`ollama update`). Older versions have known tool-calling bugs.
 - Proxy interfering? Use `no_proxy=localhost` when running against local servers.

 ### LLM Configuration Examples
@ -241,39 +421,61 @@ const grokAgent: AgentConfig = {
 }
 ```

-(Set your `XAI_API_KEY` environment variable — no `baseURL` needed anymore.)
+(Set your `XAI_API_KEY` environment variable, no `baseURL` needed.)
+
+```typescript
+const minimaxAgent: AgentConfig = {
+  name: 'minimax-agent',
+  provider: 'minimax',
+  model: 'MiniMax-M2.7',
+  systemPrompt: 'You are a helpful assistant.',
+}
+```
+
+Set `MINIMAX_API_KEY`. The adapter selects the endpoint via `MINIMAX_BASE_URL`:
+
+- `https://api.minimax.io/v1` Global, default
+- `https://api.minimaxi.com/v1` China mainland endpoint
+
+You can also pass `baseURL` directly in `AgentConfig` to override the env var.
+
+```typescript
+const deepseekAgent: AgentConfig = {
+  name: 'deepseek-agent',
+  provider: 'deepseek',
+  model: 'deepseek-chat',
+  systemPrompt: 'You are a helpful assistant.',
+}
+```
+
+Set `DEEPSEEK_API_KEY`. Available models: `deepseek-chat` (DeepSeek-V3, recommended for coding) and `deepseek-reasoner` (thinking mode).

 ## Contributing

 Issues, feature requests, and PRs are welcome. Some areas where contributions would be especially valuable:

- **Provider integrations** — Verify and document OpenAI-compatible providers (DeepSeek, Groq, Qwen, MiniMax, etc.) via `baseURL`. See [#25](https://github.com/JackChen-me/open-multi-agent/issues/25). For providers that are NOT OpenAI-compatible (e.g. Gemini), a new `LLMAdapter` implementation is welcome — the interface requires just two methods: `chat()` and `stream()`.
- **Examples** — Real-world workflows and use cases.
- **Documentation** — Guides, tutorials, and API docs.
-
-## Author
-
-> JackChen — Ex PM (¥100M+ revenue), now indie builder. Follow on [X](https://x.com/JackChen_x) for AI Agent insights.
+- **Production examples.** Real-world end-to-end workflows. See [`examples/production/README.md`](./examples/production/README.md) for the acceptance criteria and submission format.
+- **Documentation.** Guides, tutorials, and API docs.

 ## Contributors

 <a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
-  <img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&v=20260405" />
+  <img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&max=20&v=20260419" />
 </a>

 ## Star History

 <a href="https://star-history.com/#JackChen-me/open-multi-agent&Date">
 <picture>
-   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260405" />
-   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260405" />
-   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260405" />
+   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark" />
+   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date" />
+   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date" />
 </picture>
 </a>

 ## Translations

-Help translate this README — [open a PR](https://github.com/JackChen-me/open-multi-agent/pulls).
+Help translate this README. [Open a PR](https://github.com/JackChen-me/open-multi-agent/pulls).

 ## License

--- a/README_zh.md
+++ b/README_zh.md
@ -1,29 +1,58 @@
 # Open Multi-Agent

-TypeScript 多智能体编排框架。一次 `runTeam()` 调用从目标到结果——框架自动拆解任务、解析依赖、并行执行。
+TypeScript 里的轻量多智能体编排引擎。3 个运行时依赖，零配置，一次 `runTeam()` 从目标拿到结果。

-3 个运行时依赖 · 33 个源文件 · Node.js 能跑的地方都能部署 · 被 [Latent Space](https://www.latent.space/p/ainews-a-quiet-april-fools) AI News 提及（AI 工程领域头部 Newsletter，17 万+订阅者）
+CrewAI 是 Python。LangGraph 要你自己画图。`open-multi-agent` 是你现有 Node.js 后端里 `npm install` 一下就能用的那一层：一支 agent 团队围绕一个目标协作，就这些。

+[![npm version](https://img.shields.io/npm/v/@jackchen_me/open-multi-agent)](https://www.npmjs.com/package/@jackchen_me/open-multi-agent)
 [![GitHub stars](https://img.shields.io/github/stars/JackChen-me/open-multi-agent)](https://github.com/JackChen-me/open-multi-agent/stargazers)
 [![license](https://img.shields.io/github/license/JackChen-me/open-multi-agent)](./LICENSE)
 [![TypeScript](https://img.shields.io/badge/TypeScript-5.6-blue)](https://www.typescriptlang.org/)
-[![coverage](https://img.shields.io/badge/coverage-71%25-brightgreen)](https://github.com/JackChen-me/open-multi-agent/actions)
+[![runtime deps](https://img.shields.io/badge/runtime_deps-3-brightgreen)](https://github.com/JackChen-me/open-multi-agent/blob/main/package.json)
+[![codecov](https://codecov.io/gh/JackChen-me/open-multi-agent/graph/badge.svg)](https://codecov.io/gh/JackChen-me/open-multi-agent)

 [English](./README.md) | **中文**

-## 为什么选择 Open Multi-Agent？
+## 核心能力

- **目标进，结果出** — `runTeam(team, "构建一个 REST API")`。协调者智能体自动将目标拆解为带依赖关系的任务图，分配给对应智能体，独立任务并行执行，最终合成输出。无需手动定义任务或编排流程图。
- **TypeScript 原生** — 为 Node.js 生态而生。`npm install` 即用，无需 Python 运行时、无子进程桥接、无额外基础设施。可嵌入 Express、Next.js、Serverless 函数或 CI/CD 流水线。
- **可审计、极轻量** — 3 个运行时依赖（`@anthropic-ai/sdk`、`openai`、`zod`），33 个源文件。一个下午就能读完全部源码。
- **模型无关** — Claude、GPT、Gemma 4 和本地模型（Ollama、vLLM、LM Studio、llama.cpp server）可以在同一个团队中使用。通过 `baseURL` 即可接入任何 OpenAI 兼容服务。
- **多智能体协作** — 定义不同角色、工具和模型的智能体，通过消息总线和共享内存协作。
- **结构化输出** — 为任意智能体添加 `outputSchema`（Zod），输出自动解析为 JSON 并校验，校验失败自动重试一次。通过 `result.structured` 获取类型化结果。
- **任务重试** — 为任务设置 `maxRetries`，失败时自动指数退避重试。所有尝试的 token 用量累计，确保计费准确。
- **人机协同** — `runTasks()` 支持可选的 `onApproval` 回调。每批任务完成后，由你的回调决定是否继续执行后续任务。
- **生命周期钩子** — `AgentConfig` 上的 `beforeRun` / `afterRun`。在执行前拦截 prompt，或在执行后处理结果。从钩子中 throw 可中止运行。
- **循环检测** — `AgentConfig` 上的 `loopDetection` 可检测智能体重复相同工具调用或文本输出的卡死循环。可配置行为：警告（默认）、终止、或自定义回调。
- **可观测性** — 可选的 `onTrace` 回调为每次 LLM 调用、工具执行、任务和智能体运行发出结构化 span 事件——包含耗时、token 用量和共享的 `runId` 用于关联追踪。未订阅时零开销，零额外依赖。
+- `runTeam(team, "构建一个 REST API")` 下去，协调者 agent 会把目标拆成任务 DAG，独立任务并行跑，再把结果合起来。不用画图，不用手动连依赖。
+- 运行时依赖就三个：`@anthropic-ai/sdk`、`openai`、`zod`。能直接塞进 Express、Next.js、Serverless 或 CI/CD，不起 Python 进程，也不跑云端 sidecar。
+- 同一个团队里的 agent 能挂不同模型：架构师用 Opus 4.6、开发用 GPT-5.4、评审跑本地 Gemma 4 都行。支持 Claude、GPT、Gemini、Grok、MiniMax、DeepSeek、Copilot，以及 OpenAI 兼容的本地模型（Ollama、vLLM、LM Studio、llama.cpp）。用 Gemini 要额外装 `@google/genai`。
+
+还有 MCP、上下文策略、结构化输出、任务重试、human-in-the-loop、生命周期 hook、循环检测、可观测性等，下面章节或 [`examples/`](./examples/) 里都有。
+
+## 做什么，不做什么
+
+**做的事：**
+- 一个协调者，把目标拆成任务 DAG。
+- 一个任务队列，独立任务并行跑，失败级联到下游。
+- 共享内存和消息总线，让 agent 之间能看到彼此的输出。
+- 多模型团队，每个 agent 可以挂不同的 LLM provider。
+
+**不做的事：**
+- **Agent handoffs**：agent A 对话中途把控制权交给 agent B 这种模式不做。要这个用 [OpenAI Agents SDK](https://github.com/openai/openai-agents-python)。我们这边一个 agent 从头到尾负责一个任务。
+- **状态持久化 / 检查点**：暂时不做。加存储后端会破坏 3 个依赖的承诺，而且我们的工作流是秒到分钟级，不是小时级。真有长时间工作流的需求再说。
+
+A2A 协议在跟踪，观望中，等有人真用再跟。
+
+完整理由见 [`DECISIONS.md`](./DECISIONS.md)。
+
+## 和其他框架怎么选
+
+如果你在看 [LangGraph JS](https://github.com/langchain-ai/langgraphjs)：它是声明式图编排，自己定义节点、边、路由，`compile()` + `invoke()`。`open-multi-agent` 反过来，目标驱动：给一个团队和一个目标，协调者在运行时拆 DAG。想完全控拓扑、流程定下来的用 LangGraph；想写得少、迭代快、还在探索的选这个。LangGraph 有成熟 checkpoint，我们没做。
+
+Python 栈直接用 [CrewAI](https://github.com/crewAIInc/crewAI) 就行，编排层能力差不多。`open-multi-agent` 的定位是 TypeScript 原生：3 个依赖、直接进 Node.js、不用子进程桥接。按语言选。
+
+和 [Vercel AI SDK](https://github.com/vercel/ai) 不冲突。AI SDK 是 LLM 调用层，统一的 TypeScript 客户端，60+ provider，带流式、tool call、结构化输出，但不做多智能体编排。要多 agent，把 `open-multi-agent` 叠在 AI SDK 上面就行。单 agent 用 AI SDK，多 agent 用这个。
+
+## 谁在用
+
+项目 2026-04-01 发布，目前 5,500+ stars，MIT 协议。目前能确认在用的：
+
+- **[temodar-agent](https://github.com/xeloxa/temodar-agent)**（约 50 stars）。WordPress 安全分析平台，作者 [Ali Sünbül](https://github.com/xeloxa)。在 Docker runtime 里直接用我们的内置工具（`bash`、`file_*`、`grep`）。已确认生产环境使用。
+- **家用服务器 Cybersecurity SOC。** 本地完全离线跑 Qwen 2.5 + DeepSeek Coder（通过 Ollama），在 Wazuh + Proxmox 上搭自主 SOC 流水线。早期用户，未公开。
+
+如果你在生产或 side project 里用了 `open-multi-agent`，[请开个 Discussion](https://github.com/JackChen-me/open-multi-agent/discussions)，我加上来。

 ## 快速开始

@ -33,15 +62,21 @@ TypeScript 多智能体编排框架。一次 `runTeam()` 调用从目标到结
 npm install @jackchen_me/open-multi-agent
 ```

-根据使用的 Provider 设置对应的 API key。通过 Ollama 使用本地模型无需 API key — 参见 [example 06](examples/06-local-model.ts)。
+根据用的 provider 设对应 API key。通过 Ollama 跑本地模型不用 key，见 [`providers/ollama`](examples/providers/ollama.ts)。

 - `ANTHROPIC_API_KEY`
+- `AZURE_OPENAI_API_KEY`、`AZURE_OPENAI_ENDPOINT`、`AZURE_OPENAI_API_VERSION`、`AZURE_OPENAI_DEPLOYMENT`（Azure OpenAI；当 `model` 为空时可用 deployment 环境变量兜底）
 - `OPENAI_API_KEY`
 - `GEMINI_API_KEY`
 - `XAI_API_KEY`（Grok）
+- `MINIMAX_API_KEY`（MiniMax）
+- `MINIMAX_BASE_URL`（MiniMax，可选，选接入端点）
+- `DEEPSEEK_API_KEY`（DeepSeek）
 - `GITHUB_TOKEN`（Copilot）

-三个智能体，一个目标——框架处理剩下的一切：
+包里还自带一个叫 `oma` 的命令行工具，给 shell 和 CI 场景用，输出都是 JSON。`oma run`、`oma task`、`oma provider`、退出码、文件格式都在 [docs/cli.md](./docs/cli.md) 里。
+
+下面用三个 agent 协作做一个 REST API：

 ```typescript
 import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
@ -54,19 +89,8 @@ const architect: AgentConfig = {
  tools: ['file_write'],
 }

-const developer: AgentConfig = {
-  name: 'developer',
-  model: 'claude-sonnet-4-6',
-  systemPrompt: 'You implement what the architect designs.',
-  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
-}
-
-const reviewer: AgentConfig = {
-  name: 'reviewer',
-  model: 'claude-sonnet-4-6',
-  systemPrompt: 'You review code for correctness and clarity.',
-  tools: ['file_read', 'grep'],
-}
+const developer: AgentConfig = { /* 同样结构，tools: ['bash', 'file_read', 'file_write', 'file_edit'] */ }
+const reviewer: AgentConfig = { /* 同样结构，tools: ['file_read', 'grep'] */ }

 const orchestrator = new OpenMultiAgent({
  defaultModel: 'claude-sonnet-4-6',
@ -79,11 +103,11 @@ const team = orchestrator.createTeam('api-team', {
  sharedMemory: true,
 })

-// 描述一个目标——框架将其拆解为任务并编排执行
+// 描述一个目标，框架负责拆解成任务并编排执行
 const result = await orchestrator.runTeam(team, 'Create a REST API for a todo list in /tmp/todo-api/')

-console.log(`成功: ${result.success}`)
-console.log(`Token 用量: ${result.totalTokenUsage.output_tokens} output tokens`)
+console.log(`Success: ${result.success}`)
+console.log(`Tokens: ${result.totalTokenUsage.output_tokens} output tokens`)
 ```

 执行过程：
@ -95,8 +119,8 @@ task_complete architect
 task_start developer
 task_start developer              // 无依赖的任务并行执行
 task_complete developer
-task_start reviewer               // 实现完成后自动解锁
 task_complete developer
+task_start reviewer               // 实现完成后自动解锁
 task_complete reviewer
 agent_complete coordinator        // 综合所有结果
 Success: true
@ -107,33 +131,25 @@ Tokens: 12847 output tokens

 | 模式 | 方法 | 适用场景 |
 |------|------|----------|
-| 单智能体 | `runAgent()` | 一个智能体，一个提示词——最简入口 |
+| 单智能体 | `runAgent()` | 一个智能体，一个提示词，最简入口 |
 | 自动编排团队 | `runTeam()` | 给一个目标，框架自动规划和执行 |
 | 显式任务管线 | `runTasks()` | 你自己定义任务图和分配 |

+要 MapReduce 风格的 fan-out 但不需要任务依赖，直接用 `AgentPool.runParallel()`。例子见 [`patterns/fan-out-aggregate`](examples/patterns/fan-out-aggregate.ts)。
+
 ## 示例

-所有示例都是可运行脚本，位于 [`examples/`](./examples/) 目录。使用 `npx tsx` 运行：
+[`examples/`](./examples/) 按类别分了 basics、providers、patterns、integrations、production。完整索引见 [`examples/README.md`](./examples/README.md)，几个值得先看的：

-```bash
-npx tsx examples/01-single-agent.ts
-```
+- [`basics/team-collaboration`](examples/basics/team-collaboration.ts)：`runTeam()` 协调者模式。
+- [`patterns/structured-output`](examples/patterns/structured-output.ts)：任意 agent 产出 Zod 校验过的 JSON。
+- [`patterns/agent-handoff`](examples/patterns/agent-handoff.ts)：`delegate_to_agent` 同步子智能体委派。
+- [`integrations/trace-observability`](examples/integrations/trace-observability.ts)：`onTrace` 回调，给 LLM 调用、工具、任务发结构化 span。
+- [`integrations/mcp-github`](examples/integrations/mcp-github.ts)：用 `connectMCPTools()` 把 MCP 服务器的工具暴露给 agent。
+- [`integrations/with-vercel-ai-sdk`](examples/integrations/with-vercel-ai-sdk/)：Next.js 应用，OMA `runTeam()` 配合 AI SDK `useChat` 流式输出。
+- **Provider 示例**：8 个三智能体团队示例，每个 provider 一个，见 [`examples/providers/`](examples/providers/)。

-| 示例 | 展示内容 |
-|------|----------|
-| [01 — 单智能体](examples/01-single-agent.ts) | `runAgent()` 单次调用、`stream()` 流式输出、`prompt()` 多轮对话 |
-| [02 — 团队协作](examples/02-team-collaboration.ts) | `runTeam()` 自动编排 + 协调者模式 |
-| [03 — 任务流水线](examples/03-task-pipeline.ts) | `runTasks()` 显式依赖图（设计 → 实现 → 测试 + 评审） |
-| [04 — 多模型团队](examples/04-multi-model-team.ts) | `defineTool()` 自定义工具、Anthropic + OpenAI 混合、`AgentPool` |
-| [05 — Copilot](examples/05-copilot-test.ts) | GitHub Copilot 作为 LLM 提供者 |
-| [06 — 本地模型](examples/06-local-model.ts) | Ollama + Claude 混合流水线，通过 `baseURL` 接入（兼容 vLLM、LM Studio 等） |
-| [07 — 扇出聚合](examples/07-fan-out-aggregate.ts) | `runParallel()` MapReduce — 3 个分析师并行，然后综合 |
-| [08 — Gemma 4 本地](examples/08-gemma4-local.ts) | `runTasks()` + `runTeam()` 本地 Gemma 4 via Ollama — 零 API 费用 |
-| [09 — 结构化输出](examples/09-structured-output.ts) | `outputSchema`（Zod）— 校验 JSON 输出，通过 `result.structured` 获取 |
-| [10 — 任务重试](examples/10-task-retry.ts) | `maxRetries` / `retryDelayMs` / `retryBackoff` + `task_retry` 进度事件 |
-| [11 — 可观测性](examples/11-trace-observability.ts) | `onTrace` 回调 — LLM 调用、工具、任务、智能体的结构化 span 事件 |
-| [12 — Grok](examples/12-grok.ts) | 同示例 02（`runTeam()` 团队协作），使用 Grok（`XAI_API_KEY`） |
-| [13 — Gemini](examples/13-gemini.ts) | Gemini 适配器测试，使用 `gemini-2.5-flash`（`GEMINI_API_KEY`） |
+跑脚本用 `npx tsx examples/basics/team-collaboration.ts`。

 ## 架构

@ -169,12 +185,14 @@ npx tsx examples/01-single-agent.ts
         │               │  - CopilotAdapter    │
         │               │  - GeminiAdapter     │
         │               │  - GrokAdapter       │
+         │               │  - MiniMaxAdapter    │
+         │               │  - DeepSeekAdapter   │
         │               └──────────────────────┘
 ┌────────▼──────────┐
 │  AgentRunner      │    ┌──────────────────────┐
 │  - conversation   │───►│  ToolRegistry        │
 │    loop           │    │  - defineTool()      │
-│  - tool dispatch  │    │  - 5 built-in tools  │
+│  - tool dispatch  │    │  - 6 built-in tools  │
 └───────────────────┘    └──────────────────────┘
 ```

@ -182,11 +200,160 @@ npx tsx examples/01-single-agent.ts

 | 工具 | 说明 |
 |------|------|
-| `bash` | 执行 Shell 命令。返回 stdout + stderr。支持超时和工作目录设置。 |
-| `file_read` | 读取指定绝对路径的文件内容。支持偏移量和行数限制以处理大文件。 |
+| `bash` | 跑 Shell 命令。返回 stdout + stderr。支持超时和工作目录设置。 |
+| `file_read` | 按绝对路径读文件。支持偏移量和行数限制，能读大文件。 |
 | `file_write` | 写入或创建文件。自动创建父目录。 |
-| `file_edit` | 通过精确字符串匹配编辑文件。 |
-| `grep` | 使用正则表达式搜索文件内容。优先使用 ripgrep，回退到 Node.js 实现。 |
+| `file_edit` | 按精确字符串匹配改文件。 |
+| `grep` | 用正则搜文件内容。优先走 ripgrep，没有就 fallback 到 Node.js。 |
+| `glob` | 按 glob 模式查找文件。返回按修改时间排序的匹配路径。 |
+
+## 工具配置
+
+三层叠起来用：preset（预设）、tools（白名单）、disallowedTools（黑名单）。
+
+### 工具预设
+
+三种内置 preset：
+
+```typescript
+const readonlyAgent: AgentConfig = {
+  name: 'reader',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readonly',  // file_read, grep, glob
+}
+
+const readwriteAgent: AgentConfig = {
+  name: 'editor',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readwrite',  // file_read, file_write, file_edit, grep, glob
+}
+
+const fullAgent: AgentConfig = {
+  name: 'executor',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'full',  // file_read, file_write, file_edit, grep, glob, bash
+}
+```
+
+### 高级过滤
+
+```typescript
+const customAgent: AgentConfig = {
+  name: 'custom',
+  model: 'claude-sonnet-4-6',
+  toolPreset: 'readwrite',        // 起点：file_read, file_write, file_edit, grep, glob
+  tools: ['file_read', 'grep'],   // 白名单：与预设取交集 = file_read, grep
+  disallowedTools: ['grep'],      // 黑名单：再减去 = 只剩 file_read
+}
+```
+
+**解析顺序：** preset → allowlist → denylist → 框架安全护栏。
+
+### 自定义工具
+
+装一个不在内置集里的工具，有两种方式。
+
+**配置时注入。** 通过 `AgentConfig.customTools` 传入。编排层统一挂工具的时候用这个。这里定义的工具会绕过 preset 和白名单，但仍受 `disallowedTools` 限制。
+
+```typescript
+import { defineTool } from '@jackchen_me/open-multi-agent'
+import { z } from 'zod'
+
+const weatherTool = defineTool({
+  name: 'get_weather',
+  description: '查询某城市当前天气。',
+  schema: z.object({ city: z.string() }),
+  execute: async ({ city }) => ({ content: await fetchWeather(city) }),
+})
+
+const agent: AgentConfig = {
+  name: 'assistant',
+  model: 'claude-sonnet-4-6',
+  customTools: [weatherTool],
+}
+```
+
+**运行时注册。** `agent.addTool(tool)`。这种方式加的工具始终可用，不受任何过滤规则影响。
+
+### 工具输出控制
+
+工具返回太长会快速撑大对话和成本。两个开关配合着用。
+
+**截断。** 把单次工具结果压成 head + tail 摘要（中间放一个标记）：
+
+```typescript
+const agent: AgentConfig = {
+  // ...
+  maxToolOutputChars: 10_000, // 该 agent 所有工具的默认上限
+}
+
+// 单工具覆盖（优先级高于 AgentConfig.maxToolOutputChars）：
+const bigQueryTool = defineTool({
+  // ...
+  maxOutputChars: 50_000,
+})
+```
+
+**消费后压缩。** agent 用完某个工具结果之后，把历史副本压缩掉，后续每轮就不再重复消耗输入 token。错误结果不压缩。
+
+```typescript
+const agent: AgentConfig = {
+  // ...
+  compressToolResults: true,                 // 默认阈值 500 字符
+  // 或：compressToolResults: { minChars: 2_000 }
+}
+```
+
+### MCP 工具（Model Context Protocol）
+
+可以连任意 MCP 服务器，把它的工具直接给 agent 用。
+
+```typescript
+import { connectMCPTools } from '@jackchen_me/open-multi-agent/mcp'
+
+const { tools, disconnect } = await connectMCPTools({
+  command: 'npx',
+  args: ['-y', '@modelcontextprotocol/server-github'],
+  env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
+  namePrefix: 'github',
+})
+
+// 把每个 MCP 工具注册进你的 ToolRegistry，然后在 AgentConfig.tools 里引用它们的名字
+// 用完别忘了清理
+await disconnect()
+```
+
+注意事项：
+- `@modelcontextprotocol/sdk` 是 optional peer dependency，只在用 MCP 时才要装。
+- 当前只支持 stdio transport。
+- MCP 的入参校验交给 MCP 服务器自己（`inputSchema` 是 `z.any()`）。
+
+完整例子见 [`integrations/mcp-github`](examples/integrations/mcp-github.ts)。
+
+## 上下文管理
+
+长时间运行的 agent 很容易撞上输入 token 上限。在 `AgentConfig` 里设 `contextStrategy`，控制对话变长时怎么收缩：
+
+```typescript
+const agent: AgentConfig = {
+  name: 'long-runner',
+  model: 'claude-sonnet-4-6',
+  // 选一种：
+  contextStrategy: { type: 'sliding-window', maxTurns: 20 },
+  // contextStrategy: { type: 'summarize', maxTokens: 80_000, summaryModel: 'claude-haiku-4-5' },
+  // contextStrategy: { type: 'compact', maxTokens: 100_000, preserveRecentTurns: 4 },
+  // contextStrategy: { type: 'custom', compress: (messages, estimatedTokens, ctx) => ... },
+}
+```
+
+| 策略 | 什么时候用 |
+|------|------------|
+| `sliding-window` | 最省事。只保留最近 N 轮，其余丢弃。 |
+| `summarize` | 老对话发给摘要模型，用摘要替代原文。 |
+| `compact` | 基于规则：截断过长的 assistant 文本块和 tool 结果，保留最近若干轮。不额外调用 LLM。 |
+| `custom` | 传入自己的 `compress(messages, estimatedTokens, ctx)` 函数。 |
+
+和上面的 `compressToolResults`、`maxToolOutputChars` 搭着用效果更好。

 ## 支持的 Provider

@ -194,25 +361,32 @@ npx tsx examples/01-single-agent.ts
 |----------|------|----------|------|
 | Anthropic (Claude) | `provider: 'anthropic'` | `ANTHROPIC_API_KEY` | 已验证 |
 | OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | 已验证 |
+| Azure OpenAI | `provider: 'azure-openai'` | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`（可选：`AZURE_OPENAI_API_VERSION`、`AZURE_OPENAI_DEPLOYMENT`） | 已验证 |
 | Grok (xAI)   | `provider: 'grok'` | `XAI_API_KEY` | 已验证 |
+| MiniMax（全球） | `provider: 'minimax'` | `MINIMAX_API_KEY` | 已验证 |
+| MiniMax（国内） | `provider: 'minimax'` + `MINIMAX_BASE_URL` | `MINIMAX_API_KEY` | 已验证 |
+| DeepSeek | `provider: 'deepseek'` | `DEEPSEEK_API_KEY` | 已验证 |
 | GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | 已验证 |
 | Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | 已验证 |
-| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | 已验证 |
-| llama.cpp server | `provider: 'openai'` + `baseURL` | — | 已验证 |
+| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | 无 | 已验证 |
+| Groq | `provider: 'openai'` + `baseURL` | `GROQ_API_KEY` | 已验证 |
+| llama.cpp server | `provider: 'openai'` + `baseURL` | 无 | 已验证 |

-已验证支持 tool-calling 的本地模型：**Gemma 4**（见[示例 08](examples/08-gemma4-local.ts)）。
+Gemini 需要 `npm install @google/genai`（optional peer dependency）。

-任何 OpenAI 兼容 API 均可通过 `provider: 'openai'` + `baseURL` 接入（DeepSeek、Groq、Mistral、Qwen、MiniMax 等）。**Grok 现已原生支持**，使用 `provider: 'grok'`。
+已验证支持 tool-calling 的本地模型：**Gemma 4**（见 [`providers/gemma4-local`](examples/providers/gemma4-local.ts)）。
+
+OpenAI 兼容的 API 都能用 `provider: 'openai'` + `baseURL` 接（Mistral、Qwen、Moonshot、Doubao 等）。Groq 在 [`providers/groq`](examples/providers/groq.ts) 里验证过。Grok、MiniMax、DeepSeek 直接用 `provider: 'grok'`、`provider: 'minimax'`、`provider: 'deepseek'`，不用配 `baseURL`。

 ### 本地模型 Tool-Calling

-框架支持通过 Ollama、vLLM、LM Studio 或 llama.cpp 运行的本地模型进行 tool-calling。Tool-calling 由这些服务通过 OpenAI 兼容 API 原生处理。
+Ollama、vLLM、LM Studio、llama.cpp 跑的本地模型也能 tool-calling，走的是这些服务自带的 OpenAI 兼容接口。

 **已验证模型：** Gemma 4、Llama 3.1、Qwen 3、Mistral、Phi-4。完整列表见 [ollama.com/search?c=tools](https://ollama.com/search?c=tools)。

-**兜底提取：** 如果本地模型以文本形式返回工具调用，而非使用 `tool_calls` 协议格式（常见于 thinking 模型或配置不当的服务），框架会自动从文本输出中提取。
+**兜底提取：** 本地模型如果以文本形式返回工具调用，而不是 `tool_calls` 协议格式（thinking 模型或配置不对的服务常见），框架会自动从文本里提取。

-**超时设置：** 本地推理可能较慢。使用 `AgentConfig` 上的 `timeoutMs` 防止无限等待：
+**超时设置。** 本地推理可能慢。在 `AgentConfig` 里设 `timeoutMs`，避免一直卡住：

 ```typescript
 const localAgent: AgentConfig = {
@ -227,9 +401,9 @@ const localAgent: AgentConfig = {
 ```

 **常见问题：**
- 模型不调用工具？确保该模型出现在 Ollama 的 [Tools 分类](https://ollama.com/search?c=tools)中。并非所有模型都支持 tool-calling。
- 使用 Ollama？更新到最新版（`ollama update`）——旧版本有已知的 tool-calling bug。
- 代理干扰？本地服务使用 `no_proxy=localhost`。
+- 模型不调用工具？先确认它在 Ollama 的 [Tools 分类](https://ollama.com/search?c=tools) 里，不是所有模型都支持。
+- 把 Ollama 升到最新版（`ollama update`），旧版本有 tool-calling bug。
+- 代理挡住了？本地服务用 `no_proxy=localhost` 跳过代理。

 ### LLM 配置示例

@ -242,33 +416,55 @@ const grokAgent: AgentConfig = {
 }
 ```

-（设置 `XAI_API_KEY` 环境变量即可，无需 `baseURL`。）
+（设好 `XAI_API_KEY` 就行，不用配 `baseURL`。）
+
+```typescript
+const minimaxAgent: AgentConfig = {
+  name: 'minimax-agent',
+  provider: 'minimax',
+  model: 'MiniMax-M2.7',
+  systemPrompt: 'You are a helpful assistant.',
+}
+```
+
+设好 `MINIMAX_API_KEY`。端点用 `MINIMAX_BASE_URL` 选：
+
+- `https://api.minimax.io/v1` 全球端点，默认
+- `https://api.minimaxi.com/v1` 中国大陆端点
+
+也可以直接在 `AgentConfig` 里传 `baseURL`，覆盖环境变量。
+
+```typescript
+const deepseekAgent: AgentConfig = {
+  name: 'deepseek-agent',
+  provider: 'deepseek',
+  model: 'deepseek-chat',
+  systemPrompt: '你是一个有用的助手。',
+}
+```
+
+设好 `DEEPSEEK_API_KEY`。两个模型：`deepseek-chat`（DeepSeek-V3，写代码选这个）和 `deepseek-reasoner`（思考模式）。

 ## 参与贡献

-欢迎提 Issue、功能需求和 PR。以下方向的贡献尤其有价值：
+Issue、feature request、PR 都欢迎。特别想要：

- **Provider 集成** — 验证并文档化 OpenAI 兼容 Provider（DeepSeek、Groq、Qwen、MiniMax 等）通过 `baseURL` 接入。详见 [#25](https://github.com/JackChen-me/open-multi-agent/issues/25)。对于非 OpenAI 兼容的 Provider，欢迎贡献新的 `LLMAdapter` 实现——接口只需两个方法：`chat()` 和 `stream()`。
- **示例** — 真实场景的工作流和用例。
- **文档** — 指南、教程和 API 文档。
-
-## 作者
-
-> JackChen — 前 WPS 产品经理，现独立创业者。关注小红书[「杰克西｜硅基杠杆」](https://www.xiaohongshu.com/user/profile/5a1bdc1e4eacab4aa39ea6d6)，持续获取我的 AI Agent 观点和思考。
+- **生产级示例。** 端到端跑通的真实场景工作流。收录条件和提交格式见 [`examples/production/README.md`](./examples/production/README.md)。
+- **文档。** 指南、教程、API 文档。

 ## 贡献者

 <a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
-  <img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&v=20260405" />
+  <img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&max=20&v=20260419" />
 </a>

 ## Star 趋势

 <a href="https://star-history.com/#JackChen-me/open-multi-agent&Date">
 <picture>
-   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260405" />
-   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260405" />
-   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260405" />
+   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark" />
+   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date" />
+   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date" />
 </picture>
 </a>

--- a/codecov.yml
+++ b/codecov.yml
@ -0,0 +1 @@
+comment: false
--- a/docs/cli.md
+++ b/docs/cli.md
@ -0,0 +1,260 @@
+# Command-line interface (`oma`)
+
+The package ships a small binary **`oma`** that exposes the same primitives as the TypeScript API: `runTeam`, `runTasks`, plus a static provider reference. It is meant for **shell scripts and CI** (JSON on stdout, stable exit codes).
+
+It does **not** provide an interactive REPL, working-directory injection into tools, human approval gates, or session persistence. Those stay in application code.
+
+## Installation and invocation
+
+After installing the package, the binary is on `PATH` when using `npx` or a local `node_modules/.bin`:
+
+```bash
+npm install @jackchen_me/open-multi-agent
+npx oma help
+```
+
+From a clone of the repository you need a build first:
+
+```bash
+npm run build
+node dist/cli/oma.js help
+```
+
+Set the usual provider API keys in the environment (see [README](../README.md#quick-start)); the CLI does not read secrets from flags. MiniMax additionally reads `MINIMAX_BASE_URL` to select the global (`https://api.minimax.io/v1`) or China (`https://api.minimaxi.com/v1`) endpoint.
+
+---
+
+## Commands
+
+### `oma run`
+
+Runs **`OpenMultiAgent.runTeam(team, goal)`**: coordinator decomposition, task queue, optional synthesis.
+
+When invoked with `--dashboard`, the **`oma` CLI** writes a static post-execution DAG dashboard HTML to `oma-dashboards/runTeam-<timestamp>.html` under the current working directory (the library does not write files itself; if you want this outside the CLI, call `renderTeamRunDashboard(result)` in application code — see `src/dashboard/render-team-run-dashboard.ts`).
+
+The dashboard page loads **Tailwind CSS** (Play CDN), **Google Fonts** (Space Grotesk, Inter, Material Symbols), and **Material Symbols** from the network at view time. Opening the HTML file requires an **online** environment unless you host or inline those assets yourself (a future improvement).
+
+| Argument | Required | Description |
+|----------|----------|-------------|
+| `--goal` | Yes | Natural-language goal passed to the team run. |
+| `--team` | Yes | Path to JSON (see [Team file](#team-file)). |
+| `--orchestrator` | No | Path to JSON merged into `new OpenMultiAgent(...)` after any orchestrator fragment from the team file. |
+| `--coordinator` | No | Path to JSON passed as `runTeam(..., { coordinator })` (`CoordinatorConfig`). |
+| `--dashboard` | No | Write a post-execution DAG dashboard HTML to `oma-dashboards/runTeam-<timestamp>.html`. |
+
+Global flags: [`--pretty`](#output-flags), [`--include-messages`](#output-flags).
+
+### `oma task`
+
+Runs **`OpenMultiAgent.runTasks(team, tasks)`** with a fixed task list (no coordinator decomposition).
+
+| Argument | Required | Description |
+|----------|----------|-------------|
+| `--file` | Yes | Path to [tasks file](#tasks-file). |
+| `--team` | No | Path to JSON `TeamConfig`. When set, overrides the `team` object inside `--file`. |
+
+Global flags: [`--pretty`](#output-flags), [`--include-messages`](#output-flags).
+
+### `oma provider`
+
+Read-only helper for wiring JSON configs and env vars.
+
+- **`oma provider`** or **`oma provider list`** — Prints JSON: built-in provider ids, API key environment variable names, whether `baseURL` is supported, and short notes (e.g. OpenAI-compatible servers, Copilot in CI).
+- **`oma provider template <provider>`** — Prints a JSON object with example `orchestrator` and `agent` fields plus placeholder `env` entries. `<provider>` is one of: `anthropic`, `openai`, `gemini`, `grok`, `minimax`, `deepseek`, `copilot`.
+
+Supports `--pretty`.
+
+### `oma`, `oma help`, `oma -h`, `oma --help`
+
+Prints usage text to stdout and exits **0**.
+
+---
+
+## Configuration files
+
+Shapes match the library types `TeamConfig`, `OrchestratorConfig`, `CoordinatorConfig`, and the task objects accepted by `runTasks()`.
+
+### Team file
+
+Used with **`oma run --team`** (and optionally **`oma task --team`**).
+
+**Option A — plain `TeamConfig`**
+
+```json
+{
+  "name": "api-team",
+  "agents": [
+    {
+      "name": "architect",
+      "model": "claude-sonnet-4-6",
+      "provider": "anthropic",
+      "systemPrompt": "You design APIs.",
+      "tools": ["file_read", "file_write"],
+      "maxTurns": 6
+    }
+  ],
+  "sharedMemory": true
+}
+```
+
+**Option B — team plus default orchestrator settings**
+
+```json
+{
+  "team": {
+    "name": "api-team",
+    "agents": [{ "name": "worker", "model": "claude-sonnet-4-6", "systemPrompt": "…" }]
+  },
+  "orchestrator": {
+    "defaultModel": "claude-sonnet-4-6",
+    "defaultProvider": "anthropic",
+    "maxConcurrency": 3
+  }
+}
+```
+
+Validation rules enforced by the CLI:
+
+- Root (or `team`) must be an object.
+- `team.name` is a non-empty string.
+- `team.agents` is a non-empty array; each agent must have non-empty `name` and `model`.
+
+Any other fields are passed through to the library as in TypeScript.
+
+### Tasks file
+
+Used with **`oma task --file`**.
+
+```json
+{
+  "orchestrator": {
+    "defaultModel": "claude-sonnet-4-6"
+  },
+  "team": {
+    "name": "pipeline",
+    "agents": [
+      { "name": "designer", "model": "claude-sonnet-4-6", "systemPrompt": "…" },
+      { "name": "builder", "model": "claude-sonnet-4-6", "systemPrompt": "…" }
+    ],
+    "sharedMemory": true
+  },
+  "tasks": [
+    {
+      "title": "Design",
+      "description": "Produce a short spec for the feature.",
+      "assignee": "designer"
+    },
+    {
+      "title": "Implement",
+      "description": "Implement from the design.",
+      "assignee": "builder",
+      "dependsOn": ["Design"]
+    }
+  ]
+}
+```
+
+- **`dependsOn`** — Task titles (not internal ids), same convention as the coordinator output in the library.
+- Optional per-task fields: `memoryScope` (`"dependencies"` \| `"all"`), `maxRetries`, `retryDelayMs`, `retryBackoff`.
+- **`tasks`** must be a non-empty array; each item needs string `title` and `description`.
+
+If **`--team path.json`** is passed, the file’s top-level `team` property is ignored and the external file is used instead (useful when the same team definition is shared across several pipeline files).
+
+### Orchestrator and coordinator JSON
+
+These files are arbitrary JSON objects merged into **`OrchestratorConfig`** and **`CoordinatorConfig`**. Function-valued options (`onProgress`, `onApproval`, etc.) cannot appear in JSON and are not supported by the CLI.
+
+---
+
+## Output
+
+### Stdout
+
+Every invocation prints **one JSON document** to stdout, followed by a newline.
+
+**Successful `run` / `task`**
+
+```json
+{
+  "command": "run",
+  "success": true,
+  "totalTokenUsage": { "input_tokens": 0, "output_tokens": 0 },
+  "agentResults": {
+    "architect": {
+      "success": true,
+      "output": "…",
+      "tokenUsage": { "input_tokens": 0, "output_tokens": 0 },
+      "toolCalls": [],
+      "structured": null,
+      "loopDetected": false,
+      "budgetExceeded": false
+    }
+  }
+}
+```
+
+`agentResults` keys are agent names. When an agent ran multiple tasks, the library merges results; the CLI mirrors the merged `AgentRunResult` fields.
+
+**Errors (usage, validation, I/O, runtime)**
+
+```json
+{
+  "error": {
+    "kind": "usage",
+    "message": "--goal and --team are required"
+  }
+}
+```
+
+`kind` is one of: `usage`, `validation`, `io`, `runtime`, or `internal` (uncaught errors in the outer handler).
+
+### Output flags
+
+| Flag | Effect |
+|------|--------|
+| `--pretty` | Pretty-print JSON with indentation. |
+| `--include-messages` | Include each agent’s full `messages` array in `agentResults`. **Very large** for long runs; default is omit. |
+
+There is no separate progress stream; for rich telemetry use the TypeScript API with `onProgress` / `onTrace`.
+
+---
+
+## Exit codes
+
+| Code | Meaning |
+|------|---------|
+| **0** | Success: `run`/`task` finished with `success === true`, or help / `provider` completed normally. |
+| **1** | Run finished but **`success === false`** (agent or task failure as reported by the library). |
+| **2** | Usage, validation, readable JSON errors, or file access issues (e.g. missing file). |
+| **3** | Unexpected error, including typical LLM/API failures surfaced as thrown errors. |
+
+In scripts:
+
+```bash
+npx oma run --goal "Summarize README" --team team.json > result.json
+code=$?
+case $code in
+  0) echo "OK" ;;
+  1) echo "Run reported failure — inspect result.json" ;;
+  2) echo "Bad inputs or files" ;;
+  3) echo "Crash or API error" ;;
+esac
+```
+
+---
+
+## Argument parsing
+
+- Long options only: `--goal`, `--team`, `--file`, etc.
+- Values may be attached with `=`: `--team=./team.json`.
+- Boolean-style flags (`--pretty`, `--include-messages`) take no value; if the next token does not start with `--`, it is treated as the value of the previous option (standard `getopt`-style pairing).
+
+---
+
+## Limitations (by design)
+
+- No TTY session, history, or `stdin` goal input.
+- No built-in **`cwd`** or metadata passed into `ToolUseContext` (tools use process cwd unless the library adds other hooks later).
+- No **`onApproval`** from JSON; non-interactive batch only.
+- Coordinator **`runTeam`** path still requires network and API keys like any other run.
+
--- a/examples/05-copilot-test.ts
+++ b/examples/05-copilot-test.ts
@ -1,49 +0,0 @@
-/**
- * Quick smoke test for the Copilot adapter.
- *
- * Run:
- *   npx tsx examples/05-copilot-test.ts
- *
- * If GITHUB_COPILOT_TOKEN is not set, the adapter will start an interactive
- * OAuth2 device flow — you'll be prompted to sign in via your browser.
- */
-
-import { OpenMultiAgent } from '../src/index.js'
-import type { OrchestratorEvent } from '../src/types.js'
-
-const orchestrator = new OpenMultiAgent({
-  defaultModel: 'gpt-4o',
-  defaultProvider: 'copilot',
-  onProgress: (event: OrchestratorEvent) => {
-    if (event.type === 'agent_start') {
-      console.log(`[start]    agent=${event.agent}`)
-    } else if (event.type === 'agent_complete') {
-      console.log(`[complete] agent=${event.agent}`)
-    }
-  },
-})
-
-console.log('Testing Copilot adapter with gpt-4o...\n')
-
-const result = await orchestrator.runAgent(
-  {
-    name: 'assistant',
-    model: 'gpt-4o',
-    provider: 'copilot',
-    systemPrompt: 'You are a helpful assistant. Keep answers brief.',
-    maxTurns: 1,
-    maxTokens: 256,
-  },
-  'What is 2 + 2? Reply in one sentence.',
-)
-
-if (result.success) {
-  console.log('\nAgent output:')
-  console.log('─'.repeat(60))
-  console.log(result.output)
-  console.log('─'.repeat(60))
-  console.log(`\nTokens: input=${result.tokenUsage.input_tokens}, output=${result.tokenUsage.output_tokens}`)
-} else {
-  console.error('Agent failed:', result.output)
-  process.exit(1)
-}
--- a/examples/13-gemini.ts
+++ b/examples/13-gemini.ts
@ -1,48 +0,0 @@
-/**
- * Quick smoke test for the Gemini adapter.
- *
- * Run:
- *   npx tsx examples/13-gemini.ts
- *
- * If GEMINI_API_KEY is not set, the adapter will not work.
- */
-
-import { OpenMultiAgent } from '../src/index.js'
-import type { OrchestratorEvent } from '../src/types.js'
-
-const orchestrator = new OpenMultiAgent({
-  defaultModel: 'gemini-2.5-flash',
-  defaultProvider: 'gemini',
-  onProgress: (event: OrchestratorEvent) => {
-    if (event.type === 'agent_start') {
-      console.log(`[start]    agent=${event.agent}`)
-    } else if (event.type === 'agent_complete') {
-      console.log(`[complete] agent=${event.agent}`)
-    }
-  },
-})
-
-console.log('Testing Gemini adapter with gemini-2.5-flash...\n')
-
-const result = await orchestrator.runAgent(
-  {
-    name: 'assistant',
-    model: 'gemini-2.5-flash',
-    provider: 'gemini',
-    systemPrompt: 'You are a helpful assistant. Keep answers brief.',
-    maxTurns: 1,
-    maxTokens: 256,
-  },
-  'What is 2 + 2? Reply in one sentence.',
-)
-
-if (result.success) {
-  console.log('\nAgent output:')
-  console.log('─'.repeat(60))
-  console.log(result.output)
-  console.log('─'.repeat(60))
-  console.log(`\nTokens: input=${result.tokenUsage.input_tokens}, output=${result.tokenUsage.output_tokens}`)
-} else {
-  console.error('Agent failed:', result.output)
-  process.exit(1)
-}
--- a/examples/README.md
+++ b/examples/README.md
@ -0,0 +1,89 @@
+# Examples
+
+Runnable scripts demonstrating `open-multi-agent`. Organized by category — pick one that matches what you're trying to do.
+
+All scripts run with `npx tsx examples/<category>/<name>.ts` and require the corresponding API key in your environment.
+
+---
+
+## basics — start here
+
+The four core execution modes. Read these first.
+
+| Example | What it shows |
+|---------|---------------|
+| [`basics/single-agent`](basics/single-agent.ts) | One agent with bash + file tools, then streaming via the `Agent` class. |
+| [`basics/team-collaboration`](basics/team-collaboration.ts) | `runTeam()` coordinator pattern — goal in, results out. |
+| [`basics/task-pipeline`](basics/task-pipeline.ts) | `runTasks()` with explicit task DAG and dependencies. |
+| [`basics/multi-model-team`](basics/multi-model-team.ts) | Different models per agent in one team. |
+
+## providers — model & adapter examples
+
+One example per supported provider. All follow the same three-agent (architect / developer / reviewer) shape so they're easy to compare.
+
+| Example | Provider | Env var |
+|---------|----------|---------|
+| [`providers/ollama`](providers/ollama.ts) | Ollama (local) + Claude | `ANTHROPIC_API_KEY` |
+| [`providers/gemma4-local`](providers/gemma4-local.ts) | Gemma 4 via Ollama (100% local) | — |
+| [`providers/copilot`](providers/copilot.ts) | GitHub Copilot (GPT-4o + Claude) | `GITHUB_TOKEN` |
+| [`providers/azure-openai`](providers/azure-openai.ts) | Azure OpenAI | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` (+ optional `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`) |
+| [`providers/grok`](providers/grok.ts) | xAI Grok | `XAI_API_KEY` |
+| [`providers/gemini`](providers/gemini.ts) | Google Gemini | `GEMINI_API_KEY` |
+| [`providers/minimax`](providers/minimax.ts) | MiniMax M2.7 | `MINIMAX_API_KEY` |
+| [`providers/deepseek`](providers/deepseek.ts) | DeepSeek Chat | `DEEPSEEK_API_KEY` |
+| [`providers/groq`](providers/groq.ts) | Groq (OpenAI-compatible) | `GROQ_API_KEY` |
+
+## patterns — orchestration patterns
+
+Reusable shapes for common multi-agent problems.
+
+| Example | Pattern |
+|---------|---------|
+| [`patterns/fan-out-aggregate`](patterns/fan-out-aggregate.ts) | MapReduce-style fan-out via `AgentPool.runParallel()`. |
+| [`patterns/structured-output`](patterns/structured-output.ts) | Zod-validated JSON output from an agent. |
+| [`patterns/task-retry`](patterns/task-retry.ts) | Per-task retry with exponential backoff. |
+| [`patterns/multi-perspective-code-review`](patterns/multi-perspective-code-review.ts) | Multiple reviewer agents in parallel, then synthesis. |
+| [`patterns/research-aggregation`](patterns/research-aggregation.ts) | Multi-source research collated by a synthesis agent. |
+| [`patterns/agent-handoff`](patterns/agent-handoff.ts) | Synchronous sub-agent delegation via `delegate_to_agent`. |
+
+## cookbook — use-case recipes
+
+End-to-end examples framed around a concrete problem (meeting summarization, translation QA, competitive monitoring, etc.) rather than a single orchestration primitive. Lighter bar than `production/`: no tests or pinned model versions required. Good entry point if you want to see how the patterns compose on a real task.
+
+| Example | Problem solved |
+|---------|----------------|
+| [`cookbook/meeting-summarizer`](cookbook/meeting-summarizer.ts) | Fan-out post-processing of a transcript into summary, structured action items, and sentiment. |
+
+## integrations — external systems
+
+Hooking the framework up to outside-the-box tooling.
+
+| Example | Integrates with |
+|---------|-----------------|
+| [`integrations/trace-observability`](integrations/trace-observability.ts) | `onTrace` spans for LLM calls, tools, and tasks. |
+| [`integrations/mcp-github`](integrations/mcp-github.ts) | An MCP server's tools exposed to an agent via `connectMCPTools()`. |
+| [`integrations/with-vercel-ai-sdk/`](integrations/with-vercel-ai-sdk/) | Next.js app — OMA `runTeam()` + AI SDK `useChat` streaming. |
+
+## production — real-world use cases
+
+End-to-end examples wired to real workflows. Higher bar than the categories above. See [`production/README.md`](production/README.md) for the acceptance criteria and how to contribute.
+
+---
+
+## Adding a new example
+
+| You're adding… | Goes in… | Filename |
+|----------------|----------|----------|
+| A new model provider | `providers/` | `<provider-name>.ts` (lowercase, hyphenated) |
+| A reusable orchestration pattern | `patterns/` | `<pattern-name>.ts` |
+| A use-case-driven example (problem-first, uses one or more patterns) | `cookbook/` | `<use-case>.ts` |
+| Integration with an outside system (MCP server, observability backend, framework, app) | `integrations/` | `<system>.ts` or `<system>/` for multi-file |
+| A real-world end-to-end use case, production-grade | `production/` | `<use-case>/` directory with its own README |
+
+Conventions:
+
+- **No numeric prefixes.** Folders signal category; reading order is set by this README.
+- **File header docstring** with one-line title, `Run:` block, and prerequisites.
+- **Imports** should resolve as `from '../../src/index.js'` (one level deeper than the old flat layout).
+- **Match the provider template** when adding a provider: three-agent team (architect / developer / reviewer) building a small REST API. Keeps comparisons honest.
+- **Add a row** to the table in this file for the corresponding category.
--- a/examples/basics/multi-model-team.ts
+++ b/examples/basics/multi-model-team.ts
@ -1,5 +1,5 @@
 /**
- * Example 04 — Multi-Model Team with Custom Tools
+ * Multi-Model Team with Custom Tools
 *
 * Demonstrates:
 * - Mixing Anthropic and OpenAI models in the same team
@ -8,7 +8,7 @@
 * - Running a team goal that uses the custom tools
 *
 * Run:
- *   npx tsx examples/04-multi-model-team.ts
+ *   npx tsx examples/basics/multi-model-team.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY and OPENAI_API_KEY env vars must be set.
@ -16,8 +16,8 @@
 */

 import { z } from 'zod'
-import { OpenMultiAgent, defineTool } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
+import { OpenMultiAgent, defineTool } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Custom tools — defined with defineTool() + Zod schemas
@ -113,7 +113,7 @@ const formatCurrencyTool = defineTool({
 // directly through AgentPool rather than through the OpenMultiAgent high-level API.
 // ---------------------------------------------------------------------------

-import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
+import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'

 /**
 * Build an Agent with both built-in and custom tools registered.
--- a/examples/basics/single-agent.ts
+++ b/examples/basics/single-agent.ts
@ -1,18 +1,18 @@
 /**
- * Example 01 — Single Agent
+ * Single Agent
 *
 * The simplest possible usage: one agent with bash and file tools, running
 * a coding task. Then shows streaming output using the Agent class directly.
 *
 * Run:
- *   npx tsx examples/01-single-agent.ts
+ *   npx tsx examples/basics/single-agent.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent, Agent, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
-import type { OrchestratorEvent } from '../src/types.js'
+import { OpenMultiAgent, Agent, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
+import type { OrchestratorEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Part 1: Single agent via OpenMultiAgent (simplest path)
@ -114,6 +114,8 @@ const conversationAgent = new Agent(
    model: 'claude-sonnet-4-6',
    systemPrompt: 'You are a TypeScript tutor. Give short, direct answers.',
    maxTurns: 2,
+    // Keep only the most recent turn in long prompt() conversations.
+    contextStrategy: { type: 'sliding-window', maxTurns: 1 },
  },
  new ToolRegistry(), // no tools needed for this conversation
  new ToolExecutor(new ToolRegistry()),
--- a/examples/basics/task-pipeline.ts
+++ b/examples/basics/task-pipeline.ts
@ -1,19 +1,21 @@
 /**
- * Example 03 — Explicit Task Pipeline with Dependencies
+ * Explicit Task Pipeline with Dependencies
 *
 * Demonstrates how to define tasks with explicit dependency chains
 * (design → implement → test → review) using runTasks(). The TaskQueue
 * automatically blocks downstream tasks until their dependencies complete.
+ * Prompt context is dependency-scoped by default: each task sees only its own
+ * description plus direct dependency results (not unrelated team outputs).
 *
 * Run:
- *   npx tsx examples/03-task-pipeline.ts
+ *   npx tsx examples/basics/task-pipeline.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agents
@ -116,6 +118,7 @@ const tasks: Array<{
  description: string
  assignee?: string
  dependsOn?: string[]
+  memoryScope?: 'dependencies' | 'all'
 }> = [
  {
    title: 'Design: URL shortener data model',
@ -162,6 +165,9 @@ Produce a structured code review with sections:
 - Verdict: SHIP or NEEDS WORK`,
    assignee: 'reviewer',
    dependsOn: ['Implement: URL shortener'], // runs in parallel with Test after Implement completes
+    // Optional override: reviewers can opt into full shared memory when needed.
+    // Remove this line to keep strict dependency-only context.
+    memoryScope: 'all',
  },
 ]

--- a/examples/basics/team-collaboration.ts
+++ b/examples/basics/team-collaboration.ts
@ -1,19 +1,19 @@
 /**
- * Example 02 — Multi-Agent Team Collaboration
+ * Multi-Agent Team Collaboration
 *
 * Three specialised agents (architect, developer, reviewer) collaborate on a
 * shared goal. The OpenMultiAgent orchestrator breaks the goal into tasks, assigns
 * them to the right agents, and collects the results.
 *
 * Run:
- *   npx tsx examples/02-team-collaboration.ts
+ *   npx tsx examples/basics/team-collaboration.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agent definitions
--- a/examples/cookbook/meeting-summarizer.ts
+++ b/examples/cookbook/meeting-summarizer.ts
@ -0,0 +1,284 @@
+/**
+ * Meeting Summarizer (Parallel Post-Processing)
+ *
+ * Demonstrates:
+ * - Fan-out of three specialized agents on the same meeting transcript
+ * - Structured output (Zod schemas) for action items and sentiment
+ * - Parallel timing check: wall time vs sum of per-agent durations
+ * - Aggregator merges into a single Markdown report
+ *
+ * Run:
+ *   npx tsx examples/patterns/meeting-summarizer.ts
+ *
+ * Prerequisites:
+ *   ANTHROPIC_API_KEY env var must be set.
+ */
+
+import { readFileSync } from 'node:fs'
+import { fileURLToPath } from 'node:url'
+import path from 'node:path'
+import { z } from 'zod'
+import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
+import type { AgentConfig, AgentRunResult } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Load the transcript fixture
+// ---------------------------------------------------------------------------
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url))
+const TRANSCRIPT = readFileSync(
+  path.join(__dirname, '../fixtures/meeting-transcript.txt'),
+  'utf-8',
+)
+
+// ---------------------------------------------------------------------------
+// Zod schemas for structured agents
+// ---------------------------------------------------------------------------
+
+const ActionItemList = z.object({
+  items: z.array(
+    z.object({
+      task: z.string().describe('The action to be taken'),
+      owner: z.string().describe('Name of the person responsible'),
+      due_date: z.string().optional().describe('ISO date or human-readable due date if mentioned'),
+    }),
+  ),
+})
+type ActionItemList = z.infer<typeof ActionItemList>
+
+const SentimentReport = z.object({
+  participants: z.array(
+    z.object({
+      participant: z.string().describe('Name as it appears in the transcript'),
+      tone: z.enum(['positive', 'neutral', 'negative', 'mixed']),
+      evidence: z.string().describe('Direct quote or brief paraphrase supporting the tone'),
+    }),
+  ),
+})
+type SentimentReport = z.infer<typeof SentimentReport>
+
+// ---------------------------------------------------------------------------
+// Agent configs
+// ---------------------------------------------------------------------------
+
+const summaryConfig: AgentConfig = {
+  name: 'summary',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a meeting note-taker. Given a transcript, produce a
+three-paragraph summary:
+
+1. What was discussed (the agenda).
+2. Decisions made.
+3. Notable context or risk the team should remember.
+
+Plain prose. No bullet points. 200-300 words total.`,
+  maxTurns: 1,
+  temperature: 0.3,
+}
+
+const actionItemsConfig: AgentConfig = {
+  name: 'action-items',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You extract action items from meeting transcripts. An action
+item is a concrete task with a clear owner. Skip vague intentions ("we should
+think about X"). Include due dates only when the speaker named one explicitly.
+
+Return JSON matching the schema.`,
+  maxTurns: 1,
+  temperature: 0.1,
+  outputSchema: ActionItemList,
+}
+
+const sentimentConfig: AgentConfig = {
+  name: 'sentiment',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You analyze the tone of each participant in a meeting. For
+every named speaker, classify their overall tone as positive, neutral,
+negative, or mixed, and include one short quote or paraphrase as evidence.
+
+Return JSON matching the schema.`,
+  maxTurns: 1,
+  temperature: 0.2,
+  outputSchema: SentimentReport,
+}
+
+const aggregatorConfig: AgentConfig = {
+  name: 'aggregator',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a report writer. You receive three pre-computed
+analyses of the same meeting: a summary, an action-item list, and a sentiment
+report. Your job is to merge them into a single Markdown report.
+
+Output structure — use exactly these four H2 headings, in order:
+
+## Summary
+## Action Items
+## Sentiment
+## Next Steps
+
+Under "Action Items" render a Markdown table with columns: Task, Owner, Due.
+Under "Sentiment" render one bullet per participant.
+Under "Next Steps" synthesize 3-5 concrete follow-ups based on the other
+sections. Do not invent action items that are not grounded in the other data.`,
+  maxTurns: 1,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Build agents
+// ---------------------------------------------------------------------------
+
+function buildAgent(config: AgentConfig): Agent {
+  const registry = new ToolRegistry()
+  registerBuiltInTools(registry)
+  const executor = new ToolExecutor(registry)
+  return new Agent(config, registry, executor)
+}
+
+const summary = buildAgent(summaryConfig)
+const actionItems = buildAgent(actionItemsConfig)
+const sentiment = buildAgent(sentimentConfig)
+const aggregator = buildAgent(aggregatorConfig)
+
+const pool = new AgentPool(3) // three specialists can run concurrently
+pool.add(summary)
+pool.add(actionItems)
+pool.add(sentiment)
+pool.add(aggregator)
+
+console.log('Meeting Summarizer (Parallel Post-Processing)')
+console.log('='.repeat(60))
+console.log(`\nTranscript: ${TRANSCRIPT.split('\n')[0]}`)
+console.log(`Length: ${TRANSCRIPT.split(/\s+/).length} words\n`)
+
+// ---------------------------------------------------------------------------
+// Step 1: Parallel fan-out with per-agent timing
+// ---------------------------------------------------------------------------
+
+console.log('[Step 1] Running 3 agents in parallel...\n')
+
+const specialists = ['summary', 'action-items', 'sentiment'] as const
+
+// Kick off all three concurrently and record each one's own wall duration.
+// Sum-of-per-agent beats a separate serial pass: half the LLM cost, and the
+// sum is the work parallelism saved.
+const parallelStart = performance.now()
+const timed = await Promise.all(
+  specialists.map(async (name) => {
+    const t = performance.now()
+    const result = await pool.run(name, TRANSCRIPT)
+    return { name, result, durationMs: performance.now() - t }
+  }),
+)
+const parallelElapsed = performance.now() - parallelStart
+
+const byName = new Map<string, AgentRunResult>()
+const serialSum = timed.reduce((acc, r) => {
+  byName.set(r.name, r.result)
+  return acc + r.durationMs
+}, 0)
+
+for (const { name, result, durationMs } of timed) {
+  const status = result.success ? 'OK' : 'FAILED'
+  console.log(
+    `  ${name.padEnd(14)} [${status}] — ${Math.round(durationMs)}ms, ${result.tokenUsage.output_tokens} out tokens`,
+  )
+}
+console.log()
+
+for (const { name, result } of timed) {
+  if (!result.success) {
+    console.error(`Specialist '${name}' failed: ${result.output}`)
+    process.exit(1)
+  }
+}
+
+const actionData = byName.get('action-items')!.structured as ActionItemList | undefined
+const sentimentData = byName.get('sentiment')!.structured as SentimentReport | undefined
+
+if (!actionData || !sentimentData) {
+  console.error('Structured output missing: action-items or sentiment failed schema validation')
+  process.exit(1)
+}
+
+// ---------------------------------------------------------------------------
+// Step 2: Parallelism assertion
+// ---------------------------------------------------------------------------
+
+console.log('[Step 2] Parallelism check')
+console.log(`  Parallel wall time: ${Math.round(parallelElapsed)}ms`)
+console.log(`  Serial sum (per-agent): ${Math.round(serialSum)}ms`)
+console.log(`  Speedup: ${(serialSum / parallelElapsed).toFixed(2)}x\n`)
+
+if (parallelElapsed >= serialSum * 0.7) {
+  console.error(
+    `ASSERTION FAILED: parallel wall time (${Math.round(parallelElapsed)}ms) is not ` +
+      `less than 70% of serial sum (${Math.round(serialSum)}ms). Expected substantial ` +
+      `speedup from fan-out.`,
+  )
+  process.exit(1)
+}
+
+// ---------------------------------------------------------------------------
+// Step 3: Aggregate into Markdown report
+// ---------------------------------------------------------------------------
+
+console.log('[Step 3] Aggregating into Markdown report...\n')
+
+const aggregatorPrompt = `Merge the three analyses below into a single Markdown report.
+
+--- SUMMARY (prose) ---
+${byName.get('summary')!.output}
+
+--- ACTION ITEMS (JSON) ---
+${JSON.stringify(actionData, null, 2)}
+
+--- SENTIMENT (JSON) ---
+${JSON.stringify(sentimentData, null, 2)}
+
+Produce the Markdown report per the system instructions.`
+
+const reportResult = await pool.run('aggregator', aggregatorPrompt)
+
+if (!reportResult.success) {
+  console.error('Aggregator failed:', reportResult.output)
+  process.exit(1)
+}
+
+// ---------------------------------------------------------------------------
+// Final output
+// ---------------------------------------------------------------------------
+
+console.log('='.repeat(60))
+console.log('MEETING REPORT')
+console.log('='.repeat(60))
+console.log()
+console.log(reportResult.output)
+console.log()
+console.log('-'.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Token usage summary
+// ---------------------------------------------------------------------------
+
+console.log('\nToken Usage Summary:')
+console.log('-'.repeat(60))
+
+let totalInput = 0
+let totalOutput = 0
+for (const { name, result } of timed) {
+  totalInput += result.tokenUsage.input_tokens
+  totalOutput += result.tokenUsage.output_tokens
+  console.log(
+    `  ${name.padEnd(14)} — input: ${result.tokenUsage.input_tokens}, output: ${result.tokenUsage.output_tokens}`,
+  )
+}
+totalInput += reportResult.tokenUsage.input_tokens
+totalOutput += reportResult.tokenUsage.output_tokens
+console.log(
+  `  ${'aggregator'.padEnd(14)} — input: ${reportResult.tokenUsage.input_tokens}, output: ${reportResult.tokenUsage.output_tokens}`,
+)
+console.log('-'.repeat(60))
+console.log(`  ${'TOTAL'.padEnd(14)} — input: ${totalInput}, output: ${totalOutput}`)
+
+console.log('\nDone.')
--- a/examples/fixtures/meeting-transcript.txt
+++ b/examples/fixtures/meeting-transcript.txt
@ -0,0 +1,21 @@
+Weekly Engineering Standup — 2026-04-18
+Attendees: Maya (Eng Manager), Raj (Senior Backend), Priya (Frontend Lead), Dan (SRE)
+
+Maya: Quick round-table. Raj, where are we on the billing-v2 migration?
+Raj: Cutover is scheduled for Tuesday the 28th. I want to get the shadow-write harness deployed by Friday so we have a full weekend of production traffic comparisons before the cutover. I'll own that. Concerned about the reconciliation query taking 40 seconds on the biggest accounts; I'll look into adding a covering index before cutover.
+
+Maya: Good. Priya, the checkout redesign?
+Priya: Ship-ready. I finished the accessibility audit yesterday, all high-priority items landed. Two medium items on the backlog I'll tackle next sprint. Planning to flip the feature flag for 5% of traffic on Thursday the 23rd and ramp from there. I've been heads-down on this for three weeks and honestly feeling pretty good about where it landed.
+
+Maya: Great. Dan, Sunday's incident — what's the status on the retro?
+Dan: Retro doc is up. Root cause was the failover script assuming a single-region topology after we moved to multi-region in Q1. The script hasn't been exercised in production since February. I'm frustrated that nobody caught it in review — the change was obvious if you read the diff, but it's twenty pages of YAML. I'm going to propose a rule that multi-region changes need a second reviewer on the SRE team. That's an action for me before the next postmortem, I'll have it drafted by Monday the 27th.
+
+Maya: Reasonable. Anything else? Dan, how are you holding up? You've been on call a lot.
+Dan: Honestly? Tired. The back-to-back incidents took the wind out of me. I'd like to hand off primary next rotation. I'll work with Raj on the handoff doc.
+
+Maya: Noted. Let's make that happen. Priya, anything blocking you?
+Priya: Nope, feeling good.
+
+Raj: Just flagging — I saw the Slack thread about the authz refactor. If we're doing that this quarter, it conflicts with billing-v2 timelines. Can we park it until May?
+
+Maya: Yes, I'll follow up with Len and reply in the thread. Thanks everyone.
--- a/examples/integrations/mcp-github.ts
+++ b/examples/integrations/mcp-github.ts
@ -0,0 +1,59 @@
+/**
+ * MCP GitHub Tools
+ *
+ * Connect an MCP server over stdio and register all exposed MCP tools as
+ * standard open-multi-agent tools.
+ *
+ * Run:
+ *   npx tsx examples/integrations/mcp-github.ts
+ *
+ * Prerequisites:
+ *   - GEMINI_API_KEY
+ *   - GITHUB_TOKEN
+ *   - @modelcontextprotocol/sdk installed
+ */
+
+import { Agent, ToolExecutor, ToolRegistry, registerBuiltInTools } from '../../src/index.js'
+import { connectMCPTools } from '../../src/mcp.js'
+
+if (!process.env.GITHUB_TOKEN?.trim()) {
+  console.error('Missing GITHUB_TOKEN: set a GitHub personal access token in the environment.')
+  process.exit(1)
+}
+
+const { tools, disconnect } = await connectMCPTools({
+  command: 'npx',
+  args: ['-y', '@modelcontextprotocol/server-github'],
+  env: {
+    ...process.env,
+    GITHUB_TOKEN: process.env.GITHUB_TOKEN,
+  },
+  namePrefix: 'github',
+})
+
+const registry = new ToolRegistry()
+registerBuiltInTools(registry)
+for (const tool of tools) registry.register(tool)
+const executor = new ToolExecutor(registry)
+
+const agent = new Agent(
+  {
+    name: 'github-agent',
+    model: 'gemini-2.5-flash',
+    provider: 'gemini',
+    tools: tools.map((tool) => tool.name),
+    systemPrompt: 'Use GitHub MCP tools to answer repository questions.',
+  },
+  registry,
+  executor,
+)
+
+try {
+  const result = await agent.run(
+    'List the last 3 open issues in JackChen-me/open-multi-agent with title and number.',
+  )
+
+  console.log(result.output)
+} finally {
+  await disconnect()
+}
--- a/examples/integrations/trace-observability.ts
+++ b/examples/integrations/trace-observability.ts
@ -1,5 +1,5 @@
 /**
- * Example 11 — Trace Observability
+ * Trace Observability
 *
 * Demonstrates the `onTrace` callback for lightweight observability. Every LLM
 * call, tool execution, task lifecycle, and agent run emits a structured trace
@ -11,14 +11,14 @@
 * dashboard.
 *
 * Run:
- *   npx tsx examples/11-trace-observability.ts
+ *   npx tsx examples/integrations/trace-observability.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, TraceEvent } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, TraceEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agents
--- a/examples/integrations/with-vercel-ai-sdk/.gitignore
+++ b/examples/integrations/with-vercel-ai-sdk/.gitignore
@ -0,0 +1,5 @@
+node_modules/
+.next/
+.env
+.env.local
+*.tsbuildinfo
--- a/examples/integrations/with-vercel-ai-sdk/README.md
+++ b/examples/integrations/with-vercel-ai-sdk/README.md
@ -0,0 +1,59 @@
+# with-vercel-ai-sdk
+
+A Next.js demo showing **open-multi-agent** (OMA) and **Vercel AI SDK** working together:
+
+- **OMA** orchestrates a research team (researcher agent + writer agent) via `runTeam()`
+- **AI SDK** streams the result to a chat UI via `useChat` + `streamText`
+
+## How it works
+
+```
+User message
+  │
+  ▼
+API route (app/api/chat/route.ts)
+  │
+  ├─ Phase 1: OMA runTeam()
+  │    coordinator decomposes goal → researcher gathers info → writer drafts article
+  │
+  └─ Phase 2: AI SDK streamText()
+       streams the team's output to the browser
+  │
+  ▼
+Chat UI (app/page.tsx) — useChat hook renders streamed response
+```
+
+## Setup
+
+```bash
+# 1. From repo root, install OMA dependencies
+cd ../../..
+npm install
+
+# 2. Back to this example
+cd examples/integrations/with-vercel-ai-sdk
+npm install
+
+# 3. Set your API key
+export ANTHROPIC_API_KEY=sk-ant-...
+
+# 4. Run
+npm run dev
+```
+
+`npm run dev` automatically builds OMA before starting Next.js (via the `predev` script).
+
+Open [http://localhost:3000](http://localhost:3000), type a topic, and watch the research team work.
+
+## Prerequisites
+
+- Node.js >= 18
+- `ANTHROPIC_API_KEY` environment variable (used by both OMA and AI SDK)
+
+## Key files
+
+| File | Role |
+|------|------|
+| `app/api/chat/route.ts` | Backend — OMA orchestration + AI SDK streaming |
+| `app/page.tsx` | Frontend — chat UI with `useChat` hook |
+| `package.json` | References OMA via `file:../../` (local link) |
--- a/examples/integrations/with-vercel-ai-sdk/app/api/chat/route.ts
+++ b/examples/integrations/with-vercel-ai-sdk/app/api/chat/route.ts
@ -0,0 +1,91 @@
+import { streamText, convertToModelMessages, type UIMessage } from 'ai'
+import { createOpenAICompatible } from '@ai-sdk/openai-compatible'
+import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
+import type { AgentConfig } from '@jackchen_me/open-multi-agent'
+
+export const maxDuration = 120
+
+// --- DeepSeek via OpenAI-compatible API ---
+const DEEPSEEK_BASE_URL = 'https://api.deepseek.com'
+const DEEPSEEK_MODEL = 'deepseek-chat'
+
+const deepseek = createOpenAICompatible({
+  name: 'deepseek',
+  baseURL: `${DEEPSEEK_BASE_URL}/v1`,
+  apiKey: process.env.DEEPSEEK_API_KEY,
+})
+
+const researcher: AgentConfig = {
+  name: 'researcher',
+  model: DEEPSEEK_MODEL,
+  provider: 'openai',
+  baseURL: DEEPSEEK_BASE_URL,
+  apiKey: process.env.DEEPSEEK_API_KEY,
+  systemPrompt: `You are a research specialist. Given a topic, provide thorough, factual research
+with key findings, relevant data points, and important context.
+Be concise but comprehensive. Output structured notes, not prose.`,
+  maxTurns: 3,
+  temperature: 0.2,
+}
+
+const writer: AgentConfig = {
+  name: 'writer',
+  model: DEEPSEEK_MODEL,
+  provider: 'openai',
+  baseURL: DEEPSEEK_BASE_URL,
+  apiKey: process.env.DEEPSEEK_API_KEY,
+  systemPrompt: `You are an expert writer. Using research from team members (available in shared memory),
+write a well-structured, engaging article with clear headings and concise paragraphs.
+Do not repeat raw research — synthesize it into readable prose.`,
+  maxTurns: 3,
+  temperature: 0.4,
+}
+
+function extractText(message: UIMessage): string {
+  return message.parts
+    .filter((p): p is { type: 'text'; text: string } => p.type === 'text')
+    .map((p) => p.text)
+    .join('')
+}
+
+export async function POST(req: Request) {
+  const { messages }: { messages: UIMessage[] } = await req.json()
+  const lastText = extractText(messages.at(-1)!)
+
+  // --- Phase 1: OMA multi-agent orchestration ---
+  const orchestrator = new OpenMultiAgent({
+    defaultModel: DEEPSEEK_MODEL,
+    defaultProvider: 'openai',
+    defaultBaseURL: DEEPSEEK_BASE_URL,
+    defaultApiKey: process.env.DEEPSEEK_API_KEY,
+  })
+
+  const team = orchestrator.createTeam('research-writing', {
+    name: 'research-writing',
+    agents: [researcher, writer],
+    sharedMemory: true,
+  })
+
+  const teamResult = await orchestrator.runTeam(
+    team,
+    `Research and write an article about: ${lastText}`,
+  )
+
+  const teamOutput = teamResult.agentResults.get('coordinator')?.output ?? ''
+
+  // --- Phase 2: Stream result via Vercel AI SDK ---
+  const result = streamText({
+    model: deepseek(DEEPSEEK_MODEL),
+    system: `You are presenting research from a multi-agent team (researcher + writer).
+The team has already done the work. Your only job is to relay their output to the user
+in a well-formatted way. Keep the content faithful to the team output below.
+At the very end, add a one-line note that this was produced by a researcher agent
+and a writer agent collaborating via open-multi-agent.
+
+## Team Output
+${teamOutput}`,
+    messages: await convertToModelMessages(messages),
+  })
+
+  return result.toUIMessageStreamResponse()
+}
--- a/examples/integrations/with-vercel-ai-sdk/app/layout.tsx
+++ b/examples/integrations/with-vercel-ai-sdk/app/layout.tsx
@ -0,0 +1,14 @@
+import type { Metadata } from 'next'
+
+export const metadata: Metadata = {
+  title: 'OMA + Vercel AI SDK',
+  description: 'Multi-agent research team powered by open-multi-agent, streamed via Vercel AI SDK',
+}
+
+export default function RootLayout({ children }: { children: React.ReactNode }) {
+  return (
+    <html lang="en">
+      <body style={{ margin: 0, background: '#fafafa' }}>{children}</body>
+    </html>
+  )
+}
--- a/examples/integrations/with-vercel-ai-sdk/app/page.tsx
+++ b/examples/integrations/with-vercel-ai-sdk/app/page.tsx
@ -0,0 +1,97 @@
+'use client'
+
+import { useState } from 'react'
+import { useChat } from '@ai-sdk/react'
+
+export default function Home() {
+  const { messages, sendMessage, status, error } = useChat()
+  const [input, setInput] = useState('')
+
+  const isLoading = status === 'submitted' || status === 'streaming'
+
+  const handleSubmit = async (e: React.FormEvent) => {
+    e.preventDefault()
+    if (!input.trim() || isLoading) return
+    const text = input
+    setInput('')
+    await sendMessage({ text })
+  }
+
+  return (
+    <main
+      style={{
+        maxWidth: 720,
+        margin: '0 auto',
+        padding: '32px 16px',
+        fontFamily: 'system-ui, -apple-system, sans-serif',
+      }}
+    >
+      <h1 style={{ fontSize: 22, marginBottom: 4 }}>Research Team</h1>
+      <p style={{ color: '#666', fontSize: 14, marginBottom: 28 }}>
+        Enter a topic. A <strong>researcher</strong> agent gathers information, a{' '}
+        <strong>writer</strong> agent composes an article &mdash; orchestrated by
+        open-multi-agent, streamed via Vercel AI SDK.
+      </p>
+
+      <div style={{ minHeight: 120 }}>
+        {messages.map((m) => (
+          <div key={m.id} style={{ marginBottom: 24, lineHeight: 1.7 }}>
+            <div style={{ fontWeight: 600, fontSize: 13, color: '#999', marginBottom: 4 }}>
+              {m.role === 'user' ? 'You' : 'Research Team'}
+            </div>
+            <div style={{ whiteSpace: 'pre-wrap', fontSize: 15 }}>
+              {m.parts
+                .filter((part): part is { type: 'text'; text: string } => part.type === 'text')
+                .map((part) => part.text)
+                .join('')}
+            </div>
+          </div>
+        ))}
+
+        {isLoading && status === 'submitted' && (
+          <div style={{ color: '#888', fontSize: 14, padding: '8px 0' }}>
+            Agents are collaborating &mdash; this may take a minute...
+          </div>
+        )}
+
+        {error && (
+          <div style={{ color: '#c00', fontSize: 14, padding: '8px 0' }}>
+            Error: {error.message}
+          </div>
+        )}
+      </div>
+
+      <form onSubmit={handleSubmit} style={{ display: 'flex', gap: 8, marginTop: 32 }}>
+        <input
+          value={input}
+          onChange={(e) => setInput(e.target.value)}
+          placeholder="Enter a topic to research..."
+          disabled={isLoading}
+          style={{
+            flex: 1,
+            padding: '10px 14px',
+            borderRadius: 8,
+            border: '1px solid #ddd',
+            fontSize: 15,
+            outline: 'none',
+          }}
+        />
+        <button
+          type="submit"
+          disabled={isLoading || !input.trim()}
+          style={{
+            padding: '10px 20px',
+            borderRadius: 8,
+            border: 'none',
+            background: isLoading ? '#ccc' : '#111',
+            color: '#fff',
+            cursor: isLoading ? 'not-allowed' : 'pointer',
+            fontSize: 15,
+          }}
+        >
+          Send
+        </button>
+      </form>
+    </main>
+  )
+}
--- a/examples/integrations/with-vercel-ai-sdk/next-env.d.ts
+++ b/examples/integrations/with-vercel-ai-sdk/next-env.d.ts
@ -0,0 +1,6 @@
+/// <reference types="next" />
+/// <reference types="next/image-types/global" />
+import "./.next/dev/types/routes.d.ts";
+
+// NOTE: This file should not be edited
+// see https://nextjs.org/docs/app/api-reference/config/typescript for more information.
--- a/examples/integrations/with-vercel-ai-sdk/next.config.ts
+++ b/examples/integrations/with-vercel-ai-sdk/next.config.ts
@ -0,0 +1,7 @@
+import type { NextConfig } from 'next'
+
+const nextConfig: NextConfig = {
+  serverExternalPackages: ['@jackchen_me/open-multi-agent'],
+}
+
+export default nextConfig
--- a/examples/integrations/with-vercel-ai-sdk/package-lock.json
+++ b/examples/integrations/with-vercel-ai-sdk/package-lock.json
--- a/examples/integrations/with-vercel-ai-sdk/package.json
+++ b/examples/integrations/with-vercel-ai-sdk/package.json
@ -0,0 +1,25 @@
+{
+  "name": "with-vercel-ai-sdk",
+  "private": true,
+  "scripts": {
+    "predev": "cd ../.. && npm run build",
+    "dev": "next dev",
+    "build": "next build",
+    "start": "next start"
+  },
+  "dependencies": {
+    "@ai-sdk/openai-compatible": "^2.0.41",
+    "@ai-sdk/react": "^3.0.0",
+    "@jackchen_me/open-multi-agent": "file:../../",
+    "ai": "^6.0.0",
+    "next": "^16.0.0",
+    "react": "^19.0.0",
+    "react-dom": "^19.0.0"
+  },
+  "devDependencies": {
+    "@types/node": "^22.0.0",
+    "@types/react": "^19.0.0",
+    "@types/react-dom": "^19.0.0",
+    "typescript": "^5.6.0"
+  }
+}
--- a/examples/integrations/with-vercel-ai-sdk/tsconfig.json
+++ b/examples/integrations/with-vercel-ai-sdk/tsconfig.json
@ -0,0 +1,41 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "lib": [
+      "dom",
+      "dom.iterable",
+      "ES2022"
+    ],
+    "allowJs": true,
+    "skipLibCheck": true,
+    "strict": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "module": "ESNext",
+    "moduleResolution": "bundler",
+    "resolveJsonModule": true,
+    "isolatedModules": true,
+    "jsx": "react-jsx",
+    "incremental": true,
+    "plugins": [
+      {
+        "name": "next"
+      }
+    ],
+    "paths": {
+      "@/*": [
+        "./*"
+      ]
+    }
+  },
+  "include": [
+    "next-env.d.ts",
+    "**/*.ts",
+    "**/*.tsx",
+    ".next/types/**/*.ts",
+    ".next/dev/types/**/*.ts"
+  ],
+  "exclude": [
+    "node_modules"
+  ]
+}
--- a/examples/patterns/agent-handoff.ts
+++ b/examples/patterns/agent-handoff.ts
@ -0,0 +1,64 @@
+/**
+ * Synchronous agent handoff via `delegate_to_agent`
+ *
+ * During `runTeam` / `runTasks`, pool agents register the built-in
+ * `delegate_to_agent` tool so one specialist can run a sub-prompt on another
+ * roster agent and read the answer in the same conversation turn.
+ *
+ * Whitelist `delegate_to_agent` in `tools` when you want the model to see it;
+ * standalone `runAgent()` does not register this tool by default.
+ *
+ * Run:
+ *   npx tsx examples/patterns/agent-handoff.ts
+ *
+ * Prerequisites:
+ *   ANTHROPIC_API_KEY
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig } from '../../src/types.js'
+
+const researcher: AgentConfig = {
+  name: 'researcher',
+  model: 'claude-sonnet-4-6',
+  provider: 'anthropic',
+  systemPrompt:
+    'You answer factual questions briefly. When the user asks for a second opinion ' +
+    'from the analyst, use delegate_to_agent to ask the analyst agent, then summarize both views.',
+  tools: ['delegate_to_agent'],
+  maxTurns: 6,
+}
+
+const analyst: AgentConfig = {
+  name: 'analyst',
+  model: 'claude-sonnet-4-6',
+  provider: 'anthropic',
+  systemPrompt: 'You give short, skeptical analysis of claims. Push back when evidence is weak.',
+  tools: [],
+  maxTurns: 4,
+}
+
+async function main(): Promise<void> {
+  const orchestrator = new OpenMultiAgent({ maxConcurrency: 2 })
+  const team = orchestrator.createTeam('handoff-demo', {
+    name: 'handoff-demo',
+    agents: [researcher, analyst],
+    sharedMemory: true,
+  })
+
+  const goal =
+    'In one paragraph: state a simple fact about photosynthesis. ' +
+    'Then ask the analyst (via delegate_to_agent) for a one-sentence critique of overstated claims in popular science. ' +
+    'Merge both into a final short answer.'
+
+  const result = await orchestrator.runTeam(team, goal)
+  console.log('Success:', result.success)
+  for (const [name, ar] of result.agentResults) {
+    console.log(`\n--- ${name} ---\n${ar.output.slice(0, 2000)}`)
+  }
+}
+
+main().catch((err) => {
+  console.error(err)
+  process.exit(1)
+})
--- a/examples/patterns/fan-out-aggregate.ts
+++ b/examples/patterns/fan-out-aggregate.ts
@ -1,5 +1,5 @@
 /**
- * Example 07 — Fan-Out / Aggregate (MapReduce) Pattern
+ * Fan-Out / Aggregate (MapReduce) Pattern
 *
 * Demonstrates:
 * - Fan-out: send the same question to N "analyst" agents in parallel
@ -9,14 +9,14 @@
 * - No tools needed — pure LLM reasoning to keep the focus on the pattern
 *
 * Run:
- *   npx tsx examples/07-fan-out-aggregate.ts
+ *   npx tsx examples/patterns/fan-out-aggregate.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
-import type { AgentConfig, AgentRunResult } from '../src/types.js'
+import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
+import type { AgentConfig, AgentRunResult } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Analysis topic
--- a/examples/patterns/multi-perspective-code-review.ts
+++ b/examples/patterns/multi-perspective-code-review.ts
@ -0,0 +1,188 @@
+/**
+ * Multi-Perspective Code Review
+ *
+ * Demonstrates:
+ * - Dependency chain: generator produces code, three reviewers depend on it
+ * - Parallel execution: security, performance, and style reviewers run concurrently
+ * - Shared memory: each agent's output is automatically stored and injected
+ *   into downstream agents' prompts by the framework
+ *
+ * Flow:
+ *   generator → [security-reviewer, performance-reviewer, style-reviewer] (parallel) → synthesizer
+ *
+ * Run:
+ *   npx tsx examples/patterns/multi-perspective-code-review.ts
+ *
+ * Prerequisites:
+ *   ANTHROPIC_API_KEY env var must be set.
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// API spec to implement
+// ---------------------------------------------------------------------------
+
+const API_SPEC = `POST /users endpoint that:
+- Accepts JSON body with name (string, required), email (string, required), age (number, optional)
+- Validates all fields
+- Inserts into a PostgreSQL database
+- Returns 201 with the created user or 400/500 on error`
+
+// ---------------------------------------------------------------------------
+// Agents
+// ---------------------------------------------------------------------------
+
+const generator: AgentConfig = {
+  name: 'generator',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a Node.js backend developer. Given an API spec, write a complete
+Express route handler. Include imports, validation, database query, and error handling.
+Output only the code, no explanation. Keep it under 80 lines.`,
+  maxTurns: 2,
+}
+
+const securityReviewer: AgentConfig = {
+  name: 'security-reviewer',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a security reviewer. Review the code provided in context and check
+for OWASP top 10 vulnerabilities: SQL injection, XSS, broken authentication,
+sensitive data exposure, etc. Write your findings as a markdown checklist.
+Keep it to 150-200 words.`,
+  maxTurns: 2,
+}
+
+const performanceReviewer: AgentConfig = {
+  name: 'performance-reviewer',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a performance reviewer. Review the code provided in context and check
+for N+1 queries, memory leaks, blocking calls, missing connection pooling, and
+inefficient patterns. Write your findings as a markdown checklist.
+Keep it to 150-200 words.`,
+  maxTurns: 2,
+}
+
+const styleReviewer: AgentConfig = {
+  name: 'style-reviewer',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a code style reviewer. Review the code provided in context and check
+naming conventions, function structure, readability, error message clarity, and
+consistency. Write your findings as a markdown checklist.
+Keep it to 150-200 words.`,
+  maxTurns: 2,
+}
+
+const synthesizer: AgentConfig = {
+  name: 'synthesizer',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a lead engineer synthesizing code review feedback. Review all
+the feedback and original code provided in context. Produce a unified report with:
+
+1. Critical issues (must fix before merge)
+2. Recommended improvements (should fix)
+3. Minor suggestions (nice to have)
+
+Deduplicate overlapping feedback. Keep the report to 200-300 words.`,
+  maxTurns: 2,
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrator + team
+// ---------------------------------------------------------------------------
+
+function handleProgress(event: OrchestratorEvent): void {
+  if (event.type === 'task_start') {
+    console.log(`  [START] ${event.task ?? '?'} → ${event.agent ?? '?'}`)
+  }
+  if (event.type === 'task_complete') {
+    const success = (event.data as { success?: boolean })?.success ?? true
+    console.log(`  [DONE]  ${event.task ?? '?'} (${success ? 'OK' : 'FAIL'})`)
+  }
+}
+
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'claude-sonnet-4-6',
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('code-review-team', {
+  name: 'code-review-team',
+  agents: [generator, securityReviewer, performanceReviewer, styleReviewer, synthesizer],
+  sharedMemory: true,
+})
+
+// ---------------------------------------------------------------------------
+// Tasks
+// ---------------------------------------------------------------------------
+
+const tasks = [
+  {
+    title: 'Generate code',
+    description: `Write a Node.js Express route handler for this API spec:\n\n${API_SPEC}`,
+    assignee: 'generator',
+  },
+  {
+    title: 'Security review',
+    description: 'Review the generated code for security vulnerabilities.',
+    assignee: 'security-reviewer',
+    dependsOn: ['Generate code'],
+  },
+  {
+    title: 'Performance review',
+    description: 'Review the generated code for performance issues.',
+    assignee: 'performance-reviewer',
+    dependsOn: ['Generate code'],
+  },
+  {
+    title: 'Style review',
+    description: 'Review the generated code for style and readability.',
+    assignee: 'style-reviewer',
+    dependsOn: ['Generate code'],
+  },
+  {
+    title: 'Synthesize feedback',
+    description: 'Synthesize all review feedback and the original code into a unified, prioritized action item report.',
+    assignee: 'synthesizer',
+    dependsOn: ['Security review', 'Performance review', 'Style review'],
+  },
+]
+
+// ---------------------------------------------------------------------------
+// Run
+// ---------------------------------------------------------------------------
+
+console.log('Multi-Perspective Code Review')
+console.log('='.repeat(60))
+console.log(`Spec: ${API_SPEC.split('\n')[0]}`)
+console.log('Pipeline: generator → 3 reviewers (parallel) → synthesizer')
+console.log('='.repeat(60))
+console.log()
+
+const result = await orchestrator.runTasks(team, tasks)
+
+// ---------------------------------------------------------------------------
+// Output
+// ---------------------------------------------------------------------------
+
+console.log('\n' + '='.repeat(60))
+console.log(`Overall success: ${result.success}`)
+console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+console.log()
+
+for (const [name, r] of result.agentResults) {
+  const icon = r.success ? 'OK  ' : 'FAIL'
+  const tokens = `in:${r.tokenUsage.input_tokens} out:${r.tokenUsage.output_tokens}`
+  console.log(`  [${icon}] ${name.padEnd(22)} ${tokens}`)
+}
+
+const synthResult = result.agentResults.get('synthesizer')
+if (synthResult?.success) {
+  console.log('\n' + '='.repeat(60))
+  console.log('UNIFIED REVIEW REPORT')
+  console.log('='.repeat(60))
+  console.log()
+  console.log(synthResult.output)
+}
+
+console.log('\nDone.')
--- a/examples/patterns/research-aggregation.ts
+++ b/examples/patterns/research-aggregation.ts
@ -0,0 +1,169 @@
+/**
+ * Multi-Source Research Aggregation
+ *
+ * Demonstrates runTasks() with explicit dependency chains:
+ * - Parallel execution: three analyst agents research the same topic independently
+ * - Dependency chain via dependsOn: synthesizer waits for all analysts to finish
+ * - Automatic shared memory: agent output flows to downstream agents via the framework
+ *
+ * Compare with example 07 (fan-out-aggregate) which uses AgentPool.runParallel()
+ * for the same 3-analysts + synthesizer pattern. This example shows the runTasks()
+ * API with explicit dependsOn declarations instead.
+ *
+ * Flow:
+ *   [technical-analyst, market-analyst, community-analyst] (parallel) → synthesizer
+ *
+ * Run:
+ *   npx tsx examples/patterns/research-aggregation.ts
+ *
+ * Prerequisites:
+ *   ANTHROPIC_API_KEY env var must be set.
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Topic
+// ---------------------------------------------------------------------------
+
+const TOPIC = 'WebAssembly adoption in 2026'
+
+// ---------------------------------------------------------------------------
+// Agents — three analysts + one synthesizer
+// ---------------------------------------------------------------------------
+
+const technicalAnalyst: AgentConfig = {
+  name: 'technical-analyst',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a technical analyst. Given a topic, research its technical
+capabilities, limitations, performance characteristics, and architectural patterns.
+Write your findings as structured markdown. Keep it to 200-300 words.`,
+  maxTurns: 2,
+}
+
+const marketAnalyst: AgentConfig = {
+  name: 'market-analyst',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a market analyst. Given a topic, research industry adoption
+rates, key companies using the technology, market size estimates, and competitive
+landscape. Write your findings as structured markdown. Keep it to 200-300 words.`,
+  maxTurns: 2,
+}
+
+const communityAnalyst: AgentConfig = {
+  name: 'community-analyst',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a developer community analyst. Given a topic, research
+developer sentiment, ecosystem maturity, learning resources, community size,
+and conference/meetup activity. Write your findings as structured markdown.
+Keep it to 200-300 words.`,
+  maxTurns: 2,
+}
+
+const synthesizer: AgentConfig = {
+  name: 'synthesizer',
+  model: 'claude-sonnet-4-6',
+  systemPrompt: `You are a research director who synthesizes multiple analyst reports
+into a single cohesive document. You will receive all prior analyst outputs
+automatically. Then:
+
+1. Cross-reference claims across reports - flag agreements and contradictions
+2. Identify the 3 most important insights
+3. Produce a structured report with: Executive Summary, Key Findings,
+   Areas of Agreement, Open Questions, and Recommendation
+
+Keep the final report to 300-400 words.`,
+  maxTurns: 2,
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrator + team
+// ---------------------------------------------------------------------------
+
+function handleProgress(event: OrchestratorEvent): void {
+  if (event.type === 'task_start') {
+    console.log(`  [START] ${event.task ?? ''} → ${event.agent ?? ''}`)
+  }
+  if (event.type === 'task_complete') {
+    console.log(`  [DONE]  ${event.task ?? ''}`)
+  }
+}
+
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'claude-sonnet-4-6',
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('research-team', {
+  name: 'research-team',
+  agents: [technicalAnalyst, marketAnalyst, communityAnalyst, synthesizer],
+  sharedMemory: true,
+})
+
+// ---------------------------------------------------------------------------
+// Tasks — three analysts run in parallel, synthesizer depends on all three
+// ---------------------------------------------------------------------------
+
+const tasks = [
+  {
+    title: 'Technical analysis',
+    description: `Research the technical aspects of ${TOPIC}. Focus on capabilities, limitations, performance, and architecture.`,
+    assignee: 'technical-analyst',
+  },
+  {
+    title: 'Market analysis',
+    description: `Research the market landscape for ${TOPIC}. Focus on adoption rates, key players, market size, and competition.`,
+    assignee: 'market-analyst',
+  },
+  {
+    title: 'Community analysis',
+    description: `Research the developer community around ${TOPIC}. Focus on sentiment, ecosystem maturity, learning resources, and community activity.`,
+    assignee: 'community-analyst',
+  },
+  {
+    title: 'Synthesize report',
+    description: `Cross-reference all analyst findings, identify key insights, flag contradictions, and produce a unified research report.`,
+    assignee: 'synthesizer',
+    dependsOn: ['Technical analysis', 'Market analysis', 'Community analysis'],
+  },
+]
+
+// ---------------------------------------------------------------------------
+// Run
+// ---------------------------------------------------------------------------
+
+console.log('Multi-Source Research Aggregation')
+console.log('='.repeat(60))
+console.log(`Topic: ${TOPIC}`)
+console.log('Pipeline: 3 analysts (parallel) → synthesizer')
+console.log('='.repeat(60))
+console.log()
+
+const result = await orchestrator.runTasks(team, tasks)
+
+// ---------------------------------------------------------------------------
+// Output
+// ---------------------------------------------------------------------------
+
+console.log('\n' + '='.repeat(60))
+console.log(`Overall success: ${result.success}`)
+console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+console.log()
+
+for (const [name, r] of result.agentResults) {
+  const icon = r.success ? 'OK  ' : 'FAIL'
+  const tokens = `in:${r.tokenUsage.input_tokens} out:${r.tokenUsage.output_tokens}`
+  console.log(`  [${icon}] ${name.padEnd(20)} ${tokens}`)
+}
+
+const synthResult = result.agentResults.get('synthesizer')
+if (synthResult?.success) {
+  console.log('\n' + '='.repeat(60))
+  console.log('SYNTHESIZED REPORT')
+  console.log('='.repeat(60))
+  console.log()
+  console.log(synthResult.output)
+}
+
+console.log('\nDone.')
--- a/examples/patterns/structured-output.ts
+++ b/examples/patterns/structured-output.ts
@ -1,5 +1,5 @@
 /**
- * Example 09 — Structured Output
+ * Structured Output
 *
 * Demonstrates `outputSchema` on AgentConfig. The agent's response is
 * automatically parsed as JSON and validated against a Zod schema.
@ -8,15 +8,15 @@
 * The validated result is available via `result.structured`.
 *
 * Run:
- *   npx tsx examples/09-structured-output.ts
+ *   npx tsx examples/patterns/structured-output.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

 import { z } from 'zod'
-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Define a Zod schema for the expected output
--- a/examples/patterns/task-retry.ts
+++ b/examples/patterns/task-retry.ts
@ -1,5 +1,5 @@
 /**
- * Example 10 — Task Retry with Exponential Backoff
+ * Task Retry with Exponential Backoff
 *
 * Demonstrates `maxRetries`, `retryDelayMs`, and `retryBackoff` on task config.
 * When a task fails, the framework automatically retries with exponential
@ -10,14 +10,14 @@
 * to retry on failure, and the second task (analysis) depends on it.
 *
 * Run:
- *   npx tsx examples/10-task-retry.ts
+ *   npx tsx examples/patterns/task-retry.ts
 *
 * Prerequisites:
 *   ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agents
--- a/examples/production/README.md
+++ b/examples/production/README.md
@ -0,0 +1,38 @@
+# Production Examples
+
+End-to-end examples that demonstrate `open-multi-agent` running on real-world use cases — not toy demos.
+
+The other example categories (`basics/`, `providers/`, `patterns/`, `integrations/`) optimize for clarity and small surface area. This directory optimizes for **showing the framework solving an actual problem**, with the operational concerns that come with it.
+
+## Acceptance criteria
+
+A submission belongs in `production/` if it meets all of:
+
+1. **Real use case.** Solves a concrete problem someone would actually pay for or use daily — not "build me a TODO API".
+2. **Error handling.** Handles LLM failures, tool failures, and partial team failures gracefully. No bare `await` chains that crash on the first error.
+3. **Documentation.** Each example lives in its own subdirectory with a `README.md` covering:
+   - What problem it solves
+   - Architecture diagram or task DAG description
+   - Required env vars / external services
+   - How to run locally
+   - Expected runtime and approximate token cost
+4. **Reproducible.** Pinned model versions; no reliance on private datasets or unpublished APIs.
+5. **Tested.** At least one test or smoke check that verifies the example still runs after framework updates.
+
+If a submission falls short on (2)–(5), it probably belongs in `patterns/` or `integrations/` instead.
+
+## Layout
+
+```
+production/
+└── <use-case>/
+    ├── README.md          # required
+    ├── index.ts           # entry point
+    ├── agents/            # AgentConfig definitions
+    ├── tools/             # custom tools, if any
+    └── tests/             # smoke test or e2e test
+```
+
+## Submitting
+
+Open a PR. In the PR description, address each of the five acceptance criteria above.
--- a/examples/providers/azure-openai.ts
+++ b/examples/providers/azure-openai.ts
@ -0,0 +1,179 @@
+/**
+ * Multi-Agent Team Collaboration with Azure OpenAI
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Every agent uses Azure-hosted OpenAI models.
+ *
+ * Run:
+ *   npx tsx examples/providers/azure-openai.ts
+ *
+ * Prerequisites:
+ *   AZURE_OPENAI_API_KEY      — Your Azure OpenAI API key (required)
+ *   AZURE_OPENAI_ENDPOINT     — Your Azure endpoint URL (required)
+ *                                Example: https://my-resource.openai.azure.com
+ *   AZURE_OPENAI_API_VERSION  — API version (optional, defaults to 2024-10-21)
+ *   AZURE_OPENAI_DEPLOYMENT   — Deployment name fallback when model is blank (optional)
+ *
+ * Important Note on Model Field:
+ *   The 'model' field in agent configs should contain your Azure DEPLOYMENT NAME,
+ *   not the underlying model name. For example, if you deployed GPT-4 with the
+ *   deployment name "my-gpt4-prod", use `model: 'my-gpt4-prod'` in the agent config.
+ *
+ *   You can find your deployment names in the Azure Portal under:
+ *   Azure OpenAI → Your Resource → Model deployments
+ *
+ * Example Setup:
+ *   If you have these Azure deployments:
+ *   - "gpt-4" (your GPT-4 deployment)
+ *   - "gpt-35-turbo" (your GPT-3.5 Turbo deployment)
+ *
+ *   Then use those exact names in the model field below.
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (using Azure OpenAI deployments)
+// ---------------------------------------------------------------------------
+
+/**
+ * IMPORTANT: Replace 'gpt-4' and 'gpt-35-turbo' below with YOUR actual
+ * Azure deployment names. These are just examples.
+ */
+
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'gpt-4', // Replace with your Azure GPT-4 deployment name
+  provider: 'azure-openai',
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'gpt-4', // Replace with your Azure GPT-4 or GPT-3.5 deployment name
+  provider: 'azure-openai',
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'gpt-4', // Replace with your Azure GPT-4 deployment name
+  provider: 'azure-openai',
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'gpt-4', // Replace with your default Azure deployment name
+  defaultProvider: 'azure-openai',
+  maxConcurrency: 1, // sequential for readable output
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+// Sample outputs
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/copilot.ts
+++ b/examples/providers/copilot.ts
@ -0,0 +1,163 @@
+/**
+ * Multi-Agent Team Collaboration with GitHub Copilot
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Routes through GitHub Copilot's OpenAI-compatible
+ * endpoint, mixing GPT-4o (architect/reviewer) and Claude Sonnet (developer) in one team.
+ *
+ * Run:
+ *   npx tsx examples/providers/copilot.ts
+ *
+ * Authentication (one of):
+ *   GITHUB_COPILOT_TOKEN env var (preferred)
+ *   GITHUB_TOKEN env var (fallback)
+ *   Otherwise: an interactive OAuth2 device flow starts on first run and prompts
+ *   you to sign in via your browser. Requires an active Copilot subscription.
+ *
+ * Available models (subset):
+ *   gpt-4o              — included, no premium request
+ *   claude-sonnet-4.5   — premium, 1x multiplier
+ *   claude-sonnet-4.6   — premium, 1x multiplier
+ *   See src/llm/copilot.ts for the full model list and premium multipliers.
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (mixing GPT-4o and Claude Sonnet, both via Copilot)
+// ---------------------------------------------------------------------------
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'gpt-4o',
+  provider: 'copilot',
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'claude-sonnet-4.5',
+  provider: 'copilot',
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'gpt-4o',
+  provider: 'copilot',
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23)
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'gpt-4o',
+  defaultProvider: 'copilot',
+  maxConcurrency: 1,
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/copilot-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/deepseek.ts
+++ b/examples/providers/deepseek.ts
@ -0,0 +1,158 @@
+/**
+ * Multi-Agent Team Collaboration with DeepSeek
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Every agent uses DeepSeek's flagship model.
+ *
+ * Run:
+ *   npx tsx examples/providers/deepseek.ts
+ *
+ * Prerequisites:
+ *   DEEPSEEK_API_KEY environment variable must be set.
+ *
+ * Available models:
+ *   deepseek-chat      — DeepSeek-V3 (non-thinking mode, recommended for coding tasks)
+ *   deepseek-reasoner  — DeepSeek-V3 (thinking mode, for complex reasoning)
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (all using deepseek-chat)
+// ---------------------------------------------------------------------------
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'deepseek-reasoner',
+  provider: 'deepseek',
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'deepseek-chat',
+  provider: 'deepseek',
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'deepseek-chat',
+  provider: 'deepseek',
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'deepseek-chat',
+  defaultProvider: 'deepseek',
+  maxConcurrency: 1, // sequential for readable output
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+// Sample outputs
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/gemini.ts
+++ b/examples/providers/gemini.ts
@ -0,0 +1,161 @@
+/**
+ * Multi-Agent Team Collaboration with Google Gemini
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Every agent uses Google's Gemini models
+ * via the official `@google/genai` SDK.
+ *
+ * Run:
+ *   npx tsx examples/providers/gemini.ts
+ *
+ * Prerequisites:
+ *   GEMINI_API_KEY environment variable must be set.
+ *   `@google/genai` is an optional peer dependency — install it first:
+ *     npm install @google/genai
+ *
+ * Available models (subset):
+ *   gemini-2.5-flash   — fast & cheap, good for routine coding tasks
+ *   gemini-2.5-pro     — more capable, higher latency, larger context
+ *   See https://ai.google.dev/gemini-api/docs/models for the full list.
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (mixing pro and flash for a cost/capability balance)
+// ---------------------------------------------------------------------------
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'gemini-2.5-pro',
+  provider: 'gemini',
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'gemini-2.5-flash',
+  provider: 'gemini',
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'gemini-2.5-flash',
+  provider: 'gemini',
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23)
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'gemini-2.5-flash',
+  defaultProvider: 'gemini',
+  maxConcurrency: 1,
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/gemini-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/gemma4-local.ts
+++ b/examples/providers/gemma4-local.ts
@ -1,5 +1,5 @@
 /**
- * Example 08 — Gemma 4 Local (100% Local, Zero API Cost)
+ * Gemma 4 Local (100% Local, Zero API Cost)
 *
 * Demonstrates both execution modes with a fully local Gemma 4 model via
 * Ollama. No cloud API keys needed — everything runs on your machine.
@ -13,7 +13,7 @@
 * Gemma 4 e2b (5.1B params) handles both reliably.
 *
 * Run:
- *   no_proxy=localhost npx tsx examples/08-gemma4-local.ts
+ *   no_proxy=localhost npx tsx examples/providers/gemma4-local.ts
 *
 * Prerequisites:
 *   1. Ollama >= 0.20.0 installed and running: https://ollama.com
@ -26,8 +26,8 @@
 * through the proxy.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Configuration — change this to match your Ollama setup
--- a/examples/providers/grok.ts
+++ b/examples/providers/grok.ts
@ -1,18 +1,18 @@
 /**
- * Example 12 — Multi-Agent Team Collaboration with Grok (xAI)
+ * Multi-Agent Team Collaboration with Grok (xAI)
 *
 * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
 * to build a minimal Express.js REST API. Every agent uses Grok's coding-optimized model.
 *
 * Run:
- *   npx tsx examples/12-grok.ts
+ *   npx tsx examples/providers/grok.ts
 *
 * Prerequisites:
 *   XAI_API_KEY environment variable must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agent definitions (all using grok-code-fast-1)
--- a/examples/providers/groq.ts
+++ b/examples/providers/groq.ts
@ -0,0 +1,164 @@
+/**
+ * Multi-Agent Team Collaboration with Groq
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Every agent uses Groq via the OpenAI-compatible adapter.
+ *
+ * Run:
+ *   npx tsx examples/providers/groq.ts
+ *
+ * Prerequisites:
+ *   GROQ_API_KEY environment variable must be set.
+ *
+ * Available models:
+ *   llama-3.3-70b-versatile       — Groq production model (recommended for coding tasks)
+ *   deepseek-r1-distill-llama-70b — Groq reasoning model
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (all using Groq via the OpenAI-compatible adapter)
+// ---------------------------------------------------------------------------
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'deepseek-r1-distill-llama-70b',
+  provider: 'openai',
+  baseURL: 'https://api.groq.com/openai/v1',
+  apiKey: process.env.GROQ_API_KEY,
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'llama-3.3-70b-versatile',
+  provider: 'openai',
+  baseURL: 'https://api.groq.com/openai/v1',
+  apiKey: process.env.GROQ_API_KEY,
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'llama-3.3-70b-versatile',
+  provider: 'openai',
+  baseURL: 'https://api.groq.com/openai/v1',
+  apiKey: process.env.GROQ_API_KEY,
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'llama-3.3-70b-versatile',
+  defaultProvider: 'openai',
+  maxConcurrency: 1, // sequential for readable output
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+// Sample outputs
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/minimax.ts
+++ b/examples/providers/minimax.ts
@ -0,0 +1,159 @@
+/**
+ * Multi-Agent Team Collaboration with MiniMax
+ *
+ * Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
+ * to build a minimal Express.js REST API. Every agent uses MiniMax's flagship model.
+ *
+ * Run:
+ *   npx tsx examples/providers/minimax.ts
+ *
+ * Prerequisites:
+ *   MINIMAX_API_KEY environment variable must be set.
+ *   MINIMAX_BASE_URL environment variable can be set to switch to the China mainland endpoint if needed.
+ *
+ * Endpoints:
+ *   Global (default): https://api.minimax.io/v1
+ *   China mainland:   https://api.minimaxi.com/v1  (set MINIMAX_BASE_URL)
+ */
+
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Agent definitions (all using MiniMax-M2.7)
+// ---------------------------------------------------------------------------
+const architect: AgentConfig = {
+  name: 'architect',
+  model: 'MiniMax-M2.7',
+  provider: 'minimax',
+  systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
+Your job is to design clear, production-quality API contracts and file/directory structures.
+Output concise plans in markdown — no unnecessary prose.`,
+  tools: ['bash', 'file_write'],
+  maxTurns: 5,
+  temperature: 0.2,
+}
+
+const developer: AgentConfig = {
+  name: 'developer',
+  model: 'MiniMax-M2.7',
+  provider: 'minimax',
+  systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
+Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
+  tools: ['bash', 'file_read', 'file_write', 'file_edit'],
+  maxTurns: 12,
+  temperature: 0.1,
+}
+
+const reviewer: AgentConfig = {
+  name: 'reviewer',
+  model: 'MiniMax-M2.7',
+  provider: 'minimax',
+  systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
+Provide a structured review with: LGTM items, suggestions, and any blocking issues.
+Read files using the tools before reviewing.`,
+  tools: ['bash', 'file_read', 'grep'],
+  maxTurns: 5,
+  temperature: 0.3,
+}
+
+// ---------------------------------------------------------------------------
+// Progress tracking
+// ---------------------------------------------------------------------------
+const startTimes = new Map<string, number>()
+
+function handleProgress(event: OrchestratorEvent): void {
+  const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
+  switch (event.type) {
+    case 'agent_start':
+      startTimes.set(event.agent ?? '', Date.now())
+      console.log(`[${ts}] AGENT START → ${event.agent}`)
+      break
+    case 'agent_complete': {
+      const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
+      console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
+      break
+    }
+    case 'task_start':
+      console.log(`[${ts}] TASK START ↓ ${event.task}`)
+      break
+    case 'task_complete':
+      console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
+      break
+    case 'message':
+      console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
+      break
+    case 'error':
+      console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
+      if (event.data instanceof Error) console.error(` ${event.data.message}`)
+      break
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Orchestrate
+// ---------------------------------------------------------------------------
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'MiniMax-M2.7',
+  defaultProvider: 'minimax',
+  maxConcurrency: 1, // sequential for readable output
+  onProgress: handleProgress,
+})
+
+const team = orchestrator.createTeam('api-team', {
+  name: 'api-team',
+  agents: [architect, developer, reviewer],
+  sharedMemory: true,
+  maxConcurrency: 1,
+})
+
+console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
+console.log('\nStarting team run...\n')
+console.log('='.repeat(60))
+
+const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
+- GET /health → { status: "ok" }
+- GET /users → returns a hardcoded array of 2 user objects
+- POST /users → accepts { name, email } body, logs it, returns 201
+- Proper error handling middleware
+- The server should listen on port 3001
+- Include a package.json with the required dependencies`
+
+const result = await orchestrator.runTeam(team, goal)
+
+console.log('\n' + '='.repeat(60))
+
+// ---------------------------------------------------------------------------
+// Results
+// ---------------------------------------------------------------------------
+console.log('\nTeam run complete.')
+console.log(`Success: ${result.success}`)
+console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
+
+console.log('\nPer-agent results:')
+for (const [agentName, agentResult] of result.agentResults) {
+  const status = agentResult.success ? 'OK' : 'FAILED'
+  const tools = agentResult.toolCalls.length
+  console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
+  if (!agentResult.success) {
+    console.log(` Error: ${agentResult.output.slice(0, 120)}`)
+  }
+}
+
+// Sample outputs
+const developerResult = result.agentResults.get('developer')
+if (developerResult?.success) {
+  console.log('\nDeveloper output (last 600 chars):')
+  console.log('─'.repeat(60))
+  const out = developerResult.output
+  console.log(out.length > 600 ? '...' + out.slice(-600) : out)
+  console.log('─'.repeat(60))
+}
+
+const reviewerResult = result.agentResults.get('reviewer')
+if (reviewerResult?.success) {
+  console.log('\nReviewer output:')
+  console.log('─'.repeat(60))
+  console.log(reviewerResult.output)
+  console.log('─'.repeat(60))
+}
--- a/examples/providers/ollama.ts
+++ b/examples/providers/ollama.ts
@ -1,5 +1,5 @@
 /**
- * Example 06 — Local Model + Cloud Model Team (Ollama + Claude)
+ * Local Model + Cloud Model Team (Ollama + Claude)
 *
 * Demonstrates mixing a local model served by Ollama with a cloud model
 * (Claude) in the same task pipeline. The key technique is using
@ -14,7 +14,7 @@
 * Just change the baseURL and model name below.
 *
 * Run:
- *   npx tsx examples/06-local-model.ts
+ *   npx tsx examples/providers/ollama.ts
 *
 * Prerequisites:
 *   1. Ollama installed and running: https://ollama.com
@ -22,8 +22,8 @@
 *   3. ANTHROPIC_API_KEY env var must be set.
 */

-import { OpenMultiAgent } from '../src/index.js'
-import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
+import { OpenMultiAgent } from '../../src/index.js'
+import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'

 // ---------------------------------------------------------------------------
 // Agents
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@ -1,14 +1,27 @@
 {
  "name": "@jackchen_me/open-multi-agent",
-  "version": "1.0.0",
-  "description": "Production-grade multi-agent orchestration framework. Model-agnostic, supports team collaboration, task scheduling, and inter-agent communication.",
+  "version": "1.2.0",
+  "description": "TypeScript multi-agent framework — one runTeam() call from goal to result. Auto task decomposition, parallel execution. 3 dependencies, deploys anywhere Node.js runs.",
+  "files": [
+    "dist",
+    "docs",
+    "README.md",
+    "LICENSE"
+  ],
  "type": "module",
  "main": "dist/index.js",
  "types": "dist/index.d.ts",
+  "bin": {
+    "oma": "dist/cli/oma.js"
+  },
  "exports": {
    ".": {
      "types": "./dist/index.d.ts",
      "import": "./dist/index.js"
+    },
+    "./mcp": {
+      "types": "./dist/mcp.d.ts",
+      "import": "./dist/mcp.js"
    }
  },
  "scripts": {
@ -16,7 +29,9 @@
    "dev": "tsc --watch",
    "test": "vitest run",
    "test:watch": "vitest",
+    "test:coverage": "vitest run --coverage",
    "lint": "tsc --noEmit",
+    "test:e2e": "RUN_E2E=1 vitest run tests/e2e/",
    "prepublishOnly": "npm run build"
  },
  "keywords": [
@ -42,15 +57,20 @@
    "zod": "^3.23.0"
  },
  "peerDependencies": {
-    "@google/genai": "^1.48.0"
+    "@google/genai": "^1.48.0",
+    "@modelcontextprotocol/sdk": "^1.18.0"
  },
  "peerDependenciesMeta": {
    "@google/genai": {
      "optional": true
+    },
+    "@modelcontextprotocol/sdk": {
+      "optional": true
    }
  },
  "devDependencies": {
    "@google/genai": "^1.48.0",
+    "@modelcontextprotocol/sdk": "^1.18.0",
    "@types/node": "^22.0.0",
    "@vitest/coverage-v8": "^2.1.9",
    "tsx": "^4.21.0",
--- a/src/agent/agent.ts
+++ b/src/agent/agent.ts
@ -146,10 +146,15 @@ export class Agent {
      maxTurns: this.config.maxTurns,
      maxTokens: this.config.maxTokens,
      temperature: this.config.temperature,
+      toolPreset: this.config.toolPreset,
      allowedTools: this.config.tools,
+      disallowedTools: this.config.disallowedTools,
      agentName: this.name,
      agentRole: this.config.systemPrompt?.slice(0, 50) ?? 'assistant',
      loopDetection: this.config.loopDetection,
+      maxTokenBudget: this.config.maxTokenBudget,
+      contextStrategy: this.config.contextStrategy,
+      compressToolResults: this.config.compressToolResults,
    }

    this.runner = new AgentRunner(
@ -260,7 +265,7 @@ export class Agent {
   * The tool becomes available to the next LLM call — no restart required.
   */
  addTool(tool: FrameworkToolDefinition): void {
-    this._toolRegistry.register(tool)
+    this._toolRegistry.register(tool, { runtimeAdded: true })
  }

  /**
@ -328,6 +333,16 @@ export class Agent {
      const result = await runner.run(messages, runOptions)
      this.state.tokenUsage = addUsage(this.state.tokenUsage, result.tokenUsage)

+      if (result.budgetExceeded) {
+        let budgetResult = this.toAgentRunResult(result, false)
+        if (this.config.afterRun) {
+          budgetResult = await this.config.afterRun(budgetResult)
+        }
+        this.transitionTo('completed')
+        this.emitAgentTrace(callerOptions, agentStartMs, budgetResult)
+        return budgetResult
+      }
+
      // --- Structured output validation ---
      if (this.config.outputSchema) {
        let validated = await this.validateStructuredOutput(
@ -461,6 +476,7 @@ export class Agent {
        tokenUsage: mergedTokenUsage,
        toolCalls: mergedToolCalls,
        structured: validated,
+        ...(retryResult.budgetExceeded ? { budgetExceeded: true } : {}),
      }
    } catch {
      // Retry also failed
@ -472,6 +488,7 @@ export class Agent {
        tokenUsage: mergedTokenUsage,
        toolCalls: mergedToolCalls,
        structured: undefined,
+        ...(retryResult.budgetExceeded ? { budgetExceeded: true } : {}),
      }
    }
  }
@ -502,7 +519,7 @@ export class Agent {
          const result = event.data as import('./runner.js').RunResult
          this.state.tokenUsage = addUsage(this.state.tokenUsage, result.tokenUsage)

-          let agentResult = this.toAgentRunResult(result, true)
+          let agentResult = this.toAgentRunResult(result, !result.budgetExceeded)
          if (this.config.afterRun) {
            agentResult = await this.config.afterRun(agentResult)
          }
@ -598,6 +615,7 @@ export class Agent {
      toolCalls: result.toolCalls,
      structured,
      ...(result.loopDetected ? { loopDetected: true } : {}),
+      ...(result.budgetExceeded ? { budgetExceeded: true } : {}),
    }
  }

--- a/src/agent/pool.ts
+++ b/src/agent/pool.ts
@ -58,6 +58,14 @@ export interface PoolStatus {
 export class AgentPool {
  private readonly agents: Map<string, Agent> = new Map()
  private readonly semaphore: Semaphore
+  /**
+   * Per-agent mutex (Semaphore(1)) to serialize concurrent runs on the same
+   * Agent instance.  Without this, two tasks assigned to the same agent could
+   * race on mutable instance state (`status`, `messages`, `tokenUsage`).
+   *
+   * @see https://github.com/anthropics/open-multi-agent/issues/72
+   */
+  private readonly agentLocks: Map<string, Semaphore> = new Map()
  /** Cursor used by `runAny` for round-robin dispatch. */
  private roundRobinIndex = 0

@ -69,6 +77,16 @@ export class AgentPool {
    this.semaphore = new Semaphore(maxConcurrency)
  }

+  /**
+   * Pool semaphore slots not currently held (`maxConcurrency - active`).
+   * Used to avoid deadlocks when a nested `run()` would wait forever for a slot
+   * held by the parent run. Best-effort only if multiple nested runs start in
+   * parallel after the same synchronous check.
+   */
+  get availableRunSlots(): number {
+    return this.maxConcurrency - this.semaphore.active
+  }
+
  // -------------------------------------------------------------------------
  // Registry operations
  // -------------------------------------------------------------------------
@ -86,6 +104,7 @@ export class AgentPool {
      )
    }
    this.agents.set(agent.name, agent)
+    this.agentLocks.set(agent.name, new Semaphore(1))
  }

  /**
@ -98,6 +117,7 @@ export class AgentPool {
      throw new Error(`AgentPool: agent '${name}' is not registered.`)
    }
    this.agents.delete(name)
+    this.agentLocks.delete(name)
  }

  /**
@ -130,7 +150,41 @@ export class AgentPool {
    runOptions?: Partial<RunOptions>,
  ): Promise<AgentRunResult> {
    const agent = this.requireAgent(agentName)
+    const agentLock = this.agentLocks.get(agentName)!

+    // Acquire per-agent lock first so the second call for the same agent waits
+    // here without consuming a pool slot.  Then acquire the pool semaphore.
+    await agentLock.acquire()
+    try {
+      await this.semaphore.acquire()
+      try {
+        return await agent.run(prompt, runOptions)
+      } finally {
+        this.semaphore.release()
+      }
+    } finally {
+      agentLock.release()
+    }
+  }
+
+  /**
+   * Run a prompt on a caller-supplied Agent instance, acquiring only the pool
+   * semaphore — no per-agent lock, no registry lookup.
+   *
+   * Designed for delegation: each delegated call should use a **fresh** Agent
+   * instance (matching `delegate_to_agent`'s "runs in a fresh conversation"
+   * semantics), so the per-agent mutex used by {@link run} would be dead
+   * weight and, worse, a deadlock vector for mutual delegation (A→B while
+   * B→A, each caller holding its own `run`'s agent lock).
+   *
+   * The caller is responsible for constructing the Agent; {@link AgentPool}
+   * does not register or track it.
+   */
+  async runEphemeral(
+    agent: Agent,
+    prompt: string,
+    runOptions?: Partial<RunOptions>,
+  ): Promise<AgentRunResult> {
    await this.semaphore.acquire()
    try {
      return await agent.run(prompt, runOptions)
@ -200,11 +254,18 @@ export class AgentPool {
    const agent = allAgents[this.roundRobinIndex]!
    this.roundRobinIndex = (this.roundRobinIndex + 1) % allAgents.length

-    await this.semaphore.acquire()
+    const agentLock = this.agentLocks.get(agent.name)!
+
+    await agentLock.acquire()
    try {
-      return await agent.run(prompt)
+      await this.semaphore.acquire()
+      try {
+        return await agent.run(prompt)
+      } finally {
+        this.semaphore.release()
+      }
    } finally {
-      this.semaphore.release()
+      agentLock.release()
    }
  }

--- a/src/agent/runner.ts
+++ b/src/agent/runner.ts
@ -23,17 +23,38 @@ import type {
  StreamEvent,
  ToolResult,
  ToolUseContext,
+  TeamInfo,
  LLMAdapter,
  LLMChatOptions,
  TraceEvent,
  LoopDetectionConfig,
  LoopDetectionInfo,
+  LLMToolDef,
+  ContextStrategy,
 } from '../types.js'
+import { TokenBudgetExceededError } from '../errors.js'
 import { LoopDetector } from './loop-detector.js'
 import { emitTrace } from '../utils/trace.js'
+import { estimateTokens } from '../utils/tokens.js'
 import type { ToolRegistry } from '../tool/framework.js'
 import type { ToolExecutor } from '../tool/executor.js'

+// ---------------------------------------------------------------------------
+// Tool presets
+// ---------------------------------------------------------------------------
+
+/** Predefined tool sets for common agent use cases. */
+export const TOOL_PRESETS = {
+  readonly: ['file_read', 'grep', 'glob'],
+  readwrite: ['file_read', 'file_write', 'file_edit', 'grep', 'glob'],
+  full: ['file_read', 'file_write', 'file_edit', 'grep', 'glob', 'bash'],
+} as const satisfies Record<string, readonly string[]>
+
+/** Framework-level disallowed tools for safety rails. */
+export const AGENT_FRAMEWORK_DISALLOWED: readonly string[] = [
+  // Empty for now, infrastructure for future built-in tools
+]
+
 // ---------------------------------------------------------------------------
 // Public interfaces
 // ---------------------------------------------------------------------------
@ -59,17 +80,30 @@ export interface RunnerOptions {
  /** AbortSignal that cancels any in-flight adapter call and stops the loop. */
  readonly abortSignal?: AbortSignal
  /**
-   * Whitelist of tool names this runner is allowed to use.
-   * When provided, only tools whose name appears in this list are sent to the
-   * LLM. When omitted, all registered tools are available.
+   * Tool access control configuration.
+   * - `toolPreset`: Predefined tool sets for common use cases
+   * - `allowedTools`: Whitelist of tool names (allowlist)
+   * - `disallowedTools`: Blacklist of tool names (denylist)
+   * Tools are resolved in order: preset → allowlist → denylist
   */
+  readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
  readonly allowedTools?: readonly string[]
+  readonly disallowedTools?: readonly string[]
  /** Display name of the agent driving this runner (used in tool context). */
  readonly agentName?: string
  /** Short role description of the agent (used in tool context). */
  readonly agentRole?: string
  /** Loop detection configuration. When set, detects stuck agent loops. */
  readonly loopDetection?: LoopDetectionConfig
+  /** Maximum cumulative tokens (input + output) allowed for this run. */
+  readonly maxTokenBudget?: number
+  /** Optional context compression strategy for long multi-turn runs. */
+  readonly contextStrategy?: ContextStrategy
+  /**
+   * Compress tool results that the agent has already processed.
+   * See {@link AgentConfig.compressToolResults} for details.
+   */
+  readonly compressToolResults?: boolean | { readonly minChars?: number }
 }

 /**
@ -101,6 +135,11 @@ export interface RunOptions {
   * {@link RunnerOptions.abortSignal}. Useful for per-run timeouts.
   */
  readonly abortSignal?: AbortSignal
+  /**
+   * Team context for built-in tools such as `delegate_to_agent`.
+   * Injected by the orchestrator during `runTeam` / `runTasks` pool runs.
+   */
+  readonly team?: TeamInfo
 }

 /** The aggregated result returned when a full run completes. */
@ -117,6 +156,8 @@ export interface RunResult {
  readonly turns: number
  /** True when the run was terminated or warned due to loop detection. */
  readonly loopDetected?: boolean
+  /** True when the run was terminated due to token budget limits. */
+  readonly budgetExceeded?: boolean
 }

 // ---------------------------------------------------------------------------
@ -146,6 +187,34 @@ function addTokenUsage(a: TokenUsage, b: TokenUsage): TokenUsage {

 const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }

+/** Default minimum content length before tool result compression kicks in. */
+const DEFAULT_MIN_COMPRESS_CHARS = 500
+
+/**
+ * Prepends synthetic framing text to the first user message so we never emit
+ * consecutive `user` turns (Bedrock) and summaries do not concatenate onto
+ * the original user prompt (direct API). If there is no user message yet,
+ * inserts a single assistant text preamble.
+ */
+function prependSyntheticPrefixToFirstUser(
+  messages: LLMMessage[],
+  prefix: string,
+): LLMMessage[] {
+  const userIdx = messages.findIndex(m => m.role === 'user')
+  if (userIdx < 0) {
+    return [{
+      role: 'assistant',
+      content: [{ type: 'text', text: prefix.trimEnd() }],
+    }, ...messages]
+  }
+  const target = messages[userIdx]!
+  const merged: LLMMessage = {
+    role: 'user',
+    content: [{ type: 'text', text: prefix }, ...target.content],
+  }
+  return [...messages.slice(0, userIdx), merged, ...messages.slice(userIdx + 1)]
+}
+
 // ---------------------------------------------------------------------------
 // AgentRunner
 // ---------------------------------------------------------------------------
@ -165,6 +234,10 @@ const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
 */
 export class AgentRunner {
  private readonly maxTurns: number
+  private summarizeCache: {
+    oldSignature: string
+    summaryPrefix: string
+  } | null = null

  constructor(
    private readonly adapter: LLMAdapter,
@ -175,6 +248,242 @@ export class AgentRunner {
    this.maxTurns = options.maxTurns ?? 10
  }

+  private serializeMessage(message: LLMMessage): string {
+    return JSON.stringify(message)
+  }
+
+  private truncateToSlidingWindow(messages: LLMMessage[], maxTurns: number): LLMMessage[] {
+    if (maxTurns <= 0) {
+      return messages
+    }
+
+    const firstUserIndex = messages.findIndex(m => m.role === 'user')
+    const firstUser = firstUserIndex >= 0 ? messages[firstUserIndex]! : null
+    const afterFirst = firstUserIndex >= 0
+      ? messages.slice(firstUserIndex + 1)
+      : messages.slice()
+
+    if (afterFirst.length <= maxTurns * 2) {
+      return messages
+    }
+
+    const kept = afterFirst.slice(-maxTurns * 2)
+    const result: LLMMessage[] = []
+
+    if (firstUser !== null) {
+      result.push(firstUser)
+    }
+
+    const droppedPairs = Math.floor((afterFirst.length - kept.length) / 2)
+    if (droppedPairs > 0) {
+      const notice =
+        `[Earlier conversation history truncated — ${droppedPairs} turn(s) removed]\n\n`
+      result.push(...prependSyntheticPrefixToFirstUser(kept, notice))
+      return result
+    }
+
+    result.push(...kept)
+    return result
+  }
+
+  private async summarizeMessages(
+    messages: LLMMessage[],
+    maxTokens: number,
+    summaryModel: string | undefined,
+    baseChatOptions: LLMChatOptions,
+    turns: number,
+    options: RunOptions,
+  ): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
+    const estimated = estimateTokens(messages)
+    if (estimated <= maxTokens || messages.length < 4) {
+      return { messages, usage: ZERO_USAGE }
+    }
+
+    const firstUserIndex = messages.findIndex(m => m.role === 'user')
+    if (firstUserIndex < 0 || firstUserIndex === messages.length - 1) {
+      return { messages, usage: ZERO_USAGE }
+    }
+
+    const firstUser = messages[firstUserIndex]!
+    const rest = messages.slice(firstUserIndex + 1)
+    if (rest.length < 2) {
+      return { messages, usage: ZERO_USAGE }
+    }
+
+    // Split on an even boundary so we never separate a tool_use assistant turn
+    // from its tool_result user message (rest is user/assistant pairs).
+    const splitAt = Math.max(2, Math.floor(rest.length / 4) * 2)
+    const oldPortion = rest.slice(0, splitAt)
+    const recentPortion = rest.slice(splitAt)
+
+    const oldSignature = oldPortion.map(m => this.serializeMessage(m)).join('\n')
+    if (this.summarizeCache !== null && this.summarizeCache.oldSignature === oldSignature) {
+      const mergedRecent = prependSyntheticPrefixToFirstUser(
+        recentPortion,
+        `${this.summarizeCache.summaryPrefix}\n\n`,
+      )
+      return { messages: [firstUser, ...mergedRecent], usage: ZERO_USAGE }
+    }
+
+    const summaryPrompt = [
+      'Summarize the following conversation history for an LLM.',
+      '- Preserve user goals, constraints, and decisions.',
+      '- Keep key tool outputs and unresolved questions.',
+      '- Use concise bullets.',
+      '- Do not fabricate details.',
+    ].join('\n')
+
+    const summaryInput: LLMMessage[] = [
+      {
+        role: 'user',
+        content: [
+          { type: 'text', text: summaryPrompt },
+          { type: 'text', text: `\n\nConversation:\n${oldSignature}` },
+        ],
+      },
+    ]
+
+    const summaryOptions: LLMChatOptions = {
+      ...baseChatOptions,
+      model: summaryModel ?? this.options.model,
+      tools: undefined,
+    }
+
+    const summaryStartMs = Date.now()
+    const summaryResponse = await this.adapter.chat(summaryInput, summaryOptions)
+    if (options.onTrace) {
+      const summaryEndMs = Date.now()
+      emitTrace(options.onTrace, {
+        type: 'llm_call',
+        runId: options.runId ?? '',
+        taskId: options.taskId,
+        agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
+        model: summaryOptions.model,
+        phase: 'summary',
+        turn: turns,
+        tokens: summaryResponse.usage,
+        startMs: summaryStartMs,
+        endMs: summaryEndMs,
+        durationMs: summaryEndMs - summaryStartMs,
+      })
+    }
+
+    const summaryText = extractText(summaryResponse.content).trim()
+    const summaryPrefix = summaryText.length > 0
+      ? `[Conversation summary]\n${summaryText}`
+      : '[Conversation summary unavailable]'
+
+    this.summarizeCache = { oldSignature, summaryPrefix }
+    const mergedRecent = prependSyntheticPrefixToFirstUser(
+      recentPortion,
+      `${summaryPrefix}\n\n`,
+    )
+    return {
+      messages: [firstUser, ...mergedRecent],
+      usage: summaryResponse.usage,
+    }
+  }
+
+  private async applyContextStrategy(
+    messages: LLMMessage[],
+    strategy: ContextStrategy,
+    baseChatOptions: LLMChatOptions,
+    turns: number,
+    options: RunOptions,
+  ): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
+    if (strategy.type === 'sliding-window') {
+      return { messages: this.truncateToSlidingWindow(messages, strategy.maxTurns), usage: ZERO_USAGE }
+    }
+
+    if (strategy.type === 'summarize') {
+      return this.summarizeMessages(
+        messages,
+        strategy.maxTokens,
+        strategy.summaryModel,
+        baseChatOptions,
+        turns,
+        options,
+      )
+    }
+
+    if (strategy.type === 'compact') {
+      return { messages: this.compactMessages(messages, strategy), usage: ZERO_USAGE }
+    }
+
+    const estimated = estimateTokens(messages)
+    const compressed = await strategy.compress(messages, estimated)
+    if (!Array.isArray(compressed) || compressed.length === 0) {
+      throw new Error('contextStrategy.custom.compress must return a non-empty LLMMessage[]')
+    }
+    return { messages: compressed, usage: ZERO_USAGE }
+  }
+
+  // -------------------------------------------------------------------------
+  // Tool resolution
+  // -------------------------------------------------------------------------
+
+  /**
+   * Resolve the final set of tools available to this agent based on the
+   * three-layer configuration: preset → allowlist → denylist → framework safety.
+   *
+   * Returns LLMToolDef[] for direct use with LLM adapters.
+   */
+  private resolveTools(): LLMToolDef[] {
+    // Validate configuration for contradictions
+    if (this.options.toolPreset && this.options.allowedTools) {
+      console.warn(
+        'AgentRunner: both toolPreset and allowedTools are set. ' +
+        'Final tool access will be the intersection of both.'
+      )
+    }
+
+    if (this.options.allowedTools && this.options.disallowedTools) {
+      const overlap = this.options.allowedTools.filter(tool =>
+        this.options.disallowedTools!.includes(tool)
+      )
+      if (overlap.length > 0) {
+        console.warn(
+          `AgentRunner: tools [${overlap.map(name => `"${name}"`).join(', ')}] appear in both allowedTools and disallowedTools. ` +
+          'This is contradictory and may lead to unexpected behavior.'
+        )
+      }
+    }
+
+    const allTools = this.toolRegistry.toToolDefs()
+    const runtimeCustomTools = this.toolRegistry.toRuntimeToolDefs()
+    const runtimeCustomToolNames = new Set(runtimeCustomTools.map(t => t.name))
+    let filteredTools = allTools.filter(t => !runtimeCustomToolNames.has(t.name))
+
+    // 1. Apply preset filter if set
+    if (this.options.toolPreset) {
+      const presetTools = new Set(TOOL_PRESETS[this.options.toolPreset] as readonly string[])
+      filteredTools = filteredTools.filter(t => presetTools.has(t.name))
+    }
+
+    // 2. Apply allowlist filter if set
+    if (this.options.allowedTools) {
+      filteredTools = filteredTools.filter(t => this.options.allowedTools!.includes(t.name))
+    }
+
+    // 3. Apply denylist filter if set
+    const denied = this.options.disallowedTools
+      ? new Set(this.options.disallowedTools)
+      : undefined
+    if (denied) {
+      filteredTools = filteredTools.filter(t => !denied.has(t.name))
+    }
+
+    // 4. Apply framework-level safety rails
+    const frameworkDenied = new Set(AGENT_FRAMEWORK_DISALLOWED)
+    filteredTools = filteredTools.filter(t => !frameworkDenied.has(t.name))
+
+    // Runtime-added custom tools bypass preset / allowlist but respect denylist.
+    const finalRuntime = denied
+      ? runtimeCustomTools.filter(t => !denied.has(t.name))
+      : runtimeCustomTools
+    return [...filteredTools, ...finalRuntime]
+  }
+
  // -------------------------------------------------------------------------
  // Public API
  // -------------------------------------------------------------------------
@ -204,6 +513,8 @@ export class AgentRunner {
    for await (const event of this.stream(messages, options)) {
      if (event.type === 'done') {
        Object.assign(accumulated, event.data)
+      } else if (event.type === 'error') {
+        throw event.data
      }
    }

@ -217,6 +528,7 @@ export class AgentRunner {
   *  - `{ type: 'text', data: string }` for each text delta
   *  - `{ type: 'tool_use', data: ToolUseBlock }` when the model requests a tool
   *  - `{ type: 'tool_result', data: ToolResultBlock }` after each execution
+ *  - `{ type: 'budget_exceeded', data: TokenBudgetExceededError }` on budget trip
   *  - `{ type: 'done', data: RunResult }` at the very end
   *  - `{ type: 'error', data: Error }` on unrecoverable failure
   */
@ -225,21 +537,18 @@ export class AgentRunner {
    options: RunOptions = {},
  ): AsyncGenerator<StreamEvent> {
    // Working copy of the conversation — mutated as turns progress.
-    const conversationMessages: LLMMessage[] = [...initialMessages]
+    let conversationMessages: LLMMessage[] = [...initialMessages]

    // Accumulated state across all turns.
    let totalUsage: TokenUsage = ZERO_USAGE
    const allToolCalls: ToolCallRecord[] = []
    let finalOutput = ''
    let turns = 0
+    let budgetExceeded = false

    // Build the stable LLM options once; model / tokens / temp don't change.
-    // toToolDefs() returns LLMToolDef[] (inputSchema, camelCase) — matches
-    // LLMChatOptions.tools from types.ts directly.
-    const allDefs = this.toolRegistry.toToolDefs()
-    const toolDefs = this.options.allowedTools
-      ? allDefs.filter(d => this.options.allowedTools!.includes(d.name))
-      : allDefs
+    // resolveTools() returns LLMToolDef[] with three-layer filtering applied.
+    const toolDefs = this.resolveTools()

    // Per-call abortSignal takes precedence over the static one.
    const effectiveAbortSignal = options.abortSignal ?? this.options.abortSignal
@ -278,6 +587,25 @@ export class AgentRunner {

        turns++

+        // Compress consumed tool results before context strategy (lightweight,
+        // no LLM calls) so the strategy operates on already-reduced messages.
+        if (this.options.compressToolResults && turns > 1) {
+          conversationMessages = this.compressConsumedToolResults(conversationMessages)
+        }
+
+        // Optionally compact context before each LLM call after the first turn.
+        if (this.options.contextStrategy && turns > 1) {
+          const compacted = await this.applyContextStrategy(
+            conversationMessages,
+            this.options.contextStrategy,
+            baseChatOptions,
+            turns,
+            options,
+          )
+          conversationMessages = compacted.messages
+          totalUsage = addTokenUsage(totalUsage, compacted.usage)
+        }
+
        // ------------------------------------------------------------------
        // Step 1: Call the LLM and collect the full response for this turn.
        // ------------------------------------------------------------------
@ -291,6 +619,7 @@ export class AgentRunner {
            taskId: options.taskId,
            agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
            model: this.options.model,
+            phase: 'turn',
            turn: turns,
            tokens: response.usage,
            startMs: llmStartMs,
@ -318,6 +647,21 @@ export class AgentRunner {
          yield { type: 'text', data: turnText } satisfies StreamEvent
        }

+        const totalTokens = totalUsage.input_tokens + totalUsage.output_tokens
+        if (this.options.maxTokenBudget !== undefined && totalTokens > this.options.maxTokenBudget) {
+          budgetExceeded = true
+          finalOutput = turnText
+          yield {
+            type: 'budget_exceeded',
+            data: new TokenBudgetExceededError(
+              this.options.agentName ?? 'unknown',
+              totalTokens,
+              this.options.maxTokenBudget,
+            ),
+          } satisfies StreamEvent
+          break
+        }
+
        // Extract tool-use blocks for detection and execution.
        const toolUseBlocks = extractToolUseBlocks(response.content)

@ -395,11 +739,12 @@ export class AgentRunner {
        // Parallel execution is critical for multi-tool responses where the
        // tools are independent (e.g. reading several files at once).
        // ------------------------------------------------------------------
-        const toolContext: ToolUseContext = this.buildToolContext()
+        const toolContext: ToolUseContext = this.buildToolContext(options)

        const executionPromises = toolUseBlocks.map(async (block): Promise<{
          resultBlock: ToolResultBlock
          record: ToolCallRecord
+          delegationUsage?: TokenUsage
        }> => {
          options.onToolCall?.(block.name, block.input)

@ -451,12 +796,30 @@ export class AgentRunner {
            is_error: result.isError,
          }

-          return { resultBlock, record }
+          return {
+            resultBlock,
+            record,
+            ...(result.metadata?.tokenUsage !== undefined
+              ? { delegationUsage: result.metadata.tokenUsage }
+              : {}),
+          }
        })

        // Wait for every tool in this turn to finish.
        const executions = await Promise.all(executionPromises)

+        // Roll up any nested-run token usage surfaced via ToolResult.metadata
+        // (e.g. from delegate_to_agent) so it counts against this agent's budget.
+        let delegationTurnUsage: TokenUsage | undefined
+        for (const ex of executions) {
+          if (ex.delegationUsage !== undefined) {
+            totalUsage = addTokenUsage(totalUsage, ex.delegationUsage)
+            delegationTurnUsage = delegationTurnUsage === undefined
+              ? ex.delegationUsage
+              : addTokenUsage(delegationTurnUsage, ex.delegationUsage)
+          }
+        }
+
        // ------------------------------------------------------------------
        // Step 5: Accumulate results and build the user message that carries
        //         them back to the LLM in the next turn.
@ -490,6 +853,27 @@ export class AgentRunner {
        conversationMessages.push(toolResultMessage)
        options.onMessage?.(toolResultMessage)

+        // Budget check is deferred until tool_result events have been yielded
+        // and the tool_result user message has been appended, so stream
+        // consumers see matched tool_use/tool_result pairs and the returned
+        // `messages` remain resumable against the Anthropic/OpenAI APIs.
+        if (delegationTurnUsage !== undefined && this.options.maxTokenBudget !== undefined) {
+          const totalAfterDelegation = totalUsage.input_tokens + totalUsage.output_tokens
+          if (totalAfterDelegation > this.options.maxTokenBudget) {
+            budgetExceeded = true
+            finalOutput = turnText
+            yield {
+              type: 'budget_exceeded',
+              data: new TokenBudgetExceededError(
+                this.options.agentName ?? 'unknown',
+                totalAfterDelegation,
+                this.options.maxTokenBudget,
+              ),
+            } satisfies StreamEvent
+            break
+          }
+        }
+
        // Loop back to Step 1 — send updated conversation to the LLM.
      }
    } catch (err) {
@ -516,6 +900,7 @@ export class AgentRunner {
      tokenUsage: totalUsage,
      turns,
      ...(loopDetected ? { loopDetected: true } : {}),
+      ...(budgetExceeded ? { budgetExceeded: true } : {}),
    }

    yield { type: 'done', data: runResult } satisfies StreamEvent
@ -525,18 +910,233 @@ export class AgentRunner {
  // Private helpers
  // -------------------------------------------------------------------------

+  /**
+   * Rule-based selective context compaction (no LLM calls).
+   *
+   * Compresses old turns while preserving the conversation skeleton:
+   * - tool_use blocks (decisions) are always kept
+   * - Long tool_result content is replaced with a compact marker
+   * - Long assistant text blocks are truncated with an excerpt
+   * - Error tool_results are never compressed
+   * - Recent turns (within `preserveRecentTurns`) are kept intact
+   */
+  private compactMessages(
+    messages: LLMMessage[],
+    strategy: Extract<ContextStrategy, { type: 'compact' }>,
+  ): LLMMessage[] {
+    const estimated = estimateTokens(messages)
+    if (estimated <= strategy.maxTokens) {
+      return messages
+    }
+
+    const preserveRecent = strategy.preserveRecentTurns ?? 4
+    const minToolResultChars = strategy.minToolResultChars ?? 200
+    const minTextBlockChars = strategy.minTextBlockChars ?? 2000
+    const textBlockExcerptChars = strategy.textBlockExcerptChars ?? 200
+
+    // Find the first user message — it is always preserved as-is.
+    const firstUserIndex = messages.findIndex(m => m.role === 'user')
+    if (firstUserIndex < 0 || firstUserIndex === messages.length - 1) {
+      return messages
+    }
+
+    // Walk backward to find the boundary between old and recent turns.
+    // A "turn pair" is an assistant message followed by a user message.
+    let boundary = messages.length
+    let pairsFound = 0
+    for (let i = messages.length - 1; i > firstUserIndex && pairsFound < preserveRecent; i--) {
+      if (messages[i]!.role === 'user' && i > 0 && messages[i - 1]!.role === 'assistant') {
+        pairsFound++
+        boundary = i - 1
+      }
+    }
+
+    // If all turns fit within the recent window, nothing to compact.
+    if (boundary <= firstUserIndex + 1) {
+      return messages
+    }
+
+    // Build a tool_use_id → tool name lookup from old assistant messages.
+    const toolNameMap = new Map<string, string>()
+    for (let i = firstUserIndex + 1; i < boundary; i++) {
+      const msg = messages[i]!
+      if (msg.role !== 'assistant') continue
+      for (const block of msg.content) {
+        if (block.type === 'tool_use') {
+          toolNameMap.set(block.id, block.name)
+        }
+      }
+    }
+
+    // Process old messages (between first user and boundary).
+    let anyChanged = false
+    const result: LLMMessage[] = []
+
+    for (let i = 0; i < messages.length; i++) {
+      // First user message and recent messages: keep intact.
+      if (i <= firstUserIndex || i >= boundary) {
+        result.push(messages[i]!)
+        continue
+      }
+
+      const msg = messages[i]!
+      let msgChanged = false
+      const newContent = msg.content.map((block): ContentBlock => {
+        if (msg.role === 'assistant') {
+          // tool_use blocks: always preserve (decisions).
+          if (block.type === 'tool_use') return block
+          // Long text blocks: truncate with excerpt.
+          if (block.type === 'text' && block.text.length >= minTextBlockChars) {
+            msgChanged = true
+            return {
+              type: 'text',
+              text: `${block.text.slice(0, textBlockExcerptChars)}... [truncated — ${block.text.length} chars total]`,
+            } satisfies TextBlock
+          }
+          // Image blocks in old turns: replace with marker.
+          if (block.type === 'image') {
+            msgChanged = true
+            return { type: 'text', text: '[Image compacted]' } satisfies TextBlock
+          }
+          return block
+        }
+
+        // User messages in old zone.
+        if (block.type === 'tool_result') {
+          // Error results: always preserve.
+          if (block.is_error) return block
+          // Already compressed by compressToolResults or a prior compact pass.
+          if (
+            block.content.startsWith('[Tool output compressed') ||
+            block.content.startsWith('[Tool result:')
+          ) {
+            return block
+          }
+          // Short results: preserve.
+          if (block.content.length < minToolResultChars) return block
+          const toolName = toolNameMap.get(block.tool_use_id) ?? 'unknown'
+          // Delegation results: preserve — parent agent may still reason over them.
+          if (toolName === 'delegate_to_agent') return block
+          // Compress.
+          msgChanged = true
+          return {
+            type: 'tool_result',
+            tool_use_id: block.tool_use_id,
+            content: `[Tool result: ${toolName} — ${block.content.length} chars, compacted]`,
+          } satisfies ToolResultBlock
+        }
+        return block
+      })
+
+      if (msgChanged) {
+        anyChanged = true
+        result.push({ role: msg.role, content: newContent } as LLMMessage)
+      } else {
+        result.push(msg)
+      }
+    }
+
+    return anyChanged ? result : messages
+  }
+
+  /**
+   * Replace consumed tool results with compact markers.
+   *
+   * A tool_result is "consumed" when the assistant has produced a response
+   * after seeing it (i.e. there is an assistant message following the user
+   * message that contains the tool_result).  The most recent user message
+   * with tool results is always kept intact — the LLM is about to see it.
+   *
+   * Error results and results shorter than `minChars` are never compressed.
+   */
+  private compressConsumedToolResults(messages: LLMMessage[]): LLMMessage[] {
+    const config = this.options.compressToolResults
+    if (!config) return messages
+
+    const minChars = typeof config === 'object'
+      ? (config.minChars ?? DEFAULT_MIN_COMPRESS_CHARS)
+      : DEFAULT_MIN_COMPRESS_CHARS
+
+    // Find the last user message that carries tool_result blocks.
+    let lastToolResultUserIdx = -1
+    for (let i = messages.length - 1; i >= 0; i--) {
+      if (
+        messages[i]!.role === 'user' &&
+        messages[i]!.content.some(b => b.type === 'tool_result')
+      ) {
+        lastToolResultUserIdx = i
+        break
+      }
+    }
+
+    // Nothing to compress if there's at most one tool-result user message.
+    if (lastToolResultUserIdx <= 0) return messages
+
+    // Build a tool_use_id → tool name map so we can exempt delegation results,
+    // whose full output the parent agent may need to re-read in later turns.
+    const toolNameMap = new Map<string, string>()
+    for (const msg of messages) {
+      if (msg.role !== 'assistant') continue
+      for (const block of msg.content) {
+        if (block.type === 'tool_use') toolNameMap.set(block.id, block.name)
+      }
+    }
+
+    let anyChanged = false
+    const result = messages.map((msg, idx) => {
+      // Only compress user messages that appear before the last one.
+      if (msg.role !== 'user' || idx >= lastToolResultUserIdx) return msg
+
+      const hasToolResult = msg.content.some(b => b.type === 'tool_result')
+      if (!hasToolResult) return msg
+
+      let msgChanged = false
+      const newContent = msg.content.map((block): ContentBlock => {
+        if (block.type !== 'tool_result') return block
+
+        // Never compress error results — they carry diagnostic value.
+        if (block.is_error) return block
+
+        // Never compress delegation results — the parent agent relies on the full sub-agent output.
+        if (toolNameMap.get(block.tool_use_id) === 'delegate_to_agent') return block
+
+        // Skip already-compressed results — avoid re-compression with wrong char count.
+        if (block.content.startsWith('[Tool output compressed')) return block
+
+        // Skip short results — the marker itself has overhead.
+        if (block.content.length < minChars) return block
+
+        msgChanged = true
+        return {
+          type: 'tool_result',
+          tool_use_id: block.tool_use_id,
+          content: `[Tool output compressed — ${block.content.length} chars, already processed]`,
+        } satisfies ToolResultBlock
+      })
+
+      if (msgChanged) {
+        anyChanged = true
+        return { role: msg.role, content: newContent } as LLMMessage
+      }
+      return msg
+    })
+
+    return anyChanged ? result : messages
+  }
+
  /**
   * Build the {@link ToolUseContext} passed to every tool execution.
   * Identifies this runner as the invoking agent.
   */
-  private buildToolContext(): ToolUseContext {
+  private buildToolContext(options: RunOptions = {}): ToolUseContext {
    return {
      agent: {
        name: this.options.agentName ?? 'runner',
        role: this.options.agentRole ?? 'assistant',
        model: this.options.model,
      },
-      abortSignal: this.options.abortSignal,
+      abortSignal: options.abortSignal ?? this.options.abortSignal,
+      ...(options.team !== undefined ? { team: options.team } : {}),
    }
  }
 }
--- a/src/cli/oma.ts
+++ b/src/cli/oma.ts
@ -0,0 +1,470 @@
+#!/usr/bin/env node
+/**
+ * Thin shell/CI wrapper over OpenMultiAgent — no interactive session, cwd binding,
+ * approvals, or persistence.
+ *
+ * Exit codes:
+ *   0 — finished; team run succeeded
+ *   1 — finished; team run reported failure (agents/tasks)
+ *   2 — invalid usage, I/O, or JSON validation
+ *   3 — unexpected runtime error (including LLM errors)
+ */
+
+import { mkdir, writeFile } from 'node:fs/promises'
+import { readFileSync } from 'node:fs'
+import { join, resolve } from 'node:path'
+import { fileURLToPath } from 'node:url'
+
+import { OpenMultiAgent } from '../orchestrator/orchestrator.js'
+import { renderTeamRunDashboard } from '../dashboard/render-team-run-dashboard.js'
+import type { SupportedProvider } from '../llm/adapter.js'
+import type { AgentRunResult, CoordinatorConfig, OrchestratorConfig, TeamConfig, TeamRunResult } from '../types.js'
+
+// ---------------------------------------------------------------------------
+// Exit codes
+// ---------------------------------------------------------------------------
+
+export const EXIT = {
+  SUCCESS: 0,
+  RUN_FAILED: 1,
+  USAGE: 2,
+  INTERNAL: 3,
+} as const
+
+class OmaValidationError extends Error {
+  override readonly name = 'OmaValidationError'
+  constructor(message: string) {
+    super(message)
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Provider helper (static reference data)
+// ---------------------------------------------------------------------------
+
+const PROVIDER_REFERENCE: ReadonlyArray<{
+  id: SupportedProvider
+  apiKeyEnv: readonly string[]
+  baseUrlSupported: boolean
+  notes?: string
+}> = [
+  { id: 'anthropic', apiKeyEnv: ['ANTHROPIC_API_KEY'], baseUrlSupported: true },
+  { id: 'azure-openai', apiKeyEnv: ['AZURE_OPENAI_API_KEY', 'AZURE_OPENAI_ENDPOINT', 'AZURE_OPENAI_DEPLOYMENT'], baseUrlSupported: true, notes: 'Azure OpenAI requires endpoint URL (e.g., https://my-resource.openai.azure.com) and API key. Optional: AZURE_OPENAI_API_VERSION (defaults to 2024-10-21). Prefer setting deployment on agent.model; AZURE_OPENAI_DEPLOYMENT is a fallback when model is blank.' },
+  { id: 'openai', apiKeyEnv: ['OPENAI_API_KEY'], baseUrlSupported: true, notes: 'Set baseURL for Ollama / vLLM / LM Studio; apiKey may be a placeholder.' },
+  { id: 'gemini', apiKeyEnv: ['GEMINI_API_KEY', 'GOOGLE_API_KEY'], baseUrlSupported: false },
+  { id: 'grok', apiKeyEnv: ['XAI_API_KEY'], baseUrlSupported: true },
+  { id: 'minimax', apiKeyEnv: ['MINIMAX_API_KEY'], baseUrlSupported: true, notes: 'Global endpoint: https://api.minimax.io/v1 (default). China endpoint: https://api.minimaxi.com/v1. Set MINIMAX_BASE_URL to choose, or pass baseURL in agent config.' },
+  { id: 'deepseek', apiKeyEnv: ['DEEPSEEK_API_KEY'], baseUrlSupported: true, notes: 'OpenAI-compatible endpoint at https://api.deepseek.com/v1. Models: deepseek-chat (V3), deepseek-reasoner (thinking).' },
+  {
+    id: 'copilot',
+    apiKeyEnv: ['GITHUB_COPILOT_TOKEN', 'GITHUB_TOKEN'],
+    baseUrlSupported: false,
+    notes: 'If no token env is set, Copilot adapter may start an interactive OAuth device flow (avoid in CI).',
+  },
+]
+
+// ---------------------------------------------------------------------------
+// argv / JSON helpers
+// ---------------------------------------------------------------------------
+
+export function parseArgs(argv: string[]): {
+  _: string[]
+  flags: Set<string>
+  kv: Map<string, string>
+} {
+  const _ = argv.slice(2)
+  const flags = new Set<string>()
+  const kv = new Map<string, string>()
+  let i = 0
+  while (i < _.length) {
+    const a = _[i]!
+    if (a === '--') {
+      break
+    }
+    if (a.startsWith('--')) {
+      const eq = a.indexOf('=')
+      if (eq !== -1) {
+        kv.set(a.slice(2, eq), a.slice(eq + 1))
+        i++
+        continue
+      }
+      const key = a.slice(2)
+      const next = _[i + 1]
+      if (next !== undefined && !next.startsWith('--')) {
+        kv.set(key, next)
+        i += 2
+      } else {
+        flags.add(key)
+        i++
+      }
+      continue
+    }
+    i++
+  }
+  return { _, flags, kv }
+}
+
+function getOpt(kv: Map<string, string>, flags: Set<string>, key: string): string | undefined {
+  if (flags.has(key)) return ''
+  return kv.get(key)
+}
+
+function readJson(path: string): unknown {
+  const abs = resolve(path)
+  const raw = readFileSync(abs, 'utf8')
+  try {
+    return JSON.parse(raw) as unknown
+  } catch (e) {
+    if (e instanceof SyntaxError) {
+      throw new Error(`Invalid JSON in ${abs}: ${e.message}`)
+    }
+    throw e
+  }
+}
+
+function isObject(v: unknown): v is Record<string, unknown> {
+  return typeof v === 'object' && v !== null && !Array.isArray(v)
+}
+
+function asTeamConfig(v: unknown, label: string): TeamConfig {
+  if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
+  const name = v['name']
+  const agents = v['agents']
+  if (typeof name !== 'string' || !name) throw new OmaValidationError(`${label}.name: non-empty string required`)
+  if (!Array.isArray(agents) || agents.length === 0) {
+    throw new OmaValidationError(`${label}.agents: non-empty array required`)
+  }
+  for (const a of agents) {
+    if (!isObject(a)) throw new OmaValidationError(`${label}.agents[]: each agent must be an object`)
+    if (typeof a['name'] !== 'string' || !a['name']) throw new OmaValidationError(`agent.name required`)
+    if (typeof a['model'] !== 'string' || !a['model']) {
+      throw new OmaValidationError(`agent.model required for "${String(a['name'])}"`)
+    }
+  }
+  return v as unknown as TeamConfig
+}
+
+function asOrchestratorPartial(v: unknown, label: string): OrchestratorConfig {
+  if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
+  return v as OrchestratorConfig
+}
+
+function asCoordinatorPartial(v: unknown, label: string): CoordinatorConfig {
+  if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
+  return v as CoordinatorConfig
+}
+
+function asTaskSpecs(v: unknown, label: string): ReadonlyArray<{
+  title: string
+  description: string
+  assignee?: string
+  dependsOn?: string[]
+  memoryScope?: 'dependencies' | 'all'
+  maxRetries?: number
+  retryDelayMs?: number
+  retryBackoff?: number
+}> {
+  if (!Array.isArray(v)) throw new OmaValidationError(`${label}: expected a JSON array`)
+  const out: Array<{
+    title: string
+    description: string
+    assignee?: string
+    dependsOn?: string[]
+    memoryScope?: 'dependencies' | 'all'
+    maxRetries?: number
+    retryDelayMs?: number
+    retryBackoff?: number
+  }> = []
+  let i = 0
+  for (const item of v) {
+    if (!isObject(item)) throw new OmaValidationError(`${label}[${i}]: object expected`)
+    if (typeof item['title'] !== 'string' || typeof item['description'] !== 'string') {
+      throw new OmaValidationError(`${label}[${i}]: title and description strings required`)
+    }
+    const row: (typeof out)[0] = {
+      title: item['title'],
+      description: item['description'],
+    }
+    if (typeof item['assignee'] === 'string') row.assignee = item['assignee']
+    if (Array.isArray(item['dependsOn'])) {
+      row.dependsOn = item['dependsOn'].filter((x): x is string => typeof x === 'string')
+    }
+    if (item['memoryScope'] === 'all' || item['memoryScope'] === 'dependencies') {
+      row.memoryScope = item['memoryScope']
+    }
+    if (typeof item['maxRetries'] === 'number') row.maxRetries = item['maxRetries']
+    if (typeof item['retryDelayMs'] === 'number') row.retryDelayMs = item['retryDelayMs']
+    if (typeof item['retryBackoff'] === 'number') row.retryBackoff = item['retryBackoff']
+    out.push(row)
+    i++
+  }
+  return out
+}
+
+export interface CliJsonOptions {
+  readonly pretty: boolean
+  readonly includeMessages: boolean
+}
+
+export function serializeAgentResult(r: AgentRunResult, includeMessages: boolean): Record<string, unknown> {
+  const base: Record<string, unknown> = {
+    success: r.success,
+    output: r.output,
+    tokenUsage: r.tokenUsage,
+    toolCalls: r.toolCalls,
+    structured: r.structured,
+    loopDetected: r.loopDetected,
+    budgetExceeded: r.budgetExceeded,
+  }
+  if (includeMessages) base['messages'] = r.messages
+  return base
+}
+
+export function serializeTeamRunResult(result: TeamRunResult, opts: CliJsonOptions): Record<string, unknown> {
+  const agentResults: Record<string, unknown> = {}
+  for (const [k, v] of result.agentResults) {
+    agentResults[k] = serializeAgentResult(v, opts.includeMessages)
+  }
+  return {
+    success: result.success,
+    goal: result.goal,
+    tasks: result.tasks,
+    totalTokenUsage: result.totalTokenUsage,
+    agentResults,
+  }
+}
+
+function printJson(data: unknown, pretty: boolean): void {
+  const s = pretty ? JSON.stringify(data, null, 2) : JSON.stringify(data)
+  process.stdout.write(`${s}\n`)
+}
+
+function help(): string {
+  return [
+    'open-multi-agent CLI (oma)',
+    '',
+    'Usage:',
+    '  oma run --goal <text> --team <team.json> [--orchestrator <orch.json>] [--coordinator <coord.json>]',
+    '  oma task --file <tasks.json> [--team <team.json>]',
+    '  oma provider [list | template <provider>]',
+    '',
+    'Flags:',
+    '  --pretty              Pretty-print JSON to stdout',
+    '  --include-messages    Include full LLM message arrays in run output (large)',
+    '  --dashboard           Write team-run DAG HTML dashboard to oma-dashboards/',
+    '',
+    'team.json may be a TeamConfig object, or { "team": TeamConfig, "orchestrator": { ... } }.',
+    'tasks.json: { "team": TeamConfig, "tasks": [ ... ], "orchestrator"?: { ... } }.',
+    '  Optional --team overrides the embedded team object.',
+    '',
+    'Exit codes: 0 success, 1 run failed, 2 usage/validation, 3 internal',
+  ].join('\n')
+}
+
+const DEFAULT_MODEL_HINT: Record<SupportedProvider, string> = {
+  anthropic: 'claude-opus-4-6',
+  'azure-openai': 'gpt-4',
+  openai: 'gpt-4o',
+  gemini: 'gemini-2.0-flash',
+  grok: 'grok-2-latest',
+  copilot: 'gpt-4o',
+  minimax: 'MiniMax-M2.7',
+  deepseek: 'deepseek-chat',
+}
+
+async function cmdProvider(sub: string | undefined, arg: string | undefined, pretty: boolean): Promise<number> {
+  if (sub === undefined || sub === 'list') {
+    printJson({ providers: PROVIDER_REFERENCE }, pretty)
+    return EXIT.SUCCESS
+  }
+  if (sub === 'template') {
+    const id = arg as SupportedProvider | undefined
+    const row = PROVIDER_REFERENCE.find((p) => p.id === id)
+    if (!id || !row) {
+      printJson(
+        {
+          error: {
+            kind: 'usage',
+            message: `usage: oma provider template <${PROVIDER_REFERENCE.map((p) => p.id).join('|')}>`,
+          },
+        },
+        pretty,
+      )
+      return EXIT.USAGE
+    }
+    printJson(
+      {
+        orchestrator: {
+          defaultProvider: id,
+          defaultModel: DEFAULT_MODEL_HINT[id],
+        },
+        agent: {
+          name: 'worker',
+          model: DEFAULT_MODEL_HINT[id],
+          provider: id,
+          systemPrompt: 'You are a helpful assistant.',
+        },
+        env: Object.fromEntries(row.apiKeyEnv.map((k) => [k, `<set ${k} in environment>`])),
+        notes: row.notes,
+      },
+      pretty,
+    )
+    return EXIT.SUCCESS
+  }
+  printJson({ error: { kind: 'usage', message: `unknown provider subcommand: ${sub}` } }, pretty)
+  return EXIT.USAGE
+}
+
+function mergeOrchestrator(base: OrchestratorConfig, ...partials: OrchestratorConfig[]): OrchestratorConfig {
+  let o: OrchestratorConfig = { ...base }
+  for (const p of partials) {
+    o = { ...o, ...p }
+  }
+  return o
+}
+
+async function writeRunTeamDashboardFile(html: string): Promise<string> {
+  const directory = join(process.cwd(), 'oma-dashboards')
+  await mkdir(directory, { recursive: true })
+  const stamp = new Date().toISOString().replaceAll(':', '-').replace('.', '-')
+  const filePath = join(directory, `runTeam-${stamp}.html`)
+  await writeFile(filePath, html, 'utf8')
+  return filePath
+}
+
+async function main(): Promise<number> {
+  const argv = parseArgs(process.argv)
+  const cmd = argv._[0]
+  const pretty = argv.flags.has('pretty')
+  const includeMessages = argv.flags.has('include-messages')
+  const dashboard = argv.flags.has('dashboard')
+
+  if (cmd === undefined || cmd === 'help' || cmd === '-h' || cmd === '--help') {
+    process.stdout.write(`${help()}\n`)
+    return EXIT.SUCCESS
+  }
+
+  if (cmd === 'provider') {
+    return cmdProvider(argv._[1], argv._[2], pretty)
+  }
+
+  const jsonOpts: CliJsonOptions = { pretty, includeMessages }
+
+  try {
+    if (cmd === 'run') {
+      const goal = getOpt(argv.kv, argv.flags, 'goal')
+      const teamPath = getOpt(argv.kv, argv.flags, 'team')
+      const orchPath = getOpt(argv.kv, argv.flags, 'orchestrator')
+      const coordPath = getOpt(argv.kv, argv.flags, 'coordinator')
+      if (!goal || !teamPath) {
+        printJson({ error: { kind: 'usage', message: '--goal and --team are required' } }, pretty)
+        return EXIT.USAGE
+      }
+
+      const teamRaw = readJson(teamPath)
+      let teamCfg: TeamConfig
+      let orchParts: OrchestratorConfig[] = []
+      if (isObject(teamRaw) && teamRaw['team'] !== undefined) {
+        teamCfg = asTeamConfig(teamRaw['team'], 'team')
+        if (teamRaw['orchestrator'] !== undefined) {
+          orchParts.push(asOrchestratorPartial(teamRaw['orchestrator'], 'orchestrator'))
+        }
+      } else {
+        teamCfg = asTeamConfig(teamRaw, 'team')
+      }
+      if (orchPath) {
+        orchParts.push(asOrchestratorPartial(readJson(orchPath), 'orchestrator file'))
+      }
+
+      const orchestrator = new OpenMultiAgent(mergeOrchestrator({}, ...orchParts))
+      const team = orchestrator.createTeam(teamCfg.name, teamCfg)
+      let coordinator: CoordinatorConfig | undefined
+      if (coordPath) {
+        coordinator = asCoordinatorPartial(readJson(coordPath), 'coordinator file')
+      }
+      const result = await orchestrator.runTeam(team, goal, coordinator ? { coordinator } : undefined)
+      if (dashboard) {
+        const html = renderTeamRunDashboard(result)
+        try {
+          await writeRunTeamDashboardFile(html)
+        } catch (err) {
+          process.stderr.write(
+            `oma: failed to write runTeam dashboard: ${err instanceof Error ? err.message : String(err)}\n`,
+          )
+        }
+      }
+      await orchestrator.shutdown()
+      const payload = { command: 'run' as const, ...serializeTeamRunResult(result, jsonOpts) }
+      printJson(payload, pretty)
+      return result.success ? EXIT.SUCCESS : EXIT.RUN_FAILED
+    }
+
+    if (cmd === 'task') {
+      const file = getOpt(argv.kv, argv.flags, 'file')
+      const teamOverride = getOpt(argv.kv, argv.flags, 'team')
+      if (!file) {
+        printJson({ error: { kind: 'usage', message: '--file is required' } }, pretty)
+        return EXIT.USAGE
+      }
+      const doc = readJson(file)
+      if (!isObject(doc)) {
+        throw new OmaValidationError('tasks file root must be an object')
+      }
+      const orchParts: OrchestratorConfig[] = []
+      if (doc['orchestrator'] !== undefined) {
+        orchParts.push(asOrchestratorPartial(doc['orchestrator'], 'orchestrator'))
+      }
+      const teamCfg = teamOverride
+        ? asTeamConfig(readJson(teamOverride), 'team (--team)')
+        : asTeamConfig(doc['team'], 'team')
+
+      const tasks = asTaskSpecs(doc['tasks'], 'tasks')
+      if (tasks.length === 0) {
+        throw new OmaValidationError('tasks array must not be empty')
+      }
+
+      const orchestrator = new OpenMultiAgent(mergeOrchestrator({}, ...orchParts))
+      const team = orchestrator.createTeam(teamCfg.name, teamCfg)
+      const result = await orchestrator.runTasks(team, tasks)
+      await orchestrator.shutdown()
+      const payload = { command: 'task' as const, ...serializeTeamRunResult(result, jsonOpts) }
+      printJson(payload, pretty)
+      return result.success ? EXIT.SUCCESS : EXIT.RUN_FAILED
+    }
+
+    printJson({ error: { kind: 'usage', message: `unknown command: ${cmd}` } }, pretty)
+    return EXIT.USAGE
+  } catch (e) {
+    const message = e instanceof Error ? e.message : String(e)
+    const { kind, exit } = classifyCliError(e, message)
+    printJson({ error: { kind, message } }, pretty)
+    return exit
+  }
+}
+
+function classifyCliError(e: unknown, message: string): { kind: string; exit: number } {
+  if (e instanceof OmaValidationError) return { kind: 'validation', exit: EXIT.USAGE }
+  if (message.includes('Invalid JSON')) return { kind: 'validation', exit: EXIT.USAGE }
+  if (message.includes('ENOENT') || message.includes('EACCES')) return { kind: 'io', exit: EXIT.USAGE }
+  return { kind: 'runtime', exit: EXIT.INTERNAL }
+}
+
+const isMain = (() => {
+  const argv1 = process.argv[1]
+  if (!argv1) return false
+  try {
+    return fileURLToPath(import.meta.url) === resolve(argv1)
+  } catch {
+    return false
+  }
+})()
+
+if (isMain) {
+  main()
+    .then((code) => process.exit(code))
+    .catch((e) => {
+      const message = e instanceof Error ? e.message : String(e)
+      process.stdout.write(`${JSON.stringify({ error: { kind: 'internal', message } })}\n`)
+      process.exit(EXIT.INTERNAL)
+    })
+}
--- a/src/dashboard/layout-tasks.ts
+++ b/src/dashboard/layout-tasks.ts
@ -0,0 +1,98 @@
+/**
+ * Pure DAG layout for the team-run dashboard (mirrors the browser algorithm).
+ */
+
+export interface LayoutTaskInput {
+  readonly id: string
+  readonly dependsOn?: readonly string[]
+}
+
+export interface LayoutTasksResult {
+  readonly positions: ReadonlyMap<string, { readonly x: number; readonly y: number }>
+  readonly width: number
+  readonly height: number
+  readonly nodeW: number
+  readonly nodeH: number
+}
+
+/**
+ * Assigns each task to a column by longest path from roots (topological level),
+ * then stacks rows within each column. Used by the dashboard canvas sizing.
+ */
+export function layoutTasks<T extends LayoutTaskInput>(taskList: readonly T[]): LayoutTasksResult {
+  const byId = new Map(taskList.map((task) => [task.id, task]))
+  const children = new Map<string, string[]>(taskList.map((task) => [task.id, []]))
+  const indegree = new Map<string, number>()
+
+  for (const task of taskList) {
+    const deps = (task.dependsOn ?? []).filter((dep) => byId.has(dep))
+    indegree.set(task.id, deps.length)
+    for (const depId of deps) {
+      children.get(depId)!.push(task.id)
+    }
+  }
+
+  const levels = new Map<string, number>()
+  const queue: string[] = []
+  let processed = 0
+  for (const task of taskList) {
+    if ((indegree.get(task.id) ?? 0) === 0) {
+      levels.set(task.id, 0)
+      queue.push(task.id)
+    }
+  }
+
+  while (queue.length > 0) {
+    const currentId = queue.shift()!
+    processed += 1
+    const baseLevel = levels.get(currentId) ?? 0
+    for (const childId of children.get(currentId) ?? []) {
+      const nextLevel = Math.max(levels.get(childId) ?? 0, baseLevel + 1)
+      levels.set(childId, nextLevel)
+      indegree.set(childId, (indegree.get(childId) ?? 1) - 1)
+      if ((indegree.get(childId) ?? 0) === 0) {
+        queue.push(childId)
+      }
+    }
+  }
+
+  if (processed !== taskList.length) {
+    throw new Error('Task dependency graph contains a cycle')
+  }
+
+  for (const task of taskList) {
+    if (!levels.has(task.id)) levels.set(task.id, 0)
+  }
+
+  const cols = new Map<number, T[]>()
+  for (const task of taskList) {
+    const level = levels.get(task.id) ?? 0
+    if (!cols.has(level)) cols.set(level, [])
+    cols.get(level)!.push(task)
+  }
+
+  const sortedLevels = Array.from(cols.keys()).sort((a, b) => a - b)
+  const nodeW = 256
+  const nodeH = 142
+  const colGap = 96
+  const rowGap = 72
+  const padX = 120
+  const padY = 100
+  const positions = new Map<string, { x: number; y: number }>()
+  let maxRows = 1
+  for (const level of sortedLevels) maxRows = Math.max(maxRows, cols.get(level)!.length)
+
+  for (const level of sortedLevels) {
+    const colTasks = cols.get(level)!
+    colTasks.forEach((task, idx) => {
+      positions.set(task.id, {
+        x: padX + level * (nodeW + colGap),
+        y: padY + idx * (nodeH + rowGap),
+      })
+    })
+  }
+
+  const width = Math.max(1600, padX * 2 + sortedLevels.length * (nodeW + colGap))
+  const height = Math.max(700, padY * 2 + maxRows * (nodeH + rowGap))
+  return { positions, width, height, nodeW, nodeH }
+}
--- a/src/dashboard/render-team-run-dashboard.ts
+++ b/src/dashboard/render-team-run-dashboard.ts
@ -0,0 +1,460 @@
+/**
+ * Pure HTML renderer for the post-run team task DAG dashboard (no filesystem or network I/O).
+ */
+
+import type { TeamRunResult } from '../types.js'
+import { layoutTasks } from './layout-tasks.js'
+
+/**
+ * Escape serialized JSON so it can be embedded in HTML without closing a {@code <script>} tag.
+ * The HTML tokenizer ends a script on {@code </script>} even for {@code type="application/json"}.
+ */
+export function escapeJsonForHtmlScript(json: string): string {
+  return json.replace(/<\/script/gi, '<\\/script')
+}
+
+export function renderTeamRunDashboard(result: TeamRunResult): string {
+  const generatedAt = new Date().toISOString()
+  const tasks = result.tasks ?? []
+  const layout = layoutTasks(tasks)
+  const serializedPositions = Object.fromEntries(layout.positions)
+  const payload = {
+    generatedAt,
+    goal: result.goal ?? '',
+    tasks,
+    layout: {
+      positions: serializedPositions,
+      width: layout.width,
+      height: layout.height,
+      nodeW: layout.nodeW,
+      nodeH: layout.nodeH,
+    },
+  }
+  const dataJson = escapeJsonForHtmlScript(JSON.stringify(payload))
+
+  return `<!DOCTYPE html>
+<html class="dark" lang="en">
+<head>
+    <meta charset="utf-8" />
+    <meta content="width=device-width, initial-scale=1.0" name="viewport" />
+    <title>Open Multi Agent</title>
+    <script src="https://cdn.tailwindcss.com?plugins=forms,container-queries"></script>
+    <link
+        href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@300;400;500;600;700&amp;family=Inter:wght@400;500;600&amp;display=swap"
+        rel="stylesheet" />
+    <link
+        href="https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:wght,FILL@100..700,0..1&amp;display=swap"
+        rel="stylesheet" />
+    <script id="tailwind-config">
+        tailwind.config = {
+            darkMode: "class",
+            theme: {
+                extend: {
+                    "colors": {
+                        "inverse-surface": "#faf8ff",
+                        "secondary-dim": "#ecb200",
+                        "on-primary": "#005762",
+                        "on-tertiary-fixed-variant": "#006827",
+                        "primary-fixed-dim": "#00d4ec",
+                        "tertiary-container": "#5cfd80",
+                        "secondary": "#fdc003",
+                        "primary-dim": "#00d4ec",
+                        "surface-container": "#0f1930",
+                        "on-secondary": "#553e00",
+                        "surface": "#060e20",
+                        "on-surface": "#dee5ff",
+                        "surface-container-highest": "#192540",
+                        "on-secondary-fixed-variant": "#674c00",
+                        "on-tertiary-container": "#005d22",
+                        "secondary-fixed-dim": "#f7ba00",
+                        "surface-variant": "#192540",
+                        "surface-container-low": "#091328",
+                        "secondary-container": "#785900",
+                        "tertiary-fixed-dim": "#4bee74",
+                        "on-primary-fixed-variant": "#005762",
+                        "primary-container": "#00e3fd",
+                        "surface-dim": "#060e20",
+                        "error-container": "#9f0519",
+                        "on-error-container": "#ffa8a3",
+                        "primary-fixed": "#00e3fd",
+                        "tertiary-dim": "#4bee74",
+                        "surface-container-high": "#141f38",
+                        "background": "#060e20",
+                        "surface-bright": "#1f2b49",
+                        "error-dim": "#d7383b",
+                        "on-primary-container": "#004d57",
+                        "outline": "#6d758c",
+                        "error": "#ff716c",
+                        "on-secondary-container": "#fff6ec",
+                        "on-primary-fixed": "#003840",
+                        "inverse-on-surface": "#4d556b",
+                        "secondary-fixed": "#ffca4d",
+                        "tertiary-fixed": "#5cfd80",
+                        "on-tertiary-fixed": "#004819",
+                        "surface-tint": "#81ecff",
+                        "tertiary": "#b8ffbb",
+                        "outline-variant": "#40485d",
+                        "on-error": "#490006",
+                        "on-surface-variant": "#a3aac4",
+                        "surface-container-lowest": "#000000",
+                        "on-tertiary": "#006727",
+                        "primary": "#81ecff",
+                        "on-secondary-fixed": "#443100",
+                        "inverse-primary": "#006976",
+                        "on-background": "#dee5ff"
+                    },
+                    "borderRadius": {
+                        "DEFAULT": "0px",
+                        "lg": "0px",
+                        "xl": "0px",
+                        "full": "9999px"
+                    },
+                    "fontFamily": {
+                        "headline": ["Space Grotesk"],
+                        "body": ["Inter"],
+                        "label": ["Space Grotesk"]
+                    }
+                },
+            },
+        }
+    </script>
+    <style>
+        .material-symbols-outlined {
+            font-variation-settings: 'FILL' 0, 'wght' 400, 'GRAD' 0, 'opsz' 24;
+        }
+
+        .grid-pattern {
+            background-image: radial-gradient(circle, #40485d 1px, transparent 1px);
+            background-size: 24px 24px;
+        }
+
+        .node-active-glow {
+            box-shadow: 0 0 15px rgba(129, 236, 255, 0.15);
+        }
+    </style>
+</head>
+<body class="bg-surface text-on-surface font-body selection:bg-primary selection:text-on-primary">
+    <main class="p-8 min-h-[calc(100vh-64px)] grid-pattern relative overflow-hidden flex flex-col lg:flex-row gap-6">
+        <div id="viewport" class="flex-1 relative min-h-[600px] overflow-hidden cursor-grab">
+            <div id="canvas" class="absolute inset-0 origin-top-left">
+                <svg id="edgesLayer" class="absolute inset-0 w-full h-full pointer-events-none" xmlns="http://www.w3.org/2000/svg"></svg>
+                <div id="nodesLayer"></div>
+            </div>
+        </div>
+        <aside id="detailsPanel" class="hidden w-full lg:w-[400px] bg-surface-container-high p-6 flex flex-col gap-8 border-l border-outline-variant/10">
+            <div>
+                <h2 class="font-headline font-black text-lg tracking-widest mb-6 text-primary flex items-center gap-2">
+                    <span class="material-symbols-outlined" data-icon="info">info</span>
+                    NODE_DETAILS
+                </h2>
+                <button id="closePanel" class="absolute top-4 right-4 text-on-surface-variant hover:text-primary">
+                    <span class="material-symbols-outlined">close</span>
+                </button>
+                <div class="space-y-6">
+                    <div class="flex flex-col gap-2">
+                        <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Goal</label>
+                        <p id="goalText" class="text-xs bg-surface-container p-3 border-b border-outline-variant/20"></p>
+                    </div>
+                    <div class="flex flex-col gap-1">
+                        <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Assigned Agent</label>
+                        <div class="flex items-center gap-4 bg-surface-container p-3">
+                            <div>
+                                <p id="selectedAssignee" class="text-sm font-bold text-on-surface">-</p>
+                                <p id="selectedState" class="text-[10px] font-mono text-secondary">ACTIVE STATE: -</p>
+                            </div>
+                        </div>
+                    </div>
+                    <div class="grid grid-cols-2 gap-4">
+                        <div class="flex flex-col gap-1">
+                            <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Execution Start</label>
+                            <p id="selectedStart" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20">-</p>
+                        </div>
+                        <div class="flex flex-col gap-1">
+                            <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Execution End</label>
+                            <p id="selectedEnd" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20 text-on-surface-variant">-</p>
+                        </div>
+                    </div>
+                    <div class="flex flex-col gap-1">
+                        <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Token Breakdown</label>
+                        <div class="space-y-2 bg-surface-container p-4">
+                            <div class="flex justify-between text-xs font-mono">
+                                <span class="text-on-surface-variant">PROMPT:</span>
+                                <span id="selectedPromptTokens" class="text-on-surface">0</span>
+                            </div>
+                            <div class="flex justify-between text-xs font-mono">
+                                <span class="text-on-surface-variant">COMPLETION:</span>
+                                <span id="selectedCompletionTokens" class="text-on-surface text-secondary">0</span>
+                            </div>
+                            <div class="w-full h-1 bg-surface-variant mt-2">
+                                <div id="selectedTokenRatio" class="bg-primary h-full w-0"></div>
+                            </div>
+                        </div>
+                    </div>
+                    <div class="flex flex-col gap-1">
+                      <label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Tool Calls</label>
+                      <p id="selectedToolCalls" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20">0</p>
+                    </div>
+                </div>
+            </div>
+            <div class="flex-1 flex flex-col min-h-[200px]">
+                <h2 class="font-headline font-black text-[10px] tracking-widest mb-4 text-on-surface-variant">LIVE_AGENT_OUTPUT</h2>
+                <div id="liveOutput" class="bg-surface-container-lowest flex-1 p-3 font-mono text-[10px] leading-relaxed overflow-y-auto space-y-1">
+                </div>
+            </div>
+        </aside>
+    </main>
+    <div class="fixed left-0 top-0 w-1 h-screen bg-gradient-to-b from-primary via-secondary to-tertiary z-[60] opacity-30"></div>
+    <script type="application/json" id="oma-data">${dataJson}</script>
+    <script>
+        const dataEl = document.getElementById("oma-data");
+        const payload = JSON.parse(dataEl.textContent);
+        const panel = document.getElementById("detailsPanel");
+        const closeBtn = document.getElementById("closePanel");
+        const canvas = document.getElementById("canvas");
+        const viewport = document.getElementById("viewport");
+        const edgesLayer = document.getElementById("edgesLayer");
+        const nodesLayer = document.getElementById("nodesLayer");
+        const goalText = document.getElementById("goalText");
+        const liveOutput = document.getElementById("liveOutput");
+        const selectedAssignee = document.getElementById("selectedAssignee");
+        const selectedState = document.getElementById("selectedState");
+        const selectedStart = document.getElementById("selectedStart");
+        const selectedToolCalls = document.getElementById("selectedToolCalls");
+        const selectedEnd = document.getElementById("selectedEnd");
+        const selectedPromptTokens = document.getElementById("selectedPromptTokens");
+        const selectedCompletionTokens = document.getElementById("selectedCompletionTokens");
+        const selectedTokenRatio = document.getElementById("selectedTokenRatio");
+        const svgNs = "http://www.w3.org/2000/svg";
+
+        let scale = 1;
+        let translate = { x: 0, y: 0 };
+
+        let isDragging = false;
+        let last = { x: 0, y: 0 };
+
+        function updateTransform() {
+            canvas.style.transform = \`
+                translate(\${translate.x}px, \${translate.y}px)
+                scale(\${scale})
+            \`;
+        }
+
+        viewport.addEventListener("wheel", (e) => {
+            e.preventDefault();
+
+            const zoomIntensity = 0.0015;
+            const delta = -e.deltaY * zoomIntensity;
+            const newScale = Math.min(Math.max(0.4, scale + delta), 2.5);
+
+            const rect = viewport.getBoundingClientRect();
+            const mouseX = e.clientX - rect.left;
+            const mouseY = e.clientY - rect.top;
+            const dx = mouseX - translate.x;
+            const dy = mouseY - translate.y;
+
+            translate.x -= dx * (newScale / scale - 1);
+            translate.y -= dy * (newScale / scale - 1);
+            scale = newScale;
+            updateTransform();
+        });
+
+        viewport.addEventListener("mousedown", (e) => {
+            isDragging = true;
+            last = { x: e.clientX, y: e.clientY };
+            viewport.classList.add("cursor-grabbing");
+        });
+
+        window.addEventListener("mousemove", (e) => {
+            if (!isDragging) return;
+
+            const dx = e.clientX - last.x;
+            const dy = e.clientY - last.y;
+            translate.x += dx;
+            translate.y += dy;
+            last = { x: e.clientX, y: e.clientY };
+            updateTransform();
+        });
+
+        window.addEventListener("mouseup", () => {
+            isDragging = false;
+            viewport.classList.remove("cursor-grabbing");
+        });
+
+        updateTransform();
+
+        closeBtn.addEventListener("click", () => {
+            panel.classList.add("hidden");
+        });
+
+        document.addEventListener("click", (e) => {
+            const isClickInsidePanel = panel.contains(e.target);
+            const isNode = e.target.closest(".node");
+
+            if (!isClickInsidePanel && !isNode) {
+                panel.classList.add("hidden");
+            }
+        });
+
+        const tasks = Array.isArray(payload.tasks) ? payload.tasks : [];
+        goalText.textContent = payload.goal ?? "";
+
+        const statusStyles = {
+            completed: { border: "border-tertiary", icon: "check_circle", iconColor: "text-tertiary", container: "bg-surface-container-lowest node-active-glow", statusColor: "text-on-surface-variant", chip: "STABLE" },
+            failed: { border: "border-error", icon: "error", iconColor: "text-error", container: "bg-surface-container-lowest", statusColor: "text-error", chip: "FAILED" },
+            blocked: { border: "border-outline", icon: "lock", iconColor: "text-outline", container: "bg-surface-container-low opacity-60 grayscale", statusColor: "text-on-surface-variant", chip: "BLOCKED" },
+            skipped: { border: "border-outline", icon: "skip_next", iconColor: "text-outline", container: "bg-surface-container-low opacity-60", statusColor: "text-on-surface-variant", chip: "SKIPPED" },
+            in_progress: { border: "border-secondary", icon: "sync", iconColor: "text-secondary", container: "bg-surface-container-low node-active-glow border border-outline-variant/20 shadow-[0_0_20px_rgba(253,192,3,0.1)]", statusColor: "text-secondary", chip: "ACTIVE_STREAM", spin: true },
+            pending: { border: "border-outline", icon: "hourglass_empty", iconColor: "text-outline", container: "bg-surface-container-low opacity-60 grayscale", statusColor: "text-on-surface-variant", chip: "WAITING" },
+        };
+
+        function durationText(task) {
+            const ms = task?.metrics?.durationMs ?? 0;
+            const seconds = Math.max(0, ms / 1000).toFixed(1);
+            return task.status === "completed" ? "DONE (" + seconds + "s)" : task.status.toUpperCase();
+        }
+
+        function renderLiveOutput(taskList) {
+            liveOutput.innerHTML = "";
+            const finished = taskList.every((task) => ["completed", "failed", "skipped", "blocked"].includes(task.status));
+            const header = document.createElement("p");
+            header.className = "text-tertiary";
+            header.textContent = finished ? "[SYSTEM] Task graph execution finished." : "[SYSTEM] Task graph execution in progress.";
+            liveOutput.appendChild(header);
+
+            taskList.forEach((task) => {
+                const p = document.createElement("p");
+                p.className = task.status === "completed" ? "text-on-surface-variant" : task.status === "failed" ? "text-error" : "text-on-surface-variant";
+                p.textContent = "[" + (task.assignee || "UNASSIGNED").toUpperCase() + "] " + task.title + " -> " + task.status.toUpperCase();
+                liveOutput.appendChild(p);
+            });
+        }
+
+        function renderDetails(task) {
+            const metrics = task?.metrics ?? {};
+            const statusLabel = (statusStyles[task.status] || statusStyles.pending).chip;
+            const usage = metrics.tokenUsage ?? { input_tokens: 0, output_tokens: 0 };
+            const inTokens = usage.input_tokens ?? 0;
+            const outTokens = usage.output_tokens ?? 0;
+            const total = inTokens + outTokens;
+            const ratio = total > 0 ? Math.round((inTokens / total) * 100) : 0;
+
+            selectedAssignee.textContent = task?.assignee || "UNASSIGNED";
+
+            selectedState.textContent = "STATE: " + statusLabel;
+            selectedStart.textContent = metrics.startMs ? new Date(metrics.startMs).toISOString() : "-";
+            selectedEnd.textContent = metrics.endMs ? new Date(metrics.endMs).toISOString() : "-";
+
+            selectedToolCalls.textContent = (metrics.toolCalls ?? []).length.toString();
+
+            selectedPromptTokens.textContent = inTokens.toLocaleString();
+            selectedCompletionTokens.textContent = outTokens.toLocaleString();
+            selectedTokenRatio.style.width = ratio + "%";
+        }
+
+        function makeEdgePath(x1, y1, x2, y2) {
+            return "M " + x1 + " " + y1 + " C " + (x1 + 42) + " " + y1 + ", " + (x2 - 42) + " " + y2 + ", " + x2 + " " + y2;
+        }
+
+        function renderDag(taskList) {
+            const rawLayout = payload.layout ?? {};
+            const positions = new Map(Object.entries(rawLayout.positions ?? {}));
+            const width = Number(rawLayout.width ?? 1600);
+            const height = Number(rawLayout.height ?? 700);
+            const nodeW = Number(rawLayout.nodeW ?? 256);
+            const nodeH = Number(rawLayout.nodeH ?? 142);
+            canvas.style.width = width + "px";
+            canvas.style.height = height + "px";
+
+            edgesLayer.setAttribute("viewBox", "0 0 " + width + " " + height);
+            edgesLayer.innerHTML = "";
+            const defs = document.createElementNS(svgNs, "defs");
+            const marker = document.createElementNS(svgNs, "marker");
+            marker.setAttribute("id", "arrow");
+            marker.setAttribute("markerWidth", "8");
+            marker.setAttribute("markerHeight", "8");
+            marker.setAttribute("refX", "7");
+            marker.setAttribute("refY", "4");
+            marker.setAttribute("orient", "auto");
+            const markerPath = document.createElementNS(svgNs, "path");
+            markerPath.setAttribute("d", "M0,0 L8,4 L0,8 z");
+            markerPath.setAttribute("fill", "#40485d");
+            marker.appendChild(markerPath);
+            defs.appendChild(marker);
+            edgesLayer.appendChild(defs);
+
+            taskList.forEach((task) => {
+                const to = positions.get(task.id);
+                (task.dependsOn || []).forEach((depId) => {
+                    const from = positions.get(depId);
+                    if (!from || !to) return;
+                    const edge = document.createElementNS(svgNs, "path");
+                    edge.setAttribute("d", makeEdgePath(from.x + nodeW, from.y + nodeH / 2, to.x, to.y + nodeH / 2));
+                    edge.setAttribute("fill", "none");
+                    edge.setAttribute("stroke", "#40485d");
+                    edge.setAttribute("stroke-width", "2");
+                    edge.setAttribute("marker-end", "url(#arrow)");
+                    edgesLayer.appendChild(edge);
+                });
+            });
+
+            nodesLayer.innerHTML = "";
+            taskList.forEach((task, idx) => {
+                const pos = positions.get(task.id);
+                const status = statusStyles[task.status] || statusStyles.pending;
+                const nodeId = "#NODE_" + String(idx + 1).padStart(3, "0");
+                const chips = [task.assignee ? task.assignee.toUpperCase() : "UNASSIGNED", status.chip];
+
+                const node = document.createElement("div");
+                node.className = "node absolute w-64 border-l-2 p-4 cursor-pointer " + status.border + " " + status.container;
+                node.style.left = pos.x + "px";
+                node.style.top = pos.y + "px";
+
+                const rowTop = document.createElement("div");
+                rowTop.className = "flex justify-between items-start mb-4";
+                const nodeIdSpan = document.createElement("span");
+                nodeIdSpan.className = "text-[10px] font-mono " + status.iconColor;
+                nodeIdSpan.textContent = nodeId;
+                const iconSpan = document.createElement("span");
+                iconSpan.className = "material-symbols-outlined " + status.iconColor + " text-lg " + (status.spin ? "animate-spin" : "");
+                iconSpan.textContent = status.icon;
+                iconSpan.setAttribute("data-icon", status.icon);
+                rowTop.appendChild(nodeIdSpan);
+                rowTop.appendChild(iconSpan);
+
+                const titleEl = document.createElement("h3");
+                titleEl.className = "font-headline font-bold text-sm tracking-tight mb-1";
+                titleEl.textContent = task.title;
+
+                const statusLine = document.createElement("p");
+                statusLine.className = "text-xs " + status.statusColor + " mb-4";
+                statusLine.textContent = "STATUS: " + durationText(task);
+
+                const chipRow = document.createElement("div");
+                chipRow.className = "flex gap-2";
+                chips.forEach((chip) => {
+                    const chipEl = document.createElement("span");
+                    chipEl.className = "px-2 py-0.5 bg-surface-variant text-[9px] font-mono text-on-surface-variant";
+                    chipEl.textContent = chip;
+                    chipRow.appendChild(chipEl);
+                });
+
+                node.appendChild(rowTop);
+                node.appendChild(titleEl);
+                node.appendChild(statusLine);
+                node.appendChild(chipRow);
+
+                node.addEventListener("click", () => {
+                    renderDetails(task);
+                    panel.classList.remove("hidden");
+                });
+                nodesLayer.appendChild(node);
+            });
+
+            renderLiveOutput(taskList);
+        }
+
+        renderDag(tasks);
+    </script>
+</body>
+</html>`
+}
--- a/src/errors.ts
+++ b/src/errors.ts
@ -0,0 +1,19 @@
+/**
+ * @fileoverview Framework-specific error classes.
+ */
+
+/**
+ * Raised when an agent or orchestrator run exceeds its configured token budget.
+ */
+export class TokenBudgetExceededError extends Error {
+  readonly code = 'TOKEN_BUDGET_EXCEEDED'
+
+  constructor(
+    readonly agent: string,
+    readonly tokensUsed: number,
+    readonly budget: number,
+  ) {
+    super(`Agent "${agent}" exceeded token budget: ${tokensUsed} tokens used (budget: ${budget})`)
+    this.name = 'TokenBudgetExceededError'
+  }
+}
--- a/src/index.ts
+++ b/src/index.ts
@ -58,6 +58,8 @@ export { OpenMultiAgent, executeWithRetry, computeRetryDelay } from './orchestra
 export { Scheduler } from './orchestrator/scheduler.js'
 export type { SchedulingStrategy } from './orchestrator/scheduler.js'

+export { renderTeamRunDashboard } from './dashboard/render-team-run-dashboard.js'
+
 // ---------------------------------------------------------------------------
 // Agent layer
 // ---------------------------------------------------------------------------
@ -89,17 +91,21 @@ export type { TaskQueueEvent } from './task/queue.js'
 // ---------------------------------------------------------------------------

 export { defineTool, ToolRegistry, zodToJsonSchema } from './tool/framework.js'
-export { ToolExecutor } from './tool/executor.js'
+export { ToolExecutor, truncateToolOutput } from './tool/executor.js'
 export type { ToolExecutorOptions, BatchToolCall } from './tool/executor.js'
 export {
  registerBuiltInTools,
  BUILT_IN_TOOLS,
+  ALL_BUILT_IN_TOOLS_WITH_DELEGATE,
  bashTool,
+  delegateToAgentTool,
  fileReadTool,
  fileWriteTool,
  fileEditTool,
+  globTool,
  grepTool,
 } from './tool/built-in/index.js'
+export type { RegisterBuiltInToolsOptions } from './tool/built-in/index.js'

 // ---------------------------------------------------------------------------
 // LLM adapters
@ -107,6 +113,7 @@ export {

 export { createAdapter } from './llm/adapter.js'
 export type { SupportedProvider } from './llm/adapter.js'
+export { TokenBudgetExceededError } from './errors.js'

 // ---------------------------------------------------------------------------
 // Memory
@ -143,6 +150,7 @@ export type {
  ToolUseContext,
  AgentInfo,
  TeamInfo,
+  DelegationPoolView,

  // Agent
  AgentConfig,
@ -152,11 +160,16 @@ export type {
  ToolCallRecord,
  LoopDetectionConfig,
  LoopDetectionInfo,
+  ContextStrategy,

  // Team
  TeamConfig,
  TeamRunResult,

+  // Dashboard (static HTML)
+  TaskExecutionMetrics,
+  TaskExecutionRecord,
+
  // Task
  Task,
  TaskStatus,
@ -164,6 +177,7 @@ export type {
  // Orchestrator
  OrchestratorConfig,
  OrchestratorEvent,
+  CoordinatorConfig,

  // Trace
  TraceEventType,
--- a/src/llm/adapter.ts
+++ b/src/llm/adapter.ts
@ -38,19 +38,22 @@ import type { LLMAdapter } from '../types.js'
 * Additional providers can be integrated by implementing {@link LLMAdapter}
 * directly and bypassing this factory.
 */
-export type SupportedProvider = 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
+export type SupportedProvider = 'anthropic' | 'azure-openai' | 'copilot' | 'deepseek' | 'grok' | 'minimax' | 'openai' | 'gemini'

 /**
 * Instantiate the appropriate {@link LLMAdapter} for the given provider.
 *
 * API keys fall back to the standard environment variables when not supplied
 * explicitly:
- * - `anthropic` → `ANTHROPIC_API_KEY`
- * - `openai`    → `OPENAI_API_KEY`
- * - `gemini`    → `GEMINI_API_KEY` / `GOOGLE_API_KEY`
- * - `grok`      → `XAI_API_KEY`
- * - `copilot`   → `GITHUB_COPILOT_TOKEN` / `GITHUB_TOKEN`, or interactive
- *                  OAuth2 device flow if neither is set
+ * - `anthropic`    → `ANTHROPIC_API_KEY`
+ * - `azure-openai` → `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`
+ * - `openai`       → `OPENAI_API_KEY`
+ * - `gemini`       → `GEMINI_API_KEY` / `GOOGLE_API_KEY`
+ * - `grok`         → `XAI_API_KEY`
+ * - `minimax`      → `MINIMAX_API_KEY`
+ * - `deepseek`     → `DEEPSEEK_API_KEY`
+ * - `copilot`      → `GITHUB_COPILOT_TOKEN` / `GITHUB_TOKEN`, or interactive
+ *                     OAuth2 device flow if neither is set
 *
 * Adapters are imported lazily so that projects using only one provider
 * are not forced to install the SDK for the other.
@ -89,6 +92,20 @@ export async function createAdapter(
      const { GrokAdapter } = await import('./grok.js')
      return new GrokAdapter(apiKey, baseURL)
    }
+    case 'minimax': {
+      const { MiniMaxAdapter } = await import('./minimax.js')
+      return new MiniMaxAdapter(apiKey, baseURL)
+    }
+    case 'deepseek': {
+      const { DeepSeekAdapter } = await import('./deepseek.js')
+      return new DeepSeekAdapter(apiKey, baseURL)
+    }
+    case 'azure-openai': {
+      // For azure-openai, the `baseURL` parameter serves as the Azure endpoint URL.
+      // To override the API version, set AZURE_OPENAI_API_VERSION env var.
+      const { AzureOpenAIAdapter } = await import('./azure-openai.js')
+      return new AzureOpenAIAdapter(apiKey, baseURL)
+    }
    default: {
      // The `never` cast here makes TypeScript enforce exhaustiveness.
      const _exhaustive: never = provider
--- a/src/llm/azure-openai.ts
+++ b/src/llm/azure-openai.ts
@ -0,0 +1,313 @@
+/**
+ * @fileoverview Azure OpenAI adapter implementing {@link LLMAdapter}.
+ *
+ * Azure OpenAI uses regional deployment endpoints and API versioning that differ
+ * from standard OpenAI:
+ *
+ * - Endpoint: `https://{resource-name}.openai.azure.com`
+ * - API version: Query parameter (e.g., `?api-version=2024-10-21`)
+ * - Model/Deployment: Users deploy models with custom names; the `model` field
+ *   in agent config should contain the Azure deployment name, not the underlying
+ *   model name (e.g., `model: 'my-gpt4-deployment'`)
+ *
+ * The OpenAI SDK provides an `AzureOpenAI` client class that handles these
+ * Azure-specific requirements. This adapter uses that client while reusing all
+ * message conversion logic from `openai-common.ts`.
+ *
+ * Environment variable resolution order:
+ *   1. Constructor arguments
+ *   2. `AZURE_OPENAI_API_KEY` environment variable
+ *   3. `AZURE_OPENAI_ENDPOINT` environment variable
+ *   4. `AZURE_OPENAI_API_VERSION` environment variable (defaults to '2024-10-21')
+ *   5. `AZURE_OPENAI_DEPLOYMENT` as an optional fallback when `model` is blank
+ *
+ * Note: Azure introduced a next-generation v1 API (August 2025) that uses the standard
+ * OpenAI() client with baseURL set to `{endpoint}/openai/v1/` and requires no api-version.
+ * That path is not yet supported by this adapter. To use it, pass `provider: 'openai'`
+ * with `baseURL: 'https://{resource}.openai.azure.com/openai/v1/'` in your agent config.
+ *
+ * @example
+ * ```ts
+ * import { AzureOpenAIAdapter } from './azure-openai.js'
+ *
+ * const adapter = new AzureOpenAIAdapter()
+ * const response = await adapter.chat(messages, {
+ *   model: 'my-gpt4-deployment',  // Azure deployment name, not 'gpt-4'
+ *   maxTokens: 1024,
+ * })
+ * ```
+ */
+
+import { AzureOpenAI } from 'openai'
+import type {
+  ChatCompletionChunk,
+} from 'openai/resources/chat/completions/index.js'
+
+import type {
+  ContentBlock,
+  LLMAdapter,
+  LLMChatOptions,
+  LLMMessage,
+  LLMResponse,
+  LLMStreamOptions,
+  StreamEvent,
+  TextBlock,
+  ToolUseBlock,
+} from '../types.js'
+
+import {
+  toOpenAITool,
+  fromOpenAICompletion,
+  normalizeFinishReason,
+  buildOpenAIMessageList,
+} from './openai-common.js'
+import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'
+
+// ---------------------------------------------------------------------------
+// Adapter implementation
+// ---------------------------------------------------------------------------
+
+const DEFAULT_AZURE_OPENAI_API_VERSION = '2024-10-21'
+
+function resolveAzureDeploymentName(model: string): string {
+  const explicitModel = model.trim()
+  if (explicitModel.length > 0) return explicitModel
+
+  const fallbackDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']?.trim()
+  if (fallbackDeployment !== undefined && fallbackDeployment.length > 0) {
+    return fallbackDeployment
+  }
+
+  throw new Error(
+    'Azure OpenAI deployment is required. Set agent model to your deployment name, or set AZURE_OPENAI_DEPLOYMENT.',
+  )
+}
+
+/**
+ * LLM adapter backed by Azure OpenAI Chat Completions API.
+ *
+ * Thread-safe — a single instance may be shared across concurrent agent runs.
+ */
+export class AzureOpenAIAdapter implements LLMAdapter {
+  readonly name: string = 'azure-openai'
+
+  readonly #client: AzureOpenAI
+
+  /**
+   * @param apiKey - Azure OpenAI API key (falls back to AZURE_OPENAI_API_KEY env var)
+   * @param endpoint - Azure endpoint URL (falls back to AZURE_OPENAI_ENDPOINT env var)
+   * @param apiVersion - API version string (falls back to AZURE_OPENAI_API_VERSION, defaults to '2024-10-21')
+   */
+  constructor(apiKey?: string, endpoint?: string, apiVersion?: string) {
+    this.#client = new AzureOpenAI({
+      apiKey: apiKey ?? process.env['AZURE_OPENAI_API_KEY'],
+      endpoint: endpoint ?? process.env['AZURE_OPENAI_ENDPOINT'],
+      apiVersion: apiVersion ?? process.env['AZURE_OPENAI_API_VERSION'] ?? DEFAULT_AZURE_OPENAI_API_VERSION,
+    })
+  }
+
+  // -------------------------------------------------------------------------
+  // chat()
+  // -------------------------------------------------------------------------
+
+  /**
+   * Send a synchronous (non-streaming) chat request and return the complete
+   * {@link LLMResponse}.
+   *
+   * Throws an `AzureOpenAI.APIError` on non-2xx responses. Callers should catch and
+   * handle these (e.g. rate limits, context length exceeded, deployment not found).
+   */
+  async chat(messages: LLMMessage[], options: LLMChatOptions): Promise<LLMResponse> {
+    const deploymentName = resolveAzureDeploymentName(options.model)
+    const openAIMessages = buildOpenAIMessageList(messages, options.systemPrompt)
+
+    const completion = await this.#client.chat.completions.create(
+      {
+        model: deploymentName,
+        messages: openAIMessages,
+        max_tokens: options.maxTokens,
+        temperature: options.temperature,
+        tools: options.tools ? options.tools.map(toOpenAITool) : undefined,
+        stream: false,
+      },
+      {
+        signal: options.abortSignal,
+      },
+    )
+
+    const toolNames = options.tools?.map(t => t.name)
+    return fromOpenAICompletion(completion, toolNames)
+  }
+
+  // -------------------------------------------------------------------------
+  // stream()
+  // -------------------------------------------------------------------------
+
+  /**
+   * Send a streaming chat request and yield {@link StreamEvent}s incrementally.
+   *
+   * Sequence guarantees match {@link OpenAIAdapter.stream}:
+   * - Zero or more `text` events
+   * - Zero or more `tool_use` events (emitted once per tool call, after
+   *   arguments have been fully assembled)
+   * - Exactly one terminal event: `done` or `error`
+   */
+  async *stream(
+    messages: LLMMessage[],
+    options: LLMStreamOptions,
+  ): AsyncIterable<StreamEvent> {
+    const deploymentName = resolveAzureDeploymentName(options.model)
+    const openAIMessages = buildOpenAIMessageList(messages, options.systemPrompt)
+
+    // We request usage in the final chunk so we can include it in the `done` event.
+    const streamResponse = await this.#client.chat.completions.create(
+      {
+        model: deploymentName,
+        messages: openAIMessages,
+        max_tokens: options.maxTokens,
+        temperature: options.temperature,
+        tools: options.tools ? options.tools.map(toOpenAITool) : undefined,
+        stream: true,
+        stream_options: { include_usage: true },
+      },
+      {
+        signal: options.abortSignal,
+      },
+    )
+
+    // Accumulate state across chunks.
+    let completionId = ''
+    let completionModel = ''
+    let finalFinishReason: string = 'stop'
+    let inputTokens = 0
+    let outputTokens = 0
+
+    // tool_calls are streamed piecemeal; key = tool call index
+    const toolCallBuffers = new Map<
+      number,
+      { id: string; name: string; argsJson: string }
+    >()
+
+    // Full text accumulator for the `done` response.
+    let fullText = ''
+
+    try {
+      for await (const chunk of streamResponse) {
+        completionId = chunk.id
+        completionModel = chunk.model
+
+        // Usage is only populated in the final chunk when stream_options.include_usage is set.
+        if (chunk.usage !== null && chunk.usage !== undefined) {
+          inputTokens = chunk.usage.prompt_tokens
+          outputTokens = chunk.usage.completion_tokens
+        }
+
+        const choice: ChatCompletionChunk.Choice | undefined = chunk.choices[0]
+        if (choice === undefined) continue
+
+        const delta = choice.delta
+
+        // --- text delta ---
+        if (delta.content !== null && delta.content !== undefined) {
+          fullText += delta.content
+          const textEvent: StreamEvent = { type: 'text', data: delta.content }
+          yield textEvent
+        }
+
+        // --- tool call delta ---
+        for (const toolCallDelta of delta.tool_calls ?? []) {
+          const idx = toolCallDelta.index
+
+          if (!toolCallBuffers.has(idx)) {
+            toolCallBuffers.set(idx, {
+              id: toolCallDelta.id ?? '',
+              name: toolCallDelta.function?.name ?? '',
+              argsJson: '',
+            })
+          }
+
+          const buf = toolCallBuffers.get(idx)
+          // buf is guaranteed to exist: we just set it above.
+          if (buf !== undefined) {
+            if (toolCallDelta.id) buf.id = toolCallDelta.id
+            if (toolCallDelta.function?.name) buf.name = toolCallDelta.function.name
+            if (toolCallDelta.function?.arguments) {
+              buf.argsJson += toolCallDelta.function.arguments
+            }
+          }
+        }
+
+        if (choice.finish_reason !== null && choice.finish_reason !== undefined) {
+          finalFinishReason = choice.finish_reason
+        }
+      }
+
+      // Emit accumulated tool_use events after the stream ends.
+      const finalToolUseBlocks: ToolUseBlock[] = []
+      for (const buf of toolCallBuffers.values()) {
+        let parsedInput: Record<string, unknown> = {}
+        try {
+          const parsed: unknown = JSON.parse(buf.argsJson)
+          if (parsed !== null && typeof parsed === 'object' && !Array.isArray(parsed)) {
+            parsedInput = parsed as Record<string, unknown>
+          }
+        } catch {
+          // Malformed JSON — surface as empty object.
+        }
+
+        const toolUseBlock: ToolUseBlock = {
+          type: 'tool_use',
+          id: buf.id,
+          name: buf.name,
+          input: parsedInput,
+        }
+        finalToolUseBlocks.push(toolUseBlock)
+        const toolUseEvent: StreamEvent = { type: 'tool_use', data: toolUseBlock }
+        yield toolUseEvent
+      }
+
+      // Build the complete content array for the done response.
+      const doneContent: ContentBlock[] = []
+      if (fullText.length > 0) {
+        const textBlock: TextBlock = { type: 'text', text: fullText }
+        doneContent.push(textBlock)
+      }
+      doneContent.push(...finalToolUseBlocks)
+
+      // Fallback: extract tool calls from text when streaming produced no
+      // native tool_calls (same logic as fromOpenAICompletion).
+      if (finalToolUseBlocks.length === 0 && fullText.length > 0 && options.tools) {
+        const toolNames = options.tools.map(t => t.name)
+        const extracted = extractToolCallsFromText(fullText, toolNames)
+        if (extracted.length > 0) {
+          doneContent.push(...extracted)
+          for (const block of extracted) {
+            yield { type: 'tool_use', data: block } satisfies StreamEvent
+          }
+        }
+      }
+
+      const hasToolUseBlocks = doneContent.some(b => b.type === 'tool_use')
+      const resolvedStopReason = hasToolUseBlocks && finalFinishReason === 'stop'
+        ? 'tool_use'
+        : normalizeFinishReason(finalFinishReason)
+
+      const finalResponse: LLMResponse = {
+        id: completionId,
+        content: doneContent,
+        model: completionModel,
+        stop_reason: resolvedStopReason,
+        usage: { input_tokens: inputTokens, output_tokens: outputTokens },
+      }
+
+      const doneEvent: StreamEvent = { type: 'done', data: finalResponse }
+      yield doneEvent
+    } catch (err) {
+      const error = err instanceof Error ? err : new Error(String(err))
+      const errorEvent: StreamEvent = { type: 'error', data: error }
+      yield errorEvent
+    }
+  }
+}
+
+
--- a/src/llm/deepseek.ts
+++ b/src/llm/deepseek.ts
@ -0,0 +1,29 @@
+/**
+ * @fileoverview DeepSeek adapter.
+ *
+ * Thin wrapper around OpenAIAdapter that hard-codes the official DeepSeek
+ * OpenAI-compatible endpoint and DEEPSEEK_API_KEY environment variable fallback.
+ */
+
+import { OpenAIAdapter } from './openai.js'
+
+/**
+ * LLM adapter for DeepSeek models (deepseek-chat, deepseek-reasoner, and future models).
+ *
+ * Thread-safe. Can be shared across agents.
+ *
+ * Usage:
+ *   provider: 'deepseek'
+ *   model: 'deepseek-chat' (or 'deepseek-reasoner' for the thinking model)
+ */
+export class DeepSeekAdapter extends OpenAIAdapter {
+  readonly name = 'deepseek'
+
+  constructor(apiKey?: string, baseURL?: string) {
+    // Allow override of baseURL (for proxies or future changes) but default to official DeepSeek endpoint.
+    super(
+      apiKey ?? process.env['DEEPSEEK_API_KEY'],
+      baseURL ?? 'https://api.deepseek.com/v1'
+    )
+  }
+}
--- a/src/llm/gemini.ts
+++ b/src/llm/gemini.ts
@ -163,6 +163,7 @@ function buildConfig(
    toolConfig: options.tools
      ? { functionCallingConfig: { mode: FunctionCallingConfigMode.AUTO } }
      : undefined,
+    abortSignal: options.abortSignal,
  }
 }

--- a/src/llm/minimax.ts
+++ b/src/llm/minimax.ts
@ -0,0 +1,29 @@
+/**
+ * @fileoverview MiniMax adapter.
+ *
+ * Thin wrapper around OpenAIAdapter that hard-codes the official MiniMax
+ * OpenAI-compatible endpoint and MINIMAX_API_KEY environment variable fallback.
+ */
+
+import { OpenAIAdapter } from './openai.js'
+
+/**
+ * LLM adapter for MiniMax models (MiniMax-M2.7 series and future models).
+ *
+ * Thread-safe. Can be shared across agents.
+ *
+ * Usage:
+ *   provider: 'minimax'
+ *   model: 'MiniMax-M2.7' (or any current MiniMax model name)
+ */
+export class MiniMaxAdapter extends OpenAIAdapter {
+  readonly name = 'minimax'
+
+  constructor(apiKey?: string, baseURL?: string) {
+    // Allow override of baseURL (for proxies or future changes) but default to official MiniMax endpoint.
+    super(
+      apiKey ?? process.env['MINIMAX_API_KEY'],
+      baseURL ?? process.env['MINIMAX_BASE_URL'] ?? 'https://api.minimax.io/v1'
+    )
+  }
+}
--- a/src/mcp.ts
+++ b/src/mcp.ts
@ -0,0 +1,5 @@
+export type {
+  ConnectMCPToolsConfig,
+  ConnectedMCPTools,
+} from './tool/mcp.js'
+export { connectMCPTools } from './tool/mcp.js'
--- a/src/memory/shared.ts
+++ b/src/memory/shared.ts
@ -124,8 +124,18 @@ export class SharedMemory {
   * - plan: Implement feature X using const type params
   * ```
   */
-  async getSummary(): Promise<string> {
-    const all = await this.store.list()
+  async getSummary(filter?: { taskIds?: string[] }): Promise<string> {
+    let all = await this.store.list()
+    if (filter?.taskIds && filter.taskIds.length > 0) {
+      const taskIds = new Set(filter.taskIds)
+      all = all.filter((entry) => {
+        const slashIdx = entry.key.indexOf('/')
+        const localKey = slashIdx === -1 ? entry.key : entry.key.slice(slashIdx + 1)
+        if (!localKey.startsWith('task:') || !localKey.endsWith(':result')) return false
+        const taskId = localKey.slice('task:'.length, localKey.length - ':result'.length)
+        return taskIds.has(taskId)
+      })
+    }
    if (all.length === 0) return ''

    // Group entries by agent name.
--- a/src/orchestrator/orchestrator.ts
+++ b/src/orchestrator/orchestrator.ts
@ -44,11 +44,15 @@
 import type {
  AgentConfig,
  AgentRunResult,
+  CoordinatorConfig,
  OrchestratorConfig,
  OrchestratorEvent,
  Task,
+  TaskExecutionMetrics,
+  TaskExecutionRecord,
  TaskStatus,
  TeamConfig,
+  TeamInfo,
  TeamRunResult,
  TokenUsage,
 } from '../types.js'
@ -63,6 +67,8 @@ import { Team } from '../team/team.js'
 import { TaskQueue } from '../task/queue.js'
 import { createTask } from '../task/task.js'
 import { Scheduler } from './scheduler.js'
+import { TokenBudgetExceededError } from '../errors.js'
+import { extractKeywords, keywordScore } from '../utils/keywords.js'

 // ---------------------------------------------------------------------------
 // Internal constants
@ -70,8 +76,122 @@ import { Scheduler } from './scheduler.js'

 const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
 const DEFAULT_MAX_CONCURRENCY = 5
+const DEFAULT_MAX_DELEGATION_DEPTH = 3
 const DEFAULT_MODEL = 'claude-opus-4-6'

+// ---------------------------------------------------------------------------
+// Short-circuit helpers (exported for testability)
+// ---------------------------------------------------------------------------
+
+/**
+ * Regex patterns that indicate a goal requires multi-agent coordination.
+ *
+ * Each pattern targets a distinct complexity signal:
+ * - Sequencing:     "first … then", "step 1 / step 2", numbered lists
+ * - Coordination:   "collaborate", "coordinate", "review each other"
+ * - Parallel work:  "in parallel", "at the same time", "concurrently"
+ * - Multi-phase:    "phase", "stage", multiple distinct action verbs joined by connectives
+ */
+const COMPLEXITY_PATTERNS: RegExp[] = [
+  // Explicit sequencing
+  /\bfirst\b.{3,60}\bthen\b/i,
+  /\bstep\s*\d/i,
+  /\bphase\s*\d/i,
+  /\bstage\s*\d/i,
+  /^\s*\d+[\.\)]/m,                       // numbered list items ("1. …", "2) …")
+
+  // Coordination language — must be an imperative directive aimed at the agents
+  // ("collaborate with X", "coordinate the team", "agents should coordinate"),
+  // not a descriptive use ("how does X coordinate with Y" / "what does collaboration mean").
+  // Match either an explicit preposition or a noun-phrase that names a group.
+  /\bcollaborat(?:e|ing)\b\s+(?:with|on|to)\b/i,
+  /\bcoordinat(?:e|ing)\b\s+(?:with|on|across|between|the\s+(?:team|agents?|workers?|effort|work))\b/i,
+  /\breview\s+each\s+other/i,
+  /\bwork\s+together\b/i,
+
+  // Parallel execution
+  /\bin\s+parallel\b/i,
+  /\bconcurrently\b/i,
+  /\bat\s+the\s+same\s+time\b/i,
+
+  // Multiple deliverables joined by connectives
+  // Matches patterns like "build X, then deploy Y and test Z"
+  /\b(?:build|create|implement|design|write|develop)\b.{5,80}\b(?:and|then)\b.{5,80}\b(?:build|create|implement|design|write|develop|test|review|deploy)\b/i,
+]
+
+
+/**
+ * Maximum goal length (in characters) below which a goal *may* be simple.
+ *
+ * Goals longer than this threshold almost always contain enough detail to
+ * warrant multi-agent decomposition. The value is generous — short-circuit
+ * is meant for genuinely simple, single-action goals.
+ */
+const SIMPLE_GOAL_MAX_LENGTH = 200
+
+/**
+ * Determine whether a goal is simple enough to skip coordinator decomposition.
+ *
+ * A goal is considered "simple" when ALL of the following hold:
+ *   1. Its length is ≤ {@link SIMPLE_GOAL_MAX_LENGTH}.
+ *   2. It does not match any {@link COMPLEXITY_PATTERNS}.
+ *
+ * The complexity patterns are deliberately conservative — they only fire on
+ * imperative coordination directives (e.g. "collaborate with the team",
+ * "coordinate the workers"), so descriptive uses ("how do pods coordinate
+ * state", "explain microservice collaboration") remain classified as simple.
+ *
+ * Exported for unit testing.
+ */
+export function isSimpleGoal(goal: string): boolean {
+  if (goal.length > SIMPLE_GOAL_MAX_LENGTH) return false
+  return !COMPLEXITY_PATTERNS.some((re) => re.test(goal))
+}
+
+/**
+ * Select the best-matching agent for a goal using keyword affinity scoring.
+ *
+ * The scoring logic mirrors {@link Scheduler}'s `capability-match` strategy
+ * exactly, including its asymmetric use of the agent's `model` field:
+ *
+ *  - `agentKeywords` is computed from `name + systemPrompt + model` so that
+ *    a goal which mentions a model name (e.g. "haiku") can boost an agent
+ *    bound to that model.
+ *  - `agentText` (used for the reverse direction) is computed from
+ *    `name + systemPrompt` only — model names should not bias the
+ *    text-vs-goal-keywords match.
+ *
+ * The two-direction sum (`scoreA + scoreB`) ensures both "agent describes
+ * goal" and "goal mentions agent capability" contribute to the final score.
+ *
+ * Exported for unit testing.
+ */
+export function selectBestAgent(goal: string, agents: AgentConfig[]): AgentConfig {
+  if (agents.length <= 1) return agents[0]!
+
+  const goalKeywords = extractKeywords(goal)
+
+  let bestAgent = agents[0]!
+  let bestScore = -1
+
+  for (const agent of agents) {
+    const agentText = `${agent.name} ${agent.systemPrompt ?? ''}`
+    // Mirror Scheduler.capability-match: include `model` here only.
+    const agentKeywords = extractKeywords(`${agent.name} ${agent.systemPrompt ?? ''} ${agent.model}`)
+
+    const scoreA = keywordScore(agentText, goalKeywords)
+    const scoreB = keywordScore(goal, agentKeywords)
+    const score = scoreA + scoreB
+
+    if (score > bestScore) {
+      bestScore = score
+      bestAgent = agent
+    }
+  }
+
+  return bestAgent
+}
+
 // ---------------------------------------------------------------------------
 // Internal helpers
 // ---------------------------------------------------------------------------
@ -83,14 +203,32 @@ function addUsage(a: TokenUsage, b: TokenUsage): TokenUsage {
  }
 }

+function resolveTokenBudget(primary?: number, fallback?: number): number | undefined {
+  if (primary === undefined) return fallback
+  if (fallback === undefined) return primary
+  return Math.min(primary, fallback)
+}
+
 /**
 * Build a minimal {@link Agent} with its own fresh registry/executor.
- * Registers all built-in tools so coordinator/worker agents can use them.
+ * Pool workers pass `includeDelegateTool` so `delegate_to_agent` is available during `runTeam` / `runTasks`.
 */
-function buildAgent(config: AgentConfig): Agent {
+function buildAgent(
+  config: AgentConfig,
+  toolRegistration?: { readonly includeDelegateTool?: boolean },
+): Agent {
  const registry = new ToolRegistry()
-  registerBuiltInTools(registry)
-  const executor = new ToolExecutor(registry)
+  registerBuiltInTools(registry, toolRegistration)
+  if (config.customTools) {
+    for (const tool of config.customTools) {
+      registry.register(tool, { runtimeAdded: true })
+    }
+  }
+  const executor = new ToolExecutor(registry, {
+    ...(config.maxToolOutputChars !== undefined
+      ? { maxToolOutputChars: config.maxToolOutputChars }
+      : {}),
+  })
  return new Agent(config, registry, executor)
 }

@ -202,6 +340,10 @@ interface ParsedTaskSpec {
  description: string
  assignee?: string
  dependsOn?: string[]
+  memoryScope?: 'dependencies' | 'all'
+  maxRetries?: number
+  retryDelayMs?: number
+  retryBackoff?: number
 }

 /**
@ -240,6 +382,10 @@ function parseTaskSpecs(raw: string): ParsedTaskSpec[] | null {
        dependsOn: Array.isArray(obj['dependsOn'])
          ? (obj['dependsOn'] as unknown[]).filter((x): x is string => typeof x === 'string')
          : undefined,
+        memoryScope: obj['memoryScope'] === 'all' ? 'all' : undefined,
+        maxRetries: typeof obj['maxRetries'] === 'number' ? obj['maxRetries'] : undefined,
+        retryDelayMs: typeof obj['retryDelayMs'] === 'number' ? obj['retryDelayMs'] : undefined,
+        retryBackoff: typeof obj['retryBackoff'] === 'number' ? obj['retryBackoff'] : undefined,
      })
    }

@ -264,6 +410,98 @@ interface RunContext {
  readonly config: OrchestratorConfig
  /** Trace run ID, present when `onTrace` is configured. */
  readonly runId?: string
+  /** AbortSignal for run-level cancellation. Checked between task dispatch rounds. */
+  readonly abortSignal?: AbortSignal
+  cumulativeUsage: TokenUsage
+  readonly maxTokenBudget?: number
+  budgetExceededTriggered: boolean
+  budgetExceededReason?: string
+  readonly taskMetrics: Map<string, TaskExecutionMetrics>
+}
+
+/**
+ * Build {@link TeamInfo} for tool context, including nested `runDelegatedAgent`
+ * that respects pool capacity to avoid semaphore deadlocks.
+ *
+ * Delegation always builds a **fresh** Agent instance for the target and runs
+ * it via `pool.runEphemeral` — the pool semaphore still gates total concurrency,
+ * but the per-agent lock is bypassed. This matches `delegate_to_agent`'s "runs
+ * in a fresh conversation for this prompt only" contract and prevents mutual
+ * delegation (A→B while B→A) from deadlocking on each other's agent locks.
+ */
+function buildTaskAgentTeamInfo(
+  ctx: RunContext,
+  taskId: string,
+  traceBase: Partial<RunOptions>,
+  delegationDepth: number,
+  delegationChain: readonly string[],
+): TeamInfo {
+  const sharedMem = ctx.team.getSharedMemoryInstance()
+  const maxDepth = ctx.config.maxDelegationDepth
+  const agentConfigs = ctx.team.getAgents()
+  const agentNames = agentConfigs.map((a) => a.name)
+
+  const runDelegatedAgent = async (targetAgent: string, prompt: string): Promise<AgentRunResult> => {
+    const pool = ctx.pool
+    if (pool.availableRunSlots < 1) {
+      return {
+        success: false,
+        output:
+          'Agent pool has no free concurrency slot for a delegated run (would deadlock). ' +
+          'Increase maxConcurrency or reduce parallel delegation.',
+        messages: [],
+        tokenUsage: ZERO_USAGE,
+        toolCalls: [],
+      }
+    }
+
+    const targetConfig = agentConfigs.find((a) => a.name === targetAgent)
+    if (!targetConfig) {
+      return {
+        success: false,
+        output: `Unknown agent "${targetAgent}" — not in team roster [${agentNames.join(', ')}].`,
+        messages: [],
+        tokenUsage: ZERO_USAGE,
+        toolCalls: [],
+      }
+    }
+
+    // Apply orchestrator-level defaults just like buildPool, then construct a
+    // one-shot Agent for this delegation only.
+    const effective: AgentConfig = {
+      ...targetConfig,
+      provider: targetConfig.provider ?? ctx.config.defaultProvider,
+      baseURL: targetConfig.baseURL ?? ctx.config.defaultBaseURL,
+      apiKey: targetConfig.apiKey ?? ctx.config.defaultApiKey,
+    }
+    const tempAgent = buildAgent(effective, { includeDelegateTool: true })
+
+    const nestedTeam = buildTaskAgentTeamInfo(
+      ctx,
+      taskId,
+      traceBase,
+      delegationDepth + 1,
+      [...delegationChain, targetAgent],
+    )
+    const childOpts: Partial<RunOptions> = {
+      ...traceBase,
+      traceAgent: targetAgent,
+      taskId,
+      team: nestedTeam,
+    }
+    return pool.runEphemeral(tempAgent, prompt, childOpts)
+  }
+
+  return {
+    name: ctx.team.name,
+    agents: agentNames,
+    ...(sharedMem ? { sharedMemory: sharedMem.getStore() } : {}),
+    delegationDepth,
+    maxDelegationDepth: maxDepth,
+    delegationPool: ctx.pool,
+    delegationChain,
+    runDelegatedAgent,
+  }
 }

 /**
@ -295,6 +533,12 @@ async function executeQueue(
    : undefined

  while (true) {
+    // Check for cancellation before each dispatch round.
+    if (ctx.abortSignal?.aborted) {
+      queue.skipRemaining('Skipped: run aborted.')
+      break
+    }
+
    // Re-run auto-assignment each iteration so tasks that were unblocked since
    // the last round (and thus have no assignee yet) get assigned before dispatch.
    scheduler.autoAssign(queue, team.getAgents())
@ -355,19 +599,31 @@ async function executeQueue(
        data: task,
      } satisfies OrchestratorEvent)

-      // Build the prompt: inject shared memory context + task description
-      const prompt = await buildTaskPrompt(task, team)
+      // Build the prompt: task description + dependency-only context by default.
+      const prompt = await buildTaskPrompt(task, team, queue)

-      // Build trace context for this task's agent run
-      const traceOptions: Partial<RunOptions> | undefined = config.onTrace
-        ? { onTrace: config.onTrace, runId: ctx.runId ?? '', taskId: task.id, traceAgent: assignee }
-        : undefined
+      // Trace + abort + team tool context (delegate_to_agent)
+      const traceBase: Partial<RunOptions> = {
+        ...(config.onTrace
+          ? {
+              onTrace: config.onTrace,
+              runId: ctx.runId ?? '',
+              taskId: task.id,
+              traceAgent: assignee,
+            }
+          : {}),
+        ...(ctx.abortSignal ? { abortSignal: ctx.abortSignal } : {}),
+      }
+      const runOptions: Partial<RunOptions> = {
+        ...traceBase,
+        team: buildTaskAgentTeamInfo(ctx, task.id, traceBase, 0, [assignee]),
+      }

-      const taskStartMs = config.onTrace ? Date.now() : 0
+      const taskStartMs = Date.now()
      let retryCount = 0

      const result = await executeWithRetry(
-        () => pool.run(assignee, prompt, traceOptions),
+        () => pool.run(assignee, prompt, runOptions),
        task,
        (retryData) => {
          retryCount++
@ -380,9 +636,10 @@ async function executeQueue(
        },
      )

+      const taskEndMs = Date.now()
+
      // Emit task trace
      if (config.onTrace) {
-        const taskEndMs = Date.now()
        emitTrace(config.onTrace, {
          type: 'task',
          runId: ctx.runId ?? '',
@ -399,6 +656,31 @@ async function executeQueue(

      ctx.agentResults.set(`${assignee}:${task.id}`, result)

+      ctx.taskMetrics.set(task.id, {
+        startMs: taskStartMs,
+        endMs: taskEndMs,
+        durationMs: Math.max(0, taskEndMs - taskStartMs),
+        tokenUsage: result.tokenUsage,
+        toolCalls: result.toolCalls,
+      })
+      ctx.cumulativeUsage = addUsage(ctx.cumulativeUsage, result.tokenUsage)
+      const totalTokens = ctx.cumulativeUsage.input_tokens + ctx.cumulativeUsage.output_tokens
+      if (
+        !ctx.budgetExceededTriggered
+        && ctx.maxTokenBudget !== undefined
+        && totalTokens > ctx.maxTokenBudget
+      ) {
+        ctx.budgetExceededTriggered = true
+        const err = new TokenBudgetExceededError('orchestrator', totalTokens, ctx.maxTokenBudget)
+        ctx.budgetExceededReason = err.message
+        config.onProgress?.({
+          type: 'budget_exceeded',
+          agent: assignee,
+          task: task.id,
+          data: err,
+        } satisfies OrchestratorEvent)
+      }
+
      if (result.success) {
        // Persist result into shared memory so other agents can read it
        const sharedMem = team.getSharedMemoryInstance()
@ -435,6 +717,10 @@ async function executeQueue(

    // Wait for the entire parallel batch before checking for newly-unblocked tasks.
    await Promise.all(dispatchPromises)
+    if (ctx.budgetExceededTriggered) {
+      queue.skipRemaining(ctx.budgetExceededReason ?? 'Skipped: token budget exceeded.')
+      break
+    }

    // --- Approval gate ---
    // After the batch completes, check if the caller wants to approve
@ -468,22 +754,37 @@ async function executeQueue(
 *
 * Injects:
 *  - Task title and description
- *  - Dependency results from shared memory (if available)
+ *  - Direct dependency task results by default (clean slate when none)
+ *  - Optional full shared-memory context when `task.memoryScope === 'all'`
 *  - Any messages addressed to this agent from the team bus
 */
-async function buildTaskPrompt(task: Task, team: Team): Promise<string> {
+async function buildTaskPrompt(task: Task, team: Team, queue: TaskQueue): Promise<string> {
  const lines: string[] = [
    `# Task: ${task.title}`,
    '',
    task.description,
  ]

-  // Inject shared memory summary so the agent sees its teammates' work
-  const sharedMem = team.getSharedMemoryInstance()
-  if (sharedMem) {
-    const summary = await sharedMem.getSummary()
-    if (summary) {
-      lines.push('', summary)
+  if (task.memoryScope === 'all') {
+    // Explicit opt-in for full visibility (legacy/shared-memory behavior).
+    const sharedMem = team.getSharedMemoryInstance()
+    if (sharedMem) {
+      const summary = await sharedMem.getSummary()
+      if (summary) {
+        lines.push('', summary)
+      }
+    }
+  } else if (task.dependsOn && task.dependsOn.length > 0) {
+    // Default-deny: inject only explicit prerequisite outputs.
+    const depResults: string[] = []
+    for (const depId of task.dependsOn) {
+      const depTask = queue.get(depId)
+      if (depTask?.status === 'completed' && depTask.result) {
+        depResults.push(`### ${depTask.title} (by ${depTask.assignee ?? 'unknown'})\n${depTask.result}`)
+      }
+    }
+    if (depResults.length > 0) {
+      lines.push('', '## Context from prerequisite tasks', '', ...depResults)
    }
  }

@ -513,8 +814,8 @@ async function buildTaskPrompt(task: Task, team: Team): Promise<string> {
 */
 export class OpenMultiAgent {
  private readonly config: Required<
-    Omit<OrchestratorConfig, 'onApproval' | 'onProgress' | 'onTrace' | 'defaultBaseURL' | 'defaultApiKey'>
-  > & Pick<OrchestratorConfig, 'onApproval' | 'onProgress' | 'onTrace' | 'defaultBaseURL' | 'defaultApiKey'>
+    Omit<OrchestratorConfig, 'onApproval' | 'onProgress' | 'onTrace' | 'defaultBaseURL' | 'defaultApiKey' | 'maxTokenBudget'>
+  > & Pick<OrchestratorConfig, 'onApproval' | 'onProgress' | 'onTrace' | 'defaultBaseURL' | 'defaultApiKey' | 'maxTokenBudget'>

  private readonly teams: Map<string, Team> = new Map()
  private completedTaskCount = 0
@ -524,16 +825,19 @@ export class OpenMultiAgent {
   *
   * Sensible defaults:
   *   - `maxConcurrency`: 5
+   *   - `maxDelegationDepth`: 3
   *   - `defaultModel`:   `'claude-opus-4-6'`
   *   - `defaultProvider`: `'anthropic'`
   */
  constructor(config: OrchestratorConfig = {}) {
    this.config = {
      maxConcurrency: config.maxConcurrency ?? DEFAULT_MAX_CONCURRENCY,
+      maxDelegationDepth: config.maxDelegationDepth ?? DEFAULT_MAX_DELEGATION_DEPTH,
      defaultModel: config.defaultModel ?? DEFAULT_MODEL,
      defaultProvider: config.defaultProvider ?? 'anthropic',
      defaultBaseURL: config.defaultBaseURL,
      defaultApiKey: config.defaultApiKey,
+      maxTokenBudget: config.maxTokenBudget,
      onApproval: config.onApproval,
      onProgress: config.onProgress,
      onTrace: config.onTrace,
@ -580,12 +884,18 @@ export class OpenMultiAgent {
   * @param config - Agent configuration.
   * @param prompt - The user prompt to send.
   */
-  async runAgent(config: AgentConfig, prompt: string): Promise<AgentRunResult> {
+  async runAgent(
+    config: AgentConfig,
+    prompt: string,
+    options?: { abortSignal?: AbortSignal },
+  ): Promise<AgentRunResult> {
+    const effectiveBudget = resolveTokenBudget(config.maxTokenBudget, this.config.maxTokenBudget)
    const effective: AgentConfig = {
      ...config,
      provider: config.provider ?? this.config.defaultProvider,
      baseURL: config.baseURL ?? this.config.defaultBaseURL,
      apiKey: config.apiKey ?? this.config.defaultApiKey,
+      maxTokenBudget: effectiveBudget,
    }
    const agent = buildAgent(effective)
    this.config.onProgress?.({
@ -594,11 +904,34 @@ export class OpenMultiAgent {
      data: { prompt },
    })

-    const traceOptions: Partial<RunOptions> | undefined = this.config.onTrace
-      ? { onTrace: this.config.onTrace, runId: generateRunId(), traceAgent: config.name }
-      : undefined
+    // Build run-time options: trace + optional abort signal. RunOptions has
+    // readonly fields, so we assemble the literal in one shot.
+    const traceFields = this.config.onTrace
+      ? {
+          onTrace: this.config.onTrace,
+          runId: generateRunId(),
+          traceAgent: config.name,
+        }
+      : null
+    const abortFields = options?.abortSignal ? { abortSignal: options.abortSignal } : null
+    const runOptions: Partial<RunOptions> | undefined =
+      traceFields || abortFields
+        ? { ...(traceFields ?? {}), ...(abortFields ?? {}) }
+        : undefined

-    const result = await agent.run(prompt, traceOptions)
+    const result = await agent.run(prompt, runOptions)
+
+    if (result.budgetExceeded) {
+      this.config.onProgress?.({
+        type: 'budget_exceeded',
+        agent: config.name,
+        data: new TokenBudgetExceededError(
+          config.name,
+          result.tokenUsage.input_tokens + result.tokenUsage.output_tokens,
+          effectiveBudget ?? 0,
+        ),
+      })
+    }

    this.config.onProgress?.({
      type: 'agent_complete',
@ -638,20 +971,116 @@ export class OpenMultiAgent {
   * @param team - A team created via {@link createTeam} (or `new Team(...)`).
   * @param goal - High-level natural-language goal for the team.
   */
-  async runTeam(team: Team, goal: string): Promise<TeamRunResult> {
+  async runTeam(
+    team: Team,
+    goal: string,
+    options?: { abortSignal?: AbortSignal; coordinator?: CoordinatorConfig },
+  ): Promise<TeamRunResult> {
    const agentConfigs = team.getAgents()
+    const coordinatorOverrides = options?.coordinator
+
+    // ------------------------------------------------------------------
+    // Short-circuit: skip coordinator for simple, single-action goals.
+    //
+    // When the goal is short and contains no multi-step / coordination
+    // signals, dispatching it to a single agent is faster and cheaper
+    // than spinning up a coordinator for decomposition + synthesis.
+    //
+    // The best-matching agent is selected via keyword affinity scoring
+    // (same algorithm as the `capability-match` scheduler strategy).
+    // ------------------------------------------------------------------
+    if (agentConfigs.length > 0 && isSimpleGoal(goal)) {
+      const bestAgent = selectBestAgent(goal, agentConfigs)
+
+      // Use buildAgent() + agent.run() directly instead of this.runAgent()
+      // to avoid duplicate progress events and double completedTaskCount.
+      // Events are emitted here; counting is handled by buildTeamRunResult().
+      const effectiveBudget = resolveTokenBudget(bestAgent.maxTokenBudget, this.config.maxTokenBudget)
+      const effective: AgentConfig = {
+        ...bestAgent,
+        provider: bestAgent.provider ?? this.config.defaultProvider,
+        baseURL: bestAgent.baseURL ?? this.config.defaultBaseURL,
+        apiKey: bestAgent.apiKey ?? this.config.defaultApiKey,
+        maxTokenBudget: effectiveBudget,
+      }
+      const agent = buildAgent(effective)
+
+      this.config.onProgress?.({
+        type: 'agent_start',
+        agent: bestAgent.name,
+        data: { phase: 'short-circuit', goal },
+      })
+
+      const traceFields = this.config.onTrace
+        ? { onTrace: this.config.onTrace, runId: generateRunId(), traceAgent: bestAgent.name }
+        : null
+      const abortFields = options?.abortSignal ? { abortSignal: options.abortSignal } : null
+      const runOptions: Partial<RunOptions> | undefined =
+        traceFields || abortFields
+          ? { ...(traceFields ?? {}), ...(abortFields ?? {}) }
+          : undefined
+
+      const scStartMs = Date.now()
+      const result = await agent.run(goal, runOptions)
+      const scEndMs = Date.now()
+
+      if (result.budgetExceeded) {
+        this.config.onProgress?.({
+          type: 'budget_exceeded',
+          agent: bestAgent.name,
+          data: new TokenBudgetExceededError(
+            bestAgent.name,
+            result.tokenUsage.input_tokens + result.tokenUsage.output_tokens,
+            effectiveBudget ?? 0,
+          ),
+        })
+      }
+
+      this.config.onProgress?.({
+        type: 'agent_complete',
+        agent: bestAgent.name,
+        data: { phase: 'short-circuit', result },
+      })
+
+      const agentResults = new Map<string, AgentRunResult>()
+      agentResults.set(bestAgent.name, result)
+
+
+      const tasks: readonly TaskExecutionRecord[] = [{
+        id: 'short-circuit',
+        title: `Short-circuit: ${bestAgent.name}`,
+        assignee: bestAgent.name,
+        status: result.success ? 'completed' : 'failed',
+        dependsOn: [],
+        metrics: {
+          startMs: scStartMs,
+          endMs: scEndMs,
+          durationMs: Math.max(0, scEndMs - scStartMs),
+          tokenUsage: result.tokenUsage,
+          toolCalls: result.toolCalls,
+        },
+      }]
+      return this.buildTeamRunResult(agentResults, goal, tasks)
+    }

    // ------------------------------------------------------------------
    // Step 1: Coordinator decomposes goal into tasks
    // ------------------------------------------------------------------
    const coordinatorConfig: AgentConfig = {
      name: 'coordinator',
-      model: this.config.defaultModel,
-      provider: this.config.defaultProvider,
-      baseURL: this.config.defaultBaseURL,
-      apiKey: this.config.defaultApiKey,
-      systemPrompt: this.buildCoordinatorSystemPrompt(agentConfigs),
-      maxTurns: 3,
+      model: coordinatorOverrides?.model ?? this.config.defaultModel,
+      provider: coordinatorOverrides?.provider ?? this.config.defaultProvider,
+      baseURL: coordinatorOverrides?.baseURL ?? this.config.defaultBaseURL,
+      apiKey: coordinatorOverrides?.apiKey ?? this.config.defaultApiKey,
+      systemPrompt: this.buildCoordinatorPrompt(agentConfigs, coordinatorOverrides),
+      maxTurns: coordinatorOverrides?.maxTurns ?? 3,
+      maxTokens: coordinatorOverrides?.maxTokens,
+      temperature: coordinatorOverrides?.temperature,
+      toolPreset: coordinatorOverrides?.toolPreset,
+      tools: coordinatorOverrides?.tools,
+      disallowedTools: coordinatorOverrides?.disallowedTools,
+      loopDetection: coordinatorOverrides?.loopDetection,
+      timeoutMs: coordinatorOverrides?.timeoutMs,
    }

    const decompositionPrompt = this.buildDecompositionPrompt(goal, agentConfigs)
@ -665,11 +1094,29 @@ export class OpenMultiAgent {
    })

    const decompTraceOptions: Partial<RunOptions> | undefined = this.config.onTrace
-      ? { onTrace: this.config.onTrace, runId: runId ?? '', traceAgent: 'coordinator' }
-      : undefined
+      ? { onTrace: this.config.onTrace, runId: runId ?? '', traceAgent: 'coordinator', abortSignal: options?.abortSignal }
+      : options?.abortSignal ? { abortSignal: options.abortSignal } : undefined
    const decompositionResult = await coordinatorAgent.run(decompositionPrompt, decompTraceOptions)
    const agentResults = new Map<string, AgentRunResult>()
    agentResults.set('coordinator:decompose', decompositionResult)
+    const maxTokenBudget = this.config.maxTokenBudget
+    let cumulativeUsage = addUsage(ZERO_USAGE, decompositionResult.tokenUsage)
+
+    if (
+      maxTokenBudget !== undefined
+      && cumulativeUsage.input_tokens + cumulativeUsage.output_tokens > maxTokenBudget
+    ) {
+      this.config.onProgress?.({
+        type: 'budget_exceeded',
+        agent: 'coordinator',
+        data: new TokenBudgetExceededError(
+          'coordinator',
+          cumulativeUsage.input_tokens + cumulativeUsage.output_tokens,
+          maxTokenBudget,
+        ),
+      })
+      return this.buildTeamRunResult(agentResults, goal, [])
+    }

    // ------------------------------------------------------------------
    // Step 2: Parse tasks from coordinator output
@ -678,6 +1125,7 @@ export class OpenMultiAgent {

    const queue = new TaskQueue()
    const scheduler = new Scheduler('dependency-first')
+    const taskMetrics = new Map<string, TaskExecutionMetrics>()

    if (taskSpecs && taskSpecs.length > 0) {
      // Map title-based dependsOn references to real task IDs so we can
@ -712,19 +1160,55 @@ export class OpenMultiAgent {
      agentResults,
      config: this.config,
      runId,
+      abortSignal: options?.abortSignal,
+      cumulativeUsage,
+      maxTokenBudget,
+      budgetExceededTriggered: false,
+      budgetExceededReason: undefined,
+      taskMetrics,
    }

    await executeQueue(queue, ctx)
+    cumulativeUsage = ctx.cumulativeUsage
+    const taskRecords: readonly TaskExecutionRecord[] = queue.list().map((task) => ({
+      id: task.id,
+      title: task.title,
+      assignee: task.assignee,
+      status: task.status,
+      dependsOn: task.dependsOn ?? [],
+      metrics: taskMetrics.get(task.id),
+    }))

    // ------------------------------------------------------------------
    // Step 5: Coordinator synthesises final result
    // ------------------------------------------------------------------
+    if (
+      maxTokenBudget !== undefined
+      && cumulativeUsage.input_tokens + cumulativeUsage.output_tokens > maxTokenBudget
+    ) {
+      return this.buildTeamRunResult(agentResults, goal, taskRecords)
+    }
    const synthesisPrompt = await this.buildSynthesisPrompt(goal, queue.list(), team)
    const synthTraceOptions: Partial<RunOptions> | undefined = this.config.onTrace
      ? { onTrace: this.config.onTrace, runId: runId ?? '', traceAgent: 'coordinator' }
      : undefined
    const synthesisResult = await coordinatorAgent.run(synthesisPrompt, synthTraceOptions)
    agentResults.set('coordinator', synthesisResult)
+    cumulativeUsage = addUsage(cumulativeUsage, synthesisResult.tokenUsage)
+    if (
+      maxTokenBudget !== undefined
+      && cumulativeUsage.input_tokens + cumulativeUsage.output_tokens > maxTokenBudget
+    ) {
+      this.config.onProgress?.({
+        type: 'budget_exceeded',
+        agent: 'coordinator',
+        data: new TokenBudgetExceededError(
+          'coordinator',
+          cumulativeUsage.input_tokens + cumulativeUsage.output_tokens,
+          maxTokenBudget,
+        ),
+      })
+    }

    this.config.onProgress?.({
      type: 'agent_complete',
@ -736,7 +1220,7 @@ export class OpenMultiAgent {
    // Only actual user tasks (non-coordinator keys) are counted in
    // buildTeamRunResult, so we do not increment completedTaskCount here.

-    return this.buildTeamRunResult(agentResults)
+    return this.buildTeamRunResult(agentResults, goal, taskRecords)
  }

  // -------------------------------------------------------------------------
@ -760,10 +1244,12 @@ export class OpenMultiAgent {
      description: string
      assignee?: string
      dependsOn?: string[]
+      memoryScope?: 'dependencies' | 'all'
      maxRetries?: number
      retryDelayMs?: number
      retryBackoff?: number
    }>,
+    options?: { abortSignal?: AbortSignal },
  ): Promise<TeamRunResult> {
    const agentConfigs = team.getAgents()
    const queue = new TaskQueue()
@ -775,6 +1261,7 @@ export class OpenMultiAgent {
        description: t.description,
        assignee: t.assignee,
        dependsOn: t.dependsOn,
+        memoryScope: t.memoryScope,
        maxRetries: t.maxRetries,
        retryDelayMs: t.retryDelayMs,
        retryBackoff: t.retryBackoff,
@ -794,11 +1281,26 @@ export class OpenMultiAgent {
      agentResults,
      config: this.config,
      runId: this.config.onTrace ? generateRunId() : undefined,
+      abortSignal: options?.abortSignal,
+      cumulativeUsage: ZERO_USAGE,
+      maxTokenBudget: this.config.maxTokenBudget,
+      budgetExceededTriggered: false,
+      budgetExceededReason: undefined,
+      taskMetrics: new Map<string, TaskExecutionMetrics>(),
    }

    await executeQueue(queue, ctx)

-    return this.buildTeamRunResult(agentResults)
+    const taskRecords: readonly TaskExecutionRecord[] = queue.list().map((task) => ({
+      id: task.id,
+      title: task.title,
+      assignee: task.assignee,
+      status: task.status,
+      dependsOn: task.dependsOn ?? [],
+      metrics: ctx.taskMetrics.get(task.id),
+    }))
+
+    return this.buildTeamRunResult(agentResults, undefined, taskRecords)
  }

  // -------------------------------------------------------------------------
@ -845,6 +1347,47 @@ export class OpenMultiAgent {

  /** Build the system prompt given to the coordinator agent. */
  private buildCoordinatorSystemPrompt(agents: AgentConfig[]): string {
+    return [
+      'You are a task coordinator responsible for decomposing high-level goals',
+      'into concrete, actionable tasks and assigning them to the right team members.',
+      '',
+      this.buildCoordinatorRosterSection(agents),
+      '',
+      this.buildCoordinatorOutputFormatSection(),
+      '',
+      this.buildCoordinatorSynthesisSection(),
+    ].join('\n')
+  }
+
+  /** Build coordinator system prompt with optional caller overrides. */
+  private buildCoordinatorPrompt(agents: AgentConfig[], config?: CoordinatorConfig): string {
+    if (config?.systemPrompt) {
+      return [
+        config.systemPrompt,
+        '',
+        this.buildCoordinatorRosterSection(agents),
+        '',
+        this.buildCoordinatorOutputFormatSection(),
+        '',
+        this.buildCoordinatorSynthesisSection(),
+      ].join('\n')
+    }
+
+    const base = this.buildCoordinatorSystemPrompt(agents)
+    if (!config?.instructions) {
+      return base
+    }
+
+    return [
+      base,
+      '',
+      '## Additional Instructions',
+      config.instructions,
+    ].join('\n')
+  }
+
+  /** Build the coordinator team roster section. */
+  private buildCoordinatorRosterSection(agents: AgentConfig[]): string {
    const roster = agents
      .map(
        (a) =>
@ -853,12 +1396,14 @@ export class OpenMultiAgent {
      .join('\n')

    return [
-      'You are a task coordinator responsible for decomposing high-level goals',
-      'into concrete, actionable tasks and assigning them to the right team members.',
-      '',
      '## Team Roster',
      roster,
-      '',
+    ].join('\n')
+  }
+
+  /** Build the coordinator JSON output-format section. */
+  private buildCoordinatorOutputFormatSection(): string {
+    return [
      '## Output Format',
      'When asked to decompose a goal, respond ONLY with a JSON array of task objects.',
      'Each task must have:',
@ -869,7 +1414,12 @@ export class OpenMultiAgent {
      '',
      'Wrap the JSON in a ```json code fence.',
      'Do not include any text outside the code fence.',
-      '',
+    ].join('\n')
+  }
+
+  /** Build the coordinator synthesis guidance section. */
+  private buildCoordinatorSynthesisSection(): string {
+    return [
      '## When synthesising results',
      'You will be given completed task outputs and asked to synthesise a final answer.',
      'Write a clear, comprehensive response that addresses the original goal.',
@ -943,6 +1493,7 @@ export class OpenMultiAgent {
   */
  private loadSpecsIntoQueue(
    specs: ReadonlyArray<ParsedTaskSpec & {
+      memoryScope?: 'dependencies' | 'all'
      maxRetries?: number
      retryDelayMs?: number
      retryBackoff?: number
@ -963,6 +1514,7 @@ export class OpenMultiAgent {
        assignee: spec.assignee && agentNames.has(spec.assignee)
          ? spec.assignee
          : undefined,
+        memoryScope: spec.memoryScope,
        maxRetries: spec.maxRetries,
        retryDelayMs: spec.retryDelayMs,
        retryBackoff: spec.retryBackoff,
@ -1011,7 +1563,7 @@ export class OpenMultiAgent {
        baseURL: config.baseURL ?? this.config.defaultBaseURL,
        apiKey: config.apiKey ?? this.config.defaultApiKey,
      }
-      pool.add(buildAgent(effective))
+      pool.add(buildAgent(effective, { includeDelegateTool: true }))
    }
    return pool
  }
@ -1027,6 +1579,8 @@ export class OpenMultiAgent {
   */
  private buildTeamRunResult(
    agentResults: Map<string, AgentRunResult>,
+    goal?: string,
+    tasks?: readonly TaskExecutionRecord[],
  ): TeamRunResult {
    let totalUsage: TokenUsage = ZERO_USAGE
    let overallSuccess = true
@ -1064,6 +1618,8 @@ export class OpenMultiAgent {

    return {
      success: overallSuccess,
+      goal,
+      tasks,
      agentResults: collapsed,
      totalTokenUsage: totalUsage,
    }
--- a/src/orchestrator/scheduler.ts
+++ b/src/orchestrator/scheduler.ts
@ -15,6 +15,7 @@

 import type { AgentConfig, Task } from '../types.js'
 import type { TaskQueue } from '../task/queue.js'
+import { extractKeywords, keywordScore } from '../utils/keywords.js'

 // ---------------------------------------------------------------------------
 // Public types
@ -74,38 +75,6 @@ function countBlockedDependents(taskId: string, allTasks: Task[]): number {
  return visited.size
 }

-/**
- * Compute a simple keyword-overlap score between `text` and `keywords`.
- *
- * Both the text and keywords are normalised to lower-case before comparison.
- * Each keyword that appears in the text contributes +1 to the score.
- */
-function keywordScore(text: string, keywords: string[]): number {
-  const lower = text.toLowerCase()
-  return keywords.reduce((acc, kw) => acc + (lower.includes(kw.toLowerCase()) ? 1 : 0), 0)
-}
-
-/**
- * Extract a list of meaningful keywords from a string for capability matching.
- *
- * Strips common stop-words so that incidental matches (e.g. "the", "and") do
- * not inflate scores. Returns unique words longer than three characters.
- */
-function extractKeywords(text: string): string[] {
-  const STOP_WORDS = new Set([
-    'the', 'and', 'for', 'that', 'this', 'with', 'are', 'from', 'have',
-    'will', 'your', 'you', 'can', 'all', 'each', 'when', 'then', 'they',
-    'them', 'their', 'about', 'into', 'more', 'also', 'should', 'must',
-  ])
-
-  return [...new Set(
-    text
-      .toLowerCase()
-      .split(/\W+/)
-      .filter((w) => w.length > 3 && !STOP_WORDS.has(w)),
-  )]
-}
-
 // ---------------------------------------------------------------------------
 // Scheduler
 // ---------------------------------------------------------------------------
--- a/src/task/queue.ts
+++ b/src/task/queue.ts
@ -289,6 +289,11 @@ export class TaskQueue {
    return this.list().filter((t) => t.status === status)
  }

+  /** Returns a task by ID, if present. */
+  get(taskId: string): Task | undefined {
+    return this.tasks.get(taskId)
+  }
+
  /**
   * Returns `true` when every task in the queue has reached a terminal state
   * (`'completed'`, `'failed'`, or `'skipped'`), **or** the queue is empty.
--- a/src/task/task.ts
+++ b/src/task/task.ts
@ -31,6 +31,7 @@ export function createTask(input: {
  description: string
  assignee?: string
  dependsOn?: string[]
+  memoryScope?: 'dependencies' | 'all'
  maxRetries?: number
  retryDelayMs?: number
  retryBackoff?: number
@ -43,6 +44,7 @@ export function createTask(input: {
    status: 'pending' as TaskStatus,
    assignee: input.assignee,
    dependsOn: input.dependsOn ? [...input.dependsOn] : undefined,
+    memoryScope: input.memoryScope,
    result: undefined,
    createdAt: now,
    updatedAt: now,
--- a/src/tool/built-in/delegate.ts
+++ b/src/tool/built-in/delegate.ts
@ -0,0 +1,109 @@
+/**
+ * @fileoverview Built-in `delegate_to_agent` tool for synchronous handoff to a roster agent.
+ */
+
+import { z } from 'zod'
+import type { ToolDefinition, ToolResult, ToolUseContext } from '../../types.js'
+
+const inputSchema = z.object({
+  target_agent: z.string().min(1).describe('Name of the team agent to run the sub-task.'),
+  prompt: z.string().min(1).describe('Instructions / question for the target agent.'),
+})
+
+/**
+ * Delegates a sub-task to another agent on the team and returns that agent's final text output.
+ *
+ * Only available when the orchestrator injects {@link ToolUseContext.team} with
+ * `runDelegatedAgent` (pool-backed `runTeam` / `runTasks`). Standalone `runAgent`
+ * does not register this tool by default.
+ *
+ * Nested {@link AgentRunResult.tokenUsage} from the delegated run is surfaced via
+ * {@link ToolResult.metadata} so the parent runner can aggregate it into its total
+ * (keeps `maxTokenBudget` accurate across delegation chains).
+ */
+export const delegateToAgentTool: ToolDefinition<z.infer<typeof inputSchema>> = {
+  name: 'delegate_to_agent',
+  description:
+    'Run a sub-task on another agent from this team and return that agent\'s final answer as the tool result. ' +
+    'Use when you need a specialist teammate to produce output you will incorporate. ' +
+    'The target agent runs in a fresh conversation for this prompt only.',
+  inputSchema,
+  async execute(
+    { target_agent: targetAgent, prompt },
+    context: ToolUseContext,
+  ): Promise<ToolResult> {
+    const team = context.team
+    if (!team?.runDelegatedAgent) {
+      return {
+        data:
+          'delegate_to_agent is only available during orchestrated team runs with the delegation tool enabled. ' +
+          'Use SharedMemory or explicit tasks instead.',
+        isError: true,
+      }
+    }
+
+    if (targetAgent === context.agent.name) {
+      return {
+        data: 'Cannot delegate to yourself; use another team member.',
+        isError: true,
+      }
+    }
+
+    if (!team.agents.includes(targetAgent)) {
+      return {
+        data: `Unknown agent "${targetAgent}". Roster: ${team.agents.join(', ')}`,
+        isError: true,
+      }
+    }
+
+    const chain = team.delegationChain ?? []
+    if (chain.includes(targetAgent)) {
+      return {
+        data:
+          `Delegation cycle detected: ${[...chain, targetAgent].join(' -> ')}. ` +
+          'Pick a different target or restructure the plan.',
+        isError: true,
+      }
+    }
+
+    const depth = team.delegationDepth ?? 0
+    const maxDepth = team.maxDelegationDepth ?? 3
+    if (depth >= maxDepth) {
+      return {
+        data: `Maximum delegation depth (${maxDepth}) reached; cannot delegate further.`,
+        isError: true,
+      }
+    }
+
+    if (team.delegationPool !== undefined && team.delegationPool.availableRunSlots < 1) {
+      return {
+        data:
+          'Agent pool has no free concurrency slot for a delegated run (nested run would block indefinitely). ' +
+          'Increase orchestrator maxConcurrency, wait for parallel work to finish, or avoid delegating while the pool is saturated.',
+        isError: true,
+      }
+    }
+
+    const result = await team.runDelegatedAgent(targetAgent, prompt)
+
+    if (team.sharedMemory) {
+      const suffix = `${Date.now()}-${Math.random().toString(36).slice(2, 10)}`
+      const key = `delegation:${targetAgent}:${suffix}`
+      try {
+        await team.sharedMemory.set(`${context.agent.name}/${key}`, result.output, {
+          agent: context.agent.name,
+          delegatedTo: targetAgent,
+          success: String(result.success),
+        })
+      } catch {
+        // Audit is best-effort; do not fail the tool on store errors.
+      }
+    }
+
+    return {
+      data: result.output,
+      isError: !result.success,
+      metadata: { tokenUsage: result.tokenUsage },
+    }
+  },
+}
--- a/src/tool/built-in/fs-walk.ts
+++ b/src/tool/built-in/fs-walk.ts
@ -0,0 +1,97 @@
+/**
+ * Shared recursive directory walk for built-in file tools.
+ *
+ * Used by {@link grepTool} and {@link globTool} so glob filtering and skip
+ * rules stay consistent.
+ */
+
+import { readdir, stat } from 'fs/promises'
+import { join } from 'path'
+
+/** Directories that are almost never useful to traverse for code search. */
+export const SKIP_DIRS = new Set([
+  '.git',
+  '.svn',
+  '.hg',
+  'node_modules',
+  '.next',
+  'dist',
+  'build',
+])
+
+export interface CollectFilesOptions {
+  /** When set, stop collecting once this many paths are gathered. */
+  readonly maxFiles?: number
+}
+
+/**
+ * Recursively walk `dir` and return file paths, honouring {@link SKIP_DIRS}
+ * and an optional filename glob pattern.
+ */
+export async function collectFiles(
+  dir: string,
+  glob: string | undefined,
+  signal: AbortSignal | undefined,
+  options?: CollectFilesOptions,
+): Promise<string[]> {
+  const results: string[] = []
+  await walk(dir, glob, results, signal, options?.maxFiles)
+  return results
+}
+
+async function walk(
+  dir: string,
+  glob: string | undefined,
+  results: string[],
+  signal: AbortSignal | undefined,
+  maxFiles: number | undefined,
+): Promise<void> {
+  if (signal?.aborted === true) return
+  if (maxFiles !== undefined && results.length >= maxFiles) return
+
+  let entryNames: string[]
+  try {
+    entryNames = await readdir(dir, { encoding: 'utf8' })
+  } catch {
+    return
+  }
+
+  for (const entryName of entryNames) {
+    if (signal !== undefined && signal.aborted) return
+    if (maxFiles !== undefined && results.length >= maxFiles) return
+
+    const fullPath = join(dir, entryName)
+
+    let entryInfo: Awaited<ReturnType<typeof stat>>
+    try {
+      entryInfo = await stat(fullPath)
+    } catch {
+      continue
+    }
+
+    if (entryInfo.isDirectory()) {
+      if (!SKIP_DIRS.has(entryName)) {
+        await walk(fullPath, glob, results, signal, maxFiles)
+      }
+    } else if (entryInfo.isFile()) {
+      if (glob === undefined || matchesGlob(entryName, glob)) {
+        results.push(fullPath)
+      }
+    }
+  }
+}
+/** 
+ * Minimal glob match supporting `*.ext` and `**<pattern>` forms.
+ * 
+*/
+
+
+export function matchesGlob(filename: string, glob: string): boolean {
+  const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
+  const regexSource = pattern
+    .replace(/[.+^${}()|[\]\\]/g, '\\$&')
+    .replace(/\*/g, '.*')
+    .replace(/\?/g, '.')
+  const re = new RegExp(`^${regexSource}$`, 'i')
+  return re.test(filename)
+}
--- a/src/tool/built-in/glob.ts
+++ b/src/tool/built-in/glob.ts
@ -0,0 +1,99 @@
+/**
+ * Built-in glob tool.
+ *
+ * Lists file paths under a directory matching an optional filename glob.
+ * Does not read file contents — use {@link grepTool} to search inside files.
+ */
+
+import { stat } from 'fs/promises'
+import { basename, relative } from 'path'
+import { z } from 'zod'
+import type { ToolResult } from '../../types.js'
+import { collectFiles, matchesGlob } from './fs-walk.js'
+import { defineTool } from '../framework.js'
+
+const DEFAULT_MAX_FILES = 500
+
+export const globTool = defineTool({
+  name: 'glob',
+  description:
+    'List file paths under a directory that match an optional filename glob. ' +
+    'Does not read file contents — use `grep` to search inside files. ' +
+    'Skips common bulky directories (node_modules, .git, dist, etc.). ' +
+    'Paths in the result are relative to the process working directory. ' +
+    'Results are capped by `maxFiles`.',
+
+  inputSchema: z.object({
+    path: z
+      .string()
+      .optional()
+      .describe(
+        'Directory to list files under. Defaults to the current working directory.',
+      ),
+    pattern: z
+      .string()
+      .optional()
+      .describe(
+        'Filename glob (e.g. "*.ts", "**/*.json"). When omitted, every file ' +
+          'under the directory is listed (subject to maxFiles and skipped dirs).',
+      ),
+    maxFiles: z
+      .number()
+      .int()
+      .positive()
+      .optional()
+      .describe(
+        `Maximum number of file paths to return. Defaults to ${DEFAULT_MAX_FILES}.`,
+      ),
+  }),
+
+  execute: async (input, context): Promise<ToolResult> => {
+    const root = input.path ?? process.cwd()
+    const maxFiles = input.maxFiles ?? DEFAULT_MAX_FILES
+    const signal = context.abortSignal
+
+    let linesOut: string[]
+    let truncated = false
+
+    try {
+      const info = await stat(root)
+      if (info.isFile()) {
+        const name = basename(root)
+        if (
+          input.pattern !== undefined &&
+          !matchesGlob(name, input.pattern)
+        ) {
+          return { data: 'No files matched.', isError: false }
+        }
+        linesOut = [relative(process.cwd(), root) || root]
+      } else {
+        const collected = await collectFiles(root, input.pattern, signal, {
+          maxFiles: maxFiles + 1,
+        })
+        truncated = collected.length > maxFiles
+        const capped = collected.slice(0, maxFiles)
+        linesOut = capped.map((f) => relative(process.cwd(), f) || f)
+      }
+    } catch (err) {
+      const message = err instanceof Error ? err.message : 'Unknown error'
+      return {
+        data: `Cannot access path "${root}": ${message}`,
+        isError: true,
+      }
+    }
+
+    if (linesOut.length === 0) {
+      return { data: 'No files matched.', isError: false }
+    }
+
+    const sorted = [...linesOut].sort((a, b) => a.localeCompare(b))
+    const truncationNote = truncated
+      ? `\n\n(listing capped at ${maxFiles} paths; raise maxFiles for more)`
+      : ''
+
+    return {
+      data: sorted.join('\n') + truncationNote,
+      isError: false,
+    }
+  },
+})
--- a/src/tool/built-in/grep.ts
+++ b/src/tool/built-in/grep.ts
@ -8,28 +8,18 @@
 */

 import { spawn } from 'child_process'
-import { readdir, readFile, stat } from 'fs/promises'
-// Note: readdir is used with { encoding: 'utf8' } to return string[] directly.
-import { join, relative } from 'path'
+import { readFile, stat } from 'fs/promises'
+import { relative } from 'path'
 import { z } from 'zod'
 import type { ToolResult } from '../../types.js'
 import { defineTool } from '../framework.js'
+import { collectFiles } from './fs-walk.js'

 // ---------------------------------------------------------------------------
 // Constants
 // ---------------------------------------------------------------------------

 const DEFAULT_MAX_RESULTS = 100
-// Directories that are almost never useful to search inside
-const SKIP_DIRS = new Set([
-  '.git',
-  '.svn',
-  '.hg',
-  'node_modules',
-  '.next',
-  'dist',
-  'build',
-])

 // ---------------------------------------------------------------------------
 // Tool definition
@ -42,6 +32,7 @@ export const grepTool = defineTool({
    'Returns matching lines with their file paths and 1-based line numbers. ' +
    'Use the `glob` parameter to restrict the search to specific file types ' +
    '(e.g. "*.ts"). ' +
+    'To list matching file paths without reading contents, use the `glob` tool. ' +
    'Results are capped by `maxResults` to keep the response manageable.',

  inputSchema: z.object({
@ -270,79 +261,6 @@ async function runNodeSearch(
  }
 }

-// ---------------------------------------------------------------------------
-// File collection with glob filtering
-// ---------------------------------------------------------------------------
-
-/**
- * Recursively walk `dir` and return file paths, honouring `SKIP_DIRS` and an
- * optional glob pattern.
- */
-async function collectFiles(
-  dir: string,
-  glob: string | undefined,
-  signal: AbortSignal | undefined,
-): Promise<string[]> {
-  const results: string[] = []
-  await walk(dir, glob, results, signal)
-  return results
-}
-
-async function walk(
-  dir: string,
-  glob: string | undefined,
-  results: string[],
-  signal: AbortSignal | undefined,
-): Promise<void> {
-  if (signal?.aborted === true) return
-
-  let entryNames: string[]
-  try {
-    // Read as plain strings so we don't have to deal with Buffer Dirent variants.
-    entryNames = await readdir(dir, { encoding: 'utf8' })
-  } catch {
-    return
-  }
-
-  for (const entryName of entryNames) {
-    if (signal !== undefined && signal.aborted) return
-
-    const fullPath = join(dir, entryName)
-
-    let entryInfo: Awaited<ReturnType<typeof stat>>
-    try {
-      entryInfo = await stat(fullPath)
-    } catch {
-      continue
-    }
-
-    if (entryInfo.isDirectory()) {
-      if (!SKIP_DIRS.has(entryName)) {
-        await walk(fullPath, glob, results, signal)
-      }
-    } else if (entryInfo.isFile()) {
-      if (glob === undefined || matchesGlob(entryName, glob)) {
-        results.push(fullPath)
-      }
-    }
-  }
-}
-
-/**
- * Minimal glob match supporting `*.ext` and `**\/<pattern>` forms.
- */
-function matchesGlob(filename: string, glob: string): boolean {
-  // Strip leading **/ prefix — we already recurse into all directories
-  const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
-  // Convert shell glob characters to regex equivalents
-  const regexSource = pattern
-    .replace(/[.+^${}()|[\]\\]/g, '\\$&') // escape special regex chars first
-    .replace(/\*/g, '.*')                  // * -> .*
-    .replace(/\?/g, '.')                   // ? -> .
-  const re = new RegExp(`^${regexSource}$`, 'i')
-  return re.test(filename)
-}
-
 // ---------------------------------------------------------------------------
 // ripgrep availability check (cached per process)
 // ---------------------------------------------------------------------------
--- a/src/tool/built-in/index.ts
+++ b/src/tool/built-in/index.ts
@ -8,12 +8,23 @@
 import type { ToolDefinition } from '../../types.js'
 import { ToolRegistry } from '../framework.js'
 import { bashTool } from './bash.js'
+import { delegateToAgentTool } from './delegate.js'
 import { fileEditTool } from './file-edit.js'
 import { fileReadTool } from './file-read.js'
 import { fileWriteTool } from './file-write.js'
+import { globTool } from './glob.js'
 import { grepTool } from './grep.js'

-export { bashTool, fileEditTool, fileReadTool, fileWriteTool, grepTool }
+export { bashTool, delegateToAgentTool, fileEditTool, fileReadTool, fileWriteTool, globTool, grepTool }
+
+/** Options for {@link registerBuiltInTools}. */
+export interface RegisterBuiltInToolsOptions {
+  /**
+   * When true, registers `delegate_to_agent` (team orchestration handoff).
+   * Default false so standalone agents and `runAgent` do not expose a tool that always errors.
+   */
+  readonly includeDelegateTool?: boolean
+}

 /**
 * The ordered list of all built-in tools.  Import this when you need to
@ -29,6 +40,13 @@ export const BUILT_IN_TOOLS: ToolDefinition<any>[] = [
  fileWriteTool,
  fileEditTool,
  grepTool,
+  globTool,
+]
+
+/** All built-ins including `delegate_to_agent` (for team registry setup). */
+export const ALL_BUILT_IN_TOOLS_WITH_DELEGATE: ToolDefinition<any>[] = [
+  ...BUILT_IN_TOOLS,
+  delegateToAgentTool,
 ]

 /**
@ -43,8 +61,14 @@ export const BUILT_IN_TOOLS: ToolDefinition<any>[] = [
 * registerBuiltInTools(registry)
 * ```
 */
-export function registerBuiltInTools(registry: ToolRegistry): void {
+export function registerBuiltInTools(
+  registry: ToolRegistry,
+  options?: RegisterBuiltInToolsOptions,
+): void {
  for (const tool of BUILT_IN_TOOLS) {
    registry.register(tool)
  }
+  if (options?.includeDelegateTool) {
+    registry.register(delegateToAgentTool)
+  }
 }
--- a/src/tool/executor.ts
+++ b/src/tool/executor.ts
@ -24,6 +24,11 @@ export interface ToolExecutorOptions {
   * Defaults to 4.
   */
  maxConcurrency?: number
+  /**
+   * Agent-level default for maximum tool output length in characters.
+   * Per-tool `maxOutputChars` takes priority over this value.
+   */
+  maxToolOutputChars?: number
 }

 /** Describes one call in a batch. */
@ -47,10 +52,12 @@ export interface BatchToolCall {
 export class ToolExecutor {
  private readonly registry: ToolRegistry
  private readonly semaphore: Semaphore
+  private readonly maxToolOutputChars?: number

  constructor(registry: ToolRegistry, options: ToolExecutorOptions = {}) {
    this.registry = registry
    this.semaphore = new Semaphore(options.maxConcurrency ?? 4)
+    this.maxToolOutputChars = options.maxToolOutputChars
  }

  // -------------------------------------------------------------------------
@ -156,7 +163,7 @@ export class ToolExecutor {
    // --- Execute ---
    try {
      const result = await tool.execute(parseResult.data, context)
-      return result
+      return this.maybeTruncate(tool, result)
    } catch (err) {
      const message =
        err instanceof Error
@ -164,10 +171,26 @@ export class ToolExecutor {
          : typeof err === 'string'
            ? err
            : JSON.stringify(err)
-      return this.errorResult(`Tool "${tool.name}" threw an error: ${message}`)
+      return this.maybeTruncate(tool, this.errorResult(`Tool "${tool.name}" threw an error: ${message}`))
    }
  }

+  /**
+   * Apply truncation to a tool result if a character limit is configured.
+   * Priority: per-tool `maxOutputChars` > agent-level `maxToolOutputChars`.
+   */
+  private maybeTruncate(
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    tool: ToolDefinition<any>,
+    result: ToolResult,
+  ): ToolResult {
+    const maxChars = tool.maxOutputChars ?? this.maxToolOutputChars
+    if (maxChars === undefined || maxChars <= 0 || result.data.length <= maxChars) {
+      return result
+    }
+    return { ...result, data: truncateToolOutput(result.data, maxChars) }
+  }
+
  /** Construct an error ToolResult. */
  private errorResult(message: string): ToolResult {
    return {
@ -176,3 +199,37 @@ export class ToolExecutor {
    }
  }
 }
+
+// ---------------------------------------------------------------------------
+// Truncation helper
+// ---------------------------------------------------------------------------
+
+/**
+ * Truncate tool output to fit within `maxChars`, preserving the head (~70%)
+ * and tail (~30%) with a marker indicating how many characters were removed.
+ *
+ * The marker itself is counted against the budget so the returned string
+ * never exceeds `maxChars`. When `maxChars` is too small to fit any
+ * content alongside the marker, a marker-only string is returned.
+ */
+export function truncateToolOutput(data: string, maxChars: number): string {
+  if (data.length <= maxChars) return data
+
+  // Estimate marker length (digit count may shrink after subtracting content,
+  // but using data.length gives a safe upper-bound for the digit count).
+  const markerTemplate = '\n\n[...truncated  characters...]\n\n'
+  const markerOverhead = markerTemplate.length + String(data.length).length
+
+  // When maxChars is too small to fit any content alongside the marker,
+  // fall back to a hard slice so the result never exceeds maxChars.
+  if (maxChars <= markerOverhead) {
+    return data.slice(0, maxChars)
+  }
+
+  const available = maxChars - markerOverhead
+  const headChars = Math.floor(available * 0.7)
+  const tailChars = available - headChars
+  const truncatedCount = data.length - headChars - tailChars
+
+  return `${data.slice(0, headChars)}\n\n[...truncated ${truncatedCount} characters...]\n\n${data.slice(-tailChars)}`
+}
--- a/src/tool/framework.ts
+++ b/src/tool/framework.ts
@ -72,12 +72,28 @@ export function defineTool<TInput>(config: {
  name: string
  description: string
  inputSchema: ZodSchema<TInput>
+  /**
+   * Optional JSON Schema for the LLM (bypasses Zod → JSON Schema conversion).
+   */
+  llmInputSchema?: Record<string, unknown>
+  /**
+   * Per-tool maximum output length in characters. When set, tool output
+   * exceeding this limit is truncated (head + tail with a marker in between).
+   * Takes priority over agent-level `maxToolOutputChars`.
+   */
+  maxOutputChars?: number
  execute: (input: TInput, context: ToolUseContext) => Promise<ToolResult>
 }): ToolDefinition<TInput> {
  return {
    name: config.name,
    description: config.description,
    inputSchema: config.inputSchema,
+    ...(config.llmInputSchema !== undefined
+      ? { llmInputSchema: config.llmInputSchema }
+      : {}),
+    ...(config.maxOutputChars !== undefined
+      ? { maxOutputChars: config.maxOutputChars }
+      : {}),
    execute: config.execute,
  }
 }
@ -93,13 +109,17 @@ export function defineTool<TInput>(config: {
 export class ToolRegistry {
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
  private readonly tools = new Map<string, ToolDefinition<any>>()
+  private readonly runtimeToolNames = new Set<string>()

  /**
   * Add a tool to the registry.  Throws if a tool with the same name has
   * already been registered — prevents silent overwrites.
   */
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
-  register(tool: ToolDefinition<any>): void {
+  register(
+    tool: ToolDefinition<any>,
+    options?: { runtimeAdded?: boolean },
+  ): void {
    if (this.tools.has(tool.name)) {
      throw new Error(
        `ToolRegistry: a tool named "${tool.name}" is already registered. ` +
@ -107,6 +127,9 @@ export class ToolRegistry {
      )
    }
    this.tools.set(tool.name, tool)
+    if (options?.runtimeAdded === true) {
+      this.runtimeToolNames.add(tool.name)
+    }
  }

  /** Return a tool by name, or `undefined` if not found. */
@ -147,11 +170,12 @@ export class ToolRegistry {
   */
  unregister(name: string): void {
    this.tools.delete(name)
+    this.runtimeToolNames.delete(name)
  }

  /** Alias for {@link unregister} — available for symmetry with `register`. */
  deregister(name: string): void {
-    this.tools.delete(name)
+    this.unregister(name)
  }

  /**
@ -161,7 +185,8 @@ export class ToolRegistry {
   */
  toToolDefs(): LLMToolDef[] {
    return Array.from(this.tools.values()).map((tool) => {
-      const schema = zodToJsonSchema(tool.inputSchema)
+      const schema =
+        tool.llmInputSchema ?? zodToJsonSchema(tool.inputSchema)
      return {
        name: tool.name,
        description: tool.description,
@ -170,6 +195,14 @@ export class ToolRegistry {
    })
  }

+  /**
+   * Return only tools that were added dynamically at runtime (e.g. via
+   * `agent.addTool()`), in LLM definition format.
+   */
+  toRuntimeToolDefs(): LLMToolDef[] {
+    return this.toToolDefs().filter(tool => this.runtimeToolNames.has(tool.name))
+  }
+
  /**
   * Convert all registered tools to the Anthropic-style `input_schema`
   * format.  Prefer {@link toToolDefs} for normal use; this method is exposed
@ -178,13 +211,20 @@ export class ToolRegistry {
  toLLMTools(): Array<{
    name: string
    description: string
-    input_schema: {
-      type: 'object'
-      properties: Record<string, JSONSchemaProperty>
-      required?: string[]
-    }
+    /** Anthropic-style tool input JSON Schema (`type` is usually `object`). */
+    input_schema: Record<string, unknown>
  }> {
    return Array.from(this.tools.values()).map((tool) => {
+      if (tool.llmInputSchema !== undefined) {
+        return {
+          name: tool.name,
+          description: tool.description,
+          input_schema: {
+            type: 'object' as const,
+            ...(tool.llmInputSchema as Record<string, unknown>),
+          },
+        }
+      }
      const schema = zodToJsonSchema(tool.inputSchema)
      return {
        name: tool.name,
--- a/src/tool/mcp.ts
+++ b/src/tool/mcp.ts
@ -0,0 +1,296 @@
+import { z } from 'zod'
+import { defineTool } from './framework.js'
+import type { ToolDefinition } from '../types.js'
+
+interface MCPToolDescriptor {
+  name: string
+  description?: string
+  /** MCP tool JSON Schema; same shape LLM APIs expect for object parameters. */
+  inputSchema?: Record<string, unknown>
+}
+
+interface MCPListToolsResponse {
+  tools?: MCPToolDescriptor[]
+  nextCursor?: string
+}
+
+interface MCPCallToolResponse {
+  content?: Array<Record<string, unknown>>
+  structuredContent?: unknown
+  isError?: boolean
+  toolResult?: unknown
+}
+
+interface MCPClientLike {
+  connect(transport: unknown, options?: { timeout?: number; signal?: AbortSignal }): Promise<void>
+  listTools(
+    params?: { cursor?: string },
+    options?: { timeout?: number; signal?: AbortSignal },
+  ): Promise<MCPListToolsResponse>
+  callTool(
+    request: { name: string; arguments: Record<string, unknown> },
+    resultSchema?: unknown,
+    options?: { timeout?: number; signal?: AbortSignal },
+  ): Promise<MCPCallToolResponse>
+  close?: () => Promise<void>
+}
+
+type MCPClientConstructor = new (
+  info: { name: string; version: string },
+  options: { capabilities: Record<string, unknown> },
+) => MCPClientLike
+
+type StdioTransportConstructor = new (config: {
+  command: string
+  args?: string[]
+  env?: Record<string, string | undefined>
+  cwd?: string
+}) => { close?: () => Promise<void> }
+
+interface MCPModules {
+  Client: MCPClientConstructor
+  StdioClientTransport: StdioTransportConstructor
+}
+
+const DEFAULT_MCP_REQUEST_TIMEOUT_MS = 60_000
+
+async function loadMCPModules(): Promise<MCPModules> {
+  const [{ Client }, { StdioClientTransport }] = await Promise.all([
+    import('@modelcontextprotocol/sdk/client/index.js') as Promise<{
+      Client: MCPClientConstructor
+    }>,
+    import('@modelcontextprotocol/sdk/client/stdio.js') as Promise<{
+      StdioClientTransport: StdioTransportConstructor
+    }>,
+  ])
+  return { Client, StdioClientTransport }
+}
+
+export interface ConnectMCPToolsConfig {
+  command: string
+  args?: string[]
+  env?: Record<string, string | undefined>
+  cwd?: string
+  /**
+   * Optional segment prepended to MCP tool names for the framework tool (and LLM) name.
+   * Example: prefix `github` + MCP tool `search_issues` → `github_search_issues`.
+   */
+  namePrefix?: string
+  /**
+   * Timeout (ms) for MCP connect and each `tools/list` page. Defaults to 60000.
+   */
+  requestTimeoutMs?: number
+  /**
+   * Client metadata sent to the MCP server.
+   */
+  clientName?: string
+  clientVersion?: string
+}
+
+export interface ConnectedMCPTools {
+  tools: ToolDefinition[]
+  disconnect: () => Promise<void>
+}
+
+/**
+ * Build an LLM-safe tool name: MCP and prior examples used `prefix/name`, but
+ * Anthropic and other providers reject `/` in tool names.
+ */
+function normalizeToolName(rawName: string, namePrefix?: string): string {
+  const trimmedPrefix = namePrefix?.trim()
+  const base =
+    trimmedPrefix !== undefined && trimmedPrefix !== ''
+      ? `${trimmedPrefix}_${rawName}`
+      : rawName
+  return base.replace(/\//g, '_')
+}
+
+/** MCP `tools/list` JSON Schema; forwarded to the LLM as-is (runtime validation stays `z.any()`). */
+function mcpLlmInputSchema(
+  schema: Record<string, unknown> | undefined,
+): Record<string, unknown> {
+  if (schema !== undefined && typeof schema === 'object' && !Array.isArray(schema)) {
+    return schema
+  }
+  return { type: 'object' }
+}
+
+function contentBlockToText(block: Record<string, unknown>): string | undefined {
+  const typ = block.type
+  if (typ === 'text' && typeof block.text === 'string') {
+    return block.text
+  }
+  if (typ === 'image' && typeof block.data === 'string') {
+    const mime =
+      typeof block.mimeType === 'string' ? block.mimeType : 'image/*'
+    return `[image ${mime}; base64 length=${block.data.length}]`
+  }
+  if (typ === 'audio' && typeof block.data === 'string') {
+    const mime =
+      typeof block.mimeType === 'string' ? block.mimeType : 'audio/*'
+    return `[audio ${mime}; base64 length=${block.data.length}]`
+  }
+  if (
+    typ === 'resource' &&
+    block.resource !== null &&
+    typeof block.resource === 'object'
+  ) {
+    const r = block.resource as Record<string, unknown>
+    const uri = typeof r.uri === 'string' ? r.uri : ''
+    if (typeof r.text === 'string') {
+      return `[resource ${uri}]\n${r.text}`
+    }
+    if (typeof r.blob === 'string') {
+      const mime = typeof r.mimeType === 'string' ? r.mimeType : ''
+      return `[resource ${uri}; mimeType=${mime}; blob base64 length=${r.blob.length}]`
+    }
+    return `[resource ${uri}]`
+  }
+  if (typ === 'resource_link') {
+    const uri = typeof block.uri === 'string' ? block.uri : ''
+    const name = typeof block.name === 'string' ? block.name : ''
+    const desc =
+      typeof block.description === 'string' ? block.description : ''
+    const head = `[resource_link name=${JSON.stringify(name)} uri=${JSON.stringify(uri)}]`
+    return desc === '' ? head : `${head}\n${desc}`
+  }
+  return undefined
+}
+
+function toToolResultData(result: MCPCallToolResponse): string {
+  if ('toolResult' in result && result.toolResult !== undefined) {
+    try {
+      return JSON.stringify(result.toolResult, null, 2)
+    } catch {
+      return String(result.toolResult)
+    }
+  }
+
+  const lines: string[] = []
+  for (const block of result.content ?? []) {
+    if (block === null || typeof block !== 'object') continue
+    const rec = block as Record<string, unknown>
+    const line = contentBlockToText(rec)
+    if (line !== undefined) {
+      lines.push(line)
+      continue
+    }
+    try {
+      lines.push(
+        `[${String(rec.type ?? 'unknown')}]\n${JSON.stringify(rec, null, 2)}`,
+      )
+    } catch {
+      lines.push('[mcp content block]')
+    }
+  }
+
+  if (lines.length > 0) {
+    return lines.join('\n')
+  }
+
+  if (result.structuredContent !== undefined) {
+    try {
+      return JSON.stringify(result.structuredContent, null, 2)
+    } catch {
+      return String(result.structuredContent)
+    }
+  }
+
+  try {
+    return JSON.stringify(result)
+  } catch {
+    return 'MCP tool completed with non-text output.'
+  }
+}
+
+async function listAllMcpTools(
+  client: MCPClientLike,
+  requestOpts: { timeout: number },
+): Promise<MCPToolDescriptor[]> {
+  const acc: MCPToolDescriptor[] = []
+  let cursor: string | undefined
+  do {
+    const page = await client.listTools(
+      cursor !== undefined ? { cursor } : {},
+      requestOpts,
+    )
+    acc.push(...(page.tools ?? []))
+    cursor =
+      typeof page.nextCursor === 'string' && page.nextCursor !== ''
+        ? page.nextCursor
+        : undefined
+  } while (cursor !== undefined)
+  return acc
+}
+
+/**
+ * Connect to an MCP server over stdio and convert exposed MCP tools into
+ * open-multi-agent ToolDefinitions.
+ */
+export async function connectMCPTools(
+  config: ConnectMCPToolsConfig,
+): Promise<ConnectedMCPTools> {
+  const { Client, StdioClientTransport } = await loadMCPModules()
+
+  const transport = new StdioClientTransport({
+    command: config.command,
+    args: config.args ?? [],
+    env: config.env,
+    cwd: config.cwd,
+  })
+
+  const client = new Client(
+    {
+      name: config.clientName ?? 'open-multi-agent',
+      version: config.clientVersion ?? '0.0.0',
+    },
+    { capabilities: {} },
+  )
+
+  const requestOpts = {
+    timeout: config.requestTimeoutMs ?? DEFAULT_MCP_REQUEST_TIMEOUT_MS,
+  }
+
+  await client.connect(transport, requestOpts)
+
+  const mcpTools = await listAllMcpTools(client, requestOpts)
+
+  const tools: ToolDefinition[] = mcpTools.map((tool) =>
+    defineTool({
+      name: normalizeToolName(tool.name, config.namePrefix),
+      description: tool.description ?? `MCP tool: ${tool.name}`,
+      inputSchema: z.any(),
+      llmInputSchema: mcpLlmInputSchema(tool.inputSchema),
+      execute: async (input: Record<string, unknown>) => {
+        try {
+          const result = await client.callTool(
+            {
+              name: tool.name,
+              arguments: input,
+            },
+            undefined,
+            requestOpts,
+          )
+          return {
+            data: toToolResultData(result),
+            isError: result.isError === true,
+          }
+        } catch (error) {
+          const message =
+            error instanceof Error ? error.message : String(error)
+          return {
+            data: `MCP tool "${tool.name}" failed: ${message}`,
+            isError: true,
+          }
+        }
+      },
+    }),
+  )
+
+  return {
+    tools,
+    disconnect: async () => {
+      await client.close?.()
+    },
+  }
+}
--- a/src/types.ts
+++ b/src/types.ts
@ -65,6 +65,31 @@ export interface LLMMessage {
  readonly content: ContentBlock[]
 }

+/** Context management strategy for long-running agent conversations. */
+export type ContextStrategy =
+  | { type: 'sliding-window'; maxTurns: number }
+  | { type: 'summarize'; maxTokens: number; summaryModel?: string }
+  | {
+      type: 'compact'
+      /** Estimated token threshold that triggers compaction. Compaction is skipped when below this. */
+      maxTokens: number
+      /** Number of recent turn pairs (assistant+user) to keep intact. Default: 4. */
+      preserveRecentTurns?: number
+      /** Minimum chars in a tool_result content to qualify for compaction. Default: 200. */
+      minToolResultChars?: number
+      /** Minimum chars in an assistant text block to qualify for truncation. Default: 2000. */
+      minTextBlockChars?: number
+      /** Maximum chars to keep from a truncated text block (head excerpt). Default: 200. */
+      textBlockExcerptChars?: number
+    }
+  | {
+    type: 'custom'
+    compress: (
+      messages: LLMMessage[],
+      estimatedTokens: number,
+    ) => Promise<LLMMessage[]> | LLMMessage[]
+  }
+
 /** Token accounting for a single API call. */
 export interface TokenUsage {
  readonly input_tokens: number
@ -90,11 +115,12 @@ export interface LLMResponse {
 * - `text`        — incremental text delta
 * - `tool_use`    — the model has begun or completed a tool-use block
 * - `tool_result` — a tool result has been appended to the stream
+ * - `budget_exceeded` — token budget threshold reached for this run
 * - `done`        — the stream has ended; `data` is the final {@link LLMResponse}
 * - `error`       — an unrecoverable error occurred; `data` is an `Error`
 */
 export interface StreamEvent {
-  readonly type: 'text' | 'tool_use' | 'tool_result' | 'loop_detected' | 'done' | 'error'
+  readonly type: 'text' | 'tool_use' | 'tool_result' | 'loop_detected' | 'budget_exceeded' | 'done' | 'error'
  readonly data: unknown
 }

@ -152,29 +178,78 @@ export interface AgentInfo {
  readonly model: string
 }

-/** Descriptor for a team of agents with shared memory. */
+/**
+ * Minimal pool surface used by `delegate_to_agent` to detect nested-run capacity.
+ * {@link AgentPool} satisfies this structurally via {@link AgentPool.availableRunSlots}.
+ */
+export interface DelegationPoolView {
+  readonly availableRunSlots: number
+}
+
+/** Descriptor for a team of agents (orchestrator-injected into tool context). */
 export interface TeamInfo {
  readonly name: string
  readonly agents: readonly string[]
-  readonly sharedMemory: MemoryStore
+  /** When the team has shared memory enabled; used for delegation audit writes. */
+  readonly sharedMemory?: MemoryStore
+  /** Zero-based depth of nested delegation from the root task run. */
+  readonly delegationDepth?: number
+  readonly maxDelegationDepth?: number
+  readonly delegationPool?: DelegationPoolView
+  /**
+   * Ordered chain of agent names from the root task to the current agent.
+   * Used to block `A -> B -> A` cycles before they burn turns against `maxDelegationDepth`.
+   */
+  readonly delegationChain?: readonly string[]
+  /**
+   * Run another roster agent to completion and return its result.
+   * Only set during orchestrated pool execution (`runTeam` / `runTasks`).
+   */
+  readonly runDelegatedAgent?: (targetAgent: string, prompt: string) => Promise<AgentRunResult>
+}
+
+/**
+ * Optional side-channel metadata a tool may attach to its result.
+ * Not shown to the LLM — the runner reads it for accounting purposes.
+ */
+export interface ToolResultMetadata {
+  /**
+   * Token usage consumed inside the tool execution itself (e.g. nested LLM
+   * calls from `delegate_to_agent`). Accumulated into the parent runner's
+   * total so budgets/cost tracking stay accurate across delegation.
+   */
+  readonly tokenUsage?: TokenUsage
 }

 /** Value returned by a tool's `execute` function. */
 export interface ToolResult {
  readonly data: string
  readonly isError?: boolean
+  readonly metadata?: ToolResultMetadata
 }

 /**
 * A tool registered with the framework.
 *
 * `inputSchema` is a Zod schema used for validation before `execute` is called.
- * At API call time it is converted to JSON Schema via {@link LLMToolDef}.
+ * At API call time it is converted to JSON Schema for {@link LLMToolDef}, unless
+ * `llmInputSchema` is set (e.g. MCP tools ship JSON Schema from the server).
 */
 export interface ToolDefinition<TInput = Record<string, unknown>> {
  readonly name: string
  readonly description: string
  readonly inputSchema: ZodSchema<TInput>
+  /**
+   * When present, used as {@link LLMToolDef.inputSchema} as-is instead of
+   * deriving JSON Schema from `inputSchema` (Zod).
+   */
+  readonly llmInputSchema?: Record<string, unknown>
+  /**
+   * Per-tool maximum output length in characters. When set, tool output
+   * exceeding this limit is truncated (head + tail with a marker in between).
+   * Takes priority over {@link AgentConfig.maxToolOutputChars}.
+   */
+  readonly maxOutputChars?: number
  execute(input: TInput, context: ToolUseContext): Promise<ToolResult>
 }

@ -204,10 +279,28 @@ export interface AgentConfig {
  /** API key override; falls back to the provider's standard env var. */
  readonly apiKey?: string
  readonly systemPrompt?: string
+  /**
+   * Custom tool definitions to register alongside built-in tools.
+   * Created via `defineTool()`. Custom tools bypass `tools` (allowlist)
+   * and `toolPreset` filtering, but can still be blocked by `disallowedTools`.
+   *
+   * Tool names must not collide with built-in tool names; a duplicate name
+   * will throw at registration time.
+   */
+  // eslint-disable-next-line @typescript-eslint/no-explicit-any
+  readonly customTools?: readonly ToolDefinition<any>[]
  /** Names of tools (from the tool registry) available to this agent. */
  readonly tools?: readonly string[]
+  /** Names of tools explicitly disallowed for this agent. */
+  readonly disallowedTools?: readonly string[]
+  /** Predefined tool preset for common use cases. */
+  readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
  readonly maxTurns?: number
  readonly maxTokens?: number
+  /** Maximum cumulative tokens (input + output) allowed for this run. */
+  readonly maxTokenBudget?: number
+  /** Optional context compression policy to control input growth across turns. */
+  readonly contextStrategy?: ContextStrategy
  readonly temperature?: number
  /**
   * Maximum wall-clock time (in milliseconds) for the entire agent run.
@ -220,6 +313,28 @@ export interface AgentConfig {
   * calls and text outputs to detect stuck loops before `maxTurns` is reached.
   */
  readonly loopDetection?: LoopDetectionConfig
+  /**
+   * Maximum tool output length in characters for all tools used by this agent.
+   * When set, tool outputs exceeding this limit are truncated (head + tail
+   * with a marker in between). Per-tool {@link ToolDefinition.maxOutputChars}
+   * takes priority over this value.
+   */
+  readonly maxToolOutputChars?: number
+  /**
+   * Compress tool results that the agent has already processed.
+   *
+   * In multi-turn runs, tool results persist in the conversation even after the
+   * agent has acted on them. When enabled, consumed tool results (those followed
+   * by an assistant response) are replaced with a short marker before the next
+   * LLM call, freeing context budget for new reasoning.
+   *
+   * - `true` — enable with default threshold (500 chars)
+   * - `{ minChars: N }` — only compress results longer than N characters
+   * - `false` / `undefined` — disabled (default)
+   *
+   * Error tool results are never compressed.
+   */
+  readonly compressToolResults?: boolean | { readonly minChars?: number }
  /**
   * Optional Zod schema for structured output.  When set, the agent's final
   * output is parsed as JSON and validated against this schema.  A single
@ -307,6 +422,8 @@ export interface AgentRunResult {
  readonly structured?: unknown
  /** True when the run was terminated or warned due to loop detection. */
  readonly loopDetected?: boolean
+  /** True when the run stopped because token budget was exceeded. */
+  readonly budgetExceeded?: boolean
 }

 // ---------------------------------------------------------------------------
@ -324,6 +441,8 @@ export interface TeamConfig {
 /** Aggregated result for a full team run. */
 export interface TeamRunResult {
  readonly success: boolean
+  readonly goal?: string
+  readonly tasks?: readonly TaskExecutionRecord[]
  /** Keyed by agent name. */
  readonly agentResults: Map<string, AgentRunResult>
  readonly totalTokenUsage: TokenUsage
@ -336,6 +455,28 @@ export interface TeamRunResult {
 /** Valid states for a {@link Task}. */
 export type TaskStatus = 'pending' | 'in_progress' | 'completed' | 'failed' | 'blocked' | 'skipped'

+/**
+ * Metrics shown in the team-run dashboard detail panel for a single task.
+ * Mirrors execution data collected during orchestration.
+ */
+export interface TaskExecutionMetrics {
+  readonly startMs: number
+  readonly endMs: number
+  readonly durationMs: number
+  readonly tokenUsage: TokenUsage
+  readonly toolCalls: AgentRunResult['toolCalls']
+}
+
+/** Serializable task snapshot embedded in the static HTML dashboard. */
+export interface TaskExecutionRecord {
+  readonly id: string
+  readonly title: string
+  readonly assignee?: string
+  readonly status: TaskStatus
+  readonly dependsOn: readonly string[]
+  readonly metrics?: TaskExecutionMetrics
+}
+
 /** A discrete unit of work tracked by the orchestrator. */
 export interface Task {
  readonly id: string
@ -346,6 +487,12 @@ export interface Task {
  assignee?: string
  /** IDs of tasks that must complete before this one can start. */
  dependsOn?: readonly string[]
+  /**
+   * Controls what prior team context is injected into this task's prompt.
+   * - `dependencies` (default): only direct dependency task results
+   * - `all`: full shared-memory summary
+   */
+  readonly memoryScope?: 'dependencies' | 'all'
  result?: string
  readonly createdAt: Date
  updatedAt: Date
@ -375,6 +522,7 @@ export interface OrchestratorEvent {
    | 'task_complete'
    | 'task_skipped'
    | 'task_retry'
+    | 'budget_exceeded'
    | 'message'
    | 'error'
  readonly agent?: string
@ -385,6 +533,13 @@ export interface OrchestratorEvent {
 /** Top-level configuration for the orchestrator. */
 export interface OrchestratorConfig {
  readonly maxConcurrency?: number
+  /**
+   * Maximum depth of `delegate_to_agent` chains from a task run (default `3`).
+   * Depth is per nested delegated run, not per team.
+   */
+  readonly maxDelegationDepth?: number
+  /** Maximum cumulative tokens (input + output) allowed per orchestrator run. */
+  readonly maxTokenBudget?: number
  readonly defaultModel?: string
  readonly defaultProvider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
  readonly defaultBaseURL?: string
@ -410,6 +565,43 @@ export interface OrchestratorConfig {
  readonly onApproval?: (completedTasks: readonly Task[], nextTasks: readonly Task[]) => Promise<boolean>
 }

+/**
+ * Optional overrides for the temporary coordinator agent created by `runTeam`.
+ *
+ * All fields are optional. Unset fields fall back to orchestrator defaults
+ * (or coordinator built-in defaults where applicable).
+ */
+export interface CoordinatorConfig {
+  /** Coordinator model. Defaults to `OrchestratorConfig.defaultModel`. */
+  readonly model?: string
+  readonly provider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
+  readonly baseURL?: string
+  readonly apiKey?: string
+  /**
+   * Full system prompt override. When set, this replaces the default
+   * coordinator preamble and decomposition guidance.
+   *
+   * Team roster, output format, and synthesis sections are still appended.
+   */
+  readonly systemPrompt?: string
+  /**
+   * Additional instructions appended to the default coordinator prompt.
+   * Ignored when `systemPrompt` is provided.
+   */
+  readonly instructions?: string
+  readonly maxTurns?: number
+  readonly maxTokens?: number
+  readonly temperature?: number
+  /** Predefined tool preset for common coordinator use cases. */
+  readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
+  /** Tool names available to the coordinator. */
+  readonly tools?: readonly string[]
+  /** Tool names explicitly denied to the coordinator. */
+  readonly disallowedTools?: readonly string[]
+  readonly loopDetection?: LoopDetectionConfig
+  readonly timeoutMs?: number
+}
+
 // ---------------------------------------------------------------------------
 // Trace events — lightweight observability spans
 // ---------------------------------------------------------------------------
@ -438,6 +630,8 @@ export interface TraceEventBase {
 export interface LLMCallTrace extends TraceEventBase {
  readonly type: 'llm_call'
  readonly model: string
+  /** Distinguishes normal turn calls from context-summary calls. */
+  readonly phase?: 'turn' | 'summary'
  readonly turn: number
  readonly tokens: TokenUsage
 }
--- a/src/utils/keywords.ts
+++ b/src/utils/keywords.ts
@ -0,0 +1,39 @@
+/**
+ * Shared keyword-affinity helpers used by capability-match scheduling
+ * and short-circuit agent selection. Kept in one place so behaviour
+ * can't drift between Scheduler and Orchestrator.
+ */
+
+export const STOP_WORDS: ReadonlySet<string> = new Set([
+  'the', 'and', 'for', 'that', 'this', 'with', 'are', 'from', 'have',
+  'will', 'your', 'you', 'can', 'all', 'each', 'when', 'then', 'they',
+  'them', 'their', 'about', 'into', 'more', 'also', 'should', 'must',
+])
+
+/**
+ * Tokenise `text` into a deduplicated set of lower-cased keywords.
+ * Words shorter than 4 characters and entries in {@link STOP_WORDS}
+ * are filtered out.
+ */
+export function extractKeywords(text: string): string[] {
+  return [
+    ...new Set(
+      text
+        .toLowerCase()
+        .split(/\W+/)
+        .filter((w) => w.length > 3 && !STOP_WORDS.has(w)),
+    ),
+  ]
+}
+
+/**
+ * Count how many `keywords` appear (case-insensitively) in `text`.
+ * Each keyword contributes at most 1 to the score.
+ */
+export function keywordScore(text: string, keywords: readonly string[]): number {
+  const lower = text.toLowerCase()
+  return keywords.reduce(
+    (acc, kw) => acc + (lower.includes(kw.toLowerCase()) ? 1 : 0),
+    0,
+  )
+}
--- a/src/utils/semaphore.ts
+++ b/src/utils/semaphore.ts
@ -34,6 +34,11 @@ export class Semaphore {
    }
  }

+  /** Maximum concurrent holders configured for this semaphore. */
+  get limit(): number {
+    return this.max
+  }
+
  /**
   * Acquire a slot. Resolves immediately when one is free, or waits until a
   * holder calls `release()`.
--- a/src/utils/tokens.ts
+++ b/src/utils/tokens.ts
@ -0,0 +1,27 @@
+import type { LLMMessage } from '../types.js'
+
+/**
+ * Estimate token count using a lightweight character heuristic.
+ * This intentionally avoids model-specific tokenizer dependencies.
+ */
+export function estimateTokens(messages: LLMMessage[]): number {
+  let chars = 0
+
+  for (const message of messages) {
+    for (const block of message.content) {
+      if (block.type === 'text') {
+        chars += block.text.length
+      } else if (block.type === 'tool_result') {
+        chars += block.content.length
+      } else if (block.type === 'tool_use') {
+        chars += JSON.stringify(block.input).length
+      } else if (block.type === 'image') {
+        // Account for non-text payloads with a small fixed cost.
+        chars += 64
+      }
+    }
+  }
+
+  // Conservative English heuristic: ~4 chars per token.
+  return Math.ceil(chars / 4)
+}
--- a/tests/abort-signal-propagation.test.ts
+++ b/tests/abort-signal-propagation.test.ts
@ -0,0 +1,279 @@
+/**
+ * Targeted tests for abort signal propagation fixes (#99, #100, #101).
+ *
+ * - #99:  Per-call abortSignal must reach tool execution context
+ * - #100: Abort path in executeQueue must skip blocked tasks and emit events
+ * - #101: Gemini adapter must forward abortSignal to the SDK
+ */
+
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { AgentRunner } from '../src/agent/runner.js'
+import { ToolRegistry, defineTool } from '../src/tool/framework.js'
+import { ToolExecutor } from '../src/tool/executor.js'
+import { TaskQueue } from '../src/task/queue.js'
+import { createTask } from '../src/task/task.js'
+import { z } from 'zod'
+import type { LLMAdapter, LLMMessage, ToolUseContext } from '../src/types.js'
+
+// ---------------------------------------------------------------------------
+// #99 — Per-call abortSignal propagated to tool context
+// ---------------------------------------------------------------------------
+
+describe('Per-call abortSignal reaches tool context (#99)', () => {
+  it('tool receives per-call abortSignal, not static runner signal', async () => {
+    // Track the abortSignal passed to the tool
+    let receivedSignal: AbortSignal | undefined
+
+    const spy = defineTool({
+      name: 'spy',
+      description: 'Captures the abort signal from context.',
+      inputSchema: z.object({}),
+      execute: async (_input, context) => {
+        receivedSignal = context.abortSignal
+        return { data: 'ok', isError: false }
+      },
+    })
+
+    const registry = new ToolRegistry()
+    registry.register(spy)
+    const executor = new ToolExecutor(registry)
+
+    // Adapter returns one tool_use then end_turn
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      chat: vi.fn()
+        .mockResolvedValueOnce({
+          id: '1',
+          content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
+          model: 'mock',
+          stop_reason: 'tool_use',
+          usage: { input_tokens: 0, output_tokens: 0 },
+        })
+        .mockResolvedValueOnce({
+          id: '2',
+          content: [{ type: 'text', text: 'done' }],
+          model: 'mock',
+          stop_reason: 'end_turn',
+          usage: { input_tokens: 0, output_tokens: 0 },
+        }),
+      async *stream() { /* unused */ },
+    }
+
+    const perCallController = new AbortController()
+
+    // Runner created WITHOUT a static abortSignal
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock',
+      agentName: 'test',
+    })
+
+    const messages: LLMMessage[] = [
+      { role: 'user', content: [{ type: 'text', text: 'go' }] },
+    ]
+
+    await runner.run(messages, { abortSignal: perCallController.signal })
+
+    // The tool must have received the per-call signal, not undefined
+    expect(receivedSignal).toBe(perCallController.signal)
+  })
+
+  it('tool receives static signal when no per-call signal is provided', async () => {
+    let receivedSignal: AbortSignal | undefined
+
+    const spy = defineTool({
+      name: 'spy',
+      description: 'Captures the abort signal from context.',
+      inputSchema: z.object({}),
+      execute: async (_input, context) => {
+        receivedSignal = context.abortSignal
+        return { data: 'ok', isError: false }
+      },
+    })
+
+    const registry = new ToolRegistry()
+    registry.register(spy)
+    const executor = new ToolExecutor(registry)
+
+    const staticController = new AbortController()
+
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      chat: vi.fn()
+        .mockResolvedValueOnce({
+          id: '1',
+          content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
+          model: 'mock',
+          stop_reason: 'tool_use',
+          usage: { input_tokens: 0, output_tokens: 0 },
+        })
+        .mockResolvedValueOnce({
+          id: '2',
+          content: [{ type: 'text', text: 'done' }],
+          model: 'mock',
+          stop_reason: 'end_turn',
+          usage: { input_tokens: 0, output_tokens: 0 },
+        }),
+      async *stream() { /* unused */ },
+    }
+
+    // Runner created WITH a static abortSignal, no per-call signal
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock',
+      agentName: 'test',
+      abortSignal: staticController.signal,
+    })
+
+    const messages: LLMMessage[] = [
+      { role: 'user', content: [{ type: 'text', text: 'go' }] },
+    ]
+
+    await runner.run(messages)
+
+    expect(receivedSignal).toBe(staticController.signal)
+  })
+})
+
+// ---------------------------------------------------------------------------
+// #100 — Abort path skips blocked tasks and emits events
+// ---------------------------------------------------------------------------
+
+describe('Abort path skips blocked tasks and emits events (#100)', () => {
+  function task(id: string, opts: { dependsOn?: string[]; assignee?: string } = {}) {
+    const t = createTask({ title: id, description: `task ${id}`, assignee: opts.assignee })
+    return { ...t, id, dependsOn: opts.dependsOn } as ReturnType<typeof createTask>
+  }
+
+  it('skipRemaining transitions blocked tasks to skipped', () => {
+    const q = new TaskQueue()
+    q.add(task('a'))
+    q.add(task('b', { dependsOn: ['a'] }))
+
+    // 'b' should be blocked because it depends on 'a'
+    expect(q.getByStatus('blocked').length).toBe(1)
+
+    q.skipRemaining('Skipped: run aborted.')
+
+    // Both tasks should be skipped — including the blocked one
+    const all = q.list()
+    expect(all.every(t => t.status === 'skipped')).toBe(true)
+    expect(q.getByStatus('blocked').length).toBe(0)
+  })
+
+  it('skipRemaining emits task:skipped for every non-terminal task', () => {
+    const q = new TaskQueue()
+    q.add(task('a'))
+    q.add(task('b', { dependsOn: ['a'] }))
+
+    const handler = vi.fn()
+    q.on('task:skipped', handler)
+
+    q.skipRemaining('Skipped: run aborted.')
+
+    // Both pending 'a' and blocked 'b' must trigger events
+    expect(handler).toHaveBeenCalledTimes(2)
+    const ids = handler.mock.calls.map((c: any[]) => c[0].id)
+    expect(ids).toContain('a')
+    expect(ids).toContain('b')
+  })
+
+  it('skipRemaining fires all:complete after skipping', () => {
+    const q = new TaskQueue()
+    q.add(task('a'))
+    q.add(task('b', { dependsOn: ['a'] }))
+
+    const completeHandler = vi.fn()
+    q.on('all:complete', completeHandler)
+
+    q.skipRemaining('Skipped: run aborted.')
+
+    expect(completeHandler).toHaveBeenCalledTimes(1)
+    expect(q.isComplete()).toBe(true)
+  })
+})
+
+// ---------------------------------------------------------------------------
+// #101 — Gemini adapter forwards abortSignal to SDK config
+// ---------------------------------------------------------------------------
+
+const mockGenerateContent = vi.hoisted(() => vi.fn())
+const mockGenerateContentStream = vi.hoisted(() => vi.fn())
+const GoogleGenAIMock = vi.hoisted(() =>
+  vi.fn(() => ({
+    models: {
+      generateContent: mockGenerateContent,
+      generateContentStream: mockGenerateContentStream,
+    },
+  })),
+)
+
+vi.mock('@google/genai', () => ({
+  GoogleGenAI: GoogleGenAIMock,
+  FunctionCallingConfigMode: { AUTO: 'AUTO' },
+}))
+
+import { GeminiAdapter } from '../src/llm/gemini.js'
+
+describe('Gemini adapter forwards abortSignal (#101)', () => {
+  let adapter: GeminiAdapter
+
+  function makeGeminiResponse(parts: Array<Record<string, unknown>>) {
+    return {
+      candidates: [{
+        content: { parts },
+        finishReason: 'STOP',
+      }],
+      usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 5 },
+    }
+  }
+
+  async function* asyncGen<T>(items: T[]): AsyncGenerator<T> {
+    for (const item of items) yield item
+  }
+
+  beforeEach(() => {
+    vi.clearAllMocks()
+    adapter = new GeminiAdapter('test-key')
+  })
+
+  it('chat() passes abortSignal in config', async () => {
+    mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
+
+    const controller = new AbortController()
+    await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
+      { model: 'gemini-2.5-flash', abortSignal: controller.signal },
+    )
+
+    const callArgs = mockGenerateContent.mock.calls[0][0]
+    expect(callArgs.config.abortSignal).toBe(controller.signal)
+  })
+
+  it('chat() does not include abortSignal when not provided', async () => {
+    mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
+
+    await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
+      { model: 'gemini-2.5-flash' },
+    )
+
+    const callArgs = mockGenerateContent.mock.calls[0][0]
+    expect(callArgs.config.abortSignal).toBeUndefined()
+  })
+
+  it('stream() passes abortSignal in config', async () => {
+    const chunk = makeGeminiResponse([{ text: 'hi' }])
+    mockGenerateContentStream.mockResolvedValue(asyncGen([chunk]))
+
+    const controller = new AbortController()
+    const events: unknown[] = []
+    for await (const e of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
+      { model: 'gemini-2.5-flash', abortSignal: controller.signal },
+    )) {
+      events.push(e)
+    }
+
+    const callArgs = mockGenerateContentStream.mock.calls[0][0]
+    expect(callArgs.config.abortSignal).toBe(controller.signal)
+  })
+})
--- a/tests/abort-signal.test.ts
+++ b/tests/abort-signal.test.ts
@ -0,0 +1,107 @@
+import { describe, it, expect, vi } from 'vitest'
+import { OpenMultiAgent } from '../src/orchestrator/orchestrator.js'
+import { Team } from '../src/team/team.js'
+
+describe('AbortSignal support for runTeam and runTasks', () => {
+  it('runTeam should accept an abortSignal option', async () => {
+    const orchestrator = new OpenMultiAgent({
+      defaultModel: 'test-model',
+      defaultProvider: 'openai',
+    })
+
+    // Verify the API accepts the option without throwing
+    const controller = new AbortController()
+    const team = new Team({
+      name: 'test',
+      agents: [
+        { name: 'agent1', model: 'test-model', systemPrompt: 'test' },
+      ],
+    })
+
+    // Abort immediately so the run won't actually execute LLM calls
+    controller.abort()
+
+    // runTeam should return gracefully (no unhandled rejection)
+    const result = await orchestrator.runTeam(team, 'test goal', {
+      abortSignal: controller.signal,
+    })
+
+    // With immediate abort, coordinator may or may not have run,
+    // but the function should not throw.
+    expect(result).toBeDefined()
+    expect(result.agentResults).toBeInstanceOf(Map)
+  })
+
+  it('runTasks should accept an abortSignal option', async () => {
+    const orchestrator = new OpenMultiAgent({
+      defaultModel: 'test-model',
+      defaultProvider: 'openai',
+    })
+
+    const controller = new AbortController()
+    const team = new Team({
+      name: 'test',
+      agents: [
+        { name: 'agent1', model: 'test-model', systemPrompt: 'test' },
+      ],
+    })
+
+    controller.abort()
+
+    const result = await orchestrator.runTasks(team, [
+      { title: 'task1', description: 'do something', assignee: 'agent1' },
+    ], { abortSignal: controller.signal })
+
+    expect(result).toBeDefined()
+    expect(result.agentResults).toBeInstanceOf(Map)
+  })
+
+  it('pre-aborted signal should skip pending tasks', async () => {
+    const orchestrator = new OpenMultiAgent({
+      defaultModel: 'test-model',
+      defaultProvider: 'openai',
+    })
+
+    const controller = new AbortController()
+    controller.abort()
+
+    const team = new Team({
+      name: 'test',
+      agents: [
+        { name: 'agent1', model: 'test-model', systemPrompt: 'test' },
+      ],
+    })
+
+    const result = await orchestrator.runTasks(team, [
+      { title: 'task1', description: 'first', assignee: 'agent1' },
+      { title: 'task2', description: 'second', assignee: 'agent1' },
+    ], { abortSignal: controller.signal })
+
+    // No agent runs should complete since signal was already aborted
+    expect(result).toBeDefined()
+  })
+
+  it('runTeam and runTasks work without abortSignal (backward compat)', async () => {
+    const orchestrator = new OpenMultiAgent({
+      defaultModel: 'test-model',
+      defaultProvider: 'openai',
+    })
+
+    const team = new Team({
+      name: 'test',
+      agents: [
+        { name: 'agent1', model: 'test-model', systemPrompt: 'test' },
+      ],
+    })
+
+    // These should not throw even without abortSignal
+    const promise1 = orchestrator.runTeam(team, 'goal')
+    const promise2 = orchestrator.runTasks(team, [
+      { title: 'task1', description: 'do something', assignee: 'agent1' },
+    ])
+
+    // Both return promises (won't resolve without real LLM, but API is correct)
+    expect(promise1).toBeInstanceOf(Promise)
+    expect(promise2).toBeInstanceOf(Promise)
+  })
+})
--- a/tests/agent-hooks.test.ts
+++ b/tests/agent-hooks.test.ts
@ -4,7 +4,7 @@ import { Agent } from '../src/agent/agent.js'
 import { AgentRunner } from '../src/agent/runner.js'
 import { ToolRegistry } from '../src/tool/framework.js'
 import { ToolExecutor } from '../src/tool/executor.js'
-import type { AgentConfig, AgentRunResult, LLMAdapter, LLMMessage, LLMResponse } from '../src/types.js'
+import type { AgentConfig, AgentRunResult, LLMAdapter, LLMMessage, LLMResponse, StreamEvent } from '../src/types.js'

 // ---------------------------------------------------------------------------
 // Mock helpers
@ -243,7 +243,7 @@ describe('Agent hooks — beforeRun / afterRun', () => {
    }
    const { agent, calls } = buildMockAgent(config, 'streamed')

-    const events = []
+    const events: StreamEvent[] = []
    for await (const event of agent.stream('original')) {
      events.push(event)
    }
@ -263,7 +263,7 @@ describe('Agent hooks — beforeRun / afterRun', () => {
    }
    const { agent } = buildMockAgent(config, 'original')

-    const events = []
+    const events: StreamEvent[] = [] 
    for await (const event of agent.stream('hi')) {
      events.push(event)
    }
@ -280,7 +280,7 @@ describe('Agent hooks — beforeRun / afterRun', () => {
    }
    const { agent } = buildMockAgent(config, 'unreachable')

-    const events = []
+    const events: StreamEvent[] = []
    for await (const event of agent.stream('hi')) {
      events.push(event)
    }
@ -297,7 +297,7 @@ describe('Agent hooks — beforeRun / afterRun', () => {
    }
    const { agent } = buildMockAgent(config, 'streamed output')

-    const events = []
+    const events: StreamEvent[] = []
    for await (const event of agent.stream('hi')) {
      events.push(event)
    }
--- a/tests/agent-pool.test.ts
+++ b/tests/agent-pool.test.ts
@ -178,6 +178,89 @@ describe('AgentPool', () => {
    })
  })

+  describe('per-agent serialization (#72)', () => {
+    it('serializes concurrent runs on the same agent', async () => {
+      const executionLog: string[] = []
+
+      const agent = createMockAgent('dev')
+      ;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async (prompt: string) => {
+        executionLog.push(`start:${prompt}`)
+        await new Promise(r => setTimeout(r, 50))
+        executionLog.push(`end:${prompt}`)
+        return SUCCESS_RESULT
+      })
+
+      const pool = new AgentPool(5)
+      pool.add(agent)
+
+      // Fire two runs for the same agent concurrently
+      await Promise.all([
+        pool.run('dev', 'task1'),
+        pool.run('dev', 'task2'),
+      ])
+
+      // With per-agent serialization, runs must not overlap:
+      // [start:task1, end:task1, start:task2, end:task2] (or reverse order)
+      // i.e. no interleaving like [start:task1, start:task2, ...]
+      expect(executionLog).toHaveLength(4)
+      expect(executionLog[0]).toMatch(/^start:/)
+      expect(executionLog[1]).toMatch(/^end:/)
+      expect(executionLog[2]).toMatch(/^start:/)
+      expect(executionLog[3]).toMatch(/^end:/)
+    })
+
+    it('allows different agents to run in parallel', async () => {
+      let concurrent = 0
+      let maxConcurrent = 0
+
+      const makeTimedAgent = (name: string): Agent => {
+        const agent = createMockAgent(name)
+        ;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async () => {
+          concurrent++
+          maxConcurrent = Math.max(maxConcurrent, concurrent)
+          await new Promise(r => setTimeout(r, 50))
+          concurrent--
+          return SUCCESS_RESULT
+        })
+        return agent
+      }
+
+      const pool = new AgentPool(5)
+      pool.add(makeTimedAgent('a'))
+      pool.add(makeTimedAgent('b'))
+
+      await Promise.all([
+        pool.run('a', 'x'),
+        pool.run('b', 'y'),
+      ])
+
+      // Different agents should run concurrently
+      expect(maxConcurrent).toBe(2)
+    })
+
+    it('releases agent lock even when run() throws', async () => {
+      const agent = createMockAgent('dev')
+      let callCount = 0
+      ;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async () => {
+        callCount++
+        if (callCount === 1) throw new Error('first run fails')
+        return SUCCESS_RESULT
+      })
+
+      const pool = new AgentPool(5)
+      pool.add(agent)
+
+      // First run fails, second should still execute (not deadlock)
+      const results = await Promise.allSettled([
+        pool.run('dev', 'will-fail'),
+        pool.run('dev', 'should-succeed'),
+      ])
+
+      expect(results[0]!.status).toBe('rejected')
+      expect(results[1]!.status).toBe('fulfilled')
+    })
+  })
+
  describe('concurrency', () => {
    it('respects maxConcurrency limit', async () => {
      let concurrent = 0
@ -208,5 +291,93 @@ describe('AgentPool', () => {

      expect(maxConcurrent).toBeLessThanOrEqual(2)
    })
+
+    it('availableRunSlots matches maxConcurrency when idle', () => {
+      const pool = new AgentPool(3)
+      pool.add(createMockAgent('a'))
+      expect(pool.availableRunSlots).toBe(3)
+    })
+
+    it('availableRunSlots is zero while a run holds the pool slot', async () => {
+      const pool = new AgentPool(1)
+      const agent = createMockAgent('solo')
+      pool.add(agent)
+
+      let finishRun!: (value: AgentRunResult) => void
+      const holdPromise = new Promise<AgentRunResult>((resolve) => {
+        finishRun = resolve
+      })
+      vi.mocked(agent.run).mockReturnValue(holdPromise)
+
+      const runPromise = pool.run('solo', 'hold-slot')
+      await Promise.resolve()
+      await Promise.resolve()
+      expect(pool.availableRunSlots).toBe(0)
+
+      finishRun(SUCCESS_RESULT)
+      await runPromise
+      expect(pool.availableRunSlots).toBe(1)
+    })
+
+    it('runEphemeral runs a caller-supplied Agent without touching the agentLock', async () => {
+      // Registered agent's lock is held by a pending pool.run — a second
+      // pool.run() against the same name would queue on the agent lock.
+      // runEphemeral on a fresh Agent instance must NOT block on that lock.
+      const pool = new AgentPool(3)
+      const registered = createMockAgent('alice')
+      pool.add(registered)
+
+      let releaseRegistered!: (v: AgentRunResult) => void
+      vi.mocked(registered.run).mockReturnValue(
+        new Promise<AgentRunResult>((resolve) => {
+          releaseRegistered = resolve
+        }),
+      )
+      const heldRun = pool.run('alice', 'long running')
+      await Promise.resolve()
+      await Promise.resolve()
+
+      const ephemeral = createMockAgent('alice') // same name, fresh instance
+      const ephemeralResult = await pool.runEphemeral(ephemeral, 'quick task')
+
+      expect(ephemeralResult).toBe(SUCCESS_RESULT)
+      expect(ephemeral.run).toHaveBeenCalledWith('quick task', undefined)
+
+      releaseRegistered(SUCCESS_RESULT)
+      await heldRun
+    })
+
+    it('runEphemeral still respects pool semaphore', async () => {
+      const pool = new AgentPool(1)
+      const holder = createMockAgent('holder')
+      pool.add(holder)
+
+      let releaseHolder!: (v: AgentRunResult) => void
+      vi.mocked(holder.run).mockReturnValue(
+        new Promise<AgentRunResult>((resolve) => {
+          releaseHolder = resolve
+        }),
+      )
+      const heldRun = pool.run('holder', 'hold-slot')
+      await Promise.resolve()
+      await Promise.resolve()
+      expect(pool.availableRunSlots).toBe(0)
+
+      // Ephemeral agent should queue on the semaphore, not run immediately.
+      const ephemeral = createMockAgent('ephemeral')
+      let ephemeralResolved = false
+      const ephemeralRun = pool.runEphemeral(ephemeral, 'p').then((r) => {
+        ephemeralResolved = true
+        return r
+      })
+      await Promise.resolve()
+      await Promise.resolve()
+      expect(ephemeralResolved).toBe(false)
+
+      releaseHolder(SUCCESS_RESULT)
+      await heldRun
+      await ephemeralRun
+      expect(ephemeralResolved).toBe(true)
+    })
  })
 })
--- a/tests/anthropic-adapter.test.ts
+++ b/tests/anthropic-adapter.test.ts
@ -0,0 +1,436 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { textMsg, toolUseMsg, toolResultMsg, imageMsg, chatOpts, toolDef, collectEvents } from './helpers/llm-fixtures.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Mock the Anthropic SDK
+// ---------------------------------------------------------------------------
+
+const mockCreate = vi.hoisted(() => vi.fn())
+const mockStream = vi.hoisted(() => vi.fn())
+
+vi.mock('@anthropic-ai/sdk', () => {
+  const AnthropicMock = vi.fn(() => ({
+    messages: {
+      create: mockCreate,
+      stream: mockStream,
+    },
+  }))
+  return { default: AnthropicMock, Anthropic: AnthropicMock }
+})
+
+import { AnthropicAdapter } from '../src/llm/anthropic.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function makeAnthropicResponse(overrides: Record<string, unknown> = {}) {
+  return {
+    id: 'msg_test123',
+    content: [{ type: 'text', text: 'Hello' }],
+    model: 'claude-sonnet-4',
+    stop_reason: 'end_turn',
+    usage: { input_tokens: 10, output_tokens: 5 },
+    ...overrides,
+  }
+}
+
+function makeStreamMock(events: Array<Record<string, unknown>>, finalMsg: Record<string, unknown>) {
+  return {
+    [Symbol.asyncIterator]: async function* () {
+      for (const event of events) yield event
+    },
+    finalMessage: vi.fn().mockResolvedValue(finalMsg),
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('AnthropicAdapter', () => {
+  let adapter: AnthropicAdapter
+
+  beforeEach(() => {
+    vi.clearAllMocks()
+    adapter = new AnthropicAdapter('test-key')
+  })
+
+  // =========================================================================
+  // chat()
+  // =========================================================================
+
+  describe('chat()', () => {
+    it('converts a text message and returns LLMResponse', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      // Verify the SDK was called with correct shape
+      const callArgs = mockCreate.mock.calls[0]
+      expect(callArgs[0]).toMatchObject({
+        model: 'test-model',
+        max_tokens: 1024,
+        messages: [{ role: 'user', content: [{ type: 'text', text: 'Hi' }] }],
+      })
+
+      // Verify response transformation
+      expect(result).toEqual({
+        id: 'msg_test123',
+        content: [{ type: 'text', text: 'Hello' }],
+        model: 'claude-sonnet-4',
+        stop_reason: 'end_turn',
+        usage: { input_tokens: 10, output_tokens: 5 },
+      })
+    })
+
+    it('converts tool_use blocks to Anthropic format', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat(
+        [toolUseMsg('call_1', 'search', { query: 'test' })],
+        chatOpts(),
+      )
+
+      const sentMessages = mockCreate.mock.calls[0][0].messages
+      expect(sentMessages[0].content[0]).toEqual({
+        type: 'tool_use',
+        id: 'call_1',
+        name: 'search',
+        input: { query: 'test' },
+      })
+    })
+
+    it('converts tool_result blocks to Anthropic format', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat(
+        [toolResultMsg('call_1', 'result data', false)],
+        chatOpts(),
+      )
+
+      const sentMessages = mockCreate.mock.calls[0][0].messages
+      expect(sentMessages[0].content[0]).toEqual({
+        type: 'tool_result',
+        tool_use_id: 'call_1',
+        content: 'result data',
+        is_error: false,
+      })
+    })
+
+    it('converts image blocks to Anthropic format', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat([imageMsg('image/png', 'base64data')], chatOpts())
+
+      const sentMessages = mockCreate.mock.calls[0][0].messages
+      expect(sentMessages[0].content[0]).toEqual({
+        type: 'image',
+        source: {
+          type: 'base64',
+          media_type: 'image/png',
+          data: 'base64data',
+        },
+      })
+    })
+
+    it('passes system prompt as top-level parameter', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ systemPrompt: 'You are helpful.' }),
+      )
+
+      expect(mockCreate.mock.calls[0][0].system).toBe('You are helpful.')
+    })
+
+    it('converts tools to Anthropic format', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+      const tool = toolDef('search', 'Search the web')
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ tools: [tool] }),
+      )
+
+      const sentTools = mockCreate.mock.calls[0][0].tools
+      expect(sentTools[0]).toEqual({
+        name: 'search',
+        description: 'Search the web',
+        input_schema: {
+          type: 'object',
+          properties: { query: { type: 'string' } },
+          required: ['query'],
+        },
+      })
+    })
+
+    it('passes temperature through', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ temperature: 0.5 }),
+      )
+
+      expect(mockCreate.mock.calls[0][0].temperature).toBe(0.5)
+    })
+
+    it('passes abortSignal to SDK request options', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+      const controller = new AbortController()
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ abortSignal: controller.signal }),
+      )
+
+      expect(mockCreate.mock.calls[0][1]).toEqual({ signal: controller.signal })
+    })
+
+    it('defaults max_tokens to 4096 when unset', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse())
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        { model: 'test-model' },
+      )
+
+      expect(mockCreate.mock.calls[0][0].max_tokens).toBe(4096)
+    })
+
+    it('converts tool_use response blocks from Anthropic', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse({
+        content: [
+          { type: 'tool_use', id: 'call_1', name: 'search', input: { q: 'test' } },
+        ],
+        stop_reason: 'tool_use',
+      }))
+
+      const result = await adapter.chat([textMsg('user', 'search')], chatOpts())
+
+      expect(result.content[0]).toEqual({
+        type: 'tool_use',
+        id: 'call_1',
+        name: 'search',
+        input: { q: 'test' },
+      })
+      expect(result.stop_reason).toBe('tool_use')
+    })
+
+    it('gracefully degrades unknown block types to text', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse({
+        content: [{ type: 'thinking', thinking: 'hmm...' }],
+      }))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.content[0]).toEqual({
+        type: 'text',
+        text: '[unsupported block type: thinking]',
+      })
+    })
+
+    it('defaults stop_reason to end_turn when null', async () => {
+      mockCreate.mockResolvedValue(makeAnthropicResponse({ stop_reason: null }))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.stop_reason).toBe('end_turn')
+    })
+
+    it('propagates SDK errors', async () => {
+      mockCreate.mockRejectedValue(new Error('Rate limited'))
+
+      await expect(
+        adapter.chat([textMsg('user', 'Hi')], chatOpts()),
+      ).rejects.toThrow('Rate limited')
+    })
+  })
+
+  // =========================================================================
+  // stream()
+  // =========================================================================
+
+  describe('stream()', () => {
+    it('yields text events from text_delta', async () => {
+      const streamObj = makeStreamMock(
+        [
+          { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Hello' } },
+          { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: ' world' } },
+        ],
+        makeAnthropicResponse({ content: [{ type: 'text', text: 'Hello world' }] }),
+      )
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const textEvents = events.filter(e => e.type === 'text')
+      expect(textEvents).toEqual([
+        { type: 'text', data: 'Hello' },
+        { type: 'text', data: ' world' },
+      ])
+    })
+
+    it('accumulates tool input JSON and emits tool_use on content_block_stop', async () => {
+      const streamObj = makeStreamMock(
+        [
+          {
+            type: 'content_block_start',
+            index: 0,
+            content_block: { type: 'tool_use', id: 'call_1', name: 'search' },
+          },
+          {
+            type: 'content_block_delta',
+            index: 0,
+            delta: { type: 'input_json_delta', partial_json: '{"qu' },
+          },
+          {
+            type: 'content_block_delta',
+            index: 0,
+            delta: { type: 'input_json_delta', partial_json: 'ery":"test"}' },
+          },
+          { type: 'content_block_stop', index: 0 },
+        ],
+        makeAnthropicResponse({
+          content: [{ type: 'tool_use', id: 'call_1', name: 'search', input: { query: 'test' } }],
+          stop_reason: 'tool_use',
+        }),
+      )
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const toolEvents = events.filter(e => e.type === 'tool_use')
+      expect(toolEvents).toHaveLength(1)
+      const block = toolEvents[0].data as ToolUseBlock
+      expect(block).toEqual({
+        type: 'tool_use',
+        id: 'call_1',
+        name: 'search',
+        input: { query: 'test' },
+      })
+    })
+
+    it('handles malformed tool JSON gracefully (defaults to empty object)', async () => {
+      const streamObj = makeStreamMock(
+        [
+          {
+            type: 'content_block_start',
+            index: 0,
+            content_block: { type: 'tool_use', id: 'call_1', name: 'broken' },
+          },
+          {
+            type: 'content_block_delta',
+            index: 0,
+            delta: { type: 'input_json_delta', partial_json: '{invalid' },
+          },
+          { type: 'content_block_stop', index: 0 },
+        ],
+        makeAnthropicResponse({
+          content: [{ type: 'tool_use', id: 'call_1', name: 'broken', input: {} }],
+        }),
+      )
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const toolEvents = events.filter(e => e.type === 'tool_use')
+      expect((toolEvents[0].data as ToolUseBlock).input).toEqual({})
+    })
+
+    it('yields done event with complete LLMResponse', async () => {
+      const final = makeAnthropicResponse({
+        content: [{ type: 'text', text: 'Done' }],
+      })
+      const streamObj = makeStreamMock([], final)
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const doneEvents = events.filter(e => e.type === 'done')
+      expect(doneEvents).toHaveLength(1)
+      const response = doneEvents[0].data as LLMResponse
+      expect(response.id).toBe('msg_test123')
+      expect(response.content).toEqual([{ type: 'text', text: 'Done' }])
+      expect(response.usage).toEqual({ input_tokens: 10, output_tokens: 5 })
+    })
+
+    it('yields error event when stream throws', async () => {
+      const streamObj = {
+        [Symbol.asyncIterator]: async function* () {
+          throw new Error('Stream failed')
+        },
+        finalMessage: vi.fn(),
+      }
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const errorEvents = events.filter(e => e.type === 'error')
+      expect(errorEvents).toHaveLength(1)
+      expect((errorEvents[0].data as Error).message).toBe('Stream failed')
+    })
+
+    it('passes system prompt and tools to stream call', async () => {
+      const streamObj = makeStreamMock([], makeAnthropicResponse())
+      mockStream.mockReturnValue(streamObj)
+      const tool = toolDef('search')
+
+      await collectEvents(
+        adapter.stream(
+          [textMsg('user', 'Hi')],
+          chatOpts({ systemPrompt: 'Be helpful', tools: [tool] }),
+        ),
+      )
+
+      const callArgs = mockStream.mock.calls[0][0]
+      expect(callArgs.system).toBe('Be helpful')
+      expect(callArgs.tools[0].name).toBe('search')
+    })
+
+    it('passes abortSignal to stream request options', async () => {
+      const streamObj = makeStreamMock([], makeAnthropicResponse())
+      mockStream.mockReturnValue(streamObj)
+      const controller = new AbortController()
+
+      await collectEvents(
+        adapter.stream(
+          [textMsg('user', 'Hi')],
+          chatOpts({ abortSignal: controller.signal }),
+        ),
+      )
+
+      expect(mockStream.mock.calls[0][1]).toEqual({ signal: controller.signal })
+    })
+
+    it('handles multiple tool calls in one stream', async () => {
+      const streamObj = makeStreamMock(
+        [
+          { type: 'content_block_start', index: 0, content_block: { type: 'tool_use', id: 'c1', name: 'search' } },
+          { type: 'content_block_delta', index: 0, delta: { type: 'input_json_delta', partial_json: '{"q":"a"}' } },
+          { type: 'content_block_stop', index: 0 },
+          { type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: 'c2', name: 'read' } },
+          { type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '{"path":"b"}' } },
+          { type: 'content_block_stop', index: 1 },
+        ],
+        makeAnthropicResponse({
+          content: [
+            { type: 'tool_use', id: 'c1', name: 'search', input: { q: 'a' } },
+            { type: 'tool_use', id: 'c2', name: 'read', input: { path: 'b' } },
+          ],
+        }),
+      )
+      mockStream.mockReturnValue(streamObj)
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const toolEvents = events.filter(e => e.type === 'tool_use')
+      expect(toolEvents).toHaveLength(2)
+      expect((toolEvents[0].data as ToolUseBlock).name).toBe('search')
+      expect((toolEvents[1].data as ToolUseBlock).name).toBe('read')
+    })
+  })
+})
--- a/tests/azure-openai-adapter.test.ts
+++ b/tests/azure-openai-adapter.test.ts
@ -0,0 +1,383 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { chatOpts, collectEvents, textMsg, toolDef } from './helpers/llm-fixtures.js'
+import type { LLMResponse, ToolUseBlock } from '../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Mock AzureOpenAI constructor (must be hoisted for Vitest)
+// ---------------------------------------------------------------------------
+const AzureOpenAIMock = vi.hoisted(() => vi.fn())
+const createCompletionMock = vi.hoisted(() => vi.fn())
+
+vi.mock('openai', () => ({
+  AzureOpenAI: AzureOpenAIMock,
+}))
+
+import { AzureOpenAIAdapter } from '../src/llm/azure-openai.js'
+import { createAdapter } from '../src/llm/adapter.js'
+
+function makeCompletion(overrides: Record<string, unknown> = {}) {
+  return {
+    id: 'chatcmpl-123',
+    model: 'gpt-4o',
+    choices: [{
+      index: 0,
+      message: {
+        role: 'assistant',
+        content: 'Hello',
+        tool_calls: undefined,
+      },
+      finish_reason: 'stop',
+    }],
+    usage: { prompt_tokens: 10, completion_tokens: 5 },
+    ...overrides,
+  }
+}
+
+async function* makeChunks(chunks: Array<Record<string, unknown>>) {
+  for (const chunk of chunks) yield chunk
+}
+
+function textChunk(text: string, finish_reason: string | null = null, usage: Record<string, number> | null = null) {
+  return {
+    id: 'chatcmpl-123',
+    model: 'gpt-4o',
+    choices: [{
+      index: 0,
+      delta: { content: text },
+      finish_reason,
+    }],
+    usage,
+  }
+}
+
+function toolCallChunk(
+  index: number,
+  id: string | undefined,
+  name: string | undefined,
+  args: string,
+  finish_reason: string | null = null,
+) {
+  return {
+    id: 'chatcmpl-123',
+    model: 'gpt-4o',
+    choices: [{
+      index: 0,
+      delta: {
+        tool_calls: [{
+          index,
+          id,
+          function: {
+            name,
+            arguments: args,
+          },
+        }],
+      },
+      finish_reason,
+    }],
+    usage: null,
+  }
+}
+
+// ---------------------------------------------------------------------------
+// AzureOpenAIAdapter tests
+// ---------------------------------------------------------------------------
+
+describe('AzureOpenAIAdapter', () => {
+  beforeEach(() => {
+    AzureOpenAIMock.mockClear()
+    createCompletionMock.mockReset()
+    AzureOpenAIMock.mockImplementation(() => ({
+      chat: {
+        completions: {
+          create: createCompletionMock,
+        },
+      },
+    }))
+  })
+
+  it('has name "azure-openai"', () => {
+    const adapter = new AzureOpenAIAdapter()
+    expect(adapter.name).toBe('azure-openai')
+  })
+
+  it('uses AZURE_OPENAI_API_KEY by default', () => {
+    const originalKey = process.env['AZURE_OPENAI_API_KEY']
+    const originalEndpoint = process.env['AZURE_OPENAI_ENDPOINT']
+    process.env['AZURE_OPENAI_API_KEY'] = 'azure-test-key-123'
+    process.env['AZURE_OPENAI_ENDPOINT'] = 'https://test.openai.azure.com'
+
+    try {
+      new AzureOpenAIAdapter()
+      expect(AzureOpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'azure-test-key-123',
+          endpoint: 'https://test.openai.azure.com',
+        })
+      )
+    } finally {
+      if (originalKey === undefined) {
+        delete process.env['AZURE_OPENAI_API_KEY']
+      } else {
+        process.env['AZURE_OPENAI_API_KEY'] = originalKey
+      }
+      if (originalEndpoint === undefined) {
+        delete process.env['AZURE_OPENAI_ENDPOINT']
+      } else {
+        process.env['AZURE_OPENAI_ENDPOINT'] = originalEndpoint
+      }
+    }
+  })
+
+  it('uses AZURE_OPENAI_ENDPOINT by default', () => {
+    const originalEndpoint = process.env['AZURE_OPENAI_ENDPOINT']
+    process.env['AZURE_OPENAI_ENDPOINT'] = 'https://my-resource.openai.azure.com'
+
+    try {
+      new AzureOpenAIAdapter('some-key')
+      expect(AzureOpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'some-key',
+          endpoint: 'https://my-resource.openai.azure.com',
+        })
+      )
+    } finally {
+      if (originalEndpoint === undefined) {
+        delete process.env['AZURE_OPENAI_ENDPOINT']
+      } else {
+        process.env['AZURE_OPENAI_ENDPOINT'] = originalEndpoint
+      }
+    }
+  })
+
+  it('uses default API version when not set', () => {
+    new AzureOpenAIAdapter('some-key', 'https://test.openai.azure.com')
+    expect(AzureOpenAIMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        apiKey: 'some-key',
+        endpoint: 'https://test.openai.azure.com',
+        apiVersion: '2024-10-21',
+      })
+    )
+  })
+
+  it('uses AZURE_OPENAI_API_VERSION env var when set', () => {
+    const originalVersion = process.env['AZURE_OPENAI_API_VERSION']
+    process.env['AZURE_OPENAI_API_VERSION'] = '2024-03-01-preview'
+
+    try {
+      new AzureOpenAIAdapter('some-key', 'https://test.openai.azure.com')
+      expect(AzureOpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'some-key',
+          endpoint: 'https://test.openai.azure.com',
+          apiVersion: '2024-03-01-preview',
+        })
+      )
+    } finally {
+      if (originalVersion === undefined) {
+        delete process.env['AZURE_OPENAI_API_VERSION']
+      } else {
+        process.env['AZURE_OPENAI_API_VERSION'] = originalVersion
+      }
+    }
+  })
+
+  it('allows overriding apiKey, endpoint, and apiVersion', () => {
+    new AzureOpenAIAdapter(
+      'custom-key',
+      'https://custom.openai.azure.com',
+      '2024-04-01-preview'
+    )
+    expect(AzureOpenAIMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        apiKey: 'custom-key',
+        endpoint: 'https://custom.openai.azure.com',
+        apiVersion: '2024-04-01-preview',
+      })
+    )
+  })
+
+  it('createAdapter("azure-openai") returns AzureOpenAIAdapter instance', async () => {
+    const adapter = await createAdapter('azure-openai')
+    expect(adapter).toBeInstanceOf(AzureOpenAIAdapter)
+  })
+
+  it('chat() calls SDK with expected parameters', async () => {
+    createCompletionMock.mockResolvedValue(makeCompletion())
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+    const tool = toolDef('search', 'Search')
+
+    const result = await adapter.chat(
+      [textMsg('user', 'Hi')],
+      chatOpts({
+        model: 'my-deployment',
+        tools: [tool],
+        temperature: 0.3,
+      }),
+    )
+
+    const callArgs = createCompletionMock.mock.calls[0][0]
+    expect(callArgs).toMatchObject({
+      model: 'my-deployment',
+      stream: false,
+      max_tokens: 1024,
+      temperature: 0.3,
+    })
+    expect(callArgs.tools[0]).toEqual({
+      type: 'function',
+      function: {
+        name: 'search',
+        description: 'Search',
+        parameters: tool.inputSchema,
+      },
+    })
+    expect(result).toEqual({
+      id: 'chatcmpl-123',
+      content: [{ type: 'text', text: 'Hello' }],
+      model: 'gpt-4o',
+      stop_reason: 'end_turn',
+      usage: { input_tokens: 10, output_tokens: 5 },
+    })
+  })
+
+  it('chat() maps native tool_calls to tool_use blocks', async () => {
+    createCompletionMock.mockResolvedValue(makeCompletion({
+      choices: [{
+        index: 0,
+        message: {
+          role: 'assistant',
+          content: null,
+          tool_calls: [{
+            id: 'call_1',
+            type: 'function',
+            function: { name: 'search', arguments: '{"q":"test"}' },
+          }],
+        },
+        finish_reason: 'tool_calls',
+      }],
+    }))
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+
+    const result = await adapter.chat(
+      [textMsg('user', 'Hi')],
+      chatOpts({ model: 'my-deployment', tools: [toolDef('search')] }),
+    )
+
+    expect(result.content[0]).toEqual({
+      type: 'tool_use',
+      id: 'call_1',
+      name: 'search',
+      input: { q: 'test' },
+    })
+    expect(result.stop_reason).toBe('tool_use')
+  })
+
+  it('chat() uses AZURE_OPENAI_DEPLOYMENT when model is blank', async () => {
+    const originalDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']
+    process.env['AZURE_OPENAI_DEPLOYMENT'] = 'env-deployment'
+    createCompletionMock.mockResolvedValue({
+      id: 'cmpl-1',
+      model: 'gpt-4',
+      choices: [
+        {
+          finish_reason: 'stop',
+          message: { content: 'ok' },
+        },
+      ],
+      usage: { prompt_tokens: 1, completion_tokens: 1 },
+    })
+
+    try {
+      const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+      await adapter.chat([], { model: '   ' })
+
+      expect(createCompletionMock).toHaveBeenCalledWith(
+        expect.objectContaining({ model: 'env-deployment', stream: false }),
+        expect.any(Object),
+      )
+    } finally {
+      if (originalDeployment === undefined) {
+        delete process.env['AZURE_OPENAI_DEPLOYMENT']
+      } else {
+        process.env['AZURE_OPENAI_DEPLOYMENT'] = originalDeployment
+      }
+    }
+  })
+
+  it('chat() throws when both model and AZURE_OPENAI_DEPLOYMENT are blank', async () => {
+    const originalDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']
+    delete process.env['AZURE_OPENAI_DEPLOYMENT']
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+
+    try {
+      await expect(adapter.chat([], { model: ' ' })).rejects.toThrow(
+        'Azure OpenAI deployment is required',
+      )
+      expect(createCompletionMock).not.toHaveBeenCalled()
+    } finally {
+      if (originalDeployment !== undefined) {
+        process.env['AZURE_OPENAI_DEPLOYMENT'] = originalDeployment
+      }
+    }
+  })
+
+  it('stream() sends stream options and emits done usage', async () => {
+    createCompletionMock.mockResolvedValue(makeChunks([
+      textChunk('Hi', 'stop'),
+      { id: 'chatcmpl-123', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 10, completion_tokens: 2 } },
+    ]))
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+
+    const events = await collectEvents(
+      adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
+    )
+
+    const callArgs = createCompletionMock.mock.calls[0][0]
+    expect(callArgs.stream).toBe(true)
+    expect(callArgs.stream_options).toEqual({ include_usage: true })
+
+    const done = events.find(e => e.type === 'done')
+    const response = done?.data as LLMResponse
+    expect(response.usage).toEqual({ input_tokens: 10, output_tokens: 2 })
+    expect(response.model).toBe('gpt-4o')
+  })
+
+  it('stream() accumulates tool call deltas and emits tool_use', async () => {
+    createCompletionMock.mockResolvedValue(makeChunks([
+      toolCallChunk(0, 'call_1', 'search', '{"q":'),
+      toolCallChunk(0, undefined, undefined, '"test"}', 'tool_calls'),
+      { id: 'chatcmpl-123', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 10, completion_tokens: 5 } },
+    ]))
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+
+    const events = await collectEvents(
+      adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
+    )
+
+    const toolEvents = events.filter(e => e.type === 'tool_use')
+    expect(toolEvents).toHaveLength(1)
+    expect(toolEvents[0]?.data as ToolUseBlock).toEqual({
+      type: 'tool_use',
+      id: 'call_1',
+      name: 'search',
+      input: { q: 'test' },
+    })
+  })
+
+  it('stream() yields error event when iterator throws', async () => {
+    createCompletionMock.mockResolvedValue(
+      (async function* () {
+        throw new Error('Stream exploded')
+      })(),
+    )
+    const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
+
+    const events = await collectEvents(
+      adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
+    )
+
+    const errorEvents = events.filter(e => e.type === 'error')
+    expect(errorEvents).toHaveLength(1)
+    expect((errorEvents[0]?.data as Error).message).toBe('Stream exploded')
+  })
+})
--- a/tests/built-in-tools.test.ts
+++ b/tests/built-in-tools.test.ts
@ -1,4 +1,4 @@
-import { describe, it, expect, beforeEach, afterEach } from 'vitest'
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'
 import { mkdtemp, rm, writeFile, readFile } from 'fs/promises'
 import { join } from 'path'
 import { tmpdir } from 'os'
@ -6,10 +6,16 @@ import { fileReadTool } from '../src/tool/built-in/file-read.js'
 import { fileWriteTool } from '../src/tool/built-in/file-write.js'
 import { fileEditTool } from '../src/tool/built-in/file-edit.js'
 import { bashTool } from '../src/tool/built-in/bash.js'
+import { globTool } from '../src/tool/built-in/glob.js'
 import { grepTool } from '../src/tool/built-in/grep.js'
-import { registerBuiltInTools, BUILT_IN_TOOLS } from '../src/tool/built-in/index.js'
+import {
+  registerBuiltInTools,
+  BUILT_IN_TOOLS,
+  delegateToAgentTool,
+} from '../src/tool/built-in/index.js'
 import { ToolRegistry } from '../src/tool/framework.js'
-import type { ToolUseContext } from '../src/types.js'
+import { InMemoryStore } from '../src/memory/store.js'
+import type { AgentRunResult, ToolUseContext } from '../src/types.js'

 // ---------------------------------------------------------------------------
 // Helpers
@ -34,7 +40,7 @@ afterEach(async () => {
 // ===========================================================================

 describe('registerBuiltInTools', () => {
-  it('registers all 5 built-in tools', () => {
+  it('registers all 6 built-in tools', () => {
    const registry = new ToolRegistry()
    registerBuiltInTools(registry)

@ -43,10 +49,18 @@ describe('registerBuiltInTools', () => {
    expect(registry.get('file_write')).toBeDefined()
    expect(registry.get('file_edit')).toBeDefined()
    expect(registry.get('grep')).toBeDefined()
+    expect(registry.get('glob')).toBeDefined()
+    expect(registry.get('delegate_to_agent')).toBeUndefined()
+  })
+
+  it('registers delegate_to_agent when includeDelegateTool is set', () => {
+    const registry = new ToolRegistry()
+    registerBuiltInTools(registry, { includeDelegateTool: true })
+    expect(registry.get('delegate_to_agent')).toBeDefined()
  })

  it('BUILT_IN_TOOLS has correct length', () => {
-    expect(BUILT_IN_TOOLS).toHaveLength(5)
+    expect(BUILT_IN_TOOLS).toHaveLength(6)
  })
 })

@ -305,6 +319,102 @@ describe('bash', () => {
  })
 })

+// ===========================================================================
+// glob
+// ===========================================================================
+
+describe('glob', () => {
+  it('lists files matching a pattern without reading contents', async () => {
+    await writeFile(join(tmpDir, 'a.ts'), 'SECRET_CONTENT_SHOULD_NOT_APPEAR')
+    await writeFile(join(tmpDir, 'b.md'), 'also secret')
+
+    const result = await globTool.execute(
+      { path: tmpDir, pattern: '*.ts' },
+      defaultContext,
+    )
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toContain('.ts')
+    expect(result.data).not.toContain('SECRET')
+    expect(result.data).not.toContain('b.md')
+  })
+
+  it('lists all files when pattern is omitted', async () => {
+    await writeFile(join(tmpDir, 'x.txt'), 'x')
+    await writeFile(join(tmpDir, 'y.txt'), 'y')
+
+    const result = await globTool.execute({ path: tmpDir }, defaultContext)
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toContain('x.txt')
+    expect(result.data).toContain('y.txt')
+  })
+
+  it('lists a single file when path is a file', async () => {
+    const filePath = join(tmpDir, 'only.ts')
+    await writeFile(filePath, 'body')
+
+    const result = await globTool.execute({ path: filePath }, defaultContext)
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toContain('only.ts')
+  })
+
+  it('returns no match when single file does not match pattern', async () => {
+    const filePath = join(tmpDir, 'readme.md')
+    await writeFile(filePath, '# doc')
+
+    const result = await globTool.execute(
+      { path: filePath, pattern: '*.ts' },
+      defaultContext,
+    )
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toContain('No files matched')
+  })
+
+  it('recurses into subdirectories', async () => {
+    const sub = join(tmpDir, 'nested')
+    const { mkdir } = await import('fs/promises')
+    await mkdir(sub, { recursive: true })
+    await writeFile(join(sub, 'deep.ts'), '')
+
+    const result = await globTool.execute(
+      { path: tmpDir, pattern: '*.ts' },
+      defaultContext,
+    )
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toContain('deep.ts')
+  })
+
+  it('errors on inaccessible path', async () => {
+    const result = await globTool.execute(
+      { path: '/nonexistent/path/xyz' },
+      defaultContext,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toContain('Cannot access path')
+  })
+
+  it('notes truncation when maxFiles is exceeded', async () => {
+    for (let i = 0; i < 5; i++) {
+      await writeFile(join(tmpDir, `f${i}.txt`), '')
+    }
+
+    const result = await globTool.execute(
+      { path: tmpDir, pattern: '*.txt', maxFiles: 3 },
+      defaultContext,
+    )
+
+    expect(result.isError).toBe(false)
+    const lines = (result.data as string).split('\n').filter((l) => l.endsWith('.txt'))
+    expect(lines).toHaveLength(3)
+    expect(result.data).toContain('capped at 3')
+  })
+})
+
 // ===========================================================================
 // grep (Node.js fallback — tests do not depend on ripgrep availability)
 // ===========================================================================
@ -391,3 +501,241 @@ describe('grep', () => {
    expect(result.data.toLowerCase()).toContain('no such file')
  })
 })
+
+// ===========================================================================
+// delegate_to_agent
+// ===========================================================================
+
+const DELEGATE_OK: AgentRunResult = {
+  success: true,
+  output: 'research done',
+  messages: [],
+  tokenUsage: { input_tokens: 1, output_tokens: 2 },
+  toolCalls: [],
+}
+
+describe('delegate_to_agent', () => {
+  it('returns delegated agent output on success', async () => {
+    const runDelegatedAgent = vi.fn().mockResolvedValue(DELEGATE_OK)
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationDepth: 0,
+        maxDelegationDepth: 3,
+        delegationPool: { availableRunSlots: 2 },
+        runDelegatedAgent,
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Summarize X.' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(false)
+    expect(result.data).toBe('research done')
+    expect(runDelegatedAgent).toHaveBeenCalledWith('bob', 'Summarize X.')
+  })
+
+  it('errors when delegation would form a cycle (A -> B -> A)', async () => {
+    const ctx: ToolUseContext = {
+      agent: { name: 'bob', role: 'worker', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationDepth: 1,
+        maxDelegationDepth: 5,
+        delegationChain: ['alice', 'bob'],
+        delegationPool: { availableRunSlots: 2 },
+        runDelegatedAgent: vi.fn().mockResolvedValue(DELEGATE_OK),
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'alice', prompt: 'loop back' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/Delegation cycle detected: alice -> bob -> alice/)
+    expect(ctx.team!.runDelegatedAgent).not.toHaveBeenCalled()
+  })
+
+  it('surfaces delegated run tokenUsage via ToolResult.metadata', async () => {
+    const runDelegatedAgent = vi.fn().mockResolvedValue({
+      success: true,
+      output: 'answer',
+      messages: [],
+      tokenUsage: { input_tokens: 123, output_tokens: 45 },
+      toolCalls: [],
+    } satisfies AgentRunResult)
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationPool: { availableRunSlots: 2 },
+        runDelegatedAgent,
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.metadata?.tokenUsage).toEqual({ input_tokens: 123, output_tokens: 45 })
+  })
+
+  it('errors when delegation is not configured', async () => {
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: { name: 't', agents: ['alice', 'bob'] },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/only available during orchestrated team runs/i)
+  })
+
+  it('errors for unknown target agent', async () => {
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        runDelegatedAgent: vi.fn(),
+        delegationPool: { availableRunSlots: 1 },
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'charlie', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/Unknown agent/)
+  })
+
+  it('errors on self-delegation', async () => {
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        runDelegatedAgent: vi.fn(),
+        delegationPool: { availableRunSlots: 1 },
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'alice', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/yourself/)
+  })
+
+  it('errors when delegation depth limit is reached', async () => {
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationDepth: 3,
+        maxDelegationDepth: 3,
+        runDelegatedAgent: vi.fn(),
+        delegationPool: { availableRunSlots: 1 },
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/Maximum delegation depth/)
+  })
+
+  it('errors fast when pool has no free slots without calling runDelegatedAgent', async () => {
+    const runDelegatedAgent = vi.fn()
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationPool: { availableRunSlots: 0 },
+        runDelegatedAgent,
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toMatch(/no free concurrency slot/i)
+    expect(runDelegatedAgent).not.toHaveBeenCalled()
+  })
+
+  it('writes unique SharedMemory audit keys for repeated delegations', async () => {
+    const store = new InMemoryStore()
+    const runDelegatedAgent = vi.fn().mockResolvedValue(DELEGATE_OK)
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        sharedMemory: store,
+        delegationPool: { availableRunSlots: 2 },
+        runDelegatedAgent,
+      },
+    }
+
+    await delegateToAgentTool.execute({ target_agent: 'bob', prompt: 'a' }, ctx)
+    await delegateToAgentTool.execute({ target_agent: 'bob', prompt: 'b' }, ctx)
+
+    const keys = (await store.list()).map((e) => e.key)
+    const delegationKeys = keys.filter((k) => k.includes('delegation:bob:'))
+    expect(delegationKeys).toHaveLength(2)
+    expect(delegationKeys[0]).not.toBe(delegationKeys[1])
+  })
+
+  it('returns isError when delegated run reports success false', async () => {
+    const runDelegatedAgent = vi.fn().mockResolvedValue({
+      success: false,
+      output: 'delegated agent failed',
+      messages: [],
+      tokenUsage: { input_tokens: 0, output_tokens: 0 },
+      toolCalls: [],
+    } satisfies AgentRunResult)
+
+    const ctx: ToolUseContext = {
+      agent: { name: 'alice', role: 'lead', model: 'test' },
+      team: {
+        name: 't',
+        agents: ['alice', 'bob'],
+        delegationPool: { availableRunSlots: 1 },
+        runDelegatedAgent,
+      },
+    }
+
+    const result = await delegateToAgentTool.execute(
+      { target_agent: 'bob', prompt: 'Hi' },
+      ctx,
+    )
+
+    expect(result.isError).toBe(true)
+    expect(result.data).toBe('delegated agent failed')
+  })
+})
--- a/tests/cli.test.ts
+++ b/tests/cli.test.ts
@ -0,0 +1,69 @@
+import { describe, expect, it } from 'vitest'
+import {
+  EXIT,
+  parseArgs,
+  serializeAgentResult,
+  serializeTeamRunResult,
+} from '../src/cli/oma.js'
+import type { AgentRunResult, TeamRunResult } from '../src/types.js'
+
+describe('parseArgs', () => {
+  it('parses flags, key=value, and key value', () => {
+    const a = parseArgs(['node', 'oma', 'run', '--goal', 'hello', '--team=x.json', '--pretty'])
+    expect(a._[0]).toBe('run')
+    expect(a.kv.get('goal')).toBe('hello')
+    expect(a.kv.get('team')).toBe('x.json')
+    expect(a.flags.has('pretty')).toBe(true)
+  })
+})
+
+describe('serializeTeamRunResult', () => {
+  it('maps agentResults to a plain object', () => {
+    const ar: AgentRunResult = {
+      success: true,
+      output: 'ok',
+      messages: [],
+      tokenUsage: { input_tokens: 1, output_tokens: 2 },
+      toolCalls: [],
+    }
+    const tr: TeamRunResult = {
+      success: true,
+      agentResults: new Map([['alice', ar]]),
+      totalTokenUsage: { input_tokens: 1, output_tokens: 2 },
+    }
+    const json = serializeTeamRunResult(tr, { pretty: false, includeMessages: false })
+    expect(json.success).toBe(true)
+    expect((json.agentResults as Record<string, unknown>)['alice']).toMatchObject({
+      success: true,
+      output: 'ok',
+    })
+    expect((json.agentResults as Record<string, unknown>)['alice']).not.toHaveProperty('messages')
+  })
+
+  it('includes messages when requested', () => {
+    const ar: AgentRunResult = {
+      success: true,
+      output: 'x',
+      messages: [{ role: 'user', content: [{ type: 'text', text: 'hi' }] }],
+      tokenUsage: { input_tokens: 0, output_tokens: 0 },
+      toolCalls: [],
+    }
+    const tr: TeamRunResult = {
+      success: true,
+      agentResults: new Map([['bob', ar]]),
+      totalTokenUsage: { input_tokens: 0, output_tokens: 0 },
+    }
+    const json = serializeTeamRunResult(tr, { pretty: false, includeMessages: true })
+    expect(serializeAgentResult(ar, true).messages).toHaveLength(1)
+    expect((json.agentResults as Record<string, unknown>)['bob']).toHaveProperty('messages')
+  })
+})
+
+describe('EXIT', () => {
+  it('uses stable numeric codes', () => {
+    expect(EXIT.SUCCESS).toBe(0)
+    expect(EXIT.RUN_FAILED).toBe(1)
+    expect(EXIT.USAGE).toBe(2)
+    expect(EXIT.INTERNAL).toBe(3)
+  })
+})
--- a/tests/context-strategy.test.ts
+++ b/tests/context-strategy.test.ts
@ -0,0 +1,626 @@
+import { describe, it, expect, vi } from 'vitest'
+import { z } from 'zod'
+import { AgentRunner } from '../src/agent/runner.js'
+import { ToolRegistry, defineTool } from '../src/tool/framework.js'
+import { ToolExecutor } from '../src/tool/executor.js'
+import type { LLMAdapter, LLMChatOptions, LLMMessage, LLMResponse, TraceEvent } from '../src/types.js'
+
+function textResponse(text: string): LLMResponse {
+  return {
+    id: `resp-${Math.random().toString(36).slice(2)}`,
+    content: [{ type: 'text', text }],
+    model: 'mock-model',
+    stop_reason: 'end_turn',
+    usage: { input_tokens: 10, output_tokens: 20 },
+  }
+}
+
+function toolUseResponse(toolName: string, input: Record<string, unknown>): LLMResponse {
+  return {
+    id: `resp-${Math.random().toString(36).slice(2)}`,
+    content: [{
+      type: 'tool_use',
+      id: `tu-${Math.random().toString(36).slice(2)}`,
+      name: toolName,
+      input,
+    }],
+    model: 'mock-model',
+    stop_reason: 'tool_use',
+    usage: { input_tokens: 15, output_tokens: 25 },
+  }
+}
+
+function buildRegistryAndExecutor(): { registry: ToolRegistry; executor: ToolExecutor } {
+  const registry = new ToolRegistry()
+  registry.register(
+    defineTool({
+      name: 'echo',
+      description: 'Echo input',
+      inputSchema: z.object({ message: z.string() }),
+      async execute({ message }) {
+        return { data: message }
+      },
+    }),
+  )
+  return { registry, executor: new ToolExecutor(registry) }
+}
+
+describe('AgentRunner contextStrategy', () => {
+  it('keeps baseline behavior when contextStrategy is not set', async () => {
+    const calls: LLMMessage[][] = []
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      async chat(messages) {
+        calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+        return calls.length === 1
+          ? toolUseResponse('echo', { message: 'hello' })
+          : textResponse('done')
+      },
+      async *stream() {
+        /* unused */
+      },
+    }
+    const { registry, executor } = buildRegistryAndExecutor()
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock-model',
+      allowedTools: ['echo'],
+      maxTurns: 4,
+    })
+
+    await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+    expect(calls).toHaveLength(2)
+    expect(calls[0]).toHaveLength(1)
+    expect(calls[1]!.length).toBeGreaterThan(calls[0]!.length)
+  })
+
+  it('sliding-window truncates old turns and preserves the first user message', async () => {
+    const calls: LLMMessage[][] = []
+    const responses = [
+      toolUseResponse('echo', { message: 't1' }),
+      toolUseResponse('echo', { message: 't2' }),
+      toolUseResponse('echo', { message: 't3' }),
+      textResponse('done'),
+    ]
+    let idx = 0
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      async chat(messages) {
+        calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+        return responses[idx++]!
+      },
+      async *stream() {
+        /* unused */
+      },
+    }
+    const { registry, executor } = buildRegistryAndExecutor()
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock-model',
+      allowedTools: ['echo'],
+      maxTurns: 8,
+      contextStrategy: { type: 'sliding-window', maxTurns: 1 },
+    })
+
+    await runner.run([{ role: 'user', content: [{ type: 'text', text: 'original prompt' }] }])
+
+    const laterCall = calls[calls.length - 1]!
+    const firstUserText = laterCall[0]!.content[0]
+    expect(firstUserText).toMatchObject({ type: 'text', text: 'original prompt' })
+    const flattenedText = laterCall.flatMap(m => m.content.filter(c => c.type === 'text'))
+    expect(flattenedText.some(c => c.type === 'text' && c.text.includes('truncated'))).toBe(true)
+  })
+
+  it('summarize strategy replaces old context and emits summary trace call', async () => {
+    const calls: Array<{ messages: LLMMessage[]; options: LLMChatOptions }> = []
+    const traces: TraceEvent[] = []
+    const responses = [
+      toolUseResponse('echo', { message: 'first turn payload '.repeat(20) }),
+      toolUseResponse('echo', { message: 'second turn payload '.repeat(20) }),
+      textResponse('This is a concise summary.'),
+      textResponse('final answer'),
+    ]
+    let idx = 0
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      async chat(messages, options) {
+        calls.push({ messages: messages.map(m => ({ role: m.role, content: m.content })), options })
+        return responses[idx++]!
+      },
+      async *stream() {
+        /* unused */
+      },
+    }
+    const { registry, executor } = buildRegistryAndExecutor()
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock-model',
+      allowedTools: ['echo'],
+      maxTurns: 8,
+      contextStrategy: { type: 'summarize', maxTokens: 20 },
+    })
+
+    const result = await runner.run(
+      [{ role: 'user', content: [{ type: 'text', text: 'start' }] }],
+      { onTrace: (e) => { traces.push(e) }, runId: 'run-summary', traceAgent: 'context-agent' },
+    )
+
+    const summaryCall = calls.find(c => c.messages.length === 1 && c.options.tools === undefined)
+    expect(summaryCall).toBeDefined()
+    const llmTraces = traces.filter(t => t.type === 'llm_call')
+    expect(llmTraces.some(t => t.type === 'llm_call' && t.phase === 'summary')).toBe(true)
+
+    // Summary adapter usage must count toward RunResult.tokenUsage (maxTokenBudget).
+    expect(result.tokenUsage.input_tokens).toBe(15 + 15 + 10 + 10)
+    expect(result.tokenUsage.output_tokens).toBe(25 + 25 + 20 + 20)
+
+    // After compaction, summary text is folded into the next user turn (not a
+    // standalone user message), preserving user/assistant alternation.
+    const turnAfterSummary = calls.find(
+      c => c.messages.some(
+        m => m.role === 'user' && m.content.some(
+          b => b.type === 'text' && b.text.includes('[Conversation summary]'),
+        ),
+      ),
+    )
+    expect(turnAfterSummary).toBeDefined()
+    const rolesAfterFirstUser = turnAfterSummary!.messages.map(m => m.role).join(',')
+    expect(rolesAfterFirstUser).not.toMatch(/^user,user/)
+  })
+
+  it('custom strategy calls compress callback and uses returned messages', async () => {
+    const compress = vi.fn((messages: LLMMessage[]) => messages.slice(-1))
+    const calls: LLMMessage[][] = []
+    const responses = [
+      toolUseResponse('echo', { message: 'hello' }),
+      textResponse('done'),
+    ]
+    let idx = 0
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      async chat(messages) {
+        calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+        return responses[idx++]!
+      },
+      async *stream() {
+        /* unused */
+      },
+    }
+    const { registry, executor } = buildRegistryAndExecutor()
+    const runner = new AgentRunner(adapter, registry, executor, {
+      model: 'mock-model',
+      allowedTools: ['echo'],
+      maxTurns: 4,
+      contextStrategy: {
+        type: 'custom',
+        compress,
+      },
+    })
+
+    await runner.run([{ role: 'user', content: [{ type: 'text', text: 'custom prompt' }] }])
+
+    expect(compress).toHaveBeenCalledOnce()
+    expect(calls[1]).toHaveLength(1)
+  })
+
+  // ---------------------------------------------------------------------------
+  // compact strategy
+  // ---------------------------------------------------------------------------
+
+  describe('compact strategy', () => {
+    const longText = 'x'.repeat(3000)
+    const longToolResult = 'result-data '.repeat(100) // ~1200 chars
+
+    function buildMultiTurnAdapter(
+      responseCount: number,
+      calls: LLMMessage[][],
+    ): LLMAdapter {
+      const responses: LLMResponse[] = []
+      for (let i = 0; i < responseCount - 1; i++) {
+        responses.push(toolUseResponse('echo', { message: `turn-${i}` }))
+      }
+      responses.push(textResponse('done'))
+      let idx = 0
+      return {
+        name: 'mock',
+        async chat(messages) {
+          calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+          return responses[idx++]!
+        },
+        async *stream() { /* unused */ },
+      }
+    }
+
+    /** Build a registry with an echo tool that returns a fixed result string. */
+    function buildEchoRegistry(result: string): { registry: ToolRegistry; executor: ToolExecutor } {
+      const registry = new ToolRegistry()
+      registry.register(
+        defineTool({
+          name: 'echo',
+          description: 'Echo input',
+          inputSchema: z.object({ message: z.string() }),
+          async execute() {
+            return { data: result }
+          },
+        }),
+      )
+      return { registry, executor: new ToolExecutor(registry) }
+    }
+
+    it('does not activate below maxTokens threshold', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(3, calls)
+      const { registry, executor } = buildEchoRegistry('short')
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: { type: 'compact', maxTokens: 999999 },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // On the 3rd call (turn 3), all previous messages should still be intact
+      // because estimated tokens are way below the threshold.
+      const lastCall = calls[calls.length - 1]!
+      const allToolResults = lastCall.flatMap(m =>
+        m.content.filter(b => b.type === 'tool_result'),
+      )
+      for (const tr of allToolResults) {
+        if (tr.type === 'tool_result') {
+          expect(tr.content).not.toContain('compacted')
+        }
+      }
+    })
+
+    it('compresses old tool_result blocks when tokens exceed threshold', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,           // very low to always trigger
+          preserveRecentTurns: 1,  // only protect the most recent turn
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // On the last call, old tool results should have compact markers.
+      const lastCall = calls[calls.length - 1]!
+      const toolResults = lastCall.flatMap(m =>
+        m.content.filter(b => b.type === 'tool_result'),
+      )
+      const compacted = toolResults.filter(
+        b => b.type === 'tool_result' && b.content.includes('compacted'),
+      )
+      expect(compacted.length).toBeGreaterThan(0)
+      // Marker should include tool name.
+      for (const tr of compacted) {
+        if (tr.type === 'tool_result') {
+          expect(tr.content).toMatch(/\[Tool result: echo/)
+        }
+      }
+    })
+
+    it('preserves the first user message', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'original prompt' }] }])
+
+      const lastCall = calls[calls.length - 1]!
+      const firstUser = lastCall.find(m => m.role === 'user')!
+      expect(firstUser.content[0]).toMatchObject({ type: 'text', text: 'original prompt' })
+    })
+
+    it('preserves tool_use blocks in old turns', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // Every assistant message should still have its tool_use block.
+      const lastCall = calls[calls.length - 1]!
+      const assistantMsgs = lastCall.filter(m => m.role === 'assistant')
+      for (const msg of assistantMsgs) {
+        const toolUses = msg.content.filter(b => b.type === 'tool_use')
+        // The last assistant message is "done" (text only), others have tool_use.
+        if (msg.content.some(b => b.type === 'text' && b.text === 'done')) continue
+        expect(toolUses.length).toBeGreaterThan(0)
+      }
+    })
+
+    it('preserves error tool_result blocks', async () => {
+      const calls: LLMMessage[][] = []
+      const responses: LLMResponse[] = [
+        toolUseResponse('echo', { message: 'will-fail' }),
+        toolUseResponse('echo', { message: 'ok' }),
+        textResponse('done'),
+      ]
+      let idx = 0
+      const adapter: LLMAdapter = {
+        name: 'mock',
+        async chat(messages) {
+          calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+          return responses[idx++]!
+        },
+        async *stream() { /* unused */ },
+      }
+      // Tool that fails on first call, succeeds on second.
+      let callCount = 0
+      const registry = new ToolRegistry()
+      registry.register(
+        defineTool({
+          name: 'echo',
+          description: 'Echo input',
+          inputSchema: z.object({ message: z.string() }),
+          async execute() {
+            callCount++
+            if (callCount === 1) {
+              throw new Error('deliberate error '.repeat(40))
+            }
+            return { data: longToolResult }
+          },
+        }),
+      )
+      const executor = new ToolExecutor(registry)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 50,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      const lastCall = calls[calls.length - 1]!
+      const errorResults = lastCall.flatMap(m =>
+        m.content.filter(b => b.type === 'tool_result' && b.is_error),
+      )
+      // Error results should still have their original content (not compacted).
+      for (const er of errorResults) {
+        if (er.type === 'tool_result') {
+          expect(er.content).not.toContain('compacted')
+          expect(er.content).toContain('deliberate error')
+        }
+      }
+    })
+
+    it('does not re-compress markers from compressToolResults', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        compressToolResults: { minChars: 100 },
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 10,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      const lastCall = calls[calls.length - 1]!
+      const allToolResults = lastCall.flatMap(m =>
+        m.content.filter(b => b.type === 'tool_result'),
+      )
+      // No result should contain nested markers.
+      for (const tr of allToolResults) {
+        if (tr.type === 'tool_result') {
+          // Should not have a compact marker wrapping another marker.
+          const markerCount = (tr.content.match(/\[Tool/g) || []).length
+          expect(markerCount).toBeLessThanOrEqual(1)
+        }
+      }
+    })
+
+    it('truncates long assistant text blocks in old turns', async () => {
+      const calls: LLMMessage[][] = []
+      const responses: LLMResponse[] = [
+        // First turn: assistant with long text + tool_use
+        {
+          id: 'r1',
+          content: [
+            { type: 'text', text: longText },
+            { type: 'tool_use', id: 'tu-1', name: 'echo', input: { message: 'hi' } },
+          ],
+          model: 'mock-model',
+          stop_reason: 'tool_use',
+          usage: { input_tokens: 10, output_tokens: 20 },
+        },
+        toolUseResponse('echo', { message: 'turn2' }),
+        textResponse('done'),
+      ]
+      let idx = 0
+      const adapter: LLMAdapter = {
+        name: 'mock',
+        async chat(messages) {
+          calls.push(messages.map(m => ({ role: m.role, content: m.content })))
+          return responses[idx++]!
+        },
+        async *stream() { /* unused */ },
+      }
+      const { registry, executor } = buildEchoRegistry('short')
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minTextBlockChars: 500,
+          textBlockExcerptChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      const lastCall = calls[calls.length - 1]!
+      // The first assistant message (old zone) should have its text truncated.
+      const firstAssistant = lastCall.find(m => m.role === 'assistant')!
+      const textBlocks = firstAssistant.content.filter(b => b.type === 'text')
+      const truncated = textBlocks.find(
+        b => b.type === 'text' && b.text.includes('truncated'),
+      )
+      expect(truncated).toBeDefined()
+      if (truncated && truncated.type === 'text') {
+        expect(truncated.text.length).toBeLessThan(longText.length)
+        expect(truncated.text).toContain(`${longText.length} chars total`)
+      }
+    })
+
+    it('keeps recent turns intact within preserveRecentTurns', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // The most recent tool_result (last user message with tool_result) should
+      // still contain the original long content.
+      const lastCall = calls[calls.length - 1]!
+      const userMsgs = lastCall.filter(m => m.role === 'user')
+      const lastUserWithToolResult = [...userMsgs]
+        .reverse()
+        .find(m => m.content.some(b => b.type === 'tool_result'))
+      expect(lastUserWithToolResult).toBeDefined()
+      const recentTr = lastUserWithToolResult!.content.find(b => b.type === 'tool_result')
+      if (recentTr && recentTr.type === 'tool_result') {
+        expect(recentTr.content).not.toContain('compacted')
+        expect(recentTr.content).toContain('result-data')
+      }
+    })
+
+    it('does not compact when all turns fit in preserveRecentTurns', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(3, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 10, // way more than actual turns
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // All tool results should still have original content.
+      const lastCall = calls[calls.length - 1]!
+      const toolResults = lastCall.flatMap(m =>
+        m.content.filter(b => b.type === 'tool_result'),
+      )
+      for (const tr of toolResults) {
+        if (tr.type === 'tool_result') {
+          expect(tr.content).not.toContain('compacted')
+        }
+      }
+    })
+
+    it('maintains correct role alternation after compaction', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(5, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 10,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 100,
+        },
+      })
+
+      await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
+
+      // Check all LLM calls for role alternation.
+      for (const callMsgs of calls) {
+        for (let i = 1; i < callMsgs.length; i++) {
+          expect(callMsgs[i]!.role).not.toBe(callMsgs[i - 1]!.role)
+        }
+      }
+    })
+
+    it('returns ZERO_USAGE (no LLM cost from compaction)', async () => {
+      const calls: LLMMessage[][] = []
+      const adapter = buildMultiTurnAdapter(4, calls)
+      const { registry, executor } = buildEchoRegistry(longToolResult)
+      const runner = new AgentRunner(adapter, registry, executor, {
+        model: 'mock-model',
+        allowedTools: ['echo'],
+        maxTurns: 8,
+        contextStrategy: {
+          type: 'compact',
+          maxTokens: 20,
+          preserveRecentTurns: 1,
+          minToolResultChars: 100,
+        },
+      })
+
+      const result = await runner.run([
+        { role: 'user', content: [{ type: 'text', text: 'start' }] },
+      ])
+
+      // Token usage should only reflect the 4 actual LLM calls (no extra from compaction).
+      // Each toolUseResponse: input=15, output=25. textResponse: input=10, output=20.
+      // 3 tool calls + 1 final = (15*3 + 10) input, (25*3 + 20) output.
+      expect(result.tokenUsage.input_tokens).toBe(15 * 3 + 10)
+      expect(result.tokenUsage.output_tokens).toBe(25 * 3 + 20)
+    })
+  })
+})
--- a/tests/copilot-adapter.test.ts
+++ b/tests/copilot-adapter.test.ts
@ -0,0 +1,405 @@
+import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'
+import { textMsg, chatOpts, toolDef, collectEvents } from './helpers/llm-fixtures.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Mock OpenAI SDK (Copilot uses it under the hood)
+// ---------------------------------------------------------------------------
+
+const mockCreate = vi.hoisted(() => vi.fn())
+const OpenAIMock = vi.hoisted(() =>
+  vi.fn(() => ({
+    chat: { completions: { create: mockCreate } },
+  })),
+)
+
+vi.mock('openai', () => ({
+  default: OpenAIMock,
+  OpenAI: OpenAIMock,
+}))
+
+// ---------------------------------------------------------------------------
+// Mock global fetch for token management
+// ---------------------------------------------------------------------------
+
+const originalFetch = globalThis.fetch
+
+function mockFetchForToken(sessionToken = 'cop_session_abc', expiresAt?: number) {
+  const exp = expiresAt ?? Math.floor(Date.now() / 1000) + 3600
+  return vi.fn().mockResolvedValue({
+    ok: true,
+    json: () => Promise.resolve({ token: sessionToken, expires_at: exp }),
+    text: () => Promise.resolve(''),
+  })
+}
+
+import { CopilotAdapter, getCopilotMultiplier, formatCopilotMultiplier } from '../src/llm/copilot.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function makeCompletion(overrides: Record<string, unknown> = {}) {
+  return {
+    id: 'chatcmpl-cop',
+    model: 'claude-sonnet-4',
+    choices: [{
+      index: 0,
+      message: { role: 'assistant', content: 'Hello from Copilot', tool_calls: undefined },
+      finish_reason: 'stop',
+    }],
+    usage: { prompt_tokens: 8, completion_tokens: 4 },
+    ...overrides,
+  }
+}
+
+async function* makeChunks(chunks: Array<Record<string, unknown>>) {
+  for (const chunk of chunks) yield chunk
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('CopilotAdapter', () => {
+  let savedEnv: Record<string, string | undefined>
+
+  beforeEach(() => {
+    vi.clearAllMocks()
+    savedEnv = {
+      GITHUB_COPILOT_TOKEN: process.env['GITHUB_COPILOT_TOKEN'],
+      GITHUB_TOKEN: process.env['GITHUB_TOKEN'],
+    }
+    delete process.env['GITHUB_COPILOT_TOKEN']
+    delete process.env['GITHUB_TOKEN']
+  })
+
+  afterEach(() => {
+    globalThis.fetch = originalFetch
+    for (const [key, val] of Object.entries(savedEnv)) {
+      if (val === undefined) delete process.env[key]
+      else process.env[key] = val
+    }
+  })
+
+  // =========================================================================
+  // Constructor & token resolution
+  // =========================================================================
+
+  describe('constructor', () => {
+    it('accepts string apiKey as first argument', () => {
+      const adapter = new CopilotAdapter('gh_token_123')
+      expect(adapter.name).toBe('copilot')
+    })
+
+    it('accepts options object with apiKey', () => {
+      const adapter = new CopilotAdapter({ apiKey: 'gh_token_456' })
+      expect(adapter.name).toBe('copilot')
+    })
+
+    it('falls back to GITHUB_COPILOT_TOKEN env var', () => {
+      process.env['GITHUB_COPILOT_TOKEN'] = 'env_copilot_token'
+      const adapter = new CopilotAdapter()
+      expect(adapter.name).toBe('copilot')
+    })
+
+    it('falls back to GITHUB_TOKEN env var', () => {
+      process.env['GITHUB_TOKEN'] = 'env_gh_token'
+      const adapter = new CopilotAdapter()
+      expect(adapter.name).toBe('copilot')
+    })
+  })
+
+  // =========================================================================
+  // Token management
+  // =========================================================================
+
+  describe('token management', () => {
+    it('exchanges GitHub token for Copilot session token', async () => {
+      const fetchMock = mockFetchForToken('session_xyz')
+      globalThis.fetch = fetchMock
+      const adapter = new CopilotAdapter('gh_token')
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      // fetch was called to exchange token
+      expect(fetchMock).toHaveBeenCalledWith(
+        'https://api.github.com/copilot_internal/v2/token',
+        expect.objectContaining({
+          method: 'GET',
+          headers: expect.objectContaining({
+            Authorization: 'token gh_token',
+          }),
+        }),
+      )
+
+      // OpenAI client was created with session token
+      expect(OpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'session_xyz',
+          baseURL: 'https://api.githubcopilot.com',
+        }),
+      )
+    })
+
+    it('caches session token and reuses on second call', async () => {
+      const fetchMock = mockFetchForToken()
+      globalThis.fetch = fetchMock
+      const adapter = new CopilotAdapter('gh_token')
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+      await adapter.chat([textMsg('user', 'Hi again')], chatOpts())
+
+      // fetch should only be called once (cached)
+      expect(fetchMock).toHaveBeenCalledTimes(1)
+    })
+
+    it('refreshes token when near expiry (within 60s)', async () => {
+      const nowSec = Math.floor(Date.now() / 1000)
+      // First call: token expires in 30 seconds (within 60s grace)
+      let callCount = 0
+      globalThis.fetch = vi.fn().mockImplementation(() => {
+        callCount++
+        return Promise.resolve({
+          ok: true,
+          json: () => Promise.resolve({
+            token: `session_${callCount}`,
+            expires_at: callCount === 1 ? nowSec + 30 : nowSec + 3600,
+          }),
+          text: () => Promise.resolve(''),
+        })
+      })
+
+      const adapter = new CopilotAdapter('gh_token')
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+      // Token is within 60s of expiry, should refresh
+      await adapter.chat([textMsg('user', 'Hi again')], chatOpts())
+
+      expect(callCount).toBe(2)
+    })
+
+    it('concurrent requests share a single refresh promise', async () => {
+      let resolveToken: ((v: unknown) => void) | undefined
+      const slowFetch = vi.fn().mockImplementation(() => {
+        return new Promise((resolve) => {
+          resolveToken = resolve
+        })
+      })
+      globalThis.fetch = slowFetch
+
+      const adapter = new CopilotAdapter('gh_token')
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      // Fire two concurrent requests
+      const p1 = adapter.chat([textMsg('user', 'A')], chatOpts())
+      const p2 = adapter.chat([textMsg('user', 'B')], chatOpts())
+
+      // Resolve the single in-flight fetch
+      resolveToken!({
+        ok: true,
+        json: () => Promise.resolve({
+          token: 'shared_session',
+          expires_at: Math.floor(Date.now() / 1000) + 3600,
+        }),
+        text: () => Promise.resolve(''),
+      })
+
+      await Promise.all([p1, p2])
+
+      // fetch was called only once (mutex prevented double refresh)
+      expect(slowFetch).toHaveBeenCalledTimes(1)
+    })
+
+    it('throws on failed token exchange', async () => {
+      globalThis.fetch = vi.fn().mockResolvedValue({
+        ok: false,
+        status: 401,
+        text: () => Promise.resolve('Unauthorized'),
+        statusText: 'Unauthorized',
+      })
+
+      const adapter = new CopilotAdapter('bad_token')
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      await expect(
+        adapter.chat([textMsg('user', 'Hi')], chatOpts()),
+      ).rejects.toThrow('Copilot token exchange failed')
+    })
+  })
+
+  // =========================================================================
+  // chat()
+  // =========================================================================
+
+  describe('chat()', () => {
+    let adapter: CopilotAdapter
+
+    beforeEach(() => {
+      globalThis.fetch = mockFetchForToken()
+      adapter = new CopilotAdapter('gh_token')
+    })
+
+    it('creates OpenAI client with Copilot-specific headers and baseURL', async () => {
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(OpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          baseURL: 'https://api.githubcopilot.com',
+          defaultHeaders: expect.objectContaining({
+            'Copilot-Integration-Id': 'vscode-chat',
+            'Editor-Version': 'vscode/1.100.0',
+          }),
+        }),
+      )
+    })
+
+    it('returns LLMResponse from completion', async () => {
+      mockCreate.mockResolvedValue(makeCompletion())
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result).toEqual({
+        id: 'chatcmpl-cop',
+        content: [{ type: 'text', text: 'Hello from Copilot' }],
+        model: 'claude-sonnet-4',
+        stop_reason: 'end_turn',
+        usage: { input_tokens: 8, output_tokens: 4 },
+      })
+    })
+
+    it('passes tools and temperature through', async () => {
+      mockCreate.mockResolvedValue(makeCompletion())
+      const tool = toolDef('search')
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ tools: [tool], temperature: 0.5 }),
+      )
+
+      const callArgs = mockCreate.mock.calls[0][0]
+      expect(callArgs.tools[0].function.name).toBe('search')
+      expect(callArgs.temperature).toBe(0.5)
+      expect(callArgs.stream).toBe(false)
+    })
+  })
+
+  // =========================================================================
+  // stream()
+  // =========================================================================
+
+  describe('stream()', () => {
+    let adapter: CopilotAdapter
+
+    beforeEach(() => {
+      globalThis.fetch = mockFetchForToken()
+      adapter = new CopilotAdapter('gh_token')
+    })
+
+    it('yields text and done events', async () => {
+      mockCreate.mockResolvedValue(makeChunks([
+        { id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: { content: 'Hi' }, finish_reason: null }], usage: null },
+        { id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: {}, finish_reason: 'stop' }], usage: null },
+        { id: 'c1', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 5, completion_tokens: 2 } },
+      ]))
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      expect(events.filter(e => e.type === 'text')).toEqual([
+        { type: 'text', data: 'Hi' },
+      ])
+      const done = events.find(e => e.type === 'done')
+      expect((done!.data as LLMResponse).usage).toEqual({ input_tokens: 5, output_tokens: 2 })
+    })
+
+    it('yields tool_use events from streamed tool calls', async () => {
+      mockCreate.mockResolvedValue(makeChunks([
+        {
+          id: 'c1', model: 'gpt-4o',
+          choices: [{ index: 0, delta: { tool_calls: [{ index: 0, id: 'call_1', function: { name: 'search', arguments: '{"q":"x"}' } }] }, finish_reason: null }],
+          usage: null,
+        },
+        { id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }], usage: null },
+        { id: 'c1', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 5, completion_tokens: 3 } },
+      ]))
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const toolEvents = events.filter(e => e.type === 'tool_use')
+      expect(toolEvents).toHaveLength(1)
+      expect((toolEvents[0].data as ToolUseBlock).name).toBe('search')
+    })
+
+    it('yields error event on failure', async () => {
+      mockCreate.mockResolvedValue(
+        (async function* () { throw new Error('Copilot down') })(),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      expect(events.filter(e => e.type === 'error')).toHaveLength(1)
+    })
+  })
+
+  // =========================================================================
+  // getCopilotMultiplier()
+  // =========================================================================
+
+  describe('getCopilotMultiplier()', () => {
+    it('returns 0 for included models', () => {
+      expect(getCopilotMultiplier('gpt-4.1')).toBe(0)
+      expect(getCopilotMultiplier('gpt-4o')).toBe(0)
+      expect(getCopilotMultiplier('gpt-5-mini')).toBe(0)
+    })
+
+    it('returns 0.25 for grok models', () => {
+      expect(getCopilotMultiplier('grok-code-fast-1')).toBe(0.25)
+    })
+
+    it('returns 0.33 for haiku, gemini-3-flash, etc.', () => {
+      expect(getCopilotMultiplier('claude-haiku-4.5')).toBe(0.33)
+      expect(getCopilotMultiplier('gemini-3-flash')).toBe(0.33)
+    })
+
+    it('returns 1 for sonnet, gemini-pro, gpt-5.x', () => {
+      expect(getCopilotMultiplier('claude-sonnet-4')).toBe(1)
+      expect(getCopilotMultiplier('gemini-2.5-pro')).toBe(1)
+      expect(getCopilotMultiplier('gpt-5.1')).toBe(1)
+    })
+
+    it('returns 3 for claude-opus (non-fast)', () => {
+      expect(getCopilotMultiplier('claude-opus-4.5')).toBe(3)
+    })
+
+    it('returns 30 for claude-opus fast', () => {
+      expect(getCopilotMultiplier('claude-opus-4.6-fast')).toBe(30)
+    })
+
+    it('returns 1 for unknown models', () => {
+      expect(getCopilotMultiplier('some-new-model')).toBe(1)
+    })
+  })
+
+  // =========================================================================
+  // formatCopilotMultiplier()
+  // =========================================================================
+
+  describe('formatCopilotMultiplier()', () => {
+    it('returns "included (0\u00d7)" for 0', () => {
+      expect(formatCopilotMultiplier(0)).toBe('included (0\u00d7)')
+    })
+
+    it('returns "1\u00d7 premium request" for 1', () => {
+      expect(formatCopilotMultiplier(1)).toBe('1\u00d7 premium request')
+    })
+
+    it('returns "0.33\u00d7 premium request" for 0.33', () => {
+      expect(formatCopilotMultiplier(0.33)).toBe('0.33\u00d7 premium request')
+    })
+  })
+})
--- a/tests/dashboard-layout-tasks.test.ts
+++ b/tests/dashboard-layout-tasks.test.ts
@ -0,0 +1,46 @@
+import { describe, expect, it } from 'vitest'
+import { layoutTasks } from '../src/dashboard/layout-tasks.js'
+
+describe('layoutTasks', () => {
+  it('assigns increasing columns along a dependency chain (topological levels)', () => {
+    const tasks = [
+      { id: 'a', dependsOn: [] as const },
+      { id: 'b', dependsOn: ['a'] as const },
+      { id: 'c', dependsOn: ['b'] as const },
+    ]
+    const { positions } = layoutTasks(tasks)
+    expect(positions.get('a')!.x).toBeLessThan(positions.get('b')!.x)
+    expect(positions.get('b')!.x).toBeLessThan(positions.get('c')!.x)
+  })
+
+  it('places a merge node after all of its dependencies (diamond)', () => {
+    const tasks = [
+      { id: 'root', dependsOn: [] as const },
+      { id: 'left', dependsOn: ['root'] as const },
+      { id: 'right', dependsOn: ['root'] as const },
+      { id: 'merge', dependsOn: ['left', 'right'] as const },
+    ]
+    const { positions } = layoutTasks(tasks)
+    const mx = positions.get('merge')!.x
+    expect(mx).toBeGreaterThan(positions.get('left')!.x)
+    expect(mx).toBeGreaterThan(positions.get('right')!.x)
+  })
+
+  it('orders independent roots in the same column with distinct rows', () => {
+    const tasks = [
+      { id: 'a', dependsOn: [] as const },
+      { id: 'b', dependsOn: [] as const },
+    ]
+    const { positions } = layoutTasks(tasks)
+    expect(positions.get('a')!.x).toBe(positions.get('b')!.x)
+    expect(positions.get('a')!.y).not.toBe(positions.get('b')!.y)
+  })
+
+  it('throws when task dependencies contain a cycle', () => {
+    const tasks = [
+      { id: 'a', dependsOn: ['b'] as const },
+      { id: 'b', dependsOn: ['a'] as const },
+    ]
+    expect(() => layoutTasks(tasks)).toThrow('Task dependency graph contains a cycle')
+  })
+})
--- a/tests/dashboard-render.test.ts
+++ b/tests/dashboard-render.test.ts
@ -0,0 +1,92 @@
+import { describe, expect, it } from 'vitest'
+import { renderTeamRunDashboard } from '../src/dashboard/render-team-run-dashboard.js'
+
+describe('renderTeamRunDashboard', () => {
+  it('does not embed unescaped script terminators in the JSON payload and keeps XSS payloads out of HTML markup', () => {
+    const malicious = '"</script><img src=x onerror=alert(1)>"'
+    const html = renderTeamRunDashboard({
+      success: true,
+      goal: 'safe-goal',
+      tasks: [
+        {
+          id: 't1',
+          title: malicious,
+          status: 'pending',
+          dependsOn: [],
+        },
+      ],
+      agentResults: new Map(),
+      totalTokenUsage: { input_tokens: 0, output_tokens: 0 },
+    })
+
+    const dataOpen = 'id="oma-data">'
+    const start = html.indexOf(dataOpen)
+    expect(start).toBeGreaterThan(-1)
+    const contentStart = start + dataOpen.length
+    const end = html.indexOf('</script>', contentStart)
+    expect(end).toBeGreaterThan(contentStart)
+    const jsonSlice = html.slice(contentStart, end)
+    expect(jsonSlice.toLowerCase()).not.toContain('</script')
+
+    const parsed = JSON.parse(jsonSlice) as { tasks: { title: string }[] }
+    expect(parsed.tasks[0]!.title).toBe(malicious)
+
+    const beforeData = html.slice(0, start)
+    expect(beforeData).not.toContain(malicious)
+    expect(beforeData.toLowerCase()).not.toMatch(/\sonerror\s*=/)
+  })
+
+  it('keeps task description text in JSON payload', () => {
+    const description = 'danger: </script><svg onload=alert(1)>'
+    const html = renderTeamRunDashboard({
+      success: true,
+      goal: 'safe-goal',
+      tasks: [
+        {
+          id: 't1',
+          title: 'task',
+          description,
+          status: 'pending',
+          dependsOn: [],
+        } as { id: string; title: string; description: string; status: 'pending'; dependsOn: string[] },
+      ],
+      agentResults: new Map(),
+      totalTokenUsage: { input_tokens: 0, output_tokens: 0 },
+    })
+
+    const start = html.indexOf('id="oma-data">')
+    const contentStart = start + 'id="oma-data">'.length
+    const end = html.indexOf('</script>', contentStart)
+    const parsed = JSON.parse(html.slice(contentStart, end)) as {
+      tasks: Array<{ description?: string }>
+    }
+    expect(parsed.tasks[0]!.description).toBe(description)
+  })
+
+  it('keeps task result text in JSON payload', () => {
+    const result = 'final output </script><img src=x onerror=alert(1)>'
+    const html = renderTeamRunDashboard({
+      success: true,
+      goal: 'safe-goal',
+      tasks: [
+        {
+          id: 't1',
+          title: 'task',
+          result,
+          status: 'completed',
+          dependsOn: [],
+        } as { id: string; title: string; result: string; status: 'completed'; dependsOn: string[] },
+      ],
+      agentResults: new Map(),
+      totalTokenUsage: { input_tokens: 0, output_tokens: 0 },
+    })
+
+    const start = html.indexOf('id="oma-data">')
+    const contentStart = start + 'id="oma-data">'.length
+    const end = html.indexOf('</script>', contentStart)
+    const parsed = JSON.parse(html.slice(contentStart, end)) as {
+      tasks: Array<{ result?: string }>
+    }
+    expect(parsed.tasks[0]!.result).toBe(result)
+  })
+})
--- a/tests/deepseek-adapter.test.ts
+++ b/tests/deepseek-adapter.test.ts
@ -0,0 +1,74 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+
+// ---------------------------------------------------------------------------
+// Mock OpenAI constructor (must be hoisted for Vitest)
+// ---------------------------------------------------------------------------
+const OpenAIMock = vi.hoisted(() => vi.fn())
+
+vi.mock('openai', () => ({
+  default: OpenAIMock,
+}))
+
+import { DeepSeekAdapter } from '../src/llm/deepseek.js'
+import { createAdapter } from '../src/llm/adapter.js'
+
+// ---------------------------------------------------------------------------
+// DeepSeekAdapter tests
+// ---------------------------------------------------------------------------
+
+describe('DeepSeekAdapter', () => {
+  beforeEach(() => {
+    OpenAIMock.mockClear()
+  })
+
+  it('has name "deepseek"', () => {
+    const adapter = new DeepSeekAdapter()
+    expect(adapter.name).toBe('deepseek')
+  })
+
+  it('uses DEEPSEEK_API_KEY by default', () => {
+    const original = process.env['DEEPSEEK_API_KEY']
+    process.env['DEEPSEEK_API_KEY'] = 'deepseek-test-key-123'
+
+    try {
+      new DeepSeekAdapter()
+      expect(OpenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'deepseek-test-key-123',
+          baseURL: 'https://api.deepseek.com/v1',
+        })
+      )
+    } finally {
+      if (original === undefined) {
+        delete process.env['DEEPSEEK_API_KEY']
+      } else {
+        process.env['DEEPSEEK_API_KEY'] = original
+      }
+    }
+  })
+
+  it('uses official DeepSeek baseURL by default', () => {
+    new DeepSeekAdapter('some-key')
+    expect(OpenAIMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        apiKey: 'some-key',
+        baseURL: 'https://api.deepseek.com/v1',
+      })
+    )
+  })
+
+  it('allows overriding apiKey and baseURL', () => {
+    new DeepSeekAdapter('custom-key', 'https://custom.endpoint/v1')
+    expect(OpenAIMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        apiKey: 'custom-key',
+        baseURL: 'https://custom.endpoint/v1',
+      })
+    )
+  })
+
+  it('createAdapter("deepseek") returns DeepSeekAdapter instance', async () => {
+    const adapter = await createAdapter('deepseek')
+    expect(adapter).toBeInstanceOf(DeepSeekAdapter)
+  })
+})
--- a/tests/delegation-budget.test.ts
+++ b/tests/delegation-budget.test.ts
@ -0,0 +1,114 @@
+import { describe, it, expect } from 'vitest'
+import { z } from 'zod'
+import { AgentRunner } from '../src/agent/runner.js'
+import { ToolRegistry, defineTool } from '../src/tool/framework.js'
+import { ToolExecutor } from '../src/tool/executor.js'
+import type { LLMAdapter, LLMMessage, LLMResponse, StreamEvent, ToolUseBlock, ToolResultBlock } from '../src/types.js'
+
+function toolUseResponse(toolName: string, input: Record<string, unknown>): LLMResponse {
+  return {
+    id: `resp-${Math.random().toString(36).slice(2)}`,
+    content: [{
+      type: 'tool_use',
+      id: `tu-${Math.random().toString(36).slice(2)}`,
+      name: toolName,
+      input,
+    }],
+    model: 'mock-model',
+    stop_reason: 'tool_use',
+    usage: { input_tokens: 5, output_tokens: 5 },
+  }
+}
+
+function textResponse(text: string): LLMResponse {
+  return {
+    id: `resp-${Math.random().toString(36).slice(2)}`,
+    content: [{ type: 'text', text }],
+    model: 'mock-model',
+    stop_reason: 'end_turn',
+    usage: { input_tokens: 5, output_tokens: 5 },
+  }
+}
+
+describe('delegation-triggered budget_exceeded', () => {
+  it('yields tool_result events and appends tool_result message before break', async () => {
+    // Parent turn 1: LLM asks for a delegation.
+    // Tool returns metadata.tokenUsage that alone pushes totalUsage past the budget.
+    // Expectation: stream yields tool_use AND tool_result, and the returned
+    // `messages` contains the user tool_result message, so downstream consumers
+    // can resume without API "tool_use without tool_result" errors.
+    const responses = [
+      toolUseResponse('delegate_to_agent', { target_agent: 'bob', prompt: 'work' }),
+      textResponse('should not be reached'),
+    ]
+    let idx = 0
+    const adapter: LLMAdapter = {
+      name: 'mock',
+      async chat() {
+        return responses[idx++]!
+      },
+      async *stream() { /* unused */ },
+    }
+
+    const registry = new ToolRegistry()
+    registry.register(
+      defineTool({
+        name: 'delegate_to_agent',
+        description: 'Fake delegation for test',
+        inputSchema: z.object({ target_agent: z.string(), prompt: z.string() }),
+        async execute() {
+          return {
+            data: 'delegated output',
+            metadata: { tokenUsage: { input_tokens: 500, output_tokens: 500 } },
+          }
+        },
+      }),
+    )
+
+    const runner = new AgentRunner(adapter, registry, new ToolExecutor(registry), {
+      model: 'mock-model',
+      allowedTools: ['delegate_to_agent'],
+      maxTurns: 5,
+      maxTokenBudget: 100, // 10 (parent LLM) + 1000 (delegation) ≫ 100
+      agentName: 'parent',
+    })
+
+    const events: StreamEvent[] = []
+    for await (const ev of runner.stream([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])) {
+      events.push(ev)
+    }
+
+    const toolUseEvents = events.filter((e): e is StreamEvent & { type: 'tool_use'; data: ToolUseBlock } => e.type === 'tool_use')
+    const toolResultEvents = events.filter((e): e is StreamEvent & { type: 'tool_result'; data: ToolResultBlock } => e.type === 'tool_result')
+    const budgetEvents = events.filter(e => e.type === 'budget_exceeded')
+    const doneEvents = events.filter((e): e is StreamEvent & { type: 'done'; data: { messages: LLMMessage[]; budgetExceeded?: boolean } } => e.type === 'done')
+
+    // 1. Every tool_use event has a matching tool_result event.
+    expect(toolUseEvents).toHaveLength(1)
+    expect(toolResultEvents).toHaveLength(1)
+    expect(toolResultEvents[0]!.data.tool_use_id).toBe(toolUseEvents[0]!.data.id)
+
+    // 2. Budget event fires and the run terminates with budgetExceeded=true.
+    expect(budgetEvents).toHaveLength(1)
+    expect(doneEvents).toHaveLength(1)
+    expect(doneEvents[0]!.data.budgetExceeded).toBe(true)
+
+    // 3. Returned messages contain the tool_result user message so the
+    //    conversation is API-resumable.
+    const messages = doneEvents[0]!.data.messages
+    const lastMsg = messages[messages.length - 1]!
+    expect(lastMsg.role).toBe('user')
+    const hasMatchingToolResult = lastMsg.content.some(
+      b => b.type === 'tool_result' && b.tool_use_id === toolUseEvents[0]!.data.id,
+    )
+    expect(hasMatchingToolResult).toBe(true)
+
+    // 4. Ordering: tool_result event is emitted before budget_exceeded.
+    const toolResultIdx = events.findIndex(e => e.type === 'tool_result')
+    const budgetIdx = events.findIndex(e => e.type === 'budget_exceeded')
+    expect(toolResultIdx).toBeLessThan(budgetIdx)
+
+    // 5. LLM was only called once — we broke before a second turn.
+    expect(idx).toBe(1)
+  })
+})
--- a/tests/delegation-concurrency.test.ts
+++ b/tests/delegation-concurrency.test.ts
@ -0,0 +1,131 @@
+import { describe, it, expect, vi } from 'vitest'
+import { OpenMultiAgent } from '../src/orchestrator/orchestrator.js'
+import type { AgentConfig, LLMChatOptions, LLMMessage, LLMResponse } from '../src/types.js'
+
+// Single shared mock adapter, routed by systemPrompt + first-turn user text.
+vi.mock('../src/llm/adapter.js', () => ({
+  createAdapter: async () => ({
+    name: 'mock',
+    async chat(messages: LLMMessage[], options: LLMChatOptions): Promise<LLMResponse> {
+      const sys = options.systemPrompt ?? ''
+      const firstUserText = extractText(messages[0]?.content ?? [])
+      const onlyOneMessage = messages.length === 1
+
+      // Root parent task (turn 1) emits a delegation tool_use.
+      // Task description strings are set to 'ROOT-A' / 'ROOT-B' so we can
+      // distinguish the parent's first turn from the ephemeral delegate's
+      // first turn (which sees 'ping-A' / 'ping-B' as its user prompt).
+      if (onlyOneMessage && firstUserText.includes('ROOT-A')) {
+        return toolUseResponse('delegate_to_agent', { target_agent: 'B', prompt: 'ping-B' })
+      }
+      if (onlyOneMessage && firstUserText.includes('ROOT-B')) {
+        return toolUseResponse('delegate_to_agent', { target_agent: 'A', prompt: 'ping-A' })
+      }
+
+      // Ephemeral delegate's first (and only) turn — return plain text so it
+      // terminates cleanly without another delegation.
+      if (onlyOneMessage) {
+        const who = sys.startsWith('A-') ? 'A' : 'B'
+        return textResponse(`${who} nested done`)
+      }
+
+      // Root parent turn 2 — after tool_result. Return text to end the loop.
+      const who = sys.startsWith('A-') ? 'A' : 'B'
+      return textResponse(`${who} parent done`)
+    },
+    async *stream() { yield { type: 'done' as const, data: {} } },
+  }),
+}))
+
+function textResponse(text: string): LLMResponse {
+  return {
+    id: `r-${Math.random().toString(36).slice(2)}`,
+    content: [{ type: 'text', text }],
+    model: 'mock-model',
+    stop_reason: 'end_turn',
+    usage: { input_tokens: 5, output_tokens: 5 },
+  }
+}
+
+function toolUseResponse(toolName: string, input: Record<string, unknown>): LLMResponse {
+  return {
+    id: `r-${Math.random().toString(36).slice(2)}`,
+    content: [{
+      type: 'tool_use',
+      id: `tu-${Math.random().toString(36).slice(2)}`,
+      name: toolName,
+      input,
+    }],
+    model: 'mock-model',
+    stop_reason: 'tool_use',
+    usage: { input_tokens: 5, output_tokens: 5 },
+  }
+}
+
+function extractText(content: readonly { type: string; text?: string }[]): string {
+  return content
+    .filter((b): b is { type: 'text'; text: string } => b.type === 'text' && typeof b.text === 'string')
+    .map((b) => b.text)
+    .join(' ')
+}
+
+function agentA(): AgentConfig {
+  return {
+    name: 'A',
+    model: 'mock-model',
+    provider: 'openai',
+    // sysPrompt prefix used by the mock to disambiguate A vs B.
+    systemPrompt: 'A-agent. You are agent A. Delegate to B when asked.',
+    tools: ['delegate_to_agent'],
+    maxTurns: 4,
+  }
+}
+
+function agentB(): AgentConfig {
+  return {
+    name: 'B',
+    model: 'mock-model',
+    provider: 'openai',
+    systemPrompt: 'B-agent. You are agent B. Delegate to A when asked.',
+    tools: ['delegate_to_agent'],
+    maxTurns: 4,
+  }
+}
+
+describe('mutual delegation (A↔B) completes without agent-lock deadlock', () => {
+  it('two parallel root tasks both finish when each delegates to the other', async () => {
+    // Previously: pool.run('B') inside A's tool call waited on B's agent lock
+    // (held by the parent B task), while pool.run('A') inside B's tool call
+    // waited on A's agent lock — classic mutual deadlock.
+    // After the fix: delegation uses runEphemeral on a fresh Agent instance,
+    // so neither call touches the per-agent lock.
+    const oma = new OpenMultiAgent({
+      defaultModel: 'mock-model',
+      defaultProvider: 'openai',
+      // Need room for 2 parent runs + 2 ephemeral delegates.
+      maxConcurrency: 4,
+    })
+    const team = oma.createTeam('mutual', {
+      name: 'mutual',
+      agents: [agentA(), agentB()],
+      sharedMemory: false,
+    })
+
+    // Race against a 10s timeout so a regression surfaces as a test failure
+    // rather than a hanging CI job.
+    const runPromise = oma.runTasks(team, [
+      { title: 'Task A', description: 'ROOT-A', assignee: 'A' },
+      { title: 'Task B', description: 'ROOT-B', assignee: 'B' },
+    ])
+    const timeout = new Promise((_resolve, reject) =>
+      setTimeout(() => reject(new Error('mutual delegation deadlock (timeout)')), 10_000),
+    )
+
+    const result = (await Promise.race([runPromise, timeout])) as Awaited<typeof runPromise>
+
+    expect(result.success).toBe(true)
+    const agentOutputs = [...result.agentResults.values()].map((r) => r.output)
+    expect(agentOutputs.some((o) => o.includes('A parent done'))).toBe(true)
+    expect(agentOutputs.some((o) => o.includes('B parent done'))).toBe(true)
+  })
+})
--- a/tests/e2e/anthropic-e2e.test.ts
+++ b/tests/e2e/anthropic-e2e.test.ts
@ -0,0 +1,83 @@
+/**
+ * E2E tests for AnthropicAdapter against the real API.
+ *
+ * Skipped by default. Run with: npm run test:e2e
+ * Requires: ANTHROPIC_API_KEY environment variable
+ */
+import { describe, it, expect } from 'vitest'
+import { AnthropicAdapter } from '../../src/llm/anthropic.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../../src/types.js'
+
+const describeE2E = process.env['RUN_E2E'] ? describe : describe.skip
+
+describeE2E('AnthropicAdapter E2E', () => {
+  const adapter = new AnthropicAdapter()
+  const model = 'claude-haiku-4-5-20251001'
+
+  const weatherTool = {
+    name: 'get_weather',
+    description: 'Get the weather for a city',
+    inputSchema: {
+      type: 'object',
+      properties: { city: { type: 'string' } },
+      required: ['city'],
+    },
+  }
+
+  it('chat() returns a text response', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hello" and nothing else.' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )
+
+    expect(result.id).toBeTruthy()
+    expect(result.content.length).toBeGreaterThan(0)
+    expect(result.content[0].type).toBe('text')
+    expect(result.usage.input_tokens).toBeGreaterThan(0)
+    expect(result.stop_reason).toBe('end_turn')
+  }, 30_000)
+
+  it('chat() handles tool use', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'What is the weather in Tokyo? Use the get_weather tool.' }] }],
+      { model, maxTokens: 100, temperature: 0, tools: [weatherTool] },
+    )
+
+    const toolBlocks = result.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks.length).toBeGreaterThan(0)
+    expect((toolBlocks[0] as ToolUseBlock).name).toBe('get_weather')
+    expect(result.stop_reason).toBe('tool_use')
+  }, 30_000)
+
+  it('stream() yields text events and a done event', async () => {
+    const events: StreamEvent[] = []
+    for await (const event of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hi".' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )) {
+      events.push(event)
+    }
+
+    const textEvents = events.filter(e => e.type === 'text')
+    expect(textEvents.length).toBeGreaterThan(0)
+
+    const doneEvents = events.filter(e => e.type === 'done')
+    expect(doneEvents).toHaveLength(1)
+    const response = doneEvents[0].data as LLMResponse
+    expect(response.usage.input_tokens).toBeGreaterThan(0)
+  }, 30_000)
+
+  it('stream() handles tool use', async () => {
+    const events: StreamEvent[] = []
+    for await (const event of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text', text: 'Get weather in Paris. Use the tool.' }] }],
+      { model, maxTokens: 100, temperature: 0, tools: [weatherTool] },
+    )) {
+      events.push(event)
+    }
+
+    const toolEvents = events.filter(e => e.type === 'tool_use')
+    expect(toolEvents.length).toBeGreaterThan(0)
+    expect((toolEvents[0].data as ToolUseBlock).name).toBe('get_weather')
+  }, 30_000)
+})
--- a/tests/e2e/gemini-e2e.test.ts
+++ b/tests/e2e/gemini-e2e.test.ts
@ -0,0 +1,65 @@
+/**
+ * E2E tests for GeminiAdapter against the real API.
+ *
+ * Skipped by default. Run with: npm run test:e2e
+ * Requires: GEMINI_API_KEY or GOOGLE_API_KEY environment variable
+ */
+import { describe, it, expect } from 'vitest'
+import { GeminiAdapter } from '../../src/llm/gemini.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../../src/types.js'
+
+const describeE2E = process.env['RUN_E2E'] ? describe : describe.skip
+
+describeE2E('GeminiAdapter E2E', () => {
+  const adapter = new GeminiAdapter()
+  const model = 'gemini-2.0-flash'
+
+  const weatherTool = {
+    name: 'get_weather',
+    description: 'Get the weather for a city',
+    inputSchema: {
+      type: 'object',
+      properties: { city: { type: 'string' } },
+      required: ['city'],
+    },
+  }
+
+  it('chat() returns a text response', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hello" and nothing else.' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )
+
+    expect(result.id).toBeTruthy()
+    expect(result.content.length).toBeGreaterThan(0)
+    expect(result.content[0].type).toBe('text')
+  }, 30_000)
+
+  it('chat() handles tool use', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'What is the weather in Tokyo? Use the get_weather tool.' }] }],
+      { model, maxTokens: 100, temperature: 0, tools: [weatherTool] },
+    )
+
+    const toolBlocks = result.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks.length).toBeGreaterThan(0)
+    expect((toolBlocks[0] as ToolUseBlock).name).toBe('get_weather')
+    expect(result.stop_reason).toBe('tool_use')
+  }, 30_000)
+
+  it('stream() yields text events and a done event', async () => {
+    const events: StreamEvent[] = []
+    for await (const event of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hi".' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )) {
+      events.push(event)
+    }
+
+    const textEvents = events.filter(e => e.type === 'text')
+    expect(textEvents.length).toBeGreaterThan(0)
+
+    const doneEvents = events.filter(e => e.type === 'done')
+    expect(doneEvents).toHaveLength(1)
+  }, 30_000)
+})
--- a/tests/e2e/openai-e2e.test.ts
+++ b/tests/e2e/openai-e2e.test.ts
@ -0,0 +1,81 @@
+/**
+ * E2E tests for OpenAIAdapter against the real API.
+ *
+ * Skipped by default. Run with: npm run test:e2e
+ * Requires: OPENAI_API_KEY environment variable
+ */
+import { describe, it, expect } from 'vitest'
+import { OpenAIAdapter } from '../../src/llm/openai.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../../src/types.js'
+
+const describeE2E = process.env['RUN_E2E'] ? describe : describe.skip
+
+describeE2E('OpenAIAdapter E2E', () => {
+  const adapter = new OpenAIAdapter()
+  const model = 'gpt-4o-mini'
+
+  const weatherTool = {
+    name: 'get_weather',
+    description: 'Get the weather for a city',
+    inputSchema: {
+      type: 'object',
+      properties: { city: { type: 'string' } },
+      required: ['city'],
+    },
+  }
+
+  it('chat() returns a text response', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hello" and nothing else.' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )
+
+    expect(result.id).toBeTruthy()
+    expect(result.content.length).toBeGreaterThan(0)
+    expect(result.content[0].type).toBe('text')
+    expect(result.usage.input_tokens).toBeGreaterThan(0)
+  }, 30_000)
+
+  it('chat() handles tool use', async () => {
+    const result = await adapter.chat(
+      [{ role: 'user', content: [{ type: 'text', text: 'What is the weather in Tokyo? Use the get_weather tool.' }] }],
+      { model, maxTokens: 100, temperature: 0, tools: [weatherTool] },
+    )
+
+    const toolBlocks = result.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks.length).toBeGreaterThan(0)
+    expect((toolBlocks[0] as ToolUseBlock).name).toBe('get_weather')
+  }, 30_000)
+
+  it('stream() yields text events and a done event', async () => {
+    const events: StreamEvent[] = []
+    for await (const event of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text', text: 'Say "hi".' }] }],
+      { model, maxTokens: 50, temperature: 0 },
+    )) {
+      events.push(event)
+    }
+
+    const textEvents = events.filter(e => e.type === 'text')
+    expect(textEvents.length).toBeGreaterThan(0)
+
+    const doneEvents = events.filter(e => e.type === 'done')
+    expect(doneEvents).toHaveLength(1)
+    const response = doneEvents[0].data as LLMResponse
+    expect(response.usage.input_tokens).toBeGreaterThan(0)
+  }, 30_000)
+
+  it('stream() handles tool use', async () => {
+    const events: StreamEvent[] = []
+    for await (const event of adapter.stream(
+      [{ role: 'user', content: [{ type: 'text', text: 'Get weather in Paris. Use the tool.' }] }],
+      { model, maxTokens: 100, temperature: 0, tools: [weatherTool] },
+    )) {
+      events.push(event)
+    }
+
+    const toolEvents = events.filter(e => e.type === 'tool_use')
+    expect(toolEvents.length).toBeGreaterThan(0)
+    expect((toolEvents[0].data as ToolUseBlock).name).toBe('get_weather')
+  }, 30_000)
+})
--- a/tests/gemini-adapter-contract.test.ts
+++ b/tests/gemini-adapter-contract.test.ts
@ -0,0 +1,359 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { textMsg, toolUseMsg, toolResultMsg, imageMsg, chatOpts, toolDef, collectEvents } from './helpers/llm-fixtures.js'
+import type { LLMResponse, StreamEvent, ToolUseBlock } from '../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Mock GoogleGenAI
+// ---------------------------------------------------------------------------
+
+const mockGenerateContent = vi.hoisted(() => vi.fn())
+const mockGenerateContentStream = vi.hoisted(() => vi.fn())
+const GoogleGenAIMock = vi.hoisted(() =>
+  vi.fn(() => ({
+    models: {
+      generateContent: mockGenerateContent,
+      generateContentStream: mockGenerateContentStream,
+    },
+  })),
+)
+
+vi.mock('@google/genai', () => ({
+  GoogleGenAI: GoogleGenAIMock,
+  FunctionCallingConfigMode: { AUTO: 'AUTO' },
+}))
+
+import { GeminiAdapter } from '../src/llm/gemini.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function makeGeminiResponse(parts: Array<Record<string, unknown>>, overrides: Record<string, unknown> = {}) {
+  return {
+    candidates: [{
+      content: { parts },
+      finishReason: 'STOP',
+      ...overrides,
+    }],
+    usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 5 },
+  }
+}
+
+async function* asyncGen<T>(items: T[]): AsyncGenerator<T> {
+  for (const item of items) yield item
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('GeminiAdapter (contract)', () => {
+  let adapter: GeminiAdapter
+
+  beforeEach(() => {
+    vi.clearAllMocks()
+    adapter = new GeminiAdapter('test-key')
+  })
+
+  // =========================================================================
+  // chat() — message conversion
+  // =========================================================================
+
+  describe('chat() message conversion', () => {
+    it('converts text messages with correct role mapping', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'Hi' }]))
+
+      await adapter.chat(
+        [textMsg('user', 'Hello'), textMsg('assistant', 'Hi')],
+        chatOpts(),
+      )
+
+      const callArgs = mockGenerateContent.mock.calls[0][0]
+      expect(callArgs.contents[0]).toMatchObject({ role: 'user', parts: [{ text: 'Hello' }] })
+      expect(callArgs.contents[1]).toMatchObject({ role: 'model', parts: [{ text: 'Hi' }] })
+    })
+
+    it('converts tool_use blocks to functionCall parts', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat(
+        [toolUseMsg('call_1', 'search', { query: 'test' })],
+        chatOpts(),
+      )
+
+      const parts = mockGenerateContent.mock.calls[0][0].contents[0].parts
+      expect(parts[0].functionCall).toEqual({
+        id: 'call_1',
+        name: 'search',
+        args: { query: 'test' },
+      })
+    })
+
+    it('converts tool_result blocks to functionResponse parts with name lookup', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat(
+        [
+          toolUseMsg('call_1', 'search', { query: 'test' }),
+          toolResultMsg('call_1', 'found it'),
+        ],
+        chatOpts(),
+      )
+
+      const resultParts = mockGenerateContent.mock.calls[0][0].contents[1].parts
+      expect(resultParts[0].functionResponse).toMatchObject({
+        id: 'call_1',
+        name: 'search',
+        response: { content: 'found it', isError: false },
+      })
+    })
+
+    it('falls back to tool_use_id as name when no matching tool_use found', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat(
+        [toolResultMsg('unknown_id', 'data')],
+        chatOpts(),
+      )
+
+      const parts = mockGenerateContent.mock.calls[0][0].contents[0].parts
+      expect(parts[0].functionResponse.name).toBe('unknown_id')
+    })
+
+    it('converts image blocks to inlineData parts', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat([imageMsg('image/png', 'base64data')], chatOpts())
+
+      const parts = mockGenerateContent.mock.calls[0][0].contents[0].parts
+      expect(parts[0].inlineData).toEqual({
+        mimeType: 'image/png',
+        data: 'base64data',
+      })
+    })
+  })
+
+  // =========================================================================
+  // chat() — tools & config
+  // =========================================================================
+
+  describe('chat() tools & config', () => {
+    it('converts tools to Gemini format with parametersJsonSchema', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+      const tool = toolDef('search', 'Search')
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts({ tools: [tool] }))
+
+      const config = mockGenerateContent.mock.calls[0][0].config
+      expect(config.tools[0].functionDeclarations[0]).toEqual({
+        name: 'search',
+        description: 'Search',
+        parametersJsonSchema: tool.inputSchema,
+      })
+      expect(config.toolConfig).toEqual({
+        functionCallingConfig: { mode: 'AUTO' },
+      })
+    })
+
+    it('passes systemInstruction, maxOutputTokens, temperature', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat(
+        [textMsg('user', 'Hi')],
+        chatOpts({ systemPrompt: 'Be helpful', temperature: 0.7, maxTokens: 2048 }),
+      )
+
+      const config = mockGenerateContent.mock.calls[0][0].config
+      expect(config.systemInstruction).toBe('Be helpful')
+      expect(config.temperature).toBe(0.7)
+      expect(config.maxOutputTokens).toBe(2048)
+    })
+
+    it('omits tools/toolConfig when no tools provided', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }]))
+
+      await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      const config = mockGenerateContent.mock.calls[0][0].config
+      expect(config.tools).toBeUndefined()
+      expect(config.toolConfig).toBeUndefined()
+    })
+  })
+
+  // =========================================================================
+  // chat() — response conversion
+  // =========================================================================
+
+  describe('chat() response conversion', () => {
+    it('converts text parts to TextBlock', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'Hello' }]))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.content[0]).toEqual({ type: 'text', text: 'Hello' })
+    })
+
+    it('converts functionCall parts to ToolUseBlock with existing id', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([
+        { functionCall: { id: 'call_1', name: 'search', args: { q: 'test' } } },
+      ]))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.content[0]).toEqual({
+        type: 'tool_use',
+        id: 'call_1',
+        name: 'search',
+        input: { q: 'test' },
+      })
+    })
+
+    it('fabricates ID when functionCall has no id field', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([
+        { functionCall: { name: 'search', args: { q: 'test' } } },
+      ]))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      const block = result.content[0] as ToolUseBlock
+      expect(block.type).toBe('tool_use')
+      expect(block.id).toMatch(/^gemini-\d+-[a-z0-9]+$/)
+      expect(block.name).toBe('search')
+    })
+
+    it('maps STOP finishReason to end_turn', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'ok' }], { finishReason: 'STOP' }))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.stop_reason).toBe('end_turn')
+    })
+
+    it('maps MAX_TOKENS finishReason to max_tokens', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'trunc' }], { finishReason: 'MAX_TOKENS' }))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.stop_reason).toBe('max_tokens')
+    })
+
+    it('maps to tool_use when response contains functionCall (even with STOP)', async () => {
+      mockGenerateContent.mockResolvedValue(makeGeminiResponse(
+        [{ functionCall: { id: 'c1', name: 'search', args: {} } }],
+        { finishReason: 'STOP' },
+      ))
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.stop_reason).toBe('tool_use')
+    })
+
+    it('handles missing usageMetadata (defaults to 0)', async () => {
+      mockGenerateContent.mockResolvedValue({
+        candidates: [{ content: { parts: [{ text: 'ok' }] }, finishReason: 'STOP' }],
+      })
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.usage).toEqual({ input_tokens: 0, output_tokens: 0 })
+    })
+
+    it('handles empty candidates gracefully', async () => {
+      mockGenerateContent.mockResolvedValue({ candidates: [{ content: {} }] })
+
+      const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
+
+      expect(result.content).toEqual([])
+    })
+  })
+
+  // =========================================================================
+  // stream()
+  // =========================================================================
+
+  describe('stream()', () => {
+    it('yields text events for text parts', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        asyncGen([
+          makeGeminiResponse([{ text: 'Hello' }]),
+          makeGeminiResponse([{ text: ' world' }]),
+        ]),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const textEvents = events.filter(e => e.type === 'text')
+      expect(textEvents).toEqual([
+        { type: 'text', data: 'Hello' },
+        { type: 'text', data: ' world' },
+      ])
+    })
+
+    it('yields tool_use events for functionCall parts', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        asyncGen([
+          makeGeminiResponse([{ functionCall: { id: 'c1', name: 'search', args: { q: 'test' } } }]),
+        ]),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const toolEvents = events.filter(e => e.type === 'tool_use')
+      expect(toolEvents).toHaveLength(1)
+      expect((toolEvents[0].data as ToolUseBlock).name).toBe('search')
+    })
+
+    it('accumulates token counts from usageMetadata', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        asyncGen([
+          { candidates: [{ content: { parts: [{ text: 'Hi' }] } }], usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 2 } },
+          { candidates: [{ content: { parts: [{ text: '!' }] }, finishReason: 'STOP' }], usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 5 } },
+        ]),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const done = events.find(e => e.type === 'done')
+      const response = done!.data as LLMResponse
+      expect(response.usage).toEqual({ input_tokens: 10, output_tokens: 5 })
+    })
+
+    it('yields done event with correct stop_reason', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        asyncGen([makeGeminiResponse([{ text: 'ok' }], { finishReason: 'MAX_TOKENS' })]),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const done = events.find(e => e.type === 'done')
+      expect((done!.data as LLMResponse).stop_reason).toBe('max_tokens')
+    })
+
+    it('yields error event when stream throws', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        (async function* () { throw new Error('Gemini error') })(),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const errorEvents = events.filter(e => e.type === 'error')
+      expect(errorEvents).toHaveLength(1)
+      expect((errorEvents[0].data as Error).message).toBe('Gemini error')
+    })
+
+    it('handles chunks with no candidates', async () => {
+      mockGenerateContentStream.mockResolvedValue(
+        asyncGen([
+          { candidates: undefined, usageMetadata: { promptTokenCount: 5, candidatesTokenCount: 0 } },
+          makeGeminiResponse([{ text: 'ok' }]),
+        ]),
+      )
+
+      const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
+
+      const textEvents = events.filter(e => e.type === 'text')
+      expect(textEvents).toHaveLength(1)
+      expect(textEvents[0].data).toBe('ok')
+    })
+  })
+})
--- a/tests/helpers/llm-fixtures.ts
+++ b/tests/helpers/llm-fixtures.ts
@ -0,0 +1,80 @@
+/**
+ * Shared fixture builders for LLM adapter contract tests.
+ */
+
+import type {
+  ContentBlock,
+  LLMChatOptions,
+  LLMMessage,
+  LLMToolDef,
+  ImageBlock,
+  TextBlock,
+  ToolResultBlock,
+  ToolUseBlock,
+} from '../../src/types.js'
+
+// ---------------------------------------------------------------------------
+// Message builders
+// ---------------------------------------------------------------------------
+
+export function textMsg(role: 'user' | 'assistant', text: string): LLMMessage {
+  return { role, content: [{ type: 'text', text }] }
+}
+
+export function toolUseMsg(id: string, name: string, input: Record<string, unknown>): LLMMessage {
+  return {
+    role: 'assistant',
+    content: [{ type: 'tool_use', id, name, input }],
+  }
+}
+
+export function toolResultMsg(toolUseId: string, content: string, isError = false): LLMMessage {
+  return {
+    role: 'user',
+    content: [{ type: 'tool_result', tool_use_id: toolUseId, content, is_error: isError }],
+  }
+}
+
+export function imageMsg(mediaType: string, data: string): LLMMessage {
+  return {
+    role: 'user',
+    content: [{ type: 'image', source: { type: 'base64', media_type: mediaType, data } }],
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Options & tool def builders
+// ---------------------------------------------------------------------------
+
+export function chatOpts(overrides: Partial<LLMChatOptions> = {}): LLMChatOptions {
+  return {
+    model: 'test-model',
+    maxTokens: 1024,
+    ...overrides,
+  }
+}
+
+export function toolDef(name: string, description = 'A test tool'): LLMToolDef {
+  return {
+    name,
+    description,
+    inputSchema: {
+      type: 'object',
+      properties: { query: { type: 'string' } },
+      required: ['query'],
+    },
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+/** Collect all events from an async iterable. */
+export async function collectEvents<T>(iterable: AsyncIterable<T>): Promise<T[]> {
+  const events: T[] = []
+  for await (const event of iterable) {
+    events.push(event)
+  }
+  return events
+}
--- a/tests/keywords.test.ts
+++ b/tests/keywords.test.ts
@ -0,0 +1,75 @@
+import { describe, it, expect } from 'vitest'
+import { STOP_WORDS, extractKeywords, keywordScore } from '../src/utils/keywords.js'
+
+// Regression coverage for the shared keyword helpers extracted from
+// orchestrator.ts and scheduler.ts (PR #70 review point 1).
+//
+// These tests pin behaviour so future drift between Scheduler and the
+// short-circuit selector is impossible — any edit must update the shared
+// module and these tests at once.
+
+describe('utils/keywords', () => {
+  describe('STOP_WORDS', () => {
+    it('contains all 26 stop words', () => {
+      // Sanity-check the canonical list — if anyone adds/removes a stop word
+      // they should also update this assertion.
+      expect(STOP_WORDS.size).toBe(26)
+    })
+
+    it('includes "then" and "and" so they cannot dominate scoring', () => {
+      expect(STOP_WORDS.has('then')).toBe(true)
+      expect(STOP_WORDS.has('and')).toBe(true)
+    })
+  })
+
+  describe('extractKeywords', () => {
+    it('lowercases and dedupes', () => {
+      const out = extractKeywords('TypeScript typescript TYPESCRIPT')
+      expect(out).toEqual(['typescript'])
+    })
+
+    it('drops words shorter than 4 characters', () => {
+      const out = extractKeywords('a bb ccc dddd eeeee')
+      expect(out).toEqual(['dddd', 'eeeee'])
+    })
+
+    it('drops stop words', () => {
+      const out = extractKeywords('the cat and the dog have meals')
+      // 'cat', 'dog', 'have' filtered: 'cat'/'dog' too short, 'have' is a stop word
+      expect(out).toEqual(['meals'])
+    })
+
+    it('splits on non-word characters', () => {
+      const out = extractKeywords('hello,world!writer-mode')
+      expect(out.sort()).toEqual(['hello', 'mode', 'world', 'writer'])
+    })
+
+    it('returns empty array for empty input', () => {
+      expect(extractKeywords('')).toEqual([])
+    })
+  })
+
+  describe('keywordScore', () => {
+    it('counts each keyword at most once', () => {
+      // 'code' appears twice in the text but contributes 1
+      expect(keywordScore('code review code style', ['code'])).toBe(1)
+    })
+
+    it('is case-insensitive', () => {
+      expect(keywordScore('TYPESCRIPT', ['typescript'])).toBe(1)
+      expect(keywordScore('typescript', ['TYPESCRIPT'])).toBe(1)
+    })
+
+    it('returns 0 when no keywords match', () => {
+      expect(keywordScore('hello world', ['rust', 'go'])).toBe(0)
+    })
+
+    it('sums distinct keyword hits', () => {
+      expect(keywordScore('write typescript code for the api', ['typescript', 'code', 'rust'])).toBe(2)
+    })
+
+    it('returns 0 for empty keywords array', () => {
+      expect(keywordScore('any text', [])).toBe(0)
+    })
+  })
+})
--- a/Show More
+++ b/Show More