Compare commits

..

No commits in common. "main" and "v0.2.0" have entirely different histories.
main ... v0.2.0

133 changed files with 1640 additions and 24878 deletions

View File

@ -6,17 +6,6 @@ labels: enhancement
assignees: ''
---
## Source
**Where did this idea come from?** (Pick one — helps maintainers triage and prioritize.)
- [ ] **Real use case** — I'm using open-multi-agent and hit this limit. Describe the use case in "Problem" below.
- [ ] **Competitive reference** — Another framework has this (LangChain, AutoGen, CrewAI, Mastra, XCLI, etc.). Please name or link it.
- [ ] **Systematic gap** — A missing piece in the framework matrix (provider not supported, tool not covered, etc.).
- [ ] **Discussion / inspiration** — Came up in a tweet, Reddit post, Discord, or AI conversation. Please link or paste the source if possible.
> **Maintainer note**: after triage, label with one of `community-feedback`, `source:competitive`, `source:analysis`, `source:owner` (multiple OK if the source is mixed — e.g. competitive analysis + user feedback).
## Problem
A clear description of the problem or limitation you're experiencing.

View File

@ -18,22 +18,6 @@ jobs:
with:
node-version: ${{ matrix.node-version }}
cache: npm
- run: rm -f package-lock.json && npm install
- run: npm ci
- run: npm run lint
- run: npm test
coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- run: rm -f package-lock.json && npm install
- run: npm run test:coverage
- uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ./coverage/lcov.info
fail_ci_if_error: false

4
.gitignore vendored
View File

@ -1,6 +1,6 @@
node_modules/
dist/
coverage/
*.tgz
.DS_Store
oma-dashboards/
promo-*.md
non-tech_*/

View File

@ -1,95 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Commands
```bash
npm run build # Compile TypeScript (src/ → dist/)
npm run dev # Watch mode compilation
npm run lint # Type-check only (tsc --noEmit)
npm test # Run all tests (vitest run)
npm run test:watch # Vitest watch mode
node dist/cli/oma.js help # After build: shell/CI CLI (`oma` when installed via npm bin)
```
Tests live in `tests/` (vitest). Examples in `examples/` are standalone scripts requiring API keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`). CLI usage and JSON schemas: `docs/cli.md`.
## Architecture
ES module TypeScript framework for multi-agent orchestration. Three runtime dependencies: `@anthropic-ai/sdk`, `openai`, `zod`.
### Core Execution Flow
**`OpenMultiAgent`** (`src/orchestrator/orchestrator.ts`) is the top-level public API with three execution modes:
1. **`runAgent(config, prompt)`** — single agent, one-shot
2. **`runTeam(team, goal)`** — automatic orchestration: a temporary "coordinator" agent decomposes the goal into a task DAG via LLM call, then tasks execute in dependency order
3. **`runTasks(team, tasks)`** — explicit task pipeline with user-defined dependencies
### The Coordinator Pattern (runTeam)
This is the framework's key feature. When `runTeam()` is called:
1. A coordinator agent receives the goal + agent roster and produces a JSON task array (title, description, assignee, dependsOn)
2. `TaskQueue` resolves dependencies topologically — independent tasks run in parallel, dependent tasks wait
3. `Scheduler` auto-assigns any unassigned tasks (strategies: `dependency-first` default, `round-robin`, `least-busy`, `capability-match`)
4. Each task result is written to `SharedMemory` so subsequent agents see prior results
5. The coordinator synthesizes all task results into a final output
### Layer Map
| Layer | Files | Responsibility |
|-------|-------|----------------|
| Orchestrator | `orchestrator/orchestrator.ts`, `orchestrator/scheduler.ts` | Top-level API, task decomposition, coordinator pattern |
| Team | `team/team.ts`, `team/messaging.ts` | Agent roster, MessageBus (point-to-point + broadcast), SharedMemory binding |
| Agent | `agent/agent.ts`, `agent/runner.ts`, `agent/pool.ts`, `agent/structured-output.ts` | Agent lifecycle (idle→running→completed/error), conversation loop, concurrency pool with Semaphore, structured output validation |
| Task | `task/queue.ts`, `task/task.ts` | Dependency-aware queue, auto-unblock on completion, cascade failure to dependents |
| Tool | `tool/framework.ts`, `tool/executor.ts`, `tool/built-in/` | `defineTool()` with Zod schemas, ToolRegistry, parallel batch execution with concurrency semaphore |
| LLM | `llm/adapter.ts`, `llm/anthropic.ts`, `llm/openai.ts` | `LLMAdapter` interface (`chat` + `stream`), factory `createAdapter()` |
| Memory | `memory/shared.ts`, `memory/store.ts` | Namespaced key-value store (`agentName/key`), markdown summary injection into prompts |
| Types | `types.ts` | All interfaces in one file to avoid circular deps |
| Exports | `index.ts` | Public API surface |
### Agent Conversation Loop (AgentRunner)
`AgentRunner.run()`: send messages → extract tool-use blocks → execute tools in parallel batch → append results → loop until `end_turn` or `maxTurns` exhausted. Accumulates `TokenUsage` across all turns.
### Concurrency Control
Three semaphore layers: `AgentPool` pool-level (max concurrent agent runs, default 5), `AgentPool` per-agent mutex (serializes concurrent runs on the same `Agent` instance), and `ToolExecutor` (max concurrent tool calls, default 4).
### Structured Output
Optional `outputSchema` (Zod) on `AgentConfig`. When set, the agent's final output is parsed as JSON and validated. On validation failure, one retry with error feedback is attempted. Validated data is available via `result.structured`. Logic lives in `agent/structured-output.ts`, wired into `Agent.executeRun()`.
### Task Retry
Optional `maxRetries`, `retryDelayMs`, `retryBackoff` on task config (used via `runTasks()`). `executeWithRetry()` in `orchestrator.ts` handles the retry loop with exponential backoff (capped at 30s). Token usage is accumulated across all attempts. Emits `task_retry` event via `onProgress`.
### Error Handling
- Tool errors → caught, returned as `ToolResult(isError: true)`, never thrown
- Task failures → retry if `maxRetries > 0`, then cascade to all dependents; independent tasks continue
- LLM API errors → propagate to caller
### Built-in Tools
`bash`, `file_read`, `file_write`, `file_edit`, `grep`, `glob` — registered via `registerBuiltInTools(registry)`. `delegate_to_agent` is opt-in (`registerBuiltInTools(registry, { includeDelegateTool: true })`) and only wired up inside pool workers by `runTeam`/`runTasks` — see "Agent Delegation" below.
### Agent Delegation
`delegate_to_agent` (in `src/tool/built-in/delegate.ts`) lets an agent synchronously hand a sub-prompt to another roster agent and receive its final output as a tool result. Only active during orchestrated runs; standalone `runAgent` and the `runTeam` short-circuit path (`isSimpleGoal` hit) do not inject it.
Guards (all enforced in the tool itself, before `runDelegatedAgent` is called):
- **Self-delegation:** rejected (`target === context.agent.name`)
- **Unknown agent:** rejected (target not in team roster)
- **Cycle detection:** rejected if target already in `TeamInfo.delegationChain` (prevents `A → B → A` from burning tokens up to the depth cap)
- **Depth cap:** `OrchestratorConfig.maxDelegationDepth` (default 3)
- **Pool deadlock:** rejected when `AgentPool.availableRunSlots < 1`, without calling the pool
The delegated run's `AgentRunResult.tokenUsage` is surfaced via `ToolResult.metadata.tokenUsage`; the runner accumulates it into `totalUsage` before the next `maxTokenBudget` check, so delegation cannot silently bypass the parent's budget. Delegation tool_result blocks are exempt from `compressToolResults` and the `compact` context strategy so the parent agent retains the full sub-agent output across turns. Best-effort SharedMemory audit writes at `{caller}/delegation:{target}:{timestamp}-{rand}` if the team has shared memory enabled.
### Adding an LLM Adapter
Implement `LLMAdapter` interface with `chat(messages, options)` and `stream(messages, options)`, then register in `createAdapter()` factory in `src/llm/adapter.ts`.

View File

@ -1,11 +1,11 @@
# Architecture Decisions
This document records our architectural decisions — both what we choose NOT to build, and what we're actively working toward. Our goal is to be the **simplest multi-agent framework**, but simplicity doesn't mean closed. We believe the long-term value of a framework isn't its feature checklist — it's the size of the network it connects to.
This document records deliberate "won't do" decisions for the project. These are features we evaluated and chose NOT to implement — not because they're bad ideas, but because they conflict with our positioning as the **simplest multi-agent framework**.
If you're considering a PR in any of these areas, please open a discussion first.
## Won't Do
These are paradigms we evaluated and deliberately chose not to implement, because they conflict with our core model.
### 1. Agent Handoffs
**What**: Agent A transfers an in-progress conversation to Agent B (like OpenAI Agents SDK `handoff()`).
@ -20,30 +20,24 @@ These are paradigms we evaluated and deliberately chose not to implement, becaus
**Related**: Closing #20 with this rationale.
## Open to Adoption
These are protocols we see strategic value in and are actively tracking. We're waiting for the right moment — not the right feature spec, but the right network density.
> **Our thesis**: Framework competition on features (DAG scheduling, shared memory, zero-dependency) is a race that can always be caught. Network competition — where the value of the framework grows with every agent published to it — creates a fundamentally different moat. MCP and A2A are the protocols that turn a framework from a build tool into a registry.
### 3. MCP Integration (Model Context Protocol)
**What**: Anthropic's protocol for connecting LLMs to external tools and data sources.
**Status**: **Next up.** MCP has crossed the adoption threshold — Cursor, Windsurf, Claude Code all ship with built-in support, and many services now provide MCP servers directly. Asking users to re-wrap each one via `defineTool()` creates unnecessary friction.
**Approach**: Optional peer dependency (`@modelcontextprotocol/sdk`). Zero impact on the core — if you don't use MCP, you don't pay for it. This preserves our minimal-dependency principle while connecting to the broader tool ecosystem.
**Tracking**: #86
### 4. A2A Protocol (Agent-to-Agent)
### 3. A2A Protocol (Agent-to-Agent)
**What**: Google's open protocol for agents on different servers to discover and communicate with each other.
**Status**: **Watching.** The spec is still evolving and production adoption is minimal. But we recognize A2A's potential to enable the network effect we care about — if 1,000 developers publish agent services using open-multi-agent, the 1,001st developer isn't just choosing an API, they're choosing which ecosystem has the most agents they can call.
**Why not**: Too early — the spec is still evolving and adoption is minimal. Our users run agents in a single process, not across distributed services. If A2A matures and there's real demand, we can revisit. Today it would add complexity for zero practical benefit.
**When we'll move**: When A2A adoption reaches a tipping point where the protocol connects real, production agent services — not just demos. We'll prioritize a lightweight integration that lets agents be both consumers and providers of A2A services.
### 4. MCP Integration (Model Context Protocol)
**What**: Anthropic's protocol for connecting LLMs to external tools and data sources.
**Why not**: MCP is valuable but targets a different layer. Our `defineTool()` API already lets users wrap any external service as a tool in ~10 lines of code. Adding MCP would mean maintaining protocol compatibility, transport layers, and tool discovery — complexity that serves tool platform builders, not our target users who just want to run agent teams.
### 5. Dashboard / Visualization
**What**: Built-in web UI to visualize task DAGs, agent activity, and token usage.
**Why not**: We expose data, we don't build UI. The `onProgress` callback and upcoming `onTrace` (#18) give users all the raw data. They can pipe it into Grafana, build a custom dashboard, or use console logs. Shipping a web UI means owning a frontend stack, which is outside our scope.
---
*Last updated: 2026-04-09*
*Last updated: 2026-04-03*

393
README.md
View File

@ -1,56 +1,24 @@
# Open Multi-Agent
The lightweight multi-agent orchestration engine for TypeScript. Three runtime dependencies, zero config, goal to result in one `runTeam()` call.
Build AI agent teams that decompose goals into tasks automatically. Define agents with roles and tools, describe a goal — the framework plans the task graph, schedules dependencies, and runs everything in parallel.
CrewAI is Python. LangGraph makes you draw the graph by hand. `open-multi-agent` is the `npm install` you drop into an existing Node.js backend when you need a team of agents to work on a goal together. Nothing more, nothing less.
3 runtime dependencies. 27 source files. One `runTeam()` call from goal to result.
[![npm version](https://img.shields.io/npm/v/@jackchen_me/open-multi-agent)](https://www.npmjs.com/package/@jackchen_me/open-multi-agent)
[![GitHub stars](https://img.shields.io/github/stars/JackChen-me/open-multi-agent)](https://github.com/JackChen-me/open-multi-agent/stargazers)
[![license](https://img.shields.io/github/license/JackChen-me/open-multi-agent)](./LICENSE)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.6-blue)](https://www.typescriptlang.org/)
[![runtime deps](https://img.shields.io/badge/runtime_deps-3-brightgreen)](https://github.com/JackChen-me/open-multi-agent/blob/main/package.json)
[![codecov](https://codecov.io/gh/JackChen-me/open-multi-agent/graph/badge.svg)](https://codecov.io/gh/JackChen-me/open-multi-agent)
**English** | [中文](./README_zh.md)
## What you actually get
## Why Open Multi-Agent?
- **Goal to result in one call.** `runTeam(team, "Build a REST API")` kicks off a coordinator agent that decomposes the goal into a task DAG, resolves dependencies, runs independent tasks in parallel, and synthesizes the final output. No graph to draw, no tasks to wire up.
- **TypeScript-native, three runtime dependencies.** `@anthropic-ai/sdk`, `openai`, `zod`. That is the whole runtime. Embed in Express, Next.js, serverless functions, or CI/CD pipelines. No Python runtime, no subprocess bridge, no cloud sidecar.
- **Multi-model teams.** Claude, GPT, Gemini, Grok, MiniMax, DeepSeek, Copilot, or any OpenAI-compatible local model (Ollama, vLLM, LM Studio, llama.cpp) in the same team. Run the architect on Opus 4.7, the developer on GPT-5.4, the reviewer on local Gemma 4, all in one `runTeam()` call. Gemini ships as an optional peer dependency: `npm install @google/genai` to enable.
Other features (MCP integration, context strategies, structured output, task retry, human-in-the-loop, lifecycle hooks, loop detection, observability) live below the fold and in [`examples/`](./examples/).
## How is this different from X?
**vs. [LangGraph JS](https://github.com/langchain-ai/langgraphjs).** LangGraph is declarative graph orchestration: you define nodes, edges, and conditional routing, then `compile()` and `invoke()`. `open-multi-agent` is goal-driven: you declare a team and a goal, a coordinator decomposes it into a task DAG at runtime. LangGraph gives you total control of topology (great for fixed production workflows). This gives you less typing and faster iteration (great for exploratory multi-agent work). LangGraph also has mature checkpointing; we do not.
**vs. [CrewAI](https://github.com/crewAIInc/crewAI).** CrewAI is the mature Python choice. If your stack is Python, use CrewAI. `open-multi-agent` is TypeScript-native: three runtime dependencies, embeds directly in Node.js without a subprocess bridge. Roughly comparable capability on the orchestration side. Choose on language fit.
**vs. [Vercel AI SDK](https://github.com/vercel/ai).** AI SDK is the LLM call layer: a unified TypeScript client for 60+ providers with streaming, tool calls, and structured outputs. It does not orchestrate multi-agent teams. `open-multi-agent` sits on top when you need that. They compose: use AI SDK for single-agent work, reach for this when you need a team.
## Ecosystem
`open-multi-agent` is a new project (launched 2026-04-01, MIT). The ecosystem is still forming, so the lists below are short and honest.
### In production
- **[temodar-agent](https://github.com/xeloxa/temodar-agent)** (~50 stars). WordPress security analysis platform by [Ali Sünbül](https://github.com/xeloxa). Uses our built-in tools (`bash`, `file_*`, `grep`) directly in its Docker runtime. Confirmed production use.
- **Cybersecurity SOC (home lab).** A private setup running Qwen 2.5 + DeepSeek Coder entirely offline via Ollama, building an autonomous SOC pipeline on Wazuh + Proxmox. Early user, not yet public.
Using `open-multi-agent` in production or a side project? [Open a discussion](https://github.com/JackChen-me/open-multi-agent/discussions) and we will list it here.
### Integrations (free)
- **[Engram](https://www.engram-memory.com)** — "Git for AI memory." Syncs knowledge across agents instantly and flags conflicts. ([repo](https://github.com/Agentscreator/engram-memory))
Built an integration? [Open a discussion](https://github.com/JackChen-me/open-multi-agent/discussions) to get listed.
### Featured Partner ($3,000 / year)
12 months of prominent placement: logo, 100-word description, and a maintainer endorsement quote. For products or platforms already integrated with `open-multi-agent`.
[Inquire about Featured Partner](https://github.com/JackChen-me/open-multi-agent/issues/new?title=Featured+Partner+Inquiry&labels=featured-partner-inquiry)
- **Auto Task Decomposition** — Describe a goal in plain text. A built-in coordinator agent breaks it into a task DAG with dependencies and assignees — no manual orchestration needed.
- **Multi-Agent Teams** — Define agents with different roles, tools, and even different models. They collaborate through a message bus and shared memory.
- **Task DAG Scheduling** — Tasks have dependencies. The framework resolves them topologically — dependent tasks wait, independent tasks run in parallel.
- **Model Agnostic** — Claude, GPT, Gemma 4, and local models (Ollama, vLLM, LM Studio) in the same team. Swap models per agent via `baseURL`.
- **Structured Output** — Add `outputSchema` (Zod) to any agent. Output is parsed as JSON, validated, and auto-retried once on failure. Access typed results via `result.structured`.
- **Task Retry** — Set `maxRetries` on tasks for automatic retry with exponential backoff. Failed attempts accumulate token usage for accurate billing.
- **In-Process Execution** — No subprocess overhead. Everything runs in one Node.js process. Deploy to serverless, Docker, CI/CD.
## Quick Start
@ -60,23 +28,9 @@ Requires Node.js >= 18.
npm install @jackchen_me/open-multi-agent
```
Set the API key for your provider. Local models via Ollama require no API key. See [`providers/ollama`](examples/providers/ollama.ts).
Set `ANTHROPIC_API_KEY` (and optionally `OPENAI_API_KEY` or `GITHUB_TOKEN` for Copilot) in your environment. Local models via Ollama require no API key — see [example 06](examples/06-local-model.ts).
- `ANTHROPIC_API_KEY`
- `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT` (for Azure OpenAI; deployment is optional fallback when `model` is blank)
- `OPENAI_API_KEY`
- `GEMINI_API_KEY`
- `XAI_API_KEY` (for Grok)
- `MINIMAX_API_KEY` (for MiniMax)
- `MINIMAX_BASE_URL` (for MiniMax, optional, selects endpoint)
- `DEEPSEEK_API_KEY` (for DeepSeek)
- `GITHUB_TOKEN` (for Copilot)
### CLI (`oma`)
For shell and CI, the package exposes a JSON-first binary. See [docs/cli.md](./docs/cli.md) for `oma run`, `oma task`, `oma provider`, exit codes, and file formats.
Three agents, one goal. The framework handles the rest:
Three agents, one goal — the framework handles the rest:
```typescript
import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
@ -89,8 +43,19 @@ const architect: AgentConfig = {
tools: ['file_write'],
}
const developer: AgentConfig = { /* same shape, tools: ['bash', 'file_read', 'file_write', 'file_edit'] */ }
const reviewer: AgentConfig = { /* same shape, tools: ['file_read', 'grep'] */ }
const developer: AgentConfig = {
name: 'developer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You implement what the architect designs.',
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You review code for correctness and clarity.',
tools: ['file_read', 'grep'],
}
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
@ -103,7 +68,7 @@ const team = orchestrator.createTeam('api-team', {
sharedMemory: true,
})
// Describe a goal. The framework breaks it into tasks and orchestrates execution
// Describe a goal — the framework breaks it into tasks and orchestrates execution
const result = await orchestrator.runTeam(team, 'Create a REST API for a todo list in /tmp/todo-api/')
console.log(`Success: ${result.success}`)
@ -119,8 +84,8 @@ task_complete architect
task_start developer
task_start developer // independent tasks run in parallel
task_complete developer
task_complete developer
task_start reviewer // unblocked after implementation
task_complete developer
task_complete reviewer
agent_complete coordinator // synthesizes final result
Success: true
@ -131,25 +96,35 @@ Tokens: 12847 output tokens
| Mode | Method | When to use |
|------|--------|-------------|
| Single agent | `runAgent()` | One agent, one prompt. Simplest entry point |
| Single agent | `runAgent()` | One agent, one prompt — simplest entry point |
| Auto-orchestrated team | `runTeam()` | Give a goal, framework plans and executes |
| Explicit pipeline | `runTasks()` | You define the task graph and assignments |
For MapReduce-style fan-out without task dependencies, use `AgentPool.runParallel()` directly. See [`patterns/fan-out-aggregate`](examples/patterns/fan-out-aggregate.ts).
## Contributors
<a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
<img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent" />
</a>
## Examples
[`examples/`](./examples/) is organized by category: basics, providers, patterns, integrations, and production. See [`examples/README.md`](./examples/README.md) for the full index. Highlights:
All examples are runnable scripts in [`examples/`](./examples/). Run any of them with `npx tsx`:
- [`basics/team-collaboration`](examples/basics/team-collaboration.ts): `runTeam()` coordinator pattern.
- [`patterns/structured-output`](examples/patterns/structured-output.ts): any agent returns Zod-validated JSON.
- [`patterns/agent-handoff`](examples/patterns/agent-handoff.ts): synchronous sub-agent delegation via `delegate_to_agent`.
- [`integrations/trace-observability`](examples/integrations/trace-observability.ts): `onTrace` spans for LLM calls, tools, and tasks.
- [`integrations/mcp-github`](examples/integrations/mcp-github.ts): expose an MCP server's tools to an agent via `connectMCPTools()`.
- [`integrations/with-vercel-ai-sdk`](examples/integrations/with-vercel-ai-sdk/): Next.js app combining OMA `runTeam()` with AI SDK `useChat` streaming.
- **Provider examples**: eight three-agent teams (one per supported provider) under [`examples/providers/`](examples/providers/).
```bash
npx tsx examples/01-single-agent.ts
```
Run scripts with `npx tsx examples/basics/team-collaboration.ts`.
| Example | What it shows |
|---------|---------------|
| [01 — Single Agent](examples/01-single-agent.ts) | `runAgent()` one-shot, `stream()` streaming, `prompt()` multi-turn |
| [02 — Team Collaboration](examples/02-team-collaboration.ts) | `runTeam()` auto-orchestration with coordinator pattern |
| [03 — Task Pipeline](examples/03-task-pipeline.ts) | `runTasks()` explicit dependency graph (design → implement → test + review) |
| [04 — Multi-Model Team](examples/04-multi-model-team.ts) | `defineTool()` custom tools, mixed Anthropic + OpenAI providers, `AgentPool` |
| [05 — Copilot](examples/05-copilot-test.ts) | GitHub Copilot as an LLM provider |
| [06 — Local Model](examples/06-local-model.ts) | Ollama + Claude in one pipeline via `baseURL` (works with vLLM, LM Studio, etc.) |
| [07 — Fan-Out / Aggregate](examples/07-fan-out-aggregate.ts) | `runParallel()` MapReduce — 3 analysts in parallel, then synthesize |
| [08 — Gemma 4 Local](examples/08-gemma4-local.ts) | Pure-local Gemma 4 agent team with tool-calling — zero API cost |
| [09 — Gemma 4 Auto-Orchestration](examples/09-gemma4-auto-orchestration.ts) | `runTeam()` with Gemma 4 as coordinator — auto task decomposition, fully local |
## Architecture
@ -178,22 +153,17 @@ Run scripts with `npx tsx examples/basics/team-collaboration.ts`.
│ └───────────────────────┘
┌────────▼──────────┐
│ Agent │
│ - run() │ ┌────────────────────────┐
│ - prompt() │───►│ LLMAdapter │
│ - stream() │ │ - AnthropicAdapter │
└────────┬──────────┘ │ - OpenAIAdapter │
│ │ - AzureOpenAIAdapter │
│ │ - CopilotAdapter │
│ │ - GeminiAdapter │
│ │ - GrokAdapter │
│ │ - MiniMaxAdapter │
│ │ - DeepSeekAdapter │
│ └────────────────────────┘
│ - run() │ ┌──────────────────────┐
│ - prompt() │───►│ LLMAdapter │
│ - stream() │ │ - AnthropicAdapter │
└────────┬──────────┘ │ - OpenAIAdapter │
│ │ - CopilotAdapter │
│ └──────────────────────┘
┌────────▼──────────┐
│ AgentRunner │ ┌──────────────────────┐
│ - conversation │───►│ ToolRegistry │
│ loop │ │ - defineTool() │
│ - tool dispatch │ │ - 6 built-in tools │
│ - tool dispatch │ │ - 5 built-in tools │
└───────────────────┘ └──────────────────────┘
```
@ -206,157 +176,6 @@ Run scripts with `npx tsx examples/basics/team-collaboration.ts`.
| `file_write` | Write or create a file. Auto-creates parent directories. |
| `file_edit` | Edit a file by replacing an exact string match. |
| `grep` | Search file contents with regex. Uses ripgrep when available, falls back to Node.js. |
| `glob` | Find files by glob pattern. Returns matching paths sorted by modification time. |
## Tool Configuration
Agents can be configured with fine-grained tool access control using presets, allowlists, and denylists.
### Tool Presets
Predefined tool sets for common use cases:
```typescript
const readonlyAgent: AgentConfig = {
name: 'reader',
model: 'claude-sonnet-4-6',
toolPreset: 'readonly', // file_read, grep, glob
}
const readwriteAgent: AgentConfig = {
name: 'editor',
model: 'claude-sonnet-4-6',
toolPreset: 'readwrite', // file_read, file_write, file_edit, grep, glob
}
const fullAgent: AgentConfig = {
name: 'executor',
model: 'claude-sonnet-4-6',
toolPreset: 'full', // file_read, file_write, file_edit, grep, glob, bash
}
```
### Advanced Filtering
Combine presets with allowlists and denylists for precise control:
```typescript
const customAgent: AgentConfig = {
name: 'custom',
model: 'claude-sonnet-4-6',
toolPreset: 'readwrite', // Start with: file_read, file_write, file_edit, grep, glob
tools: ['file_read', 'grep'], // Allowlist: intersect with preset = file_read, grep
disallowedTools: ['grep'], // Denylist: subtract = file_read only
}
```
**Resolution order:** preset → allowlist → denylist → framework safety rails.
### Custom Tools
Two ways to give an agent a tool that is not in the built-in set.
**Inject at config time** via `customTools` on `AgentConfig`. Good when the orchestrator wires up tools centrally. Tools defined here bypass preset/allowlist filtering but still respect `disallowedTools`.
```typescript
import { defineTool } from '@jackchen_me/open-multi-agent'
import { z } from 'zod'
const weatherTool = defineTool({
name: 'get_weather',
description: 'Look up current weather for a city.',
schema: z.object({ city: z.string() }),
execute: async ({ city }) => ({ content: await fetchWeather(city) }),
})
const agent: AgentConfig = {
name: 'assistant',
model: 'claude-sonnet-4-6',
customTools: [weatherTool],
}
```
**Register at runtime** via `agent.addTool(tool)`. Tools added this way are always available, regardless of filtering.
### Tool Output Control
Long tool outputs can blow up conversation size and cost. Two controls work together.
**Truncation.** Cap an individual tool result to a head + tail excerpt with a marker in between:
```typescript
const agent: AgentConfig = {
// ...
maxToolOutputChars: 10_000, // applies to every tool this agent runs
}
// Per-tool override (takes priority over AgentConfig.maxToolOutputChars):
const bigQueryTool = defineTool({
// ...
maxOutputChars: 50_000,
})
```
**Post-consumption compression.** Once the agent has acted on a tool result, compress older copies in the transcript so they stop costing input tokens on every subsequent turn. Error results are never compressed.
```typescript
const agent: AgentConfig = {
// ...
compressToolResults: true, // default threshold: 500 chars
// or: compressToolResults: { minChars: 2_000 }
}
```
### MCP Tools (Model Context Protocol)
`open-multi-agent` can connect to any MCP server and expose its tools directly to agents.
```typescript
import { connectMCPTools } from '@jackchen_me/open-multi-agent/mcp'
const { tools, disconnect } = await connectMCPTools({
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
namePrefix: 'github',
})
// Register each MCP tool in your ToolRegistry, then include their names in AgentConfig.tools
// Don't forget cleanup when done
await disconnect()
```
Notes:
- `@modelcontextprotocol/sdk` is an optional peer dependency, only needed when using MCP.
- Current transport support is stdio.
- MCP input validation is delegated to the MCP server (`inputSchema` is `z.any()`).
See [`integrations/mcp-github`](examples/integrations/mcp-github.ts) for a full runnable setup.
## Context Management
Long-running agents can hit input token ceilings fast. Set `contextStrategy` on `AgentConfig` to control how the conversation shrinks as it grows:
```typescript
const agent: AgentConfig = {
name: 'long-runner',
model: 'claude-sonnet-4-6',
// Pick one:
contextStrategy: { type: 'sliding-window', maxTurns: 20 },
// contextStrategy: { type: 'summarize', maxTokens: 80_000, summaryModel: 'claude-haiku-4-5' },
// contextStrategy: { type: 'compact', maxTokens: 100_000, preserveRecentTurns: 4 },
// contextStrategy: { type: 'custom', compress: (messages, estimatedTokens, ctx) => ... },
}
```
| Strategy | When to reach for it |
|----------|----------------------|
| `sliding-window` | Cheapest. Keep the last N turns, drop the rest. |
| `summarize` | Send old turns to a summary model; keep the summary in place of the originals. |
| `compact` | Rule-based: truncate large assistant text blocks and tool results, keep recent turns intact. No extra LLM call. |
| `custom` | Supply your own `compress(messages, estimatedTokens, ctx)` function. |
Pairs well with `compressToolResults` and `maxToolOutputChars` above.
## Supported Providers
@ -364,115 +183,31 @@ Pairs well with `compressToolResults` and `maxToolOutputChars` above.
|----------|--------|---------|--------|
| Anthropic (Claude) | `provider: 'anthropic'` | `ANTHROPIC_API_KEY` | Verified |
| OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | Verified |
| Azure OpenAI | `provider: 'azure-openai'` | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` (+ optional `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`) | Verified |
| Grok (xAI) | `provider: 'grok'` | `XAI_API_KEY` | Verified |
| MiniMax (global) | `provider: 'minimax'` | `MINIMAX_API_KEY` | Verified |
| MiniMax (China) | `provider: 'minimax'` + `MINIMAX_BASE_URL` | `MINIMAX_API_KEY` | Verified |
| DeepSeek | `provider: 'deepseek'` | `DEEPSEEK_API_KEY` | Verified |
| GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | Verified |
| Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | Verified |
| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | none | Verified |
| Groq | `provider: 'openai'` + `baseURL` | `GROQ_API_KEY` | Verified |
| llama.cpp server | `provider: 'openai'` + `baseURL` | none | Verified |
| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | Verified |
Gemini requires `npm install @google/genai` (optional peer dependency).
Verified local models with tool-calling: **Gemma 4** (see [example 08](examples/08-gemma4-local.ts)).
Any OpenAI-compatible API should work via `provider: 'openai'` + `baseURL` (Mistral, Qwen, Moonshot, Doubao, etc.). Groq is now verified in [`providers/groq`](examples/providers/groq.ts). **Grok, MiniMax, and DeepSeek now have first-class support** via `provider: 'grok'`, `provider: 'minimax'`, and `provider: 'deepseek'`.
### Local Model Tool-Calling
The framework supports tool-calling with local models served by Ollama, vLLM, LM Studio, or llama.cpp. Tool-calling is handled natively by these servers via the OpenAI-compatible API.
**Verified models:** Gemma 4, Llama 3.1, Qwen 3, Mistral, Phi-4. See the full list at [ollama.com/search?c=tools](https://ollama.com/search?c=tools).
**Fallback extraction:** If a local model returns tool calls as text instead of using the `tool_calls` wire format (common with thinking models or misconfigured servers), the framework automatically extracts them from the text output.
**Timeout:** Local inference can be slow. Use `timeoutMs` on `AgentConfig` to prevent indefinite hangs:
```typescript
const localAgent: AgentConfig = {
name: 'local',
model: 'llama3.1',
provider: 'openai',
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
tools: ['bash', 'file_read'],
timeoutMs: 120_000, // abort after 2 minutes
}
```
**Troubleshooting:**
- Model not calling tools? Ensure it appears in Ollama's [Tools category](https://ollama.com/search?c=tools). Not all models support tool-calling.
- Using Ollama? Update to the latest version (`ollama update`). Older versions have known tool-calling bugs.
- Proxy interfering? Use `no_proxy=localhost` when running against local servers.
### LLM Configuration Examples
```typescript
const grokAgent: AgentConfig = {
name: 'grok-agent',
provider: 'grok',
model: 'grok-4',
systemPrompt: 'You are a helpful assistant.',
}
```
(Set your `XAI_API_KEY` environment variable, no `baseURL` needed.)
```typescript
const minimaxAgent: AgentConfig = {
name: 'minimax-agent',
provider: 'minimax',
model: 'MiniMax-M2.7',
systemPrompt: 'You are a helpful assistant.',
}
```
Set `MINIMAX_API_KEY`. The adapter selects the endpoint via `MINIMAX_BASE_URL`:
- `https://api.minimax.io/v1` Global, default
- `https://api.minimaxi.com/v1` China mainland endpoint
You can also pass `baseURL` directly in `AgentConfig` to override the env var.
```typescript
const deepseekAgent: AgentConfig = {
name: 'deepseek-agent',
provider: 'deepseek',
model: 'deepseek-chat',
systemPrompt: 'You are a helpful assistant.',
}
```
Set `DEEPSEEK_API_KEY`. Available models: `deepseek-chat` (DeepSeek-V3, recommended for coding) and `deepseek-reasoner` (thinking mode).
Any OpenAI-compatible API should work via `provider: 'openai'` + `baseURL` (DeepSeek, Groq, Mistral, Qwen, MiniMax, etc.). These providers have not been fully verified yet — contributions welcome via [#25](https://github.com/JackChen-me/open-multi-agent/issues/25).
## Contributing
Issues, feature requests, and PRs are welcome. Some areas where contributions would be especially valuable:
- **Production examples.** Real-world end-to-end workflows. See [`examples/production/README.md`](./examples/production/README.md) for the acceptance criteria and submission format.
- **Documentation.** Guides, tutorials, and API docs.
## Contributors
<a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
<img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&max=20&v=20260423" />
</a>
- **Provider integrations** — Verify and document OpenAI-compatible providers (DeepSeek, Groq, Qwen, MiniMax, etc.) via `baseURL`. See [#25](https://github.com/JackChen-me/open-multi-agent/issues/25). For providers that are NOT OpenAI-compatible (e.g. Gemini), a new `LLMAdapter` implementation is welcome — the interface requires just two methods: `chat()` and `stream()`.
- **Examples** — Real-world workflows and use cases.
- **Documentation** — Guides, tutorials, and API docs.
## Star History
<a href="https://star-history.com/#JackChen-me/open-multi-agent&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260423" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260423" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260423" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260403" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260403" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260403" />
</picture>
</a>
## Translations
Help translate this README. [Open a PR](https://github.com/JackChen-me/open-multi-agent/pulls).
## License
MIT

View File

@ -1,56 +1,24 @@
# Open Multi-Agent
TypeScript 里的轻量多智能体编排引擎。3 个运行时依赖,零配置,一次 `runTeam()` 从目标拿到结果
构建能自动拆解目标的 AI 智能体团队。定义智能体的角色和工具,描述一个目标——框架自动规划任务图、调度依赖、并行执行
CrewAI 是 Python。LangGraph 要你自己画图。`open-multi-agent` 是你现有 Node.js 后端里 `npm install` 一下就能用的那一层:一支 agent 团队围绕一个目标协作,就这些
3 个运行时依赖27 个源文件,一次 `runTeam()` 调用从目标到结果
[![npm version](https://img.shields.io/npm/v/@jackchen_me/open-multi-agent)](https://www.npmjs.com/package/@jackchen_me/open-multi-agent)
[![GitHub stars](https://img.shields.io/github/stars/JackChen-me/open-multi-agent)](https://github.com/JackChen-me/open-multi-agent/stargazers)
[![license](https://img.shields.io/github/license/JackChen-me/open-multi-agent)](./LICENSE)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.6-blue)](https://www.typescriptlang.org/)
[![runtime deps](https://img.shields.io/badge/runtime_deps-3-brightgreen)](https://github.com/JackChen-me/open-multi-agent/blob/main/package.json)
[![codecov](https://codecov.io/gh/JackChen-me/open-multi-agent/graph/badge.svg)](https://codecov.io/gh/JackChen-me/open-multi-agent)
[English](./README.md) | **中文**
## 核心能力
## 为什么选择 Open Multi-Agent
- `runTeam(team, "构建一个 REST API")` 下去,协调者 agent 会把目标拆成任务 DAG独立任务并行跑再把结果合起来。不用画图不用手动连依赖。
- 运行时依赖就三个:`@anthropic-ai/sdk`、`openai`、`zod`。能直接塞进 Express、Next.js、Serverless 或 CI/CD不起 Python 进程,也不跑云端 sidecar。
- 同一个团队里的 agent 能挂不同模型:架构师用 Opus 4.7、开发用 GPT-5.4、评审跑本地 Gemma 4 都行。支持 Claude、GPT、Gemini、Grok、MiniMax、DeepSeek、Copilot以及 OpenAI 兼容的本地模型Ollama、vLLM、LM Studio、llama.cpp。用 Gemini 要额外装 `@google/genai`
还有 MCP、上下文策略、结构化输出、任务重试、human-in-the-loop、生命周期 hook、循环检测、可观测性等下面章节或 [`examples/`](./examples/) 里都有。
## 和其他框架怎么选
如果你在看 [LangGraph JS](https://github.com/langchain-ai/langgraphjs):它是声明式图编排,自己定义节点、边、路由,`compile()` + `invoke()`。`open-multi-agent` 反过来,目标驱动:给一个团队和一个目标,协调者在运行时拆 DAG。想完全控拓扑、流程定下来的用 LangGraph想写得少、迭代快、还在探索的选这个。LangGraph 有成熟 checkpoint我们没做。
Python 栈直接用 [CrewAI](https://github.com/crewAIInc/crewAI) 就行,编排层能力差不多。`open-multi-agent` 的定位是 TypeScript 原生3 个依赖、直接进 Node.js、不用子进程桥接。按语言选。
和 [Vercel AI SDK](https://github.com/vercel/ai) 不冲突。AI SDK 是 LLM 调用层,统一的 TypeScript 客户端60+ provider带流式、tool call、结构化输出但不做多智能体编排。要多 agent`open-multi-agent` 叠在 AI SDK 上面就行。单 agent 用 AI SDK多 agent 用这个。
## 生态
项目 2026-04-01 发布MIT 协议。生态还在成型,下面的列表不长,但都是真的。
### 生产环境在用
- **[temodar-agent](https://github.com/xeloxa/temodar-agent)**(约 50 stars。WordPress 安全分析平台,作者 [Ali Sünbül](https://github.com/xeloxa)。在 Docker runtime 里直接用我们的内置工具(`bash`、`file_*`、`grep`)。已确认生产环境使用。
- **家用服务器 Cybersecurity SOC。** 本地完全离线跑 Qwen 2.5 + DeepSeek Coder通过 Ollama在 Wazuh + Proxmox 上搭自主 SOC 流水线。早期用户,未公开。
如果你在生产或 side project 里用了 `open-multi-agent`[请开个 Discussion](https://github.com/JackChen-me/open-multi-agent/discussions),我加上来。
### 集成(免费)
- **[Engram](https://www.engram-memory.com)** — "Git for AI memory." Syncs knowledge across agents instantly and flags conflicts. ([repo](https://github.com/Agentscreator/engram-memory))
做了 `open-multi-agent` 集成?[开个 Discussion](https://github.com/JackChen-me/open-multi-agent/discussions),我加上来。
### Featured Partner$3,000 / 年)
12 个月显眼位置logo、100 字介绍、maintainer 背书 quote。面向已经集成 `open-multi-agent` 的产品或平台。
[咨询 Featured Partner](https://github.com/JackChen-me/open-multi-agent/issues/new?title=Featured+Partner+Inquiry&labels=featured-partner-inquiry)
- **自动任务拆解** — 用自然语言描述目标,内置的协调者智能体自动将其拆解为带依赖关系和分配的任务图——无需手动编排。
- **多智能体团队** — 定义不同角色、工具甚至不同模型的智能体。它们通过消息总线和共享内存协作。
- **任务 DAG 调度** — 任务之间存在依赖关系。框架进行拓扑排序——有依赖的任务等待,无依赖的任务并行执行。
- **模型无关** — Claude、GPT、Gemma 4 和本地模型Ollama、vLLM、LM Studio可以在同一个团队中使用。通过 `baseURL` 即可接入任何 OpenAI 兼容服务。
- **结构化输出** — 为任意智能体添加 `outputSchema`Zod输出自动解析为 JSON 并校验,校验失败自动重试一次。通过 `result.structured` 获取类型化结果。
- **任务重试** — 为任务设置 `maxRetries`,失败时自动指数退避重试。所有尝试的 token 用量累计,确保计费准确。
- **进程内执行** — 没有子进程开销。所有内容在一个 Node.js 进程中运行。可部署到 Serverless、Docker、CI/CD。
## 快速开始
@ -60,23 +28,9 @@ Python 栈直接用 [CrewAI](https://github.com/crewAIInc/crewAI) 就行,编
npm install @jackchen_me/open-multi-agent
```
根据用的 provider 设对应 API key。通过 Ollama 跑本地模型不用 key见 [`providers/ollama`](examples/providers/ollama.ts)。
在环境变量中设置 `ANTHROPIC_API_KEY`(以及可选的 `OPENAI_API_KEY` 或用于 Copilot 的 `GITHUB_TOKEN`)。通过 Ollama 使用本地模型无需 API key — 参见 [example 06](examples/06-local-model.ts)。
- `ANTHROPIC_API_KEY`
- `AZURE_OPENAI_API_KEY`、`AZURE_OPENAI_ENDPOINT`、`AZURE_OPENAI_API_VERSION`、`AZURE_OPENAI_DEPLOYMENT`Azure OpenAI`model` 为空时可用 deployment 环境变量兜底)
- `OPENAI_API_KEY`
- `GEMINI_API_KEY`
- `XAI_API_KEY`Grok
- `MINIMAX_API_KEY`MiniMax
- `MINIMAX_BASE_URL`MiniMax可选选接入端点
- `DEEPSEEK_API_KEY`DeepSeek
- `GITHUB_TOKEN`Copilot
### CLI`oma`
包里还自带一个叫 `oma` 的命令行工具,给 shell 和 CI 场景用,输出都是 JSON。`oma run`、`oma task`、`oma provider`、退出码、文件格式都在 [docs/cli.md](./docs/cli.md) 里。
下面用三个 agent 协作做一个 REST API
三个智能体,一个目标——框架处理剩下的一切:
```typescript
import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
@ -89,8 +43,19 @@ const architect: AgentConfig = {
tools: ['file_write'],
}
const developer: AgentConfig = { /* 同样结构tools: ['bash', 'file_read', 'file_write', 'file_edit'] */ }
const reviewer: AgentConfig = { /* 同样结构tools: ['file_read', 'grep'] */ }
const developer: AgentConfig = {
name: 'developer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You implement what the architect designs.',
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You review code for correctness and clarity.',
tools: ['file_read', 'grep'],
}
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
@ -103,11 +68,11 @@ const team = orchestrator.createTeam('api-team', {
sharedMemory: true,
})
// 描述一个目标,框架负责拆解成任务并编排执行
// 描述一个目标——框架将其拆解为任务并编排执行
const result = await orchestrator.runTeam(team, 'Create a REST API for a todo list in /tmp/todo-api/')
console.log(`Success: ${result.success}`)
console.log(`Tokens: ${result.totalTokenUsage.output_tokens} output tokens`)
console.log(`成功: ${result.success}`)
console.log(`Token 用量: ${result.totalTokenUsage.output_tokens} output tokens`)
```
执行过程:
@ -119,37 +84,51 @@ task_complete architect
task_start developer
task_start developer // 无依赖的任务并行执行
task_complete developer
task_complete developer
task_start reviewer // 实现完成后自动解锁
task_complete developer
task_complete reviewer
agent_complete coordinator // 综合所有结果
Success: true
Tokens: 12847 output tokens
```
## 作者
> JackChen — 前 WPS 产品经理,现独立创业者。关注小红书[「杰克西|硅基杠杆」](https://www.xiaohongshu.com/user/profile/5a1bdc1e4eacab4aa39ea6d6),持续获取我的 AI Agent 观点和思考。
## 三种运行模式
| 模式 | 方法 | 适用场景 |
|------|------|----------|
| 单智能体 | `runAgent()` | 一个智能体,一个提示词最简入口 |
| 单智能体 | `runAgent()` | 一个智能体,一个提示词——最简入口 |
| 自动编排团队 | `runTeam()` | 给一个目标,框架自动规划和执行 |
| 显式任务管线 | `runTasks()` | 你自己定义任务图和分配 |
要 MapReduce 风格的 fan-out 但不需要任务依赖,直接用 `AgentPool.runParallel()`。例子见 [`patterns/fan-out-aggregate`](examples/patterns/fan-out-aggregate.ts)。
## 贡献者
<a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
<img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent" />
</a>
## 示例
[`examples/`](./examples/) 按类别分了 basics、providers、patterns、integrations、production。完整索引见 [`examples/README.md`](./examples/README.md),几个值得先看的:
所有示例都是可运行脚本,位于 [`examples/`](./examples/) 目录。使用 `npx tsx` 运行
- [`basics/team-collaboration`](examples/basics/team-collaboration.ts)`runTeam()` 协调者模式。
- [`patterns/structured-output`](examples/patterns/structured-output.ts):任意 agent 产出 Zod 校验过的 JSON。
- [`patterns/agent-handoff`](examples/patterns/agent-handoff.ts)`delegate_to_agent` 同步子智能体委派。
- [`integrations/trace-observability`](examples/integrations/trace-observability.ts)`onTrace` 回调,给 LLM 调用、工具、任务发结构化 span。
- [`integrations/mcp-github`](examples/integrations/mcp-github.ts):用 `connectMCPTools()` 把 MCP 服务器的工具暴露给 agent。
- [`integrations/with-vercel-ai-sdk`](examples/integrations/with-vercel-ai-sdk/)Next.js 应用OMA `runTeam()` 配合 AI SDK `useChat` 流式输出。
- **Provider 示例**8 个三智能体团队示例,每个 provider 一个,见 [`examples/providers/`](examples/providers/)。
```bash
npx tsx examples/01-single-agent.ts
```
跑脚本用 `npx tsx examples/basics/team-collaboration.ts`
| 示例 | 展示内容 |
|------|----------|
| [01 — 单智能体](examples/01-single-agent.ts) | `runAgent()` 单次调用、`stream()` 流式输出、`prompt()` 多轮对话 |
| [02 — 团队协作](examples/02-team-collaboration.ts) | `runTeam()` 自动编排 + 协调者模式 |
| [03 — 任务流水线](examples/03-task-pipeline.ts) | `runTasks()` 显式依赖图(设计 → 实现 → 测试 + 评审) |
| [04 — 多模型团队](examples/04-multi-model-team.ts) | `defineTool()` 自定义工具、Anthropic + OpenAI 混合、`AgentPool` |
| [05 — Copilot](examples/05-copilot-test.ts) | GitHub Copilot 作为 LLM 提供者 |
| [06 — 本地模型](examples/06-local-model.ts) | Ollama + Claude 混合流水线,通过 `baseURL` 接入(兼容 vLLM、LM Studio 等) |
| [07 — 扇出聚合](examples/07-fan-out-aggregate.ts) | `runParallel()` MapReduce — 3 个分析师并行,然后综合 |
| [08 — Gemma 4 本地](examples/08-gemma4-local.ts) | 纯本地 Gemma 4 智能体团队 + tool-calling — 零 API 费用 |
| [09 — Gemma 4 自动编排](examples/09-gemma4-auto-orchestration.ts) | `runTeam()` 用 Gemma 4 当 coordinator — 自动任务拆解,完全本地 |
## 架构
@ -178,22 +157,17 @@ Tokens: 12847 output tokens
│ └───────────────────────┘
┌────────▼──────────┐
│ Agent │
│ - run() │ ┌────────────────────────┐
│ - prompt() │───►│ LLMAdapter │
│ - stream() │ │ - AnthropicAdapter │
└────────┬──────────┘ │ - OpenAIAdapter │
│ │ - AzureOpenAIAdapter │
│ │ - CopilotAdapter │
│ │ - GeminiAdapter │
│ │ - GrokAdapter │
│ │ - MiniMaxAdapter │
│ │ - DeepSeekAdapter │
│ └────────────────────────┘
│ - run() │ ┌──────────────────────┐
│ - prompt() │───►│ LLMAdapter │
│ - stream() │ │ - AnthropicAdapter │
└────────┬──────────┘ │ - OpenAIAdapter │
│ │ - CopilotAdapter │
│ └──────────────────────┘
┌────────▼──────────┐
│ AgentRunner │ ┌──────────────────────┐
│ - conversation │───►│ ToolRegistry │
│ loop │ │ - defineTool() │
│ - tool dispatch │ │ - 6 built-in tools │
│ - tool dispatch │ │ - 5 built-in tools │
└───────────────────┘ └──────────────────────┘
```
@ -201,160 +175,11 @@ Tokens: 12847 output tokens
| 工具 | 说明 |
|------|------|
| `bash` | Shell 命令。返回 stdout + stderr。支持超时和工作目录设置。 |
| `file_read` | 按绝对路径读文件。支持偏移量和行数限制,能读大文件。 |
| `bash` | 执行 Shell 命令。返回 stdout + stderr。支持超时和工作目录设置。 |
| `file_read` | 读取指定绝对路径的文件内容。支持偏移量和行数限制以处理大文件。 |
| `file_write` | 写入或创建文件。自动创建父目录。 |
| `file_edit` | 按精确字符串匹配改文件。 |
| `grep` | 用正则搜文件内容。优先走 ripgrep没有就 fallback 到 Node.js。 |
| `glob` | 按 glob 模式查找文件。返回按修改时间排序的匹配路径。 |
## 工具配置
三层叠起来用preset预设、tools白名单、disallowedTools黑名单
### 工具预设
三种内置 preset
```typescript
const readonlyAgent: AgentConfig = {
name: 'reader',
model: 'claude-sonnet-4-6',
toolPreset: 'readonly', // file_read, grep, glob
}
const readwriteAgent: AgentConfig = {
name: 'editor',
model: 'claude-sonnet-4-6',
toolPreset: 'readwrite', // file_read, file_write, file_edit, grep, glob
}
const fullAgent: AgentConfig = {
name: 'executor',
model: 'claude-sonnet-4-6',
toolPreset: 'full', // file_read, file_write, file_edit, grep, glob, bash
}
```
### 高级过滤
```typescript
const customAgent: AgentConfig = {
name: 'custom',
model: 'claude-sonnet-4-6',
toolPreset: 'readwrite', // 起点file_read, file_write, file_edit, grep, glob
tools: ['file_read', 'grep'], // 白名单:与预设取交集 = file_read, grep
disallowedTools: ['grep'], // 黑名单:再减去 = 只剩 file_read
}
```
**解析顺序:** preset → allowlist → denylist → 框架安全护栏。
### 自定义工具
装一个不在内置集里的工具,有两种方式。
**配置时注入。** 通过 `AgentConfig.customTools` 传入。编排层统一挂工具的时候用这个。这里定义的工具会绕过 preset 和白名单,但仍受 `disallowedTools` 限制。
```typescript
import { defineTool } from '@jackchen_me/open-multi-agent'
import { z } from 'zod'
const weatherTool = defineTool({
name: 'get_weather',
description: '查询某城市当前天气。',
schema: z.object({ city: z.string() }),
execute: async ({ city }) => ({ content: await fetchWeather(city) }),
})
const agent: AgentConfig = {
name: 'assistant',
model: 'claude-sonnet-4-6',
customTools: [weatherTool],
}
```
**运行时注册。** `agent.addTool(tool)`。这种方式加的工具始终可用,不受任何过滤规则影响。
### 工具输出控制
工具返回太长会快速撑大对话和成本。两个开关配合着用。
**截断。** 把单次工具结果压成 head + tail 摘要(中间放一个标记):
```typescript
const agent: AgentConfig = {
// ...
maxToolOutputChars: 10_000, // 该 agent 所有工具的默认上限
}
// 单工具覆盖(优先级高于 AgentConfig.maxToolOutputChars
const bigQueryTool = defineTool({
// ...
maxOutputChars: 50_000,
})
```
**消费后压缩。** agent 用完某个工具结果之后,把历史副本压缩掉,后续每轮就不再重复消耗输入 token。错误结果不压缩。
```typescript
const agent: AgentConfig = {
// ...
compressToolResults: true, // 默认阈值 500 字符
// 或compressToolResults: { minChars: 2_000 }
}
```
### MCP 工具Model Context Protocol
可以连任意 MCP 服务器,把它的工具直接给 agent 用。
```typescript
import { connectMCPTools } from '@jackchen_me/open-multi-agent/mcp'
const { tools, disconnect } = await connectMCPTools({
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
namePrefix: 'github',
})
// 把每个 MCP 工具注册进你的 ToolRegistry然后在 AgentConfig.tools 里引用它们的名字
// 用完别忘了清理
await disconnect()
```
注意事项:
- `@modelcontextprotocol/sdk` 是 optional peer dependency只在用 MCP 时才要装。
- 当前只支持 stdio transport。
- MCP 的入参校验交给 MCP 服务器自己(`inputSchema` 是 `z.any()`)。
完整例子见 [`integrations/mcp-github`](examples/integrations/mcp-github.ts)。
## 上下文管理
长时间运行的 agent 很容易撞上输入 token 上限。在 `AgentConfig` 里设 `contextStrategy`,控制对话变长时怎么收缩:
```typescript
const agent: AgentConfig = {
name: 'long-runner',
model: 'claude-sonnet-4-6',
// 选一种:
contextStrategy: { type: 'sliding-window', maxTurns: 20 },
// contextStrategy: { type: 'summarize', maxTokens: 80_000, summaryModel: 'claude-haiku-4-5' },
// contextStrategy: { type: 'compact', maxTokens: 100_000, preserveRecentTurns: 4 },
// contextStrategy: { type: 'custom', compress: (messages, estimatedTokens, ctx) => ... },
}
```
| 策略 | 什么时候用 |
|------|------------|
| `sliding-window` | 最省事。只保留最近 N 轮,其余丢弃。 |
| `summarize` | 老对话发给摘要模型,用摘要替代原文。 |
| `compact` | 基于规则:截断过长的 assistant 文本块和 tool 结果,保留最近若干轮。不额外调用 LLM。 |
| `custom` | 传入自己的 `compress(messages, estimatedTokens, ctx)` 函数。 |
和上面的 `compressToolResults`、`maxToolOutputChars` 搭着用效果更好。
| `file_edit` | 通过精确字符串匹配编辑文件。 |
| `grep` | 使用正则表达式搜索文件内容。优先使用 ripgrep回退到 Node.js 实现。 |
## 支持的 Provider
@ -362,108 +187,28 @@ const agent: AgentConfig = {
|----------|------|----------|------|
| Anthropic (Claude) | `provider: 'anthropic'` | `ANTHROPIC_API_KEY` | 已验证 |
| OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | 已验证 |
| Azure OpenAI | `provider: 'azure-openai'` | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`(可选:`AZURE_OPENAI_API_VERSION`、`AZURE_OPENAI_DEPLOYMENT` | 已验证 |
| Grok (xAI) | `provider: 'grok'` | `XAI_API_KEY` | 已验证 |
| MiniMax全球 | `provider: 'minimax'` | `MINIMAX_API_KEY` | 已验证 |
| MiniMax国内 | `provider: 'minimax'` + `MINIMAX_BASE_URL` | `MINIMAX_API_KEY` | 已验证 |
| DeepSeek | `provider: 'deepseek'` | `DEEPSEEK_API_KEY` | 已验证 |
| GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | 已验证 |
| Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | 已验证 |
| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | 无 | 已验证 |
| Groq | `provider: 'openai'` + `baseURL` | `GROQ_API_KEY` | 已验证 |
| llama.cpp server | `provider: 'openai'` + `baseURL` | 无 | 已验证 |
| Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | 已验证 |
Gemini 需要 `npm install @google/genai`optional peer dependency)。
已验证支持 tool-calling 的本地模型:**Gemma 4**(见[示例 08](examples/08-gemma4-local.ts))。
OpenAI 兼容的 API 都能用 `provider: 'openai'` + `baseURL`Mistral、Qwen、Moonshot、Doubao 等。Groq 在 [`providers/groq`](examples/providers/groq.ts) 里验证过。Grok、MiniMax、DeepSeek 直接用 `provider: 'grok'`、`provider: 'minimax'`、`provider: 'deepseek'`,不用配 `baseURL`
### 本地模型 Tool-Calling
Ollama、vLLM、LM Studio、llama.cpp 跑的本地模型也能 tool-calling走的是这些服务自带的 OpenAI 兼容接口。
**已验证模型:** Gemma 4、Llama 3.1、Qwen 3、Mistral、Phi-4。完整列表见 [ollama.com/search?c=tools](https://ollama.com/search?c=tools)。
**兜底提取:** 本地模型如果以文本形式返回工具调用,而不是 `tool_calls` 协议格式thinking 模型或配置不对的服务常见),框架会自动从文本里提取。
**超时设置。** 本地推理可能慢。在 `AgentConfig` 里设 `timeoutMs`,避免一直卡住:
```typescript
const localAgent: AgentConfig = {
name: 'local',
model: 'llama3.1',
provider: 'openai',
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
tools: ['bash', 'file_read'],
timeoutMs: 120_000, // 2 分钟后中止
}
```
**常见问题:**
- 模型不调用工具?先确认它在 Ollama 的 [Tools 分类](https://ollama.com/search?c=tools) 里,不是所有模型都支持。
- 把 Ollama 升到最新版(`ollama update`),旧版本有 tool-calling bug。
- 代理挡住了?本地服务用 `no_proxy=localhost` 跳过代理。
### LLM 配置示例
```typescript
const grokAgent: AgentConfig = {
name: 'grok-agent',
provider: 'grok',
model: 'grok-4',
systemPrompt: 'You are a helpful assistant.',
}
```
(设好 `XAI_API_KEY` 就行,不用配 `baseURL`。)
```typescript
const minimaxAgent: AgentConfig = {
name: 'minimax-agent',
provider: 'minimax',
model: 'MiniMax-M2.7',
systemPrompt: 'You are a helpful assistant.',
}
```
设好 `MINIMAX_API_KEY`。端点用 `MINIMAX_BASE_URL` 选:
- `https://api.minimax.io/v1` 全球端点,默认
- `https://api.minimaxi.com/v1` 中国大陆端点
也可以直接在 `AgentConfig` 里传 `baseURL`,覆盖环境变量。
```typescript
const deepseekAgent: AgentConfig = {
name: 'deepseek-agent',
provider: 'deepseek',
model: 'deepseek-chat',
systemPrompt: '你是一个有用的助手。',
}
```
设好 `DEEPSEEK_API_KEY`。两个模型:`deepseek-chat`DeepSeek-V3写代码选这个`deepseek-reasoner`(思考模式)。
任何 OpenAI 兼容 API 均可通过 `provider: 'openai'` + `baseURL` 接入DeepSeek、Groq、Mistral、Qwen、MiniMax 等)。这些 Provider 尚未完整验证——欢迎通过 [#25](https://github.com/JackChen-me/open-multi-agent/issues/25) 贡献验证。
## 参与贡献
Issue、feature request、PR 都欢迎。特别想要
欢迎提 Issue、功能需求和 PR。以下方向的贡献尤其有价值
- **生产级示例。** 端到端跑通的真实场景工作流。收录条件和提交格式见 [`examples/production/README.md`](./examples/production/README.md)。
- **文档。** 指南、教程、API 文档。
## 贡献者
<a href="https://github.com/JackChen-me/open-multi-agent/graphs/contributors">
<img src="https://contrib.rocks/image?repo=JackChen-me/open-multi-agent&max=20&v=20260423" />
</a>
- **Provider 集成** — 验证并文档化 OpenAI 兼容 ProviderDeepSeek、Groq、Qwen、MiniMax 等)通过 `baseURL` 接入。详见 [#25](https://github.com/JackChen-me/open-multi-agent/issues/25)。对于非 OpenAI 兼容的 Provider如 Gemini欢迎贡献新的 `LLMAdapter` 实现——接口只需两个方法:`chat()` 和 `stream()`
- **示例** — 真实场景的工作流和用例。
- **文档** — 指南、教程和 API 文档。
## Star 趋势
<a href="https://star-history.com/#JackChen-me/open-multi-agent&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260423" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260423" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260423" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&theme=dark&v=20260403" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260403" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JackChen-me/open-multi-agent&type=Date&v=20260403" />
</picture>
</a>

View File

@ -1 +0,0 @@
comment: false

View File

@ -1,260 +0,0 @@
# Command-line interface (`oma`)
The package ships a small binary **`oma`** that exposes the same primitives as the TypeScript API: `runTeam`, `runTasks`, plus a static provider reference. It is meant for **shell scripts and CI** (JSON on stdout, stable exit codes).
It does **not** provide an interactive REPL, working-directory injection into tools, human approval gates, or session persistence. Those stay in application code.
## Installation and invocation
After installing the package, the binary is on `PATH` when using `npx` or a local `node_modules/.bin`:
```bash
npm install @jackchen_me/open-multi-agent
npx oma help
```
From a clone of the repository you need a build first:
```bash
npm run build
node dist/cli/oma.js help
```
Set the usual provider API keys in the environment (see [README](../README.md#quick-start)); the CLI does not read secrets from flags. MiniMax additionally reads `MINIMAX_BASE_URL` to select the global (`https://api.minimax.io/v1`) or China (`https://api.minimaxi.com/v1`) endpoint.
---
## Commands
### `oma run`
Runs **`OpenMultiAgent.runTeam(team, goal)`**: coordinator decomposition, task queue, optional synthesis.
When invoked with `--dashboard`, the **`oma` CLI** writes a static post-execution DAG dashboard HTML to `oma-dashboards/runTeam-<timestamp>.html` under the current working directory (the library does not write files itself; if you want this outside the CLI, call `renderTeamRunDashboard(result)` in application code — see `src/dashboard/render-team-run-dashboard.ts`).
The dashboard page loads **Tailwind CSS** (Play CDN), **Google Fonts** (Space Grotesk, Inter, Material Symbols), and **Material Symbols** from the network at view time. Opening the HTML file requires an **online** environment unless you host or inline those assets yourself (a future improvement).
| Argument | Required | Description |
|----------|----------|-------------|
| `--goal` | Yes | Natural-language goal passed to the team run. |
| `--team` | Yes | Path to JSON (see [Team file](#team-file)). |
| `--orchestrator` | No | Path to JSON merged into `new OpenMultiAgent(...)` after any orchestrator fragment from the team file. |
| `--coordinator` | No | Path to JSON passed as `runTeam(..., { coordinator })` (`CoordinatorConfig`). |
| `--dashboard` | No | Write a post-execution DAG dashboard HTML to `oma-dashboards/runTeam-<timestamp>.html`. |
Global flags: [`--pretty`](#output-flags), [`--include-messages`](#output-flags).
### `oma task`
Runs **`OpenMultiAgent.runTasks(team, tasks)`** with a fixed task list (no coordinator decomposition).
| Argument | Required | Description |
|----------|----------|-------------|
| `--file` | Yes | Path to [tasks file](#tasks-file). |
| `--team` | No | Path to JSON `TeamConfig`. When set, overrides the `team` object inside `--file`. |
Global flags: [`--pretty`](#output-flags), [`--include-messages`](#output-flags).
### `oma provider`
Read-only helper for wiring JSON configs and env vars.
- **`oma provider`** or **`oma provider list`** — Prints JSON: built-in provider ids, API key environment variable names, whether `baseURL` is supported, and short notes (e.g. OpenAI-compatible servers, Copilot in CI).
- **`oma provider template <provider>`** — Prints a JSON object with example `orchestrator` and `agent` fields plus placeholder `env` entries. `<provider>` is one of: `anthropic`, `openai`, `gemini`, `grok`, `minimax`, `deepseek`, `copilot`.
Supports `--pretty`.
### `oma`, `oma help`, `oma -h`, `oma --help`
Prints usage text to stdout and exits **0**.
---
## Configuration files
Shapes match the library types `TeamConfig`, `OrchestratorConfig`, `CoordinatorConfig`, and the task objects accepted by `runTasks()`.
### Team file
Used with **`oma run --team`** (and optionally **`oma task --team`**).
**Option A — plain `TeamConfig`**
```json
{
"name": "api-team",
"agents": [
{
"name": "architect",
"model": "claude-sonnet-4-6",
"provider": "anthropic",
"systemPrompt": "You design APIs.",
"tools": ["file_read", "file_write"],
"maxTurns": 6
}
],
"sharedMemory": true
}
```
**Option B — team plus default orchestrator settings**
```json
{
"team": {
"name": "api-team",
"agents": [{ "name": "worker", "model": "claude-sonnet-4-6", "systemPrompt": "…" }]
},
"orchestrator": {
"defaultModel": "claude-sonnet-4-6",
"defaultProvider": "anthropic",
"maxConcurrency": 3
}
}
```
Validation rules enforced by the CLI:
- Root (or `team`) must be an object.
- `team.name` is a non-empty string.
- `team.agents` is a non-empty array; each agent must have non-empty `name` and `model`.
Any other fields are passed through to the library as in TypeScript.
### Tasks file
Used with **`oma task --file`**.
```json
{
"orchestrator": {
"defaultModel": "claude-sonnet-4-6"
},
"team": {
"name": "pipeline",
"agents": [
{ "name": "designer", "model": "claude-sonnet-4-6", "systemPrompt": "…" },
{ "name": "builder", "model": "claude-sonnet-4-6", "systemPrompt": "…" }
],
"sharedMemory": true
},
"tasks": [
{
"title": "Design",
"description": "Produce a short spec for the feature.",
"assignee": "designer"
},
{
"title": "Implement",
"description": "Implement from the design.",
"assignee": "builder",
"dependsOn": ["Design"]
}
]
}
```
- **`dependsOn`** — Task titles (not internal ids), same convention as the coordinator output in the library.
- Optional per-task fields: `memoryScope` (`"dependencies"` \| `"all"`), `maxRetries`, `retryDelayMs`, `retryBackoff`.
- **`tasks`** must be a non-empty array; each item needs string `title` and `description`.
If **`--team path.json`** is passed, the files top-level `team` property is ignored and the external file is used instead (useful when the same team definition is shared across several pipeline files).
### Orchestrator and coordinator JSON
These files are arbitrary JSON objects merged into **`OrchestratorConfig`** and **`CoordinatorConfig`**. Function-valued options (`onProgress`, `onApproval`, etc.) cannot appear in JSON and are not supported by the CLI.
---
## Output
### Stdout
Every invocation prints **one JSON document** to stdout, followed by a newline.
**Successful `run` / `task`**
```json
{
"command": "run",
"success": true,
"totalTokenUsage": { "input_tokens": 0, "output_tokens": 0 },
"agentResults": {
"architect": {
"success": true,
"output": "…",
"tokenUsage": { "input_tokens": 0, "output_tokens": 0 },
"toolCalls": [],
"structured": null,
"loopDetected": false,
"budgetExceeded": false
}
}
}
```
`agentResults` keys are agent names. When an agent ran multiple tasks, the library merges results; the CLI mirrors the merged `AgentRunResult` fields.
**Errors (usage, validation, I/O, runtime)**
```json
{
"error": {
"kind": "usage",
"message": "--goal and --team are required"
}
}
```
`kind` is one of: `usage`, `validation`, `io`, `runtime`, or `internal` (uncaught errors in the outer handler).
### Output flags
| Flag | Effect |
|------|--------|
| `--pretty` | Pretty-print JSON with indentation. |
| `--include-messages` | Include each agents full `messages` array in `agentResults`. **Very large** for long runs; default is omit. |
There is no separate progress stream; for rich telemetry use the TypeScript API with `onProgress` / `onTrace`.
---
## Exit codes
| Code | Meaning |
|------|---------|
| **0** | Success: `run`/`task` finished with `success === true`, or help / `provider` completed normally. |
| **1** | Run finished but **`success === false`** (agent or task failure as reported by the library). |
| **2** | Usage, validation, readable JSON errors, or file access issues (e.g. missing file). |
| **3** | Unexpected error, including typical LLM/API failures surfaced as thrown errors. |
In scripts:
```bash
npx oma run --goal "Summarize README" --team team.json > result.json
code=$?
case $code in
0) echo "OK" ;;
1) echo "Run reported failure — inspect result.json" ;;
2) echo "Bad inputs or files" ;;
3) echo "Crash or API error" ;;
esac
```
---
## Argument parsing
- Long options only: `--goal`, `--team`, `--file`, etc.
- Values may be attached with `=`: `--team=./team.json`.
- Boolean-style flags (`--pretty`, `--include-messages`) take no value; if the next token does not start with `--`, it is treated as the value of the previous option (standard `getopt`-style pairing).
---
## Limitations (by design)
- No TTY session, history, or `stdin` goal input.
- No built-in **`cwd`** or metadata passed into `ToolUseContext` (tools use process cwd unless the library adds other hooks later).
- No **`onApproval`** from JSON; non-interactive batch only.
- Coordinator **`runTeam`** path still requires network and API keys like any other run.

View File

@ -1,18 +1,18 @@
/**
* Single Agent
* Example 01 Single Agent
*
* The simplest possible usage: one agent with bash and file tools, running
* a coding task. Then shows streaming output using the Agent class directly.
*
* Run:
* npx tsx examples/basics/single-agent.ts
* npx tsx examples/01-single-agent.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent, Agent, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
import type { OrchestratorEvent } from '../../src/types.js'
import { OpenMultiAgent, Agent, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
import type { OrchestratorEvent } from '../src/types.js'
// ---------------------------------------------------------------------------
// Part 1: Single agent via OpenMultiAgent (simplest path)
@ -114,8 +114,6 @@ const conversationAgent = new Agent(
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a TypeScript tutor. Give short, direct answers.',
maxTurns: 2,
// Keep only the most recent turn in long prompt() conversations.
contextStrategy: { type: 'sliding-window', maxTurns: 1 },
},
new ToolRegistry(), // no tools needed for this conversation
new ToolExecutor(new ToolRegistry()),

View File

@ -1,19 +1,19 @@
/**
* Multi-Agent Team Collaboration
* Example 02 Multi-Agent Team Collaboration
*
* Three specialised agents (architect, developer, reviewer) collaborate on a
* shared goal. The OpenMultiAgent orchestrator breaks the goal into tasks, assigns
* them to the right agents, and collects the results.
*
* Run:
* npx tsx examples/basics/team-collaboration.ts
* npx tsx examples/02-team-collaboration.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
import { OpenMultiAgent } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions

View File

@ -1,21 +1,19 @@
/**
* Explicit Task Pipeline with Dependencies
* Example 03 Explicit Task Pipeline with Dependencies
*
* Demonstrates how to define tasks with explicit dependency chains
* (design implement test review) using runTasks(). The TaskQueue
* automatically blocks downstream tasks until their dependencies complete.
* Prompt context is dependency-scoped by default: each task sees only its own
* description plus direct dependency results (not unrelated team outputs).
*
* Run:
* npx tsx examples/basics/task-pipeline.ts
* npx tsx examples/03-task-pipeline.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'
import { OpenMultiAgent } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
// ---------------------------------------------------------------------------
// Agents
@ -118,7 +116,6 @@ const tasks: Array<{
description: string
assignee?: string
dependsOn?: string[]
memoryScope?: 'dependencies' | 'all'
}> = [
{
title: 'Design: URL shortener data model',
@ -165,9 +162,6 @@ Produce a structured code review with sections:
- Verdict: SHIP or NEEDS WORK`,
assignee: 'reviewer',
dependsOn: ['Implement: URL shortener'], // runs in parallel with Test after Implement completes
// Optional override: reviewers can opt into full shared memory when needed.
// Remove this line to keep strict dependency-only context.
memoryScope: 'all',
},
]

View File

@ -1,5 +1,5 @@
/**
* Multi-Model Team with Custom Tools
* Example 04 Multi-Model Team with Custom Tools
*
* Demonstrates:
* - Mixing Anthropic and OpenAI models in the same team
@ -8,7 +8,7 @@
* - Running a team goal that uses the custom tools
*
* Run:
* npx tsx examples/basics/multi-model-team.ts
* npx tsx examples/04-multi-model-team.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY and OPENAI_API_KEY env vars must be set.
@ -16,8 +16,8 @@
*/
import { z } from 'zod'
import { OpenMultiAgent, defineTool } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
import { OpenMultiAgent, defineTool } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../src/types.js'
// ---------------------------------------------------------------------------
// Custom tools — defined with defineTool() + Zod schemas
@ -113,7 +113,7 @@ const formatCurrencyTool = defineTool({
// directly through AgentPool rather than through the OpenMultiAgent high-level API.
// ---------------------------------------------------------------------------
import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
/**
* Build an Agent with both built-in and custom tools registered.

View File

@ -0,0 +1,49 @@
/**
* Quick smoke test for the Copilot adapter.
*
* Run:
* npx tsx examples/05-copilot-test.ts
*
* If GITHUB_COPILOT_TOKEN is not set, the adapter will start an interactive
* OAuth2 device flow you'll be prompted to sign in via your browser.
*/
import { OpenMultiAgent } from '../src/index.js'
import type { OrchestratorEvent } from '../src/types.js'
const orchestrator = new OpenMultiAgent({
defaultModel: 'gpt-4o',
defaultProvider: 'copilot',
onProgress: (event: OrchestratorEvent) => {
if (event.type === 'agent_start') {
console.log(`[start] agent=${event.agent}`)
} else if (event.type === 'agent_complete') {
console.log(`[complete] agent=${event.agent}`)
}
},
})
console.log('Testing Copilot adapter with gpt-4o...\n')
const result = await orchestrator.runAgent(
{
name: 'assistant',
model: 'gpt-4o',
provider: 'copilot',
systemPrompt: 'You are a helpful assistant. Keep answers brief.',
maxTurns: 1,
maxTokens: 256,
},
'What is 2 + 2? Reply in one sentence.',
)
if (result.success) {
console.log('\nAgent output:')
console.log('─'.repeat(60))
console.log(result.output)
console.log('─'.repeat(60))
console.log(`\nTokens: input=${result.tokenUsage.input_tokens}, output=${result.tokenUsage.output_tokens}`)
} else {
console.error('Agent failed:', result.output)
process.exit(1)
}

View File

@ -1,5 +1,5 @@
/**
* Local Model + Cloud Model Team (Ollama + Claude)
* Example 06 Local Model + Cloud Model Team (Ollama + Claude)
*
* Demonstrates mixing a local model served by Ollama with a cloud model
* (Claude) in the same task pipeline. The key technique is using
@ -14,7 +14,7 @@
* Just change the baseURL and model name below.
*
* Run:
* npx tsx examples/providers/ollama.ts
* npx tsx examples/06-local-model.ts
*
* Prerequisites:
* 1. Ollama installed and running: https://ollama.com
@ -22,8 +22,8 @@
* 3. ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'
import { OpenMultiAgent } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
// ---------------------------------------------------------------------------
// Agents
@ -64,7 +64,6 @@ Your review MUST include these sections:
Be specific and constructive. Reference line numbers or function names when possible.`,
tools: ['file_read'],
maxTurns: 4,
timeoutMs: 120_000, // 2 min — local models can be slow
}
// ---------------------------------------------------------------------------

View File

@ -1,5 +1,5 @@
/**
* Fan-Out / Aggregate (MapReduce) Pattern
* Example 07 Fan-Out / Aggregate (MapReduce) Pattern
*
* Demonstrates:
* - Fan-out: send the same question to N "analyst" agents in parallel
@ -9,14 +9,14 @@
* - No tools needed pure LLM reasoning to keep the focus on the pattern
*
* Run:
* npx tsx examples/patterns/fan-out-aggregate.ts
* npx tsx examples/07-fan-out-aggregate.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
import type { AgentConfig, AgentRunResult } from '../../src/types.js'
import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../src/index.js'
import type { AgentConfig, AgentRunResult } from '../src/types.js'
// ---------------------------------------------------------------------------
// Analysis topic

203
examples/08-gemma4-local.ts Normal file
View File

@ -0,0 +1,203 @@
/**
* Example 08 Gemma 4 Local Agent Team (100% Local, Zero API Cost)
*
* Demonstrates a fully local multi-agent team using Google's Gemma 4 via
* Ollama. No cloud API keys needed everything runs on your machine.
*
* Two agents collaborate through a task pipeline:
* - researcher: uses bash + file_write to gather system info and write a report
* - summarizer: uses file_read to read the report and produce a concise summary
*
* This pattern works with any Ollama model that supports tool-calling.
* Gemma 4 (released 2026-04-02) has native tool-calling support.
*
* Run:
* no_proxy=localhost npx tsx examples/08-gemma4-local.ts
*
* Prerequisites:
* 1. Ollama >= 0.20.0 installed and running: https://ollama.com
* 2. Pull the model: ollama pull gemma4:e2b
* (or gemma4:e4b for better quality on machines with more RAM)
* 3. No API keys needed!
*
* Note: The no_proxy=localhost prefix is needed if you have an HTTP proxy
* configured, since the OpenAI SDK would otherwise route Ollama requests
* through the proxy.
*/
import { OpenMultiAgent } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
// ---------------------------------------------------------------------------
// Configuration — change this to match your Ollama setup
// ---------------------------------------------------------------------------
// See available tags at https://ollama.com/library/gemma4
const OLLAMA_MODEL = 'gemma4:e2b' // or 'gemma4:e4b', 'gemma4:26b'
const OLLAMA_BASE_URL = 'http://localhost:11434/v1'
const OUTPUT_DIR = '/tmp/gemma4-demo'
// ---------------------------------------------------------------------------
// Agents — both use Gemma 4 locally
// ---------------------------------------------------------------------------
/**
* Researcher gathers system information using shell commands.
*/
const researcher: AgentConfig = {
name: 'researcher',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama', // placeholder — Ollama ignores this, but the OpenAI SDK requires a non-empty value
systemPrompt: `You are a system researcher. Your job is to gather information
about the current machine using shell commands and write a structured report.
Use the bash tool to run commands like: uname -a, df -h, uptime, and similar
non-destructive read-only commands.
On macOS you can also use: sw_vers, sysctl -n hw.memsize.
On Linux you can also use: cat /etc/os-release, free -h.
Then use file_write to save a Markdown report to ${OUTPUT_DIR}/system-report.md.
The report should have sections: OS, Hardware, Disk, and Uptime.
Be concise one or two lines per section is enough.`,
tools: ['bash', 'file_write'],
maxTurns: 8,
}
/**
* Summarizer reads the report and writes a one-paragraph executive summary.
*/
const summarizer: AgentConfig = {
name: 'summarizer',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama',
systemPrompt: `You are a technical writer. Read the system report file provided,
then produce a concise one-paragraph executive summary (3-5 sentences).
Focus on the key highlights: what OS, how much RAM, disk status, and uptime.`,
tools: ['file_read'],
maxTurns: 4,
}
// ---------------------------------------------------------------------------
// Progress handler
// ---------------------------------------------------------------------------
const taskTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'task_start': {
taskTimes.set(event.task ?? '', Date.now())
const task = event.data as Task | undefined
console.log(`[${ts}] TASK START "${task?.title ?? event.task}" → ${task?.assignee ?? '?'}`)
break
}
case 'task_complete': {
const elapsed = Date.now() - (taskTimes.get(event.task ?? '') ?? Date.now())
console.log(`[${ts}] TASK DONE "${event.task}" in ${(elapsed / 1000).toFixed(1)}s`)
break
}
case 'agent_start':
console.log(`[${ts}] AGENT START ${event.agent}`)
break
case 'agent_complete':
console.log(`[${ts}] AGENT DONE ${event.agent}`)
break
case 'error':
console.error(`[${ts}] ERROR ${event.agent ?? ''} task=${event.task ?? '?'}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrator + Team
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: OLLAMA_MODEL,
maxConcurrency: 1, // run agents sequentially — local model can only serve one at a time
onProgress: handleProgress,
})
const team = orchestrator.createTeam('gemma4-team', {
name: 'gemma4-team',
agents: [researcher, summarizer],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Task pipeline: research → summarize
// ---------------------------------------------------------------------------
const tasks: Array<{
title: string
description: string
assignee?: string
dependsOn?: string[]
}> = [
{
title: 'Gather system information',
description: `Use bash to run system info commands (uname -a, sw_vers, sysctl, df -h, uptime).
Then write a structured Markdown report to ${OUTPUT_DIR}/system-report.md with sections:
OS, Hardware, Disk, and Uptime.`,
assignee: 'researcher',
},
{
title: 'Summarize the report',
description: `Read the file at ${OUTPUT_DIR}/system-report.md.
Produce a concise one-paragraph executive summary of the system information.`,
assignee: 'summarizer',
dependsOn: ['Gather system information'],
},
]
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
console.log('Gemma 4 Local Agent Team — Zero API Cost')
console.log('='.repeat(60))
console.log(` model → ${OLLAMA_MODEL} via Ollama`)
console.log(` researcher → bash + file_write`)
console.log(` summarizer → file_read`)
console.log(` output dir → ${OUTPUT_DIR}`)
console.log()
console.log('Pipeline: researcher gathers info → summarizer writes summary')
console.log('='.repeat(60))
const start = Date.now()
const result = await orchestrator.runTasks(team, tasks)
const totalTime = Date.now() - start
// ---------------------------------------------------------------------------
// Summary
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log('Pipeline complete.\n')
console.log(`Overall success: ${result.success}`)
console.log(`Total time: ${(totalTime / 1000).toFixed(1)}s`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
const tools = r.toolCalls.map(c => c.toolName).join(', ')
console.log(` [${icon}] ${name.padEnd(12)} tools: ${tools || '(none)'}`)
}
// Print the summarizer's output
const summary = result.agentResults.get('summarizer')
if (summary?.success) {
console.log('\nExecutive Summary (from local Gemma 4):')
console.log('-'.repeat(60))
console.log(summary.output)
console.log('-'.repeat(60))
}
console.log('\nAll processing done locally. $0 API cost.')

View File

@ -0,0 +1,162 @@
/**
* Example 09 Gemma 4 Auto-Orchestration (runTeam, 100% Local)
*
* Demonstrates the framework's key feature automatic task decomposition
* powered entirely by a local Gemma 4 model. No cloud API needed.
*
* What happens:
* 1. A Gemma 4 "coordinator" receives the goal + agent roster
* 2. It outputs a structured JSON task array (title, description, assignee, dependsOn)
* 3. The framework resolves dependencies, schedules tasks, and runs agents
* 4. The coordinator synthesises all task results into a final answer
*
* This is the hardest test for a local model it must produce valid JSON
* for task decomposition AND do tool-calling for actual task execution.
* Gemma 4 e2b (5.1B params) handles both reliably.
*
* Run:
* no_proxy=localhost npx tsx examples/09-gemma4-auto-orchestration.ts
*
* Prerequisites:
* 1. Ollama >= 0.20.0 installed and running: https://ollama.com
* 2. Pull the model: ollama pull gemma4:e2b
* 3. No API keys needed!
*
* Note: The no_proxy=localhost prefix is needed if you have an HTTP proxy
* configured, since the OpenAI SDK would otherwise route Ollama requests
* through the proxy.
*/
import { OpenMultiAgent } from '../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../src/types.js'
// ---------------------------------------------------------------------------
// Configuration
// ---------------------------------------------------------------------------
// See available tags at https://ollama.com/library/gemma4
const OLLAMA_MODEL = 'gemma4:e2b' // or 'gemma4:e4b', 'gemma4:26b'
const OLLAMA_BASE_URL = 'http://localhost:11434/v1'
// ---------------------------------------------------------------------------
// Agents — the coordinator is created automatically by runTeam()
// ---------------------------------------------------------------------------
const researcher: AgentConfig = {
name: 'researcher',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama',
systemPrompt: `You are a system researcher. Use bash to run non-destructive,
read-only commands and report the results concisely.`,
tools: ['bash'],
maxTurns: 4,
}
const writer: AgentConfig = {
name: 'writer',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama',
systemPrompt: `You are a technical writer. Use file_write to create clear,
structured Markdown reports based on the information provided.`,
tools: ['file_write'],
maxTurns: 4,
}
// ---------------------------------------------------------------------------
// Progress handler
// ---------------------------------------------------------------------------
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'task_start': {
const task = event.data as Task | undefined
console.log(`[${ts}] TASK START "${task?.title ?? event.task}" → ${task?.assignee ?? '?'}`)
break
}
case 'task_complete':
console.log(`[${ts}] TASK DONE "${event.task}"`)
break
case 'agent_start':
console.log(`[${ts}] AGENT START ${event.agent}`)
break
case 'agent_complete':
console.log(`[${ts}] AGENT DONE ${event.agent}`)
break
case 'error':
console.error(`[${ts}] ERROR ${event.agent ?? ''} task=${event.task ?? '?'}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrator — defaultModel is used for the coordinator agent
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: OLLAMA_MODEL,
defaultProvider: 'openai',
defaultBaseURL: OLLAMA_BASE_URL,
defaultApiKey: 'ollama',
maxConcurrency: 1, // local model serves one request at a time
onProgress: handleProgress,
})
const team = orchestrator.createTeam('gemma4-auto', {
name: 'gemma4-auto',
agents: [researcher, writer],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Give a goal — the framework handles the rest
// ---------------------------------------------------------------------------
const goal = `Check this machine's Node.js version, npm version, and OS info,
then write a short Markdown summary report to /tmp/gemma4-auto/report.md`
console.log('Gemma 4 Auto-Orchestration — Zero API Cost')
console.log('='.repeat(60))
console.log(` model → ${OLLAMA_MODEL} via Ollama (all agents + coordinator)`)
console.log(` researcher → bash`)
console.log(` writer → file_write`)
console.log(` coordinator → auto-created by runTeam()`)
console.log()
console.log(`Goal: ${goal.replace(/\n/g, ' ').trim()}`)
console.log('='.repeat(60))
const start = Date.now()
const result = await orchestrator.runTeam(team, goal)
const totalTime = Date.now() - start
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log('Pipeline complete.\n')
console.log(`Overall success: ${result.success}`)
console.log(`Total time: ${(totalTime / 1000).toFixed(1)}s`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
const tools = r.toolCalls.length > 0 ? r.toolCalls.map(c => c.toolName).join(', ') : '(none)'
console.log(` [${icon}] ${name.padEnd(24)} tools: ${tools}`)
}
// Print the coordinator's final synthesis
const coordResult = result.agentResults.get('coordinator')
if (coordResult?.success) {
console.log('\nFinal synthesis (from local Gemma 4 coordinator):')
console.log('-'.repeat(60))
console.log(coordResult.output)
console.log('-'.repeat(60))
}
console.log('\nAll processing done locally. $0 API cost.')

View File

@ -1,89 +0,0 @@
# Examples
Runnable scripts demonstrating `open-multi-agent`. Organized by category — pick one that matches what you're trying to do.
All scripts run with `npx tsx examples/<category>/<name>.ts` and require the corresponding API key in your environment.
---
## basics — start here
The four core execution modes. Read these first.
| Example | What it shows |
|---------|---------------|
| [`basics/single-agent`](basics/single-agent.ts) | One agent with bash + file tools, then streaming via the `Agent` class. |
| [`basics/team-collaboration`](basics/team-collaboration.ts) | `runTeam()` coordinator pattern — goal in, results out. |
| [`basics/task-pipeline`](basics/task-pipeline.ts) | `runTasks()` with explicit task DAG and dependencies. |
| [`basics/multi-model-team`](basics/multi-model-team.ts) | Different models per agent in one team. |
## providers — model & adapter examples
One example per supported provider. All follow the same three-agent (architect / developer / reviewer) shape so they're easy to compare.
| Example | Provider | Env var |
|---------|----------|---------|
| [`providers/ollama`](providers/ollama.ts) | Ollama (local) + Claude | `ANTHROPIC_API_KEY` |
| [`providers/gemma4-local`](providers/gemma4-local.ts) | Gemma 4 via Ollama (100% local) | — |
| [`providers/copilot`](providers/copilot.ts) | GitHub Copilot (GPT-4o + Claude) | `GITHUB_TOKEN` |
| [`providers/azure-openai`](providers/azure-openai.ts) | Azure OpenAI | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` (+ optional `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`) |
| [`providers/grok`](providers/grok.ts) | xAI Grok | `XAI_API_KEY` |
| [`providers/gemini`](providers/gemini.ts) | Google Gemini | `GEMINI_API_KEY` |
| [`providers/minimax`](providers/minimax.ts) | MiniMax M2.7 | `MINIMAX_API_KEY` |
| [`providers/deepseek`](providers/deepseek.ts) | DeepSeek Chat | `DEEPSEEK_API_KEY` |
| [`providers/groq`](providers/groq.ts) | Groq (OpenAI-compatible) | `GROQ_API_KEY` |
## patterns — orchestration patterns
Reusable shapes for common multi-agent problems.
| Example | Pattern |
|---------|---------|
| [`patterns/fan-out-aggregate`](patterns/fan-out-aggregate.ts) | MapReduce-style fan-out via `AgentPool.runParallel()`. |
| [`patterns/structured-output`](patterns/structured-output.ts) | Zod-validated JSON output from an agent. |
| [`patterns/task-retry`](patterns/task-retry.ts) | Per-task retry with exponential backoff. |
| [`patterns/multi-perspective-code-review`](patterns/multi-perspective-code-review.ts) | Multiple reviewer agents in parallel, then synthesis. |
| [`patterns/research-aggregation`](patterns/research-aggregation.ts) | Multi-source research collated by a synthesis agent. |
| [`patterns/agent-handoff`](patterns/agent-handoff.ts) | Synchronous sub-agent delegation via `delegate_to_agent`. |
## cookbook — use-case recipes
End-to-end examples framed around a concrete problem (meeting summarization, translation QA, competitive monitoring, etc.) rather than a single orchestration primitive. Lighter bar than `production/`: no tests or pinned model versions required. Good entry point if you want to see how the patterns compose on a real task.
| Example | Problem solved |
|---------|----------------|
| [`cookbook/meeting-summarizer`](cookbook/meeting-summarizer.ts) | Fan-out post-processing of a transcript into summary, structured action items, and sentiment. |
## integrations — external systems
Hooking the framework up to outside-the-box tooling.
| Example | Integrates with |
|---------|-----------------|
| [`integrations/trace-observability`](integrations/trace-observability.ts) | `onTrace` spans for LLM calls, tools, and tasks. |
| [`integrations/mcp-github`](integrations/mcp-github.ts) | An MCP server's tools exposed to an agent via `connectMCPTools()`. |
| [`integrations/with-vercel-ai-sdk/`](integrations/with-vercel-ai-sdk/) | Next.js app — OMA `runTeam()` + AI SDK `useChat` streaming. |
## production — real-world use cases
End-to-end examples wired to real workflows. Higher bar than the categories above. See [`production/README.md`](production/README.md) for the acceptance criteria and how to contribute.
---
## Adding a new example
| You're adding… | Goes in… | Filename |
|----------------|----------|----------|
| A new model provider | `providers/` | `<provider-name>.ts` (lowercase, hyphenated) |
| A reusable orchestration pattern | `patterns/` | `<pattern-name>.ts` |
| A use-case-driven example (problem-first, uses one or more patterns) | `cookbook/` | `<use-case>.ts` |
| Integration with an outside system (MCP server, observability backend, framework, app) | `integrations/` | `<system>.ts` or `<system>/` for multi-file |
| A real-world end-to-end use case, production-grade | `production/` | `<use-case>/` directory with its own README |
Conventions:
- **No numeric prefixes.** Folders signal category; reading order is set by this README.
- **File header docstring** with one-line title, `Run:` block, and prerequisites.
- **Imports** should resolve as `from '../../src/index.js'` (one level deeper than the old flat layout).
- **Match the provider template** when adding a provider: three-agent team (architect / developer / reviewer) building a small REST API. Keeps comparisons honest.
- **Add a row** to the table in this file for the corresponding category.

View File

@ -1,284 +0,0 @@
/**
* Meeting Summarizer (Parallel Post-Processing)
*
* Demonstrates:
* - Fan-out of three specialized agents on the same meeting transcript
* - Structured output (Zod schemas) for action items and sentiment
* - Parallel timing check: wall time vs sum of per-agent durations
* - Aggregator merges into a single Markdown report
*
* Run:
* npx tsx examples/patterns/meeting-summarizer.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { readFileSync } from 'node:fs'
import { fileURLToPath } from 'node:url'
import path from 'node:path'
import { z } from 'zod'
import { Agent, AgentPool, ToolRegistry, ToolExecutor, registerBuiltInTools } from '../../src/index.js'
import type { AgentConfig, AgentRunResult } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Load the transcript fixture
// ---------------------------------------------------------------------------
const __dirname = path.dirname(fileURLToPath(import.meta.url))
const TRANSCRIPT = readFileSync(
path.join(__dirname, '../fixtures/meeting-transcript.txt'),
'utf-8',
)
// ---------------------------------------------------------------------------
// Zod schemas for structured agents
// ---------------------------------------------------------------------------
const ActionItemList = z.object({
items: z.array(
z.object({
task: z.string().describe('The action to be taken'),
owner: z.string().describe('Name of the person responsible'),
due_date: z.string().optional().describe('ISO date or human-readable due date if mentioned'),
}),
),
})
type ActionItemList = z.infer<typeof ActionItemList>
const SentimentReport = z.object({
participants: z.array(
z.object({
participant: z.string().describe('Name as it appears in the transcript'),
tone: z.enum(['positive', 'neutral', 'negative', 'mixed']),
evidence: z.string().describe('Direct quote or brief paraphrase supporting the tone'),
}),
),
})
type SentimentReport = z.infer<typeof SentimentReport>
// ---------------------------------------------------------------------------
// Agent configs
// ---------------------------------------------------------------------------
const summaryConfig: AgentConfig = {
name: 'summary',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a meeting note-taker. Given a transcript, produce a
three-paragraph summary:
1. What was discussed (the agenda).
2. Decisions made.
3. Notable context or risk the team should remember.
Plain prose. No bullet points. 200-300 words total.`,
maxTurns: 1,
temperature: 0.3,
}
const actionItemsConfig: AgentConfig = {
name: 'action-items',
model: 'claude-sonnet-4-6',
systemPrompt: `You extract action items from meeting transcripts. An action
item is a concrete task with a clear owner. Skip vague intentions ("we should
think about X"). Include due dates only when the speaker named one explicitly.
Return JSON matching the schema.`,
maxTurns: 1,
temperature: 0.1,
outputSchema: ActionItemList,
}
const sentimentConfig: AgentConfig = {
name: 'sentiment',
model: 'claude-sonnet-4-6',
systemPrompt: `You analyze the tone of each participant in a meeting. For
every named speaker, classify their overall tone as positive, neutral,
negative, or mixed, and include one short quote or paraphrase as evidence.
Return JSON matching the schema.`,
maxTurns: 1,
temperature: 0.2,
outputSchema: SentimentReport,
}
const aggregatorConfig: AgentConfig = {
name: 'aggregator',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a report writer. You receive three pre-computed
analyses of the same meeting: a summary, an action-item list, and a sentiment
report. Your job is to merge them into a single Markdown report.
Output structure use exactly these four H2 headings, in order:
## Summary
## Action Items
## Sentiment
## Next Steps
Under "Action Items" render a Markdown table with columns: Task, Owner, Due.
Under "Sentiment" render one bullet per participant.
Under "Next Steps" synthesize 3-5 concrete follow-ups based on the other
sections. Do not invent action items that are not grounded in the other data.`,
maxTurns: 1,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Build agents
// ---------------------------------------------------------------------------
function buildAgent(config: AgentConfig): Agent {
const registry = new ToolRegistry()
registerBuiltInTools(registry)
const executor = new ToolExecutor(registry)
return new Agent(config, registry, executor)
}
const summary = buildAgent(summaryConfig)
const actionItems = buildAgent(actionItemsConfig)
const sentiment = buildAgent(sentimentConfig)
const aggregator = buildAgent(aggregatorConfig)
const pool = new AgentPool(3) // three specialists can run concurrently
pool.add(summary)
pool.add(actionItems)
pool.add(sentiment)
pool.add(aggregator)
console.log('Meeting Summarizer (Parallel Post-Processing)')
console.log('='.repeat(60))
console.log(`\nTranscript: ${TRANSCRIPT.split('\n')[0]}`)
console.log(`Length: ${TRANSCRIPT.split(/\s+/).length} words\n`)
// ---------------------------------------------------------------------------
// Step 1: Parallel fan-out with per-agent timing
// ---------------------------------------------------------------------------
console.log('[Step 1] Running 3 agents in parallel...\n')
const specialists = ['summary', 'action-items', 'sentiment'] as const
// Kick off all three concurrently and record each one's own wall duration.
// Sum-of-per-agent beats a separate serial pass: half the LLM cost, and the
// sum is the work parallelism saved.
const parallelStart = performance.now()
const timed = await Promise.all(
specialists.map(async (name) => {
const t = performance.now()
const result = await pool.run(name, TRANSCRIPT)
return { name, result, durationMs: performance.now() - t }
}),
)
const parallelElapsed = performance.now() - parallelStart
const byName = new Map<string, AgentRunResult>()
const serialSum = timed.reduce((acc, r) => {
byName.set(r.name, r.result)
return acc + r.durationMs
}, 0)
for (const { name, result, durationMs } of timed) {
const status = result.success ? 'OK' : 'FAILED'
console.log(
` ${name.padEnd(14)} [${status}] — ${Math.round(durationMs)}ms, ${result.tokenUsage.output_tokens} out tokens`,
)
}
console.log()
for (const { name, result } of timed) {
if (!result.success) {
console.error(`Specialist '${name}' failed: ${result.output}`)
process.exit(1)
}
}
const actionData = byName.get('action-items')!.structured as ActionItemList | undefined
const sentimentData = byName.get('sentiment')!.structured as SentimentReport | undefined
if (!actionData || !sentimentData) {
console.error('Structured output missing: action-items or sentiment failed schema validation')
process.exit(1)
}
// ---------------------------------------------------------------------------
// Step 2: Parallelism assertion
// ---------------------------------------------------------------------------
console.log('[Step 2] Parallelism check')
console.log(` Parallel wall time: ${Math.round(parallelElapsed)}ms`)
console.log(` Serial sum (per-agent): ${Math.round(serialSum)}ms`)
console.log(` Speedup: ${(serialSum / parallelElapsed).toFixed(2)}x\n`)
if (parallelElapsed >= serialSum * 0.7) {
console.error(
`ASSERTION FAILED: parallel wall time (${Math.round(parallelElapsed)}ms) is not ` +
`less than 70% of serial sum (${Math.round(serialSum)}ms). Expected substantial ` +
`speedup from fan-out.`,
)
process.exit(1)
}
// ---------------------------------------------------------------------------
// Step 3: Aggregate into Markdown report
// ---------------------------------------------------------------------------
console.log('[Step 3] Aggregating into Markdown report...\n')
const aggregatorPrompt = `Merge the three analyses below into a single Markdown report.
--- SUMMARY (prose) ---
${byName.get('summary')!.output}
--- ACTION ITEMS (JSON) ---
${JSON.stringify(actionData, null, 2)}
--- SENTIMENT (JSON) ---
${JSON.stringify(sentimentData, null, 2)}
Produce the Markdown report per the system instructions.`
const reportResult = await pool.run('aggregator', aggregatorPrompt)
if (!reportResult.success) {
console.error('Aggregator failed:', reportResult.output)
process.exit(1)
}
// ---------------------------------------------------------------------------
// Final output
// ---------------------------------------------------------------------------
console.log('='.repeat(60))
console.log('MEETING REPORT')
console.log('='.repeat(60))
console.log()
console.log(reportResult.output)
console.log()
console.log('-'.repeat(60))
// ---------------------------------------------------------------------------
// Token usage summary
// ---------------------------------------------------------------------------
console.log('\nToken Usage Summary:')
console.log('-'.repeat(60))
let totalInput = 0
let totalOutput = 0
for (const { name, result } of timed) {
totalInput += result.tokenUsage.input_tokens
totalOutput += result.tokenUsage.output_tokens
console.log(
` ${name.padEnd(14)} — input: ${result.tokenUsage.input_tokens}, output: ${result.tokenUsage.output_tokens}`,
)
}
totalInput += reportResult.tokenUsage.input_tokens
totalOutput += reportResult.tokenUsage.output_tokens
console.log(
` ${'aggregator'.padEnd(14)} — input: ${reportResult.tokenUsage.input_tokens}, output: ${reportResult.tokenUsage.output_tokens}`,
)
console.log('-'.repeat(60))
console.log(` ${'TOTAL'.padEnd(14)} — input: ${totalInput}, output: ${totalOutput}`)
console.log('\nDone.')

View File

@ -1,328 +0,0 @@
/**
* Translation + Backtranslation Quality Check (Cross-Model)
*
* Demonstrates:
* - Agent A: translate EN -> target language with Claude
* - Agent B: back-translate -> EN with a different provider family
* - Agent C: compare original vs. backtranslation and flag semantic drift
* - Structured output with Zod schemas
*
* Run:
* npx tsx examples/cookbook/translation-backtranslation.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY must be set
* and at least one of OPENAI_API_KEY / GEMINI_API_KEY must be set
*/
import { z } from 'zod'
import {
Agent,
AgentPool,
ToolRegistry,
ToolExecutor,
registerBuiltInTools,
} from '../../src/index.js'
import type { AgentConfig } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Inline sample text (3-5 technical paragraphs, per issue requirement)
// ---------------------------------------------------------------------------
const SAMPLE_TEXT = `
Modern CI/CD pipelines rely on deterministic builds and reproducible environments.
A deployment may fail even when the application code is correct if the runtime,
dependency graph, or container image differs from what engineers tested locally.
Observability should combine logs, metrics, and traces rather than treating them
as separate debugging tools. Metrics show that something is wrong, logs provide
local detail, and traces explain how a request moved across services.
Schema validation is especially important in LLM systems. A response may sound
reasonable to a human reader but still break automation if the JSON structure,
field names, or enum values do not match the downstream contract.
Cross-model verification can reduce self-confirmation bias. When one model
produces a translation and a different provider family performs the
backtranslation, semantic drift becomes easier to detect.
`.trim()
// ---------------------------------------------------------------------------
// Zod schemas
// ---------------------------------------------------------------------------
const ParagraphInput = z.object({
index: z.number().int().positive(),
original: z.string(),
})
type ParagraphInput = z.infer<typeof ParagraphInput>
const TranslationBatch = z.object({
target_language: z.string(),
items: z.array(
z.object({
index: z.number().int().positive(),
translation: z.string(),
}),
),
})
type TranslationBatch = z.infer<typeof TranslationBatch>
const BacktranslationBatch = z.object({
items: z.array(
z.object({
index: z.number().int().positive(),
backtranslation: z.string(),
}),
),
})
type BacktranslationBatch = z.infer<typeof BacktranslationBatch>
const DriftRow = z.object({
original: z.string(),
translation: z.string(),
backtranslation: z.string(),
drift_severity: z.enum(['none', 'minor', 'major']),
notes: z.string(),
})
type DriftRow = z.infer<typeof DriftRow>
const DriftTable = z.array(DriftRow)
type DriftTable = z.infer<typeof DriftTable>
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function buildAgent(config: AgentConfig): Agent {
const registry = new ToolRegistry()
registerBuiltInTools(registry)
const executor = new ToolExecutor(registry)
return new Agent(config, registry, executor)
}
function splitParagraphs(text: string): ParagraphInput[] {
return text
.split(/\n\s*\n/)
.map((p, i) => ({
index: i + 1,
original: p.trim(),
}))
.filter((p) => p.original.length > 0)
}
// ---------------------------------------------------------------------------
// Provider selection
// ---------------------------------------------------------------------------
const hasAnthropic = Boolean(process.env.ANTHROPIC_API_KEY)
const hasOpenAI = Boolean(process.env.OPENAI_API_KEY)
const hasGemini = Boolean(process.env.GEMINI_API_KEY)
if (!hasAnthropic || (!hasGemini && !hasOpenAI)) {
console.log(
'[skip] This example needs ANTHROPIC_API_KEY plus GEMINI_API_KEY or OPENAI_API_KEY.',
)
process.exit(0)
}
// Prefer native Gemini when GEMINI_API_KEY is available.
// Fall back to OpenAI otherwise.
const backProvider: 'gemini' | 'openai' = hasGemini ? 'gemini' : 'openai'
const backModel =
backProvider === 'gemini'
? 'gemini-2.5-pro'
: (process.env.OPENAI_MODEL || 'gpt-5.4')
// ---------------------------------------------------------------------------
// Agent configs
// ---------------------------------------------------------------------------
// Agent A ---------------------------------------------------------------
// 用 Claude 做 “英文 -> 目标语言” 翻译
const translatorConfig: AgentConfig = {
name: 'translator',
provider: 'anthropic',
model: 'claude-sonnet-4-6',
systemPrompt: `You are Agent A, a technical translator.
Translate English paragraphs into Simplified Chinese.
Preserve meaning, terminology, paragraph boundaries, and index numbers.
Do not merge paragraphs.
Return JSON only, matching the schema exactly.`,
maxTurns: 1,
temperature: 0,
outputSchema: TranslationBatch,
}
// Agent B ---------------------------------------------------------------
// 用不同 provider 家族做 “目标语言 -> 英文” 回译
const backtranslatorConfig: AgentConfig = {
name: 'backtranslator',
provider: backProvider,
model: backModel,
baseURL: backProvider === 'openai' ? process.env.OPENAI_BASE_URL : undefined,
systemPrompt: `You are Agent B, a back-translation specialist.
Back-translate the provided Simplified Chinese paragraphs into English.
Preserve meaning as literally as possible.
Do not merge paragraphs.
Keep the same index numbers.
Return JSON only, matching the schema exactly.`,
maxTurns: 1,
temperature: 0,
outputSchema: BacktranslationBatch,
}
// Agent C ---------------------------------------------------------------
// 比较原文和回译文,判断语义漂移
const reviewerConfig: AgentConfig = {
name: 'reviewer',
provider: 'anthropic',
model: 'claude-sonnet-4-6',
systemPrompt: `You are Agent C, a semantic drift reviewer.
You will receive:
- the original English paragraph
- the translated paragraph
- the backtranslated English paragraph
For each paragraph, judge drift_severity using only:
- none: meaning preserved
- minor: slight wording drift, but no important meaning change
- major: material semantic change, omission, contradiction, or mistranslation
Return JSON only.
The final output must be an array where each item contains:
original, translation, backtranslation, drift_severity, notes.`,
maxTurns: 1,
temperature: 0,
outputSchema: DriftTable,
}
// ---------------------------------------------------------------------------
// Build agents
// ---------------------------------------------------------------------------
const translator = buildAgent(translatorConfig)
const backtranslator = buildAgent(backtranslatorConfig)
const reviewer = buildAgent(reviewerConfig)
const pool = new AgentPool(1)
pool.add(translator)
pool.add(backtranslator)
pool.add(reviewer)
// ---------------------------------------------------------------------------
// Run pipeline
// ---------------------------------------------------------------------------
const paragraphs = splitParagraphs(SAMPLE_TEXT)
console.log('Translation + Backtranslation Quality Check')
console.log('='.repeat(60))
console.log(`Paragraphs: ${paragraphs.length}`)
console.log(`Translator provider: anthropic (claude-sonnet-4-6)`)
console.log(`Backtranslator provider: ${backProvider} (${backModel})`)
console.log()
// Step 1: Agent A translates
console.log('[1/3] Agent A translating EN -> zh-CN...\n')
const translationPrompt = `Target language: Simplified Chinese
Translate the following paragraphs.
Return exactly one translated item per paragraph.
Input:
${JSON.stringify(paragraphs, null, 2)}`
const translationResult = await pool.run('translator', translationPrompt)
if (!translationResult.success || !translationResult.structured) {
console.error('Agent A failed:', translationResult.output)
process.exit(1)
}
const translated = translationResult.structured as TranslationBatch
// Step 2: Agent B back-translates
console.log('[2/3] Agent B back-translating zh-CN -> EN...\n')
const backtranslationPrompt = `Back-translate the following paragraphs into English.
Keep the same indexes.
Input:
${JSON.stringify(translated.items, null, 2)}`
const backtranslationResult = await pool.run('backtranslator', backtranslationPrompt)
if (!backtranslationResult.success || !backtranslationResult.structured) {
console.error('Agent B failed:', backtranslationResult.output)
process.exit(1)
}
const backtranslated = backtranslationResult.structured as BacktranslationBatch
// Step 3: Agent C reviews semantic drift
console.log('[3/3] Agent C reviewing semantic drift...\n')
const mergedInput = paragraphs.map((p) => ({
index: p.index,
original: p.original,
translation: translated.items.find((x) => x.index === p.index)?.translation ?? '',
backtranslation:
backtranslated.items.find((x) => x.index === p.index)?.backtranslation ?? '',
}))
const reviewPrompt = `Compare the original English against the backtranslated English.
Important:
- Evaluate semantic drift paragraph by paragraph
- Do not judge style differences as major unless meaning changed
- Return only the final JSON array
Input:
${JSON.stringify(mergedInput, null, 2)}`
const reviewResult = await pool.run('reviewer', reviewPrompt)
if (!reviewResult.success || !reviewResult.structured) {
console.error('Agent C failed:', reviewResult.output)
process.exit(1)
}
const driftTable = reviewResult.structured as DriftTable
// ---------------------------------------------------------------------------
// Final output
// ---------------------------------------------------------------------------
console.log('='.repeat(60))
console.log('FINAL DRIFT TABLE')
console.log('='.repeat(60))
console.log(JSON.stringify(driftTable, null, 2))
console.log()
console.log('Token Usage Summary')
console.log('-'.repeat(60))
console.log(
`Agent A (translator) — input: ${translationResult.tokenUsage.input_tokens}, output: ${translationResult.tokenUsage.output_tokens}`,
)
console.log(
`Agent B (backtranslator) — input: ${backtranslationResult.tokenUsage.input_tokens}, output: ${backtranslationResult.tokenUsage.output_tokens}`,
)
console.log(
`Agent C (reviewer) — input: ${reviewResult.tokenUsage.input_tokens}, output: ${reviewResult.tokenUsage.output_tokens}`,
)
const totalInput =
translationResult.tokenUsage.input_tokens +
backtranslationResult.tokenUsage.input_tokens +
reviewResult.tokenUsage.input_tokens
const totalOutput =
translationResult.tokenUsage.output_tokens +
backtranslationResult.tokenUsage.output_tokens +
reviewResult.tokenUsage.output_tokens
console.log('-'.repeat(60))
console.log(`TOTAL — input: ${totalInput}, output: ${totalOutput}`)
console.log('\nDone.')

View File

@ -1,21 +0,0 @@
Weekly Engineering Standup — 2026-04-18
Attendees: Maya (Eng Manager), Raj (Senior Backend), Priya (Frontend Lead), Dan (SRE)
Maya: Quick round-table. Raj, where are we on the billing-v2 migration?
Raj: Cutover is scheduled for Tuesday the 28th. I want to get the shadow-write harness deployed by Friday so we have a full weekend of production traffic comparisons before the cutover. I'll own that. Concerned about the reconciliation query taking 40 seconds on the biggest accounts; I'll look into adding a covering index before cutover.
Maya: Good. Priya, the checkout redesign?
Priya: Ship-ready. I finished the accessibility audit yesterday, all high-priority items landed. Two medium items on the backlog I'll tackle next sprint. Planning to flip the feature flag for 5% of traffic on Thursday the 23rd and ramp from there. I've been heads-down on this for three weeks and honestly feeling pretty good about where it landed.
Maya: Great. Dan, Sunday's incident — what's the status on the retro?
Dan: Retro doc is up. Root cause was the failover script assuming a single-region topology after we moved to multi-region in Q1. The script hasn't been exercised in production since February. I'm frustrated that nobody caught it in review — the change was obvious if you read the diff, but it's twenty pages of YAML. I'm going to propose a rule that multi-region changes need a second reviewer on the SRE team. That's an action for me before the next postmortem, I'll have it drafted by Monday the 27th.
Maya: Reasonable. Anything else? Dan, how are you holding up? You've been on call a lot.
Dan: Honestly? Tired. The back-to-back incidents took the wind out of me. I'd like to hand off primary next rotation. I'll work with Raj on the handoff doc.
Maya: Noted. Let's make that happen. Priya, anything blocking you?
Priya: Nope, feeling good.
Raj: Just flagging — I saw the Slack thread about the authz refactor. If we're doing that this quarter, it conflicts with billing-v2 timelines. Can we park it until May?
Maya: Yes, I'll follow up with Len and reply in the thread. Thanks everyone.

View File

@ -1,59 +0,0 @@
/**
* MCP GitHub Tools
*
* Connect an MCP server over stdio and register all exposed MCP tools as
* standard open-multi-agent tools.
*
* Run:
* npx tsx examples/integrations/mcp-github.ts
*
* Prerequisites:
* - GEMINI_API_KEY
* - GITHUB_TOKEN
* - @modelcontextprotocol/sdk installed
*/
import { Agent, ToolExecutor, ToolRegistry, registerBuiltInTools } from '../../src/index.js'
import { connectMCPTools } from '../../src/mcp.js'
if (!process.env.GITHUB_TOKEN?.trim()) {
console.error('Missing GITHUB_TOKEN: set a GitHub personal access token in the environment.')
process.exit(1)
}
const { tools, disconnect } = await connectMCPTools({
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: {
...process.env,
GITHUB_TOKEN: process.env.GITHUB_TOKEN,
},
namePrefix: 'github',
})
const registry = new ToolRegistry()
registerBuiltInTools(registry)
for (const tool of tools) registry.register(tool)
const executor = new ToolExecutor(registry)
const agent = new Agent(
{
name: 'github-agent',
model: 'gemini-2.5-flash',
provider: 'gemini',
tools: tools.map((tool) => tool.name),
systemPrompt: 'Use GitHub MCP tools to answer repository questions.',
},
registry,
executor,
)
try {
const result = await agent.run(
'List the last 3 open issues in JackChen-me/open-multi-agent with title and number.',
)
console.log(result.output)
} finally {
await disconnect()
}

View File

@ -1,133 +0,0 @@
/**
* Trace Observability
*
* Demonstrates the `onTrace` callback for lightweight observability. Every LLM
* call, tool execution, task lifecycle, and agent run emits a structured trace
* event with timing data and token usage giving you full visibility into
* what's happening inside a multi-agent run.
*
* Trace events share a `runId` for correlation, so you can reconstruct the
* full execution timeline. Pipe them into your own logging, OpenTelemetry, or
* dashboard.
*
* Run:
* npx tsx examples/integrations/trace-observability.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, TraceEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agents
// ---------------------------------------------------------------------------
const researcher: AgentConfig = {
name: 'researcher',
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a research assistant. Provide concise, factual answers.',
maxTurns: 2,
}
const writer: AgentConfig = {
name: 'writer',
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a technical writer. Summarize research into clear prose.',
maxTurns: 2,
}
// ---------------------------------------------------------------------------
// Trace handler — log every span with timing
// ---------------------------------------------------------------------------
function handleTrace(event: TraceEvent): void {
const dur = `${event.durationMs}ms`.padStart(7)
switch (event.type) {
case 'llm_call':
console.log(
` [LLM] ${dur} agent=${event.agent} model=${event.model} turn=${event.turn}` +
` tokens=${event.tokens.input_tokens}in/${event.tokens.output_tokens}out`,
)
break
case 'tool_call':
console.log(
` [TOOL] ${dur} agent=${event.agent} tool=${event.tool}` +
` error=${event.isError}`,
)
break
case 'task':
console.log(
` [TASK] ${dur} task="${event.taskTitle}" agent=${event.agent}` +
` success=${event.success} retries=${event.retries}`,
)
break
case 'agent':
console.log(
` [AGENT] ${dur} agent=${event.agent} turns=${event.turns}` +
` tools=${event.toolCalls} tokens=${event.tokens.input_tokens}in/${event.tokens.output_tokens}out`,
)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrator + team
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
onTrace: handleTrace,
})
const team = orchestrator.createTeam('trace-demo', {
name: 'trace-demo',
agents: [researcher, writer],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Tasks — researcher first, then writer summarizes
// ---------------------------------------------------------------------------
const tasks = [
{
title: 'Research topic',
description: 'List 5 key benefits of TypeScript for large codebases. Be concise.',
assignee: 'researcher',
},
{
title: 'Write summary',
description: 'Read the research from shared memory and write a 3-sentence summary.',
assignee: 'writer',
dependsOn: ['Research topic'],
},
]
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
console.log('Trace Observability Example')
console.log('='.repeat(60))
console.log('Pipeline: research → write (with full trace output)')
console.log('='.repeat(60))
console.log()
const result = await orchestrator.runTasks(team, tasks)
// ---------------------------------------------------------------------------
// Summary
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log(`Overall success: ${result.success}`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
console.log(` [${icon}] ${name}`)
console.log(` ${r.output.slice(0, 200)}`)
}

View File

@ -1,5 +0,0 @@
node_modules/
.next/
.env
.env.local
*.tsbuildinfo

View File

@ -1,59 +0,0 @@
# with-vercel-ai-sdk
A Next.js demo showing **open-multi-agent** (OMA) and **Vercel AI SDK** working together:
- **OMA** orchestrates a research team (researcher agent + writer agent) via `runTeam()`
- **AI SDK** streams the result to a chat UI via `useChat` + `streamText`
## How it works
```
User message
API route (app/api/chat/route.ts)
├─ Phase 1: OMA runTeam()
│ coordinator decomposes goal → researcher gathers info → writer drafts article
└─ Phase 2: AI SDK streamText()
streams the team's output to the browser
Chat UI (app/page.tsx) — useChat hook renders streamed response
```
## Setup
```bash
# 1. From repo root, install OMA dependencies
cd ../../..
npm install
# 2. Back to this example
cd examples/integrations/with-vercel-ai-sdk
npm install
# 3. Set your API key
export ANTHROPIC_API_KEY=sk-ant-...
# 4. Run
npm run dev
```
`npm run dev` automatically builds OMA before starting Next.js (via the `predev` script).
Open [http://localhost:3000](http://localhost:3000), type a topic, and watch the research team work.
## Prerequisites
- Node.js >= 18
- `ANTHROPIC_API_KEY` environment variable (used by both OMA and AI SDK)
## Key files
| File | Role |
|------|------|
| `app/api/chat/route.ts` | Backend — OMA orchestration + AI SDK streaming |
| `app/page.tsx` | Frontend — chat UI with `useChat` hook |
| `package.json` | References OMA via `file:../../` (local link) |

View File

@ -1,91 +0,0 @@
import { streamText, convertToModelMessages, type UIMessage } from 'ai'
import { createOpenAICompatible } from '@ai-sdk/openai-compatible'
import { OpenMultiAgent } from '@jackchen_me/open-multi-agent'
import type { AgentConfig } from '@jackchen_me/open-multi-agent'
export const maxDuration = 120
// --- DeepSeek via OpenAI-compatible API ---
const DEEPSEEK_BASE_URL = 'https://api.deepseek.com'
const DEEPSEEK_MODEL = 'deepseek-chat'
const deepseek = createOpenAICompatible({
name: 'deepseek',
baseURL: `${DEEPSEEK_BASE_URL}/v1`,
apiKey: process.env.DEEPSEEK_API_KEY,
})
const researcher: AgentConfig = {
name: 'researcher',
model: DEEPSEEK_MODEL,
provider: 'openai',
baseURL: DEEPSEEK_BASE_URL,
apiKey: process.env.DEEPSEEK_API_KEY,
systemPrompt: `You are a research specialist. Given a topic, provide thorough, factual research
with key findings, relevant data points, and important context.
Be concise but comprehensive. Output structured notes, not prose.`,
maxTurns: 3,
temperature: 0.2,
}
const writer: AgentConfig = {
name: 'writer',
model: DEEPSEEK_MODEL,
provider: 'openai',
baseURL: DEEPSEEK_BASE_URL,
apiKey: process.env.DEEPSEEK_API_KEY,
systemPrompt: `You are an expert writer. Using research from team members (available in shared memory),
write a well-structured, engaging article with clear headings and concise paragraphs.
Do not repeat raw research synthesize it into readable prose.`,
maxTurns: 3,
temperature: 0.4,
}
function extractText(message: UIMessage): string {
return message.parts
.filter((p): p is { type: 'text'; text: string } => p.type === 'text')
.map((p) => p.text)
.join('')
}
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json()
const lastText = extractText(messages.at(-1)!)
// --- Phase 1: OMA multi-agent orchestration ---
const orchestrator = new OpenMultiAgent({
defaultModel: DEEPSEEK_MODEL,
defaultProvider: 'openai',
defaultBaseURL: DEEPSEEK_BASE_URL,
defaultApiKey: process.env.DEEPSEEK_API_KEY,
})
const team = orchestrator.createTeam('research-writing', {
name: 'research-writing',
agents: [researcher, writer],
sharedMemory: true,
})
const teamResult = await orchestrator.runTeam(
team,
`Research and write an article about: ${lastText}`,
)
const teamOutput = teamResult.agentResults.get('coordinator')?.output ?? ''
// --- Phase 2: Stream result via Vercel AI SDK ---
const result = streamText({
model: deepseek(DEEPSEEK_MODEL),
system: `You are presenting research from a multi-agent team (researcher + writer).
The team has already done the work. Your only job is to relay their output to the user
in a well-formatted way. Keep the content faithful to the team output below.
At the very end, add a one-line note that this was produced by a researcher agent
and a writer agent collaborating via open-multi-agent.
## Team Output
${teamOutput}`,
messages: await convertToModelMessages(messages),
})
return result.toUIMessageStreamResponse()
}

View File

@ -1,14 +0,0 @@
import type { Metadata } from 'next'
export const metadata: Metadata = {
title: 'OMA + Vercel AI SDK',
description: 'Multi-agent research team powered by open-multi-agent, streamed via Vercel AI SDK',
}
export default function RootLayout({ children }: { children: React.ReactNode }) {
return (
<html lang="en">
<body style={{ margin: 0, background: '#fafafa' }}>{children}</body>
</html>
)
}

View File

@ -1,97 +0,0 @@
'use client'
import { useState } from 'react'
import { useChat } from '@ai-sdk/react'
export default function Home() {
const { messages, sendMessage, status, error } = useChat()
const [input, setInput] = useState('')
const isLoading = status === 'submitted' || status === 'streaming'
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault()
if (!input.trim() || isLoading) return
const text = input
setInput('')
await sendMessage({ text })
}
return (
<main
style={{
maxWidth: 720,
margin: '0 auto',
padding: '32px 16px',
fontFamily: 'system-ui, -apple-system, sans-serif',
}}
>
<h1 style={{ fontSize: 22, marginBottom: 4 }}>Research Team</h1>
<p style={{ color: '#666', fontSize: 14, marginBottom: 28 }}>
Enter a topic. A <strong>researcher</strong> agent gathers information, a{' '}
<strong>writer</strong> agent composes an article &mdash; orchestrated by
open-multi-agent, streamed via Vercel AI SDK.
</p>
<div style={{ minHeight: 120 }}>
{messages.map((m) => (
<div key={m.id} style={{ marginBottom: 24, lineHeight: 1.7 }}>
<div style={{ fontWeight: 600, fontSize: 13, color: '#999', marginBottom: 4 }}>
{m.role === 'user' ? 'You' : 'Research Team'}
</div>
<div style={{ whiteSpace: 'pre-wrap', fontSize: 15 }}>
{m.parts
.filter((part): part is { type: 'text'; text: string } => part.type === 'text')
.map((part) => part.text)
.join('')}
</div>
</div>
))}
{isLoading && status === 'submitted' && (
<div style={{ color: '#888', fontSize: 14, padding: '8px 0' }}>
Agents are collaborating &mdash; this may take a minute...
</div>
)}
{error && (
<div style={{ color: '#c00', fontSize: 14, padding: '8px 0' }}>
Error: {error.message}
</div>
)}
</div>
<form onSubmit={handleSubmit} style={{ display: 'flex', gap: 8, marginTop: 32 }}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Enter a topic to research..."
disabled={isLoading}
style={{
flex: 1,
padding: '10px 14px',
borderRadius: 8,
border: '1px solid #ddd',
fontSize: 15,
outline: 'none',
}}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
style={{
padding: '10px 20px',
borderRadius: 8,
border: 'none',
background: isLoading ? '#ccc' : '#111',
color: '#fff',
cursor: isLoading ? 'not-allowed' : 'pointer',
fontSize: 15,
}}
>
Send
</button>
</form>
</main>
)
}

View File

@ -1,6 +0,0 @@
/// <reference types="next" />
/// <reference types="next/image-types/global" />
import "./.next/dev/types/routes.d.ts";
// NOTE: This file should not be edited
// see https://nextjs.org/docs/app/api-reference/config/typescript for more information.

View File

@ -1,7 +0,0 @@
import type { NextConfig } from 'next'
const nextConfig: NextConfig = {
serverExternalPackages: ['@jackchen_me/open-multi-agent'],
}
export default nextConfig

File diff suppressed because it is too large Load Diff

View File

@ -1,25 +0,0 @@
{
"name": "with-vercel-ai-sdk",
"private": true,
"scripts": {
"predev": "cd ../.. && npm run build",
"dev": "next dev",
"build": "next build",
"start": "next start"
},
"dependencies": {
"@ai-sdk/openai-compatible": "^2.0.41",
"@ai-sdk/react": "^3.0.0",
"@jackchen_me/open-multi-agent": "file:../../",
"ai": "^6.0.0",
"next": "^16.0.0",
"react": "^19.0.0",
"react-dom": "^19.0.0"
},
"devDependencies": {
"@types/node": "^22.0.0",
"@types/react": "^19.0.0",
"@types/react-dom": "^19.0.0",
"typescript": "^5.6.0"
}
}

View File

@ -1,41 +0,0 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": [
"dom",
"dom.iterable",
"ES2022"
],
"allowJs": true,
"skipLibCheck": true,
"strict": true,
"noEmit": true,
"esModuleInterop": true,
"module": "ESNext",
"moduleResolution": "bundler",
"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "react-jsx",
"incremental": true,
"plugins": [
{
"name": "next"
}
],
"paths": {
"@/*": [
"./*"
]
}
},
"include": [
"next-env.d.ts",
"**/*.ts",
"**/*.tsx",
".next/types/**/*.ts",
".next/dev/types/**/*.ts"
],
"exclude": [
"node_modules"
]
}

View File

@ -1,64 +0,0 @@
/**
* Synchronous agent handoff via `delegate_to_agent`
*
* During `runTeam` / `runTasks`, pool agents register the built-in
* `delegate_to_agent` tool so one specialist can run a sub-prompt on another
* roster agent and read the answer in the same conversation turn.
*
* Whitelist `delegate_to_agent` in `tools` when you want the model to see it;
* standalone `runAgent()` does not register this tool by default.
*
* Run:
* npx tsx examples/patterns/agent-handoff.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig } from '../../src/types.js'
const researcher: AgentConfig = {
name: 'researcher',
model: 'claude-sonnet-4-6',
provider: 'anthropic',
systemPrompt:
'You answer factual questions briefly. When the user asks for a second opinion ' +
'from the analyst, use delegate_to_agent to ask the analyst agent, then summarize both views.',
tools: ['delegate_to_agent'],
maxTurns: 6,
}
const analyst: AgentConfig = {
name: 'analyst',
model: 'claude-sonnet-4-6',
provider: 'anthropic',
systemPrompt: 'You give short, skeptical analysis of claims. Push back when evidence is weak.',
tools: [],
maxTurns: 4,
}
async function main(): Promise<void> {
const orchestrator = new OpenMultiAgent({ maxConcurrency: 2 })
const team = orchestrator.createTeam('handoff-demo', {
name: 'handoff-demo',
agents: [researcher, analyst],
sharedMemory: true,
})
const goal =
'In one paragraph: state a simple fact about photosynthesis. ' +
'Then ask the analyst (via delegate_to_agent) for a one-sentence critique of overstated claims in popular science. ' +
'Merge both into a final short answer.'
const result = await orchestrator.runTeam(team, goal)
console.log('Success:', result.success)
for (const [name, ar] of result.agentResults) {
console.log(`\n--- ${name} ---\n${ar.output.slice(0, 2000)}`)
}
}
main().catch((err) => {
console.error(err)
process.exit(1)
})

View File

@ -1,188 +0,0 @@
/**
* Multi-Perspective Code Review
*
* Demonstrates:
* - Dependency chain: generator produces code, three reviewers depend on it
* - Parallel execution: security, performance, and style reviewers run concurrently
* - Shared memory: each agent's output is automatically stored and injected
* into downstream agents' prompts by the framework
*
* Flow:
* generator [security-reviewer, performance-reviewer, style-reviewer] (parallel) synthesizer
*
* Run:
* npx tsx examples/patterns/multi-perspective-code-review.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// API spec to implement
// ---------------------------------------------------------------------------
const API_SPEC = `POST /users endpoint that:
- Accepts JSON body with name (string, required), email (string, required), age (number, optional)
- Validates all fields
- Inserts into a PostgreSQL database
- Returns 201 with the created user or 400/500 on error`
// ---------------------------------------------------------------------------
// Agents
// ---------------------------------------------------------------------------
const generator: AgentConfig = {
name: 'generator',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a Node.js backend developer. Given an API spec, write a complete
Express route handler. Include imports, validation, database query, and error handling.
Output only the code, no explanation. Keep it under 80 lines.`,
maxTurns: 2,
}
const securityReviewer: AgentConfig = {
name: 'security-reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a security reviewer. Review the code provided in context and check
for OWASP top 10 vulnerabilities: SQL injection, XSS, broken authentication,
sensitive data exposure, etc. Write your findings as a markdown checklist.
Keep it to 150-200 words.`,
maxTurns: 2,
}
const performanceReviewer: AgentConfig = {
name: 'performance-reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a performance reviewer. Review the code provided in context and check
for N+1 queries, memory leaks, blocking calls, missing connection pooling, and
inefficient patterns. Write your findings as a markdown checklist.
Keep it to 150-200 words.`,
maxTurns: 2,
}
const styleReviewer: AgentConfig = {
name: 'style-reviewer',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a code style reviewer. Review the code provided in context and check
naming conventions, function structure, readability, error message clarity, and
consistency. Write your findings as a markdown checklist.
Keep it to 150-200 words.`,
maxTurns: 2,
}
const synthesizer: AgentConfig = {
name: 'synthesizer',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a lead engineer synthesizing code review feedback. Review all
the feedback and original code provided in context. Produce a unified report with:
1. Critical issues (must fix before merge)
2. Recommended improvements (should fix)
3. Minor suggestions (nice to have)
Deduplicate overlapping feedback. Keep the report to 200-300 words.`,
maxTurns: 2,
}
// ---------------------------------------------------------------------------
// Orchestrator + team
// ---------------------------------------------------------------------------
function handleProgress(event: OrchestratorEvent): void {
if (event.type === 'task_start') {
console.log(` [START] ${event.task ?? '?'}${event.agent ?? '?'}`)
}
if (event.type === 'task_complete') {
const success = (event.data as { success?: boolean })?.success ?? true
console.log(` [DONE] ${event.task ?? '?'} (${success ? 'OK' : 'FAIL'})`)
}
}
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
onProgress: handleProgress,
})
const team = orchestrator.createTeam('code-review-team', {
name: 'code-review-team',
agents: [generator, securityReviewer, performanceReviewer, styleReviewer, synthesizer],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Tasks
// ---------------------------------------------------------------------------
const tasks = [
{
title: 'Generate code',
description: `Write a Node.js Express route handler for this API spec:\n\n${API_SPEC}`,
assignee: 'generator',
},
{
title: 'Security review',
description: 'Review the generated code for security vulnerabilities.',
assignee: 'security-reviewer',
dependsOn: ['Generate code'],
},
{
title: 'Performance review',
description: 'Review the generated code for performance issues.',
assignee: 'performance-reviewer',
dependsOn: ['Generate code'],
},
{
title: 'Style review',
description: 'Review the generated code for style and readability.',
assignee: 'style-reviewer',
dependsOn: ['Generate code'],
},
{
title: 'Synthesize feedback',
description: 'Synthesize all review feedback and the original code into a unified, prioritized action item report.',
assignee: 'synthesizer',
dependsOn: ['Security review', 'Performance review', 'Style review'],
},
]
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
console.log('Multi-Perspective Code Review')
console.log('='.repeat(60))
console.log(`Spec: ${API_SPEC.split('\n')[0]}`)
console.log('Pipeline: generator → 3 reviewers (parallel) → synthesizer')
console.log('='.repeat(60))
console.log()
const result = await orchestrator.runTasks(team, tasks)
// ---------------------------------------------------------------------------
// Output
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log(`Overall success: ${result.success}`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log()
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
const tokens = `in:${r.tokenUsage.input_tokens} out:${r.tokenUsage.output_tokens}`
console.log(` [${icon}] ${name.padEnd(22)} ${tokens}`)
}
const synthResult = result.agentResults.get('synthesizer')
if (synthResult?.success) {
console.log('\n' + '='.repeat(60))
console.log('UNIFIED REVIEW REPORT')
console.log('='.repeat(60))
console.log()
console.log(synthResult.output)
}
console.log('\nDone.')

View File

@ -1,169 +0,0 @@
/**
* Multi-Source Research Aggregation
*
* Demonstrates runTasks() with explicit dependency chains:
* - Parallel execution: three analyst agents research the same topic independently
* - Dependency chain via dependsOn: synthesizer waits for all analysts to finish
* - Automatic shared memory: agent output flows to downstream agents via the framework
*
* Compare with example 07 (fan-out-aggregate) which uses AgentPool.runParallel()
* for the same 3-analysts + synthesizer pattern. This example shows the runTasks()
* API with explicit dependsOn declarations instead.
*
* Flow:
* [technical-analyst, market-analyst, community-analyst] (parallel) synthesizer
*
* Run:
* npx tsx examples/patterns/research-aggregation.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Topic
// ---------------------------------------------------------------------------
const TOPIC = 'WebAssembly adoption in 2026'
// ---------------------------------------------------------------------------
// Agents — three analysts + one synthesizer
// ---------------------------------------------------------------------------
const technicalAnalyst: AgentConfig = {
name: 'technical-analyst',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a technical analyst. Given a topic, research its technical
capabilities, limitations, performance characteristics, and architectural patterns.
Write your findings as structured markdown. Keep it to 200-300 words.`,
maxTurns: 2,
}
const marketAnalyst: AgentConfig = {
name: 'market-analyst',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a market analyst. Given a topic, research industry adoption
rates, key companies using the technology, market size estimates, and competitive
landscape. Write your findings as structured markdown. Keep it to 200-300 words.`,
maxTurns: 2,
}
const communityAnalyst: AgentConfig = {
name: 'community-analyst',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a developer community analyst. Given a topic, research
developer sentiment, ecosystem maturity, learning resources, community size,
and conference/meetup activity. Write your findings as structured markdown.
Keep it to 200-300 words.`,
maxTurns: 2,
}
const synthesizer: AgentConfig = {
name: 'synthesizer',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a research director who synthesizes multiple analyst reports
into a single cohesive document. You will receive all prior analyst outputs
automatically. Then:
1. Cross-reference claims across reports - flag agreements and contradictions
2. Identify the 3 most important insights
3. Produce a structured report with: Executive Summary, Key Findings,
Areas of Agreement, Open Questions, and Recommendation
Keep the final report to 300-400 words.`,
maxTurns: 2,
}
// ---------------------------------------------------------------------------
// Orchestrator + team
// ---------------------------------------------------------------------------
function handleProgress(event: OrchestratorEvent): void {
if (event.type === 'task_start') {
console.log(` [START] ${event.task ?? ''}${event.agent ?? ''}`)
}
if (event.type === 'task_complete') {
console.log(` [DONE] ${event.task ?? ''}`)
}
}
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
onProgress: handleProgress,
})
const team = orchestrator.createTeam('research-team', {
name: 'research-team',
agents: [technicalAnalyst, marketAnalyst, communityAnalyst, synthesizer],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Tasks — three analysts run in parallel, synthesizer depends on all three
// ---------------------------------------------------------------------------
const tasks = [
{
title: 'Technical analysis',
description: `Research the technical aspects of ${TOPIC}. Focus on capabilities, limitations, performance, and architecture.`,
assignee: 'technical-analyst',
},
{
title: 'Market analysis',
description: `Research the market landscape for ${TOPIC}. Focus on adoption rates, key players, market size, and competition.`,
assignee: 'market-analyst',
},
{
title: 'Community analysis',
description: `Research the developer community around ${TOPIC}. Focus on sentiment, ecosystem maturity, learning resources, and community activity.`,
assignee: 'community-analyst',
},
{
title: 'Synthesize report',
description: `Cross-reference all analyst findings, identify key insights, flag contradictions, and produce a unified research report.`,
assignee: 'synthesizer',
dependsOn: ['Technical analysis', 'Market analysis', 'Community analysis'],
},
]
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
console.log('Multi-Source Research Aggregation')
console.log('='.repeat(60))
console.log(`Topic: ${TOPIC}`)
console.log('Pipeline: 3 analysts (parallel) → synthesizer')
console.log('='.repeat(60))
console.log()
const result = await orchestrator.runTasks(team, tasks)
// ---------------------------------------------------------------------------
// Output
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log(`Overall success: ${result.success}`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log()
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
const tokens = `in:${r.tokenUsage.input_tokens} out:${r.tokenUsage.output_tokens}`
console.log(` [${icon}] ${name.padEnd(20)} ${tokens}`)
}
const synthResult = result.agentResults.get('synthesizer')
if (synthResult?.success) {
console.log('\n' + '='.repeat(60))
console.log('SYNTHESIZED REPORT')
console.log('='.repeat(60))
console.log()
console.log(synthResult.output)
}
console.log('\nDone.')

View File

@ -1,73 +0,0 @@
/**
* Structured Output
*
* Demonstrates `outputSchema` on AgentConfig. The agent's response is
* automatically parsed as JSON and validated against a Zod schema.
* On validation failure, the framework retries once with error feedback.
*
* The validated result is available via `result.structured`.
*
* Run:
* npx tsx examples/patterns/structured-output.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { z } from 'zod'
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Define a Zod schema for the expected output
// ---------------------------------------------------------------------------
const ReviewAnalysis = z.object({
summary: z.string().describe('One-sentence summary of the review'),
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1).describe('How confident the analysis is'),
keyTopics: z.array(z.string()).describe('Main topics mentioned in the review'),
})
type ReviewAnalysis = z.infer<typeof ReviewAnalysis>
// ---------------------------------------------------------------------------
// Agent with outputSchema
// ---------------------------------------------------------------------------
const analyst: AgentConfig = {
name: 'analyst',
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a product review analyst. Analyze the given review and extract structured insights.',
outputSchema: ReviewAnalysis,
}
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({ defaultModel: 'claude-sonnet-4-6' })
const reviews = [
'This keyboard is amazing! The mechanical switches feel incredible and the RGB lighting is stunning. Build quality is top-notch. Only downside is the price.',
'Terrible experience. The product arrived broken, customer support was unhelpful, and the return process took 3 weeks.',
'It works fine. Nothing special, nothing bad. Does what it says on the box.',
]
console.log('Analyzing product reviews with structured output...\n')
for (const review of reviews) {
const result = await orchestrator.runAgent(analyst, `Analyze this review: "${review}"`)
if (result.structured) {
const data = result.structured as ReviewAnalysis
console.log(`Sentiment: ${data.sentiment} (confidence: ${data.confidence})`)
console.log(`Summary: ${data.summary}`)
console.log(`Topics: ${data.keyTopics.join(', ')}`)
} else {
console.log(`Validation failed. Raw output: ${result.output.slice(0, 100)}`)
}
console.log(`Tokens: ${result.tokenUsage.input_tokens} in / ${result.tokenUsage.output_tokens} out`)
console.log('---')
}

View File

@ -1,132 +0,0 @@
/**
* Task Retry with Exponential Backoff
*
* Demonstrates `maxRetries`, `retryDelayMs`, and `retryBackoff` on task config.
* When a task fails, the framework automatically retries with exponential
* backoff. The `onProgress` callback receives `task_retry` events so you can
* log retry attempts in real time.
*
* Scenario: a two-step pipeline where the first task (data fetch) is configured
* to retry on failure, and the second task (analysis) depends on it.
*
* Run:
* npx tsx examples/patterns/task-retry.ts
*
* Prerequisites:
* ANTHROPIC_API_KEY env var must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agents
// ---------------------------------------------------------------------------
const fetcher: AgentConfig = {
name: 'fetcher',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a data-fetching agent. When given a topic, produce a short
JSON summary with 3-5 key facts. Output ONLY valid JSON, no markdown fences.
Example: {"topic":"...", "facts":["fact1","fact2","fact3"]}`,
maxTurns: 2,
}
const analyst: AgentConfig = {
name: 'analyst',
model: 'claude-sonnet-4-6',
systemPrompt: `You are a data analyst. Read the fetched data from shared memory
and produce a brief analysis (3-4 sentences) highlighting trends or insights.`,
maxTurns: 2,
}
// ---------------------------------------------------------------------------
// Progress handler — watch for task_retry events
// ---------------------------------------------------------------------------
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'task_start':
console.log(`[${ts}] TASK START "${event.task}" (agent: ${event.agent})`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE "${event.task}"`)
break
case 'task_retry': {
const d = event.data as { attempt: number; maxAttempts: number; error: string; nextDelayMs: number }
console.log(`[${ts}] TASK RETRY "${event.task}" — attempt ${d.attempt}/${d.maxAttempts}, next in ${d.nextDelayMs}ms`)
console.log(` error: ${d.error.slice(0, 120)}`)
break
}
case 'error':
console.log(`[${ts}] ERROR "${event.task}" agent=${event.agent}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrator + team
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'claude-sonnet-4-6',
onProgress: handleProgress,
})
const team = orchestrator.createTeam('retry-demo', {
name: 'retry-demo',
agents: [fetcher, analyst],
sharedMemory: true,
})
// ---------------------------------------------------------------------------
// Tasks — fetcher has retry config, analyst depends on it
// ---------------------------------------------------------------------------
const tasks = [
{
title: 'Fetch data',
description: 'Fetch key facts about the adoption of TypeScript in open-source projects as of 2024. Output a JSON object with a "topic" and "facts" array.',
assignee: 'fetcher',
// Retry config: up to 2 retries, 500ms base delay, 2x backoff (500ms, 1000ms)
maxRetries: 2,
retryDelayMs: 500,
retryBackoff: 2,
},
{
title: 'Analyze data',
description: 'Read the fetched data from shared memory and produce a 3-4 sentence analysis of TypeScript adoption trends.',
assignee: 'analyst',
dependsOn: ['Fetch data'],
// No retry — if analysis fails, just report the error
},
]
// ---------------------------------------------------------------------------
// Run
// ---------------------------------------------------------------------------
console.log('Task Retry Example')
console.log('='.repeat(60))
console.log('Pipeline: fetch (with retry) → analyze')
console.log(`Retry config: maxRetries=2, delay=500ms, backoff=2x`)
console.log('='.repeat(60))
console.log()
const result = await orchestrator.runTasks(team, tasks)
// ---------------------------------------------------------------------------
// Summary
// ---------------------------------------------------------------------------
console.log('\n' + '='.repeat(60))
console.log(`Overall success: ${result.success}`)
console.log(`Tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
for (const [name, r] of result.agentResults) {
const icon = r.success ? 'OK ' : 'FAIL'
console.log(` [${icon}] ${name}`)
console.log(` ${r.output.slice(0, 200)}`)
}

View File

@ -1,38 +0,0 @@
# Production Examples
End-to-end examples that demonstrate `open-multi-agent` running on real-world use cases — not toy demos.
The other example categories (`basics/`, `providers/`, `patterns/`, `integrations/`) optimize for clarity and small surface area. This directory optimizes for **showing the framework solving an actual problem**, with the operational concerns that come with it.
## Acceptance criteria
A submission belongs in `production/` if it meets all of:
1. **Real use case.** Solves a concrete problem someone would actually pay for or use daily — not "build me a TODO API".
2. **Error handling.** Handles LLM failures, tool failures, and partial team failures gracefully. No bare `await` chains that crash on the first error.
3. **Documentation.** Each example lives in its own subdirectory with a `README.md` covering:
- What problem it solves
- Architecture diagram or task DAG description
- Required env vars / external services
- How to run locally
- Expected runtime and approximate token cost
4. **Reproducible.** Pinned model versions; no reliance on private datasets or unpublished APIs.
5. **Tested.** At least one test or smoke check that verifies the example still runs after framework updates.
If a submission falls short on (2)(5), it probably belongs in `patterns/` or `integrations/` instead.
## Layout
```
production/
└── <use-case>/
├── README.md # required
├── index.ts # entry point
├── agents/ # AgentConfig definitions
├── tools/ # custom tools, if any
└── tests/ # smoke test or e2e test
```
## Submitting
Open a PR. In the PR description, address each of the five acceptance criteria above.

View File

@ -1,179 +0,0 @@
/**
* Multi-Agent Team Collaboration with Azure OpenAI
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses Azure-hosted OpenAI models.
*
* Run:
* npx tsx examples/providers/azure-openai.ts
*
* Prerequisites:
* AZURE_OPENAI_API_KEY Your Azure OpenAI API key (required)
* AZURE_OPENAI_ENDPOINT Your Azure endpoint URL (required)
* Example: https://my-resource.openai.azure.com
* AZURE_OPENAI_API_VERSION API version (optional, defaults to 2024-10-21)
* AZURE_OPENAI_DEPLOYMENT Deployment name fallback when model is blank (optional)
*
* Important Note on Model Field:
* The 'model' field in agent configs should contain your Azure DEPLOYMENT NAME,
* not the underlying model name. For example, if you deployed GPT-4 with the
* deployment name "my-gpt4-prod", use `model: 'my-gpt4-prod'` in the agent config.
*
* You can find your deployment names in the Azure Portal under:
* Azure OpenAI Your Resource Model deployments
*
* Example Setup:
* If you have these Azure deployments:
* - "gpt-4" (your GPT-4 deployment)
* - "gpt-35-turbo" (your GPT-3.5 Turbo deployment)
*
* Then use those exact names in the model field below.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (using Azure OpenAI deployments)
// ---------------------------------------------------------------------------
/**
* IMPORTANT: Replace 'gpt-4' and 'gpt-35-turbo' below with YOUR actual
* Azure deployment names. These are just examples.
*/
const architect: AgentConfig = {
name: 'architect',
model: 'gpt-4', // Replace with your Azure GPT-4 deployment name
provider: 'azure-openai',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'gpt-4', // Replace with your Azure GPT-4 or GPT-3.5 deployment name
provider: 'azure-openai',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'gpt-4', // Replace with your Azure GPT-4 deployment name
provider: 'azure-openai',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'gpt-4', // Replace with your default Azure deployment name
defaultProvider: 'azure-openai',
maxConcurrency: 1, // sequential for readable output
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
// Sample outputs
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,163 +0,0 @@
/**
* Multi-Agent Team Collaboration with GitHub Copilot
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Routes through GitHub Copilot's OpenAI-compatible
* endpoint, mixing GPT-4o (architect/reviewer) and Claude Sonnet (developer) in one team.
*
* Run:
* npx tsx examples/providers/copilot.ts
*
* Authentication (one of):
* GITHUB_COPILOT_TOKEN env var (preferred)
* GITHUB_TOKEN env var (fallback)
* Otherwise: an interactive OAuth2 device flow starts on first run and prompts
* you to sign in via your browser. Requires an active Copilot subscription.
*
* Available models (subset):
* gpt-4o included, no premium request
* claude-sonnet-4.5 premium, 1x multiplier
* claude-sonnet-4.6 premium, 1x multiplier
* See src/llm/copilot.ts for the full model list and premium multipliers.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (mixing GPT-4o and Claude Sonnet, both via Copilot)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'gpt-4o',
provider: 'copilot',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'claude-sonnet-4.5',
provider: 'copilot',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'gpt-4o',
provider: 'copilot',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'gpt-4o',
defaultProvider: 'copilot',
maxConcurrency: 1,
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/copilot-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,158 +0,0 @@
/**
* Multi-Agent Team Collaboration with DeepSeek
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses DeepSeek's flagship model.
*
* Run:
* npx tsx examples/providers/deepseek.ts
*
* Prerequisites:
* DEEPSEEK_API_KEY environment variable must be set.
*
* Available models:
* deepseek-chat DeepSeek-V3 (non-thinking mode, recommended for coding tasks)
* deepseek-reasoner DeepSeek-V3 (thinking mode, for complex reasoning)
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (all using deepseek-chat)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'deepseek-reasoner',
provider: 'deepseek',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'deepseek-chat',
provider: 'deepseek',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'deepseek-chat',
provider: 'deepseek',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'deepseek-chat',
defaultProvider: 'deepseek',
maxConcurrency: 1, // sequential for readable output
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
// Sample outputs
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,161 +0,0 @@
/**
* Multi-Agent Team Collaboration with Google Gemini
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses Google's Gemini models
* via the official `@google/genai` SDK.
*
* Run:
* npx tsx examples/providers/gemini.ts
*
* Prerequisites:
* GEMINI_API_KEY environment variable must be set.
* `@google/genai` is an optional peer dependency install it first:
* npm install @google/genai
*
* Available models (subset):
* gemini-2.5-flash fast & cheap, good for routine coding tasks
* gemini-2.5-pro more capable, higher latency, larger context
* See https://ai.google.dev/gemini-api/docs/models for the full list.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (mixing pro and flash for a cost/capability balance)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'gemini-2.5-pro',
provider: 'gemini',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'gemini-2.5-flash',
provider: 'gemini',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'gemini-2.5-flash',
provider: 'gemini',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'gemini-2.5-flash',
defaultProvider: 'gemini',
maxConcurrency: 1,
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/gemini-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,192 +0,0 @@
/**
* Gemma 4 Local (100% Local, Zero API Cost)
*
* Demonstrates both execution modes with a fully local Gemma 4 model via
* Ollama. No cloud API keys needed everything runs on your machine.
*
* Part 1 runTasks(): explicit task pipeline (researcher summarizer)
* Part 2 runTeam(): auto-orchestration where Gemma 4 acts as coordinator,
* decomposes the goal into tasks, and synthesises the final result
*
* This is the hardest test for a local model runTeam() requires it to
* produce valid JSON for task decomposition AND do tool-calling for execution.
* Gemma 4 e2b (5.1B params) handles both reliably.
*
* Run:
* no_proxy=localhost npx tsx examples/providers/gemma4-local.ts
*
* Prerequisites:
* 1. Ollama >= 0.20.0 installed and running: https://ollama.com
* 2. Pull the model: ollama pull gemma4:e2b
* (or gemma4:e4b for better quality on machines with more RAM)
* 3. No API keys needed!
*
* Note: The no_proxy=localhost prefix is needed if you have an HTTP proxy
* configured, since the OpenAI SDK would otherwise route Ollama requests
* through the proxy.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent, Task } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Configuration — change this to match your Ollama setup
// ---------------------------------------------------------------------------
// See available tags at https://ollama.com/library/gemma4
const OLLAMA_MODEL = 'gemma4:e2b' // or 'gemma4:e4b', 'gemma4:26b'
const OLLAMA_BASE_URL = 'http://localhost:11434/v1'
const OUTPUT_DIR = '/tmp/gemma4-demo'
// ---------------------------------------------------------------------------
// Agents
// ---------------------------------------------------------------------------
const researcher: AgentConfig = {
name: 'researcher',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama', // placeholder — Ollama ignores this, but the OpenAI SDK requires a non-empty value
systemPrompt: `You are a system researcher. Use bash to run non-destructive,
read-only commands (uname -a, sw_vers, df -h, uptime, etc.) and report results.
Use file_write to save reports when asked.`,
tools: ['bash', 'file_write'],
maxTurns: 8,
}
const summarizer: AgentConfig = {
name: 'summarizer',
model: OLLAMA_MODEL,
provider: 'openai',
baseURL: OLLAMA_BASE_URL,
apiKey: 'ollama',
systemPrompt: `You are a technical writer. Read files and produce concise,
structured Markdown summaries. Use file_write to save reports when asked.`,
tools: ['file_read', 'file_write'],
maxTurns: 4,
}
// ---------------------------------------------------------------------------
// Progress handler
// ---------------------------------------------------------------------------
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23)
switch (event.type) {
case 'task_start': {
const task = event.data as Task | undefined
console.log(`[${ts}] TASK START "${task?.title ?? event.task}" → ${task?.assignee ?? '?'}`)
break
}
case 'task_complete':
console.log(`[${ts}] TASK DONE "${event.task}"`)
break
case 'agent_start':
console.log(`[${ts}] AGENT START ${event.agent}`)
break
case 'agent_complete':
console.log(`[${ts}] AGENT DONE ${event.agent}`)
break
case 'error':
console.error(`[${ts}] ERROR ${event.agent ?? ''} task=${event.task ?? '?'}`)
break
}
}
// ═══════════════════════════════════════════════════════════════════════════
// Part 1: runTasks() — Explicit task pipeline
// ═══════════════════════════════════════════════════════════════════════════
console.log('Part 1: runTasks() — Explicit Pipeline')
console.log('='.repeat(60))
console.log(` model → ${OLLAMA_MODEL} via Ollama`)
console.log(` pipeline → researcher gathers info → summarizer writes summary`)
console.log()
const orchestrator1 = new OpenMultiAgent({
defaultModel: OLLAMA_MODEL,
maxConcurrency: 1, // local model serves one request at a time
onProgress: handleProgress,
})
const team1 = orchestrator1.createTeam('explicit', {
name: 'explicit',
agents: [researcher, summarizer],
sharedMemory: true,
})
const tasks = [
{
title: 'Gather system information',
description: `Use bash to run system info commands (uname -a, sw_vers, sysctl, df -h, uptime).
Then write a structured Markdown report to ${OUTPUT_DIR}/system-report.md with sections:
OS, Hardware, Disk, and Uptime.`,
assignee: 'researcher',
},
{
title: 'Summarize the report',
description: `Read the file at ${OUTPUT_DIR}/system-report.md.
Produce a concise one-paragraph executive summary of the system information.`,
assignee: 'summarizer',
dependsOn: ['Gather system information'],
},
]
const start1 = Date.now()
const result1 = await orchestrator1.runTasks(team1, tasks)
console.log(`\nSuccess: ${result1.success} Time: ${((Date.now() - start1) / 1000).toFixed(1)}s`)
console.log(`Tokens — input: ${result1.totalTokenUsage.input_tokens}, output: ${result1.totalTokenUsage.output_tokens}`)
const summary = result1.agentResults.get('summarizer')
if (summary?.success) {
console.log('\nSummary (from local Gemma 4):')
console.log('-'.repeat(60))
console.log(summary.output)
console.log('-'.repeat(60))
}
// ═══════════════════════════════════════════════════════════════════════════
// Part 2: runTeam() — Auto-orchestration (Gemma 4 as coordinator)
// ═══════════════════════════════════════════════════════════════════════════
console.log('\n\nPart 2: runTeam() — Auto-Orchestration')
console.log('='.repeat(60))
console.log(` coordinator → auto-created by runTeam(), also Gemma 4`)
console.log(` goal → given in natural language, framework plans everything`)
console.log()
const orchestrator2 = new OpenMultiAgent({
defaultModel: OLLAMA_MODEL,
defaultProvider: 'openai',
defaultBaseURL: OLLAMA_BASE_URL,
defaultApiKey: 'ollama',
maxConcurrency: 1,
onProgress: handleProgress,
})
const team2 = orchestrator2.createTeam('auto', {
name: 'auto',
agents: [researcher, summarizer],
sharedMemory: true,
})
const goal = `Check this machine's Node.js version, npm version, and OS info,
then write a short Markdown summary report to /tmp/gemma4-auto/report.md`
const start2 = Date.now()
const result2 = await orchestrator2.runTeam(team2, goal)
console.log(`\nSuccess: ${result2.success} Time: ${((Date.now() - start2) / 1000).toFixed(1)}s`)
console.log(`Tokens — input: ${result2.totalTokenUsage.input_tokens}, output: ${result2.totalTokenUsage.output_tokens}`)
const coordResult = result2.agentResults.get('coordinator')
if (coordResult?.success) {
console.log('\nFinal synthesis (from local Gemma 4 coordinator):')
console.log('-'.repeat(60))
console.log(coordResult.output)
console.log('-'.repeat(60))
}
console.log('\nAll processing done locally. $0 API cost.')

View File

@ -1,154 +0,0 @@
/**
* Multi-Agent Team Collaboration with Grok (xAI)
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses Grok's coding-optimized model.
*
* Run:
* npx tsx examples/providers/grok.ts
*
* Prerequisites:
* XAI_API_KEY environment variable must be set.
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (all using grok-code-fast-1)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'grok-code-fast-1',
provider: 'grok',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'grok-code-fast-1',
provider: 'grok',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'grok-code-fast-1',
provider: 'grok',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'grok-code-fast-1',
defaultProvider: 'grok',
maxConcurrency: 1, // sequential for readable output
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
// Sample outputs
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,164 +0,0 @@
/**
* Multi-Agent Team Collaboration with Groq
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses Groq via the OpenAI-compatible adapter.
*
* Run:
* npx tsx examples/providers/groq.ts
*
* Prerequisites:
* GROQ_API_KEY environment variable must be set.
*
* Available models:
* llama-3.3-70b-versatile Groq production model (recommended for coding tasks)
* deepseek-r1-distill-llama-70b Groq reasoning model
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (all using Groq via the OpenAI-compatible adapter)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'deepseek-r1-distill-llama-70b',
provider: 'openai',
baseURL: 'https://api.groq.com/openai/v1',
apiKey: process.env.GROQ_API_KEY,
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'llama-3.3-70b-versatile',
provider: 'openai',
baseURL: 'https://api.groq.com/openai/v1',
apiKey: process.env.GROQ_API_KEY,
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'llama-3.3-70b-versatile',
provider: 'openai',
baseURL: 'https://api.groq.com/openai/v1',
apiKey: process.env.GROQ_API_KEY,
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'llama-3.3-70b-versatile',
defaultProvider: 'openai',
maxConcurrency: 1, // sequential for readable output
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
// Sample outputs
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

View File

@ -1,159 +0,0 @@
/**
* Multi-Agent Team Collaboration with MiniMax
*
* Three specialized agents (architect, developer, reviewer) collaborate via `runTeam()`
* to build a minimal Express.js REST API. Every agent uses MiniMax's flagship model.
*
* Run:
* npx tsx examples/providers/minimax.ts
*
* Prerequisites:
* MINIMAX_API_KEY environment variable must be set.
* MINIMAX_BASE_URL environment variable can be set to switch to the China mainland endpoint if needed.
*
* Endpoints:
* Global (default): https://api.minimax.io/v1
* China mainland: https://api.minimaxi.com/v1 (set MINIMAX_BASE_URL)
*/
import { OpenMultiAgent } from '../../src/index.js'
import type { AgentConfig, OrchestratorEvent } from '../../src/types.js'
// ---------------------------------------------------------------------------
// Agent definitions (all using MiniMax-M2.7)
// ---------------------------------------------------------------------------
const architect: AgentConfig = {
name: 'architect',
model: 'MiniMax-M2.7',
provider: 'minimax',
systemPrompt: `You are a software architect with deep experience in Node.js and REST API design.
Your job is to design clear, production-quality API contracts and file/directory structures.
Output concise plans in markdown no unnecessary prose.`,
tools: ['bash', 'file_write'],
maxTurns: 5,
temperature: 0.2,
}
const developer: AgentConfig = {
name: 'developer',
model: 'MiniMax-M2.7',
provider: 'minimax',
systemPrompt: `You are a TypeScript/Node.js developer. You implement what the architect specifies.
Write clean, runnable code with proper error handling. Use the tools to write files and run tests.`,
tools: ['bash', 'file_read', 'file_write', 'file_edit'],
maxTurns: 12,
temperature: 0.1,
}
const reviewer: AgentConfig = {
name: 'reviewer',
model: 'MiniMax-M2.7',
provider: 'minimax',
systemPrompt: `You are a senior code reviewer. Review code for correctness, security, and clarity.
Provide a structured review with: LGTM items, suggestions, and any blocking issues.
Read files using the tools before reviewing.`,
tools: ['bash', 'file_read', 'grep'],
maxTurns: 5,
temperature: 0.3,
}
// ---------------------------------------------------------------------------
// Progress tracking
// ---------------------------------------------------------------------------
const startTimes = new Map<string, number>()
function handleProgress(event: OrchestratorEvent): void {
const ts = new Date().toISOString().slice(11, 23) // HH:MM:SS.mmm
switch (event.type) {
case 'agent_start':
startTimes.set(event.agent ?? '', Date.now())
console.log(`[${ts}] AGENT START → ${event.agent}`)
break
case 'agent_complete': {
const elapsed = Date.now() - (startTimes.get(event.agent ?? '') ?? Date.now())
console.log(`[${ts}] AGENT DONE ← ${event.agent} (${elapsed}ms)`)
break
}
case 'task_start':
console.log(`[${ts}] TASK START ↓ ${event.task}`)
break
case 'task_complete':
console.log(`[${ts}] TASK DONE ↑ ${event.task}`)
break
case 'message':
console.log(`[${ts}] MESSAGE • ${event.agent} → (team)`)
break
case 'error':
console.error(`[${ts}] ERROR ✗ agent=${event.agent} task=${event.task}`)
if (event.data instanceof Error) console.error(` ${event.data.message}`)
break
}
}
// ---------------------------------------------------------------------------
// Orchestrate
// ---------------------------------------------------------------------------
const orchestrator = new OpenMultiAgent({
defaultModel: 'MiniMax-M2.7',
defaultProvider: 'minimax',
maxConcurrency: 1, // sequential for readable output
onProgress: handleProgress,
})
const team = orchestrator.createTeam('api-team', {
name: 'api-team',
agents: [architect, developer, reviewer],
sharedMemory: true,
maxConcurrency: 1,
})
console.log(`Team "${team.name}" created with agents: ${team.getAgents().map(a => a.name).join(', ')}`)
console.log('\nStarting team run...\n')
console.log('='.repeat(60))
const goal = `Create a minimal Express.js REST API in /tmp/express-api/ with:
- GET /health { status: "ok" }
- GET /users returns a hardcoded array of 2 user objects
- POST /users accepts { name, email } body, logs it, returns 201
- Proper error handling middleware
- The server should listen on port 3001
- Include a package.json with the required dependencies`
const result = await orchestrator.runTeam(team, goal)
console.log('\n' + '='.repeat(60))
// ---------------------------------------------------------------------------
// Results
// ---------------------------------------------------------------------------
console.log('\nTeam run complete.')
console.log(`Success: ${result.success}`)
console.log(`Total tokens — input: ${result.totalTokenUsage.input_tokens}, output: ${result.totalTokenUsage.output_tokens}`)
console.log('\nPer-agent results:')
for (const [agentName, agentResult] of result.agentResults) {
const status = agentResult.success ? 'OK' : 'FAILED'
const tools = agentResult.toolCalls.length
console.log(` ${agentName.padEnd(12)} [${status}] tool_calls=${tools}`)
if (!agentResult.success) {
console.log(` Error: ${agentResult.output.slice(0, 120)}`)
}
}
// Sample outputs
const developerResult = result.agentResults.get('developer')
if (developerResult?.success) {
console.log('\nDeveloper output (last 600 chars):')
console.log('─'.repeat(60))
const out = developerResult.output
console.log(out.length > 600 ? '...' + out.slice(-600) : out)
console.log('─'.repeat(60))
}
const reviewerResult = result.agentResults.get('reviewer')
if (reviewerResult?.success) {
console.log('\nReviewer output:')
console.log('─'.repeat(60))
console.log(reviewerResult.output)
console.log('─'.repeat(60))
}

3586
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -1,27 +1,14 @@
{
"name": "@jackchen_me/open-multi-agent",
"version": "1.2.0",
"description": "TypeScript multi-agent framework — one runTeam() call from goal to result. Auto task decomposition, parallel execution. 3 dependencies, deploys anywhere Node.js runs.",
"files": [
"dist",
"docs",
"README.md",
"LICENSE"
],
"version": "0.2.0",
"description": "Production-grade multi-agent orchestration framework. Model-agnostic, supports team collaboration, task scheduling, and inter-agent communication.",
"type": "module",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"bin": {
"oma": "dist/cli/oma.js"
},
"exports": {
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
},
"./mcp": {
"types": "./dist/mcp.d.ts",
"import": "./dist/mcp.js"
}
},
"scripts": {
@ -29,9 +16,7 @@
"dev": "tsc --watch",
"test": "vitest run",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage",
"lint": "tsc --noEmit",
"test:e2e": "RUN_E2E=1 vitest run tests/e2e/",
"prepublishOnly": "npm run build"
},
"keywords": [
@ -48,14 +33,6 @@
],
"author": "",
"license": "MIT",
"repository": {
"type": "git",
"url": "git+https://github.com/JackChen-me/open-multi-agent.git"
},
"homepage": "https://github.com/JackChen-me/open-multi-agent#readme",
"bugs": {
"url": "https://github.com/JackChen-me/open-multi-agent/issues"
},
"engines": {
"node": ">=18.0.0"
},
@ -64,23 +41,8 @@
"openai": "^4.73.0",
"zod": "^3.23.0"
},
"peerDependencies": {
"@google/genai": "^1.48.0",
"@modelcontextprotocol/sdk": "^1.18.0"
},
"peerDependenciesMeta": {
"@google/genai": {
"optional": true
},
"@modelcontextprotocol/sdk": {
"optional": true
}
},
"devDependencies": {
"@google/genai": "^1.48.0",
"@modelcontextprotocol/sdk": "^1.18.0",
"@types/node": "^22.0.0",
"@vitest/coverage-v8": "^2.1.9",
"tsx": "^4.21.0",
"typescript": "^5.6.0",
"vitest": "^2.1.0"

View File

@ -27,13 +27,11 @@ import type {
AgentConfig,
AgentState,
AgentRunResult,
BeforeRunHookContext,
LLMMessage,
StreamEvent,
TokenUsage,
ToolUseContext,
} from '../types.js'
import { emitTrace, generateRunId } from '../utils/trace.js'
import type { ToolDefinition as FrameworkToolDefinition, ToolRegistry } from '../tool/framework.js'
import type { ToolExecutor } from '../tool/executor.js'
import { createAdapter } from '../llm/adapter.js'
@ -50,19 +48,6 @@ import {
const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
/**
* Combine two {@link AbortSignal}s so that aborting either one cancels the
* returned signal. Works on Node 18+ (no `AbortSignal.any` required).
*/
function mergeAbortSignals(a: AbortSignal, b: AbortSignal): AbortSignal {
const controller = new AbortController()
if (a.aborted || b.aborted) { controller.abort(); return controller.signal }
const abort = () => controller.abort()
a.addEventListener('abort', abort, { once: true })
b.addEventListener('abort', abort, { once: true })
return controller.signal
}
function addUsage(a: TokenUsage, b: TokenUsage): TokenUsage {
return {
input_tokens: a.input_tokens + b.input_tokens,
@ -146,15 +131,9 @@ export class Agent {
maxTurns: this.config.maxTurns,
maxTokens: this.config.maxTokens,
temperature: this.config.temperature,
toolPreset: this.config.toolPreset,
allowedTools: this.config.tools,
disallowedTools: this.config.disallowedTools,
agentName: this.name,
agentRole: this.config.systemPrompt?.slice(0, 50) ?? 'assistant',
loopDetection: this.config.loopDetection,
maxTokenBudget: this.config.maxTokenBudget,
contextStrategy: this.config.contextStrategy,
compressToolResults: this.config.compressToolResults,
}
this.runner = new AgentRunner(
@ -179,12 +158,12 @@ export class Agent {
*
* Use this for one-shot queries where past context is irrelevant.
*/
async run(prompt: string, runOptions?: Partial<RunOptions>): Promise<AgentRunResult> {
async run(prompt: string): Promise<AgentRunResult> {
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: prompt }] },
]
return this.executeRun(messages, runOptions)
return this.executeRun(messages)
}
/**
@ -195,7 +174,6 @@ export class Agent {
*
* Use this for multi-turn interactions.
*/
// TODO(#18): accept optional RunOptions to forward trace context
async prompt(message: string): Promise<AgentRunResult> {
const userMessage: LLMMessage = {
role: 'user',
@ -219,7 +197,6 @@ export class Agent {
*
* Like {@link run}, this does not use or update the persistent history.
*/
// TODO(#18): accept optional RunOptions to forward trace context
async *stream(prompt: string): AsyncGenerator<StreamEvent> {
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: prompt }] },
@ -265,7 +242,7 @@ export class Agent {
* The tool becomes available to the next LLM call no restart required.
*/
addTool(tool: FrameworkToolDefinition): void {
this._toolRegistry.register(tool, { runtimeAdded: true })
this._toolRegistry.register(tool)
}
/**
@ -289,91 +266,37 @@ export class Agent {
* Shared execution path used by both `run` and `prompt`.
* Handles state transitions and error wrapping.
*/
private async executeRun(
messages: LLMMessage[],
callerOptions?: Partial<RunOptions>,
): Promise<AgentRunResult> {
private async executeRun(messages: LLMMessage[]): Promise<AgentRunResult> {
this.transitionTo('running')
const agentStartMs = Date.now()
try {
// --- beforeRun hook ---
if (this.config.beforeRun) {
const hookCtx = this.buildBeforeRunHookContext(messages)
const modified = await this.config.beforeRun(hookCtx)
this.applyHookContext(messages, modified, hookCtx.prompt)
}
const runner = await this.getRunner()
const internalOnMessage = (msg: LLMMessage) => {
this.state.messages.push(msg)
callerOptions?.onMessage?.(msg)
}
// Auto-generate runId when onTrace is provided but runId is missing
const needsRunId = callerOptions?.onTrace && !callerOptions.runId
// Create a fresh timeout signal per run (not per runner) so that
// each run() / prompt() call gets its own timeout window.
const timeoutSignal = this.config.timeoutMs !== undefined && this.config.timeoutMs > 0
? AbortSignal.timeout(this.config.timeoutMs)
: undefined
// Merge caller-provided abortSignal with the timeout signal so that
// either cancellation source is respected.
const callerAbort = callerOptions?.abortSignal
const effectiveAbort = timeoutSignal && callerAbort
? mergeAbortSignals(timeoutSignal, callerAbort)
: timeoutSignal ?? callerAbort
const runOptions: RunOptions = {
...callerOptions,
onMessage: internalOnMessage,
...(needsRunId ? { runId: generateRunId() } : undefined),
...(effectiveAbort ? { abortSignal: effectiveAbort } : undefined),
onMessage: msg => {
this.state.messages.push(msg)
},
}
const result = await runner.run(messages, runOptions)
this.state.tokenUsage = addUsage(this.state.tokenUsage, result.tokenUsage)
if (result.budgetExceeded) {
let budgetResult = this.toAgentRunResult(result, false)
if (this.config.afterRun) {
budgetResult = await this.config.afterRun(budgetResult)
}
this.transitionTo('completed')
this.emitAgentTrace(callerOptions, agentStartMs, budgetResult)
return budgetResult
}
// --- Structured output validation ---
if (this.config.outputSchema) {
let validated = await this.validateStructuredOutput(
return this.validateStructuredOutput(
messages,
result,
runner,
runOptions,
)
// --- afterRun hook ---
if (this.config.afterRun) {
validated = await this.config.afterRun(validated)
}
this.emitAgentTrace(callerOptions, agentStartMs, validated)
return validated
}
let agentResult = this.toAgentRunResult(result, true)
// --- afterRun hook ---
if (this.config.afterRun) {
agentResult = await this.config.afterRun(agentResult)
}
this.transitionTo('completed')
this.emitAgentTrace(callerOptions, agentStartMs, agentResult)
return agentResult
return this.toAgentRunResult(result, true)
} catch (err) {
const error = err instanceof Error ? err : new Error(String(err))
this.transitionToError(error)
const errorResult: AgentRunResult = {
return {
success: false,
output: error.message,
messages: [],
@ -381,33 +304,9 @@ export class Agent {
toolCalls: [],
structured: undefined,
}
this.emitAgentTrace(callerOptions, agentStartMs, errorResult)
return errorResult
}
}
/** Emit an `agent` trace event if `onTrace` is provided. */
private emitAgentTrace(
options: Partial<RunOptions> | undefined,
startMs: number,
result: AgentRunResult,
): void {
if (!options?.onTrace) return
const endMs = Date.now()
emitTrace(options.onTrace, {
type: 'agent',
runId: options.runId ?? '',
taskId: options.taskId,
agent: options.traceAgent ?? this.name,
turns: result.messages.filter(m => m.role === 'assistant').length,
tokens: result.tokenUsage,
toolCalls: result.toolCalls.length,
startMs,
endMs,
durationMs: endMs - startMs,
})
}
/**
* Validate agent output against the configured `outputSchema`.
* On first validation failure, retry once with error feedback.
@ -476,7 +375,6 @@ export class Agent {
tokenUsage: mergedTokenUsage,
toolCalls: mergedToolCalls,
structured: validated,
...(retryResult.budgetExceeded ? { budgetExceeded: true } : {}),
}
} catch {
// Retry also failed
@ -488,7 +386,6 @@ export class Agent {
tokenUsage: mergedTokenUsage,
toolCalls: mergedToolCalls,
structured: undefined,
...(retryResult.budgetExceeded ? { budgetExceeded: true } : {}),
}
}
}
@ -501,31 +398,13 @@ export class Agent {
this.transitionTo('running')
try {
// --- beforeRun hook ---
if (this.config.beforeRun) {
const hookCtx = this.buildBeforeRunHookContext(messages)
const modified = await this.config.beforeRun(hookCtx)
this.applyHookContext(messages, modified, hookCtx.prompt)
}
const runner = await this.getRunner()
// Fresh timeout per stream call, same as executeRun.
const timeoutSignal = this.config.timeoutMs !== undefined && this.config.timeoutMs > 0
? AbortSignal.timeout(this.config.timeoutMs)
: undefined
for await (const event of runner.stream(messages, timeoutSignal ? { abortSignal: timeoutSignal } : {})) {
for await (const event of runner.stream(messages)) {
if (event.type === 'done') {
const result = event.data as import('./runner.js').RunResult
this.state.tokenUsage = addUsage(this.state.tokenUsage, result.tokenUsage)
let agentResult = this.toAgentRunResult(result, !result.budgetExceeded)
if (this.config.afterRun) {
agentResult = await this.config.afterRun(agentResult)
}
this.transitionTo('completed')
yield { type: 'done', data: agentResult } satisfies StreamEvent
continue
} else if (event.type === 'error') {
const error = event.data instanceof Error
? event.data
@ -542,50 +421,6 @@ export class Agent {
}
}
// -------------------------------------------------------------------------
// Hook helpers
// -------------------------------------------------------------------------
/** Extract the prompt text from the last user message to build hook context. */
private buildBeforeRunHookContext(messages: LLMMessage[]): BeforeRunHookContext {
let prompt = ''
for (let i = messages.length - 1; i >= 0; i--) {
if (messages[i]!.role === 'user') {
prompt = messages[i]!.content
.filter((b): b is import('../types.js').TextBlock => b.type === 'text')
.map(b => b.text)
.join('')
break
}
}
// Strip hook functions to avoid circular self-references in the context
const { beforeRun, afterRun, ...agentInfo } = this.config
return { prompt, agent: agentInfo as AgentConfig }
}
/**
* Apply a (possibly modified) hook context back to the messages array.
*
* Only text blocks in the last user message are replaced; non-text content
* (images, tool results) is preserved. The array element is replaced (not
* mutated in place) so that shallow copies of the original array (e.g. from
* `prompt()`) are not affected.
*/
private applyHookContext(messages: LLMMessage[], ctx: BeforeRunHookContext, originalPrompt: string): void {
if (ctx.prompt === originalPrompt) return
for (let i = messages.length - 1; i >= 0; i--) {
if (messages[i]!.role === 'user') {
const nonTextBlocks = messages[i]!.content.filter(b => b.type !== 'text')
messages[i] = {
role: 'user',
content: [{ type: 'text', text: ctx.prompt }, ...nonTextBlocks],
}
break
}
}
}
// -------------------------------------------------------------------------
// State transition helpers
// -------------------------------------------------------------------------
@ -614,8 +449,6 @@ export class Agent {
tokenUsage: result.tokenUsage,
toolCalls: result.toolCalls,
structured,
...(result.loopDetected ? { loopDetected: true } : {}),
...(result.budgetExceeded ? { budgetExceeded: true } : {}),
}
}

View File

@ -1,137 +0,0 @@
/**
* @fileoverview Sliding-window loop detector for the agent conversation loop.
*
* Tracks tool-call signatures and text outputs across turns to detect when an
* agent is stuck repeating the same actions. Used by {@link AgentRunner} when
* {@link LoopDetectionConfig} is provided.
*/
import type { LoopDetectionConfig, LoopDetectionInfo } from '../types.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/**
* Recursively sort object keys so that `{b:1, a:2}` and `{a:2, b:1}` produce
* the same JSON string.
*/
function sortKeys(value: unknown): unknown {
if (value === null || typeof value !== 'object') return value
if (Array.isArray(value)) return value.map(sortKeys)
const sorted: Record<string, unknown> = {}
for (const key of Object.keys(value as Record<string, unknown>).sort()) {
sorted[key] = sortKeys((value as Record<string, unknown>)[key])
}
return sorted
}
// ---------------------------------------------------------------------------
// LoopDetector
// ---------------------------------------------------------------------------
export class LoopDetector {
private readonly maxRepeats: number
private readonly windowSize: number
private readonly toolSignatures: string[] = []
private readonly textOutputs: string[] = []
constructor(config: LoopDetectionConfig = {}) {
this.maxRepeats = config.maxRepetitions ?? 3
const requestedWindow = config.loopDetectionWindow ?? 4
// Window must be >= threshold, otherwise detection can never trigger.
this.windowSize = Math.max(requestedWindow, this.maxRepeats)
}
/**
* Record a turn's tool calls. Returns detection info when a loop is found.
*/
recordToolCalls(
blocks: ReadonlyArray<{ name: string; input: Record<string, unknown> }>,
): LoopDetectionInfo | null {
if (blocks.length === 0) return null
const signature = this.computeToolSignature(blocks)
this.push(this.toolSignatures, signature)
const count = this.consecutiveRepeats(this.toolSignatures)
if (count >= this.maxRepeats) {
const names = blocks.map(b => b.name).join(', ')
return {
kind: 'tool_repetition',
repetitions: count,
detail:
`Tool call "${names}" with identical arguments has repeated ` +
`${count} times consecutively. The agent appears to be stuck in a loop.`,
}
}
return null
}
/**
* Record a turn's text output. Returns detection info when a loop is found.
*/
recordText(text: string): LoopDetectionInfo | null {
const normalised = text.trim().replace(/\s+/g, ' ')
if (normalised.length === 0) return null
this.push(this.textOutputs, normalised)
const count = this.consecutiveRepeats(this.textOutputs)
if (count >= this.maxRepeats) {
return {
kind: 'text_repetition',
repetitions: count,
detail:
`The agent has produced the same text response ${count} times ` +
`consecutively. It appears to be stuck in a loop.`,
}
}
return null
}
// -------------------------------------------------------------------------
// Private
// -------------------------------------------------------------------------
/**
* Deterministic JSON signature for a set of tool calls.
* Sorts calls by name (for multi-tool turns) and keys within each input.
*/
private computeToolSignature(
blocks: ReadonlyArray<{ name: string; input: Record<string, unknown> }>,
): string {
const items = blocks
.map(b => ({ name: b.name, input: sortKeys(b.input) }))
.sort((a, b) => {
const cmp = a.name.localeCompare(b.name)
if (cmp !== 0) return cmp
return JSON.stringify(a.input).localeCompare(JSON.stringify(b.input))
})
return JSON.stringify(items)
}
/** Push an entry and trim the buffer to `windowSize`. */
private push(buffer: string[], entry: string): void {
buffer.push(entry)
while (buffer.length > this.windowSize) {
buffer.shift()
}
}
/**
* Count how many consecutive identical entries exist at the tail of `buffer`.
* Returns 1 when the last entry is unique.
*/
private consecutiveRepeats(buffer: string[]): number {
if (buffer.length === 0) return 0
const last = buffer[buffer.length - 1]
let count = 0
for (let i = buffer.length - 1; i >= 0; i--) {
if (buffer[i] === last) count++
else break
}
return count
}
}

View File

@ -21,7 +21,6 @@
*/
import type { AgentRunResult } from '../types.js'
import type { RunOptions } from './runner.js'
import type { Agent } from './agent.js'
import { Semaphore } from '../utils/semaphore.js'
@ -58,14 +57,6 @@ export interface PoolStatus {
export class AgentPool {
private readonly agents: Map<string, Agent> = new Map()
private readonly semaphore: Semaphore
/**
* Per-agent mutex (Semaphore(1)) to serialize concurrent runs on the same
* Agent instance. Without this, two tasks assigned to the same agent could
* race on mutable instance state (`status`, `messages`, `tokenUsage`).
*
* @see https://github.com/anthropics/open-multi-agent/issues/72
*/
private readonly agentLocks: Map<string, Semaphore> = new Map()
/** Cursor used by `runAny` for round-robin dispatch. */
private roundRobinIndex = 0
@ -77,16 +68,6 @@ export class AgentPool {
this.semaphore = new Semaphore(maxConcurrency)
}
/**
* Pool semaphore slots not currently held (`maxConcurrency - active`).
* Used to avoid deadlocks when a nested `run()` would wait forever for a slot
* held by the parent run. Best-effort only if multiple nested runs start in
* parallel after the same synchronous check.
*/
get availableRunSlots(): number {
return this.maxConcurrency - this.semaphore.active
}
// -------------------------------------------------------------------------
// Registry operations
// -------------------------------------------------------------------------
@ -104,7 +85,6 @@ export class AgentPool {
)
}
this.agents.set(agent.name, agent)
this.agentLocks.set(agent.name, new Semaphore(1))
}
/**
@ -117,7 +97,6 @@ export class AgentPool {
throw new Error(`AgentPool: agent '${name}' is not registered.`)
}
this.agents.delete(name)
this.agentLocks.delete(name)
}
/**
@ -144,50 +123,12 @@ export class AgentPool {
*
* @throws {Error} If the agent name is not found.
*/
async run(
agentName: string,
prompt: string,
runOptions?: Partial<RunOptions>,
): Promise<AgentRunResult> {
async run(agentName: string, prompt: string): Promise<AgentRunResult> {
const agent = this.requireAgent(agentName)
const agentLock = this.agentLocks.get(agentName)!
// Acquire per-agent lock first so the second call for the same agent waits
// here without consuming a pool slot. Then acquire the pool semaphore.
await agentLock.acquire()
try {
await this.semaphore.acquire()
try {
return await agent.run(prompt, runOptions)
} finally {
this.semaphore.release()
}
} finally {
agentLock.release()
}
}
/**
* Run a prompt on a caller-supplied Agent instance, acquiring only the pool
* semaphore no per-agent lock, no registry lookup.
*
* Designed for delegation: each delegated call should use a **fresh** Agent
* instance (matching `delegate_to_agent`'s "runs in a fresh conversation"
* semantics), so the per-agent mutex used by {@link run} would be dead
* weight and, worse, a deadlock vector for mutual delegation (AB while
* BA, each caller holding its own `run`'s agent lock).
*
* The caller is responsible for constructing the Agent; {@link AgentPool}
* does not register or track it.
*/
async runEphemeral(
agent: Agent,
prompt: string,
runOptions?: Partial<RunOptions>,
): Promise<AgentRunResult> {
await this.semaphore.acquire()
try {
return await agent.run(prompt, runOptions)
return await agent.run(prompt)
} finally {
this.semaphore.release()
}
@ -203,7 +144,6 @@ export class AgentPool {
*
* @param tasks - Array of `{ agent, prompt }` descriptors.
*/
// TODO(#18): accept RunOptions per task to forward trace context
async runParallel(
tasks: ReadonlyArray<{ readonly agent: string; readonly prompt: string }>,
): Promise<Map<string, AgentRunResult>> {
@ -242,7 +182,6 @@ export class AgentPool {
*
* @throws {Error} If the pool is empty.
*/
// TODO(#18): accept RunOptions to forward trace context
async runAny(prompt: string): Promise<AgentRunResult> {
const allAgents = this.list()
if (allAgents.length === 0) {
@ -254,18 +193,11 @@ export class AgentPool {
const agent = allAgents[this.roundRobinIndex]!
this.roundRobinIndex = (this.roundRobinIndex + 1) % allAgents.length
const agentLock = this.agentLocks.get(agent.name)!
await agentLock.acquire()
await this.semaphore.acquire()
try {
await this.semaphore.acquire()
try {
return await agent.run(prompt)
} finally {
this.semaphore.release()
}
return await agent.run(prompt)
} finally {
agentLock.release()
this.semaphore.release()
}
}

View File

@ -23,38 +23,12 @@ import type {
StreamEvent,
ToolResult,
ToolUseContext,
TeamInfo,
LLMAdapter,
LLMChatOptions,
TraceEvent,
LoopDetectionConfig,
LoopDetectionInfo,
LLMToolDef,
ContextStrategy,
} from '../types.js'
import { TokenBudgetExceededError } from '../errors.js'
import { LoopDetector } from './loop-detector.js'
import { emitTrace } from '../utils/trace.js'
import { estimateTokens } from '../utils/tokens.js'
import type { ToolRegistry } from '../tool/framework.js'
import type { ToolExecutor } from '../tool/executor.js'
// ---------------------------------------------------------------------------
// Tool presets
// ---------------------------------------------------------------------------
/** Predefined tool sets for common agent use cases. */
export const TOOL_PRESETS = {
readonly: ['file_read', 'grep', 'glob'],
readwrite: ['file_read', 'file_write', 'file_edit', 'grep', 'glob'],
full: ['file_read', 'file_write', 'file_edit', 'grep', 'glob', 'bash'],
} as const satisfies Record<string, readonly string[]>
/** Framework-level disallowed tools for safety rails. */
export const AGENT_FRAMEWORK_DISALLOWED: readonly string[] = [
// Empty for now, infrastructure for future built-in tools
]
// ---------------------------------------------------------------------------
// Public interfaces
// ---------------------------------------------------------------------------
@ -80,30 +54,15 @@ export interface RunnerOptions {
/** AbortSignal that cancels any in-flight adapter call and stops the loop. */
readonly abortSignal?: AbortSignal
/**
* Tool access control configuration.
* - `toolPreset`: Predefined tool sets for common use cases
* - `allowedTools`: Whitelist of tool names (allowlist)
* - `disallowedTools`: Blacklist of tool names (denylist)
* Tools are resolved in order: preset allowlist denylist
* Whitelist of tool names this runner is allowed to use.
* When provided, only tools whose name appears in this list are sent to the
* LLM. When omitted, all registered tools are available.
*/
readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
readonly allowedTools?: readonly string[]
readonly disallowedTools?: readonly string[]
/** Display name of the agent driving this runner (used in tool context). */
readonly agentName?: string
/** Short role description of the agent (used in tool context). */
readonly agentRole?: string
/** Loop detection configuration. When set, detects stuck agent loops. */
readonly loopDetection?: LoopDetectionConfig
/** Maximum cumulative tokens (input + output) allowed for this run. */
readonly maxTokenBudget?: number
/** Optional context compression strategy for long multi-turn runs. */
readonly contextStrategy?: ContextStrategy
/**
* Compress tool results that the agent has already processed.
* See {@link AgentConfig.compressToolResults} for details.
*/
readonly compressToolResults?: boolean | { readonly minChars?: number }
}
/**
@ -117,29 +76,6 @@ export interface RunOptions {
readonly onToolResult?: (name: string, result: ToolResult) => void
/** Fired after each complete {@link LLMMessage} is appended. */
readonly onMessage?: (message: LLMMessage) => void
/**
* Fired when the runner detects a potential configuration issue.
* For example, when a model appears to ignore tool definitions.
*/
readonly onWarning?: (message: string) => void
/** Trace callback for observability spans. Async callbacks are safe. */
readonly onTrace?: (event: TraceEvent) => void | Promise<void>
/** Run ID for trace correlation. */
readonly runId?: string
/** Task ID for trace correlation. */
readonly taskId?: string
/** Agent name for trace correlation (overrides RunnerOptions.agentName). */
readonly traceAgent?: string
/**
* Per-call abort signal. When set, takes precedence over the static
* {@link RunnerOptions.abortSignal}. Useful for per-run timeouts.
*/
readonly abortSignal?: AbortSignal
/**
* Team context for built-in tools such as `delegate_to_agent`.
* Injected by the orchestrator during `runTeam` / `runTasks` pool runs.
*/
readonly team?: TeamInfo
}
/** The aggregated result returned when a full run completes. */
@ -154,10 +90,6 @@ export interface RunResult {
readonly tokenUsage: TokenUsage
/** Total number of LLM turns (including tool-call follow-ups). */
readonly turns: number
/** True when the run was terminated or warned due to loop detection. */
readonly loopDetected?: boolean
/** True when the run was terminated due to token budget limits. */
readonly budgetExceeded?: boolean
}
// ---------------------------------------------------------------------------
@ -187,34 +119,6 @@ function addTokenUsage(a: TokenUsage, b: TokenUsage): TokenUsage {
const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
/** Default minimum content length before tool result compression kicks in. */
const DEFAULT_MIN_COMPRESS_CHARS = 500
/**
* Prepends synthetic framing text to the first user message so we never emit
* consecutive `user` turns (Bedrock) and summaries do not concatenate onto
* the original user prompt (direct API). If there is no user message yet,
* inserts a single assistant text preamble.
*/
function prependSyntheticPrefixToFirstUser(
messages: LLMMessage[],
prefix: string,
): LLMMessage[] {
const userIdx = messages.findIndex(m => m.role === 'user')
if (userIdx < 0) {
return [{
role: 'assistant',
content: [{ type: 'text', text: prefix.trimEnd() }],
}, ...messages]
}
const target = messages[userIdx]!
const merged: LLMMessage = {
role: 'user',
content: [{ type: 'text', text: prefix }, ...target.content],
}
return [...messages.slice(0, userIdx), merged, ...messages.slice(userIdx + 1)]
}
// ---------------------------------------------------------------------------
// AgentRunner
// ---------------------------------------------------------------------------
@ -234,10 +138,6 @@ function prependSyntheticPrefixToFirstUser(
*/
export class AgentRunner {
private readonly maxTurns: number
private summarizeCache: {
oldSignature: string
summaryPrefix: string
} | null = null
constructor(
private readonly adapter: LLMAdapter,
@ -248,242 +148,6 @@ export class AgentRunner {
this.maxTurns = options.maxTurns ?? 10
}
private serializeMessage(message: LLMMessage): string {
return JSON.stringify(message)
}
private truncateToSlidingWindow(messages: LLMMessage[], maxTurns: number): LLMMessage[] {
if (maxTurns <= 0) {
return messages
}
const firstUserIndex = messages.findIndex(m => m.role === 'user')
const firstUser = firstUserIndex >= 0 ? messages[firstUserIndex]! : null
const afterFirst = firstUserIndex >= 0
? messages.slice(firstUserIndex + 1)
: messages.slice()
if (afterFirst.length <= maxTurns * 2) {
return messages
}
const kept = afterFirst.slice(-maxTurns * 2)
const result: LLMMessage[] = []
if (firstUser !== null) {
result.push(firstUser)
}
const droppedPairs = Math.floor((afterFirst.length - kept.length) / 2)
if (droppedPairs > 0) {
const notice =
`[Earlier conversation history truncated — ${droppedPairs} turn(s) removed]\n\n`
result.push(...prependSyntheticPrefixToFirstUser(kept, notice))
return result
}
result.push(...kept)
return result
}
private async summarizeMessages(
messages: LLMMessage[],
maxTokens: number,
summaryModel: string | undefined,
baseChatOptions: LLMChatOptions,
turns: number,
options: RunOptions,
): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
const estimated = estimateTokens(messages)
if (estimated <= maxTokens || messages.length < 4) {
return { messages, usage: ZERO_USAGE }
}
const firstUserIndex = messages.findIndex(m => m.role === 'user')
if (firstUserIndex < 0 || firstUserIndex === messages.length - 1) {
return { messages, usage: ZERO_USAGE }
}
const firstUser = messages[firstUserIndex]!
const rest = messages.slice(firstUserIndex + 1)
if (rest.length < 2) {
return { messages, usage: ZERO_USAGE }
}
// Split on an even boundary so we never separate a tool_use assistant turn
// from its tool_result user message (rest is user/assistant pairs).
const splitAt = Math.max(2, Math.floor(rest.length / 4) * 2)
const oldPortion = rest.slice(0, splitAt)
const recentPortion = rest.slice(splitAt)
const oldSignature = oldPortion.map(m => this.serializeMessage(m)).join('\n')
if (this.summarizeCache !== null && this.summarizeCache.oldSignature === oldSignature) {
const mergedRecent = prependSyntheticPrefixToFirstUser(
recentPortion,
`${this.summarizeCache.summaryPrefix}\n\n`,
)
return { messages: [firstUser, ...mergedRecent], usage: ZERO_USAGE }
}
const summaryPrompt = [
'Summarize the following conversation history for an LLM.',
'- Preserve user goals, constraints, and decisions.',
'- Keep key tool outputs and unresolved questions.',
'- Use concise bullets.',
'- Do not fabricate details.',
].join('\n')
const summaryInput: LLMMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: summaryPrompt },
{ type: 'text', text: `\n\nConversation:\n${oldSignature}` },
],
},
]
const summaryOptions: LLMChatOptions = {
...baseChatOptions,
model: summaryModel ?? this.options.model,
tools: undefined,
}
const summaryStartMs = Date.now()
const summaryResponse = await this.adapter.chat(summaryInput, summaryOptions)
if (options.onTrace) {
const summaryEndMs = Date.now()
emitTrace(options.onTrace, {
type: 'llm_call',
runId: options.runId ?? '',
taskId: options.taskId,
agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
model: summaryOptions.model,
phase: 'summary',
turn: turns,
tokens: summaryResponse.usage,
startMs: summaryStartMs,
endMs: summaryEndMs,
durationMs: summaryEndMs - summaryStartMs,
})
}
const summaryText = extractText(summaryResponse.content).trim()
const summaryPrefix = summaryText.length > 0
? `[Conversation summary]\n${summaryText}`
: '[Conversation summary unavailable]'
this.summarizeCache = { oldSignature, summaryPrefix }
const mergedRecent = prependSyntheticPrefixToFirstUser(
recentPortion,
`${summaryPrefix}\n\n`,
)
return {
messages: [firstUser, ...mergedRecent],
usage: summaryResponse.usage,
}
}
private async applyContextStrategy(
messages: LLMMessage[],
strategy: ContextStrategy,
baseChatOptions: LLMChatOptions,
turns: number,
options: RunOptions,
): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
if (strategy.type === 'sliding-window') {
return { messages: this.truncateToSlidingWindow(messages, strategy.maxTurns), usage: ZERO_USAGE }
}
if (strategy.type === 'summarize') {
return this.summarizeMessages(
messages,
strategy.maxTokens,
strategy.summaryModel,
baseChatOptions,
turns,
options,
)
}
if (strategy.type === 'compact') {
return { messages: this.compactMessages(messages, strategy), usage: ZERO_USAGE }
}
const estimated = estimateTokens(messages)
const compressed = await strategy.compress(messages, estimated)
if (!Array.isArray(compressed) || compressed.length === 0) {
throw new Error('contextStrategy.custom.compress must return a non-empty LLMMessage[]')
}
return { messages: compressed, usage: ZERO_USAGE }
}
// -------------------------------------------------------------------------
// Tool resolution
// -------------------------------------------------------------------------
/**
* Resolve the final set of tools available to this agent based on the
* three-layer configuration: preset allowlist denylist framework safety.
*
* Returns LLMToolDef[] for direct use with LLM adapters.
*/
private resolveTools(): LLMToolDef[] {
// Validate configuration for contradictions
if (this.options.toolPreset && this.options.allowedTools) {
console.warn(
'AgentRunner: both toolPreset and allowedTools are set. ' +
'Final tool access will be the intersection of both.'
)
}
if (this.options.allowedTools && this.options.disallowedTools) {
const overlap = this.options.allowedTools.filter(tool =>
this.options.disallowedTools!.includes(tool)
)
if (overlap.length > 0) {
console.warn(
`AgentRunner: tools [${overlap.map(name => `"${name}"`).join(', ')}] appear in both allowedTools and disallowedTools. ` +
'This is contradictory and may lead to unexpected behavior.'
)
}
}
const allTools = this.toolRegistry.toToolDefs()
const runtimeCustomTools = this.toolRegistry.toRuntimeToolDefs()
const runtimeCustomToolNames = new Set(runtimeCustomTools.map(t => t.name))
let filteredTools = allTools.filter(t => !runtimeCustomToolNames.has(t.name))
// 1. Apply preset filter if set
if (this.options.toolPreset) {
const presetTools = new Set(TOOL_PRESETS[this.options.toolPreset] as readonly string[])
filteredTools = filteredTools.filter(t => presetTools.has(t.name))
}
// 2. Apply allowlist filter if set
if (this.options.allowedTools) {
filteredTools = filteredTools.filter(t => this.options.allowedTools!.includes(t.name))
}
// 3. Apply denylist filter if set
const denied = this.options.disallowedTools
? new Set(this.options.disallowedTools)
: undefined
if (denied) {
filteredTools = filteredTools.filter(t => !denied.has(t.name))
}
// 4. Apply framework-level safety rails
const frameworkDenied = new Set(AGENT_FRAMEWORK_DISALLOWED)
filteredTools = filteredTools.filter(t => !frameworkDenied.has(t.name))
// Runtime-added custom tools bypass preset / allowlist but respect denylist.
const finalRuntime = denied
? runtimeCustomTools.filter(t => !denied.has(t.name))
: runtimeCustomTools
return [...filteredTools, ...finalRuntime]
}
// -------------------------------------------------------------------------
// Public API
// -------------------------------------------------------------------------
@ -502,7 +166,13 @@ export class AgentRunner {
options: RunOptions = {},
): Promise<RunResult> {
// Collect everything yielded by the internal streaming loop.
const accumulated: RunResult = {
const accumulated: {
messages: LLMMessage[]
output: string
toolCalls: ToolCallRecord[]
tokenUsage: TokenUsage
turns: number
} = {
messages: [],
output: '',
toolCalls: [],
@ -512,9 +182,12 @@ export class AgentRunner {
for await (const event of this.stream(messages, options)) {
if (event.type === 'done') {
Object.assign(accumulated, event.data)
} else if (event.type === 'error') {
throw event.data
const result = event.data as RunResult
accumulated.messages = result.messages
accumulated.output = result.output
accumulated.toolCalls = result.toolCalls
accumulated.tokenUsage = result.tokenUsage
accumulated.turns = result.turns
}
}
@ -528,7 +201,6 @@ export class AgentRunner {
* - `{ type: 'text', data: string }` for each text delta
* - `{ type: 'tool_use', data: ToolUseBlock }` when the model requests a tool
* - `{ type: 'tool_result', data: ToolResultBlock }` after each execution
* - `{ type: 'budget_exceeded', data: TokenBudgetExceededError }` on budget trip
* - `{ type: 'done', data: RunResult }` at the very end
* - `{ type: 'error', data: Error }` on unrecoverable failure
*/
@ -537,21 +209,21 @@ export class AgentRunner {
options: RunOptions = {},
): AsyncGenerator<StreamEvent> {
// Working copy of the conversation — mutated as turns progress.
let conversationMessages: LLMMessage[] = [...initialMessages]
const conversationMessages: LLMMessage[] = [...initialMessages]
// Accumulated state across all turns.
let totalUsage: TokenUsage = ZERO_USAGE
const allToolCalls: ToolCallRecord[] = []
let finalOutput = ''
let turns = 0
let budgetExceeded = false
// Build the stable LLM options once; model / tokens / temp don't change.
// resolveTools() returns LLMToolDef[] with three-layer filtering applied.
const toolDefs = this.resolveTools()
// Per-call abortSignal takes precedence over the static one.
const effectiveAbortSignal = options.abortSignal ?? this.options.abortSignal
// toToolDefs() returns LLMToolDef[] (inputSchema, camelCase) — matches
// LLMChatOptions.tools from types.ts directly.
const allDefs = this.toolRegistry.toToolDefs()
const toolDefs = this.options.allowedTools
? allDefs.filter(d => this.options.allowedTools!.includes(d.name))
: allDefs
const baseChatOptions: LLMChatOptions = {
model: this.options.model,
@ -559,24 +231,16 @@ export class AgentRunner {
maxTokens: this.options.maxTokens,
temperature: this.options.temperature,
systemPrompt: this.options.systemPrompt,
abortSignal: effectiveAbortSignal,
abortSignal: this.options.abortSignal,
}
// Loop detection state — only allocated when configured.
const detector = this.options.loopDetection
? new LoopDetector(this.options.loopDetection)
: null
let loopDetected = false
let loopWarned = false
const loopAction = this.options.loopDetection?.onLoopDetected ?? 'warn'
try {
// -----------------------------------------------------------------
// Main agentic loop — `while (true)` until end_turn or maxTurns
// -----------------------------------------------------------------
while (true) {
// Respect abort before each LLM call.
if (effectiveAbortSignal?.aborted) {
if (this.options.abortSignal?.aborted) {
break
}
@ -587,46 +251,10 @@ export class AgentRunner {
turns++
// Compress consumed tool results before context strategy (lightweight,
// no LLM calls) so the strategy operates on already-reduced messages.
if (this.options.compressToolResults && turns > 1) {
conversationMessages = this.compressConsumedToolResults(conversationMessages)
}
// Optionally compact context before each LLM call after the first turn.
if (this.options.contextStrategy && turns > 1) {
const compacted = await this.applyContextStrategy(
conversationMessages,
this.options.contextStrategy,
baseChatOptions,
turns,
options,
)
conversationMessages = compacted.messages
totalUsage = addTokenUsage(totalUsage, compacted.usage)
}
// ------------------------------------------------------------------
// Step 1: Call the LLM and collect the full response for this turn.
// ------------------------------------------------------------------
const llmStartMs = Date.now()
const response = await this.adapter.chat(conversationMessages, baseChatOptions)
if (options.onTrace) {
const llmEndMs = Date.now()
emitTrace(options.onTrace, {
type: 'llm_call',
runId: options.runId ?? '',
taskId: options.taskId,
agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
model: this.options.model,
phase: 'turn',
turn: turns,
tokens: response.usage,
startMs: llmStartMs,
endMs: llmEndMs,
durationMs: llmEndMs - llmStartMs,
})
}
totalUsage = addTokenUsage(totalUsage, response.usage)
@ -647,104 +275,32 @@ export class AgentRunner {
yield { type: 'text', data: turnText } satisfies StreamEvent
}
const totalTokens = totalUsage.input_tokens + totalUsage.output_tokens
if (this.options.maxTokenBudget !== undefined && totalTokens > this.options.maxTokenBudget) {
budgetExceeded = true
finalOutput = turnText
yield {
type: 'budget_exceeded',
data: new TokenBudgetExceededError(
this.options.agentName ?? 'unknown',
totalTokens,
this.options.maxTokenBudget,
),
} satisfies StreamEvent
break
}
// Extract tool-use blocks for detection and execution.
// Announce each tool-use block the model requested.
const toolUseBlocks = extractToolUseBlocks(response.content)
// ------------------------------------------------------------------
// Step 2.5: Loop detection — check before yielding tool_use events
// so that terminate mode doesn't emit orphaned tool_use without
// matching tool_result.
// ------------------------------------------------------------------
let injectWarning = false
let injectWarningKind: 'tool_repetition' | 'text_repetition' = 'tool_repetition'
if (detector && toolUseBlocks.length > 0) {
const toolInfo = detector.recordToolCalls(toolUseBlocks)
const textInfo = turnText.length > 0 ? detector.recordText(turnText) : null
const info = toolInfo ?? textInfo
if (info) {
yield { type: 'loop_detected', data: info } satisfies StreamEvent
options.onWarning?.(info.detail)
const action = typeof loopAction === 'function'
? await loopAction(info)
: loopAction
if (action === 'terminate') {
loopDetected = true
finalOutput = turnText
break
} else if (action === 'warn' || action === 'inject') {
if (loopWarned) {
// Second detection after a warning — force terminate.
loopDetected = true
finalOutput = turnText
break
}
loopWarned = true
injectWarning = true
injectWarningKind = info.kind
// Fall through to execute tools, then inject warning.
}
// 'continue' — do nothing, let the loop proceed normally.
} else {
// No loop detected this turn — agent has recovered, so reset
// the warning state. A future loop gets a fresh warning cycle.
loopWarned = false
}
for (const block of toolUseBlocks) {
yield { type: 'tool_use', data: block } satisfies StreamEvent
}
// ------------------------------------------------------------------
// Step 3: Decide whether to continue looping.
// ------------------------------------------------------------------
if (toolUseBlocks.length === 0) {
// Warn on first turn if tools were provided but model didn't use them.
if (turns === 1 && toolDefs.length > 0 && options.onWarning) {
const agentName = this.options.agentName ?? 'unknown'
options.onWarning(
`Agent "${agentName}" has ${toolDefs.length} tool(s) available but the model ` +
`returned no tool calls. If using a local model, verify it supports tool calling ` +
`(see https://ollama.com/search?c=tools).`,
)
}
// No tools requested — this is the terminal assistant turn.
finalOutput = turnText
break
}
// Announce each tool-use block the model requested (after loop
// detection, so terminate mode never emits unpaired events).
for (const block of toolUseBlocks) {
yield { type: 'tool_use', data: block } satisfies StreamEvent
}
// ------------------------------------------------------------------
// Step 4: Execute all tool calls in PARALLEL.
//
// Parallel execution is critical for multi-tool responses where the
// tools are independent (e.g. reading several files at once).
// ------------------------------------------------------------------
const toolContext: ToolUseContext = this.buildToolContext(options)
const toolContext: ToolUseContext = this.buildToolContext()
const executionPromises = toolUseBlocks.map(async (block): Promise<{
resultBlock: ToolResultBlock
record: ToolCallRecord
delegationUsage?: TokenUsage
}> => {
options.onToolCall?.(block.name, block.input)
@ -763,25 +319,10 @@ export class AgentRunner {
result = { data: message, isError: true }
}
const endTime = Date.now()
const duration = endTime - startTime
const duration = Date.now() - startTime
options.onToolResult?.(block.name, result)
if (options.onTrace) {
emitTrace(options.onTrace, {
type: 'tool_call',
runId: options.runId ?? '',
taskId: options.taskId,
agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
tool: block.name,
isError: result.isError ?? false,
startMs: startTime,
endMs: endTime,
durationMs: duration,
})
}
const record: ToolCallRecord = {
toolName: block.name,
input: block.input,
@ -796,30 +337,12 @@ export class AgentRunner {
is_error: result.isError,
}
return {
resultBlock,
record,
...(result.metadata?.tokenUsage !== undefined
? { delegationUsage: result.metadata.tokenUsage }
: {}),
}
return { resultBlock, record }
})
// Wait for every tool in this turn to finish.
const executions = await Promise.all(executionPromises)
// Roll up any nested-run token usage surfaced via ToolResult.metadata
// (e.g. from delegate_to_agent) so it counts against this agent's budget.
let delegationTurnUsage: TokenUsage | undefined
for (const ex of executions) {
if (ex.delegationUsage !== undefined) {
totalUsage = addTokenUsage(totalUsage, ex.delegationUsage)
delegationTurnUsage = delegationTurnUsage === undefined
? ex.delegationUsage
: addTokenUsage(delegationTurnUsage, ex.delegationUsage)
}
}
// ------------------------------------------------------------------
// Step 5: Accumulate results and build the user message that carries
// them back to the LLM in the next turn.
@ -831,20 +354,6 @@ export class AgentRunner {
yield { type: 'tool_result', data: resultBlock } satisfies StreamEvent
}
// Inject a loop-detection warning into the tool-result message so
// the LLM sees it alongside the results (avoids two consecutive user
// messages which violates the alternating-role constraint).
if (injectWarning) {
const warningText = injectWarningKind === 'text_repetition'
? 'WARNING: You appear to be generating the same response repeatedly. ' +
'This suggests you are stuck in a loop. Please try a different approach ' +
'or provide new information.'
: 'WARNING: You appear to be repeating the same tool calls with identical arguments. ' +
'This suggests you are stuck in a loop. Please try a different approach, use different ' +
'parameters, or explain what you are trying to accomplish.'
toolResultBlocks.push({ type: 'text' as const, text: warningText })
}
const toolResultMessage: LLMMessage = {
role: 'user',
content: toolResultBlocks,
@ -853,27 +362,6 @@ export class AgentRunner {
conversationMessages.push(toolResultMessage)
options.onMessage?.(toolResultMessage)
// Budget check is deferred until tool_result events have been yielded
// and the tool_result user message has been appended, so stream
// consumers see matched tool_use/tool_result pairs and the returned
// `messages` remain resumable against the Anthropic/OpenAI APIs.
if (delegationTurnUsage !== undefined && this.options.maxTokenBudget !== undefined) {
const totalAfterDelegation = totalUsage.input_tokens + totalUsage.output_tokens
if (totalAfterDelegation > this.options.maxTokenBudget) {
budgetExceeded = true
finalOutput = turnText
yield {
type: 'budget_exceeded',
data: new TokenBudgetExceededError(
this.options.agentName ?? 'unknown',
totalAfterDelegation,
this.options.maxTokenBudget,
),
} satisfies StreamEvent
break
}
}
// Loop back to Step 1 — send updated conversation to the LLM.
}
} catch (err) {
@ -899,8 +387,6 @@ export class AgentRunner {
toolCalls: allToolCalls,
tokenUsage: totalUsage,
turns,
...(loopDetected ? { loopDetected: true } : {}),
...(budgetExceeded ? { budgetExceeded: true } : {}),
}
yield { type: 'done', data: runResult } satisfies StreamEvent
@ -910,233 +396,18 @@ export class AgentRunner {
// Private helpers
// -------------------------------------------------------------------------
/**
* Rule-based selective context compaction (no LLM calls).
*
* Compresses old turns while preserving the conversation skeleton:
* - tool_use blocks (decisions) are always kept
* - Long tool_result content is replaced with a compact marker
* - Long assistant text blocks are truncated with an excerpt
* - Error tool_results are never compressed
* - Recent turns (within `preserveRecentTurns`) are kept intact
*/
private compactMessages(
messages: LLMMessage[],
strategy: Extract<ContextStrategy, { type: 'compact' }>,
): LLMMessage[] {
const estimated = estimateTokens(messages)
if (estimated <= strategy.maxTokens) {
return messages
}
const preserveRecent = strategy.preserveRecentTurns ?? 4
const minToolResultChars = strategy.minToolResultChars ?? 200
const minTextBlockChars = strategy.minTextBlockChars ?? 2000
const textBlockExcerptChars = strategy.textBlockExcerptChars ?? 200
// Find the first user message — it is always preserved as-is.
const firstUserIndex = messages.findIndex(m => m.role === 'user')
if (firstUserIndex < 0 || firstUserIndex === messages.length - 1) {
return messages
}
// Walk backward to find the boundary between old and recent turns.
// A "turn pair" is an assistant message followed by a user message.
let boundary = messages.length
let pairsFound = 0
for (let i = messages.length - 1; i > firstUserIndex && pairsFound < preserveRecent; i--) {
if (messages[i]!.role === 'user' && i > 0 && messages[i - 1]!.role === 'assistant') {
pairsFound++
boundary = i - 1
}
}
// If all turns fit within the recent window, nothing to compact.
if (boundary <= firstUserIndex + 1) {
return messages
}
// Build a tool_use_id → tool name lookup from old assistant messages.
const toolNameMap = new Map<string, string>()
for (let i = firstUserIndex + 1; i < boundary; i++) {
const msg = messages[i]!
if (msg.role !== 'assistant') continue
for (const block of msg.content) {
if (block.type === 'tool_use') {
toolNameMap.set(block.id, block.name)
}
}
}
// Process old messages (between first user and boundary).
let anyChanged = false
const result: LLMMessage[] = []
for (let i = 0; i < messages.length; i++) {
// First user message and recent messages: keep intact.
if (i <= firstUserIndex || i >= boundary) {
result.push(messages[i]!)
continue
}
const msg = messages[i]!
let msgChanged = false
const newContent = msg.content.map((block): ContentBlock => {
if (msg.role === 'assistant') {
// tool_use blocks: always preserve (decisions).
if (block.type === 'tool_use') return block
// Long text blocks: truncate with excerpt.
if (block.type === 'text' && block.text.length >= minTextBlockChars) {
msgChanged = true
return {
type: 'text',
text: `${block.text.slice(0, textBlockExcerptChars)}... [truncated — ${block.text.length} chars total]`,
} satisfies TextBlock
}
// Image blocks in old turns: replace with marker.
if (block.type === 'image') {
msgChanged = true
return { type: 'text', text: '[Image compacted]' } satisfies TextBlock
}
return block
}
// User messages in old zone.
if (block.type === 'tool_result') {
// Error results: always preserve.
if (block.is_error) return block
// Already compressed by compressToolResults or a prior compact pass.
if (
block.content.startsWith('[Tool output compressed') ||
block.content.startsWith('[Tool result:')
) {
return block
}
// Short results: preserve.
if (block.content.length < minToolResultChars) return block
const toolName = toolNameMap.get(block.tool_use_id) ?? 'unknown'
// Delegation results: preserve — parent agent may still reason over them.
if (toolName === 'delegate_to_agent') return block
// Compress.
msgChanged = true
return {
type: 'tool_result',
tool_use_id: block.tool_use_id,
content: `[Tool result: ${toolName}${block.content.length} chars, compacted]`,
} satisfies ToolResultBlock
}
return block
})
if (msgChanged) {
anyChanged = true
result.push({ role: msg.role, content: newContent } as LLMMessage)
} else {
result.push(msg)
}
}
return anyChanged ? result : messages
}
/**
* Replace consumed tool results with compact markers.
*
* A tool_result is "consumed" when the assistant has produced a response
* after seeing it (i.e. there is an assistant message following the user
* message that contains the tool_result). The most recent user message
* with tool results is always kept intact the LLM is about to see it.
*
* Error results and results shorter than `minChars` are never compressed.
*/
private compressConsumedToolResults(messages: LLMMessage[]): LLMMessage[] {
const config = this.options.compressToolResults
if (!config) return messages
const minChars = typeof config === 'object'
? (config.minChars ?? DEFAULT_MIN_COMPRESS_CHARS)
: DEFAULT_MIN_COMPRESS_CHARS
// Find the last user message that carries tool_result blocks.
let lastToolResultUserIdx = -1
for (let i = messages.length - 1; i >= 0; i--) {
if (
messages[i]!.role === 'user' &&
messages[i]!.content.some(b => b.type === 'tool_result')
) {
lastToolResultUserIdx = i
break
}
}
// Nothing to compress if there's at most one tool-result user message.
if (lastToolResultUserIdx <= 0) return messages
// Build a tool_use_id → tool name map so we can exempt delegation results,
// whose full output the parent agent may need to re-read in later turns.
const toolNameMap = new Map<string, string>()
for (const msg of messages) {
if (msg.role !== 'assistant') continue
for (const block of msg.content) {
if (block.type === 'tool_use') toolNameMap.set(block.id, block.name)
}
}
let anyChanged = false
const result = messages.map((msg, idx) => {
// Only compress user messages that appear before the last one.
if (msg.role !== 'user' || idx >= lastToolResultUserIdx) return msg
const hasToolResult = msg.content.some(b => b.type === 'tool_result')
if (!hasToolResult) return msg
let msgChanged = false
const newContent = msg.content.map((block): ContentBlock => {
if (block.type !== 'tool_result') return block
// Never compress error results — they carry diagnostic value.
if (block.is_error) return block
// Never compress delegation results — the parent agent relies on the full sub-agent output.
if (toolNameMap.get(block.tool_use_id) === 'delegate_to_agent') return block
// Skip already-compressed results — avoid re-compression with wrong char count.
if (block.content.startsWith('[Tool output compressed')) return block
// Skip short results — the marker itself has overhead.
if (block.content.length < minChars) return block
msgChanged = true
return {
type: 'tool_result',
tool_use_id: block.tool_use_id,
content: `[Tool output compressed — ${block.content.length} chars, already processed]`,
} satisfies ToolResultBlock
})
if (msgChanged) {
anyChanged = true
return { role: msg.role, content: newContent } as LLMMessage
}
return msg
})
return anyChanged ? result : messages
}
/**
* Build the {@link ToolUseContext} passed to every tool execution.
* Identifies this runner as the invoking agent.
*/
private buildToolContext(options: RunOptions = {}): ToolUseContext {
private buildToolContext(): ToolUseContext {
return {
agent: {
name: this.options.agentName ?? 'runner',
role: this.options.agentRole ?? 'assistant',
model: this.options.model,
},
abortSignal: options.abortSignal ?? this.options.abortSignal,
...(options.team !== undefined ? { team: options.team } : {}),
abortSignal: this.options.abortSignal,
}
}
}

View File

@ -1,470 +0,0 @@
#!/usr/bin/env node
/**
* Thin shell/CI wrapper over OpenMultiAgent no interactive session, cwd binding,
* approvals, or persistence.
*
* Exit codes:
* 0 finished; team run succeeded
* 1 finished; team run reported failure (agents/tasks)
* 2 invalid usage, I/O, or JSON validation
* 3 unexpected runtime error (including LLM errors)
*/
import { mkdir, writeFile } from 'node:fs/promises'
import { readFileSync } from 'node:fs'
import { join, resolve } from 'node:path'
import { fileURLToPath } from 'node:url'
import { OpenMultiAgent } from '../orchestrator/orchestrator.js'
import { renderTeamRunDashboard } from '../dashboard/render-team-run-dashboard.js'
import type { SupportedProvider } from '../llm/adapter.js'
import type { AgentRunResult, CoordinatorConfig, OrchestratorConfig, TeamConfig, TeamRunResult } from '../types.js'
// ---------------------------------------------------------------------------
// Exit codes
// ---------------------------------------------------------------------------
export const EXIT = {
SUCCESS: 0,
RUN_FAILED: 1,
USAGE: 2,
INTERNAL: 3,
} as const
class OmaValidationError extends Error {
override readonly name = 'OmaValidationError'
constructor(message: string) {
super(message)
}
}
// ---------------------------------------------------------------------------
// Provider helper (static reference data)
// ---------------------------------------------------------------------------
const PROVIDER_REFERENCE: ReadonlyArray<{
id: SupportedProvider
apiKeyEnv: readonly string[]
baseUrlSupported: boolean
notes?: string
}> = [
{ id: 'anthropic', apiKeyEnv: ['ANTHROPIC_API_KEY'], baseUrlSupported: true },
{ id: 'azure-openai', apiKeyEnv: ['AZURE_OPENAI_API_KEY', 'AZURE_OPENAI_ENDPOINT', 'AZURE_OPENAI_DEPLOYMENT'], baseUrlSupported: true, notes: 'Azure OpenAI requires endpoint URL (e.g., https://my-resource.openai.azure.com) and API key. Optional: AZURE_OPENAI_API_VERSION (defaults to 2024-10-21). Prefer setting deployment on agent.model; AZURE_OPENAI_DEPLOYMENT is a fallback when model is blank.' },
{ id: 'openai', apiKeyEnv: ['OPENAI_API_KEY'], baseUrlSupported: true, notes: 'Set baseURL for Ollama / vLLM / LM Studio; apiKey may be a placeholder.' },
{ id: 'gemini', apiKeyEnv: ['GEMINI_API_KEY', 'GOOGLE_API_KEY'], baseUrlSupported: false },
{ id: 'grok', apiKeyEnv: ['XAI_API_KEY'], baseUrlSupported: true },
{ id: 'minimax', apiKeyEnv: ['MINIMAX_API_KEY'], baseUrlSupported: true, notes: 'Global endpoint: https://api.minimax.io/v1 (default). China endpoint: https://api.minimaxi.com/v1. Set MINIMAX_BASE_URL to choose, or pass baseURL in agent config.' },
{ id: 'deepseek', apiKeyEnv: ['DEEPSEEK_API_KEY'], baseUrlSupported: true, notes: 'OpenAI-compatible endpoint at https://api.deepseek.com/v1. Models: deepseek-chat (V3), deepseek-reasoner (thinking).' },
{
id: 'copilot',
apiKeyEnv: ['GITHUB_COPILOT_TOKEN', 'GITHUB_TOKEN'],
baseUrlSupported: false,
notes: 'If no token env is set, Copilot adapter may start an interactive OAuth device flow (avoid in CI).',
},
]
// ---------------------------------------------------------------------------
// argv / JSON helpers
// ---------------------------------------------------------------------------
export function parseArgs(argv: string[]): {
_: string[]
flags: Set<string>
kv: Map<string, string>
} {
const _ = argv.slice(2)
const flags = new Set<string>()
const kv = new Map<string, string>()
let i = 0
while (i < _.length) {
const a = _[i]!
if (a === '--') {
break
}
if (a.startsWith('--')) {
const eq = a.indexOf('=')
if (eq !== -1) {
kv.set(a.slice(2, eq), a.slice(eq + 1))
i++
continue
}
const key = a.slice(2)
const next = _[i + 1]
if (next !== undefined && !next.startsWith('--')) {
kv.set(key, next)
i += 2
} else {
flags.add(key)
i++
}
continue
}
i++
}
return { _, flags, kv }
}
function getOpt(kv: Map<string, string>, flags: Set<string>, key: string): string | undefined {
if (flags.has(key)) return ''
return kv.get(key)
}
function readJson(path: string): unknown {
const abs = resolve(path)
const raw = readFileSync(abs, 'utf8')
try {
return JSON.parse(raw) as unknown
} catch (e) {
if (e instanceof SyntaxError) {
throw new Error(`Invalid JSON in ${abs}: ${e.message}`)
}
throw e
}
}
function isObject(v: unknown): v is Record<string, unknown> {
return typeof v === 'object' && v !== null && !Array.isArray(v)
}
function asTeamConfig(v: unknown, label: string): TeamConfig {
if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
const name = v['name']
const agents = v['agents']
if (typeof name !== 'string' || !name) throw new OmaValidationError(`${label}.name: non-empty string required`)
if (!Array.isArray(agents) || agents.length === 0) {
throw new OmaValidationError(`${label}.agents: non-empty array required`)
}
for (const a of agents) {
if (!isObject(a)) throw new OmaValidationError(`${label}.agents[]: each agent must be an object`)
if (typeof a['name'] !== 'string' || !a['name']) throw new OmaValidationError(`agent.name required`)
if (typeof a['model'] !== 'string' || !a['model']) {
throw new OmaValidationError(`agent.model required for "${String(a['name'])}"`)
}
}
return v as unknown as TeamConfig
}
function asOrchestratorPartial(v: unknown, label: string): OrchestratorConfig {
if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
return v as OrchestratorConfig
}
function asCoordinatorPartial(v: unknown, label: string): CoordinatorConfig {
if (!isObject(v)) throw new OmaValidationError(`${label}: expected a JSON object`)
return v as CoordinatorConfig
}
function asTaskSpecs(v: unknown, label: string): ReadonlyArray<{
title: string
description: string
assignee?: string
dependsOn?: string[]
memoryScope?: 'dependencies' | 'all'
maxRetries?: number
retryDelayMs?: number
retryBackoff?: number
}> {
if (!Array.isArray(v)) throw new OmaValidationError(`${label}: expected a JSON array`)
const out: Array<{
title: string
description: string
assignee?: string
dependsOn?: string[]
memoryScope?: 'dependencies' | 'all'
maxRetries?: number
retryDelayMs?: number
retryBackoff?: number
}> = []
let i = 0
for (const item of v) {
if (!isObject(item)) throw new OmaValidationError(`${label}[${i}]: object expected`)
if (typeof item['title'] !== 'string' || typeof item['description'] !== 'string') {
throw new OmaValidationError(`${label}[${i}]: title and description strings required`)
}
const row: (typeof out)[0] = {
title: item['title'],
description: item['description'],
}
if (typeof item['assignee'] === 'string') row.assignee = item['assignee']
if (Array.isArray(item['dependsOn'])) {
row.dependsOn = item['dependsOn'].filter((x): x is string => typeof x === 'string')
}
if (item['memoryScope'] === 'all' || item['memoryScope'] === 'dependencies') {
row.memoryScope = item['memoryScope']
}
if (typeof item['maxRetries'] === 'number') row.maxRetries = item['maxRetries']
if (typeof item['retryDelayMs'] === 'number') row.retryDelayMs = item['retryDelayMs']
if (typeof item['retryBackoff'] === 'number') row.retryBackoff = item['retryBackoff']
out.push(row)
i++
}
return out
}
export interface CliJsonOptions {
readonly pretty: boolean
readonly includeMessages: boolean
}
export function serializeAgentResult(r: AgentRunResult, includeMessages: boolean): Record<string, unknown> {
const base: Record<string, unknown> = {
success: r.success,
output: r.output,
tokenUsage: r.tokenUsage,
toolCalls: r.toolCalls,
structured: r.structured,
loopDetected: r.loopDetected,
budgetExceeded: r.budgetExceeded,
}
if (includeMessages) base['messages'] = r.messages
return base
}
export function serializeTeamRunResult(result: TeamRunResult, opts: CliJsonOptions): Record<string, unknown> {
const agentResults: Record<string, unknown> = {}
for (const [k, v] of result.agentResults) {
agentResults[k] = serializeAgentResult(v, opts.includeMessages)
}
return {
success: result.success,
goal: result.goal,
tasks: result.tasks,
totalTokenUsage: result.totalTokenUsage,
agentResults,
}
}
function printJson(data: unknown, pretty: boolean): void {
const s = pretty ? JSON.stringify(data, null, 2) : JSON.stringify(data)
process.stdout.write(`${s}\n`)
}
function help(): string {
return [
'open-multi-agent CLI (oma)',
'',
'Usage:',
' oma run --goal <text> --team <team.json> [--orchestrator <orch.json>] [--coordinator <coord.json>]',
' oma task --file <tasks.json> [--team <team.json>]',
' oma provider [list | template <provider>]',
'',
'Flags:',
' --pretty Pretty-print JSON to stdout',
' --include-messages Include full LLM message arrays in run output (large)',
' --dashboard Write team-run DAG HTML dashboard to oma-dashboards/',
'',
'team.json may be a TeamConfig object, or { "team": TeamConfig, "orchestrator": { ... } }.',
'tasks.json: { "team": TeamConfig, "tasks": [ ... ], "orchestrator"?: { ... } }.',
' Optional --team overrides the embedded team object.',
'',
'Exit codes: 0 success, 1 run failed, 2 usage/validation, 3 internal',
].join('\n')
}
const DEFAULT_MODEL_HINT: Record<SupportedProvider, string> = {
anthropic: 'claude-opus-4-6',
'azure-openai': 'gpt-4',
openai: 'gpt-4o',
gemini: 'gemini-2.0-flash',
grok: 'grok-2-latest',
copilot: 'gpt-4o',
minimax: 'MiniMax-M2.7',
deepseek: 'deepseek-chat',
}
async function cmdProvider(sub: string | undefined, arg: string | undefined, pretty: boolean): Promise<number> {
if (sub === undefined || sub === 'list') {
printJson({ providers: PROVIDER_REFERENCE }, pretty)
return EXIT.SUCCESS
}
if (sub === 'template') {
const id = arg as SupportedProvider | undefined
const row = PROVIDER_REFERENCE.find((p) => p.id === id)
if (!id || !row) {
printJson(
{
error: {
kind: 'usage',
message: `usage: oma provider template <${PROVIDER_REFERENCE.map((p) => p.id).join('|')}>`,
},
},
pretty,
)
return EXIT.USAGE
}
printJson(
{
orchestrator: {
defaultProvider: id,
defaultModel: DEFAULT_MODEL_HINT[id],
},
agent: {
name: 'worker',
model: DEFAULT_MODEL_HINT[id],
provider: id,
systemPrompt: 'You are a helpful assistant.',
},
env: Object.fromEntries(row.apiKeyEnv.map((k) => [k, `<set ${k} in environment>`])),
notes: row.notes,
},
pretty,
)
return EXIT.SUCCESS
}
printJson({ error: { kind: 'usage', message: `unknown provider subcommand: ${sub}` } }, pretty)
return EXIT.USAGE
}
function mergeOrchestrator(base: OrchestratorConfig, ...partials: OrchestratorConfig[]): OrchestratorConfig {
let o: OrchestratorConfig = { ...base }
for (const p of partials) {
o = { ...o, ...p }
}
return o
}
async function writeRunTeamDashboardFile(html: string): Promise<string> {
const directory = join(process.cwd(), 'oma-dashboards')
await mkdir(directory, { recursive: true })
const stamp = new Date().toISOString().replaceAll(':', '-').replace('.', '-')
const filePath = join(directory, `runTeam-${stamp}.html`)
await writeFile(filePath, html, 'utf8')
return filePath
}
async function main(): Promise<number> {
const argv = parseArgs(process.argv)
const cmd = argv._[0]
const pretty = argv.flags.has('pretty')
const includeMessages = argv.flags.has('include-messages')
const dashboard = argv.flags.has('dashboard')
if (cmd === undefined || cmd === 'help' || cmd === '-h' || cmd === '--help') {
process.stdout.write(`${help()}\n`)
return EXIT.SUCCESS
}
if (cmd === 'provider') {
return cmdProvider(argv._[1], argv._[2], pretty)
}
const jsonOpts: CliJsonOptions = { pretty, includeMessages }
try {
if (cmd === 'run') {
const goal = getOpt(argv.kv, argv.flags, 'goal')
const teamPath = getOpt(argv.kv, argv.flags, 'team')
const orchPath = getOpt(argv.kv, argv.flags, 'orchestrator')
const coordPath = getOpt(argv.kv, argv.flags, 'coordinator')
if (!goal || !teamPath) {
printJson({ error: { kind: 'usage', message: '--goal and --team are required' } }, pretty)
return EXIT.USAGE
}
const teamRaw = readJson(teamPath)
let teamCfg: TeamConfig
let orchParts: OrchestratorConfig[] = []
if (isObject(teamRaw) && teamRaw['team'] !== undefined) {
teamCfg = asTeamConfig(teamRaw['team'], 'team')
if (teamRaw['orchestrator'] !== undefined) {
orchParts.push(asOrchestratorPartial(teamRaw['orchestrator'], 'orchestrator'))
}
} else {
teamCfg = asTeamConfig(teamRaw, 'team')
}
if (orchPath) {
orchParts.push(asOrchestratorPartial(readJson(orchPath), 'orchestrator file'))
}
const orchestrator = new OpenMultiAgent(mergeOrchestrator({}, ...orchParts))
const team = orchestrator.createTeam(teamCfg.name, teamCfg)
let coordinator: CoordinatorConfig | undefined
if (coordPath) {
coordinator = asCoordinatorPartial(readJson(coordPath), 'coordinator file')
}
const result = await orchestrator.runTeam(team, goal, coordinator ? { coordinator } : undefined)
if (dashboard) {
const html = renderTeamRunDashboard(result)
try {
await writeRunTeamDashboardFile(html)
} catch (err) {
process.stderr.write(
`oma: failed to write runTeam dashboard: ${err instanceof Error ? err.message : String(err)}\n`,
)
}
}
await orchestrator.shutdown()
const payload = { command: 'run' as const, ...serializeTeamRunResult(result, jsonOpts) }
printJson(payload, pretty)
return result.success ? EXIT.SUCCESS : EXIT.RUN_FAILED
}
if (cmd === 'task') {
const file = getOpt(argv.kv, argv.flags, 'file')
const teamOverride = getOpt(argv.kv, argv.flags, 'team')
if (!file) {
printJson({ error: { kind: 'usage', message: '--file is required' } }, pretty)
return EXIT.USAGE
}
const doc = readJson(file)
if (!isObject(doc)) {
throw new OmaValidationError('tasks file root must be an object')
}
const orchParts: OrchestratorConfig[] = []
if (doc['orchestrator'] !== undefined) {
orchParts.push(asOrchestratorPartial(doc['orchestrator'], 'orchestrator'))
}
const teamCfg = teamOverride
? asTeamConfig(readJson(teamOverride), 'team (--team)')
: asTeamConfig(doc['team'], 'team')
const tasks = asTaskSpecs(doc['tasks'], 'tasks')
if (tasks.length === 0) {
throw new OmaValidationError('tasks array must not be empty')
}
const orchestrator = new OpenMultiAgent(mergeOrchestrator({}, ...orchParts))
const team = orchestrator.createTeam(teamCfg.name, teamCfg)
const result = await orchestrator.runTasks(team, tasks)
await orchestrator.shutdown()
const payload = { command: 'task' as const, ...serializeTeamRunResult(result, jsonOpts) }
printJson(payload, pretty)
return result.success ? EXIT.SUCCESS : EXIT.RUN_FAILED
}
printJson({ error: { kind: 'usage', message: `unknown command: ${cmd}` } }, pretty)
return EXIT.USAGE
} catch (e) {
const message = e instanceof Error ? e.message : String(e)
const { kind, exit } = classifyCliError(e, message)
printJson({ error: { kind, message } }, pretty)
return exit
}
}
function classifyCliError(e: unknown, message: string): { kind: string; exit: number } {
if (e instanceof OmaValidationError) return { kind: 'validation', exit: EXIT.USAGE }
if (message.includes('Invalid JSON')) return { kind: 'validation', exit: EXIT.USAGE }
if (message.includes('ENOENT') || message.includes('EACCES')) return { kind: 'io', exit: EXIT.USAGE }
return { kind: 'runtime', exit: EXIT.INTERNAL }
}
const isMain = (() => {
const argv1 = process.argv[1]
if (!argv1) return false
try {
return fileURLToPath(import.meta.url) === resolve(argv1)
} catch {
return false
}
})()
if (isMain) {
main()
.then((code) => process.exit(code))
.catch((e) => {
const message = e instanceof Error ? e.message : String(e)
process.stdout.write(`${JSON.stringify({ error: { kind: 'internal', message } })}\n`)
process.exit(EXIT.INTERNAL)
})
}

View File

@ -1,98 +0,0 @@
/**
* Pure DAG layout for the team-run dashboard (mirrors the browser algorithm).
*/
export interface LayoutTaskInput {
readonly id: string
readonly dependsOn?: readonly string[]
}
export interface LayoutTasksResult {
readonly positions: ReadonlyMap<string, { readonly x: number; readonly y: number }>
readonly width: number
readonly height: number
readonly nodeW: number
readonly nodeH: number
}
/**
* Assigns each task to a column by longest path from roots (topological level),
* then stacks rows within each column. Used by the dashboard canvas sizing.
*/
export function layoutTasks<T extends LayoutTaskInput>(taskList: readonly T[]): LayoutTasksResult {
const byId = new Map(taskList.map((task) => [task.id, task]))
const children = new Map<string, string[]>(taskList.map((task) => [task.id, []]))
const indegree = new Map<string, number>()
for (const task of taskList) {
const deps = (task.dependsOn ?? []).filter((dep) => byId.has(dep))
indegree.set(task.id, deps.length)
for (const depId of deps) {
children.get(depId)!.push(task.id)
}
}
const levels = new Map<string, number>()
const queue: string[] = []
let processed = 0
for (const task of taskList) {
if ((indegree.get(task.id) ?? 0) === 0) {
levels.set(task.id, 0)
queue.push(task.id)
}
}
while (queue.length > 0) {
const currentId = queue.shift()!
processed += 1
const baseLevel = levels.get(currentId) ?? 0
for (const childId of children.get(currentId) ?? []) {
const nextLevel = Math.max(levels.get(childId) ?? 0, baseLevel + 1)
levels.set(childId, nextLevel)
indegree.set(childId, (indegree.get(childId) ?? 1) - 1)
if ((indegree.get(childId) ?? 0) === 0) {
queue.push(childId)
}
}
}
if (processed !== taskList.length) {
throw new Error('Task dependency graph contains a cycle')
}
for (const task of taskList) {
if (!levels.has(task.id)) levels.set(task.id, 0)
}
const cols = new Map<number, T[]>()
for (const task of taskList) {
const level = levels.get(task.id) ?? 0
if (!cols.has(level)) cols.set(level, [])
cols.get(level)!.push(task)
}
const sortedLevels = Array.from(cols.keys()).sort((a, b) => a - b)
const nodeW = 256
const nodeH = 142
const colGap = 96
const rowGap = 72
const padX = 120
const padY = 100
const positions = new Map<string, { x: number; y: number }>()
let maxRows = 1
for (const level of sortedLevels) maxRows = Math.max(maxRows, cols.get(level)!.length)
for (const level of sortedLevels) {
const colTasks = cols.get(level)!
colTasks.forEach((task, idx) => {
positions.set(task.id, {
x: padX + level * (nodeW + colGap),
y: padY + idx * (nodeH + rowGap),
})
})
}
const width = Math.max(1600, padX * 2 + sortedLevels.length * (nodeW + colGap))
const height = Math.max(700, padY * 2 + maxRows * (nodeH + rowGap))
return { positions, width, height, nodeW, nodeH }
}

View File

@ -1,460 +0,0 @@
/**
* Pure HTML renderer for the post-run team task DAG dashboard (no filesystem or network I/O).
*/
import type { TeamRunResult } from '../types.js'
import { layoutTasks } from './layout-tasks.js'
/**
* Escape serialized JSON so it can be embedded in HTML without closing a {@code <script>} tag.
* The HTML tokenizer ends a script on {@code </script>} even for {@code type="application/json"}.
*/
export function escapeJsonForHtmlScript(json: string): string {
return json.replace(/<\/script/gi, '<\\/script')
}
export function renderTeamRunDashboard(result: TeamRunResult): string {
const generatedAt = new Date().toISOString()
const tasks = result.tasks ?? []
const layout = layoutTasks(tasks)
const serializedPositions = Object.fromEntries(layout.positions)
const payload = {
generatedAt,
goal: result.goal ?? '',
tasks,
layout: {
positions: serializedPositions,
width: layout.width,
height: layout.height,
nodeW: layout.nodeW,
nodeH: layout.nodeH,
},
}
const dataJson = escapeJsonForHtmlScript(JSON.stringify(payload))
return `<!DOCTYPE html>
<html class="dark" lang="en">
<head>
<meta charset="utf-8" />
<meta content="width=device-width, initial-scale=1.0" name="viewport" />
<title>Open Multi Agent</title>
<script src="https://cdn.tailwindcss.com?plugins=forms,container-queries"></script>
<link
href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@300;400;500;600;700&amp;family=Inter:wght@400;500;600&amp;display=swap"
rel="stylesheet" />
<link
href="https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:wght,FILL@100..700,0..1&amp;display=swap"
rel="stylesheet" />
<script id="tailwind-config">
tailwind.config = {
darkMode: "class",
theme: {
extend: {
"colors": {
"inverse-surface": "#faf8ff",
"secondary-dim": "#ecb200",
"on-primary": "#005762",
"on-tertiary-fixed-variant": "#006827",
"primary-fixed-dim": "#00d4ec",
"tertiary-container": "#5cfd80",
"secondary": "#fdc003",
"primary-dim": "#00d4ec",
"surface-container": "#0f1930",
"on-secondary": "#553e00",
"surface": "#060e20",
"on-surface": "#dee5ff",
"surface-container-highest": "#192540",
"on-secondary-fixed-variant": "#674c00",
"on-tertiary-container": "#005d22",
"secondary-fixed-dim": "#f7ba00",
"surface-variant": "#192540",
"surface-container-low": "#091328",
"secondary-container": "#785900",
"tertiary-fixed-dim": "#4bee74",
"on-primary-fixed-variant": "#005762",
"primary-container": "#00e3fd",
"surface-dim": "#060e20",
"error-container": "#9f0519",
"on-error-container": "#ffa8a3",
"primary-fixed": "#00e3fd",
"tertiary-dim": "#4bee74",
"surface-container-high": "#141f38",
"background": "#060e20",
"surface-bright": "#1f2b49",
"error-dim": "#d7383b",
"on-primary-container": "#004d57",
"outline": "#6d758c",
"error": "#ff716c",
"on-secondary-container": "#fff6ec",
"on-primary-fixed": "#003840",
"inverse-on-surface": "#4d556b",
"secondary-fixed": "#ffca4d",
"tertiary-fixed": "#5cfd80",
"on-tertiary-fixed": "#004819",
"surface-tint": "#81ecff",
"tertiary": "#b8ffbb",
"outline-variant": "#40485d",
"on-error": "#490006",
"on-surface-variant": "#a3aac4",
"surface-container-lowest": "#000000",
"on-tertiary": "#006727",
"primary": "#81ecff",
"on-secondary-fixed": "#443100",
"inverse-primary": "#006976",
"on-background": "#dee5ff"
},
"borderRadius": {
"DEFAULT": "0px",
"lg": "0px",
"xl": "0px",
"full": "9999px"
},
"fontFamily": {
"headline": ["Space Grotesk"],
"body": ["Inter"],
"label": ["Space Grotesk"]
}
},
},
}
</script>
<style>
.material-symbols-outlined {
font-variation-settings: 'FILL' 0, 'wght' 400, 'GRAD' 0, 'opsz' 24;
}
.grid-pattern {
background-image: radial-gradient(circle, #40485d 1px, transparent 1px);
background-size: 24px 24px;
}
.node-active-glow {
box-shadow: 0 0 15px rgba(129, 236, 255, 0.15);
}
</style>
</head>
<body class="bg-surface text-on-surface font-body selection:bg-primary selection:text-on-primary">
<main class="p-8 min-h-[calc(100vh-64px)] grid-pattern relative overflow-hidden flex flex-col lg:flex-row gap-6">
<div id="viewport" class="flex-1 relative min-h-[600px] overflow-hidden cursor-grab">
<div id="canvas" class="absolute inset-0 origin-top-left">
<svg id="edgesLayer" class="absolute inset-0 w-full h-full pointer-events-none" xmlns="http://www.w3.org/2000/svg"></svg>
<div id="nodesLayer"></div>
</div>
</div>
<aside id="detailsPanel" class="hidden w-full lg:w-[400px] bg-surface-container-high p-6 flex flex-col gap-8 border-l border-outline-variant/10">
<div>
<h2 class="font-headline font-black text-lg tracking-widest mb-6 text-primary flex items-center gap-2">
<span class="material-symbols-outlined" data-icon="info">info</span>
NODE_DETAILS
</h2>
<button id="closePanel" class="absolute top-4 right-4 text-on-surface-variant hover:text-primary">
<span class="material-symbols-outlined">close</span>
</button>
<div class="space-y-6">
<div class="flex flex-col gap-2">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Goal</label>
<p id="goalText" class="text-xs bg-surface-container p-3 border-b border-outline-variant/20"></p>
</div>
<div class="flex flex-col gap-1">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Assigned Agent</label>
<div class="flex items-center gap-4 bg-surface-container p-3">
<div>
<p id="selectedAssignee" class="text-sm font-bold text-on-surface">-</p>
<p id="selectedState" class="text-[10px] font-mono text-secondary">ACTIVE STATE: -</p>
</div>
</div>
</div>
<div class="grid grid-cols-2 gap-4">
<div class="flex flex-col gap-1">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Execution Start</label>
<p id="selectedStart" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20">-</p>
</div>
<div class="flex flex-col gap-1">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Execution End</label>
<p id="selectedEnd" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20 text-on-surface-variant">-</p>
</div>
</div>
<div class="flex flex-col gap-1">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Token Breakdown</label>
<div class="space-y-2 bg-surface-container p-4">
<div class="flex justify-between text-xs font-mono">
<span class="text-on-surface-variant">PROMPT:</span>
<span id="selectedPromptTokens" class="text-on-surface">0</span>
</div>
<div class="flex justify-between text-xs font-mono">
<span class="text-on-surface-variant">COMPLETION:</span>
<span id="selectedCompletionTokens" class="text-on-surface text-secondary">0</span>
</div>
<div class="w-full h-1 bg-surface-variant mt-2">
<div id="selectedTokenRatio" class="bg-primary h-full w-0"></div>
</div>
</div>
</div>
<div class="flex flex-col gap-1">
<label class="text-[10px] font-headline uppercase tracking-widest text-on-surface-variant">Tool Calls</label>
<p id="selectedToolCalls" class="text-xs font-mono bg-surface-container p-2 border-b border-outline-variant/20">0</p>
</div>
</div>
</div>
<div class="flex-1 flex flex-col min-h-[200px]">
<h2 class="font-headline font-black text-[10px] tracking-widest mb-4 text-on-surface-variant">LIVE_AGENT_OUTPUT</h2>
<div id="liveOutput" class="bg-surface-container-lowest flex-1 p-3 font-mono text-[10px] leading-relaxed overflow-y-auto space-y-1">
</div>
</div>
</aside>
</main>
<div class="fixed left-0 top-0 w-1 h-screen bg-gradient-to-b from-primary via-secondary to-tertiary z-[60] opacity-30"></div>
<script type="application/json" id="oma-data">${dataJson}</script>
<script>
const dataEl = document.getElementById("oma-data");
const payload = JSON.parse(dataEl.textContent);
const panel = document.getElementById("detailsPanel");
const closeBtn = document.getElementById("closePanel");
const canvas = document.getElementById("canvas");
const viewport = document.getElementById("viewport");
const edgesLayer = document.getElementById("edgesLayer");
const nodesLayer = document.getElementById("nodesLayer");
const goalText = document.getElementById("goalText");
const liveOutput = document.getElementById("liveOutput");
const selectedAssignee = document.getElementById("selectedAssignee");
const selectedState = document.getElementById("selectedState");
const selectedStart = document.getElementById("selectedStart");
const selectedToolCalls = document.getElementById("selectedToolCalls");
const selectedEnd = document.getElementById("selectedEnd");
const selectedPromptTokens = document.getElementById("selectedPromptTokens");
const selectedCompletionTokens = document.getElementById("selectedCompletionTokens");
const selectedTokenRatio = document.getElementById("selectedTokenRatio");
const svgNs = "http://www.w3.org/2000/svg";
let scale = 1;
let translate = { x: 0, y: 0 };
let isDragging = false;
let last = { x: 0, y: 0 };
function updateTransform() {
canvas.style.transform = \`
translate(\${translate.x}px, \${translate.y}px)
scale(\${scale})
\`;
}
viewport.addEventListener("wheel", (e) => {
e.preventDefault();
const zoomIntensity = 0.0015;
const delta = -e.deltaY * zoomIntensity;
const newScale = Math.min(Math.max(0.4, scale + delta), 2.5);
const rect = viewport.getBoundingClientRect();
const mouseX = e.clientX - rect.left;
const mouseY = e.clientY - rect.top;
const dx = mouseX - translate.x;
const dy = mouseY - translate.y;
translate.x -= dx * (newScale / scale - 1);
translate.y -= dy * (newScale / scale - 1);
scale = newScale;
updateTransform();
});
viewport.addEventListener("mousedown", (e) => {
isDragging = true;
last = { x: e.clientX, y: e.clientY };
viewport.classList.add("cursor-grabbing");
});
window.addEventListener("mousemove", (e) => {
if (!isDragging) return;
const dx = e.clientX - last.x;
const dy = e.clientY - last.y;
translate.x += dx;
translate.y += dy;
last = { x: e.clientX, y: e.clientY };
updateTransform();
});
window.addEventListener("mouseup", () => {
isDragging = false;
viewport.classList.remove("cursor-grabbing");
});
updateTransform();
closeBtn.addEventListener("click", () => {
panel.classList.add("hidden");
});
document.addEventListener("click", (e) => {
const isClickInsidePanel = panel.contains(e.target);
const isNode = e.target.closest(".node");
if (!isClickInsidePanel && !isNode) {
panel.classList.add("hidden");
}
});
const tasks = Array.isArray(payload.tasks) ? payload.tasks : [];
goalText.textContent = payload.goal ?? "";
const statusStyles = {
completed: { border: "border-tertiary", icon: "check_circle", iconColor: "text-tertiary", container: "bg-surface-container-lowest node-active-glow", statusColor: "text-on-surface-variant", chip: "STABLE" },
failed: { border: "border-error", icon: "error", iconColor: "text-error", container: "bg-surface-container-lowest", statusColor: "text-error", chip: "FAILED" },
blocked: { border: "border-outline", icon: "lock", iconColor: "text-outline", container: "bg-surface-container-low opacity-60 grayscale", statusColor: "text-on-surface-variant", chip: "BLOCKED" },
skipped: { border: "border-outline", icon: "skip_next", iconColor: "text-outline", container: "bg-surface-container-low opacity-60", statusColor: "text-on-surface-variant", chip: "SKIPPED" },
in_progress: { border: "border-secondary", icon: "sync", iconColor: "text-secondary", container: "bg-surface-container-low node-active-glow border border-outline-variant/20 shadow-[0_0_20px_rgba(253,192,3,0.1)]", statusColor: "text-secondary", chip: "ACTIVE_STREAM", spin: true },
pending: { border: "border-outline", icon: "hourglass_empty", iconColor: "text-outline", container: "bg-surface-container-low opacity-60 grayscale", statusColor: "text-on-surface-variant", chip: "WAITING" },
};
function durationText(task) {
const ms = task?.metrics?.durationMs ?? 0;
const seconds = Math.max(0, ms / 1000).toFixed(1);
return task.status === "completed" ? "DONE (" + seconds + "s)" : task.status.toUpperCase();
}
function renderLiveOutput(taskList) {
liveOutput.innerHTML = "";
const finished = taskList.every((task) => ["completed", "failed", "skipped", "blocked"].includes(task.status));
const header = document.createElement("p");
header.className = "text-tertiary";
header.textContent = finished ? "[SYSTEM] Task graph execution finished." : "[SYSTEM] Task graph execution in progress.";
liveOutput.appendChild(header);
taskList.forEach((task) => {
const p = document.createElement("p");
p.className = task.status === "completed" ? "text-on-surface-variant" : task.status === "failed" ? "text-error" : "text-on-surface-variant";
p.textContent = "[" + (task.assignee || "UNASSIGNED").toUpperCase() + "] " + task.title + " -> " + task.status.toUpperCase();
liveOutput.appendChild(p);
});
}
function renderDetails(task) {
const metrics = task?.metrics ?? {};
const statusLabel = (statusStyles[task.status] || statusStyles.pending).chip;
const usage = metrics.tokenUsage ?? { input_tokens: 0, output_tokens: 0 };
const inTokens = usage.input_tokens ?? 0;
const outTokens = usage.output_tokens ?? 0;
const total = inTokens + outTokens;
const ratio = total > 0 ? Math.round((inTokens / total) * 100) : 0;
selectedAssignee.textContent = task?.assignee || "UNASSIGNED";
selectedState.textContent = "STATE: " + statusLabel;
selectedStart.textContent = metrics.startMs ? new Date(metrics.startMs).toISOString() : "-";
selectedEnd.textContent = metrics.endMs ? new Date(metrics.endMs).toISOString() : "-";
selectedToolCalls.textContent = (metrics.toolCalls ?? []).length.toString();
selectedPromptTokens.textContent = inTokens.toLocaleString();
selectedCompletionTokens.textContent = outTokens.toLocaleString();
selectedTokenRatio.style.width = ratio + "%";
}
function makeEdgePath(x1, y1, x2, y2) {
return "M " + x1 + " " + y1 + " C " + (x1 + 42) + " " + y1 + ", " + (x2 - 42) + " " + y2 + ", " + x2 + " " + y2;
}
function renderDag(taskList) {
const rawLayout = payload.layout ?? {};
const positions = new Map(Object.entries(rawLayout.positions ?? {}));
const width = Number(rawLayout.width ?? 1600);
const height = Number(rawLayout.height ?? 700);
const nodeW = Number(rawLayout.nodeW ?? 256);
const nodeH = Number(rawLayout.nodeH ?? 142);
canvas.style.width = width + "px";
canvas.style.height = height + "px";
edgesLayer.setAttribute("viewBox", "0 0 " + width + " " + height);
edgesLayer.innerHTML = "";
const defs = document.createElementNS(svgNs, "defs");
const marker = document.createElementNS(svgNs, "marker");
marker.setAttribute("id", "arrow");
marker.setAttribute("markerWidth", "8");
marker.setAttribute("markerHeight", "8");
marker.setAttribute("refX", "7");
marker.setAttribute("refY", "4");
marker.setAttribute("orient", "auto");
const markerPath = document.createElementNS(svgNs, "path");
markerPath.setAttribute("d", "M0,0 L8,4 L0,8 z");
markerPath.setAttribute("fill", "#40485d");
marker.appendChild(markerPath);
defs.appendChild(marker);
edgesLayer.appendChild(defs);
taskList.forEach((task) => {
const to = positions.get(task.id);
(task.dependsOn || []).forEach((depId) => {
const from = positions.get(depId);
if (!from || !to) return;
const edge = document.createElementNS(svgNs, "path");
edge.setAttribute("d", makeEdgePath(from.x + nodeW, from.y + nodeH / 2, to.x, to.y + nodeH / 2));
edge.setAttribute("fill", "none");
edge.setAttribute("stroke", "#40485d");
edge.setAttribute("stroke-width", "2");
edge.setAttribute("marker-end", "url(#arrow)");
edgesLayer.appendChild(edge);
});
});
nodesLayer.innerHTML = "";
taskList.forEach((task, idx) => {
const pos = positions.get(task.id);
const status = statusStyles[task.status] || statusStyles.pending;
const nodeId = "#NODE_" + String(idx + 1).padStart(3, "0");
const chips = [task.assignee ? task.assignee.toUpperCase() : "UNASSIGNED", status.chip];
const node = document.createElement("div");
node.className = "node absolute w-64 border-l-2 p-4 cursor-pointer " + status.border + " " + status.container;
node.style.left = pos.x + "px";
node.style.top = pos.y + "px";
const rowTop = document.createElement("div");
rowTop.className = "flex justify-between items-start mb-4";
const nodeIdSpan = document.createElement("span");
nodeIdSpan.className = "text-[10px] font-mono " + status.iconColor;
nodeIdSpan.textContent = nodeId;
const iconSpan = document.createElement("span");
iconSpan.className = "material-symbols-outlined " + status.iconColor + " text-lg " + (status.spin ? "animate-spin" : "");
iconSpan.textContent = status.icon;
iconSpan.setAttribute("data-icon", status.icon);
rowTop.appendChild(nodeIdSpan);
rowTop.appendChild(iconSpan);
const titleEl = document.createElement("h3");
titleEl.className = "font-headline font-bold text-sm tracking-tight mb-1";
titleEl.textContent = task.title;
const statusLine = document.createElement("p");
statusLine.className = "text-xs " + status.statusColor + " mb-4";
statusLine.textContent = "STATUS: " + durationText(task);
const chipRow = document.createElement("div");
chipRow.className = "flex gap-2";
chips.forEach((chip) => {
const chipEl = document.createElement("span");
chipEl.className = "px-2 py-0.5 bg-surface-variant text-[9px] font-mono text-on-surface-variant";
chipEl.textContent = chip;
chipRow.appendChild(chipEl);
});
node.appendChild(rowTop);
node.appendChild(titleEl);
node.appendChild(statusLine);
node.appendChild(chipRow);
node.addEventListener("click", () => {
renderDetails(task);
panel.classList.remove("hidden");
});
nodesLayer.appendChild(node);
});
renderLiveOutput(taskList);
}
renderDag(tasks);
</script>
</body>
</html>`
}

View File

@ -1,19 +0,0 @@
/**
* @fileoverview Framework-specific error classes.
*/
/**
* Raised when an agent or orchestrator run exceeds its configured token budget.
*/
export class TokenBudgetExceededError extends Error {
readonly code = 'TOKEN_BUDGET_EXCEEDED'
constructor(
readonly agent: string,
readonly tokensUsed: number,
readonly budget: number,
) {
super(`Agent "${agent}" exceeded token budget: ${tokensUsed} tokens used (budget: ${budget})`)
this.name = 'TokenBudgetExceededError'
}
}

View File

@ -58,14 +58,11 @@ export { OpenMultiAgent, executeWithRetry, computeRetryDelay } from './orchestra
export { Scheduler } from './orchestrator/scheduler.js'
export type { SchedulingStrategy } from './orchestrator/scheduler.js'
export { renderTeamRunDashboard } from './dashboard/render-team-run-dashboard.js'
// ---------------------------------------------------------------------------
// Agent layer
// ---------------------------------------------------------------------------
export { Agent } from './agent/agent.js'
export { LoopDetector } from './agent/loop-detector.js'
export { buildStructuredOutputInstruction, extractJSON, validateOutput } from './agent/structured-output.js'
export { AgentPool, Semaphore } from './agent/pool.js'
export type { PoolStatus } from './agent/pool.js'
@ -91,21 +88,17 @@ export type { TaskQueueEvent } from './task/queue.js'
// ---------------------------------------------------------------------------
export { defineTool, ToolRegistry, zodToJsonSchema } from './tool/framework.js'
export { ToolExecutor, truncateToolOutput } from './tool/executor.js'
export { ToolExecutor } from './tool/executor.js'
export type { ToolExecutorOptions, BatchToolCall } from './tool/executor.js'
export {
registerBuiltInTools,
BUILT_IN_TOOLS,
ALL_BUILT_IN_TOOLS_WITH_DELEGATE,
bashTool,
delegateToAgentTool,
fileReadTool,
fileWriteTool,
fileEditTool,
globTool,
grepTool,
} from './tool/built-in/index.js'
export type { RegisterBuiltInToolsOptions } from './tool/built-in/index.js'
// ---------------------------------------------------------------------------
// LLM adapters
@ -113,7 +106,6 @@ export type { RegisterBuiltInToolsOptions } from './tool/built-in/index.js'
export { createAdapter } from './llm/adapter.js'
export type { SupportedProvider } from './llm/adapter.js'
export { TokenBudgetExceededError } from './errors.js'
// ---------------------------------------------------------------------------
// Memory
@ -150,26 +142,17 @@ export type {
ToolUseContext,
AgentInfo,
TeamInfo,
DelegationPoolView,
// Agent
AgentConfig,
AgentState,
AgentRunResult,
BeforeRunHookContext,
ToolCallRecord,
LoopDetectionConfig,
LoopDetectionInfo,
ContextStrategy,
// Team
TeamConfig,
TeamRunResult,
// Dashboard (static HTML)
TaskExecutionMetrics,
TaskExecutionRecord,
// Task
Task,
TaskStatus,
@ -177,20 +160,8 @@ export type {
// Orchestrator
OrchestratorConfig,
OrchestratorEvent,
CoordinatorConfig,
// Trace
TraceEventType,
TraceEventBase,
TraceEvent,
LLMCallTrace,
ToolCallTrace,
TaskTrace,
AgentTrace,
// Memory
MemoryEntry,
MemoryStore,
} from './types.js'
export { generateRunId } from './utils/trace.js'

View File

@ -11,7 +11,6 @@
*
* const anthropic = createAdapter('anthropic')
* const openai = createAdapter('openai', process.env.OPENAI_API_KEY)
* const gemini = createAdapter('gemini', process.env.GEMINI_API_KEY)
* ```
*/
@ -38,22 +37,17 @@ import type { LLMAdapter } from '../types.js'
* Additional providers can be integrated by implementing {@link LLMAdapter}
* directly and bypassing this factory.
*/
export type SupportedProvider = 'anthropic' | 'azure-openai' | 'copilot' | 'deepseek' | 'grok' | 'minimax' | 'openai' | 'gemini'
export type SupportedProvider = 'anthropic' | 'copilot' | 'openai'
/**
* Instantiate the appropriate {@link LLMAdapter} for the given provider.
*
* API keys fall back to the standard environment variables when not supplied
* explicitly:
* - `anthropic` `ANTHROPIC_API_KEY`
* - `azure-openai` `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_DEPLOYMENT`
* - `openai` `OPENAI_API_KEY`
* - `gemini` `GEMINI_API_KEY` / `GOOGLE_API_KEY`
* - `grok` `XAI_API_KEY`
* - `minimax` `MINIMAX_API_KEY`
* - `deepseek` `DEEPSEEK_API_KEY`
* - `copilot` `GITHUB_COPILOT_TOKEN` / `GITHUB_TOKEN`, or interactive
* OAuth2 device flow if neither is set
* - `anthropic` `ANTHROPIC_API_KEY`
* - `openai` `OPENAI_API_KEY`
* - `copilot` `GITHUB_COPILOT_TOKEN` / `GITHUB_TOKEN`, or interactive
* OAuth2 device flow if neither is set
*
* Adapters are imported lazily so that projects using only one provider
* are not forced to install the SDK for the other.
@ -80,32 +74,10 @@ export async function createAdapter(
const { CopilotAdapter } = await import('./copilot.js')
return new CopilotAdapter(apiKey)
}
case 'gemini': {
const { GeminiAdapter } = await import('./gemini.js')
return new GeminiAdapter(apiKey)
}
case 'openai': {
const { OpenAIAdapter } = await import('./openai.js')
return new OpenAIAdapter(apiKey, baseURL)
}
case 'grok': {
const { GrokAdapter } = await import('./grok.js')
return new GrokAdapter(apiKey, baseURL)
}
case 'minimax': {
const { MiniMaxAdapter } = await import('./minimax.js')
return new MiniMaxAdapter(apiKey, baseURL)
}
case 'deepseek': {
const { DeepSeekAdapter } = await import('./deepseek.js')
return new DeepSeekAdapter(apiKey, baseURL)
}
case 'azure-openai': {
// For azure-openai, the `baseURL` parameter serves as the Azure endpoint URL.
// To override the API version, set AZURE_OPENAI_API_VERSION env var.
const { AzureOpenAIAdapter } = await import('./azure-openai.js')
return new AzureOpenAIAdapter(apiKey, baseURL)
}
default: {
// The `never` cast here makes TypeScript enforce exhaustiveness.
const _exhaustive: never = provider

View File

@ -1,313 +0,0 @@
/**
* @fileoverview Azure OpenAI adapter implementing {@link LLMAdapter}.
*
* Azure OpenAI uses regional deployment endpoints and API versioning that differ
* from standard OpenAI:
*
* - Endpoint: `https://{resource-name}.openai.azure.com`
* - API version: Query parameter (e.g., `?api-version=2024-10-21`)
* - Model/Deployment: Users deploy models with custom names; the `model` field
* in agent config should contain the Azure deployment name, not the underlying
* model name (e.g., `model: 'my-gpt4-deployment'`)
*
* The OpenAI SDK provides an `AzureOpenAI` client class that handles these
* Azure-specific requirements. This adapter uses that client while reusing all
* message conversion logic from `openai-common.ts`.
*
* Environment variable resolution order:
* 1. Constructor arguments
* 2. `AZURE_OPENAI_API_KEY` environment variable
* 3. `AZURE_OPENAI_ENDPOINT` environment variable
* 4. `AZURE_OPENAI_API_VERSION` environment variable (defaults to '2024-10-21')
* 5. `AZURE_OPENAI_DEPLOYMENT` as an optional fallback when `model` is blank
*
* Note: Azure introduced a next-generation v1 API (August 2025) that uses the standard
* OpenAI() client with baseURL set to `{endpoint}/openai/v1/` and requires no api-version.
* That path is not yet supported by this adapter. To use it, pass `provider: 'openai'`
* with `baseURL: 'https://{resource}.openai.azure.com/openai/v1/'` in your agent config.
*
* @example
* ```ts
* import { AzureOpenAIAdapter } from './azure-openai.js'
*
* const adapter = new AzureOpenAIAdapter()
* const response = await adapter.chat(messages, {
* model: 'my-gpt4-deployment', // Azure deployment name, not 'gpt-4'
* maxTokens: 1024,
* })
* ```
*/
import { AzureOpenAI } from 'openai'
import type {
ChatCompletionChunk,
} from 'openai/resources/chat/completions/index.js'
import type {
ContentBlock,
LLMAdapter,
LLMChatOptions,
LLMMessage,
LLMResponse,
LLMStreamOptions,
StreamEvent,
TextBlock,
ToolUseBlock,
} from '../types.js'
import {
toOpenAITool,
fromOpenAICompletion,
normalizeFinishReason,
buildOpenAIMessageList,
} from './openai-common.js'
import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'
// ---------------------------------------------------------------------------
// Adapter implementation
// ---------------------------------------------------------------------------
const DEFAULT_AZURE_OPENAI_API_VERSION = '2024-10-21'
function resolveAzureDeploymentName(model: string): string {
const explicitModel = model.trim()
if (explicitModel.length > 0) return explicitModel
const fallbackDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']?.trim()
if (fallbackDeployment !== undefined && fallbackDeployment.length > 0) {
return fallbackDeployment
}
throw new Error(
'Azure OpenAI deployment is required. Set agent model to your deployment name, or set AZURE_OPENAI_DEPLOYMENT.',
)
}
/**
* LLM adapter backed by Azure OpenAI Chat Completions API.
*
* Thread-safe a single instance may be shared across concurrent agent runs.
*/
export class AzureOpenAIAdapter implements LLMAdapter {
readonly name: string = 'azure-openai'
readonly #client: AzureOpenAI
/**
* @param apiKey - Azure OpenAI API key (falls back to AZURE_OPENAI_API_KEY env var)
* @param endpoint - Azure endpoint URL (falls back to AZURE_OPENAI_ENDPOINT env var)
* @param apiVersion - API version string (falls back to AZURE_OPENAI_API_VERSION, defaults to '2024-10-21')
*/
constructor(apiKey?: string, endpoint?: string, apiVersion?: string) {
this.#client = new AzureOpenAI({
apiKey: apiKey ?? process.env['AZURE_OPENAI_API_KEY'],
endpoint: endpoint ?? process.env['AZURE_OPENAI_ENDPOINT'],
apiVersion: apiVersion ?? process.env['AZURE_OPENAI_API_VERSION'] ?? DEFAULT_AZURE_OPENAI_API_VERSION,
})
}
// -------------------------------------------------------------------------
// chat()
// -------------------------------------------------------------------------
/**
* Send a synchronous (non-streaming) chat request and return the complete
* {@link LLMResponse}.
*
* Throws an `AzureOpenAI.APIError` on non-2xx responses. Callers should catch and
* handle these (e.g. rate limits, context length exceeded, deployment not found).
*/
async chat(messages: LLMMessage[], options: LLMChatOptions): Promise<LLMResponse> {
const deploymentName = resolveAzureDeploymentName(options.model)
const openAIMessages = buildOpenAIMessageList(messages, options.systemPrompt)
const completion = await this.#client.chat.completions.create(
{
model: deploymentName,
messages: openAIMessages,
max_tokens: options.maxTokens,
temperature: options.temperature,
tools: options.tools ? options.tools.map(toOpenAITool) : undefined,
stream: false,
},
{
signal: options.abortSignal,
},
)
const toolNames = options.tools?.map(t => t.name)
return fromOpenAICompletion(completion, toolNames)
}
// -------------------------------------------------------------------------
// stream()
// -------------------------------------------------------------------------
/**
* Send a streaming chat request and yield {@link StreamEvent}s incrementally.
*
* Sequence guarantees match {@link OpenAIAdapter.stream}:
* - Zero or more `text` events
* - Zero or more `tool_use` events (emitted once per tool call, after
* arguments have been fully assembled)
* - Exactly one terminal event: `done` or `error`
*/
async *stream(
messages: LLMMessage[],
options: LLMStreamOptions,
): AsyncIterable<StreamEvent> {
const deploymentName = resolveAzureDeploymentName(options.model)
const openAIMessages = buildOpenAIMessageList(messages, options.systemPrompt)
// We request usage in the final chunk so we can include it in the `done` event.
const streamResponse = await this.#client.chat.completions.create(
{
model: deploymentName,
messages: openAIMessages,
max_tokens: options.maxTokens,
temperature: options.temperature,
tools: options.tools ? options.tools.map(toOpenAITool) : undefined,
stream: true,
stream_options: { include_usage: true },
},
{
signal: options.abortSignal,
},
)
// Accumulate state across chunks.
let completionId = ''
let completionModel = ''
let finalFinishReason: string = 'stop'
let inputTokens = 0
let outputTokens = 0
// tool_calls are streamed piecemeal; key = tool call index
const toolCallBuffers = new Map<
number,
{ id: string; name: string; argsJson: string }
>()
// Full text accumulator for the `done` response.
let fullText = ''
try {
for await (const chunk of streamResponse) {
completionId = chunk.id
completionModel = chunk.model
// Usage is only populated in the final chunk when stream_options.include_usage is set.
if (chunk.usage !== null && chunk.usage !== undefined) {
inputTokens = chunk.usage.prompt_tokens
outputTokens = chunk.usage.completion_tokens
}
const choice: ChatCompletionChunk.Choice | undefined = chunk.choices[0]
if (choice === undefined) continue
const delta = choice.delta
// --- text delta ---
if (delta.content !== null && delta.content !== undefined) {
fullText += delta.content
const textEvent: StreamEvent = { type: 'text', data: delta.content }
yield textEvent
}
// --- tool call delta ---
for (const toolCallDelta of delta.tool_calls ?? []) {
const idx = toolCallDelta.index
if (!toolCallBuffers.has(idx)) {
toolCallBuffers.set(idx, {
id: toolCallDelta.id ?? '',
name: toolCallDelta.function?.name ?? '',
argsJson: '',
})
}
const buf = toolCallBuffers.get(idx)
// buf is guaranteed to exist: we just set it above.
if (buf !== undefined) {
if (toolCallDelta.id) buf.id = toolCallDelta.id
if (toolCallDelta.function?.name) buf.name = toolCallDelta.function.name
if (toolCallDelta.function?.arguments) {
buf.argsJson += toolCallDelta.function.arguments
}
}
}
if (choice.finish_reason !== null && choice.finish_reason !== undefined) {
finalFinishReason = choice.finish_reason
}
}
// Emit accumulated tool_use events after the stream ends.
const finalToolUseBlocks: ToolUseBlock[] = []
for (const buf of toolCallBuffers.values()) {
let parsedInput: Record<string, unknown> = {}
try {
const parsed: unknown = JSON.parse(buf.argsJson)
if (parsed !== null && typeof parsed === 'object' && !Array.isArray(parsed)) {
parsedInput = parsed as Record<string, unknown>
}
} catch {
// Malformed JSON — surface as empty object.
}
const toolUseBlock: ToolUseBlock = {
type: 'tool_use',
id: buf.id,
name: buf.name,
input: parsedInput,
}
finalToolUseBlocks.push(toolUseBlock)
const toolUseEvent: StreamEvent = { type: 'tool_use', data: toolUseBlock }
yield toolUseEvent
}
// Build the complete content array for the done response.
const doneContent: ContentBlock[] = []
if (fullText.length > 0) {
const textBlock: TextBlock = { type: 'text', text: fullText }
doneContent.push(textBlock)
}
doneContent.push(...finalToolUseBlocks)
// Fallback: extract tool calls from text when streaming produced no
// native tool_calls (same logic as fromOpenAICompletion).
if (finalToolUseBlocks.length === 0 && fullText.length > 0 && options.tools) {
const toolNames = options.tools.map(t => t.name)
const extracted = extractToolCallsFromText(fullText, toolNames)
if (extracted.length > 0) {
doneContent.push(...extracted)
for (const block of extracted) {
yield { type: 'tool_use', data: block } satisfies StreamEvent
}
}
}
const hasToolUseBlocks = doneContent.some(b => b.type === 'tool_use')
const resolvedStopReason = hasToolUseBlocks && finalFinishReason === 'stop'
? 'tool_use'
: normalizeFinishReason(finalFinishReason)
const finalResponse: LLMResponse = {
id: completionId,
content: doneContent,
model: completionModel,
stop_reason: resolvedStopReason,
usage: { input_tokens: inputTokens, output_tokens: outputTokens },
}
const doneEvent: StreamEvent = { type: 'done', data: finalResponse }
yield doneEvent
} catch (err) {
const error = err instanceof Error ? err : new Error(String(err))
const errorEvent: StreamEvent = { type: 'error', data: error }
yield errorEvent
}
}
}

View File

@ -313,8 +313,7 @@ export class CopilotAdapter implements LLMAdapter {
},
)
const toolNames = options.tools?.map(t => t.name)
return fromOpenAICompletion(completion, toolNames)
return fromOpenAICompletion(completion)
}
// -------------------------------------------------------------------------

View File

@ -1,29 +0,0 @@
/**
* @fileoverview DeepSeek adapter.
*
* Thin wrapper around OpenAIAdapter that hard-codes the official DeepSeek
* OpenAI-compatible endpoint and DEEPSEEK_API_KEY environment variable fallback.
*/
import { OpenAIAdapter } from './openai.js'
/**
* LLM adapter for DeepSeek models (deepseek-chat, deepseek-reasoner, and future models).
*
* Thread-safe. Can be shared across agents.
*
* Usage:
* provider: 'deepseek'
* model: 'deepseek-chat' (or 'deepseek-reasoner' for the thinking model)
*/
export class DeepSeekAdapter extends OpenAIAdapter {
readonly name = 'deepseek'
constructor(apiKey?: string, baseURL?: string) {
// Allow override of baseURL (for proxies or future changes) but default to official DeepSeek endpoint.
super(
apiKey ?? process.env['DEEPSEEK_API_KEY'],
baseURL ?? 'https://api.deepseek.com/v1'
)
}
}

View File

@ -1,379 +0,0 @@
/**
* @fileoverview Google Gemini adapter implementing {@link LLMAdapter}.
*
* Built for `@google/genai` (the unified Google Gen AI SDK, v1.x), NOT the
* legacy `@google/generative-ai` package.
*
* Converts between the framework's internal {@link ContentBlock} types and the
* `@google/genai` SDK's wire format, handling tool definitions, system prompts,
* and both batch and streaming response paths.
*
* API key resolution order:
* 1. `apiKey` constructor argument
* 2. `GEMINI_API_KEY` environment variable
* 3. `GOOGLE_API_KEY` environment variable
*
* @example
* ```ts
* import { GeminiAdapter } from './gemini.js'
*
* const adapter = new GeminiAdapter()
* const response = await adapter.chat(messages, {
* model: 'gemini-2.5-flash',
* maxTokens: 1024,
* })
* ```
*/
import {
GoogleGenAI,
FunctionCallingConfigMode,
type Content,
type FunctionDeclaration,
type GenerateContentConfig,
type GenerateContentResponse,
type Part,
type Tool as GeminiTool,
} from '@google/genai'
import type {
ContentBlock,
LLMAdapter,
LLMChatOptions,
LLMMessage,
LLMResponse,
LLMStreamOptions,
LLMToolDef,
StreamEvent,
ToolUseBlock,
} from '../types.js'
// ---------------------------------------------------------------------------
// Internal helpers
// ---------------------------------------------------------------------------
/**
* Map framework role names to Gemini role names.
*
* Gemini uses `"model"` instead of `"assistant"`.
*/
function toGeminiRole(role: 'user' | 'assistant'): string {
return role === 'assistant' ? 'model' : 'user'
}
/**
* Convert framework messages into Gemini's {@link Content}[] format.
*
* Key differences from Anthropic:
* - Gemini uses `"model"` instead of `"assistant"`.
* - `functionResponse` parts (tool results) must appear in `"user"` turns.
* - `functionCall` parts appear in `"model"` turns.
* - We build a name lookup map from tool_use blocks so tool_result blocks
* can resolve the function name required by Gemini's `functionResponse`.
*/
function toGeminiContents(messages: LLMMessage[]): Content[] {
// First pass: build id → name map for resolving tool results.
const toolNameById = new Map<string, string>()
for (const msg of messages) {
for (const block of msg.content) {
if (block.type === 'tool_use') {
toolNameById.set(block.id, block.name)
}
}
}
return messages.map((msg): Content => {
const parts: Part[] = msg.content.map((block): Part => {
switch (block.type) {
case 'text':
return { text: block.text }
case 'tool_use':
return {
functionCall: {
id: block.id,
name: block.name,
args: block.input,
},
}
case 'tool_result': {
const name = toolNameById.get(block.tool_use_id) ?? block.tool_use_id
return {
functionResponse: {
id: block.tool_use_id,
name,
response: {
content:
typeof block.content === 'string'
? block.content
: JSON.stringify(block.content),
isError: block.is_error ?? false,
},
},
}
}
case 'image':
return {
inlineData: {
mimeType: block.source.media_type,
data: block.source.data,
},
}
default: {
const _exhaustive: never = block
throw new Error(`Unhandled content block type: ${JSON.stringify(_exhaustive)}`)
}
}
})
return { role: toGeminiRole(msg.role), parts }
})
}
/**
* Convert framework {@link LLMToolDef}s into a Gemini `tools` config array.
*
* In `@google/genai`, function declarations use `parametersJsonSchema` (not
* `parameters` or `input_schema`). All declarations are grouped under a single
* tool entry.
*/
function toGeminiTools(tools: readonly LLMToolDef[]): GeminiTool[] {
const functionDeclarations: FunctionDeclaration[] = tools.map((t) => ({
name: t.name,
description: t.description,
parametersJsonSchema: t.inputSchema as Record<string, unknown>,
}))
return [{ functionDeclarations }]
}
/**
* Build the {@link GenerateContentConfig} shared by chat() and stream().
*/
function buildConfig(
options: LLMChatOptions | LLMStreamOptions,
): GenerateContentConfig {
return {
maxOutputTokens: options.maxTokens ?? 4096,
temperature: options.temperature,
systemInstruction: options.systemPrompt,
tools: options.tools ? toGeminiTools(options.tools) : undefined,
toolConfig: options.tools
? { functionCallingConfig: { mode: FunctionCallingConfigMode.AUTO } }
: undefined,
abortSignal: options.abortSignal,
}
}
/**
* Generate a stable pseudo-random ID string for tool use blocks.
*
* Gemini may not always return call IDs (especially in streaming), so we
* fabricate them when absent to satisfy the framework's {@link ToolUseBlock}
* contract.
*/
function generateId(): string {
return `gemini-${Date.now()}-${Math.random().toString(36).slice(2, 9)}`
}
/**
* Extract the function call ID from a Gemini part, or generate one.
*
* The `id` field exists in newer API versions but may be absent in older
* responses, so we cast conservatively and fall back to a generated ID.
*/
function getFunctionCallId(part: Part): string {
return (part.functionCall as { id?: string } | undefined)?.id ?? generateId()
}
/**
* Convert a Gemini {@link GenerateContentResponse} into a framework
* {@link LLMResponse}.
*/
function fromGeminiResponse(
response: GenerateContentResponse,
id: string,
model: string,
): LLMResponse {
const candidate = response.candidates?.[0]
const content: ContentBlock[] = []
for (const part of candidate?.content?.parts ?? []) {
if (part.text !== undefined && part.text !== '') {
content.push({ type: 'text', text: part.text })
} else if (part.functionCall !== undefined) {
content.push({
type: 'tool_use',
id: getFunctionCallId(part),
name: part.functionCall.name ?? '',
input: (part.functionCall.args ?? {}) as Record<string, unknown>,
})
}
// inlineData echoes and other part types are silently ignored.
}
// Map Gemini finish reasons to framework stop_reason vocabulary.
const finishReason = candidate?.finishReason as string | undefined
let stop_reason: LLMResponse['stop_reason'] = 'end_turn'
if (finishReason === 'MAX_TOKENS') {
stop_reason = 'max_tokens'
} else if (content.some((b) => b.type === 'tool_use')) {
// Gemini may report STOP even when it returned function calls.
stop_reason = 'tool_use'
}
const usage = response.usageMetadata
return {
id,
content,
model,
stop_reason,
usage: {
input_tokens: usage?.promptTokenCount ?? 0,
output_tokens: usage?.candidatesTokenCount ?? 0,
},
}
}
// ---------------------------------------------------------------------------
// Adapter implementation
// ---------------------------------------------------------------------------
/**
* LLM adapter backed by the Google Gemini API via `@google/genai`.
*
* Thread-safe a single instance may be shared across concurrent agent runs.
* The underlying SDK client is stateless across requests.
*/
export class GeminiAdapter implements LLMAdapter {
readonly name = 'gemini'
readonly #client: GoogleGenAI
constructor(apiKey?: string) {
this.#client = new GoogleGenAI({
apiKey: apiKey ?? process.env['GEMINI_API_KEY'] ?? process.env['GOOGLE_API_KEY'],
})
}
// -------------------------------------------------------------------------
// chat()
// -------------------------------------------------------------------------
/**
* Send a synchronous (non-streaming) chat request and return the complete
* {@link LLMResponse}.
*
* Uses `ai.models.generateContent()` with the full conversation as `contents`,
* which is the idiomatic pattern for `@google/genai`.
*/
async chat(messages: LLMMessage[], options: LLMChatOptions): Promise<LLMResponse> {
const id = generateId()
const contents = toGeminiContents(messages)
const response = await this.#client.models.generateContent({
model: options.model,
contents,
config: buildConfig(options),
})
return fromGeminiResponse(response, id, options.model)
}
// -------------------------------------------------------------------------
// stream()
// -------------------------------------------------------------------------
/**
* Send a streaming chat request and yield {@link StreamEvent}s as they
* arrive from the API.
*
* Uses `ai.models.generateContentStream()` which returns an
* `AsyncGenerator<GenerateContentResponse>`. Each yielded chunk has the same
* shape as a full response but contains only the delta for that chunk.
*
* Because `@google/genai` doesn't expose a `finalMessage()` helper like the
* Anthropic SDK, we accumulate content and token counts as we stream so that
* the terminal `done` event carries a complete and accurate {@link LLMResponse}.
*
* Sequence guarantees (matching the Anthropic adapter):
* - Zero or more `text` events with incremental deltas
* - Zero or more `tool_use` events (one per call; Gemini doesn't stream args)
* - Exactly one terminal event: `done` or `error`
*/
async *stream(
messages: LLMMessage[],
options: LLMStreamOptions,
): AsyncIterable<StreamEvent> {
const id = generateId()
const contents = toGeminiContents(messages)
try {
const streamResponse = await this.#client.models.generateContentStream({
model: options.model,
contents,
config: buildConfig(options),
})
// Accumulators for building the done payload.
const accumulatedContent: ContentBlock[] = []
let inputTokens = 0
let outputTokens = 0
let lastFinishReason: string | undefined
for await (const chunk of streamResponse) {
const candidate = chunk.candidates?.[0]
// Accumulate token counts — the API emits these on the final chunk.
if (chunk.usageMetadata) {
inputTokens = chunk.usageMetadata.promptTokenCount ?? inputTokens
outputTokens = chunk.usageMetadata.candidatesTokenCount ?? outputTokens
}
if (candidate?.finishReason) {
lastFinishReason = candidate.finishReason as string
}
for (const part of candidate?.content?.parts ?? []) {
if (part.text) {
accumulatedContent.push({ type: 'text', text: part.text })
yield { type: 'text', data: part.text } satisfies StreamEvent
} else if (part.functionCall) {
const toolId = getFunctionCallId(part)
const toolUseBlock: ToolUseBlock = {
type: 'tool_use',
id: toolId,
name: part.functionCall.name ?? '',
input: (part.functionCall.args ?? {}) as Record<string, unknown>,
}
accumulatedContent.push(toolUseBlock)
yield { type: 'tool_use', data: toolUseBlock } satisfies StreamEvent
}
}
}
// Determine stop_reason from the accumulated response.
const hasToolUse = accumulatedContent.some((b) => b.type === 'tool_use')
let stop_reason: LLMResponse['stop_reason'] = 'end_turn'
if (lastFinishReason === 'MAX_TOKENS') {
stop_reason = 'max_tokens'
} else if (hasToolUse) {
stop_reason = 'tool_use'
}
const finalResponse: LLMResponse = {
id,
content: accumulatedContent,
model: options.model,
stop_reason,
usage: { input_tokens: inputTokens, output_tokens: outputTokens },
}
yield { type: 'done', data: finalResponse } satisfies StreamEvent
} catch (err) {
const error = err instanceof Error ? err : new Error(String(err))
yield { type: 'error', data: error } satisfies StreamEvent
}
}
}

View File

@ -1,29 +0,0 @@
/**
* @fileoverview Grok (xAI) adapter.
*
* Thin wrapper around OpenAIAdapter that hard-codes the official xAI endpoint
* and XAI_API_KEY environment variable fallback.
*/
import { OpenAIAdapter } from './openai.js'
/**
* LLM adapter for Grok models (grok-4 series and future models).
*
* Thread-safe. Can be shared across agents.
*
* Usage:
* provider: 'grok'
* model: 'grok-4' (or any current Grok model name)
*/
export class GrokAdapter extends OpenAIAdapter {
readonly name = 'grok'
constructor(apiKey?: string, baseURL?: string) {
// Allow override of baseURL (for proxies or future changes) but default to official xAI endpoint.
super(
apiKey ?? process.env['XAI_API_KEY'],
baseURL ?? 'https://api.x.ai/v1'
)
}
}

View File

@ -1,29 +0,0 @@
/**
* @fileoverview MiniMax adapter.
*
* Thin wrapper around OpenAIAdapter that hard-codes the official MiniMax
* OpenAI-compatible endpoint and MINIMAX_API_KEY environment variable fallback.
*/
import { OpenAIAdapter } from './openai.js'
/**
* LLM adapter for MiniMax models (MiniMax-M2.7 series and future models).
*
* Thread-safe. Can be shared across agents.
*
* Usage:
* provider: 'minimax'
* model: 'MiniMax-M2.7' (or any current MiniMax model name)
*/
export class MiniMaxAdapter extends OpenAIAdapter {
readonly name = 'minimax'
constructor(apiKey?: string, baseURL?: string) {
// Allow override of baseURL (for proxies or future changes) but default to official MiniMax endpoint.
super(
apiKey ?? process.env['MINIMAX_API_KEY'],
baseURL ?? process.env['MINIMAX_BASE_URL'] ?? 'https://api.minimax.io/v1'
)
}
}

View File

@ -25,7 +25,6 @@ import type {
TextBlock,
ToolUseBlock,
} from '../types.js'
import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'
// ---------------------------------------------------------------------------
// Framework → OpenAI
@ -167,18 +166,8 @@ function toOpenAIAssistantMessage(msg: LLMMessage): ChatCompletionAssistantMessa
*
* Takes only the first choice (index 0), consistent with how the framework
* is designed for single-output agents.
*
* @param completion - The raw OpenAI completion.
* @param knownToolNames - Optional whitelist of tool names. When the model
* returns no `tool_calls` but the text contains JSON
* that looks like a tool call, the fallback extractor
* uses this list to validate matches. Pass the names
* of tools sent in the request for best results.
*/
export function fromOpenAICompletion(
completion: ChatCompletion,
knownToolNames?: string[],
): LLMResponse {
export function fromOpenAICompletion(completion: ChatCompletion): LLMResponse {
const choice = completion.choices[0]
if (choice === undefined) {
throw new Error('OpenAI returned a completion with no choices')
@ -212,35 +201,7 @@ export function fromOpenAICompletion(
content.push(toolUseBlock)
}
// ---------------------------------------------------------------------------
// Fallback: extract tool calls from text when native tool_calls is empty.
//
// Some local models (Ollama thinking models, misconfigured vLLM) return tool
// calls as plain text instead of using the tool_calls wire format. When we
// have text but no tool_calls, try to extract them from the text.
// ---------------------------------------------------------------------------
const hasNativeToolCalls = (message.tool_calls ?? []).length > 0
if (
!hasNativeToolCalls &&
knownToolNames !== undefined &&
knownToolNames.length > 0 &&
message.content !== null &&
message.content !== undefined &&
message.content.length > 0
) {
const extracted = extractToolCallsFromText(message.content, knownToolNames)
if (extracted.length > 0) {
content.push(...extracted)
}
}
const hasToolUseBlocks = content.some(b => b.type === 'tool_use')
const rawStopReason = choice.finish_reason ?? 'stop'
// If we extracted tool calls from text but the finish_reason was 'stop',
// correct it to 'tool_use' so the agent runner continues the loop.
const stopReason = hasToolUseBlocks && rawStopReason === 'stop'
? 'tool_use'
: normalizeFinishReason(rawStopReason)
const stopReason = normalizeFinishReason(choice.finish_reason ?? 'stop')
return {
id: completion.id,

View File

@ -54,7 +54,6 @@ import {
normalizeFinishReason,
buildOpenAIMessageList,
} from './openai-common.js'
import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'
// ---------------------------------------------------------------------------
// Adapter implementation
@ -66,7 +65,7 @@ import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'
* Thread-safe a single instance may be shared across concurrent agent runs.
*/
export class OpenAIAdapter implements LLMAdapter {
readonly name: string = 'openai'
readonly name = 'openai'
readonly #client: OpenAI
@ -105,8 +104,7 @@ export class OpenAIAdapter implements LLMAdapter {
},
)
const toolNames = options.tools?.map(t => t.name)
return fromOpenAICompletion(completion, toolNames)
return fromOpenAICompletion(completion)
}
// -------------------------------------------------------------------------
@ -243,29 +241,11 @@ export class OpenAIAdapter implements LLMAdapter {
}
doneContent.push(...finalToolUseBlocks)
// Fallback: extract tool calls from text when streaming produced no
// native tool_calls (same logic as fromOpenAICompletion).
if (finalToolUseBlocks.length === 0 && fullText.length > 0 && options.tools) {
const toolNames = options.tools.map(t => t.name)
const extracted = extractToolCallsFromText(fullText, toolNames)
if (extracted.length > 0) {
doneContent.push(...extracted)
for (const block of extracted) {
yield { type: 'tool_use', data: block } satisfies StreamEvent
}
}
}
const hasToolUseBlocks = doneContent.some(b => b.type === 'tool_use')
const resolvedStopReason = hasToolUseBlocks && finalFinishReason === 'stop'
? 'tool_use'
: normalizeFinishReason(finalFinishReason)
const finalResponse: LLMResponse = {
id: completionId,
content: doneContent,
model: completionModel,
stop_reason: resolvedStopReason,
stop_reason: normalizeFinishReason(finalFinishReason),
usage: { input_tokens: inputTokens, output_tokens: outputTokens },
}

View File

@ -1,5 +0,0 @@
export type {
ConnectMCPToolsConfig,
ConnectedMCPTools,
} from './tool/mcp.js'
export { connectMCPTools } from './tool/mcp.js'

View File

@ -124,18 +124,8 @@ export class SharedMemory {
* - plan: Implement feature X using const type params
* ```
*/
async getSummary(filter?: { taskIds?: string[] }): Promise<string> {
let all = await this.store.list()
if (filter?.taskIds && filter.taskIds.length > 0) {
const taskIds = new Set(filter.taskIds)
all = all.filter((entry) => {
const slashIdx = entry.key.indexOf('/')
const localKey = slashIdx === -1 ? entry.key : entry.key.slice(slashIdx + 1)
if (!localKey.startsWith('task:') || !localKey.endsWith(':result')) return false
const taskId = localKey.slice('task:'.length, localKey.length - ':result'.length)
return taskIds.has(taskId)
})
}
async getSummary(): Promise<string> {
const all = await this.store.list()
if (all.length === 0) return ''
// Group entries by agent name.

File diff suppressed because it is too large Load Diff

View File

@ -15,7 +15,6 @@
import type { AgentConfig, Task } from '../types.js'
import type { TaskQueue } from '../task/queue.js'
import { extractKeywords, keywordScore } from '../utils/keywords.js'
// ---------------------------------------------------------------------------
// Public types
@ -75,6 +74,38 @@ function countBlockedDependents(taskId: string, allTasks: Task[]): number {
return visited.size
}
/**
* Compute a simple keyword-overlap score between `text` and `keywords`.
*
* Both the text and keywords are normalised to lower-case before comparison.
* Each keyword that appears in the text contributes +1 to the score.
*/
function keywordScore(text: string, keywords: string[]): number {
const lower = text.toLowerCase()
return keywords.reduce((acc, kw) => acc + (lower.includes(kw.toLowerCase()) ? 1 : 0), 0)
}
/**
* Extract a list of meaningful keywords from a string for capability matching.
*
* Strips common stop-words so that incidental matches (e.g. "the", "and") do
* not inflate scores. Returns unique words longer than three characters.
*/
function extractKeywords(text: string): string[] {
const STOP_WORDS = new Set([
'the', 'and', 'for', 'that', 'this', 'with', 'are', 'from', 'have',
'will', 'your', 'you', 'can', 'all', 'each', 'when', 'then', 'they',
'them', 'their', 'about', 'into', 'more', 'also', 'should', 'must',
])
return [...new Set(
text
.toLowerCase()
.split(/\W+/)
.filter((w) => w.length > 3 && !STOP_WORDS.has(w)),
)]
}
// ---------------------------------------------------------------------------
// Scheduler
// ---------------------------------------------------------------------------

View File

@ -18,7 +18,6 @@ export type TaskQueueEvent =
| 'task:ready'
| 'task:complete'
| 'task:failed'
| 'task:skipped'
| 'all:complete'
/** Handler for `'task:ready' | 'task:complete' | 'task:failed'` events. */
@ -157,51 +156,6 @@ export class TaskQueue {
return failed
}
/**
* Marks `taskId` as `'skipped'` and records `reason` in the `result` field.
*
* Fires `'task:skipped'` for the skipped task and cascades to every
* downstream task that transitively depended on it even if the dependent
* has other dependencies that are still pending or completed. A skipped
* upstream is treated as permanently unsatisfiable, mirroring `fail()`.
*
* @throws {Error} when `taskId` is not found.
*/
skip(taskId: string, reason: string): Task {
const skipped = this.update(taskId, { status: 'skipped', result: reason })
this.emit('task:skipped', skipped)
this.cascadeSkip(taskId)
if (this.isComplete()) {
this.emitAllComplete()
}
return skipped
}
/**
* Marks all non-terminal tasks as `'skipped'`.
*
* Used when an approval gate rejects continuation every pending, blocked,
* or in-progress task is skipped with the given reason.
*
* **Important:** Call only when no tasks are actively executing. The
* orchestrator invokes this after `await Promise.all()`, so no tasks are
* in-flight. Calling while agents are running may mark an in-progress task
* as skipped while its agent continues executing.
*/
skipRemaining(reason = 'Skipped: approval rejected.'): void {
// Snapshot first — update() mutates the live map, which is unsafe to
// iterate over during modification.
const snapshot = Array.from(this.tasks.values())
for (const task of snapshot) {
if (task.status === 'completed' || task.status === 'failed' || task.status === 'skipped') continue
const skipped = this.update(task.id, { status: 'skipped', result: reason })
this.emit('task:skipped', skipped)
}
if (this.isComplete()) {
this.emitAllComplete()
}
}
/**
* Recursively marks all tasks that (transitively) depend on `failedTaskId`
* as `'failed'` with an informative message, firing `'task:failed'` for each.
@ -224,24 +178,6 @@ export class TaskQueue {
}
}
/**
* Recursively marks all tasks that (transitively) depend on `skippedTaskId`
* as `'skipped'`, firing `'task:skipped'` for each.
*/
private cascadeSkip(skippedTaskId: string): void {
for (const task of this.tasks.values()) {
if (task.status !== 'blocked' && task.status !== 'pending') continue
if (!task.dependsOn?.includes(skippedTaskId)) continue
const cascaded = this.update(task.id, {
status: 'skipped',
result: `Skipped: dependency "${skippedTaskId}" was skipped.`,
})
this.emit('task:skipped', cascaded)
this.cascadeSkip(task.id)
}
}
// ---------------------------------------------------------------------------
// Queries
// ---------------------------------------------------------------------------
@ -289,18 +225,13 @@ export class TaskQueue {
return this.list().filter((t) => t.status === status)
}
/** Returns a task by ID, if present. */
get(taskId: string): Task | undefined {
return this.tasks.get(taskId)
}
/**
* Returns `true` when every task in the queue has reached a terminal state
* (`'completed'`, `'failed'`, or `'skipped'`), **or** the queue is empty.
* (`'completed'` or `'failed'`), **or** the queue is empty.
*/
isComplete(): boolean {
for (const task of this.tasks.values()) {
if (task.status !== 'completed' && task.status !== 'failed' && task.status !== 'skipped') return false
if (task.status !== 'completed' && task.status !== 'failed') return false
}
return true
}
@ -318,14 +249,12 @@ export class TaskQueue {
total: number
completed: number
failed: number
skipped: number
inProgress: number
pending: number
blocked: number
} {
let completed = 0
let failed = 0
let skipped = 0
let inProgress = 0
let pending = 0
let blocked = 0
@ -338,9 +267,6 @@ export class TaskQueue {
case 'failed':
failed++
break
case 'skipped':
skipped++
break
case 'in_progress':
inProgress++
break
@ -357,7 +283,6 @@ export class TaskQueue {
total: this.tasks.size,
completed,
failed,
skipped,
inProgress,
pending,
blocked,
@ -445,7 +370,7 @@ export class TaskQueue {
}
}
private emit(event: 'task:ready' | 'task:complete' | 'task:failed' | 'task:skipped', task: Task): void {
private emit(event: 'task:ready' | 'task:complete' | 'task:failed', task: Task): void {
const map = this.listeners.get(event)
if (!map) return
for (const handler of map.values()) {

View File

@ -31,7 +31,6 @@ export function createTask(input: {
description: string
assignee?: string
dependsOn?: string[]
memoryScope?: 'dependencies' | 'all'
maxRetries?: number
retryDelayMs?: number
retryBackoff?: number
@ -44,7 +43,6 @@ export function createTask(input: {
status: 'pending' as TaskStatus,
assignee: input.assignee,
dependsOn: input.dependsOn ? [...input.dependsOn] : undefined,
memoryScope: input.memoryScope,
result: undefined,
createdAt: now,
updatedAt: now,

View File

@ -1,109 +0,0 @@
/**
* @fileoverview Built-in `delegate_to_agent` tool for synchronous handoff to a roster agent.
*/
import { z } from 'zod'
import type { ToolDefinition, ToolResult, ToolUseContext } from '../../types.js'
const inputSchema = z.object({
target_agent: z.string().min(1).describe('Name of the team agent to run the sub-task.'),
prompt: z.string().min(1).describe('Instructions / question for the target agent.'),
})
/**
* Delegates a sub-task to another agent on the team and returns that agent's final text output.
*
* Only available when the orchestrator injects {@link ToolUseContext.team} with
* `runDelegatedAgent` (pool-backed `runTeam` / `runTasks`). Standalone `runAgent`
* does not register this tool by default.
*
* Nested {@link AgentRunResult.tokenUsage} from the delegated run is surfaced via
* {@link ToolResult.metadata} so the parent runner can aggregate it into its total
* (keeps `maxTokenBudget` accurate across delegation chains).
*/
export const delegateToAgentTool: ToolDefinition<z.infer<typeof inputSchema>> = {
name: 'delegate_to_agent',
description:
'Run a sub-task on another agent from this team and return that agent\'s final answer as the tool result. ' +
'Use when you need a specialist teammate to produce output you will incorporate. ' +
'The target agent runs in a fresh conversation for this prompt only.',
inputSchema,
async execute(
{ target_agent: targetAgent, prompt },
context: ToolUseContext,
): Promise<ToolResult> {
const team = context.team
if (!team?.runDelegatedAgent) {
return {
data:
'delegate_to_agent is only available during orchestrated team runs with the delegation tool enabled. ' +
'Use SharedMemory or explicit tasks instead.',
isError: true,
}
}
if (targetAgent === context.agent.name) {
return {
data: 'Cannot delegate to yourself; use another team member.',
isError: true,
}
}
if (!team.agents.includes(targetAgent)) {
return {
data: `Unknown agent "${targetAgent}". Roster: ${team.agents.join(', ')}`,
isError: true,
}
}
const chain = team.delegationChain ?? []
if (chain.includes(targetAgent)) {
return {
data:
`Delegation cycle detected: ${[...chain, targetAgent].join(' -> ')}. ` +
'Pick a different target or restructure the plan.',
isError: true,
}
}
const depth = team.delegationDepth ?? 0
const maxDepth = team.maxDelegationDepth ?? 3
if (depth >= maxDepth) {
return {
data: `Maximum delegation depth (${maxDepth}) reached; cannot delegate further.`,
isError: true,
}
}
if (team.delegationPool !== undefined && team.delegationPool.availableRunSlots < 1) {
return {
data:
'Agent pool has no free concurrency slot for a delegated run (nested run would block indefinitely). ' +
'Increase orchestrator maxConcurrency, wait for parallel work to finish, or avoid delegating while the pool is saturated.',
isError: true,
}
}
const result = await team.runDelegatedAgent(targetAgent, prompt)
if (team.sharedMemory) {
const suffix = `${Date.now()}-${Math.random().toString(36).slice(2, 10)}`
const key = `delegation:${targetAgent}:${suffix}`
try {
await team.sharedMemory.set(`${context.agent.name}/${key}`, result.output, {
agent: context.agent.name,
delegatedTo: targetAgent,
success: String(result.success),
})
} catch {
// Audit is best-effort; do not fail the tool on store errors.
}
}
return {
data: result.output,
isError: !result.success,
metadata: { tokenUsage: result.tokenUsage },
}
},
}

View File

@ -1,97 +0,0 @@
/**
* Shared recursive directory walk for built-in file tools.
*
* Used by {@link grepTool} and {@link globTool} so glob filtering and skip
* rules stay consistent.
*/
import { readdir, stat } from 'fs/promises'
import { join } from 'path'
/** Directories that are almost never useful to traverse for code search. */
export const SKIP_DIRS = new Set([
'.git',
'.svn',
'.hg',
'node_modules',
'.next',
'dist',
'build',
])
export interface CollectFilesOptions {
/** When set, stop collecting once this many paths are gathered. */
readonly maxFiles?: number
}
/**
* Recursively walk `dir` and return file paths, honouring {@link SKIP_DIRS}
* and an optional filename glob pattern.
*/
export async function collectFiles(
dir: string,
glob: string | undefined,
signal: AbortSignal | undefined,
options?: CollectFilesOptions,
): Promise<string[]> {
const results: string[] = []
await walk(dir, glob, results, signal, options?.maxFiles)
return results
}
async function walk(
dir: string,
glob: string | undefined,
results: string[],
signal: AbortSignal | undefined,
maxFiles: number | undefined,
): Promise<void> {
if (signal?.aborted === true) return
if (maxFiles !== undefined && results.length >= maxFiles) return
let entryNames: string[]
try {
entryNames = await readdir(dir, { encoding: 'utf8' })
} catch {
return
}
for (const entryName of entryNames) {
if (signal !== undefined && signal.aborted) return
if (maxFiles !== undefined && results.length >= maxFiles) return
const fullPath = join(dir, entryName)
let entryInfo: Awaited<ReturnType<typeof stat>>
try {
entryInfo = await stat(fullPath)
} catch {
continue
}
if (entryInfo.isDirectory()) {
if (!SKIP_DIRS.has(entryName)) {
await walk(fullPath, glob, results, signal, maxFiles)
}
} else if (entryInfo.isFile()) {
if (glob === undefined || matchesGlob(entryName, glob)) {
results.push(fullPath)
}
}
}
}
/**
* Minimal glob match supporting `*.ext` and `**<pattern>` forms.
*
*/
export function matchesGlob(filename: string, glob: string): boolean {
const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
const regexSource = pattern
.replace(/[.+^${}()|[\]\\]/g, '\\$&')
.replace(/\*/g, '.*')
.replace(/\?/g, '.')
const re = new RegExp(`^${regexSource}$`, 'i')
return re.test(filename)
}

View File

@ -1,99 +0,0 @@
/**
* Built-in glob tool.
*
* Lists file paths under a directory matching an optional filename glob.
* Does not read file contents use {@link grepTool} to search inside files.
*/
import { stat } from 'fs/promises'
import { basename, relative } from 'path'
import { z } from 'zod'
import type { ToolResult } from '../../types.js'
import { collectFiles, matchesGlob } from './fs-walk.js'
import { defineTool } from '../framework.js'
const DEFAULT_MAX_FILES = 500
export const globTool = defineTool({
name: 'glob',
description:
'List file paths under a directory that match an optional filename glob. ' +
'Does not read file contents — use `grep` to search inside files. ' +
'Skips common bulky directories (node_modules, .git, dist, etc.). ' +
'Paths in the result are relative to the process working directory. ' +
'Results are capped by `maxFiles`.',
inputSchema: z.object({
path: z
.string()
.optional()
.describe(
'Directory to list files under. Defaults to the current working directory.',
),
pattern: z
.string()
.optional()
.describe(
'Filename glob (e.g. "*.ts", "**/*.json"). When omitted, every file ' +
'under the directory is listed (subject to maxFiles and skipped dirs).',
),
maxFiles: z
.number()
.int()
.positive()
.optional()
.describe(
`Maximum number of file paths to return. Defaults to ${DEFAULT_MAX_FILES}.`,
),
}),
execute: async (input, context): Promise<ToolResult> => {
const root = input.path ?? process.cwd()
const maxFiles = input.maxFiles ?? DEFAULT_MAX_FILES
const signal = context.abortSignal
let linesOut: string[]
let truncated = false
try {
const info = await stat(root)
if (info.isFile()) {
const name = basename(root)
if (
input.pattern !== undefined &&
!matchesGlob(name, input.pattern)
) {
return { data: 'No files matched.', isError: false }
}
linesOut = [relative(process.cwd(), root) || root]
} else {
const collected = await collectFiles(root, input.pattern, signal, {
maxFiles: maxFiles + 1,
})
truncated = collected.length > maxFiles
const capped = collected.slice(0, maxFiles)
linesOut = capped.map((f) => relative(process.cwd(), f) || f)
}
} catch (err) {
const message = err instanceof Error ? err.message : 'Unknown error'
return {
data: `Cannot access path "${root}": ${message}`,
isError: true,
}
}
if (linesOut.length === 0) {
return { data: 'No files matched.', isError: false }
}
const sorted = [...linesOut].sort((a, b) => a.localeCompare(b))
const truncationNote = truncated
? `\n\n(listing capped at ${maxFiles} paths; raise maxFiles for more)`
: ''
return {
data: sorted.join('\n') + truncationNote,
isError: false,
}
},
})

View File

@ -8,18 +8,28 @@
*/
import { spawn } from 'child_process'
import { readFile, stat } from 'fs/promises'
import { relative } from 'path'
import { readdir, readFile, stat } from 'fs/promises'
// Note: readdir is used with { encoding: 'utf8' } to return string[] directly.
import { join, relative } from 'path'
import { z } from 'zod'
import type { ToolResult } from '../../types.js'
import { defineTool } from '../framework.js'
import { collectFiles } from './fs-walk.js'
// ---------------------------------------------------------------------------
// Constants
// ---------------------------------------------------------------------------
const DEFAULT_MAX_RESULTS = 100
// Directories that are almost never useful to search inside
const SKIP_DIRS = new Set([
'.git',
'.svn',
'.hg',
'node_modules',
'.next',
'dist',
'build',
])
// ---------------------------------------------------------------------------
// Tool definition
@ -32,7 +42,6 @@ export const grepTool = defineTool({
'Returns matching lines with their file paths and 1-based line numbers. ' +
'Use the `glob` parameter to restrict the search to specific file types ' +
'(e.g. "*.ts"). ' +
'To list matching file paths without reading contents, use the `glob` tool. ' +
'Results are capped by `maxResults` to keep the response manageable.',
inputSchema: z.object({
@ -261,6 +270,79 @@ async function runNodeSearch(
}
}
// ---------------------------------------------------------------------------
// File collection with glob filtering
// ---------------------------------------------------------------------------
/**
* Recursively walk `dir` and return file paths, honouring `SKIP_DIRS` and an
* optional glob pattern.
*/
async function collectFiles(
dir: string,
glob: string | undefined,
signal: AbortSignal | undefined,
): Promise<string[]> {
const results: string[] = []
await walk(dir, glob, results, signal)
return results
}
async function walk(
dir: string,
glob: string | undefined,
results: string[],
signal: AbortSignal | undefined,
): Promise<void> {
if (signal?.aborted === true) return
let entryNames: string[]
try {
// Read as plain strings so we don't have to deal with Buffer Dirent variants.
entryNames = await readdir(dir, { encoding: 'utf8' })
} catch {
return
}
for (const entryName of entryNames) {
if (signal !== undefined && signal.aborted) return
const fullPath = join(dir, entryName)
let entryInfo: Awaited<ReturnType<typeof stat>>
try {
entryInfo = await stat(fullPath)
} catch {
continue
}
if (entryInfo.isDirectory()) {
if (!SKIP_DIRS.has(entryName)) {
await walk(fullPath, glob, results, signal)
}
} else if (entryInfo.isFile()) {
if (glob === undefined || matchesGlob(entryName, glob)) {
results.push(fullPath)
}
}
}
}
/**
* Minimal glob match supporting `*.ext` and `**\/<pattern>` forms.
*/
function matchesGlob(filename: string, glob: string): boolean {
// Strip leading **/ prefix — we already recurse into all directories
const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
// Convert shell glob characters to regex equivalents
const regexSource = pattern
.replace(/[.+^${}()|[\]\\]/g, '\\$&') // escape special regex chars first
.replace(/\*/g, '.*') // * -> .*
.replace(/\?/g, '.') // ? -> .
const re = new RegExp(`^${regexSource}$`, 'i')
return re.test(filename)
}
// ---------------------------------------------------------------------------
// ripgrep availability check (cached per process)
// ---------------------------------------------------------------------------

View File

@ -8,23 +8,12 @@
import type { ToolDefinition } from '../../types.js'
import { ToolRegistry } from '../framework.js'
import { bashTool } from './bash.js'
import { delegateToAgentTool } from './delegate.js'
import { fileEditTool } from './file-edit.js'
import { fileReadTool } from './file-read.js'
import { fileWriteTool } from './file-write.js'
import { globTool } from './glob.js'
import { grepTool } from './grep.js'
export { bashTool, delegateToAgentTool, fileEditTool, fileReadTool, fileWriteTool, globTool, grepTool }
/** Options for {@link registerBuiltInTools}. */
export interface RegisterBuiltInToolsOptions {
/**
* When true, registers `delegate_to_agent` (team orchestration handoff).
* Default false so standalone agents and `runAgent` do not expose a tool that always errors.
*/
readonly includeDelegateTool?: boolean
}
export { bashTool, fileEditTool, fileReadTool, fileWriteTool, grepTool }
/**
* The ordered list of all built-in tools. Import this when you need to
@ -40,13 +29,6 @@ export const BUILT_IN_TOOLS: ToolDefinition<any>[] = [
fileWriteTool,
fileEditTool,
grepTool,
globTool,
]
/** All built-ins including `delegate_to_agent` (for team registry setup). */
export const ALL_BUILT_IN_TOOLS_WITH_DELEGATE: ToolDefinition<any>[] = [
...BUILT_IN_TOOLS,
delegateToAgentTool,
]
/**
@ -61,14 +43,8 @@ export const ALL_BUILT_IN_TOOLS_WITH_DELEGATE: ToolDefinition<any>[] = [
* registerBuiltInTools(registry)
* ```
*/
export function registerBuiltInTools(
registry: ToolRegistry,
options?: RegisterBuiltInToolsOptions,
): void {
export function registerBuiltInTools(registry: ToolRegistry): void {
for (const tool of BUILT_IN_TOOLS) {
registry.register(tool)
}
if (options?.includeDelegateTool) {
registry.register(delegateToAgentTool)
}
}

View File

@ -24,11 +24,6 @@ export interface ToolExecutorOptions {
* Defaults to 4.
*/
maxConcurrency?: number
/**
* Agent-level default for maximum tool output length in characters.
* Per-tool `maxOutputChars` takes priority over this value.
*/
maxToolOutputChars?: number
}
/** Describes one call in a batch. */
@ -52,12 +47,10 @@ export interface BatchToolCall {
export class ToolExecutor {
private readonly registry: ToolRegistry
private readonly semaphore: Semaphore
private readonly maxToolOutputChars?: number
constructor(registry: ToolRegistry, options: ToolExecutorOptions = {}) {
this.registry = registry
this.semaphore = new Semaphore(options.maxConcurrency ?? 4)
this.maxToolOutputChars = options.maxToolOutputChars
}
// -------------------------------------------------------------------------
@ -163,7 +156,7 @@ export class ToolExecutor {
// --- Execute ---
try {
const result = await tool.execute(parseResult.data, context)
return this.maybeTruncate(tool, result)
return result
} catch (err) {
const message =
err instanceof Error
@ -171,26 +164,10 @@ export class ToolExecutor {
: typeof err === 'string'
? err
: JSON.stringify(err)
return this.maybeTruncate(tool, this.errorResult(`Tool "${tool.name}" threw an error: ${message}`))
return this.errorResult(`Tool "${tool.name}" threw an error: ${message}`)
}
}
/**
* Apply truncation to a tool result if a character limit is configured.
* Priority: per-tool `maxOutputChars` > agent-level `maxToolOutputChars`.
*/
private maybeTruncate(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
tool: ToolDefinition<any>,
result: ToolResult,
): ToolResult {
const maxChars = tool.maxOutputChars ?? this.maxToolOutputChars
if (maxChars === undefined || maxChars <= 0 || result.data.length <= maxChars) {
return result
}
return { ...result, data: truncateToolOutput(result.data, maxChars) }
}
/** Construct an error ToolResult. */
private errorResult(message: string): ToolResult {
return {
@ -199,37 +176,3 @@ export class ToolExecutor {
}
}
}
// ---------------------------------------------------------------------------
// Truncation helper
// ---------------------------------------------------------------------------
/**
* Truncate tool output to fit within `maxChars`, preserving the head (~70%)
* and tail (~30%) with a marker indicating how many characters were removed.
*
* The marker itself is counted against the budget so the returned string
* never exceeds `maxChars`. When `maxChars` is too small to fit any
* content alongside the marker, a marker-only string is returned.
*/
export function truncateToolOutput(data: string, maxChars: number): string {
if (data.length <= maxChars) return data
// Estimate marker length (digit count may shrink after subtracting content,
// but using data.length gives a safe upper-bound for the digit count).
const markerTemplate = '\n\n[...truncated characters...]\n\n'
const markerOverhead = markerTemplate.length + String(data.length).length
// When maxChars is too small to fit any content alongside the marker,
// fall back to a hard slice so the result never exceeds maxChars.
if (maxChars <= markerOverhead) {
return data.slice(0, maxChars)
}
const available = maxChars - markerOverhead
const headChars = Math.floor(available * 0.7)
const tailChars = available - headChars
const truncatedCount = data.length - headChars - tailChars
return `${data.slice(0, headChars)}\n\n[...truncated ${truncatedCount} characters...]\n\n${data.slice(-tailChars)}`
}

View File

@ -72,28 +72,12 @@ export function defineTool<TInput>(config: {
name: string
description: string
inputSchema: ZodSchema<TInput>
/**
* Optional JSON Schema for the LLM (bypasses Zod JSON Schema conversion).
*/
llmInputSchema?: Record<string, unknown>
/**
* Per-tool maximum output length in characters. When set, tool output
* exceeding this limit is truncated (head + tail with a marker in between).
* Takes priority over agent-level `maxToolOutputChars`.
*/
maxOutputChars?: number
execute: (input: TInput, context: ToolUseContext) => Promise<ToolResult>
}): ToolDefinition<TInput> {
return {
name: config.name,
description: config.description,
inputSchema: config.inputSchema,
...(config.llmInputSchema !== undefined
? { llmInputSchema: config.llmInputSchema }
: {}),
...(config.maxOutputChars !== undefined
? { maxOutputChars: config.maxOutputChars }
: {}),
execute: config.execute,
}
}
@ -109,17 +93,13 @@ export function defineTool<TInput>(config: {
export class ToolRegistry {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
private readonly tools = new Map<string, ToolDefinition<any>>()
private readonly runtimeToolNames = new Set<string>()
/**
* Add a tool to the registry. Throws if a tool with the same name has
* already been registered prevents silent overwrites.
*/
// eslint-disable-next-line @typescript-eslint/no-explicit-any
register(
tool: ToolDefinition<any>,
options?: { runtimeAdded?: boolean },
): void {
register(tool: ToolDefinition<any>): void {
if (this.tools.has(tool.name)) {
throw new Error(
`ToolRegistry: a tool named "${tool.name}" is already registered. ` +
@ -127,9 +107,6 @@ export class ToolRegistry {
)
}
this.tools.set(tool.name, tool)
if (options?.runtimeAdded === true) {
this.runtimeToolNames.add(tool.name)
}
}
/** Return a tool by name, or `undefined` if not found. */
@ -170,12 +147,11 @@ export class ToolRegistry {
*/
unregister(name: string): void {
this.tools.delete(name)
this.runtimeToolNames.delete(name)
}
/** Alias for {@link unregister} — available for symmetry with `register`. */
deregister(name: string): void {
this.unregister(name)
this.tools.delete(name)
}
/**
@ -185,8 +161,7 @@ export class ToolRegistry {
*/
toToolDefs(): LLMToolDef[] {
return Array.from(this.tools.values()).map((tool) => {
const schema =
tool.llmInputSchema ?? zodToJsonSchema(tool.inputSchema)
const schema = zodToJsonSchema(tool.inputSchema)
return {
name: tool.name,
description: tool.description,
@ -195,14 +170,6 @@ export class ToolRegistry {
})
}
/**
* Return only tools that were added dynamically at runtime (e.g. via
* `agent.addTool()`), in LLM definition format.
*/
toRuntimeToolDefs(): LLMToolDef[] {
return this.toToolDefs().filter(tool => this.runtimeToolNames.has(tool.name))
}
/**
* Convert all registered tools to the Anthropic-style `input_schema`
* format. Prefer {@link toToolDefs} for normal use; this method is exposed
@ -211,20 +178,13 @@ export class ToolRegistry {
toLLMTools(): Array<{
name: string
description: string
/** Anthropic-style tool input JSON Schema (`type` is usually `object`). */
input_schema: Record<string, unknown>
input_schema: {
type: 'object'
properties: Record<string, JSONSchemaProperty>
required?: string[]
}
}> {
return Array.from(this.tools.values()).map((tool) => {
if (tool.llmInputSchema !== undefined) {
return {
name: tool.name,
description: tool.description,
input_schema: {
type: 'object' as const,
...(tool.llmInputSchema as Record<string, unknown>),
},
}
}
const schema = zodToJsonSchema(tool.inputSchema)
return {
name: tool.name,

View File

@ -1,296 +0,0 @@
import { z } from 'zod'
import { defineTool } from './framework.js'
import type { ToolDefinition } from '../types.js'
interface MCPToolDescriptor {
name: string
description?: string
/** MCP tool JSON Schema; same shape LLM APIs expect for object parameters. */
inputSchema?: Record<string, unknown>
}
interface MCPListToolsResponse {
tools?: MCPToolDescriptor[]
nextCursor?: string
}
interface MCPCallToolResponse {
content?: Array<Record<string, unknown>>
structuredContent?: unknown
isError?: boolean
toolResult?: unknown
}
interface MCPClientLike {
connect(transport: unknown, options?: { timeout?: number; signal?: AbortSignal }): Promise<void>
listTools(
params?: { cursor?: string },
options?: { timeout?: number; signal?: AbortSignal },
): Promise<MCPListToolsResponse>
callTool(
request: { name: string; arguments: Record<string, unknown> },
resultSchema?: unknown,
options?: { timeout?: number; signal?: AbortSignal },
): Promise<MCPCallToolResponse>
close?: () => Promise<void>
}
type MCPClientConstructor = new (
info: { name: string; version: string },
options: { capabilities: Record<string, unknown> },
) => MCPClientLike
type StdioTransportConstructor = new (config: {
command: string
args?: string[]
env?: Record<string, string | undefined>
cwd?: string
}) => { close?: () => Promise<void> }
interface MCPModules {
Client: MCPClientConstructor
StdioClientTransport: StdioTransportConstructor
}
const DEFAULT_MCP_REQUEST_TIMEOUT_MS = 60_000
async function loadMCPModules(): Promise<MCPModules> {
const [{ Client }, { StdioClientTransport }] = await Promise.all([
import('@modelcontextprotocol/sdk/client/index.js') as Promise<{
Client: MCPClientConstructor
}>,
import('@modelcontextprotocol/sdk/client/stdio.js') as Promise<{
StdioClientTransport: StdioTransportConstructor
}>,
])
return { Client, StdioClientTransport }
}
export interface ConnectMCPToolsConfig {
command: string
args?: string[]
env?: Record<string, string | undefined>
cwd?: string
/**
* Optional segment prepended to MCP tool names for the framework tool (and LLM) name.
* Example: prefix `github` + MCP tool `search_issues` `github_search_issues`.
*/
namePrefix?: string
/**
* Timeout (ms) for MCP connect and each `tools/list` page. Defaults to 60000.
*/
requestTimeoutMs?: number
/**
* Client metadata sent to the MCP server.
*/
clientName?: string
clientVersion?: string
}
export interface ConnectedMCPTools {
tools: ToolDefinition[]
disconnect: () => Promise<void>
}
/**
* Build an LLM-safe tool name: MCP and prior examples used `prefix/name`, but
* Anthropic and other providers reject `/` in tool names.
*/
function normalizeToolName(rawName: string, namePrefix?: string): string {
const trimmedPrefix = namePrefix?.trim()
const base =
trimmedPrefix !== undefined && trimmedPrefix !== ''
? `${trimmedPrefix}_${rawName}`
: rawName
return base.replace(/\//g, '_')
}
/** MCP `tools/list` JSON Schema; forwarded to the LLM as-is (runtime validation stays `z.any()`). */
function mcpLlmInputSchema(
schema: Record<string, unknown> | undefined,
): Record<string, unknown> {
if (schema !== undefined && typeof schema === 'object' && !Array.isArray(schema)) {
return schema
}
return { type: 'object' }
}
function contentBlockToText(block: Record<string, unknown>): string | undefined {
const typ = block.type
if (typ === 'text' && typeof block.text === 'string') {
return block.text
}
if (typ === 'image' && typeof block.data === 'string') {
const mime =
typeof block.mimeType === 'string' ? block.mimeType : 'image/*'
return `[image ${mime}; base64 length=${block.data.length}]`
}
if (typ === 'audio' && typeof block.data === 'string') {
const mime =
typeof block.mimeType === 'string' ? block.mimeType : 'audio/*'
return `[audio ${mime}; base64 length=${block.data.length}]`
}
if (
typ === 'resource' &&
block.resource !== null &&
typeof block.resource === 'object'
) {
const r = block.resource as Record<string, unknown>
const uri = typeof r.uri === 'string' ? r.uri : ''
if (typeof r.text === 'string') {
return `[resource ${uri}]\n${r.text}`
}
if (typeof r.blob === 'string') {
const mime = typeof r.mimeType === 'string' ? r.mimeType : ''
return `[resource ${uri}; mimeType=${mime}; blob base64 length=${r.blob.length}]`
}
return `[resource ${uri}]`
}
if (typ === 'resource_link') {
const uri = typeof block.uri === 'string' ? block.uri : ''
const name = typeof block.name === 'string' ? block.name : ''
const desc =
typeof block.description === 'string' ? block.description : ''
const head = `[resource_link name=${JSON.stringify(name)} uri=${JSON.stringify(uri)}]`
return desc === '' ? head : `${head}\n${desc}`
}
return undefined
}
function toToolResultData(result: MCPCallToolResponse): string {
if ('toolResult' in result && result.toolResult !== undefined) {
try {
return JSON.stringify(result.toolResult, null, 2)
} catch {
return String(result.toolResult)
}
}
const lines: string[] = []
for (const block of result.content ?? []) {
if (block === null || typeof block !== 'object') continue
const rec = block as Record<string, unknown>
const line = contentBlockToText(rec)
if (line !== undefined) {
lines.push(line)
continue
}
try {
lines.push(
`[${String(rec.type ?? 'unknown')}]\n${JSON.stringify(rec, null, 2)}`,
)
} catch {
lines.push('[mcp content block]')
}
}
if (lines.length > 0) {
return lines.join('\n')
}
if (result.structuredContent !== undefined) {
try {
return JSON.stringify(result.structuredContent, null, 2)
} catch {
return String(result.structuredContent)
}
}
try {
return JSON.stringify(result)
} catch {
return 'MCP tool completed with non-text output.'
}
}
async function listAllMcpTools(
client: MCPClientLike,
requestOpts: { timeout: number },
): Promise<MCPToolDescriptor[]> {
const acc: MCPToolDescriptor[] = []
let cursor: string | undefined
do {
const page = await client.listTools(
cursor !== undefined ? { cursor } : {},
requestOpts,
)
acc.push(...(page.tools ?? []))
cursor =
typeof page.nextCursor === 'string' && page.nextCursor !== ''
? page.nextCursor
: undefined
} while (cursor !== undefined)
return acc
}
/**
* Connect to an MCP server over stdio and convert exposed MCP tools into
* open-multi-agent ToolDefinitions.
*/
export async function connectMCPTools(
config: ConnectMCPToolsConfig,
): Promise<ConnectedMCPTools> {
const { Client, StdioClientTransport } = await loadMCPModules()
const transport = new StdioClientTransport({
command: config.command,
args: config.args ?? [],
env: config.env,
cwd: config.cwd,
})
const client = new Client(
{
name: config.clientName ?? 'open-multi-agent',
version: config.clientVersion ?? '0.0.0',
},
{ capabilities: {} },
)
const requestOpts = {
timeout: config.requestTimeoutMs ?? DEFAULT_MCP_REQUEST_TIMEOUT_MS,
}
await client.connect(transport, requestOpts)
const mcpTools = await listAllMcpTools(client, requestOpts)
const tools: ToolDefinition[] = mcpTools.map((tool) =>
defineTool({
name: normalizeToolName(tool.name, config.namePrefix),
description: tool.description ?? `MCP tool: ${tool.name}`,
inputSchema: z.any(),
llmInputSchema: mcpLlmInputSchema(tool.inputSchema),
execute: async (input: Record<string, unknown>) => {
try {
const result = await client.callTool(
{
name: tool.name,
arguments: input,
},
undefined,
requestOpts,
)
return {
data: toToolResultData(result),
isError: result.isError === true,
}
} catch (error) {
const message =
error instanceof Error ? error.message : String(error)
return {
data: `MCP tool "${tool.name}" failed: ${message}`,
isError: true,
}
}
},
}),
)
return {
tools,
disconnect: async () => {
await client.close?.()
},
}
}

View File

@ -1,219 +0,0 @@
/**
* @fileoverview Fallback tool-call extractor for local models.
*
* When a local model (Ollama, vLLM, LM Studio) returns tool calls as plain
* text instead of using the OpenAI `tool_calls` wire format, this module
* attempts to extract them from the text output.
*
* Common scenarios:
* - Ollama thinking-model bug: tool call JSON ends up inside unclosed `<think>` tags
* - Model outputs raw JSON tool calls without the server parsing them
* - Model wraps tool calls in markdown code fences
* - Hermes-format `<tool_call>` tags
*
* This is a **safety net**, not the primary path. Native `tool_calls` from
* the server are always preferred.
*/
import type { ToolUseBlock } from '../types.js'
// ---------------------------------------------------------------------------
// ID generation
// ---------------------------------------------------------------------------
let callCounter = 0
/** Generate a unique tool-call ID for extracted calls. */
function generateToolCallId(): string {
return `extracted_call_${Date.now()}_${++callCounter}`
}
// ---------------------------------------------------------------------------
// Internal parsers
// ---------------------------------------------------------------------------
/**
* Try to parse a single JSON object as a tool call.
*
* Accepted shapes:
* ```json
* { "name": "bash", "arguments": { "command": "ls" } }
* { "name": "bash", "parameters": { "command": "ls" } }
* { "function": { "name": "bash", "arguments": { "command": "ls" } } }
* ```
*/
function parseToolCallJSON(
json: unknown,
knownToolNames: ReadonlySet<string>,
): ToolUseBlock | null {
if (json === null || typeof json !== 'object' || Array.isArray(json)) {
return null
}
const obj = json as Record<string, unknown>
// Shape: { function: { name, arguments } }
if (typeof obj['function'] === 'object' && obj['function'] !== null) {
const fn = obj['function'] as Record<string, unknown>
return parseFlat(fn, knownToolNames)
}
// Shape: { name, arguments|parameters }
return parseFlat(obj, knownToolNames)
}
function parseFlat(
obj: Record<string, unknown>,
knownToolNames: ReadonlySet<string>,
): ToolUseBlock | null {
const name = obj['name']
if (typeof name !== 'string' || name.length === 0) return null
// Whitelist check — don't treat arbitrary JSON as a tool call
if (knownToolNames.size > 0 && !knownToolNames.has(name)) return null
let input: Record<string, unknown> = {}
const args = obj['arguments'] ?? obj['parameters'] ?? obj['input']
if (args !== null && args !== undefined) {
if (typeof args === 'string') {
try {
const parsed = JSON.parse(args)
if (typeof parsed === 'object' && parsed !== null && !Array.isArray(parsed)) {
input = parsed as Record<string, unknown>
}
} catch {
// Malformed — use empty input
}
} else if (typeof args === 'object' && !Array.isArray(args)) {
input = args as Record<string, unknown>
}
}
return {
type: 'tool_use',
id: generateToolCallId(),
name,
input,
}
}
// ---------------------------------------------------------------------------
// JSON extraction from text
// ---------------------------------------------------------------------------
/**
* Find all top-level JSON objects in a string by tracking brace depth.
* Returns the parsed objects (not sub-objects).
*/
function extractJSONObjects(text: string): unknown[] {
const results: unknown[] = []
let depth = 0
let start = -1
let inString = false
let escape = false
for (let i = 0; i < text.length; i++) {
const ch = text[i]!
if (escape) {
escape = false
continue
}
if (ch === '\\' && inString) {
escape = true
continue
}
if (ch === '"') {
inString = !inString
continue
}
if (inString) continue
if (ch === '{') {
if (depth === 0) start = i
depth++
} else if (ch === '}') {
depth--
if (depth === 0 && start !== -1) {
const candidate = text.slice(start, i + 1)
try {
results.push(JSON.parse(candidate))
} catch {
// Not valid JSON — skip
}
start = -1
}
}
}
return results
}
// ---------------------------------------------------------------------------
// Hermes format: <tool_call>...</tool_call>
// ---------------------------------------------------------------------------
function extractHermesToolCalls(
text: string,
knownToolNames: ReadonlySet<string>,
): ToolUseBlock[] {
const results: ToolUseBlock[] = []
for (const match of text.matchAll(/<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g)) {
const inner = match[1]!.trim()
try {
const parsed: unknown = JSON.parse(inner)
const block = parseToolCallJSON(parsed, knownToolNames)
if (block !== null) results.push(block)
} catch {
// Malformed hermes content — skip
}
}
return results
}
// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------
/**
* Attempt to extract tool calls from a model's text output.
*
* Tries multiple strategies in order:
* 1. Hermes `<tool_call>` tags
* 2. JSON objects in text (bare or inside code fences)
*
* @param text - The model's text output.
* @param knownToolNames - Whitelist of registered tool names. When non-empty,
* only JSON objects whose `name` matches a known tool
* are treated as tool calls.
* @returns Extracted {@link ToolUseBlock}s, or an empty array if none found.
*/
export function extractToolCallsFromText(
text: string,
knownToolNames: string[],
): ToolUseBlock[] {
if (text.length === 0) return []
const nameSet = new Set(knownToolNames)
// Strategy 1: Hermes format
const hermesResults = extractHermesToolCalls(text, nameSet)
if (hermesResults.length > 0) return hermesResults
// Strategy 2: Strip code fences, then extract JSON objects
const stripped = text.replace(/```(?:json)?\s*\n?([\s\S]*?)\n?\s*```/g, '$1')
const jsonObjects = extractJSONObjects(stripped)
const results: ToolUseBlock[] = []
for (const obj of jsonObjects) {
const block = parseToolCallJSON(obj, nameSet)
if (block !== null) results.push(block)
}
return results
}

View File

@ -65,31 +65,6 @@ export interface LLMMessage {
readonly content: ContentBlock[]
}
/** Context management strategy for long-running agent conversations. */
export type ContextStrategy =
| { type: 'sliding-window'; maxTurns: number }
| { type: 'summarize'; maxTokens: number; summaryModel?: string }
| {
type: 'compact'
/** Estimated token threshold that triggers compaction. Compaction is skipped when below this. */
maxTokens: number
/** Number of recent turn pairs (assistant+user) to keep intact. Default: 4. */
preserveRecentTurns?: number
/** Minimum chars in a tool_result content to qualify for compaction. Default: 200. */
minToolResultChars?: number
/** Minimum chars in an assistant text block to qualify for truncation. Default: 2000. */
minTextBlockChars?: number
/** Maximum chars to keep from a truncated text block (head excerpt). Default: 200. */
textBlockExcerptChars?: number
}
| {
type: 'custom'
compress: (
messages: LLMMessage[],
estimatedTokens: number,
) => Promise<LLMMessage[]> | LLMMessage[]
}
/** Token accounting for a single API call. */
export interface TokenUsage {
readonly input_tokens: number
@ -115,12 +90,11 @@ export interface LLMResponse {
* - `text` incremental text delta
* - `tool_use` the model has begun or completed a tool-use block
* - `tool_result` a tool result has been appended to the stream
* - `budget_exceeded` token budget threshold reached for this run
* - `done` the stream has ended; `data` is the final {@link LLMResponse}
* - `error` an unrecoverable error occurred; `data` is an `Error`
*/
export interface StreamEvent {
readonly type: 'text' | 'tool_use' | 'tool_result' | 'loop_detected' | 'budget_exceeded' | 'done' | 'error'
readonly type: 'text' | 'tool_use' | 'tool_result' | 'done' | 'error'
readonly data: unknown
}
@ -178,78 +152,29 @@ export interface AgentInfo {
readonly model: string
}
/**
* Minimal pool surface used by `delegate_to_agent` to detect nested-run capacity.
* {@link AgentPool} satisfies this structurally via {@link AgentPool.availableRunSlots}.
*/
export interface DelegationPoolView {
readonly availableRunSlots: number
}
/** Descriptor for a team of agents (orchestrator-injected into tool context). */
/** Descriptor for a team of agents with shared memory. */
export interface TeamInfo {
readonly name: string
readonly agents: readonly string[]
/** When the team has shared memory enabled; used for delegation audit writes. */
readonly sharedMemory?: MemoryStore
/** Zero-based depth of nested delegation from the root task run. */
readonly delegationDepth?: number
readonly maxDelegationDepth?: number
readonly delegationPool?: DelegationPoolView
/**
* Ordered chain of agent names from the root task to the current agent.
* Used to block `A -> B -> A` cycles before they burn turns against `maxDelegationDepth`.
*/
readonly delegationChain?: readonly string[]
/**
* Run another roster agent to completion and return its result.
* Only set during orchestrated pool execution (`runTeam` / `runTasks`).
*/
readonly runDelegatedAgent?: (targetAgent: string, prompt: string) => Promise<AgentRunResult>
}
/**
* Optional side-channel metadata a tool may attach to its result.
* Not shown to the LLM the runner reads it for accounting purposes.
*/
export interface ToolResultMetadata {
/**
* Token usage consumed inside the tool execution itself (e.g. nested LLM
* calls from `delegate_to_agent`). Accumulated into the parent runner's
* total so budgets/cost tracking stay accurate across delegation.
*/
readonly tokenUsage?: TokenUsage
readonly sharedMemory: MemoryStore
}
/** Value returned by a tool's `execute` function. */
export interface ToolResult {
readonly data: string
readonly isError?: boolean
readonly metadata?: ToolResultMetadata
}
/**
* A tool registered with the framework.
*
* `inputSchema` is a Zod schema used for validation before `execute` is called.
* At API call time it is converted to JSON Schema for {@link LLMToolDef}, unless
* `llmInputSchema` is set (e.g. MCP tools ship JSON Schema from the server).
* At API call time it is converted to JSON Schema via {@link LLMToolDef}.
*/
export interface ToolDefinition<TInput = Record<string, unknown>> {
readonly name: string
readonly description: string
readonly inputSchema: ZodSchema<TInput>
/**
* When present, used as {@link LLMToolDef.inputSchema} as-is instead of
* deriving JSON Schema from `inputSchema` (Zod).
*/
readonly llmInputSchema?: Record<string, unknown>
/**
* Per-tool maximum output length in characters. When set, tool output
* exceeding this limit is truncated (head + tail with a marker in between).
* Takes priority over {@link AgentConfig.maxToolOutputChars}.
*/
readonly maxOutputChars?: number
execute(input: TInput, context: ToolUseContext): Promise<ToolResult>
}
@ -257,19 +182,11 @@ export interface ToolDefinition<TInput = Record<string, unknown>> {
// Agent
// ---------------------------------------------------------------------------
/** Context passed to the {@link AgentConfig.beforeRun} hook. */
export interface BeforeRunHookContext {
/** The user prompt text. */
readonly prompt: string
/** The agent's static configuration. */
readonly agent: AgentConfig
}
/** Static configuration for a single agent. */
export interface AgentConfig {
readonly name: string
readonly model: string
readonly provider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
readonly provider?: 'anthropic' | 'copilot' | 'openai'
/**
* Custom base URL for OpenAI-compatible APIs (Ollama, vLLM, LM Studio, etc.).
* Note: local servers that don't require auth still need `apiKey` set to a
@ -279,115 +196,17 @@ export interface AgentConfig {
/** API key override; falls back to the provider's standard env var. */
readonly apiKey?: string
readonly systemPrompt?: string
/**
* Custom tool definitions to register alongside built-in tools.
* Created via `defineTool()`. Custom tools bypass `tools` (allowlist)
* and `toolPreset` filtering, but can still be blocked by `disallowedTools`.
*
* Tool names must not collide with built-in tool names; a duplicate name
* will throw at registration time.
*/
// eslint-disable-next-line @typescript-eslint/no-explicit-any
readonly customTools?: readonly ToolDefinition<any>[]
/** Names of tools (from the tool registry) available to this agent. */
readonly tools?: readonly string[]
/** Names of tools explicitly disallowed for this agent. */
readonly disallowedTools?: readonly string[]
/** Predefined tool preset for common use cases. */
readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
readonly maxTurns?: number
readonly maxTokens?: number
/** Maximum cumulative tokens (input + output) allowed for this run. */
readonly maxTokenBudget?: number
/** Optional context compression policy to control input growth across turns. */
readonly contextStrategy?: ContextStrategy
readonly temperature?: number
/**
* Maximum wall-clock time (in milliseconds) for the entire agent run.
* When exceeded, the run is aborted via `AbortSignal.timeout()`.
* Useful for local models where inference can be unpredictably slow.
*/
readonly timeoutMs?: number
/**
* Loop detection configuration. When set, the agent tracks repeated tool
* calls and text outputs to detect stuck loops before `maxTurns` is reached.
*/
readonly loopDetection?: LoopDetectionConfig
/**
* Maximum tool output length in characters for all tools used by this agent.
* When set, tool outputs exceeding this limit are truncated (head + tail
* with a marker in between). Per-tool {@link ToolDefinition.maxOutputChars}
* takes priority over this value.
*/
readonly maxToolOutputChars?: number
/**
* Compress tool results that the agent has already processed.
*
* In multi-turn runs, tool results persist in the conversation even after the
* agent has acted on them. When enabled, consumed tool results (those followed
* by an assistant response) are replaced with a short marker before the next
* LLM call, freeing context budget for new reasoning.
*
* - `true` enable with default threshold (500 chars)
* - `{ minChars: N }` only compress results longer than N characters
* - `false` / `undefined` disabled (default)
*
* Error tool results are never compressed.
*/
readonly compressToolResults?: boolean | { readonly minChars?: number }
/**
* Optional Zod schema for structured output. When set, the agent's final
* output is parsed as JSON and validated against this schema. A single
* retry with error feedback is attempted on validation failure.
*/
readonly outputSchema?: ZodSchema
/**
* Called before each agent run. Receives the prompt and agent config.
* Return a (possibly modified) context to continue, or throw to abort the run.
* Only `prompt` from the returned context is applied; `agent` is read-only informational.
*/
readonly beforeRun?: (context: BeforeRunHookContext) => Promise<BeforeRunHookContext> | BeforeRunHookContext
/**
* Called after each agent run completes successfully. Receives the run result.
* Return a (possibly modified) result, or throw to mark the run as failed.
* Not called when the run throws. For error observation, handle errors at the call site.
*/
readonly afterRun?: (result: AgentRunResult) => Promise<AgentRunResult> | AgentRunResult
}
// ---------------------------------------------------------------------------
// Loop detection
// ---------------------------------------------------------------------------
/** Configuration for agent loop detection. */
export interface LoopDetectionConfig {
/**
* Maximum consecutive times the same tool call (name + args) or text
* output can repeat before detection triggers. Default: `3`.
*/
readonly maxRepetitions?: number
/**
* Number of recent turns to track for repetition analysis. Default: `4`.
*/
readonly loopDetectionWindow?: number
/**
* Action to take when a loop is detected.
* - `'warn'` inject a "you appear stuck" message, give the LLM one
* more chance; terminate if the loop persists (default)
* - `'terminate'` stop the run immediately
* - `function` custom callback (sync or async); return `'continue'`,
* `'inject'`, or `'terminate'` to control the outcome
*/
readonly onLoopDetected?: 'warn' | 'terminate' | ((info: LoopDetectionInfo) => 'continue' | 'inject' | 'terminate' | Promise<'continue' | 'inject' | 'terminate'>)
}
/** Diagnostic payload emitted when a loop is detected. */
export interface LoopDetectionInfo {
readonly kind: 'tool_repetition' | 'text_repetition'
/** Number of consecutive identical occurrences observed. */
readonly repetitions: number
/** Human-readable description of the detected loop. */
readonly detail: string
}
/** Lifecycle state tracked during an agent run. */
@ -420,10 +239,6 @@ export interface AgentRunResult {
* failed after retry.
*/
readonly structured?: unknown
/** True when the run was terminated or warned due to loop detection. */
readonly loopDetected?: boolean
/** True when the run stopped because token budget was exceeded. */
readonly budgetExceeded?: boolean
}
// ---------------------------------------------------------------------------
@ -441,8 +256,6 @@ export interface TeamConfig {
/** Aggregated result for a full team run. */
export interface TeamRunResult {
readonly success: boolean
readonly goal?: string
readonly tasks?: readonly TaskExecutionRecord[]
/** Keyed by agent name. */
readonly agentResults: Map<string, AgentRunResult>
readonly totalTokenUsage: TokenUsage
@ -453,29 +266,7 @@ export interface TeamRunResult {
// ---------------------------------------------------------------------------
/** Valid states for a {@link Task}. */
export type TaskStatus = 'pending' | 'in_progress' | 'completed' | 'failed' | 'blocked' | 'skipped'
/**
* Metrics shown in the team-run dashboard detail panel for a single task.
* Mirrors execution data collected during orchestration.
*/
export interface TaskExecutionMetrics {
readonly startMs: number
readonly endMs: number
readonly durationMs: number
readonly tokenUsage: TokenUsage
readonly toolCalls: AgentRunResult['toolCalls']
}
/** Serializable task snapshot embedded in the static HTML dashboard. */
export interface TaskExecutionRecord {
readonly id: string
readonly title: string
readonly assignee?: string
readonly status: TaskStatus
readonly dependsOn: readonly string[]
readonly metrics?: TaskExecutionMetrics
}
export type TaskStatus = 'pending' | 'in_progress' | 'completed' | 'failed' | 'blocked'
/** A discrete unit of work tracked by the orchestrator. */
export interface Task {
@ -487,12 +278,6 @@ export interface Task {
assignee?: string
/** IDs of tasks that must complete before this one can start. */
dependsOn?: readonly string[]
/**
* Controls what prior team context is injected into this task's prompt.
* - `dependencies` (default): only direct dependency task results
* - `all`: full shared-memory summary
*/
readonly memoryScope?: 'dependencies' | 'all'
result?: string
readonly createdAt: Date
updatedAt: Date
@ -508,21 +293,14 @@ export interface Task {
// Orchestrator
// ---------------------------------------------------------------------------
/**
* Progress event emitted by the orchestrator during a run.
*
* **v0.3 addition:** `'task_skipped'` consumers with exhaustive switches
* on `type` will need to add a case for this variant.
*/
/** Progress event emitted by the orchestrator during a run. */
export interface OrchestratorEvent {
readonly type:
| 'agent_start'
| 'agent_complete'
| 'task_start'
| 'task_complete'
| 'task_skipped'
| 'task_retry'
| 'budget_exceeded'
| 'message'
| 'error'
readonly agent?: string
@ -533,136 +311,13 @@ export interface OrchestratorEvent {
/** Top-level configuration for the orchestrator. */
export interface OrchestratorConfig {
readonly maxConcurrency?: number
/**
* Maximum depth of `delegate_to_agent` chains from a task run (default `3`).
* Depth is per nested delegated run, not per team.
*/
readonly maxDelegationDepth?: number
/** Maximum cumulative tokens (input + output) allowed per orchestrator run. */
readonly maxTokenBudget?: number
readonly defaultModel?: string
readonly defaultProvider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
readonly defaultProvider?: 'anthropic' | 'copilot' | 'openai'
readonly defaultBaseURL?: string
readonly defaultApiKey?: string
readonly onProgress?: (event: OrchestratorEvent) => void
readonly onTrace?: (event: TraceEvent) => void | Promise<void>
/**
* Optional approval gate called between task execution rounds.
*
* After a batch of tasks completes, this callback receives all
* completed {@link Task}s from that round and the list of tasks about
* to start next. Return `true` to continue or `false` to abort
* remaining tasks will be marked `'skipped'`.
*
* Not called when:
* - No tasks succeeded in the round (all failed).
* - No pending tasks remain after the round (final batch).
*
* **Note:** Do not mutate the {@link Task} objects passed to this
* callback they are live references to queue state. Mutation is
* undefined behavior.
*/
readonly onApproval?: (completedTasks: readonly Task[], nextTasks: readonly Task[]) => Promise<boolean>
onProgress?: (event: OrchestratorEvent) => void
}
/**
* Optional overrides for the temporary coordinator agent created by `runTeam`.
*
* All fields are optional. Unset fields fall back to orchestrator defaults
* (or coordinator built-in defaults where applicable).
*/
export interface CoordinatorConfig {
/** Coordinator model. Defaults to `OrchestratorConfig.defaultModel`. */
readonly model?: string
readonly provider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
readonly baseURL?: string
readonly apiKey?: string
/**
* Full system prompt override. When set, this replaces the default
* coordinator preamble and decomposition guidance.
*
* Team roster, output format, and synthesis sections are still appended.
*/
readonly systemPrompt?: string
/**
* Additional instructions appended to the default coordinator prompt.
* Ignored when `systemPrompt` is provided.
*/
readonly instructions?: string
readonly maxTurns?: number
readonly maxTokens?: number
readonly temperature?: number
/** Predefined tool preset for common coordinator use cases. */
readonly toolPreset?: 'readonly' | 'readwrite' | 'full'
/** Tool names available to the coordinator. */
readonly tools?: readonly string[]
/** Tool names explicitly denied to the coordinator. */
readonly disallowedTools?: readonly string[]
readonly loopDetection?: LoopDetectionConfig
readonly timeoutMs?: number
}
// ---------------------------------------------------------------------------
// Trace events — lightweight observability spans
// ---------------------------------------------------------------------------
/** Trace event type discriminants. */
export type TraceEventType = 'llm_call' | 'tool_call' | 'task' | 'agent'
/** Shared fields present on every trace event. */
export interface TraceEventBase {
/** Unique identifier for the entire run (runTeam / runTasks / runAgent call). */
readonly runId: string
readonly type: TraceEventType
/** Unix epoch ms when the span started. */
readonly startMs: number
/** Unix epoch ms when the span ended. */
readonly endMs: number
/** Wall-clock duration in milliseconds (`endMs - startMs`). */
readonly durationMs: number
/** Agent name associated with this span. */
readonly agent: string
/** Task ID associated with this span. */
readonly taskId?: string
}
/** Emitted for each LLM API call (one per agent turn). */
export interface LLMCallTrace extends TraceEventBase {
readonly type: 'llm_call'
readonly model: string
/** Distinguishes normal turn calls from context-summary calls. */
readonly phase?: 'turn' | 'summary'
readonly turn: number
readonly tokens: TokenUsage
}
/** Emitted for each tool execution. */
export interface ToolCallTrace extends TraceEventBase {
readonly type: 'tool_call'
readonly tool: string
readonly isError: boolean
}
/** Emitted when a task completes (wraps the full retry sequence). */
export interface TaskTrace extends TraceEventBase {
readonly type: 'task'
readonly taskId: string
readonly taskTitle: string
readonly success: boolean
readonly retries: number
}
/** Emitted when an agent run completes (wraps the full conversation loop). */
export interface AgentTrace extends TraceEventBase {
readonly type: 'agent'
readonly turns: number
readonly tokens: TokenUsage
readonly toolCalls: number
}
/** Discriminated union of all trace event types. */
export type TraceEvent = LLMCallTrace | ToolCallTrace | TaskTrace | AgentTrace
// ---------------------------------------------------------------------------
// Memory
// ---------------------------------------------------------------------------

View File

@ -1,39 +0,0 @@
/**
* Shared keyword-affinity helpers used by capability-match scheduling
* and short-circuit agent selection. Kept in one place so behaviour
* can't drift between Scheduler and Orchestrator.
*/
export const STOP_WORDS: ReadonlySet<string> = new Set([
'the', 'and', 'for', 'that', 'this', 'with', 'are', 'from', 'have',
'will', 'your', 'you', 'can', 'all', 'each', 'when', 'then', 'they',
'them', 'their', 'about', 'into', 'more', 'also', 'should', 'must',
])
/**
* Tokenise `text` into a deduplicated set of lower-cased keywords.
* Words shorter than 4 characters and entries in {@link STOP_WORDS}
* are filtered out.
*/
export function extractKeywords(text: string): string[] {
return [
...new Set(
text
.toLowerCase()
.split(/\W+/)
.filter((w) => w.length > 3 && !STOP_WORDS.has(w)),
),
]
}
/**
* Count how many `keywords` appear (case-insensitively) in `text`.
* Each keyword contributes at most 1 to the score.
*/
export function keywordScore(text: string, keywords: readonly string[]): number {
const lower = text.toLowerCase()
return keywords.reduce(
(acc, kw) => acc + (lower.includes(kw.toLowerCase()) ? 1 : 0),
0,
)
}

View File

@ -34,11 +34,6 @@ export class Semaphore {
}
}
/** Maximum concurrent holders configured for this semaphore. */
get limit(): number {
return this.max
}
/**
* Acquire a slot. Resolves immediately when one is free, or waits until a
* holder calls `release()`.

View File

@ -1,27 +0,0 @@
import type { LLMMessage } from '../types.js'
/**
* Estimate token count using a lightweight character heuristic.
* This intentionally avoids model-specific tokenizer dependencies.
*/
export function estimateTokens(messages: LLMMessage[]): number {
let chars = 0
for (const message of messages) {
for (const block of message.content) {
if (block.type === 'text') {
chars += block.text.length
} else if (block.type === 'tool_result') {
chars += block.content.length
} else if (block.type === 'tool_use') {
chars += JSON.stringify(block.input).length
} else if (block.type === 'image') {
// Account for non-text payloads with a small fixed cost.
chars += 64
}
}
}
// Conservative English heuristic: ~4 chars per token.
return Math.ceil(chars / 4)
}

View File

@ -1,34 +0,0 @@
/**
* @fileoverview Trace emission utilities for the observability layer.
*/
import { randomUUID } from 'node:crypto'
import type { TraceEvent } from '../types.js'
/**
* Safely emit a trace event. Swallows callback errors so a broken
* subscriber never crashes agent execution.
*/
export function emitTrace(
fn: ((event: TraceEvent) => void | Promise<void>) | undefined,
event: TraceEvent,
): void {
if (!fn) return
try {
// Guard async callbacks: if fn returns a Promise, swallow its rejection
// so an async onTrace never produces an unhandled promise rejection.
const result = fn(event) as unknown
if (result && typeof (result as Promise<unknown>).catch === 'function') {
;(result as Promise<unknown>).catch(noop)
}
} catch {
// Intentionally swallowed — observability must never break execution.
}
}
function noop() {}
/** Generate a unique run ID for trace correlation. */
export function generateRunId(): string {
return randomUUID()
}

View File

@ -1,279 +0,0 @@
/**
* Targeted tests for abort signal propagation fixes (#99, #100, #101).
*
* - #99: Per-call abortSignal must reach tool execution context
* - #100: Abort path in executeQueue must skip blocked tasks and emit events
* - #101: Gemini adapter must forward abortSignal to the SDK
*/
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry, defineTool } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import { TaskQueue } from '../src/task/queue.js'
import { createTask } from '../src/task/task.js'
import { z } from 'zod'
import type { LLMAdapter, LLMMessage, ToolUseContext } from '../src/types.js'
// ---------------------------------------------------------------------------
// #99 — Per-call abortSignal propagated to tool context
// ---------------------------------------------------------------------------
describe('Per-call abortSignal reaches tool context (#99)', () => {
it('tool receives per-call abortSignal, not static runner signal', async () => {
// Track the abortSignal passed to the tool
let receivedSignal: AbortSignal | undefined
const spy = defineTool({
name: 'spy',
description: 'Captures the abort signal from context.',
inputSchema: z.object({}),
execute: async (_input, context) => {
receivedSignal = context.abortSignal
return { data: 'ok', isError: false }
},
})
const registry = new ToolRegistry()
registry.register(spy)
const executor = new ToolExecutor(registry)
// Adapter returns one tool_use then end_turn
const adapter: LLMAdapter = {
name: 'mock',
chat: vi.fn()
.mockResolvedValueOnce({
id: '1',
content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
model: 'mock',
stop_reason: 'tool_use',
usage: { input_tokens: 0, output_tokens: 0 },
})
.mockResolvedValueOnce({
id: '2',
content: [{ type: 'text', text: 'done' }],
model: 'mock',
stop_reason: 'end_turn',
usage: { input_tokens: 0, output_tokens: 0 },
}),
async *stream() { /* unused */ },
}
const perCallController = new AbortController()
// Runner created WITHOUT a static abortSignal
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock',
agentName: 'test',
})
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: 'go' }] },
]
await runner.run(messages, { abortSignal: perCallController.signal })
// The tool must have received the per-call signal, not undefined
expect(receivedSignal).toBe(perCallController.signal)
})
it('tool receives static signal when no per-call signal is provided', async () => {
let receivedSignal: AbortSignal | undefined
const spy = defineTool({
name: 'spy',
description: 'Captures the abort signal from context.',
inputSchema: z.object({}),
execute: async (_input, context) => {
receivedSignal = context.abortSignal
return { data: 'ok', isError: false }
},
})
const registry = new ToolRegistry()
registry.register(spy)
const executor = new ToolExecutor(registry)
const staticController = new AbortController()
const adapter: LLMAdapter = {
name: 'mock',
chat: vi.fn()
.mockResolvedValueOnce({
id: '1',
content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
model: 'mock',
stop_reason: 'tool_use',
usage: { input_tokens: 0, output_tokens: 0 },
})
.mockResolvedValueOnce({
id: '2',
content: [{ type: 'text', text: 'done' }],
model: 'mock',
stop_reason: 'end_turn',
usage: { input_tokens: 0, output_tokens: 0 },
}),
async *stream() { /* unused */ },
}
// Runner created WITH a static abortSignal, no per-call signal
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock',
agentName: 'test',
abortSignal: staticController.signal,
})
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: 'go' }] },
]
await runner.run(messages)
expect(receivedSignal).toBe(staticController.signal)
})
})
// ---------------------------------------------------------------------------
// #100 — Abort path skips blocked tasks and emits events
// ---------------------------------------------------------------------------
describe('Abort path skips blocked tasks and emits events (#100)', () => {
function task(id: string, opts: { dependsOn?: string[]; assignee?: string } = {}) {
const t = createTask({ title: id, description: `task ${id}`, assignee: opts.assignee })
return { ...t, id, dependsOn: opts.dependsOn } as ReturnType<typeof createTask>
}
it('skipRemaining transitions blocked tasks to skipped', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
// 'b' should be blocked because it depends on 'a'
expect(q.getByStatus('blocked').length).toBe(1)
q.skipRemaining('Skipped: run aborted.')
// Both tasks should be skipped — including the blocked one
const all = q.list()
expect(all.every(t => t.status === 'skipped')).toBe(true)
expect(q.getByStatus('blocked').length).toBe(0)
})
it('skipRemaining emits task:skipped for every non-terminal task', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
const handler = vi.fn()
q.on('task:skipped', handler)
q.skipRemaining('Skipped: run aborted.')
// Both pending 'a' and blocked 'b' must trigger events
expect(handler).toHaveBeenCalledTimes(2)
const ids = handler.mock.calls.map((c: any[]) => c[0].id)
expect(ids).toContain('a')
expect(ids).toContain('b')
})
it('skipRemaining fires all:complete after skipping', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
const completeHandler = vi.fn()
q.on('all:complete', completeHandler)
q.skipRemaining('Skipped: run aborted.')
expect(completeHandler).toHaveBeenCalledTimes(1)
expect(q.isComplete()).toBe(true)
})
})
// ---------------------------------------------------------------------------
// #101 — Gemini adapter forwards abortSignal to SDK config
// ---------------------------------------------------------------------------
const mockGenerateContent = vi.hoisted(() => vi.fn())
const mockGenerateContentStream = vi.hoisted(() => vi.fn())
const GoogleGenAIMock = vi.hoisted(() =>
vi.fn(() => ({
models: {
generateContent: mockGenerateContent,
generateContentStream: mockGenerateContentStream,
},
})),
)
vi.mock('@google/genai', () => ({
GoogleGenAI: GoogleGenAIMock,
FunctionCallingConfigMode: { AUTO: 'AUTO' },
}))
import { GeminiAdapter } from '../src/llm/gemini.js'
describe('Gemini adapter forwards abortSignal (#101)', () => {
let adapter: GeminiAdapter
function makeGeminiResponse(parts: Array<Record<string, unknown>>) {
return {
candidates: [{
content: { parts },
finishReason: 'STOP',
}],
usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 5 },
}
}
async function* asyncGen<T>(items: T[]): AsyncGenerator<T> {
for (const item of items) yield item
}
beforeEach(() => {
vi.clearAllMocks()
adapter = new GeminiAdapter('test-key')
})
it('chat() passes abortSignal in config', async () => {
mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
const controller = new AbortController()
await adapter.chat(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash', abortSignal: controller.signal },
)
const callArgs = mockGenerateContent.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBe(controller.signal)
})
it('chat() does not include abortSignal when not provided', async () => {
mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
await adapter.chat(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash' },
)
const callArgs = mockGenerateContent.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBeUndefined()
})
it('stream() passes abortSignal in config', async () => {
const chunk = makeGeminiResponse([{ text: 'hi' }])
mockGenerateContentStream.mockResolvedValue(asyncGen([chunk]))
const controller = new AbortController()
const events: unknown[] = []
for await (const e of adapter.stream(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash', abortSignal: controller.signal },
)) {
events.push(e)
}
const callArgs = mockGenerateContentStream.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBe(controller.signal)
})
})

View File

@ -1,107 +0,0 @@
import { describe, it, expect, vi } from 'vitest'
import { OpenMultiAgent } from '../src/orchestrator/orchestrator.js'
import { Team } from '../src/team/team.js'
describe('AbortSignal support for runTeam and runTasks', () => {
it('runTeam should accept an abortSignal option', async () => {
const orchestrator = new OpenMultiAgent({
defaultModel: 'test-model',
defaultProvider: 'openai',
})
// Verify the API accepts the option without throwing
const controller = new AbortController()
const team = new Team({
name: 'test',
agents: [
{ name: 'agent1', model: 'test-model', systemPrompt: 'test' },
],
})
// Abort immediately so the run won't actually execute LLM calls
controller.abort()
// runTeam should return gracefully (no unhandled rejection)
const result = await orchestrator.runTeam(team, 'test goal', {
abortSignal: controller.signal,
})
// With immediate abort, coordinator may or may not have run,
// but the function should not throw.
expect(result).toBeDefined()
expect(result.agentResults).toBeInstanceOf(Map)
})
it('runTasks should accept an abortSignal option', async () => {
const orchestrator = new OpenMultiAgent({
defaultModel: 'test-model',
defaultProvider: 'openai',
})
const controller = new AbortController()
const team = new Team({
name: 'test',
agents: [
{ name: 'agent1', model: 'test-model', systemPrompt: 'test' },
],
})
controller.abort()
const result = await orchestrator.runTasks(team, [
{ title: 'task1', description: 'do something', assignee: 'agent1' },
], { abortSignal: controller.signal })
expect(result).toBeDefined()
expect(result.agentResults).toBeInstanceOf(Map)
})
it('pre-aborted signal should skip pending tasks', async () => {
const orchestrator = new OpenMultiAgent({
defaultModel: 'test-model',
defaultProvider: 'openai',
})
const controller = new AbortController()
controller.abort()
const team = new Team({
name: 'test',
agents: [
{ name: 'agent1', model: 'test-model', systemPrompt: 'test' },
],
})
const result = await orchestrator.runTasks(team, [
{ title: 'task1', description: 'first', assignee: 'agent1' },
{ title: 'task2', description: 'second', assignee: 'agent1' },
], { abortSignal: controller.signal })
// No agent runs should complete since signal was already aborted
expect(result).toBeDefined()
})
it('runTeam and runTasks work without abortSignal (backward compat)', async () => {
const orchestrator = new OpenMultiAgent({
defaultModel: 'test-model',
defaultProvider: 'openai',
})
const team = new Team({
name: 'test',
agents: [
{ name: 'agent1', model: 'test-model', systemPrompt: 'test' },
],
})
// These should not throw even without abortSignal
const promise1 = orchestrator.runTeam(team, 'goal')
const promise2 = orchestrator.runTasks(team, [
{ title: 'task1', description: 'do something', assignee: 'agent1' },
])
// Both return promises (won't resolve without real LLM, but API is correct)
expect(promise1).toBeInstanceOf(Promise)
expect(promise2).toBeInstanceOf(Promise)
})
})

View File

@ -1,473 +0,0 @@
import { describe, it, expect, vi } from 'vitest'
import { z } from 'zod'
import { Agent } from '../src/agent/agent.js'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import type { AgentConfig, AgentRunResult, LLMAdapter, LLMMessage, LLMResponse, StreamEvent } from '../src/types.js'
// ---------------------------------------------------------------------------
// Mock helpers
// ---------------------------------------------------------------------------
/**
* Create a mock adapter that records every `chat()` call's messages
* and returns a fixed text response.
*/
function mockAdapter(responseText: string) {
const calls: LLMMessage[][] = []
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push([...messages])
return {
id: 'mock-1',
content: [{ type: 'text' as const, text: responseText }],
model: 'mock-model',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 20 },
} satisfies LLMResponse
},
async *stream() {
/* unused */
},
}
return { adapter, calls }
}
/** Build an Agent with a mocked LLM, bypassing createAdapter. */
function buildMockAgent(config: AgentConfig, responseText: string) {
const { adapter, calls } = mockAdapter(responseText)
const registry = new ToolRegistry()
const executor = new ToolExecutor(registry)
const agent = new Agent(config, registry, executor)
const runner = new AgentRunner(adapter, registry, executor, {
model: config.model,
systemPrompt: config.systemPrompt,
maxTurns: config.maxTurns,
maxTokens: config.maxTokens,
temperature: config.temperature,
agentName: config.name,
})
;(agent as any).runner = runner
return { agent, calls }
}
const baseConfig: AgentConfig = {
name: 'test-agent',
model: 'mock-model',
systemPrompt: 'You are a test agent.',
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe('Agent hooks — beforeRun / afterRun', () => {
// -----------------------------------------------------------------------
// Baseline — no hooks
// -----------------------------------------------------------------------
it('works normally without hooks', async () => {
const { agent } = buildMockAgent(baseConfig, 'hello')
const result = await agent.run('ping')
expect(result.success).toBe(true)
expect(result.output).toBe('hello')
})
// -----------------------------------------------------------------------
// beforeRun
// -----------------------------------------------------------------------
it('beforeRun can modify the prompt', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => ({ ...ctx, prompt: 'modified prompt' }),
}
const { agent, calls } = buildMockAgent(config, 'response')
await agent.run('original prompt')
// The adapter should have received the modified prompt.
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
const textBlock = lastUserMsg!.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('modified prompt')
})
it('beforeRun that returns context unchanged does not alter prompt', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => ctx,
}
const { agent, calls } = buildMockAgent(config, 'response')
await agent.run('keep this')
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
const textBlock = lastUserMsg!.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('keep this')
})
it('beforeRun throwing aborts the run with failure', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: () => { throw new Error('budget exceeded') },
}
const { agent, calls } = buildMockAgent(config, 'should not reach')
const result = await agent.run('hi')
expect(result.success).toBe(false)
expect(result.output).toContain('budget exceeded')
// No LLM call should have been made.
expect(calls).toHaveLength(0)
})
it('async beforeRun works', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: async (ctx) => {
await Promise.resolve()
return { ...ctx, prompt: 'async modified' }
},
}
const { agent, calls } = buildMockAgent(config, 'ok')
await agent.run('original')
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
const textBlock = lastUserMsg!.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('async modified')
})
// -----------------------------------------------------------------------
// afterRun
// -----------------------------------------------------------------------
it('afterRun can modify the result', async () => {
const config: AgentConfig = {
...baseConfig,
afterRun: (result) => ({ ...result, output: 'modified output' }),
}
const { agent } = buildMockAgent(config, 'original output')
const result = await agent.run('hi')
expect(result.success).toBe(true)
expect(result.output).toBe('modified output')
})
it('afterRun throwing marks run as failed', async () => {
const config: AgentConfig = {
...baseConfig,
afterRun: () => { throw new Error('content violation') },
}
const { agent } = buildMockAgent(config, 'bad content')
const result = await agent.run('hi')
expect(result.success).toBe(false)
expect(result.output).toContain('content violation')
})
it('async afterRun works', async () => {
const config: AgentConfig = {
...baseConfig,
afterRun: async (result) => {
await Promise.resolve()
return { ...result, output: result.output.toUpperCase() }
},
}
const { agent } = buildMockAgent(config, 'hello')
const result = await agent.run('hi')
expect(result.output).toBe('HELLO')
})
// -----------------------------------------------------------------------
// Both hooks together
// -----------------------------------------------------------------------
it('beforeRun and afterRun compose correctly', async () => {
const hookOrder: string[] = []
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => {
hookOrder.push('before')
return { ...ctx, prompt: 'injected prompt' }
},
afterRun: (result) => {
hookOrder.push('after')
return { ...result, output: `[processed] ${result.output}` }
},
}
const { agent, calls } = buildMockAgent(config, 'raw output')
const result = await agent.run('original')
expect(hookOrder).toEqual(['before', 'after'])
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
const textBlock = lastUserMsg!.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('injected prompt')
expect(result.output).toBe('[processed] raw output')
})
// -----------------------------------------------------------------------
// prompt() multi-turn mode
// -----------------------------------------------------------------------
it('hooks fire on prompt() calls', async () => {
const beforeSpy = vi.fn((ctx) => ctx)
const afterSpy = vi.fn((result) => result)
const config: AgentConfig = {
...baseConfig,
beforeRun: beforeSpy,
afterRun: afterSpy,
}
const { agent } = buildMockAgent(config, 'reply')
await agent.prompt('hello')
expect(beforeSpy).toHaveBeenCalledOnce()
expect(afterSpy).toHaveBeenCalledOnce()
expect(beforeSpy.mock.calls[0]![0].prompt).toBe('hello')
})
// -----------------------------------------------------------------------
// stream() mode
// -----------------------------------------------------------------------
it('beforeRun fires in stream mode', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => ({ ...ctx, prompt: 'stream modified' }),
}
const { agent, calls } = buildMockAgent(config, 'streamed')
const events: StreamEvent[] = []
for await (const event of agent.stream('original')) {
events.push(event)
}
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
const textBlock = lastUserMsg!.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('stream modified')
// Should have at least a text event and a done event.
expect(events.some(e => e.type === 'done')).toBe(true)
})
it('afterRun fires in stream mode and modifies done event', async () => {
const config: AgentConfig = {
...baseConfig,
afterRun: (result) => ({ ...result, output: 'stream modified output' }),
}
const { agent } = buildMockAgent(config, 'original')
const events: StreamEvent[] = []
for await (const event of agent.stream('hi')) {
events.push(event)
}
const doneEvent = events.find(e => e.type === 'done')
expect(doneEvent).toBeDefined()
expect((doneEvent!.data as AgentRunResult).output).toBe('stream modified output')
})
it('beforeRun throwing in stream mode yields error event', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: () => { throw new Error('stream abort') },
}
const { agent } = buildMockAgent(config, 'unreachable')
const events: StreamEvent[] = []
for await (const event of agent.stream('hi')) {
events.push(event)
}
const errorEvent = events.find(e => e.type === 'error')
expect(errorEvent).toBeDefined()
expect((errorEvent!.data as Error).message).toContain('stream abort')
})
it('afterRun throwing in stream mode yields error event', async () => {
const config: AgentConfig = {
...baseConfig,
afterRun: () => { throw new Error('stream content violation') },
}
const { agent } = buildMockAgent(config, 'streamed output')
const events: StreamEvent[] = []
for await (const event of agent.stream('hi')) {
events.push(event)
}
// Text events may have been yielded before the error.
const errorEvent = events.find(e => e.type === 'error')
expect(errorEvent).toBeDefined()
expect((errorEvent!.data as Error).message).toContain('stream content violation')
// No done event should be present since afterRun rejected it.
expect(events.find(e => e.type === 'done')).toBeUndefined()
})
// -----------------------------------------------------------------------
// prompt() history integrity
// -----------------------------------------------------------------------
it('beforeRun modifying prompt preserves non-text content blocks', async () => {
// Simulate a multi-turn message where the last user message has mixed content
// (text + tool_result). beforeRun should only replace text, not strip other blocks.
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => ({ ...ctx, prompt: 'modified' }),
}
const { adapter, calls } = mockAdapter('ok')
const registry = new ToolRegistry()
const executor = new ToolExecutor(registry)
const agent = new Agent(config, registry, executor)
const runner = new AgentRunner(adapter, registry, executor, {
model: config.model,
agentName: config.name,
})
;(agent as any).runner = runner
// Directly call run which creates a single text-only user message.
// To test mixed content, we need to go through the private executeRun.
// Instead, we test via prompt() after injecting history with mixed content.
;(agent as any).messageHistory = [
{
role: 'user' as const,
content: [
{ type: 'text' as const, text: 'original' },
{ type: 'image' as const, source: { type: 'base64' as const, media_type: 'image/png', data: 'abc' } },
],
},
]
// prompt() appends a new user message then calls executeRun with full history
await agent.prompt('follow up')
// The last user message sent to the LLM should have modified text
const sentMessages = calls[0]!
const lastUser = [...sentMessages].reverse().find(m => m.role === 'user')!
const textBlock = lastUser.content.find(b => b.type === 'text')
expect((textBlock as any).text).toBe('modified')
// The earlier user message (with the image) should be untouched
const firstUser = sentMessages.find(m => m.role === 'user')!
const imageBlock = firstUser.content.find(b => b.type === 'image')
expect(imageBlock).toBeDefined()
})
it('beforeRun modifying prompt does not corrupt messageHistory', async () => {
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => ({ ...ctx, prompt: 'hook-modified' }),
}
const { agent, calls } = buildMockAgent(config, 'reply')
await agent.prompt('original message')
// The LLM should have received the modified prompt.
const lastUserMsg = calls[0]!.find(m => m.role === 'user')
expect((lastUserMsg!.content[0] as any).text).toBe('hook-modified')
// But the persistent history should retain the original message.
const history = agent.getHistory()
const firstUserInHistory = history.find(m => m.role === 'user')
expect((firstUserInHistory!.content[0] as any).text).toBe('original message')
})
// -----------------------------------------------------------------------
// afterRun NOT called on error
// -----------------------------------------------------------------------
it('afterRun is not called when executeRun throws', async () => {
const afterSpy = vi.fn((result) => result)
const config: AgentConfig = {
...baseConfig,
// Use beforeRun to trigger an error inside executeRun's try block,
// before afterRun would normally run.
beforeRun: () => { throw new Error('rejected by policy') },
afterRun: afterSpy,
}
const { agent } = buildMockAgent(config, 'should not reach')
const result = await agent.run('hi')
expect(result.success).toBe(false)
expect(result.output).toContain('rejected by policy')
expect(afterSpy).not.toHaveBeenCalled()
})
// -----------------------------------------------------------------------
// outputSchema + afterRun
// -----------------------------------------------------------------------
it('afterRun fires after structured output validation', async () => {
const schema = z.object({ answer: z.string() })
const config: AgentConfig = {
...baseConfig,
outputSchema: schema,
afterRun: (result) => ({ ...result, output: '[post-processed] ' + result.output }),
}
// Return valid JSON matching the schema
const { agent } = buildMockAgent(config, '{"answer":"42"}')
const result = await agent.run('what is the answer?')
expect(result.success).toBe(true)
expect(result.output).toBe('[post-processed] {"answer":"42"}')
expect(result.structured).toEqual({ answer: '42' })
})
// -----------------------------------------------------------------------
// ctx.agent does not contain hook self-references
// -----------------------------------------------------------------------
it('beforeRun context.agent has correct config without hook self-references', async () => {
let receivedAgent: AgentConfig | undefined
const config: AgentConfig = {
...baseConfig,
beforeRun: (ctx) => {
receivedAgent = ctx.agent
return ctx
},
}
const { agent } = buildMockAgent(config, 'ok')
await agent.run('test')
expect(receivedAgent).toBeDefined()
expect(receivedAgent!.name).toBe('test-agent')
expect(receivedAgent!.model).toBe('mock-model')
// Hook functions should be stripped to avoid circular references
expect(receivedAgent!.beforeRun).toBeUndefined()
expect(receivedAgent!.afterRun).toBeUndefined()
})
// -----------------------------------------------------------------------
// Multiple prompt() turns fire hooks each time
// -----------------------------------------------------------------------
it('hooks fire on every prompt() call', async () => {
const beforeSpy = vi.fn((ctx) => ctx)
const afterSpy = vi.fn((result) => result)
const config: AgentConfig = {
...baseConfig,
beforeRun: beforeSpy,
afterRun: afterSpy,
}
const { agent } = buildMockAgent(config, 'reply')
await agent.prompt('turn 1')
await agent.prompt('turn 2')
expect(beforeSpy).toHaveBeenCalledTimes(2)
expect(afterSpy).toHaveBeenCalledTimes(2)
expect(beforeSpy.mock.calls[0]![0].prompt).toBe('turn 1')
expect(beforeSpy.mock.calls[1]![0].prompt).toBe('turn 2')
})
})

View File

@ -1,383 +0,0 @@
import { describe, it, expect, vi } from 'vitest'
import { AgentPool } from '../src/agent/pool.js'
import type { Agent } from '../src/agent/agent.js'
import type { AgentRunResult, AgentState } from '../src/types.js'
// ---------------------------------------------------------------------------
// Mock Agent factory
// ---------------------------------------------------------------------------
const SUCCESS_RESULT: AgentRunResult = {
success: true,
output: 'done',
messages: [],
tokenUsage: { input_tokens: 10, output_tokens: 20 },
toolCalls: [],
}
function createMockAgent(
name: string,
opts?: { runResult?: AgentRunResult; state?: AgentState['status'] },
): Agent {
const state: AgentState = {
status: opts?.state ?? 'idle',
messages: [],
tokenUsage: { input_tokens: 0, output_tokens: 0 },
}
return {
name,
config: { name, model: 'test' },
run: vi.fn().mockResolvedValue(opts?.runResult ?? SUCCESS_RESULT),
getState: vi.fn().mockReturnValue(state),
reset: vi.fn(),
} as unknown as Agent
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe('AgentPool', () => {
describe('registry: add / remove / get / list', () => {
it('adds and retrieves an agent', () => {
const pool = new AgentPool()
const agent = createMockAgent('alice')
pool.add(agent)
expect(pool.get('alice')).toBe(agent)
expect(pool.list()).toHaveLength(1)
})
it('throws on duplicate add', () => {
const pool = new AgentPool()
pool.add(createMockAgent('alice'))
expect(() => pool.add(createMockAgent('alice'))).toThrow('already registered')
})
it('removes an agent', () => {
const pool = new AgentPool()
pool.add(createMockAgent('alice'))
pool.remove('alice')
expect(pool.get('alice')).toBeUndefined()
expect(pool.list()).toHaveLength(0)
})
it('throws on remove of unknown agent', () => {
const pool = new AgentPool()
expect(() => pool.remove('unknown')).toThrow('not registered')
})
it('get returns undefined for unknown agent', () => {
const pool = new AgentPool()
expect(pool.get('unknown')).toBeUndefined()
})
})
describe('run', () => {
it('runs a prompt on a named agent', async () => {
const pool = new AgentPool()
const agent = createMockAgent('alice')
pool.add(agent)
const result = await pool.run('alice', 'hello')
expect(result.success).toBe(true)
expect(agent.run).toHaveBeenCalledWith('hello', undefined)
})
it('throws on unknown agent name', async () => {
const pool = new AgentPool()
await expect(pool.run('unknown', 'hello')).rejects.toThrow('not registered')
})
})
describe('runParallel', () => {
it('runs multiple agents in parallel', async () => {
const pool = new AgentPool(5)
pool.add(createMockAgent('a'))
pool.add(createMockAgent('b'))
const results = await pool.runParallel([
{ agent: 'a', prompt: 'task a' },
{ agent: 'b', prompt: 'task b' },
])
expect(results.size).toBe(2)
expect(results.get('a')!.success).toBe(true)
expect(results.get('b')!.success).toBe(true)
})
it('handles agent failures gracefully', async () => {
const pool = new AgentPool()
const failAgent = createMockAgent('fail')
;(failAgent.run as ReturnType<typeof vi.fn>).mockRejectedValue(new Error('boom'))
pool.add(failAgent)
const results = await pool.runParallel([
{ agent: 'fail', prompt: 'will fail' },
])
expect(results.get('fail')!.success).toBe(false)
expect(results.get('fail')!.output).toContain('boom')
})
})
describe('runAny', () => {
it('round-robins across agents', async () => {
const pool = new AgentPool()
const a = createMockAgent('a')
const b = createMockAgent('b')
pool.add(a)
pool.add(b)
await pool.runAny('first')
await pool.runAny('second')
expect(a.run).toHaveBeenCalledTimes(1)
expect(b.run).toHaveBeenCalledTimes(1)
})
it('throws on empty pool', async () => {
const pool = new AgentPool()
await expect(pool.runAny('hello')).rejects.toThrow('empty pool')
})
})
describe('getStatus', () => {
it('reports agent states', () => {
const pool = new AgentPool()
pool.add(createMockAgent('idle1', { state: 'idle' }))
pool.add(createMockAgent('idle2', { state: 'idle' }))
pool.add(createMockAgent('running', { state: 'running' }))
pool.add(createMockAgent('done', { state: 'completed' }))
pool.add(createMockAgent('err', { state: 'error' }))
const status = pool.getStatus()
expect(status.total).toBe(5)
expect(status.idle).toBe(2)
expect(status.running).toBe(1)
expect(status.completed).toBe(1)
expect(status.error).toBe(1)
})
})
describe('shutdown', () => {
it('resets all agents', async () => {
const pool = new AgentPool()
const a = createMockAgent('a')
const b = createMockAgent('b')
pool.add(a)
pool.add(b)
await pool.shutdown()
expect(a.reset).toHaveBeenCalled()
expect(b.reset).toHaveBeenCalled()
})
})
describe('per-agent serialization (#72)', () => {
it('serializes concurrent runs on the same agent', async () => {
const executionLog: string[] = []
const agent = createMockAgent('dev')
;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async (prompt: string) => {
executionLog.push(`start:${prompt}`)
await new Promise(r => setTimeout(r, 50))
executionLog.push(`end:${prompt}`)
return SUCCESS_RESULT
})
const pool = new AgentPool(5)
pool.add(agent)
// Fire two runs for the same agent concurrently
await Promise.all([
pool.run('dev', 'task1'),
pool.run('dev', 'task2'),
])
// With per-agent serialization, runs must not overlap:
// [start:task1, end:task1, start:task2, end:task2] (or reverse order)
// i.e. no interleaving like [start:task1, start:task2, ...]
expect(executionLog).toHaveLength(4)
expect(executionLog[0]).toMatch(/^start:/)
expect(executionLog[1]).toMatch(/^end:/)
expect(executionLog[2]).toMatch(/^start:/)
expect(executionLog[3]).toMatch(/^end:/)
})
it('allows different agents to run in parallel', async () => {
let concurrent = 0
let maxConcurrent = 0
const makeTimedAgent = (name: string): Agent => {
const agent = createMockAgent(name)
;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async () => {
concurrent++
maxConcurrent = Math.max(maxConcurrent, concurrent)
await new Promise(r => setTimeout(r, 50))
concurrent--
return SUCCESS_RESULT
})
return agent
}
const pool = new AgentPool(5)
pool.add(makeTimedAgent('a'))
pool.add(makeTimedAgent('b'))
await Promise.all([
pool.run('a', 'x'),
pool.run('b', 'y'),
])
// Different agents should run concurrently
expect(maxConcurrent).toBe(2)
})
it('releases agent lock even when run() throws', async () => {
const agent = createMockAgent('dev')
let callCount = 0
;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async () => {
callCount++
if (callCount === 1) throw new Error('first run fails')
return SUCCESS_RESULT
})
const pool = new AgentPool(5)
pool.add(agent)
// First run fails, second should still execute (not deadlock)
const results = await Promise.allSettled([
pool.run('dev', 'will-fail'),
pool.run('dev', 'should-succeed'),
])
expect(results[0]!.status).toBe('rejected')
expect(results[1]!.status).toBe('fulfilled')
})
})
describe('concurrency', () => {
it('respects maxConcurrency limit', async () => {
let concurrent = 0
let maxConcurrent = 0
const makeAgent = (name: string): Agent => {
const agent = createMockAgent(name)
;(agent.run as ReturnType<typeof vi.fn>).mockImplementation(async () => {
concurrent++
maxConcurrent = Math.max(maxConcurrent, concurrent)
await new Promise(r => setTimeout(r, 50))
concurrent--
return SUCCESS_RESULT
})
return agent
}
const pool = new AgentPool(2) // max 2 concurrent
pool.add(makeAgent('a'))
pool.add(makeAgent('b'))
pool.add(makeAgent('c'))
await pool.runParallel([
{ agent: 'a', prompt: 'x' },
{ agent: 'b', prompt: 'y' },
{ agent: 'c', prompt: 'z' },
])
expect(maxConcurrent).toBeLessThanOrEqual(2)
})
it('availableRunSlots matches maxConcurrency when idle', () => {
const pool = new AgentPool(3)
pool.add(createMockAgent('a'))
expect(pool.availableRunSlots).toBe(3)
})
it('availableRunSlots is zero while a run holds the pool slot', async () => {
const pool = new AgentPool(1)
const agent = createMockAgent('solo')
pool.add(agent)
let finishRun!: (value: AgentRunResult) => void
const holdPromise = new Promise<AgentRunResult>((resolve) => {
finishRun = resolve
})
vi.mocked(agent.run).mockReturnValue(holdPromise)
const runPromise = pool.run('solo', 'hold-slot')
await Promise.resolve()
await Promise.resolve()
expect(pool.availableRunSlots).toBe(0)
finishRun(SUCCESS_RESULT)
await runPromise
expect(pool.availableRunSlots).toBe(1)
})
it('runEphemeral runs a caller-supplied Agent without touching the agentLock', async () => {
// Registered agent's lock is held by a pending pool.run — a second
// pool.run() against the same name would queue on the agent lock.
// runEphemeral on a fresh Agent instance must NOT block on that lock.
const pool = new AgentPool(3)
const registered = createMockAgent('alice')
pool.add(registered)
let releaseRegistered!: (v: AgentRunResult) => void
vi.mocked(registered.run).mockReturnValue(
new Promise<AgentRunResult>((resolve) => {
releaseRegistered = resolve
}),
)
const heldRun = pool.run('alice', 'long running')
await Promise.resolve()
await Promise.resolve()
const ephemeral = createMockAgent('alice') // same name, fresh instance
const ephemeralResult = await pool.runEphemeral(ephemeral, 'quick task')
expect(ephemeralResult).toBe(SUCCESS_RESULT)
expect(ephemeral.run).toHaveBeenCalledWith('quick task', undefined)
releaseRegistered(SUCCESS_RESULT)
await heldRun
})
it('runEphemeral still respects pool semaphore', async () => {
const pool = new AgentPool(1)
const holder = createMockAgent('holder')
pool.add(holder)
let releaseHolder!: (v: AgentRunResult) => void
vi.mocked(holder.run).mockReturnValue(
new Promise<AgentRunResult>((resolve) => {
releaseHolder = resolve
}),
)
const heldRun = pool.run('holder', 'hold-slot')
await Promise.resolve()
await Promise.resolve()
expect(pool.availableRunSlots).toBe(0)
// Ephemeral agent should queue on the semaphore, not run immediately.
const ephemeral = createMockAgent('ephemeral')
let ephemeralResolved = false
const ephemeralRun = pool.runEphemeral(ephemeral, 'p').then((r) => {
ephemeralResolved = true
return r
})
await Promise.resolve()
await Promise.resolve()
expect(ephemeralResolved).toBe(false)
releaseHolder(SUCCESS_RESULT)
await heldRun
await ephemeralRun
expect(ephemeralResolved).toBe(true)
})
})
})

View File

@ -1,436 +0,0 @@
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { textMsg, toolUseMsg, toolResultMsg, imageMsg, chatOpts, toolDef, collectEvents } from './helpers/llm-fixtures.js'
import type { LLMResponse, StreamEvent, ToolUseBlock } from '../src/types.js'
// ---------------------------------------------------------------------------
// Mock the Anthropic SDK
// ---------------------------------------------------------------------------
const mockCreate = vi.hoisted(() => vi.fn())
const mockStream = vi.hoisted(() => vi.fn())
vi.mock('@anthropic-ai/sdk', () => {
const AnthropicMock = vi.fn(() => ({
messages: {
create: mockCreate,
stream: mockStream,
},
}))
return { default: AnthropicMock, Anthropic: AnthropicMock }
})
import { AnthropicAdapter } from '../src/llm/anthropic.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function makeAnthropicResponse(overrides: Record<string, unknown> = {}) {
return {
id: 'msg_test123',
content: [{ type: 'text', text: 'Hello' }],
model: 'claude-sonnet-4',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 5 },
...overrides,
}
}
function makeStreamMock(events: Array<Record<string, unknown>>, finalMsg: Record<string, unknown>) {
return {
[Symbol.asyncIterator]: async function* () {
for (const event of events) yield event
},
finalMessage: vi.fn().mockResolvedValue(finalMsg),
}
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe('AnthropicAdapter', () => {
let adapter: AnthropicAdapter
beforeEach(() => {
vi.clearAllMocks()
adapter = new AnthropicAdapter('test-key')
})
// =========================================================================
// chat()
// =========================================================================
describe('chat()', () => {
it('converts a text message and returns LLMResponse', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
// Verify the SDK was called with correct shape
const callArgs = mockCreate.mock.calls[0]
expect(callArgs[0]).toMatchObject({
model: 'test-model',
max_tokens: 1024,
messages: [{ role: 'user', content: [{ type: 'text', text: 'Hi' }] }],
})
// Verify response transformation
expect(result).toEqual({
id: 'msg_test123',
content: [{ type: 'text', text: 'Hello' }],
model: 'claude-sonnet-4',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 5 },
})
})
it('converts tool_use blocks to Anthropic format', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat(
[toolUseMsg('call_1', 'search', { query: 'test' })],
chatOpts(),
)
const sentMessages = mockCreate.mock.calls[0][0].messages
expect(sentMessages[0].content[0]).toEqual({
type: 'tool_use',
id: 'call_1',
name: 'search',
input: { query: 'test' },
})
})
it('converts tool_result blocks to Anthropic format', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat(
[toolResultMsg('call_1', 'result data', false)],
chatOpts(),
)
const sentMessages = mockCreate.mock.calls[0][0].messages
expect(sentMessages[0].content[0]).toEqual({
type: 'tool_result',
tool_use_id: 'call_1',
content: 'result data',
is_error: false,
})
})
it('converts image blocks to Anthropic format', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat([imageMsg('image/png', 'base64data')], chatOpts())
const sentMessages = mockCreate.mock.calls[0][0].messages
expect(sentMessages[0].content[0]).toEqual({
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: 'base64data',
},
})
})
it('passes system prompt as top-level parameter', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ systemPrompt: 'You are helpful.' }),
)
expect(mockCreate.mock.calls[0][0].system).toBe('You are helpful.')
})
it('converts tools to Anthropic format', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
const tool = toolDef('search', 'Search the web')
await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ tools: [tool] }),
)
const sentTools = mockCreate.mock.calls[0][0].tools
expect(sentTools[0]).toEqual({
name: 'search',
description: 'Search the web',
input_schema: {
type: 'object',
properties: { query: { type: 'string' } },
required: ['query'],
},
})
})
it('passes temperature through', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ temperature: 0.5 }),
)
expect(mockCreate.mock.calls[0][0].temperature).toBe(0.5)
})
it('passes abortSignal to SDK request options', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
const controller = new AbortController()
await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ abortSignal: controller.signal }),
)
expect(mockCreate.mock.calls[0][1]).toEqual({ signal: controller.signal })
})
it('defaults max_tokens to 4096 when unset', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse())
await adapter.chat(
[textMsg('user', 'Hi')],
{ model: 'test-model' },
)
expect(mockCreate.mock.calls[0][0].max_tokens).toBe(4096)
})
it('converts tool_use response blocks from Anthropic', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse({
content: [
{ type: 'tool_use', id: 'call_1', name: 'search', input: { q: 'test' } },
],
stop_reason: 'tool_use',
}))
const result = await adapter.chat([textMsg('user', 'search')], chatOpts())
expect(result.content[0]).toEqual({
type: 'tool_use',
id: 'call_1',
name: 'search',
input: { q: 'test' },
})
expect(result.stop_reason).toBe('tool_use')
})
it('gracefully degrades unknown block types to text', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse({
content: [{ type: 'thinking', thinking: 'hmm...' }],
}))
const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
expect(result.content[0]).toEqual({
type: 'text',
text: '[unsupported block type: thinking]',
})
})
it('defaults stop_reason to end_turn when null', async () => {
mockCreate.mockResolvedValue(makeAnthropicResponse({ stop_reason: null }))
const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
expect(result.stop_reason).toBe('end_turn')
})
it('propagates SDK errors', async () => {
mockCreate.mockRejectedValue(new Error('Rate limited'))
await expect(
adapter.chat([textMsg('user', 'Hi')], chatOpts()),
).rejects.toThrow('Rate limited')
})
})
// =========================================================================
// stream()
// =========================================================================
describe('stream()', () => {
it('yields text events from text_delta', async () => {
const streamObj = makeStreamMock(
[
{ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Hello' } },
{ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: ' world' } },
],
makeAnthropicResponse({ content: [{ type: 'text', text: 'Hello world' }] }),
)
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const textEvents = events.filter(e => e.type === 'text')
expect(textEvents).toEqual([
{ type: 'text', data: 'Hello' },
{ type: 'text', data: ' world' },
])
})
it('accumulates tool input JSON and emits tool_use on content_block_stop', async () => {
const streamObj = makeStreamMock(
[
{
type: 'content_block_start',
index: 0,
content_block: { type: 'tool_use', id: 'call_1', name: 'search' },
},
{
type: 'content_block_delta',
index: 0,
delta: { type: 'input_json_delta', partial_json: '{"qu' },
},
{
type: 'content_block_delta',
index: 0,
delta: { type: 'input_json_delta', partial_json: 'ery":"test"}' },
},
{ type: 'content_block_stop', index: 0 },
],
makeAnthropicResponse({
content: [{ type: 'tool_use', id: 'call_1', name: 'search', input: { query: 'test' } }],
stop_reason: 'tool_use',
}),
)
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const toolEvents = events.filter(e => e.type === 'tool_use')
expect(toolEvents).toHaveLength(1)
const block = toolEvents[0].data as ToolUseBlock
expect(block).toEqual({
type: 'tool_use',
id: 'call_1',
name: 'search',
input: { query: 'test' },
})
})
it('handles malformed tool JSON gracefully (defaults to empty object)', async () => {
const streamObj = makeStreamMock(
[
{
type: 'content_block_start',
index: 0,
content_block: { type: 'tool_use', id: 'call_1', name: 'broken' },
},
{
type: 'content_block_delta',
index: 0,
delta: { type: 'input_json_delta', partial_json: '{invalid' },
},
{ type: 'content_block_stop', index: 0 },
],
makeAnthropicResponse({
content: [{ type: 'tool_use', id: 'call_1', name: 'broken', input: {} }],
}),
)
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const toolEvents = events.filter(e => e.type === 'tool_use')
expect((toolEvents[0].data as ToolUseBlock).input).toEqual({})
})
it('yields done event with complete LLMResponse', async () => {
const final = makeAnthropicResponse({
content: [{ type: 'text', text: 'Done' }],
})
const streamObj = makeStreamMock([], final)
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const doneEvents = events.filter(e => e.type === 'done')
expect(doneEvents).toHaveLength(1)
const response = doneEvents[0].data as LLMResponse
expect(response.id).toBe('msg_test123')
expect(response.content).toEqual([{ type: 'text', text: 'Done' }])
expect(response.usage).toEqual({ input_tokens: 10, output_tokens: 5 })
})
it('yields error event when stream throws', async () => {
const streamObj = {
[Symbol.asyncIterator]: async function* () {
throw new Error('Stream failed')
},
finalMessage: vi.fn(),
}
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const errorEvents = events.filter(e => e.type === 'error')
expect(errorEvents).toHaveLength(1)
expect((errorEvents[0].data as Error).message).toBe('Stream failed')
})
it('passes system prompt and tools to stream call', async () => {
const streamObj = makeStreamMock([], makeAnthropicResponse())
mockStream.mockReturnValue(streamObj)
const tool = toolDef('search')
await collectEvents(
adapter.stream(
[textMsg('user', 'Hi')],
chatOpts({ systemPrompt: 'Be helpful', tools: [tool] }),
),
)
const callArgs = mockStream.mock.calls[0][0]
expect(callArgs.system).toBe('Be helpful')
expect(callArgs.tools[0].name).toBe('search')
})
it('passes abortSignal to stream request options', async () => {
const streamObj = makeStreamMock([], makeAnthropicResponse())
mockStream.mockReturnValue(streamObj)
const controller = new AbortController()
await collectEvents(
adapter.stream(
[textMsg('user', 'Hi')],
chatOpts({ abortSignal: controller.signal }),
),
)
expect(mockStream.mock.calls[0][1]).toEqual({ signal: controller.signal })
})
it('handles multiple tool calls in one stream', async () => {
const streamObj = makeStreamMock(
[
{ type: 'content_block_start', index: 0, content_block: { type: 'tool_use', id: 'c1', name: 'search' } },
{ type: 'content_block_delta', index: 0, delta: { type: 'input_json_delta', partial_json: '{"q":"a"}' } },
{ type: 'content_block_stop', index: 0 },
{ type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: 'c2', name: 'read' } },
{ type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '{"path":"b"}' } },
{ type: 'content_block_stop', index: 1 },
],
makeAnthropicResponse({
content: [
{ type: 'tool_use', id: 'c1', name: 'search', input: { q: 'a' } },
{ type: 'tool_use', id: 'c2', name: 'read', input: { path: 'b' } },
],
}),
)
mockStream.mockReturnValue(streamObj)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const toolEvents = events.filter(e => e.type === 'tool_use')
expect(toolEvents).toHaveLength(2)
expect((toolEvents[0].data as ToolUseBlock).name).toBe('search')
expect((toolEvents[1].data as ToolUseBlock).name).toBe('read')
})
})
})

View File

@ -1,464 +0,0 @@
import { describe, it, expect, vi } from 'vitest'
import { TaskQueue } from '../src/task/queue.js'
import { createTask } from '../src/task/task.js'
import { OpenMultiAgent } from '../src/orchestrator/orchestrator.js'
import { Agent } from '../src/agent/agent.js'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import { AgentPool } from '../src/agent/pool.js'
import type { AgentConfig, LLMAdapter, LLMResponse, Task } from '../src/types.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function task(id: string, opts: { dependsOn?: string[]; assignee?: string } = {}) {
const t = createTask({ title: id, description: `task ${id}`, assignee: opts.assignee })
return { ...t, id, dependsOn: opts.dependsOn } as ReturnType<typeof createTask>
}
function mockAdapter(responseText: string): LLMAdapter {
return {
name: 'mock',
async chat() {
return {
id: 'mock-1',
content: [{ type: 'text' as const, text: responseText }],
model: 'mock-model',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 20 },
} satisfies LLMResponse
},
async *stream() {
/* unused */
},
}
}
function buildMockAgent(config: AgentConfig, responseText: string): Agent {
const registry = new ToolRegistry()
const executor = new ToolExecutor(registry)
const agent = new Agent(config, registry, executor)
const runner = new AgentRunner(mockAdapter(responseText), registry, executor, {
model: config.model,
systemPrompt: config.systemPrompt,
maxTurns: config.maxTurns,
maxTokens: config.maxTokens,
temperature: config.temperature,
agentName: config.name,
})
;(agent as any).runner = runner
return agent
}
// ---------------------------------------------------------------------------
// TaskQueue: skip / skipRemaining
// ---------------------------------------------------------------------------
describe('TaskQueue — skip', () => {
it('marks a task as skipped', () => {
const q = new TaskQueue()
q.add(task('a'))
q.skip('a', 'user rejected')
expect(q.list()[0].status).toBe('skipped')
expect(q.list()[0].result).toBe('user rejected')
})
it('fires task:skipped event with updated task object', () => {
const q = new TaskQueue()
const handler = vi.fn()
q.on('task:skipped', handler)
q.add(task('a'))
q.skip('a', 'rejected')
expect(handler).toHaveBeenCalledTimes(1)
const emitted = handler.mock.calls[0][0]
expect(emitted.id).toBe('a')
expect(emitted.status).toBe('skipped')
expect(emitted.result).toBe('rejected')
})
it('cascades skip to dependent tasks', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
q.add(task('c', { dependsOn: ['b'] }))
q.skip('a', 'rejected')
expect(q.list().find((t) => t.id === 'a')!.status).toBe('skipped')
expect(q.list().find((t) => t.id === 'b')!.status).toBe('skipped')
expect(q.list().find((t) => t.id === 'c')!.status).toBe('skipped')
})
it('does not cascade to independent tasks', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b'))
q.add(task('c', { dependsOn: ['a'] }))
q.skip('a', 'rejected')
expect(q.list().find((t) => t.id === 'b')!.status).toBe('pending')
expect(q.list().find((t) => t.id === 'c')!.status).toBe('skipped')
})
it('throws when skipping a non-existent task', () => {
const q = new TaskQueue()
expect(() => q.skip('nope', 'reason')).toThrow('not found')
})
it('isComplete() treats skipped as terminal', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b'))
q.complete('a', 'done')
expect(q.isComplete()).toBe(false)
q.skip('b', 'rejected')
expect(q.isComplete()).toBe(true)
})
it('getProgress() counts skipped tasks', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b'))
q.add(task('c'))
q.complete('a', 'done')
q.skip('b', 'rejected')
const progress = q.getProgress()
expect(progress.completed).toBe(1)
expect(progress.skipped).toBe(1)
expect(progress.pending).toBe(1)
})
})
describe('TaskQueue — skipRemaining', () => {
it('marks all non-terminal tasks as skipped', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b'))
q.add(task('c', { dependsOn: ['a'] }))
q.complete('a', 'done')
q.skipRemaining('approval rejected')
expect(q.list().find((t) => t.id === 'a')!.status).toBe('completed')
expect(q.list().find((t) => t.id === 'b')!.status).toBe('skipped')
expect(q.list().find((t) => t.id === 'c')!.status).toBe('skipped')
})
it('leaves failed tasks untouched', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b'))
q.fail('a', 'error')
q.skipRemaining()
expect(q.list().find((t) => t.id === 'a')!.status).toBe('failed')
expect(q.list().find((t) => t.id === 'b')!.status).toBe('skipped')
})
it('emits task:skipped with the updated task object (not stale)', () => {
const q = new TaskQueue()
const handler = vi.fn()
q.on('task:skipped', handler)
q.add(task('a'))
q.add(task('b'))
q.skipRemaining('reason')
expect(handler).toHaveBeenCalledTimes(2)
// Every emitted task must have status 'skipped'
for (const call of handler.mock.calls) {
expect(call[0].status).toBe('skipped')
expect(call[0].result).toBe('reason')
}
})
it('fires all:complete after skipRemaining', () => {
const q = new TaskQueue()
const handler = vi.fn()
q.on('all:complete', handler)
q.add(task('a'))
q.add(task('b'))
q.complete('a', 'done')
expect(handler).not.toHaveBeenCalled()
q.skipRemaining()
expect(handler).toHaveBeenCalledTimes(1)
})
})
// ---------------------------------------------------------------------------
// Orchestrator: onApproval integration
// ---------------------------------------------------------------------------
describe('onApproval integration', () => {
function patchPool(orchestrator: OpenMultiAgent, agents: Map<string, Agent>) {
;(orchestrator as any).buildPool = () => {
const pool = new AgentPool(5)
for (const [, agent] of agents) {
pool.add(agent)
}
return pool
}
}
function setup(onApproval?: (tasks: readonly Task[], next: readonly Task[]) => Promise<boolean>) {
const agentA: AgentConfig = { name: 'agent-a', model: 'mock', systemPrompt: 'You are agent A.' }
const agentB: AgentConfig = { name: 'agent-b', model: 'mock', systemPrompt: 'You are agent B.' }
const orchestrator = new OpenMultiAgent({
defaultModel: 'mock',
...(onApproval ? { onApproval } : {}),
})
const team = orchestrator.createTeam('test', {
name: 'test',
agents: [agentA, agentB],
})
const mockAgents = new Map<string, Agent>()
mockAgents.set('agent-a', buildMockAgent(agentA, 'result from A'))
mockAgents.set('agent-b', buildMockAgent(agentB, 'result from B'))
patchPool(orchestrator, mockAgents)
return { orchestrator, team }
}
it('approve all — all tasks complete normally', async () => {
const approvalSpy = vi.fn().mockResolvedValue(true)
const { orchestrator, team } = setup(approvalSpy)
const result = await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
expect(result.success).toBe(true)
expect(result.agentResults.has('agent-a')).toBe(true)
expect(result.agentResults.has('agent-b')).toBe(true)
// onApproval called once (between round 1 and round 2)
expect(approvalSpy).toHaveBeenCalledTimes(1)
})
it('reject mid-pipeline — remaining tasks skipped', async () => {
const approvalSpy = vi.fn().mockResolvedValue(false)
const { orchestrator, team } = setup(approvalSpy)
const result = await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
expect(approvalSpy).toHaveBeenCalledTimes(1)
// Only agent-a's output present (task-2 was skipped, never ran)
expect(result.agentResults.has('agent-a')).toBe(true)
expect(result.agentResults.has('agent-b')).toBe(false)
})
it('no callback — tasks flow without interruption', async () => {
const { orchestrator, team } = setup(/* no onApproval */)
const result = await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
expect(result.success).toBe(true)
expect(result.agentResults.has('agent-a')).toBe(true)
expect(result.agentResults.has('agent-b')).toBe(true)
})
it('callback receives correct arguments — completedTasks array and nextTasks', async () => {
const approvalSpy = vi.fn().mockResolvedValue(true)
const { orchestrator, team } = setup(approvalSpy)
await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
// First arg: array of completed tasks from this round
const completedTasks = approvalSpy.mock.calls[0][0]
expect(completedTasks).toHaveLength(1)
expect(completedTasks[0].title).toBe('task-1')
expect(completedTasks[0].status).toBe('completed')
// Second arg: the next tasks about to run
const nextTasks = approvalSpy.mock.calls[0][1]
expect(nextTasks).toHaveLength(1)
expect(nextTasks[0].title).toBe('task-2')
})
it('callback throwing an error skips remaining tasks gracefully', async () => {
const approvalSpy = vi.fn().mockRejectedValue(new Error('network timeout'))
const { orchestrator, team } = setup(approvalSpy)
// Should not throw — error is caught and remaining tasks are skipped
const result = await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
expect(approvalSpy).toHaveBeenCalledTimes(1)
expect(result.agentResults.has('agent-a')).toBe(true)
expect(result.agentResults.has('agent-b')).toBe(false)
})
it('parallel batch — completedTasks contains all tasks from the round', async () => {
const approvalSpy = vi.fn().mockResolvedValue(true)
const agentA: AgentConfig = { name: 'agent-a', model: 'mock', systemPrompt: 'A' }
const agentB: AgentConfig = { name: 'agent-b', model: 'mock', systemPrompt: 'B' }
const agentC: AgentConfig = { name: 'agent-c', model: 'mock', systemPrompt: 'C' }
const orchestrator = new OpenMultiAgent({
defaultModel: 'mock',
onApproval: approvalSpy,
})
const team = orchestrator.createTeam('test', {
name: 'test',
agents: [agentA, agentB, agentC],
})
const mockAgents = new Map<string, Agent>()
mockAgents.set('agent-a', buildMockAgent(agentA, 'A done'))
mockAgents.set('agent-b', buildMockAgent(agentB, 'B done'))
mockAgents.set('agent-c', buildMockAgent(agentC, 'C done'))
patchPool(orchestrator, mockAgents)
// task-1 and task-2 are independent (run in parallel), task-3 depends on both
await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b' },
{ title: 'task-3', description: 'third', assignee: 'agent-c', dependsOn: ['task-1', 'task-2'] },
])
// Approval called once between the parallel batch and task-3
expect(approvalSpy).toHaveBeenCalledTimes(1)
const completedTasks = approvalSpy.mock.calls[0][0] as Task[]
// Both task-1 and task-2 completed in the same round
expect(completedTasks).toHaveLength(2)
const titles = completedTasks.map((t: Task) => t.title).sort()
expect(titles).toEqual(['task-1', 'task-2'])
})
it('single batch with no second round — callback never fires', async () => {
const approvalSpy = vi.fn().mockResolvedValue(true)
const { orchestrator, team } = setup(approvalSpy)
const result = await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b' },
])
expect(result.success).toBe(true)
// No second round → callback never called
expect(approvalSpy).not.toHaveBeenCalled()
})
it('mixed success/failure in batch — completedTasks only contains succeeded tasks', async () => {
const approvalSpy = vi.fn().mockResolvedValue(true)
const agentA: AgentConfig = { name: 'agent-a', model: 'mock', systemPrompt: 'A' }
const agentB: AgentConfig = { name: 'agent-b', model: 'mock', systemPrompt: 'B' }
const agentC: AgentConfig = { name: 'agent-c', model: 'mock', systemPrompt: 'C' }
const orchestrator = new OpenMultiAgent({
defaultModel: 'mock',
onApproval: approvalSpy,
})
const team = orchestrator.createTeam('test', {
name: 'test',
agents: [agentA, agentB, agentC],
})
const mockAgents = new Map<string, Agent>()
mockAgents.set('agent-a', buildMockAgent(agentA, 'A done'))
mockAgents.set('agent-b', buildMockAgent(agentB, 'B done'))
mockAgents.set('agent-c', buildMockAgent(agentC, 'C done'))
// Patch buildPool so that pool.run for agent-b returns a failure result
;(orchestrator as any).buildPool = () => {
const pool = new AgentPool(5)
for (const [, agent] of mockAgents) pool.add(agent)
const originalRun = pool.run.bind(pool)
pool.run = async (agentName: string, prompt: string, opts?: any) => {
if (agentName === 'agent-b') {
return {
success: false,
output: 'simulated failure',
messages: [],
tokenUsage: { input_tokens: 0, output_tokens: 0 },
toolCalls: [],
}
}
return originalRun(agentName, prompt, opts)
}
return pool
}
// task-1 (success) and task-2 (fail) run in parallel, task-3 depends on task-1
await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b' },
{ title: 'task-3', description: 'third', assignee: 'agent-c', dependsOn: ['task-1'] },
])
expect(approvalSpy).toHaveBeenCalledTimes(1)
const completedTasks = approvalSpy.mock.calls[0][0] as Task[]
// Only task-1 succeeded — task-2 failed, so it should not appear
expect(completedTasks).toHaveLength(1)
expect(completedTasks[0].title).toBe('task-1')
expect(completedTasks[0].status).toBe('completed')
})
it('onProgress receives task_skipped events when approval is rejected', async () => {
const progressSpy = vi.fn()
const agentA: AgentConfig = { name: 'agent-a', model: 'mock', systemPrompt: 'A' }
const agentB: AgentConfig = { name: 'agent-b', model: 'mock', systemPrompt: 'B' }
const orchestrator = new OpenMultiAgent({
defaultModel: 'mock',
onApproval: vi.fn().mockResolvedValue(false),
onProgress: progressSpy,
})
const team = orchestrator.createTeam('test', {
name: 'test',
agents: [agentA, agentB],
})
const mockAgents = new Map<string, Agent>()
mockAgents.set('agent-a', buildMockAgent(agentA, 'A done'))
mockAgents.set('agent-b', buildMockAgent(agentB, 'B done'))
;(orchestrator as any).buildPool = () => {
const pool = new AgentPool(5)
for (const [, agent] of mockAgents) pool.add(agent)
return pool
}
await orchestrator.runTasks(team, [
{ title: 'task-1', description: 'first', assignee: 'agent-a' },
{ title: 'task-2', description: 'second', assignee: 'agent-b', dependsOn: ['task-1'] },
])
const skippedEvents = progressSpy.mock.calls
.map((c: any) => c[0])
.filter((e: any) => e.type === 'task_skipped')
expect(skippedEvents).toHaveLength(1)
expect(skippedEvents[0].data.status).toBe('skipped')
})
})

View File

@ -1,383 +0,0 @@
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { chatOpts, collectEvents, textMsg, toolDef } from './helpers/llm-fixtures.js'
import type { LLMResponse, ToolUseBlock } from '../src/types.js'
// ---------------------------------------------------------------------------
// Mock AzureOpenAI constructor (must be hoisted for Vitest)
// ---------------------------------------------------------------------------
const AzureOpenAIMock = vi.hoisted(() => vi.fn())
const createCompletionMock = vi.hoisted(() => vi.fn())
vi.mock('openai', () => ({
AzureOpenAI: AzureOpenAIMock,
}))
import { AzureOpenAIAdapter } from '../src/llm/azure-openai.js'
import { createAdapter } from '../src/llm/adapter.js'
function makeCompletion(overrides: Record<string, unknown> = {}) {
return {
id: 'chatcmpl-123',
model: 'gpt-4o',
choices: [{
index: 0,
message: {
role: 'assistant',
content: 'Hello',
tool_calls: undefined,
},
finish_reason: 'stop',
}],
usage: { prompt_tokens: 10, completion_tokens: 5 },
...overrides,
}
}
async function* makeChunks(chunks: Array<Record<string, unknown>>) {
for (const chunk of chunks) yield chunk
}
function textChunk(text: string, finish_reason: string | null = null, usage: Record<string, number> | null = null) {
return {
id: 'chatcmpl-123',
model: 'gpt-4o',
choices: [{
index: 0,
delta: { content: text },
finish_reason,
}],
usage,
}
}
function toolCallChunk(
index: number,
id: string | undefined,
name: string | undefined,
args: string,
finish_reason: string | null = null,
) {
return {
id: 'chatcmpl-123',
model: 'gpt-4o',
choices: [{
index: 0,
delta: {
tool_calls: [{
index,
id,
function: {
name,
arguments: args,
},
}],
},
finish_reason,
}],
usage: null,
}
}
// ---------------------------------------------------------------------------
// AzureOpenAIAdapter tests
// ---------------------------------------------------------------------------
describe('AzureOpenAIAdapter', () => {
beforeEach(() => {
AzureOpenAIMock.mockClear()
createCompletionMock.mockReset()
AzureOpenAIMock.mockImplementation(() => ({
chat: {
completions: {
create: createCompletionMock,
},
},
}))
})
it('has name "azure-openai"', () => {
const adapter = new AzureOpenAIAdapter()
expect(adapter.name).toBe('azure-openai')
})
it('uses AZURE_OPENAI_API_KEY by default', () => {
const originalKey = process.env['AZURE_OPENAI_API_KEY']
const originalEndpoint = process.env['AZURE_OPENAI_ENDPOINT']
process.env['AZURE_OPENAI_API_KEY'] = 'azure-test-key-123'
process.env['AZURE_OPENAI_ENDPOINT'] = 'https://test.openai.azure.com'
try {
new AzureOpenAIAdapter()
expect(AzureOpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'azure-test-key-123',
endpoint: 'https://test.openai.azure.com',
})
)
} finally {
if (originalKey === undefined) {
delete process.env['AZURE_OPENAI_API_KEY']
} else {
process.env['AZURE_OPENAI_API_KEY'] = originalKey
}
if (originalEndpoint === undefined) {
delete process.env['AZURE_OPENAI_ENDPOINT']
} else {
process.env['AZURE_OPENAI_ENDPOINT'] = originalEndpoint
}
}
})
it('uses AZURE_OPENAI_ENDPOINT by default', () => {
const originalEndpoint = process.env['AZURE_OPENAI_ENDPOINT']
process.env['AZURE_OPENAI_ENDPOINT'] = 'https://my-resource.openai.azure.com'
try {
new AzureOpenAIAdapter('some-key')
expect(AzureOpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'some-key',
endpoint: 'https://my-resource.openai.azure.com',
})
)
} finally {
if (originalEndpoint === undefined) {
delete process.env['AZURE_OPENAI_ENDPOINT']
} else {
process.env['AZURE_OPENAI_ENDPOINT'] = originalEndpoint
}
}
})
it('uses default API version when not set', () => {
new AzureOpenAIAdapter('some-key', 'https://test.openai.azure.com')
expect(AzureOpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'some-key',
endpoint: 'https://test.openai.azure.com',
apiVersion: '2024-10-21',
})
)
})
it('uses AZURE_OPENAI_API_VERSION env var when set', () => {
const originalVersion = process.env['AZURE_OPENAI_API_VERSION']
process.env['AZURE_OPENAI_API_VERSION'] = '2024-03-01-preview'
try {
new AzureOpenAIAdapter('some-key', 'https://test.openai.azure.com')
expect(AzureOpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'some-key',
endpoint: 'https://test.openai.azure.com',
apiVersion: '2024-03-01-preview',
})
)
} finally {
if (originalVersion === undefined) {
delete process.env['AZURE_OPENAI_API_VERSION']
} else {
process.env['AZURE_OPENAI_API_VERSION'] = originalVersion
}
}
})
it('allows overriding apiKey, endpoint, and apiVersion', () => {
new AzureOpenAIAdapter(
'custom-key',
'https://custom.openai.azure.com',
'2024-04-01-preview'
)
expect(AzureOpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'custom-key',
endpoint: 'https://custom.openai.azure.com',
apiVersion: '2024-04-01-preview',
})
)
})
it('createAdapter("azure-openai") returns AzureOpenAIAdapter instance', async () => {
const adapter = await createAdapter('azure-openai')
expect(adapter).toBeInstanceOf(AzureOpenAIAdapter)
})
it('chat() calls SDK with expected parameters', async () => {
createCompletionMock.mockResolvedValue(makeCompletion())
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
const tool = toolDef('search', 'Search')
const result = await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({
model: 'my-deployment',
tools: [tool],
temperature: 0.3,
}),
)
const callArgs = createCompletionMock.mock.calls[0][0]
expect(callArgs).toMatchObject({
model: 'my-deployment',
stream: false,
max_tokens: 1024,
temperature: 0.3,
})
expect(callArgs.tools[0]).toEqual({
type: 'function',
function: {
name: 'search',
description: 'Search',
parameters: tool.inputSchema,
},
})
expect(result).toEqual({
id: 'chatcmpl-123',
content: [{ type: 'text', text: 'Hello' }],
model: 'gpt-4o',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 5 },
})
})
it('chat() maps native tool_calls to tool_use blocks', async () => {
createCompletionMock.mockResolvedValue(makeCompletion({
choices: [{
index: 0,
message: {
role: 'assistant',
content: null,
tool_calls: [{
id: 'call_1',
type: 'function',
function: { name: 'search', arguments: '{"q":"test"}' },
}],
},
finish_reason: 'tool_calls',
}],
}))
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
const result = await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ model: 'my-deployment', tools: [toolDef('search')] }),
)
expect(result.content[0]).toEqual({
type: 'tool_use',
id: 'call_1',
name: 'search',
input: { q: 'test' },
})
expect(result.stop_reason).toBe('tool_use')
})
it('chat() uses AZURE_OPENAI_DEPLOYMENT when model is blank', async () => {
const originalDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']
process.env['AZURE_OPENAI_DEPLOYMENT'] = 'env-deployment'
createCompletionMock.mockResolvedValue({
id: 'cmpl-1',
model: 'gpt-4',
choices: [
{
finish_reason: 'stop',
message: { content: 'ok' },
},
],
usage: { prompt_tokens: 1, completion_tokens: 1 },
})
try {
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
await adapter.chat([], { model: ' ' })
expect(createCompletionMock).toHaveBeenCalledWith(
expect.objectContaining({ model: 'env-deployment', stream: false }),
expect.any(Object),
)
} finally {
if (originalDeployment === undefined) {
delete process.env['AZURE_OPENAI_DEPLOYMENT']
} else {
process.env['AZURE_OPENAI_DEPLOYMENT'] = originalDeployment
}
}
})
it('chat() throws when both model and AZURE_OPENAI_DEPLOYMENT are blank', async () => {
const originalDeployment = process.env['AZURE_OPENAI_DEPLOYMENT']
delete process.env['AZURE_OPENAI_DEPLOYMENT']
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
try {
await expect(adapter.chat([], { model: ' ' })).rejects.toThrow(
'Azure OpenAI deployment is required',
)
expect(createCompletionMock).not.toHaveBeenCalled()
} finally {
if (originalDeployment !== undefined) {
process.env['AZURE_OPENAI_DEPLOYMENT'] = originalDeployment
}
}
})
it('stream() sends stream options and emits done usage', async () => {
createCompletionMock.mockResolvedValue(makeChunks([
textChunk('Hi', 'stop'),
{ id: 'chatcmpl-123', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 10, completion_tokens: 2 } },
]))
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
const events = await collectEvents(
adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
)
const callArgs = createCompletionMock.mock.calls[0][0]
expect(callArgs.stream).toBe(true)
expect(callArgs.stream_options).toEqual({ include_usage: true })
const done = events.find(e => e.type === 'done')
const response = done?.data as LLMResponse
expect(response.usage).toEqual({ input_tokens: 10, output_tokens: 2 })
expect(response.model).toBe('gpt-4o')
})
it('stream() accumulates tool call deltas and emits tool_use', async () => {
createCompletionMock.mockResolvedValue(makeChunks([
toolCallChunk(0, 'call_1', 'search', '{"q":'),
toolCallChunk(0, undefined, undefined, '"test"}', 'tool_calls'),
{ id: 'chatcmpl-123', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 10, completion_tokens: 5 } },
]))
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
const events = await collectEvents(
adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
)
const toolEvents = events.filter(e => e.type === 'tool_use')
expect(toolEvents).toHaveLength(1)
expect(toolEvents[0]?.data as ToolUseBlock).toEqual({
type: 'tool_use',
id: 'call_1',
name: 'search',
input: { q: 'test' },
})
})
it('stream() yields error event when iterator throws', async () => {
createCompletionMock.mockResolvedValue(
(async function* () {
throw new Error('Stream exploded')
})(),
)
const adapter = new AzureOpenAIAdapter('k', 'https://test.openai.azure.com')
const events = await collectEvents(
adapter.stream([textMsg('user', 'Hi')], chatOpts({ model: 'my-deployment' })),
)
const errorEvents = events.filter(e => e.type === 'error')
expect(errorEvents).toHaveLength(1)
expect((errorEvents[0]?.data as Error).message).toBe('Stream exploded')
})
})

View File

@ -1,741 +0,0 @@
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'
import { mkdtemp, rm, writeFile, readFile } from 'fs/promises'
import { join } from 'path'
import { tmpdir } from 'os'
import { fileReadTool } from '../src/tool/built-in/file-read.js'
import { fileWriteTool } from '../src/tool/built-in/file-write.js'
import { fileEditTool } from '../src/tool/built-in/file-edit.js'
import { bashTool } from '../src/tool/built-in/bash.js'
import { globTool } from '../src/tool/built-in/glob.js'
import { grepTool } from '../src/tool/built-in/grep.js'
import {
registerBuiltInTools,
BUILT_IN_TOOLS,
delegateToAgentTool,
} from '../src/tool/built-in/index.js'
import { ToolRegistry } from '../src/tool/framework.js'
import { InMemoryStore } from '../src/memory/store.js'
import type { AgentRunResult, ToolUseContext } from '../src/types.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
const defaultContext: ToolUseContext = {
agent: { name: 'test-agent', role: 'tester', model: 'test' },
}
let tmpDir: string
beforeEach(async () => {
tmpDir = await mkdtemp(join(tmpdir(), 'oma-test-'))
})
afterEach(async () => {
await rm(tmpDir, { recursive: true, force: true })
})
// ===========================================================================
// registerBuiltInTools
// ===========================================================================
describe('registerBuiltInTools', () => {
it('registers all 6 built-in tools', () => {
const registry = new ToolRegistry()
registerBuiltInTools(registry)
expect(registry.get('bash')).toBeDefined()
expect(registry.get('file_read')).toBeDefined()
expect(registry.get('file_write')).toBeDefined()
expect(registry.get('file_edit')).toBeDefined()
expect(registry.get('grep')).toBeDefined()
expect(registry.get('glob')).toBeDefined()
expect(registry.get('delegate_to_agent')).toBeUndefined()
})
it('registers delegate_to_agent when includeDelegateTool is set', () => {
const registry = new ToolRegistry()
registerBuiltInTools(registry, { includeDelegateTool: true })
expect(registry.get('delegate_to_agent')).toBeDefined()
})
it('BUILT_IN_TOOLS has correct length', () => {
expect(BUILT_IN_TOOLS).toHaveLength(6)
})
})
// ===========================================================================
// file_read
// ===========================================================================
describe('file_read', () => {
it('reads a file with line numbers', async () => {
const filePath = join(tmpDir, 'test.txt')
await writeFile(filePath, 'line one\nline two\nline three\n')
const result = await fileReadTool.execute({ path: filePath }, defaultContext)
expect(result.isError).toBe(false)
expect(result.data).toContain('1\tline one')
expect(result.data).toContain('2\tline two')
expect(result.data).toContain('3\tline three')
})
it('reads a slice with offset and limit', async () => {
const filePath = join(tmpDir, 'test.txt')
await writeFile(filePath, 'a\nb\nc\nd\ne\n')
const result = await fileReadTool.execute(
{ path: filePath, offset: 2, limit: 2 },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('2\tb')
expect(result.data).toContain('3\tc')
expect(result.data).not.toContain('1\ta')
})
it('errors on non-existent file', async () => {
const result = await fileReadTool.execute(
{ path: join(tmpDir, 'nope.txt') },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('Could not read file')
})
it('errors when offset is beyond end of file', async () => {
const filePath = join(tmpDir, 'short.txt')
await writeFile(filePath, 'one line\n')
const result = await fileReadTool.execute(
{ path: filePath, offset: 100 },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('beyond the end')
})
it('shows truncation note when not reading entire file', async () => {
const filePath = join(tmpDir, 'multi.txt')
await writeFile(filePath, 'a\nb\nc\nd\ne\n')
const result = await fileReadTool.execute(
{ path: filePath, limit: 2 },
defaultContext,
)
expect(result.data).toContain('showing lines')
})
})
// ===========================================================================
// file_write
// ===========================================================================
describe('file_write', () => {
it('creates a new file', async () => {
const filePath = join(tmpDir, 'new-file.txt')
const result = await fileWriteTool.execute(
{ path: filePath, content: 'hello world' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('Created')
const content = await readFile(filePath, 'utf8')
expect(content).toBe('hello world')
})
it('overwrites an existing file', async () => {
const filePath = join(tmpDir, 'existing.txt')
await writeFile(filePath, 'old content')
const result = await fileWriteTool.execute(
{ path: filePath, content: 'new content' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('Updated')
const content = await readFile(filePath, 'utf8')
expect(content).toBe('new content')
})
it('creates parent directories', async () => {
const filePath = join(tmpDir, 'deep', 'nested', 'file.txt')
const result = await fileWriteTool.execute(
{ path: filePath, content: 'deep file' },
defaultContext,
)
expect(result.isError).toBe(false)
const content = await readFile(filePath, 'utf8')
expect(content).toBe('deep file')
})
it('reports line and byte counts', async () => {
const filePath = join(tmpDir, 'counted.txt')
const result = await fileWriteTool.execute(
{ path: filePath, content: 'line1\nline2\nline3' },
defaultContext,
)
expect(result.data).toContain('3 lines')
})
})
// ===========================================================================
// file_edit
// ===========================================================================
describe('file_edit', () => {
it('replaces a unique string', async () => {
const filePath = join(tmpDir, 'edit.txt')
await writeFile(filePath, 'hello world\ngoodbye world\n')
const result = await fileEditTool.execute(
{ path: filePath, old_string: 'hello', new_string: 'hi' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('Replaced 1 occurrence')
const content = await readFile(filePath, 'utf8')
expect(content).toContain('hi world')
expect(content).toContain('goodbye world')
})
it('errors when old_string not found', async () => {
const filePath = join(tmpDir, 'edit.txt')
await writeFile(filePath, 'hello world\n')
const result = await fileEditTool.execute(
{ path: filePath, old_string: 'nonexistent', new_string: 'x' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('not found')
})
it('errors on ambiguous match without replace_all', async () => {
const filePath = join(tmpDir, 'edit.txt')
await writeFile(filePath, 'foo bar foo\n')
const result = await fileEditTool.execute(
{ path: filePath, old_string: 'foo', new_string: 'baz' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('2 times')
})
it('replaces all when replace_all is true', async () => {
const filePath = join(tmpDir, 'edit.txt')
await writeFile(filePath, 'foo bar foo\n')
const result = await fileEditTool.execute(
{ path: filePath, old_string: 'foo', new_string: 'baz', replace_all: true },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('Replaced 2 occurrences')
const content = await readFile(filePath, 'utf8')
expect(content).toBe('baz bar baz\n')
})
it('errors on non-existent file', async () => {
const result = await fileEditTool.execute(
{ path: join(tmpDir, 'nope.txt'), old_string: 'x', new_string: 'y' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('Could not read')
})
})
// ===========================================================================
// bash
// ===========================================================================
describe('bash', () => {
it('executes a simple command', async () => {
const result = await bashTool.execute(
{ command: 'echo "hello bash"' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('hello bash')
})
it('captures stderr on failed command', async () => {
const result = await bashTool.execute(
{ command: 'ls /nonexistent/path/xyz 2>&1' },
defaultContext,
)
expect(result.isError).toBe(true)
})
it('supports custom working directory', async () => {
const result = await bashTool.execute(
{ command: 'pwd', cwd: tmpDir },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain(tmpDir)
})
it('returns exit code for failing commands', async () => {
const result = await bashTool.execute(
{ command: 'exit 42' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('42')
})
it('handles commands with no output', async () => {
const result = await bashTool.execute(
{ command: 'true' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('command completed with no output')
})
})
// ===========================================================================
// glob
// ===========================================================================
describe('glob', () => {
it('lists files matching a pattern without reading contents', async () => {
await writeFile(join(tmpDir, 'a.ts'), 'SECRET_CONTENT_SHOULD_NOT_APPEAR')
await writeFile(join(tmpDir, 'b.md'), 'also secret')
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('.ts')
expect(result.data).not.toContain('SECRET')
expect(result.data).not.toContain('b.md')
})
it('lists all files when pattern is omitted', async () => {
await writeFile(join(tmpDir, 'x.txt'), 'x')
await writeFile(join(tmpDir, 'y.txt'), 'y')
const result = await globTool.execute({ path: tmpDir }, defaultContext)
expect(result.isError).toBe(false)
expect(result.data).toContain('x.txt')
expect(result.data).toContain('y.txt')
})
it('lists a single file when path is a file', async () => {
const filePath = join(tmpDir, 'only.ts')
await writeFile(filePath, 'body')
const result = await globTool.execute({ path: filePath }, defaultContext)
expect(result.isError).toBe(false)
expect(result.data).toContain('only.ts')
})
it('returns no match when single file does not match pattern', async () => {
const filePath = join(tmpDir, 'readme.md')
await writeFile(filePath, '# doc')
const result = await globTool.execute(
{ path: filePath, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('No files matched')
})
it('recurses into subdirectories', async () => {
const sub = join(tmpDir, 'nested')
const { mkdir } = await import('fs/promises')
await mkdir(sub, { recursive: true })
await writeFile(join(sub, 'deep.ts'), '')
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('deep.ts')
})
it('errors on inaccessible path', async () => {
const result = await globTool.execute(
{ path: '/nonexistent/path/xyz' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('Cannot access path')
})
it('notes truncation when maxFiles is exceeded', async () => {
for (let i = 0; i < 5; i++) {
await writeFile(join(tmpDir, `f${i}.txt`), '')
}
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.txt', maxFiles: 3 },
defaultContext,
)
expect(result.isError).toBe(false)
const lines = (result.data as string).split('\n').filter((l) => l.endsWith('.txt'))
expect(lines).toHaveLength(3)
expect(result.data).toContain('capped at 3')
})
})
// ===========================================================================
// grep (Node.js fallback — tests do not depend on ripgrep availability)
// ===========================================================================
describe('grep', () => {
it('finds matching lines in a file', async () => {
const filePath = join(tmpDir, 'search.txt')
await writeFile(filePath, 'apple\nbanana\napricot\ncherry\n')
const result = await grepTool.execute(
{ pattern: 'ap', path: filePath },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('apple')
expect(result.data).toContain('apricot')
expect(result.data).not.toContain('cherry')
})
it('returns "No matches found" when nothing matches', async () => {
const filePath = join(tmpDir, 'search.txt')
await writeFile(filePath, 'hello world\n')
const result = await grepTool.execute(
{ pattern: 'zzz', path: filePath },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('No matches found')
})
it('errors on invalid regex', async () => {
const result = await grepTool.execute(
{ pattern: '[invalid', path: tmpDir },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('Invalid regular expression')
})
it('searches recursively in a directory', async () => {
const subDir = join(tmpDir, 'sub')
await writeFile(join(tmpDir, 'a.txt'), 'findme here\n')
// Create subdir and file
const { mkdir } = await import('fs/promises')
await mkdir(subDir, { recursive: true })
await writeFile(join(subDir, 'b.txt'), 'findme there\n')
const result = await grepTool.execute(
{ pattern: 'findme', path: tmpDir },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('findme here')
expect(result.data).toContain('findme there')
})
it('respects glob filter', async () => {
await writeFile(join(tmpDir, 'code.ts'), 'const x = 1\n')
await writeFile(join(tmpDir, 'readme.md'), 'const y = 2\n')
const result = await grepTool.execute(
{ pattern: 'const', path: tmpDir, glob: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('code.ts')
expect(result.data).not.toContain('readme.md')
})
it('errors on inaccessible path', async () => {
const result = await grepTool.execute(
{ pattern: 'test', path: '/nonexistent/path/xyz' },
defaultContext,
)
expect(result.isError).toBe(true)
// May hit ripgrep path or Node fallback — both report an error
expect(result.data.toLowerCase()).toContain('no such file')
})
})
// ===========================================================================
// delegate_to_agent
// ===========================================================================
const DELEGATE_OK: AgentRunResult = {
success: true,
output: 'research done',
messages: [],
tokenUsage: { input_tokens: 1, output_tokens: 2 },
toolCalls: [],
}
describe('delegate_to_agent', () => {
it('returns delegated agent output on success', async () => {
const runDelegatedAgent = vi.fn().mockResolvedValue(DELEGATE_OK)
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationDepth: 0,
maxDelegationDepth: 3,
delegationPool: { availableRunSlots: 2 },
runDelegatedAgent,
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Summarize X.' },
ctx,
)
expect(result.isError).toBe(false)
expect(result.data).toBe('research done')
expect(runDelegatedAgent).toHaveBeenCalledWith('bob', 'Summarize X.')
})
it('errors when delegation would form a cycle (A -> B -> A)', async () => {
const ctx: ToolUseContext = {
agent: { name: 'bob', role: 'worker', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationDepth: 1,
maxDelegationDepth: 5,
delegationChain: ['alice', 'bob'],
delegationPool: { availableRunSlots: 2 },
runDelegatedAgent: vi.fn().mockResolvedValue(DELEGATE_OK),
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'alice', prompt: 'loop back' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/Delegation cycle detected: alice -> bob -> alice/)
expect(ctx.team!.runDelegatedAgent).not.toHaveBeenCalled()
})
it('surfaces delegated run tokenUsage via ToolResult.metadata', async () => {
const runDelegatedAgent = vi.fn().mockResolvedValue({
success: true,
output: 'answer',
messages: [],
tokenUsage: { input_tokens: 123, output_tokens: 45 },
toolCalls: [],
} satisfies AgentRunResult)
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationPool: { availableRunSlots: 2 },
runDelegatedAgent,
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Hi' },
ctx,
)
expect(result.metadata?.tokenUsage).toEqual({ input_tokens: 123, output_tokens: 45 })
})
it('errors when delegation is not configured', async () => {
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: { name: 't', agents: ['alice', 'bob'] },
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/only available during orchestrated team runs/i)
})
it('errors for unknown target agent', async () => {
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
runDelegatedAgent: vi.fn(),
delegationPool: { availableRunSlots: 1 },
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'charlie', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/Unknown agent/)
})
it('errors on self-delegation', async () => {
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
runDelegatedAgent: vi.fn(),
delegationPool: { availableRunSlots: 1 },
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'alice', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/yourself/)
})
it('errors when delegation depth limit is reached', async () => {
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationDepth: 3,
maxDelegationDepth: 3,
runDelegatedAgent: vi.fn(),
delegationPool: { availableRunSlots: 1 },
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/Maximum delegation depth/)
})
it('errors fast when pool has no free slots without calling runDelegatedAgent', async () => {
const runDelegatedAgent = vi.fn()
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationPool: { availableRunSlots: 0 },
runDelegatedAgent,
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toMatch(/no free concurrency slot/i)
expect(runDelegatedAgent).not.toHaveBeenCalled()
})
it('writes unique SharedMemory audit keys for repeated delegations', async () => {
const store = new InMemoryStore()
const runDelegatedAgent = vi.fn().mockResolvedValue(DELEGATE_OK)
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
sharedMemory: store,
delegationPool: { availableRunSlots: 2 },
runDelegatedAgent,
},
}
await delegateToAgentTool.execute({ target_agent: 'bob', prompt: 'a' }, ctx)
await delegateToAgentTool.execute({ target_agent: 'bob', prompt: 'b' }, ctx)
const keys = (await store.list()).map((e) => e.key)
const delegationKeys = keys.filter((k) => k.includes('delegation:bob:'))
expect(delegationKeys).toHaveLength(2)
expect(delegationKeys[0]).not.toBe(delegationKeys[1])
})
it('returns isError when delegated run reports success false', async () => {
const runDelegatedAgent = vi.fn().mockResolvedValue({
success: false,
output: 'delegated agent failed',
messages: [],
tokenUsage: { input_tokens: 0, output_tokens: 0 },
toolCalls: [],
} satisfies AgentRunResult)
const ctx: ToolUseContext = {
agent: { name: 'alice', role: 'lead', model: 'test' },
team: {
name: 't',
agents: ['alice', 'bob'],
delegationPool: { availableRunSlots: 1 },
runDelegatedAgent,
},
}
const result = await delegateToAgentTool.execute(
{ target_agent: 'bob', prompt: 'Hi' },
ctx,
)
expect(result.isError).toBe(true)
expect(result.data).toBe('delegated agent failed')
})
})

View File

@ -1,69 +0,0 @@
import { describe, expect, it } from 'vitest'
import {
EXIT,
parseArgs,
serializeAgentResult,
serializeTeamRunResult,
} from '../src/cli/oma.js'
import type { AgentRunResult, TeamRunResult } from '../src/types.js'
describe('parseArgs', () => {
it('parses flags, key=value, and key value', () => {
const a = parseArgs(['node', 'oma', 'run', '--goal', 'hello', '--team=x.json', '--pretty'])
expect(a._[0]).toBe('run')
expect(a.kv.get('goal')).toBe('hello')
expect(a.kv.get('team')).toBe('x.json')
expect(a.flags.has('pretty')).toBe(true)
})
})
describe('serializeTeamRunResult', () => {
it('maps agentResults to a plain object', () => {
const ar: AgentRunResult = {
success: true,
output: 'ok',
messages: [],
tokenUsage: { input_tokens: 1, output_tokens: 2 },
toolCalls: [],
}
const tr: TeamRunResult = {
success: true,
agentResults: new Map([['alice', ar]]),
totalTokenUsage: { input_tokens: 1, output_tokens: 2 },
}
const json = serializeTeamRunResult(tr, { pretty: false, includeMessages: false })
expect(json.success).toBe(true)
expect((json.agentResults as Record<string, unknown>)['alice']).toMatchObject({
success: true,
output: 'ok',
})
expect((json.agentResults as Record<string, unknown>)['alice']).not.toHaveProperty('messages')
})
it('includes messages when requested', () => {
const ar: AgentRunResult = {
success: true,
output: 'x',
messages: [{ role: 'user', content: [{ type: 'text', text: 'hi' }] }],
tokenUsage: { input_tokens: 0, output_tokens: 0 },
toolCalls: [],
}
const tr: TeamRunResult = {
success: true,
agentResults: new Map([['bob', ar]]),
totalTokenUsage: { input_tokens: 0, output_tokens: 0 },
}
const json = serializeTeamRunResult(tr, { pretty: false, includeMessages: true })
expect(serializeAgentResult(ar, true).messages).toHaveLength(1)
expect((json.agentResults as Record<string, unknown>)['bob']).toHaveProperty('messages')
})
})
describe('EXIT', () => {
it('uses stable numeric codes', () => {
expect(EXIT.SUCCESS).toBe(0)
expect(EXIT.RUN_FAILED).toBe(1)
expect(EXIT.USAGE).toBe(2)
expect(EXIT.INTERNAL).toBe(3)
})
})

View File

@ -1,626 +0,0 @@
import { describe, it, expect, vi } from 'vitest'
import { z } from 'zod'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry, defineTool } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import type { LLMAdapter, LLMChatOptions, LLMMessage, LLMResponse, TraceEvent } from '../src/types.js'
function textResponse(text: string): LLMResponse {
return {
id: `resp-${Math.random().toString(36).slice(2)}`,
content: [{ type: 'text', text }],
model: 'mock-model',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 20 },
}
}
function toolUseResponse(toolName: string, input: Record<string, unknown>): LLMResponse {
return {
id: `resp-${Math.random().toString(36).slice(2)}`,
content: [{
type: 'tool_use',
id: `tu-${Math.random().toString(36).slice(2)}`,
name: toolName,
input,
}],
model: 'mock-model',
stop_reason: 'tool_use',
usage: { input_tokens: 15, output_tokens: 25 },
}
}
function buildRegistryAndExecutor(): { registry: ToolRegistry; executor: ToolExecutor } {
const registry = new ToolRegistry()
registry.register(
defineTool({
name: 'echo',
description: 'Echo input',
inputSchema: z.object({ message: z.string() }),
async execute({ message }) {
return { data: message }
},
}),
)
return { registry, executor: new ToolExecutor(registry) }
}
describe('AgentRunner contextStrategy', () => {
it('keeps baseline behavior when contextStrategy is not set', async () => {
const calls: LLMMessage[][] = []
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return calls.length === 1
? toolUseResponse('echo', { message: 'hello' })
: textResponse('done')
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 4,
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
expect(calls).toHaveLength(2)
expect(calls[0]).toHaveLength(1)
expect(calls[1]!.length).toBeGreaterThan(calls[0]!.length)
})
it('sliding-window truncates old turns and preserves the first user message', async () => {
const calls: LLMMessage[][] = []
const responses = [
toolUseResponse('echo', { message: 't1' }),
toolUseResponse('echo', { message: 't2' }),
toolUseResponse('echo', { message: 't3' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: { type: 'sliding-window', maxTurns: 1 },
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'original prompt' }] }])
const laterCall = calls[calls.length - 1]!
const firstUserText = laterCall[0]!.content[0]
expect(firstUserText).toMatchObject({ type: 'text', text: 'original prompt' })
const flattenedText = laterCall.flatMap(m => m.content.filter(c => c.type === 'text'))
expect(flattenedText.some(c => c.type === 'text' && c.text.includes('truncated'))).toBe(true)
})
it('summarize strategy replaces old context and emits summary trace call', async () => {
const calls: Array<{ messages: LLMMessage[]; options: LLMChatOptions }> = []
const traces: TraceEvent[] = []
const responses = [
toolUseResponse('echo', { message: 'first turn payload '.repeat(20) }),
toolUseResponse('echo', { message: 'second turn payload '.repeat(20) }),
textResponse('This is a concise summary.'),
textResponse('final answer'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages, options) {
calls.push({ messages: messages.map(m => ({ role: m.role, content: m.content })), options })
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: { type: 'summarize', maxTokens: 20 },
})
const result = await runner.run(
[{ role: 'user', content: [{ type: 'text', text: 'start' }] }],
{ onTrace: (e) => { traces.push(e) }, runId: 'run-summary', traceAgent: 'context-agent' },
)
const summaryCall = calls.find(c => c.messages.length === 1 && c.options.tools === undefined)
expect(summaryCall).toBeDefined()
const llmTraces = traces.filter(t => t.type === 'llm_call')
expect(llmTraces.some(t => t.type === 'llm_call' && t.phase === 'summary')).toBe(true)
// Summary adapter usage must count toward RunResult.tokenUsage (maxTokenBudget).
expect(result.tokenUsage.input_tokens).toBe(15 + 15 + 10 + 10)
expect(result.tokenUsage.output_tokens).toBe(25 + 25 + 20 + 20)
// After compaction, summary text is folded into the next user turn (not a
// standalone user message), preserving user/assistant alternation.
const turnAfterSummary = calls.find(
c => c.messages.some(
m => m.role === 'user' && m.content.some(
b => b.type === 'text' && b.text.includes('[Conversation summary]'),
),
),
)
expect(turnAfterSummary).toBeDefined()
const rolesAfterFirstUser = turnAfterSummary!.messages.map(m => m.role).join(',')
expect(rolesAfterFirstUser).not.toMatch(/^user,user/)
})
it('custom strategy calls compress callback and uses returned messages', async () => {
const compress = vi.fn((messages: LLMMessage[]) => messages.slice(-1))
const calls: LLMMessage[][] = []
const responses = [
toolUseResponse('echo', { message: 'hello' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 4,
contextStrategy: {
type: 'custom',
compress,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'custom prompt' }] }])
expect(compress).toHaveBeenCalledOnce()
expect(calls[1]).toHaveLength(1)
})
// ---------------------------------------------------------------------------
// compact strategy
// ---------------------------------------------------------------------------
describe('compact strategy', () => {
const longText = 'x'.repeat(3000)
const longToolResult = 'result-data '.repeat(100) // ~1200 chars
function buildMultiTurnAdapter(
responseCount: number,
calls: LLMMessage[][],
): LLMAdapter {
const responses: LLMResponse[] = []
for (let i = 0; i < responseCount - 1; i++) {
responses.push(toolUseResponse('echo', { message: `turn-${i}` }))
}
responses.push(textResponse('done'))
let idx = 0
return {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() { /* unused */ },
}
}
/** Build a registry with an echo tool that returns a fixed result string. */
function buildEchoRegistry(result: string): { registry: ToolRegistry; executor: ToolExecutor } {
const registry = new ToolRegistry()
registry.register(
defineTool({
name: 'echo',
description: 'Echo input',
inputSchema: z.object({ message: z.string() }),
async execute() {
return { data: result }
},
}),
)
return { registry, executor: new ToolExecutor(registry) }
}
it('does not activate below maxTokens threshold', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(3, calls)
const { registry, executor } = buildEchoRegistry('short')
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: { type: 'compact', maxTokens: 999999 },
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// On the 3rd call (turn 3), all previous messages should still be intact
// because estimated tokens are way below the threshold.
const lastCall = calls[calls.length - 1]!
const allToolResults = lastCall.flatMap(m =>
m.content.filter(b => b.type === 'tool_result'),
)
for (const tr of allToolResults) {
if (tr.type === 'tool_result') {
expect(tr.content).not.toContain('compacted')
}
}
})
it('compresses old tool_result blocks when tokens exceed threshold', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20, // very low to always trigger
preserveRecentTurns: 1, // only protect the most recent turn
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// On the last call, old tool results should have compact markers.
const lastCall = calls[calls.length - 1]!
const toolResults = lastCall.flatMap(m =>
m.content.filter(b => b.type === 'tool_result'),
)
const compacted = toolResults.filter(
b => b.type === 'tool_result' && b.content.includes('compacted'),
)
expect(compacted.length).toBeGreaterThan(0)
// Marker should include tool name.
for (const tr of compacted) {
if (tr.type === 'tool_result') {
expect(tr.content).toMatch(/\[Tool result: echo/)
}
}
})
it('preserves the first user message', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'original prompt' }] }])
const lastCall = calls[calls.length - 1]!
const firstUser = lastCall.find(m => m.role === 'user')!
expect(firstUser.content[0]).toMatchObject({ type: 'text', text: 'original prompt' })
})
it('preserves tool_use blocks in old turns', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// Every assistant message should still have its tool_use block.
const lastCall = calls[calls.length - 1]!
const assistantMsgs = lastCall.filter(m => m.role === 'assistant')
for (const msg of assistantMsgs) {
const toolUses = msg.content.filter(b => b.type === 'tool_use')
// The last assistant message is "done" (text only), others have tool_use.
if (msg.content.some(b => b.type === 'text' && b.text === 'done')) continue
expect(toolUses.length).toBeGreaterThan(0)
}
})
it('preserves error tool_result blocks', async () => {
const calls: LLMMessage[][] = []
const responses: LLMResponse[] = [
toolUseResponse('echo', { message: 'will-fail' }),
toolUseResponse('echo', { message: 'ok' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() { /* unused */ },
}
// Tool that fails on first call, succeeds on second.
let callCount = 0
const registry = new ToolRegistry()
registry.register(
defineTool({
name: 'echo',
description: 'Echo input',
inputSchema: z.object({ message: z.string() }),
async execute() {
callCount++
if (callCount === 1) {
throw new Error('deliberate error '.repeat(40))
}
return { data: longToolResult }
},
}),
)
const executor = new ToolExecutor(registry)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 50,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
const lastCall = calls[calls.length - 1]!
const errorResults = lastCall.flatMap(m =>
m.content.filter(b => b.type === 'tool_result' && b.is_error),
)
// Error results should still have their original content (not compacted).
for (const er of errorResults) {
if (er.type === 'tool_result') {
expect(er.content).not.toContain('compacted')
expect(er.content).toContain('deliberate error')
}
}
})
it('does not re-compress markers from compressToolResults', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
compressToolResults: { minChars: 100 },
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 10,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
const lastCall = calls[calls.length - 1]!
const allToolResults = lastCall.flatMap(m =>
m.content.filter(b => b.type === 'tool_result'),
)
// No result should contain nested markers.
for (const tr of allToolResults) {
if (tr.type === 'tool_result') {
// Should not have a compact marker wrapping another marker.
const markerCount = (tr.content.match(/\[Tool/g) || []).length
expect(markerCount).toBeLessThanOrEqual(1)
}
}
})
it('truncates long assistant text blocks in old turns', async () => {
const calls: LLMMessage[][] = []
const responses: LLMResponse[] = [
// First turn: assistant with long text + tool_use
{
id: 'r1',
content: [
{ type: 'text', text: longText },
{ type: 'tool_use', id: 'tu-1', name: 'echo', input: { message: 'hi' } },
],
model: 'mock-model',
stop_reason: 'tool_use',
usage: { input_tokens: 10, output_tokens: 20 },
},
toolUseResponse('echo', { message: 'turn2' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() { /* unused */ },
}
const { registry, executor } = buildEchoRegistry('short')
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minTextBlockChars: 500,
textBlockExcerptChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
const lastCall = calls[calls.length - 1]!
// The first assistant message (old zone) should have its text truncated.
const firstAssistant = lastCall.find(m => m.role === 'assistant')!
const textBlocks = firstAssistant.content.filter(b => b.type === 'text')
const truncated = textBlocks.find(
b => b.type === 'text' && b.text.includes('truncated'),
)
expect(truncated).toBeDefined()
if (truncated && truncated.type === 'text') {
expect(truncated.text.length).toBeLessThan(longText.length)
expect(truncated.text).toContain(`${longText.length} chars total`)
}
})
it('keeps recent turns intact within preserveRecentTurns', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// The most recent tool_result (last user message with tool_result) should
// still contain the original long content.
const lastCall = calls[calls.length - 1]!
const userMsgs = lastCall.filter(m => m.role === 'user')
const lastUserWithToolResult = [...userMsgs]
.reverse()
.find(m => m.content.some(b => b.type === 'tool_result'))
expect(lastUserWithToolResult).toBeDefined()
const recentTr = lastUserWithToolResult!.content.find(b => b.type === 'tool_result')
if (recentTr && recentTr.type === 'tool_result') {
expect(recentTr.content).not.toContain('compacted')
expect(recentTr.content).toContain('result-data')
}
})
it('does not compact when all turns fit in preserveRecentTurns', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(3, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 10, // way more than actual turns
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// All tool results should still have original content.
const lastCall = calls[calls.length - 1]!
const toolResults = lastCall.flatMap(m =>
m.content.filter(b => b.type === 'tool_result'),
)
for (const tr of toolResults) {
if (tr.type === 'tool_result') {
expect(tr.content).not.toContain('compacted')
}
}
})
it('maintains correct role alternation after compaction', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(5, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 10,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 100,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
// Check all LLM calls for role alternation.
for (const callMsgs of calls) {
for (let i = 1; i < callMsgs.length; i++) {
expect(callMsgs[i]!.role).not.toBe(callMsgs[i - 1]!.role)
}
}
})
it('returns ZERO_USAGE (no LLM cost from compaction)', async () => {
const calls: LLMMessage[][] = []
const adapter = buildMultiTurnAdapter(4, calls)
const { registry, executor } = buildEchoRegistry(longToolResult)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: {
type: 'compact',
maxTokens: 20,
preserveRecentTurns: 1,
minToolResultChars: 100,
},
})
const result = await runner.run([
{ role: 'user', content: [{ type: 'text', text: 'start' }] },
])
// Token usage should only reflect the 4 actual LLM calls (no extra from compaction).
// Each toolUseResponse: input=15, output=25. textResponse: input=10, output=20.
// 3 tool calls + 1 final = (15*3 + 10) input, (25*3 + 20) output.
expect(result.tokenUsage.input_tokens).toBe(15 * 3 + 10)
expect(result.tokenUsage.output_tokens).toBe(25 * 3 + 20)
})
})
})

View File

@ -1,478 +0,0 @@
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'
import { textMsg, chatOpts, toolDef, collectEvents } from './helpers/llm-fixtures.js'
import type { LLMResponse, StreamEvent, ToolUseBlock } from '../src/types.js'
// ---------------------------------------------------------------------------
// Mock OpenAI SDK (Copilot uses it under the hood)
// ---------------------------------------------------------------------------
const mockCreate = vi.hoisted(() => vi.fn())
const OpenAIMock = vi.hoisted(() =>
vi.fn(() => ({
chat: { completions: { create: mockCreate } },
})),
)
vi.mock('openai', () => ({
default: OpenAIMock,
OpenAI: OpenAIMock,
}))
// ---------------------------------------------------------------------------
// Mock global fetch for token management
// ---------------------------------------------------------------------------
const originalFetch = globalThis.fetch
function mockFetchForToken(sessionToken = 'cop_session_abc', expiresAt?: number) {
const exp = expiresAt ?? Math.floor(Date.now() / 1000) + 3600
return vi.fn().mockResolvedValue({
ok: true,
json: () => Promise.resolve({ token: sessionToken, expires_at: exp }),
text: () => Promise.resolve(''),
})
}
import { CopilotAdapter, getCopilotMultiplier, formatCopilotMultiplier } from '../src/llm/copilot.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function makeCompletion(overrides: Record<string, unknown> = {}) {
return {
id: 'chatcmpl-cop',
model: 'claude-sonnet-4',
choices: [{
index: 0,
message: { role: 'assistant', content: 'Hello from Copilot', tool_calls: undefined },
finish_reason: 'stop',
}],
usage: { prompt_tokens: 8, completion_tokens: 4 },
...overrides,
}
}
async function* makeChunks(chunks: Array<Record<string, unknown>>) {
for (const chunk of chunks) yield chunk
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe('CopilotAdapter', () => {
let savedEnv: Record<string, string | undefined>
beforeEach(() => {
vi.clearAllMocks()
savedEnv = {
GITHUB_COPILOT_TOKEN: process.env['GITHUB_COPILOT_TOKEN'],
GITHUB_TOKEN: process.env['GITHUB_TOKEN'],
}
delete process.env['GITHUB_COPILOT_TOKEN']
delete process.env['GITHUB_TOKEN']
})
afterEach(() => {
globalThis.fetch = originalFetch
for (const [key, val] of Object.entries(savedEnv)) {
if (val === undefined) delete process.env[key]
else process.env[key] = val
}
})
// =========================================================================
// Constructor & token resolution
// =========================================================================
describe('constructor', () => {
it('accepts string apiKey as first argument', () => {
const adapter = new CopilotAdapter('gh_token_123')
expect(adapter.name).toBe('copilot')
})
it('accepts options object with apiKey', () => {
const adapter = new CopilotAdapter({ apiKey: 'gh_token_456' })
expect(adapter.name).toBe('copilot')
})
it('falls back to GITHUB_COPILOT_TOKEN env var', () => {
process.env['GITHUB_COPILOT_TOKEN'] = 'env_copilot_token'
const adapter = new CopilotAdapter()
expect(adapter.name).toBe('copilot')
})
it('falls back to GITHUB_TOKEN env var', () => {
process.env['GITHUB_TOKEN'] = 'env_gh_token'
const adapter = new CopilotAdapter()
expect(adapter.name).toBe('copilot')
})
})
// =========================================================================
// Token management
// =========================================================================
describe('token management', () => {
it('uses the device flow when no GitHub token is available', async () => {
vi.useFakeTimers()
const onDeviceCode = vi.fn()
globalThis.fetch = vi.fn()
.mockResolvedValueOnce({
ok: true,
json: () => Promise.resolve({
device_code: 'device-code',
user_code: 'ABCD-EFGH',
verification_uri: 'https://github.com/login/device',
interval: 0,
expires_in: 600,
}),
})
.mockResolvedValueOnce({
ok: true,
json: () => Promise.resolve({ access_token: 'oauth_token' }),
})
.mockResolvedValueOnce({
ok: true,
json: () => Promise.resolve({
token: 'session_from_device_flow',
expires_at: Math.floor(Date.now() / 1000) + 3600,
}),
text: () => Promise.resolve(''),
})
const adapter = new CopilotAdapter({ onDeviceCode })
mockCreate.mockResolvedValue(makeCompletion())
const responsePromise = adapter.chat([textMsg('user', 'Hi')], chatOpts())
await vi.runAllTimersAsync()
await responsePromise
expect(onDeviceCode).toHaveBeenCalledWith(
'https://github.com/login/device',
'ABCD-EFGH',
)
expect(globalThis.fetch).toHaveBeenNthCalledWith(
3,
'https://api.github.com/copilot_internal/v2/token',
expect.objectContaining({
headers: expect.objectContaining({
Authorization: 'token oauth_token',
}),
}),
)
expect(OpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'session_from_device_flow',
}),
)
vi.useRealTimers()
})
it('exchanges GitHub token for Copilot session token', async () => {
const fetchMock = mockFetchForToken('session_xyz')
globalThis.fetch = fetchMock
const adapter = new CopilotAdapter('gh_token')
mockCreate.mockResolvedValue(makeCompletion())
await adapter.chat([textMsg('user', 'Hi')], chatOpts())
// fetch was called to exchange token
expect(fetchMock).toHaveBeenCalledWith(
'https://api.github.com/copilot_internal/v2/token',
expect.objectContaining({
method: 'GET',
headers: expect.objectContaining({
Authorization: 'token gh_token',
}),
}),
)
// OpenAI client was created with session token
expect(OpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
apiKey: 'session_xyz',
baseURL: 'https://api.githubcopilot.com',
}),
)
})
it('caches session token and reuses on second call', async () => {
const fetchMock = mockFetchForToken()
globalThis.fetch = fetchMock
const adapter = new CopilotAdapter('gh_token')
mockCreate.mockResolvedValue(makeCompletion())
await adapter.chat([textMsg('user', 'Hi')], chatOpts())
await adapter.chat([textMsg('user', 'Hi again')], chatOpts())
// fetch should only be called once (cached)
expect(fetchMock).toHaveBeenCalledTimes(1)
})
it('refreshes token when near expiry (within 60s)', async () => {
const nowSec = Math.floor(Date.now() / 1000)
// First call: token expires in 30 seconds (within 60s grace)
let callCount = 0
globalThis.fetch = vi.fn().mockImplementation(() => {
callCount++
return Promise.resolve({
ok: true,
json: () => Promise.resolve({
token: `session_${callCount}`,
expires_at: callCount === 1 ? nowSec + 30 : nowSec + 3600,
}),
text: () => Promise.resolve(''),
})
})
const adapter = new CopilotAdapter('gh_token')
mockCreate.mockResolvedValue(makeCompletion())
await adapter.chat([textMsg('user', 'Hi')], chatOpts())
// Token is within 60s of expiry, should refresh
await adapter.chat([textMsg('user', 'Hi again')], chatOpts())
expect(callCount).toBe(2)
})
it('concurrent requests share a single refresh promise', async () => {
let resolveToken: ((v: unknown) => void) | undefined
const slowFetch = vi.fn().mockImplementation(() => {
return new Promise((resolve) => {
resolveToken = resolve
})
})
globalThis.fetch = slowFetch
const adapter = new CopilotAdapter('gh_token')
mockCreate.mockResolvedValue(makeCompletion())
// Fire two concurrent requests
const p1 = adapter.chat([textMsg('user', 'A')], chatOpts())
const p2 = adapter.chat([textMsg('user', 'B')], chatOpts())
// Resolve the single in-flight fetch
resolveToken!({
ok: true,
json: () => Promise.resolve({
token: 'shared_session',
expires_at: Math.floor(Date.now() / 1000) + 3600,
}),
text: () => Promise.resolve(''),
})
await Promise.all([p1, p2])
// fetch was called only once (mutex prevented double refresh)
expect(slowFetch).toHaveBeenCalledTimes(1)
})
it('throws on failed token exchange', async () => {
globalThis.fetch = vi.fn().mockResolvedValue({
ok: false,
status: 401,
text: () => Promise.resolve('Unauthorized'),
statusText: 'Unauthorized',
})
const adapter = new CopilotAdapter('bad_token')
mockCreate.mockResolvedValue(makeCompletion())
await expect(
adapter.chat([textMsg('user', 'Hi')], chatOpts()),
).rejects.toThrow('Copilot token exchange failed')
})
})
// =========================================================================
// chat()
// =========================================================================
describe('chat()', () => {
let adapter: CopilotAdapter
beforeEach(() => {
globalThis.fetch = mockFetchForToken()
adapter = new CopilotAdapter('gh_token')
})
it('creates OpenAI client with Copilot-specific headers and baseURL', async () => {
mockCreate.mockResolvedValue(makeCompletion())
await adapter.chat([textMsg('user', 'Hi')], chatOpts())
expect(OpenAIMock).toHaveBeenCalledWith(
expect.objectContaining({
baseURL: 'https://api.githubcopilot.com',
defaultHeaders: expect.objectContaining({
'Copilot-Integration-Id': 'vscode-chat',
'Editor-Version': 'vscode/1.100.0',
}),
}),
)
})
it('returns LLMResponse from completion', async () => {
mockCreate.mockResolvedValue(makeCompletion())
const result = await adapter.chat([textMsg('user', 'Hi')], chatOpts())
expect(result).toEqual({
id: 'chatcmpl-cop',
content: [{ type: 'text', text: 'Hello from Copilot' }],
model: 'claude-sonnet-4',
stop_reason: 'end_turn',
usage: { input_tokens: 8, output_tokens: 4 },
})
})
it('passes tools and temperature through', async () => {
mockCreate.mockResolvedValue(makeCompletion())
const tool = toolDef('search')
await adapter.chat(
[textMsg('user', 'Hi')],
chatOpts({ tools: [tool], temperature: 0.5 }),
)
const callArgs = mockCreate.mock.calls[0][0]
expect(callArgs.tools[0].function.name).toBe('search')
expect(callArgs.temperature).toBe(0.5)
expect(callArgs.stream).toBe(false)
})
})
// =========================================================================
// stream()
// =========================================================================
describe('stream()', () => {
let adapter: CopilotAdapter
beforeEach(() => {
globalThis.fetch = mockFetchForToken()
adapter = new CopilotAdapter('gh_token')
})
it('yields text and done events', async () => {
mockCreate.mockResolvedValue(makeChunks([
{ id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: { content: 'Hi' }, finish_reason: null }], usage: null },
{ id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: {}, finish_reason: 'stop' }], usage: null },
{ id: 'c1', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 5, completion_tokens: 2 } },
]))
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
expect(events.filter(e => e.type === 'text')).toEqual([
{ type: 'text', data: 'Hi' },
])
const done = events.find(e => e.type === 'done')
expect((done!.data as LLMResponse).usage).toEqual({ input_tokens: 5, output_tokens: 2 })
})
it('yields tool_use events from streamed tool calls', async () => {
mockCreate.mockResolvedValue(makeChunks([
{
id: 'c1', model: 'gpt-4o',
choices: [{ index: 0, delta: { tool_calls: [{ index: 0, id: 'call_1', function: { name: 'search', arguments: '{"q":"x"}' } }] }, finish_reason: null }],
usage: null,
},
{ id: 'c1', model: 'gpt-4o', choices: [{ index: 0, delta: {}, finish_reason: 'tool_calls' }], usage: null },
{ id: 'c1', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 5, completion_tokens: 3 } },
]))
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const toolEvents = events.filter(e => e.type === 'tool_use')
expect(toolEvents).toHaveLength(1)
expect((toolEvents[0].data as ToolUseBlock).name).toBe('search')
})
it('yields error event on failure', async () => {
mockCreate.mockResolvedValue(
(async function* () { throw new Error('Copilot down') })(),
)
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
expect(events.filter(e => e.type === 'error')).toHaveLength(1)
})
it('handles malformed streamed tool arguments JSON', async () => {
mockCreate.mockResolvedValue(makeChunks([
{
id: 'c1', model: 'gpt-4o',
choices: [{ index: 0, delta: { tool_calls: [{ index: 0, id: 'call_1', function: { name: 'search', arguments: '{broken' } }] }, finish_reason: 'tool_calls' }],
usage: null,
},
{ id: 'c1', model: 'gpt-4o', choices: [], usage: { prompt_tokens: 5, completion_tokens: 3 } },
]))
const events = await collectEvents(adapter.stream([textMsg('user', 'Hi')], chatOpts()))
const toolEvents = events.filter(e => e.type === 'tool_use')
expect(toolEvents).toHaveLength(1)
expect((toolEvents[0].data as ToolUseBlock).input).toEqual({})
})
})
// =========================================================================
// getCopilotMultiplier()
// =========================================================================
describe('getCopilotMultiplier()', () => {
it('returns 0 for included models', () => {
expect(getCopilotMultiplier('gpt-4.1')).toBe(0)
expect(getCopilotMultiplier('gpt-4o')).toBe(0)
expect(getCopilotMultiplier('gpt-5-mini')).toBe(0)
})
it('returns 0.25 for grok models', () => {
expect(getCopilotMultiplier('grok-code-fast-1')).toBe(0.25)
})
it('returns 0.33 for haiku, gemini-3-flash, etc.', () => {
expect(getCopilotMultiplier('claude-haiku-4.5')).toBe(0.33)
expect(getCopilotMultiplier('gemini-3-flash')).toBe(0.33)
})
it('returns 1 for sonnet, gemini-pro, gpt-5.x', () => {
expect(getCopilotMultiplier('claude-sonnet-4')).toBe(1)
expect(getCopilotMultiplier('gemini-2.5-pro')).toBe(1)
expect(getCopilotMultiplier('gpt-5.1')).toBe(1)
})
it('returns 3 for claude-opus (non-fast)', () => {
expect(getCopilotMultiplier('claude-opus-4.5')).toBe(3)
})
it('returns 30 for claude-opus fast', () => {
expect(getCopilotMultiplier('claude-opus-4.6-fast')).toBe(30)
})
it('returns 1 for unknown models', () => {
expect(getCopilotMultiplier('some-new-model')).toBe(1)
})
})
// =========================================================================
// formatCopilotMultiplier()
// =========================================================================
describe('formatCopilotMultiplier()', () => {
it('returns "included (0\u00d7)" for 0', () => {
expect(formatCopilotMultiplier(0)).toBe('included (0\u00d7)')
})
it('returns "1\u00d7 premium request" for 1', () => {
expect(formatCopilotMultiplier(1)).toBe('1\u00d7 premium request')
})
it('returns "0.33\u00d7 premium request" for 0.33', () => {
expect(formatCopilotMultiplier(0.33)).toBe('0.33\u00d7 premium request')
})
})
})

View File

@ -1,46 +0,0 @@
import { describe, expect, it } from 'vitest'
import { layoutTasks } from '../src/dashboard/layout-tasks.js'
describe('layoutTasks', () => {
it('assigns increasing columns along a dependency chain (topological levels)', () => {
const tasks = [
{ id: 'a', dependsOn: [] as const },
{ id: 'b', dependsOn: ['a'] as const },
{ id: 'c', dependsOn: ['b'] as const },
]
const { positions } = layoutTasks(tasks)
expect(positions.get('a')!.x).toBeLessThan(positions.get('b')!.x)
expect(positions.get('b')!.x).toBeLessThan(positions.get('c')!.x)
})
it('places a merge node after all of its dependencies (diamond)', () => {
const tasks = [
{ id: 'root', dependsOn: [] as const },
{ id: 'left', dependsOn: ['root'] as const },
{ id: 'right', dependsOn: ['root'] as const },
{ id: 'merge', dependsOn: ['left', 'right'] as const },
]
const { positions } = layoutTasks(tasks)
const mx = positions.get('merge')!.x
expect(mx).toBeGreaterThan(positions.get('left')!.x)
expect(mx).toBeGreaterThan(positions.get('right')!.x)
})
it('orders independent roots in the same column with distinct rows', () => {
const tasks = [
{ id: 'a', dependsOn: [] as const },
{ id: 'b', dependsOn: [] as const },
]
const { positions } = layoutTasks(tasks)
expect(positions.get('a')!.x).toBe(positions.get('b')!.x)
expect(positions.get('a')!.y).not.toBe(positions.get('b')!.y)
})
it('throws when task dependencies contain a cycle', () => {
const tasks = [
{ id: 'a', dependsOn: ['b'] as const },
{ id: 'b', dependsOn: ['a'] as const },
]
expect(() => layoutTasks(tasks)).toThrow('Task dependency graph contains a cycle')
})
})

Some files were not shown because too many files have changed in this diff Show More