Compare commits

...

15 Commits

Author SHA1 Message Date
JackChen 0f16e81ae6
Merge pull request #105 from JackChen-me/chore/feature-request-source-field
chore: add source field to feature request template
2026-04-12 23:25:51 +08:00
JackChen 5804a54898 chore: add source field to feature request issue template
Helps maintainers triage by requiring contributors to indicate where
the idea originated (real use case, competitive reference, systematic
gap, or external discussion).
2026-04-12 23:23:42 +08:00
JackChen 252419e1f8
Merge pull request #104 from JackChen-me/fix/issue-99-100-101-abort-signal-propagation
fix: propagate AbortSignal through tools, Gemini adapter, and abort queue path
2026-04-12 23:22:15 +08:00
JackChen 6ea66afab5 fix: propagate AbortSignal through tool execution, Gemini adapter, and abort queue path (#99, #100, #101)
- #99: pass per-call effectiveAbortSignal to buildToolContext() so tools
  receive the correct signal instead of the static runner-level one
- #100: replace manual pending-task loop with queue.skipRemaining() on
  abort, fixing blocked tasks left non-terminal and missing events
- #101: forward abortSignal in Gemini adapter's buildConfig() so the
  SDK can cancel in-flight API calls
- Add 8 targeted tests for all three fixes
2026-04-12 23:20:30 +08:00
JackChen 97c5e457dd
Merge pull request #103 from JackChen-me/fix/issue-98-run-error-propagation
fix: propagate error events in AgentRunner.run()
2026-04-12 22:22:22 +08:00
JackChen 9b04fbf2e5
Merge pull request #102 from ibrahimkzmv/feat.add-glob-tool
feat: add glob tool
2026-04-12 22:21:43 +08:00
JackChen 9a446b8796 fix: propagate error events in AgentRunner.run() (#98)
run() only handled 'done' events from stream(), silently dropping
'error' events. This caused failed LLM calls to return an empty
RunResult that the caller treated as successful.
2026-04-12 22:16:33 +08:00
Ibrahim Kazimov dc88232885 feat: add glob tool 2026-04-12 16:59:20 +03:00
JackChen ced1d90a93
Merge pull request #89 from ibrahimkzmv/feat.mcp-tool-integration
feat: add connectMCPTools() to register MCP server tools as standard agent tools
2026-04-12 17:05:31 +08:00
JackChen 0fb8a38284
Merge pull request #88 from ibrahimkzmv/feat.context-strategy
feat: Add contextStrategy to control conversation growth and prevent token explosion in agent runs
2026-04-12 16:54:41 +08:00
MrAvalonApple 629d9c8253 feat: implement synthetic framing for user messages and enhance context strategy handling 2026-04-12 00:18:36 +03:00
Ibrahim Kazimov 167085c3a7
Merge branch 'main' into feat.mcp-tool-integration 2026-04-12 00:03:18 +03:00
MrAvalonApple 12dd802ad8 feat: update MCP GitHub example and added llmInputSchema 2026-04-12 00:01:22 +03:00
MrAvalonApple 7aa1bb7b5d feat: add connectMCPTools() to register MCP server tools as agent tools 2026-04-09 20:05:20 +03:00
MrAvalonApple eb484d9bbf feat: add context management strategies (sliding-window, summarize, custom) to prevent unbounded conversation growth 2026-04-09 19:40:15 +03:00
26 changed files with 2863 additions and 115 deletions

View File

@ -6,6 +6,17 @@ labels: enhancement
assignees: ''
---
## Source
**Where did this idea come from?** (Pick one — helps maintainers triage and prioritize.)
- [ ] **Real use case** — I'm using open-multi-agent and hit this limit. Describe the use case in "Problem" below.
- [ ] **Competitive reference** — Another framework has this (LangChain, AutoGen, CrewAI, Mastra, XCLI, etc.). Please name or link it.
- [ ] **Systematic gap** — A missing piece in the framework matrix (provider not supported, tool not covered, etc.).
- [ ] **Discussion / inspiration** — Came up in a tweet, Reddit post, Discord, or AI conversation. Please link or paste the source if possible.
> **Maintainer note**: after triage, label with one of `community-feedback`, `source:competitive`, `source:analysis`, `source:owner` (multiple OK if the source is mixed — e.g. competitive analysis + user feedback).
## Problem
A clear description of the problem or limitation you're experiencing.

View File

@ -138,7 +138,7 @@ For MapReduce-style fan-out without task dependencies, use `AgentPool.runParalle
## Examples
15 runnable scripts in [`examples/`](./examples/). Start with these four:
16 runnable scripts in [`examples/`](./examples/). Start with these four:
- [02 — Team Collaboration](examples/02-team-collaboration.ts): `runTeam()` coordinator pattern.
- [06 — Local Model](examples/06-local-model.ts): Ollama and Claude in one pipeline via `baseURL`.
@ -248,6 +248,30 @@ const customAgent: AgentConfig = {
Tools added via `agent.addTool()` are always available regardless of filtering.
### MCP Tools (Model Context Protocol)
`open-multi-agent` can connect to any MCP server and expose its tools directly to agents.
```typescript
import { connectMCPTools } from '@jackchen_me/open-multi-agent/mcp'
const { tools, disconnect } = await connectMCPTools({
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
namePrefix: 'github',
})
// Register each MCP tool in your ToolRegistry, then include their names in AgentConfig.tools
// Don't forget cleanup when done
await disconnect()
```
Notes:
- `@modelcontextprotocol/sdk` is an optional peer dependency, only needed when using MCP.
- Current transport support is stdio.
- MCP input validation is delegated to the MCP server (`inputSchema` is `z.any()`).
## Supported Providers
| Provider | Config | Env var | Status |

View File

@ -114,6 +114,8 @@ const conversationAgent = new Agent(
model: 'claude-sonnet-4-6',
systemPrompt: 'You are a TypeScript tutor. Give short, direct answers.',
maxTurns: 2,
// Keep only the most recent turn in long prompt() conversations.
contextStrategy: { type: 'sliding-window', maxTurns: 1 },
},
new ToolRegistry(), // no tools needed for this conversation
new ToolExecutor(new ToolRegistry()),

59
examples/16-mcp-github.ts Normal file
View File

@ -0,0 +1,59 @@
/**
* Example 16 MCP GitHub Tools
*
* Connect an MCP server over stdio and register all exposed MCP tools as
* standard open-multi-agent tools.
*
* Run:
* npx tsx examples/16-mcp-github.ts
*
* Prerequisites:
* - GEMINI_API_KEY
* - GITHUB_TOKEN
* - @modelcontextprotocol/sdk installed
*/
import { Agent, ToolExecutor, ToolRegistry, registerBuiltInTools } from '../src/index.js'
import { connectMCPTools } from '../src/mcp.js'
if (!process.env.GITHUB_TOKEN?.trim()) {
console.error('Missing GITHUB_TOKEN: set a GitHub personal access token in the environment.')
process.exit(1)
}
const { tools, disconnect } = await connectMCPTools({
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: {
...process.env,
GITHUB_TOKEN: process.env.GITHUB_TOKEN,
},
namePrefix: 'github',
})
const registry = new ToolRegistry()
registerBuiltInTools(registry)
for (const tool of tools) registry.register(tool)
const executor = new ToolExecutor(registry)
const agent = new Agent(
{
name: 'github-agent',
model: 'gemini-2.5-flash',
provider: 'gemini',
tools: tools.map((tool) => tool.name),
systemPrompt: 'Use GitHub MCP tools to answer repository questions.',
},
registry,
executor,
)
try {
const result = await agent.run(
'List the last 3 open issues in JackChen-me/open-multi-agent with title and number.',
)
console.log(result.output)
} finally {
await disconnect()
}

1036
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -14,6 +14,10 @@
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
},
"./mcp": {
"types": "./dist/mcp.d.ts",
"import": "./dist/mcp.js"
}
},
"scripts": {
@ -48,15 +52,20 @@
"zod": "^3.23.0"
},
"peerDependencies": {
"@google/genai": "^1.48.0"
"@google/genai": "^1.48.0",
"@modelcontextprotocol/sdk": "^1.18.0"
},
"peerDependenciesMeta": {
"@google/genai": {
"optional": true
},
"@modelcontextprotocol/sdk": {
"optional": true
}
},
"devDependencies": {
"@google/genai": "^1.48.0",
"@modelcontextprotocol/sdk": "^1.18.0",
"@types/node": "^22.0.0",
"@vitest/coverage-v8": "^2.1.9",
"tsx": "^4.21.0",

View File

@ -153,6 +153,7 @@ export class Agent {
agentRole: this.config.systemPrompt?.slice(0, 50) ?? 'assistant',
loopDetection: this.config.loopDetection,
maxTokenBudget: this.config.maxTokenBudget,
contextStrategy: this.config.contextStrategy,
}
this.runner = new AgentRunner(

View File

@ -29,10 +29,12 @@ import type {
LoopDetectionConfig,
LoopDetectionInfo,
LLMToolDef,
ContextStrategy,
} from '../types.js'
import { TokenBudgetExceededError } from '../errors.js'
import { LoopDetector } from './loop-detector.js'
import { emitTrace } from '../utils/trace.js'
import { estimateTokens } from '../utils/tokens.js'
import type { ToolRegistry } from '../tool/framework.js'
import type { ToolExecutor } from '../tool/executor.js'
@ -94,6 +96,8 @@ export interface RunnerOptions {
readonly loopDetection?: LoopDetectionConfig
/** Maximum cumulative tokens (input + output) allowed for this run. */
readonly maxTokenBudget?: number
/** Optional context compression strategy for long multi-turn runs. */
readonly contextStrategy?: ContextStrategy
}
/**
@ -172,6 +176,31 @@ function addTokenUsage(a: TokenUsage, b: TokenUsage): TokenUsage {
const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
/**
* Prepends synthetic framing text to the first user message so we never emit
* consecutive `user` turns (Bedrock) and summaries do not concatenate onto
* the original user prompt (direct API). If there is no user message yet,
* inserts a single assistant text preamble.
*/
function prependSyntheticPrefixToFirstUser(
messages: LLMMessage[],
prefix: string,
): LLMMessage[] {
const userIdx = messages.findIndex(m => m.role === 'user')
if (userIdx < 0) {
return [{
role: 'assistant',
content: [{ type: 'text', text: prefix.trimEnd() }],
}, ...messages]
}
const target = messages[userIdx]!
const merged: LLMMessage = {
role: 'user',
content: [{ type: 'text', text: prefix }, ...target.content],
}
return [...messages.slice(0, userIdx), merged, ...messages.slice(userIdx + 1)]
}
// ---------------------------------------------------------------------------
// AgentRunner
// ---------------------------------------------------------------------------
@ -191,6 +220,10 @@ const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }
*/
export class AgentRunner {
private readonly maxTurns: number
private summarizeCache: {
oldSignature: string
summaryPrefix: string
} | null = null
constructor(
private readonly adapter: LLMAdapter,
@ -201,6 +234,172 @@ export class AgentRunner {
this.maxTurns = options.maxTurns ?? 10
}
private serializeMessage(message: LLMMessage): string {
return JSON.stringify(message)
}
private truncateToSlidingWindow(messages: LLMMessage[], maxTurns: number): LLMMessage[] {
if (maxTurns <= 0) {
return messages
}
const firstUserIndex = messages.findIndex(m => m.role === 'user')
const firstUser = firstUserIndex >= 0 ? messages[firstUserIndex]! : null
const afterFirst = firstUserIndex >= 0
? messages.slice(firstUserIndex + 1)
: messages.slice()
if (afterFirst.length <= maxTurns * 2) {
return messages
}
const kept = afterFirst.slice(-maxTurns * 2)
const result: LLMMessage[] = []
if (firstUser !== null) {
result.push(firstUser)
}
const droppedPairs = Math.floor((afterFirst.length - kept.length) / 2)
if (droppedPairs > 0) {
const notice =
`[Earlier conversation history truncated — ${droppedPairs} turn(s) removed]\n\n`
result.push(...prependSyntheticPrefixToFirstUser(kept, notice))
return result
}
result.push(...kept)
return result
}
private async summarizeMessages(
messages: LLMMessage[],
maxTokens: number,
summaryModel: string | undefined,
baseChatOptions: LLMChatOptions,
turns: number,
options: RunOptions,
): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
const estimated = estimateTokens(messages)
if (estimated <= maxTokens || messages.length < 4) {
return { messages, usage: ZERO_USAGE }
}
const firstUserIndex = messages.findIndex(m => m.role === 'user')
if (firstUserIndex < 0 || firstUserIndex === messages.length - 1) {
return { messages, usage: ZERO_USAGE }
}
const firstUser = messages[firstUserIndex]!
const rest = messages.slice(firstUserIndex + 1)
if (rest.length < 2) {
return { messages, usage: ZERO_USAGE }
}
// Split on an even boundary so we never separate a tool_use assistant turn
// from its tool_result user message (rest is user/assistant pairs).
const splitAt = Math.max(2, Math.floor(rest.length / 4) * 2)
const oldPortion = rest.slice(0, splitAt)
const recentPortion = rest.slice(splitAt)
const oldSignature = oldPortion.map(m => this.serializeMessage(m)).join('\n')
if (this.summarizeCache !== null && this.summarizeCache.oldSignature === oldSignature) {
const mergedRecent = prependSyntheticPrefixToFirstUser(
recentPortion,
`${this.summarizeCache.summaryPrefix}\n\n`,
)
return { messages: [firstUser, ...mergedRecent], usage: ZERO_USAGE }
}
const summaryPrompt = [
'Summarize the following conversation history for an LLM.',
'- Preserve user goals, constraints, and decisions.',
'- Keep key tool outputs and unresolved questions.',
'- Use concise bullets.',
'- Do not fabricate details.',
].join('\n')
const summaryInput: LLMMessage[] = [
{
role: 'user',
content: [
{ type: 'text', text: summaryPrompt },
{ type: 'text', text: `\n\nConversation:\n${oldSignature}` },
],
},
]
const summaryOptions: LLMChatOptions = {
...baseChatOptions,
model: summaryModel ?? this.options.model,
tools: undefined,
}
const summaryStartMs = Date.now()
const summaryResponse = await this.adapter.chat(summaryInput, summaryOptions)
if (options.onTrace) {
const summaryEndMs = Date.now()
emitTrace(options.onTrace, {
type: 'llm_call',
runId: options.runId ?? '',
taskId: options.taskId,
agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
model: summaryOptions.model,
phase: 'summary',
turn: turns,
tokens: summaryResponse.usage,
startMs: summaryStartMs,
endMs: summaryEndMs,
durationMs: summaryEndMs - summaryStartMs,
})
}
const summaryText = extractText(summaryResponse.content).trim()
const summaryPrefix = summaryText.length > 0
? `[Conversation summary]\n${summaryText}`
: '[Conversation summary unavailable]'
this.summarizeCache = { oldSignature, summaryPrefix }
const mergedRecent = prependSyntheticPrefixToFirstUser(
recentPortion,
`${summaryPrefix}\n\n`,
)
return {
messages: [firstUser, ...mergedRecent],
usage: summaryResponse.usage,
}
}
private async applyContextStrategy(
messages: LLMMessage[],
strategy: ContextStrategy,
baseChatOptions: LLMChatOptions,
turns: number,
options: RunOptions,
): Promise<{ messages: LLMMessage[]; usage: TokenUsage }> {
if (strategy.type === 'sliding-window') {
return { messages: this.truncateToSlidingWindow(messages, strategy.maxTurns), usage: ZERO_USAGE }
}
if (strategy.type === 'summarize') {
return this.summarizeMessages(
messages,
strategy.maxTokens,
strategy.summaryModel,
baseChatOptions,
turns,
options,
)
}
const estimated = estimateTokens(messages)
const compressed = await strategy.compress(messages, estimated)
if (!Array.isArray(compressed) || compressed.length === 0) {
throw new Error('contextStrategy.custom.compress must return a non-empty LLMMessage[]')
}
return { messages: compressed, usage: ZERO_USAGE }
}
// -------------------------------------------------------------------------
// Tool resolution
// -------------------------------------------------------------------------
@ -291,6 +490,8 @@ export class AgentRunner {
for await (const event of this.stream(messages, options)) {
if (event.type === 'done') {
Object.assign(accumulated, event.data)
} else if (event.type === 'error') {
throw event.data
}
}
@ -313,7 +514,7 @@ export class AgentRunner {
options: RunOptions = {},
): AsyncGenerator<StreamEvent> {
// Working copy of the conversation — mutated as turns progress.
const conversationMessages: LLMMessage[] = [...initialMessages]
let conversationMessages: LLMMessage[] = [...initialMessages]
// Accumulated state across all turns.
let totalUsage: TokenUsage = ZERO_USAGE
@ -363,6 +564,19 @@ export class AgentRunner {
turns++
// Optionally compact context before each LLM call after the first turn.
if (this.options.contextStrategy && turns > 1) {
const compacted = await this.applyContextStrategy(
conversationMessages,
this.options.contextStrategy,
baseChatOptions,
turns,
options,
)
conversationMessages = compacted.messages
totalUsage = addTokenUsage(totalUsage, compacted.usage)
}
// ------------------------------------------------------------------
// Step 1: Call the LLM and collect the full response for this turn.
// ------------------------------------------------------------------
@ -376,6 +590,7 @@ export class AgentRunner {
taskId: options.taskId,
agent: options.traceAgent ?? this.options.agentName ?? 'unknown',
model: this.options.model,
phase: 'turn',
turn: turns,
tokens: response.usage,
startMs: llmStartMs,
@ -495,7 +710,7 @@ export class AgentRunner {
// Parallel execution is critical for multi-tool responses where the
// tools are independent (e.g. reading several files at once).
// ------------------------------------------------------------------
const toolContext: ToolUseContext = this.buildToolContext()
const toolContext: ToolUseContext = this.buildToolContext(effectiveAbortSignal)
const executionPromises = toolUseBlocks.map(async (block): Promise<{
resultBlock: ToolResultBlock
@ -630,14 +845,14 @@ export class AgentRunner {
* Build the {@link ToolUseContext} passed to every tool execution.
* Identifies this runner as the invoking agent.
*/
private buildToolContext(): ToolUseContext {
private buildToolContext(abortSignal?: AbortSignal): ToolUseContext {
return {
agent: {
name: this.options.agentName ?? 'runner',
role: this.options.agentRole ?? 'assistant',
model: this.options.model,
},
abortSignal: this.options.abortSignal,
abortSignal,
}
}
}

View File

@ -98,6 +98,7 @@ export {
fileReadTool,
fileWriteTool,
fileEditTool,
globTool,
grepTool,
} from './tool/built-in/index.js'
@ -153,6 +154,7 @@ export type {
ToolCallRecord,
LoopDetectionConfig,
LoopDetectionInfo,
ContextStrategy,
// Team
TeamConfig,

View File

@ -163,6 +163,7 @@ function buildConfig(
toolConfig: options.tools
? { functionCallingConfig: { mode: FunctionCallingConfigMode.AUTO } }
: undefined,
abortSignal: options.abortSignal,
}
}

5
src/mcp.ts Normal file
View File

@ -0,0 +1,5 @@
export type {
ConnectMCPToolsConfig,
ConnectedMCPTools,
} from './tool/mcp.js'
export { connectMCPTools } from './tool/mcp.js'

View File

@ -433,10 +433,7 @@ async function executeQueue(
while (true) {
// Check for cancellation before each dispatch round.
if (ctx.abortSignal?.aborted) {
// Mark all remaining pending tasks as skipped.
for (const t of queue.getByStatus('pending')) {
queue.update(t.id, { status: 'skipped' as TaskStatus })
}
queue.skipRemaining('Skipped: run aborted.')
break
}

View File

@ -0,0 +1,97 @@
/**
* Shared recursive directory walk for built-in file tools.
*
* Used by {@link grepTool} and {@link globTool} so glob filtering and skip
* rules stay consistent.
*/
import { readdir, stat } from 'fs/promises'
import { join } from 'path'
/** Directories that are almost never useful to traverse for code search. */
export const SKIP_DIRS = new Set([
'.git',
'.svn',
'.hg',
'node_modules',
'.next',
'dist',
'build',
])
export interface CollectFilesOptions {
/** When set, stop collecting once this many paths are gathered. */
readonly maxFiles?: number
}
/**
* Recursively walk `dir` and return file paths, honouring {@link SKIP_DIRS}
* and an optional filename glob pattern.
*/
export async function collectFiles(
dir: string,
glob: string | undefined,
signal: AbortSignal | undefined,
options?: CollectFilesOptions,
): Promise<string[]> {
const results: string[] = []
await walk(dir, glob, results, signal, options?.maxFiles)
return results
}
async function walk(
dir: string,
glob: string | undefined,
results: string[],
signal: AbortSignal | undefined,
maxFiles: number | undefined,
): Promise<void> {
if (signal?.aborted === true) return
if (maxFiles !== undefined && results.length >= maxFiles) return
let entryNames: string[]
try {
entryNames = await readdir(dir, { encoding: 'utf8' })
} catch {
return
}
for (const entryName of entryNames) {
if (signal !== undefined && signal.aborted) return
if (maxFiles !== undefined && results.length >= maxFiles) return
const fullPath = join(dir, entryName)
let entryInfo: Awaited<ReturnType<typeof stat>>
try {
entryInfo = await stat(fullPath)
} catch {
continue
}
if (entryInfo.isDirectory()) {
if (!SKIP_DIRS.has(entryName)) {
await walk(fullPath, glob, results, signal, maxFiles)
}
} else if (entryInfo.isFile()) {
if (glob === undefined || matchesGlob(entryName, glob)) {
results.push(fullPath)
}
}
}
}
/**
* Minimal glob match supporting `*.ext` and `**<pattern>` forms.
*
*/
export function matchesGlob(filename: string, glob: string): boolean {
const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
const regexSource = pattern
.replace(/[.+^${}()|[\]\\]/g, '\\$&')
.replace(/\*/g, '.*')
.replace(/\?/g, '.')
const re = new RegExp(`^${regexSource}$`, 'i')
return re.test(filename)
}

99
src/tool/built-in/glob.ts Normal file
View File

@ -0,0 +1,99 @@
/**
* Built-in glob tool.
*
* Lists file paths under a directory matching an optional filename glob.
* Does not read file contents use {@link grepTool} to search inside files.
*/
import { stat } from 'fs/promises'
import { basename, relative } from 'path'
import { z } from 'zod'
import type { ToolResult } from '../../types.js'
import { collectFiles, matchesGlob } from './fs-walk.js'
import { defineTool } from '../framework.js'
const DEFAULT_MAX_FILES = 500
export const globTool = defineTool({
name: 'glob',
description:
'List file paths under a directory that match an optional filename glob. ' +
'Does not read file contents — use `grep` to search inside files. ' +
'Skips common bulky directories (node_modules, .git, dist, etc.). ' +
'Paths in the result are relative to the process working directory. ' +
'Results are capped by `maxFiles`.',
inputSchema: z.object({
path: z
.string()
.optional()
.describe(
'Directory to list files under. Defaults to the current working directory.',
),
pattern: z
.string()
.optional()
.describe(
'Filename glob (e.g. "*.ts", "**/*.json"). When omitted, every file ' +
'under the directory is listed (subject to maxFiles and skipped dirs).',
),
maxFiles: z
.number()
.int()
.positive()
.optional()
.describe(
`Maximum number of file paths to return. Defaults to ${DEFAULT_MAX_FILES}.`,
),
}),
execute: async (input, context): Promise<ToolResult> => {
const root = input.path ?? process.cwd()
const maxFiles = input.maxFiles ?? DEFAULT_MAX_FILES
const signal = context.abortSignal
let linesOut: string[]
let truncated = false
try {
const info = await stat(root)
if (info.isFile()) {
const name = basename(root)
if (
input.pattern !== undefined &&
!matchesGlob(name, input.pattern)
) {
return { data: 'No files matched.', isError: false }
}
linesOut = [relative(process.cwd(), root) || root]
} else {
const collected = await collectFiles(root, input.pattern, signal, {
maxFiles: maxFiles + 1,
})
truncated = collected.length > maxFiles
const capped = collected.slice(0, maxFiles)
linesOut = capped.map((f) => relative(process.cwd(), f) || f)
}
} catch (err) {
const message = err instanceof Error ? err.message : 'Unknown error'
return {
data: `Cannot access path "${root}": ${message}`,
isError: true,
}
}
if (linesOut.length === 0) {
return { data: 'No files matched.', isError: false }
}
const sorted = [...linesOut].sort((a, b) => a.localeCompare(b))
const truncationNote = truncated
? `\n\n(listing capped at ${maxFiles} paths; raise maxFiles for more)`
: ''
return {
data: sorted.join('\n') + truncationNote,
isError: false,
}
},
})

View File

@ -8,28 +8,18 @@
*/
import { spawn } from 'child_process'
import { readdir, readFile, stat } from 'fs/promises'
// Note: readdir is used with { encoding: 'utf8' } to return string[] directly.
import { join, relative } from 'path'
import { readFile, stat } from 'fs/promises'
import { relative } from 'path'
import { z } from 'zod'
import type { ToolResult } from '../../types.js'
import { defineTool } from '../framework.js'
import { collectFiles } from './fs-walk.js'
// ---------------------------------------------------------------------------
// Constants
// ---------------------------------------------------------------------------
const DEFAULT_MAX_RESULTS = 100
// Directories that are almost never useful to search inside
const SKIP_DIRS = new Set([
'.git',
'.svn',
'.hg',
'node_modules',
'.next',
'dist',
'build',
])
// ---------------------------------------------------------------------------
// Tool definition
@ -42,6 +32,7 @@ export const grepTool = defineTool({
'Returns matching lines with their file paths and 1-based line numbers. ' +
'Use the `glob` parameter to restrict the search to specific file types ' +
'(e.g. "*.ts"). ' +
'To list matching file paths without reading contents, use the `glob` tool. ' +
'Results are capped by `maxResults` to keep the response manageable.',
inputSchema: z.object({
@ -270,79 +261,6 @@ async function runNodeSearch(
}
}
// ---------------------------------------------------------------------------
// File collection with glob filtering
// ---------------------------------------------------------------------------
/**
* Recursively walk `dir` and return file paths, honouring `SKIP_DIRS` and an
* optional glob pattern.
*/
async function collectFiles(
dir: string,
glob: string | undefined,
signal: AbortSignal | undefined,
): Promise<string[]> {
const results: string[] = []
await walk(dir, glob, results, signal)
return results
}
async function walk(
dir: string,
glob: string | undefined,
results: string[],
signal: AbortSignal | undefined,
): Promise<void> {
if (signal?.aborted === true) return
let entryNames: string[]
try {
// Read as plain strings so we don't have to deal with Buffer Dirent variants.
entryNames = await readdir(dir, { encoding: 'utf8' })
} catch {
return
}
for (const entryName of entryNames) {
if (signal !== undefined && signal.aborted) return
const fullPath = join(dir, entryName)
let entryInfo: Awaited<ReturnType<typeof stat>>
try {
entryInfo = await stat(fullPath)
} catch {
continue
}
if (entryInfo.isDirectory()) {
if (!SKIP_DIRS.has(entryName)) {
await walk(fullPath, glob, results, signal)
}
} else if (entryInfo.isFile()) {
if (glob === undefined || matchesGlob(entryName, glob)) {
results.push(fullPath)
}
}
}
}
/**
* Minimal glob match supporting `*.ext` and `**\/<pattern>` forms.
*/
function matchesGlob(filename: string, glob: string): boolean {
// Strip leading **/ prefix — we already recurse into all directories
const pattern = glob.startsWith('**/') ? glob.slice(3) : glob
// Convert shell glob characters to regex equivalents
const regexSource = pattern
.replace(/[.+^${}()|[\]\\]/g, '\\$&') // escape special regex chars first
.replace(/\*/g, '.*') // * -> .*
.replace(/\?/g, '.') // ? -> .
const re = new RegExp(`^${regexSource}$`, 'i')
return re.test(filename)
}
// ---------------------------------------------------------------------------
// ripgrep availability check (cached per process)
// ---------------------------------------------------------------------------

View File

@ -11,9 +11,10 @@ import { bashTool } from './bash.js'
import { fileEditTool } from './file-edit.js'
import { fileReadTool } from './file-read.js'
import { fileWriteTool } from './file-write.js'
import { globTool } from './glob.js'
import { grepTool } from './grep.js'
export { bashTool, fileEditTool, fileReadTool, fileWriteTool, grepTool }
export { bashTool, fileEditTool, fileReadTool, fileWriteTool, globTool, grepTool }
/**
* The ordered list of all built-in tools. Import this when you need to
@ -29,6 +30,7 @@ export const BUILT_IN_TOOLS: ToolDefinition<any>[] = [
fileWriteTool,
fileEditTool,
grepTool,
globTool,
]
/**

View File

@ -72,12 +72,19 @@ export function defineTool<TInput>(config: {
name: string
description: string
inputSchema: ZodSchema<TInput>
/**
* Optional JSON Schema for the LLM (bypasses Zod JSON Schema conversion).
*/
llmInputSchema?: Record<string, unknown>
execute: (input: TInput, context: ToolUseContext) => Promise<ToolResult>
}): ToolDefinition<TInput> {
return {
name: config.name,
description: config.description,
inputSchema: config.inputSchema,
...(config.llmInputSchema !== undefined
? { llmInputSchema: config.llmInputSchema }
: {}),
execute: config.execute,
}
}
@ -169,7 +176,8 @@ export class ToolRegistry {
*/
toToolDefs(): LLMToolDef[] {
return Array.from(this.tools.values()).map((tool) => {
const schema = zodToJsonSchema(tool.inputSchema)
const schema =
tool.llmInputSchema ?? zodToJsonSchema(tool.inputSchema)
return {
name: tool.name,
description: tool.description,
@ -194,13 +202,20 @@ export class ToolRegistry {
toLLMTools(): Array<{
name: string
description: string
input_schema: {
type: 'object'
properties: Record<string, JSONSchemaProperty>
required?: string[]
}
/** Anthropic-style tool input JSON Schema (`type` is usually `object`). */
input_schema: Record<string, unknown>
}> {
return Array.from(this.tools.values()).map((tool) => {
if (tool.llmInputSchema !== undefined) {
return {
name: tool.name,
description: tool.description,
input_schema: {
type: 'object' as const,
...(tool.llmInputSchema as Record<string, unknown>),
},
}
}
const schema = zodToJsonSchema(tool.inputSchema)
return {
name: tool.name,

296
src/tool/mcp.ts Normal file
View File

@ -0,0 +1,296 @@
import { z } from 'zod'
import { defineTool } from './framework.js'
import type { ToolDefinition } from '../types.js'
interface MCPToolDescriptor {
name: string
description?: string
/** MCP tool JSON Schema; same shape LLM APIs expect for object parameters. */
inputSchema?: Record<string, unknown>
}
interface MCPListToolsResponse {
tools?: MCPToolDescriptor[]
nextCursor?: string
}
interface MCPCallToolResponse {
content?: Array<Record<string, unknown>>
structuredContent?: unknown
isError?: boolean
toolResult?: unknown
}
interface MCPClientLike {
connect(transport: unknown, options?: { timeout?: number; signal?: AbortSignal }): Promise<void>
listTools(
params?: { cursor?: string },
options?: { timeout?: number; signal?: AbortSignal },
): Promise<MCPListToolsResponse>
callTool(
request: { name: string; arguments: Record<string, unknown> },
resultSchema?: unknown,
options?: { timeout?: number; signal?: AbortSignal },
): Promise<MCPCallToolResponse>
close?: () => Promise<void>
}
type MCPClientConstructor = new (
info: { name: string; version: string },
options: { capabilities: Record<string, unknown> },
) => MCPClientLike
type StdioTransportConstructor = new (config: {
command: string
args?: string[]
env?: Record<string, string | undefined>
cwd?: string
}) => { close?: () => Promise<void> }
interface MCPModules {
Client: MCPClientConstructor
StdioClientTransport: StdioTransportConstructor
}
const DEFAULT_MCP_REQUEST_TIMEOUT_MS = 60_000
async function loadMCPModules(): Promise<MCPModules> {
const [{ Client }, { StdioClientTransport }] = await Promise.all([
import('@modelcontextprotocol/sdk/client/index.js') as Promise<{
Client: MCPClientConstructor
}>,
import('@modelcontextprotocol/sdk/client/stdio.js') as Promise<{
StdioClientTransport: StdioTransportConstructor
}>,
])
return { Client, StdioClientTransport }
}
export interface ConnectMCPToolsConfig {
command: string
args?: string[]
env?: Record<string, string | undefined>
cwd?: string
/**
* Optional segment prepended to MCP tool names for the framework tool (and LLM) name.
* Example: prefix `github` + MCP tool `search_issues` `github_search_issues`.
*/
namePrefix?: string
/**
* Timeout (ms) for MCP connect and each `tools/list` page. Defaults to 60000.
*/
requestTimeoutMs?: number
/**
* Client metadata sent to the MCP server.
*/
clientName?: string
clientVersion?: string
}
export interface ConnectedMCPTools {
tools: ToolDefinition[]
disconnect: () => Promise<void>
}
/**
* Build an LLM-safe tool name: MCP and prior examples used `prefix/name`, but
* Anthropic and other providers reject `/` in tool names.
*/
function normalizeToolName(rawName: string, namePrefix?: string): string {
const trimmedPrefix = namePrefix?.trim()
const base =
trimmedPrefix !== undefined && trimmedPrefix !== ''
? `${trimmedPrefix}_${rawName}`
: rawName
return base.replace(/\//g, '_')
}
/** MCP `tools/list` JSON Schema; forwarded to the LLM as-is (runtime validation stays `z.any()`). */
function mcpLlmInputSchema(
schema: Record<string, unknown> | undefined,
): Record<string, unknown> {
if (schema !== undefined && typeof schema === 'object' && !Array.isArray(schema)) {
return schema
}
return { type: 'object' }
}
function contentBlockToText(block: Record<string, unknown>): string | undefined {
const typ = block.type
if (typ === 'text' && typeof block.text === 'string') {
return block.text
}
if (typ === 'image' && typeof block.data === 'string') {
const mime =
typeof block.mimeType === 'string' ? block.mimeType : 'image/*'
return `[image ${mime}; base64 length=${block.data.length}]`
}
if (typ === 'audio' && typeof block.data === 'string') {
const mime =
typeof block.mimeType === 'string' ? block.mimeType : 'audio/*'
return `[audio ${mime}; base64 length=${block.data.length}]`
}
if (
typ === 'resource' &&
block.resource !== null &&
typeof block.resource === 'object'
) {
const r = block.resource as Record<string, unknown>
const uri = typeof r.uri === 'string' ? r.uri : ''
if (typeof r.text === 'string') {
return `[resource ${uri}]\n${r.text}`
}
if (typeof r.blob === 'string') {
const mime = typeof r.mimeType === 'string' ? r.mimeType : ''
return `[resource ${uri}; mimeType=${mime}; blob base64 length=${r.blob.length}]`
}
return `[resource ${uri}]`
}
if (typ === 'resource_link') {
const uri = typeof block.uri === 'string' ? block.uri : ''
const name = typeof block.name === 'string' ? block.name : ''
const desc =
typeof block.description === 'string' ? block.description : ''
const head = `[resource_link name=${JSON.stringify(name)} uri=${JSON.stringify(uri)}]`
return desc === '' ? head : `${head}\n${desc}`
}
return undefined
}
function toToolResultData(result: MCPCallToolResponse): string {
if ('toolResult' in result && result.toolResult !== undefined) {
try {
return JSON.stringify(result.toolResult, null, 2)
} catch {
return String(result.toolResult)
}
}
const lines: string[] = []
for (const block of result.content ?? []) {
if (block === null || typeof block !== 'object') continue
const rec = block as Record<string, unknown>
const line = contentBlockToText(rec)
if (line !== undefined) {
lines.push(line)
continue
}
try {
lines.push(
`[${String(rec.type ?? 'unknown')}]\n${JSON.stringify(rec, null, 2)}`,
)
} catch {
lines.push('[mcp content block]')
}
}
if (lines.length > 0) {
return lines.join('\n')
}
if (result.structuredContent !== undefined) {
try {
return JSON.stringify(result.structuredContent, null, 2)
} catch {
return String(result.structuredContent)
}
}
try {
return JSON.stringify(result)
} catch {
return 'MCP tool completed with non-text output.'
}
}
async function listAllMcpTools(
client: MCPClientLike,
requestOpts: { timeout: number },
): Promise<MCPToolDescriptor[]> {
const acc: MCPToolDescriptor[] = []
let cursor: string | undefined
do {
const page = await client.listTools(
cursor !== undefined ? { cursor } : {},
requestOpts,
)
acc.push(...(page.tools ?? []))
cursor =
typeof page.nextCursor === 'string' && page.nextCursor !== ''
? page.nextCursor
: undefined
} while (cursor !== undefined)
return acc
}
/**
* Connect to an MCP server over stdio and convert exposed MCP tools into
* open-multi-agent ToolDefinitions.
*/
export async function connectMCPTools(
config: ConnectMCPToolsConfig,
): Promise<ConnectedMCPTools> {
const { Client, StdioClientTransport } = await loadMCPModules()
const transport = new StdioClientTransport({
command: config.command,
args: config.args ?? [],
env: config.env,
cwd: config.cwd,
})
const client = new Client(
{
name: config.clientName ?? 'open-multi-agent',
version: config.clientVersion ?? '0.0.0',
},
{ capabilities: {} },
)
const requestOpts = {
timeout: config.requestTimeoutMs ?? DEFAULT_MCP_REQUEST_TIMEOUT_MS,
}
await client.connect(transport, requestOpts)
const mcpTools = await listAllMcpTools(client, requestOpts)
const tools: ToolDefinition[] = mcpTools.map((tool) =>
defineTool({
name: normalizeToolName(tool.name, config.namePrefix),
description: tool.description ?? `MCP tool: ${tool.name}`,
inputSchema: z.any(),
llmInputSchema: mcpLlmInputSchema(tool.inputSchema),
execute: async (input: Record<string, unknown>) => {
try {
const result = await client.callTool(
{
name: tool.name,
arguments: input,
},
undefined,
requestOpts,
)
return {
data: toToolResultData(result),
isError: result.isError === true,
}
} catch (error) {
const message =
error instanceof Error ? error.message : String(error)
return {
data: `MCP tool "${tool.name}" failed: ${message}`,
isError: true,
}
}
},
}),
)
return {
tools,
disconnect: async () => {
await client.close?.()
},
}
}

View File

@ -65,6 +65,18 @@ export interface LLMMessage {
readonly content: ContentBlock[]
}
/** Context management strategy for long-running agent conversations. */
export type ContextStrategy =
| { type: 'sliding-window'; maxTurns: number }
| { type: 'summarize'; maxTokens: number; summaryModel?: string }
| {
type: 'custom'
compress: (
messages: LLMMessage[],
estimatedTokens: number,
) => Promise<LLMMessage[]> | LLMMessage[]
}
/** Token accounting for a single API call. */
export interface TokenUsage {
readonly input_tokens: number
@ -170,12 +182,18 @@ export interface ToolResult {
* A tool registered with the framework.
*
* `inputSchema` is a Zod schema used for validation before `execute` is called.
* At API call time it is converted to JSON Schema via {@link LLMToolDef}.
* At API call time it is converted to JSON Schema for {@link LLMToolDef}, unless
* `llmInputSchema` is set (e.g. MCP tools ship JSON Schema from the server).
*/
export interface ToolDefinition<TInput = Record<string, unknown>> {
readonly name: string
readonly description: string
readonly inputSchema: ZodSchema<TInput>
/**
* When present, used as {@link LLMToolDef.inputSchema} as-is instead of
* deriving JSON Schema from `inputSchema` (Zod).
*/
readonly llmInputSchema?: Record<string, unknown>
execute(input: TInput, context: ToolUseContext): Promise<ToolResult>
}
@ -215,6 +233,8 @@ export interface AgentConfig {
readonly maxTokens?: number
/** Maximum cumulative tokens (input + output) allowed for this run. */
readonly maxTokenBudget?: number
/** Optional context compression policy to control input growth across turns. */
readonly contextStrategy?: ContextStrategy
readonly temperature?: number
/**
* Maximum wall-clock time (in milliseconds) for the entire agent run.
@ -493,6 +513,8 @@ export interface TraceEventBase {
export interface LLMCallTrace extends TraceEventBase {
readonly type: 'llm_call'
readonly model: string
/** Distinguishes normal turn calls from context-summary calls. */
readonly phase?: 'turn' | 'summary'
readonly turn: number
readonly tokens: TokenUsage
}

27
src/utils/tokens.ts Normal file
View File

@ -0,0 +1,27 @@
import type { LLMMessage } from '../types.js'
/**
* Estimate token count using a lightweight character heuristic.
* This intentionally avoids model-specific tokenizer dependencies.
*/
export function estimateTokens(messages: LLMMessage[]): number {
let chars = 0
for (const message of messages) {
for (const block of message.content) {
if (block.type === 'text') {
chars += block.text.length
} else if (block.type === 'tool_result') {
chars += block.content.length
} else if (block.type === 'tool_use') {
chars += JSON.stringify(block.input).length
} else if (block.type === 'image') {
// Account for non-text payloads with a small fixed cost.
chars += 64
}
}
}
// Conservative English heuristic: ~4 chars per token.
return Math.ceil(chars / 4)
}

View File

@ -0,0 +1,279 @@
/**
* Targeted tests for abort signal propagation fixes (#99, #100, #101).
*
* - #99: Per-call abortSignal must reach tool execution context
* - #100: Abort path in executeQueue must skip blocked tasks and emit events
* - #101: Gemini adapter must forward abortSignal to the SDK
*/
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry, defineTool } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import { TaskQueue } from '../src/task/queue.js'
import { createTask } from '../src/task/task.js'
import { z } from 'zod'
import type { LLMAdapter, LLMMessage, ToolUseContext } from '../src/types.js'
// ---------------------------------------------------------------------------
// #99 — Per-call abortSignal propagated to tool context
// ---------------------------------------------------------------------------
describe('Per-call abortSignal reaches tool context (#99)', () => {
it('tool receives per-call abortSignal, not static runner signal', async () => {
// Track the abortSignal passed to the tool
let receivedSignal: AbortSignal | undefined
const spy = defineTool({
name: 'spy',
description: 'Captures the abort signal from context.',
inputSchema: z.object({}),
execute: async (_input, context) => {
receivedSignal = context.abortSignal
return { data: 'ok', isError: false }
},
})
const registry = new ToolRegistry()
registry.register(spy)
const executor = new ToolExecutor(registry)
// Adapter returns one tool_use then end_turn
const adapter: LLMAdapter = {
name: 'mock',
chat: vi.fn()
.mockResolvedValueOnce({
id: '1',
content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
model: 'mock',
stop_reason: 'tool_use',
usage: { input_tokens: 0, output_tokens: 0 },
})
.mockResolvedValueOnce({
id: '2',
content: [{ type: 'text', text: 'done' }],
model: 'mock',
stop_reason: 'end_turn',
usage: { input_tokens: 0, output_tokens: 0 },
}),
async *stream() { /* unused */ },
}
const perCallController = new AbortController()
// Runner created WITHOUT a static abortSignal
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock',
agentName: 'test',
})
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: 'go' }] },
]
await runner.run(messages, { abortSignal: perCallController.signal })
// The tool must have received the per-call signal, not undefined
expect(receivedSignal).toBe(perCallController.signal)
})
it('tool receives static signal when no per-call signal is provided', async () => {
let receivedSignal: AbortSignal | undefined
const spy = defineTool({
name: 'spy',
description: 'Captures the abort signal from context.',
inputSchema: z.object({}),
execute: async (_input, context) => {
receivedSignal = context.abortSignal
return { data: 'ok', isError: false }
},
})
const registry = new ToolRegistry()
registry.register(spy)
const executor = new ToolExecutor(registry)
const staticController = new AbortController()
const adapter: LLMAdapter = {
name: 'mock',
chat: vi.fn()
.mockResolvedValueOnce({
id: '1',
content: [{ type: 'tool_use', id: 'call-1', name: 'spy', input: {} }],
model: 'mock',
stop_reason: 'tool_use',
usage: { input_tokens: 0, output_tokens: 0 },
})
.mockResolvedValueOnce({
id: '2',
content: [{ type: 'text', text: 'done' }],
model: 'mock',
stop_reason: 'end_turn',
usage: { input_tokens: 0, output_tokens: 0 },
}),
async *stream() { /* unused */ },
}
// Runner created WITH a static abortSignal, no per-call signal
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock',
agentName: 'test',
abortSignal: staticController.signal,
})
const messages: LLMMessage[] = [
{ role: 'user', content: [{ type: 'text', text: 'go' }] },
]
await runner.run(messages)
expect(receivedSignal).toBe(staticController.signal)
})
})
// ---------------------------------------------------------------------------
// #100 — Abort path skips blocked tasks and emits events
// ---------------------------------------------------------------------------
describe('Abort path skips blocked tasks and emits events (#100)', () => {
function task(id: string, opts: { dependsOn?: string[]; assignee?: string } = {}) {
const t = createTask({ title: id, description: `task ${id}`, assignee: opts.assignee })
return { ...t, id, dependsOn: opts.dependsOn } as ReturnType<typeof createTask>
}
it('skipRemaining transitions blocked tasks to skipped', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
// 'b' should be blocked because it depends on 'a'
expect(q.getByStatus('blocked').length).toBe(1)
q.skipRemaining('Skipped: run aborted.')
// Both tasks should be skipped — including the blocked one
const all = q.list()
expect(all.every(t => t.status === 'skipped')).toBe(true)
expect(q.getByStatus('blocked').length).toBe(0)
})
it('skipRemaining emits task:skipped for every non-terminal task', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
const handler = vi.fn()
q.on('task:skipped', handler)
q.skipRemaining('Skipped: run aborted.')
// Both pending 'a' and blocked 'b' must trigger events
expect(handler).toHaveBeenCalledTimes(2)
const ids = handler.mock.calls.map((c: any[]) => c[0].id)
expect(ids).toContain('a')
expect(ids).toContain('b')
})
it('skipRemaining fires all:complete after skipping', () => {
const q = new TaskQueue()
q.add(task('a'))
q.add(task('b', { dependsOn: ['a'] }))
const completeHandler = vi.fn()
q.on('all:complete', completeHandler)
q.skipRemaining('Skipped: run aborted.')
expect(completeHandler).toHaveBeenCalledTimes(1)
expect(q.isComplete()).toBe(true)
})
})
// ---------------------------------------------------------------------------
// #101 — Gemini adapter forwards abortSignal to SDK config
// ---------------------------------------------------------------------------
const mockGenerateContent = vi.hoisted(() => vi.fn())
const mockGenerateContentStream = vi.hoisted(() => vi.fn())
const GoogleGenAIMock = vi.hoisted(() =>
vi.fn(() => ({
models: {
generateContent: mockGenerateContent,
generateContentStream: mockGenerateContentStream,
},
})),
)
vi.mock('@google/genai', () => ({
GoogleGenAI: GoogleGenAIMock,
FunctionCallingConfigMode: { AUTO: 'AUTO' },
}))
import { GeminiAdapter } from '../src/llm/gemini.js'
describe('Gemini adapter forwards abortSignal (#101)', () => {
let adapter: GeminiAdapter
function makeGeminiResponse(parts: Array<Record<string, unknown>>) {
return {
candidates: [{
content: { parts },
finishReason: 'STOP',
}],
usageMetadata: { promptTokenCount: 10, candidatesTokenCount: 5 },
}
}
async function* asyncGen<T>(items: T[]): AsyncGenerator<T> {
for (const item of items) yield item
}
beforeEach(() => {
vi.clearAllMocks()
adapter = new GeminiAdapter('test-key')
})
it('chat() passes abortSignal in config', async () => {
mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
const controller = new AbortController()
await adapter.chat(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash', abortSignal: controller.signal },
)
const callArgs = mockGenerateContent.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBe(controller.signal)
})
it('chat() does not include abortSignal when not provided', async () => {
mockGenerateContent.mockResolvedValue(makeGeminiResponse([{ text: 'hi' }]))
await adapter.chat(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash' },
)
const callArgs = mockGenerateContent.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBeUndefined()
})
it('stream() passes abortSignal in config', async () => {
const chunk = makeGeminiResponse([{ text: 'hi' }])
mockGenerateContentStream.mockResolvedValue(asyncGen([chunk]))
const controller = new AbortController()
const events: unknown[] = []
for await (const e of adapter.stream(
[{ role: 'user', content: [{ type: 'text' as const, text: 'hello' }] }],
{ model: 'gemini-2.5-flash', abortSignal: controller.signal },
)) {
events.push(e)
}
const callArgs = mockGenerateContentStream.mock.calls[0][0]
expect(callArgs.config.abortSignal).toBe(controller.signal)
})
})

View File

@ -6,6 +6,7 @@ import { fileReadTool } from '../src/tool/built-in/file-read.js'
import { fileWriteTool } from '../src/tool/built-in/file-write.js'
import { fileEditTool } from '../src/tool/built-in/file-edit.js'
import { bashTool } from '../src/tool/built-in/bash.js'
import { globTool } from '../src/tool/built-in/glob.js'
import { grepTool } from '../src/tool/built-in/grep.js'
import { registerBuiltInTools, BUILT_IN_TOOLS } from '../src/tool/built-in/index.js'
import { ToolRegistry } from '../src/tool/framework.js'
@ -34,7 +35,7 @@ afterEach(async () => {
// ===========================================================================
describe('registerBuiltInTools', () => {
it('registers all 5 built-in tools', () => {
it('registers all 6 built-in tools', () => {
const registry = new ToolRegistry()
registerBuiltInTools(registry)
@ -43,10 +44,11 @@ describe('registerBuiltInTools', () => {
expect(registry.get('file_write')).toBeDefined()
expect(registry.get('file_edit')).toBeDefined()
expect(registry.get('grep')).toBeDefined()
expect(registry.get('glob')).toBeDefined()
})
it('BUILT_IN_TOOLS has correct length', () => {
expect(BUILT_IN_TOOLS).toHaveLength(5)
expect(BUILT_IN_TOOLS).toHaveLength(6)
})
})
@ -305,6 +307,102 @@ describe('bash', () => {
})
})
// ===========================================================================
// glob
// ===========================================================================
describe('glob', () => {
it('lists files matching a pattern without reading contents', async () => {
await writeFile(join(tmpDir, 'a.ts'), 'SECRET_CONTENT_SHOULD_NOT_APPEAR')
await writeFile(join(tmpDir, 'b.md'), 'also secret')
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('.ts')
expect(result.data).not.toContain('SECRET')
expect(result.data).not.toContain('b.md')
})
it('lists all files when pattern is omitted', async () => {
await writeFile(join(tmpDir, 'x.txt'), 'x')
await writeFile(join(tmpDir, 'y.txt'), 'y')
const result = await globTool.execute({ path: tmpDir }, defaultContext)
expect(result.isError).toBe(false)
expect(result.data).toContain('x.txt')
expect(result.data).toContain('y.txt')
})
it('lists a single file when path is a file', async () => {
const filePath = join(tmpDir, 'only.ts')
await writeFile(filePath, 'body')
const result = await globTool.execute({ path: filePath }, defaultContext)
expect(result.isError).toBe(false)
expect(result.data).toContain('only.ts')
})
it('returns no match when single file does not match pattern', async () => {
const filePath = join(tmpDir, 'readme.md')
await writeFile(filePath, '# doc')
const result = await globTool.execute(
{ path: filePath, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('No files matched')
})
it('recurses into subdirectories', async () => {
const sub = join(tmpDir, 'nested')
const { mkdir } = await import('fs/promises')
await mkdir(sub, { recursive: true })
await writeFile(join(sub, 'deep.ts'), '')
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.ts' },
defaultContext,
)
expect(result.isError).toBe(false)
expect(result.data).toContain('deep.ts')
})
it('errors on inaccessible path', async () => {
const result = await globTool.execute(
{ path: '/nonexistent/path/xyz' },
defaultContext,
)
expect(result.isError).toBe(true)
expect(result.data).toContain('Cannot access path')
})
it('notes truncation when maxFiles is exceeded', async () => {
for (let i = 0; i < 5; i++) {
await writeFile(join(tmpDir, `f${i}.txt`), '')
}
const result = await globTool.execute(
{ path: tmpDir, pattern: '*.txt', maxFiles: 3 },
defaultContext,
)
expect(result.isError).toBe(false)
const lines = (result.data as string).split('\n').filter((l) => l.endsWith('.txt'))
expect(lines).toHaveLength(3)
expect(result.data).toContain('capped at 3')
})
})
// ===========================================================================
// grep (Node.js fallback — tests do not depend on ripgrep availability)
// ===========================================================================

View File

@ -0,0 +1,202 @@
import { describe, it, expect, vi } from 'vitest'
import { z } from 'zod'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry, defineTool } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import type { LLMAdapter, LLMChatOptions, LLMMessage, LLMResponse, TraceEvent } from '../src/types.js'
function textResponse(text: string): LLMResponse {
return {
id: `resp-${Math.random().toString(36).slice(2)}`,
content: [{ type: 'text', text }],
model: 'mock-model',
stop_reason: 'end_turn',
usage: { input_tokens: 10, output_tokens: 20 },
}
}
function toolUseResponse(toolName: string, input: Record<string, unknown>): LLMResponse {
return {
id: `resp-${Math.random().toString(36).slice(2)}`,
content: [{
type: 'tool_use',
id: `tu-${Math.random().toString(36).slice(2)}`,
name: toolName,
input,
}],
model: 'mock-model',
stop_reason: 'tool_use',
usage: { input_tokens: 15, output_tokens: 25 },
}
}
function buildRegistryAndExecutor(): { registry: ToolRegistry; executor: ToolExecutor } {
const registry = new ToolRegistry()
registry.register(
defineTool({
name: 'echo',
description: 'Echo input',
inputSchema: z.object({ message: z.string() }),
async execute({ message }) {
return { data: message }
},
}),
)
return { registry, executor: new ToolExecutor(registry) }
}
describe('AgentRunner contextStrategy', () => {
it('keeps baseline behavior when contextStrategy is not set', async () => {
const calls: LLMMessage[][] = []
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return calls.length === 1
? toolUseResponse('echo', { message: 'hello' })
: textResponse('done')
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 4,
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'start' }] }])
expect(calls).toHaveLength(2)
expect(calls[0]).toHaveLength(1)
expect(calls[1]!.length).toBeGreaterThan(calls[0]!.length)
})
it('sliding-window truncates old turns and preserves the first user message', async () => {
const calls: LLMMessage[][] = []
const responses = [
toolUseResponse('echo', { message: 't1' }),
toolUseResponse('echo', { message: 't2' }),
toolUseResponse('echo', { message: 't3' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: { type: 'sliding-window', maxTurns: 1 },
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'original prompt' }] }])
const laterCall = calls[calls.length - 1]!
const firstUserText = laterCall[0]!.content[0]
expect(firstUserText).toMatchObject({ type: 'text', text: 'original prompt' })
const flattenedText = laterCall.flatMap(m => m.content.filter(c => c.type === 'text'))
expect(flattenedText.some(c => c.type === 'text' && c.text.includes('truncated'))).toBe(true)
})
it('summarize strategy replaces old context and emits summary trace call', async () => {
const calls: Array<{ messages: LLMMessage[]; options: LLMChatOptions }> = []
const traces: TraceEvent[] = []
const responses = [
toolUseResponse('echo', { message: 'first turn payload '.repeat(20) }),
toolUseResponse('echo', { message: 'second turn payload '.repeat(20) }),
textResponse('This is a concise summary.'),
textResponse('final answer'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages, options) {
calls.push({ messages: messages.map(m => ({ role: m.role, content: m.content })), options })
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 8,
contextStrategy: { type: 'summarize', maxTokens: 20 },
})
const result = await runner.run(
[{ role: 'user', content: [{ type: 'text', text: 'start' }] }],
{ onTrace: (e) => { traces.push(e) }, runId: 'run-summary', traceAgent: 'context-agent' },
)
const summaryCall = calls.find(c => c.messages.length === 1 && c.options.tools === undefined)
expect(summaryCall).toBeDefined()
const llmTraces = traces.filter(t => t.type === 'llm_call')
expect(llmTraces.some(t => t.type === 'llm_call' && t.phase === 'summary')).toBe(true)
// Summary adapter usage must count toward RunResult.tokenUsage (maxTokenBudget).
expect(result.tokenUsage.input_tokens).toBe(15 + 15 + 10 + 10)
expect(result.tokenUsage.output_tokens).toBe(25 + 25 + 20 + 20)
// After compaction, summary text is folded into the next user turn (not a
// standalone user message), preserving user/assistant alternation.
const turnAfterSummary = calls.find(
c => c.messages.some(
m => m.role === 'user' && m.content.some(
b => b.type === 'text' && b.text.includes('[Conversation summary]'),
),
),
)
expect(turnAfterSummary).toBeDefined()
const rolesAfterFirstUser = turnAfterSummary!.messages.map(m => m.role).join(',')
expect(rolesAfterFirstUser).not.toMatch(/^user,user/)
})
it('custom strategy calls compress callback and uses returned messages', async () => {
const compress = vi.fn((messages: LLMMessage[]) => messages.slice(-1))
const calls: LLMMessage[][] = []
const responses = [
toolUseResponse('echo', { message: 'hello' }),
textResponse('done'),
]
let idx = 0
const adapter: LLMAdapter = {
name: 'mock',
async chat(messages) {
calls.push(messages.map(m => ({ role: m.role, content: m.content })))
return responses[idx++]!
},
async *stream() {
/* unused */
},
}
const { registry, executor } = buildRegistryAndExecutor()
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
allowedTools: ['echo'],
maxTurns: 4,
contextStrategy: {
type: 'custom',
compress,
},
})
await runner.run([{ role: 'user', content: [{ type: 'text', text: 'custom prompt' }] }])
expect(compress).toHaveBeenCalledOnce()
expect(calls[1]).toHaveLength(1)
})
})

211
tests/mcp-tools.test.ts Normal file
View File

@ -0,0 +1,211 @@
import { describe, it, expect, beforeEach, vi } from 'vitest'
import type { ToolUseContext } from '../src/types.js'
import { ToolRegistry } from '../src/tool/framework.js'
const listToolsMock = vi.fn()
const callToolMock = vi.fn()
const connectMock = vi.fn()
const clientCloseMock = vi.fn()
const transportCloseMock = vi.fn()
class MockClient {
async connect(
transport: unknown,
_options?: { timeout?: number },
): Promise<void> {
connectMock(transport)
}
async listTools(
params?: { cursor?: string },
options?: { timeout?: number },
): Promise<{
tools: Array<{
name: string
description: string
inputSchema?: Record<string, unknown>
}>
nextCursor?: string
}> {
return listToolsMock(params, options)
}
async callTool(
request: { name: string; arguments: Record<string, unknown> },
resultSchema?: unknown,
options?: { timeout?: number },
): Promise<{
content?: Array<Record<string, unknown>>
structuredContent?: unknown
isError?: boolean
toolResult?: unknown
}> {
return callToolMock(request, resultSchema, options)
}
async close(): Promise<void> {
clientCloseMock()
}
}
class MockStdioTransport {
readonly config: unknown
constructor(config: unknown) {
this.config = config
}
async close(): Promise<void> {
transportCloseMock()
}
}
vi.mock('@modelcontextprotocol/sdk/client/index.js', () => ({
Client: MockClient,
}))
vi.mock('@modelcontextprotocol/sdk/client/stdio.js', () => ({
StdioClientTransport: MockStdioTransport,
}))
const context: ToolUseContext = {
agent: { name: 'test-agent', role: 'tester', model: 'test-model' },
}
beforeEach(() => {
vi.clearAllMocks()
})
describe('connectMCPTools', () => {
it('connects, discovers tools, and executes MCP calls', async () => {
listToolsMock.mockResolvedValue({
tools: [
{
name: 'search_issues',
description: 'Search repository issues.',
inputSchema: {
type: 'object',
properties: { q: { type: 'string' } },
required: ['q'],
},
},
],
})
callToolMock.mockResolvedValue({
content: [{ type: 'text', text: 'found 2 issues' }],
isError: false,
})
const { connectMCPTools } = await import('../src/tool/mcp.js')
const connected = await connectMCPTools({
command: 'npx',
args: ['-y', 'mock-mcp-server'],
env: { GITHUB_TOKEN: 'token' },
namePrefix: 'github',
})
expect(connectMock).toHaveBeenCalledTimes(1)
expect(connected.tools).toHaveLength(1)
expect(connected.tools[0].name).toBe('github_search_issues')
const registry = new ToolRegistry()
registry.register(connected.tools[0])
const defs = registry.toToolDefs()
expect(defs[0].inputSchema).toMatchObject({
type: 'object',
properties: { q: { type: 'string' } },
required: ['q'],
})
const result = await connected.tools[0].execute({ q: 'bug' }, context)
expect(callToolMock).toHaveBeenCalledWith(
{
name: 'search_issues',
arguments: { q: 'bug' },
},
undefined,
expect.objectContaining({ timeout: expect.any(Number) }),
)
expect(result.isError).toBe(false)
expect(result.data).toContain('found 2 issues')
await connected.disconnect()
expect(clientCloseMock).toHaveBeenCalledTimes(1)
expect(transportCloseMock).not.toHaveBeenCalled()
})
it('aggregates paginated listTools results', async () => {
listToolsMock.mockImplementation(
async (params?: { cursor?: string }) => {
if (params?.cursor === 'c1') {
return {
tools: [
{ name: 'b', description: 'B', inputSchema: { type: 'object' } },
],
}
}
return {
tools: [
{ name: 'a', description: 'A', inputSchema: { type: 'object' } },
],
nextCursor: 'c1',
}
},
)
callToolMock.mockResolvedValue({ content: [{ type: 'text', text: 'ok' }] })
const { connectMCPTools } = await import('../src/tool/mcp.js')
const connected = await connectMCPTools({
command: 'npx',
args: ['-y', 'mock-mcp-server'],
})
expect(listToolsMock).toHaveBeenCalledTimes(2)
expect(listToolsMock.mock.calls[1][0]).toEqual({ cursor: 'c1' })
expect(connected.tools).toHaveLength(2)
expect(connected.tools.map((t) => t.name)).toEqual(['a', 'b'])
})
it('serializes non-text MCP content blocks', async () => {
listToolsMock.mockResolvedValue({
tools: [{ name: 'pic', description: 'Pic', inputSchema: { type: 'object' } }],
})
callToolMock.mockResolvedValue({
content: [
{
type: 'image',
data: 'AAA',
mimeType: 'image/png',
},
],
isError: false,
})
const { connectMCPTools } = await import('../src/tool/mcp.js')
const connected = await connectMCPTools({ command: 'npx', args: ['x'] })
const result = await connected.tools[0].execute({}, context)
expect(result.data).toContain('image')
expect(result.data).toContain('base64 length=3')
})
it('marks tool result as error when MCP returns isError', async () => {
listToolsMock.mockResolvedValue({
tools: [{ name: 'danger', description: 'Dangerous op.', inputSchema: {} }],
})
callToolMock.mockResolvedValue({
content: [{ type: 'text', text: 'permission denied' }],
isError: true,
})
const { connectMCPTools } = await import('../src/tool/mcp.js')
const connected = await connectMCPTools({
command: 'npx',
args: ['-y', 'mock-mcp-server'],
})
const result = await connected.tools[0].execute({}, context)
expect(result.isError).toBe(true)
expect(result.data).toContain('permission denied')
})
})

View File

@ -0,0 +1,85 @@
import { describe, it, expect } from 'vitest'
import { Agent } from '../src/agent/agent.js'
import { AgentRunner } from '../src/agent/runner.js'
import { ToolRegistry } from '../src/tool/framework.js'
import { ToolExecutor } from '../src/tool/executor.js'
import type { AgentConfig, LLMAdapter, LLMMessage } from '../src/types.js'
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/** Adapter whose chat() always throws. */
function errorAdapter(error: Error): LLMAdapter {
return {
name: 'error-mock',
async chat(_messages: LLMMessage[]) {
throw error
},
async *stream() {
/* unused */
},
}
}
function buildAgentWithAdapter(config: AgentConfig, adapter: LLMAdapter) {
const registry = new ToolRegistry()
const executor = new ToolExecutor(registry)
const agent = new Agent(config, registry, executor)
const runner = new AgentRunner(adapter, registry, executor, {
model: config.model,
systemPrompt: config.systemPrompt,
maxTurns: config.maxTurns,
agentName: config.name,
})
;(agent as any).runner = runner
return agent
}
const baseConfig: AgentConfig = {
name: 'test-agent',
model: 'mock-model',
systemPrompt: 'You are a test agent.',
}
// ---------------------------------------------------------------------------
// Tests — #98: AgentRunner.run() must propagate errors from stream()
// ---------------------------------------------------------------------------
describe('AgentRunner.run() error propagation (#98)', () => {
it('LLM adapter error surfaces as success:false in AgentRunResult', async () => {
const apiError = new Error('API 500: internal server error')
const agent = buildAgentWithAdapter(baseConfig, errorAdapter(apiError))
const result = await agent.run('hello')
expect(result.success).toBe(false)
expect(result.output).toContain('API 500')
})
it('AgentRunner.run() throws when adapter errors', async () => {
const apiError = new Error('network timeout')
const adapter = errorAdapter(apiError)
const registry = new ToolRegistry()
const executor = new ToolExecutor(registry)
const runner = new AgentRunner(adapter, registry, executor, {
model: 'mock-model',
systemPrompt: 'test',
agentName: 'test',
})
await expect(
runner.run([{ role: 'user', content: [{ type: 'text', text: 'hi' }] }]),
).rejects.toThrow('network timeout')
})
it('agent transitions to error state on LLM failure', async () => {
const agent = buildAgentWithAdapter(baseConfig, errorAdapter(new Error('boom')))
await agent.run('hello')
expect(agent.getState().status).toBe('error')
})
})

View File

@ -61,6 +61,13 @@ function createTestTools() {
execute: async () => ({ data: 'matches', isError: false }),
}))
registry.register(defineTool({
name: 'glob',
description: 'List paths',
inputSchema: z.object({ path: z.string().optional() }),
execute: async () => ({ data: 'paths', isError: false }),
}))
registry.register(defineTool({
name: 'bash',
description: 'Run shell command',
@ -110,7 +117,15 @@ describe('Tool filtering', () => {
const tools = (runner as any).resolveTools() as LLMToolDef[]
const toolNames = tools.map((t: LLMToolDef) => t.name).sort()
expect(toolNames).toEqual(['bash', 'custom_tool', 'file_edit', 'file_read', 'file_write', 'grep'])
expect(toolNames).toEqual([
'bash',
'custom_tool',
'file_edit',
'file_read',
'file_write',
'glob',
'grep',
])
})
})
@ -124,7 +139,7 @@ describe('Tool filtering', () => {
const tools = (runner as any).resolveTools() as LLMToolDef[]
const toolNames = tools.map((t: LLMToolDef) => t.name).sort()
expect(toolNames).toEqual(['custom_tool', 'file_read', 'grep'])
expect(toolNames).toEqual(['custom_tool', 'file_read', 'glob', 'grep'])
})
it('readwrite preset filters correctly', () => {
@ -136,7 +151,14 @@ describe('Tool filtering', () => {
const tools = (runner as any).resolveTools() as LLMToolDef[]
const toolNames = tools.map((t: LLMToolDef) => t.name).sort()
expect(toolNames).toEqual(['custom_tool', 'file_edit', 'file_read', 'file_write', 'grep'])
expect(toolNames).toEqual([
'custom_tool',
'file_edit',
'file_read',
'file_write',
'glob',
'grep',
])
})
it('full preset filters correctly', () => {
@ -148,7 +170,15 @@ describe('Tool filtering', () => {
const tools = (runner as any).resolveTools() as LLMToolDef[]
const toolNames = tools.map((t: LLMToolDef) => t.name).sort()
expect(toolNames).toEqual(['bash', 'custom_tool', 'file_edit', 'file_read', 'file_write', 'grep'])
expect(toolNames).toEqual([
'bash',
'custom_tool',
'file_edit',
'file_read',
'file_write',
'glob',
'grep',
])
})
})
@ -186,7 +216,14 @@ describe('Tool filtering', () => {
const tools = (runner as any).resolveTools() as LLMToolDef[]
const toolNames = tools.map((t: LLMToolDef) => t.name).sort()
expect(toolNames).toEqual(['custom_tool', 'file_edit', 'file_read', 'file_write', 'grep'])
expect(toolNames).toEqual([
'custom_tool',
'file_edit',
'file_read',
'file_write',
'glob',
'grep',
])
})
it('empty denylist returns all tools', () => {
@ -196,13 +233,13 @@ describe('Tool filtering', () => {
})
const tools = (runner as any).resolveTools()
expect(tools).toHaveLength(6) // All registered tools
expect(tools).toHaveLength(7) // All registered tools
})
})
describe('resolveTools - combined filtering (preset + allowlist + denylist)', () => {
it('preset + allowlist + denylist work together', () => {
// Start with readwrite preset: ['file_read', 'file_write', 'file_edit', 'grep']
// Start with readwrite preset: ['file_read', 'file_write', 'file_edit', 'grep', 'glob']
// Then allowlist: intersect with ['file_read', 'file_write', 'grep'] = ['file_read', 'file_write', 'grep']
// Then denylist: subtract ['file_write'] = ['file_read', 'grep']
const runner = new AgentRunner(mockAdapter, registry, executor, {
@ -219,7 +256,7 @@ describe('Tool filtering', () => {
})
it('preset filters first, then allowlist intersects, then denylist subtracts', () => {
// Start with readonly preset: ['file_read', 'grep']
// Start with readonly preset: ['file_read', 'grep', 'glob']
// Allowlist intersect with ['file_read', 'bash']: ['file_read']
// Denylist subtract ['file_read']: []
const runner = new AgentRunner(mockAdapter, registry, executor, {