merge: resolve conflicts with main (timeoutMs, abortSignal, gemini)

2026-04-05 13:01:44 +08:00 · 2026-04-05 13:01:44 +08:00 · a19143389a
parent b18cb39525 ed3753c1f4
commit a19143389a
19 changed files with 2170 additions and 812 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -18,6 +18,6 @@ jobs:
        with:
          node-version: ${{ matrix.node-version }}
          cache: npm
-      - run: npm ci
+      - run: rm -f package-lock.json && npm install
      - run: npm run lint
      - run: npm test
--- a/README.md
+++ b/README.md
@ -29,7 +29,12 @@ Requires Node.js >= 18.
 npm install @jackchen_me/open-multi-agent
 ```

-Set `ANTHROPIC_API_KEY` (and optionally `OPENAI_API_KEY` or `GITHUB_TOKEN` for Copilot) in your environment. Local models via Ollama require no API key — see [example 06](examples/06-local-model.ts).
+Set the API key for your provider. Local models via Ollama require no API key — see [example 06](examples/06-local-model.ts).
+
+- `ANTHROPIC_API_KEY`
+- `OPENAI_API_KEY`
+- `GEMINI_API_KEY`
+- `GITHUB_TOKEN` (for Copilot)

 Three agents, one goal — the framework handles the rest:

@ -156,6 +161,7 @@ npx tsx examples/01-single-agent.ts
 │  - stream()       │    │  - AnthropicAdapter  │
 └────────┬──────────┘    │  - OpenAIAdapter     │
         │               │  - CopilotAdapter    │
+         │               │  - GeminiAdapter     │
         │               └──────────────────────┘
 ┌────────▼──────────┐
 │  AgentRunner      │    ┌──────────────────────┐
@ -183,6 +189,7 @@ npx tsx examples/01-single-agent.ts
 | OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | Verified |
 | Grok (xAI)   | `provider: 'grok'` | `XAI_API_KEY` | Verified |
 | GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | Verified |
+| Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | Verified |
 | Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | Verified |
 | llama.cpp server | `provider: 'openai'` + `baseURL` | — | Verified |

@ -190,6 +197,33 @@ Verified local models with tool-calling: **Gemma 4** (see [example 08](examples/

 Any OpenAI-compatible API should work via `provider: 'openai'` + `baseURL` (DeepSeek, Groq, Mistral, Qwen, MiniMax, etc.). **Grok now has first-class support** via `provider: 'grok'`.

+### Local Model Tool-Calling
+
+The framework supports tool-calling with local models served by Ollama, vLLM, LM Studio, or llama.cpp. Tool-calling is handled natively by these servers via the OpenAI-compatible API.
+
+**Verified models:** Gemma 4, Llama 3.1, Qwen 3, Mistral, Phi-4. See the full list at [ollama.com/search?c=tools](https://ollama.com/search?c=tools).
+
+**Fallback extraction:** If a local model returns tool calls as text instead of using the `tool_calls` wire format (common with thinking models or misconfigured servers), the framework automatically extracts them from the text output.
+
+**Timeout:** Local inference can be slow. Use `timeoutMs` on `AgentConfig` to prevent indefinite hangs:
+
+```typescript
+const localAgent: AgentConfig = {
+  name: 'local',
+  model: 'llama3.1',
+  provider: 'openai',
+  baseURL: 'http://localhost:11434/v1',
+  apiKey: 'ollama',
+  tools: ['bash', 'file_read'],
+  timeoutMs: 120_000, // abort after 2 minutes
+}
+```
+
+**Troubleshooting:**
+- Model not calling tools? Ensure it appears in Ollama's [Tools category](https://ollama.com/search?c=tools). Not all models support tool-calling.
+- Using Ollama? Update to the latest version (`ollama update`) — older versions have known tool-calling bugs.
+- Proxy interfering? Use `no_proxy=localhost` when running against local servers.
+
 ### LLM Configuration Examples

 ```typescript
--- a/README_zh.md
+++ b/README_zh.md
@ -155,6 +155,7 @@ npx tsx examples/01-single-agent.ts
 │  - stream()       │    │  - AnthropicAdapter  │
 └────────┬──────────┘    │  - OpenAIAdapter     │
         │               │  - CopilotAdapter    │
+         │               │  - GeminiAdapter     │
         │               └──────────────────────┘
 ┌────────▼──────────┐
 │  AgentRunner      │    ┌──────────────────────┐
@ -181,6 +182,7 @@ npx tsx examples/01-single-agent.ts
 | Anthropic (Claude) | `provider: 'anthropic'` | `ANTHROPIC_API_KEY` | 已验证 |
 | OpenAI (GPT) | `provider: 'openai'` | `OPENAI_API_KEY` | 已验证 |
 | GitHub Copilot | `provider: 'copilot'` | `GITHUB_TOKEN` | 已验证 |
+| Gemini | `provider: 'gemini'` | `GEMINI_API_KEY` | 已验证 |
 | Ollama / vLLM / LM Studio | `provider: 'openai'` + `baseURL` | — | 已验证 |

 已验证支持 tool-calling 的本地模型：**Gemma 4**（见[示例 08](examples/08-gemma4-local.ts)）。
--- a/examples/06-local-model.ts
+++ b/examples/06-local-model.ts
@ -64,6 +64,7 @@ Your review MUST include these sections:
 Be specific and constructive. Reference line numbers or function names when possible.`,
  tools: ['file_read'],
  maxTurns: 4,
+  timeoutMs: 120_000, // 2 min — local models can be slow
 }

 // ---------------------------------------------------------------------------
--- a/examples/13-gemini.ts
+++ b/examples/13-gemini.ts
@ -0,0 +1,48 @@
+/**
+ * Quick smoke test for the Gemini adapter.
+ *
+ * Run:
+ *   npx tsx examples/13-gemini.ts
+ *
+ * If GEMINI_API_KEY is not set, the adapter will not work.
+ */
+
+import { OpenMultiAgent } from '../src/index.js'
+import type { OrchestratorEvent } from '../src/types.js'
+
+const orchestrator = new OpenMultiAgent({
+  defaultModel: 'gemini-2.5-flash',
+  defaultProvider: 'gemini',
+  onProgress: (event: OrchestratorEvent) => {
+    if (event.type === 'agent_start') {
+      console.log(`[start]    agent=${event.agent}`)
+    } else if (event.type === 'agent_complete') {
+      console.log(`[complete] agent=${event.agent}`)
+    }
+  },
+})
+
+console.log('Testing Gemini adapter with gemini-2.5-flash...\n')
+
+const result = await orchestrator.runAgent(
+  {
+    name: 'assistant',
+    model: 'gemini-2.5-flash',
+    provider: 'gemini',
+    systemPrompt: 'You are a helpful assistant. Keep answers brief.',
+    maxTurns: 1,
+    maxTokens: 256,
+  },
+  'What is 2 + 2? Reply in one sentence.',
+)
+
+if (result.success) {
+  console.log('\nAgent output:')
+  console.log('─'.repeat(60))
+  console.log(result.output)
+  console.log('─'.repeat(60))
+  console.log(`\nTokens: input=${result.tokenUsage.input_tokens}, output=${result.tokenUsage.output_tokens}`)
+} else {
+  console.error('Agent failed:', result.output)
+  process.exit(1)
+}
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@ -41,7 +41,16 @@
    "openai": "^4.73.0",
    "zod": "^3.23.0"
  },
+  "peerDependencies": {
+    "@google/genai": "^1.48.0"
+  },
+  "peerDependenciesMeta": {
+    "@google/genai": {
+      "optional": true
+    }
+  },
  "devDependencies": {
+    "@google/genai": "^1.48.0",
    "@types/node": "^22.0.0",
    "tsx": "^4.21.0",
    "typescript": "^5.6.0",
--- a/src/agent/agent.ts
+++ b/src/agent/agent.ts
@ -50,6 +50,19 @@ import {

 const ZERO_USAGE: TokenUsage = { input_tokens: 0, output_tokens: 0 }

+/**
+ * Combine two {@link AbortSignal}s so that aborting either one cancels the
+ * returned signal.  Works on Node 18+ (no `AbortSignal.any` required).
+ */
+function mergeAbortSignals(a: AbortSignal, b: AbortSignal): AbortSignal {
+  const controller = new AbortController()
+  if (a.aborted || b.aborted) { controller.abort(); return controller.signal }
+  const abort = () => controller.abort()
+  a.addEventListener('abort', abort, { once: true })
+  b.addEventListener('abort', abort, { once: true })
+  return controller.signal
+}
+
 function addUsage(a: TokenUsage, b: TokenUsage): TokenUsage {
  return {
    input_tokens: a.input_tokens + b.input_tokens,
@ -294,10 +307,22 @@ export class Agent {
      }
      // Auto-generate runId when onTrace is provided but runId is missing
      const needsRunId = callerOptions?.onTrace && !callerOptions.runId
+      // Create a fresh timeout signal per run (not per runner) so that
+      // each run() / prompt() call gets its own timeout window.
+      const timeoutSignal = this.config.timeoutMs !== undefined && this.config.timeoutMs > 0
+        ? AbortSignal.timeout(this.config.timeoutMs)
+        : undefined
+      // Merge caller-provided abortSignal with the timeout signal so that
+      // either cancellation source is respected.
+      const callerAbort = callerOptions?.abortSignal
+      const effectiveAbort = timeoutSignal && callerAbort
+        ? mergeAbortSignals(timeoutSignal, callerAbort)
+        : timeoutSignal ?? callerAbort
      const runOptions: RunOptions = {
        ...callerOptions,
        onMessage: internalOnMessage,
        ...(needsRunId ? { runId: generateRunId() } : undefined),
+        ...(effectiveAbort ? { abortSignal: effectiveAbort } : undefined),
      }

      const result = await runner.run(messages, runOptions)
@ -467,8 +492,12 @@ export class Agent {
      }

      const runner = await this.getRunner()
+      // Fresh timeout per stream call, same as executeRun.
+      const timeoutSignal = this.config.timeoutMs !== undefined && this.config.timeoutMs > 0
+        ? AbortSignal.timeout(this.config.timeoutMs)
+        : undefined

-      for await (const event of runner.stream(messages)) {
+      for await (const event of runner.stream(messages, timeoutSignal ? { abortSignal: timeoutSignal } : {})) {
        if (event.type === 'done') {
          const result = event.data as import('./runner.js').RunResult
          this.state.tokenUsage = addUsage(this.state.tokenUsage, result.tokenUsage)
--- a/src/agent/runner.ts
+++ b/src/agent/runner.ts
@ -83,6 +83,11 @@ export interface RunOptions {
  readonly onToolResult?: (name: string, result: ToolResult) => void
  /** Fired after each complete {@link LLMMessage} is appended. */
  readonly onMessage?: (message: LLMMessage) => void
+  /**
+   * Fired when the runner detects a potential configuration issue.
+   * For example, when a model appears to ignore tool definitions.
+   */
+  readonly onWarning?: (message: string) => void
  /** Trace callback for observability spans. Async callbacks are safe. */
  readonly onTrace?: (event: TraceEvent) => void | Promise<void>
  /** Run ID for trace correlation. */
@ -92,10 +97,10 @@ export interface RunOptions {
  /** Agent name for trace correlation (overrides RunnerOptions.agentName). */
  readonly traceAgent?: string
  /**
-   * Fired when the runner detects a potential issue (e.g. loop detection,
-   * model ignoring tool definitions).
+   * Per-call abort signal. When set, takes precedence over the static
+   * {@link RunnerOptions.abortSignal}. Useful for per-run timeouts.
   */
-  readonly onWarning?: (message: string) => void
+  readonly abortSignal?: AbortSignal
 }

 /** The aggregated result returned when a full run completes. */
@ -236,13 +241,16 @@ export class AgentRunner {
      ? allDefs.filter(d => this.options.allowedTools!.includes(d.name))
      : allDefs

+    // Per-call abortSignal takes precedence over the static one.
+    const effectiveAbortSignal = options.abortSignal ?? this.options.abortSignal
+
    const baseChatOptions: LLMChatOptions = {
      model: this.options.model,
      tools: toolDefs.length > 0 ? toolDefs : undefined,
      maxTokens: this.options.maxTokens,
      temperature: this.options.temperature,
      systemPrompt: this.options.systemPrompt,
-      abortSignal: this.options.abortSignal,
+      abortSignal: effectiveAbortSignal,
    }

    // Loop detection state — only allocated when configured.
@ -259,7 +267,7 @@ export class AgentRunner {
      // -----------------------------------------------------------------
      while (true) {
        // Respect abort before each LLM call.
-        if (this.options.abortSignal?.aborted) {
+        if (effectiveAbortSignal?.aborted) {
          break
        }

@ -361,6 +369,15 @@ export class AgentRunner {
        // Step 3: Decide whether to continue looping.
        // ------------------------------------------------------------------
        if (toolUseBlocks.length === 0) {
+          // Warn on first turn if tools were provided but model didn't use them.
+          if (turns === 1 && toolDefs.length > 0 && options.onWarning) {
+            const agentName = this.options.agentName ?? 'unknown'
+            options.onWarning(
+              `Agent "${agentName}" has ${toolDefs.length} tool(s) available but the model ` +
+              `returned no tool calls. If using a local model, verify it supports tool calling ` +
+              `(see https://ollama.com/search?c=tools).`,
+            )
+          }
          // No tools requested — this is the terminal assistant turn.
          finalOutput = turnText
          break
--- a/src/llm/adapter.ts
+++ b/src/llm/adapter.ts
@ -11,6 +11,7 @@
 *
 * const anthropic = createAdapter('anthropic')
 * const openai    = createAdapter('openai', process.env.OPENAI_API_KEY)
+ * const gemini    = createAdapter('gemini', process.env.GEMINI_API_KEY)
 * ```
 */

@ -37,7 +38,7 @@ import type { LLMAdapter } from '../types.js'
 * Additional providers can be integrated by implementing {@link LLMAdapter}
 * directly and bypassing this factory.
 */
-export type SupportedProvider = 'anthropic' | 'copilot' | 'grok' | 'openai'
+export type SupportedProvider = 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'

 /**
 * Instantiate the appropriate {@link LLMAdapter} for the given provider.
@ -46,6 +47,7 @@ export type SupportedProvider = 'anthropic' | 'copilot' | 'grok' | 'openai'
 * explicitly:
 * - `anthropic` → `ANTHROPIC_API_KEY`
 * - `openai`    → `OPENAI_API_KEY`
+ * - `gemini`    → `GEMINI_API_KEY` / `GOOGLE_API_KEY`
 * - `grok`      → `XAI_API_KEY`
 * - `copilot`   → `GITHUB_COPILOT_TOKEN` / `GITHUB_TOKEN`, or interactive
 *                  OAuth2 device flow if neither is set
@ -75,6 +77,10 @@ export async function createAdapter(
      const { CopilotAdapter } = await import('./copilot.js')
      return new CopilotAdapter(apiKey)
    }
+    case 'gemini': {
+      const { GeminiAdapter } = await import('./gemini.js')
+      return new GeminiAdapter(apiKey)
+    }
    case 'openai': {
      const { OpenAIAdapter } = await import('./openai.js')
      return new OpenAIAdapter(apiKey, baseURL)
--- a/src/llm/copilot.ts
+++ b/src/llm/copilot.ts
@ -313,7 +313,8 @@ export class CopilotAdapter implements LLMAdapter {
      },
    )

-    return fromOpenAICompletion(completion)
+    const toolNames = options.tools?.map(t => t.name)
+    return fromOpenAICompletion(completion, toolNames)
  }

  // -------------------------------------------------------------------------
--- a/src/llm/gemini.ts
+++ b/src/llm/gemini.ts
@ -0,0 +1,378 @@
+/**
+ * @fileoverview Google Gemini adapter implementing {@link LLMAdapter}.
+ *
+ * Built for `@google/genai` (the unified Google Gen AI SDK, v1.x), NOT the
+ * legacy `@google/generative-ai` package.
+ *
+ * Converts between the framework's internal {@link ContentBlock} types and the
+ * `@google/genai` SDK's wire format, handling tool definitions, system prompts,
+ * and both batch and streaming response paths.
+ *
+ * API key resolution order:
+ *   1. `apiKey` constructor argument
+ *   2. `GEMINI_API_KEY` environment variable
+ *   3. `GOOGLE_API_KEY` environment variable
+ *
+ * @example
+ * ```ts
+ * import { GeminiAdapter } from './gemini.js'
+ *
+ * const adapter = new GeminiAdapter()
+ * const response = await adapter.chat(messages, {
+ *   model: 'gemini-2.5-flash',
+ *   maxTokens: 1024,
+ * })
+ * ```
+ */
+
+import {
+  GoogleGenAI,
+  FunctionCallingConfigMode,
+  type Content,
+  type FunctionDeclaration,
+  type GenerateContentConfig,
+  type GenerateContentResponse,
+  type Part,
+  type Tool as GeminiTool,
+} from '@google/genai'
+
+import type {
+  ContentBlock,
+  LLMAdapter,
+  LLMChatOptions,
+  LLMMessage,
+  LLMResponse,
+  LLMStreamOptions,
+  LLMToolDef,
+  StreamEvent,
+  ToolUseBlock,
+} from '../types.js'
+
+// ---------------------------------------------------------------------------
+// Internal helpers
+// ---------------------------------------------------------------------------
+
+/**
+ * Map framework role names to Gemini role names.
+ *
+ * Gemini uses `"model"` instead of `"assistant"`.
+ */
+function toGeminiRole(role: 'user' | 'assistant'): string {
+  return role === 'assistant' ? 'model' : 'user'
+}
+
+/**
+ * Convert framework messages into Gemini's {@link Content}[] format.
+ *
+ * Key differences from Anthropic:
+ * - Gemini uses `"model"` instead of `"assistant"`.
+ * - `functionResponse` parts (tool results) must appear in `"user"` turns.
+ * - `functionCall` parts appear in `"model"` turns.
+ * - We build a name lookup map from tool_use blocks so tool_result blocks
+ *   can resolve the function name required by Gemini's `functionResponse`.
+ */
+function toGeminiContents(messages: LLMMessage[]): Content[] {
+  // First pass: build id → name map for resolving tool results.
+  const toolNameById = new Map<string, string>()
+  for (const msg of messages) {
+    for (const block of msg.content) {
+      if (block.type === 'tool_use') {
+        toolNameById.set(block.id, block.name)
+      }
+    }
+  }
+
+  return messages.map((msg): Content => {
+    const parts: Part[] = msg.content.map((block): Part => {
+      switch (block.type) {
+        case 'text':
+          return { text: block.text }
+
+        case 'tool_use':
+          return {
+            functionCall: {
+              id: block.id,
+              name: block.name,
+              args: block.input,
+            },
+          }
+
+        case 'tool_result': {
+          const name = toolNameById.get(block.tool_use_id) ?? block.tool_use_id
+          return {
+            functionResponse: {
+              id: block.tool_use_id,
+              name,
+              response: {
+                content:
+                  typeof block.content === 'string'
+                    ? block.content
+                    : JSON.stringify(block.content),
+                isError: block.is_error ?? false,
+              },
+            },
+          }
+        }
+
+        case 'image':
+          return {
+            inlineData: {
+              mimeType: block.source.media_type,
+              data: block.source.data,
+            },
+          }
+
+        default: {
+          const _exhaustive: never = block
+          throw new Error(`Unhandled content block type: ${JSON.stringify(_exhaustive)}`)
+        }
+      }
+    })
+
+    return { role: toGeminiRole(msg.role), parts }
+  })
+}
+
+/**
+ * Convert framework {@link LLMToolDef}s into a Gemini `tools` config array.
+ *
+ * In `@google/genai`, function declarations use `parametersJsonSchema` (not
+ * `parameters` or `input_schema`). All declarations are grouped under a single
+ * tool entry.
+ */
+function toGeminiTools(tools: readonly LLMToolDef[]): GeminiTool[] {
+  const functionDeclarations: FunctionDeclaration[] = tools.map((t) => ({
+    name: t.name,
+    description: t.description,
+    parametersJsonSchema: t.inputSchema as Record<string, unknown>,
+  }))
+  return [{ functionDeclarations }]
+}
+
+/**
+ * Build the {@link GenerateContentConfig} shared by chat() and stream().
+ */
+function buildConfig(
+  options: LLMChatOptions | LLMStreamOptions,
+): GenerateContentConfig {
+  return {
+    maxOutputTokens: options.maxTokens ?? 4096,
+    temperature: options.temperature,
+    systemInstruction: options.systemPrompt,
+    tools: options.tools ? toGeminiTools(options.tools) : undefined,
+    toolConfig: options.tools
+      ? { functionCallingConfig: { mode: FunctionCallingConfigMode.AUTO } }
+      : undefined,
+  }
+}
+
+/**
+ * Generate a stable pseudo-random ID string for tool use blocks.
+ *
+ * Gemini may not always return call IDs (especially in streaming), so we
+ * fabricate them when absent to satisfy the framework's {@link ToolUseBlock}
+ * contract.
+ */
+function generateId(): string {
+  return `gemini-${Date.now()}-${Math.random().toString(36).slice(2, 9)}`
+}
+
+/**
+ * Extract the function call ID from a Gemini part, or generate one.
+ *
+ * The `id` field exists in newer API versions but may be absent in older
+ * responses, so we cast conservatively and fall back to a generated ID.
+ */
+function getFunctionCallId(part: Part): string {
+  return (part.functionCall as { id?: string } | undefined)?.id ?? generateId()
+}
+
+/**
+ * Convert a Gemini {@link GenerateContentResponse} into a framework
+ * {@link LLMResponse}.
+ */
+function fromGeminiResponse(
+  response: GenerateContentResponse,
+  id: string,
+  model: string,
+): LLMResponse {
+  const candidate = response.candidates?.[0]
+  const content: ContentBlock[] = []
+
+  for (const part of candidate?.content?.parts ?? []) {
+    if (part.text !== undefined && part.text !== '') {
+      content.push({ type: 'text', text: part.text })
+    } else if (part.functionCall !== undefined) {
+      content.push({
+        type: 'tool_use',
+        id: getFunctionCallId(part),
+        name: part.functionCall.name ?? '',
+        input: (part.functionCall.args ?? {}) as Record<string, unknown>,
+      })
+    }
+    // inlineData echoes and other part types are silently ignored.
+  }
+
+  // Map Gemini finish reasons to framework stop_reason vocabulary.
+  const finishReason = candidate?.finishReason as string | undefined
+  let stop_reason: LLMResponse['stop_reason'] = 'end_turn'
+  if (finishReason === 'MAX_TOKENS') {
+    stop_reason = 'max_tokens'
+  } else if (content.some((b) => b.type === 'tool_use')) {
+    // Gemini may report STOP even when it returned function calls.
+    stop_reason = 'tool_use'
+  }
+
+  const usage = response.usageMetadata
+  return {
+    id,
+    content,
+    model,
+    stop_reason,
+    usage: {
+      input_tokens: usage?.promptTokenCount ?? 0,
+      output_tokens: usage?.candidatesTokenCount ?? 0,
+    },
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Adapter implementation
+// ---------------------------------------------------------------------------
+
+/**
+ * LLM adapter backed by the Google Gemini API via `@google/genai`.
+ *
+ * Thread-safe — a single instance may be shared across concurrent agent runs.
+ * The underlying SDK client is stateless across requests.
+ */
+export class GeminiAdapter implements LLMAdapter {
+  readonly name = 'gemini'
+
+  readonly #client: GoogleGenAI
+
+  constructor(apiKey?: string) {
+    this.#client = new GoogleGenAI({
+      apiKey: apiKey ?? process.env['GEMINI_API_KEY'] ?? process.env['GOOGLE_API_KEY'],
+    })
+  }
+
+  // -------------------------------------------------------------------------
+  // chat()
+  // -------------------------------------------------------------------------
+
+  /**
+   * Send a synchronous (non-streaming) chat request and return the complete
+   * {@link LLMResponse}.
+   *
+   * Uses `ai.models.generateContent()` with the full conversation as `contents`,
+   * which is the idiomatic pattern for `@google/genai`.
+   */
+  async chat(messages: LLMMessage[], options: LLMChatOptions): Promise<LLMResponse> {
+    const id = generateId()
+    const contents = toGeminiContents(messages)
+
+    const response = await this.#client.models.generateContent({
+      model: options.model,
+      contents,
+      config: buildConfig(options),
+    })
+
+    return fromGeminiResponse(response, id, options.model)
+  }
+
+  // -------------------------------------------------------------------------
+  // stream()
+  // -------------------------------------------------------------------------
+
+  /**
+   * Send a streaming chat request and yield {@link StreamEvent}s as they
+   * arrive from the API.
+   *
+   * Uses `ai.models.generateContentStream()` which returns an
+   * `AsyncGenerator<GenerateContentResponse>`. Each yielded chunk has the same
+   * shape as a full response but contains only the delta for that chunk.
+   *
+   * Because `@google/genai` doesn't expose a `finalMessage()` helper like the
+   * Anthropic SDK, we accumulate content and token counts as we stream so that
+   * the terminal `done` event carries a complete and accurate {@link LLMResponse}.
+   *
+   * Sequence guarantees (matching the Anthropic adapter):
+   * - Zero or more `text` events with incremental deltas
+   * - Zero or more `tool_use` events (one per call; Gemini doesn't stream args)
+   * - Exactly one terminal event: `done` or `error`
+   */
+  async *stream(
+    messages: LLMMessage[],
+    options: LLMStreamOptions,
+  ): AsyncIterable<StreamEvent> {
+    const id = generateId()
+    const contents = toGeminiContents(messages)
+
+    try {
+      const streamResponse = await this.#client.models.generateContentStream({
+        model: options.model,
+        contents,
+        config: buildConfig(options),
+      })
+
+      // Accumulators for building the done payload.
+      const accumulatedContent: ContentBlock[] = []
+      let inputTokens = 0
+      let outputTokens = 0
+      let lastFinishReason: string | undefined
+
+      for await (const chunk of streamResponse) {
+        const candidate = chunk.candidates?.[0]
+
+        // Accumulate token counts — the API emits these on the final chunk.
+        if (chunk.usageMetadata) {
+          inputTokens = chunk.usageMetadata.promptTokenCount ?? inputTokens
+          outputTokens = chunk.usageMetadata.candidatesTokenCount ?? outputTokens
+        }
+        if (candidate?.finishReason) {
+          lastFinishReason = candidate.finishReason as string
+        }
+
+        for (const part of candidate?.content?.parts ?? []) {
+          if (part.text) {
+            accumulatedContent.push({ type: 'text', text: part.text })
+            yield { type: 'text', data: part.text } satisfies StreamEvent
+          } else if (part.functionCall) {
+            const toolId = getFunctionCallId(part)
+            const toolUseBlock: ToolUseBlock = {
+              type: 'tool_use',
+              id: toolId,
+              name: part.functionCall.name ?? '',
+              input: (part.functionCall.args ?? {}) as Record<string, unknown>,
+            }
+            accumulatedContent.push(toolUseBlock)
+            yield { type: 'tool_use', data: toolUseBlock } satisfies StreamEvent
+          }
+        }
+      }
+
+      // Determine stop_reason from the accumulated response.
+      const hasToolUse = accumulatedContent.some((b) => b.type === 'tool_use')
+      let stop_reason: LLMResponse['stop_reason'] = 'end_turn'
+      if (lastFinishReason === 'MAX_TOKENS') {
+        stop_reason = 'max_tokens'
+      } else if (hasToolUse) {
+        stop_reason = 'tool_use'
+      }
+
+      const finalResponse: LLMResponse = {
+        id,
+        content: accumulatedContent,
+        model: options.model,
+        stop_reason,
+        usage: { input_tokens: inputTokens, output_tokens: outputTokens },
+      }
+
+      yield { type: 'done', data: finalResponse } satisfies StreamEvent
+    } catch (err) {
+      const error = err instanceof Error ? err : new Error(String(err))
+      yield { type: 'error', data: error } satisfies StreamEvent
+    }
+  }
+}
--- a/src/llm/openai-common.ts
+++ b/src/llm/openai-common.ts
@ -25,6 +25,7 @@ import type {
  TextBlock,
  ToolUseBlock,
 } from '../types.js'
+import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'

 // ---------------------------------------------------------------------------
 // Framework → OpenAI
@ -166,8 +167,18 @@ function toOpenAIAssistantMessage(msg: LLMMessage): ChatCompletionAssistantMessa
 *
 * Takes only the first choice (index 0), consistent with how the framework
 * is designed for single-output agents.
+ *
+ * @param completion      - The raw OpenAI completion.
+ * @param knownToolNames  - Optional whitelist of tool names. When the model
+ *                          returns no `tool_calls` but the text contains JSON
+ *                          that looks like a tool call, the fallback extractor
+ *                          uses this list to validate matches. Pass the names
+ *                          of tools sent in the request for best results.
 */
-export function fromOpenAICompletion(completion: ChatCompletion): LLMResponse {
+export function fromOpenAICompletion(
+  completion: ChatCompletion,
+  knownToolNames?: string[],
+): LLMResponse {
  const choice = completion.choices[0]
  if (choice === undefined) {
    throw new Error('OpenAI returned a completion with no choices')
@ -201,7 +212,35 @@ export function fromOpenAICompletion(completion: ChatCompletion): LLMResponse {
    content.push(toolUseBlock)
  }

-  const stopReason = normalizeFinishReason(choice.finish_reason ?? 'stop')
+  // ---------------------------------------------------------------------------
+  // Fallback: extract tool calls from text when native tool_calls is empty.
+  //
+  // Some local models (Ollama thinking models, misconfigured vLLM) return tool
+  // calls as plain text instead of using the tool_calls wire format.  When we
+  // have text but no tool_calls, try to extract them from the text.
+  // ---------------------------------------------------------------------------
+  const hasNativeToolCalls = (message.tool_calls ?? []).length > 0
+  if (
+    !hasNativeToolCalls &&
+    knownToolNames !== undefined &&
+    knownToolNames.length > 0 &&
+    message.content !== null &&
+    message.content !== undefined &&
+    message.content.length > 0
+  ) {
+    const extracted = extractToolCallsFromText(message.content, knownToolNames)
+    if (extracted.length > 0) {
+      content.push(...extracted)
+    }
+  }
+
+  const hasToolUseBlocks = content.some(b => b.type === 'tool_use')
+  const rawStopReason = choice.finish_reason ?? 'stop'
+  // If we extracted tool calls from text but the finish_reason was 'stop',
+  // correct it to 'tool_use' so the agent runner continues the loop.
+  const stopReason = hasToolUseBlocks && rawStopReason === 'stop'
+    ? 'tool_use'
+    : normalizeFinishReason(rawStopReason)

  return {
    id: completion.id,
--- a/src/llm/openai.ts
+++ b/src/llm/openai.ts
@ -54,6 +54,7 @@ import {
  normalizeFinishReason,
  buildOpenAIMessageList,
 } from './openai-common.js'
+import { extractToolCallsFromText } from '../tool/text-tool-extractor.js'

 // ---------------------------------------------------------------------------
 // Adapter implementation
@ -104,7 +105,8 @@ export class OpenAIAdapter implements LLMAdapter {
      },
    )

-    return fromOpenAICompletion(completion)
+    const toolNames = options.tools?.map(t => t.name)
+    return fromOpenAICompletion(completion, toolNames)
  }

  // -------------------------------------------------------------------------
@ -241,11 +243,29 @@ export class OpenAIAdapter implements LLMAdapter {
      }
      doneContent.push(...finalToolUseBlocks)

+      // Fallback: extract tool calls from text when streaming produced no
+      // native tool_calls (same logic as fromOpenAICompletion).
+      if (finalToolUseBlocks.length === 0 && fullText.length > 0 && options.tools) {
+        const toolNames = options.tools.map(t => t.name)
+        const extracted = extractToolCallsFromText(fullText, toolNames)
+        if (extracted.length > 0) {
+          doneContent.push(...extracted)
+          for (const block of extracted) {
+            yield { type: 'tool_use', data: block } satisfies StreamEvent
+          }
+        }
+      }
+
+      const hasToolUseBlocks = doneContent.some(b => b.type === 'tool_use')
+      const resolvedStopReason = hasToolUseBlocks && finalFinishReason === 'stop'
+        ? 'tool_use'
+        : normalizeFinishReason(finalFinishReason)
+
      const finalResponse: LLMResponse = {
        id: completionId,
        content: doneContent,
        model: completionModel,
-        stop_reason: normalizeFinishReason(finalFinishReason),
+        stop_reason: resolvedStopReason,
        usage: { input_tokens: inputTokens, output_tokens: outputTokens },
      }

--- a/src/tool/text-tool-extractor.ts
+++ b/src/tool/text-tool-extractor.ts
@ -0,0 +1,219 @@
+/**
+ * @fileoverview Fallback tool-call extractor for local models.
+ *
+ * When a local model (Ollama, vLLM, LM Studio) returns tool calls as plain
+ * text instead of using the OpenAI `tool_calls` wire format, this module
+ * attempts to extract them from the text output.
+ *
+ * Common scenarios:
+ * - Ollama thinking-model bug: tool call JSON ends up inside unclosed `<think>` tags
+ * - Model outputs raw JSON tool calls without the server parsing them
+ * - Model wraps tool calls in markdown code fences
+ * - Hermes-format `<tool_call>` tags
+ *
+ * This is a **safety net**, not the primary path. Native `tool_calls` from
+ * the server are always preferred.
+ */
+
+import type { ToolUseBlock } from '../types.js'
+
+// ---------------------------------------------------------------------------
+// ID generation
+// ---------------------------------------------------------------------------
+
+let callCounter = 0
+
+/** Generate a unique tool-call ID for extracted calls. */
+function generateToolCallId(): string {
+  return `extracted_call_${Date.now()}_${++callCounter}`
+}
+
+// ---------------------------------------------------------------------------
+// Internal parsers
+// ---------------------------------------------------------------------------
+
+/**
+ * Try to parse a single JSON object as a tool call.
+ *
+ * Accepted shapes:
+ * ```json
+ * { "name": "bash", "arguments": { "command": "ls" } }
+ * { "name": "bash", "parameters": { "command": "ls" } }
+ * { "function": { "name": "bash", "arguments": { "command": "ls" } } }
+ * ```
+ */
+function parseToolCallJSON(
+  json: unknown,
+  knownToolNames: ReadonlySet<string>,
+): ToolUseBlock | null {
+  if (json === null || typeof json !== 'object' || Array.isArray(json)) {
+    return null
+  }
+
+  const obj = json as Record<string, unknown>
+
+  // Shape: { function: { name, arguments } }
+  if (typeof obj['function'] === 'object' && obj['function'] !== null) {
+    const fn = obj['function'] as Record<string, unknown>
+    return parseFlat(fn, knownToolNames)
+  }
+
+  // Shape: { name, arguments|parameters }
+  return parseFlat(obj, knownToolNames)
+}
+
+function parseFlat(
+  obj: Record<string, unknown>,
+  knownToolNames: ReadonlySet<string>,
+): ToolUseBlock | null {
+  const name = obj['name']
+  if (typeof name !== 'string' || name.length === 0) return null
+
+  // Whitelist check — don't treat arbitrary JSON as a tool call
+  if (knownToolNames.size > 0 && !knownToolNames.has(name)) return null
+
+  let input: Record<string, unknown> = {}
+  const args = obj['arguments'] ?? obj['parameters'] ?? obj['input']
+  if (args !== null && args !== undefined) {
+    if (typeof args === 'string') {
+      try {
+        const parsed = JSON.parse(args)
+        if (typeof parsed === 'object' && parsed !== null && !Array.isArray(parsed)) {
+          input = parsed as Record<string, unknown>
+        }
+      } catch {
+        // Malformed — use empty input
+      }
+    } else if (typeof args === 'object' && !Array.isArray(args)) {
+      input = args as Record<string, unknown>
+    }
+  }
+
+  return {
+    type: 'tool_use',
+    id: generateToolCallId(),
+    name,
+    input,
+  }
+}
+
+// ---------------------------------------------------------------------------
+// JSON extraction from text
+// ---------------------------------------------------------------------------
+
+/**
+ * Find all top-level JSON objects in a string by tracking brace depth.
+ * Returns the parsed objects (not sub-objects).
+ */
+function extractJSONObjects(text: string): unknown[] {
+  const results: unknown[] = []
+  let depth = 0
+  let start = -1
+  let inString = false
+  let escape = false
+
+  for (let i = 0; i < text.length; i++) {
+    const ch = text[i]!
+
+    if (escape) {
+      escape = false
+      continue
+    }
+
+    if (ch === '\\' && inString) {
+      escape = true
+      continue
+    }
+
+    if (ch === '"') {
+      inString = !inString
+      continue
+    }
+
+    if (inString) continue
+
+    if (ch === '{') {
+      if (depth === 0) start = i
+      depth++
+    } else if (ch === '}') {
+      depth--
+      if (depth === 0 && start !== -1) {
+        const candidate = text.slice(start, i + 1)
+        try {
+          results.push(JSON.parse(candidate))
+        } catch {
+          // Not valid JSON — skip
+        }
+        start = -1
+      }
+    }
+  }
+
+  return results
+}
+
+// ---------------------------------------------------------------------------
+// Hermes format: <tool_call>...</tool_call>
+// ---------------------------------------------------------------------------
+
+function extractHermesToolCalls(
+  text: string,
+  knownToolNames: ReadonlySet<string>,
+): ToolUseBlock[] {
+  const results: ToolUseBlock[] = []
+
+  for (const match of text.matchAll(/<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g)) {
+    const inner = match[1]!.trim()
+    try {
+      const parsed: unknown = JSON.parse(inner)
+      const block = parseToolCallJSON(parsed, knownToolNames)
+      if (block !== null) results.push(block)
+    } catch {
+      // Malformed hermes content — skip
+    }
+  }
+
+  return results
+}
+
+// ---------------------------------------------------------------------------
+// Public API
+// ---------------------------------------------------------------------------
+
+/**
+ * Attempt to extract tool calls from a model's text output.
+ *
+ * Tries multiple strategies in order:
+ * 1. Hermes `<tool_call>` tags
+ * 2. JSON objects in text (bare or inside code fences)
+ *
+ * @param text           - The model's text output.
+ * @param knownToolNames - Whitelist of registered tool names. When non-empty,
+ *                         only JSON objects whose `name` matches a known tool
+ *                         are treated as tool calls.
+ * @returns Extracted {@link ToolUseBlock}s, or an empty array if none found.
+ */
+export function extractToolCallsFromText(
+  text: string,
+  knownToolNames: string[],
+): ToolUseBlock[] {
+  if (text.length === 0) return []
+
+  const nameSet = new Set(knownToolNames)
+
+  // Strategy 1: Hermes format
+  const hermesResults = extractHermesToolCalls(text, nameSet)
+  if (hermesResults.length > 0) return hermesResults
+
+  // Strategy 2: Strip code fences, then extract JSON objects
+  const stripped = text.replace(/```(?:json)?\s*\n?([\s\S]*?)\n?\s*```/g, '$1')
+  const jsonObjects = extractJSONObjects(stripped)
+
+  const results: ToolUseBlock[] = []
+  for (const obj of jsonObjects) {
+    const block = parseToolCallJSON(obj, nameSet)
+    if (block !== null) results.push(block)
+  }
+
+  return results
+}
--- a/src/types.ts
+++ b/src/types.ts
@ -194,7 +194,7 @@ export interface BeforeRunHookContext {
 export interface AgentConfig {
  readonly name: string
  readonly model: string
-  readonly provider?: 'anthropic' | 'copilot' | 'grok' | 'openai'
+  readonly provider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
  /**
   * Custom base URL for OpenAI-compatible APIs (Ollama, vLLM, LM Studio, etc.).
   * Note: local servers that don't require auth still need `apiKey` set to a
@ -209,6 +209,12 @@ export interface AgentConfig {
  readonly maxTurns?: number
  readonly maxTokens?: number
  readonly temperature?: number
+  /**
+   * Maximum wall-clock time (in milliseconds) for the entire agent run.
+   * When exceeded, the run is aborted via `AbortSignal.timeout()`.
+   * Useful for local models where inference can be unpredictably slow.
+   */
+  readonly timeoutMs?: number
  /**
   * Loop detection configuration. When set, the agent tracks repeated tool
   * calls and text outputs to detect stuck loops before `maxTurns` is reached.
@ -380,7 +386,7 @@ export interface OrchestratorEvent {
 export interface OrchestratorConfig {
  readonly maxConcurrency?: number
  readonly defaultModel?: string
-  readonly defaultProvider?: 'anthropic' | 'copilot' | 'grok' | 'openai'
+  readonly defaultProvider?: 'anthropic' | 'copilot' | 'grok' | 'openai' | 'gemini'
  readonly defaultBaseURL?: string
  readonly defaultApiKey?: string
  readonly onProgress?: (event: OrchestratorEvent) => void
--- a/tests/gemini-adapter.test.ts
+++ b/tests/gemini-adapter.test.ts
@ -0,0 +1,97 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+
+// ---------------------------------------------------------------------------
+// Mock GoogleGenAI constructor (must be hoisted for Vitest)
+// ---------------------------------------------------------------------------
+const GoogleGenAIMock = vi.hoisted(() => vi.fn())
+
+vi.mock('@google/genai', () => ({
+  GoogleGenAI: GoogleGenAIMock,
+  FunctionCallingConfigMode: { AUTO: 'AUTO' },
+}))
+
+import { GeminiAdapter } from '../src/llm/gemini.js'
+import { createAdapter } from '../src/llm/adapter.js'
+
+// ---------------------------------------------------------------------------
+// GeminiAdapter tests
+// ---------------------------------------------------------------------------
+
+describe('GeminiAdapter', () => {
+  beforeEach(() => {
+    GoogleGenAIMock.mockClear()
+  })
+
+  it('has name "gemini"', () => {
+    const adapter = new GeminiAdapter()
+    expect(adapter.name).toBe('gemini')
+  })
+
+  it('uses GEMINI_API_KEY by default', () => {
+    const originalGemini = process.env['GEMINI_API_KEY']
+    const originalGoogle = process.env['GOOGLE_API_KEY']
+    process.env['GEMINI_API_KEY'] = 'gemini-env-key'
+    delete process.env['GOOGLE_API_KEY']
+
+    try {
+      new GeminiAdapter()
+      expect(GoogleGenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'gemini-env-key',
+        }),
+      )
+    } finally {
+      if (originalGemini === undefined) {
+        delete process.env['GEMINI_API_KEY']
+      } else {
+        process.env['GEMINI_API_KEY'] = originalGemini
+      }
+      if (originalGoogle === undefined) {
+        delete process.env['GOOGLE_API_KEY']
+      } else {
+        process.env['GOOGLE_API_KEY'] = originalGoogle
+      }
+    }
+  })
+
+  it('falls back to GOOGLE_API_KEY when GEMINI_API_KEY is unset', () => {
+    const originalGemini = process.env['GEMINI_API_KEY']
+    const originalGoogle = process.env['GOOGLE_API_KEY']
+    delete process.env['GEMINI_API_KEY']
+    process.env['GOOGLE_API_KEY'] = 'google-env-key'
+
+    try {
+      new GeminiAdapter()
+      expect(GoogleGenAIMock).toHaveBeenCalledWith(
+        expect.objectContaining({
+          apiKey: 'google-env-key',
+        }),
+      )
+    } finally {
+      if (originalGemini === undefined) {
+        delete process.env['GEMINI_API_KEY']
+      } else {
+        process.env['GEMINI_API_KEY'] = originalGemini
+      }
+      if (originalGoogle === undefined) {
+        delete process.env['GOOGLE_API_KEY']
+      } else {
+        process.env['GOOGLE_API_KEY'] = originalGoogle
+      }
+    }
+  })
+
+  it('allows overriding apiKey explicitly', () => {
+    new GeminiAdapter('explicit-key')
+    expect(GoogleGenAIMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        apiKey: 'explicit-key',
+      }),
+    )
+  })
+
+  it('createAdapter("gemini") returns GeminiAdapter instance', async () => {
+    const adapter = await createAdapter('gemini')
+    expect(adapter).toBeInstanceOf(GeminiAdapter)
+  })
+})
--- a/tests/openai-fallback.test.ts
+++ b/tests/openai-fallback.test.ts
@ -0,0 +1,159 @@
+import { describe, it, expect } from 'vitest'
+import { fromOpenAICompletion } from '../src/llm/openai-common.js'
+import type { ChatCompletion } from 'openai/resources/chat/completions/index.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function makeCompletion(overrides: {
+  content?: string | null
+  tool_calls?: ChatCompletion.Choice['message']['tool_calls']
+  finish_reason?: string
+}): ChatCompletion {
+  return {
+    id: 'chatcmpl-test',
+    object: 'chat.completion',
+    created: Date.now(),
+    model: 'test-model',
+    choices: [
+      {
+        index: 0,
+        message: {
+          role: 'assistant',
+          content: overrides.content ?? null,
+          tool_calls: overrides.tool_calls,
+          refusal: null,
+        },
+        finish_reason: (overrides.finish_reason ?? 'stop') as 'stop' | 'tool_calls',
+        logprobs: null,
+      },
+    ],
+    usage: {
+      prompt_tokens: 10,
+      completion_tokens: 20,
+      total_tokens: 30,
+    },
+  }
+}
+
+const TOOL_NAMES = ['bash', 'file_read', 'file_write']
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('fromOpenAICompletion fallback extraction', () => {
+  it('returns normal tool_calls when present (no fallback)', () => {
+    const completion = makeCompletion({
+      content: 'Let me run a command.',
+      tool_calls: [
+        {
+          id: 'call_123',
+          type: 'function',
+          function: {
+            name: 'bash',
+            arguments: '{"command": "ls"}',
+          },
+        },
+      ],
+      finish_reason: 'tool_calls',
+    })
+
+    const response = fromOpenAICompletion(completion, TOOL_NAMES)
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks).toHaveLength(1)
+    expect(toolBlocks[0]!.type === 'tool_use' && toolBlocks[0]!.name).toBe('bash')
+    expect(toolBlocks[0]!.type === 'tool_use' && toolBlocks[0]!.id).toBe('call_123')
+    expect(response.stop_reason).toBe('tool_use')
+  })
+
+  it('extracts tool calls from text when tool_calls is absent', () => {
+    const completion = makeCompletion({
+      content: 'I will run this:\n{"name": "bash", "arguments": {"command": "pwd"}}',
+      finish_reason: 'stop',
+    })
+
+    const response = fromOpenAICompletion(completion, TOOL_NAMES)
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks).toHaveLength(1)
+    expect(toolBlocks[0]!.type === 'tool_use' && toolBlocks[0]!.name).toBe('bash')
+    expect(toolBlocks[0]!.type === 'tool_use' && toolBlocks[0]!.input).toEqual({ command: 'pwd' })
+    // stop_reason should be corrected to tool_use
+    expect(response.stop_reason).toBe('tool_use')
+  })
+
+  it('does not fallback when knownToolNames is not provided', () => {
+    const completion = makeCompletion({
+      content: '{"name": "bash", "arguments": {"command": "ls"}}',
+      finish_reason: 'stop',
+    })
+
+    const response = fromOpenAICompletion(completion)
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks).toHaveLength(0)
+    expect(response.stop_reason).toBe('end_turn')
+  })
+
+  it('does not fallback when knownToolNames is empty', () => {
+    const completion = makeCompletion({
+      content: '{"name": "bash", "arguments": {"command": "ls"}}',
+      finish_reason: 'stop',
+    })
+
+    const response = fromOpenAICompletion(completion, [])
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks).toHaveLength(0)
+    expect(response.stop_reason).toBe('end_turn')
+  })
+
+  it('returns plain text when no tool calls found in text', () => {
+    const completion = makeCompletion({
+      content: 'Hello! How can I help you today?',
+      finish_reason: 'stop',
+    })
+
+    const response = fromOpenAICompletion(completion, TOOL_NAMES)
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(toolBlocks).toHaveLength(0)
+    expect(response.stop_reason).toBe('end_turn')
+  })
+
+  it('preserves text block alongside extracted tool blocks', () => {
+    const completion = makeCompletion({
+      content: 'Let me check:\n{"name": "file_read", "arguments": {"path": "/tmp/x"}}',
+      finish_reason: 'stop',
+    })
+
+    const response = fromOpenAICompletion(completion, TOOL_NAMES)
+    const textBlocks = response.content.filter(b => b.type === 'text')
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    expect(textBlocks).toHaveLength(1)
+    expect(toolBlocks).toHaveLength(1)
+  })
+
+  it('does not double-extract when native tool_calls already present', () => {
+    // Text also contains a tool call JSON, but native tool_calls is populated.
+    // The fallback should NOT run.
+    const completion = makeCompletion({
+      content: '{"name": "file_read", "arguments": {"path": "/tmp/y"}}',
+      tool_calls: [
+        {
+          id: 'call_native',
+          type: 'function',
+          function: {
+            name: 'bash',
+            arguments: '{"command": "ls"}',
+          },
+        },
+      ],
+      finish_reason: 'tool_calls',
+    })
+
+    const response = fromOpenAICompletion(completion, TOOL_NAMES)
+    const toolBlocks = response.content.filter(b => b.type === 'tool_use')
+    // Should only have the native one, not the text-extracted one
+    expect(toolBlocks).toHaveLength(1)
+    expect(toolBlocks[0]!.type === 'tool_use' && toolBlocks[0]!.id).toBe('call_native')
+  })
+})
--- a/tests/text-tool-extractor.test.ts
+++ b/tests/text-tool-extractor.test.ts
@ -0,0 +1,170 @@
+import { describe, it, expect } from 'vitest'
+import { extractToolCallsFromText } from '../src/tool/text-tool-extractor.js'
+
+const TOOLS = ['bash', 'file_read', 'file_write']
+
+describe('extractToolCallsFromText', () => {
+  // -------------------------------------------------------------------------
+  // No tool calls
+  // -------------------------------------------------------------------------
+
+  it('returns empty array for empty text', () => {
+    expect(extractToolCallsFromText('', TOOLS)).toEqual([])
+  })
+
+  it('returns empty array for plain text with no JSON', () => {
+    expect(extractToolCallsFromText('Hello, I am a helpful assistant.', TOOLS)).toEqual([])
+  })
+
+  it('returns empty array for JSON that does not match any known tool', () => {
+    const text = '{"name": "unknown_tool", "arguments": {"x": 1}}'
+    expect(extractToolCallsFromText(text, TOOLS)).toEqual([])
+  })
+
+  // -------------------------------------------------------------------------
+  // Bare JSON
+  // -------------------------------------------------------------------------
+
+  it('extracts a bare JSON tool call with "arguments"', () => {
+    const text = 'I will run this command:\n{"name": "bash", "arguments": {"command": "ls -la"}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.type).toBe('tool_use')
+    expect(result[0]!.name).toBe('bash')
+    expect(result[0]!.input).toEqual({ command: 'ls -la' })
+    expect(result[0]!.id).toMatch(/^extracted_call_/)
+  })
+
+  it('extracts a bare JSON tool call with "parameters"', () => {
+    const text = '{"name": "file_read", "parameters": {"path": "/tmp/test.txt"}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('file_read')
+    expect(result[0]!.input).toEqual({ path: '/tmp/test.txt' })
+  })
+
+  it('extracts a bare JSON tool call with "input"', () => {
+    const text = '{"name": "bash", "input": {"command": "pwd"}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[0]!.input).toEqual({ command: 'pwd' })
+  })
+
+  it('extracts { function: { name, arguments } } shape', () => {
+    const text = '{"function": {"name": "bash", "arguments": {"command": "echo hi"}}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[0]!.input).toEqual({ command: 'echo hi' })
+  })
+
+  it('handles string-encoded arguments', () => {
+    const text = '{"name": "bash", "arguments": "{\\"command\\": \\"ls\\"}"}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.input).toEqual({ command: 'ls' })
+  })
+
+  // -------------------------------------------------------------------------
+  // Multiple tool calls
+  // -------------------------------------------------------------------------
+
+  it('extracts multiple tool calls from text', () => {
+    const text = `Let me do two things:
+{"name": "bash", "arguments": {"command": "ls"}}
+And then:
+{"name": "file_read", "arguments": {"path": "/tmp/x"}}`
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(2)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[1]!.name).toBe('file_read')
+  })
+
+  // -------------------------------------------------------------------------
+  // Code fence wrapped
+  // -------------------------------------------------------------------------
+
+  it('extracts tool call from markdown code fence', () => {
+    const text = 'Here is the tool call:\n```json\n{"name": "bash", "arguments": {"command": "whoami"}}\n```'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[0]!.input).toEqual({ command: 'whoami' })
+  })
+
+  it('extracts tool call from code fence without language tag', () => {
+    const text = '```\n{"name": "file_write", "arguments": {"path": "/tmp/a.txt", "content": "hi"}}\n```'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('file_write')
+  })
+
+  // -------------------------------------------------------------------------
+  // Hermes format
+  // -------------------------------------------------------------------------
+
+  it('extracts tool call from <tool_call> tags', () => {
+    const text = '<tool_call>\n{"name": "bash", "arguments": {"command": "date"}}\n</tool_call>'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[0]!.input).toEqual({ command: 'date' })
+  })
+
+  it('extracts multiple hermes tool calls', () => {
+    const text = `<tool_call>{"name": "bash", "arguments": {"command": "ls"}}</tool_call>
+Some text in between
+<tool_call>{"name": "file_read", "arguments": {"path": "/tmp/x"}}</tool_call>`
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(2)
+    expect(result[0]!.name).toBe('bash')
+    expect(result[1]!.name).toBe('file_read')
+  })
+
+  // -------------------------------------------------------------------------
+  // Edge cases
+  // -------------------------------------------------------------------------
+
+  it('skips malformed JSON gracefully', () => {
+    const text = '{"name": "bash", "arguments": {invalid json}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toEqual([])
+  })
+
+  it('skips JSON objects without a name field', () => {
+    const text = '{"command": "ls", "arguments": {"x": 1}}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toEqual([])
+  })
+
+  it('works with empty knownToolNames (no whitelist filtering)', () => {
+    const text = '{"name": "anything", "arguments": {"x": 1}}'
+    const result = extractToolCallsFromText(text, [])
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('anything')
+  })
+
+  it('generates unique IDs for each extracted call', () => {
+    const text = `{"name": "bash", "arguments": {"command": "a"}}
+{"name": "bash", "arguments": {"command": "b"}}`
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(2)
+    expect(result[0]!.id).not.toBe(result[1]!.id)
+  })
+
+  it('handles tool call with no arguments', () => {
+    const text = '{"name": "bash"}'
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.input).toEqual({})
+  })
+
+  it('handles text with nested JSON objects that are not tool calls', () => {
+    const text = `Here is some config: {"port": 3000, "host": "localhost"}
+And a tool call: {"name": "bash", "arguments": {"command": "ls"}}`
+    const result = extractToolCallsFromText(text, TOOLS)
+    expect(result).toHaveLength(1)
+    expect(result[0]!.name).toBe('bash')
+  })
+})