Examples grew to 19 flat files mixing basics, provider demos, orchestration
patterns, and integrations, with two files colliding on the number 16.
Reorganized into category folders so the structure scales as new providers
and patterns get added.
Layout:
examples/basics/ core execution modes (4 files)
examples/providers/ one example per supported model provider (8 files)
examples/patterns/ reusable orchestration patterns (6 files)
examples/integrations/ MCP, observability, AI SDK (3 entries)
examples/production/ placeholder for end-to-end use cases
Notable changes:
- Dropped numeric prefixes; folder + filename now signal category and intent.
- Rewrote former smoke-test scripts (copilot, gemini) into proper three-agent
team examples matching the deepseek/grok/minimax/groq template. Adapter
unit tests in tests/ already cover correctness, so this only improves
documentation quality.
- Added examples/README.md as the categorized index plus maintenance rules
for new submissions.
- Added examples/production/README.md with acceptance criteria for the new
production category.
- Updated all internal npx tsx paths and import paths (../src/ to ../../src/).
- Updated README.md and README_zh.md links.
- Fixed stale cd paths inside examples/integrations/with-vercel-ai-sdk/README.md.
Implements `delegate_to_agent` built-in tool (closes#63). Opt-in registration via `includeDelegateTool`; only wired up by `runTeam` / `runTasks` for pool workers. Guards: self-delegation, unknown target, cycle detection via `delegationChain`, depth cap (`maxDelegationDepth`, default 3), pool deadlock.
Delegation runs on ephemeral Agent instances via `AgentPool.runEphemeral` (pool semaphore only, no per-agent lock) so mutual delegation (A→B while B→A) can't deadlock. Delegated run `tokenUsage` surfaces via `ToolResult.metadata` and rolls into the parent runner's total before the next budget check; delegation tool_result blocks are exempt from `compressToolResults` and the `compact` strategy. Best-effort SharedMemory audit writes at `{caller}/delegation:{target}:{ts}-{rand}`.
Picks up @NamelessNATM's work from #84 and adds cycle detection, token aggregation, compression exemption, mutual-delegation deadlock fix (Codex P1), and tool_result-preservation on budget-exceeded (Codex P2).
Co-authored-by: NamelessNATM <hamzarstar@gmail.com>
- Document AgentConfig.customTools alongside agent.addTool()
- Add Tool Output Control section (maxToolOutputChars, maxOutputChars, compressToolResults)
- Add Context Management section covering all four strategies (sliding-window, summarize, compact, custom)
- Bump examples count 18 to 19 and add example 16 (MCP) and example 19 (Groq) to curated list
- Sync README_zh with README: add CLI (oma) note, full MCP Tools section, Groq row in providers table
- Drop stale rentech-quant-platform entry from Used by
* feat: add rule-based compact context strategy (#111)
Add `contextStrategy: 'compact'` as a zero-LLM-cost alternative to `summarize`.
Instead of making an LLM call to compress everything into prose, it selectively
compresses old turns using structural rules:
- Preserve tool_use blocks (agent decisions) and error tool_results
- Replace long tool_result content with compact markers including tool name
- Truncate long assistant text blocks with head excerpts
- Keep recent turns (configurable via preserveRecentTurns) fully intact
- Detect already-compressed markers from compressToolResults to avoid double-processing
Closes#111
* fix: remove redundant length guard and fix compact type indentation
When minChars is set low, compressed markers could be re-compressed
with incorrect char counts. Skip blocks whose content already starts
with the compression prefix.
Replace consumed tool results with compact markers before each LLM call,
freeing context budget in multi-turn agent runs. A tool result is
"consumed" once the assistant has produced a response after seeing it.
- Add `compressToolResults` option to AgentConfig / RunnerOptions
- Runs before contextStrategy (lightweight, no LLM calls)
- Error results and short results (< minChars, default 500) are skipped
- 9 test cases covering default off, compression, parallel tools,
4+ turn compounding, error exemption, custom threshold, and
contextStrategy coexistence
* feat: add tool output auto-truncation at framework level (#110)
Prevent context blowup from large tool outputs by adding opt-in
character-based truncation (head 70% + tail 30% with marker).
Agent-level `maxToolOutputChars` and per-tool `maxOutputChars`
with per-tool taking priority. Marker overhead is budgeted so
the result never exceeds the configured limit.
* fix: truncateToolOutput may exceed maxChars when limit < marker overhead
- Fall back to hard slice when maxChars is too small to fit the marker
- Fix misplaced JSDoc for outputSchema in AgentConfig
- Tighten test assertion to verify length <= maxChars
* feat: add customTools support to AgentConfig for orchestrator-level tool injection
Users can now pass custom ToolDefinition objects via AgentConfig.customTools,
which are registered alongside built-in tools in all orchestrator paths
(runAgent, runTeam, runTasks). Custom tools bypass allowlist/preset filtering
but can still be blocked by disallowedTools.
Ref #108
* test: add disallowedTools blocking custom tool test
* fix: apply disallowedTools filtering to runtime-added custom tools
Previously runtime-added tools bypassed all filtering including
disallowedTools, contradicting the documented behavior. Now custom
tools still bypass preset/allowlist but respect the denylist.
Previously runtime-added tools bypassed all filtering including
disallowedTools, contradicting the documented behavior. Now custom
tools still bypass preset/allowlist but respect the denylist.
Users can now pass custom ToolDefinition objects via AgentConfig.customTools,
which are registered alongside built-in tools in all orchestrator paths
(runAgent, runTeam, runTasks). Custom tools bypass allowlist/preset filtering
but can still be blocked by disallowedTools.
Ref #108
Helps maintainers triage by requiring contributors to indicate where
the idea originated (real use case, competitive reference, systematic
gap, or external discussion).
- #99: pass per-call effectiveAbortSignal to buildToolContext() so tools
receive the correct signal instead of the static runner-level one
- #100: replace manual pending-task loop with queue.skipRemaining() on
abort, fixing blocked tasks left non-terminal and missing events
- #101: forward abortSignal in Gemini adapter's buildConfig() so the
SDK can cancel in-flight API calls
- Add 8 targeted tests for all three fixes
run() only handled 'done' events from stream(), silently dropping
'error' events. This caused failed LLM calls to return an empty
RunResult that the caller treated as successful.
Strategic rewrite following docs/project-evaluation-2026-04-09.md.
README.md and README_zh.md updated in lockstep.
Top fold changes:
- New tagline positioning against CrewAI and LangGraph
- Replace 11-bullet "Why" with 3 bullets (runTeam / 3 deps / multi-model)
- New Philosophy section with "we build / we don't build / tracking"
- "We don't build" limited to handoffs and checkpointing (softened);
Cloud/Studio bullet dropped to preserve future Hosted option
- New "How is this different from X?" FAQ covering LangGraph JS, CrewAI,
and Vercel AI SDK
- New "Used by" section with three early-stage integrations, framed
honestly for a new project (temodar-agent, rentech-quant-platform,
cybersecurity SOC home lab)
Examples section:
- Shrink 15-row catalog table to 4 featured entries + link to examples/
- Featured: 02 team collaboration, 06 local model, 09 structured output,
11 trace observability
- Eliminates maintenance debt of updating the table on every new example
Refinements during alignment pass:
- Launch date corrected to 2026-04-01 (matches first commit timestamp)
- Surface Gemini @google/genai peer dep in top fold and Providers table
- Rephrase "Agent handoffs" bullet to avoid reading as single-agent framework
- Update prose example to Opus 4.6 / GPT-5.4 / local Gemma 4
- Quick Start code example shortened ~30% (developer/reviewer collapsed
to stubs, still demonstrates multi-agent team shape)
- Remove CrewAI endorsement stats (48K stars / Andrew Ng / $18M) to keep
comparisons technical
- Drop Star History cache-buster since growth has stabilized; bump
contributors cache-buster to max=20 so all 8 contributors render
- Delete Author section; shrink Contributing to Examples + Documentation
Small carry-over fixes:
- Fix duplicated task_complete line in Quick Start output sample
- Add AgentPool.runParallel() note to Three Ways to Run
- Update source file count 33 → 35
Kept unchanged per scope:
- Architecture diagram, Built-in Tools, Supported Providers
Does not touch source code or package.json.
Split decisions into "Won't Do" (handoffs, checkpointing) and "Open to
Adoption" (MCP, A2A). Feature parity is a race that can be caught;
network effects from protocol adoption create a different kind of moat.
- MCP marked as "Next up" with optional peer dependency approach
- A2A marked as "Watching" with clear adoption trigger criteria