Strategic rewrite following docs/project-evaluation-2026-04-09.md.
README.md and README_zh.md updated in lockstep.
Top fold changes:
- New tagline positioning against CrewAI and LangGraph
- Replace 11-bullet "Why" with 3 bullets (runTeam / 3 deps / multi-model)
- New Philosophy section with "we build / we don't build / tracking"
- "We don't build" limited to handoffs and checkpointing (softened);
Cloud/Studio bullet dropped to preserve future Hosted option
- New "How is this different from X?" FAQ covering LangGraph JS, CrewAI,
and Vercel AI SDK
- New "Used by" section with three early-stage integrations, framed
honestly for a new project (temodar-agent, rentech-quant-platform,
cybersecurity SOC home lab)
Examples section:
- Shrink 15-row catalog table to 4 featured entries + link to examples/
- Featured: 02 team collaboration, 06 local model, 09 structured output,
11 trace observability
- Eliminates maintenance debt of updating the table on every new example
Refinements during alignment pass:
- Launch date corrected to 2026-04-01 (matches first commit timestamp)
- Surface Gemini @google/genai peer dep in top fold and Providers table
- Rephrase "Agent handoffs" bullet to avoid reading as single-agent framework
- Update prose example to Opus 4.6 / GPT-5.4 / local Gemma 4
- Quick Start code example shortened ~30% (developer/reviewer collapsed
to stubs, still demonstrates multi-agent team shape)
- Remove CrewAI endorsement stats (48K stars / Andrew Ng / $18M) to keep
comparisons technical
- Drop Star History cache-buster since growth has stabilized; bump
contributors cache-buster to max=20 so all 8 contributors render
- Delete Author section; shrink Contributing to Examples + Documentation
Small carry-over fixes:
- Fix duplicated task_complete line in Quick Start output sample
- Add AgentPool.runParallel() note to Three Ways to Run
- Update source file count 33 → 35
Kept unchanged per scope:
- Architecture diagram, Built-in Tools, Supported Providers
Does not touch source code or package.json.
Split decisions into "Won't Do" (handoffs, checkpointing) and "Open to
Adoption" (MCP, A2A). Feature parity is a race that can be caught;
network effects from protocol adoption create a different kind of moat.
- MCP marked as "Next up" with optional peer dependency approach
- A2A marked as "Watching" with clear adoption trigger criteria
The short-circuit block in runTeam() called this.runAgent(), which emits
its own agent_start/agent_complete events and increments completedTaskCount.
The short-circuit block then emitted the same events again, and
buildTeamRunResult() incremented the count a second time.
Fix: call buildAgent() + agent.run() directly, bypassing runAgent().
Events and counting are handled once by the short-circuit block and
buildTeamRunResult() respectively.
The agent system prompts and task descriptions implied agents could
explicitly read/write shared memory keys, but the framework handles
this automatically. Simplified to match actual behavior.
- Use 'task_start'/'task_complete' (underscores) instead of colons
- Use event.task/event.agent instead of non-existent taskTitle/agentName
- Remove Task import; runTasks() accepts a lighter inline type
Addresses all five review points from @JackChen-me on PR #70:
1. Extract shared keyword helpers into src/utils/keywords.ts so the
short-circuit selector and Scheduler.capability-match cannot drift.
Both orchestrator.ts and scheduler.ts now import the same module.
2. selectBestAgent now mirrors Scheduler.capability-match exactly,
including the asymmetric use of agent.model: agentKeywords includes
model, agentText does not. This restores parity with the documented
capability-match behaviour.
3. Remove isSimpleGoal and selectBestAgent from the public barrel
(src/index.ts). They remain exported from orchestrator.ts for unit
tests but are no longer part of the package API surface.
4. Forward the AbortSignal from runTeam(options) through the
short-circuit path. runAgent() now accepts an optional
{ abortSignal } argument; runTeam's short-circuit branch passes
the caller's signal so cancellation works for simple goals too.
5. Tighten the collaborate/coordinate complexity regexes so they only
fire on imperative directives ("collaborate with X", "coordinate
the team") and not on descriptive uses ("explain how pods
coordinate", "what is microservice collaboration").
Also fixes a pre-existing test failure in token-budget.test.ts:
"enforces orchestrator budget in runTeam" was using "Do work" as its
goal which now short-circuits, so the coordinator path the test was
exercising never ran. Switched to a multi-step goal.
Adds 60 new tests across short-circuit.test.ts and the new
keywords.test.ts covering all five fixes.
Co-Authored-By: Claude <noreply@anthropic.com>
When a goal is short (<200 chars) and contains no multi-step or
coordination signals, runTeam() now dispatches directly to the
best-matching agent — skipping the coordinator decomposition and
synthesis round-trips. This saves ~2 LLM calls worth of tokens
and latency for genuinely simple goals.
Complexity detection uses regex patterns for sequencing markers
(first...then, step N, numbered lists), coordination language
(collaborate, coordinate, work together), parallel execution
signals, and multi-deliverable patterns.
Agent selection reuses the same keyword-affinity scoring as the
capability-match scheduler strategy to pick the most relevant
agent from the team roster.
- Add isSimpleGoal() and selectBestAgent() (exported for testing)
- Add 35 unit tests covering heuristic edge cases and integration
- Update existing runTeam tests to use complex goals
Co-Authored-By: Claude <noreply@anthropic.com>
Adds an example showing a generator agent producing code, then
three reviewer agents (security, performance, style) analyzing it
in parallel, followed by a synthesizer merging all feedback into
a prioritized action item report.
Closes#75
Demonstrates parallel analyst execution with dependency-based
synthesis using runTasks(), sharedMemory, and dependsOn:
1. Three analyst agents research the same topic in parallel
(technical, market, community perspectives)
2. Synthesizer waits for all analysts via dependsOn, reads
shared memory, cross-references findings, and produces
a unified report
Fixes#76
AgentPool now maintains a per-agent Semaphore(1) that serializes
concurrent run() calls targeting the same Agent. This prevents
shared-state races on Agent.state (status, messages, tokenUsage)
when multiple independent tasks are assigned to the same agent.
Lock acquisition order: per-agent lock first, then pool semaphore,
so queued tasks don't waste pool slots while waiting.
Fixes#61
Thread AbortSignal from the top-level API through RunContext to
executeQueue(), enabling graceful cancellation in Express, Next.js,
serverless, and CLI scenarios.
Changes:
- Added optional to RunContext interface
- now accepts
- now accepts
- executeQueue() checks signal.aborted before each dispatch round
and skips remaining tasks when cancelled
- Signal is forwarded to coordinator's run() and per-task pool.run()
so in-flight LLM calls are also cancelled
- Full backward compatibility: both methods work without options
The abort infrastructure already existed at lower layers
(AgentRunner, Agent, AgentPool) — this commit bridges the last gap
at the orchestrator level.
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
- Add contract tests for Anthropic, OpenAI, Gemini, Copilot adapters
- Add optional E2E test suite (tests/e2e/, run with npm run test:e2e)
- Add shared test fixtures (tests/helpers/llm-fixtures.ts)
- Configure vitest to exclude e2e tests by default
- Add "files" field to package.json to reduce npm package size by 50%
- Align npm description with GitHub repo description
- Bump version to 1.0.1
* feat(agent): add smart loop detection for stuck agents (#16)
Detect when agents repeat the same tool calls or text outputs in a
sliding window. Three modes: warn (inject nudge, terminate on 2nd hit),
terminate (immediate stop), or custom callback. Fully opt-in via
`loopDetection` on AgentConfig — zero overhead when unconfigured.
* fix(agent): support async onLoopDetected callbacks and prevent orphaned tool_use events
- Await onLoopDetected callback result so async functions work correctly
instead of silently falling through to 'continue'
- Move loop detection before yielding tool_use events so terminate mode
never emits tool_use without a matching tool_result
* fix(agent): reset loopWarned on recovery and rename maxRepeatedToolCalls to maxRepetitions
- Reset loopWarned flag when the agent stops repeating, so a future
loop gets a fresh warning cycle instead of immediate termination
- Rename maxRepeatedToolCalls → maxRepetitions since the threshold
applies to both tool call and text output repetition detection
* test(agent): add tests for async callback, warn recovery, and injected warning text
- Verify async onLoopDetected callback is awaited correctly
- Verify loopWarned resets after recovery, giving fresh warning cycle
- Verify WARNING TextBlock is injected into user message content
- Add @google/genai to devDependencies so types are available for
lint/test in CI (stays as optional peerDependency for consumers)
- Delete package-lock.json in CI before npm install to avoid
Mac-generated lockfile missing Linux platform-specific rollup binaries
npm ci fails on Linux CI when package-lock.json was generated on macOS,
because platform-specific optional deps (@rollup/rollup-linux-x64-gnu)
are missing from the lockfile. This is a known npm bug (#4828).
When both timeoutMs and a caller-provided abortSignal were set, the
timeout signal silently replaced the caller's signal. Now they are
combined via mergeAbortSignals() so either source can cancel the run.
Also removes dead array-handling branch in text-tool-extractor.ts
(extractJSONObjects only returns objects, never arrays).
Local models (Ollama, vLLM) sometimes return tool calls as text instead
of using the native tool_calls wire format. This adds a safety-net
extractor that parses tool calls from model text output when native
tool_calls is empty.
- Add text-tool-extractor with support for bare JSON, code fences,
and Hermes <tool_call> tags
- Wire fallback into OpenAI adapter chat() and stream() paths
- Add onWarning callback when model ignores configured tools
- Add timeoutMs on AgentConfig for per-run abort (local models can
be slow)
- Add 26 tests for extractor and fallback behavior
- Document local model compatibility in README
* feat(orchestrator): add onApproval callback for human-in-the-loop (#32)
Add an optional `onApproval` callback to OrchestratorConfig that gates
between task execution rounds. After each batch of parallel tasks
completes, the callback receives the completed tasks and the tasks about
to start, returning true to continue or false to abort gracefully.
Key changes:
- Add 'skipped' to TaskStatus for user-initiated abort (distinct from 'failed')
- Add skip(), skipRemaining(), cascadeSkip() to TaskQueue
- Add 'task_skipped' to OrchestratorEvent for progress monitoring
- Approval gate in executeQueue() with try/catch for callback errors
- Synthesis prompt now includes skipped tasks section
- 17 new tests covering queue skip operations and orchestrator integration
Closes#32
* docs: clarify onApproval contract and add missing test scenarios
- Document skip() cascade semantics, skipRemaining() in-flight constraint,
and onApproval trigger conditions / mutation warning
- Add concurrency safety comment on completedThisRound
- Note task_skipped as breaking union addition on OrchestratorEvent
- Add 3 test scenarios: single-batch no-callback, mixed success/failure
batch, and onProgress task_skipped event relay