* feat(agent): add smart loop detection for stuck agents (#16)
Detect when agents repeat the same tool calls or text outputs in a
sliding window. Three modes: warn (inject nudge, terminate on 2nd hit),
terminate (immediate stop), or custom callback. Fully opt-in via
`loopDetection` on AgentConfig — zero overhead when unconfigured.
* fix(agent): support async onLoopDetected callbacks and prevent orphaned tool_use events
- Await onLoopDetected callback result so async functions work correctly
instead of silently falling through to 'continue'
- Move loop detection before yielding tool_use events so terminate mode
never emits tool_use without a matching tool_result
* fix(agent): reset loopWarned on recovery and rename maxRepeatedToolCalls to maxRepetitions
- Reset loopWarned flag when the agent stops repeating, so a future
loop gets a fresh warning cycle instead of immediate termination
- Rename maxRepeatedToolCalls → maxRepetitions since the threshold
applies to both tool call and text output repetition detection
* test(agent): add tests for async callback, warn recovery, and injected warning text
- Verify async onLoopDetected callback is awaited correctly
- Verify loopWarned resets after recovery, giving fresh warning cycle
- Verify WARNING TextBlock is injected into user message content
Local models (Ollama, vLLM) sometimes return tool calls as text instead
of using the native tool_calls wire format. This adds a safety-net
extractor that parses tool calls from model text output when native
tool_calls is empty.
- Add text-tool-extractor with support for bare JSON, code fences,
and Hermes <tool_call> tags
- Wire fallback into OpenAI adapter chat() and stream() paths
- Add onWarning callback when model ignores configured tools
- Add timeoutMs on AgentConfig for per-run abort (local models can
be slow)
- Add 26 tests for extractor and fallback behavior
- Document local model compatibility in README
Add lightweight onTrace callback to OrchestratorConfig that emits
structured span events (llm_call, tool_call, task, agent) with timing,
token usage, and runId correlation. Zero overhead when not subscribed.
Closes#18