* feat(orchestrator): add onApproval callback for human-in-the-loop (#32)
Add an optional `onApproval` callback to OrchestratorConfig that gates
between task execution rounds. After each batch of parallel tasks
completes, the callback receives the completed tasks and the tasks about
to start, returning true to continue or false to abort gracefully.
Key changes:
- Add 'skipped' to TaskStatus for user-initiated abort (distinct from 'failed')
- Add skip(), skipRemaining(), cascadeSkip() to TaskQueue
- Add 'task_skipped' to OrchestratorEvent for progress monitoring
- Approval gate in executeQueue() with try/catch for callback errors
- Synthesis prompt now includes skipped tasks section
- 17 new tests covering queue skip operations and orchestrator integration
Closes#32
* docs: clarify onApproval contract and add missing test scenarios
- Document skip() cascade semantics, skipRemaining() in-flight constraint,
and onApproval trigger conditions / mutation warning
- Add concurrency safety comment on completedThisRound
- Note task_skipped as breaking union addition on OrchestratorEvent
- Add 3 test scenarios: single-batch no-callback, mixed success/failure
batch, and onProgress task_skipped event relay
* feat: add task-level retry with exponential backoff
Add `maxRetries`, `retryDelayMs`, and `retryBackoff` to task config.
When a task fails and retries remain, the orchestrator waits with
exponential backoff and re-runs the task with a fresh agent conversation.
A `task_retry` event is emitted via `onProgress` for observability.
Cascade failure only occurs after all retries are exhausted.
Closes#30
* fix: address review — extract executeWithRetry, add delay cap, fix tests
- Extract `executeWithRetry()` as a testable exported function
- Add `computeRetryDelay()` with 30s max cap (prevents runaway backoff)
- Remove retry fields from `ParsedTaskSpec` (dead code for runTeam path)
- Deduplicate retry event emission (single code path for both error types)
- Injectable delay function for test determinism
- Rewrite tests to call the real `executeWithRetry`, not a copy
- 15 tests covering: success, retry+success, retry+failure, backoff
calculation, delay cap, delay function injection, no-retry default
* fix: clamp negative maxRetries/retryBackoff to safe values
- maxRetries clamped to >= 0 (negative values treated as no retry)
- retryBackoff clamped to >= 1 (prevents zero/negative delay oscillation)
- retryDelayMs clamped to >= 0
- Add tests for negative maxRetries and negative backoff
Addresses Codex review P1 on #37
* fix: accumulate token usage across retry attempts
Previously only the final attempt's tokenUsage was returned, causing
under-reporting of actual model consumption when retries occurred.
Now all attempts' token counts are summed in the returned result.
Addresses Codex review P2 (token usage) on #37
crypto.randomUUID() is not globally available in Node 18. Import
randomUUID from node:crypto explicitly so the framework works on
all supported Node versions (>=18).
isTaskReady() rejects non-pending tasks on its first line, but
unblockDependents() passed blocked tasks directly to it. This meant
dependent tasks stayed blocked forever after their dependencies
completed, breaking any workflow with task dependencies.
Fix: pass a pending-status copy so isTaskReady only checks the
dependency condition.