ChatClaudeAgent is a plain Runnable rather than a BaseChatModel, so
LangChain's callback system never fired on_chat_model_start / on_llm_end
for it — leaving the CLI TUI stuck on "LLM: 0" and "Tokens: --" during
runs. Pop callbacks out of the LLM kwargs, invoke them manually around
each SDK call, and attach usage_metadata extracted from the SDK's
ResultMessage (input, output, total — including cached input) to the
returned AIMessage so downstream handlers pick it up.
Tool callbacks now also fire through the MCP wrapper: forward the
callback list into each wrapped LangChain tool's invocation config so
StatsCallbackHandler sees on_tool_start/on_tool_end when the SDK loop
calls a tool.
Verified via direct StatsCallbackHandler round-trip on both Shape A
(ChatClaudeAgent.invoke) and Shape B (run_sdk_analyst): llm_calls,
tool_calls, tokens_in, and tokens_out all increment as expected.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>