- Increase max_tokens from 4096 to 16384 to accommodate reasoning tokens - Increase timeout from 90s to 180s for thinking model latency - Add logging for response diagnostics (content length, reasoning, finish reason) - Better error message when model exhausts tokens on reasoning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>