Add a new admin-only feature that allows the platform owner to benchmark
the production AI model against up to 2 alternate models (any OpenAI-compatible
API) using real tenant data, without impacting users.
Backend:
- Shared AI caller utility (ai-caller.ts) for OpenAI-compatible endpoints
- Shadow AI module with service, controller, and 3 entities
- 6 admin API endpoints for model config CRUD, run trigger, and history
- Auto-creates shadow_ai_models, shadow_runs, shadow_run_results tables
- Exposes health-scores and investment-planning prompt builders for reuse
Frontend:
- New admin page at /admin/shadow-ai with 3 tabs:
- Model Configuration (production + 2 alternate slots)
- Run Comparison (tenant select, feature select, side-by-side results)
- History (filterable run log with detail drill-down)
- Full side-by-side output display with diff highlighting
- Sidebar navigation link for AI Benchmarking
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>