The production stack no longer runs a Docker nginx container. Instead,
the host-level nginx handles SSL termination AND request routing:
/api/* → 127.0.0.1:3000 (backend)
/* → 127.0.0.1:3001 (frontend)
Changes:
- docker-compose.prod.yml: set nginx replicas to 0, expose backend and
frontend on 127.0.0.1 only (loopback)
- nginx/host-production.conf: new ready-to-copy host nginx config with
SSL, rate limiting, proxy buffering, and AI endpoint timeouts
- docs/DEPLOYMENT.md: rewritten production deployment and SSL sections
to reflect the simplified single-nginx architecture
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Frontend container nginx listens on 3001 instead of 80 to avoid
conflicts with the host-level reverse proxy
- Removed certbot service, volumes, and SSL config from
docker-compose.prod.yml — SSL/certbot is managed at the host level
- Updated nginx/production.conf: HTTP-only (host handles TLS),
upstream frontend points to port 3001
- Updated nginx/ssl.conf frontend upstream to 3001 for consistency
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause of 502 errors under 30 concurrent users: the production server
was running dev-mode infrastructure (Vite dev server, NestJS --watch,
no DB connection pooling, single Node.js process).
Changes:
- backend/Dockerfile: multi-stage prod build (compiled JS, no devDeps)
- frontend/Dockerfile: multi-stage prod build (static assets served by nginx)
- frontend/nginx.conf: SPA routing config for frontend container
- docker-compose.prod.yml: production overlay with tuned Postgres, memory
limits, health checks, restart policies
- nginx/production.conf: keepalive upstreams, proxy buffering, rate limiting
- backend/src/main.ts: Node.js clustering (1 worker per CPU, up to 4),
conditional request logging, production CORS
- backend/src/app.module.ts: TypeORM connection pool (max 30, min 5)
- docs/DEPLOYMENT.md: new Production Deployment section
Deploy with: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The AI recommendation endpoint calls a large language model (397B params)
which can take 60-120 seconds to respond. Nginx's default 60s proxy_read_timeout
was killing the connection before the response arrived. Added a dedicated
location block with 180s timeout for the recommendations endpoint.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>