The previous approach relied on Docker's container health status, but
Docker's healthcheck (start_period:30s + 3x15s retries = ~75s) marks
the container "unhealthy" before NestJS finishes cold-starting after a
fresh image build (New Relic + TypeORM + Redis + BullMQ init can take
2-3 minutes).
Changes:
- Primary check is now direct wget to localhost:3000/api from the host
- Docker health status used only for informational logging
- Total timeout increased from 130s to 190s (~3 min) for cold starts
- Early exit if container has stopped/exited (no point waiting)
- More backend log lines (30 vs 20) shown on failure for diagnostics
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The health check used curl which is not installed on the prod server.
Replace with a dual approach:
1. Primary: check Docker's own container health status (already running
via docker-compose.prod.yml healthcheck with wget inside container)
2. Secondary: wget from host as fallback signal
Also add diagnostic logging (container status + recent backend logs)
before triggering rollback on health check failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The APPLIED_MIGRATIONS associative array triggered "unbound variable"
under set -u when empty (first run / seed-existing). Fix by initializing
with =() and using a safe helper function with ${:-} default syntax.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add automated production deployment pipeline:
- scripts/deploy-prod.sh: Full deployment script with pre/post DB backups,
migration tracking via shared.schema_migrations table, health checks,
and automatic rollback on failure (restores DB, reverts code, rebuilds)
- .gitea/workflows/deploy.yml: Manual-trigger Gitea Actions workflow for
intentional production deployments with optional --seed-existing flag
- scripts/db-backup.sh: Add --yes/-y flag to skip interactive confirmation
prompts, enabling automated restore during rollback
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>