Files
HOA_Financial_Platform/docs/DEPLOYMENT.md
olsch01 8db89373e0 Add production infrastructure: compiled builds, clustering, connection pooling
Root cause of 502 errors under 30 concurrent users: the production server
was running dev-mode infrastructure (Vite dev server, NestJS --watch,
no DB connection pooling, single Node.js process).

Changes:
- backend/Dockerfile: multi-stage prod build (compiled JS, no devDeps)
- frontend/Dockerfile: multi-stage prod build (static assets served by nginx)
- frontend/nginx.conf: SPA routing config for frontend container
- docker-compose.prod.yml: production overlay with tuned Postgres, memory
  limits, health checks, restart policies
- nginx/production.conf: keepalive upstreams, proxy buffering, rate limiting
- backend/src/main.ts: Node.js clustering (1 worker per CPU, up to 4),
  conditional request logging, production CORS
- backend/src/app.module.ts: TypeORM connection pool (max 30, min 5)
- docs/DEPLOYMENT.md: new Production Deployment section

Deploy with: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 16:55:30 -05:00

714 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HOA LedgerIQ — Deployment Guide
**Version:** 2026.3.2 (beta)
**Last updated:** 2026-03-02
---
## Table of Contents
1. [Prerequisites](#prerequisites)
2. [Deploy to a Fresh Docker Server](#deploy-to-a-fresh-docker-server)
3. [Production Deployment](#production-deployment)
4. [SSL with Certbot (Let's Encrypt)](#ssl-with-certbot-lets-encrypt)
5. [Backup the Local Test Database](#backup-the-local-test-database)
6. [Restore a Backup into the Staged Environment](#restore-a-backup-into-the-staged-environment)
7. [Running Migrations on the Staged Environment](#running-migrations-on-the-staged-environment)
8. [Verifying the Deployment](#verifying-the-deployment)
9. [Environment Variable Reference](#environment-variable-reference)
---
## Prerequisites
On the **target server**, ensure the following are installed:
| Tool | Minimum Version |
|-----------------|-----------------|
| Docker Engine | 24+ |
| Docker Compose | v2+ |
| Git | 2.x |
| `psql` (client) | 15+ *(optional, for manual DB work)* |
The app runs five containers — nginx, backend (NestJS), frontend (Vite/React),
PostgreSQL 15, and Redis 7. Total memory footprint is roughly **12 GB** idle.
For SSL, the server must also have:
- A **public hostname** with a DNS A record pointing to the server's IP
(e.g., `staging.yourdomain.com → 203.0.113.10`)
- **Ports 80 and 443** open in any firewall / security group
---
## Deploy to a Fresh Docker Server
### 1. Clone the repository
```bash
ssh your-staging-server
git clone <repo-url> /opt/hoa-ledgeriq
cd /opt/hoa-ledgeriq
```
### 2. Create the environment file
Copy the example and fill in real values:
```bash
cp .env.example .env
nano .env # or vi, your choice
```
**Required changes from defaults:**
```dotenv
# --- CHANGE THESE ---
POSTGRES_PASSWORD=<strong-random-password>
JWT_SECRET=<random-64-char-string>
# Database URL must match the password above
DATABASE_URL=postgresql://hoafinance:<same-password>@postgres:5432/hoafinance
# AI features (get a key from build.nvidia.com)
AI_API_KEY=nvapi-xxxxxxxxxxxx
# --- Usually fine as-is ---
POSTGRES_USER=hoafinance
POSTGRES_DB=hoafinance
REDIS_URL=redis://redis:6379
NODE_ENV=development # keep as development for staging
AI_API_URL=https://integrate.api.nvidia.com/v1
AI_MODEL=qwen/qwen3.5-397b-a17b
AI_DEBUG=false
```
> **Tip:** Generate secrets quickly:
> ```bash
> openssl rand -hex 32 # good for JWT_SECRET
> openssl rand -base64 24 # good for POSTGRES_PASSWORD
> ```
### 3. Build and start the stack
```bash
docker compose up -d --build
```
This will:
- Build the backend and frontend images
- Pull `postgres:15-alpine`, `redis:7-alpine`, and `nginx:alpine`
- Initialize the PostgreSQL database with the shared schema (`db/init/00-init.sql`)
- Start all five services on the `hoanet` bridge network
### 4. Wait for healthy services
```bash
docker compose ps
```
All five containers should show `Up` (postgres and redis should also show
`(healthy)`). If the backend is restarting, check logs:
```bash
docker compose logs backend --tail=50
```
### 5. (Optional) Seed with demo data
If deploying a fresh environment for testing and you want the Sunrise Valley
HOA demo tenant:
```bash
docker compose exec -T postgres psql -U hoafinance -d hoafinance < db/seed/seed.sql
```
This creates:
- Platform admin: `admin@hoaledgeriq.com` / `password123`
- Tenant admin: `admin@sunrisevalley.org` / `password123`
- Tenant viewer: `viewer@sunrisevalley.org` / `password123`
### 6. Access the application
| Service | URL |
|-----------|--------------------------------|
| App (UI) | `http://<server-ip>` |
| API | `http://<server-ip>/api` |
| Postgres | `<server-ip>:5432` (direct) |
> At this point the app is running over **plain HTTP** in development mode.
> For any environment that will serve real traffic, continue to the Production
> Deployment section.
---
## Production Deployment
The base `docker-compose.yml` runs everything in **development mode** (Vite
dev server, NestJS in watch mode, no connection pooling). This is fine for
local development but will fail under even light production load.
`docker-compose.prod.yml` provides a production overlay that fixes this:
| Component | Dev mode | Production mode |
|-----------|----------|-----------------|
| Frontend | Vite dev server (single-threaded, HMR) | Static build served by nginx |
| Backend | `nest start --watch` (ts-node, file watcher) | Compiled JS, clustered across CPU cores |
| DB pooling | None (new connection per query) | Pool of 30 reusable connections |
| Postgres | Default config (100 connections) | Tuned: 200 connections, optimized buffers |
| Nginx | Basic proxy | Keepalive upstreams, buffering, rate limiting |
| Restart | None | `unless-stopped` on all services |
### Deploy for production
```bash
cd /opt/hoa-ledgeriq
# Ensure .env has NODE_ENV=production and strong secrets
nano .env
# Build and start with the production overlay
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
```
To add SSL on top of the production stack:
```bash
docker compose \
-f docker-compose.yml \
-f docker-compose.prod.yml \
-f docker-compose.ssl.yml \
up -d --build
```
> **Tip:** Create a shell alias to avoid typing the compose files every time:
> ```bash
> echo 'alias dc="docker compose -f docker-compose.yml -f docker-compose.prod.yml"' >> ~/.bashrc
> source ~/.bashrc
> dc up -d --build
> ```
### What the production overlay does
**Backend (`backend/Dockerfile`)**
- Multi-stage build: compiles TypeScript once, runs `node dist/main`
- No dev dependencies shipped (smaller image, faster startup)
- Node.js clustering: forks one worker per CPU core (up to 4)
- Connection pool: 30 reusable PostgreSQL connections shared across workers
**Frontend (`frontend/Dockerfile`)**
- Multi-stage build: `npm run build` produces optimized static assets
- Served by a lightweight nginx container (not Vite)
- Static assets cached with immutable headers (Vite filename hashing)
**Nginx (`nginx/production.conf`)**
- Keepalive connections to upstream services (connection reuse)
- Proxy buffering to prevent 502s during slow responses
- Rate limiting on API routes (10 req/s per IP, burst 30)
- Proper timeouts tuned per endpoint type
**PostgreSQL**
- `max_connections=200` (up from default 100)
- `shared_buffers=256MB`, `effective_cache_size=512MB`
- Tuned checkpoint, WAL, and memory settings
### Capacity guidelines
With the production stack on a 2-core / 4GB server:
| Metric | Expected capacity |
|--------|-------------------|
| Concurrent users | 50100 |
| API requests/sec | ~200 |
| DB connections | 30 per backend worker × workers |
| Frontend serving | Static files, effectively unlimited |
For higher loads, scale the backend horizontally with Docker Swarm or
Kubernetes replicas.
---
## SSL with Certbot (Let's Encrypt)
This section walks through enabling HTTPS using the included nginx container
and a Certbot sidecar. The process has three phases: obtain the certificate,
switch nginx to SSL, and set up auto-renewal.
### Files involved
| File | Purpose |
|------|---------|
| `nginx/default.conf` | HTTP-only config (development / initial state) |
| `nginx/certbot-init.conf` | Temporary config used only during initial cert request |
| `nginx/ssl.conf` | Full SSL config (HTTP→HTTPS redirect + TLS termination) |
| `docker-compose.ssl.yml` | Compose override that adds port 443, certbot volumes, and renewal service |
### Overview
```
Phase 1 — Obtain certificate
┌────────────────────────────────────────────────┐
│ nginx serves certbot-init.conf on port 80 │
│ certbot answers ACME challenge → gets cert │
└────────────────────────────────────────────────┘
Phase 2 — Enable SSL
┌────────────────────────────────────────────────┐
│ nginx switches to ssl.conf │
│ port 80 → 301 redirect to 443 │
│ port 443 → TLS termination → backend/frontend │
└────────────────────────────────────────────────┘
Phase 3 — Auto-renewal
┌────────────────────────────────────────────────┐
│ certbot container checks every 12 hours │
│ renews if cert expires within 30 days │
│ nginx reloads via cron to pick up new cert │
└────────────────────────────────────────────────┘
```
---
### Phase 1 — Obtain the certificate
Throughout this section, replace `staging.example.com` with your actual
hostname and `you@example.com` with your real email.
#### Step 1: Edit `nginx/ssl.conf` with your hostname
```bash
cd /opt/hoa-ledgeriq
sed -i 's/staging.example.com/YOUR_HOSTNAME/g' nginx/ssl.conf
```
This updates the three places where the hostname appears: the `server_name`
directive and the two certificate paths.
#### Step 2: Swap in the temporary certbot-init config
Nginx can't start with `ssl.conf` yet because the certificates don't exist.
Use the minimal HTTP-only config that only serves the ACME challenge:
```bash
# Back up the current config
cp nginx/default.conf nginx/default.conf.bak
# Use the certbot init config temporarily
cp nginx/certbot-init.conf nginx/default.conf
```
#### Step 3: Start the stack with the SSL compose override
```bash
docker compose -f docker-compose.yml -f docker-compose.ssl.yml up -d --build
```
This starts all services plus the certbot container, exposes port 443,
and creates the shared volumes for certificates and ACME challenges.
#### Step 4: Request the certificate
```bash
docker compose -f docker-compose.yml -f docker-compose.ssl.yml \
run --rm certbot certonly \
--webroot \
--webroot-path /var/www/certbot \
--email you@example.com \
--agree-tos \
--no-eff-email \
-d YOUR_HOSTNAME
```
You should see output ending with:
```
Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/YOUR_HOSTNAME/fullchain.pem
Key is saved at: /etc/letsencrypt/live/YOUR_HOSTNAME/privkey.pem
```
> **Troubleshooting:** If certbot fails with a connection error, ensure:
> - DNS for `YOUR_HOSTNAME` resolves to this server's public IP
> - Port 80 is open (firewall, security group, cloud provider)
> - No other process is bound to port 80
---
### Phase 2 — Switch to full SSL
#### Step 5: Activate the SSL nginx config
```bash
cp nginx/ssl.conf nginx/default.conf
```
#### Step 6: Restart nginx to load the certificate
```bash
docker compose -f docker-compose.yml -f docker-compose.ssl.yml \
exec nginx nginx -s reload
```
#### Step 7: Verify HTTPS
```bash
# Should follow redirect and return HTML
curl -I https://YOUR_HOSTNAME
# Should return 301 redirect
curl -I http://YOUR_HOSTNAME
```
Expected `http://` response:
```
HTTP/1.1 301 Moved Permanently
Location: https://YOUR_HOSTNAME/
```
The app is now accessible at `https://YOUR_HOSTNAME` with a valid
Let's Encrypt certificate. All HTTP requests redirect to HTTPS.
---
### Phase 3 — Auto-renewal
The `certbot` service defined in `docker-compose.ssl.yml` already runs a
renewal loop (checks every 12 hours, renews if < 30 days remain).
However, nginx needs to be told to reload when a new certificate is issued.
Add a host-level cron job that reloads nginx daily:
```bash
# Add to the server's crontab (as root)
crontab -e
```
Add this line:
```cron
0 4 * * * cd /opt/hoa-ledgeriq && docker compose -f docker-compose.yml -f docker-compose.ssl.yml exec -T nginx nginx -s reload > /dev/null 2>&1
```
This reloads nginx at 4 AM daily. Since reload is graceful (no downtime),
this is safe to run even when no new cert was issued.
> **Alternative:** If you prefer not to use cron, you can add a
> `--deploy-hook` to the certbot command that reloads nginx only when a
> renewal actually happens:
> ```bash
> docker compose -f docker-compose.yml -f docker-compose.ssl.yml \
> run --rm certbot certonly \
> --webroot \
> --webroot-path /var/www/certbot \
> --email you@example.com \
> --agree-tos \
> --no-eff-email \
> --deploy-hook "nginx -s reload" \
> -d YOUR_HOSTNAME
> ```
---
### Day-to-day usage with SSL
Once SSL is set up, **always** use both compose files when running commands:
```bash
# Start / restart
docker compose -f docker-compose.yml -f docker-compose.ssl.yml up -d
# Rebuild after code changes
docker compose -f docker-compose.yml -f docker-compose.ssl.yml up -d --build
# View logs
docker compose -f docker-compose.yml -f docker-compose.ssl.yml logs -f nginx
# Shortcut: create an alias
echo 'alias dc="docker compose -f docker-compose.yml -f docker-compose.ssl.yml"' >> ~/.bashrc
source ~/.bashrc
dc up -d --build # much easier
```
### Reverting to HTTP-only (development)
If you need to go back to plain HTTP (e.g., local development):
```bash
# Restore the original HTTP-only config
cp nginx/default.conf.bak nginx/default.conf
# Run without the SSL override
docker compose up -d
```
---
### Quick reference: SSL file map
```
nginx/
├── default.conf ← active config (copy ssl.conf here for HTTPS)
├── default.conf.bak ← backup of original HTTP-only config
├── certbot-init.conf ← temporary, used only during initial cert request
└── ssl.conf ← full SSL config (edit hostname before using)
docker-compose.yml ← base stack (HTTP only, ports 80)
docker-compose.ssl.yml ← SSL overlay (adds 443, certbot, volumes)
```
---
## Backup the Local Test Database
### Full database dump (recommended)
From your **local development machine** where the app is currently running:
```bash
cd /path/to/HOA_Financial_Platform
# Dump the entire database (all schemas, roles, data)
docker compose exec -T postgres pg_dump \
-U hoafinance \
-d hoafinance \
--no-owner \
--no-privileges \
--format=custom \
-f /tmp/hoafinance_backup.dump
# Copy the dump file out of the container
docker compose cp postgres:/tmp/hoafinance_backup.dump ./hoafinance_backup.dump
```
The `--format=custom` flag produces a compressed binary format that supports
selective restore. The file is typically 5080% smaller than plain SQL.
### Alternative: Plain SQL dump
If you prefer a human-readable SQL file:
```bash
docker compose exec -T postgres pg_dump \
-U hoafinance \
-d hoafinance \
--no-owner \
--no-privileges \
> hoafinance_backup.sql
```
### Backup a single tenant schema
To export just one tenant (e.g., Pine Creek HOA):
```bash
docker compose exec -T postgres pg_dump \
-U hoafinance \
-d hoafinance \
--no-owner \
--no-privileges \
--schema=tenant_pine_creek_hoa_q33i \
> pine_creek_backup.sql
```
> **Finding a tenant's schema name:**
> ```bash
> docker compose exec -T postgres psql -U hoafinance -d hoafinance \
> -c "SELECT name, schema_name FROM shared.organizations WHERE status = 'active';"
> ```
---
## Restore a Backup into the Staged Environment
### 1. Transfer the backup to the staging server
```bash
scp hoafinance_backup.dump user@staging-server:/opt/hoa-ledgeriq/
```
### 2. Ensure the stack is running
```bash
cd /opt/hoa-ledgeriq
docker compose up -d
```
### 3. Drop and recreate the database (clean slate)
```bash
# Connect to postgres and reset the database
docker compose exec -T postgres psql -U hoafinance -d postgres -c "
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'hoafinance' AND pid <> pg_backend_pid();
"
docker compose exec -T postgres dropdb -U hoafinance hoafinance
docker compose exec -T postgres createdb -U hoafinance hoafinance
```
### 4a. Restore from custom-format dump
```bash
# Copy the dump into the container
docker compose cp hoafinance_backup.dump postgres:/tmp/hoafinance_backup.dump
# Restore
docker compose exec -T postgres pg_restore \
-U hoafinance \
-d hoafinance \
--no-owner \
--no-privileges \
/tmp/hoafinance_backup.dump
```
### 4b. Restore from plain SQL dump
```bash
docker compose exec -T postgres psql \
-U hoafinance \
-d hoafinance \
< hoafinance_backup.sql
```
### 5. Restart the backend
After restoring, restart the backend so NestJS re-establishes its connection
pool and picks up the restored schemas:
```bash
docker compose restart backend
```
---
## Running Migrations on the Staged Environment
Migrations live in `db/migrations/` and are numbered sequentially. After
restoring an older backup, you may need to apply newer migrations.
Check which migrations exist:
```bash
ls -la db/migrations/
```
Apply them in order:
```bash
# Run all migrations sequentially
for f in db/migrations/*.sql; do
echo "Applying $f ..."
docker compose exec -T postgres psql \
-U hoafinance \
-d hoafinance \
< "$f"
done
```
Or apply a specific migration:
```bash
docker compose exec -T postgres psql \
-U hoafinance \
-d hoafinance \
< db/migrations/010-health-scores.sql
```
> **Note:** Migrations are idempotent where possible (`IF NOT EXISTS`,
> `DO $$ ... $$` blocks), so re-running one that has already been applied
> is generally safe.
---
## Verifying the Deployment
### Quick health checks
```bash
# Backend is responding (use https:// if SSL is enabled)
curl -s http://localhost/api/auth/login | head -c 100
# Database is accessible
docker compose exec -T postgres psql -U hoafinance -d hoafinance \
-c "SELECT count(*) AS tenants FROM shared.organizations WHERE status = 'active';"
# Redis is working
docker compose exec -T redis redis-cli ping
```
### Full smoke test
1. Open `https://YOUR_HOSTNAME` (or `http://<server-ip>`) in a browser
2. Log in with a known account
3. Navigate to Dashboard verify health scores load
4. Navigate to Capital Planning verify Kanban columns render
5. Navigate to Projects verify project list loads
6. Check the Settings page version should read **2026.3.2 (beta)**
### Verify SSL (if enabled)
```bash
# Check certificate details
echo | openssl s_client -connect YOUR_HOSTNAME:443 -servername YOUR_HOSTNAME 2>/dev/null \
| openssl x509 -noout -subject -issuer -dates
# Check that HTTP redirects to HTTPS
curl -sI http://YOUR_HOSTNAME | grep -E 'HTTP|Location'
```
### View logs
```bash
docker compose logs -f # all services
docker compose logs -f backend # backend only
docker compose logs -f postgres # database only
docker compose logs -f nginx # nginx access/error log
```
---
## Environment Variable Reference
| Variable | Required | Description |
|-------------------|----------|----------------------------------------------------|
| `POSTGRES_USER` | Yes | PostgreSQL username |
| `POSTGRES_PASSWORD`| Yes | PostgreSQL password (**change from default**) |
| `POSTGRES_DB` | Yes | Database name |
| `DATABASE_URL` | Yes | Full connection string for the backend |
| `REDIS_URL` | Yes | Redis connection string |
| `JWT_SECRET` | Yes | Secret for signing JWT tokens (**change from default**) |
| `NODE_ENV` | Yes | `development` or `production` |
| `AI_API_URL` | Yes | OpenAI-compatible inference endpoint |
| `AI_API_KEY` | Yes | API key for AI provider (Nvidia) |
| `AI_MODEL` | Yes | Model identifier for AI calls |
| `AI_DEBUG` | No | Set `true` to log raw AI prompts/responses |
---
## Architecture Overview
```
┌──────────────────┐
Browser ─────────► │ nginx :80/:443 │
└────────┬─────────┘
┌──────────┴──────────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ backend :3000│ │frontend :5173│
│ (NestJS) │ │ (Vite/React) │
└──────┬───────┘ └──────────────┘
┌────┴────┐
▼ ▼
┌────────────┐ ┌───────────┐ ┌───────────┐
│postgres:5432│ │redis :6379│ │ certbot │
│ (PG 15) │ │ (Redis 7) │ │ (renewal) │
└────────────┘ └───────────┘ └───────────┘
```
**Multi-tenant isolation:** Each HOA organization gets its own PostgreSQL
schema (e.g., `tenant_pine_creek_hoa_q33i`). The `shared` schema holds
cross-tenant tables (users, organizations, market rates). Tenant context
is resolved from the JWT token on every API request.