Files
STR_Optimization_Manager/str-optimization-manager-spec.md
olsch01 d4c714fadc Initial commit: STR Optimization Manager MVP
Full-stack short-term rental management platform with:
- React/Vite frontend with dark theme dashboard, performance, pricing,
  reservations, experiments, and settings pages
- Fastify API server with auth, platform management, performance tracking,
  pricing, reservations, experiments, and weekly report endpoints
- Playwright-based scraper service with Airbnb adapter (login with MFA,
  performance metrics, reservations, calendar pricing, price changes)
- VRBO adapter scaffold and mock adapter for development
- PostgreSQL with Drizzle ORM, migrations, and seed scripts
- Job queue with worker for async scraping tasks
- AES-256-GCM credential encryption for platform credentials
- Session cookie persistence for scraper browser sessions
- Docker Compose for PostgreSQL database

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 15:04:12 -04:00

662 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# STR Optimization Manager — Claude Code Requirements Specification
## 1. Project Overview
A self-hosted, Dockerized web application for managing and optimizing a short-term rental (STR) property listed on multiple platforms (initially Airbnb and VRBO). The system automates daily performance data collection via browser automation, stores historical metrics in a local database, enables bulk pricing management across platforms, tracks pricing experiments (A/B style), and syncs reservation data for long-term record keeping.
**Design philosophy:** Modular by platform and by property. Every platform integration is an isolated adapter. Adding a new platform or a second property should require only a new adapter and config entry, not architectural changes.
---
## 2. Goals & Non-Goals
### Goals
- Automated daily (and on-demand) scraping of performance metrics from Airbnb and VRBO
- Local time-series database of all collected metrics
- Dashboard with filterable, date-ranged charts for performance analysis
- Bulk pricing management across platforms (with preview/diff before commit)
- Pricing change log with experiment tagging and correlation to booking outcomes
- Full reservation sync and local storage
- Weekly performance summary email report
- Docker Compose deployment, runs on Mac (dev) and Debian (prod)
- Responsive UI: desktop and mobile
### Non-Goals (explicitly out of scope for v1)
- Multi-property support (architecture should allow it later, but not built now)
- Cleaning/turnover scheduling
- Guest messaging
- Expense tracking / P&L
- Tax reporting
- Public platform APIs (all data collection is via authenticated browser sessions)
---
## 3. Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Docker Compose │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Frontend │ │ API Server │ │ Scraper Worker │ │
│ │ (React/TS) │◄──│ (Node/TS + │◄──│ (Playwright + │ │
│ │ Vite SPA │ │ Fastify) │ │ adapters) │ │
│ └──────────────┘ └──────┬───────┘ └───────┬────────┘ │
│ │ │ │
│ ┌───────▼────────────────────▼───────┐ │
│ │ PostgreSQL Database │ │
│ └────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Scheduler (node-cron in API) │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Services
| Service | Image | Purpose |
|---|---|---|
| `frontend` | Node (build) → nginx | React SPA served via nginx |
| `api` | Node 20 Alpine | Fastify REST API + cron scheduler |
| `scraper` | Node 20 + Playwright | Browser automation workers |
| `db` | PostgreSQL 16 Alpine | Primary data store |
---
## 4. Tech Stack
| Layer | Technology |
|---|---|
| Backend API | Node.js 20 + TypeScript + Fastify |
| Browser Automation | Playwright (headless Chromium) |
| Database | PostgreSQL 16 |
| ORM / Migrations | Drizzle ORM + drizzle-kit |
| Job Scheduling | node-cron |
| Frontend | React 18 + TypeScript + Vite |
| Charting | Recharts |
| UI Components | shadcn/ui + Tailwind CSS |
| Email | Nodemailer (SMTP) |
| Auth (app login) | Simple single-user session auth with bcrypt + JWT stored in httpOnly cookie |
| Container | Docker + Docker Compose v2 |
| Secrets | `.env` file (never committed), credentials encrypted at rest in DB using AES-256 |
---
## 5. Platform Adapter Interface
Every platform integration implements this TypeScript interface. This is the core abstraction that keeps the system extensible.
```typescript
interface PlatformAdapter {
readonly platformId: string; // e.g. 'airbnb' | 'vrbo'
readonly displayName: string;
// Session management
login(credentials: Credentials): Promise<void>;
isSessionValid(): Promise<boolean>;
saveSession(store: SessionStore): Promise<void>;
restoreSession(store: SessionStore): Promise<boolean>;
// Data collection
scrapePerformanceMetrics(): Promise<PerformanceSnapshot>;
scrapeReservations(): Promise<Reservation[]>;
scrapePricing(dateRange: DateRange): Promise<DailyPrice[]>;
// Pricing mutations
previewPriceChanges(changes: PriceChange[]): Promise<PriceChangeDiff>;
applyPriceChanges(changes: PriceChange[]): Promise<PriceChangeResult>;
// Adapter health
selfTest(): Promise<AdapterHealthStatus>;
}
```
### Initial Adapters
- `AirbnbAdapter` — targets Airbnb Host dashboard
- `VrboAdapter` — targets VRBO Owner dashboard
### Adapter File Structure
```
src/
adapters/
base/
PlatformAdapter.ts ← interface + shared types
SessionStore.ts ← encrypted session persistence
AdapterRegistry.ts ← registers all adapters
airbnb/
AirbnbAdapter.ts ← main adapter class
airbnb.selectors.ts ← ALL CSS selectors isolated here
airbnb.flows.ts ← login, nav, scraping flows
vrbo/
VrboAdapter.ts
vrbo.selectors.ts
vrbo.flows.ts
```
**Critical pattern:** All CSS selectors and XPaths live in `*.selectors.ts` files only. When a platform updates their UI, only that file needs updating — never the business logic.
---
## 6. Database Schema
### `platforms`
| Column | Type | Notes |
|---|---|---|
| id | varchar PK | e.g. 'airbnb' |
| display_name | varchar | |
| credentials_encrypted | text | AES-256 encrypted JSON |
| session_data_encrypted | text | Stored browser session |
| last_scrape_at | timestamptz | |
| is_active | boolean | |
### `performance_snapshots`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| platform_id | varchar FK | |
| captured_at | timestamptz | When this row was written |
| period_label | varchar | e.g. 'last_30_days' |
| views_search | integer | Times appeared in search |
| views_listing | integer | Times listing was clicked |
| conversion_rate | numeric | views_listing / views_search |
| bookings_count | integer | |
| occupancy_rate | numeric | % of available days booked |
| avg_daily_rate | numeric | |
| revenue_total | numeric | |
| raw_json | jsonb | Full raw payload for future parsing |
### `daily_prices`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| platform_id | varchar FK | |
| date | date | The night in question |
| price | numeric | |
| is_available | boolean | |
| min_stay_nights | integer | |
| synced_at | timestamptz | |
### `price_changes`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| platform_id | varchar FK | |
| date | date | Night being changed |
| price_before | numeric | |
| price_after | numeric | |
| changed_at | timestamptz | |
| changed_by | varchar | 'scheduled' or 'manual' |
| note | text | User-provided reason |
| experiment_id | uuid FK nullable | |
### `experiments`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| name | varchar | e.g. "Lower weekend rate Jan test" |
| hypothesis | text | What you expect to happen |
| start_date | date | |
| end_date | date | |
| status | varchar | 'active' \| 'completed' \| 'cancelled' |
| created_at | timestamptz | |
| conclusion | text | Notes written at end |
### `reservations`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| platform_id | varchar FK | |
| platform_reservation_id | varchar | Native ID from platform |
| guest_name | varchar | |
| check_in | date | |
| check_out | date | |
| nights | integer | |
| guests_count | integer | |
| nightly_rate | numeric | |
| cleaning_fee | numeric | |
| platform_fee | numeric | |
| total_payout | numeric | |
| status | varchar | 'confirmed' \| 'cancelled' \| 'completed' |
| booked_at | timestamptz | |
| synced_at | timestamptz | |
| raw_json | jsonb | |
### `scrape_jobs`
| Column | Type | Notes |
|---|---|---|
| id | uuid PK | |
| platform_id | varchar FK | |
| job_type | varchar | 'performance' \| 'pricing' \| 'reservations' |
| triggered_by | varchar | 'schedule' \| 'manual' |
| status | varchar | 'pending' \| 'running' \| 'success' \| 'failed' |
| started_at | timestamptz | |
| completed_at | timestamptz | |
| error_message | text | |
| rows_collected | integer | |
---
## 7. API Endpoints
All endpoints are prefixed `/api/v1`. Auth required on all except `/api/v1/auth/login`.
### Auth
| Method | Path | Description |
|---|---|---|
| POST | `/auth/login` | App login (single user) |
| POST | `/auth/logout` | Clear session |
| GET | `/auth/me` | Current session info |
### Platforms
| Method | Path | Description |
|---|---|---|
| GET | `/platforms` | List platforms and status |
| PUT | `/platforms/:id/credentials` | Update stored credentials |
| POST | `/platforms/:id/test` | Test login + adapter health |
| POST | `/platforms/:id/scrape` | Trigger on-demand scrape (all types) |
| GET | `/platforms/:id/scrape-jobs` | Recent job history |
### Performance
| Method | Path | Description |
|---|---|---|
| GET | `/performance/snapshots` | Query snapshots, supports `?platform=&from=&to=` |
| GET | `/performance/summary` | Aggregated summary across platforms |
| GET | `/performance/trends` | Time-series data for charts |
### Pricing
| Method | Path | Description |
|---|---|---|
| GET | `/pricing/calendar` | All daily prices `?platform=&from=&to=` |
| POST | `/pricing/preview` | Dry-run bulk price changes, returns diff |
| POST | `/pricing/apply` | Apply previewed changes after confirmation |
| GET | `/pricing/changes` | Price change log `?platform=&from=&to=&experiment_id=` |
### Experiments
| Method | Path | Description |
|---|---|---|
| GET | `/experiments` | List all experiments |
| POST | `/experiments` | Create new experiment |
| PUT | `/experiments/:id` | Update (add conclusion, change status) |
| GET | `/experiments/:id/analysis` | Correlation: price changes → bookings/views |
### Reservations
| Method | Path | Description |
|---|---|---|
| GET | `/reservations` | List reservations `?platform=&status=&from=&to=` |
| GET | `/reservations/summary` | Occupancy, revenue totals by month/year |
### Reports
| Method | Path | Description |
|---|---|---|
| POST | `/reports/weekly/send` | Manually trigger weekly email report |
| GET | `/reports/weekly/preview` | Preview this week's report as JSON |
---
## 8. Frontend — Pages & Views
### Navigation Structure
```
/ (Dashboard)
/performance
/pricing
/experiments
/reservations
/settings
```
### Dashboard (`/`)
- KPI cards: occupancy rate, avg daily rate, total revenue MTD, search views (last 30d)
- Side-by-side platform comparison for all KPIs
- Booking trend sparkline (last 90 days)
- Recent reservations list (last 5)
- Scraper job status indicators (last run time per platform, success/fail badge)
- "Run Scrape Now" button per platform
### Performance (`/performance`)
- Date range picker (presets: 7d, 30d, 90d, YTD, custom)
- Platform filter toggle (All / Airbnb / VRBO)
- Charts (all use Recharts):
- Search views over time (line)
- Listing click-through rate over time (line)
- Bookings per week (bar)
- Occupancy rate over time (area)
- Avg daily rate over time (line, overlaid with booking events)
- Data table below charts: raw snapshot history, exportable to CSV
### Pricing (`/pricing`)
- Calendar grid view: each day shows price per platform, color-coded by deviation from base rate
- Sidebar panel: select date range + enter new price → generates preview
- Preview modal: shows diff table (date | platform | old price | new price) before any changes go live. Requires explicit "Confirm & Apply" button.
- Pricing change log table: filterable by platform, date range, experiment
- "Link to Experiment" action on any change or group of changes
### Experiments (`/experiments`)
- List view: all experiments with status badge, date range, linked price changes count
- Create experiment modal: name, hypothesis, date range, initial notes
- Experiment detail page:
- Linked price changes table
- Performance chart for the experiment date range (views, bookings, occupancy)
- Before/after comparison: avg metrics N days before vs during experiment
- Conclusion text field (editable when status = completed)
### Reservations (`/reservations`)
- Table: all reservations, sortable, filterable by platform/status/date range
- Monthly occupancy heatmap calendar
- Revenue by month bar chart
- YoY comparison once data spans 12+ months
### Settings (`/settings`)
- Platform credentials (masked, update form per platform)
- App login password change
- Scrape schedule configuration (time of day for daily run)
- SMTP configuration for weekly report email
- Adapter health check panel: "Test Connection" per platform with live output log
---
## 9. Scraper Worker — Detailed Behavior
### Session Management
- On first run, performs full login flow (email → password → handle any MFA prompt interactively via a special "needs attention" UI state)
- After successful login, saves browser storage state (cookies + localStorage) encrypted to DB
- On subsequent runs, restores session state and verifies validity before scraping
- If session invalid, re-triggers login flow and flags for user attention if MFA required
### Scrape Job Flow
```
1. Job queued (by scheduler or API trigger)
2. Worker picks up job
3. Restore session → validate → re-login if needed
4. Navigate to performance dashboard → extract metrics → insert performance_snapshot row
5. Navigate to calendar/pricing → extract N days of pricing → upsert daily_prices rows
6. Navigate to reservations → extract all reservations → upsert reservations rows
7. Update scrape_jobs row with status + counts
8. Emit websocket event → UI updates in real time
```
### Error Handling
- Retry up to 3 times on transient failures (network, selector not found)
- On persistent failure: mark job as failed, store error message, surface in UI
- Never silently swallow errors — all failures logged to scrape_jobs table
- Screenshot on failure: save to `/data/screenshots/` volume for debugging
### Anti-Detection Considerations (document in spec, implement in adapters)
- Randomized delays between actions (200800ms)
- Human-like mouse movement patterns via Playwright's `mouse.move()`
- Persist and reuse sessions to minimize login frequency
- Run during off-peak hours by default (configurable)
- User-agent set to current stable Chrome
---
## 10. Pricing Change Flow (Safety-First)
This flow must never apply changes without explicit user confirmation.
```
User selects dates + enters new price
POST /pricing/preview
System queries current prices from daily_prices table
Returns diff: [{date, platform, currentPrice, newPrice, delta}]
UI renders preview modal with full diff table
User reviews → clicks "Confirm & Apply"
POST /pricing/apply (idempotency key from preview response)
Scraper worker opens browser → navigates to platform calendar
Applies changes date by date (with verification reads after each)
Writes price_changes rows for each date changed
Returns result: {success: [], failed: []}
UI shows success/failure summary
```
---
## 11. Weekly Report Email
Sent every Monday at 8am (configurable). Contains:
- **This week vs last week:** views, clicks, CTR, bookings
- **MTD vs same period last month:** revenue, occupancy
- **Upcoming 30 days:** occupancy %, revenue booked
- **Active experiments:** name, days running, early metric movement
- **Pricing changes this week:** count, avg delta
- **Any scraper failures** from the past 7 days
Format: HTML email with inline styles (no external CSS). Plain-text fallback included.
---
## 12. Docker Compose — Full Stack Definition
```yaml
# docker-compose.yml
version: '3.9'
services:
db:
image: postgres:16-alpine
restart: unless-stopped
environment:
POSTGRES_DB: str_manager
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
interval: 10s
timeout: 5s
retries: 5
api:
build:
context: ./apps/api
dockerfile: Dockerfile
restart: unless-stopped
depends_on:
db:
condition: service_healthy
environment:
DATABASE_URL: postgresql://${DB_USER}:${DB_PASSWORD}@db:5432/str_manager
JWT_SECRET: ${JWT_SECRET}
ENCRYPTION_KEY: ${ENCRYPTION_KEY}
SMTP_HOST: ${SMTP_HOST}
SMTP_PORT: ${SMTP_PORT}
SMTP_USER: ${SMTP_USER}
SMTP_PASS: ${SMTP_PASS}
REPORT_EMAIL_TO: ${REPORT_EMAIL_TO}
SCRAPER_URL: http://scraper:3001
ports:
- "3000:3000"
scraper:
build:
context: ./apps/scraper
dockerfile: Dockerfile
restart: unless-stopped
depends_on:
db:
condition: service_healthy
environment:
DATABASE_URL: postgresql://${DB_USER}:${DB_PASSWORD}@db:5432/str_manager
ENCRYPTION_KEY: ${ENCRYPTION_KEY}
PLAYWRIGHT_HEADLESS: "true"
volumes:
- scraper_screenshots:/app/screenshots
shm_size: '2gb' # Required for Chromium in Docker
frontend:
build:
context: ./apps/frontend
dockerfile: Dockerfile
restart: unless-stopped
depends_on:
- api
ports:
- "80:80"
- "443:443"
volumes:
postgres_data:
scraper_screenshots:
```
### `.env.example`
```env
# Database
DB_USER=str_manager
DB_PASSWORD=changeme_strong_password
# App Security
JWT_SECRET=changeme_64_char_random_string
ENCRYPTION_KEY=changeme_32_char_aes_key
# Email (weekly report)
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=you@gmail.com
SMTP_PASS=your_app_password
REPORT_EMAIL_TO=you@gmail.com
# App login
APP_USERNAME=admin
APP_PASSWORD_HASH=bcrypt_hash_of_your_password
```
---
## 13. Monorepo Structure
```
str-optimization-manager/
├── docker-compose.yml
├── docker-compose.dev.yml ← mounts source for hot reload
├── .env.example
├── .gitignore ← must include .env, screenshots/
├── README.md
├── apps/
│ ├── api/
│ │ ├── Dockerfile
│ │ ├── package.json
│ │ ├── tsconfig.json
│ │ └── src/
│ │ ├── index.ts ← Fastify server entry
│ │ ├── routes/ ← one file per route group
│ │ ├── services/ ← business logic
│ │ ├── db/
│ │ │ ├── schema.ts ← Drizzle schema (single source of truth)
│ │ │ └── migrations/
│ │ ├── scheduler/
│ │ │ └── cron.ts
│ │ └── email/
│ │ └── weeklyReport.ts
│ ├── scraper/
│ │ ├── Dockerfile
│ │ ├── package.json
│ │ ├── tsconfig.json
│ │ └── src/
│ │ ├── index.ts ← HTTP server (receives jobs from api)
│ │ ├── adapters/ ← see section 5
│ │ ├── queue/ ← in-memory job queue
│ │ └── utils/
│ │ ├── browser.ts ← Playwright browser factory
│ │ └── encryption.ts ← AES-256 helpers
│ └── frontend/
│ ├── Dockerfile
│ ├── nginx.conf
│ ├── package.json
│ ├── vite.config.ts
│ └── src/
│ ├── main.tsx
│ ├── App.tsx
│ ├── pages/
│ ├── components/
│ ├── hooks/
│ └── lib/
│ ├── api.ts ← typed API client
│ └── utils.ts
└── packages/
└── shared-types/ ← shared TypeScript types between api/scraper/frontend
└── src/
└── index.ts
```
---
## 14. UI Design Direction
**Aesthetic:** Clean, data-dense, utilitarian dashboard. Think Vercel analytics meets a Bloomberg terminal — dark mode default, monospaced numbers, high contrast data visualizations. Not a generic SaaS template.
**Color palette:**
- Background: `#0a0a0a` (near-black)
- Surface: `#141414`
- Border: `#262626`
- Primary accent: `#22c55e` (green — for positive metrics, occupancy, revenue)
- Warning: `#f59e0b`
- Danger: `#ef4444`
- Text primary: `#fafafa`
- Text muted: `#737373`
- Chart colors: distinct accessible palette (no red/green only for accessibility)
**Typography:**
- Numbers/data: `JetBrains Mono` or `IBM Plex Mono` — monospaced for alignment
- UI labels: `Geist` or `DM Sans` — clean, modern, not generic
**Interaction patterns:**
- All data tables have column sorting
- All date range pickers have keyboard support
- Loading states: skeleton screens (not spinners)
- Real-time job status via WebSocket (SSE acceptable as simpler alternative)
- Mobile: bottom tab navigation, cards stack vertically, charts scroll horizontally
---
## 15. Key Implementation Notes for Claude Code
1. **Start with the database schema and Drizzle migrations** — everything else derives from this
2. **Build the adapter interface and a mock adapter first** — use the mock for all frontend/API development before real scrapers are needed
3. **The Airbnb adapter is higher priority** than VRBO — build and test it first
4. **All selector strings must be constants** — never inline a CSS selector in logic code
5. **Preview-before-apply is non-negotiable** — the `/pricing/apply` endpoint must reject requests without a valid preview token
6. **Session encryption is day-one, not a later hardening step** — credentials never touch disk unencrypted
7. **The weekly email must work with any standard SMTP provider** — no vendor lock-in (no SendGrid dependency)
8. **Write a `docker-compose.dev.yml`** that mounts source volumes and enables hot reload for both api and frontend
9. **Include a `/api/v1/health` endpoint** that checks DB connectivity and returns scraper worker status
10. **The README must include** first-run setup steps, how to update platform credentials, how to add a new platform adapter, and how to run outside Docker for local development
---
## 16. Acceptance Criteria
The following must all be true for v1 to be considered complete:
- [ ] `docker compose up` from a fresh clone brings the full stack online
- [ ] App login works (single user, password-protected)
- [ ] Airbnb credentials can be entered, tested, and stored encrypted
- [ ] VRBO credentials can be entered, tested, and stored encrypted
- [ ] Manual "scrape now" triggers all three data collection types per platform
- [ ] Daily cron scrape runs at configured time
- [ ] Performance dashboard renders with real data from at least one platform
- [ ] All performance charts filter correctly by platform and date range
- [ ] Pricing calendar shows current prices per platform per day
- [ ] Bulk price change goes through preview → confirm → apply flow with no way to skip preview
- [ ] Every applied price change is recorded in price_changes table
- [ ] An experiment can be created, price changes linked to it, and the analysis view shows before/after metric comparison
- [ ] All reservations sync and display in reservations table
- [ ] Weekly report email sends successfully via configured SMTP
- [ ] UI is usable on a 390px wide mobile screen
- [ ] Scraper failure is visible in the UI within 60 seconds of occurrence
- [ ] `.env.example` covers every required environment variable