Files
HOA_Financial_Platform/scripts
olsch01 c12ad94b7f fix: rewrite Bankrate scraper to extract actual bank names from offer cards
The previous scraper was picking up Bankrate's summary table
(.wealth-product-rate-list) which only has "best rates" per term with
no bank names, resulting in entries like "Top CD Rate - 1 year".

Now targets the actual bank offer cards in .wrt-RateSections-sponsoredoffers
and .wrt-RateSections-additionaloffers sections. Key changes:

- Extract bank names from img[alt] (logo) with text-based fallbacks
- Fix APY parsing to avoid Bankrate score leaking in (e.g. "4.5" score
  concatenated with "4.00%" APY was parsed as 0.4%)
- Handle both "Min. deposit" (CDs) and "Min. balance for APY" (savings/MM)
- Parse abbreviated terms from Bankrate (e.g. "1yr", "14mo")
- Strip product suffixes from bank names (e.g. "Synchrony Bank CD" → "Synchrony Bank")
- Filter out entries that aren't real banks (terms, dollar amounts)
- Keep a fallback strategy for future Bankrate layout changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 10:44:58 -05:00
..

HOA LedgerIQ - Scripts

Standalone scripts for data fetching, maintenance, and automation tasks.

CD Rate Fetcher

Scrapes the top 25 CD rates from Bankrate.com and stores them in the shared.cd_rates PostgreSQL table.

Note: Bankrate renders rate data dynamically via JavaScript, so this script uses Puppeteer (headless Chrome) to fully render the page before extracting data.

Prerequisites

  • Node.js 20+
  • PostgreSQL with the shared.cd_rates table (created by db/init/00-init.sql or db/migrations/005-cd-rates.sql)
  • A .env file at the project root with DATABASE_URL

Manual Execution

cd scripts
npm install
npx tsx fetch-cd-rates.ts

Cron Setup

To run daily at 6:00 AM:

# Edit crontab
crontab -e

# Add this line (adjust path to your project directory):
0 6 * * * cd /path/to/HOA_Financial_Platform/scripts && /usr/local/bin/npx tsx fetch-cd-rates.ts >> /var/log/hoa-cd-rates.log 2>&1

For Docker-based deployments, you can use a host cron job that executes into the container:

0 6 * * * docker exec hoa-backend sh -c "cd /app/scripts && npx tsx fetch-cd-rates.ts" >> /var/log/hoa-cd-rates.log 2>&1

Troubleshooting

  • 0 rates extracted: Bankrate likely changed their page structure. Inspect the page DOM in a browser and update the CSS selectors in fetch-cd-rates.ts.
  • Database connection error: Verify DATABASE_URL in .env points to the correct PostgreSQL instance. For local development (outside Docker), use localhost:5432 instead of postgres:5432.
  • Puppeteer launch error: Ensure Chromium dependencies are installed. On Ubuntu: apt-get install -y libnss3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxrandr2 libgbm1 libpango-1.0-0 libasound2