The previous scraper was picking up Bankrate's summary table (.wealth-product-rate-list) which only has "best rates" per term with no bank names, resulting in entries like "Top CD Rate - 1 year". Now targets the actual bank offer cards in .wrt-RateSections-sponsoredoffers and .wrt-RateSections-additionaloffers sections. Key changes: - Extract bank names from img[alt] (logo) with text-based fallbacks - Fix APY parsing to avoid Bankrate score leaking in (e.g. "4.5" score concatenated with "4.00%" APY was parsed as 0.4%) - Handle both "Min. deposit" (CDs) and "Min. balance for APY" (savings/MM) - Parse abbreviated terms from Bankrate (e.g. "1yr", "14mo") - Strip product suffixes from bank names (e.g. "Synchrony Bank CD" → "Synchrony Bank") - Filter out entries that aren't real banks (terms, dollar amounts) - Keep a fallback strategy for future Bankrate layout changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HOA LedgerIQ - Scripts
Standalone scripts for data fetching, maintenance, and automation tasks.
CD Rate Fetcher
Scrapes the top 25 CD rates from Bankrate.com and stores them in the shared.cd_rates PostgreSQL table.
Note: Bankrate renders rate data dynamically via JavaScript, so this script uses Puppeteer (headless Chrome) to fully render the page before extracting data.
Prerequisites
- Node.js 20+
- PostgreSQL with the
shared.cd_ratestable (created bydb/init/00-init.sqlordb/migrations/005-cd-rates.sql) - A
.envfile at the project root withDATABASE_URL
Manual Execution
cd scripts
npm install
npx tsx fetch-cd-rates.ts
Cron Setup
To run daily at 6:00 AM:
# Edit crontab
crontab -e
# Add this line (adjust path to your project directory):
0 6 * * * cd /path/to/HOA_Financial_Platform/scripts && /usr/local/bin/npx tsx fetch-cd-rates.ts >> /var/log/hoa-cd-rates.log 2>&1
For Docker-based deployments, you can use a host cron job that executes into the container:
0 6 * * * docker exec hoa-backend sh -c "cd /app/scripts && npx tsx fetch-cd-rates.ts" >> /var/log/hoa-cd-rates.log 2>&1
Troubleshooting
- 0 rates extracted: Bankrate likely changed their page structure. Inspect the page DOM in a browser and update the CSS selectors in
fetch-cd-rates.ts. - Database connection error: Verify
DATABASE_URLin.envpoints to the correct PostgreSQL instance. For local development (outside Docker), uselocalhost:5432instead ofpostgres:5432. - Puppeteer launch error: Ensure Chromium dependencies are installed. On Ubuntu:
apt-get install -y libnss3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxrandr2 libgbm1 libpango-1.0-0 libasound2