fix: improve AI health score accuracy and consistency

Address 4 issues identified in AI feature audit:

1. Reduce temperature from 0.3 to 0.1 for health score calculations
   to reduce 16-40 point score volatility across runs

2. Add explicit cash runway classification rules to operating prompt
   preventing the model from rating sub-3-month runway as "positive"

3. Pre-compute total special assessment income in both operating and
   reserve prompts, eliminating per-unit vs total confusion ($300
   vs $20,100)

4. Make YTD budget comparison actuals-aware: only compare months with
   posted journal entries, show current month budget separately, and
   add prompt guidance about month-end posting cadence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-06 12:44:12 -05:00
parent a0b366e94a
commit 07d15001ae
5 changed files with 790 additions and 1132 deletions

545
docs/AI_FEATURE_AUDIT.md Normal file
View File

@@ -0,0 +1,545 @@
# AI Feature Audit Report
**Audit Date:** 2026-03-05
**Tenant Under Test:** Pine Creek HOA (`tenant_pine_creek_hoa_q33i`)
**AI Model:** Qwen 3.5-397B-A17B via NVIDIA NIM (Temperature: 0.3)
**Auditor:** Claude Opus 4.6 (automated)
**Data Snapshot Date:** 2026-03-04
---
## Executive Summary
Three AI-powered features were audited against ground-truth database records: **Operating Fund Health**, **Reserve Fund Health**, and **Investment Recommendations**. Overall, the AI demonstrates strong financial reasoning and produces actionable, fiduciary-appropriate recommendations. However, score consistency across runs is a concern (16-point spread on operating, 20-point spread on reserve), and several specific data interpretation issues were identified.
| Feature | Latest Score/Grade | Concurrence | Verdict |
|---|---|---|---|
| Operating Fund Health | 88 / Good | **72%** | Score ~10-15 pts high; cash runway below its own "Good" threshold |
| Reserve Fund Health | 45 / Needs Attention | **85%** | Well-calibrated; minor data misquote on annual contributions |
| Investment Recommendations | 6 recommendations | **88%** | Excellent specificity; all market rates verified accurate |
---
## Data Foundation (Ground Truth)
### Financial Position
| Metric | Value | Source |
|---|---|---|
| Operating Cash (Checking) | $27,418.81 | GL balance |
| Reserve Cash (Savings) | $10,688.45 | GL balance |
| Reserve CD #1a (FCB) | $10,000 @ 3.67%, matures 6/19/26 | `investment_accounts` |
| Reserve CD #2a (FCB) | $8,000 @ 3.60%, matures 4/14/26 | `investment_accounts` |
| Reserve CD #3a (FCB) | $10,000 @ 3.67%, matures 8/18/26 | `investment_accounts` |
| Total Reserve Fund | $38,688.45 | Cash + Investments |
| Total Assets | $66,107.26 | Operating + Reserve |
### Budget (FY2026)
| Category | Annual Total |
|---|---|
| Operating Income | $184,207.40 |
| Operating Expense | $139,979.95 |
| **Net Operating Surplus** | **$44,227.45** |
| Monthly Expense Run Rate | $11,665.00 |
| Reserve Interest Income | $1,449.96 |
| Reserve Disbursements | $22,000.00 (Mar $13K, Apr $9K) |
### Assessment Structure
- **67 units** at $2,328.14/year regular + $300.00/year special (annual frequency)
- Total annual regular assessments: ~$155,985
- Total annual special assessments: ~$20,100
- Budget timing: assessments front-loaded in Mar-Jun
### Actuals (YTD through March 4, 2026)
| Metric | Value |
|---|---|
| YTD Income | $88.16 (ARC fees $100 - $50 adj + $38.16 interest) |
| YTD Expenses | $1,850.42 (January only) |
| Delinquent Invoices | 0 ($0.00) |
| Journal Entries Posted | 4 (Jan actuals + Feb adjusting + Feb opening balances) |
### Capital Projects (from `projects` table, 26 total)
| Project | Cost | Target | Funded % |
|---|---|---|---|
| Pond Spillway | $7,000 | Mar 2026 | 0% |
| Tuscany Drain Box | $5,500 | May 2026 | 0% |
| Front Entrance Power Washing | $1,500 | Mar 2027 | 0% |
| Irrigation Pump Replacement | $1,500 | Jun 2027 | 0% |
| **Road Sealing - All Roads** | **$80,000** | **Jun 2029** | **0%** |
| Asphalt Repair - Creek Stone Dr | $43,000 | TBD | 0% |
| Pavilion & Vineyard Structures | $7,000 | Jun 2035 | 0% |
| 16 placeholder items | $1.00 each | TBD | 0% |
| **Total Planned** | **$152,016** | | **0%** |
### Reserve Components
- **0 components tracked** (empty `reserve_components` table)
### Market Rates (fetched 2026-03-04)
| Type | Top Rate | Bank | Term |
|---|---|---|---|
| CD | 4.10% | E*TRADE / Synchrony | 12-14 mo |
| High-Yield Savings | 4.09% | Openbank | Liquid |
| Money Market | 4.03% | Vio Bank | Liquid |
---
## 1. Operating Fund Health Score
**Latest Score:** 88 (Good) — Generated 2026-03-04T19:24:36Z
**Score History:** 48 → 72 → 78 → 72 → 78 → **88** (6 runs, March 2-4)
**Overall Concurrence: 72%**
### Factor-by-Factor Analysis
#### Factor 1: "Projected Cash Flow" — Impact: Positive
> "12-month forecast shows consistent positive liquidity, with cash balances never dipping below the starting $27,419 and peaking at $142,788 in June."
| Check | Result |
|---|---|
| Budget surplus ($184K income vs $140K expense) | **Verified** ✅ |
| Assessments front-loaded Mar-Jun | **Verified** ✅ (budget shows $48K Mar, $64K Apr, $32K May, $16K Jun) |
| Peak of ~$142K in June | **Plausible** ✅ ($27K + cumulative income through June) |
| Cash never below starting $27K | **Plausible** ✅ (expenses < income by month) |
**Concurrence: 95%** Forecast logic is sound. The only risk is the assumption that assessments are collected on the exact budget schedule.
---
#### Factor 2: "Delinquency Rate" — Impact: Positive
> "$0.00 in overdue invoices and a 0.0% delinquency rate."
**Concurrence: 100%** Database confirms zero delinquent invoices.
---
#### Factor 3: "Budget Performance (Timing)" — Impact: Neutral
> "YTD income is 99.8% below budget ($55k variance) primarily due to the timing of the large Special Assessment ($20,700) and regular assessments appearing in future projected months."
| Check | Result |
|---|---|
| YTD income $88.16 | **Verified** |
| Budget includes March ($55K) in YTD calc | **Accurate** AI uses month 3 of 12, includes full March budget |
| Timing explanation | **Reasonable** we're only 4 days into March |
| Rating as "neutral" vs "negative" | **Appropriate** correctly avoids penalizing for calendar timing |
**Concurrence: 80%** The variance is accurately computed but presenting a $55K "variance" when we're 4 days into March could alarm a board member. The YTD window through month 3 includes all of March's budget despite only 4 days having elapsed. Consider computing YTD budget pro-rata or through the prior complete month.
**🔧 Tuning Suggestion:** Add a note to the prompt about pro-rating the current month's budget, or instruct the AI to note "X days into the current month" when the variance is driven by incomplete-month timing.
---
#### Factor 4: "Cash Reserves" — Impact: Positive
> "Current operating cash of $27,419 provides 2.4 months of runway based on the annual expense run rate."
| Check | Result |
|---|---|
| $27,419 / ($139,980 / 12) = 2.35 months | **Math verified** |
| Rated as "positive" | **Questionable** |
**Concurrence: 60%** The math is correct, but rating 2.4 months as "positive" contradicts the scoring guidelines which state 2-3 months = "Fair" (60-74) and 3-6 months = "Good" (75-89). This factor should be "neutral" at best, and the overall score should reflect that the HOA is *below* the "Good" threshold for cash reserves.
**🔧 Tuning Suggestion:** Add explicit guidance in the prompt: "If cash runway is below 3 months, this factor MUST be neutral or negative, regardless of projected future inflows."
---
#### Factor 5: "Expense Management" — Impact: Positive
> "YTD expenses are $36,313 under budget (4.8% of annual budget spent vs 25% of year elapsed)."
| Check | Result |
|---|---|
| YTD expenses $1,850.42 | **Verified** |
| Budget YTD (3 months): ~$38,164 | **Correct** |
| $1,850 / $38,164 = 4.85% | **Math verified** |
| "25% of year elapsed" | **Correct** (month 3 of 12) |
| Phrasing "of annual budget" | **Misleading** it's actually 4.8% of YTD budget, not annual |
**Concurrence: 70%** The percentage is correctly calculated against YTD budget, but the phrasing "of annual budget" is incorrect. Also, the low spend is not necessarily positive only January actuals exist; February hasn't been posted yet, which the AI partially acknowledges with "or delayed billing cycles."
---
### Recommendation Assessment
| # | Recommendation | Priority | Concurrence |
|---|---|---|---|
| 1 | "Verify the posting schedule for the $20,700 Special Assessment" | Low | **90%** Valid; assessments are annual, collection timing matters |
| 2 | "Investigate the low YTD expense recognition ($1,850 vs $38,164)" | Medium | **95%** Excellent catch; Feb expenses not posted yet |
| 3 | "Consider moving excess cash over $100K in Q2 to interest-bearing account" | Low | **85%** Sound advice; aligns with HY Savings at 4.09% |
**Recommendation Concurrence: 90%** All three recommendations are actionable and data-backed.
---
### Score Assessment
**Is 88 (Good) the right score?**
| Scoring Criterion | Guidelines Say | Actual | Alignment |
|---|---|---|---|
| Cash reserves | 3-6 months for "Good" | 2.4 months | Below threshold |
| Income vs expenses | "Roughly matching" for Good | $184K vs $140K (surplus) | Exceeds |
| Delinquency | "Manageable" for Good | 0% | Excellent |
| Budget performance | No major overruns for Good | Under budget (timing) | Positive |
| Projected cash flow | Not explicitly in guidelines | Strong positive trajectory | Positive |
The cash runway of 2.4 months is below the stated "Good" (75-89) threshold of 3-6 months and technically falls in the "Fair" (60-74) range of 2-3 months. Earlier AI runs scored this 72-78, which better aligns with the guidelines. The 88 appears to overweight the projected future cash flow (which is speculative) vs the current actual position.
**Suggested correct score: 74-80** (high end of Fair to low end of Good)
---
### Score Consistency Concern
| Run Date | Score | Label |
|---|---|---|
| Mar 2 15:07 | 48 | Needs Attention |
| Mar 2 15:12 | 78 | Good |
| Mar 2 15:36 | 72 | Fair |
| Mar 2 17:09 | 78 | Good |
| Mar 3 02:03 | 72 | Fair |
| Mar 4 19:24 | 88 | Good |
A **40-point spread** (48-88) across 6 runs with essentially the same data is concerning. Even excluding the outlier first run (which noted a data config issue with "1 units"), the remaining 5 runs span 72-88 (16 points). At temperature 0.3, this suggests the model is not deterministic enough for financial scoring.
**🔧 Tuning Suggestion:** Consider lowering temperature to 0.1 for health score calculations to improve consistency. Alternatively, implement a moving average of the last 3 scores to smooth volatility.
---
## 2. Reserve Fund Health Score
**Latest Score:** 45 (Needs Attention) Generated 2026-03-04T19:24:50Z
**Score History:** 25 48 42 25 45 35 **45** (7 runs, March 2-4)
**Overall Concurrence: 85%**
### Factor-by-Factor Analysis
#### Factor 1: "Funded Ratio" — Impact: Negative
> "Calculated at 0% because no reserve components have been inventoried or assigned replacement costs, making it impossible to measure true funding health against the $152,016 in planned projects."
| Check | Result |
|---|---|
| 0 reserve components in DB | **Verified** |
| $152,016 in planned projects | **Verified** (sum of all `projects` rows) |
| 0% funded ratio | **Technically accurate** (no denominator from components) |
| Distinction between components and projects | **Well articulated** |
**Concurrence: 95%** The AI correctly identifies that the 0% is an artifact of missing reserve study data, not a literal lack of funds. It appropriately flags this as a governance failure.
---
#### Factor 2: "Projected Cash Flow" — Impact: Positive
> "Strong immediate liquidity; cash balance is projected to rise from $10,688 to over $49,000 by May 2026 due to special assessment income covering the $12,500 in urgent 2026 project costs."
| Check | Result |
|---|---|
| Starting reserve cash $10,688 | **Verified** |
| 2026 project costs: $7K (Mar) + $5.5K (May) = $12,500 | **Verified** |
| Special assessment: $300 × 67 = $20,100/year | **Verified** |
| CD maturities: $8K (Apr), $10K (Jun), $10K (Aug) | **Verified** |
| Projected rise to $49K by May | **Plausible** (income + maturities - project costs) |
**Concurrence: 85%** Math is directionally correct. However, the assessment is annual frequency so the full $20,100 may arrive in a single payment, not spread monthly. The timing assumption is critical.
---
#### Factor 3: "Component Tracking" — Impact: Negative
> "Critical failure in governance: 'No reserve components tracked' means the association is flying blind on the condition and remaining useful life of major assets like roads and irrigation."
**Concurrence: 100%** Database confirms 0 rows in `reserve_components`. This is objectively a critical gap.
---
#### Factor 4: "Annual Contributions" — Impact: Negative
> "Recurring annual reserve income is only $300 (plus minimal interest), which is grossly insufficient to fund the $80,000 road sealing project due in 2029."
| Check | Result |
|---|---|
| Reserve budget income: $1,449.96/yr (interest only) | **Verified** |
| Special assessment: $300/unit × 67 = $20,100/yr | **Verified** |
| "$300" cited as annual reserve income | **Incorrect** |
| Road Sealing $80K in June 2029 | **Verified** |
**Concurrence: 65%** The concern about insufficient contributions is valid, but the "$300" figure appears to confuse the per-unit special assessment amount ($300/unit) with the total annual reserve income. Actual annual reserve income = $1,450 (interest) + $20,100 (special assessments) = **$21,550/yr**. Even at $21,550/yr, the 3 years until Road Sealing would accumulate ~$64,650, still short of $80K. So the directional concern is correct, but the magnitude is significantly misstated.
**🔧 Tuning Suggestion:** The prompt should explicitly label the special assessment income total (not per-unit) in the data context. Currently the data says "$300.00/unit × 67 units (annual)" the AI should compute $20,100 but sometimes fixates on the $300 per-unit figure. Consider pre-computing and passing the total.
---
### Recommendation Assessment
| # | Recommendation | Priority | Concurrence |
|---|---|---|---|
| 1 | "Commission a professional Reserve Study to inventory assets and establish funded ratio" | High | **100%** Critical and universally correct |
| 2 | "Develop a long-term funding plan for the $80,000 Road Sealing project (2029)" | High | **90%** Verified project exists; $80K with 0% funded |
| 3 | "Formalize collection of special assessments into the reserve fund vs operating" | Medium | **95%** Budget shows special assessments in operating income section |
**Recommendation Concurrence: 95%** All recommendations are actionable, appropriately prioritized, and backed by database evidence.
---
### Score Assessment
**Is 45 (Needs Attention) the right score?**
| Scoring Criterion | Guidelines Say | Actual | Alignment |
|---|---|---|---|
| Percent funded | 20-30% for "Needs Attention" | 0% (no components) | Worse than threshold |
| Contributions | "Inadequate" for Needs Attention | $21,550/yr for $152K in projects | Borderline |
| Component tracking | "Multiple urgent unfunded" | 0 tracked, 2 due in 2026 | Critical gap |
| Investments | Not scored negatively | 3 CDs earning 3.6-3.67% | Positive |
| Capital readiness | | $12.5K due soon, only $10.7K cash | Tight |
A score of 45 is reasonable. The 0% funded ratio technically suggests "At Risk" (20-39), but the presence of real assets ($38.7K), active investments, and manageable near-term liquidity justifies bumping it into the "Needs Attention" band. The AI's balancing of the artificial 0% metric against actual fund health shows good judgment.
**Suggested correct score: 40-50** the AI's 45 is well-calibrated.
---
### Score Consistency Concern
| Run Date | Score | Label |
|---|---|---|
| Mar 2 15:06 | 25 | At Risk |
| Mar 2 15:13 | 25 | At Risk |
| Mar 2 15:37 | 48 | Needs Attention |
| Mar 2 17:10 | 42 | Needs Attention |
| Mar 3 02:04 | 45 | Needs Attention |
| Mar 4 18:49 | 35 | At Risk |
| Mar 4 19:24 | 45 | Needs Attention |
A **23-point spread** (25-48) across 7 runs. The scores oscillate between "At Risk" and "Needs Attention" the model cannot consistently decide which band this falls into. The most recent 3 runs (35, 45, 45) are more stable.
**🔧 Tuning Suggestion:** Add boundary guidance to the prompt: "When the score falls within ±5 points of a threshold (40, 60, 75, 90), explicitly justify which side of the boundary the HOA falls on."
---
## 3. AI Investment Recommendations
**Latest Run:** 2026-03-04T19:28:22Z (3 runs saved)
**Overall Concurrence: 88%**
### Overall Assessment
> "The HOA has a healthy long-term cash flow outlook with significant surpluses projected by mid-2026, but faces an immediate liquidity pinch in the Reserve Fund for March/April capital projects. The current investment strategy relies on older, lower-yielding CDs (3.60-3.67%) that are maturing soon."
**Concurrence: 92%** Every claim verified:
- CDs are at 3.60-3.67% vs market 4.10% (verified)
- March project ($7K) vs reserve cash ($10.7K) is tight (verified)
- Long-term surplus projected from assessment income (verified from budget)
---
### Recommendation-by-Recommendation Analysis
#### Rec 1: "Critical Reserve Shortfall for March Project" — HIGH / Liquidity Warning
| Claim | Database Value | Match |
|---|---|---|
| Reserve cash = $10,688 | $10,688.45 | Exact |
| $7,000 Pond Spillway project due March | Projects table: $7,000, Mar 2026 | Exact |
| Shortfall risk | $10,688 - $7,000 = $3,688 remaining tight but feasible | |
| Suggested action: expedite special assessment or transfer from operating | Sound advice | |
**Concurrence: 90%** The liquidity concern is real. After paying the $7K project, only $3.7K would remain in reserve cash before the $5.5K May project. The AI correctly flags the timing risk even though the fund is technically solvent.
---
#### Rec 2: "Reinvest Maturing CD #2a at Higher Rate" — HIGH / Maturity Action
| Claim | Database Value | Match |
|---|---|---|
| CD #2a = $8,000 | $8,000.00 | Exact |
| Current rate = 3.60% | 3.60% | Exact |
| Maturity = April 14, 2026 | 2026-04-14 | Exact |
| Market rate = 4.10% (E*TRADE) | CD rates: E*TRADE 4.10%, 1 year, $0 min | Exact |
| Additional yield: ~$40/year per $8K | $8K × 0.50% = $40 | Math correct |
**Concurrence: 95%** Textbook-correct recommendation. Every data point verified. The 50 bps improvement is risk-free income.
---
#### Rec 3: "Establish 12-Month CD Ladder for Reserves" — MEDIUM / CD Ladder
| Claim | Database Value | Match |
|---|---|---|
| ~$38K total reserve portfolio | $38,688.45 | Exact |
| Suggest 4-rung ladder (3/6/9/12 mo) | Standard strategy | |
| Rates up to 4.10% | Market data confirmed | |
| $9K matures every quarter | $38K / 4 = $9.5K per rung | Approximate |
**Concurrence: 75%** Strategy is sound in principle, but the recommendation overlooks two constraints:
1. **Immediate project costs ($12.5K in 2026)** must be reserved first, leaving ~$26K for laddering
2. **Investing the entire $38K** is aggressive some cash buffer should remain liquid
**🔧 Tuning Suggestion:** Add a constraint to the prompt: "When recommending CD ladders, always subtract upcoming project costs (next 12 months) and a minimum emergency reserve (1 month of budgeted reserve expenses) before calculating the investable amount."
---
#### Rec 4: "Deploy Excess Operating Cash to High-Yield Savings" — MEDIUM / New Investment
| Claim | Database Value | Match |
|---|---|---|
| Operating cash = $27,418 | $27,418.81 | Exact |
| 3-month buffer = ~$35,000 | $11,665 × 3 = $34,995 | Math correct |
| Current cash below buffer | $27.4K < $35K | Correctly identified |
| Openbank 4.09% APY | Market data: Openbank 4.09%, $0.01 min | Exact |
| Trigger: "As soon as balance exceeds $35K" | Sound deferred recommendation | |
**Concurrence: 90%** The AI correctly identifies the current shortfall and provides a forward-looking trigger. Well-structured advice that respects the liquidity constraint.
---
#### Rec 5: "Optimize Reserve Cash Yield Post-Project" — LOW / Reallocation
| Claim | Database Value | Match |
|---|---|---|
| Vio Bank Money Market at 4.03% | Market data: Vio Bank 4.03%, $0 min | Exact |
| Post-project reserve cash deployment | Appropriate timing | |
| T+1 liquidity for emergencies | Correct MM account characteristic | |
**Concurrence: 85%** Reasonable low-priority optimization. Correctly uses market data.
---
#### Rec 6: "Formalize Special Assessment Collection for Reserves" — LOW / General
| Claim | Database Value | Match |
|---|---|---|
| $300/unit special assessment | Assessment groups: $300.00 special | Exact |
| Risk of commingling with operating | Budget shows special assessments in operating income | Identified |
**Concurrence: 90%** Important governance recommendation. The budget structure does show special assessments as operating income, which could lead to improper fund commingling.
---
### Risk Notes Assessment
| Risk Note | Verified | Concurrence |
|---|---|---|
| "Reserve cash ($10.6K) barely sufficient for $7K + $5.5K projects" | $10,688 vs $12,500 in projects | **95%** |
| "Concentration risk: CDs maturing in 4-month window (Apr-Aug)" | All 3 CDs mature Apr-Aug 2026 | **100%** |
| "Operating cash ballooning to $140K+ without investment plan" | Budget shows large Q2 surplus | **85%** |
| "Road Sealing $80K in 2029 needs dedicated savings plan" | Project exists, 0% funded | **95%** |
**Risk Notes Concurrence: 94%** All risk items are data-backed and appropriately flagged.
---
### Cross-Run Consistency (Investment Recommendations)
Three runs were compared. Key observations:
- **Core recommendations are highly consistent** across runs: CD reinvestment, HY savings for operating, CD ladder for reserves
- **Dollar amounts match exactly** across all runs (same data inputs)
- **Bank name recommendations vary slightly** (E*TRADE vs "Top CD Rate") cosmetic, not substantive
- **Priority levels are stable** (HIGH for liquidity warnings, MEDIUM for optimization)
**Consistency Grade: A-** Investment recommendations show much better consistency than health scores, likely because the structured data (specific CDs, specific rates) constrains the output more than the subjective health scoring.
---
## Cross-Cutting Issues
### Issue 1: Score Volatility (MEDIUM Priority)
Health scores vary significantly across runs despite identical input data:
- Operating: 40-point spread (48-88)
- Reserve: 23-point spread (25-48)
**Root Cause:** Temperature 0.3 allows too much variance for numerical scoring. The model interprets guidelines subjectively.
**Recommended Fix:**
1. Reduce temperature to **0.1** for health score calculations
2. Implement a **3-run moving average** to smooth individual run variance
3. Add explicit **boundary justification** requirements to prompts
### Issue 2: YTD Budget Calculation Includes Incomplete Month (LOW Priority)
The operating health score computes YTD budget through the current month (March), but actual data may only cover a few days. This creates alarming income variances (e.g., "$55K variance") that are pure timing artifacts.
**Recommended Fix:**
- Compute YTD budget through the **prior completed month** (February)
- OR pro-rate the current month's budget by days elapsed
- Add a note to the prompt: "If the variance is driven by the current incomplete month, flag it as 'timing' and weight it minimally."
### Issue 3: Per-Unit vs Total Confusion on Special Assessments (LOW Priority)
The AI sometimes quotes "$300" as the annual reserve income instead of $300 × 67 = $20,100. The data passed says "$300.00/unit × 67 units (annual)" but the model occasionally fixates on the per-unit figure.
**Recommended Fix:**
- Pre-compute and include the total in the data: "Total Annual Special Assessment Income: $20,100.00"
- Keep the per-unit breakdown for context but lead with the total
### Issue 4: Cash Runway Classification Inconsistency (MEDIUM Priority)
The operating health score rates 2.4 months of cash runway as "positive" despite the scoring guidelines defining 2-3 months as "Fair" territory. This inflates the overall score.
**Recommended Fix:**
- Add explicit prompt guidance: "Cash runway categorization: <2 months = negative, 2-3 months = neutral, 3-6 months = positive, 6+ months = strongly positive. Do NOT rate below-threshold runway as positive based on projected future inflows."
### Issue 5: Dual Project Tables (INFORMATIONAL)
The schema contains both `capital_projects` (empty) and `projects` (26 rows). The health score service correctly queries `projects`, but auditors initially checked `capital_projects` and found no data. This dual-table pattern could confuse future developers.
**Recommended Fix:**
- Consolidate into a single table, OR
- Add a comment/documentation clarifying the canonical source
---
## Concurrence Summary by Recommendation
### Operating Fund Health — Recommendations
| Recommendation | Concurrence |
|---|---|
| Verify posting schedule for $20,700 Special Assessment | 90% |
| Investigate low YTD expense recognition | 95% |
| Move excess cash to interest-bearing account | 85% |
| **Average** | **90%** |
### Reserve Fund Health — Recommendations
| Recommendation | Concurrence |
|---|---|
| Commission professional Reserve Study | 100% |
| Develop funding plan for $80K Road Sealing | 90% |
| Formalize special assessment collection for reserves | 95% |
| **Average** | **95%** |
### Investment Planning — Recommendations
| Recommendation | Concurrence |
|---|---|
| Critical Reserve Shortfall for March Project | 90% |
| Reinvest Maturing CD #2a at Higher Rate | 95% |
| Establish 12-Month CD Ladder | 75% |
| Deploy Operating Cash to HY Savings | 90% |
| Optimize Reserve Cash Post-Project | 85% |
| Formalize Special Assessment Collection | 90% |
| **Average** | **88%** |
---
## Final Grades
| Feature | Score Accuracy | Recommendation Quality | Data Fidelity | Consistency | **Overall** |
|---|---|---|---|---|---|
| Operating Fund Health | C+ (score ~15 pts high) | A (90%) | B+ (minor math phrasing) | C (16-pt spread) | **72% — B-** |
| Reserve Fund Health | A- (well-calibrated) | A (95%) | B (per-unit confusion) | B- (23-pt spread) | **85% — B+** |
| Investment Recommendations | N/A (no single score) | A (88%) | A (exact data matches) | A- (stable across runs) | **88% — A-** |
---
## Priority Action Items for Tuning
1. **[HIGH]** Reduce AI temperature from 0.3 0.1 for health score calculations to reduce score volatility
2. **[MEDIUM]** Add explicit cash-runway-to-impact mapping in operating prompt to prevent misclassification
3. **[MEDIUM]** Pre-compute total special assessment income in data context (not just per-unit)
4. **[LOW]** Adjust YTD budget calculation to use prior completed month or pro-rate current month
5. **[LOW]** Add boundary justification requirement to scoring prompts
6. **[LOW]** Consider implementing 3-run moving average for displayed health scores
---
*Generated by Claude Opus 4.6 — Automated AI Feature Audit*