Files
HOA_Financial_Platform/docs/AI_FEATURE_AUDIT.md
olsch01 07d15001ae fix: improve AI health score accuracy and consistency
Address 4 issues identified in AI feature audit:

1. Reduce temperature from 0.3 to 0.1 for health score calculations
   to reduce 16-40 point score volatility across runs

2. Add explicit cash runway classification rules to operating prompt
   preventing the model from rating sub-3-month runway as "positive"

3. Pre-compute total special assessment income in both operating and
   reserve prompts, eliminating per-unit vs total confusion ($300
   vs $20,100)

4. Make YTD budget comparison actuals-aware: only compare months with
   posted journal entries, show current month budget separately, and
   add prompt guidance about month-end posting cadence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:44:12 -05:00

546 lines
26 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AI Feature Audit Report
**Audit Date:** 2026-03-05
**Tenant Under Test:** Pine Creek HOA (`tenant_pine_creek_hoa_q33i`)
**AI Model:** Qwen 3.5-397B-A17B via NVIDIA NIM (Temperature: 0.3)
**Auditor:** Claude Opus 4.6 (automated)
**Data Snapshot Date:** 2026-03-04
---
## Executive Summary
Three AI-powered features were audited against ground-truth database records: **Operating Fund Health**, **Reserve Fund Health**, and **Investment Recommendations**. Overall, the AI demonstrates strong financial reasoning and produces actionable, fiduciary-appropriate recommendations. However, score consistency across runs is a concern (16-point spread on operating, 20-point spread on reserve), and several specific data interpretation issues were identified.
| Feature | Latest Score/Grade | Concurrence | Verdict |
|---|---|---|---|
| Operating Fund Health | 88 / Good | **72%** | Score ~10-15 pts high; cash runway below its own "Good" threshold |
| Reserve Fund Health | 45 / Needs Attention | **85%** | Well-calibrated; minor data misquote on annual contributions |
| Investment Recommendations | 6 recommendations | **88%** | Excellent specificity; all market rates verified accurate |
---
## Data Foundation (Ground Truth)
### Financial Position
| Metric | Value | Source |
|---|---|---|
| Operating Cash (Checking) | $27,418.81 | GL balance |
| Reserve Cash (Savings) | $10,688.45 | GL balance |
| Reserve CD #1a (FCB) | $10,000 @ 3.67%, matures 6/19/26 | `investment_accounts` |
| Reserve CD #2a (FCB) | $8,000 @ 3.60%, matures 4/14/26 | `investment_accounts` |
| Reserve CD #3a (FCB) | $10,000 @ 3.67%, matures 8/18/26 | `investment_accounts` |
| Total Reserve Fund | $38,688.45 | Cash + Investments |
| Total Assets | $66,107.26 | Operating + Reserve |
### Budget (FY2026)
| Category | Annual Total |
|---|---|
| Operating Income | $184,207.40 |
| Operating Expense | $139,979.95 |
| **Net Operating Surplus** | **$44,227.45** |
| Monthly Expense Run Rate | $11,665.00 |
| Reserve Interest Income | $1,449.96 |
| Reserve Disbursements | $22,000.00 (Mar $13K, Apr $9K) |
### Assessment Structure
- **67 units** at $2,328.14/year regular + $300.00/year special (annual frequency)
- Total annual regular assessments: ~$155,985
- Total annual special assessments: ~$20,100
- Budget timing: assessments front-loaded in Mar-Jun
### Actuals (YTD through March 4, 2026)
| Metric | Value |
|---|---|
| YTD Income | $88.16 (ARC fees $100 - $50 adj + $38.16 interest) |
| YTD Expenses | $1,850.42 (January only) |
| Delinquent Invoices | 0 ($0.00) |
| Journal Entries Posted | 4 (Jan actuals + Feb adjusting + Feb opening balances) |
### Capital Projects (from `projects` table, 26 total)
| Project | Cost | Target | Funded % |
|---|---|---|---|
| Pond Spillway | $7,000 | Mar 2026 | 0% |
| Tuscany Drain Box | $5,500 | May 2026 | 0% |
| Front Entrance Power Washing | $1,500 | Mar 2027 | 0% |
| Irrigation Pump Replacement | $1,500 | Jun 2027 | 0% |
| **Road Sealing - All Roads** | **$80,000** | **Jun 2029** | **0%** |
| Asphalt Repair - Creek Stone Dr | $43,000 | TBD | 0% |
| Pavilion & Vineyard Structures | $7,000 | Jun 2035 | 0% |
| 16 placeholder items | $1.00 each | TBD | 0% |
| **Total Planned** | **$152,016** | | **0%** |
### Reserve Components
- **0 components tracked** (empty `reserve_components` table)
### Market Rates (fetched 2026-03-04)
| Type | Top Rate | Bank | Term |
|---|---|---|---|
| CD | 4.10% | E*TRADE / Synchrony | 12-14 mo |
| High-Yield Savings | 4.09% | Openbank | Liquid |
| Money Market | 4.03% | Vio Bank | Liquid |
---
## 1. Operating Fund Health Score
**Latest Score:** 88 (Good) — Generated 2026-03-04T19:24:36Z
**Score History:** 48 → 72 → 78 → 72 → 78 → **88** (6 runs, March 2-4)
**Overall Concurrence: 72%**
### Factor-by-Factor Analysis
#### Factor 1: "Projected Cash Flow" — Impact: Positive
> "12-month forecast shows consistent positive liquidity, with cash balances never dipping below the starting $27,419 and peaking at $142,788 in June."
| Check | Result |
|---|---|
| Budget surplus ($184K income vs $140K expense) | **Verified** ✅ |
| Assessments front-loaded Mar-Jun | **Verified** ✅ (budget shows $48K Mar, $64K Apr, $32K May, $16K Jun) |
| Peak of ~$142K in June | **Plausible** ✅ ($27K + cumulative income through June) |
| Cash never below starting $27K | **Plausible** ✅ (expenses < income by month) |
**Concurrence: 95%** Forecast logic is sound. The only risk is the assumption that assessments are collected on the exact budget schedule.
---
#### Factor 2: "Delinquency Rate" — Impact: Positive
> "$0.00 in overdue invoices and a 0.0% delinquency rate."
**Concurrence: 100%** Database confirms zero delinquent invoices.
---
#### Factor 3: "Budget Performance (Timing)" — Impact: Neutral
> "YTD income is 99.8% below budget ($55k variance) primarily due to the timing of the large Special Assessment ($20,700) and regular assessments appearing in future projected months."
| Check | Result |
|---|---|
| YTD income $88.16 | **Verified** |
| Budget includes March ($55K) in YTD calc | **Accurate** AI uses month 3 of 12, includes full March budget |
| Timing explanation | **Reasonable** we're only 4 days into March |
| Rating as "neutral" vs "negative" | **Appropriate** correctly avoids penalizing for calendar timing |
**Concurrence: 80%** The variance is accurately computed but presenting a $55K "variance" when we're 4 days into March could alarm a board member. The YTD window through month 3 includes all of March's budget despite only 4 days having elapsed. Consider computing YTD budget pro-rata or through the prior complete month.
**🔧 Tuning Suggestion:** Add a note to the prompt about pro-rating the current month's budget, or instruct the AI to note "X days into the current month" when the variance is driven by incomplete-month timing.
---
#### Factor 4: "Cash Reserves" — Impact: Positive
> "Current operating cash of $27,419 provides 2.4 months of runway based on the annual expense run rate."
| Check | Result |
|---|---|
| $27,419 / ($139,980 / 12) = 2.35 months | **Math verified** |
| Rated as "positive" | **Questionable** |
**Concurrence: 60%** The math is correct, but rating 2.4 months as "positive" contradicts the scoring guidelines which state 2-3 months = "Fair" (60-74) and 3-6 months = "Good" (75-89). This factor should be "neutral" at best, and the overall score should reflect that the HOA is *below* the "Good" threshold for cash reserves.
**🔧 Tuning Suggestion:** Add explicit guidance in the prompt: "If cash runway is below 3 months, this factor MUST be neutral or negative, regardless of projected future inflows."
---
#### Factor 5: "Expense Management" — Impact: Positive
> "YTD expenses are $36,313 under budget (4.8% of annual budget spent vs 25% of year elapsed)."
| Check | Result |
|---|---|
| YTD expenses $1,850.42 | **Verified** |
| Budget YTD (3 months): ~$38,164 | **Correct** |
| $1,850 / $38,164 = 4.85% | **Math verified** |
| "25% of year elapsed" | **Correct** (month 3 of 12) |
| Phrasing "of annual budget" | **Misleading** it's actually 4.8% of YTD budget, not annual |
**Concurrence: 70%** The percentage is correctly calculated against YTD budget, but the phrasing "of annual budget" is incorrect. Also, the low spend is not necessarily positive only January actuals exist; February hasn't been posted yet, which the AI partially acknowledges with "or delayed billing cycles."
---
### Recommendation Assessment
| # | Recommendation | Priority | Concurrence |
|---|---|---|---|
| 1 | "Verify the posting schedule for the $20,700 Special Assessment" | Low | **90%** Valid; assessments are annual, collection timing matters |
| 2 | "Investigate the low YTD expense recognition ($1,850 vs $38,164)" | Medium | **95%** Excellent catch; Feb expenses not posted yet |
| 3 | "Consider moving excess cash over $100K in Q2 to interest-bearing account" | Low | **85%** Sound advice; aligns with HY Savings at 4.09% |
**Recommendation Concurrence: 90%** All three recommendations are actionable and data-backed.
---
### Score Assessment
**Is 88 (Good) the right score?**
| Scoring Criterion | Guidelines Say | Actual | Alignment |
|---|---|---|---|
| Cash reserves | 3-6 months for "Good" | 2.4 months | Below threshold |
| Income vs expenses | "Roughly matching" for Good | $184K vs $140K (surplus) | Exceeds |
| Delinquency | "Manageable" for Good | 0% | Excellent |
| Budget performance | No major overruns for Good | Under budget (timing) | Positive |
| Projected cash flow | Not explicitly in guidelines | Strong positive trajectory | Positive |
The cash runway of 2.4 months is below the stated "Good" (75-89) threshold of 3-6 months and technically falls in the "Fair" (60-74) range of 2-3 months. Earlier AI runs scored this 72-78, which better aligns with the guidelines. The 88 appears to overweight the projected future cash flow (which is speculative) vs the current actual position.
**Suggested correct score: 74-80** (high end of Fair to low end of Good)
---
### Score Consistency Concern
| Run Date | Score | Label |
|---|---|---|
| Mar 2 15:07 | 48 | Needs Attention |
| Mar 2 15:12 | 78 | Good |
| Mar 2 15:36 | 72 | Fair |
| Mar 2 17:09 | 78 | Good |
| Mar 3 02:03 | 72 | Fair |
| Mar 4 19:24 | 88 | Good |
A **40-point spread** (48-88) across 6 runs with essentially the same data is concerning. Even excluding the outlier first run (which noted a data config issue with "1 units"), the remaining 5 runs span 72-88 (16 points). At temperature 0.3, this suggests the model is not deterministic enough for financial scoring.
**🔧 Tuning Suggestion:** Consider lowering temperature to 0.1 for health score calculations to improve consistency. Alternatively, implement a moving average of the last 3 scores to smooth volatility.
---
## 2. Reserve Fund Health Score
**Latest Score:** 45 (Needs Attention) Generated 2026-03-04T19:24:50Z
**Score History:** 25 48 42 25 45 35 **45** (7 runs, March 2-4)
**Overall Concurrence: 85%**
### Factor-by-Factor Analysis
#### Factor 1: "Funded Ratio" — Impact: Negative
> "Calculated at 0% because no reserve components have been inventoried or assigned replacement costs, making it impossible to measure true funding health against the $152,016 in planned projects."
| Check | Result |
|---|---|
| 0 reserve components in DB | **Verified** |
| $152,016 in planned projects | **Verified** (sum of all `projects` rows) |
| 0% funded ratio | **Technically accurate** (no denominator from components) |
| Distinction between components and projects | **Well articulated** |
**Concurrence: 95%** The AI correctly identifies that the 0% is an artifact of missing reserve study data, not a literal lack of funds. It appropriately flags this as a governance failure.
---
#### Factor 2: "Projected Cash Flow" — Impact: Positive
> "Strong immediate liquidity; cash balance is projected to rise from $10,688 to over $49,000 by May 2026 due to special assessment income covering the $12,500 in urgent 2026 project costs."
| Check | Result |
|---|---|
| Starting reserve cash $10,688 | **Verified** |
| 2026 project costs: $7K (Mar) + $5.5K (May) = $12,500 | **Verified** |
| Special assessment: $300 × 67 = $20,100/year | **Verified** |
| CD maturities: $8K (Apr), $10K (Jun), $10K (Aug) | **Verified** |
| Projected rise to $49K by May | **Plausible** (income + maturities - project costs) |
**Concurrence: 85%** Math is directionally correct. However, the assessment is annual frequency so the full $20,100 may arrive in a single payment, not spread monthly. The timing assumption is critical.
---
#### Factor 3: "Component Tracking" — Impact: Negative
> "Critical failure in governance: 'No reserve components tracked' means the association is flying blind on the condition and remaining useful life of major assets like roads and irrigation."
**Concurrence: 100%** Database confirms 0 rows in `reserve_components`. This is objectively a critical gap.
---
#### Factor 4: "Annual Contributions" — Impact: Negative
> "Recurring annual reserve income is only $300 (plus minimal interest), which is grossly insufficient to fund the $80,000 road sealing project due in 2029."
| Check | Result |
|---|---|
| Reserve budget income: $1,449.96/yr (interest only) | **Verified** |
| Special assessment: $300/unit × 67 = $20,100/yr | **Verified** |
| "$300" cited as annual reserve income | **Incorrect** |
| Road Sealing $80K in June 2029 | **Verified** |
**Concurrence: 65%** The concern about insufficient contributions is valid, but the "$300" figure appears to confuse the per-unit special assessment amount ($300/unit) with the total annual reserve income. Actual annual reserve income = $1,450 (interest) + $20,100 (special assessments) = **$21,550/yr**. Even at $21,550/yr, the 3 years until Road Sealing would accumulate ~$64,650, still short of $80K. So the directional concern is correct, but the magnitude is significantly misstated.
**🔧 Tuning Suggestion:** The prompt should explicitly label the special assessment income total (not per-unit) in the data context. Currently the data says "$300.00/unit × 67 units (annual)" the AI should compute $20,100 but sometimes fixates on the $300 per-unit figure. Consider pre-computing and passing the total.
---
### Recommendation Assessment
| # | Recommendation | Priority | Concurrence |
|---|---|---|---|
| 1 | "Commission a professional Reserve Study to inventory assets and establish funded ratio" | High | **100%** Critical and universally correct |
| 2 | "Develop a long-term funding plan for the $80,000 Road Sealing project (2029)" | High | **90%** Verified project exists; $80K with 0% funded |
| 3 | "Formalize collection of special assessments into the reserve fund vs operating" | Medium | **95%** Budget shows special assessments in operating income section |
**Recommendation Concurrence: 95%** All recommendations are actionable, appropriately prioritized, and backed by database evidence.
---
### Score Assessment
**Is 45 (Needs Attention) the right score?**
| Scoring Criterion | Guidelines Say | Actual | Alignment |
|---|---|---|---|
| Percent funded | 20-30% for "Needs Attention" | 0% (no components) | Worse than threshold |
| Contributions | "Inadequate" for Needs Attention | $21,550/yr for $152K in projects | Borderline |
| Component tracking | "Multiple urgent unfunded" | 0 tracked, 2 due in 2026 | Critical gap |
| Investments | Not scored negatively | 3 CDs earning 3.6-3.67% | Positive |
| Capital readiness | | $12.5K due soon, only $10.7K cash | Tight |
A score of 45 is reasonable. The 0% funded ratio technically suggests "At Risk" (20-39), but the presence of real assets ($38.7K), active investments, and manageable near-term liquidity justifies bumping it into the "Needs Attention" band. The AI's balancing of the artificial 0% metric against actual fund health shows good judgment.
**Suggested correct score: 40-50** the AI's 45 is well-calibrated.
---
### Score Consistency Concern
| Run Date | Score | Label |
|---|---|---|
| Mar 2 15:06 | 25 | At Risk |
| Mar 2 15:13 | 25 | At Risk |
| Mar 2 15:37 | 48 | Needs Attention |
| Mar 2 17:10 | 42 | Needs Attention |
| Mar 3 02:04 | 45 | Needs Attention |
| Mar 4 18:49 | 35 | At Risk |
| Mar 4 19:24 | 45 | Needs Attention |
A **23-point spread** (25-48) across 7 runs. The scores oscillate between "At Risk" and "Needs Attention" the model cannot consistently decide which band this falls into. The most recent 3 runs (35, 45, 45) are more stable.
**🔧 Tuning Suggestion:** Add boundary guidance to the prompt: "When the score falls within ±5 points of a threshold (40, 60, 75, 90), explicitly justify which side of the boundary the HOA falls on."
---
## 3. AI Investment Recommendations
**Latest Run:** 2026-03-04T19:28:22Z (3 runs saved)
**Overall Concurrence: 88%**
### Overall Assessment
> "The HOA has a healthy long-term cash flow outlook with significant surpluses projected by mid-2026, but faces an immediate liquidity pinch in the Reserve Fund for March/April capital projects. The current investment strategy relies on older, lower-yielding CDs (3.60-3.67%) that are maturing soon."
**Concurrence: 92%** Every claim verified:
- CDs are at 3.60-3.67% vs market 4.10% (verified)
- March project ($7K) vs reserve cash ($10.7K) is tight (verified)
- Long-term surplus projected from assessment income (verified from budget)
---
### Recommendation-by-Recommendation Analysis
#### Rec 1: "Critical Reserve Shortfall for March Project" — HIGH / Liquidity Warning
| Claim | Database Value | Match |
|---|---|---|
| Reserve cash = $10,688 | $10,688.45 | Exact |
| $7,000 Pond Spillway project due March | Projects table: $7,000, Mar 2026 | Exact |
| Shortfall risk | $10,688 - $7,000 = $3,688 remaining tight but feasible | |
| Suggested action: expedite special assessment or transfer from operating | Sound advice | |
**Concurrence: 90%** The liquidity concern is real. After paying the $7K project, only $3.7K would remain in reserve cash before the $5.5K May project. The AI correctly flags the timing risk even though the fund is technically solvent.
---
#### Rec 2: "Reinvest Maturing CD #2a at Higher Rate" — HIGH / Maturity Action
| Claim | Database Value | Match |
|---|---|---|
| CD #2a = $8,000 | $8,000.00 | Exact |
| Current rate = 3.60% | 3.60% | Exact |
| Maturity = April 14, 2026 | 2026-04-14 | Exact |
| Market rate = 4.10% (E*TRADE) | CD rates: E*TRADE 4.10%, 1 year, $0 min | Exact |
| Additional yield: ~$40/year per $8K | $8K × 0.50% = $40 | Math correct |
**Concurrence: 95%** Textbook-correct recommendation. Every data point verified. The 50 bps improvement is risk-free income.
---
#### Rec 3: "Establish 12-Month CD Ladder for Reserves" — MEDIUM / CD Ladder
| Claim | Database Value | Match |
|---|---|---|
| ~$38K total reserve portfolio | $38,688.45 | Exact |
| Suggest 4-rung ladder (3/6/9/12 mo) | Standard strategy | |
| Rates up to 4.10% | Market data confirmed | |
| $9K matures every quarter | $38K / 4 = $9.5K per rung | Approximate |
**Concurrence: 75%** Strategy is sound in principle, but the recommendation overlooks two constraints:
1. **Immediate project costs ($12.5K in 2026)** must be reserved first, leaving ~$26K for laddering
2. **Investing the entire $38K** is aggressive some cash buffer should remain liquid
**🔧 Tuning Suggestion:** Add a constraint to the prompt: "When recommending CD ladders, always subtract upcoming project costs (next 12 months) and a minimum emergency reserve (1 month of budgeted reserve expenses) before calculating the investable amount."
---
#### Rec 4: "Deploy Excess Operating Cash to High-Yield Savings" — MEDIUM / New Investment
| Claim | Database Value | Match |
|---|---|---|
| Operating cash = $27,418 | $27,418.81 | Exact |
| 3-month buffer = ~$35,000 | $11,665 × 3 = $34,995 | Math correct |
| Current cash below buffer | $27.4K < $35K | Correctly identified |
| Openbank 4.09% APY | Market data: Openbank 4.09%, $0.01 min | Exact |
| Trigger: "As soon as balance exceeds $35K" | Sound deferred recommendation | |
**Concurrence: 90%** The AI correctly identifies the current shortfall and provides a forward-looking trigger. Well-structured advice that respects the liquidity constraint.
---
#### Rec 5: "Optimize Reserve Cash Yield Post-Project" — LOW / Reallocation
| Claim | Database Value | Match |
|---|---|---|
| Vio Bank Money Market at 4.03% | Market data: Vio Bank 4.03%, $0 min | Exact |
| Post-project reserve cash deployment | Appropriate timing | |
| T+1 liquidity for emergencies | Correct MM account characteristic | |
**Concurrence: 85%** Reasonable low-priority optimization. Correctly uses market data.
---
#### Rec 6: "Formalize Special Assessment Collection for Reserves" — LOW / General
| Claim | Database Value | Match |
|---|---|---|
| $300/unit special assessment | Assessment groups: $300.00 special | Exact |
| Risk of commingling with operating | Budget shows special assessments in operating income | Identified |
**Concurrence: 90%** Important governance recommendation. The budget structure does show special assessments as operating income, which could lead to improper fund commingling.
---
### Risk Notes Assessment
| Risk Note | Verified | Concurrence |
|---|---|---|
| "Reserve cash ($10.6K) barely sufficient for $7K + $5.5K projects" | $10,688 vs $12,500 in projects | **95%** |
| "Concentration risk: CDs maturing in 4-month window (Apr-Aug)" | All 3 CDs mature Apr-Aug 2026 | **100%** |
| "Operating cash ballooning to $140K+ without investment plan" | Budget shows large Q2 surplus | **85%** |
| "Road Sealing $80K in 2029 needs dedicated savings plan" | Project exists, 0% funded | **95%** |
**Risk Notes Concurrence: 94%** All risk items are data-backed and appropriately flagged.
---
### Cross-Run Consistency (Investment Recommendations)
Three runs were compared. Key observations:
- **Core recommendations are highly consistent** across runs: CD reinvestment, HY savings for operating, CD ladder for reserves
- **Dollar amounts match exactly** across all runs (same data inputs)
- **Bank name recommendations vary slightly** (E*TRADE vs "Top CD Rate") cosmetic, not substantive
- **Priority levels are stable** (HIGH for liquidity warnings, MEDIUM for optimization)
**Consistency Grade: A-** Investment recommendations show much better consistency than health scores, likely because the structured data (specific CDs, specific rates) constrains the output more than the subjective health scoring.
---
## Cross-Cutting Issues
### Issue 1: Score Volatility (MEDIUM Priority)
Health scores vary significantly across runs despite identical input data:
- Operating: 40-point spread (48-88)
- Reserve: 23-point spread (25-48)
**Root Cause:** Temperature 0.3 allows too much variance for numerical scoring. The model interprets guidelines subjectively.
**Recommended Fix:**
1. Reduce temperature to **0.1** for health score calculations
2. Implement a **3-run moving average** to smooth individual run variance
3. Add explicit **boundary justification** requirements to prompts
### Issue 2: YTD Budget Calculation Includes Incomplete Month (LOW Priority)
The operating health score computes YTD budget through the current month (March), but actual data may only cover a few days. This creates alarming income variances (e.g., "$55K variance") that are pure timing artifacts.
**Recommended Fix:**
- Compute YTD budget through the **prior completed month** (February)
- OR pro-rate the current month's budget by days elapsed
- Add a note to the prompt: "If the variance is driven by the current incomplete month, flag it as 'timing' and weight it minimally."
### Issue 3: Per-Unit vs Total Confusion on Special Assessments (LOW Priority)
The AI sometimes quotes "$300" as the annual reserve income instead of $300 × 67 = $20,100. The data passed says "$300.00/unit × 67 units (annual)" but the model occasionally fixates on the per-unit figure.
**Recommended Fix:**
- Pre-compute and include the total in the data: "Total Annual Special Assessment Income: $20,100.00"
- Keep the per-unit breakdown for context but lead with the total
### Issue 4: Cash Runway Classification Inconsistency (MEDIUM Priority)
The operating health score rates 2.4 months of cash runway as "positive" despite the scoring guidelines defining 2-3 months as "Fair" territory. This inflates the overall score.
**Recommended Fix:**
- Add explicit prompt guidance: "Cash runway categorization: <2 months = negative, 2-3 months = neutral, 3-6 months = positive, 6+ months = strongly positive. Do NOT rate below-threshold runway as positive based on projected future inflows."
### Issue 5: Dual Project Tables (INFORMATIONAL)
The schema contains both `capital_projects` (empty) and `projects` (26 rows). The health score service correctly queries `projects`, but auditors initially checked `capital_projects` and found no data. This dual-table pattern could confuse future developers.
**Recommended Fix:**
- Consolidate into a single table, OR
- Add a comment/documentation clarifying the canonical source
---
## Concurrence Summary by Recommendation
### Operating Fund Health — Recommendations
| Recommendation | Concurrence |
|---|---|
| Verify posting schedule for $20,700 Special Assessment | 90% |
| Investigate low YTD expense recognition | 95% |
| Move excess cash to interest-bearing account | 85% |
| **Average** | **90%** |
### Reserve Fund Health — Recommendations
| Recommendation | Concurrence |
|---|---|
| Commission professional Reserve Study | 100% |
| Develop funding plan for $80K Road Sealing | 90% |
| Formalize special assessment collection for reserves | 95% |
| **Average** | **95%** |
### Investment Planning — Recommendations
| Recommendation | Concurrence |
|---|---|
| Critical Reserve Shortfall for March Project | 90% |
| Reinvest Maturing CD #2a at Higher Rate | 95% |
| Establish 12-Month CD Ladder | 75% |
| Deploy Operating Cash to HY Savings | 90% |
| Optimize Reserve Cash Post-Project | 85% |
| Formalize Special Assessment Collection | 90% |
| **Average** | **88%** |
---
## Final Grades
| Feature | Score Accuracy | Recommendation Quality | Data Fidelity | Consistency | **Overall** |
|---|---|---|---|---|---|
| Operating Fund Health | C+ (score ~15 pts high) | A (90%) | B+ (minor math phrasing) | C (16-pt spread) | **72% — B-** |
| Reserve Fund Health | A- (well-calibrated) | A (95%) | B (per-unit confusion) | B- (23-pt spread) | **85% — B+** |
| Investment Recommendations | N/A (no single score) | A (88%) | A (exact data matches) | A- (stable across runs) | **88% — A-** |
---
## Priority Action Items for Tuning
1. **[HIGH]** Reduce AI temperature from 0.3 0.1 for health score calculations to reduce score volatility
2. **[MEDIUM]** Add explicit cash-runway-to-impact mapping in operating prompt to prevent misclassification
3. **[MEDIUM]** Pre-compute total special assessment income in data context (not just per-unit)
4. **[LOW]** Adjust YTD budget calculation to use prior completed month or pro-rate current month
5. **[LOW]** Add boundary justification requirement to scoring prompts
6. **[LOW]** Consider implementing 3-run moving average for displayed health scores
---
*Generated by Claude Opus 4.6 — Automated AI Feature Audit*