Files

olsch01 07d15001ae fix: improve AI health score accuracy and consistency

Address 4 issues identified in AI feature audit:

1. Reduce temperature from 0.3 to 0.1 for health score calculations
   to reduce 16-40 point score volatility across runs

2. Add explicit cash runway classification rules to operating prompt
   preventing the model from rating sub-3-month runway as "positive"

3. Pre-compute total special assessment income in both operating and
   reserve prompts, eliminating per-unit vs total confusion ($300
   vs $20,100)

4. Make YTD budget comparison actuals-aware: only compare months with
   posted journal entries, show current month budget separately, and
   add prompt guidance about month-end posting cadence

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-06 12:44:12 -05:00

26 KiB

Raw Permalink Blame History

AI Feature Audit Report

Audit Date: 2026-03-05 Tenant Under Test: Pine Creek HOA (tenant_pine_creek_hoa_q33i) AI Model: Qwen 3.5-397B-A17B via NVIDIA NIM (Temperature: 0.3) Auditor: Claude Opus 4.6 (automated) Data Snapshot Date: 2026-03-04

Executive Summary

Three AI-powered features were audited against ground-truth database records: Operating Fund Health, Reserve Fund Health, and Investment Recommendations. Overall, the AI demonstrates strong financial reasoning and produces actionable, fiduciary-appropriate recommendations. However, score consistency across runs is a concern (16-point spread on operating, 20-point spread on reserve), and several specific data interpretation issues were identified.

Feature	Latest Score/Grade	Concurrence	Verdict
Operating Fund Health	88 / Good	72%	Score ~10-15 pts high; cash runway below its own "Good" threshold
Reserve Fund Health	45 / Needs Attention	85%	Well-calibrated; minor data misquote on annual contributions
Investment Recommendations	6 recommendations	88%	Excellent specificity; all market rates verified accurate

Data Foundation (Ground Truth)

Financial Position

Metric	Value	Source
Operating Cash (Checking)	$27,418.81	GL balance
Reserve Cash (Savings)	$10,688.45	GL balance
Reserve CD #1a (FCB)	$10,000 @ 3.67%, matures 6/19/26	`investment_accounts`
Reserve CD #2a (FCB)	$8,000 @ 3.60%, matures 4/14/26	`investment_accounts`
Reserve CD #3a (FCB)	$10,000 @ 3.67%, matures 8/18/26	`investment_accounts`
Total Reserve Fund	$38,688.45	Cash + Investments
Total Assets	$66,107.26	Operating + Reserve

Budget (FY2026)

Category	Annual Total
Operating Income	$184,207.40
Operating Expense	$139,979.95
Net Operating Surplus	$44,227.45
Monthly Expense Run Rate	$11,665.00
Reserve Interest Income	$1,449.96
Reserve Disbursements	$22,000.00 (Mar $13K, Apr $9K)

Assessment Structure

67 units at $2,328.14/year regular + $300.00/year special (annual frequency)
Total annual regular assessments: ~$155,985
Total annual special assessments: ~$20,100
Budget timing: assessments front-loaded in Mar-Jun

Actuals (YTD through March 4, 2026)

Metric	Value
YTD Income	$88.16 (ARC fees $100 - $50 adj + $38.16 interest)
YTD Expenses	$1,850.42 (January only)
Delinquent Invoices	0 ($0.00)
Journal Entries Posted	4 (Jan actuals + Feb adjusting + Feb opening balances)

Capital Projects (from `projects` table, 26 total)

Project	Cost	Target	Funded %
Pond Spillway	$7,000	Mar 2026	0%
Tuscany Drain Box	$5,500	May 2026	0%
Front Entrance Power Washing	$1,500	Mar 2027	0%
Irrigation Pump Replacement	$1,500	Jun 2027	0%
Road Sealing - All Roads	$80,000	Jun 2029	0%
Asphalt Repair - Creek Stone Dr	$43,000	TBD	0%
Pavilion & Vineyard Structures	$7,000	Jun 2035	0%
16 placeholder items	$1.00 each	TBD	0%
Total Planned	$152,016		0%

Reserve Components

0 components tracked (empty reserve_components table)

Market Rates (fetched 2026-03-04)

Type	Top Rate	Bank	Term
CD	4.10%	E*TRADE / Synchrony	12-14 mo
High-Yield Savings	4.09%	Openbank	Liquid
Money Market	4.03%	Vio Bank	Liquid

1. Operating Fund Health Score

Latest Score: 88 (Good) — Generated 2026-03-04T19:24:36Z Score History: 48 → 72 → 78 → 72 → 78 → 88 (6 runs, March 2-4) Overall Concurrence: 72%

Factor-by-Factor Analysis

Factor 1: "Projected Cash Flow" — Impact: Positive

"12-month forecast shows consistent positive liquidity, with cash balances never dipping below the starting $27,419 and peaking at $142,788 in June."

Check	Result
Budget surplus ($184K income vs $140K expense)	Verified ✅
Assessments front-loaded Mar-Jun	Verified ✅ (budget shows $48K Mar, $64K Apr, $32K May, $16K Jun)
Peak of ~$142K in June	Plausible ✅ ($27K + cumulative income through June)
Cash never below starting $27K	Plausible ✅ (expenses < income by month)

Concurrence: 95% — Forecast logic is sound. The only risk is the assumption that assessments are collected on the exact budget schedule.

Factor 2: "Delinquency Rate" — Impact: Positive

"$0.00 in overdue invoices and a 0.0% delinquency rate."

Concurrence: 100% ✅ — Database confirms zero delinquent invoices.

Factor 3: "Budget Performance (Timing)" — Impact: Neutral

"YTD income is 99.8% below budget ($55k variance) primarily due to the timing of the large Special Assessment ($20,700) and regular assessments appearing in future projected months."

Check	Result
YTD income $88.16	Verified ✅
Budget includes March ($55K) in YTD calc	Accurate — AI uses month 3 of 12, includes full March budget
Timing explanation	Reasonable — we're only 4 days into March
Rating as "neutral" vs "negative"	Appropriate ✅ — correctly avoids penalizing for calendar timing

Concurrence: 80% — The variance is accurately computed but presenting a $55K "variance" when we're 4 days into March could alarm a board member. The YTD window through month 3 includes all of March's budget despite only 4 days having elapsed. Consider computing YTD budget pro-rata or through the prior complete month.

🔧 Tuning Suggestion: Add a note to the prompt about pro-rating the current month's budget, or instruct the AI to note "X days into the current month" when the variance is driven by incomplete-month timing.

Factor 4: "Cash Reserves" — Impact: Positive

"Current operating cash of $27,419 provides 2.4 months of runway based on the annual expense run rate."

Check	Result
$27,419 / ($139,980 / 12) = 2.35 months	Math verified ✅
Rated as "positive"	Questionable ⚠️

Concurrence: 60% — The math is correct, but rating 2.4 months as "positive" contradicts the scoring guidelines which state 2-3 months = "Fair" (60-74) and 3-6 months = "Good" (75-89). This factor should be "neutral" at best, and the overall score should reflect that the HOA is below the "Good" threshold for cash reserves.

🔧 Tuning Suggestion: Add explicit guidance in the prompt: "If cash runway is below 3 months, this factor MUST be neutral or negative, regardless of projected future inflows."

Factor 5: "Expense Management" — Impact: Positive

"YTD expenses are $36,313 under budget (4.8% of annual budget spent vs 25% of year elapsed)."

Check	Result
YTD expenses $1,850.42	Verified ✅
Budget YTD (3 months): ~$38,164	Correct ✅
$1,850 / $38,164 = 4.85%	Math verified ✅
"25% of year elapsed"	Correct (month 3 of 12)
Phrasing "of annual budget"	Misleading ⚠️ — it's actually 4.8% of YTD budget, not annual

Concurrence: 70% — The percentage is correctly calculated against YTD budget, but the phrasing "of annual budget" is incorrect. Also, the low spend is not necessarily positive — only January actuals exist; February hasn't been posted yet, which the AI partially acknowledges with "or delayed billing cycles."

Recommendation Assessment

#	Recommendation	Priority	Concurrence
1	"Verify the posting schedule for the $20,700 Special Assessment"	Low	90% ✅ Valid; assessments are annual, collection timing matters
2	"Investigate the low YTD expense recognition ($1,850 vs $38,164)"	Medium	95% ✅ Excellent catch; Feb expenses not posted yet
3	"Consider moving excess cash over $100K in Q2 to interest-bearing account"	Low	85% ✅ Sound advice; aligns with HY Savings at 4.09%

Recommendation Concurrence: 90% — All three recommendations are actionable and data-backed.

Score Assessment

Is 88 (Good) the right score?

Scoring Criterion	Guidelines Say	Actual	Alignment
Cash reserves	3-6 months for "Good"	2.4 months	❌ Below threshold
Income vs expenses	"Roughly matching" for Good	$184K vs $140K (surplus)	✅ Exceeds
Delinquency	"Manageable" for Good	0%	✅ Excellent
Budget performance	No major overruns for Good	Under budget (timing)	✅ Positive
Projected cash flow	Not explicitly in guidelines	Strong positive trajectory	✅ Positive

The cash runway of 2.4 months is below the stated "Good" (75-89) threshold of 3-6 months and technically falls in the "Fair" (60-74) range of 2-3 months. Earlier AI runs scored this 72-78, which better aligns with the guidelines. The 88 appears to overweight the projected future cash flow (which is speculative) vs the current actual position.

Suggested correct score: 74-80 (high end of Fair to low end of Good)

Score Consistency Concern

Run Date	Score	Label
Mar 2 15:07	48	Needs Attention
Mar 2 15:12	78	Good
Mar 2 15:36	72	Fair
Mar 2 17:09	78	Good
Mar 3 02:03	72	Fair
Mar 4 19:24	88	Good

A 40-point spread (48-88) across 6 runs with essentially the same data is concerning. Even excluding the outlier first run (which noted a data config issue with "1 units"), the remaining 5 runs span 72-88 (16 points). At temperature 0.3, this suggests the model is not deterministic enough for financial scoring.

🔧 Tuning Suggestion: Consider lowering temperature to 0.1 for health score calculations to improve consistency. Alternatively, implement a moving average of the last 3 scores to smooth volatility.

2. Reserve Fund Health Score

Latest Score: 45 (Needs Attention) — Generated 2026-03-04T19:24:50Z Score History: 25 → 48 → 42 → 25 → 45 → 35 → 45 (7 runs, March 2-4) Overall Concurrence: 85%

Factor-by-Factor Analysis

Factor 1: "Funded Ratio" — Impact: Negative

"Calculated at 0% because no reserve components have been inventoried or assigned replacement costs, making it impossible to measure true funding health against the $152,016 in planned projects."

Check	Result
0 reserve components in DB	Verified ✅
$152,016 in planned projects	Verified ✅ (sum of all `projects` rows)
0% funded ratio	Technically accurate ✅ (no denominator from components)
Distinction between components and projects	Well articulated ✅

Concurrence: 95% — The AI correctly identifies that the 0% is an artifact of missing reserve study data, not a literal lack of funds. It appropriately flags this as a governance failure.

Factor 2: "Projected Cash Flow" — Impact: Positive

"Strong immediate liquidity; cash balance is projected to rise from $10,688 to over $49,000 by May 2026 due to special assessment income covering the $12,500 in urgent 2026 project costs."

Check	Result
Starting reserve cash $10,688	Verified ✅
2026 project costs: $7K (Mar) + $5.5K (May) = $12,500	Verified ✅
Special assessment: $300 × 67 = $20,100/year	Verified ✅
CD maturities: $8K (Apr), $10K (Jun), $10K (Aug)	Verified ✅
Projected rise to $49K by May	Plausible ✅ (income + maturities - project costs)

Concurrence: 85% — Math is directionally correct. However, the assessment is annual frequency so the full $20,100 may arrive in a single payment, not spread monthly. The timing assumption is critical.

Factor 3: "Component Tracking" — Impact: Negative

"Critical failure in governance: 'No reserve components tracked' means the association is flying blind on the condition and remaining useful life of major assets like roads and irrigation."

Concurrence: 100% ✅ — Database confirms 0 rows in reserve_components. This is objectively a critical gap.

Factor 4: "Annual Contributions" — Impact: Negative

"Recurring annual reserve income is only $300 (plus minimal interest), which is grossly insufficient to fund the $80,000 road sealing project due in 2029."

Check	Result
Reserve budget income: $1,449.96/yr (interest only)	Verified ✅
Special assessment: $300/unit × 67 = $20,100/yr	Verified ✅
"$300" cited as annual reserve income	Incorrect ⚠️
Road Sealing $80K in June 2029	Verified ✅

Concurrence: 65% — The concern about insufficient contributions is valid, but the "$300" figure appears to confuse the per-unit special assessment amount ($300/unit) with the total annual reserve income. Actual annual reserve income = $1,450 (interest) + $20,100 (special assessments) = $21,550/yr. Even at $21,550/yr, the 3 years until Road Sealing would accumulate ~$64,650, still short of $80K. So the directional concern is correct, but the magnitude is significantly misstated.

🔧 Tuning Suggestion: The prompt should explicitly label the special assessment income total (not per-unit) in the data context. Currently the data says "$300.00/unit × 67 units (annual)" — the AI should compute $20,100 but sometimes fixates on the $300 per-unit figure. Consider pre-computing and passing the total.

Recommendation Assessment

#	Recommendation	Priority	Concurrence
1	"Commission a professional Reserve Study to inventory assets and establish funded ratio"	High	100% ✅ Critical and universally correct
2	"Develop a long-term funding plan for the $80,000 Road Sealing project (2029)"	High	90% ✅ Verified project exists; $80K with 0% funded
3	"Formalize collection of special assessments into the reserve fund vs operating"	Medium	95% ✅ Budget shows special assessments in operating income section

Recommendation Concurrence: 95% — All recommendations are actionable, appropriately prioritized, and backed by database evidence.

Score Assessment

Is 45 (Needs Attention) the right score?

Scoring Criterion	Guidelines Say	Actual	Alignment
Percent funded	20-30% for "Needs Attention"	0% (no components)	⬇️ Worse than threshold
Contributions	"Inadequate" for Needs Attention	$21,550/yr for $152K in projects	⚠️ Borderline
Component tracking	"Multiple urgent unfunded"	0 tracked, 2 due in 2026	❌ Critical gap
Investments	Not scored negatively	3 CDs earning 3.6-3.67%	✅ Positive
Capital readiness		$12.5K due soon, only $10.7K cash	⚠️ Tight

A score of 45 is reasonable. The 0% funded ratio technically suggests "At Risk" (20-39), but the presence of real assets ($38.7K), active investments, and manageable near-term liquidity justifies bumping it into the "Needs Attention" band. The AI's balancing of the artificial 0% metric against actual fund health shows good judgment.

Suggested correct score: 40-50 — the AI's 45 is well-calibrated.

Score Consistency Concern

Run Date	Score	Label
Mar 2 15:06	25	At Risk
Mar 2 15:13	25	At Risk
Mar 2 15:37	48	Needs Attention
Mar 2 17:10	42	Needs Attention
Mar 3 02:04	45	Needs Attention
Mar 4 18:49	35	At Risk
Mar 4 19:24	45	Needs Attention

A 23-point spread (25-48) across 7 runs. The scores oscillate between "At Risk" and "Needs Attention" — the model cannot consistently decide which band this falls into. The most recent 3 runs (35, 45, 45) are more stable.

🔧 Tuning Suggestion: Add boundary guidance to the prompt: "When the score falls within ±5 points of a threshold (40, 60, 75, 90), explicitly justify which side of the boundary the HOA falls on."

3. AI Investment Recommendations

Latest Run: 2026-03-04T19:28:22Z (3 runs saved) Overall Concurrence: 88%

Overall Assessment

"The HOA has a healthy long-term cash flow outlook with significant surpluses projected by mid-2026, but faces an immediate liquidity pinch in the Reserve Fund for March/April capital projects. The current investment strategy relies on older, lower-yielding CDs (3.60-3.67%) that are maturing soon."

Concurrence: 92% ✅ — Every claim verified:

CDs are at 3.60-3.67% vs market 4.10% (verified)
March project ($7K) vs reserve cash ($10.7K) is tight (verified)
Long-term surplus projected from assessment income (verified from budget)

Recommendation-by-Recommendation Analysis

Rec 1: "Critical Reserve Shortfall for March Project" — HIGH / Liquidity Warning

Claim	Database Value	Match
Reserve cash = $10,688	$10,688.45	✅ Exact
$7,000 Pond Spillway project due March	Projects table: $7,000, Mar 2026	✅ Exact
Shortfall risk	$10,688 - $7,000 = $3,688 remaining — tight but feasible	✅
Suggested action: expedite special assessment or transfer from operating	Sound advice	✅

Concurrence: 90% — The liquidity concern is real. After paying the $7K project, only $3.7K would remain in reserve cash before the $5.5K May project. The AI correctly flags the timing risk even though the fund is technically solvent.

Rec 2: "Reinvest Maturing CD #2a at Higher Rate" — HIGH / Maturity Action

Claim	Database Value	Match
CD #2a = $8,000	$8,000.00	✅ Exact
Current rate = 3.60%	3.60%	✅ Exact
Maturity = April 14, 2026	2026-04-14	✅ Exact
Market rate = 4.10% (E*TRADE)	CD rates: E*TRADE 4.10%, 1 year, $0 min	✅ Exact
Additional yield: ~$40/year per $8K	$8K × 0.50% = $40	✅ Math correct

Concurrence: 95% ✅ — Textbook-correct recommendation. Every data point verified. The 50 bps improvement is risk-free income.

Rec 3: "Establish 12-Month CD Ladder for Reserves" — MEDIUM / CD Ladder

Claim	Database Value	Match
~$38K total reserve portfolio	$38,688.45	✅ Exact
Suggest 4-rung ladder (3/6/9/12 mo)	Standard strategy	✅
Rates up to 4.10%	Market data confirmed	✅
$9K matures every quarter	$38K / 4 = $9.5K per rung	✅ Approximate

Concurrence: 75% — Strategy is sound in principle, but the recommendation overlooks two constraints:

Immediate project costs ($12.5K in 2026) must be reserved first, leaving ~$26K for laddering
Investing the entire $38K is aggressive — some cash buffer should remain liquid

🔧 Tuning Suggestion: Add a constraint to the prompt: "When recommending CD ladders, always subtract upcoming project costs (next 12 months) and a minimum emergency reserve (1 month of budgeted reserve expenses) before calculating the investable amount."

Rec 4: "Deploy Excess Operating Cash to High-Yield Savings" — MEDIUM / New Investment

Claim	Database Value	Match
Operating cash = $27,418	$27,418.81	✅ Exact
3-month buffer = ~$35,000	$11,665 × 3 = $34,995	✅ Math correct
Current cash below buffer	$27.4K < $35K	✅ Correctly identified
Openbank 4.09% APY	Market data: Openbank 4.09%, $0.01 min	✅ Exact
Trigger: "As soon as balance exceeds $35K"	Sound deferred recommendation	✅

Concurrence: 90% ✅ — The AI correctly identifies the current shortfall and provides a forward-looking trigger. Well-structured advice that respects the liquidity constraint.

Rec 5: "Optimize Reserve Cash Yield Post-Project" — LOW / Reallocation

Claim	Database Value	Match
Vio Bank Money Market at 4.03%	Market data: Vio Bank 4.03%, $0 min	✅ Exact
Post-project reserve cash deployment	Appropriate timing	✅
T+1 liquidity for emergencies	Correct MM account characteristic	✅

Concurrence: 85% ✅ — Reasonable low-priority optimization. Correctly uses market data.

Rec 6: "Formalize Special Assessment Collection for Reserves" — LOW / General

Claim	Database Value	Match
$300/unit special assessment	Assessment groups: $300.00 special	✅ Exact
Risk of commingling with operating	Budget shows special assessments in operating income	✅ Identified

Concurrence: 90% ✅ — Important governance recommendation. The budget structure does show special assessments as operating income, which could lead to improper fund commingling.

Risk Notes Assessment

Risk Note	Verified	Concurrence
"Reserve cash ($10.6K) barely sufficient for $7K + $5.5K projects"	✅ $10,688 vs $12,500 in projects	95%
"Concentration risk: CDs maturing in 4-month window (Apr-Aug)"	✅ All 3 CDs mature Apr-Aug 2026	100%
"Operating cash ballooning to $140K+ without investment plan"	✅ Budget shows large Q2 surplus	85%
"Road Sealing $80K in 2029 needs dedicated savings plan"	✅ Project exists, 0% funded	95%

Risk Notes Concurrence: 94% — All risk items are data-backed and appropriately flagged.

Cross-Run Consistency (Investment Recommendations)

Three runs were compared. Key observations:

Core recommendations are highly consistent across runs: CD reinvestment, HY savings for operating, CD ladder for reserves
Dollar amounts match exactly across all runs (same data inputs)
Bank name recommendations vary slightly (E*TRADE vs "Top CD Rate") — cosmetic, not substantive
Priority levels are stable (HIGH for liquidity warnings, MEDIUM for optimization)

Consistency Grade: A- — Investment recommendations show much better consistency than health scores, likely because the structured data (specific CDs, specific rates) constrains the output more than the subjective health scoring.

Cross-Cutting Issues

Issue 1: Score Volatility (MEDIUM Priority)

Health scores vary significantly across runs despite identical input data:

Operating: 40-point spread (48-88)
Reserve: 23-point spread (25-48)

Root Cause: Temperature 0.3 allows too much variance for numerical scoring. The model interprets guidelines subjectively.

Recommended Fix:

Reduce temperature to 0.1 for health score calculations
Implement a 3-run moving average to smooth individual run variance
Add explicit boundary justification requirements to prompts

Issue 2: YTD Budget Calculation Includes Incomplete Month (LOW Priority)

The operating health score computes YTD budget through the current month (March), but actual data may only cover a few days. This creates alarming income variances (e.g., "$55K variance") that are pure timing artifacts.

Recommended Fix:

Compute YTD budget through the prior completed month (February)
OR pro-rate the current month's budget by days elapsed
Add a note to the prompt: "If the variance is driven by the current incomplete month, flag it as 'timing' and weight it minimally."

Issue 3: Per-Unit vs Total Confusion on Special Assessments (LOW Priority)

The AI sometimes quotes "$300" as the annual reserve income instead of $300 × 67 = $20,100. The data passed says "$300.00/unit × 67 units (annual)" but the model occasionally fixates on the per-unit figure.

Recommended Fix:

Pre-compute and include the total in the data: "Total Annual Special Assessment Income: $20,100.00"
Keep the per-unit breakdown for context but lead with the total

Issue 4: Cash Runway Classification Inconsistency (MEDIUM Priority)

The operating health score rates 2.4 months of cash runway as "positive" despite the scoring guidelines defining 2-3 months as "Fair" territory. This inflates the overall score.

Recommended Fix:

Add explicit prompt guidance: "Cash runway categorization: <2 months = negative, 2-3 months = neutral, 3-6 months = positive, 6+ months = strongly positive. Do NOT rate below-threshold runway as positive based on projected future inflows."

Issue 5: Dual Project Tables (INFORMATIONAL)

The schema contains both capital_projects (empty) and projects (26 rows). The health score service correctly queries projects, but auditors initially checked capital_projects and found no data. This dual-table pattern could confuse future developers.

Recommended Fix:

Consolidate into a single table, OR
Add a comment/documentation clarifying the canonical source

Concurrence Summary by Recommendation

Operating Fund Health — Recommendations

Recommendation	Concurrence
Verify posting schedule for $20,700 Special Assessment	90%
Investigate low YTD expense recognition	95%
Move excess cash to interest-bearing account	85%
Average	90%

Reserve Fund Health — Recommendations

Recommendation	Concurrence
Commission professional Reserve Study	100%
Develop funding plan for $80K Road Sealing	90%
Formalize special assessment collection for reserves	95%
Average	95%

Investment Planning — Recommendations

Recommendation	Concurrence
Critical Reserve Shortfall for March Project	90%
Reinvest Maturing CD #2a at Higher Rate	95%
Establish 12-Month CD Ladder	75%
Deploy Operating Cash to HY Savings	90%
Optimize Reserve Cash Post-Project	85%
Formalize Special Assessment Collection	90%
Average	88%

Final Grades

Feature	Score Accuracy	Recommendation Quality	Data Fidelity	Consistency	Overall
Operating Fund Health	C+ (score ~15 pts high)	A (90%)	B+ (minor math phrasing)	C (16-pt spread)	72% — B-
Reserve Fund Health	A- (well-calibrated)	A (95%)	B (per-unit confusion)	B- (23-pt spread)	85% — B+
Investment Recommendations	N/A (no single score)	A (88%)	A (exact data matches)	A- (stable across runs)	88% — A-

Priority Action Items for Tuning

[HIGH] Reduce AI temperature from 0.3 → 0.1 for health score calculations to reduce score volatility
[MEDIUM] Add explicit cash-runway-to-impact mapping in operating prompt to prevent misclassification
[MEDIUM] Pre-compute total special assessment income in data context (not just per-unit)
[LOW] Adjust YTD budget calculation to use prior completed month or pro-rate current month
[LOW] Add boundary justification requirement to scoring prompts
[LOW] Consider implementing 3-run moving average for displayed health scores

Generated by Claude Opus 4.6 — Automated AI Feature Audit

26 KiB Raw Permalink Blame History Unescape Escape

AI Feature Audit Report

Executive Summary

Data Foundation (Ground Truth)

Financial Position

Budget (FY2026)

Assessment Structure

Actuals (YTD through March 4, 2026)

Capital Projects (from projects table, 26 total)

Reserve Components

Market Rates (fetched 2026-03-04)

1. Operating Fund Health Score

Factor-by-Factor Analysis

Factor 1: "Projected Cash Flow" — Impact: Positive

Factor 2: "Delinquency Rate" — Impact: Positive

Factor 3: "Budget Performance (Timing)" — Impact: Neutral

Factor 4: "Cash Reserves" — Impact: Positive

Factor 5: "Expense Management" — Impact: Positive

Recommendation Assessment

Score Assessment

Score Consistency Concern

2. Reserve Fund Health Score

Factor-by-Factor Analysis

Factor 1: "Funded Ratio" — Impact: Negative

Factor 2: "Projected Cash Flow" — Impact: Positive

Factor 3: "Component Tracking" — Impact: Negative

Factor 4: "Annual Contributions" — Impact: Negative

Recommendation Assessment

Score Assessment

Score Consistency Concern

3. AI Investment Recommendations

Overall Assessment

Recommendation-by-Recommendation Analysis

Rec 1: "Critical Reserve Shortfall for March Project" — HIGH / Liquidity Warning

Rec 2: "Reinvest Maturing CD #2a at Higher Rate" — HIGH / Maturity Action

Rec 3: "Establish 12-Month CD Ladder for Reserves" — MEDIUM / CD Ladder

Rec 4: "Deploy Excess Operating Cash to High-Yield Savings" — MEDIUM / New Investment

Rec 5: "Optimize Reserve Cash Yield Post-Project" — LOW / Reallocation

Rec 6: "Formalize Special Assessment Collection for Reserves" — LOW / General

Risk Notes Assessment

Cross-Run Consistency (Investment Recommendations)

Cross-Cutting Issues

Issue 1: Score Volatility (MEDIUM Priority)

Issue 2: YTD Budget Calculation Includes Incomplete Month (LOW Priority)

Issue 3: Per-Unit vs Total Confusion on Special Assessments (LOW Priority)

Issue 4: Cash Runway Classification Inconsistency (MEDIUM Priority)

Issue 5: Dual Project Tables (INFORMATIONAL)

Concurrence Summary by Recommendation

Operating Fund Health — Recommendations

Reserve Fund Health — Recommendations

Investment Planning — Recommendations

Final Grades

Priority Action Items for Tuning

26 KiB

Raw Permalink Blame History

Capital Projects (from `projects` table, 26 total)