A look inside Apollo — the supervised sale-probability model that replaces Alpha, 8020REI's legacy 25-signal heuristic. It scores every residential parcel across five counties on its odds of selling in the next six months.
Apollo asks a different question than the permit models: not will this home get work done but will it sell. Its thesis is that the owner’s situation — distress, leverage, absentee status, tenure — drives a sale more than the property itself. So its largest family of features by far is owner-and-distress (48 of ~96). It scores 4.05 million parcels and, at the top 1% of each county, beats Alpha by a geomean 3.03× across all five counties — and 5.49× across the three where Alpha was weakest.
Driver clusters · LightGBM gain, aggregated by signal
What the model actually weights.
Owner+distress · 42%
Property · 20%
Valuation · 16%
Macro/local · 14%
Dates
Approximate share of clustered gain across the top features · the five named clusters ≈ by tier of total gain
01
Owner & distress (Tier B)
48 feats
The largest family by design — 23 distress trajectories, absentee level (owner-occupied / in-state / out-of-state), leverage, ownership length. Apollo’s core bet: who owns it and what situation they’re in beats what the house is.
Silver parcel attributes: use type, beds/baths, sqft, lot, flood zone, cash-buyer flag. The rank-1 single feature (property age in Miami) lives here — but it is a buy-box separator, not a deal-motivation signal.
National rates and indices (FRED mortgage rate, Fed funds, HPI, CPI, unemployment) plus local context (county unemployment, ACS income, FHFA state HPI). Sets the tide every property floats on.
Mortgage age, prev-sale recency (2 features, under leakage audit) and optional permit signals (toggle). Small and carefully bounded.
mortgage_age · prev_sale_recency · permit_signals
Read by tier, not feature. Apollo’s features partition into seven tiers (A–G); the bar shares above are indicative weights by tier emphasis, not a single LightGBM gain dump (per-county models weight tiers differently). Owner-and-distress is the largest family in every county. 25 of the features are hand-engineered synthetics.
How the model finds the next sale
Field Guide · 1
May 2026
Headline numbers
How well it actually works.
3.03×
Geomean lift vs Alpha
top 1%, all 5 counties
5.49×
Lift vs Alpha · signal-3
Maricopa·Harris·Jackson
4.05M
Parcels scored
T0 = 2025-09
~96
Features in training
117 in production
Apollo replaces Alpha, a hand-weighted 25-signal heuristic. Across the five counties it is 3.03× more efficient than Alpha at the top 1%; where Alpha was already strong (Miami, Philly) the edge is small, but where Alpha was weak (Jackson 7.9×, Harris 6.9×, Maricopa 3.4×) the model is several times better.
Different target, different baseline. Apollo predicts sales, not permits — its baseline is Alpha (the incumbent heuristic), not a random draw. The arms-length filter is intentionally off this phase (every sale counts). The locked March-2025 head-to-head test is untouched until sign-off; numbers here are walk-forward.
Per-county edge vs Alpha
Biggest gains where Alpha was weakest.
Model ÷ Alpha lift ratio at top 1%, per county. Significant (>2×) on Maricopa, Harris, Jackson; Miami and Philly within noise — Alpha was already strong there.
Audit findings
What we checked, changed, and left open.
Clean
Walk-forward folds enforce the T0 boundary; features computable from the T0 snapshot alone. 69/69 sanity checks pass.
Validated
Triple audit (2026-04-23) closed 9 of 10 ship-blockers. Feature set cut from 481 raw candidates to ~96 in training.
Pending
Tier D date-diffs (mortgage age, prev-sale recency) under leakage audit. Tier F local-market features not wired by default. Arms-length filter is a planned second pass.
Closed
Locked March-2025 head-to-head is the Phase-4 gate — untouched until Eduardo + Camilo sign off in writing.
Anti-signals
What pulls a score down.
Lowers sale odds
Owner-occupied, low distress
Long stable tenure
Low leverage / high equity
Market headwind
High mortgage rates
Falling local HPI
Low listing activity
Not in scope
Outside the 5 counties
Non-residential parcel
Arms-length-only (deferred)
Beats Alpha, not random. Apollo’s job is to out-rank the incumbent heuristic on sale probability. Directions above are indicative (per-county models differ). The win condition is top-decile recall ≥ Alpha and ≥ Camilo with 30/60/90-day calibration, on the locked test.
Bottom line
Apollo is roughly 3× more efficient than Alpha across five counties (5.5× where Alpha was weakest), built on the bet that owner situation — distress, leverage, absentee status — drives a sale more than the property does. What it can’t see: off-market owner intent, and any market outside its five counties. Active build — the locked head-to-head test is the gate that decides it.
Apollo · 5-county · ~96 feat · T0 2025-09 · vs Alpha