Model card distilled from JOURNEY — expand as needed.
Model card — REI · Apollo
Plain-language explainer. Numbers from the 5×4 ablation matrix + JOURNEY distillation. Full lineage: changelog.
What it predicts
For each property, the probability the owner sells within 6 months (T0+1..T0+6). This is the supervised replacement for Alpha at step 4 of the Gaia ETL. Output contract identical to Alpha: 0–100 score per property within county. Status: building — pending locked March 2025 head-to-head sign-off from Eduardo and Camilo.
Who it scores (universe)
Owner-occupied and non-owner-occupied SFH + multi-family across 5 pilot counties (Maricopa AZ · Harris TX · Miami-Dade FL · Philadelphia PA · Jackson MO). Arms-length filter intentionally OFF during "predict sales overall" phase — all transactions count as y=1. CRM-leak guard drops is_crm_matched_anywindow=1 from training.
Architecture
| Component | Choice | Why |
|---|---|---|
| Algorithm | HistGradientBoosting | Wins 3 of 5 counties in 5×4 ablation; never loses by a meaningful margin vs LightGBM |
| Calibration | Isotonic regression | Held-out non-downsampled slice (~60K rows/county) |
| Folds | Walk-forward 6 folds | T0 boundary strictly enforced; no future leakage |
| Features | 177 (481 raw → 117 curated in v8) | Sparse/constant/leaky dropped |
| Downsampling | Case-control, prior-corrected | Class imbalance; true-prior correction at calibration |
Performance vs Alpha (as of 2026-05-08)
| County | Apollo Lift @top-1% | vs Alpha | Status |
|---|---|---|---|
| Jackson (29095) | 7.87× | 3.03× geomean across 5-co | Statistically distinguishable |
| Harris (48201) | 6.92× | Statistically distinguishable | |
| Maricopa (04013) | 3.43× | Statistically distinguishable | |
| Miami-Dade (12086) | 1.22× ± 0.27 | Inside 95% CI of 1.0× | |
| Philadelphia (42101) | 1.12× ± 0.21 | Inside 95% CI of 1.0× |
Open audit items
- Scenario A (FLAG): Recency-feature leakage on embargoed Fold 5 — pending V2 ablation.
- Head-to-head gate: Locked March 2025 cohort gated on written sign-off from Eduardo and Camilo.
- Miami / Philly: Lift inside noise floor — root cause under investigation (feature coverage vs Alpha's distress signals).
Run / artifacts
scripts/train_fold.py (single fold + county). Multi-arch sweep: scripts/train_fold_arch.py. Feature builder: src/new_model/features.py. Full lineage: changelog.