04 β Variable Catalog: What Actually Predicts NFL Games
TL;DR. Ranked by out-of-sample predictive value (what matters for forecasting future games, not describing past ones): **EPA/play (pass-weighted)
QB identity > ANY/A > opponent-adjusted point differential / Pythagorean > DVOA + success rate > pace/PPD/field position > wind > rest/HFA > turnovers / red-zone / fumble luck (regress hard) > special teams / O-line / coaching (mostly captured indirectly). The recurring theme: offense is more stable than defense, and the noisy stuff (turnovers, RZ TD%, fumble recoveries, close-game records) must be regressed toward the mean β use it to flag regression, not to project.**
This is the menu of inputs. For each: what it is, how predictive, and the caveats that govern how much to trust it.
Tier 1 β Highest predictive value
1. EPA per play (offense & defense, pass/rush split)
The most-cited single predictor. Passing EPA carries far more weight than rushing EPA β it is the backbone of modern power ratings. - Offense is more stable than defense. Year-to-year correlation: off EPA/play r β 0.377 vs def EPA/play r β 0.322. Every defensive correlation is lower than its offensive counterpart β weight offense more, regress defense more. - Caveat: it's a team/unit metric (can't cleanly isolate a player); needs sample to stabilize; the EP baseline drifts yearly with scoring environment.
2. QB identity / QB injury
The largest discrete swing variable. Elite QB β 5β7 pts on the spread; non-elite starter 3β4; backup downgrade swings 3β7 pts depending on drop-off. - Caveat: once announced, the market prices it in seconds β no residual edge from the obvious move. The edge is in anticipating it and correctly valuing the backup.
3. ANY/A (Adjusted Net Yards per Attempt)
Best simple passing/QB metric. (yds + 20Β·TD β 45Β·INT β sack_yds) / (att + sacks).
Correlation with team wins β 0.67; the higher-ANY/A team scored more in ~87%
of games one season. Caveat: descriptive correlation, partly an EPA proxy.
4. Pythagorean expectation
Point differential predicts future wins better than W-L record. NFL exponent
2.37: PF^2.37 / (PF^2.37 + PA^2.37). Teams whose record outran their
differential regress down ~2 wins; underperformers gain ~1.2.
Caveat: adjust point differential for opponent and garbage time first.
Tier 2 β Strong, with structure/adjustment required
5. DVOA (opponent-adjusted efficiency)
Per-play value vs. a baseline for the same down, distance, field position, then opponent-adjusted; RZ and late-close plays weighted more. - Success baselines: 1st down β₯45% of needed yds, 2nd β₯60%, 3rd/4th = convert. - Companions: DAVE (blends a preseason prior with in-season DVOA early in the year), Weighted DVOA (recency-weighted). - Caveat: opponent adjustments are unreliable until ~Week 4β6; lean on the prior before then. (We already ingest DVOA into the prediction engine.)
6. Success rate
Binary "stayed on schedule" per play; lower variance than EPA β stabilizes faster β good early-season signal. Best paired with EPA (frequency vs. magnitude).
7. Strength of schedule / opponent adjustment
Not predictive alone, but a necessary correction to every raw efficiency stat. Unadjusted EPA/DVOA/point-diff overrate teams that played weak slates.
8. Rest / travel / situational
Small and mostly priced. Bye edge collapsed post-2011: +2.21 β +0.31 pts. Short-week (<6 days) road after Week 6: 43.9% win / 47.4% ATS. WestβEast time-zone shifts affect cognition (hard to quantify). (Full table in Doc 2 Β§8.)
9. Home-field advantage
Shrunk: ~2.5 long-run, ~1.5 in recent seasons; home win rate fell from ~57β60% to ~52β53% since 2019. Use a current, decaying, ideally venue-specific constant β not the historical 3. (Doc 2 Β§6.)
Tier 3 β Real but high-variance / regress heavily / second-order
10. Turnovers / turnover margin β the classic trap
Important for past outcomes, nearly useless for prediction.
- Year-to-year turnover-margin correlation β 0.10 (RΒ² β 0.01). A +20 team
projects to just +2.2 next year.
- ~46% skill / 54% luck; fumble recovery is ~50/50 with ~zero carryover skill.
- Rule: regress turnover margin hard toward zero. Use it to identify regression
candidates, not to project.
- Our engine: a turnovers_scale of 0.2, cap 0.8 β already modest and
appropriate. Don't increase it. Confidence also leans on turnover differential
β consider down-weighting given its noise.
11. Red-zone efficiency
Strong descriptively, regresses ~11β12%/yr for top teams. Use as a
finishing-drives adjustment with heavy regression, not a raw input.
(Our redzone_scale is 1.0 β make sure the underlying input is regressed.)
12. Special teams / field position
Small "hidden" yardage: ~0.03 EP/yard; +1 net punt yard β +7.3 pts/season. Biggest single-drive predictor is starting field position. Non-trivial in close games; small share of total team value.
13. O-line / non-QB skill / individual defenders
Modest spread impact: skill players ~0.5β2.5 pts, defenders ~0.5β1 pt. O-line matters mainly via QB efficiency / sack rate, not as a standalone input β which is the right way to model it (we have line-play metrics).
14. Coaching / situational tendencies
Pace philosophy, pass-rate-over-expectation, 4th-down aggressiveness, blitz rate.
Hard to quantify directly; best captured indirectly through the efficiency and
pace metrics they produce. (We have a coaching weight at 0.6 β keep it small.)
Cross-cutting: recency weighting & priors
- Recency weighting is standard. Recent games carry more weight (injuries,
scheme, personnel). Methods: exponential decay,
weight = 1/(weeks_ago + 0.4), or linear (7,6,5,β¦). One optimization found the most recent ~5 weeks of spreads minimized MSE. - Early-season priors. Blend current results with a long-run prior (prior 3 seasons' ratings + market-implied odds), Γ la DAVE β the first ~4 weeks are too noisy to trust raw.
- Regression to the mean governs trust per stat: strongest for turnovers, fumble recoveries, RZ TD%, close-game records (regress aggressively); weakest (most "real") for offensive pass EPA and ANY/A (trust more).
The ranked summary (out-of-sample predictive value)
| Rank | Variable | Use it for | Regress? |
|---|---|---|---|
| 1 | EPA/play, pass-weighted (off > def) | core team strength | mild; regress def more |
| 2 | QB identity / starter-backup delta | spread (3β7 pts) | n/a (it's an event) |
| 3 | ANY/A | simple passing strength | mild |
| 4 | Opp-adj point diff / Pythagorean (exp 2.37) | record-independent strength | adjust opp + garbage time |
| 5 | DVOA + success rate | efficiency + stability | lean on prior preβWk 6 |
| 6 | Pace / PPD / field position | totals core | mild |
| 7 | Wind β₯15β20 mph | totals (β2.7 at 20+) | n/a (forecast) |
| 8 | Rest / HFA | small spread nudges | use current values |
| 9 | Turnovers / RZ TD% / fumble luck | regression flags only | hard |
| 10 | ST / O-line / non-QB skill / coaching | second-order, via efficiency | β |
Net guidance for our weights file. Our biggest missing inputs are #2 (QB delta) and a proper #1/#4 opponent-adjusted rating spine. Our existing situational weights (turnovers, rest, divisional, coaching) are appropriately small and should stay small. The mistake to avoid is over-trusting Tier-3 noise.
Sources: EPA stability, ANY/A, Pythagorean, DVOA methods, turnover randomness, turnover margin, red zone regression, field position / ST, QB/player values, recency/priors, bye-week academic.
β Continue to Doc 5 β Modeling Methods.