Docs / Handicapping Series — Overview

How To Be the Book: NFL Spread & Total Handicapping Reference

A research-backed series on how Las Vegas oddsmakers and professional quant bettors actually price and predict NFL games — written specifically so we can upgrade this repo's prediction engine to perform like the people who do it for a living.

Why this series exists

Our current prediction engine is an additive points-adjustment model: an opponent-aware, defense-weighted "envelope" baseline plus a rush/pass scheme matchup and ~15 small capped nudges (home edge, coaching, rookies, rest, DVOA, power rank, line play, etc.), with a score-dispersion knob and per-defense clamps. That is a solid handicapper's worksheet — and recent work already moved it toward this series (opponent-aware, defense-weighted, variance-calibrated) — but it is still not how the market is built, and it lacks the three things that separate real operators from hobbyists:

Reading this after the recent rework? Doc 7's "Where we are today" and phase statuses are reconciled with it (Phase 4 is now partially done). Start there for the current state and the still-open priorities (QB lever, the probability layer, de-vig/CLV).

A power-rating → spread identity with opponent adjustment and a proper home-field constant.
A probability layer — converting a predicted margin into win / cover / over probabilities using the empirical distribution of NFL margins (σ ≈ 13.5) and the key-number spikes at 3 and 7.
A market-aware evaluation loop — measuring ourselves against the closing line (CLV), not against the scoreboard, and only betting when we disagree with the no-vig market by enough to clear the vig.

The market is one of the most efficient prediction systems in the world. The goal is not to "beat Vegas at predicting scores" — it's to (a) reproduce the market's accuracy as a baseline, then (b) find the small, specific spots where we have incremental information the closing line hasn't absorbed yet.

The documents

#	Doc	What it covers
1	`01-bookmaking-fundamentals.md`	How books originate and move lines, the vig, balanced-book myth, sharp vs. square money, Closing Line Value (CLV), market efficiency
2	`02-predicting-spreads.md`	Power ratings, the rating→spread identity, Elo (538 spec), home-field advantage, key numbers, spread↔probability conversion, schedule/QB adjustments
3	`03-predicting-totals.md`	Drives × points-per-drive, pace, efficiency, weather (wind is king), game script, scoring-era shifts
4	`04-variable-catalog.md`	Ranked catalog of every predictive input (EPA, DVOA, ANY/A, Pythagorean, turnovers, rest, HFA…) with predictive value and regression caveats
5	`05-modeling-methods.md`	Ridge power ratings, Elo/Glicko, Bayesian state-space, ML pitfalls, walk-forward validation, calibration, market-aware blending
6	`06-betting-strategy-bankroll.md`	Edge identification, de-vigging, Kelly & fractional Kelly, CLV tracking, why accuracy ≠ profit
7	`07-engine-improvement-roadmap.md`	Concrete, phased plan mapping all of the above to our actual files, weights, and backtester
8	`08-clv-and-odds-data.md`	Closing Line Value: where to get opening/closing odds (free + paid), and the capture / import / CLV-backtest tools

The five numbers to memorize

Quantity	Value	Why it matters
Break-even win rate at −110	52.38%	Below this you lose money long-term; this is the bar
SD of NFL game margin	σ ≈ 13.5 (13.45–13.86)	Converts a predicted margin into win/cover probability
SD of NFL game total	≈ 10	Converts a predicted total into over/under probability
Most common margin	3 pts (~15%), then 7 (~9%)	Key numbers; half-points across 3 & 7 are worth ~3–4% each
Modern home-field edge	~1.5–2.5 pts (was ~3)	Our weight cap of 2.5 is at the high end; market prices ~1.5

How to read this as an engineer

Each doc ends with a "Implications for our engine" section that translates theory into specific code changes. Doc 7 consolidates those into a build sequence. Start with the README and Doc 7 if you want the action plan; read 1–6 for the why behind each change.

Scope / disclaimer. This is a modeling and analytics reference for our own prediction engine and educational use. It documents how legal, regulated sportsbooks price markets and how public research evaluates those markets. It is not gambling advice.

Source quality note

The strongest claims (margin distributions, σ ≈ 13.5, the 52.38% bar, bye-week edge collapse, EPA/turnover stability) come from academic papers (arXiv, JASA, Frontiers) and established analytics outlets. Mechanics (vig, sharp/square, CLV, market-maker structure) are corroborated across multiple industry sources and the canonical books. Each doc cites its sources inline. Vendor "58% ATS" type claims are flagged as marketing, not audited results.

Canonical books referenced throughout

Ed Miller & Matthew Davidow — The Logic of Sports Betting (2019) — how books actually work, price discovery, CLV.
Stanford Wong — Sharp Sports Betting (2001) — point spreads, totals, key numbers, half-point values.
Wayne Winston — Mathletics — margin ≈ Normal(line, 13.86), least-squares power ratings.
King Yao — Weighing the Odds in Sports Betting (2007) — line movement, scalping/middling, hedging.
Joseph Buchdahl — Squares & Sharps, Suckers & Sharks (2016) — efficiency, luck vs. skill, wisdom of crowds.

Foundational papers

Glickman & Stern (1998), A State-Space Model for NFL Scores, JASA — Bayesian dynamic ratings; origin of σ ≈ 13.86.
Dixon & Coles (1997), Modelling Association Football Scores…, JRSS-C — score modeling + market inefficiency template.
Levitt (2004), Why are gambling markets organised so differently…, Economic Journal — books shade lines to public bias.
Benter (1994), Computer Based Horse Race Handicapping… — market-aware blending + Kelly; the blueprint.
Kelly (1956), A New Interpretation of Information Rate — the Kelly criterion.