Docs / 1 · Bookmaking Fundamentals

01 — Bookmaking Fundamentals: How Lines Are Made and Moved

TL;DR. A sportsbook does not try to predict the score perfectly. It posts a price, lets informed money discover the true number, charges a ~4.5–4.8% commission (the vig), and manages risk by position-taking — not by perfectly balancing both sides. The closing line is the most accurate public forecast of a game that exists, because by kickoff it has absorbed all information and the highest limits. Beating that closing number — Closing Line Value (CLV) — is the single best measure of betting skill. Everything in this series serves that one idea.

1. How an opening line ("opener") is born

Every oddsmaker maintains numerical power ratings — one strength number per team in points, derived from point differential, efficiency (often EPA-based), roster talent, and strength of schedule. The opener is generated by the core identity (developed fully in Doc 2):

Projected spread = (Rating_home − Rating_away) + Home-Field Advantage

Example. Team A rates +5.2, Team B rates +2.4, A is home, HFA = 2.0: (5.2 − 2.4) + 2.0 = 4.8 → the opener is roughly A −5 (or A −4.5/−5).

The opener's job is accuracy and risk-minimization, not balanced action.

Look-ahead lines

"Look-ahead" (or "lookahead") lines are early numbers for next week, posted with low limits. They have a known weakness — they understate next-week injury uncertainty — which is exactly the inefficiency sharps hunt. The Westgate SuperBook is a classic originator of NFL look-aheads and the marquee prop/futures menus.

Who actually originates the number

Modern line origination is "follow the leader."

Market-maker (sharp) books originate and discover prices: Pinnacle (primary for NFL/EPL), BetCRIS, Circa, Westgate SuperBook. They post a tight opener at low limits and let informed bettors hammer it into place — price discovery.
Retail / "soft" books (DraftKings, FanDuel, most US apps) mostly copy the market-maker number off the screen and manage risk via lower limits, higher hold, and limiting/banning winners. They are not the source of truth.

Lifecycle: (1) market-makers post openers at low limits, (2) early sharp action drives discovery and rapid movement, (3) other books copy the settled line, (4) limits rise through the week (NFL steps up notably on Thursday).

Sources: Action Network, Unabated — Who Sets the Line, SCCG — Market Maker vs Retail, Covers — Lookahead Lines.

2. The vig (juice / hold / overround)

A fair coin-flip bet pays +100 each side. At the standard spread/total price of −110, you risk $110 to win $100.

Implied probability of −110 = 110 / 210 = 52.38%.
Two sides at −110 sum to 52.38% + 52.38% = 104.76%. That 4.76% overround is the vig.
Break-even win rate at −110 = 52.38%. Memorize this. It is the bar every strategy must clear.
Theoretical hold ≈ 4.55% of handle (vig and hold are related but not identical). Retail books widen it with −115/−120 pricing; futures hold far more.

De-vigging — recovering the "fair" probability

To get the market's true probability estimate, strip the vig: convert both sides to implied probability, then normalize so they sum to 100%.

For -110 / -110:   0.5238 / (0.5238 + 0.5238) = 0.500  → fair 50/50
For -150 / +130:   p_fav = 0.600, p_dog = 0.4348, sum = 1.0348
                   fair_fav = 0.600 / 1.0348 = 57.98%
                   fair_dog = 0.4348 / 1.0348 = 42.02%

This is the number we compare our model against — never the raw quoted line. (See Doc 6 for why this matters for edge.)

Sources: BettingUSA — Vig, OddsJam — Vig Calculator.

3. The balanced-book myth

The textbook story — "books move lines purely to balance money so they collect risk-free vig" — is largely false in modern markets.

Marco Blume (ex-Pinnacle head trader) described their markets as "100% data science and 0% committed money": they move on information, not on which side the money is on. The logic: a book that shaded its line off the true probability just to balance action would be offering a worse number than the efficient line and inviting arbitrage. Instead sharp books take positions — they hold their number when confident, willingly carry one-sided liability against square money, and rely on the law of large numbers across many games (not balance on any one game) to realize their margin.

Implication for us: the line is best read as the market's estimate of the true outcome, lightly shaded toward public biases (Levitt 2004) — not as a balance point. Our model should treat the de-vigged close as a near-truth prior.

Sources: Underdog Chance — Market-Making vs Market-Taking, Levitt (2004), Economic Journal.

4. Sharp money vs. square money, and how lines move

Square (public) money: recreational, emotional — piles onto favorites, overs, and popular national teams. Small individual bets; large cumulative volume, especially near kickoff.
Sharp money: professional/syndicate — research-driven, bankroll-managed, value-seeking. Books track which accounts are "respected" and weight who is betting, not just how much. A line moves on a modest respected wager but may ignore large square volume.

Signals to recognize (and to engineer features from):

Signal	What it looks like	What it means
Steam move	Sudden, large, coordinated move across many books at once	Syndicate action; the market is repricing fast
Reverse line movement (RLM)	Line moves against the side with most tickets (75% of bets on A, line moves toward B)	Sharp money (fewer tickets, bigger dollars) is on the unpopular side
Line freeze	Heavy one-sided public action, line doesn't move	Book is comfortable with its number / already balanced against the public

Sources: Prime — Sharp vs Public, OddsIndex — Steam Moves.

5. Closing Line Value (CLV) — the gold standard

Definition. CLV measures whether the number/price you got beat the closing line at kickoff — independent of whether the bet won. You took +3.5 and it closed +2.5? Positive CLV. You took −110 and it closed −130? Positive CLV.

Why it's the gold standard. By kickoff, the closing line has absorbed all information — injuries, weather, public and sharp money, the aggregate analytic effort of every participant — and the week's highest limits flow in late, sharpening it further. So:

Consistently beating the close means you extracted value before the market reached its most efficient state — the statistical signature of skill.
CLV separates skill from variance. Win/loss over small samples is noise; persistent positive CLV reliably predicts long-run profitability. Bettors who track CLV report ROI ~2–3× those who track win rate alone.

This reframes our whole project. Success is not "did our predicted score match the final?" It is "did our number beat the closing number?" Doc 5 and Doc 7 build CLV into the backtester as a first-class metric.

Sources: VSiN — CLV, Pinnacle Odds Dropper — CLV.

6. How efficient is the NFL market, really?

NFL spread/total markets are among the most efficient betting markets in the world (high liquidity, heavy sharp participation):

Academic studies repeatedly find the closing line is not statistically different from the actual margin on average — an essentially unbiased predictor. (~50% of final margins land above the spread, ~50% below; the line behaves like the median outcome.)
Books are well-calibrated in moneyline win probabilities.

Realistic edge. Because you must clear 52.38%:

A genuine long-term sharp wins about 53–55% ATS; 55%+ is strong, ~65% is fantasy over any large sample.
A representative pro is roughly ~5% ROI over thousands of bets; sustained +2–3% CLV is excellent.
Even good ML models top out around ~55% ATS (Doc 5) — and "the market is hard to beat" is the single most replicated finding in the literature.

Wisdom of crowds. The efficiency is a crowd-aggregation result: the close pools the dispersed information and capital of thousands of participants (with sharps overweighted), converging on a near-true probability no single forecaster reliably beats.

Sources: arXiv 1211.4000 — Performance of NFL Betting Lines, Skidmore — NFL Market Efficiency, nfelo — Margin Probabilities.

7. The line vs. the "true" expected margin

The spread predicts the central tendency of the margin — effectively the median (≈50/50 split), very close to the mean since the margin distribution is roughly symmetric around the number.
The total predicts expected combined points, centered so ~50% of games go over and ~50% under at the no-vig number.
But the distribution is not a clean normal curve — it spikes on key numbers (3, 7…). That nonlinearity is the subject of Doc 2 §4 and is why a half-point near 3 is worth ~3–4%.

8. Implications for our engine

Add a market-truth prior. Ingest the closing (and opening) line per game. Treat the de-vigged close as the best available estimate of the true spread/total. We already scrape Vegas odds via the odds-data pipeline — we must also store the closing number and the price, not just an early line.
Build a de-vig utility. A small function: two American prices → normalized fair probabilities → fair spread/total. Used everywhere we compare model vs. market.
Make CLV a backtest metric. For every flagged bet, record the number we "bet" vs. the closing number, and report aggregate CLV alongside ROI and ATS win%. This is the truest scorecard.
Stop optimizing for raw score accuracy alone. The Optuna tuning process currently tunes toward prediction error; it should also (or instead) optimize for CLV / ROI vs. the close (Doc 5, Doc 7).
Engineer line-movement features where data allows: open→close move size, RLM flags, steam. These are among the few signals correlated with sharp information.

→ Continue to Doc 2 — Predicting Spreads.