What win-probability math really recommends on fourth down — and why teams still punt too much.
Published June 6, 2026 · NFL Analytics
No corner of NFL analytics has spilled more onto television than fourth down. For decades, coaches punted almost reflexively on fourth-and-short, and for decades the math quietly said they were leaving wins on the field. The "fourth-down revolution" is the slow, public shift toward decisions grounded in win probability rather than gut feel - and the bots and charts that put that math in everyone's hands.
The framework is straightforward to state: on any fourth down, you have up to three options - go for it, punt, or kick a field goal. Each option leads to a distribution of outcomes, and each outcome has a value. The right call is the one with the highest expected value, usually measured in win probability (WP) or sometimes expected points (EP).
To compare options you weight each possible outcome by how likely it is and how valuable it is. Conceptually, the value of going for it looks like this:
WP(go) = P(convert) × WP(first down) + [1 − P(convert)] × WP(turnover on downs)
And you compare that against the alternatives:
WP(punt) ≈ WP(opponent gets ball at expected punt spot)
WP(kick) = P(make FG) × WP(made FG) + [1 − P(make FG)] × WP(missed FG)
The single most important input is conversion probability by distance. Short-yardage attempts convert far more often than long ones, and that gradient is what drives most of the conclusions:
The headline, well-documented finding is that NFL teams historically punted too often on fourth-and-short, and the mistake was worst in plus territory - past midfield, where a punt buys almost nothing and a failed field goal still hands the opponent poor field position. The expected-value math repeatedly favored going for it in spots where convention said to punt or kick.
What pushed this from research papers into the mainstream was a lineage of public tools. The original "4th down bot" demonstrated the decisions live and made the gaps between coaching and math impossible to ignore. That tradition continued with open models such as Ben Baldwin's nfl4th, built on public play-by-play data, which lets anyone evaluate a real decision with a transparent win-probability model.
Because these models are open, you can inspect their assumptions instead of taking a broadcast graphic on faith - a major reason fourth-down analysis became trusted. Our own tools page includes a fourth-down helper in the same spirit.
The same expected-value thinking powers the famous go-for-two chart. After a touchdown you can kick the extra point or go for two, and you simply compare the points each is expected to produce:
The arithmetic: an extra point yields about 1 × 0.94 ≈ 0.94 points on average, while a two-point try yields about 2 × 0.475 ≈ 0.95 points. On raw expectation the two are remarkably close - which is exactly why the decision so often comes down to game state. When you trail by a margin where two points changes the number of scores you need, the chart says go for two; the breakeven nature of the raw points is what lets situation tip the call.
Win-probability models are calibrated on league-average behavior. A team with a dominant short-yardage rushing attack should go for it more than the baseline suggests; a team that cannot get a yard should go less.
A great kicker extends sensible field-goal range; a shaky one shrinks it. The same fourth down can have different right answers for different rosters.
Wind, cold, and a poor field surface all change conversion and kicking odds in ways a generic model may not fully price for a specific game.
Late-game situations, timeouts remaining, and which team you would rather have the ball can shift the call away from the season-long expected-value answer.
Fourth-down decisions come down to comparing the win probability (or expected points) of going for it, punting, and kicking - and the engine is conversion probability by distance. The well-documented finding is that teams historically punted far too often on fourth-and-short, especially in plus territory, where a punt gains almost nothing. Public, open models like the "4th down bot" lineage and nfl4th made that math impossible to ignore. The same logic explains the two-point chart: with an extra point worth about 0.94 points and a two-point try (around 47-48% success) worth roughly 0.95, the decision tips on game state. Just remember the models assume league-average teams - personnel, weather, and situation can all move the right answer.
Want the code behind these metrics? Work through the 45-chapter NFL analytics tutorial.
Browse tutorials Free tools