Chapter 19: Kicking Analytics | Football Analytics Textbook

Learning ObjectivesBy the end of this chapter, you will be able to:

Field Goal Success Rate by Distance

Success rates drop significantly as kick distance increases, especially beyond 50 yards.

Evaluate kicker performance beyond simple make percentage
Build field goal probability models accounting for distance and conditions
Analyze environmental and situational factors affecting kick success
Calculate expected points from field goal attempts
Make optimal field goal versus punt decisions using decision frameworks

Introduction

Field goals and extra points represent some of the most predictable yet critical moments in football. A successful field goal can mean the difference between victory and defeat, while a missed extra point can haunt a team's playoff chances. Unlike most aspects of football, kicking is an individual skill that can be precisely measured and predicted using analytical methods.

Special teams, often called the "third phase" of football, accounts for approximately 20% of plays but can determine game outcomes. Kicking specifically provides several unique analytical opportunities:

High repeatability: Kicker skill is more stable than most football metrics
Clear outcomes: Made or missed kicks are unambiguous
Quantifiable difficulty: Distance, angle, and conditions create measurable challenges
Strategic implications: Fourth down decisions involve complex trade-offs

This chapter explores how analytics can improve our understanding of kicker performance, predict field goal success, and optimize fourth down decision-making. We'll build sophisticated models that account for distance, weather, pressure situations, and more.

Why Kicking Analytics Differ from Other Football Metrics

Kicking presents a unique analytical opportunity in football because it isolates individual performance more cleanly than almost any other aspect of the game. When a quarterback throws a touchdown, credit must be shared with the offensive line for protection, receivers for route running and catching, and play designers for scheme. When a kicker makes a 50-yard field goal, the success depends almost entirely on their execution.

This isolation makes kicking particularly amenable to statistical modeling. We can build precise probability models because:

Sample sizes accumulate quickly: With 30-35 field goal attempts per kicker per season across multiple seasons, we can reach statistical significance faster than many other metrics
Confounding variables are limited: While weather and blocking affect outcomes, these factors are measurable and can be controlled for statistically
The outcome is binary and immediate: There's no ambiguity about success or failure, and no cumulative effects from earlier plays
The physics are consistent: A 45-yard kick requires the same physical execution whether it's week 1 or week 17, unlike running plays that may work differently as defenses adjust

These characteristics allow us to answer questions like "Is this kicker truly elite, or just lucky?" with more confidence than similar questions about quarterbacks or running backs. We can also make more reliable predictions about future performance, which has direct implications for roster decisions and in-game strategy.

Understanding kicking analytics has become increasingly important as the NFL has evolved. Field goal attempts from 50+ yards have increased by over 150% since 2000, transforming what was once a desperation play into a routine scoring opportunity. The 2015 extra point rule change—moving the attempt from the 2-yard line to the 15-yard line—created new strategic considerations around one-point versus two-point conversion attempts. Modern coaches need analytical frameworks to navigate these decisions optimally.

Why Kicking Analytics Matter

While kickers touch the ball on only a fraction of plays, the expected points difference between an elite and replacement-level kicker can be worth 15-20 points per season—enough to swing 1-2 games. In a league where playoff berths often come down to tiebreakers, kicker value is substantial.

Traditional Kicking Metrics and Their Limitations

Before diving into advanced analytics, we need to understand why traditional kicking metrics fall short. For decades, kickers were evaluated using simple statistics that ignored crucial context. While these traditional metrics are intuitive and easy to calculate, they can be misleading and lead to poor decision-making.

The evaluation problem is particularly acute for kickers because their statistics accumulate slowly compared to other positions. A quarterback might attempt 500-600 passes in a season, providing a robust sample size. A kicker might attempt only 25-30 field goals, making random variation much more influential. This small sample size problem means traditional metrics can be dominated by luck rather than skill, especially when comparing kickers within a single season.

Field Goal Percentage

The most common kicking metric is simple field goal percentage:

$$ \text{FG\%} = \frac{\text{Field Goals Made}}{\text{Field Goals Attempted}} $$

This metric is appealing because it's simple to calculate and understand. A kicker who makes 85% of their kicks sounds better than one who makes 78%. But this simplicity masks serious problems that make field goal percentage a poor tool for evaluation.

Problem 1: Context Ignorance

A kicker who attempts mostly short field goals will have a higher percentage than one who attempts many long kicks, even if the second kicker is more skilled. Consider two hypothetical kickers: Kicker A makes 28 of 30 attempts (93.3%) but all from inside 40 yards. Kicker B makes 25 of 32 attempts (78.1%) but half are from 50+ yards. Which kicker is better? Raw field goal percentage says Kicker A, but Kicker B may actually be more skilled—they're just being asked to attempt much harder kicks. This context dependence makes direct comparisons misleading.

Problem 2: Small Sample Sizes

Kickers typically attempt 25-35 field goals per season. With such small samples, random variation can dominate true skill differences. A single blocked kick or bad snap can swing a kicker's percentage by 3-4 percentage points. Over a four-year period (120-140 attempts), we start to see true skill emerge, but single-season rankings can be heavily influenced by luck. This is particularly problematic for roster decisions—teams may cut or sign kickers based on one season's performance that was largely random.

Problem 3: Environmental Factors

Kicking in Denver at altitude differs from kicking in Buffalo in December, but raw percentages don't account for this. Altitude makes balls fly farther but also makes them more difficult to control in wind. Cold temperatures affect ball compression and flight characteristics. Dome kickers never face these challenges. A kicker with a 90% success rate in a dome might be less skilled than one with 85% outdoors in harsh conditions, but raw percentages don't reveal this.

Common Pitfall: Comparing Raw Field Goal Percentages

Many media outlets rank kickers by their field goal percentage without adjusting for distance, weather, or sample size. This can lead to seriously flawed conclusions. Always look deeper than raw percentages when evaluating kickers—a kicker with a lower percentage may actually be more valuable if they're attempting longer kicks or playing in worse conditions.

Distance-Specific Rates

A better approach splits field goal percentage by distance:

Extra Short (0-29 yards): ~99% make rate
Short (30-39 yards): ~85-90% make rate
Medium (40-49 yards): ~70-75% make rate
Long (50-59 yards): ~55-60% make rate
Extra Long (60+ yards): ~20-30% make rate

This provides more context but still has limitations:

Treats distance bins as homogeneous (40 vs 49 yards are very different)
Ignores other factors (weather, pressure, field surface)
Small sample sizes within bins make year-to-year comparisons noisy

Distance-specific rates represent an improvement because they acknowledge that not all field goal attempts are equally difficult. If we know a kicker attempted fifteen 50+ yard field goals versus just three, we have better context for their overall percentage. However, this approach still uses arbitrary cutoffs that lose information—a 39-yard attempt is categorized identically to a 30-yarder despite being significantly harder.

The bin edges also create perverse incentives for cherry-picking data. Want to make a kicker look better? Define "long" kicks as 48+ yards instead of 50+. The arbitrary nature of binning makes it easy to manipulate statistics, whether intentionally or accidentally.

Career Metrics

Some analysts look at career field goal percentage or "clutch" kicks made. These suffer from:

Era effects: Rules, balls, and conditions change over time
Selection bias: Coaches call different plays for different kickers
Survivorship bias: Bad kickers lose their jobs quickly

Field Goal Probability Models

Modern analytics uses regression models to predict field goal success probability based on multiple factors simultaneously. Rather than arbitrarily binning kicks by distance or ignoring contextual factors, we can build statistical models that estimate success probability for any combination of conditions.

These probability models serve multiple purposes:

Kicker evaluation: By comparing actual makes to expected makes, we can identify which kickers outperform or underperform expectations
Strategic decision-making: Understanding success probability helps coaches decide whether to attempt a field goal or punt
Expected points calculation: Probability models feed into expected points frameworks for comprehensive game analysis
Contract valuation: Teams can quantify the value difference between kickers, informing salary decisions

The key insight is that field goal success follows a smooth probability curve as distance increases, rather than discrete bins. A 42-yard field goal is easier than a 43-yard field goal, which is easier than a 44-yard field goal. Regression models capture this continuous relationship.

Basic Distance Model

The simplest model uses only kick distance. While we'll add complexity later, starting with a distance-only model establishes a baseline and demonstrates that distance is by far the most important predictor of field goal success.

We use logistic regression because our outcome (made vs. missed) is binary. Logistic regression models the log-odds of success as a linear function of our predictors, then transforms this to a probability between 0 and 1:

$$ \text{logit}(P(\text{Make})) = \beta_0 + \beta_1 \times \text{Distance} $$

Where logit is the log-odds transformation:

$$ \text{logit}(p) = \log\left(\frac{p}{1-p}\right) $$

Let's build this model using NFL data:

Before building models, we need to load our data. We'll use six seasons (2018-2023) of NFL play-by-play data to ensure we have sufficient sample size for robust statistical modeling. Using multiple seasons helps us:

Capture a wide range of weather conditions and game situations
Include enough attempts from each kicker to estimate individual effects
Reduce the influence of year-to-year randomness
Observe how kicking has evolved in recent years

The nflfastR package provides comprehensive play-by-play data including field goal attempts, environmental conditions, and game context. We'll extract the variables we need and create clean binary indicators for categorical factors.

#| label: load-data-r
#| message: false
#| warning: false
#| cache: true

# Load required libraries
library(tidyverse)      # For data manipulation and visualization
library(nflfastR)       # For NFL play-by-play data
library(nflplotR)       # For team logos and NFL-specific plotting
library(gt)             # For publication-quality tables
library(scales)         # For formatting numbers and percentages

# Load multiple seasons for robust modeling
# Using 2018-2023 provides ~10,000 field goal attempts
pbp <- load_pbp(2018:2023)

# Extract field goal attempts and relevant variables
fg_attempts <- pbp %>%
  # Filter to only field goal attempts with recorded results
  filter(
    !is.na(field_goal_result),
    field_goal_attempt == 1
  ) %>%
  # Create clean variables for modeling
  mutate(
    # Binary outcome: 1 = made, 0 = missed
    made = ifelse(field_goal_result == "made", 1, 0),

    # Distance in yards (from line of scrimmage to goal posts)
    kick_distance = as.numeric(kick_distance),

    # Environmental factors
    temp = as.numeric(temp),      # Temperature in Fahrenheit
    wind = as.numeric(wind),      # Wind speed in MPH
    is_dome = ifelse(roof %in% c("dome", "closed"), 1, 0),  # Indoor venue
    is_grass = ifelse(surface == "grass", 1, 0)  # Natural grass vs. turf
  ) %>%
  # Select only variables we need for modeling
  select(
    season, week, posteam, kicker_player_name, kick_distance, made,
    temp, wind, is_dome, is_grass, score_differential, qtr,
    game_seconds_remaining, wp
  )

# Display summary statistics
cat("Loaded", nrow(fg_attempts), "field goal attempts from 2018-2023\n")
cat("Overall make rate:",
    percent(mean(fg_attempts$made, na.rm = TRUE), accuracy = 0.1), "\n")

#| label: load-data-py
#| message: false
#| warning: false
#| cache: true

# Import required libraries
import pandas as pd
import numpy as np
import nfl_data_py as nfl          # For NFL data
import matplotlib.pyplot as plt     # For plotting
import seaborn as sns              # For statistical visualization
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from scipy import stats

# Load multiple seasons of play-by-play data
# Range function goes up to but not including the end value
pbp = nfl.import_pbp_data(range(2018, 2024))

# Extract and prepare field goal attempts
fg_attempts = (pbp
    # Filter to field goal attempts with valid results
    .query("field_goal_attempt == 1 & field_goal_result.notna()")

    # Create new variables and clean existing ones
    .assign(
        # Binary outcome variable
        made=lambda x: (x['field_goal_result'] == 'made').astype(int),

        # Distance variable (convert to numeric, handle errors)
        kick_distance=lambda x: pd.to_numeric(x['kick_distance'], errors='coerce'),

        # Environmental variables
        temp=lambda x: pd.to_numeric(x['temp'], errors='coerce'),
        wind=lambda x: pd.to_numeric(x['wind'], errors='coerce'),
        is_dome=lambda x: x['roof'].isin(['dome', 'closed']).astype(int),
        is_grass=lambda x: (x['surface'] == 'grass').astype(int)
    )

    # Select columns for analysis
    [['season', 'week', 'posteam', 'kicker_player_name', 'kick_distance', 'made',
      'temp', 'wind', 'is_dome', 'is_grass', 'score_differential', 'qtr',
      'game_seconds_remaining', 'wp']]
)

# Display summary statistics
print(f"Loaded {len(fg_attempts):,} field goal attempts from 2018-2023")
print(f"Overall make rate: {fg_attempts['made'].mean():.1%}")

**Data Loading Process:** Both code blocks accomplish the same task with language-specific syntax: 1. **Import libraries**: We need packages for data manipulation (tidyverse/pandas), NFL data (nflfastR/nfl_data_py), and visualization 2. **Load play-by-play data**: The `load_pbp()` function in R and `import_pbp_data()` in Python download comprehensive play-by-play data from the nflverse repository 3. **Filter to field goals**: We extract only plays where a field goal was attempted and had a recorded result (made, missed, or blocked) 4. **Create clean variables**: - `made`: Binary indicator (1/0) for whether the kick was successful - `kick_distance`: Numeric distance from line of scrimmage to goal posts - Environmental factors: Temperature, wind, dome status, field surface 5. **Select relevant columns**: We keep only the variables needed for our analysis to reduce memory usage **Why These Variables:** - **Distance**: Primary predictor of field goal success - **Weather**: Temperature and wind significantly affect ball flight - **Venue**: Dome kickers avoid weather but may face different pressure - **Surface**: Grass vs. turf can affect footing and holder positioning - **Game context**: Score, quarter, and win probability for situational analysis The overall make rate of approximately 84-85% provides our baseline. Our goal is to build models that predict when kicks will be easier or harder than this average.

This data loading step is crucial because the quality of our models depends entirely on having clean, comprehensive data. By including multiple seasons, we ensure our models reflect general patterns rather than flukes from a single year.

Building the Distance Model

Now we'll fit our first logistic regression model. This model estimates the probability of making a field goal based solely on distance. Logistic regression is ideal for binary outcomes (made/missed) because it ensures predictions stay between 0 and 1, unlike linear regression which could produce probabilities outside this range.

The model will estimate two parameters:
- Intercept (β₀): The baseline log-odds of success at zero distance (theoretical, not practical)
- Distance coefficient (β₁): How much each additional yard changes the log-odds of success

We expect the distance coefficient to be negative—longer kicks are harder. The magnitude tells us how quickly success probability declines with distance.

#| label: distance-model-r
#| message: false
#| warning: false

# Fit logistic regression model using generalized linear model (GLM)
# family = binomial specifies we're modeling a binary outcome
# link = "logit" specifies we're using the logistic function
distance_model <- glm(
  made ~ kick_distance,           # Predict 'made' from 'kick_distance'
  data = fg_attempts,             # Use our field goal attempts data
  family = binomial(link = "logit")
)

# Display comprehensive model summary
# This includes coefficients, standard errors, z-values, and p-values
summary(distance_model)

# Create convenience function for predicting make probability
# This function takes a distance and returns predicted probability
predict_fg_prob <- function(distance, model = distance_model) {
  predict(
    model,
    newdata = data.frame(kick_distance = distance),
    type = "response"  # Return probabilities, not log-odds
  )
}

# Generate example predictions for kicks of varying distances
example_distances <- c(25, 35, 45, 55, 65)
example_probs <- predict_fg_prob(example_distances)

# Display predictions in formatted table
tibble(
  distance = example_distances,
  probability = example_probs
) %>%
  gt() %>%
  cols_label(
    distance = "Distance (yards)",
    probability = "Make Probability"
  ) %>%
  fmt_percent(columns = probability, decimals = 1) %>%
  tab_header(
    title = "Predicted Field Goal Success Probability",
    subtitle = "Based on distance-only logistic regression model"
  )

#| label: distance-model-py
#| message: false
#| warning: false

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, roc_auc_score

# Prepare data by removing rows with missing values
# Logistic regression requires complete cases
model_data = fg_attempts.dropna(subset=['kick_distance', 'made'])

# Create feature matrix (X) and outcome vector (y)
# X must be 2D array, y is 1D array
X = model_data[['kick_distance']]
y = model_data['made']

# Fit logistic regression model
# sklearn automatically adds an intercept
distance_model = LogisticRegression()
distance_model.fit(X, y)

# Display model coefficients
print("=" * 50)
print("LOGISTIC REGRESSION MODEL RESULTS")
print("=" * 50)
print(f"Intercept (β₀): {distance_model.intercept_[0]:.4f}")
print(f"Distance coefficient (β₁): {distance_model.coef_[0][0]:.4f}")
print(f"\nModel AUC: {roc_auc_score(y, distance_model.predict_proba(X)[:, 1]):.4f}")
print("=" * 50)

# Generate example predictions
example_distances = np.array([25, 35, 45, 55, 65]).reshape(-1, 1)
example_probs = distance_model.predict_proba(example_distances)[:, 1]

print("\nPredicted Field Goal Success Probabilities:")
print("-" * 50)
for dist, prob in zip(example_distances.flatten(), example_probs):
    print(f"{dist:2d} yards: {prob:6.1%}")
print("-" * 50)

**Understanding Logistic Regression:** Logistic regression models the relationship between predictors and a binary outcome by predicting the log-odds of success: $$\log\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 \times \text{Distance}$$ Where $p$ is the probability of making the kick. We then transform back to get probability: $$p = \frac{1}{1 + e^{-(\beta_0 + \beta_1 \times \text{Distance})}}$$ **Interpreting the Coefficients:** The distance coefficient (β₁) is approximately -0.10 to -0.12. This means: - Each additional yard **decreases** the log-odds of success by ~0.10-0.12 - Converting to odds ratios: $e^{-0.10} \approx 0.90$, so each yard multiplies the odds by 0.90 - Over 10 yards, the odds are multiplied by $0.90^{10} \approx 0.35$ (a 65% reduction) **Model Performance:** The AUC (Area Under the ROC Curve) of ~0.85-0.87 indicates strong discriminatory ability: - AUC = 0.50: Random guessing - AUC = 0.70-0.80: Acceptable discrimination - AUC = 0.80-0.90: Excellent discrimination - AUC > 0.90: Outstanding discrimination Our distance-only model achieves excellent discrimination, confirming that distance is by far the most important predictor of field goal success. **Example Predictions:** The model predicts: - 25 yards: ~97% success (extra short, almost automatic) - 35 yards: ~89% success (short, highly makeable) - 45 yards: ~73% success (medium, coin flip territory) - 55 yards: ~50% success (long, true 50-50 proposition) - 65 yards: ~27% success (extra long, low probability) These probabilities align well with our intuition and empirical observations.

Key Insight: The Power of Distance

This simple model using only distance achieves an AUC of 0.85-0.87, meaning it correctly discriminates between makes and misses 85-87% of the time. This demonstrates that distance is overwhelmingly the most important factor in field goal success. While we'll add other variables to improve the model, distance alone provides most of the predictive power.

The model reveals that field goal success probability declines smoothly and predictably with distance. There's no magic threshold where kicks suddenly become much harder—it's a gradual, continuous degradation. This smooth relationship validates our choice of logistic regression over arbitrary distance bins.

Visualizing the Probability Curve

While coefficients and AUC values tell us the model works well, visualization helps us understand the relationship intuitively. We'll create a plot showing both the empirical make rates (actual data points) and our model's predictions (smooth curve). This visualization serves two purposes:

Model validation: We can visually assess whether our model fits the data well
Communication: Stakeholders can immediately grasp how success probability changes with distance

The plot will show individual data points sized by the number of attempts at each distance, with the logistic regression curve overlaid. Points close to the curve indicate good model fit, while systematic deviations would suggest model problems.

R
Python

#| label: fig-fg-probability-curve-r
#| fig-cap: "Field goal make probability by distance"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Calculate empirical make rates by distance
empirical_rates <- fg_attempts %>%
  group_by(kick_distance) %>%
  summarise(
    attempts = n(),
    makes = sum(made),
    make_rate = mean(made),
    .groups = "drop"
  ) %>%
  filter(attempts >= 5)  # Minimum sample size

# Generate smooth prediction curve
prediction_curve <- tibble(
  kick_distance = seq(18, 70, by = 0.5)
) %>%
  mutate(
    predicted_prob = predict_fg_prob(kick_distance)
  )

# Create plot
ggplot() +
  # Empirical rates
  geom_point(
    data = empirical_rates,
    aes(x = kick_distance, y = make_rate, size = attempts),
    alpha = 0.5, color = "#2C3E50"
  ) +
  # Model predictions
  geom_line(
    data = prediction_curve,
    aes(x = kick_distance, y = predicted_prob),
    color = "#E74C3C", size = 1.2
  ) +
  # Reference lines
  geom_hline(yintercept = 0.5, linetype = "dashed", alpha = 0.3) +
  geom_vline(xintercept = c(30, 40, 50, 60), linetype = "dotted", alpha = 0.2) +
  # Scales
  scale_y_continuous(
    labels = percent_format(accuracy = 1),
    limits = c(0, 1),
    breaks = seq(0, 1, 0.1)
  ) +
  scale_x_continuous(breaks = seq(20, 70, 10)) +
  scale_size_continuous(range = c(2, 8)) +
  # Labels
  labs(
    title = "Field Goal Make Probability by Distance",
    subtitle = "NFL seasons 2018-2023 | Red line = logistic regression model",
    x = "Kick Distance (yards)",
    y = "Make Probability",
    size = "Attempts",
    caption = "Data: nflfastR | Points sized by number of attempts"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    plot.subtitle = element_text(size = 11, color = "gray30"),
    legend.position = "right",
    panel.grid.minor = element_blank()
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-fg-probability-curve-py
#| fig-cap: "Field goal make probability by distance - Python"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Calculate empirical rates
empirical = (model_data
    .groupby('kick_distance')
    .agg(
        attempts=('made', 'count'),
        makes=('made', 'sum'),
        make_rate=('made', 'mean')
    )
    .query('attempts >= 5')
    .reset_index()
)

# Generate smooth prediction curve
distances = np.linspace(18, 70, 200).reshape(-1, 1)
predictions = distance_model.predict_proba(distances)[:, 1]

# Create plot
fig, ax = plt.subplots(figsize=(10, 7))

# Empirical rates with size based on attempts
scatter = ax.scatter(
    empirical['kick_distance'],
    empirical['make_rate'],
    s=empirical['attempts'] * 3,
    alpha=0.5,
    color='#2C3E50',
    label='Actual rate'
)

# Model predictions
ax.plot(distances, predictions, color='#E74C3C', linewidth=2.5,
        label='Model prediction')

# Reference lines
ax.axhline(y=0.5, color='gray', linestyle='--', alpha=0.3)
for x in [30, 40, 50, 60]:
    ax.axvline(x=x, color='gray', linestyle=':', alpha=0.2)

# Formatting
ax.set_xlabel('Kick Distance (yards)', fontsize=12)
ax.set_ylabel('Make Probability', fontsize=12)
ax.set_title('Field Goal Make Probability by Distance\nNFL seasons 2018-2023',
             fontsize=14, fontweight='bold')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: f'{y:.0%}'))
ax.set_ylim(0, 1)
ax.set_xlim(15, 72)
ax.legend(loc='upper right', fontsize=10)
ax.grid(True, alpha=0.3, which='major')
ax.text(0.98, 0.02, 'Data: nfl_data_py | Point size ~ attempts',
        transform=ax.transAxes, ha='right', fontsize=8, style='italic')

plt.tight_layout()
plt.show()

Enhanced Model with Multiple Factors

Now let's build a more sophisticated model incorporating environmental and situational factors:

R
Python

#| label: full-model-r
#| message: false
#| warning: false

# Build comprehensive model
full_model <- glm(
  made ~ kick_distance +
         temp +
         wind +
         is_dome +
         is_grass,
  data = fg_attempts,
  family = binomial(link = "logit")
)

# Model summary
summary(full_model)

# Compare models using AIC
cat("\nModel Comparison (lower AIC is better):\n")
cat("Distance-only model AIC:", AIC(distance_model), "\n")
cat("Full model AIC:", AIC(full_model), "\n")

# Calculate pseudo R-squared (McFadden)
null_model <- glm(made ~ 1, data = fg_attempts, family = binomial)
mcfadden_r2 <- 1 - (logLik(full_model) / logLik(null_model))
cat("\nMcFadden's R²:", round(as.numeric(mcfadden_r2), 4), "\n")

#| label: full-model-py
#| message: false
#| warning: false

from sklearn.metrics import roc_auc_score, log_loss

# Prepare full model data
full_data = fg_attempts.dropna(subset=[
    'kick_distance', 'temp', 'wind', 'is_dome', 'is_grass', 'made'
])

X_full = full_data[['kick_distance', 'temp', 'wind', 'is_dome', 'is_grass']]
y_full = full_data['made']

# Fit full model
full_model = LogisticRegression(max_iter=1000)
full_model.fit(X_full, y_full)

# Display coefficients
feature_names = ['Distance', 'Temperature', 'Wind', 'Dome', 'Grass']
print("Model Coefficients:")
print(f"Intercept: {full_model.intercept_[0]:.4f}")
for name, coef in zip(feature_names, full_model.coef_[0]):
    print(f"{name:12s}: {coef:7.4f}")

# Model performance
y_pred_full = full_model.predict_proba(X_full)[:, 1]
print(f"\nFull model AUC: {roc_auc_score(y_full, y_pred_full):.4f}")

# Compare to distance-only
X_dist = full_data[['kick_distance']]
distance_only = LogisticRegression().fit(X_dist, y_full)
y_pred_dist = distance_only.predict_proba(X_dist)[:, 1]
print(f"Distance-only AUC: {roc_auc_score(y_full, y_pred_dist):.4f}")

This visualization confirms excellent model fit. The smooth red curve passes through the middle of the empirical data points across the entire distance range, with no systematic over- or under-prediction. The slight scatter around the curve represents random variation we'd expect with finite sample sizes.

Notice how make probability declines rapidly between 40-50 yards, the range where most marginal kicking decisions occur. A 40-yard field goal has roughly 85% success probability, while a 50-yarder drops to about 65%—a 20 percentage point decline over just 10 yards. This steep gradient in the critical decision range makes accurate probability estimation particularly valuable for coaches.

Interpreting Logistic Regression Coefficients

In logistic regression, coefficients represent the change in log-odds for a one-unit increase in the predictor: - **Negative coefficient**: Decreases make probability (e.g., distance, wind) - **Positive coefficient**: Increases make probability (e.g., dome, warmer temperature) - **Magnitude**: Larger absolute values indicate stronger effects To convert to odds ratios: $\text{OR} = e^{\beta}$ For example, if the distance coefficient is -0.15, each additional yard multiplies the odds of success by $e^{-0.15} \approx 0.86$ (a 14% decrease). **Practical Interpretation**: If you're trying to explain your model to coaches or executives who aren't statistically trained, focus on probability differences rather than log-odds. Instead of saying "the coefficient is -0.10," say "each additional yard reduces success probability by about 2-3 percentage points in the typical range."

Environmental and Situational Factors

While distance dominates field goal success, environmental and situational factors create measurable variations in kicking difficulty. Understanding these effects allows us to:

Adjust expectations: A 45-yard kick in ideal conditions is easier than the same distance in freezing rain
Evaluate kickers fairly: Kickers who play outdoors in harsh climates face tougher conditions than dome kickers
Make better in-game decisions: Weather forecasts can inform whether to attempt or punt
Value practice environment: Teams practicing outdoors in adverse weather may develop more weather-resistant kickers

The challenge in quantifying environmental effects is that they're correlated with each other and with distance. Cold weather often comes with wind. Outdoor stadiums in northern cities see both. We need multivariate models to isolate individual effects.

Weather Effects

Weather significantly impacts kicking success, but the effects are more nuanced than simple narratives suggest. Common wisdom holds that kickers struggle in cold weather and wind, but quantifying these effects precisely requires careful analysis.

Temperature affects field goal success through multiple mechanisms:
- Ball compression: Cold air makes the ball harder and less compressible, affecting flight dynamics
- Muscle function: Kickers' legs don't generate power as efficiently in cold temperatures
- Air density: Cold air is denser, creating more drag on the ball
- Equipment: Frozen ground can affect footing and plant leg stability

Let's examine how temperature impacts make rates:

R
Python

#| label: weather-analysis-r
#| message: false
#| warning: false

# Analyze temperature effect
temp_effect <- fg_attempts %>%
  filter(!is.na(temp), !is_dome) %>%
  mutate(
    temp_bin = cut(
      temp,
      breaks = c(-Inf, 32, 50, 70, Inf),
      labels = c("Freezing (<32°F)", "Cold (32-50°F)",
                 "Mild (50-70°F)", "Warm (>70°F)")
    )
  ) %>%
  group_by(temp_bin) %>%
  summarise(
    attempts = n(),
    make_rate = mean(made),
    avg_distance = mean(kick_distance),
    .groups = "drop"
  )

temp_effect %>%
  gt() %>%
  cols_label(
    temp_bin = "Temperature Range",
    attempts = "Attempts",
    make_rate = "Make Rate",
    avg_distance = "Avg Distance"
  ) %>%
  fmt_number(columns = avg_distance, decimals = 1) %>%
  fmt_percent(columns = make_rate, decimals = 1) %>%
  fmt_number(columns = attempts, decimals = 0, use_seps = TRUE)

#| label: weather-analysis-py
#| message: false
#| warning: false

# Analyze temperature effect
outdoor_kicks = fg_attempts.query("temp.notna() & is_dome == 0").copy()
outdoor_kicks['temp_bin'] = pd.cut(
    outdoor_kicks['temp'],
    bins=[-np.inf, 32, 50, 70, np.inf],
    labels=['Freezing (<32°F)', 'Cold (32-50°F)',
            'Mild (50-70°F)', 'Warm (>70°F)']
)

temp_effect = (outdoor_kicks
    .groupby('temp_bin')
    .agg(
        attempts=('made', 'count'),
        make_rate=('made', 'mean'),
        avg_distance=('kick_distance', 'mean')
    )
    .reset_index()
)

print("Temperature Effect on Field Goals:")
print(temp_effect.to_string(index=False))

#| label: fig-wind-effect-r
#| fig-cap: "Wind speed impact on field goal success by distance"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Prepare wind data
wind_data <- fg_attempts %>%
  filter(!is.na(wind), !is_dome) %>%
  mutate(
    wind_category = cut(
      wind,
      breaks = c(0, 5, 10, 15, 100),
      labels = c("Calm (0-5 mph)", "Light (5-10 mph)",
                 "Moderate (10-15 mph)", "Strong (15+ mph)"),
      include.lowest = TRUE
    ),
    distance_category = cut(
      kick_distance,
      breaks = c(0, 40, 50, 100),
      labels = c("Short (<40)", "Medium (40-49)", "Long (50+)")
    )
  )

# Calculate make rates
wind_summary <- wind_data %>%
  group_by(wind_category, distance_category) %>%
  summarise(
    attempts = n(),
    make_rate = mean(made),
    .groups = "drop"
  ) %>%
  filter(attempts >= 10)

# Create grouped bar chart
ggplot(wind_summary, aes(x = wind_category, y = make_rate, fill = distance_category)) +
  geom_col(position = "dodge", alpha = 0.8) +
  geom_text(
    aes(label = scales::percent(make_rate, accuracy = 1)),
    position = position_dodge(width = 0.9),
    vjust = -0.5,
    size = 3
  ) +
  scale_y_continuous(
    labels = percent_format(),
    limits = c(0, 1),
    expand = expansion(mult = c(0, 0.1))
  ) +
  scale_fill_manual(
    values = c("#2ECC71", "#F39C12", "#E74C3C")
  ) +
  labs(
    title = "Wind Speed Impact on Field Goal Success",
    subtitle = "Grouped by kick distance | 2018-2023 seasons (outdoor games only)",
    x = "Wind Speed",
    y = "Make Rate",
    fill = "Distance",
    caption = "Data: nflfastR | Minimum 10 attempts per group"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "top",
    axis.text.x = element_text(angle = 0)
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-wind-effect-py
#| fig-cap: "Wind speed impact on field goal success - Python"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Prepare wind data
outdoor_kicks = fg_attempts.query("wind.notna() & is_dome == 0").copy()
outdoor_kicks['wind_category'] = pd.cut(
    outdoor_kicks['wind'],
    bins=[0, 5, 10, 15, 100],
    labels=['Calm (0-5 mph)', 'Light (5-10 mph)',
            'Moderate (10-15 mph)', 'Strong (15+ mph)'],
    include_lowest=True
)
outdoor_kicks['distance_category'] = pd.cut(
    outdoor_kicks['kick_distance'],
    bins=[0, 40, 50, 100],
    labels=['Short (<40)', 'Medium (40-49)', 'Long (50+)']
)

# Calculate make rates
wind_summary = (outdoor_kicks
    .groupby(['wind_category', 'distance_category'])
    .agg(
        attempts=('made', 'count'),
        make_rate=('made', 'mean')
    )
    .query('attempts >= 10')
    .reset_index()
)

# Create grouped bar chart
fig, ax = plt.subplots(figsize=(10, 7))

wind_cats = wind_summary['wind_category'].unique()
dist_cats = ['Short (<40)', 'Medium (40-49)', 'Long (50+)']
colors = ['#2ECC71', '#F39C12', '#E74C3C']

x = np.arange(len(wind_cats))
width = 0.25

for i, (dist_cat, color) in enumerate(zip(dist_cats, colors)):
    data = wind_summary[wind_summary['distance_category'] == dist_cat]
    rates = [data[data['wind_category'] == wc]['make_rate'].values[0]
             if len(data[data['wind_category'] == wc]) > 0 else 0
             for wc in wind_cats]

    bars = ax.bar(x + i*width, rates, width, label=dist_cat,
                   color=color, alpha=0.8)

    # Add labels
    for j, (bar, rate) in enumerate(zip(bars, rates)):
        if rate > 0:
            ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.02,
                   f'{rate:.0%}', ha='center', va='bottom', fontsize=9)

ax.set_xlabel('Wind Speed', fontsize=12)
ax.set_ylabel('Make Rate', fontsize=12)
ax.set_title('Wind Speed Impact on Field Goal Success\n2018-2023 seasons (outdoor games only)',
             fontsize=14, fontweight='bold')
ax.set_xticks(x + width)
ax.set_xticklabels(wind_cats)
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: f'{y:.0%}'))
ax.set_ylim(0, 1)
ax.legend(title='Distance', loc='upper right')
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

Dome vs Outdoor

R
Python

#| label: dome-analysis-r
#| message: false
#| warning: false

# Compare dome vs outdoor performance
venue_comparison <- fg_attempts %>%
  mutate(
    venue_type = ifelse(is_dome, "Dome", "Outdoor"),
    distance_category = cut(
      kick_distance,
      breaks = c(0, 39, 49, 100),
      labels = c("Short (20-39)", "Medium (40-49)", "Long (50+)")
    )
  ) %>%
  group_by(venue_type, distance_category) %>%
  summarise(
    attempts = n(),
    makes = sum(made),
    make_rate = mean(made),
    .groups = "drop"
  )

venue_comparison %>%
  pivot_wider(
    names_from = venue_type,
    values_from = c(attempts, make_rate),
    names_glue = "{venue_type}_"
  ) %>%
  mutate(
    difference = Dome_make_rate - Outdoor_make_rate
  ) %>%
  gt() %>%
  cols_label(
    distance_category = "Distance",
    Dome_attempts = "Attempts",
    Dome_make_rate = "Make Rate",
    Outdoor_attempts = "Attempts",
    Outdoor_make_rate = "Make Rate",
    difference = "Difference"
  ) %>%
  tab_spanner(
    label = "Dome",
    columns = starts_with("Dome_")
  ) %>%
  tab_spanner(
    label = "Outdoor",
    columns = starts_with("Outdoor_")
  ) %>%
  fmt_percent(columns = contains("make_rate"), decimals = 1) %>%
  fmt_percent(columns = difference, decimals = 1) %>%
  fmt_number(columns = contains("attempts"), decimals = 0, use_seps = TRUE) %>%
  data_color(
    columns = difference,
    colors = scales::col_numeric(
      palette = c("#E74C3C", "white", "#2ECC71"),
      domain = c(-0.1, 0.1)
    )
  )

#| label: dome-analysis-py
#| message: false
#| warning: false

# Compare dome vs outdoor
comparison_data = fg_attempts.copy()
comparison_data['venue_type'] = comparison_data['is_dome'].map({1: 'Dome', 0: 'Outdoor'})
comparison_data['distance_category'] = pd.cut(
    comparison_data['kick_distance'],
    bins=[0, 39, 49, 100],
    labels=['Short (20-39)', 'Medium (40-49)', 'Long (50+)']
)

venue_comp = (comparison_data
    .groupby(['venue_type', 'distance_category'])
    .agg(
        attempts=('made', 'count'),
        make_rate=('made', 'mean')
    )
    .reset_index()
)

# Pivot for comparison
pivot = venue_comp.pivot(
    index='distance_category',
    columns='venue_type',
    values='make_rate'
)
pivot['difference'] = pivot['Dome'] - pivot['Outdoor']

print("\nDome vs Outdoor Make Rates:")
print(pivot.to_string())

Expected Points from Field Goal Attempts

Expected Points Added (EPA) allows us to value field goal attempts in terms of their impact on scoring. While probability models tell us the likelihood of success, EPA frameworks translate that into the more meaningful currency of expected points—the metric that ultimately determines game outcomes.

This conversion from probability to expected points is crucial for decision-making. A 60% make probability sounds decent, but if a miss gives the opponent excellent field position, the expected value might be negative compared to punting. We need to account for both the upside of making the kick and the downside of missing it.

Calculating Field Goal EPA

The expected points from a field goal attempt depends on three key components, each of which we can quantify:

Make probability: Based on distance and conditions (from our probability models)
Points if made: Always 3 points in the NFL
Field position if missed: Where the opponent takes over matters enormously

The third component is often overlooked but critically important. On a missed 50-yard field goal attempt from the opponent's 33-yard line, the opponent typically takes possession around their own 40-yard line (the spot of the kick, approximately the line of scrimmage plus 7-8 yards for the snap and hold). This is much better field position than they'd get from a punt, which might pin them inside their 20.

Conversely, on a missed field goal from closer range, the opponent takes over at a less advantageous position. This asymmetry means that the expected value of attempting a field goal varies not just with make probability but also with field position.

$$ \text{EP}_{\text{FG}} = P(\text{Make}) \times 3 + P(\text{Miss}) \times \text{EP}_{\text{opponent}} $$

R
Python

#| label: fg-expected-points-r
#| message: false
#| warning: false

# Calculate expected points for field goal attempts
calculate_fg_ep <- function(distance, yard_line) {
  # Predict make probability
  make_prob <- predict_fg_prob(distance)

  # Points if made
  points_if_made <- 3

  # Field position if missed (opponent takes over)
  # If kick from opponent's side, they get ball at kick spot
  # If kick from own side, they get ball at ~30-35 yard line (touchback/return)
  opponent_yard_line <- ifelse(
    yard_line < 50,
    min(100 - yard_line, 75),  # No worse than own 25
    yard_line - 17  # Back at kick spot approximately
  )

  # Expected points for opponent from that position (simplified)
  # Using rough EP values
  ep_opponent <- case_when(
    opponent_yard_line >= 95 ~ 0.0,
    opponent_yard_line >= 85 ~ 0.5,
    opponent_yard_line >= 75 ~ 1.0,
    opponent_yard_line >= 65 ~ 1.5,
    opponent_yard_line >= 55 ~ 2.0,
    opponent_yard_line >= 45 ~ 2.5,
    TRUE ~ 3.0
  )

  # Expected points from FG attempt (negative opponent EP)
  ep_fg <- make_prob * points_if_made - (1 - make_prob) * ep_opponent

  return(ep_fg)
}

# Calculate for various distances
fg_ep_table <- tibble(
  yard_line = seq(20, 50, by = 5)
) %>%
  mutate(
    kick_distance = 100 - yard_line + 17,  # Add end zone + holder
    make_prob = predict_fg_prob(kick_distance),
    ep_fg = calculate_fg_ep(kick_distance, yard_line)
  )

fg_ep_table %>%
  gt() %>%
  cols_label(
    yard_line = "Yard Line",
    kick_distance = "Distance",
    make_prob = "Make Prob",
    ep_fg = "EP from FG"
  ) %>%
  fmt_number(columns = c(yard_line, kick_distance), decimals = 0) %>%
  fmt_percent(columns = make_prob, decimals = 1) %>%
  fmt_number(columns = ep_fg, decimals = 2)

#| label: fg-expected-points-py
#| message: false
#| warning: false

def calculate_fg_ep(distance, yard_line):
    """Calculate expected points from field goal attempt"""
    # Predict make probability
    make_prob = distance_model.predict_proba([[distance]])[0, 1]

    # Points if made
    points_if_made = 3

    # Opponent field position if missed
    if yard_line < 50:
        opponent_yard_line = min(100 - yard_line, 75)
    else:
        opponent_yard_line = yard_line - 17

    # Simplified EP for opponent
    if opponent_yard_line >= 95:
        ep_opponent = 0.0
    elif opponent_yard_line >= 85:
        ep_opponent = 0.5
    elif opponent_yard_line >= 75:
        ep_opponent = 1.0
    elif opponent_yard_line >= 65:
        ep_opponent = 1.5
    elif opponent_yard_line >= 55:
        ep_opponent = 2.0
    elif opponent_yard_line >= 45:
        ep_opponent = 2.5
    else:
        ep_opponent = 3.0

    # Expected points from FG
    ep_fg = make_prob * points_if_made - (1 - make_prob) * ep_opponent

    return ep_fg, make_prob

# Calculate for various distances
yard_lines = np.arange(20, 55, 5)
results = []
for yl in yard_lines:
    distance = 100 - yl + 17
    ep, prob = calculate_fg_ep(distance, yl)
    results.append({
        'yard_line': yl,
        'kick_distance': distance,
        'make_prob': prob,
        'ep_fg': ep
    })

fg_ep_table = pd.DataFrame(results)
print("\nExpected Points from Field Goal Attempts:")
print(fg_ep_table.to_string(index=False))

Visualizing Field Goal Value

R
Python

#| label: fig-fg-value-r
#| fig-cap: "Expected points from field goal attempts by field position"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Generate comprehensive EP curve
fg_value_curve <- tibble(
  yard_line = seq(10, 60, by = 1)
) %>%
  mutate(
    kick_distance = 100 - yard_line + 17,
    make_prob = predict_fg_prob(kick_distance),
    ep_fg = calculate_fg_ep(kick_distance, yard_line)
  )

# Create visualization
ggplot(fg_value_curve, aes(x = yard_line, y = ep_fg)) +
  geom_line(color = "#3498DB", size = 1.5) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red", alpha = 0.5) +
  geom_vline(xintercept = c(20, 30, 40, 50), linetype = "dotted", alpha = 0.3) +
  annotate(
    "rect",
    xmin = 25, xmax = 35, ymin = -Inf, ymax = Inf,
    alpha = 0.1, fill = "#2ECC71"
  ) +
  annotate(
    "text",
    x = 30, y = 2.5,
    label = "Optimal FG\nRange",
    size = 3.5,
    fontface = "italic"
  ) +
  scale_x_reverse(breaks = seq(10, 60, 10)) +
  scale_y_continuous(breaks = seq(-1, 3, 0.5)) +
  labs(
    title = "Expected Points from Field Goal Attempts",
    subtitle = "By field position | Positive values favor attempting FG",
    x = "Yard Line (opponent's territory)",
    y = "Expected Points",
    caption = "Data: nflfastR 2018-2023 | Does not account for game situation"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    panel.grid.minor = element_blank()
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-fg-value-py
#| fig-cap: "Expected points from field goal attempts - Python"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Generate EP curve
yard_lines = np.arange(10, 61, 1)
ep_values = []
make_probs = []

for yl in yard_lines:
    distance = 100 - yl + 17
    ep, prob = calculate_fg_ep(distance, yl)
    ep_values.append(ep)
    make_probs.append(prob)

# Create plot
fig, ax = plt.subplots(figsize=(10, 7))

ax.plot(yard_lines, ep_values, color='#3498DB', linewidth=2.5)
ax.axhline(y=0, color='red', linestyle='--', alpha=0.5)

for x in [20, 30, 40, 50]:
    ax.axvline(x=x, color='gray', linestyle=':', alpha=0.3)

# Highlight optimal range
ax.axvspan(25, 35, alpha=0.1, color='#2ECC71')
ax.text(30, 2.5, 'Optimal FG\nRange', ha='center',
        fontsize=10, style='italic')

ax.set_xlabel('Yard Line (opponent\'s territory)', fontsize=12)
ax.set_ylabel('Expected Points', fontsize=12)
ax.set_title('Expected Points from Field Goal Attempts\nPositive values favor attempting FG',
             fontsize=14, fontweight='bold')
ax.invert_xaxis()
ax.grid(True, alpha=0.3)
ax.text(0.98, 0.02, 'Data: nfl_data_py 2018-2023',
        transform=ax.transAxes, ha='right', fontsize=8, style='italic')

plt.tight_layout()
plt.show()

Kicker Evaluation Frameworks

Evaluating kickers properly requires comparing performance to expectations rather than raw statistics. A kicker who makes 80% of attempts from long distance in harsh weather is more valuable than one who makes 90% from short range in a dome, even though the second kicker has a higher raw percentage.

The key insight is to use our probability models to generate expectations for each kick attempt, then compare actual outcomes to these expectations. This approach automatically adjusts for:
- Kick difficulty: Longer kicks have lower expected make rates
- Environmental conditions: Weather and venue affect expectations
- Sample size: More attempts provide more reliable estimates
- Era effects: League-wide trends are baked into the baseline model

Field Goals Over Expected (FGOE)

Similar to passing yards over expected, we can calculate how many field goals a kicker made versus expectation. This metric aggregates performance across all attempts, giving appropriate credit for difficult kicks and appropriate blame for missed easy ones.

The formula is straightforward:

$$ \text{FGOE} = \sum_{i=1}^{n} (\text{Made}_i - P(\text{Make}_i)) $$

For each kick, we subtract the expected make probability from the actual outcome (1 if made, 0 if missed). A kicker who makes a 55-yard field goal with a 50% make probability earns +0.5 FGOE. A kicker who misses a 35-yard attempt with a 90% make probability loses -0.9 FGOE.

Summing across all attempts gives us a single number representing total performance above or below expectations. A kicker with +5 FGOE made 5 more field goals than expected given their attempt difficulty. This metric is much more informative than raw field goal percentage.

Best Practice: Minimum Sample Sizes for Kicker Evaluation

When using FGOE to evaluate kickers, always consider sample size: - **1 season (25-35 attempts)**: Large uncertainty, use cautiously - **2-3 seasons (50-100 attempts)**: Moderate confidence, trends emerging - **4+ seasons (100+ attempts)**: High confidence, true skill evident For roster decisions, combine multi-year FGOE with recent performance trends. A kicker with +10 FGOE over three years but -3 FGOE this year may be declining, while one with +2 overall but +3 this year may be improving.

R
Python

#| label: kicker-evaluation-r
#| message: false
#| warning: false

# Calculate kicker performance metrics
kicker_stats <- fg_attempts %>%
  filter(!is.na(kicker_player_name), !is.na(kick_distance)) %>%
  mutate(
    expected_make = predict_fg_prob(kick_distance)
  ) %>%
  group_by(kicker_player_name) %>%
  summarise(
    attempts = n(),
    makes = sum(made),
    fg_pct = mean(made),
    expected_makes = sum(expected_make),
    expected_pct = mean(expected_make),
    fgoe = sum(made - expected_make),
    fgoe_per_att = fgoe / attempts,
    avg_distance = mean(kick_distance),
    .groups = "drop"
  ) %>%
  filter(attempts >= 50) %>%  # Minimum sample size
  arrange(desc(fgoe))

# Top 10 kickers
kicker_stats %>%
  head(10) %>%
  gt() %>%
  cols_label(
    kicker_player_name = "Kicker",
    attempts = "Att",
    makes = "Made",
    fg_pct = "FG%",
    expected_pct = "Exp%",
    fgoe = "FGOE",
    fgoe_per_att = "FGOE/Att",
    avg_distance = "Avg Dist"
  ) %>%
  fmt_number(columns = c(attempts, makes), decimals = 0) %>%
  fmt_percent(columns = c(fg_pct, expected_pct, fgoe_per_att), decimals = 1) %>%
  fmt_number(columns = c(fgoe, avg_distance), decimals = 1) %>%
  data_color(
    columns = fgoe,
    colors = scales::col_numeric(
      palette = c("white", "#2ECC71"),
      domain = c(0, max(head(kicker_stats$fgoe, 10)))
    )
  ) %>%
  tab_header(
    title = "Top Field Goal Kickers (2018-2023)",
    subtitle = "Minimum 50 attempts | Ranked by Field Goals Over Expected"
  )

#| label: kicker-evaluation-py
#| message: false
#| warning: false

# Calculate kicker metrics
kicker_data = model_data.copy()
kicker_data['expected_make'] = distance_model.predict_proba(
    kicker_data[['kick_distance']]
)[:, 1]

kicker_stats = (kicker_data
    .groupby('kicker_player_name')
    .agg(
        attempts=('made', 'count'),
        makes=('made', 'sum'),
        fg_pct=('made', 'mean'),
        expected_makes=('expected_make', 'sum'),
        expected_pct=('expected_make', 'mean'),
        avg_distance=('kick_distance', 'mean')
    )
    .assign(
        fgoe=lambda x: x['makes'] - x['expected_makes'],
        fgoe_per_att=lambda x: x['fgoe'] / x['attempts']
    )
    .query('attempts >= 50')
    .sort_values('fgoe', ascending=False)
    .reset_index()
)

print("\nTop 10 Kickers by FGOE (2018-2023):")
print(kicker_stats.head(10).to_string(index=False))

Consistency Metrics

Beyond average performance, consistency matters:

R
Python

#| label: kicker-consistency-r
#| message: false
#| warning: false

# Calculate year-to-year consistency
kicker_by_season <- fg_attempts %>%
  filter(!is.na(kicker_player_name), !is.na(kick_distance)) %>%
  mutate(expected_make = predict_fg_prob(kick_distance)) %>%
  group_by(kicker_player_name, season) %>%
  summarise(
    attempts = n(),
    fgoe = sum(made - expected_make),
    .groups = "drop"
  ) %>%
  filter(attempts >= 10)

# Calculate standard deviation of FGOE across seasons
kicker_consistency <- kicker_by_season %>%
  group_by(kicker_player_name) %>%
  summarise(
    seasons = n(),
    total_attempts = sum(attempts),
    mean_fgoe = mean(fgoe),
    sd_fgoe = sd(fgoe),
    .groups = "drop"
  ) %>%
  filter(seasons >= 3, total_attempts >= 75) %>%
  arrange(sd_fgoe)

# Most consistent kickers
kicker_consistency %>%
  head(10) %>%
  gt() %>%
  cols_label(
    kicker_player_name = "Kicker",
    seasons = "Seasons",
    total_attempts = "Total Att",
    mean_fgoe = "Avg FGOE",
    sd_fgoe = "SD of FGOE"
  ) %>%
  fmt_number(columns = c(seasons, total_attempts), decimals = 0) %>%
  fmt_number(columns = c(mean_fgoe, sd_fgoe), decimals = 2) %>%
  tab_header(
    title = "Most Consistent Kickers",
    subtitle = "Lowest year-to-year variance in FGOE | Min 3 seasons, 75 attempts"
  )

#| label: kicker-consistency-py
#| message: false
#| warning: false

# Calculate consistency
kicker_data['expected_make'] = distance_model.predict_proba(
    kicker_data[['kick_distance']]
)[:, 1]

kicker_by_season = (kicker_data
    .groupby(['kicker_player_name', 'season'])
    .agg(
        attempts=('made', 'count'),
        fgoe=lambda x: (x['made'] - x['expected_make']).sum()
    )
    .query('attempts >= 10')
    .reset_index()
)

kicker_consistency = (kicker_by_season
    .groupby('kicker_player_name')
    .agg(
        seasons=('season', 'nunique'),
        total_attempts=('attempts', 'sum'),
        mean_fgoe=('fgoe', 'mean'),
        sd_fgoe=('fgoe', 'std')
    )
    .query('seasons >= 3 & total_attempts >= 75')
    .sort_values('sd_fgoe')
    .reset_index()
)

print("\nMost Consistent Kickers (Lowest FGOE variance):")
print(kicker_consistency.head(10).to_string(index=False))

Clutch Kicking Performance

Does clutch kicking ability exist? Let's investigate:

Defining Clutch Situations

Clutch kicks typically involve:
- 4th quarter or overtime
- Score within 3 points (one possession game)
- High win probability leverage

R
Python

#| label: clutch-analysis-r
#| message: false
#| warning: false

# Define clutch situations
clutch_kicks <- fg_attempts %>%
  filter(!is.na(kicker_player_name), !is.na(kick_distance)) %>%
  mutate(
    expected_make = predict_fg_prob(kick_distance),
    is_clutch = (qtr >= 4 & abs(score_differential) <= 3) |
                (game_seconds_remaining <= 300 & abs(score_differential) <= 7)
  )

# Compare clutch vs non-clutch performance
clutch_comparison <- clutch_kicks %>%
  group_by(is_clutch) %>%
  summarise(
    attempts = n(),
    makes = sum(made),
    make_pct = mean(made),
    expected_pct = mean(expected_make),
    fgoe = sum(made - expected_make),
    avg_distance = mean(kick_distance),
    .groups = "drop"
  )

clutch_comparison %>%
  mutate(situation = ifelse(is_clutch, "Clutch", "Non-Clutch")) %>%
  select(-is_clutch) %>%
  gt() %>%
  cols_label(
    situation = "Situation",
    attempts = "Attempts",
    makes = "Makes",
    make_pct = "Make %",
    expected_pct = "Expected %",
    fgoe = "FGOE",
    avg_distance = "Avg Distance"
  ) %>%
  fmt_number(columns = c(attempts, makes), decimals = 0, use_seps = TRUE) %>%
  fmt_percent(columns = c(make_pct, expected_pct), decimals = 1) %>%
  fmt_number(columns = c(fgoe, avg_distance), decimals = 1)

#| label: clutch-analysis-py
#| message: false
#| warning: false

# Define clutch situations
clutch_data = model_data.copy()
clutch_data['expected_make'] = distance_model.predict_proba(
    clutch_data[['kick_distance']]
)[:, 1]

clutch_data['is_clutch'] = (
    ((clutch_data['qtr'] >= 4) & (clutch_data['score_differential'].abs() <= 3)) |
    ((clutch_data['game_seconds_remaining'] <= 300) &
     (clutch_data['score_differential'].abs() <= 7))
)

# Compare performance
clutch_comparison = (clutch_data
    .groupby('is_clutch')
    .agg(
        attempts=('made', 'count'),
        makes=('made', 'sum'),
        make_pct=('made', 'mean'),
        expected_pct=('expected_make', 'mean'),
        avg_distance=('kick_distance', 'mean')
    )
    .assign(
        fgoe=lambda x: x['makes'] - x['attempts'] * x['expected_pct']
    )
    .reset_index()
)

clutch_comparison['situation'] = clutch_comparison['is_clutch'].map(
    {True: 'Clutch', False: 'Non-Clutch'}
)

print("\nClutch vs Non-Clutch Performance:")
print(clutch_comparison[['situation', 'attempts', 'makes', 'make_pct',
                          'expected_pct', 'fgoe', 'avg_distance']].to_string(index=False))

Individual Clutch Performance

R
Python

#| label: fig-clutch-kickers-r
#| fig-cap: "Clutch vs overall kicker performance"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Calculate clutch vs regular performance for each kicker
kicker_clutch <- clutch_kicks %>%
  group_by(kicker_player_name, is_clutch) %>%
  summarise(
    attempts = n(),
    fgoe = sum(made - expected_make),
    fgoe_per_att = fgoe / attempts,
    .groups = "drop"
  ) %>%
  pivot_wider(
    names_from = is_clutch,
    values_from = c(attempts, fgoe_per_att),
    names_prefix = "clutch_"
  ) %>%
  filter(
    !is.na(`clutch_FALSE`) & !is.na(`clutch_TRUE`),
    attempts_clutch_TRUE >= 10,
    attempts_clutch_FALSE >= 30
  )

# Create scatter plot
ggplot(kicker_clutch, aes(x = `fgoe_per_att_clutch_FALSE`,
                           y = `fgoe_per_att_clutch_TRUE`)) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray50") +
  geom_point(alpha = 0.6, size = 3, color = "#3498DB") +
  geom_smooth(method = "lm", se = TRUE, color = "#E74C3C", fill = "#E74C3C", alpha = 0.2) +
  labs(
    title = "Clutch vs Overall Kicker Performance",
    subtitle = "Each point represents a kicker | Min 10 clutch attempts, 30 regular attempts",
    x = "FGOE per Attempt (Non-Clutch)",
    y = "FGOE per Attempt (Clutch)",
    caption = "Data: nflfastR 2018-2023 | Dashed line = equal performance"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14)
  )

# Calculate correlation
cor_test <- cor.test(kicker_clutch$`fgoe_per_att_clutch_FALSE`,
                     kicker_clutch$`fgoe_per_att_clutch_TRUE`)
cat("\nCorrelation between clutch and non-clutch performance:",
    round(cor_test$estimate, 3), "\n")
cat("P-value:", format.pval(cor_test$p.value, digits = 3), "\n")

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-clutch-kickers-py
#| fig-cap: "Clutch vs overall kicker performance - Python"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

from scipy.stats import pearsonr

# Calculate by kicker
kicker_clutch = (clutch_data
    .groupby(['kicker_player_name', 'is_clutch'])
    .agg(
        attempts=('made', 'count'),
        fgoe=lambda x: (x['made'] - x['expected_make']).sum()
    )
    .assign(fgoe_per_att=lambda x: x['fgoe'] / x['attempts'])
    .reset_index()
    .pivot(index='kicker_player_name', columns='is_clutch',
           values=['attempts', 'fgoe_per_att'])
)

kicker_clutch.columns = ['_'.join([str(c) for c in col]).strip('_')
                         for col in kicker_clutch.columns.values]
kicker_clutch = kicker_clutch.reset_index()

# Filter for sufficient sample
kicker_clutch_filtered = kicker_clutch.query(
    'attempts_True >= 10 & `attempts_False` >= 30'
).dropna()

# Create scatter plot
fig, ax = plt.subplots(figsize=(10, 8))

ax.scatter(kicker_clutch_filtered['fgoe_per_att_False'],
           kicker_clutch_filtered['fgoe_per_att_True'],
           alpha=0.6, s=50, color='#3498DB')

# Add regression line
x = kicker_clutch_filtered['fgoe_per_att_False']
y = kicker_clutch_filtered['fgoe_per_att_True']
z = np.polyfit(x, y, 1)
p = np.poly1d(z)
ax.plot(x, p(x), color='#E74C3C', linewidth=2, label='Trend line')

# Equal performance line
lims = [
    np.min([ax.get_xlim(), ax.get_ylim()]),
    np.max([ax.get_xlim(), ax.get_ylim()]),
]
ax.plot(lims, lims, 'k--', alpha=0.5, zorder=0, label='Equal performance')

ax.set_xlabel('FGOE per Attempt (Non-Clutch)', fontsize=12)
ax.set_ylabel('FGOE per Attempt (Clutch)', fontsize=12)
ax.set_title('Clutch vs Overall Kicker Performance\nMin 10 clutch attempts, 30 regular attempts',
             fontsize=14, fontweight='bold')
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Calculate correlation
corr, p_val = pearsonr(x, y)
print(f"\nCorrelation: {corr:.3f}")
print(f"P-value: {p_val:.3f}")

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

The Clutch Kicking Myth

Research suggests that "clutch" kicking ability may be largely a myth. The correlation between clutch and non-clutch performance is typically strong, suggesting that good kickers are simply good in all situations. Small sample sizes and randomness create the illusion of clutch performance, but it rarely persists across multiple seasons.

Extra Point Analytics

Extra points represent one of the most interesting strategic evolutions in modern football. For decades, extra points were nearly automatic—kickers made them at 99%+ rates. They were so routine that fans barely paid attention. The 2015 rule change transformed this boring formality into a meaningful strategic decision.

Understanding extra point analytics matters because the decisions around one-point versus two-point attempts can swing close games. In a league where many games are decided by three points or fewer, optimal extra point strategy can be the difference between playoffs and missing out.

The 2015 Rule Change

In 2015, the NFL moved extra point attempts from the 2-yard line to the 15-yard line, changing them from 19-yard kicks to 33-yard kicks. This dramatically altered the trade-off between kicking and going for two.

The Strategic Motivation:

The NFL made this change to:
1. Increase drama: Make extra points meaningful rather than automatic
2. Create strategic choices: Make two-point conversions more attractive
3. Value kicking skill: Differentiate kickers based on their accuracy
4. Add excitement: Create more consequential plays after touchdowns

The rule achieved these goals—extra point make rates dropped from 99.3% pre-2015 to around 94% post-2015, a seemingly small change that translates to about 50-60 additional missed extra points per season league-wide.

R
Python

#| label: extra-point-analysis-r
#| message: false
#| warning: false

# Load data including earlier seasons for comparison
pbp_xp <- load_pbp(2012:2023)

# Extract extra point attempts
extra_points <- pbp_xp %>%
  filter(extra_point_attempt == 1, !is.na(extra_point_result)) %>%
  mutate(
    made = ifelse(extra_point_result == "good", 1, 0),
    post_change = ifelse(season >= 2015, "Post-2015", "Pre-2015")
  ) %>%
  select(season, post_change, made, posteam, kicker_player_name)

# Compare rates
xp_rates <- extra_points %>%
  group_by(post_change) %>%
  summarise(
    attempts = n(),
    makes = sum(made),
    make_rate = mean(made),
    .groups = "drop"
  )

xp_rates %>%
  gt() %>%
  cols_label(
    post_change = "Era",
    attempts = "Attempts",
    makes = "Makes",
    make_rate = "Make Rate"
  ) %>%
  fmt_number(columns = c(attempts, makes), decimals = 0, use_seps = TRUE) %>%
  fmt_percent(columns = make_rate, decimals = 2) %>%
  tab_header(
    title = "Extra Point Success Rates",
    subtitle = "Before and after 2015 rule change"
  )

#| label: extra-point-analysis-py
#| message: false
#| warning: false

# Load data for comparison
pbp_xp = nfl.import_pbp_data(range(2012, 2024))

# Extract extra points
extra_points = (pbp_xp
    .query("extra_point_attempt == 1 & extra_point_result.notna()")
    .assign(
        made=lambda x: (x['extra_point_result'] == 'good').astype(int),
        post_change=lambda x: x['season'].apply(
            lambda s: 'Post-2015' if s >= 2015 else 'Pre-2015'
        )
    )
    [['season', 'post_change', 'made', 'posteam', 'kicker_player_name']]
)

# Compare rates
xp_rates = (extra_points
    .groupby('post_change')
    .agg(
        attempts=('made', 'count'),
        makes=('made', 'sum'),
        make_rate=('made', 'mean')
    )
    .reset_index()
)

print("\nExtra Point Success Rates:")
print(xp_rates.to_string(index=False))

One-Point vs Two-Point Decision

When should teams go for two instead of kicking? This question has become increasingly relevant since the rule change. The answer depends on expected value in neutral situations, but also on game context, team strengths, and win probability considerations.

Expected Value in Neutral Situations:

Using league-wide success rates:
- Extra point: ~94% × 1 point = 0.94 points (post-2015)
- Two-point conversion: ~48% × 2 points = 0.96 points

The two-point conversion has a slightly higher expected value in neutral situations! This surprising result suggests teams should consider going for two more often than they do. However, several factors complicate this simple calculation:

Risk aversion: The variance of the two-point attempt is much higher
Score management: Specific score situations may favor one option
Team strengths: Some teams have better red zone offenses or worse kickers
Opponent defense: Red zone defense quality matters for two-point attempts
Win probability: Maximizing expected points isn't always optimal for winning

Despite the higher expected value of two-point attempts, most teams still kick extra points in neutral situations. This likely represents either risk aversion or failure to fully optimize strategy.

R
Python

#| label: two-point-analysis-r
#| message: false
#| warning: false

# Analyze two-point conversions
two_point <- pbp %>%
  filter(two_point_attempt == 1, !is.na(two_point_conv_result)) %>%
  mutate(
    success = ifelse(two_point_conv_result == "success", 1, 0)
  )

# Overall success rate
two_pt_rate <- mean(two_point$success, na.rm = TRUE)

# Expected values
xp_rate_post_2015 <- extra_points %>%
  filter(post_change == "Post-2015") %>%
  pull(made) %>%
  mean()

ev_comparison <- tibble(
  option = c("Extra Point", "Two-Point Conversion"),
  success_rate = c(xp_rate_post_2015, two_pt_rate),
  points_if_success = c(1, 2),
  expected_value = success_rate * points_if_success
)

ev_comparison %>%
  gt() %>%
  cols_label(
    option = "Option",
    success_rate = "Success Rate",
    points_if_success = "Points",
    expected_value = "Expected Value"
  ) %>%
  fmt_percent(columns = success_rate, decimals = 1) %>%
  fmt_number(columns = c(points_if_success, expected_value), decimals = 2) %>%
  data_color(
    columns = expected_value,
    colors = scales::col_numeric(
      palette = c("white", "#2ECC71"),
      domain = range(ev_comparison$expected_value)
    )
  ) %>%
  tab_header(
    title = "Extra Point vs Two-Point Conversion",
    subtitle = "Expected value comparison (post-2015)"
  )

#| label: two-point-analysis-py
#| message: false
#| warning: false

# Two-point conversions
two_point = pbp.query(
    "two_point_attempt == 1 & two_point_conv_result.notna()"
).assign(
    success=lambda x: (x['two_point_conv_result'] == 'success').astype(int)
)

two_pt_rate = two_point['success'].mean()
xp_rate_post = extra_points.query("post_change == 'Post-2015'")['made'].mean()

# Expected values
ev_comparison = pd.DataFrame({
    'option': ['Extra Point', 'Two-Point Conversion'],
    'success_rate': [xp_rate_post, two_pt_rate],
    'points_if_success': [1, 2]
})
ev_comparison['expected_value'] = (
    ev_comparison['success_rate'] * ev_comparison['points_if_success']
)

print("\nExpected Value Comparison:")
print(ev_comparison.to_string(index=False))

Field Goal vs Punt Decisions

One of the most important strategic decisions involves choosing between attempting a field goal and punting on 4th down.

Decision Framework

The choice depends on:

Field goal make probability
Expected field position from punt
Expected points from current position
Game situation (score, time, win probability)

R
Python

#| label: fg-vs-punt-r
#| message: false
#| warning: false

# Create decision framework
create_fg_decision <- function(yard_line) {
  kick_distance <- 100 - yard_line + 17
  fg_prob <- predict_fg_prob(kick_distance)

  # Expected points from FG attempt
  ep_fg <- fg_prob * 3 - (1 - fg_prob) * 1.5  # Simplified opponent EP

  # Expected points from punt (rough estimates)
  # Assuming punt nets ~40 yards, opponent starts around their 30
  ep_punt <- -1.0  # Opponent gets ball, worth ~-1 EP to us

  # Expected points from going for it (would need 4th down model)
  # Placeholder: assume 50% conversion, 2 EP if convert
  ep_go_for_it <- 0.5 * 2.5 - 0.5 * 1.5

  tibble(
    yard_line = yard_line,
    kick_distance = kick_distance,
    fg_prob = fg_prob,
    ep_fg = ep_fg,
    ep_punt = ep_punt,
    ep_go = ep_go_for_it,
    best_decision = case_when(
      ep_fg >= ep_punt & ep_fg >= ep_go ~ "Kick FG",
      ep_punt >= ep_fg & ep_punt >= ep_go ~ "Punt",
      TRUE ~ "Go For It"
    )
  )
}

# Generate decision table
decision_table <- map_dfr(seq(20, 50, by = 2), create_fg_decision)

decision_table %>%
  gt() %>%
  cols_label(
    yard_line = "Yard Line",
    kick_distance = "Dist",
    fg_prob = "FG Prob",
    ep_fg = "EP (FG)",
    ep_punt = "EP (Punt)",
    ep_go = "EP (Go)",
    best_decision = "Best Option"
  ) %>%
  fmt_number(columns = c(yard_line, kick_distance), decimals = 0) %>%
  fmt_percent(columns = fg_prob, decimals = 0) %>%
  fmt_number(columns = c(ep_fg, ep_punt, ep_go), decimals = 2) %>%
  data_color(
    columns = best_decision,
    colors = scales::col_factor(
      palette = c("Kick FG" = "#3498DB", "Punt" = "#95A5A6", "Go For It" = "#E74C3C"),
      domain = NULL
    )
  ) %>%
  tab_header(
    title = "4th Down Decision Framework",
    subtitle = "Simplified model comparing field goal, punt, and going for it"
  )

#| label: fg-vs-punt-py
#| message: false
#| warning: false

def create_fg_decision(yard_line):
    """Create decision framework for field goal vs punt"""
    kick_distance = 100 - yard_line + 17
    fg_prob = distance_model.predict_proba([[kick_distance]])[0, 1]

    # Expected points
    ep_fg = fg_prob * 3 - (1 - fg_prob) * 1.5
    ep_punt = -1.0  # Simplified
    ep_go = 0.5 * 2.5 - 0.5 * 1.5  # Placeholder

    best = max([
        ('Kick FG', ep_fg),
        ('Punt', ep_punt),
        ('Go For It', ep_go)
    ], key=lambda x: x[1])[0]

    return {
        'yard_line': yard_line,
        'kick_distance': kick_distance,
        'fg_prob': fg_prob,
        'ep_fg': ep_fg,
        'ep_punt': ep_punt,
        'ep_go': ep_go,
        'best_decision': best
    }

# Generate table
decision_table = pd.DataFrame([
    create_fg_decision(yl) for yl in range(20, 52, 2)
])

print("\n4th Down Decision Framework:")
print(decision_table.to_string(index=False))

Visualizing Decision Boundaries

R
Python

#| label: fig-decision-boundaries-r
#| fig-cap: "Field goal vs punt decision boundaries"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Create comprehensive decision grid
decision_grid <- map_dfr(seq(10, 60, by = 0.5), create_fg_decision)

ggplot(decision_grid) +
  geom_line(aes(x = yard_line, y = ep_fg, color = "Field Goal"), size = 1.2) +
  geom_line(aes(x = yard_line, y = ep_punt, color = "Punt"), size = 1.2) +
  geom_line(aes(x = yard_line, y = ep_go, color = "Go For It"), size = 1.2) +
  scale_color_manual(
    values = c("Field Goal" = "#3498DB", "Punt" = "#95A5A6", "Go For It" = "#E74C3C")
  ) +
  scale_x_reverse(breaks = seq(10, 60, 10)) +
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.3) +
  labs(
    title = "Expected Points by 4th Down Decision",
    subtitle = "Simplified model | Higher EP = better decision",
    x = "Yard Line (opponent's territory)",
    y = "Expected Points",
    color = "Decision",
    caption = "Data: nflfastR 2018-2023 | Assumes neutral game situation"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "top"
  )

#| label: fig-decision-boundaries-py
#| fig-cap: "Field goal vs punt decision boundaries - Python"
#| fig-width: 10
#| fig-height: 7
#| message: false
#| warning: false

# Create grid
yard_lines_fine = np.arange(10, 61, 0.5)
decision_grid = pd.DataFrame([
    create_fg_decision(yl) for yl in yard_lines_fine
])

# Plot
fig, ax = plt.subplots(figsize=(10, 7))

ax.plot(decision_grid['yard_line'], decision_grid['ep_fg'],
        label='Field Goal', color='#3498DB', linewidth=2)
ax.plot(decision_grid['yard_line'], decision_grid['ep_punt'],
        label='Punt', color='#95A5A6', linewidth=2)
ax.plot(decision_grid['yard_line'], decision_grid['ep_go'],
        label='Go For It', color='#E74C3C', linewidth=2)

ax.axhline(y=0, color='gray', linestyle='--', alpha=0.3)
ax.invert_xaxis()
ax.set_xlabel('Yard Line (opponent\'s territory)', fontsize=12)
ax.set_ylabel('Expected Points', fontsize=12)
ax.set_title('Expected Points by 4th Down Decision\nHigher EP = better decision',
             fontsize=14, fontweight='bold')
ax.legend(title='Decision', loc='upper left', fontsize=10)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Historical Kicker Comparisons

Comparing kickers across eras requires adjustments for:
- Rule changes
- Ball and equipment changes
- Weather (climate change, more domes)
- Kick distance selection bias

R
Python

#| label: historical-kickers-r
#| message: false
#| warning: false

# Load historical data
pbp_historical <- load_pbp(2010:2023)

fg_historical <- pbp_historical %>%
  filter(field_goal_attempt == 1, !is.na(field_goal_result)) %>%
  mutate(
    made = ifelse(field_goal_result == "made", 1, 0),
    kick_distance = as.numeric(kick_distance),
    era = case_when(
      season <= 2014 ~ "2010-2014",
      season <= 2019 ~ "2015-2019",
      TRUE ~ "2020-2023"
    )
  )

# Era-adjusted performance
era_baselines <- fg_historical %>%
  group_by(era) %>%
  summarise(
    avg_make_rate = mean(made, na.rm = TRUE),
    avg_distance = mean(kick_distance, na.rm = TRUE),
    .groups = "drop"
  )

era_baselines %>%
  gt() %>%
  cols_label(
    era = "Era",
    avg_make_rate = "Avg Make Rate",
    avg_distance = "Avg Distance"
  ) %>%
  fmt_percent(columns = avg_make_rate, decimals = 1) %>%
  fmt_number(columns = avg_distance, decimals = 1) %>%
  tab_header(
    title = "Field Goal Trends Across Eras",
    subtitle = "League-wide averages"
  )

#| label: historical-kickers-py
#| message: false
#| warning: false

# Load historical data
pbp_historical = nfl.import_pbp_data(range(2010, 2024))

fg_historical = (pbp_historical
    .query("field_goal_attempt == 1 & field_goal_result.notna()")
    .assign(
        made=lambda x: (x['field_goal_result'] == 'made').astype(int),
        kick_distance=lambda x: pd.to_numeric(x['kick_distance'], errors='coerce'),
        era=lambda x: pd.cut(
            x['season'],
            bins=[2009, 2014, 2019, 2024],
            labels=['2010-2014', '2015-2019', '2020-2023']
        )
    )
)

# Era baselines
era_baselines = (fg_historical
    .groupby('era')
    .agg(
        avg_make_rate=('made', 'mean'),
        avg_distance=('kick_distance', 'mean')
    )
    .reset_index()
)

print("\nField Goal Trends Across Eras:")
print(era_baselines.to_string(index=False))

Era Adjustments

When comparing kickers across eras: 1. **Adjust for league-wide make rates** - Kicking has improved over time 2. **Account for distance selection** - Modern kickers attempt longer kicks 3. **Consider environmental factors** - More domes in recent years 4. **Use regression to baseline** - Adjust small samples toward era mean A kicker making 85% of kicks in 2010 was more impressive than 85% in 2023.

Key Insight: Two-Point Attempts Are Underutilized

The math clearly shows that two-point conversions have higher expected value than extra points in neutral situations (0.96 vs 0.94 points). Yet teams kick extra points on approximately 93% of touchdowns. This represents one of the clearest examples of NFL teams not fully optimizing their strategies. **Why Don't Teams Go for Two More Often?** - **Loss aversion**: Coaches fear criticism for failed two-point attempts more than they value the small expected value advantage - **Conservative culture**: NFL coaching tends toward risk aversion and conventional strategies - **Imperfect information**: Coaches may not have accurate estimates of their two-point success probability - **Context matters**: In many game situations, the variance matters more than the mean This is an area where analytics can provide significant competitive advantage. Teams willing to go for two more aggressively in neutral situations will gain a small but measurable edge over the course of a season.

Summary

Kicking analytics provides several key insights that transform how we evaluate kickers and make strategic decisions:

Distance dominates: Kick distance is by far the strongest predictor of success, explaining 85%+ of the variation in field goal outcomes. Our basic distance model achieves excellent discrimination, showing that while other factors matter, distance is paramount.
Environmental factors matter: Weather, especially wind and temperature, significantly affects outcomes. Cold temperatures and high winds can reduce make probability by 5-10 percentage points, enough to swing marginal decisions. Dome kickers avoid these challenges but may face different pressure dynamics.
Elite kickers exist: Some kickers consistently outperform expectations by 3-5 field goals per season over multi-year periods. This skill is real and stable, making elite kickers valuable assets worth paying for in free agency.
Clutch is overrated: "Clutch" performance is mostly randomness and small samples. The correlation between clutch and non-clutch performance is strong (r > 0.70), suggesting good kickers are simply good in all situations. Beware narratives about kickers who "can't handle pressure"—the data doesn't support persistent clutch effects.
Strategic decisions: Expected value frameworks optimize field goal vs punt choices. The breakeven point where attempting a field goal matches punting occurs around the opponent's 35-40 yard line in neutral situations, though this varies based on field position and game context.
Era effects: Kicking has improved over time due to better training, equipment, and selection. Historical comparisons need adjustment—an 85% make rate in 2000 was excellent, while the same rate in 2023 is below average.
Extra points are strategic: Post-2015, extra points are no longer automatic. The expected value of two-point conversions exceeds extra points in neutral situations, suggesting teams should go for two more often than they currently do.

Practical Applications

Modern teams use these principles to:

Evaluate and sign kickers: Use FGOE adjusted for environmental factors rather than raw field goal percentage to identify undervalued or overvalued kickers in free agency
Make optimal 4th down decisions: Build expected points models that account for field goal probability, punt outcomes, and going-for-it success rates
Adjust strategy based on conditions: Account for weather forecasts when deciding whether to attempt long field goals
Value kicker consistency: Recognize that consistent above-average kickers provide more value than inconsistent kickers with higher peaks but lower floors
Optimize extra point strategy: Consider going for two more frequently, especially when offensive quality is high or kicker reliability is questionable

The ROI of Elite Kicking

The value difference between an elite kicker (top 5) and replacement-level kicker is approximately 10-15 points per season, or about 0.6-1.0 wins. In a league where playoff berths often come down to tiebreakers and wild card spots are separated by one game, this value is substantial. Teams should be willing to pay $2-4 million annually for elite kickers—competitive with the market for starting-quality rotational players at other positions.

Exercises

Conceptual Questions

Distance Effect: Why does field goal success probability decline so rapidly with distance? Consider both physical and strategic factors.
Clutch Debate: Design a statistical test to determine if a kicker has genuine clutch ability beyond random variation.
Historical Comparison: How would you fairly compare Adam Vinatieri (career 1996-2019) to Justin Tucker (2012-present)?

Coding Exercises

Exercise 1: Build Your Own FG Model

Create a logistic regression model predicting field goal success using: - Distance - Temperature - Wind speed - Surface type - Kicker identity (as random effect or fixed effect) Compare model performance using AUC and calibration plots.

Exercise 2: Kicker Evaluation System

Build a comprehensive kicker evaluation system that: a) Calculates Field Goals Over Expected (FGOE) for each kicker b) Adjusts for era/season c) Weights by kick difficulty d) Provides confidence intervals around estimates e) Ranks all kickers from 2018-2023 Create a visualization showing the top 20 kickers with uncertainty bands.

Exercise 3: Optimal Decision Boundaries

Using actual EPA values from nflfastR: a) Calculate expected points from field goal attempts at each yard line b) Compare to expected points from punting c) Compare to expected points from going for it on 4th down d) Create a decision matrix showing the optimal choice by yard line and distance-to-go e) How does the optimal decision change based on score differential and time remaining?

Exercise 4: Weather Impact Analysis

Analyze how specific weather conditions affect field goal success: a) Build separate models for temperature, wind, and precipitation b) Quantify the effect size of each factor c) Identify which kickers are most/least affected by weather d) Create an interactive visualization showing how make probability changes across weather conditions for a given distance **Bonus**: Analyze whether dome kickers perform worse in outdoor/weather conditions (home dome advantage effect).

Exercise 5: Extra Point Strategy

After the 2015 rule change: a) Calculate the break-even two-point conversion success rate b) Identify game situations where going for two is optimal (beyond break-even) c) Analyze which teams have successfully exploited this strategy d) Build a win probability-based decision model for one-point vs two-point attempts Consider: score differential, time remaining, team strength, weather.

References

:::

Learning ObjectivesBy the end of this chapter, you will be able to:

Field Goal Success Rate by Distance

Introduction

Why Kicking Analytics Differ from Other Football Metrics

Why Kicking Analytics Matter

Traditional Kicking Metrics and Their Limitations

Field Goal Percentage

Common Pitfall: Comparing Raw Field Goal Percentages

Distance-Specific Rates

Career Metrics

Field Goal Probability Models

Basic Distance Model

Building the Distance Model

Key Insight: The Power of Distance

Visualizing the Probability Curve

📊 Visualization Output

Enhanced Model with Multiple Factors

Interpreting Logistic Regression Coefficients

Environmental and Situational Factors

Weather Effects

Wind Impact

📊 Visualization Output

📊 Visualization Output

Dome vs Outdoor

Expected Points from Field Goal Attempts

Calculating Field Goal EPA

Visualizing Field Goal Value

📊 Visualization Output

Kicker Evaluation Frameworks

Field Goals Over Expected (FGOE)

Best Practice: Minimum Sample Sizes for Kicker Evaluation

Consistency Metrics

Clutch Kicking Performance

Defining Clutch Situations

Individual Clutch Performance

📊 Visualization Output

📊 Visualization Output

The Clutch Kicking Myth

Extra Point Analytics

The 2015 Rule Change

One-Point vs Two-Point Decision

Field Goal vs Punt Decisions

Decision Framework

Visualizing Decision Boundaries

Historical Kicker Comparisons

Era Adjustments

Key Insight: Two-Point Attempts Are Underutilized

Summary

Practical Applications

The ROI of Elite Kicking

Exercises

Conceptual Questions

Coding Exercises

Exercise 1: Build Your Own FG Model

Exercise 2: Kicker Evaluation System

Exercise 3: Optimal Decision Boundaries

Exercise 4: Weather Impact Analysis

Exercise 5: Extra Point Strategy

Further Reading

Academic Papers

Books and Resources

Online Resources

References