Chapter 30: Coaching Metrics | Football Analytics Textbook

Learning ObjectivesBy the end of this chapter, you will be able to:

Evaluate head coach performance systematically using win-loss records in context
Measure coordinator impact on offensive and defensive efficiency
Analyze in-game decision quality for 4th downs, timeouts, and challenges
Study player development and coaching effects on performance trajectories
Quantify coaching value above replacement (CVAR)

Introduction

Coaching is perhaps the most visible yet difficult-to-measure aspect of football performance. When a team wins, the head coach receives praise; when they lose, the coach faces criticism. But how much credit or blame should coaches receive? How do we separate coaching skill from player talent, luck, and schedule strength?

Evaluating coaching performance requires moving beyond simple win-loss records to understand context, decision quality, player development, and strategic adjustments. A coach who goes 8-9 with a rebuilding roster might be performing better than one who goes 10-7 with a championship-caliber team. A coordinator whose unit ranks 15th in yards but 5th in EPA might be doing excellent work despite middling traditional stats.

This chapter explores coaching analytics from multiple angles. We'll analyze head coach performance relative to expectations, evaluate coordinator efficiency, measure in-game decision quality, quantify player development effects, examine halftime adjustments, and estimate overall coaching value above replacement.

What Makes Coaching Evaluation Challenging?

Coaching impact is difficult to isolate because: - **Confounding factors**: Player talent, injuries, schedule strength all affect outcomes - **Sample size**: NFL seasons are only 17 games, creating high variance - **Attribution**: Is success due to the head coach, coordinators, or position coaches? - **Luck**: Random variation plays a significant role in short-season outcomes - **Context**: Team situation (rebuilding vs contending) matters enormously Effective coaching evaluation must account for all these factors.

Win-Loss Records in Context

The Problem with Raw Win-Loss Records

Consider these coaching records:

Coach A: 10-7 with the league's highest-paid roster
Coach B: 8-9 with the youngest roster and $40M in cap space
Coach C: 11-6 with an easy schedule (opponents' win rate: 0.400)
Coach D: 9-8 with a brutal schedule (opponents' win rate: 0.580)

Raw records tell an incomplete story. Context matters.

Expected Wins Framework

We can estimate expected wins based on:

Roster talent: Aggregate player value (from Approximate Value, PFF grades, or contract value)
Schedule strength: Opponent quality and difficulty
Pythagorean expectation: Points scored and allowed
Close game luck: Performance in one-score games

The difference between actual and expected wins reveals coaching impact.

#| label: setup-r
#| message: false
#| warning: false

library(tidyverse)
library(nflfastR)
library(nflplotR)
library(gt)

# Load team-level data
teams <- load_teams() %>%
  filter(season >= 2015, season <= 2023)

# Load schedules and results
schedules <- load_schedules(2015:2023) %>%
  filter(game_type == "REG")

# Calculate team records and point differential
team_records <- schedules %>%
  filter(!is.na(result)) %>%
  # Home team perspective
  bind_rows(
    schedules %>%
      select(
        season, week, team = home_team, opponent = away_team,
        points = home_score, points_allowed = away_score, result
      ),
    schedules %>%
      select(
        season, week, team = away_team, opponent = home_team,
        points = away_score, points_allowed = home_score
      ) %>%
      mutate(result = -result)
  ) %>%
  filter(!is.na(points)) %>%
  group_by(season, team) %>%
  summarise(
    games = n(),
    wins = sum(result > 0),
    losses = sum(result < 0),
    ties = sum(result == 0),
    points = sum(points),
    points_allowed = sum(points_allowed),
    point_diff = points - points_allowed,
    .groups = "drop"
  )

# Calculate Pythagorean expected wins
# Formula: W_exp = G * (PF^2.37 / (PF^2.37 + PA^2.37))
team_records <- team_records %>%
  mutate(
    pythag_wins = games * (points^2.37 / (points^2.37 + points_allowed^2.37)),
    pythag_diff = wins - pythag_wins,
    win_pct = wins / games
  )

# Display recent examples
team_records %>%
  filter(season == 2023) %>%
  arrange(desc(pythag_diff)) %>%
  select(team, wins, losses, pythag_wins, pythag_diff) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    wins = "Wins",
    losses = "Losses",
    pythag_wins = "Expected Wins",
    pythag_diff = "Over/Under"
  ) %>%
  fmt_number(columns = c(pythag_wins, pythag_diff), decimals = 2) %>%
  tab_header(
    title = "Teams Exceeding Pythagorean Expectation",
    subtitle = "2023 Regular Season"
  )

#| label: setup-py
#| message: false
#| warning: false

import pandas as pd
import numpy as np
import nfl_data_py as nfl
import matplotlib.pyplot as plt
import seaborn as sns

# Load schedules
schedules = nfl.import_schedules(range(2015, 2024))
schedules = schedules[schedules['game_type'] == 'REG'].copy()

# Create team-level records
home_games = schedules[schedules['home_score'].notna()].copy()
home_games['team'] = home_games['home_team']
home_games['opponent'] = home_games['away_team']
home_games['points'] = home_games['home_score']
home_games['points_allowed'] = home_games['away_score']
home_games['win'] = (home_games['home_score'] > home_games['away_score']).astype(int)

away_games = schedules[schedules['away_score'].notna()].copy()
away_games['team'] = away_games['away_team']
away_games['opponent'] = away_games['home_team']
away_games['points'] = away_games['away_score']
away_games['points_allowed'] = away_games['home_score']
away_games['win'] = (away_games['away_score'] > away_games['home_score']).astype(int)

# Combine
all_games = pd.concat([
    home_games[['season', 'week', 'team', 'opponent', 'points', 'points_allowed', 'win']],
    away_games[['season', 'week', 'team', 'opponent', 'points', 'points_allowed', 'win']]
])

# Calculate team records
team_records = all_games.groupby(['season', 'team']).agg({
    'win': ['sum', 'count'],
    'points': 'sum',
    'points_allowed': 'sum'
}).reset_index()

team_records.columns = ['season', 'team', 'wins', 'games', 'points', 'points_allowed']
team_records['losses'] = team_records['games'] - team_records['wins']
team_records['point_diff'] = team_records['points'] - team_records['points_allowed']

# Calculate Pythagorean expected wins
exponent = 2.37
team_records['pythag_wins'] = (
    team_records['games'] *
    (team_records['points']**exponent /
     (team_records['points']**exponent + team_records['points_allowed']**exponent))
)
team_records['pythag_diff'] = team_records['wins'] - team_records['pythag_wins']
team_records['win_pct'] = team_records['wins'] / team_records['games']

# Display 2023 results
results_2023 = team_records[team_records['season'] == 2023].sort_values(
    'pythag_diff', ascending=False
).head(10)

print("\nTeams Exceeding Pythagorean Expectation (2023):")
print(results_2023[['team', 'wins', 'losses', 'pythag_wins', 'pythag_diff']].to_string(index=False))

Schedule Strength Adjustment

R
Python

#| label: schedule-strength-r
#| message: false
#| warning: false

# Calculate opponent win percentage for each team
schedule_strength <- schedules %>%
  filter(!is.na(result)) %>%
  # For each game, get opponent's season record
  left_join(
    team_records %>% select(season, team, win_pct),
    by = c("season", "away_team" = "team")
  ) %>%
  rename(away_win_pct = win_pct) %>%
  left_join(
    team_records %>% select(season, team, win_pct),
    by = c("season", "home_team" = "team")
  ) %>%
  rename(home_win_pct = win_pct) %>%
  # Calculate schedule strength for each team
  group_by(season, home_team) %>%
  summarise(
    sos_home = mean(away_win_pct, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  rename(team = home_team) %>%
  bind_rows(
    schedules %>%
      filter(!is.na(result)) %>%
      left_join(
        team_records %>% select(season, team, win_pct),
        by = c("season", "home_team" = "team")
      ) %>%
      rename(home_win_pct = win_pct) %>%
      group_by(season, away_team) %>%
      summarise(
        sos_away = mean(home_win_pct, na.rm = TRUE),
        .groups = "drop"
      ) %>%
      rename(team = away_team)
  ) %>%
  group_by(season, team) %>%
  summarise(
    sos = mean(c(sos_home, sos_away), na.rm = TRUE),
    .groups = "drop"
  )

# Join with team records
team_records_adj <- team_records %>%
  left_join(schedule_strength, by = c("season", "team")) %>%
  mutate(
    # Adjust wins for schedule strength
    # Teams with harder schedules get bonus
    sos_adj_wins = wins + (sos - 0.500) * games,
    sos_diff = sos_adj_wins - wins
  )

# Show 2023 schedule strength
team_records_adj %>%
  filter(season == 2023) %>%
  arrange(desc(sos)) %>%
  select(team, wins, losses, sos, sos_adj_wins) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    wins = "Actual Wins",
    losses = "Losses",
    sos = "SOS",
    sos_adj_wins = "Adjusted Wins"
  ) %>%
  fmt_number(columns = c(sos, sos_adj_wins), decimals = 2) %>%
  tab_header(
    title = "Schedule Strength Adjustments",
    subtitle = "2023 Regular Season - Hardest Schedules"
  )

#| label: schedule-strength-py
#| message: false
#| warning: false

# Calculate opponent win percentage
schedules_with_records = schedules.merge(
    team_records[['season', 'team', 'win_pct']],
    left_on=['season', 'away_team'],
    right_on=['season', 'team'],
    suffixes=('', '_away')
).merge(
    team_records[['season', 'team', 'win_pct']],
    left_on=['season', 'home_team'],
    right_on=['season', 'team'],
    suffixes=('_away', '_home')
)

# Calculate SOS for each team (as home and away)
sos_home = schedules_with_records.groupby(['season', 'home_team'])['win_pct_away'].mean().reset_index()
sos_home.columns = ['season', 'team', 'sos']

sos_away = schedules_with_records.groupby(['season', 'away_team'])['win_pct_home'].mean().reset_index()
sos_away.columns = ['season', 'team', 'sos']

# Combine
sos_combined = pd.concat([sos_home, sos_away]).groupby(['season', 'team'])['sos'].mean().reset_index()

# Join with team records
team_records_adj = team_records.merge(sos_combined, on=['season', 'team'], how='left')
team_records_adj['sos_adj_wins'] = (
    team_records_adj['wins'] +
    (team_records_adj['sos'] - 0.500) * team_records_adj['games']
)
team_records_adj['sos_diff'] = team_records_adj['sos_adj_wins'] - team_records_adj['wins']

# Show 2023 results
sos_2023 = team_records_adj[team_records_adj['season'] == 2023].sort_values(
    'sos', ascending=False
).head(10)

print("\nSchedule Strength Adjustments (2023 - Hardest Schedules):")
print(sos_2023[['team', 'wins', 'losses', 'sos', 'sos_adj_wins']].to_string(index=False))

Visualizing Expected vs Actual Wins

R
Python

#| label: fig-expected-wins-r
#| fig-cap: "Actual wins vs Pythagorean expected wins by team (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

team_records_adj %>%
  filter(season == 2023) %>%
  ggplot(aes(x = pythag_wins, y = wins)) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "gray40") +
  geom_nfl_logos(aes(team_abbr = team), width = 0.05, alpha = 0.8) +
  scale_x_continuous(breaks = seq(0, 17, 2), limits = c(0, 17)) +
  scale_y_continuous(breaks = seq(0, 17, 2), limits = c(0, 17)) +
  labs(
    title = "Actual Wins vs Pythagorean Expected Wins",
    subtitle = "2023 NFL Regular Season | Teams above line exceeded expectations",
    x = "Pythagorean Expected Wins",
    y = "Actual Wins",
    caption = "Data: nflfastR | Teams above diagonal exceeded expectations based on points scored/allowed"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank()
  )

#| label: fig-expected-wins-py
#| fig-cap: "Actual wins vs Pythagorean expected wins - Python (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Filter 2023 data
plot_data = team_records_adj[team_records_adj['season'] == 2023].copy()

# Create plot
fig, ax = plt.subplots(figsize=(10, 8))

# Diagonal line
ax.plot([0, 17], [0, 17], 'k--', alpha=0.3, linewidth=1)

# Scatter plot
for _, row in plot_data.iterrows():
    ax.scatter(row['pythag_wins'], row['wins'], s=100, alpha=0.7)
    ax.text(row['pythag_wins'], row['wins'], row['team'],
            fontsize=8, ha='center', va='center')

ax.set_xlim(0, 17)
ax.set_ylim(0, 17)
ax.set_xlabel('Pythagorean Expected Wins', fontsize=12)
ax.set_ylabel('Actual Wins', fontsize=12)
ax.set_title('Actual Wins vs Pythagorean Expected Wins\n2023 NFL Regular Season',
             fontsize=14, fontweight='bold')
ax.text(0.98, 0.02, 'Data: nfl_data_py\nTeams above line exceeded expectations',
        transform=ax.transAxes, ha='right', va='bottom', fontsize=8, style='italic')

plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

In-Game Decision Quality

Fourth Down Decision Analysis

Fourth down decisions are the most scrutinized coaching choices. We can evaluate decision quality using expected points.

R
Python

#| label: fourth-down-analysis-r
#| message: false
#| warning: false
#| cache: true

# Load play-by-play data
pbp <- load_pbp(2023)

# Analyze 4th down decisions
fourth_downs <- pbp %>%
  filter(down == 4, !is.na(ydstogo)) %>%
  mutate(
    decision = case_when(
      punt_attempt == 1 ~ "Punt",
      field_goal_attempt == 1 ~ "FG Attempt",
      play_type %in% c("pass", "run") ~ "Go For It",
      TRUE ~ "Other"
    ),
    # Optimal decision based on EPA
    should_go = if_else(go_boost > 0.5, TRUE, FALSE),
    correct_decision = case_when(
      decision == "Go For It" & should_go ~ TRUE,
      decision != "Go For It" & !should_go ~ TRUE,
      TRUE ~ FALSE
    )
  ) %>%
  filter(decision != "Other")

# Calculate decision quality by team
team_4th_down <- fourth_downs %>%
  group_by(posteam) %>%
  summarise(
    fourth_downs = n(),
    go_attempts = sum(decision == "Go For It"),
    go_rate = mean(decision == "Go For It"),
    should_go_situations = sum(should_go),
    correct_decisions = sum(correct_decision),
    decision_rate = mean(correct_decision),
    .groups = "drop"
  ) %>%
  arrange(desc(decision_rate))

# Display top decision-makers
team_4th_down %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    fourth_downs = "4th Downs",
    go_attempts = "Go Attempts",
    go_rate = "Go Rate",
    correct_decisions = "Correct",
    decision_rate = "Correct %"
  ) %>%
  fmt_percent(columns = c(go_rate, decision_rate), decimals = 1) %>%
  tab_header(
    title = "Fourth Down Decision Quality",
    subtitle = "2023 Regular Season - Best Decision Makers"
  )

#| label: fourth-down-analysis-py
#| message: false
#| warning: false
#| cache: true

# Load play-by-play data
pbp = nfl.import_pbp_data([2023])

# Analyze 4th down decisions
fourth_downs = pbp[
    (pbp['down'] == 4) &
    (pbp['ydstogo'].notna())
].copy()

# Classify decisions
fourth_downs['decision'] = 'Other'
fourth_downs.loc[fourth_downs['punt_attempt'] == 1, 'decision'] = 'Punt'
fourth_downs.loc[fourth_downs['field_goal_attempt'] == 1, 'decision'] = 'FG Attempt'
fourth_downs.loc[
    fourth_downs['play_type'].isin(['pass', 'run']),
    'decision'
] = 'Go For It'

# Filter valid decisions
fourth_downs = fourth_downs[fourth_downs['decision'] != 'Other'].copy()

# Determine optimal decision (using go_boost if available)
if 'go_boost' in fourth_downs.columns:
    fourth_downs['should_go'] = fourth_downs['go_boost'] > 0.5
    fourth_downs['correct_decision'] = (
        ((fourth_downs['decision'] == 'Go For It') & fourth_downs['should_go']) |
        ((fourth_downs['decision'] != 'Go For It') & ~fourth_downs['should_go'])
    )

    # Calculate by team
    team_4th_down = fourth_downs.groupby('posteam').agg({
        'play_id': 'count',
        'decision': lambda x: (x == 'Go For It').sum(),
        'should_go': 'sum',
        'correct_decision': 'sum'
    }).reset_index()

    team_4th_down.columns = ['team', 'fourth_downs', 'go_attempts', 'should_go_situations', 'correct_decisions']
    team_4th_down['go_rate'] = team_4th_down['go_attempts'] / team_4th_down['fourth_downs']
    team_4th_down['decision_rate'] = team_4th_down['correct_decisions'] / team_4th_down['fourth_downs']

    # Sort and display
    team_4th_down = team_4th_down.sort_values('decision_rate', ascending=False)

    print("\nFourth Down Decision Quality (2023 - Best Decision Makers):")
    print(team_4th_down.head(10)[['team', 'fourth_downs', 'go_attempts', 'go_rate', 'correct_decisions', 'decision_rate']].to_string(index=False))
else:
    print("\ngo_boost metric not available in dataset")

Challenge Success Rates

R
Python

#| label: challenge-analysis-r
#| message: false
#| warning: false

# Analyze replay challenges
# Note: Challenge data is limited in pbp data
challenges <- pbp %>%
  filter(!is.na(replay_or_challenge), replay_or_challenge == 1) %>%
  mutate(
    challenge_successful = if_else(
      !is.na(replay_or_challenge_result) &
      str_detect(tolower(replay_or_challenge_result), "upheld|confirmed"),
      FALSE,
      TRUE
    )
  )

# Summarize by team (possessing team when challenge occurred)
team_challenges <- challenges %>%
  filter(!is.na(posteam)) %>%
  group_by(posteam) %>%
  summarise(
    challenges = n(),
    successful = sum(challenge_successful, na.rm = TRUE),
    success_rate = mean(challenge_successful, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(challenges >= 3) %>%  # Minimum sample
  arrange(desc(success_rate))

if(nrow(team_challenges) > 0) {
  team_challenges %>%
    head(10) %>%
    gt() %>%
    cols_label(
      posteam = "Team",
      challenges = "Challenges",
      successful = "Successful",
      success_rate = "Success Rate"
    ) %>%
    fmt_percent(columns = success_rate, decimals = 1) %>%
    tab_header(
      title = "Challenge Success Rates",
      subtitle = "2023 Regular Season (min. 3 challenges)"
    )
} else {
  cat("Challenge data not available in sufficient detail\n")
}

#| label: challenge-analysis-py
#| message: false
#| warning: false

# Analyze replay challenges
if 'replay_or_challenge' in pbp.columns:
    challenges = pbp[
        (pbp['replay_or_challenge'].notna()) &
        (pbp['replay_or_challenge'] == 1)
    ].copy()

    # Determine success (overturned = successful challenge)
    if 'replay_or_challenge_result' in challenges.columns:
        challenges['challenge_successful'] = ~challenges['replay_or_challenge_result'].str.lower().str.contains(
            'upheld|confirmed', na=False
        )

        # Summarize by team
        team_challenges = challenges.groupby('posteam').agg({
            'play_id': 'count',
            'challenge_successful': ['sum', 'mean']
        }).reset_index()

        team_challenges.columns = ['team', 'challenges', 'successful', 'success_rate']
        team_challenges = team_challenges[team_challenges['challenges'] >= 3].sort_values(
            'success_rate', ascending=False
        )

        print("\nChallenge Success Rates (2023 - min. 3 challenges):")
        print(team_challenges.head(10).to_string(index=False))
    else:
        print("\nChallenge result data not available")
else:
    print("\nChallenge data not available in dataset")

Timeout Management

R
Python

#| label: timeout-analysis-r
#| message: false
#| warning: false

# Analyze timeout usage
timeout_analysis <- pbp %>%
  filter(!is.na(timeout), timeout == 1) %>%
  mutate(
    # Categorize timeouts by situation
    timeout_category = case_when(
      qtr <= 2 & game_seconds_remaining < 120 ~ "End of 1st Half",
      qtr == 4 & game_seconds_remaining < 120 ~ "End of Game",
      abs(score_differential) <= 3 ~ "Close Game",
      TRUE ~ "Other"
    ),
    # Was it effective? (led to positive EPA on next play)
    timeout_team = timeout_team
  )

# Count timeouts by team and situation
team_timeouts <- timeout_analysis %>%
  filter(!is.na(timeout_team)) %>%
  group_by(timeout_team, timeout_category) %>%
  summarise(
    timeouts = n(),
    .groups = "drop"
  ) %>%
  pivot_wider(
    names_from = timeout_category,
    values_from = timeouts,
    values_fill = 0
  ) %>%
  mutate(
    total_timeouts = `Close Game` + `End of 1st Half` + `End of Game` + Other
  ) %>%
  arrange(desc(total_timeouts))

# Display
team_timeouts %>%
  head(10) %>%
  gt() %>%
  cols_label(
    timeout_team = "Team",
    `Close Game` = "Close Game",
    `End of 1st Half` = "End Half",
    `End of Game` = "End Game",
    Other = "Other",
    total_timeouts = "Total"
  ) %>%
  tab_header(
    title = "Timeout Usage by Situation",
    subtitle = "2023 Regular Season"
  )

#| label: timeout-analysis-py
#| message: false
#| warning: false

# Analyze timeout usage
if 'timeout' in pbp.columns:
    timeouts = pbp[(pbp['timeout'].notna()) & (pbp['timeout'] == 1)].copy()

    # Categorize timeouts
    timeouts['timeout_category'] = 'Other'
    timeouts.loc[
        (timeouts['qtr'] <= 2) & (timeouts['game_seconds_remaining'] < 120),
        'timeout_category'
    ] = 'End of 1st Half'
    timeouts.loc[
        (timeouts['qtr'] == 4) & (timeouts['game_seconds_remaining'] < 120),
        'timeout_category'
    ] = 'End of Game'
    timeouts.loc[
        timeouts['score_differential'].abs() <= 3,
        'timeout_category'
    ] = 'Close Game'

    # Count by team and category
    if 'timeout_team' in timeouts.columns:
        team_timeouts = timeouts.groupby(
            ['timeout_team', 'timeout_category']
        )['play_id'].count().reset_index()

        team_timeouts_pivot = team_timeouts.pivot(
            index='timeout_team',
            columns='timeout_category',
            values='play_id'
        ).fillna(0).astype(int)

        team_timeouts_pivot['Total'] = team_timeouts_pivot.sum(axis=1)
        team_timeouts_pivot = team_timeouts_pivot.sort_values('Total', ascending=False)

        print("\nTimeout Usage by Situation (2023):")
        print(team_timeouts_pivot.head(10))
    else:
        print("\nTimeout team data not available")
else:
    print("\nTimeout data not available in dataset")

Coordinator Impact Analysis

Offensive Coordinator Efficiency

R
Python

#| label: oc-analysis-r
#| message: false
#| warning: false

# Calculate offensive efficiency metrics
offense_metrics <- pbp %>%
  filter(!is.na(posteam), !is.na(epa), play_type %in% c("pass", "run")) %>%
  group_by(season, posteam) %>%
  summarise(
    plays = n(),
    epa_per_play = mean(epa),
    success_rate = mean(epa > 0),
    explosive_rate = mean(epa > 1.5),
    # Situational efficiency
    third_down_conv = mean(third_down_converted[down == 3], na.rm = TRUE),
    red_zone_td_rate = mean(
      touchdown[yardline_100 <= 20],
      na.rm = TRUE
    ),
    .groups = "drop"
  )

# Add team colors for visualization
offense_metrics <- offense_metrics %>%
  left_join(
    load_teams() %>%
      select(team_abbr, team_color, team_color2) %>%
      distinct(),
    by = c("posteam" = "team_abbr")
  )

# Display 2023 top offenses
offense_metrics %>%
  filter(season == 2023) %>%
  arrange(desc(epa_per_play)) %>%
  select(posteam, plays, epa_per_play, success_rate, explosive_rate,
         third_down_conv, red_zone_td_rate) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    plays = "Plays",
    epa_per_play = "EPA/Play",
    success_rate = "Success Rate",
    explosive_rate = "Explosive Rate",
    third_down_conv = "3rd Down Conv",
    red_zone_td_rate = "RZ TD Rate"
  ) %>%
  fmt_number(columns = epa_per_play, decimals = 3) %>%
  fmt_percent(columns = c(success_rate, explosive_rate, third_down_conv, red_zone_td_rate),
              decimals = 1) %>%
  tab_header(
    title = "Offensive Coordinator Efficiency",
    subtitle = "2023 Regular Season - Top 10 Offenses by EPA"
  )

#| label: oc-analysis-py
#| message: false
#| warning: false

# Calculate offensive efficiency metrics
offense_data = pbp[
    (pbp['posteam'].notna()) &
    (pbp['epa'].notna()) &
    (pbp['play_type'].isin(['pass', 'run']))
].copy()

offense_metrics = offense_data.groupby(['season', 'posteam']).agg({
    'play_id': 'count',
    'epa': ['mean', lambda x: (x > 0).mean(), lambda x: (x > 1.5).mean()]
}).reset_index()

offense_metrics.columns = ['season', 'team', 'plays', 'epa_per_play', 'success_rate', 'explosive_rate']

# Third down conversions
third_downs = offense_data[offense_data['down'] == 3].copy()
third_down_conv = third_downs.groupby(['season', 'posteam'])['third_down_converted'].mean().reset_index()
third_down_conv.columns = ['season', 'team', 'third_down_conv']

# Red zone touchdowns
red_zone = offense_data[offense_data['yardline_100'] <= 20].copy()
red_zone_td = red_zone.groupby(['season', 'posteam'])['touchdown'].mean().reset_index()
red_zone_td.columns = ['season', 'team', 'red_zone_td_rate']

# Merge
offense_metrics = offense_metrics.merge(
    third_down_conv, on=['season', 'team'], how='left'
).merge(
    red_zone_td, on=['season', 'team'], how='left'
)

# Display 2023 top offenses
offense_2023 = offense_metrics[offense_metrics['season'] == 2023].sort_values(
    'epa_per_play', ascending=False
).head(10)

print("\nOffensive Coordinator Efficiency (2023 - Top 10 by EPA):")
print(offense_2023.to_string(index=False))

Defensive Coordinator Efficiency

R
Python

#| label: dc-analysis-r
#| message: false
#| warning: false

# Calculate defensive efficiency metrics
defense_metrics <- pbp %>%
  filter(!is.na(defteam), !is.na(epa), play_type %in% c("pass", "run")) %>%
  group_by(season, defteam) %>%
  summarise(
    plays = n(),
    epa_allowed = mean(epa),  # Lower is better
    success_rate_allowed = mean(epa > 0),  # Lower is better
    explosive_rate_allowed = mean(epa > 1.5),  # Lower is better
    # Pressure and turnovers
    pressure_rate = mean(qb_hit == 1 | qb_hurry == 1 | sack == 1, na.rm = TRUE),
    turnover_rate = mean(interception == 1 | fumble_lost == 1, na.rm = TRUE),
    # Situational
    third_down_stop_rate = mean(!third_down_converted[down == 3], na.rm = TRUE),
    red_zone_td_rate = mean(touchdown[yardline_100 <= 20], na.rm = TRUE),
    .groups = "drop"
  )

# Display 2023 top defenses
defense_metrics %>%
  filter(season == 2023) %>%
  arrange(epa_allowed) %>%  # Lower is better
  select(defteam, plays, epa_allowed, success_rate_allowed,
         pressure_rate, turnover_rate, third_down_stop_rate) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    defteam = "Team",
    plays = "Plays",
    epa_allowed = "EPA Allowed",
    success_rate_allowed = "Success Rate",
    pressure_rate = "Pressure Rate",
    turnover_rate = "Turnover Rate",
    third_down_stop_rate = "3rd Down Stops"
  ) %>%
  fmt_number(columns = epa_allowed, decimals = 3) %>%
  fmt_percent(columns = c(success_rate_allowed, pressure_rate, turnover_rate,
                          third_down_stop_rate), decimals = 1) %>%
  tab_header(
    title = "Defensive Coordinator Efficiency",
    subtitle = "2023 Regular Season - Top 10 Defenses by EPA Allowed"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = epa_allowed, rows = epa_allowed < -0.05)
  )

#| label: dc-analysis-py
#| message: false
#| warning: false

# Calculate defensive efficiency metrics
defense_data = pbp[
    (pbp['defteam'].notna()) &
    (pbp['epa'].notna()) &
    (pbp['play_type'].isin(['pass', 'run']))
].copy()

defense_metrics = defense_data.groupby(['season', 'defteam']).agg({
    'play_id': 'count',
    'epa': ['mean', lambda x: (x > 0).mean(), lambda x: (x > 1.5).mean()]
}).reset_index()

defense_metrics.columns = ['season', 'team', 'plays', 'epa_allowed', 'success_rate_allowed', 'explosive_rate_allowed']

# Pressure rate
if all(col in defense_data.columns for col in ['qb_hit', 'qb_hurry', 'sack']):
    pressure = defense_data.groupby(['season', 'defteam']).apply(
        lambda x: ((x['qb_hit'] == 1) | (x['qb_hurry'] == 1) | (x['sack'] == 1)).mean()
    ).reset_index()
    pressure.columns = ['season', 'team', 'pressure_rate']
    defense_metrics = defense_metrics.merge(pressure, on=['season', 'team'], how='left')

# Turnover rate
if all(col in defense_data.columns for col in ['interception', 'fumble_lost']):
    turnovers = defense_data.groupby(['season', 'defteam']).apply(
        lambda x: ((x['interception'] == 1) | (x['fumble_lost'] == 1)).mean()
    ).reset_index()
    turnovers.columns = ['season', 'team', 'turnover_rate']
    defense_metrics = defense_metrics.merge(turnovers, on=['season', 'team'], how='left')

# Third down stops
third_downs_def = defense_data[defense_data['down'] == 3].copy()
third_down_stops = third_downs_def.groupby(['season', 'defteam']).apply(
    lambda x: (~x['third_down_converted']).mean()
).reset_index()
third_down_stops.columns = ['season', 'team', 'third_down_stop_rate']
defense_metrics = defense_metrics.merge(third_down_stops, on=['season', 'team'], how='left')

# Display 2023 top defenses
defense_2023 = defense_metrics[defense_metrics['season'] == 2023].sort_values(
    'epa_allowed', ascending=True
).head(10)

print("\nDefensive Coordinator Efficiency (2023 - Top 10 by EPA Allowed):")
print(defense_2023.to_string(index=False))

Coordinator EPA Scatter Plot

R
Python

#| label: fig-coordinator-scatter-r
#| fig-cap: "Offensive and defensive EPA by team (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Combine offense and defense metrics
combined_metrics <- offense_metrics %>%
  filter(season == 2023) %>%
  select(posteam, off_epa = epa_per_play) %>%
  inner_join(
    defense_metrics %>%
      filter(season == 2023) %>%
      select(defteam, def_epa = epa_allowed),
    by = c("posteam" = "defteam")
  )

# Create scatter plot
combined_metrics %>%
  ggplot(aes(x = off_epa, y = -def_epa)) +  # Negative def_epa so up is better
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
  geom_vline(xintercept = 0, linetype = "dashed", alpha = 0.5) +
  geom_nfl_logos(aes(team_abbr = posteam), width = 0.05, alpha = 0.8) +
  labs(
    title = "Offensive and Defensive Coordinator Efficiency",
    subtitle = "2023 NFL Regular Season | Top-right quadrant = elite on both sides",
    x = "Offensive EPA per Play (Better →)",
    y = "Defensive EPA per Play (Better ↑)",
    caption = "Data: nflfastR | Defensive EPA inverted so up = better defense"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank()
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-coordinator-scatter-py
#| fig-cap: "Offensive and defensive EPA - Python (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Combine offense and defense
combined_2023 = offense_2023[['team', 'epa_per_play']].merge(
    defense_2023[['team', 'epa_allowed']],
    left_on='team',
    right_on='team',
    how='inner'
)

# Create plot
fig, ax = plt.subplots(figsize=(10, 8))

# Reference lines
ax.axhline(y=0, color='gray', linestyle='--', alpha=0.5)
ax.axvline(x=0, color='gray', linestyle='--', alpha=0.5)

# Scatter plot
for _, row in combined_2023.iterrows():
    ax.scatter(row['epa_per_play'], -row['epa_allowed'], s=100, alpha=0.7)
    ax.text(row['epa_per_play'], -row['epa_allowed'], row['team'],
            fontsize=8, ha='center', va='center')

ax.set_xlabel('Offensive EPA per Play (Better →)', fontsize=12)
ax.set_ylabel('Defensive EPA per Play (Better ↑)', fontsize=12)
ax.set_title('Offensive and Defensive Coordinator Efficiency\n2023 NFL Regular Season',
             fontsize=14, fontweight='bold')
ax.text(0.98, 0.02, 'Data: nfl_data_py\nDefensive EPA inverted (up = better)',
        transform=ax.transAxes, ha='right', va='bottom', fontsize=8, style='italic')

plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Player Development Metrics

Measuring Player Improvement

R
Python

#| label: player-development-r
#| message: false
#| warning: false
#| cache: true

# Load multiple seasons
pbp_multi <- load_pbp(2021:2023)

# Calculate QB development (EPA improvement by experience)
qb_development <- pbp_multi %>%
  filter(
    play_type %in% c("pass", "run"),
    !is.na(passer_player_id),
    !is.na(epa)
  ) %>%
  group_by(season, passer_player_id, passer_player_name) %>%
  summarise(
    dropbacks = n(),
    epa_per_play = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  filter(dropbacks >= 100) %>%  # Minimum sample
  # Calculate year-over-year improvement
  group_by(passer_player_id) %>%
  arrange(season) %>%
  mutate(
    seasons_played = row_number(),
    epa_change = epa_per_play - lag(epa_per_play),
    success_change = success_rate - lag(success_rate)
  ) %>%
  ungroup()

# Identify top improvers (2022 to 2023)
top_improvers <- qb_development %>%
  filter(season == 2023, !is.na(epa_change)) %>%
  arrange(desc(epa_change)) %>%
  select(passer_player_name, season, dropbacks, epa_per_play, epa_change)

# Display
top_improvers %>%
  head(10) %>%
  gt() %>%
  cols_label(
    passer_player_name = "Quarterback",
    season = "Season",
    dropbacks = "Dropbacks",
    epa_per_play = "EPA/Play",
    epa_change = "EPA Change"
  ) %>%
  fmt_number(columns = c(epa_per_play, epa_change), decimals = 3) %>%
  tab_header(
    title = "Quarterback Development",
    subtitle = "2023 Season - Biggest YoY Improvements in EPA"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = epa_change, rows = epa_change > 0.05)
  )

#| label: player-development-py
#| message: false
#| warning: false
#| cache: true

# Load multiple seasons
pbp_multi = nfl.import_pbp_data([2021, 2022, 2023])

# Calculate QB development
qb_data = pbp_multi[
    (pbp_multi['play_type'].isin(['pass', 'run'])) &
    (pbp_multi['passer_player_id'].notna()) &
    (pbp_multi['epa'].notna())
].copy()

qb_development = qb_data.groupby(['season', 'passer_player_id', 'passer_player_name']).agg({
    'play_id': 'count',
    'epa': ['mean', lambda x: (x > 0).mean()]
}).reset_index()

qb_development.columns = ['season', 'player_id', 'player_name', 'dropbacks', 'epa_per_play', 'success_rate']
qb_development = qb_development[qb_development['dropbacks'] >= 100].copy()

# Calculate year-over-year changes
qb_development = qb_development.sort_values(['player_id', 'season'])
qb_development['epa_change'] = qb_development.groupby('player_id')['epa_per_play'].diff()
qb_development['success_change'] = qb_development.groupby('player_id')['success_rate'].diff()

# Top improvers 2023
top_improvers = qb_development[
    (qb_development['season'] == 2023) &
    (qb_development['epa_change'].notna())
].sort_values('epa_change', ascending=False).head(10)

print("\nQuarterback Development (2023 - Biggest YoY EPA Improvements):")
print(top_improvers[['player_name', 'season', 'dropbacks', 'epa_per_play', 'epa_change']].to_string(index=False))

Team-Level Player Development

R
Python

#| label: team-development-r
#| message: false
#| warning: false

# Calculate team-level player development
# Focus on young players (first 3 seasons)
young_player_development <- pbp_multi %>%
  filter(
    play_type %in% c("pass", "run"),
    !is.na(epa)
  ) %>%
  # Get passer or rusher
  mutate(
    player_id = coalesce(passer_player_id, rusher_player_id),
    player_name = coalesce(passer_player_name, rusher_player_name)
  ) %>%
  filter(!is.na(player_id)) %>%
  group_by(season, player_id, player_name, posteam) %>%
  summarise(
    plays = n(),
    epa_per_play = mean(epa),
    .groups = "drop"
  ) %>%
  filter(plays >= 50) %>%
  # Calculate improvement
  group_by(player_id) %>%
  arrange(season) %>%
  mutate(
    seasons = n(),
    epa_change = epa_per_play - lag(epa_per_play)
  ) %>%
  filter(seasons >= 2, seasons <= 3, !is.na(epa_change)) %>%
  ungroup()

# Aggregate by team
team_development <- young_player_development %>%
  group_by(posteam) %>%
  summarise(
    players = n(),
    avg_improvement = mean(epa_change),
    positive_improvement_rate = mean(epa_change > 0),
    .groups = "drop"
  ) %>%
  filter(players >= 5) %>%
  arrange(desc(avg_improvement))

# Display
team_development %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    players = "Players",
    avg_improvement = "Avg EPA Improvement",
    positive_improvement_rate = "% Improving"
  ) %>%
  fmt_number(columns = avg_improvement, decimals = 3) %>%
  fmt_percent(columns = positive_improvement_rate, decimals = 1) %>%
  tab_header(
    title = "Team Player Development",
    subtitle = "2021-2023 | Young Players (Years 1-3)"
  )

#| label: team-development-py
#| message: false
#| warning: false

# Calculate team-level player development
player_data = pbp_multi[
    (pbp_multi['play_type'].isin(['pass', 'run'])) &
    (pbp_multi['epa'].notna())
].copy()

# Get player ID (passer or rusher)
player_data['player_id'] = player_data['passer_player_id'].fillna(player_data['rusher_player_id'])
player_data['player_name'] = player_data['passer_player_name'].fillna(player_data['rusher_player_name'])

player_data = player_data[player_data['player_id'].notna()].copy()

# Aggregate by player-season
player_seasons = player_data.groupby(['season', 'player_id', 'player_name', 'posteam']).agg({
    'play_id': 'count',
    'epa': 'mean'
}).reset_index()

player_seasons.columns = ['season', 'player_id', 'player_name', 'team', 'plays', 'epa_per_play']
player_seasons = player_seasons[player_seasons['plays'] >= 50].copy()

# Calculate improvement
player_seasons = player_seasons.sort_values(['player_id', 'season'])
player_seasons['seasons_played'] = player_seasons.groupby('player_id').cumcount() + 1
player_seasons['epa_change'] = player_seasons.groupby('player_id')['epa_per_play'].diff()

# Filter to young players with improvement data
young_players = player_seasons[
    (player_seasons['seasons_played'] >= 2) &
    (player_seasons['seasons_played'] <= 3) &
    (player_seasons['epa_change'].notna())
].copy()

# Aggregate by team
team_development = young_players.groupby('team').agg({
    'player_id': 'count',
    'epa_change': ['mean', lambda x: (x > 0).mean()]
}).reset_index()

team_development.columns = ['team', 'players', 'avg_improvement', 'positive_improvement_rate']
team_development = team_development[team_development['players'] >= 5].sort_values(
    'avg_improvement', ascending=False
)

print("\nTeam Player Development (2021-2023 - Young Players):")
print(team_development.head(10).to_string(index=False))

Halftime Adjustments

Second Half Performance

R
Python

#| label: halftime-adjustments-r
#| message: false
#| warning: false

# Analyze first half vs second half performance
halftime_analysis <- pbp %>%
  filter(!is.na(epa), play_type %in% c("pass", "run")) %>%
  mutate(
    half = if_else(qtr <= 2, "First Half", "Second Half")
  ) %>%
  group_by(season, posteam, half) %>%
  summarise(
    plays = n(),
    epa_per_play = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  pivot_wider(
    names_from = half,
    values_from = c(plays, epa_per_play, success_rate),
    names_glue = "{half}_"
  ) %>%
  mutate(
    epa_improvement = `Second Half_epa_per_play` - `First Half_epa_per_play`,
    success_improvement = `Second Half_success_rate` - `First Half_success_rate`
  )

# Display 2023 best adjusters
halftime_analysis %>%
  filter(season == 2023) %>%
  arrange(desc(epa_improvement)) %>%
  select(posteam, `First Half_epa_per_play`, `Second Half_epa_per_play`,
         epa_improvement, success_improvement) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    `First Half_epa_per_play` = "1st Half EPA",
    `Second Half_epa_per_play` = "2nd Half EPA",
    epa_improvement = "EPA Improvement",
    success_improvement = "Success Rate Δ"
  ) %>%
  fmt_number(columns = c(`First Half_epa_per_play`, `Second Half_epa_per_play`,
                         epa_improvement), decimals = 3) %>%
  fmt_number(columns = success_improvement, decimals = 3) %>%
  tab_header(
    title = "Halftime Adjustments",
    subtitle = "2023 Regular Season - Best Second Half Improvements"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = epa_improvement, rows = epa_improvement > 0.05)
  )

#| label: halftime-adjustments-py
#| message: false
#| warning: false

# Analyze halftime adjustments
halftime_data = pbp[
    (pbp['epa'].notna()) &
    (pbp['play_type'].isin(['pass', 'run']))
].copy()

halftime_data['half'] = halftime_data['qtr'].apply(
    lambda x: 'First Half' if x <= 2 else 'Second Half'
)

# Aggregate by team and half
halftime_agg = halftime_data.groupby(['season', 'posteam', 'half']).agg({
    'play_id': 'count',
    'epa': ['mean', lambda x: (x > 0).mean()]
}).reset_index()

halftime_agg.columns = ['season', 'team', 'half', 'plays', 'epa_per_play', 'success_rate']

# Pivot to compare halves
halftime_pivot = halftime_agg.pivot_table(
    index=['season', 'team'],
    columns='half',
    values=['epa_per_play', 'success_rate']
).reset_index()

halftime_pivot.columns = ['_'.join(col).strip('_') for col in halftime_pivot.columns.values]
halftime_pivot.columns = ['season', 'team', 'first_half_epa', 'second_half_epa',
                          'first_half_success', 'second_half_success']

halftime_pivot['epa_improvement'] = halftime_pivot['second_half_epa'] - halftime_pivot['first_half_epa']
halftime_pivot['success_improvement'] = halftime_pivot['second_half_success'] - halftime_pivot['first_half_success']

# 2023 best adjusters
halftime_2023 = halftime_pivot[halftime_pivot['season'] == 2023].sort_values(
    'epa_improvement', ascending=False
).head(10)

print("\nHalftime Adjustments (2023 - Best Second Half Improvements):")
print(halftime_2023[['team', 'first_half_epa', 'second_half_epa', 'epa_improvement']].to_string(index=False))

Adjustment Visualization

R
Python

#| label: fig-halftime-viz-r
#| fig-cap: "First half vs second half EPA by team (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

halftime_analysis %>%
  filter(season == 2023) %>%
  ggplot(aes(x = `First Half_epa_per_play`, y = `Second Half_epa_per_play`)) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "gray40") +
  geom_nfl_logos(aes(team_abbr = posteam), width = 0.05, alpha = 0.8) +
  labs(
    title = "Halftime Adjustments: First vs Second Half Performance",
    subtitle = "2023 NFL Regular Season | Teams above line improved in second half",
    x = "First Half EPA per Play",
    y = "Second Half EPA per Play",
    caption = "Data: nflfastR"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    panel.grid.minor = element_blank()
  )

#| label: fig-halftime-viz-py
#| fig-cap: "First half vs second half EPA - Python (2023)"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Create plot
fig, ax = plt.subplots(figsize=(10, 8))

# Get min and max for equal scales
all_epa = pd.concat([halftime_2023['first_half_epa'], halftime_2023['second_half_epa']])
epa_min, epa_max = all_epa.min(), all_epa.max()

# Diagonal line
ax.plot([epa_min, epa_max], [epa_min, epa_max], 'k--', alpha=0.3, linewidth=1)

# Scatter plot
for _, row in halftime_pivot[halftime_pivot['season'] == 2023].iterrows():
    ax.scatter(row['first_half_epa'], row['second_half_epa'], s=100, alpha=0.7)
    ax.text(row['first_half_epa'], row['second_half_epa'], row['team'],
            fontsize=8, ha='center', va='center')

ax.set_xlabel('First Half EPA per Play', fontsize=12)
ax.set_ylabel('Second Half EPA per Play', fontsize=12)
ax.set_title('Halftime Adjustments: First vs Second Half Performance\n2023 NFL Regular Season',
             fontsize=14, fontweight='bold')
ax.text(0.98, 0.02, 'Data: nfl_data_py\nTeams above line improved in 2nd half',
        transform=ax.transAxes, ha='right', va='bottom', fontsize=8, style='italic')

plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Coaching Value Above Replacement

Estimating CVAR

Coaching Value Above Replacement (CVAR) estimates the win contribution of a coach relative to a replacement-level coach.

R
Python

#| label: cvar-calculation-r
#| message: false
#| warning: false

# Calculate CVAR components
cvar_data <- team_records_adj %>%
  filter(season >= 2015, season <= 2023) %>%
  mutate(
    # Win components
    pythag_over = pythag_diff,
    sos_boost = sos_diff,

    # Expected wins for replacement coach (0.400)
    replacement_wins = games * 0.400,

    # Wins above replacement
    war = wins - replacement_wins,

    # Adjusted for context
    cvar_estimate = war + pythag_over + (sos_boost * 0.5)
  )

# Aggregate by team (proxy for coaching staff)
team_cvar <- cvar_data %>%
  group_by(team) %>%
  summarise(
    seasons = n(),
    total_wins = sum(wins),
    avg_wins = mean(wins),
    total_cvar = sum(cvar_estimate),
    avg_cvar = mean(cvar_estimate),
    .groups = "drop"
  ) %>%
  arrange(desc(avg_cvar))

# Display top coaches (by team)
team_cvar %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    seasons = "Seasons",
    total_wins = "Total Wins",
    avg_wins = "Avg Wins",
    total_cvar = "Total CVAR",
    avg_cvar = "CVAR/Season"
  ) %>%
  fmt_number(columns = c(avg_wins, total_cvar, avg_cvar), decimals = 2) %>%
  tab_header(
    title = "Coaching Value Above Replacement",
    subtitle = "2015-2023 Regular Seasons"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = avg_cvar, rows = avg_cvar > 2)
  )

#| label: cvar-calculation-py
#| message: false
#| warning: false

# Calculate CVAR
cvar_data = team_records_adj[
    (team_records_adj['season'] >= 2015) &
    (team_records_adj['season'] <= 2023)
].copy()

# CVAR components
cvar_data['pythag_over'] = cvar_data['pythag_diff']
cvar_data['sos_boost'] = cvar_data['sos_diff']
cvar_data['replacement_wins'] = cvar_data['games'] * 0.400
cvar_data['war'] = cvar_data['wins'] - cvar_data['replacement_wins']
cvar_data['cvar_estimate'] = cvar_data['war'] + cvar_data['pythag_over'] + (cvar_data['sos_boost'] * 0.5)

# Aggregate by team
team_cvar = cvar_data.groupby('team').agg({
    'season': 'count',
    'wins': ['sum', 'mean'],
    'cvar_estimate': ['sum', 'mean']
}).reset_index()

team_cvar.columns = ['team', 'seasons', 'total_wins', 'avg_wins', 'total_cvar', 'avg_cvar']
team_cvar = team_cvar.sort_values('avg_cvar', ascending=False)

print("\nCoaching Value Above Replacement (2015-2023):")
print(team_cvar.head(10).to_string(index=False))

CVAR Visualization

R
Python

#| label: fig-cvar-r
#| fig-cap: "Coaching Value Above Replacement by team (2015-2023)"
#| fig-width: 10
#| fig-height: 10
#| message: false
#| warning: false

team_cvar %>%
  mutate(team = fct_reorder(team, avg_cvar)) %>%
  ggplot(aes(x = avg_cvar, y = team)) +
  geom_col(aes(fill = avg_cvar > 0), show.legend = FALSE) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
  scale_fill_manual(values = c("TRUE" = "darkgreen", "FALSE" = "darkred")) +
  labs(
    title = "Coaching Value Above Replacement (CVAR)",
    subtitle = "2015-2023 Regular Seasons | Average CVAR per Season",
    x = "Average CVAR per Season",
    y = NULL,
    caption = "Data: nflfastR | Positive = Better than replacement coach"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    panel.grid.major.y = element_blank(),
    panel.grid.minor = element_blank()
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-cvar-py
#| fig-cap: "Coaching Value Above Replacement - Python (2015-2023)"
#| fig-width: 10
#| fig-height: 10
#| message: false
#| warning: false

# Sort by CVAR
team_cvar_sorted = team_cvar.sort_values('avg_cvar')

# Create plot
fig, ax = plt.subplots(figsize=(10, 10))

# Color based on positive/negative
colors = ['darkgreen' if x > 0 else 'darkred' for x in team_cvar_sorted['avg_cvar']]

ax.barh(team_cvar_sorted['team'], team_cvar_sorted['avg_cvar'], color=colors, alpha=0.7)
ax.axvline(x=0, color='black', linestyle='--', linewidth=1)

ax.set_xlabel('Average CVAR per Season', fontsize=12)
ax.set_title('Coaching Value Above Replacement (CVAR)\n2015-2023 Regular Seasons',
             fontsize=14, fontweight='bold')
ax.text(0.98, 0.02, 'Data: nfl_data_py\nPositive = Better than replacement',
        transform=ax.transAxes, ha='right', va='bottom', fontsize=8, style='italic')

plt.grid(True, axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

Advanced Coaching Metrics

Situational Coaching

R
Python

#| label: situational-coaching-r
#| message: false
#| warning: false

# Analyze coaching in key situations
situational_performance <- pbp %>%
  filter(!is.na(epa), play_type %in% c("pass", "run")) %>%
  mutate(
    situation = case_when(
      down == 3 & ydstogo <= 3 ~ "3rd & Short",
      down == 3 & ydstogo > 3 ~ "3rd & Long",
      yardline_100 <= 10 ~ "Red Zone",
      score_differential >= -8 & score_differential <= 8 &
        game_seconds_remaining <= 300 ~ "Close Late",
      TRUE ~ "Other"
    )
  ) %>%
  filter(situation != "Other") %>%
  group_by(season, posteam, situation) %>%
  summarise(
    plays = n(),
    epa_per_play = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  )

# Display 2023 red zone performance
situational_performance %>%
  filter(season == 2023, situation == "Red Zone") %>%
  arrange(desc(epa_per_play)) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    plays = "Plays",
    epa_per_play = "EPA/Play",
    success_rate = "Success Rate"
  ) %>%
  fmt_number(columns = epa_per_play, decimals = 3) %>%
  fmt_percent(columns = success_rate, decimals = 1) %>%
  tab_header(
    title = "Red Zone Coaching Performance",
    subtitle = "2023 Regular Season - Best Red Zone Offenses"
  )

#| label: situational-coaching-py
#| message: false
#| warning: false

# Analyze situational performance
situational_data = pbp[
    (pbp['epa'].notna()) &
    (pbp['play_type'].isin(['pass', 'run']))
].copy()

# Categorize situations
situational_data['situation'] = 'Other'
situational_data.loc[
    (situational_data['down'] == 3) & (situational_data['ydstogo'] <= 3),
    'situation'
] = '3rd & Short'
situational_data.loc[
    (situational_data['down'] == 3) & (situational_data['ydstogo'] > 3),
    'situation'
] = '3rd & Long'
situational_data.loc[
    situational_data['yardline_100'] <= 10,
    'situation'
] = 'Red Zone'
situational_data.loc[
    (situational_data['score_differential'].abs() <= 8) &
    (situational_data['game_seconds_remaining'] <= 300),
    'situation'
] = 'Close Late'

# Aggregate
situational_agg = situational_data[
    situational_data['situation'] != 'Other'
].groupby(['season', 'posteam', 'situation']).agg({
    'play_id': 'count',
    'epa': ['mean', lambda x: (x > 0).mean()]
}).reset_index()

situational_agg.columns = ['season', 'team', 'situation', 'plays', 'epa_per_play', 'success_rate']

# Display 2023 red zone
red_zone_2023 = situational_agg[
    (situational_agg['season'] == 2023) &
    (situational_agg['situation'] == 'Red Zone')
].sort_values('epa_per_play', ascending=False).head(10)

print("\nRed Zone Coaching Performance (2023 - Best Red Zone Offenses):")
print(red_zone_2023[['team', 'plays', 'epa_per_play', 'success_rate']].to_string(index=False))

Roster Management Efficiency

R
Python

#| label: roster-management-r
#| message: false
#| warning: false

# Analyze playing time distribution
# Focus on offensive snaps
snap_distribution <- pbp %>%
  filter(play_type %in% c("pass", "run"), !is.na(posteam)) %>%
  # Get unique player usage per game
  mutate(
    game_player = paste(game_id, passer_player_id, rusher_player_id, sep = "_")
  ) %>%
  group_by(season, posteam, game_id) %>%
  summarise(
    total_plays = n(),
    unique_passers = n_distinct(passer_player_id[!is.na(passer_player_id)]),
    unique_rushers = n_distinct(rusher_player_id[!is.na(rusher_player_id)]),
    .groups = "drop"
  ) %>%
  group_by(season, posteam) %>%
  summarise(
    games = n_distinct(game_id),
    avg_plays = mean(total_plays),
    avg_passers = mean(unique_passers),
    avg_rushers = mean(unique_rushers),
    .groups = "drop"
  )

# Display 2023
snap_distribution %>%
  filter(season == 2023) %>%
  mutate(
    personnel_diversity = avg_passers + avg_rushers
  ) %>%
  arrange(desc(personnel_diversity)) %>%
  select(posteam, avg_plays, avg_passers, avg_rushers, personnel_diversity) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    avg_plays = "Avg Plays",
    avg_passers = "Avg Passers",
    avg_rushers = "Avg Rushers",
    personnel_diversity = "Diversity Score"
  ) %>%
  fmt_number(columns = c(avg_plays, avg_passers, avg_rushers, personnel_diversity),
             decimals = 1) %>%
  tab_header(
    title = "Roster Management: Personnel Diversity",
    subtitle = "2023 Regular Season - Average per Game"
  )

#| label: roster-management-py
#| message: false
#| warning: false

# Analyze snap distribution
snap_data = pbp[
    (pbp['play_type'].isin(['pass', 'run'])) &
    (pbp['posteam'].notna())
].copy()

# Aggregate by game
snap_dist = snap_data.groupby(['season', 'posteam', 'game_id']).agg({
    'play_id': 'count',
    'passer_player_id': lambda x: x.nunique(),
    'rusher_player_id': lambda x: x.nunique()
}).reset_index()

snap_dist.columns = ['season', 'team', 'game_id', 'total_plays', 'unique_passers', 'unique_rushers']

# Average per team-season
snap_summary = snap_dist.groupby(['season', 'team']).agg({
    'game_id': 'count',
    'total_plays': 'mean',
    'unique_passers': 'mean',
    'unique_rushers': 'mean'
}).reset_index()

snap_summary.columns = ['season', 'team', 'games', 'avg_plays', 'avg_passers', 'avg_rushers']
snap_summary['personnel_diversity'] = snap_summary['avg_passers'] + snap_summary['avg_rushers']

# Display 2023
snap_2023 = snap_summary[snap_summary['season'] == 2023].sort_values(
    'personnel_diversity', ascending=False
).head(10)

print("\nRoster Management: Personnel Diversity (2023):")
print(snap_2023[['team', 'avg_plays', 'avg_passers', 'avg_rushers', 'personnel_diversity']].to_string(index=False))

Coaching Stability and Tenure

Tenure Analysis

R
Python

#| label: coaching-tenure-r
#| message: false
#| warning: false

# Analyze relationship between wins and consistency
tenure_performance <- team_records %>%
  arrange(team, season) %>%
  group_by(team) %>%
  mutate(
    # Calculate rolling averages
    win_pct_3yr = zoo::rollmean(win_pct, k = 3, fill = NA, align = "right"),
    win_volatility = zoo::rollapply(win_pct, width = 3, FUN = sd,
                                     fill = NA, align = "right")
  ) %>%
  filter(!is.na(win_pct_3yr))

# Summarize by team
team_stability <- tenure_performance %>%
  group_by(team) %>%
  summarise(
    seasons = n(),
    avg_win_pct = mean(win_pct),
    avg_3yr_win_pct = mean(win_pct_3yr),
    volatility = mean(win_volatility, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(avg_win_pct))

# Display most stable successful teams
team_stability %>%
  filter(avg_win_pct > 0.500) %>%
  arrange(volatility) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    seasons = "Seasons",
    avg_win_pct = "Avg Win %",
    avg_3yr_win_pct = "3-Yr Avg Win %",
    volatility = "Volatility"
  ) %>%
  fmt_percent(columns = c(avg_win_pct, avg_3yr_win_pct), decimals = 1) %>%
  fmt_number(columns = volatility, decimals = 3) %>%
  tab_header(
    title = "Coaching Stability",
    subtitle = "Most Consistent Winning Teams (2015-2023)"
  )

#| label: coaching-tenure-py
#| message: false
#| warning: false

# Calculate rolling metrics
team_records_sorted = team_records.sort_values(['team', 'season'])

# Calculate 3-year rolling average and volatility
team_records_sorted['win_pct_3yr'] = team_records_sorted.groupby('team')['win_pct'].transform(
    lambda x: x.rolling(window=3, min_periods=3).mean()
)
team_records_sorted['win_volatility'] = team_records_sorted.groupby('team')['win_pct'].transform(
    lambda x: x.rolling(window=3, min_periods=3).std()
)

# Filter valid data
tenure_data = team_records_sorted[team_records_sorted['win_pct_3yr'].notna()].copy()

# Summarize by team
team_stability = tenure_data.groupby('team').agg({
    'season': 'count',
    'win_pct': 'mean',
    'win_pct_3yr': 'mean',
    'win_volatility': 'mean'
}).reset_index()

team_stability.columns = ['team', 'seasons', 'avg_win_pct', 'avg_3yr_win_pct', 'volatility']

# Most stable successful teams
stable_winners = team_stability[team_stability['avg_win_pct'] > 0.500].sort_values('volatility')

print("\nCoaching Stability - Most Consistent Winning Teams (2015-2023):")
print(stable_winners.head(10).to_string(index=False))

Summary

Coaching evaluation requires a nuanced, multi-faceted approach that goes beyond simple win-loss records. In this chapter, we explored:

Win-Loss Context: We calculated Pythagorean expected wins and adjusted for schedule strength to understand whether teams exceeded or fell short of expectations.

Decision Quality: We analyzed fourth down decisions, challenge success rates, and timeout management to evaluate in-game coaching acumen.

Coordinator Impact: We measured offensive and defensive coordinator efficiency using EPA, success rates, and situational performance metrics.

Player Development: We tracked year-over-year player improvement to identify coaching staffs that excel at developing talent.

Halftime Adjustments: We compared first-half and second-half performance to evaluate coaching staffs' ability to make effective adjustments.

Coaching Value Above Replacement (CVAR): We developed a comprehensive metric that estimates coaching contribution above a replacement-level coach.

Situational Performance: We examined red zone efficiency, third down performance, and late-game execution.

Roster Management: We analyzed personnel diversity and playing time distribution as indicators of coaching philosophy.

Stability and Tenure: We measured performance volatility to identify consistently successful coaching staffs.

Key Insights

1. **Context matters enormously**: Raw win-loss records can be misleading without considering roster talent, schedule strength, and injury luck 2. **Decision quality is measurable**: Fourth down decisions, challenges, and timeout usage provide objective evidence of coaching acumen 3. **Coordinators drive results**: Offensive and defensive coordinator performance can be evaluated independently using EPA and situational metrics 4. **Player development is a differentiator**: Coaching staffs that consistently improve young players create sustainable competitive advantages 5. **Adjustments separate good from great**: The ability to make effective halftime adjustments distinguishes elite coaching staffs 6. **Small samples require caution**: Single-season coaching evaluations suffer from high variance and should be interpreted carefully

Exercises

Conceptual Questions

Context in Evaluation: Why is it problematic to evaluate coaches solely on win-loss records? What contextual factors should be considered?
Decision Quality: How would you design a comprehensive "coaching decision quality" score that incorporates fourth down decisions, challenge usage, and timeout management?
Attribution Problem: How do you separate the impact of a head coach from coordinators and position coaches when evaluating team success?
Sample Size: Given that NFL seasons are only 17 games, what statistical methods would you use to reduce noise in single-season coaching evaluations?

Coding Exercises

Exercise 1: Expected Wins Model

Build a comprehensive expected wins model that predicts team wins based on: a) Pythagorean expectation (points scored and allowed) b) Schedule strength (opponent win percentage) c) Point differential in close games (within 1 score) d) Previous season performance Calculate expected vs actual wins for each team in 2023. **Hint**: Use multiple regression with these features to predict wins.

Exercise 2: Fourth Down Decision Dashboard

Create a comprehensive fourth down decision analysis: a) Calculate the "optimal" decision for each 4th down situation using EPA b) Compare each team's actual decisions to optimal decisions c) Compute a "decision quality score" for each team d) Identify which coaches are most/least aggressive on 4th downs **Bonus**: Analyze whether aggressive 4th down strategies correlate with team success.

Exercise 3: Coordinator Efficiency Rankings

Build a comprehensive coordinator ranking system: a) Calculate offensive coordinator rankings using EPA, success rate, and explosive play rate b) Calculate defensive coordinator rankings using EPA allowed, pressure rate, and turnover rate c) Adjust for opponent strength d) Create visualizations comparing offensive and defensive coordinator efficiency **Advanced**: Track coordinator performance when they change teams.

Exercise 4: Player Development Score

Develop a "Player Development Score" for each team: a) Identify players in their first 3 seasons b) Calculate year-over-year EPA improvement for these players c) Aggregate by team to create a team-level development score d) Analyze whether teams with better development scores have more sustainable success **Hint**: Focus on positions with sufficient sample sizes (QB, RB, WR).

Exercise 5: CVAR Model

Build your own Coaching Value Above Replacement model: a) Calculate expected wins based on multiple factors (roster talent, schedule, etc.) b) Compute wins above replacement (assume replacement = 0.400) c) Adjust for Pythagorean over/under performance d) Create a comprehensive CVAR metric **Validation**: Check if CVAR correlates with coaching tenure and future success.

Exercise 6: Halftime Adjustment Analysis

Analyze halftime adjustments comprehensively: a) Compare first-half and second-half EPA for all teams b) Break down by game situation (leading, trailing, tied) c) Identify which teams make the best adjustments when trailing d) Create visualizations showing adjustment patterns **Advanced**: Analyze whether adjustment ability predicts playoff success.

References

:::

Learning ObjectivesBy the end of this chapter, you will be able to:

Introduction

What Makes Coaching Evaluation Challenging?

Win-Loss Records in Context

The Problem with Raw Win-Loss Records

Expected Wins Framework

Calculating Expected Wins

Schedule Strength Adjustment

Visualizing Expected vs Actual Wins

In-Game Decision Quality

Fourth Down Decision Analysis

Challenge Success Rates

Timeout Management

Coordinator Impact Analysis

Offensive Coordinator Efficiency

Defensive Coordinator Efficiency

Coordinator EPA Scatter Plot

📊 Visualization Output

Player Development Metrics

Measuring Player Improvement

Team-Level Player Development

Halftime Adjustments

Second Half Performance

Adjustment Visualization

Coaching Value Above Replacement

Estimating CVAR

CVAR Visualization

📊 Visualization Output

📊 Visualization Output

Advanced Coaching Metrics

Situational Coaching

Roster Management Efficiency

Coaching Stability and Tenure

Tenure Analysis

Summary

Key Insights

Exercises

Conceptual Questions

Coding Exercises

Exercise 1: Expected Wins Model

Exercise 2: Fourth Down Decision Dashboard

Exercise 3: Coordinator Efficiency Rankings

Exercise 4: Player Development Score

Exercise 5: CVAR Model

Exercise 6: Halftime Adjustment Analysis

Further Reading

Academic Research

Practitioner Resources

Books

Methodological References

References