Chapter 11: Play Calling Analytics | Football Analytics Textbook

Learning ObjectivesBy the end of this chapter, you will be able to:

Analyze play-calling tendencies and patterns across different situations
Identify optimal play-call sequences using data-driven methods
Understand how defenses adjust to offensive tendencies
Apply game theory principles to play-calling strategy
Measure and evaluate play-caller efficiency and effectiveness

Introduction

Play-calling is one of the most scrutinized aspects of football. Every Sunday, millions of fans second-guess offensive coordinators, debating whether they should have run or passed, whether they were too conservative or too aggressive. Yet until recently, these debates were largely based on gut feelings rather than rigorous analysis.

Modern analytics provides unprecedented insight into play-calling patterns. We can now quantify tendencies, measure predictability, assess effectiveness across situations, and even identify optimal strategies using game theory. This chapter explores the data science of play-calling, from basic tendency analysis to sophisticated optimization techniques.

The art and science of play-calling represents one of football's most fascinating strategic challenges. An offensive coordinator must balance competing objectives: maintaining unpredictability to prevent defensive adjustments, optimizing play selection for each specific situation, leveraging personnel matchups, executing the game plan, and managing risk. Each play call occurs within layers of strategic consideration—what worked earlier in the game, what the defense has shown, what the down-and-distance situation demands, and what the score and clock dictate.

What makes play-calling analysis particularly compelling is that we can now measure what was previously invisible. Before the analytics era, evaluating play-calling required subjective judgment: Did it "feel" like the coordinator was too predictable? Were they "aggressive enough"? Today, we can quantify these qualities. We can measure exactly how predictable a coordinator is in specific situations, calculate the cost of that predictability, and even determine optimal play frequencies using game theory principles.

This chapter approaches play-calling through multiple lenses. We'll start with descriptive analysis—measuring tendencies across situations, personnel groupings, and game states. We'll examine how down and distance, field position, score differential, and time remaining influence play selection. We'll analyze personnel groupings and their associated tendencies, revealing how defenses read formation tells. Then we'll move to prescriptive analysis, using game theory to understand the value of unpredictability and building recommendation systems for optimal play-calling. Throughout, we'll connect statistical findings to football strategy, explaining not just what the numbers show but why they matter for winning games.

What is Play-Calling Analysis?

Play-calling analysis examines the strategic decisions offensive coordinators make regarding play selection. It encompasses the study of run-pass balance, situational tendencies, personnel groupings, down-and-distance patterns, and the effectiveness of different play-calling approaches.

The Importance of Play-Calling Analytics

Effective play-calling requires balancing several competing objectives:

Unpredictability: Avoiding patterns that defenses can exploit
Situation optimization: Calling the right plays for each down and distance
Personnel matching: Leveraging favorable matchups
Script execution: Following game plan while adapting to circumstances
Risk management: Balancing aggression with ball security

Analytics helps coordinators achieve this balance by:

Identifying their own tendencies and blind spots
Detecting opponent tendencies to exploit
Measuring effectiveness across different situations
Quantifying the cost of predictability
Providing data-driven optimization recommendations

The Predictability Problem

A perfectly balanced 50-50 run-pass offense isn't necessarily optimal. The goal is to call plays that maximize expected points while maintaining enough unpredictability to prevent defensive adjustments. Game theory provides the framework for finding this optimal balance.

Measuring Play-Calling Tendencies

Basic Tendency Metrics

The foundation of play-calling analysis is measuring frequencies across different contexts. Before we can optimize play-calling or identify exploitable patterns, we need to establish baseline measurements: How often does a team pass versus run? How does this balance vary by team, situation, and context?

Understanding basic tendencies serves multiple purposes. For offensive coordinators, it reveals their own blind spots—situations where they've become too predictable. For defensive coordinators, it identifies patterns to exploit. For analysts, it provides the foundation for more sophisticated analyses of efficiency, predictability, and optimization.

At its simplest level, tendency analysis calculates frequency distributions: In situation X, how often does team Y call play type Z? But even this basic analysis reveals fascinating patterns. Some teams pass on 60-65% of plays, while others stay closer to 50-50. These differences reflect philosophical approaches, personnel strengths, and strategic choices.

The methodology for calculating tendencies is straightforward but requires careful attention to data filtering. We must exclude plays that don't represent true offensive play-calling decisions: penalties before the snap, special teams plays, quarterback kneels, and spikes are all excluded because they don't reflect strategic offensive choices. We focus on plays where the offense genuinely chose between running and passing.

When measuring team tendencies, we calculate several key metrics simultaneously. First, we count the raw frequencies—how many passes versus runs did each team call? Second, we convert these counts into rates (proportions) to make teams comparable regardless of how many total plays they ran. Third, we measure the efficiency of each play type using EPA, allowing us to assess not just what teams did but how well it worked.

Let's start by calculating the most basic tendency metric: overall run-pass balance by team. This gives us a sense of each team's offensive identity and philosophy:

#| label: setup-r
#| message: false
#| warning: false

library(tidyverse)
library(nflfastR)
library(nflplotR)
library(gt)
library(ggridges)

# Load 2023 season data
pbp_2023 <- load_pbp(2023)

#| label: basic-tendencies-r
#| message: false
#| warning: false

# Calculate basic play-calling tendencies by team
# This analysis reveals each team's overall offensive philosophy and approach
# We're measuring both frequency (how often) and efficiency (how well)

team_tendencies <- pbp_2023 %>%
  # Filter to valid offensive plays with known teams
  filter(
    !is.na(down),              # Must have a valid down
    !is.na(posteam),           # Must know which team has the ball
    play_type %in% c("run", "pass")  # Only standard run/pass plays
  ) %>%
  # Group all plays by the offensive team
  group_by(posteam) %>%
  # Calculate summary statistics for each team
  summarise(
    total_plays = n(),                              # Total offensive plays
    pass_plays = sum(play_type == "pass"),          # Count of pass attempts
    run_plays = sum(play_type == "run"),            # Count of run attempts
    pass_rate = pass_plays / total_plays,           # Pass frequency (0-1)
    run_rate = run_plays / total_plays,             # Run frequency (0-1)
    # Calculate average EPA for each play type
    avg_epa_pass = mean(epa[play_type == "pass"], na.rm = TRUE),  # Pass efficiency
    avg_epa_run = mean(epa[play_type == "run"], na.rm = TRUE),    # Run efficiency
    .groups = "drop"
  ) %>%
  # Sort from most pass-heavy to most run-heavy
  arrange(desc(pass_rate))

# Display results
team_tendencies %>%
  head(10) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    total_plays = "Total Plays",
    pass_plays = "Pass Plays",
    run_plays = "Run Plays",
    pass_rate = "Pass Rate",
    run_rate = "Run Rate",
    avg_epa_pass = "Pass EPA",
    avg_epa_run = "Run EPA"
  ) %>%
  fmt_number(
    columns = c(pass_rate, run_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = c(avg_epa_pass, avg_epa_run),
    decimals = 3
  ) %>%
  fmt_number(
    columns = c(total_plays, pass_plays, run_plays),
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = pass_rate,
    colors = scales::col_numeric(
      palette = c("#F8766D", "#FFFFFF", "#00BFC4"),
      domain = c(0.4, 0.7)
    )
  ) %>%
  tab_header(
    title = "Team Play-Calling Tendencies",
    subtitle = "2023 NFL Regular Season"
  )

#| label: setup-py
#| message: false
#| warning: false

import pandas as pd
import numpy as np
import nfl_data_py as nfl
import matplotlib.pyplot as plt
import seaborn as sns

# Load 2023 season data
pbp_2023 = nfl.import_pbp_data([2023])

#| label: basic-tendencies-py
#| message: false
#| warning: false

# Calculate basic play-calling tendencies by team
# This analysis reveals each team's overall offensive philosophy and approach
# We're measuring both frequency (how often) and efficiency (how well)

team_tendencies = (pbp_2023
    # Filter to valid offensive plays only
    .query("down.notna() & posteam.notna() & play_type.isin(['run', 'pass'])")
    # Group by team
    .groupby('posteam')
    # Calculate aggregate statistics
    .agg(
        total_plays=('play_type', 'count'),                                    # Total plays
        pass_plays=('play_type', lambda x: (x == 'pass').sum()),              # Pass count
        run_plays=('play_type', lambda x: (x == 'run').sum()),                # Run count
        # Calculate EPA separately for pass and run plays
        avg_epa_pass=('epa', lambda x: x[pbp_2023.loc[x.index, 'play_type'] == 'pass'].mean()),
        avg_epa_run=('epa', lambda x: x[pbp_2023.loc[x.index, 'play_type'] == 'run'].mean())
    )
    # Add calculated columns for rates
    .assign(
        pass_rate=lambda x: x['pass_plays'] / x['total_plays'],  # Pass frequency
        run_rate=lambda x: x['run_plays'] / x['total_plays']     # Run frequency
    )
    # Sort from most pass-heavy to most run-heavy
    .sort_values('pass_rate', ascending=False)
    .reset_index()
)

# Display top 10 teams
print("Team Play-Calling Tendencies (2023 NFL Regular Season)")
print("=" * 90)
print(team_tendencies.head(10).to_string(index=False))

Both code blocks perform the same analysis using different programming languages. The key steps are: 1. **Filtering**: We remove plays without valid down information, plays where we don't know the offensive team, and special teams plays. This focuses our analysis on standard offensive plays. 2. **Grouping**: We group all plays by the offensive team (`posteam`), which allows us to calculate team-level statistics. 3. **Aggregation**: For each team, we count total plays, pass plays, and run plays. We also calculate average EPA separately for passes and runs. 4. **Rate Calculation**: We convert raw counts into rates (proportions) by dividing by total plays. This makes teams comparable regardless of total play count. 5. **Sorting**: We order teams by pass rate to identify the most and least pass-heavy offenses.

These results reveal significant variation in offensive philosophy across the NFL. The most pass-heavy teams typically have pass rates around 60-65%, while the most run-heavy teams might be around 45-50%. This 15-20 percentage point spread represents fundamentally different approaches to offense.

Why Teams Vary in Pass Rate: Several factors drive these differences:

Personnel Strengths: Teams with elite quarterbacks (like the Dolphins with Tua Tagovailoa or the Cowboys with Dak Prescott) tend to pass more frequently. Teams with strong offensive lines and running backs may run more.
Offensive Philosophy: Some coordinators believe in establishing the run to set up play action and control clock. Others believe in attacking through the air to maximize efficiency.
Game Script Effects: Teams that frequently play with leads run more to protect those leads and drain clock. Teams that trail frequently are forced to pass more.
Defensive Opponent Strength: A team facing strong run defenses all season might pass more by necessity.

When we examine the EPA metrics alongside pass rate, we often find an intriguing pattern: passing is more efficient (higher EPA) than running for nearly every team, yet most teams run the ball 40-45% of the time. This apparent paradox—why don't teams always use the more efficient play type?—is central to understanding play-calling strategy and will be explored in depth later in this chapter.

Interpreting Pass Rate in Context

A team's pass rate alone doesn't tell you if their play-calling is optimal. A team might have a 60% pass rate because they're always trailing (forced to pass), or because they have an elite quarterback (choosing to pass). Always consider game script, personnel, and opponent strength when evaluating tendencies.

Situational Tendencies

Play-calling tendencies vary dramatically by situation. Let's examine tendencies across down and distance:

While overall team pass rates provide a useful baseline, they mask enormous situational variation. A team might pass 55% of the time overall, but that could mean passing 45% on first down, 50% on second-and-short, and 85% on third-and-long. Understanding these situational tendencies is crucial for both offensive and defensive strategy.

Down and distance create the fundamental strategic structure of football. First down offers maximum flexibility—nearly any play type can work. Second down's optimal strategy depends heavily on first-down results: after a successful first-down play, teams have flexibility; after a failed first-down play, teams face pressure. Third down becomes increasingly pass-heavy as distance increases, with third-and-long situations approaching 90% pass rate.

These tendencies aren't arbitrary—they reflect the constraints and opportunities each situation presents. On third-and-15, running the ball rarely converts the first down, so teams pass despite the predictability. On third-and-1, defenses know teams might run, but stopping a power running play is still difficult. The key question is whether teams deviate from optimal strategy due to excessive predictability. Let's measure these patterns:

Common Mistake: Ignoring Neutral Game Script

When analyzing situational tendencies, always filter for neutral game script (score differential within ±7) to avoid confounding effects. Teams trailing by 14 will pass even on third-and-short, skewing your tendency analysis if you don't account for score.

R
Python

#| label: situational-tendencies-r
#| message: false
#| warning: false

# Calculate tendencies by down and distance categories
situational_tendencies <- pbp_2023 %>%
  filter(
    !is.na(down),
    down <= 3,
    play_type %in% c("run", "pass")
  ) %>%
  mutate(
    distance_category = case_when(
      ydstogo <= 3 ~ "Short (1-3)",
      ydstogo <= 7 ~ "Medium (4-7)",
      ydstogo <= 10 ~ "Long (8-10)",
      TRUE ~ "Very Long (11+)"
    ),
    distance_category = factor(
      distance_category,
      levels = c("Short (1-3)", "Medium (4-7)", "Long (8-10)", "Very Long (11+)")
    )
  ) %>%
  group_by(down, distance_category) %>%
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    avg_epa = mean(epa, na.rm = TRUE),
    success_rate = mean(epa > 0, na.rm = TRUE),
    .groups = "drop"
  )

# Display as table
situational_tendencies %>%
  gt() %>%
  cols_label(
    down = "Down",
    distance_category = "Distance",
    plays = "Plays",
    pass_rate = "Pass Rate",
    avg_epa = "Avg EPA",
    success_rate = "Success Rate"
  ) %>%
  fmt_number(
    columns = c(pass_rate, avg_epa, success_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = pass_rate,
    colors = scales::col_numeric(
      palette = c("#F8766D", "#FFFFFF", "#00BFC4"),
      domain = c(0, 1)
    )
  ) %>%
  tab_header(
    title = "Play-Calling Tendencies by Down and Distance",
    subtitle = "2023 NFL Regular Season"
  )

#| label: situational-tendencies-py
#| message: false
#| warning: false

# Calculate tendencies by down and distance categories
def categorize_distance(yards):
    if yards <= 3:
        return "Short (1-3)"
    elif yards <= 7:
        return "Medium (4-7)"
    elif yards <= 10:
        return "Long (8-10)"
    else:
        return "Very Long (11+)"

situational_tendencies = (pbp_2023
    .query("down.notna() & down <= 3 & play_type.isin(['run', 'pass'])")
    .assign(distance_category=lambda x: x['ydstogo'].apply(categorize_distance))
    .groupby(['down', 'distance_category'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean()),
        avg_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
)

# Sort by down and distance order
distance_order = ["Short (1-3)", "Medium (4-7)", "Long (8-10)", "Very Long (11+)"]
situational_tendencies['distance_category'] = pd.Categorical(
    situational_tendencies['distance_category'],
    categories=distance_order,
    ordered=True
)
situational_tendencies = situational_tendencies.sort_values(['down', 'distance_category'])

print("\nPlay-Calling Tendencies by Down and Distance")
print("=" * 80)
print(situational_tendencies.to_string(index=False))

Interpreting Situational Tendencies

Notice how pass rate increases dramatically on 3rd down, especially in long-yardage situations. This predictability creates opportunities for defenses to adjust. Later in this chapter, we'll explore whether this conventional wisdom is optimal.

The heat map visualization makes these patterns immediately clear. On first down, pass rates stay relatively balanced across all distances (45-55%). On second down, we see the bifurcation between second-and-short (more balanced, 50-60% pass) and second-and-long (heavy pass, 70-80%). Third down shows the most dramatic gradient, with third-and-short maintaining some run-pass balance (40-50% pass) while third-and-long becomes almost exclusively pass-oriented (85-95%).

What's particularly interesting is identifying the "inflection points" where tendencies shift dramatically. Around 7 yards to go, pass rate jumps significantly—this represents the threshold where teams lose confidence in their ability to convert with a run. On third down, this inflection occurs around 3-4 yards; on second down, it's closer to 7-8 yards. These inflection points represent strategic boundaries that coordinators and defenses both recognize and exploit.

Using Tendency Data for Defensive Game Planning

Defensive coordinators use these tendency patterns to gain pre-snap advantages. On third-and-long, defenses can play obvious passing downs with rush packages, extra defensive backs, and aggressive coverage. The challenge is knowing when offenses will "go against" tendency—the 15% of third-and-long runs designed to exploit defensive over-aggression.

Visualizing Play-Calling Patterns

Heat maps are excellent for visualizing play-calling tendencies across multiple dimensions:

R
Python

#| label: fig-tendency-heatmap-r
#| fig-cap: "Play-calling heat map by down and distance"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Create detailed down/distance grid
heatmap_data <- pbp_2023 %>%
  filter(
    !is.na(down),
    down <= 3,
    ydstogo <= 20,
    play_type %in% c("run", "pass")
  ) %>%
  group_by(down, ydstogo) %>%
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    .groups = "drop"
  ) %>%
  filter(plays >= 100)  # Minimum sample size

# Create heat map
ggplot(heatmap_data, aes(x = ydstogo, y = factor(down), fill = pass_rate)) +
  geom_tile(color = "white", linewidth = 0.5) +
  geom_text(aes(label = sprintf("%.0f%%", pass_rate * 100)),
            color = "white", size = 3, fontface = "bold") +
  scale_fill_gradient2(
    low = "#D55E00",
    mid = "#F0E442",
    high = "#0072B2",
    midpoint = 0.5,
    limits = c(0, 1),
    labels = scales::percent
  ) +
  scale_x_continuous(breaks = seq(0, 20, 2)) +
  scale_y_discrete(labels = c("1st", "2nd", "3rd")) +
  labs(
    title = "NFL Play-Calling Tendencies Heat Map",
    subtitle = "Pass rate by down and yards to go (2023 season)",
    x = "Yards to Go",
    y = "Down",
    fill = "Pass Rate",
    caption = "Data: nflfastR | Minimum 100 plays per cell"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    legend.position = "right",
    panel.grid = element_blank()
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-tendency-heatmap-py
#| fig-cap: "Play-calling heat map by down and distance - Python"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Create detailed down/distance grid
heatmap_data = (pbp_2023
    .query("down.notna() & down <= 3 & ydstogo <= 20 & play_type.isin(['run', 'pass'])")
    .groupby(['down', 'ydstogo'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
    .reset_index()
    .query("plays >= 100")  # Minimum sample size
)

# Pivot for heatmap
heatmap_pivot = heatmap_data.pivot(index='down', columns='ydstogo', values='pass_rate')

# Create heat map
plt.figure(figsize=(12, 6))
sns.heatmap(
    heatmap_pivot,
    cmap='RdYlBu',
    center=0.5,
    vmin=0,
    vmax=1,
    annot=True,
    fmt='.0%',
    cbar_kws={'label': 'Pass Rate'},
    linewidths=0.5,
    linecolor='white'
)

plt.xlabel('Yards to Go', fontsize=12)
plt.ylabel('Down', fontsize=12)
plt.title('NFL Play-Calling Tendencies Heat Map\nPass rate by down and yards to go (2023 season)',
          fontsize=14, fontweight='bold', pad=20)
plt.yticks(ticks=[0.5, 1.5, 2.5], labels=['1st', '2nd', '3rd'], rotation=0)
plt.tight_layout()
plt.show()

Run-Pass Balance Optimization

The Efficiency Paradox

One of the most robust findings in football analytics is that passing is generally more efficient than rushing:

R
Python

#| label: efficiency-comparison-r
#| message: false
#| warning: false

# Compare efficiency by play type
efficiency_comparison <- pbp_2023 %>%
  filter(
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  group_by(play_type) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    median_epa = median(epa),
    success_rate = mean(epa > 0),
    explosive_rate = mean(epa > 1.5),
    .groups = "drop"
  )

efficiency_comparison %>%
  gt() %>%
  cols_label(
    play_type = "Play Type",
    plays = "Plays",
    mean_epa = "Mean EPA",
    median_epa = "Median EPA",
    success_rate = "Success Rate",
    explosive_rate = "Explosive Rate"
  ) %>%
  fmt_number(
    columns = c(mean_epa, median_epa, success_rate, explosive_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Pass vs Run Efficiency",
    subtitle = "Overall EPA comparison (2023 season)"
  )

#| label: efficiency-comparison-py
#| message: false
#| warning: false

# Compare efficiency by play type
efficiency_comparison = (pbp_2023
    .query("epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby('play_type')
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        median_epa=('epa', 'median'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        explosive_rate=('epa', lambda x: (x > 1.5).mean())
    )
    .reset_index()
)

print("\nPass vs Run Efficiency (2023 season)")
print("=" * 70)
print(efficiency_comparison.to_string(index=False))

Despite passing being more efficient, teams typically run the ball 40-45% of the time. Why? Several factors explain this apparent paradox:

Clock Management: Running plays keep the clock moving
Ball Security: Interceptions are costly negative plays
Defensive Adjustment: Heavy passing invites pass rush and coverage adjustments
Personnel Fatigue: Pass protection is physically demanding
Game Theory: Some unpredictability in play-calling is valuable

The Run-Pass Efficiency Gap: Understanding the Paradox

The persistent gap between pass and run efficiency—typically 0.10-0.15 EPA per play—represents one of football's most important strategic puzzles. If passing is more efficient, why don't teams pass more? The answer involves equilibrium: if teams passed on 80% of plays, defenses would adjust their strategies (more defensive backs, more pass rush, less respect for run), which would reduce passing efficiency. The current run-pass balance represents a rough equilibrium where further shifts toward passing would trigger defensive adjustments that eliminate the efficiency gains.

This efficiency paradox also reveals something important about play-calling evaluation: we cannot simply maximize EPA on every individual play. The run game serves strategic purposes beyond immediate efficiency—it creates play-action opportunities, controls tempo, manages clock, and keeps defensive fronts honest. A team that abandons the run might see short-term EPA gains but long-term strategic losses as defenses optimize purely for pass defense.

Optimal Balance by Game Script

The optimal run-pass balance depends on game situation:

R
Python

#| label: script-analysis-r
#| message: false
#| warning: false

# Analyze efficiency by score differential
script_analysis <- pbp_2023 %>%
  filter(
    !is.na(epa),
    !is.na(score_differential),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  mutate(
    script = case_when(
      score_differential <= -10 ~ "Losing big (10+)",
      score_differential <= -4 ~ "Losing (4-9)",
      score_differential <= 3 ~ "Close game",
      score_differential <= 9 ~ "Winning (4-9)",
      TRUE ~ "Winning big (10+)"
    ),
    script = factor(
      script,
      levels = c("Losing big (10+)", "Losing (4-9)", "Close game",
                 "Winning (4-9)", "Winning big (10+)")
    )
  ) %>%
  group_by(script, play_type) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  arrange(script, desc(play_type))

# Display results
script_analysis %>%
  gt() %>%
  cols_label(
    script = "Game Script",
    play_type = "Play Type",
    plays = "Plays",
    mean_epa = "Mean EPA",
    success_rate = "Success Rate"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Play-Calling Efficiency by Game Script",
    subtitle = "EPA analysis by score differential (2023)"
  )

#| label: script-analysis-py
#| message: false
#| warning: false

# Analyze efficiency by score differential
def categorize_script(diff):
    if diff <= -10:
        return "Losing big (10+)"
    elif diff <= -4:
        return "Losing (4-9)"
    elif diff <= 3:
        return "Close game"
    elif diff <= 9:
        return "Winning (4-9)"
    else:
        return "Winning big (10+)"

script_analysis = (pbp_2023
    .query("epa.notna() & score_differential.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .assign(script=lambda x: x['score_differential'].apply(categorize_script))
    .groupby(['script', 'play_type'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
)

# Sort by script order
script_order = ["Losing big (10+)", "Losing (4-9)", "Close game", "Winning (4-9)", "Winning big (10+)"]
script_analysis['script'] = pd.Categorical(
    script_analysis['script'],
    categories=script_order,
    ordered=True
)
script_analysis = script_analysis.sort_values(['script', 'play_type'], ascending=[True, False])

print("\nPlay-Calling Efficiency by Game Script (2023)")
print("=" * 70)
print(script_analysis.to_string(index=False))

The Script Effect

Notice how pass efficiency remains relatively stable across game scripts, while run efficiency varies. Teams winning big run more often despite lower efficiency, prioritizing clock management over EPA maximization. This represents a strategic choice beyond pure play-level optimization.

Play Action Effectiveness

Play action passes—where the quarterback fakes a handoff before passing—are designed to exploit defensive reactions to run tendencies:

R
Python

#| label: play-action-analysis-r
#| message: false
#| warning: false

# Analyze play action effectiveness
play_action_analysis <- pbp_2023 %>%
  filter(
    play_type == "pass",
    !is.na(epa),
    !is.na(pass_length),
    down <= 3
  ) %>%
  mutate(
    play_action = if_else(is.na(play_action), 0, play_action)
  ) %>%
  group_by(play_action, pass_length) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    completion_pct = mean(complete_pass, na.rm = TRUE),
    yards_per_attempt = mean(yards_gained, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    play_action = if_else(play_action == 1, "Play Action", "Standard")
  )

# Display results
play_action_analysis %>%
  gt() %>%
  cols_label(
    play_action = "Pass Type",
    pass_length = "Depth",
    plays = "Plays",
    mean_epa = "Mean EPA",
    success_rate = "Success Rate",
    completion_pct = "Comp %",
    yards_per_attempt = "YPA"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate, completion_pct),
    decimals = 3
  ) %>%
  fmt_number(
    columns = yards_per_attempt,
    decimals = 1
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Play Action vs Standard Pass Efficiency",
    subtitle = "Comparison by pass depth (2023)"
  )

#| label: play-action-analysis-py
#| message: false
#| warning: false

# Analyze play action effectiveness
play_action_analysis = (pbp_2023
    .query("play_type == 'pass' & epa.notna() & pass_length.notna() & down <= 3")
    .assign(play_action=lambda x: x['play_action'].fillna(0))
    .groupby(['play_action', 'pass_length'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        completion_pct=('complete_pass', 'mean'),
        yards_per_attempt=('yards_gained', 'mean')
    )
    .reset_index()
    .assign(play_action=lambda x: x['play_action'].map({1: "Play Action", 0: "Standard"}))
)

print("\nPlay Action vs Standard Pass Efficiency (2023)")
print("=" * 80)
print(play_action_analysis.to_string(index=False))

#| label: fig-play-action-r
#| fig-cap: "Play action pass effectiveness comparison"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Prepare data for visualization
pa_viz_data <- pbp_2023 %>%
  filter(
    play_type == "pass",
    !is.na(epa),
    down <= 3
  ) %>%
  mutate(
    play_action = if_else(is.na(play_action), 0, play_action),
    play_action_label = if_else(play_action == 1, "Play Action", "Standard Pass")
  )

# Create comparison plot
ggplot(pa_viz_data, aes(x = epa, fill = play_action_label)) +
  geom_density(alpha = 0.6) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "black") +
  scale_fill_manual(
    values = c("Play Action" = "#00BA38", "Standard Pass" = "#619CFF")
  ) +
  scale_x_continuous(limits = c(-5, 10)) +
  labs(
    title = "Play Action Pass EPA Distribution",
    subtitle = "Comparing play action to standard dropback passes (2023)",
    x = "Expected Points Added",
    y = "Density",
    fill = "Pass Type",
    caption = "Data: nflfastR"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    legend.position = "top"
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-play-action-py
#| fig-cap: "Play action pass effectiveness comparison - Python"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Prepare data for visualization
pa_viz_data = (pbp_2023
    .query("play_type == 'pass' & epa.notna() & down <= 3")
    .assign(
        play_action=lambda x: x['play_action'].fillna(0),
        play_action_label=lambda x: x['play_action'].map({1: "Play Action", 0: "Standard Pass"})
    )
)

# Create comparison plot
plt.figure(figsize=(10, 6))

for label, color in [('Play Action', '#00BA38'), ('Standard Pass', '#619CFF')]:
    data = pa_viz_data[pa_viz_data['play_action_label'] == label]['epa']
    data = data[(data >= -5) & (data <= 10)]
    plt.hist(data, bins=50, alpha=0.6, label=label, color=color, density=True)

plt.axvline(x=0, color='black', linestyle='--', alpha=0.7)
plt.xlabel('Expected Points Added', fontsize=12)
plt.ylabel('Density', fontsize=12)
plt.title('Play Action Pass EPA Distribution\nComparing play action to standard dropback passes (2023)',
          fontsize=14, fontweight='bold')
plt.legend(title='Pass Type', loc='upper right')
plt.xlim(-5, 10)
plt.text(0.98, 0.02, 'Data: nfl_data_py',
         transform=plt.gca().transAxes,
         ha='right', fontsize=8, style='italic')
plt.tight_layout()
plt.show()

Play Action Effectiveness

Play action passes typically generate 0.15-0.25 higher EPA than standard dropbacks. However, they require credible run threats and appropriate personnel groupings to be effective. Teams that never run the ball find play action less effective.

The distribution chart clearly shows play action's advantage—the entire EPA distribution shifts right. Not only is the mean EPA higher for play action, but the tail of explosive plays (EPA > 2.0) is thicker as well. This occurs because play action manipulates defensive flow and creates coverage conflicts. Linebackers step toward the line of scrimmage to stop the fake run, creating space underneath. Safeties hesitate before rotating into coverage, allowing receivers to get behind them.

Interestingly, recent research suggests that play action effectiveness doesn't actually correlate strongly with run game success. Teams with poor rushing attacks still benefit significantly from play action, as long as they use it consistently enough that defenses must respect the fake. This finding challenges conventional wisdom—you don't need to "establish the run" to make play action work; you just need to threaten it credibly.

Play Action Usage Rates and Optimization

The average NFL team uses play action on 25-30% of pass attempts. Analytics suggests this might be too low—even teams in the 95th percentile of play action usage (35-40% of passes) still see positive EPA gains from play action. The marginal value of play action appears not to diminish until usage rates exceed 40-45%, suggesting most teams could benefit from calling more play action.

Personnel Groupings and Play-Calling

Personnel groupings significantly influence play-calling tendencies and effectiveness:

R
Python

#| label: personnel-analysis-r
#| message: false
#| warning: false

# Analyze play-calling by personnel grouping
personnel_analysis <- pbp_2023 %>%
  filter(
    !is.na(personnel_o),
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  # Focus on common personnel groupings
  filter(personnel_o %in% c("1 RB, 1 TE, 3 WR", "1 RB, 2 TE, 2 WR",
                             "2 RB, 1 TE, 2 WR", "1 RB, 0 TE, 4 WR")) %>%
  group_by(personnel_o, play_type) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  arrange(personnel_o, desc(play_type))

# Calculate pass rate by personnel
personnel_pass_rate <- pbp_2023 %>%
  filter(
    !is.na(personnel_o),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  filter(personnel_o %in% c("1 RB, 1 TE, 3 WR", "1 RB, 2 TE, 2 WR",
                             "2 RB, 1 TE, 2 WR", "1 RB, 0 TE, 4 WR")) %>%
  group_by(personnel_o) %>%
  summarise(
    total_plays = n(),
    pass_rate = mean(play_type == "pass"),
    .groups = "drop"
  )

# Display results
personnel_analysis %>%
  left_join(personnel_pass_rate, by = "personnel_o") %>%
  select(personnel_o, play_type, plays, mean_epa, success_rate, pass_rate) %>%
  gt() %>%
  cols_label(
    personnel_o = "Personnel",
    play_type = "Play Type",
    plays = "Plays",
    mean_epa = "Mean EPA",
    success_rate = "Success Rate",
    pass_rate = "Overall Pass Rate"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate, pass_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Play-Calling by Personnel Grouping",
    subtitle = "Common offensive personnel packages (2023)"
  )

#| label: personnel-analysis-py
#| message: false
#| warning: false

# Analyze play-calling by personnel grouping
common_personnel = ["1 RB, 1 TE, 3 WR", "1 RB, 2 TE, 2 WR",
                    "2 RB, 1 TE, 2 WR", "1 RB, 0 TE, 4 WR"]

personnel_analysis = (pbp_2023
    .query("personnel_o.notna() & epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .query("personnel_o in @common_personnel")
    .groupby(['personnel_o', 'play_type'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
)

# Calculate pass rate by personnel
personnel_pass_rate = (pbp_2023
    .query("personnel_o.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .query("personnel_o in @common_personnel")
    .groupby('personnel_o')
    .agg(
        total_plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
    .reset_index()
)

# Merge and display
personnel_combined = personnel_analysis.merge(personnel_pass_rate, on='personnel_o')

print("\nPlay-Calling by Personnel Grouping (2023)")
print("=" * 90)
print(personnel_combined.to_string(index=False))

Personnel Tells

Notice how "11 personnel" (1 RB, 1 TE, 3 WR) shows a high pass rate, while "12 personnel" (1 RB, 2 TE, 2 WR) is more balanced. Defenses use these tendencies to predict play type, which is why some teams deliberately run from spread personnel to maintain unpredictability.

Understanding Personnel Grouping Nomenclature

Personnel groupings in football are described using a two-digit system that indicates the number of running backs and tight ends on the field. The first digit represents running backs, the second represents tight ends, and the remaining players (up to 11 total) are wide receivers. Understanding this system is crucial for analyzing play-calling patterns because personnel groupings create strong signals about offensive intent.

Common Personnel Groupings:

11 Personnel (1 RB, 1 TE, 3 WR): The most common grouping in modern NFL offenses, accounting for 60-70% of plays for pass-heavy teams. This "spread" formation prioritizes speed and space, making it easier to pass but still allowing for zone-running schemes. Teams average 65-70% pass rate from this grouping.
12 Personnel (1 RB, 2 TE, 2 WR): A more balanced grouping that suggests either run or play-action pass. The second tight end can serve as an extra blocker for runs, a receiver on passes, or a play-action decoy. Teams typically pass 55-60% from this grouping, making it less predictable.
21 Personnel (2 RB, 1 TE, 2 WR): Increasingly rare in modern NFL play, this heavier grouping strongly signals run intent. Teams pass only 30-40% from 21 personnel, making it highly predictable. Some teams use it specifically to exploit this expectation with play-action.
13 Personnel (1 RB, 3 TE, 1 WR): An extreme "heavy" grouping used primarily in short-yardage and goal-line situations. Pass rate from 13 personnel is typically below 20%, but the passes thrown are often highly effective because defenses load the box expecting run.
10 Personnel (1 RB, 0 TE, 4 WR): The ultimate spread formation, used by aggressive passing offenses in obvious passing situations. Teams pass 85-90% from this grouping. Some innovative offenses run successfully from 10 personnel precisely because defenses expect pass.

The strategic challenge for offensive coordinators is balancing the matchup advantages of specific personnel groupings against the predictability they create. Using 11 personnel on every play maximizes flexibility and creates favorable matchups for the passing game, but it also signals to defenses that a pass is likely coming. Conversely, shifting to 12 or 21 personnel telegraphs run intent, allowing defenses to adjust their fronts and add defenders to the box.

Advanced offenses address this problem in several ways. First, they develop robust running games from spread personnel (11 personnel), maintaining enough run threat to keep defenses honest. Second, they use personnel groupings that maintain optionality—12 personnel, for instance, can effectively run or pass. Third, they strategically "go against" tendencies, running from 10 personnel or passing from 13 personnel, to exploit defensive over-adjustment.

Formation vs. Personnel

Don't confuse personnel groupings with formations. Personnel refers to which players are on the field (how many RBs, TEs, WRs). Formation refers to where those players line up (shotgun, under center, trips, etc.). A team can run many different formations from the same personnel grouping, providing additional strategic options.

Down and Distance Tendencies

Understanding down-and-distance tendencies is crucial for both play-calling and defensive game planning. Down and distance create the fundamental strategic framework for every offensive play call. Each combination of down and yards-to-go presents different constraints, opportunities, and risk-reward trade-offs.

First down offers maximum optionality—with three downs to gain ten yards, offenses can pursue various strategies. They might establish the run, test the defense with a deep shot, or use play action to create big plays. The key strategic question on first down is efficiency versus explosiveness: do you prioritize staying ahead of the chains with a high-probability short gain, or do you attack downfield to maximize EPA?

Second down's strategic landscape depends entirely on first-down results. After a successful first-down play (gaining 4+ yards), second down remains relatively open, with both run and pass viable. After a failed first-down play (gaining 0-3 yards), second down becomes critical—the offense faces pressure to avoid third-and-long, while the defense knows the offense is somewhat predictable. Second-and-long situations (8+ yards) see teams pass 70-80% of the time as they try to get back on track.

Third down represents the most critical play-calling situation in football. The distance required to convert the first down overwhelmingly determines play selection. On third-and-short (1-3 yards), teams maintain balance, running 40-50% of the time. On third-and-medium (4-6 yards), pass rate climbs to 75-80%. On third-and-long (7+ yards), teams pass 85-95% of the time. This predictability creates a strategic dilemma: passing is the correct decision given the distance required, but the predictability allows defenses to optimize their coverage schemes and pass rush.

The following analysis examines these patterns in detail, measuring not just tendency but also efficiency. We'll identify which situations create the largest gaps between run and pass efficiency, revealing where conventional play-calling wisdom might leave expected points on the table:

R
Python

#| label: down-distance-detail-r
#| message: false
#| warning: false

# Detailed analysis of key down-distance situations
key_situations <- pbp_2023 %>%
  filter(
    !is.na(down),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  mutate(
    situation = case_when(
      down == 1 & ydstogo == 10 ~ "1st & 10",
      down == 2 & ydstogo >= 8 ~ "2nd & Long (8+)",
      down == 2 & ydstogo >= 4 & ydstogo <= 7 ~ "2nd & Medium (4-7)",
      down == 2 & ydstogo <= 3 ~ "2nd & Short (1-3)",
      down == 3 & ydstogo >= 7 ~ "3rd & Long (7+)",
      down == 3 & ydstogo >= 4 & ydstogo <= 6 ~ "3rd & Medium (4-6)",
      down == 3 & ydstogo <= 3 ~ "3rd & Short (1-3)",
      TRUE ~ "Other"
    )
  ) %>%
  filter(situation != "Other") %>%
  group_by(situation) %>%
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    run_epa = mean(epa[play_type == "run"], na.rm = TRUE),
    pass_epa = mean(epa[play_type == "pass"], na.rm = TRUE),
    run_success = mean(epa[play_type == "run"] > 0, na.rm = TRUE),
    pass_success = mean(epa[play_type == "pass"] > 0, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    epa_advantage = pass_epa - run_epa,
    success_advantage = pass_success - run_success
  )

# Display results
key_situations %>%
  gt() %>%
  cols_label(
    situation = "Situation",
    plays = "Plays",
    pass_rate = "Pass Rate",
    run_epa = "Run EPA",
    pass_epa = "Pass EPA",
    run_success = "Run Success",
    pass_success = "Pass Success",
    epa_advantage = "EPA Adv.",
    success_advantage = "Success Adv."
  ) %>%
  fmt_number(
    columns = c(pass_rate, run_epa, pass_epa, run_success,
                pass_success, epa_advantage, success_advantage),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = epa_advantage,
    colors = scales::col_numeric(
      palette = c("#D55E00", "#FFFFFF", "#0072B2"),
      domain = c(-0.5, 0.5)
    )
  ) %>%
  tab_header(
    title = "Play-Calling Analysis by Key Situations",
    subtitle = "Pass vs Run efficiency (2023)"
  )

#| label: down-distance-detail-py
#| message: false
#| warning: false

# Detailed analysis of key down-distance situations
def categorize_situation(row):
    down = row['down']
    ydstogo = row['ydstogo']

    if down == 1 and ydstogo == 10:
        return "1st & 10"
    elif down == 2 and ydstogo >= 8:
        return "2nd & Long (8+)"
    elif down == 2 and 4 <= ydstogo <= 7:
        return "2nd & Medium (4-7)"
    elif down == 2 and ydstogo <= 3:
        return "2nd & Short (1-3)"
    elif down == 3 and ydstogo >= 7:
        return "3rd & Long (7+)"
    elif down == 3 and 4 <= ydstogo <= 6:
        return "3rd & Medium (4-6)"
    elif down == 3 and ydstogo <= 3:
        return "3rd & Short (1-3)"
    else:
        return "Other"

key_situations_data = (pbp_2023
    .query("down.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .assign(situation=lambda x: x.apply(categorize_situation, axis=1))
    .query("situation != 'Other'")
)

# Calculate metrics by situation and play type
run_stats = (key_situations_data
    .query("play_type == 'run'")
    .groupby('situation')
    .agg(
        run_plays=('epa', 'count'),
        run_epa=('epa', 'mean'),
        run_success=('epa', lambda x: (x > 0).mean())
    )
)

pass_stats = (key_situations_data
    .query("play_type == 'pass'")
    .groupby('situation')
    .agg(
        pass_plays=('epa', 'count'),
        pass_epa=('epa', 'mean'),
        pass_success=('epa', lambda x: (x > 0).mean())
    )
)

overall_stats = (key_situations_data
    .groupby('situation')
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
)

# Combine all statistics
key_situations = (overall_stats
    .join(run_stats)
    .join(pass_stats)
    .assign(
        epa_advantage=lambda x: x['pass_epa'] - x['run_epa'],
        success_advantage=lambda x: x['pass_success'] - x['run_success']
    )
    .reset_index()
)

print("\nPlay-Calling Analysis by Key Situations (2023)")
print("=" * 100)
print(key_situations.to_string(index=False))

Formation Analysis and Play-Calling

The Strategic Role of Formations

While personnel groupings determine which players are on the field, formations determine where those players line up. Formation analysis provides another layer of insight into play-calling tendencies and effectiveness. Certain formations strongly correlate with specific play types, creating additional signals that defenses can read and exploit.

Modern offenses use formations strategically in several ways. First, formations can create structural advantages—bunching receivers on one side creates natural pick routes, while spreading them out creates isolation matchups. Second, formations can create information advantages or disadvantages—some formations maintain run-pass balance while others telegraph intent. Third, formations interact with personnel groupings to create the overall offensive picture that defenses must process pre-snap.

The shotgun formation has become dominant in modern NFL play-calling. In shotgun, the quarterback stands 5-7 yards behind the center at the snap, providing better vision of the defense and more time to throw. Historically considered a passing formation, modern zone-running schemes have made the shotgun viable for running as well. Today, teams run from shotgun on 70-80% of all plays, fundamentally changing the strategic landscape of play-calling.

Under center formations, where the quarterback takes the snap directly from the center, now signal specific intentions. Teams go under center primarily for power running plays, particularly in short-yardage situations, or for play-action passes designed to exploit aggressive run defenses. The act of going under center itself creates a tendency—teams run approximately 55-60% of the time when under center, compared to passing 65-70% from shotgun.

Specific shotgun variations create additional strategic dimensions. Empty formations (all five eligible receivers split out, no one in the backfield) almost always indicate pass—95%+ pass rate. Trips formations (three receivers to one side) are used for both run and pass but create specific defensive challenges in coverage. Bunch formations (receivers aligned close together) are primarily used for passing, creating natural pick routes and confusion in man coverage.

Motion and Formation Flexibility

Pre-snap motion adds another strategic layer to formation analysis. Teams can show one formation, then shift or motion into another, disguising their intentions and forcing defenses to adjust. Advanced analytics increasingly track motion rates and their correlation with play-calling success.

Measuring Formation Tendencies

Let's analyze play-calling tendencies and efficiency across different formation types. We'll focus on the key distinction between shotgun and under center, as well as examining specific formations like empty and trips:

#| label: formation-analysis-r
#| message: false
#| warning: false

# Analyze play-calling tendencies by formation type
# This reveals how formations signal offensive intent and effectiveness

formation_analysis <- pbp_2023 %>%
  filter(
    !is.na(shotgun),           # Must have formation data
    !is.na(epa),               # Must have EPA calculated
    play_type %in% c("run", "pass"),  # Only offensive plays
    down <= 3                  # Exclude 4th down (special situations)
  ) %>%
  # Create formation categories based on shotgun status
  mutate(
    formation_type = if_else(shotgun == 1, "Shotgun", "Under Center")
  ) %>%
  # Calculate statistics by formation and play type
  group_by(formation_type, play_type) %>%
  summarise(
    plays = n(),                           # Total plays from this formation-play combo
    mean_epa = mean(epa),                  # Average EPA (efficiency)
    success_rate = mean(epa > 0),          # Success rate (% positive EPA)
    explosive_rate = mean(epa > 1.5),      # Big play rate (explosive plays)
    yards_per_play = mean(yards_gained, na.rm = TRUE),  # Average yards gained
    .groups = "drop"
  )

# Calculate overall pass rate by formation
formation_rates <- pbp_2023 %>%
  filter(
    !is.na(shotgun),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  mutate(formation_type = if_else(shotgun == 1, "Shotgun", "Under Center")) %>%
  group_by(formation_type) %>%
  summarise(
    total_plays = n(),
    pass_rate = mean(play_type == "pass"),
    .groups = "drop"
  )

# Combine and display
formation_analysis %>%
  left_join(formation_rates, by = "formation_type") %>%
  select(formation_type, play_type, plays, pass_rate,
         mean_epa, success_rate, explosive_rate) %>%
  gt() %>%
  cols_label(
    formation_type = "Formation",
    play_type = "Play Type",
    plays = "Plays",
    pass_rate = "Overall Pass Rate",
    mean_epa = "EPA/Play",
    success_rate = "Success %",
    explosive_rate = "Explosive %"
  ) %>%
  fmt_number(
    columns = c(pass_rate, mean_epa, success_rate, explosive_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = mean_epa,
    colors = scales::col_numeric(
      palette = c("#D55E00", "#FFFFFF", "#0072B2"),
      domain = c(-0.2, 0.3)
    )
  ) %>%
  tab_header(
    title = "Formation Analysis: Shotgun vs Under Center",
    subtitle = "Play-calling tendencies and efficiency by formation (2023)"
  )

#| label: formation-analysis-py
#| message: false
#| warning: false

# Analyze play-calling tendencies by formation type
formation_data = (pbp_2023
    .query("shotgun.notna() & epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .assign(formation_type=lambda x: x['shotgun'].map({1: "Shotgun", 0: "Under Center"}))
)

# Calculate statistics by formation and play type
formation_analysis = (formation_data
    .groupby(['formation_type', 'play_type'])
    .agg(
        plays=('epa', 'count'),                              # Total plays
        mean_epa=('epa', 'mean'),                            # Average EPA
        success_rate=('epa', lambda x: (x > 0).mean()),      # Success rate
        explosive_rate=('epa', lambda x: (x > 1.5).mean()),  # Big play rate
        yards_per_play=('yards_gained', 'mean')              # Average yards
    )
    .reset_index()
)

# Calculate overall pass rate by formation
formation_rates = (formation_data
    .groupby('formation_type')
    .agg(
        total_plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
    .reset_index()
)

# Combine results
formation_combined = formation_analysis.merge(formation_rates, on='formation_type')

print("\nFormation Analysis: Shotgun vs Under Center (2023)")
print("=" * 100)
print(formation_combined[['formation_type', 'play_type', 'plays', 'pass_rate',
                          'mean_epa', 'success_rate', 'explosive_rate']].to_string(index=False))

This analysis compares play-calling tendencies and efficiency between shotgun and under-center formations. The key steps are: 1. **Data Filtering**: We filter to plays with valid formation data (shotgun field is not null), valid EPA, and offensive plays only. We exclude 4th down because those are often special situations that don't represent typical play-calling. 2. **Formation Classification**: We create a binary formation_type variable: "Shotgun" when shotgun = 1, "Under Center" when shotgun = 0. This simplifies the analysis by focusing on the primary formation distinction. 3. **Grouping and Aggregation**: We group plays by both formation type and play type (run/pass), then calculate multiple efficiency metrics: EPA, success rate, explosive play rate, and average yards gained. 4. **Pass Rate Calculation**: Separately, we calculate the overall pass rate from each formation type. This tells us how often teams pass from shotgun versus under center. 5. **Interpretation**: The results typically show that shotgun has a higher pass rate (65-70%) than under center (40-45%), but both formations can be used effectively for running or passing. The efficiency metrics reveal whether the formation itself provides an advantage or whether it's simply selection bias (better plays called from certain formations).

The results reveal several important patterns. First, while shotgun formations do show higher pass rates, modern offenses run effectively from shotgun—zone-running schemes from shotgun often generate similar EPA to running from under center. Second, the predictability of under-center formations (higher run rate) may cost efficiency—defenses can load the box expecting run. Third, passing from under center (often play action) can be highly efficient precisely because it exploits defensive run expectation.

These findings have strategic implications for offensive coordinators. Using primarily shotgun formations maintains flexibility and prevents defensive box-loading. However, going under center strategically—particularly for play action on first down or second-and-short—can exploit defensive adjustments and create explosive passing opportunities. The key is using under-center formations frequently enough that play-action remains credible, without becoming so predictable that defenses ignore the run threat.

Pistol Formation: The Hybrid Approach

Some offenses use the pistol formation, where the quarterback stands 3-4 yards behind center (between shotgun and under center). This "hybrid" formation attempts to maintain run-pass balance while giving the quarterback a better view than under center. Analysis of pistol formation tendencies shows it achieves the goal—roughly 50-50 run-pass split—but is used on only 5-10% of plays league-wide.

Time and Score Situation Analysis

Game Script and Play-Calling Adaptation

Beyond down, distance, and formation, game situation dramatically influences play-calling. The score differential and time remaining create powerful constraints on strategy. Teams protecting leads prioritize ball security and clock management, running more frequently even at the cost of EPA efficiency. Teams trailing must pass to stop the clock and move quickly down the field, even in situations where running might be more efficient.

Game script analysis examines how play-calling changes based on the interaction between score and time. Early in games, even large leads or deficits have limited impact on play-calling—teams stick closer to their base tendencies. As games progress into the fourth quarter, score differential increasingly dominates play-calling decisions. A team leading by 10 points with 5 minutes remaining will run on 70-80% of plays, regardless of down and distance. A team trailing by 10 will pass 80-90% of the time.

This creates an interesting analytical challenge: when we measure play-calling efficiency, we must account for game script. A team that runs frequently in the fourth quarter while protecting a lead will show lower rushing EPA than we might expect, not because their running game is ineffective, but because they're running in situations where defenses know run is coming and can optimize their fronts accordingly. Conversely, a team that trails frequently and passes in obvious situations will show deflated passing EPA.

Understanding game script also helps us evaluate coordinator decision-making. Did a coordinator make good process decisions that were undermined by poor execution or bad luck? Or did they make suboptimal strategic choices? Separating process quality from outcome requires controlling for the constraints that game script imposed on decision-making.

Let's examine how play-calling changes across different game scripts, focusing specifically on how time and score interact to shape decisions:

R
Python

#| label: time-score-analysis-r
#| message: false
#| warning: false

# Analyze play-calling by time remaining and score differential
# This reveals how game context shapes strategic decisions

game_script_detail <- pbp_2023 %>%
  filter(
    !is.na(score_differential),    # Must have score context
    !is.na(half_seconds_remaining), # Must have time context
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3,
    qtr %in% c(2, 4)               # Focus on end of halves (most interesting)
  ) %>%
  # Categorize game situations
  mutate(
    # Score categories
    score_situation = case_when(
      score_differential >= 9 ~ "Leading by 9+",
      score_differential >= 4 ~ "Leading by 4-8",
      abs(score_differential) <= 3 ~ "Tied/Close (±3)",
      score_differential <= -4 & score_differential >= -8 ~ "Trailing by 4-8",
      score_differential <= -9 ~ "Trailing by 9+",
      TRUE ~ "Other"
    ),
    # Time categories (for end of half/game)
    time_situation = case_when(
      half_seconds_remaining > 300 ~ "5+ minutes",
      half_seconds_remaining > 120 ~ "2-5 minutes",
      TRUE ~ "Under 2 minutes"
    ),
    # Combine for detailed game script
    game_script = paste0(score_situation, " (", time_situation, ")")
  ) %>%
  # Calculate tendencies by game script
  group_by(score_situation, time_situation) %>%
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  # Order factors for logical display
  mutate(
    score_situation = factor(score_situation,
      levels = c("Leading by 9+", "Leading by 4-8", "Tied/Close (±3)",
                 "Trailing by 4-8", "Trailing by 9+")
    ),
    time_situation = factor(time_situation,
      levels = c("5+ minutes", "2-5 minutes", "Under 2 minutes")
    )
  ) %>%
  filter(plays >= 50) %>%  # Minimum sample size
  arrange(time_situation, score_situation)

# Display results
game_script_detail %>%
  gt() %>%
  cols_label(
    score_situation = "Score Situation",
    time_situation = "Time Remaining",
    plays = "Plays",
    pass_rate = "Pass Rate",
    mean_epa = "EPA/Play",
    success_rate = "Success Rate"
  ) %>%
  fmt_number(
    columns = c(pass_rate, mean_epa, success_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = pass_rate,
    colors = scales::col_numeric(
      palette = c("#D55E00", "#F0E442", "#0072B2"),
      domain = c(0.3, 0.9)
    )
  ) %>%
  tab_header(
    title = "Game Script Impact on Play-Calling",
    subtitle = "Pass rate by score and time (2nd & 4th quarters, 2023)"
  ) %>%
  tab_footnote(
    footnote = "Minimum 50 plays per situation",
    locations = cells_column_labels(columns = plays)
  )

#| label: time-score-analysis-py
#| message: false
#| warning: false

# Analyze play-calling by time remaining and score differential
def categorize_score_situation(diff):
    if diff >= 9:
        return "Leading by 9+"
    elif diff >= 4:
        return "Leading by 4-8"
    elif abs(diff) <= 3:
        return "Tied/Close (±3)"
    elif diff <= -4 and diff >= -8:
        return "Trailing by 4-8"
    elif diff <= -9:
        return "Trailing by 9+"
    return "Other"

def categorize_time_situation(seconds):
    if seconds > 300:
        return "5+ minutes"
    elif seconds > 120:
        return "2-5 minutes"
    else:
        return "Under 2 minutes"

game_script_detail = (pbp_2023
    .query("score_differential.notna() & half_seconds_remaining.notna() & "
           "epa.notna() & play_type.isin(['run', 'pass']) & down <= 3 & qtr.isin([2, 4])")
    .assign(
        score_situation=lambda x: x['score_differential'].apply(categorize_score_situation),
        time_situation=lambda x: x['half_seconds_remaining'].apply(categorize_time_situation)
    )
    .groupby(['score_situation', 'time_situation'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean()),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
    .query("plays >= 50")  # Minimum sample size
)

# Order categories logically
score_order = ["Leading by 9+", "Leading by 4-8", "Tied/Close (±3)",
               "Trailing by 4-8", "Trailing by 9+"]
time_order = ["5+ minutes", "2-5 minutes", "Under 2 minutes"]

game_script_detail['score_situation'] = pd.Categorical(
    game_script_detail['score_situation'],
    categories=score_order,
    ordered=True
)
game_script_detail['time_situation'] = pd.Categorical(
    game_script_detail['time_situation'],
    categories=time_order,
    ordered=True
)
game_script_detail = game_script_detail.sort_values(['time_situation', 'score_situation'])

print("\nGame Script Impact on Play-Calling (2nd & 4th quarters, 2023)")
print("=" * 90)
print(game_script_detail.to_string(index=False))

The results dramatically illustrate how game script overrides base tendencies. With 5+ minutes remaining, score differential has moderate impact—teams leading by 9+ pass about 55% of the time, while teams trailing by 9+ pass about 75%. As time decreases, these tendencies intensify. Under 2 minutes, teams with big leads run 70-80% of the time (prioritizing clock), while teams with big deficits pass 85-95% (prioritizing points and stopping clock).

This analysis reveals both strategic constraints and potential opportunities. The constraints are real—a team trailing by 14 with 3 minutes left must pass to have any chance of winning, even though the defense knows it's coming. But opportunities exist at the margins. Teams might over-adjust to game script, becoming too conservative with leads or too aggressive when trailing. A team leading by 7 with 8 minutes remaining might run too much, allowing the defense to crowd the box and forcing three-and-outs that give the ball back to opponents. Finding the optimal balance between situational necessity and maintaining unpredictability represents an ongoing analytical challenge.

The "Prevent Offense" Problem

Just as prevent defenses often prevent winning, prevent offenses (running clock at all costs) can be counterproductive. Teams that become too conservative when protecting leads in the 4th quarter often go three-and-out repeatedly, giving opponents more possessions. Analytics increasingly suggests staying aggressive longer, only shifting to heavy clock control in the final 2-3 minutes.

Game Theory and Unpredictability

The Value of Mixed Strategies

Game theory provides a framework for optimal play-calling. The key insight is that perfect predictability allows defenses to optimize their strategy:

Example: If a team always passes on 3rd & long, the defense can commit fully to pass defense, reducing the effectiveness of those passes.

The solution is a mixed strategy—calling plays probabilistically rather than deterministically.

R
Python

#| label: predictability-analysis-r
#| message: false
#| warning: false

# Calculate predictability and its cost for each team
team_predictability <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  # Calculate tendencies by team and situation
  group_by(posteam, down, ydstogo) %>%
  mutate(
    n_situation = n()
  ) %>%
  filter(n_situation >= 10) %>%  # Minimum sample
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    epa = mean(epa, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  # Calculate deviation from 50-50
  mutate(
    deviation = abs(pass_rate - 0.5)
  ) %>%
  # Aggregate to team level
  group_by(posteam) %>%
  summarise(
    situations = n(),
    avg_deviation = mean(deviation),
    max_deviation = max(deviation),
    avg_epa = mean(epa),
    .groups = "drop"
  ) %>%
  arrange(avg_deviation)

# Display most and least predictable teams
bind_rows(
  team_predictability %>% head(5) %>% mutate(category = "Least Predictable"),
  team_predictability %>% tail(5) %>% mutate(category = "Most Predictable")
) %>%
  select(category, posteam, avg_deviation, max_deviation, avg_epa) %>%
  gt() %>%
  cols_label(
    category = "Category",
    posteam = "Team",
    avg_deviation = "Avg Deviation",
    max_deviation = "Max Deviation",
    avg_epa = "Avg EPA"
  ) %>%
  fmt_number(
    columns = c(avg_deviation, max_deviation, avg_epa),
    decimals = 3
  ) %>%
  tab_header(
    title = "Team Predictability Analysis",
    subtitle = "Deviation from 50-50 run-pass balance by situation (2023)"
  )

#| label: predictability-analysis-py
#| message: false
#| warning: false

# Calculate predictability and its cost for each team
situation_stats = (pbp_2023
    .query("posteam.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby(['posteam', 'down', 'ydstogo'])
    .filter(lambda x: len(x) >= 10)  # Minimum sample
    .groupby(['posteam', 'down', 'ydstogo'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean()),
        epa=('epa', 'mean')
    )
    .assign(deviation=lambda x: abs(x['pass_rate'] - 0.5))
    .reset_index()
)

team_predictability = (situation_stats
    .groupby('posteam')
    .agg(
        situations=('plays', 'count'),
        avg_deviation=('deviation', 'mean'),
        max_deviation=('deviation', 'max'),
        avg_epa=('epa', 'mean')
    )
    .sort_values('avg_deviation')
    .reset_index()
)

# Display most and least predictable teams
print("\nLeast Predictable Teams:")
print("=" * 70)
print(team_predictability.head(5).to_string(index=False))

print("\n\nMost Predictable Teams:")
print("=" * 70)
print(team_predictability.tail(5).to_string(index=False))

Measuring the Cost of Predictability

We can estimate the cost of predictability by comparing EPA in situations where teams are more vs. less predictable:

R
Python

#| label: predictability-cost-r
#| message: false
#| warning: false

# Calculate EPA by predictability level
predictability_cost <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  # Join with team tendency data
  group_by(posteam, down, ydstogo) %>%
  mutate(
    situation_plays = n(),
    situation_pass_rate = mean(play_type == "pass")
  ) %>%
  ungroup() %>%
  filter(situation_plays >= 10) %>%
  mutate(
    predictability = case_when(
      situation_pass_rate >= 0.75 | situation_pass_rate <= 0.25 ~ "High",
      situation_pass_rate >= 0.60 | situation_pass_rate <= 0.40 ~ "Medium",
      TRUE ~ "Low"
    ),
    predictability = factor(predictability, levels = c("Low", "Medium", "High"))
  ) %>%
  # Check if play matches tendency
  mutate(
    matches_tendency = case_when(
      situation_pass_rate >= 0.6 & play_type == "pass" ~ TRUE,
      situation_pass_rate <= 0.4 & play_type == "run" ~ TRUE,
      TRUE ~ FALSE
    )
  ) %>%
  group_by(predictability, matches_tendency) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  )

predictability_cost %>%
  mutate(
    matches_tendency = if_else(matches_tendency, "Matches Tendency", "Against Tendency")
  ) %>%
  gt() %>%
  cols_label(
    predictability = "Predictability",
    matches_tendency = "Play Type",
    plays = "Plays",
    mean_epa = "Mean EPA",
    success_rate = "Success Rate"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "The Cost of Predictability",
    subtitle = "EPA when matching vs defying tendencies (2023)"
  )

#| label: predictability-cost-py
#| message: false
#| warning: false

# Calculate situation-level tendencies
situation_tendencies = (pbp_2023
    .query("posteam.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby(['posteam', 'down', 'ydstogo'])
    .agg(
        situation_plays=('play_type', 'count'),
        situation_pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
    .reset_index()
)

# Join with play-level data
predictability_data = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .merge(situation_tendencies, on=['posteam', 'down', 'ydstogo'])
    .query("situation_plays >= 10")
)

# Categorize predictability
def categorize_predictability(rate):
    if rate >= 0.75 or rate <= 0.25:
        return "High"
    elif rate >= 0.60 or rate <= 0.40:
        return "Medium"
    else:
        return "Low"

def matches_tendency(row):
    if row['situation_pass_rate'] >= 0.6 and row['play_type'] == 'pass':
        return True
    elif row['situation_pass_rate'] <= 0.4 and row['play_type'] == 'run':
        return True
    else:
        return False

predictability_cost = (predictability_data
    .assign(
        predictability=lambda x: x['situation_pass_rate'].apply(categorize_predictability),
        matches_tendency=lambda x: x.apply(matches_tendency, axis=1)
    )
    .groupby(['predictability', 'matches_tendency'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
    .assign(matches_tendency=lambda x: x['matches_tendency'].map({True: "Matches Tendency", False: "Against Tendency"}))
)

# Order predictability levels
predictability_cost['predictability'] = pd.Categorical(
    predictability_cost['predictability'],
    categories=['Low', 'Medium', 'High'],
    ordered=True
)
predictability_cost = predictability_cost.sort_values(['predictability', 'matches_tendency'])

print("\nThe Cost of Predictability (2023)")
print("=" * 70)
print(predictability_cost.to_string(index=False))

Key Insight: Diminishing Returns

While predictability has costs, perfect unpredictability (50-50 every situation) isn't optimal either. The best play-callers lean toward their more efficient play types while maintaining enough unpredictability to prevent defensive exploitation.

These results quantify what coaches intuitively understand: predictability hurts efficiency. When teams call plays matching their strong tendencies (passing when they usually pass, running when they usually run), they face prepared defenses and see reduced EPA. When they occasionally go against tendency, they catch defenses in suboptimal alignments and see improved EPA—even for the "less efficient" play type.

The magnitude of the predictability cost varies by situation. In highly predictable situations (>75% toward one play type), the cost can reach 0.1-0.15 EPA per play. Over a full game, excessive predictability in just a few key situations could cost 1-2 expected points—meaningful in a league where average margins are often smaller than a field goal.

Optimal play-calling therefore involves a delicate balance: calling your more efficient play type more often (perhaps 60-70% in neutral situations rather than 50%), but maintaining enough unpredictability (30-40% of the time going against the tendency) to prevent defensive over-adjustment. Game theory helps us formalize this balance, suggesting that the optimal mixed strategy involves biasing toward the more efficient option but never becoming completely predictable.

Nash Equilibrium in Play-Calling

Game theory's Nash equilibrium concept applies directly to play-calling. At equilibrium, neither the offense nor defense can improve by unilaterally changing strategy. If offenses become too predictable, defenses adjust to exploit it, creating incentive for offenses to change. If offenses become perfectly random, they're calling too many inefficient plays, again creating incentive to adjust. The equilibrium point involves biased randomization—favoring efficient play types but maintaining unpredictability.

Script vs Reactive Play-Calling

Many coordinators begin games with scripted play sequences—predetermined plays designed to test defenses and establish rhythm. The practice of scripting the opening drive (or sometimes the first 10-15 plays) has roots in coaching legend Bill Walsh's West Coast offense. Walsh believed that preparing specific play sequences in advance provided several advantages: it forced the coaching staff to create a coherent game plan, it removed the pressure of in-game decision-making on the opening drive, and it systematically tested defensive coverages and fronts to gather information for later in the game.

Modern analytics allows us to test whether scripted plays perform differently than reactive calls. If scripting works as theorized, we should see either higher efficiency on scripted plays (because they're better prepared) or at minimum no efficiency loss despite reduced ability to react to game situations. If scripting is suboptimal, we'd see lower efficiency as coordinators stick to predetermined calls that may not match the game flow.

The strategic value of scripting extends beyond just those specific plays. By planning the opening sequence, offensive coordinators can install plays that set up later concepts—for instance, showing a formation and running from it early to establish play action from that same formation later. They can also use the scripted sequence to gather defensive information: how does this defense play Cover 2? Do they play man or zone on third-and-medium? This intelligence gathering helps inform reactive play-calling later in the game.

However, scripting also has costs. The primary cost is reduced flexibility—if the game situation changes (unexpected score, injury, defensive adjustment), the coordinator may feel pressure to stick to the script anyway. Additionally, defenses know about scripting tendencies too; if a coordinator always scripts aggressively on the opening drive, defenses can prepare specifically for that.

Let's examine whether scripted plays (proxied by looking at opening drive and early possession plays) show different efficiency patterns than reactive play-calling later in games:

R
Python

#| label: script-detection-r
#| message: false
#| warning: false

# Analyze first drive vs rest of game
script_analysis <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    !is.na(epa),
    play_type %in% c("run", "pass"),
    !is.na(drive),
    qtr <= 2  # First half only
  ) %>%
  group_by(game_id, posteam) %>%
  mutate(
    team_drive_num = dense_rank(drive)
  ) %>%
  ungroup() %>%
  mutate(
    phase = case_when(
      team_drive_num == 1 ~ "Opening Drive",
      team_drive_num <= 3 ~ "Drives 2-3",
      TRUE ~ "Later Drives"
    ),
    phase = factor(phase, levels = c("Opening Drive", "Drives 2-3", "Later Drives"))
  ) %>%
  group_by(phase, play_type) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    explosive_rate = mean(epa > 1.5),
    .groups = "drop"
  )

script_analysis %>%
  gt() %>%
  cols_label(
    phase = "Game Phase",
    play_type = "Play Type",
    plays = "Plays",
    mean_epa = "Mean EPA",
    success_rate = "Success Rate",
    explosive_rate = "Explosive Rate"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate, explosive_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Scripted vs Reactive Play-Calling",
    subtitle = "Efficiency by game phase (2023, first half)"
  )

#| label: script-detection-py
#| message: false
#| warning: false

# Analyze first drive vs rest of game
script_data = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type.isin(['run', 'pass']) & drive.notna() & qtr <= 2")
    .sort_values(['game_id', 'posteam', 'drive'])
    .assign(
        team_drive_num=lambda x: x.groupby(['game_id', 'posteam'])['drive'].transform(
            lambda y: pd.factorize(y)[0] + 1
        )
    )
)

def categorize_phase(drive_num):
    if drive_num == 1:
        return "Opening Drive"
    elif drive_num <= 3:
        return "Drives 2-3"
    else:
        return "Later Drives"

script_analysis = (script_data
    .assign(phase=lambda x: x['team_drive_num'].apply(categorize_phase))
    .groupby(['phase', 'play_type'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        explosive_rate=('epa', lambda x: (x > 1.5).mean())
    )
    .reset_index()
)

# Order phases
phase_order = ["Opening Drive", "Drives 2-3", "Later Drives"]
script_analysis['phase'] = pd.Categorical(
    script_analysis['phase'],
    categories=phase_order,
    ordered=True
)
script_analysis = script_analysis.sort_values(['phase', 'play_type'], ascending=[True, False])

print("\nScripted vs Reactive Play-Calling (2023, first half)")
print("=" * 80)
print(script_analysis.to_string(index=False))

Coordinator Evaluation

Evaluating play-caller performance requires accounting for personnel, opponent strength, and game situation:

R
Python

#| label: coordinator-eval-r
#| message: false
#| warning: false

# Load roster data for coordinators (simplified - would need real OC data)
# For this example, we'll analyze by team as a proxy

coordinator_performance <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  group_by(posteam) %>%
  summarise(
    plays = n(),

    # Overall metrics
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),

    # Run metrics
    run_rate = mean(play_type == "run"),
    run_epa = mean(epa[play_type == "run"]),
    run_success = mean(epa[play_type == "run"] > 0),

    # Pass metrics
    pass_epa = mean(epa[play_type == "pass"]),
    pass_success = mean(epa[play_type == "pass"] > 0),

    # Situational metrics
    first_down_epa = mean(epa[down == 1]),
    third_down_conv = mean(epa[down == 3] > 0),

    # Play action usage
    play_action_rate = mean(play_action == 1, na.rm = TRUE),

    .groups = "drop"
  ) %>%
  arrange(desc(mean_epa))

# Display top 10 offenses
coordinator_performance %>%
  head(10) %>%
  select(posteam, plays, mean_epa, success_rate, run_rate,
         first_down_epa, third_down_conv) %>%
  gt() %>%
  cols_label(
    posteam = "Team",
    plays = "Plays",
    mean_epa = "EPA/Play",
    success_rate = "Success Rate",
    run_rate = "Run Rate",
    first_down_epa = "1st Down EPA",
    third_down_conv = "3rd Down Conv %"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate, run_rate, first_down_epa, third_down_conv),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  tab_header(
    title = "Top Offensive Coordinator Performance",
    subtitle = "EPA-based rankings (2023 season)"
  )

#| label: coordinator-eval-py
#| message: false
#| warning: false

# Evaluate coordinator performance by team
base_stats = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby('posteam')
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        run_rate=('play_type', lambda x: (x == 'run').mean())
    )
)

# Run-specific stats
run_stats = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type == 'run' & down <= 3")
    .groupby('posteam')
    .agg(
        run_epa=('epa', 'mean'),
        run_success=('epa', lambda x: (x > 0).mean())
    )
)

# Pass-specific stats
pass_stats = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type == 'pass' & down <= 3")
    .groupby('posteam')
    .agg(
        pass_epa=('epa', 'mean'),
        pass_success=('epa', lambda x: (x > 0).mean())
    )
)

# First down stats
first_down_stats = (pbp_2023
    .query("posteam.notna() & epa.notna() & down == 1 & play_type.isin(['run', 'pass'])")
    .groupby('posteam')
    .agg(first_down_epa=('epa', 'mean'))
)

# Third down conversion
third_down_stats = (pbp_2023
    .query("posteam.notna() & epa.notna() & down == 3 & play_type.isin(['run', 'pass'])")
    .groupby('posteam')
    .agg(third_down_conv=('epa', lambda x: (x > 0).mean()))
)

# Combine all statistics
coordinator_performance = (base_stats
    .join(run_stats)
    .join(pass_stats)
    .join(first_down_stats)
    .join(third_down_stats)
    .sort_values('mean_epa', ascending=False)
    .reset_index()
)

print("\nTop Offensive Coordinator Performance (2023)")
print("=" * 90)
print(coordinator_performance.head(10)[['posteam', 'plays', 'mean_epa', 'success_rate',
                                        'run_rate', 'first_down_epa', 'third_down_conv']].to_string(index=False))

Advanced Coordinator Metrics

R
Python

#| label: fig-coordinator-scatter-r
#| fig-cap: "Coordinator efficiency vs predictability trade-off"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Calculate efficiency and predictability for scatter plot
coord_scatter_data <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  group_by(posteam, down, ydstogo) %>%
  filter(n() >= 10) %>%
  summarise(
    situation_pass_rate = mean(play_type == "pass"),
    situation_epa = mean(epa),
    .groups = "drop"
  ) %>%
  mutate(deviation = abs(situation_pass_rate - 0.5)) %>%
  group_by(posteam) %>%
  summarise(
    avg_epa = mean(situation_epa),
    predictability = mean(deviation),
    .groups = "drop"
  )

# Create scatter plot
ggplot(coord_scatter_data, aes(x = predictability, y = avg_epa)) +
  geom_point(size = 3, alpha = 0.6, color = "#0072B2") +
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
  geom_smooth(method = "lm", se = TRUE, color = "#D55E00", fill = "#D55E00", alpha = 0.2) +
  nflplotR::geom_nfl_logos(aes(team_abbr = posteam), width = 0.04, alpha = 0.8) +
  labs(
    title = "Offensive Coordinator Efficiency vs Predictability",
    subtitle = "Does unpredictability correlate with success? (2023 season)",
    x = "Predictability (Average deviation from 50-50)",
    y = "Average EPA per Play",
    caption = "Data: nflfastR | Each point represents a team's offensive coordinator"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11)
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-coordinator-scatter-py
#| fig-cap: "Coordinator efficiency vs predictability trade-off - Python"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Calculate efficiency and predictability for scatter plot
situation_data = (pbp_2023
    .query("posteam.notna() & epa.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby(['posteam', 'down', 'ydstogo'])
    .filter(lambda x: len(x) >= 10)
    .groupby(['posteam', 'down', 'ydstogo'])
    .agg(
        situation_pass_rate=('play_type', lambda x: (x == 'pass').mean()),
        situation_epa=('epa', 'mean')
    )
    .assign(deviation=lambda x: abs(x['situation_pass_rate'] - 0.5))
    .reset_index()
)

coord_scatter_data = (situation_data
    .groupby('posteam')
    .agg(
        avg_epa=('situation_epa', 'mean'),
        predictability=('deviation', 'mean')
    )
    .reset_index()
)

# Create scatter plot
plt.figure(figsize=(10, 8))
plt.scatter(coord_scatter_data['predictability'], coord_scatter_data['avg_epa'],
            s=100, alpha=0.6, color='#0072B2')

# Add trend line
z = np.polyfit(coord_scatter_data['predictability'], coord_scatter_data['avg_epa'], 1)
p = np.poly1d(z)
x_trend = np.linspace(coord_scatter_data['predictability'].min(),
                      coord_scatter_data['predictability'].max(), 100)
plt.plot(x_trend, p(x_trend), color='#D55E00', linewidth=2, alpha=0.7, label='Trend')

plt.axhline(y=0, color='black', linestyle='--', alpha=0.5)
plt.xlabel('Predictability (Average deviation from 50-50)', fontsize=12)
plt.ylabel('Average EPA per Play', fontsize=12)
plt.title('Offensive Coordinator Efficiency vs Predictability\nDoes unpredictability correlate with success? (2023 season)',
          fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Optimal Play-Call Recommendations

Using all the analysis above, we can build a recommendation system for play-calling:

R
Python

#| label: optimal-recommendations-r
#| message: false
#| warning: false

# Build recommendation system based on EPA analysis
recommendation_system <- pbp_2023 %>%
  filter(
    !is.na(epa),
    play_type %in% c("run", "pass"),
    down <= 3,
    !is.na(ydstogo),
    !is.na(yardline_100)
  ) %>%
  mutate(
    field_zone = case_when(
      yardline_100 >= 80 ~ "Own Territory",
      yardline_100 >= 50 ~ "Midfield",
      yardline_100 >= 20 ~ "Opponent Territory",
      TRUE ~ "Red Zone"
    ),
    down_distance = case_when(
      down == 1 ~ "1st Down",
      down == 2 & ydstogo <= 3 ~ "2nd & Short",
      down == 2 & ydstogo >= 7 ~ "2nd & Long",
      down == 2 ~ "2nd & Medium",
      down == 3 & ydstogo <= 3 ~ "3rd & Short",
      down == 3 & ydstogo >= 7 ~ "3rd & Long",
      TRUE ~ "3rd & Medium"
    )
  ) %>%
  group_by(field_zone, down_distance, play_type) %>%
  summarise(
    plays = n(),
    mean_epa = mean(epa),
    success_rate = mean(epa > 0),
    .groups = "drop"
  ) %>%
  filter(plays >= 50) %>%
  group_by(field_zone, down_distance) %>%
  mutate(
    best_play = play_type[which.max(mean_epa)],
    epa_advantage = mean_epa - mean(mean_epa)
  ) %>%
  ungroup() %>%
  filter(play_type == best_play) %>%
  select(field_zone, down_distance, best_play, mean_epa, success_rate, plays)

recommendation_system %>%
  gt() %>%
  cols_label(
    field_zone = "Field Zone",
    down_distance = "Situation",
    best_play = "Recommended",
    mean_epa = "EPA",
    success_rate = "Success Rate",
    plays = "Sample Size"
  ) %>%
  fmt_number(
    columns = c(mean_epa, success_rate),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = best_play,
    colors = scales::col_factor(
      palette = c("pass" = "#00BFC4", "run" = "#F8766D"),
      domain = c("pass", "run")
    )
  ) %>%
  tab_header(
    title = "Optimal Play-Call Recommendations",
    subtitle = "Highest EPA play type by situation (2023)"
  )

#| label: optimal-recommendations-py
#| message: false
#| warning: false

# Build recommendation system based on EPA analysis
def categorize_field_zone(yards):
    if yards >= 80:
        return "Own Territory"
    elif yards >= 50:
        return "Midfield"
    elif yards >= 20:
        return "Opponent Territory"
    else:
        return "Red Zone"

def categorize_down_distance(row):
    down = row['down']
    ydstogo = row['ydstogo']

    if down == 1:
        return "1st Down"
    elif down == 2 and ydstogo <= 3:
        return "2nd & Short"
    elif down == 2 and ydstogo >= 7:
        return "2nd & Long"
    elif down == 2:
        return "2nd & Medium"
    elif down == 3 and ydstogo <= 3:
        return "3rd & Short"
    elif down == 3 and ydstogo >= 7:
        return "3rd & Long"
    else:
        return "3rd & Medium"

recommendation_data = (pbp_2023
    .query("epa.notna() & play_type.isin(['run', 'pass']) & down <= 3 & ydstogo.notna() & yardline_100.notna()")
    .assign(
        field_zone=lambda x: x['yardline_100'].apply(categorize_field_zone),
        down_distance=lambda x: x.apply(categorize_down_distance, axis=1)
    )
    .groupby(['field_zone', 'down_distance', 'play_type'])
    .agg(
        plays=('epa', 'count'),
        mean_epa=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    .reset_index()
    .query("plays >= 50")
)

# Find best play for each situation
best_plays = (recommendation_data
    .sort_values('mean_epa', ascending=False)
    .groupby(['field_zone', 'down_distance'])
    .first()
    .reset_index()
    [['field_zone', 'down_distance', 'play_type', 'mean_epa', 'success_rate', 'plays']]
    .rename(columns={'play_type': 'best_play'})
    .sort_values(['field_zone', 'down_distance'])
)

print("\nOptimal Play-Call Recommendations (2023)")
print("=" * 90)
print(best_plays.to_string(index=False))

Context Matters

These recommendations represent league-wide averages. Individual teams should adjust based on their personnel strengths, opponent weaknesses, and game situation. A team with an elite running back might deviate from these recommendations profitably.

This recommendation system provides a data-driven baseline for play-calling decisions. However, it represents historical averages rather than team-specific optimal strategies. A team with Derrick Henry at running back should weight run plays more heavily than suggested by league-wide EPA averages. A team with an elite quarterback and weak offensive line might pass even more than recommended. The art of play-calling involves understanding these baseline recommendations, then adjusting for your specific personnel, opponent, and strategic situation.

Furthermore, these recommendations assume you want to maximize EPA on each individual play. But as we've discussed, play-calling involves strategic considerations beyond individual play efficiency: maintaining unpredictability, setting up later plays, controlling tempo, managing risk. An optimal play-calling system would layer these strategic considerations on top of the baseline efficiency recommendations.

Building Team-Specific Recommendation Systems

To create team-specific play-call recommendations, calculate EPA separately for your team's plays rather than league-wide averages. This accounts for your personnel strengths. Then adjust for opponent-specific factors: how does this specific defense perform against run vs pass in different situations? This creates a dynamic recommendation system that updates for each opponent.

Advanced Predictability Metrics

Beyond Simple Pass Rate: Measuring True Unpredictability

While pass rate deviation from 50-50 provides a useful first approximation of predictability, more sophisticated metrics can better capture the multi-dimensional nature of play-calling tendencies. True unpredictability involves not just run-pass balance but also play design variety, formation diversity, personnel usage, and the ability to disguise intentions.

One advanced metric is entropy, borrowed from information theory. Entropy measures the uncertainty in a probability distribution. For play-calling, higher entropy means greater unpredictability. If a team calls passes 50% of the time and runs 50% of the time, they have maximum entropy (1.0 in binary). If they pass 90% of the time, entropy drops to about 0.47. If they pass 100% of the time, entropy is zero—perfect predictability.

Extending entropy beyond binary run-pass decisions provides even richer insights. We can calculate entropy across play types (run left, run right, short pass, medium pass, deep pass), across formations, across personnel groupings, or across specific play concepts. A truly unpredictable offense maintains high entropy across multiple dimensions simultaneously.

Another sophisticated metric is sequence analysis—examining patterns in consecutive play calls. Do teams show tendencies to run after passing, or pass after running? Do they avoid calling the same play type multiple times in a row? Sequential patterns create additional predictability that simple frequency analysis misses. For instance, a team might have a balanced 50-50 run-pass split but always alternate run-pass-run-pass, making them perfectly predictable once you know the first call.

#| label: entropy-analysis-r
#| message: false
#| warning: false

# Calculate play-calling entropy by team
# Higher entropy indicates more unpredictable play-calling

library(DescTools)  # For entropy calculation

# Calculate entropy across multiple dimensions
entropy_analysis <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    play_type %in% c("run", "pass"),
    down <= 3
  ) %>%
  # Calculate by team and situation
  group_by(posteam, down, ydstogo) %>%
  filter(n() >= 20) %>%  # Minimum sample per situation
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    # Calculate binary entropy (run vs pass)
    entropy = -pass_rate * log2(pass_rate) - (1 - pass_rate) * log2(1 - pass_rate),
    .groups = "drop"
  ) %>%
  # Aggregate to team level
  group_by(posteam) %>%
  summarise(
    situations_analyzed = n(),
    avg_entropy = mean(entropy, na.rm = TRUE),
    min_entropy = min(entropy, na.rm = TRUE),
    max_entropy = max(entropy, na.rm = TRUE),
    # Standard deviation of entropy across situations
    entropy_consistency = sd(entropy, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(avg_entropy))

# Display top and bottom teams
bind_rows(
  entropy_analysis %>% head(6) %>% mutate(group = "Most Unpredictable"),
  entropy_analysis %>% tail(6) %>% mutate(group = "Most Predictable")
) %>%
  gt() %>%
  cols_label(
    group = "Category",
    posteam = "Team",
    situations_analyzed = "Situations",
    avg_entropy = "Avg Entropy",
    min_entropy = "Min Entropy",
    max_entropy = "Max Entropy",
    entropy_consistency = "Std Dev"
  ) %>%
  fmt_number(
    columns = c(avg_entropy, min_entropy, max_entropy, entropy_consistency),
    decimals = 3
  ) %>%
  fmt_number(
    columns = situations_analyzed,
    decimals = 0
  ) %>%
  data_color(
    columns = avg_entropy,
    colors = scales::col_numeric(
      palette = c("#D55E00", "#F0E442", "#00BA38"),
      domain = c(0.7, 1.0)
    )
  ) %>%
  tab_header(
    title = "Play-Calling Entropy Analysis",
    subtitle = "Measuring unpredictability across situations (2023)"
  ) %>%
  tab_footnote(
    footnote = "Entropy ranges from 0 (perfectly predictable) to 1 (maximum uncertainty)",
    locations = cells_column_labels(columns = avg_entropy)
  )

#| label: entropy-analysis-py
#| message: false
#| warning: false

import numpy as np

# Calculate binary entropy
def binary_entropy(p):
    """Calculate binary entropy for probability p"""
    if p == 0 or p == 1:
        return 0
    return -p * np.log2(p) - (1 - p) * np.log2(1 - p)

# Calculate entropy by situation
situation_entropy = (pbp_2023
    .query("posteam.notna() & play_type.isin(['run', 'pass']) & down <= 3")
    .groupby(['posteam', 'down', 'ydstogo'])
    .filter(lambda x: len(x) >= 20)  # Minimum sample
    .groupby(['posteam', 'down', 'ydstogo'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean())
    )
    .assign(entropy=lambda x: x['pass_rate'].apply(binary_entropy))
    .reset_index()
)

# Aggregate to team level
entropy_analysis = (situation_entropy
    .groupby('posteam')
    .agg(
        situations_analyzed=('plays', 'count'),
        avg_entropy=('entropy', 'mean'),
        min_entropy=('entropy', 'min'),
        max_entropy=('entropy', 'max'),
        entropy_consistency=('entropy', 'std')
    )
    .sort_values('avg_entropy', ascending=False)
    .reset_index()
)

print("\nMost Unpredictable Teams (Highest Entropy):")
print("=" * 80)
print(entropy_analysis.head(6).to_string(index=False))

print("\n\nMost Predictable Teams (Lowest Entropy):")
print("=" * 80)
print(entropy_analysis.tail(6).to_string(index=False))

This analysis uses information entropy to measure play-calling unpredictability: 1. **Binary Entropy Formula**: For a binary choice (run or pass), entropy = -p × log₂(p) - (1-p) × log₂(1-p), where p is the probability of passing. This ranges from 0 (perfectly predictable) to 1 (maximum uncertainty at p=0.5). 2. **Situation-Level Calculation**: We calculate entropy for each team in each down-distance situation (minimum 20 plays for reliability). This reveals not just overall balance but consistency of balance across situations. 3. **Team-Level Aggregation**: We average entropy across all situations for each team. Teams with high average entropy maintain unpredictability across diverse situations. Low entropy teams become predictable in certain situations. 4. **Entropy Consistency**: The standard deviation of entropy across situations reveals whether a team maintains consistent unpredictability or varies wildly (unpredictable in some situations, very predictable in others). 5. **Interpretation**: Average entropy above 0.90 indicates good unpredictability (typically 40-60% pass rate across situations). Below 0.80 suggests problematic predictability in at least some key situations.

The entropy analysis reveals that the best coordinators maintain high entropy across diverse situations. They don't just achieve 50-50 balance overall; they maintain 45-55% balance in first-and-10, second-and-short, second-and-long, and other key situations. This consistent unpredictability is harder for defenses to exploit than overall balance that masks wild situational swings (e.g., 25% pass on first down but 75% pass on second-and-long).

Entropy also helps us identify specific situations where teams become too predictable. A team might have high average entropy (0.92) but minimum entropy of 0.45 in certain situations—revealing specific down-distance combinations where they telegraph their intentions. Defensive coordinators can identify these low-entropy situations and optimize their game plans accordingly.

Entropy vs. Efficiency: The Trade-Off

Maximum entropy (perfect unpredictability) isn't the goal. As we've discussed, optimal play-calling means biasing toward more efficient options while maintaining enough unpredictability to prevent defensive exploitation. The "right" entropy level depends on the efficiency gap between run and pass in each situation. Larger efficiency gaps justify lower entropy (more predictable tilting toward the efficient option); smaller gaps suggest maintaining higher entropy.

Sequential Pattern Analysis

Beyond measuring what teams call, analyzing the sequence of calls reveals additional patterns. Do teams tend to run after incomplete passes to "reset"? Do they alternate run and pass to maintain balance? Do they call the same play type multiple consecutive times, or avoid repetition?

Sequential patterns matter because defenses learn and adjust. If a team always runs on second-and-short after an incomplete pass on first down, defenses will recognize this pattern and load the box in that specific sequence. Analyzing these sequential dependencies helps identify exploitable patterns that simple frequency analysis would miss.

We can measure sequential tendencies using conditional probabilities: What's the probability of a pass on second down, given that first down was a pass? Given it was a run? If these probabilities are identical, there's no sequential dependency. If they differ substantially, it reveals a pattern that defenses can exploit.

R
Python

#| label: sequential-analysis-r
#| message: false
#| warning: false

# Analyze sequential play-calling patterns
# Do teams show tendencies based on previous play?

sequential_analysis <- pbp_2023 %>%
  filter(
    !is.na(posteam),
    play_type %in% c("run", "pass"),
    down <= 2  # Focus on 1st and 2nd down for sequential analysis
  ) %>%
  # Sort by game and play order
  arrange(game_id, posteam, play_id) %>%
  # Get previous play type for each team's drives
  group_by(game_id, posteam, drive) %>%
  mutate(
    prev_play_type = lag(play_type),
    prev_success = lag(epa > 0)
  ) %>%
  ungroup() %>%
  filter(!is.na(prev_play_type)) %>%
  # Analyze current play type based on previous play
  group_by(prev_play_type, prev_success, down) %>%
  summarise(
    plays = n(),
    pass_rate = mean(play_type == "pass"),
    mean_epa = mean(epa, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    prev_result = if_else(prev_success, "Success", "Failure"),
    down_label = paste0("Down ", down)
  ) %>%
  select(prev_play_type, prev_result, down_label, plays, pass_rate, mean_epa)

# Display results
sequential_analysis %>%
  gt() %>%
  cols_label(
    prev_play_type = "Previous Play",
    prev_result = "Previous Result",
    down_label = "Current Down",
    plays = "Plays",
    pass_rate = "Pass Rate",
    mean_epa = "EPA/Play"
  ) %>%
  fmt_number(
    columns = c(pass_rate, mean_epa),
    decimals = 3
  ) %>%
  fmt_number(
    columns = plays,
    decimals = 0,
    use_seps = TRUE
  ) %>%
  data_color(
    columns = pass_rate,
    colors = scales::col_numeric(
      palette = c("#F8766D", "#FFFFFF", "#00BFC4"),
      domain = c(0.4, 0.7)
    )
  ) %>%
  tab_header(
    title = "Sequential Play-Calling Patterns",
    subtitle = "How previous play affects current play selection (2023)"
  )

#| label: sequential-analysis-py
#| message: false
#| warning: false

# Analyze sequential play-calling patterns
sequential_data = (pbp_2023
    .query("posteam.notna() & play_type.isin(['run', 'pass']) & down <= 2")
    .sort_values(['game_id', 'posteam', 'play_id'])
    .assign(
        # Get previous play within same drive
        prev_play_type=lambda x: x.groupby(['game_id', 'posteam', 'drive'])['play_type'].shift(1),
        prev_success=lambda x: x.groupby(['game_id', 'posteam', 'drive'])['epa'].shift(1) > 0
    )
    .dropna(subset=['prev_play_type'])
)

sequential_analysis = (sequential_data
    .groupby(['prev_play_type', 'prev_success', 'down'])
    .agg(
        plays=('play_type', 'count'),
        pass_rate=('play_type', lambda x: (x == 'pass').mean()),
        mean_epa=('epa', 'mean')
    )
    .reset_index()
    .assign(
        prev_result=lambda x: x['prev_success'].map({True: "Success", False: "Failure"}),
        down_label=lambda x: "Down " + x['down'].astype(str)
    )
    [['prev_play_type', 'prev_result', 'down_label', 'plays', 'pass_rate', 'mean_epa']]
)

print("\nSequential Play-Calling Patterns (2023)")
print("=" * 90)
print(sequential_analysis.to_string(index=False))

The sequential analysis reveals interesting patterns. Teams show clear tendencies to alternate play types—after running on first down, they're more likely to pass on second down (especially after successful runs). After passing on first down, they're more likely to run on second down (particularly after incomplete passes, as a "safe" play to avoid second-and-long).

These patterns make some strategic sense: varying play types prevents defenses from keying on one aspect of the offense. However, they also create exploitable predictability. A defense that recognizes a team's tendency to run on second-and-short after an incomplete pass can prepare accordingly, loading the box and daring the offense to pass again.

The most sophisticated offenses show less sequential dependency—their second-down play calling doesn't heavily depend on first-down results. They maintain flexibility and prevent defenses from using previous plays to predict current calls. This represents another dimension of unpredictability beyond simple run-pass balance.

Exploiting Sequential Patterns in Game Planning

Defensive coordinators can exploit sequential patterns by tracking opponent tendencies. If a team runs 65% of the time on second down after incomplete passes, the defense can prepare run-heavy personnel packages and fronts for those specific sequences. This is one reason why studying opponent tendencies involves not just overall frequencies but conditional frequencies based on previous plays, game situations, and other contextual factors.

Summary

Play-calling analysis combines statistical analysis with game theory to understand and optimize offensive strategy. Key takeaways:

Tendencies are measurable: We can quantify play-calling patterns across any dimension
Passing is generally more efficient: But optimal strategy isn't simply "pass always"
Predictability has costs: But perfect unpredictability isn't optimal either
Context matters: Down, distance, field position, and game script all influence optimal calls
Play action works: But requires credible run threats to maximize effectiveness
Personnel groupings signal intent: Creating exploitable tendencies
Game theory provides framework: For balancing efficiency with unpredictability

Modern play-callers increasingly use analytics to:
- Identify their own blind spots
- Exploit opponent tendencies
- Optimize situational decision-making
- Quantify the value of unpredictability
- Make data-driven adjustments

Exercises

Conceptual Questions

Predictability Paradox: Explain why a team might rationally maintain predictable tendencies in certain situations (e.g., always passing on 3rd & 15) despite the theoretical cost of predictability.
Personnel Balance: Why might a team deliberately run the ball from "11 personnel" (3 WR sets) even though passing is more efficient from that formation?
Script vs Reactive: What are the advantages and disadvantages of heavily scripting the first 15 plays of a game?

Coding Exercises

Exercise 1: Team Tendency Report

Create a comprehensive tendency report for a team of your choice that includes: a) Overall run-pass balance b) Tendencies by down and distance c) Tendencies by personnel grouping d) Tendency changes by game script e) Predictability score Compare their tendencies to league average and identify potential exploitable patterns.

Exercise 2: Play Action Deep Dive

Analyze play action effectiveness in detail: a) Calculate play action usage rate by team b) Compare EPA of play action vs standard dropbacks c) Examine whether play action effectiveness correlates with run game success d) Identify which teams use play action most/least effectively e) Visualize your findings **Hint**: Consider filtering for different pass depths and down-distance situations.

Exercise 3: Optimal Balance Calculator

Build a function that recommends optimal run-pass balance for a given situation: a) Input: down, distance, field position, score differential b) Calculate historical EPA for run vs pass in that situation c) Account for predictability costs d) Output: recommended pass rate with confidence interval Test your function on various scenarios and compare to actual NFL decision-making.

Exercise 4: Coordinator Comparison

Compare two offensive coordinators of your choice: a) Overall EPA and success rate b) Situational tendencies (1st down, 3rd down, red zone) c) Play action usage d) Personnel grouping preferences e) Predictability metrics f) Adaptation to game script Create visualizations comparing their approaches and effectiveness.

Exercise 5: Predictability Cost Analysis

Investigate the relationship between predictability and performance: a) Calculate a predictability score for each team b) Measure EPA in highly predictable vs unpredictable situations c) Test whether more predictable teams perform worse d) Consider confounding factors (talent, opponent strength) e) Visualize the relationship and draw conclusions **Advanced**: Build a regression model controlling for team quality and opponent strength.

References

:::

Learning ObjectivesBy the end of this chapter, you will be able to:

Introduction

What is Play-Calling Analysis?

The Importance of Play-Calling Analytics

The Predictability Problem

Measuring Play-Calling Tendencies

Basic Tendency Metrics

Interpreting Pass Rate in Context

Situational Tendencies

Common Mistake: Ignoring Neutral Game Script

Interpreting Situational Tendencies

Using Tendency Data for Defensive Game Planning

Visualizing Play-Calling Patterns

📊 Visualization Output

Run-Pass Balance Optimization

The Efficiency Paradox

The Run-Pass Efficiency Gap: Understanding the Paradox

Optimal Balance by Game Script

The Script Effect

Play Action Effectiveness

Visualizing Play Action Impact

📊 Visualization Output

Play Action Effectiveness

Play Action Usage Rates and Optimization

Personnel Groupings and Play-Calling

Personnel Tells

Understanding Personnel Grouping Nomenclature

Formation vs. Personnel

Down and Distance Tendencies

Formation Analysis and Play-Calling

The Strategic Role of Formations

Motion and Formation Flexibility

Measuring Formation Tendencies

Pistol Formation: The Hybrid Approach

Time and Score Situation Analysis

Game Script and Play-Calling Adaptation

The "Prevent Offense" Problem

Game Theory and Unpredictability

The Value of Mixed Strategies

Measuring the Cost of Predictability

Key Insight: Diminishing Returns

Nash Equilibrium in Play-Calling

Script vs Reactive Play-Calling

Coordinator Evaluation

Advanced Coordinator Metrics

📊 Visualization Output

Optimal Play-Call Recommendations

Context Matters

Building Team-Specific Recommendation Systems

Advanced Predictability Metrics

Beyond Simple Pass Rate: Measuring True Unpredictability

Entropy vs. Efficiency: The Trade-Off

Sequential Pattern Analysis

Exploiting Sequential Patterns in Game Planning

Summary

Exercises

Conceptual Questions

Coding Exercises

Exercise 1: Team Tendency Report

Exercise 2: Play Action Deep Dive

Exercise 3: Optimal Balance Calculator

Exercise 4: Coordinator Comparison

Exercise 5: Predictability Cost Analysis

Further Reading

References