Chapter 28: Salary Cap Analytics | Football Analytics Textbook

Learning ObjectivesBy the end of this chapter, you will be able to:

Understand salary cap mechanics and structure
Analyze contract value and optimization
Evaluate positional spending strategies
Study market efficiency and value contracts
Build roster cost optimization models

Introduction

The NFL salary cap represents one of the most complex and consequential constraints in professional sports. Unlike other major North American sports leagues with luxury tax systems (NBA, MLB) or higher caps (NBA), the NFL enforces a hard salary cap that requires teams to make difficult choices about resource allocation.

In 2023, the salary cap was set at $224.8 million per team, with projections reaching $255+ million by 2024. While this seems like a large amount, the reality is that with 53-man rosters, practice squads, and guaranteed contracts, teams must carefully manage every dollar to remain competitive.

The salary cap creates a unique analytical challenge: teams must simultaneously optimize for on-field performance (winning games), financial efficiency (maximizing value per dollar), and long-term sustainability (maintaining flexibility for future seasons). This multi-objective optimization problem requires sophisticated data science approaches that combine financial analysis, statistical modeling, and strategic planning.

Consider the strategic dilemma facing teams with elite quarterbacks on expensive contracts. Patrick Mahomes, for example, signed a 10-year, $450 million extension with the Kansas City Chiefs in 2020. While Mahomes is undoubtedly worth the investment, his cap hit limits spending elsewhere on the roster. The Chiefs must find value contracts at other positions—often through the draft or by identifying undervalued veterans—to maintain a competitive roster. This is the essence of salary cap analytics: making optimal resource allocation decisions under severe budget constraints.

What is Salary Cap Analytics?

Salary cap analytics applies financial analysis, optimization theory, and statistical modeling to evaluate player contracts, identify market inefficiencies, and construct optimal rosters under budget constraints. It's where sports analytics meets corporate finance, combining elements of: - **Portfolio optimization**: Balancing risk and return across player investments - **Market efficiency analysis**: Identifying mispriced assets (undervalued players) - **Forecasting**: Predicting future cap situations and player values - **Constraint optimization**: Maximizing performance within hard budget limits

This chapter explores how teams can use data science to navigate these challenges effectively. We'll cover both the practical mechanics of the salary cap and the analytical techniques used to gain competitive advantages in roster construction.

Salary Cap Structure and Mechanics

Understanding salary cap analytics requires first understanding how the cap itself works. The NFL salary cap is far more complex than simply "teams can't spend more than $X million." The rules around contract structure, bonus proration, dead money, and roster accounting create opportunities for creative cap management—and analytical advantages for teams that understand these nuances.

The Basics

The NFL salary cap is determined by a formula based on league revenues, which are split between owners and players according to the Collective Bargaining Agreement (CBA). The current CBA, negotiated in 2020, specifies that players receive approximately 48% of total league revenue.

$$ \text{Salary Cap} = \frac{\text{Total League Revenue} \times \text{Player Share}}{32 \text{ teams}} $$

This formula creates a direct link between league financial health and team spending capacity. When league revenues grow—through new broadcast deals, expanded playoffs, or increased sponsorship—the salary cap rises, giving teams more flexibility. Conversely, events like the COVID-19 pandemic that reduce revenues can constrain team spending.

The cap applies to the "top 51" rule during the offseason, meaning only a team's 51 highest cap hits count against the cap until the final roster is set. This allows teams to sign draft picks and maintain larger rosters during training camp without immediately counting every contract against the cap.

Key Cap Concepts

Several critical concepts separate cash spending from cap accounting, creating opportunities for financial engineering:

Cash vs. Cap: The actual cash paid to a player in a given year can differ significantly from the cap hit due to signing bonuses and guarantees. Teams often use this distinction to manipulate their cap situation, paying players large upfront bonuses (cash) that prorate over multiple years (cap), creating short-term cap savings at the expense of future flexibility.

Dead Money: Cap charges that remain when a player is released, typically from prorated signing bonuses. When a team releases a player, any remaining prorated bonus money "accelerates" to the current year (with some exceptions for post-June 1 designations). Dead money represents perhaps the most visible cost of roster mistakes—teams paying millions in cap space for players no longer on the roster.

Cap Carry Over: Unused cap space from one season can be carried to the next, creating a strategic decision: should teams use all available cap space to maximize current season competitiveness, or preserve some for future flexibility? This decision depends on championship window considerations and upcoming contract situations.

Performance Escalators: Contract provisions that increase salary based on playing time or performance. These allow teams to bet on player development while protecting against overpayment if the player doesn't meet expectations. For example, a contract might escalate by $2 million if a player makes the Pro Bowl.

Contract Structure Basics

A $100M/5-year contract with a $20M signing bonus demonstrates the cash vs. cap distinction: - **Signing bonus**: $20M (paid immediately, prorates to $4M/year over 5 years) - **Year 1**: $10M salary + $4M proration = $14M cap hit (but $30M cash) - **Year 2**: $15M salary + $4M proration = $19M cap hit ($15M cash) - **Year 3**: $18M salary + $4M proration = $22M cap hit ($18M cash) - **Year 4**: $20M salary + $4M proration = $24M cap hit ($20M cash) - **Year 5**: $17M salary + $4M proration = $21M cap hit ($17M cash) Notice how the team pays $30M in Year 1 (salary + bonus) but only counts $14M against the cap. This structure helps teams manage short-term cap situations but creates future obligations. If the team releases the player after Year 2, the remaining $12M in bonus proration ($4M × 3 years) accelerates to Year 3 as dead money.

These cap mechanics create a complex optimization problem. Teams must balance immediate competitiveness (using cap space now) against future flexibility (preserving cap space for later). The best analytical approaches model these trade-offs explicitly, considering multiple future scenarios and their probabilities.

Positional Cap Rules

Certain positions have minimum salary requirements and special rules that create strategic considerations:

Top-51 Rule: During offseason, only top 51 cap hits count. This creates an interesting strategic element: teams can sign extra players to compete during training camp without immediately facing cap consequences. However, when final rosters are set, all contracts count, potentially creating cap crunches for teams that aren't careful with their planning.

Minimum Salary Benefit: Veteran minimum salaries have cap advantages through the Veteran Salary Benefit program. When teams sign veteran players to minimum salary contracts, only a portion counts against the cap, with the league funding the difference. This creates value opportunities for teams willing to sign older players to short-term, minimum contracts.

Franchise Tag: Calculated as average of top 5 salaries at position (or 120% of previous salary, whichever is greater). The franchise tag allows teams to retain players for one year without negotiating a long-term deal, but at a premium price. This creates strategic decisions: is it better to sign a player to a long-term deal with lower annual value, or keep them on the tag while maintaining future flexibility?

Understanding these rules is essential for cap analytics because they create opportunities for arbitrage—finding loopholes or inefficiencies in the cap system that allow teams to gain competitive advantages. The best cap managers combine deep knowledge of these rules with analytical rigor to construct optimal rosters.

Setting Up the Environment

Before we dive into salary cap analysis, we need to set up our analytical environment with the necessary packages and libraries. In a real-world scenario, you would access contract data from sources like Over The Cap or Spotrac through their APIs. For this chapter, we'll use simulated contract data that mirrors the structure of real NFL contracts, allowing us to demonstrate analytical techniques without requiring API access.

The data we'll work with includes key contract elements: player names, positions, teams, contract length, total value, guaranteed money, and annual cap hits. We'll also incorporate player performance data from nflfastR to evaluate contract efficiency—the relationship between what teams pay and what they receive in on-field production.

R Setup
Python Setup

#| eval: false
#| echo: true

# Install required packages for salary cap analysis
# tidyverse: data manipulation and visualization
# nflfastR: NFL play-by-play data for performance metrics
# nflreadr: efficient data reading from nflverse
# gt/gtExtras: creating publication-quality tables
# scales: formatting numbers and currencies
# glmnet: regularized regression for market value models
# randomForest: ensemble models for contract prediction

install.packages("tidyverse")
install.packages("nflfastR")
install.packages("nflreadr")
install.packages("gt")
install.packages("gtExtras")
install.packages("scales")
install.packages("glmnet")
install.packages("randomForest")

#| message: false
#| warning: false

# Load packages into R session
library(tidyverse)      # For data manipulation and visualization
library(nflfastR)       # For NFL play-by-play data
library(nflreadr)       # For efficient data loading
library(gt)             # For creating formatted tables
library(gtExtras)       # For enhanced table features
library(scales)         # For formatting numbers and scales

cat("✓ R packages loaded successfully\n")

#| eval: false
#| echo: true

# Install required packages for salary cap analysis
# pandas: data manipulation and analysis
# numpy: numerical computing and array operations
# nfl-data-py: NFL data access including play-by-play
# matplotlib/seaborn: data visualization
# scikit-learn: machine learning for contract models
# scipy: optimization algorithms for roster construction

pip install pandas numpy nfl-data-py matplotlib seaborn scikit-learn scipy

#| message: false
#| warning: false

# Import Python packages
import pandas as pd                           # Data manipulation
import numpy as np                            # Numerical operations
import nfl_data_py as nfl                     # NFL data access
import matplotlib.pyplot as plt               # Plotting
import seaborn as sns                         # Statistical visualization
from sklearn.linear_model import LinearRegression, Ridge  # Regression models
from sklearn.ensemble import RandomForestRegressor        # Ensemble models
from sklearn.model_selection import train_test_split     # Model validation
from scipy.optimize import minimize                       # Optimization

print("✓ Python packages loaded successfully")

Both R and Python provide excellent tools for salary cap analysis. R's tidyverse ecosystem offers intuitive data manipulation, while Python's scikit-learn provides robust machine learning capabilities. Choose based on your team's existing infrastructure and expertise.

Loading and Preparing Salary Cap Data

Contract data forms the foundation of salary cap analytics. In production environments, this data typically comes from specialized sources like Over The Cap, Spotrac, or proprietary team databases. These sources track detailed contract information including signing bonuses, yearly salaries, guaranteed amounts, roster bonuses, workout bonuses, and option clauses.

For our analysis, we'll simulate contract data that mirrors real NFL contract structures. This simulation includes realistic distributions of contract values by position, appropriate ranges for guaranteed percentages, and sensible relationships between age, experience, and contract value.

Contract Data Sources

Real-world contract data requires careful validation because publicly available information may be incomplete or inaccurate. Teams often structure contracts with complex clauses and incentives that aren't immediately apparent in public databases. When building production analytics systems, always cross-reference multiple sources and validate against official league cap reports.

R
Python

#| label: load-contract-data-r
#| message: false
#| warning: false
#| cache: true

# Simulate contract data for demonstration purposes
# In practice, this would come from Over The Cap, Spotrac, or similar sources
# Real data would include: signing bonuses, yearly breakdowns, option years,
# incentives, guarantees for injury vs. skill, and detailed cap accounting

set.seed(2024)  # Set seed for reproducible random generation

# Create simulated contract dataset
contracts <- tibble(
  # Generate player identifiers (in practice, would be actual names)
  player = paste("Player", 1:500),

  # Position assignment - weighted toward positions with more players
  position = sample(c("QB", "RB", "WR", "TE", "OL",
                     "DL", "LB", "CB", "S", "K/P"),
                   500, replace = TRUE),

  # Team assignment - sample from actual NFL teams
  team = sample(nfl_teams()$team_abbr, 500, replace = TRUE),

  # Contract length - most contracts are 3-5 years
  years = sample(1:6, 500, replace = TRUE),

  # Total contract value - log-normal distribution for realistic skew
  # exp(rnorm(mean=16, sd=1.5)) creates values centered around $9M
  total_value = exp(rnorm(500, mean = 16, sd = 1.5)) * 1e6,

  # Guaranteed percentage - varies by position and player quality
  # Range from 20% (prove-it deals) to 80% (star players)
  guaranteed = runif(500, 0.2, 0.8),

  # Player age - realistic age range for NFL players
  age = sample(22:35, 500, replace = TRUE),

  # Years of experience in NFL
  experience = sample(0:15, 500, replace = TRUE),

  # Calculate average annual value (APY)
  apy = total_value / years
) %>%
  mutate(
    # Calculate guaranteed money in dollars
    guaranteed_money = total_value * guaranteed,

    # Cap hit for 2024 - varies around APY due to signing bonus proration
    # Real contracts would have exact yearly cap hits
    cap_hit_2024 = apy * runif(500, 0.8, 1.2),

    # Adjust QB contracts upward - QBs command premium salaries
    # Multiply by 2.5 to reflect QB market (top QBs make ~$50M APY)
    total_value = ifelse(position == "QB", total_value * 2.5, total_value),

    # Recalculate derived values after QB adjustment
    apy = total_value / years,
    guaranteed_money = total_value * guaranteed,
    cap_hit_2024 = apy * runif(500, 0.8, 1.2)
  )

# Display summary statistics
cat("Loaded", nrow(contracts), "player contracts\n")
cat("Total contract value:", scales::dollar(sum(contracts$total_value)), "\n")
cat("Average APY:", scales::dollar(mean(contracts$apy)), "\n")
cat("Position breakdown:\n")
print(table(contracts$position))

#| label: load-contract-data-py
#| message: false
#| warning: false
#| cache: true

# Simulate contract data structure
# In production, would load from API or database
np.random.seed(2024)

# Get team abbreviations from nfl_data_py
teams = nfl.import_team_desc()[['team_abbr']].dropna()['team_abbr'].unique()

# Define position groups
positions = ['QB', 'RB', 'WR', 'TE', 'OL', 'DL', 'LB', 'CB', 'S', 'K/P']

# Number of contracts to simulate
n = 500

# Create contract dataframe with realistic structure
contracts = pd.DataFrame({
    # Player identifiers
    'player': [f'Player {i}' for i in range(1, n+1)],

    # Random position assignment
    'position': np.random.choice(positions, n),

    # Random team assignment from actual NFL teams
    'team': np.random.choice(teams, n),

    # Contract length - most 3-5 years
    'years': np.random.randint(1, 7, n),

    # Total value - log-normal distribution for realistic skew
    # Most contracts $5-20M total, some much higher
    'total_value': np.exp(np.random.normal(16, 1.5, n)) * 1e6,

    # Guaranteed percentage - varies widely
    'guaranteed': np.random.uniform(0.2, 0.8, n),

    # Player age - realistic NFL range
    'age': np.random.randint(22, 36, n),

    # Years in NFL
    'experience': np.random.randint(0, 16, n)
})

# Calculate derived contract values
contracts['apy'] = contracts['total_value'] / contracts['years']
contracts['guaranteed_money'] = contracts['total_value'] * contracts['guaranteed']
contracts['cap_hit_2024'] = contracts['apy'] * np.random.uniform(0.8, 1.2, n)

# Adjust QB contracts to reflect market premium
qb_mask = contracts['position'] == 'QB'
contracts.loc[qb_mask, 'total_value'] *= 2.5

# Recalculate derived values after QB adjustment
contracts['apy'] = contracts['total_value'] / contracts['years']
contracts['guaranteed_money'] = contracts['total_value'] * contracts['guaranteed']
contracts.loc[qb_mask, 'cap_hit_2024'] = contracts.loc[qb_mask, 'apy'] * np.random.uniform(0.8, 1.2, qb_mask.sum())

# Display summary
print(f"Loaded {len(contracts):,} player contracts")
print(f"Total contract value: ${contracts['total_value'].sum():,.0f}")
print(f"Average APY: ${contracts['apy'].mean():,.0f}")
print(f"\nPosition breakdown:")
print(contracts['position'].value_counts().sort_index())

This simulated data provides a realistic foundation for our analyses. The key variables—position, team, contract value, guarantees, age, and experience—mirror actual NFL contract structures. The data reflects real patterns: QB contracts are significantly larger, contract length varies by position, and guaranteed percentages differ based on player leverage and risk.

Data Quality Considerations

When working with real contract data, be aware of several data quality challenges: 1. **Reporting lags**: Contract details may not be immediately available after signing 2. **Restructures**: Teams frequently restructure contracts, changing cap hits mid-season 3. **Incentives**: LTBE (Likely To Be Earned) vs. NLTBE incentives affect cap differently 4. **Voidable years**: Some contracts include "fake years" that void automatically 5. **Guaranteed money definitions**: "Fully guaranteed" vs. "guaranteed for injury" vs. "guaranteed at signing" Always validate data against multiple sources and update regularly to reflect restructures and releases.

Loading Performance Data

To evaluate contract efficiency—whether teams are getting good value for their spending—we need performance metrics to pair with contract data. The most comprehensive approach combines multiple data sources: nflfastR for play-by-play EPA, PFF grades for subjective evaluation, and Next Gen Stats for tracking data.

For this analysis, we'll focus on EPA-based metrics from nflfastR, which provide objective, outcome-based measures of player contribution. We'll calculate position-specific performance metrics that can be directly related to contract value.

R
Python

#| label: load-performance-r
#| message: false
#| warning: false
#| cache: true

# Load recent season data for performance metrics
# This provides play-by-play EPA data we can aggregate by player
pbp_2023 <- load_pbp(2023)

# Calculate QB performance metrics
# Focus on efficiency metrics (EPA per play) rather than volume
qb_performance <- pbp_2023 %>%
  # Filter to plays with valid EPA and identified passer
  filter(!is.na(epa), !is.na(passer_player_name)) %>%

  # Group by QB and team to calculate performance metrics
  group_by(passer_player_name, posteam) %>%
  summarise(
    # Sample size - minimum 100 dropbacks for reliable estimates
    dropbacks = n(),

    # Expected Points Added per play - primary efficiency metric
    epa_per_play = mean(epa),

    # Completion percentage over expectation - adjusts for difficulty
    cpoe = mean(cpoe, na.rm = TRUE),

    # Success rate - percentage of positive EPA plays
    success_rate = mean(epa > 0),

    .groups = "drop"
  ) %>%

  # Filter to QBs with sufficient sample size
  filter(dropbacks >= 100) %>%

  # Rename for joining with contract data
  rename(player = passer_player_name, team = posteam)

# Calculate RB performance metrics
# Running backs evaluated on rushing efficiency
rb_performance <- pbp_2023 %>%
  filter(!is.na(epa), play_type == "run", !is.na(rusher_player_name)) %>%
  group_by(rusher_player_name, posteam) %>%
  summarise(
    rushes = n(),
    epa_per_rush = mean(epa),
    success_rate = mean(epa > 0),
    yards_per_rush = mean(yards_gained, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(rushes >= 50) %>%
  rename(player = rusher_player_name, team = posteam)

# Calculate WR/TE performance metrics
# Receivers evaluated on per-target efficiency
wr_performance <- pbp_2023 %>%
  filter(!is.na(epa), play_type == "pass", !is.na(receiver_player_name)) %>%
  group_by(receiver_player_name, posteam) %>%
  summarise(
    targets = n(),
    epa_per_target = mean(epa),
    success_rate = mean(epa > 0),
    yards_per_target = mean(yards_gained, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(targets >= 30) %>%
  rename(player = receiver_player_name, team = posteam)

cat("Performance metrics calculated for",
    nrow(qb_performance), "QBs,",
    nrow(rb_performance), "RBs,",
    nrow(wr_performance), "WRs\n")

#| label: load-performance-py
#| message: false
#| warning: false
#| cache: true

# Load 2023 season play-by-play data
pbp_2023 = nfl.import_pbp_data([2023])

# Calculate QB performance metrics
# Group by passer and aggregate efficiency metrics
qb_performance = (pbp_2023
    .query("epa.notna() & passer_player_name.notna()")
    .groupby(['passer_player_name', 'posteam'])
    .agg(
        # Count total dropbacks (sample size)
        dropbacks=('epa', 'count'),

        # Mean EPA per dropback (efficiency metric)
        epa_per_play=('epa', 'mean'),

        # Completion percentage over expectation
        cpoe=('cpoe', lambda x: x.mean()),

        # Success rate (% of positive EPA plays)
        success_rate=('epa', lambda x: (x > 0).mean())
    )
    # Filter to QBs with meaningful sample size
    .query("dropbacks >= 100")
    .reset_index()
    .rename(columns={'passer_player_name': 'player', 'posteam': 'team'})
)

# Calculate RB performance metrics
rb_performance = (pbp_2023
    .query("epa.notna() & play_type == 'run' & rusher_player_name.notna()")
    .groupby(['rusher_player_name', 'posteam'])
    .agg(
        rushes=('epa', 'count'),
        epa_per_rush=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        yards_per_rush=('yards_gained', 'mean')
    )
    .query("rushes >= 50")
    .reset_index()
    .rename(columns={'rusher_player_name': 'player', 'posteam': 'team'})
)

# Calculate WR/TE performance metrics
wr_performance = (pbp_2023
    .query("epa.notna() & play_type == 'pass' & receiver_player_name.notna()")
    .groupby(['receiver_player_name', 'posteam'])
    .agg(
        targets=('epa', 'count'),
        epa_per_target=('epa', 'mean'),
        success_rate=('epa', lambda x: (x > 0).mean()),
        yards_per_target=('yards_gained', 'mean')
    )
    .query("targets >= 30")
    .reset_index()
    .rename(columns={'receiver_player_name': 'player', 'posteam': 'team'})
)

print(f"Performance metrics calculated for {len(qb_performance)} QBs, "
      f"{len(rb_performance)} RBs, {len(wr_performance)} WRs")

These performance metrics provide the foundation for evaluating contract efficiency. By combining per-play efficiency metrics (EPA per play) with contract data (APY), we can identify which players provide the most value per dollar—a critical insight for cap-constrained teams.

The key insight here is using efficiency metrics (per-play EPA) rather than volume metrics (total yards or touchdowns). Volume metrics conflate opportunity with production—a player who gets more plays will accumulate more total stats regardless of efficiency. Efficiency metrics isolate the player's contribution per opportunity, which better reflects true value and translates across different usage levels.

Contract Valuation Analysis

Understanding what drives contract values is fundamental to salary cap analytics. Contract valuation isn't purely about performance—market timing, positional scarcity, leverage in negotiations, and team cap situations all influence final contract terms. By building statistical models that account for these factors, we can identify when teams are paying above or below market rate for players.

This section develops market value models that predict expected contract value based on observable player characteristics. The residuals from these models—the difference between actual and predicted contract value—reveal which players represent value contracts (underpaid) or overpays.

Average Annual Value (APY) by Position

The most basic metric for contract comparison is Average Annual Value (APY), calculated by dividing total contract value by contract length. While simple, APY has limitations: it doesn't account for guarantees, signing bonus structure, or the real cap accounting. However, it provides a useful starting point for cross-contract comparisons.

$$ \text{APY} = \frac{\text{Total Contract Value}}{\text{Contract Years}} $$

Different positions command dramatically different APY values in the NFL market. Quarterbacks earn far more than any other position, reflecting their disproportionate impact on winning. Within non-QB positions, there's a clear hierarchy: elite pass rushers and left tackles command premiums, while running backs and off-ball linebackers earn less.

#| label: positional-apy-r
#| message: false
#| warning: false

# Calculate comprehensive APY statistics by position
# This analysis reveals the positional market structure
apy_by_position <- contracts %>%
  group_by(position) %>%
  summarise(
    # Sample size for each position
    n = n(),

    # Mean APY - affected by star contracts and outliers
    mean_apy = mean(apy),

    # Median APY - more robust to outliers, better represents "typical" contract
    median_apy = median(apy),

    # 75th percentile - represents "good" contracts for starters
    q75_apy = quantile(apy, 0.75),

    # Maximum APY - represents elite player contracts
    max_apy = max(apy),

    # Average percentage of contract guaranteed
    pct_guaranteed = mean(guaranteed * 100),

    .groups = "drop"
  ) %>%
  # Sort by mean APY to see positional hierarchy
  arrange(desc(mean_apy))

# Display as formatted table with proper currency formatting
apy_by_position %>%
  gt() %>%
  cols_label(
    position = "Position",
    n = "N",
    mean_apy = "Mean APY",
    median_apy = "Median APY",
    q75_apy = "75th Pct",
    max_apy = "Max APY",
    pct_guaranteed = "% Guaranteed"
  ) %>%
  fmt_currency(
    columns = c(mean_apy, median_apy, q75_apy, max_apy),
    decimals = 0
  ) %>%
  fmt_number(
    columns = pct_guaranteed,
    decimals = 1
  ) %>%
  tab_header(
    title = "Contract APY by Position",
    subtitle = "Average Annual Value and Guarantee Rates"
  ) %>%
  tab_source_note("Sample contract data")

#| label: positional-apy-py
#| message: false
#| warning: false

# Calculate APY statistics by position
# Provides comprehensive view of positional markets
apy_by_position = (contracts
    .groupby('position')
    .agg(
        n=('apy', 'count'),
        mean_apy=('apy', 'mean'),
        median_apy=('apy', 'median'),
        q75_apy=('apy', lambda x: x.quantile(0.75)),
        max_apy=('apy', 'max'),
        pct_guaranteed=('guaranteed', lambda x: x.mean() * 100)
    )
    .sort_values('mean_apy', ascending=False)
    .reset_index()
)

print("\nContract APY by Position:")
print("="*80)
for _, row in apy_by_position.iterrows():
    print(f"{row['position']:>4} | N={row['n']:>3} | "
          f"Mean: ${row['mean_apy']/1e6:>5.1f}M | "
          f"Median: ${row['median_apy']/1e6:>5.1f}M | "
          f"Max: ${row['max_apy']/1e6:>5.1f}M | "
          f"Guaranteed: {row['pct_guaranteed']:>4.1f}%")

This code calculates position-level contract statistics using group-by operations. In R, we use `dplyr`'s `group_by()` and `summarise()` pattern, while Python uses pandas' `groupby()` and `agg()` methods. The analysis computes multiple summary statistics because each reveals different aspects of the market: - **Mean APY**: Shows average spending but is sensitive to outliers (star contracts) - **Median APY**: More robust measure of typical contracts - **75th percentile**: Represents what teams pay for quality starters - **Max APY**: Shows ceiling for elite players - **% Guaranteed**: Reflects position-specific risk and leverage The `arrange(desc(mean_apy))` in R and `sort_values('mean_apy', ascending=False)` in Python sort positions from highest to lowest paid, revealing the market hierarchy.

When we examine these positional APY statistics, several patterns emerge. Quarterbacks command dramatically higher APY than any other position—often 2-3 times higher than the next highest position. This reflects the quarterback's unique importance to team success. Research consistently shows QB play explains more variance in team wins than any other position.

Within non-QB positions, we see a clear hierarchy. Positions that impact the passing game—wide receivers, cornerbacks, edge rushers—command higher APY than positions primarily involved in the run game—running backs, off-ball linebackers. This reflects the modern NFL's pass-heavy offensive approach. Since passing is more efficient than running (as we discussed in earlier chapters), players who excel in the passing game command market premiums.

The guarantee percentage also varies significantly by position. Quarterbacks and left tackles typically receive higher guarantee percentages because (1) they have more leverage in negotiations due to positional scarcity, and (2) teams are more confident in their evaluation of these positions. Running backs receive lower guarantees because teams worry about injury risk and rapid performance decline with age.

Positional Value and Market Compensation

The market doesn't always perfectly reflect positional value. Some positions may be systematically undervalued or overvalued relative to their impact on winning. For example: - **Running backs**: May be overvalued relative to actual contribution to wins - **Offensive line**: May be undervalued because impact is harder to measure - **Interior defensive line**: Market value increasing as analytics reveal their importance - **Off-ball linebackers**: Often overvalued in traditional evaluation, undervalued by analytics Teams with superior analytics can exploit these market inefficiencies by spending more on undervalued positions and less on overvalued ones.

Guaranteed Money Analysis

While APY provides a useful summary, guaranteed money often matters more for understanding true contract value. Guaranteed money represents the minimum a player will earn regardless of injury or performance decline. From the player's perspective, guarantees provide financial security. From the team's perspective, guarantees represent committed capital that limits future flexibility.

The relationship between contract length and guarantee percentage reveals interesting patterns about risk allocation between teams and players. Generally, longer contracts have lower guarantee percentages because teams become less confident in player performance further into the future. However, this relationship varies by position and player quality.

R
Python

#| label: fig-guaranteed-analysis-r
#| fig-cap: "Guaranteed money percentage by position and contract length"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Visualize relationship between contract length and guarantees
# Shows how teams manage risk across different contract structures
contracts %>%
  # Focus on major positions with sufficient sample size
  filter(position %in% c("QB", "WR", "RB", "OL", "DL", "LB", "CB", "S")) %>%

  # Create scatter plot with trend lines
  ggplot(aes(x = years, y = guaranteed * 100, color = position)) +

  # Individual contract points with transparency to show density
  geom_point(alpha = 0.4, size = 2) +

  # LOESS smoothing to show non-linear trends
  # se = FALSE removes confidence bands for cleaner visualization
  geom_smooth(method = "loess", se = FALSE, linewidth = 1) +

  # Format y-axis as percentages
  scale_y_continuous(labels = scales::percent_format(scale = 1)) +

  # Show all contract lengths from 1-6 years
  scale_x_continuous(breaks = 1:6) +

  # Use color palette that works for colorblind viewers
  scale_color_brewer(palette = "Set2") +

  labs(
    title = "Contract Guarantee Rates by Position and Length",
    subtitle = "Longer contracts typically have lower guarantee percentages",
    x = "Contract Length (Years)",
    y = "Percentage Guaranteed",
    color = "Position"
  ) +

  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "right"
  )

#| label: fig-guaranteed-analysis-py
#| fig-cap: "Guaranteed money percentage by position and contract length - Python"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Filter to positions with sufficient data
positions = ['QB', 'WR', 'RB', 'OL', 'DL', 'LB', 'CB', 'S']
plot_data = contracts[contracts['position'].isin(positions)].copy()

# Create figure
plt.figure(figsize=(10, 6))

# Plot each position separately to create colored groups
for i, pos in enumerate(positions):
    pos_data = plot_data[plot_data['position'] == pos]

    # Scatter plot with transparency
    plt.scatter(pos_data['years'], pos_data['guaranteed'] * 100,
                alpha=0.4, label=pos, s=50)

plt.xlabel('Contract Length (Years)', fontsize=12)
plt.ylabel('Percentage Guaranteed (%)', fontsize=12)
plt.title('Contract Guarantee Rates by Position and Length\n'
          'Longer contracts typically have lower guarantee percentages',
          fontsize=14, fontweight='bold')
plt.legend(title='Position', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

This visualization reveals several important patterns in how teams structure guarantees. First, we see a general downward trend: longer contracts tend to have lower guarantee percentages. This makes intuitive sense—teams are less willing to commit guaranteed money five years into the future when uncertainty about player performance is high.

Second, we observe position-specific patterns. Quarterbacks tend to maintain high guarantee percentages even on longer contracts, reflecting their unique value and leverage. Running backs show the opposite pattern—even short contracts often have relatively low guarantees due to injury concerns and positional replaceability.

Third, there's significant variation within positions. Some QBs sign contracts with 90%+ guarantees (typically elite players with leverage), while others receive only 40-50% guarantees (often younger players or those with injury histories). This variation represents differences in player quality, team cap situations, and negotiating leverage.

Guarantee Structures

NFL contracts use several types of guarantees, each with different implications: - **Fully Guaranteed**: Money paid regardless of injury, performance, or roster status. Extremely rare in the NFL except for signing bonuses and elite player contracts. - **Guaranteed for Injury**: Player receives money if injured while under contract. Common in veteran deals. If player is healthy but cut for performance, guarantee doesn't apply. - **Guaranteed for Skill**: Player receives money if on roster, even if benched. Protects against performance-based cuts but not injury. - **Non-Guaranteed**: Team can release player without financial obligation. Common in later years of contracts, giving teams flexibility. - **Rolling Guarantees**: Money becomes guaranteed at a certain date (often March of each year). Allows teams to evaluate player before committing. Understanding these distinctions is critical for contract analysis. A "$100M guaranteed" contract might have only $40M fully guaranteed at signing, with the rest guaranteed on roster dates in future years.

Contract Length and Age Dynamics

Beyond just the guarantee percentage, the interaction between player age and contract length creates interesting dynamics. Teams generally prefer to avoid paying for "decline years"—ages when player performance typically deteriorates. This creates age-specific contract length patterns.

Younger players in their early-to-mid 20s often receive longer contracts because teams expect sustained performance. Players in their late 20s at peak performance may also receive long-term deals, though often structured with outs in later years. Players in their 30s typically receive shorter contracts as teams become wary of age-related decline.

The optimal contract structure balances team risk management with player compensation preferences. Players generally prefer longer contracts with more guarantees (security), while teams prefer shorter contracts with fewer guarantees (flexibility). The final terms reflect relative negotiating leverage, which varies by position, player quality, and market conditions.

Positional Spending Patterns

Understanding how teams allocate cap resources across positions reveals strategic priorities and potential inefficiencies. Some teams invest heavily in quarterback and offensive line, betting that a clean pocket produces efficient offense. Others spread resources more evenly across the roster. Analyzing these allocation patterns helps us understand different roster construction philosophies and identify which approaches correlate with success.

Position-level spending analysis also reveals market inefficiencies. If analytics show that interior defensive line play drives wins more than the market prices it at, teams can gain advantages by over-investing at that position. Conversely, if running backs contribute less to winning than their salaries suggest, teams should spend less at RB.

Team-Level Positional Allocation

To understand team construction strategies, we need to examine how much cap space each team dedicates to each position. This reveals whether teams are following conventional wisdom or taking contrarian approaches that might provide competitive advantages.

R
Python

#| label: team-positional-spend-r
#| message: false
#| warning: false

# Calculate team spending by position
# This analysis reveals team-level strategic priorities
team_position_spend <- contracts %>%
  group_by(team, position) %>%
  summarise(
    # Total cap hit for this team-position combination
    total_cap = sum(cap_hit_2024),

    # Number of players at this position for this team
    n_players = n(),

    .groups = "drop"
  ) %>%

  # Calculate percentage of team's total cap at each position
  group_by(team) %>%
  mutate(
    # Each position's share of team's total cap spending
    pct_of_cap = total_cap / sum(total_cap) * 100
  ) %>%
  ungroup()

# Display QB spending as example
# QB spending varies widely - some teams have expensive veterans,
# others have cheap rookies, creating massive cap flexibility differences
team_position_spend %>%
  filter(position == "QB") %>%
  arrange(desc(pct_of_cap)) %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    position = "Position",
    total_cap = "Total Cap Hit",
    n_players = "Players",
    pct_of_cap = "% of Cap"
  ) %>%
  fmt_currency(
    columns = total_cap,
    decimals = 0
  ) %>%
  fmt_number(
    columns = pct_of_cap,
    decimals = 1
  ) %>%
  tab_header(
    title = "Top 10 Teams by QB Cap Spending",
    subtitle = "2024 Season"
  )

#| label: team-positional-spend-py
#| message: false
#| warning: false

# Calculate team spending by position
team_position_spend = (contracts
    .groupby(['team', 'position'])
    .agg(
        total_cap=('cap_hit_2024', 'sum'),
        n_players=('player', 'count')
    )
    .reset_index()
)

# Calculate percentage of each team's total cap
team_totals = (team_position_spend
    .groupby('team')['total_cap']
    .sum()
    .reset_index()
    .rename(columns={'total_cap': 'team_total'})
)

# Merge team totals back to calculate percentages
team_position_spend = team_position_spend.merge(team_totals, on='team')
team_position_spend['pct_of_cap'] = (
    team_position_spend['total_cap'] / team_position_spend['team_total'] * 100
)

# Show QB spending as example
qb_spend = (team_position_spend
    .query("position == 'QB'")
    .sort_values('pct_of_cap', ascending=False)
    .head(10)
)

print("\nTop 10 Teams by QB Cap Spending (2024):")
print("="*70)
for _, row in qb_spend.iterrows():
    print(f"{row['team']:>3} | ${row['total_cap']/1e6:>6.2f}M | "
          f"{row['pct_of_cap']:>5.1f}% of cap | "
          f"{row['n_players']:>2} players")

This analysis reveals one of the most important factors in NFL team building: quarterback cap allocation. Teams with expensive veteran quarterbacks might dedicate 15-20% of their cap to the position, while teams with quarterbacks on rookie contracts might spend only 2-3%. This 12-17% difference—roughly $25-35 million in cap space—represents a massive competitive advantage for teams with cheap QB play.

The Kansas City Chiefs' success from 2018-2020 illustrates this advantage. While Patrick Mahomes was on his rookie contract, the Chiefs could afford elite weapons like Tyreek Hill and Travis Kelce, plus a strong defense. After Mahomes' extension kicked in, the Chiefs had to make difficult choices, letting some players leave and finding value in other areas.

This pattern creates strategic implications for team building:

Teams with rookie QBs: Should maximize competitive window by spending aggressively at other positions
Teams extending QBs: Must find value contracts and draft well to maintain roster quality
Teams with expensive veteran QBs: Need extreme efficiency in non-QB spending

Championship Window Management

Teams face a critical decision when a rookie QB plays well: when to extend them? Waiting maximizes the rookie contract advantage but risks the QB demanding more in the open market. Extending early provides cost certainty but eliminates cap flexibility sooner. The optimal strategy depends on: - QB's trajectory (is he still improving?) - Current roster quality (can we win now?) - Market conditions (are QB salaries rising fast?) - Rookie contract details (4-year deal vs. 5th year option) Teams increasingly extend QBs after Year 3 or 4, balancing these considerations.

Positional Spending Heat Map

To visualize team allocation strategies across all positions simultaneously, we can create a heat map showing each team's spending distribution. This allows us to identify teams with unusual allocation patterns that might represent contrarian strategies or inefficient spending.

R
Python

#| label: fig-position-heatmap-r
#| fig-cap: "Team positional spending patterns (% of salary cap)"
#| fig-width: 12
#| fig-height: 8
#| message: false
#| warning: false

# Create heatmap of positional spending
# Shows team-specific allocation strategies
spending_matrix <- team_position_spend %>%
  # Focus on major positions for cleaner visualization
  filter(position %in% c("QB", "RB", "WR", "TE", "OL", "DL", "LB", "CB", "S")) %>%

  # Reshape to wide format for heatmap
  select(team, position, pct_of_cap) %>%
  pivot_wider(names_from = position, values_from = pct_of_cap, values_fill = 0) %>%
  arrange(team)

# Sample 16 teams for cleaner visualization
# Full 32-team heatmap would be too dense
set.seed(42)
sample_teams <- sample(spending_matrix$team, 16)

spending_matrix %>%
  filter(team %in% sample_teams) %>%

  # Convert back to long format for ggplot
  pivot_longer(-team, names_to = "position", values_to = "pct") %>%

  ggplot(aes(x = position, y = team, fill = pct)) +

  # Tile geometry creates heatmap
  geom_tile(color = "white", linewidth = 0.5) +

  # Color scale: white (low) to dark blue (high spending)
  scale_fill_gradient2(
    low = "white",
    mid = "lightblue",
    high = "darkblue",
    midpoint = 10,  # Typical position spending around 10%
    name = "% of Cap"
  ) +

  # Add percentage text in each cell
  geom_text(aes(label = sprintf("%.1f", pct)), size = 3) +

  labs(
    title = "Team Positional Spending Patterns",
    subtitle = "Percentage of salary cap allocated to each position (Sample of 16 teams)",
    x = "Position",
    y = "Team"
  ) +

  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    axis.text.x = element_text(angle = 0, hjust = 0.5)
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-position-heatmap-py
#| fig-cap: "Team positional spending patterns - Python (% of salary cap)"
#| fig-width: 12
#| fig-height: 8
#| message: false
#| warning: false

# Create heatmap of positional spending
positions = ['QB', 'RB', 'WR', 'TE', 'OL', 'DL', 'LB', 'CB', 'S']

# Pivot to wide format for heatmap
spending_matrix = (team_position_spend
    .query("position in @positions")
    .pivot(index='team', columns='position', values='pct_of_cap')
    .fillna(0)
)

# Sample 16 teams for cleaner visualization
np.random.seed(42)
sample_teams = np.random.choice(spending_matrix.index, 16, replace=False)
spending_sample = spending_matrix.loc[sample_teams, positions]

# Create heatmap using seaborn
plt.figure(figsize=(12, 8))
sns.heatmap(spending_sample,
            annot=True,           # Show values in cells
            fmt='.1f',            # Format as one decimal place
            cmap='Blues',         # Color scheme
            cbar_kws={'label': '% of Cap'},
            linewidths=0.5)       # Add gridlines

plt.title('Team Positional Spending Patterns\n'
          'Percentage of salary cap allocated to each position (Sample of 16 teams)',
          fontsize=14, fontweight='bold', pad=20)
plt.xlabel('Position', fontsize=12)
plt.ylabel('Team', fontsize=12)
plt.tight_layout()
plt.show()

The heatmap visualization reveals interesting patterns in team construction. Some teams show heavy concentration in one or two positions (high-risk, high-reward strategies), while others spread spending more evenly (conservative, balanced approaches).

Looking across teams, we can identify different strategic archetypes:

Pass-First Teams: High spending at QB, WR, and OL (pass protection). Lower spending on RB and run-stopping positions. These teams bet on passing efficiency driving wins.

Balanced Teams: Relatively even distribution across positions. These teams don't make strong bets on specific positions being more valuable.

Defense-First Teams: Higher spending on DL, CB, and S. These teams bet on defense winning championships and controlling games.

Contrarian Teams: Unusual spending patterns that deviate from league norms. These might represent analytical insights (investing in undervalued positions) or strategic mistakes (overpaying positions with limited impact).

The key question for analysts: which spending patterns correlate with winning? Teams that can identify undervalued positions and allocate accordingly gain systematic advantages in roster construction.

Spending Allocation and Team Success

Research on the relationship between position spending and winning shows: 1. **QB spending**: Weak correlation with wins because it's endogenous—teams pay QBs based on past performance, so good QBs cost more but aren't necessarily better values. 2. **Pass rush spending**: Positive correlation with wins. Elite pass rushers are rare and valuable, justifying high investment. 3. **RB spending**: Negative or neutral correlation with wins. Suggests RBs may be systematically overvalued. 4. **OL spending**: Positive correlation with wins, especially pass protection. Often undervalued relative to impact. 5. **CB spending**: Positive correlation with wins. Elite coverage allows aggressive pass rushing. These patterns suggest market inefficiencies teams can exploit through contrarian position allocation.

Market Value Models

Building statistical models to predict market value allows us to identify systematically over- or under-paid players. These models use player characteristics (age, experience, position) and performance metrics to estimate expected contract value. Players earning significantly less than expected represent value contracts, while those earning more than expected are potential overpays.

Market value modeling serves several purposes:

Contract negotiations: Provides objective basis for offer amounts
Value identification: Highlights players providing excess value
Market tracking: Shows how positional values evolve over time
Decision support: Informs extension timing and free agent targeting

The key challenge is separating player quality from market conditions. A player might be "overpaid" not because he's bad, but because he signed at market peak or had leverage from multiple bidders. Conversely, "value contracts" often reflect signing timing (before breakout) rather than analytical genius.

Building a Market Value Model

We start with a simple linear regression model predicting log(APY) based on player characteristics. Using logarithmic transformation helps because contract values are log-normally distributed (most players earn modest amounts, few earn enormous amounts).

$$ \log(\text{APY}_i) = \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Age}^2_i + \beta_3 \text{Experience}_i + \beta_4 \text{Guaranteed}_i + \epsilon_i $$

The model includes age and age-squared to capture non-linear age effects (value peaks in late 20s, then declines). Experience captures market seniority effects separate from age. Guaranteed percentage proxies for player quality—better players negotiate higher guarantees.

R
Python

#| label: market-value-model-r
#| message: false
#| warning: false

# Build market value model for quarterbacks
# QBs chosen because position is homogeneous (same role across teams)
# and market is liquid (frequent transactions provide price discovery)

qb_model_data <- contracts %>%
  filter(position == "QB") %>%
  mutate(
    # Log transform APY for better model fit
    # Log transformation handles right-skewed distribution
    log_apy = log(apy),

    # Age-squared term captures non-linear age effects
    # Peak performance typically late 20s, decline after 30
    age_squared = age^2
  )

# Fit linear model
# Predicts log(APY) from age, experience, guarantees
qb_market_model <- lm(
  log_apy ~ age + age_squared + experience + guaranteed,
  data = qb_model_data
)

# Display model summary
# Look for significant coefficients and R-squared
summary(qb_market_model)

# Generate predictions and calculate value over market
qb_model_data <- qb_model_data %>%
  mutate(
    # Predicted log(APY) from model
    predicted_log_apy = predict(qb_market_model),

    # Transform back to dollar scale
    predicted_apy = exp(predicted_log_apy),

    # Difference between actual and predicted (in dollars)
    value_over_market = apy - predicted_apy,

    # Percentage difference (easier to interpret)
    value_pct = (apy - predicted_apy) / predicted_apy * 100
  )

cat("\nModel Performance:\n")
cat("R-squared:", summary(qb_market_model)$r.squared, "\n")
cat("RMSE:", sqrt(mean((qb_model_data$log_apy - qb_model_data$predicted_log_apy)^2)), "\n")

#| label: market-value-model-py
#| message: false
#| warning: false

from sklearn.metrics import r2_score, mean_squared_error

# Build market value model for QBs
qb_model_data = contracts[contracts['position'] == 'QB'].copy()

# Transform and create features
qb_model_data['log_apy'] = np.log(qb_model_data['apy'])
qb_model_data['age_squared'] = qb_model_data['age'] ** 2

# Prepare feature matrix and target
X = qb_model_data[['age', 'age_squared', 'experience', 'guaranteed']]
y = qb_model_data['log_apy']

# Fit linear regression model
qb_market_model = LinearRegression()
qb_market_model.fit(X, y)

# Generate predictions
qb_model_data['predicted_log_apy'] = qb_market_model.predict(X)
qb_model_data['predicted_apy'] = np.exp(qb_model_data['predicted_log_apy'])
qb_model_data['value_over_market'] = (qb_model_data['apy'] -
                                       qb_model_data['predicted_apy'])
qb_model_data['value_pct'] = (qb_model_data['value_over_market'] /
                               qb_model_data['predicted_apy'] * 100)

# Model performance metrics
r2 = r2_score(y, qb_model_data['predicted_log_apy'])
rmse = np.sqrt(mean_squared_error(y, qb_model_data['predicted_log_apy']))

print(f"\nQB Market Value Model:")
print(f"R-squared: {r2:.3f}")
print(f"RMSE: {rmse:.3f}")
print(f"\nCoefficients:")
for feat, coef in zip(X.columns, qb_market_model.coef_):
    print(f"  {feat}: {coef:.4f}")
print(f"  Intercept: {qb_market_model.intercept_:.4f}")

The model reveals several interesting patterns in QB market pricing. The age coefficient shows how age affects market value, while age-squared captures the inverted-U shape of age curves. Experience has a separate effect from age—two 28-year-olds with different NFL experience might command different salaries.

The R-squared value indicates how much of contract variation the model explains. A moderate R-squared (0.3-0.5) suggests other factors beyond age, experience, and guarantees drive contracts—things like recent performance, playoff success, team cap situations, and negotiation leverage. This residual variation is where we find value contracts and overpays.

One limitation of this simple model: it doesn't include performance metrics. A more sophisticated approach would incorporate EPA per play, completion percentage over expectation, or other objective performance measures. However, even this simple model provides useful insights by establishing baseline market expectations.

Causality vs. Correlation in Value Models

Market value models identify correlations, not causal relationships. A player earning less than predicted isn't necessarily a "value"—they might have injury concerns, character issues, or other factors not in the model that justify lower compensation. Similarly, "overpays" might reflect factors the model misses: leadership value, playoff experience, fit with specific schemes, or competitive bidding situations. Use model residuals as starting points for investigation, not final conclusions. Combine quantitative analysis with qualitative evaluation and contextual understanding.

Identifying Value Contracts and Overpays

The residuals from our market value model—the difference between actual and predicted APY—identify players earning significantly more or less than market expectations. Large negative residuals indicate value contracts (player earning less than expected), while large positive residuals indicate potential overpays.

R
Python

#| label: value-contracts-r
#| message: false
#| warning: false

# Identify biggest value contracts (most underpaid)
# Negative value_pct means earning less than predicted
value_contracts <- qb_model_data %>%
  filter(!is.na(value_pct)) %>%
  arrange(value_pct) %>%
  select(player, team, age, apy, predicted_apy, value_over_market, value_pct) %>%
  head(10)

# Identify biggest overpays (most overpaid)
# Positive value_pct means earning more than predicted
overpays <- qb_model_data %>%
  filter(!is.na(value_pct)) %>%
  arrange(desc(value_pct)) %>%
  select(player, team, age, apy, predicted_apy, value_over_market, value_pct) %>%
  head(10)

# Display value contracts in formatted table
value_contracts %>%
  gt() %>%
  cols_label(
    player = "Player",
    team = "Team",
    age = "Age",
    apy = "Actual APY",
    predicted_apy = "Expected APY",
    value_over_market = "Value Over Market",
    value_pct = "% Over Market"
  ) %>%
  fmt_currency(
    columns = c(apy, predicted_apy, value_over_market),
    decimals = 0
  ) %>%
  fmt_number(
    columns = value_pct,
    decimals = 1
  ) %>%
  tab_header(
    title = "Top 10 Value QB Contracts",
    subtitle = "Most underpaid relative to market model"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightgreen"),
    locations = cells_body(columns = value_pct)
  )

#| label: value-contracts-py
#| message: false
#| warning: false

# Identify value contracts (underpaid players)
value_contracts = (qb_model_data
    .dropna(subset=['value_pct'])
    .nsmallest(10, 'value_pct')
    [['player', 'team', 'age', 'apy', 'predicted_apy',
      'value_over_market', 'value_pct']]
)

# Identify overpays (overpaid players)
overpays = (qb_model_data
    .dropna(subset=['value_pct'])
    .nlargest(10, 'value_pct')
    [['player', 'team', 'age', 'apy', 'predicted_apy',
      'value_over_market', 'value_pct']]
)

print("\nTop 10 Value QB Contracts (Underpaid):")
print("="*90)
for _, row in value_contracts.iterrows():
    print(f"{row['player']:>10} ({row['team']}) | "
          f"Age: {row['age']:>2} | "
          f"Actual: ${row['apy']/1e6:>5.1f}M | "
          f"Expected: ${row['predicted_apy']/1e6:>5.1f}M | "
          f"Value: {row['value_pct']:>6.1f}%")

Value contracts represent competitive advantages. A quarterback earning $20M below market rate provides the team with $20M in cap space to invest elsewhere—roughly enough to sign an elite pass rusher or top cornerback. This is why teams with rookie quarterbacks have such an advantage: they're getting elite QB play at a fraction of market cost.

The Buffalo Bills from 2018-2020 provide a clear example. Josh Allen on his rookie contract ($6M APY) was producing like a $35M QB, providing $29M in excess value. The Bills used this excess cap space to build a strong defense and surround Allen with weapons, reaching the AFC Championship Game. Once Allen's extension kicked in at $43M APY, the Bills faced tougher roster decisions.

Overpays also matter strategically. A quarterback earning $15M more than expected represents $15M that can't be spent improving the roster. However, some "overpays" are strategic necessity—if a team has a franchise QB, they must pay market rate (or above) to retain him, because replacing QB play is extremely difficult.

Finding Market Inefficiencies

Value contracts typically arise from three scenarios: 1. **Timing**: Player signed before breakout performance. Example: Josh Allen extended after Year 3, before MVP-caliber Year 4. 2. **Rookie contracts**: Slotted values bear no relationship to actual production. Elite QBs on rookie deals provide massive value. 3. **Market conditions**: Players signing when QB market is depressed, or when no other teams bid aggressively. 4. **Injury recovery**: Player returning from injury signs "prove-it" deal at discount, then performs at previous level. 5. **Positional inefficiency**: Player at systematically undervalued position (RB value vs. cost). Teams can exploit these patterns by: extending players early (before market-setting deals), drafting well at valuable positions, and targeting specific free agent profiles (injury recovery, scheme fits).

Market Efficiency Analysis

Examining market efficiency across positions reveals whether certain position markets are systematically mispriced. If models consistently over-predict or under-predict certain positions, it suggests the market values those positions differently than observable characteristics would predict.

This analysis has important strategic implications. If analytics suggest interior defensive linemen are undervalued relative to their impact on winning, teams should invest more heavily at that position. If running backs are overvalued, teams should minimize RB spending and find cheap alternatives.

Positional Market Efficiency

We can build market value models for each position and compare model fit. Positions where models fit poorly suggest more market noise and potential opportunities for analytical advantage.

R
Python

#| label: fig-market-efficiency-r
#| fig-cap: "Market value vs actual value by position"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Build market value models for each major position
# Compare model fit across positions to identify market efficiency
position_models <- contracts %>%
  # Focus on positions with sufficient sample size and clear roles
  filter(position %in% c("QB", "WR", "RB", "OL", "DL", "LB", "CB")) %>%

  # Log-transform APY for modeling
  mutate(log_apy = log(apy)) %>%

  # Group by position to build position-specific models
  group_by(position) %>%

  # Only model positions with 20+ observations
  filter(n() >= 20) %>%

  # Nest data for each position
  nest() %>%

  # Fit model for each position
  mutate(
    # Linear regression: log(APY) ~ age + experience + guarantees
    model = map(data, ~lm(log_apy ~ age + experience + guaranteed, data = .x)),

    # Generate predictions for each position
    predictions = map2(model, data, ~{
      .y %>%
        mutate(
          predicted_log_apy = predict(.x),
          predicted_apy = exp(predicted_log_apy),
          residual = apy - predicted_apy
        )
    })
  ) %>%

  # Unnest results
  select(position, predictions) %>%
  unnest(predictions)

# Visualize actual vs predicted for each position
# Points along diagonal indicate good model fit
# Scatter indicates market inefficiency or unmeasured factors
position_models %>%
  ggplot(aes(x = predicted_apy, y = apy)) +

  # Scatter plot of actual vs predicted
  geom_point(alpha = 0.5, color = "steelblue") +

  # Diagonal line represents perfect prediction
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "red") +

  # Separate panel for each position
  facet_wrap(~position, scales = "free", ncol = 3) +

  # Format axes as millions
  scale_x_continuous(labels = scales::dollar_format(scale = 1e-6, suffix = "M")) +
  scale_y_continuous(labels = scales::dollar_format(scale = 1e-6, suffix = "M")) +

  labs(
    title = "Market Value Model Fit by Position",
    subtitle = "Actual APY vs Predicted APY (red line = perfect prediction)",
    x = "Predicted APY",
    y = "Actual APY"
  ) +

  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    strip.text = element_text(face = "bold")
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-market-efficiency-py
#| fig-cap: "Market value vs actual value by position - Python"
#| fig-width: 10
#| fig-height: 8
#| message: false
#| warning: false

# Build position-specific market models
positions_to_model = ['QB', 'WR', 'RB', 'OL', 'DL', 'LB', 'CB']
position_predictions = []

for pos in positions_to_model:
    # Filter to position
    pos_data = contracts[contracts['position'] == pos].copy()

    # Skip if insufficient sample
    if len(pos_data) < 20:
        continue

    # Prepare features
    pos_data['log_apy'] = np.log(pos_data['apy'])
    X = pos_data[['age', 'experience', 'guaranteed']]
    y = pos_data['log_apy']

    # Fit model
    model = LinearRegression()
    model.fit(X, y)

    # Generate predictions
    pos_data['predicted_log_apy'] = model.predict(X)
    pos_data['predicted_apy'] = np.exp(pos_data['predicted_log_apy'])
    pos_data['residual'] = pos_data['apy'] - pos_data['predicted_apy']

    position_predictions.append(pos_data)

# Combine all position predictions
all_predictions = pd.concat(position_predictions, ignore_index=True)

# Create subplot for each position
fig, axes = plt.subplots(3, 3, figsize=(12, 10))
axes = axes.flatten()

for i, pos in enumerate(positions_to_model):
    if i >= len(axes):
        break

    pos_data = all_predictions[all_predictions['position'] == pos]

    # Scatter plot
    axes[i].scatter(pos_data['predicted_apy']/1e6, pos_data['apy']/1e6,
                    alpha=0.5, color='steelblue')

    # Perfect prediction line
    min_val = min(pos_data['predicted_apy'].min(), pos_data['apy'].min()) / 1e6
    max_val = max(pos_data['predicted_apy'].max(), pos_data['apy'].max()) / 1e6
    axes[i].plot([min_val, max_val], [min_val, max_val],
                 'r--', alpha=0.7, label='Perfect prediction')

    axes[i].set_xlabel('Predicted APY ($M)', fontsize=9)
    axes[i].set_ylabel('Actual APY ($M)', fontsize=9)
    axes[i].set_title(pos, fontsize=11, fontweight='bold')
    axes[i].grid(True, alpha=0.3)

# Remove extra subplots
for i in range(len(positions_to_model), len(axes)):
    fig.delaxes(axes[i])

plt.suptitle('Market Value Model Fit by Position\n'
             'Actual APY vs Predicted APY (red line = perfect prediction)',
             fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

These visualizations reveal interesting differences in market efficiency across positions. Some positions show tight clustering around the diagonal (good model fit), while others show more scatter (poor model fit).

Positions with good model fit suggest efficient markets—contracts are predictable based on age, experience, and leverage. Positions with poor model fit suggest either (1) important unmeasured factors affecting contracts, or (2) market inefficiency creating opportunity.

For example, if offensive line contracts show high scatter, it might indicate teams struggle to value OL properly, creating opportunities for teams with superior OL evaluation. If quarterback contracts cluster tightly, it suggests the QB market is efficient with less room for analytical arbitrage.

The residual patterns also matter. If all underpaid players at a position are young and overpaid players are old, it suggests the market undervalues youth. If a specific team consistently gets value contracts at a position, they might have superior evaluation or development.

Position Market Premium

Beyond modeling individual contracts, we can quantify the market premium for each position while controlling for player characteristics. This reveals which positions command the highest salaries independent of age, experience, or guarantees.

R
Python

#| label: position-premium-r
#| message: false
#| warning: false

# Calculate market premium by position controlling for age and experience
# Uses regression with position dummy variables
position_premium <- contracts %>%
  filter(position %in% c("QB", "WR", "RB", "TE", "OL", "DL", "LB", "CB", "S")) %>%
  mutate(log_apy = log(apy))

# Fit model with position indicators
# Position coefficients show premium relative to reference position
premium_model <- lm(
  log_apy ~ position + age + experience + guaranteed,
  data = position_premium
)

# Extract and interpret position effects
position_effects <- broom::tidy(premium_model) %>%
  filter(str_detect(term, "position")) %>%
  mutate(
    # Clean position names
    position = str_remove(term, "position"),

    # Convert log coefficient to percentage premium
    # exp(coef) - 1 gives percentage change
    premium_pct = (exp(estimate) - 1) * 100
  ) %>%

  # Add reference category (omitted from regression)
  bind_rows(
    tibble(
      term = "position(Reference)",
      position = "CB",  # Reference category
      estimate = 0,
      premium_pct = 0
    )
  ) %>%
  arrange(desc(premium_pct))

# Display position premiums
position_effects %>%
  select(position, premium_pct) %>%
  gt() %>%
  cols_label(
    position = "Position",
    premium_pct = "Market Premium (%)"
  ) %>%
  fmt_number(
    columns = premium_pct,
    decimals = 1
  ) %>%
  tab_header(
    title = "Position Market Premium",
    subtitle = "% premium over reference position (controlling for age, experience, guarantees)"
  )

#| label: position-premium-py
#| message: false
#| warning: false

from sklearn.preprocessing import LabelEncoder

# Filter to positions to analyze
positions_to_analyze = ['QB', 'WR', 'RB', 'TE', 'OL', 'DL', 'LB', 'CB', 'S']
position_premium = contracts[contracts['position'].isin(positions_to_analyze)].copy()
position_premium['log_apy'] = np.log(position_premium['apy'])

# Create dummy variables for positions
# This allows us to estimate position-specific effects
position_dummies = pd.get_dummies(position_premium['position'], prefix='pos')

# Combine position dummies with other features
X = pd.concat([
    position_dummies,
    position_premium[['age', 'experience', 'guaranteed']]
], axis=1)
y = position_premium['log_apy']

# Fit regression model
premium_model = LinearRegression()
premium_model.fit(X, y)

# Extract position effects from coefficients
position_cols = [col for col in X.columns if col.startswith('pos_')]
position_effects = []

for i, col in enumerate(position_cols):
    pos_name = col.replace('pos_', '')
    coef = premium_model.coef_[i]

    # Convert log coefficient to percentage
    premium_pct = (np.exp(coef) - 1) * 100

    position_effects.append({
        'position': pos_name,
        'coefficient': coef,
        'premium_pct': premium_pct
    })

# Sort by premium
position_effects_df = (pd.DataFrame(position_effects)
    .sort_values('premium_pct', ascending=False)
)

print("\nPosition Market Premium:")
print("(% premium controlling for age, experience, guarantees)")
print("="*50)
for _, row in position_effects_df.iterrows():
    print(f"{row['position']:>3} | {row['premium_pct']:>+6.1f}%")

This analysis quantifies exactly how much premium each position commands after accounting for player age, experience, and guarantees. A position with a +50% premium means players at that position earn 50% more than the reference position, all else equal.

The results typically show:

Quarterbacks: Massive premium (100%+) over other positions
Pass rushers/LTs: Significant premium (30-50%) reflecting scarcity
WR/CB: Moderate premium (10-30%) for pass-game importance
RB/LB: Neutral or negative premium despite traditional importance

These premiums reveal market consensus about positional value. Teams can gain advantages by identifying gaps between market premiums and actual value to wins. If the market pays a 40% premium for position X, but analytics show position Y contributes equally to winning, smart teams invest more in position Y.

Cap Space Allocation Optimization

The ultimate application of salary cap analytics is roster optimization: constructing the best possible team within cap constraints. This is a classic constrained optimization problem—maximize expected performance subject to budget and roster constraints.

While real NFL roster construction involves many factors beyond raw optimization (chemistry, coaching fit, injury risk, future considerations), the optimization framework provides a useful baseline for evaluating allocation decisions.

Portfolio Optimization Approach

We can frame roster construction as portfolio optimization, familiar from finance. Instead of assets (stocks, bonds), we're selecting players. Instead of returns, we're maximizing expected performance (EPA). Instead of budget, we're constrained by salary cap.

The objective function:

$$ \max_{\mathbf{x}} \sum_{i=1}^{N} x_i \cdot \text{EPA}_i $$

Subject to constraints:

$$ \sum_{i=1}^{N} x_i \cdot \text{Salary}_i \leq \text{Cap} $$

$$ \sum_{i \in P} x_i = \text{Required}_P \quad \forall P \in \text{Positions} $$

Where $x_i \in \{0, 1\}$ indicates whether to select player $i$. Additional constraints would include roster size limits, position minimums/maximums, and possibly chemistry or scheme fit requirements.

R
Python

#| label: roster-optimization-r
#| message: false
#| warning: false
#| eval: false

# Simplified roster optimization example
# Real NFL optimization would include many more constraints and considerations
# This demonstrates the basic approach

library(lpSolve)

# Create player pool for starting offense
# In practice, would use much larger pool with real performance data
player_pool <- contracts %>%
  filter(position %in% c("QB", "RB", "WR", "TE", "OL")) %>%
  sample_n(50) %>%
  mutate(
    # Simulate EPA contribution
    # Better values for cheaper players (creating value)
    # In practice, use actual EPA data from play-by-play
    epa_contribution = rnorm(n(), mean = 20, sd = 10) *
                       (mean(cap_hit_2024) / cap_hit_2024)
  )

# Define optimization problem
n_players <- nrow(player_pool)

# Objective: maximize total EPA contribution
objective <- player_pool$epa_contribution

# Budget constraint
budget_constraint <- player_pool$cap_hit_2024

# Position requirements (simplified 11-man offense)
# 1 QB, 2 RB, 3 WR, 1 TE, 4 OL
position_constraints <- rbind(
  as.numeric(player_pool$position == "QB"),
  as.numeric(player_pool$position == "RB"),
  as.numeric(player_pool$position == "WR"),
  as.numeric(player_pool$position == "TE"),
  as.numeric(player_pool$position == "OL")
)

# Combine constraints into matrix
const_mat <- rbind(
  budget_constraint,
  position_constraints
)

# Constraint directions and right-hand sides
const_dir <- c("<=", "=", "=", "=", "=", "=")
const_rhs <- c(
  50e6,  # $50M budget for these positions
  1,     # 1 QB
  2,     # 2 RB
  3,     # 3 WR
  1,     # 1 TE
  4      # 4 OL
)

# Solve integer programming problem
solution <- lp(
  "max",              # Maximize objective
  objective,          # EPA contribution
  const_mat,          # Constraint matrix
  const_dir,          # Constraint directions
  const_rhs,          # Constraint values
  all.bin = TRUE      # Binary variables (select or not)
)

# Extract selected players
selected_players <- player_pool[solution$solution == 1, ]

cat("Optimal Lineup:\n")
cat("Total EPA:", solution$objval, "\n")
cat("Total Cost:", scales::dollar(sum(selected_players$cap_hit_2024)), "\n\n")

# Display lineup
selected_players %>%
  select(player, position, team, cap_hit_2024, epa_contribution) %>%
  arrange(position) %>%
  print()

#| label: roster-optimization-py
#| message: false
#| warning: false

from scipy.optimize import linprog

# Simplified roster optimization
# Demonstrates approach, not production-ready system
np.random.seed(42)

# Create player pool
player_pool = contracts[
    contracts['position'].isin(['QB', 'RB', 'WR', 'TE', 'OL'])
].sample(50).copy()

# Simulate EPA contribution (inverse to cost creates value opportunities)
player_pool['epa_contribution'] = (
    np.random.normal(20, 10, len(player_pool)) *
    (player_pool['cap_hit_2024'].mean() / player_pool['cap_hit_2024'])
)

# Objective: maximize EPA (negative for minimization algorithm)
c = -player_pool['epa_contribution'].values

# Inequality constraint: budget
A_ub = [player_pool['cap_hit_2024'].values]
b_ub = [50e6]  # $50M budget

# Equality constraints: position requirements
A_eq = [
    (player_pool['position'] == 'QB').astype(int).values,
    (player_pool['position'] == 'RB').astype(int).values,
    (player_pool['position'] == 'WR').astype(int).values,
    (player_pool['position'] == 'TE').astype(int).values,
    (player_pool['position'] == 'OL').astype(int).values,
]
b_eq = [1, 2, 3, 1, 4]  # Position requirements

# Binary constraints (0 or 1)
bounds = [(0, 1) for _ in range(len(player_pool))]

# Solve linear program
# Note: linprog allows fractional solutions; would need integer programming
# for true binary selection (using package like PuLP or Gurobi)
result = linprog(
    c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq,
    bounds=bounds, method='highs'
)

if result.success:
    # Round to binary for approximate solution
    selected = np.round(result.x)
    selected_players = player_pool[selected == 1].copy()

    print("\nOptimal Lineup (Approximation):")
    print("="*70)
    print(f"Total EPA: {-result.fun:.1f}")
    print(f"Total Cost: ${selected_players['cap_hit_2024'].sum():,.0f}")
    print(f"\nSelected Players:")
    print("-"*70)

    # Display by position
    for pos in ['QB', 'RB', 'WR', 'TE', 'OL']:
        pos_players = selected_players[selected_players['position'] == pos]
        if len(pos_players) > 0:
            print(f"\n{pos}:")
            for _, p in pos_players.iterrows():
                print(f"  {p['player']:>12} | "
                      f"${p['cap_hit_2024']/1e6:>5.1f}M | "
                      f"EPA: {p['epa_contribution']:>5.1f}")

This optimization approach provides a quantitative framework for roster construction decisions. While simplified, it demonstrates how mathematical optimization can identify efficient allocations that maximize performance within budget constraints.

Real NFL roster optimization would include many additional considerations:

Multiple years: Optimize across 2-3 year window, not just current season
Uncertainty: Model player performance uncertainty and injury risk
Depth: Include requirements for backup players
Chemistry: Account for position combinations that work well together
Draft: Incorporate draft picks as cheap talent sources
Development: Model player improvement/decline trajectories

Despite these complexities, the basic framework remains valuable. It forces explicit articulation of goals (maximize performance), constraints (cap, roster spots), and trade-offs (spending here means not spending there).

Real-World Optimization Considerations

Production roster optimization systems must address several practical challenges: 1. **Roster constraints**: 53-man roster limits, practice squad, injured reserve, position flex 2. **Position flexibility**: Players who can play multiple positions increase optimization complexity 3. **Injury risk**: Stochastic modeling of injury probability and impact 4. **Chemistry**: Subjective factors like locker room leadership, coaching fit 5. **Draft integration**: Rookie contracts provide systematic value, must model draft choices 6. **Multi-year planning**: Current decisions affect future cap situations through dead money 7. **Performance uncertainty**: Players don't produce fixed EPA, need probabilistic models 8. **Market dynamics**: Other teams' actions affect free agent prices and availability Advanced analytics groups use stochastic programming, Monte Carlo simulation, and machine learning to address these complexities while maintaining the optimization framework's discipline.

Age and Contract Length Analysis

Player age fundamentally shapes contract structure and value. Understanding age curves—how performance changes with age—is essential for optimal contract length decisions. Sign a player too long into their decline years, and the team pays for diminished production. Sign too short, and the player leaves during prime years.

Age Curves by Position

Different positions age differently. Quarterbacks tend to maintain performance longer than running backs. Offensive linemen peak later than skill position players. These position-specific age curves should inform contract length decisions.

R
Python

#| label: fig-age-curves-r
#| fig-cap: "Contract APY by age and position"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

# Visualize how contract value varies with age across positions
# Shows market perception of age-based value curves
contracts %>%
  filter(position %in% c("QB", "WR", "RB", "OL", "DL", "LB")) %>%
  filter(age >= 22, age <= 35) %>%

  ggplot(aes(x = age, y = apy / 1e6, color = position)) +

  # Individual contracts
  geom_point(alpha = 0.3) +

  # LOESS smoothing to show age trends
  # se = TRUE adds confidence bands
  geom_smooth(method = "loess", se = TRUE, linewidth = 1.2) +

  # Format y-axis as millions
  scale_y_continuous(labels = scales::dollar_format(suffix = "M")) +

  scale_color_brewer(palette = "Set2") +

  labs(
    title = "Contract Value by Age and Position",
    subtitle = "APY tends to peak in late 20s, then decline",
    x = "Age",
    y = "Average Annual Value",
    color = "Position"
  ) +

  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "right"
  )

📊 Visualization Output

The code above generates a visualization. To see the output, run this code in your R or Python environment. The resulting plot will help illustrate the concepts discussed in this section.

#| label: fig-age-curves-py
#| fig-cap: "Contract APY by age and position - Python"
#| fig-width: 10
#| fig-height: 6
#| message: false
#| warning: false

positions = ['QB', 'WR', 'RB', 'OL', 'DL', 'LB']
plot_data = contracts[
    (contracts['position'].isin(positions)) &
    (contracts['age'] >= 22) &
    (contracts['age'] <= 35)
].copy()

plt.figure(figsize=(10, 6))

# Plot each position
for pos in positions:
    pos_data = plot_data[plot_data['position'] == pos]

    # Scatter plot
    plt.scatter(pos_data['age'], pos_data['apy'] / 1e6,
                alpha=0.3, label=pos, s=30)

    # Polynomial trend line
    if len(pos_data) > 10:
        z = np.polyfit(pos_data['age'], pos_data['apy'] / 1e6, 2)
        p = np.poly1d(z)
        age_range = np.linspace(pos_data['age'].min(), pos_data['age'].max(), 100)
        plt.plot(age_range, p(age_range), linewidth=2, alpha=0.7)

plt.xlabel('Age', fontsize=12)
plt.ylabel('Average Annual Value ($M)', fontsize=12)
plt.title('Contract Value by Age and Position\n'
          'APY tends to peak in late 20s, then decline',
          fontsize=14, fontweight='bold')
plt.legend(title='Position', loc='upper right')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

The age curves reveal fascinating patterns in how the market values aging. Most positions show peak contract values around ages 26-29, with decline thereafter. This reflects the balance between experience (which increases with age) and physical decline (which also increases with age).

Quarterbacks show the flattest age curve, maintaining high contract values into their mid-30s. This reflects QB positional uniqueness: the mental aspects of QB play (reading defenses, adjustments, leadership) improve with experience, offsetting physical decline.

Running backs show the steepest age curve, with values declining sharply after age 27. This reflects the physically punishing nature of the position and data showing RB production typically peaks early and declines rapidly.

Market Age Curves vs. Performance Age Curves

These age curves reflect market beliefs about aging, not necessarily true performance aging. The market might systematically over- or underestimate age-related decline. Research comparing market age curves to actual performance age curves reveals: - **QBs**: Market accurately prices gradual decline; QBs remain effective into mid-30s - **RBs**: Market may overvalue RBs in late 20s; performance often declines faster than salaries - **WRs**: Market undervalues veteran WRs; elite WRs maintain performance longer than contracts suggest - **OL**: Market undervalues young OL; tackles often improve significantly through age 28-30 Teams with superior age curve understanding can exploit market inefficiencies by extending players before expected improvement or avoiding contracts covering decline years.

Optimal Contract Length by Age

The interaction between player age and contract length creates strategic considerations. How long should contracts be for players at different ages?

R
Python

#| label: contract-length-analysis-r
#| message: false
#| warning: false

# Analyze contract length patterns by position and age
# Reveals market preferences for contract structure
length_analysis <- contracts %>%
  filter(position %in% c("QB", "WR", "RB", "CB", "S")) %>%

  # Create age groups for analysis
  mutate(age_group = cut(age, breaks = c(22, 25, 28, 31, 35),
                         labels = c("22-24", "25-27", "28-30", "31+"))) %>%

  group_by(position, age_group) %>%
  summarise(
    # Average contract length for this age group
    avg_years = mean(years),

    # Average guarantee percentage
    avg_guaranteed_pct = mean(guaranteed) * 100,

    # Sample size
    n = n(),

    .groups = "drop"
  ) %>%
  filter(!is.na(age_group))

# Display formatted table
length_analysis %>%
  gt() %>%
  cols_label(
    position = "Position",
    age_group = "Age Group",
    avg_years = "Avg Contract Length",
    avg_guaranteed_pct = "Avg % Guaranteed",
    n = "N"
  ) %>%
  fmt_number(
    columns = c(avg_years, avg_guaranteed_pct),
    decimals = 1
  ) %>%
  tab_header(
    title = "Contract Length and Guarantees by Age",
    subtitle = "Younger players get longer deals with higher guarantees"
  ) %>%
  data_color(
    columns = avg_years,
    colors = scales::col_numeric(
      palette = c("white", "lightblue"),
      domain = NULL
    )
  )

#| label: contract-length-analysis-py
#| message: false
#| warning: false

# Analyze contract length by position and age
positions = ['QB', 'WR', 'RB', 'CB', 'S']
length_data = contracts[contracts['position'].isin(positions)].copy()

# Create age groups
length_data['age_group'] = pd.cut(
    length_data['age'],
    bins=[22, 25, 28, 31, 35],
    labels=['22-24', '25-27', '28-30', '31+']
)

# Calculate summary statistics
length_analysis = (length_data
    .groupby(['position', 'age_group'])
    .agg(
        avg_years=('years', 'mean'),
        avg_guaranteed_pct=('guaranteed', lambda x: x.mean() * 100),
        n=('player', 'count')
    )
    .reset_index()
    .dropna()
)

print("\nContract Length and Guarantees by Age:")
print("="*75)

for pos in positions:
    print(f"\n{pos}:")
    pos_data = length_analysis[length_analysis['position'] == pos]

    for _, row in pos_data.iterrows():
        print(f"  {row['age_group']:>6} | "
              f"Avg Length: {row['avg_years']:.1f} yrs | "
              f"Guaranteed: {row['avg_guaranteed_pct']:>4.1f}% | "
              f"N={row['n']:>3}")

This analysis reveals clear patterns in how teams structure contracts based on player age:

Prime-Age Players (25-28): Receive longest contracts with highest guarantees. Teams are most confident in sustained performance, willing to commit long-term.

Young Players (22-24): Moderate contract length. Teams want to secure value before players reach free agency but aren't yet confident enough for very long deals.

Veteran Players (31+): Short contracts with lower guarantees. Teams acknowledge decline risk, preferring flexibility to opt out if performance drops.

The strategy implications are significant. For players, getting extended during prime years (25-28) maximizes earning potential through long, highly guaranteed contracts. For teams, extending players just before their prime maximizes value—locking in elite years at pre-prime prices.

Contract Length Strategy

Optimal contract length strategy varies by stakeholder: **For Players:** - **Prime years (25-28)**: Maximize contract length and guarantees - **Young players (22-24)**: Accept shorter prove-it deals to bet on self - **Veterans (31+)**: Accept one-year deals with performance bonuses **For Teams:** - **Prime players**: Extend before free agency to avoid bidding wars - **Young players**: Balance team control vs. extension cost - **Veterans**: Short deals with team options, avoid guaranteed decline years - **Position-specific**: Shorter deals for RBs (steep decline), longer for QBs (gradual decline) **Market Timing:** - Extend before comparable player contracts reset market upward - Avoid free agency years with competitive bidding - Structure deals with team flexibility (options, de-escalators)

Team Cap Management Evaluation

Evaluating team-level cap management reveals which organizations effectively navigate the cap's complexities. Some teams consistently field competitive rosters while maintaining cap flexibility, while others struggle with dead money and poor allocation decisions.

Cap Efficiency Metrics

We can develop metrics to evaluate overall team cap management quality, including total cap usage, allocation efficiency, dead money percentage, and cap concentration (how much spending is concentrated in a few players).

R
Python

#| label: team-cap-efficiency-r
#| message: false
#| warning: false

# Calculate comprehensive team-level cap metrics
# Reveals management quality and strategic approaches
team_cap_summary <- contracts %>%
  group_by(team) %>%
  summarise(
    # Total cap hit for all players
    total_cap = sum(cap_hit_2024),

    # Number of players under contract
    n_players = n(),

    # Average age of roster (older = less future flexibility)
    avg_age = mean(age),

    # Average guarantee percentage
    avg_guaranteed_pct = mean(guaranteed) * 100,

    # Cap concentration: % of cap in top 5 players
    # High concentration = risk if injuries/poor performance
    top_5_cap_pct = sum(head(sort(cap_hit_2024, decreasing = TRUE), 5)) /
                    sum(cap_hit_2024) * 100,

    .groups = "drop"
  ) %>%

  # Categorize concentration level
  mutate(
    cap_concentration = case_when(
      top_5_cap_pct >= 50 ~ "High",
      top_5_cap_pct >= 40 ~ "Medium",
      TRUE ~ "Low"
    )
  ) %>%
  arrange(desc(total_cap))

# Display top teams
team_cap_summary %>%
  head(15) %>%
  gt() %>%
  cols_label(
    team = "Team",
    total_cap = "Total Cap",
    n_players = "Players",
    avg_age = "Avg Age",
    avg_guaranteed_pct = "Avg % Guar",
    top_5_cap_pct = "Top 5 %",
    cap_concentration = "Concentration"
  ) %>%
  fmt_currency(
    columns = total_cap,
    decimals = 0
  ) %>%
  fmt_number(
    columns = c(avg_age, avg_guaranteed_pct, top_5_cap_pct),
    decimals = 1
  ) %>%
  tab_header(
    title = "Team Cap Management Summary",
    subtitle = "Total cap allocation and concentration metrics"
  )

#| label: team-cap-efficiency-py
#| message: false
#| warning: false

# Calculate team-level cap efficiency metrics
team_cap_summary = (contracts
    .groupby('team')
    .apply(lambda x: pd.Series({
        'total_cap': x['cap_hit_2024'].sum(),
        'n_players': len(x),
        'avg_age': x['age'].mean(),
        'avg_guaranteed_pct': x['guaranteed'].mean() * 100,
        'top_5_cap_pct': (x.nlargest(5, 'cap_hit_2024')['cap_hit_2024'].sum() /
                          x['cap_hit_2024'].sum() * 100)
    }))
    .reset_index()
    .sort_values('total_cap', ascending=False)
)

# Categorize concentration
team_cap_summary['cap_concentration'] = pd.cut(
    team_cap_summary['top_5_cap_pct'],
    bins=[0, 40, 50, 100],
    labels=['Low', 'Medium', 'High']
)

print("\nTeam Cap Management Summary (Top 15):")
print("="*95)

for _, row in team_cap_summary.head(15).iterrows():
    print(f"{row['team']:>3} | "
          f"${row['total_cap']/1e6:>6.1f}M | "
          f"{row['n_players']:>3} players | "
          f"Avg Age: {row['avg_age']:>4.1f} | "
          f"Guar: {row['avg_guaranteed_pct']:>4.1f}% | "
          f"Top 5: {row['top_5_cap_pct']:>4.1f}% | "
          f"{row['cap_concentration']}")

These team-level metrics reveal different cap management philosophies:

High Concentration Teams: Invest heavily in a few star players. This creates clear hierarchy and elite talent but increases risk—if top players underperform or get injured, the team struggles with little cap flexibility to fix problems.

Low Concentration Teams: Spread spending across more players. This provides depth and injury insurance but may lack elite talent at key positions.

Young vs. Old Rosters: Average age reveals temporal strategy. Young rosters suggest rebuilding or championship window preparation. Old rosters suggest win-now approaches, often with less future flexibility.

Guarantee Levels: High guarantee percentages suggest either (1) many elite players with leverage, or (2) questionable contract negotiations giving away guarantees unnecessarily.

Combining these metrics with team performance (wins, playoff success) reveals which cap management approaches work best. Do teams with balanced allocation outperform concentrated teams? Do young rosters outperform old ones? These are empirically testable questions.

Evaluating Cap Management Quality

Superior cap management shows several characteristics: 1. **Appropriate concentration**: Invest heavily in QB and 2-3 elite non-QBs, fill rest with value 2. **Age distribution**: Mix of young players (future), prime players (now), few veterans (leadership) 3. **Dead money minimization**: Rarely forced to cut players with significant dead money 4. **Draft success**: Regular influx of productive rookie contracts 5. **Market timing**: Sign free agents when position markets are depressed 6. **Extension timing**: Extend players before market-resetting contracts 7. **Flexibility**: Maintain ability to make moves, not painted into corners Teams excelling at cap management (Patriots, Packers historically) share these characteristics.

Dead Money Analysis

Dead money—cap charges for players no longer on the roster—represents the most visible cost of poor cap management. High dead money indicates either (1) bad contract decisions requiring early player releases, or (2) strategic restructures that pushed money into future years.

R
Python

#| label: dead-money-analysis-r
#| message: false
#| warning: false

# Simulate dead money scenarios
# In practice, would use actual contract details and release dates
set.seed(2024)

team_dead_money <- contracts %>%
  group_by(team) %>%
  summarise(
    # Active cap from players on roster
    active_cap = sum(cap_hit_2024),

    # Simulate dead money from released players
    # In reality, calculate from actual signing bonus proration
    dead_money = sum(sample(c(0, guaranteed_money * runif(1, 0, 0.3)),
                            size = n(), replace = TRUE)),

    # Total cap used (active + dead)
    total_cap_used = active_cap + dead_money,

    # Dead money as percentage of total
    dead_money_pct = dead_money / total_cap_used * 100,

    .groups = "drop"
  ) %>%
  arrange(desc(dead_money))

# Display teams with most dead money
team_dead_money %>%
  head(10) %>%
  gt() %>%
  cols_label(
    team = "Team",
    active_cap = "Active Cap",
    dead_money = "Dead Money",
    total_cap_used = "Total Cap Used",
    dead_money_pct = "Dead % of Cap"
  ) %>%
  fmt_currency(
    columns = c(active_cap, dead_money, total_cap_used),
    decimals = 0
  ) %>%
  fmt_number(
    columns = dead_money_pct,
    decimals = 1
  ) %>%
  tab_header(
    title = "Teams with Most Dead Money",
    subtitle = "Cap space consumed by players no longer on roster"
  ) %>%
  tab_style(
    style = cell_fill(color = "lightcoral"),
    locations = cells_body(
      columns = dead_money_pct,
      rows = dead_money_pct > 5
    )
  )

#| label: dead-money-analysis-py
#| message: false
#| warning: false

# Simulate dead money for each team
np.random.seed(2024)
team_dead_money_list = []

for team in contracts['team'].unique():
    team_contracts = contracts[contracts['team'] == team]
    active_cap = team_contracts['cap_hit_2024'].sum()

    # Simulate dead money from releases
    dead_money = sum([
        0 if np.random.rand() > 0.2 else
        team_contracts.iloc[i]['guaranteed_money'] * np.random.uniform(0, 0.3)
        for i in range(len(team_contracts))
    ])

    total_cap_used = active_cap + dead_money
    dead_money_pct = (dead_money / total_cap_used * 100) if total_cap_used > 0 else 0

    team_dead_money_list.append({
        'team': team,
        'active_cap': active_cap,
        'dead_money': dead_money,
        'total_cap_used': total_cap_used,
        'dead_money_pct': dead_money_pct
    })

team_dead_money = (pd.DataFrame(team_dead_money_list)
    .sort_values('dead_money', ascending=False)
)

print("\nTeams with Most Dead Money (Top 10):")
print("="*85)

for _, row in team_dead_money.head(10).iterrows():
    print(f"{row['team']:>3} | "
          f"Active: ${row['active_cap']/1e6:>6.1f}M | "
          f"Dead: ${row['dead_money']/1e6:>5.1f}M | "
          f"Total: ${row['total_cap_used']/1e6:>6.1f}M | "
          f"Dead %: {row['dead_money_pct']:>4.1f}%")

High dead money significantly constrains team building. A team with $20M in dead money effectively has $20M less cap space to sign players for the current season. This forces difficult choices: release additional players to create cap room, restructure contracts (creating future dead money problems), or field an undermanned roster.

Dead money typically results from:

Releases: Cutting underperforming players with remaining bonus proration
Trades: Trading players creates dead money from accelerated bonuses
Retirements: Unexpected retirements accelerate guarantees
Failed contracts: Players who never pan out but had large signing bonuses

Teams can minimize dead money through several strategies:

Careful contract structuring: Limit signing bonuses, use roster bonuses instead
Player evaluation: Better evaluation reduces bad contracts requiring release
Trade timing: Trade players before signing bonus acceleration becomes prohibitive
Extension strategy: Extend players before contracts end to avoid cap spikes

Advanced Topics

Multi-Year Cap Planning

Smart teams don't just manage the current year's cap—they plan 3-5 years ahead, considering future contract expirations, extension needs, and projected cap growth.

Multi-Year Cap Strategy

Effective multi-year planning considers: 1. **Current Year**: Competitive roster construction, maximize win probability 2. **Year 2-3**: Plan for contract extensions and free agent needs 3. **Year 4+**: Account for rookie contract expirations and potential exodus 4. **Cap smoothing**: Use restructures and extensions to manage year-to-year spikes 5. **Cap projections**: Model multiple scenarios for cap growth and player performance Teams maintain spreadsheets projecting cap situations through future years, identifying upcoming crunches and planning solutions (extensions, restructures, releases) in advance.

Rookie Contract Value

Rookie contracts, with slotted values based on draft position, create systematic value opportunities. Elite rookies drastically outperform contract value, while busts still consume significant cap space.

R
Python

#| label: rookie-value-r
#| message: false
#| warning: false

# Simulate rookie contract values by draft position
# Based on actual rookie wage scale
rookie_contracts <- tibble(
  pick = 1:260,
  round = ceiling(pick / 32),

  # Slot value decreases with pick number
  slot_value = case_when(
    pick == 1 ~ 40e6,
    pick <= 10 ~ 30e6 - (pick - 1) * 2e6,
    pick <= 32 ~ 15e6 - (pick - 10) * 0.5e6,
    pick <= 64 ~ 8e6 - (pick - 32) * 0.2e6,
    TRUE ~ 5e6 - (pick - 64) * 0.02e6
  ),

  # Contract length: 4 years for all
  years = 4,

  # 5th year option for 1st rounders
  fifth_year_option = round == 1,

  # APY
  apy = slot_value / years
) %>%
  filter(pick <= 100)

# Compare to veteran market
position_avg <- contracts %>%
  filter(position %in% c("QB", "WR", "RB", "OL", "DL")) %>%
  group_by(position) %>%
  summarise(avg_apy = mean(apy))

cat("Rookie Contract Value Advantage:\n")
cat("Pick 1 APY:", scales::dollar(rookie_contracts$apy[1]), "\n")
cat("Pick 32 APY:", scales::dollar(rookie_contracts$apy[32]), "\n")
cat("\nAverage veteran QB APY:", scales::dollar(
  position_avg$avg_apy[position_avg$position == "QB"]
), "\n")
cat("Value advantage for elite rookie QB: ~",
    scales::dollar(position_avg$avg_apy[position_avg$position == "QB"] -
                   rookie_contracts$apy[1]), "\n")

#| label: rookie-value-py
#| message: false
#| warning: false

# Simulate rookie contract slot values
picks = np.arange(1, 261)
rounds = np.ceil(picks / 32)

def calculate_slot_value(pick):
    """Calculate rookie contract slot value by pick"""
    if pick == 1:
        return 40e6
    elif pick <= 10:
        return 30e6 - (pick - 1) * 2e6
    elif pick <= 32:
        return 15e6 - (pick - 10) * 0.5e6
    elif pick <= 64:
        return 8e6 - (pick - 32) * 0.2e6
    else:
        return 5e6 - (pick - 64) * 0.02e6

rookie_contracts = pd.DataFrame({
    'pick': picks,
    'round': rounds,
    'slot_value': [calculate_slot_value(p) for p in picks],
})

rookie_contracts['years'] = 4
rookie_contracts['fifth_year_option'] = rookie_contracts['round'] == 1
rookie_contracts['apy'] = rookie_contracts['slot_value'] / rookie_contracts['years']

# Focus on first 100 picks
rookie_sample = rookie_contracts[rookie_contracts['pick'] <= 100]

# Compare to veteran contracts
position_avg = (contracts
    .query("position in ['QB', 'WR', 'RB', 'OL', 'DL']")
    .groupby('position')['apy']
    .mean()
)

print("\nRookie Contract Value Advantage:")
print("="*60)
print(f"Pick 1 APY: ${rookie_contracts.loc[0, 'apy']/1e6:.1f}M")
print(f"Pick 32 APY: ${rookie_contracts.loc[31, 'apy']/1e6:.1f}M")
print(f"\nAverage veteran QB APY: ${position_avg['QB']/1e6:.1f}M")
print(f"Average veteran WR APY: ${position_avg['WR']/1e6:.1f}M")
print(f"\nValue advantage for elite rookie QB: ~${(position_avg['QB'] - rookie_contracts.loc[0, 'apy'])/1e6:.1f}M")

The rookie wage scale creates enormous value opportunities. An elite quarterback on a rookie contract (Pick 1: ~$10M APY) produces at the level of a veteran making $45M+, creating $35M in excess value annually. Over a 4-year rookie contract plus 5th-year option, this represents $175M+ in value.

This explains why teams prioritize drafting well, especially at valuable positions. A team that drafts an elite QB, pass rusher, or tackle gains years of elite production at below-market rates, creating cap space to build around that player.

Contract Restructuring Analysis

Restructuring converts salary to signing bonus, creating immediate cap relief but increasing future cap hits.

Restructure Mechanics

Example restructure: **Before:** - 2024 salary: $20M - 2024 cap hit: $20M **After:** - Convert $16M salary to signing bonus - New 2024 salary: $4M - Signing bonus proration: $16M / 4 years = $4M/year - New 2024 cap hit: $4M + $4M = $8M **Result:** - 2024 cap savings: $12M - 2025-2027 cap increase: +$4M per year **Strategic implications:** - Creates immediate cap space for win-now moves - Increases future cap obligations - Acceptable when in championship window - Dangerous if player performance declines or injury occurs - Creates dead money if player later released

Restructures represent borrowing from the future to invest in the present. Teams in championship windows (strong roster, QB in prime) should restructure aggressively. Teams rebuilding should avoid restructures to maintain future flexibility.

Case Studies

Case Study 1: Kansas City Chiefs - Sustained Excellence

The Chiefs demonstrate excellent cap management through their championship run:

2017-2020: Rookie QB Window
- Patrick Mahomes on rookie contract (~$5M APY)
- Invested in weapons: Hill, Kelce, Watkins
- Strong defense possible with QB savings
- Won Super Bowl LIV (2019 season)

2020: Extension Decision
- Extended Mahomes: 10 years, $450M ($45M APY)
- Structured with team flexibility and escalators
- Preserved ability to compete despite large QB cap hit

2021-Present: Post-Extension Adjustment
- Released expensive veterans (Houston, Watkins)
- Found value contracts (JuJu Smith-Schuster, Kadarius Toney)
- Drafted well to maintain talent influx
- Continued success: 2 more Super Bowl wins

Key Lessons:
1. Maximize rookie QB window
2. Structure extensions with flexibility
3. Find value replacements for departed veterans
4. Draft well to maintain rookie contract value

Case Study 2: Los Angeles Rams - All-In Strategy

The Rams used aggressive cap management to win Super Bowl LVI:

Strategy:
- Traded draft picks for established stars (Stafford, Ramsey, Miller)
- Restructured repeatedly to push money into future
- Used voidable years to minimize current cap hits
- Prioritized 2-3 year window over long-term sustainability

Results:
- Won Super Bowl LVI (2021 season)
- Faced severe cap constraints 2022-2023
- Limited ability to retain players or add talent
- Rebuilding phase required

Key Lessons:
1. All-in strategy can work for championship
2. Requires accepting future cap pain
3. Best with aging core and narrow window
4. Not sustainable long-term approach

Case Study 3: Value Contracts Driving Success

Teams finding value contracts gain competitive edges:

San Francisco 49ers with Brock Purdy:
- Elite QB play on minimum salary ($1M APY vs. $45M market)
- $44M annual savings allows elite roster around him
- Challenge: Extend before market resets or lose advantage?

Philadelphia Eagles with Jalen Hurts (before extension):
- MVP-caliber season on rookie deal
- Invested in OL, weapons, defense
- Won NFC Championship

Dallas Cowboys with Micah Parsons:
- Elite defender on rookie contract
- Value allows investment elsewhere
- Must extend before rookie deal expires

Pattern: Teams with elite players on rookie deals have systematic advantages. The challenge is transitioning when those deals expire—some teams maintain success through superior drafting and value identification, while others regress.

Summary

Salary cap management represents one of the most complex analytical challenges in football, combining financial analysis, statistical modeling, and strategic planning. Success requires understanding cap mechanics, building market value models, identifying inefficiencies, and optimizing allocation decisions.

Key Takeaways:

Market Value Models: Statistical models can identify over/underpaid players relative to age, experience, and position markets. Use these models to find value contracts and avoid overpays.
Positional Allocation: Different positions command different market premiums. Invest more heavily in positions where analytics suggest market undervaluation relative to actual impact.
Age Curves: Contract value and performance both vary with age. Structure contracts to capture prime years while avoiding paying for decline.
Optimization: Roster construction can be framed as constrained optimization—maximize expected performance subject to cap and roster constraints.
Multi-Year Planning: Cap management requires 3-5 year planning horizon, considering contract expirations, extension needs, and cap growth.
Rookie Value: Rookie contracts provide massive cap advantages when elite players are drafted. This creates championship windows that must be maximized.
Flexibility: Dead money and long-term obligations limit flexibility. Maintain optionality through careful contract structuring and strategic extensions.

Analytical Approaches:

Regression models for market value prediction
Portfolio optimization for roster construction
Time series analysis for multi-year cap projection
Market efficiency analysis for identifying systematic mispricings
Age curve modeling for optimal contract length decisions

The best cap management combines deep understanding of NFL contract rules with rigorous analytical methods and strategic thinking about championship windows and organizational timelines.

Exercises

Conceptual Questions

Guaranteed Money: Why do some positions (QB) get higher guarantee percentages than others (RB)? Consider both supply/demand factors and position-specific characteristics.
Restructure Trade-offs: What are the pros and cons of restructuring contracts to create immediate cap space? Under what circumstances should teams restructure aggressively vs. avoid restructures?
Market Inefficiency: Identify three potential sources of market inefficiency in NFL contract markets where analytically-sophisticated teams could gain advantages.
Positional Allocation: Should a team invest more in offensive or defensive line? Build a framework for answering this question using both analytical evidence and strategic considerations.
Extension Timing: When should teams extend players—early in contracts (more team-friendly but less information) or late (more expensive but more certainty)? How does this vary by position and player quality?

Coding Exercises

Exercise 1: Build Position Market Value Model

Using the contract data: a) Build a regression model predicting APY for wide receivers based on age, experience, and performance metrics (if available) b) Identify the top 5 value contracts (most underpaid) at the position c) Calculate the total cap savings from these value contracts vs. market rate d) Create a visualization showing actual vs. predicted APY with highlighted value contracts **Hint**: Consider log-transforming APY for better model fit. Include quadratic age terms to capture non-linear aging effects.

Exercise 2: Team Cap Allocation Analysis

For each team: a) Calculate the percentage of cap allocated to each position group (QB, OL, DL, Secondary, Skill positions, etc.) b) Identify which teams are outliers in their allocation strategies using clustering or distance metrics c) Create a heatmap showing positional allocation across all teams d) Calculate cap concentration (% of cap in top 5 contracts) and correlate with allocation strategy **Bonus**: If team performance data is available, correlate allocation strategies with team success (wins, playoff appearances). Which allocation approaches work best?

Exercise 3: Age and Contract Optimization

Analyze the relationship between age and contract structure: a) For each position, create age curves showing average APY by age b) Calculate the "peak age" for each position (highest average APY) c) Analyze how contract length varies by age within positions d) Build a model predicting optimal contract length given position and age **Extension**: Include performance trajectories if data available. Do age-based contract lengths align with performance-based optimal lengths?

Exercise 4: Roster Optimization Challenge

Build a simplified roster optimization model: a) Define a 22-player starting lineup (11 offense, 11 defense) b) Create position requirements (1 QB, 2 RB, 3 WR, 5 OL, 3 DL, 3 LB, 2 CB, 2 S) c) Set a total cap budget ($120M for starters) d) Maximize total expected EPA (or other performance metric) using optimization e) Compare optimal allocation to actual NFL team allocations—where do teams deviate from optimal? **Advanced**: Include constraints for depth (backup players), age diversity, or positional flexibility. How do additional constraints change optimal allocation?

Exercise 5: Multi-Year Cap Planning

Create a multi-year cap projection tool: a) Start with a team's current cap situation (contracts, cap space) b) Project cap hits for next 3 years based on existing contracts c) Identify contracts that expire and need replacement at market rate d) Model different extension/restructure scenarios and their cap impact e) Evaluate cap flexibility under different scenarios—which provides best balance of current competitiveness and future flexibility? **Visualization**: Create a waterfall chart showing cap evolution over time with different decision scenarios.

References

:::

Learning ObjectivesBy the end of this chapter, you will be able to:

Introduction

What is Salary Cap Analytics?

Salary Cap Structure and Mechanics

The Basics

Key Cap Concepts

Contract Structure Basics

Positional Cap Rules

Setting Up the Environment

Loading and Preparing Salary Cap Data

Contract Data Sources

Data Quality Considerations

Loading Performance Data

Contract Valuation Analysis

Average Annual Value (APY) by Position

Positional Value and Market Compensation

Guaranteed Money Analysis

📊 Visualization Output

Guarantee Structures

Contract Length and Age Dynamics

Positional Spending Patterns

Team-Level Positional Allocation

Championship Window Management

Positional Spending Heat Map

📊 Visualization Output

Spending Allocation and Team Success

Market Value Models

Building a Market Value Model

Causality vs. Correlation in Value Models

Identifying Value Contracts and Overpays

Finding Market Inefficiencies

Market Efficiency Analysis

Positional Market Efficiency

📊 Visualization Output

Position Market Premium

Cap Space Allocation Optimization

Portfolio Optimization Approach

Real-World Optimization Considerations

Age and Contract Length Analysis

Age Curves by Position

📊 Visualization Output

Market Age Curves vs. Performance Age Curves

Optimal Contract Length by Age

Contract Length Strategy

Team Cap Management Evaluation

Cap Efficiency Metrics

Evaluating Cap Management Quality

Dead Money Analysis

Advanced Topics

Multi-Year Cap Planning

Multi-Year Cap Strategy

Rookie Contract Value

Contract Restructuring Analysis

Restructure Mechanics

Case Studies

Case Study 1: Kansas City Chiefs - Sustained Excellence

Case Study 2: Los Angeles Rams - All-In Strategy

Case Study 3: Value Contracts Driving Success

Summary

Exercises

Conceptual Questions

Coding Exercises

Exercise 1: Build Position Market Value Model

Exercise 2: Team Cap Allocation Analysis

Exercise 3: Age and Contract Optimization

Exercise 4: Roster Optimization Challenge

Exercise 5: Multi-Year Cap Planning

Further Reading

Academic Research

Industry Resources

Books

References