Frequently Asked Questions

Common questions about NFL analytics, our tools, and getting started with football data science

Getting Started

NFL analytics is the application of data science, statistics, and mathematical modeling to American football. It involves analyzing play-by-play data to gain insights about team performance, player evaluation, game strategy, and betting markets. Modern analytics uses metrics like EPA (Expected Points Added), win probability, and CPOE (Completion Percentage Over Expected) to measure performance beyond traditional stats.

While coding (especially R or Python) greatly enhances what you can do with NFL data, you can start learning analytics concepts without coding. Our tutorials explain the theory and interpretation of metrics. However, to do original analysis or build models, learning R (with nflfastR) or Python (with nfl_data_py) is highly recommended. Both languages have excellent free learning resources.

Both R and Python are excellent choices. R is the most popular in the NFL analytics community, with the nflfastR package being the gold standard for play-by-play data. Python is more versatile and has better machine learning libraries. We recommend starting with R if you're focused solely on football analytics, or Python if you want broader data science skills. Our code examples provide both.

The best free source is nflfastR (R) or nfl_data_py (Python), which provide play-by-play data with advanced metrics since 1999. Pro Football Reference offers free historical statistics. NFL Next Gen Stats provides player tracking data for recent seasons. Our Resources page has a complete list of data sources.

You can understand basic concepts like EPA and success rate in a few hours of reading. Learning to code basic analyses takes a few weeks of practice. Building sophisticated models and developing genuine expertise takes months to years of consistent work. Our 45-chapter tutorial series is designed to be completed over several months, building skills progressively.

Understanding Metrics

EPA measures the value of a play by comparing the expected points before and after the play. For example, if a team has 2.5 expected points before a play and 3.5 after, that play was worth +1.0 EPA. It accounts for down, distance, field position, and game situation, making it a context-aware metric. EPA/play is the most important efficiency metric in modern football analytics.

Success rate is the percentage of plays that are considered "successful." A play is successful if it gains: 50% of yards needed on 1st down, 70% on 2nd down, or 100% on 3rd/4th down. It's a volume-based stability metric that complements EPA, which can be volatile due to big plays. High success rate with low EPA suggests consistent but not explosive offense.

CPOE measures how much better (or worse) a quarterback's completion percentage is compared to what's expected based on the difficulty of their passes. It uses machine learning to predict completion probability based on factors like air yards, receiver separation, and pressure. A positive CPOE means the QB is completing passes at a higher rate than expected.

Win probability estimates a team's chances of winning based on the current game state (score, time remaining, field position, down/distance, timeouts). It's calculated using historical data from similar situations. Win Probability Added (WPA) measures how much a single play changed the team's chances of winning.

EPA (from nflfastR) is play-by-play and publicly available. DVOA (from Football Outsiders) is a proprietary metric that compares performance to league average on a per-play basis, expressed as a percentage. Both measure efficiency, but DVOA includes opponent adjustments. EPA is more accessible; DVOA requires a subscription for detailed data.

On average, rushing plays generate about +0.02 EPA/play while passing generates about +0.09 EPA/play. The difference is significant. However, this doesn't mean teams shouldn't run at all - rushing has strategic value (clock management, play-action), and some teams/situations favor it. The point is that raw rushing yards are often misleading about team quality.

Using the Site

Our Tools page has various calculators like EPA Calculator, Fourth Down Decision Tool, and Win Probability Calculator. Simply enter the required inputs (down, distance, field position, etc.) and the tool calculates the result. Each tool includes explanations of how to interpret the output.

Yes! All code examples on our Code Examples page have a "Copy" button. Each example includes both R and Python versions. You're free to use, modify, and share the code for any purpose - just make sure you have the required packages installed.

Tutorial content is updated periodically to reflect new developments in analytics. Team analytics and playoff predictions are updated throughout the NFL season. Code examples are updated when packages have breaking changes. Check the "Updated" date on each page for the most recent revision.

Yes, we save your tutorial progress using your browser's local storage. This means your progress is tied to your browser - if you switch browsers or clear your data, progress will be reset. No account is required, and we don't collect any personal information.

Betting & Fantasy

Analytics can give you an edge, but betting markets are efficient and beating them consistently is extremely difficult. The key is finding +EV (positive expected value) bets where your estimated probability exceeds the implied probability from the odds. Even skilled bettors typically win only 52-55% against the spread. Our Betting section covers strategies, but remember: the house always has an edge.

EV is the average profit or loss per bet over the long run. A +EV bet has positive expected value - if you made this bet many times, you'd profit on average. EV = (Win Probability × Profit) - (Loss Probability × Stake). At -110 odds, you need to win 52.4% to break even. Our EV Calculator helps you find this.

Absolutely! Analytics can help with player evaluation (target share, snap counts, efficiency metrics), matchup analysis (opponent defense rankings), and trade decisions. Key metrics include target share, air yards share, and opportunity-based projections. However, remember that fantasy also involves factors like weather, game script, and variance.

The Kelly Criterion is a formula for optimal bet sizing that maximizes long-term bankroll growth. It calculates what percentage of your bankroll to bet based on your edge. The formula is: Kelly % = (bp - q) / b, where b = decimal odds - 1, p = win probability, q = loss probability. Most bettors use "fractional Kelly" (half or quarter) to reduce variance.

Technical Questions

In R, run: install.packages("nflfastR"). Then load it with library(nflfastR). For the full ecosystem including plotting functions, install nflverse: install.packages("nflverse"). Python users should use: pip install nfl_data_py.

Play-by-play data files are large (100MB+ per season). Tips for speed: (1) Load only the seasons you need, (2) Use load_pbp() instead of build_nflfastR_pbp(), (3) Filter data early in your pipeline, (4) Use data.table or arrow for large datasets, (5) Consider caching processed data locally.

The nflfastR EPA model is well-calibrated and widely used by NFL teams and media. It's based on historical expected points values for each game situation. Limitations include: (1) It doesn't know personnel or scheme, (2) It's based on league averages, not team-specific, (3) Special situations (weather, altitude) aren't modeled. Despite limitations, it's the best publicly available metric.

nflfastR data is free for any use, including commercial. However, NFL trademarks and logos are protected. If you're building a commercial product, consult with a lawyer about intellectual property. Academic research and personal projects are generally fine. Always cite your data sources.

Learning Path

We recommend: (1) Start with Chapters 1-5 for foundations, (2) Learn EPA and efficiency metrics (Chapters 6-12), (3) Study defense and special teams (13-21), (4) Explore game theory and strategy (22-25), (5) Dive into advanced methods (31-35) when comfortable. Our tutorials are designed to build on each other progressively.

The best way to learn is by doing. Start by reproducing analyses you see on Twitter/X or in articles. Participate in NFL Big Data Bowl competitions on Kaggle. Build your own models and test predictions. Share your work on social media - the analytics community is supportive and provides feedback.

Yes! The nflverse Discord is very active and welcoming to beginners. The r/NFLstatheads subreddit has good discussions. Twitter/X has a vibrant NFL analytics community - follow accounts like @benbbaldwin, @LeeSharpeNFL, and @naborforce to get started.

NFL teams do hire analytics staff, but positions are competitive. Build a strong portfolio of public work (blog posts, Twitter threads, Kaggle competitions). Network at conferences like the Sloan Sports Analytics Conference. Many analysts start in consulting, media, or other sports before moving to NFL. Programming skills and communication ability are essential.

Still Have Questions?

Can't find what you're looking for? Join the nflverse Discord community where experts and beginners alike discuss NFL analytics.