Value Over Replacement Statistic: Analyzing ESPN’s Total QB Rating

Beau Brace
September 28, 2011

It is easy, says the conventional wisdom, for a fan to sit in his armchair and criticize his star quarterback’s mechanics. The presentation of professional football as theater tends to make critics of most people. Naturally, this criticism is not all qualitative speculation based on one’s idea of what, for example, a good defense does.

The point spread on a game, for example, represents an attempt to quantify the difference in the quality of two teams based on a projected margin of victory. Spreads, over/unders and their ilk have been around for some time, but they do not represent a bottom-up reimagining of football analysis.

Statistical analysis depends on discrete units of observation. The sabermetrics made famous in Michael Lewis’ seminal book Moneyball (and its film adaptation) depend largely on isolating an individual player and controlling the impact of all other players on the field at a given time. In baseball, where each pitch is essentially one set piece with extensive statistics kept and each player has a discrete task, it is easy to quantify an individual player’s worth to the team. So, a third baseman who is above average defensively and who is able to generate offense in a variety of ways and in important situations could be worth more than a defensively incompetent slugger.

In football, it is more difficult to isolate individual players. A quarterback throwing to mid-2000s Terell Owens could count on excellent yards after the catch; a defensive end who plays opposite Julius Peppers could be expected to benefit from less attention. In both cases, the interconnectedness of the team works against would-be statisticians.

There are two logical ways to account for the team element. One is to look at only similar plays and determine what one would expect the typical offense – or a player in the typical offense – to gain and then rank players and teams based on their value-over-average in a given situation. This method is a crude description of Football Outsiders’ excellent Defense-Adjusted Value Over Average (DVOA) stat. The other, slightly less statistically sound way, is to assume that one player (for example, the quarterback) is particularly crucial to a team’s success on offense and attempt to better assess how well that player performs. Thus, we arrive at ESPN’s vaunted Total Quarterback Rating (TQBR).

ESPN, in an unsurprising feat of self-promotion, declared 2011 to be the “Year of the Quarterback.” Whether or not this was just an ex post facto pretext for it to fawn over its favorite stars more than usual is of secondary importance to the fact that the network seemed to be up to something slightly different. ESPN has quietly embraced advanced sports analytics in certain contexts. Seeing Buster Olney with a text box listing a player’s WAR (wins above replacement player) onscreen was jarring for anyone who has heard baseball lifers bemoan what they perceive to be an intrusion of geeks into a jock’s game. When the self-anointed Worldwide Leader in Sports deigned to commission a reimagining of the flawed Quarterback Rating, it was not terribly surprising: ESPN was executing an elegant coup to capture both statisticians and those interested in new and innovative ways to argue about which quarterback is better than all the others.

TQBR was revealed and discussed in a special on ESPN. The statistic attempts to quantify expected points from a pass on a given play, and adjusts these based on win probability using the condescendingly-named Clutch Index. TQBR also accounts for overthrows, underthrows and a variety of other exogenous factors to a quarterback’s expected points from a given play by dividing credit using a variety of methods. The statistic also, apparently, takes into account key plays the quarterback makes as the leader of the offense (e.g. recovering a fumble on a non-pass play). A fuller discussion of the statistic is available on ESPN‘s website. TQBR, unlike DVOA, is not adjusted for opponents’ strength and is normalized on a 0-100 schedule, with a score of 50 representing an average play.

TQBR does adjust for some important elements. For example, the yards a pass covers in the air versus yards after catch. The old quarterback rating does not adjust for this factor. ESPN should also get credit for at least attempting to use the breadth of its reach to drive innovation in sports analysis. Effort, though, is only part of the equation.

What sticks out most in an analysis of TQBR is the Clutch Index. The CI modifies expected points on the premise that a third-and-goal from the four-yard line down six points with nine seconds to go is generally more valuable than a play from midfield, ceteris paribus. At face value, the CI inflates or deflates a quarterback’s rating based on the contribution of his leadership and skills to a team’s success. That said, it would appear that the statistic would fail to take into account, for example, an elective safety that a quarterback takes to better his team’s field position late in the game or a two yard pass completed in bounds to take time off the clock late in the fourth quarter.

Alex Koenig of Harvard University’s Sports Analysis Collective (HSAC) also identifies the clutch index as being problematic.

“The Clutch Index is extremely oversimplified. It doesn’t necessarily indicate how good a quarterback is. And that’s been the problem with the quarterback rating to start with. It doesn’t make a lot of sense to me.”

Koenig said the power of the CI as a direct multiplier to expected points has flaws that could make a particularly putrid outing look respectable, or vice versa, depending on a variety of factors. Yet another problem arises in games where an offense dominates an opponent. Even the quarterback of a team that won 61-10 with five passing touchdowns could, conceivably, have a lower game TQBR than another quarterback who scored two touchdowns in the closing minutes to win a game despite a mediocre outing with interceptions and ineffective play.

This disconnect speaks to the fundamental question with rating a quarterback: is the most important statistic the win? Should Tom Brady’s dominant Week 2 performance in a romp against the Chargers (TQBR: 91.1) be considered better than Ryan Fitzpatrick’s inspired play in a comeback against the Raiders (TQBR: 82.2)? The answer is, of course, open to question.

A look at the 2008-2010 QBR tables shows the usual suspects at the top of the list. Peyton Manning owns the top two spots with his 2009 campaign ranking first among all quarterbacks with a season TQBR of 82.3. Michael Vick’s entertaining 2010 season ranks 18th and, fittingly, underscores the divide between statistical analysis of a player’s performance and qualitative analysis. It is easy to look at Michael Vick’s performance and say that he is truly great, but, while his season TQBR of 66.6 suggests he would be a Pro-Bowler, it does not put him among the likes of Brady, Manning and Drew Brees.

It is easy to pick holes in a statistic from an outsider’s perspective, but it is hard not to give ESPN credit for at least trying. As Koenig said, “It’s good that we’re even having this conversation now.”

It’s easy to accuse ESPN of oversimplifying the statistic for the sake of allowing it to appeal to a wide audience. But according to HSAC president John Ezekowitz, this isn’t necessarily a bad thing.

“The most important thing a statistician can do is to craft his message to appeal to those without a stats background,” he said.

Indeed, ESPN publishing its methodology in a language that is easily understandable is a significant step in the right direction. ESPN should be lauded for its role in appealing to those that eschew the raw numbers of Football Outsiders’ statistics in favor of cleaner, simpler approaches. However, one hopes the network will assume its viewers understand more.

It is easy for armchair statisticians to sit back and criticize TQBR for its flaws. It is more difficult to truly drive innovation. But, by giving exposure to a new statistic, ESPN could inspire the next generation of sports analysts. Perhaps the Worldwide Leader is, indeed, leading.

