We are now multiple decades into the sabermetric “revolution,” and it is still unclear how to determine how good a pitcher is. There are metrics that have come into and then gone out of vogue, but there isn’t a consistent approach to these measurements the way there has been for offensive statistics. The earliest and most basic pitching stats, such as ERA and wins, assumed that everything was within a pitcher’s control and penalized or credited him based solely on the scoreboard outcome. Then, defense independent pitching (DIPS) theories assumed that pitchers had no control over what occurred once a ball left the bat. Finally, the most recent iterations of pitching stats assume that there is some level of pitcher skill and luck involved in batted-ball outcomes.
Baseball Prospectus’s DRA is the best and most recent attempt to evaluate pitchers. As its name (“Deserved Run Average”) suggests, however, it has a backward-looking component. It attempts to apportion credit and blame for what happened while a pitcher was on the mound. Context-based Fielding Independent Pitching (cFIP), which is another recent BP statistic, attempts to measure only true talent. (A summary of DRA and cFIP are available here.) Both are valuable resources, but they measure pitcher effectiveness in different ways.
Chase Anderson was good last year by any publicly available measurement, though. His ERA was 2.74, his FIP was 3.59, his cFIP was 93 (where 95-to-105 is average, and a lower number is better), and his DRA- was 81.9 (here 100 is average, with a lower number better). These were all career-best marks, so there were two possible interpretations: either Anderson had developed and taken a step forward, or 2017 was one of those unrepeatable career years that players occasionally have.
In 2018, Anderson has not been as good. All of his cumulative numbers have regressed, and both his DRA- and cFIP are worse than his career average. All of the public pitching stats agree that he has not been good, but they all agreed that he was good last season. Both the forward-looking stats and backward-looking stats agree on this.
It is this conundrum I find the most interesting. In broad strokes, we can break pitching metrics down into two categories: forward-looking and backward-looking. FIP and cFIP are prospective, while ERA is retrospective. DRA is somewhere between the two, but seeks to explain past performance. And both sets believed Anderson was good last year in terms of underlying performance and run prevention.
But that performance has not carried over into this year. The traditional regression examples are pitchers who have good ERAs but bad cFIPs or DRAs, which indicates that they just got lucky and there was no uptick in performance. Those pitchers are expected to not be as good the next year. By contrast, good peripherals (as taken into account by cFIP) that match good run prevention numbers are supposed to indicate that someone is able to sustain that performance going forward. Anderson breaks that model, though.
There are possible explanations for this that don’t require metrics to have missed. Anderson could be pitching while hurt this year, or he could have made a mechanical adjustment that has not worked. But we don’t know whether that has occurred, so I am assuming he isn’t dealing with any physical issues beyond the general fatigue we expect big league players to battle through. He could also just be an outlier in these numbers; a sample size of one is insufficient to draw overarching conclusions about the validity of the stats, and that is not what I am attempting to do here.
Anderson provides the fulcrum for this discussion, but he is just a part of a larger question about how to measure pitcher performance. There has been increased focus on how accurate our defensive metrics are because they have not kept up with shift tendencies, but pitching, although not seen as reliable as offensive numbers, has not received similar scrutiny. I don’t believe anyone is suggesting that the current pitching metrics are perfect, but I wonder what else is missing that can be incorporated to help fix this type of blind spot.
The most basic analysis of Anderson’s season last year would have been that it was just a career year and he was likely to return to being the type of fourth starter he had been previously. DRA, cFIP, and similar metrics provided possible justifications for Anderson having made substantive improvements that would carry forward. To this point, though, those possibilities have not panned out.
I do not intend to just point to pitchers who public metrics cannot comprehend. Instead, Anderson demonstrates a specific phenomenon: he had a career year last season, but it was backed up by improved peripherals in such a way that we don’t normally see in fluky performances. The statistics that try to remove luck from the equation thought that Anderson had improved, but, at least to this point, it appears he had not.