In the most recent episode of Fangraphs Audio, host Carson Cistulli speculated on the merits and possibilities of illustrating, graphically and with all emphasis on brevity, the very concept of “regression to the mean”. Below, in at least 50% graphical format, is a chart meant to convey this very idea.

The problem, as discussed, is that fans of Team Y or Player X will assume a year of poor performance is an outlier while an especially good year is representative of true talent level. What this chart shows is that, as with most things, reality is more complicated than we may speculate and a player can both be better than their worst year and worse than their best year, and that indeed such an assumption is healthy.

Michael "OK at Baseball" Saunders

A graphical representation of the concept of regression to the mean as illustrated using Seattle Mariners CF Michael Saunders 2013 ZiPS projections

To illustrate this concept I chose Seattle Mariners CF Michael Saunders, simply because

A) the comments of FanGraphs ZiPS projections of the Mariners featured an especially dimwitted response about Saunders’ 2012 performance compared against his 2013 projection

and

B) I can think of no things in god’s creation worth getting upset about less than the projected 2013 line of Seattle CF Michael Saunders, who as Szymborski noted could reasonably be expected to perform within a margin of error of +/- 10% of this line, based on, one would assume, circumstances.

also finally

C) compared to his performance over 4 years of data, Saunders’ 2013 reveals a trend of general performance improvement, close but not exactly the level of the previous year but certainly ahead of, say, the year before.

For these reasons, Saunders is  a good representation of ‘regression to the mean’ because this case shows that the mean doesn’t actually mean the average across X time, but instead the average of X time with the time weighted in such a way that does give more value to the performance of the previous year. And also because if you look at things stacked against  4 years of data suddenly you’re not comparing A to B, but instead A to B, C, and D, and having more to reference is hardly a point to complain about.

Advertisements