The regression toward the mean/moving target mindset makes it almost impossible for a judge to score a gymnast outrageously far from the “average” score on that apparatus, regardless of how outrageously good or bad the performance was. Mediocre routines are judged as if they are just that…mediocre. No really harsh deductions, and no really lenient ones. Conversely, near-perfect routines force the judges to bring out their MICROSCOPES and look for any possible minute error they can find…and sometimes even become compelled to make up deductions out of thin air (the 2008 Anna Pavlova Floor Tragedy is a perfect example). Horrible routines are judged as if they simply aren’t in the same league as the near-perfect routines, and thus very minor deductions are ignored and only obvious ones are taken – often leniently.
If you flipped a coin six times, are you going to get 50% heads? Possibly…but you could also very easily get 4 heads and 2 tails, or even 5 tails and one head, which would make the final percentage of heads very different from the expected 50%. But what if you flipped the coin SIX HUNDRED TIMES? Regression toward the mean takes over, and the overall percentage of heads is going to end up very, very close to 50%. This is because the greater the sample size, the more accurate statistics become. Regression toward the mean has much more of an effect as the number of events increases. Unfortunately, this is exactly why the regression toward the mean phenomenon actually has much more power in today’s gymnastics than ever before.
Today’s gymnastics routines on both the men’s and women’s sides are longer and more skill-loaded than they have ever been. On one hand, it seems that this should create a WIDER range of scores than ever before because there are so many more skills by which to separate the gymnasts. With routines often twice as long as they once were and with so many more difficult skills than ever before, shouldn’t we be seeing scores range from around a 6.2 to a 9.8? Instead, however, in most competitions we’re seeing a much narrower range than ever before…from a 7.6 to an 8.8, for example. WHY? Because the larger number of skills represent a larger sample size, and more opportunities for the regression toward the mean phenomenon to take over. In other words, the more instances of “judging” that take place in a routine, the more areas the judge has to manipulate the score (whether consciously or subconsciously) and bring it closer to average.
Is it possible to change the regression toward the mean mindset in gymnastics judging? Well, addressing it is at least a great first step. This phenomenon is actually exactly why I suggested the radical idea of giving “general impression” execution scores in my proposed code of points last year. I feel that allowing the judges to simply sit back and watch the routine would actually make them MORE accurate in evaluating relative qualities of execution among various performances. It would allow them to use their brains and look at the overall picture rather than have their pen and paper taken over subconsciously by the regression toward the mean/moving target phenomena.
Perhaps making judges more aware that this phenomenon exists might help them evaluate their own judging and become more honest with themselves about how much sense their scores actually make when the chalk dust settles. Perhaps judges DO need to be reminded from time to time that it’s OKAY to give extreme scores for extreme performances, and that in fact that’s exactly what the judges are there to decide. We mustn’t forget that the whole point of judging gymnastics routines is to see where all the performances fall in relation to an absolute standard, not to create some “moving target” that attempts to make them all fall as close as possible to an arbitrary “mean” that we create in our minds.
I know it will probably be a while before we see any drastic changes to either the code of points, the competition formats, the age requirements, or the nature of judging altogether…but it is an important goal of mine to bring issues to the forefront that I often feel are neglected, misunderstood, or simply overlooked. The paradoxical battles that occur between intentions and reality in gymnastics today are universally obvious, but the sources of these conflicts are not always so conspicuous. If we are to bring artistry, fairness, and the connection with the fans back into gymnastics, we MUST address the issue of whether the perfect ten is truly being used as the standard for execution…even if it is baby steps at a time.
All things in moderation, right?
General impression is a very bad idea. The sport has a well known history of national cheating in judging. Introducing more subjectivity opens the door for the criminals. And will draw increased criticism from those who follow normal sports.
Instead publish the exact details of what judge scored what routine, how. They are supposed to follow the code. So fine. Be transparent. If they are judging imperfectly,no big deal, but show what happened. And I BET, they will become more consistent and better if under more scrutiny (not perfect, but noticeably better).