Tuesday, December 27

The importance of "selection bias" in statistics

The importance of "selection bias" in statistics:
During WWII, statistician Abraham Wald was asked to help the British decide where to add armor to their bombers. After analyzing the records, he recommended adding more armor to the places where there was no damage! This seems backward at first, but Wald realized his data came from bombers that survived. That is, the British were only able to analyze the bombers that returned to England; those that were shot down over enemy territory were not part of their sample. These bombers’ wounds showed where they could afford to be hit. Said another way, the undamaged areas on the survivors showed where the lost planes must have been hit because the planes hit in those areas did not return from their missions. Wald assumed that the bullets were fired randomly, that no one could accurately aim for a particular part of the bomber. Instead they aimed in the general direction of the plane and sometimes got lucky. So, for example, if Wald saw that more bombers in his sample had bullet holes in the middle of the wings, he did not conclude that Nazis liked to aim for the middle of wings. He assumed that there must have been about as many bombers with bullet holes in every other part of the plane but that those with holes elsewhere were not part of his sample because they had been shot down.
I suppose one could argue about whether the bullets were fired randomly, but the analytic principle involved is still an important one for students to learn. Text credit John D. Cook at The Endeavor, via Neatorama; image via Mother Jones.

No comments:

Post a Comment