Correlation and Causation: A new look

26 06 2008

Correlation does not imply causation. The logical fallacy that correlation implies causation will be starkly pointed to if you falsely make a cause-and-effect determination based on your observation of the world, your observation of data, or your study findings of a correlation.

Chris Anderson, editor-in-chief and famed writer of The Long Tail, has tackled the idea that the deluge of data makes the scientific method obsolete. That correlation is sufficient in the Petabyte Age as he calls it, and the practice of causation only leads us to realize that we understand even less about the world.

Jackson West rightly points out the dangers of throwing the scientific method under the bus for the sake of applied mathematics and relegating the creation of models of the world to high and mighty, “pansy” theoreticians. And he’s generally right to be critical of a loss of strict rationality and too much focus on faith.

Data can be manipulated to show what you want. Not always in an evil manner, but methodology and the very collection of data itself can affect the outcome. You can get what you ask for, or perhaps deserve as Mr. West concludes.

Take it further and a strict reliance on data leaves room for dangerous regimes. Tyranny of the majority, self-fulfilling prophecies: these are all dangers that are left on the table when strict rationality goes out the window.

Yet Anderson is certainly not suggesting this, only pointing out a real observation of the world in which we live. And yes applied mathematics can be profited from, theoretical mathematics not so much. Ian Ayres, a Yale Professor, is the author of Supercrunchers whose very title is number crunching is the way to be smart.

The examples are largely the same: information (Google), airline predictions (Farecast), legal discovery, predicting the outcome of an election. It’s easiest to see manipulation possible in the last example.

Oh well, so be it. I’ve written that blind faith is basically idiotic. A strict reliance on numbers may not mean you’re determining a causation, but perhaps correlation is enough. And for those who operate in a gray area where luck plays a part, where luck is better described as a set of circumstances beyond your control, Anderson’s piece could be comforting.

But more importantly, it outlines a real path to be smart and succeed. And so you might be wise to heed the words that correlation is enough, and that correlation supersedes causation in a world dictated by Petabytes of information ready to crunched.




3 07 2008
