One of the biggest misconceptions with data is that a relationship
implies a meaningful relationship. We are quick to assume that two seemingly related (i.e. increase in temperature shows an
increase in murder rate) means that one causes the other. This is all too common!
Often times this relationship is merely coincidental or random; if
you look at the relationships between enough variable combinations, eventually (by chance)
you will find a couple instances where they happen to correlate. Other
times, it may be that these two variables indeed have a relationship but it is a
“confounding” variable that is the true “cause”.
An example (made up) is perhaps you may find a negative
relationship between more years of education and life expectancy... implying that
if you attend more schooling you will die younger - when infact the confounding factors may be that those with more education, such as doctors, tend to work
longer hours and carry more stress in their career positions.
Although my favorite is #10, this list brings together a multitude
of examples demonstrating why you should never conclude a meaningful
relationship from a graph or suggested relationship in general!
P.S. Make sure to Google the “flying spaghetti monster”
Anyone have any good examples of implied causation by correlation?
No comments:
Post a Comment