【Read in 3 Minutes!】Comprehensive Explanation of Correlation, Causation, and Spurious Correlation! We'll also explain an easy-to-understand method for detecting causation!

2024年2月4日日曜日

Correlation and Causation English statistics

t f B! P L

 


 

Academic research, policy, and business have increasingly employed statistics in recent years. Do you have a clear understanding of correlation and causation? This article explains correlation and causation, highlighting the points necessary to prevent spurious correlation and derive reliable causation in an easily understandable manner!


Difference between Correlation and Causation


Correlation:

Correlation indicates that two variables are statistically related, but one does not directly influence the other. For example, there might be a correlation that "umbrella sales increase when it rains." While there may be a statistical correlation, it doesn't mean that umbrella sales increase directly because it rains. Another example is the correlation between the sun rising and daylight, which does not imply causation.

Causation:

Causation exists when one variable directly influences another, establishing a cause-and-effect relationship. For instance, the statement "the sun rises, causing daylight" represents causation. The rising of the sun is the cause, and daylight is the effect.

In summary, correlation does not necessarily imply causation. Even if statistical data shows correlation, it doesn't directly indicate a causal relationship. It's crucial to be aware of this distinction.

Spurious Correlation


One thing to be particularly cautious about when reaching causation is spurious correlation. Spurious correlation is a phenomenon where there appears to be a correlation due to unseen factors, even when there is no causal relationship. Let's illustrate this with an example:

Example: Ice Cream Sales and Drowning Incidents

In summer, ice cream sales increase, and at the same time, the number of drowning incidents seems to rise. This might create a spurious correlation suggesting a link between ice cream sales and drowning incidents. However, in reality, during hot summer weather, people tend to buy more ice cream and also spend more time in pools or at the beach. The common factor here is the "temperature," and without considering its impact, a spurious correlation emerges. In such cases, the apparent correlation is not an actual causal relationship; there's an influence from other factors.
Causes of Spurious Correlation

Spurious correlation is generally caused by unseen factors affecting statistical data and making it appear as if there is a correlation between two variables. The main causes include:

    Confounding Factors:

    When other factors are involved, they can cause spurious correlation. For example, if there is a high correlation between ice cream sales and drowning incidents, temperature may act as a confounding factor.

    Sample Size Influence:

    Small sample sizes may lead to coincidental correlations that may lack reliability unless using large datasets.

    Observation Bias:

    If specific observations are more frequent than others, it can create a bias in the data, resulting in spurious correlation.

By carefully considering these factors, we can avoid being misled by spurious correlation and interpret causation appropriately.

Ensuring Accurate Causation


To derive causation accurately, it's crucial to keep the following in mind:

    Consider the Background of the Data:

    Consider not only the results of the investigation or data but also the background and context. Changes in specific conditions or environments may be influencing the data.

    Consider Other Factors:

    Check for factors other than the two things under consideration that might be influencing the outcome. Ignoring other factors can lead to apparent causation that is not actual causation.

    Look at Surrounding Data:

    Examining other data or research results related to the same theme can make it easier to derive more accurate causation.

    Ensure an Adequate Sample Size:

    It's challenging to find accurate causation without sufficient data. Having a large dataset is crucial.

    Utilize Common Sense and Domain Knowledge:

    Applying knowledge of the field and common sense can be beneficial. Domain knowledge helps understand what truly influences causation.

Conclusion


How was this for understanding the ways to derive causation and the difference from correlation? If this article has deepened your understanding of causation and the difference from correlation even a little, I'm delighted.

Thank you for reading until the end!

サイト内検索

アーカイブ

Translate

QooQ