credit: SkipsterUK (CC BY-NC-ND 2.0) |
Preamble
In the previous post in this series, I explained how to use causal diagrams to set up multivariate regressions so that statistical confounding is eliminated.
In this post, I'll give a short and simple example of a case where statistical confounding can't be prevented, because an important variable is unavailable. This sort of thing is unfixable, and it is bound to happen sometimes in observational statistical analyses, because there are influencing variables that we just don't anticipate, and therefore don't collect.
Here's the entire 'statistical confounding' series:
Part 1: Statistical confounding: why it matters
- on the many ways that confounding affects statistical analyses.
Part 2: Simpson's Paradox: extreme statistical confounding
- understanding how statistical confounding can cause you to draw exactly the wrong conclusion.
Part 3: Linear regression is trickier than you think
- a discussion of multivariate linear regression models
Part 4: A gentle introduction to causal diagrams
- a causal analysis of fake data relating COVID-19 incidence to wearing protective glasses.
Part 5: How to eliminate confounding in multivariate regression
- how to do a causal analysis to eliminate confounding in your regression analyses
Part 6: A simple example of omitted variable bias (this post)
- an example of statistical confounding that can't be fixed, using only 4 variables.