Using Regression to Get Causal Effects: Unconfoundedness: Causal Inference Bootcamp

[MUSIC]. So we’ve talked about how to use
regression to describe correlations in the data. But we really care about causal effects,
not necessarily just correlations, and we know that correlations don’t necessarily imply
causality. So one approach to learning about causality
using regressions is to, well, just assume that the correlations actually are causal
– that is, that the correlations do imply causality, by assumption. So why might this make sense? Let’s think
about our definition of causality. Let’s remember that we say that we take all the variables
that could possibly affect our outcome variable. And then we’ve got our treatment variable.
So we’ve got all these variables. We are going to hold everything constant except for the
treatment variable and then vary that. If we see that our outcome variable changes,
then there is a causal effect of the treatment on the outcome variable. So if we observed every single variable that
could affect outcomes, then we could just compare units that had the same values of
all these other variables over here, but different values of the treatment variable, and if their
outcomes changed then we know that there’s a causal effect. Now, of course in practice the problem is
that we usually do not usually literally observe every single relevant variable. And in fact
we know that this isn’t the case because if we did, we could just look at two units or
two people who had the same values of every variable including the treatment, and they
would have to have the same value of the outcome variable too if we’ve literally observed everything. But that never happens. So we know that in
social science there’re some variables we’re not observing, that we’re missing. But, if
we assume that we’ve observed enough variables so that when we compare people who had the
same values of this variable but different treatments, that the treatment was as good
as randomly assigned. Maybe that’s the case. Let’s just suppose
that. Well that’s called the Selection On Observables Assumption or the Unconfoundness
Assumption. Remember confounders happen when we don’t observe all the variables that are
relevant, and there are these unobserved confounding variables that are causing changes in the
outcome but that are correlated with the treatment variable, so that when we look at the correlation
between treatment and outcomes we see a correlation, but that might be picking up the confounders.
So if we don’t have any confounders by assumption, then that’s the Unconfoundedness Assumption. So the idea is basically just that we’ve measured
all possible variables that could have been confounders and we observed them. So under
this assumption any correlation we see between treatment and outcomes – once we’ve held all
these confounders constant – is actually causal. So when we do a regression, we hold all the
other variables constant, and we look at the effect, the correlation in the data between
the treatment and the outcome that we see through our regressional analysis, that actually
is a causal effect under this assumption. So that’s all there is to it to use regression
and learn about causality. You just assume that we’ve observed enough variables that
when you condition on it, treatment is as good as randomly assigned. [MUSIC].

3 thoughts on “Using Regression to Get Causal Effects: Unconfoundedness: Causal Inference Bootcamp

Leave a Reply

Your email address will not be published. Required fields are marked *