# Using Regression to Get Causal Effects: Unconfoundedness: Causal Inference Bootcamp

[MUSIC]. So we’ve talked about how to use

regression to describe correlations in the data. But we really care about causal effects,

not necessarily just correlations, and we know that correlations don’t necessarily imply

causality. So one approach to learning about causality

using regressions is to, well, just assume that the correlations actually are causal

– that is, that the correlations do imply causality, by assumption. So why might this make sense? Let’s think

about our definition of causality. Let’s remember that we say that we take all the variables

that could possibly affect our outcome variable. And then we’ve got our treatment variable.

So we’ve got all these variables. We are going to hold everything constant except for the

treatment variable and then vary that. If we see that our outcome variable changes,

then there is a causal effect of the treatment on the outcome variable. So if we observed every single variable that

could affect outcomes, then we could just compare units that had the same values of

all these other variables over here, but different values of the treatment variable, and if their

outcomes changed then we know that there’s a causal effect. Now, of course in practice the problem is

that we usually do not usually literally observe every single relevant variable. And in fact

we know that this isn’t the case because if we did, we could just look at two units or

two people who had the same values of every variable including the treatment, and they

would have to have the same value of the outcome variable too if we’ve literally observed everything. But that never happens. So we know that in

social science there’re some variables we’re not observing, that we’re missing. But, if

we assume that we’ve observed enough variables so that when we compare people who had the

same values of this variable but different treatments, that the treatment was as good

as randomly assigned. Maybe that’s the case. Let’s just suppose

that. Well that’s called the Selection On Observables Assumption or the Unconfoundness

Assumption. Remember confounders happen when we don’t observe all the variables that are

relevant, and there are these unobserved confounding variables that are causing changes in the

outcome but that are correlated with the treatment variable, so that when we look at the correlation

between treatment and outcomes we see a correlation, but that might be picking up the confounders.

So if we don’t have any confounders by assumption, then that’s the Unconfoundedness Assumption. So the idea is basically just that we’ve measured

all possible variables that could have been confounders and we observed them. So under

this assumption any correlation we see between treatment and outcomes – once we’ve held all

these confounders constant – is actually causal. So when we do a regression, we hold all the

other variables constant, and we look at the effect, the correlation in the data between

the treatment and the outcome that we see through our regressional analysis, that actually

is a causal effect under this assumption. So that’s all there is to it to use regression

and learn about causality. You just assume that we’ve observed enough variables that

when you condition on it, treatment is as good as randomly assigned. [MUSIC].

This is exactly what I needed! Thanks!

thanks…way of explanation n stuff really really good for beginners..

Great!