Instrumental Variables in Action: Education and Wages (graphs): Causal Inference Bootcamp

[MUSIC] What is the causal effect of education
on wages? If you go to school for an extra year, will you make $1,000 more? $10,000 more
or $100,000 more? Or maybe you’re going to lose money, maybe you’ll make less money,
because you’re just sitting around watching basketball games all the time. Who knows? To answer that question, a lot
of people have used something called Instrumental Variables Analysis. And to see how that works,
let’s look at an example where I’ve generated some fake data. So in my fake data, here I’ve
got the relationship between years of education and how much money you make. We see in the data that there is a positive
relationship, people with more education tend to make more money. Does that mean there’s
a causal effect of education on wages? No, not necessarily because education is not randomly
assigned, people choose how much education they get. So, what could be going on? Well, this positive
relationship might be just be picking up an unobserved confounder. What’s one possible
unobserved confounder? Well, maybe your ability, your innate ability, deep down inside you.
So, over here I’ve plotted ability versus years of education, so people with higher
ability tend to get more education. Seems plausible. Maybe people of higher ability also tend to
make more money, because their ability, independent of how much school they have, directly effects
how effective they are at their job and makes some more productive for some reason. So,
if in your data you don’t observe this variable, this innate ability, then you can’t draw these
plots. I can draw them because I’ve generated the fake data, which is why I generated the
fake data. But in real life you often don’t see the unobserved
confounders-that’s what makes them unobserved, so you don’t know if this kind of thing is
going on behind the scenes. So, with this situation the unobserved ability could be
driving the observed correlation between education, hourly wage, not a true causal effect of education
on wage. So, what are we gonna do? That’s where the
Instrument comes in. One Instrument that has been proposed is how far a student’s high
school home is from the nearest college or university. So it’s called the Distance to
College Instrument. So why would this make sense? Well, if a student lives closer to
university when they’re in high school then the benefits from going to university might
be more salient to them, they might recognize, oh, look at all these students who’ve gone
to university and they’re having a great time and they’re really getting a lot of benefits,
I should go too. The cost might be lower for them to attend
because they could attend university in their home town, they wouldn’t have to pay for an
apartment or travel out of state or somewhere else. So, this suggests that distance to college
has a causal effect on whether you actually go to college or not. So, it has a causal effect on how much education
you get overall, that was one of our assumptions from before. Another thing you might think
about this instrument is that it’s not necessarily related to ability. Okay, so why would high-ability
people live closer to universities? If people were just born randomly across the country,
some people have high-ability, some people have low ability then there’s no reason to
expect that ability will be related to distance to college. Now, that was another assumption we needed,
and in practice this is somewhat contentious, and usually is the most contentious assumption
in an instrumental variables analysis. Because some people have argued that, well, universities
often have university faculty or other staff members living near them and those people
may have some innate ability that they pass on to their children who then live near a
university, and that would mean that actually living close to the university is correlated
with having high ability, and that would be a problem for using distance to college as
an instrument. But putting that issue aside, we’re just going to go with it for now for
the sake of illustration. So the final assumption that we need for instrumental
variables analysis is that there is not a direct effect of the instrument on outcome
variables, so how close you live to a university doesn’t affect your wages at all. So, this
kind of makes sense because why would an employer care about where you lived when you were in
high school? Why would they care if you lived close to a university or not? They don’t,
they just care about things like your education and how good you are at doing the job. So, with that said we’ve sort of justified
that this one instrument can satisfy all of the three assumptions we needed, so let’s
see what it looks like in the data. So over here, I’ve plotted distance to college versus
years of education. So people who live closer to a college over here tend to get more education,
so that was our first assumption, that there is a causal effect of distance to college
on years of education. Seems to be what’s going on in this plot.
This next plot plotted ability versus distance to college. And you can see there isn’t any
relationship, it’s just a big cloud of points. So that was our next assumption, that distance
to college is randomly assigned to people that’s unrelated to other possible confounder
like ability. So finally what do we do? Now that we’ve justified
our instrument, we go and we look at the correlation between the instrument, distance to college,
and the outcome variable, the hourly wage, and we see that there is a correlation. Here,
there’s a negative relationship between distance of college and hourly wage. People who live closer tend to make more money.
So, because we’ve said that this instrument satisfies those three assumptions, we can
conclude from this plot, that this represents this correlation represents a causal effect
of years of education on hourly wage of the treatment on the outcome by looking at the
correlation between the instruments and the outcome. That’s the main idea of instrumental variables
analysis, and one important thing to take away from this is that in my fake data I had
ability there are potential confounder, but in real life we don’t observe these potential
confounder, but that’s okay for instrumental variable analysis. We don’t need to observe the unobserved confounder.
All we need is our instrument, our treatment and the outcome variable, and with those instruments
alone and our assumptions on the instrument, we can get at casual effect of treatments
on outcomes, or in this case the effect of going to school on wages. [MUSIC].

11 thoughts on “Instrumental Variables in Action: Education and Wages (graphs): Causal Inference Bootcamp

  1. this is an excellent way to put it, hats off to you sir! i was kind of struggling with the intuition when i was reading the textbook.

Leave a Reply

Your email address will not be published. Required fields are marked *