Selection Bias

MIT’s Josh Angrist returns from his mountaintop meditation to guide us in our next econometrics lesson: selection bias! Graduates of private universities make 14% more than graduates of public universities. Does that mean attending a private universities causes your wages to go up? In this video, Master Joshway shows how randomized trials and regression can be used to address this question. Think econometrics is boring? So it was once, but will be no more! Skipping theoretical tedium, we use real empirical questions to bring the numbers to life. Does an expensive private university education pay off with higher earnings? Does health insurance really make you healthier? Does legal drinking cost lives?

Teacher Resources

Transcript

Welcome back. Today we continue our pursuit of causal knowledge. Recall that private university alumni earn wages 14% higher on average than the wages earned by public university grads. Does that mean private university education causes your wages to go up? As with most of the questions we ponder, the facts are not in dispute, rather it's the interpretation that's contentious. 

Let's compare private school graduates with those who attended public schools. Private university grads differ in a number of ways. For example, they have higher SAT scores. Attendees of private universities score 120 points higher on average. These SAT stars sport orange sweatshirts. Private university grads also come from wealthier families—13% higher than public university students. The rich kids have green pants. 

It would seem that public/private comparisons are not apples to apples. Perhaps their 14% wage gain is caused by pre-existing differences in earnings potential rather than by private university attendance. Like many who have walked before us in search of causal knowledge, we are waylaid here by selection bias. 

Selection bias misleads us into interpreting naive comparisons as causal effects. Here we see selection bias tricking us by directing traffic. Those with higher test scores go to the left towards private universities. Those with lower family incomes go to the right towards public universities. Public/private comparisons have causal force only when the groups compared are otherwise identical on average. For then, we can happily say ceteris paribus. But private schools are typically more selective and more expensive than their public university counterparts. So those going left are not comparable to those going right. This is how selection bias bewitches us. 

Although colleges really do select their applicants, the term selection bias refers to any comparison plagued by systematic differences between groups other than the difference we're focused on. When the groups being compared differ in many ways, we've lost ceteris paribus. Selection bias is the principal enemy facing metrics students and metrics masters alike. Our five most important weapons in the fight against it are the Furious Five of Econometrics. 

Selection bias is insidious and pervasive, but our weapons are powerful, for we do not need to ensure that the individuals to be compared are identical. We don't need virtual clones. Rather, we need only ensure that the groups compared be the same on average. Our most powerful weapon, strong and dependable, is random assignment of group membership. Imagine a secret experiment in which applicants to public and private colleges are randomly assigned to attend one or the other. Seems only fair. And maybe we'll learn something from this too. In the interest of science, I have proposed such an experiment at MIT where I teach econometrics. I'd like to replace our skilled but well-paid admissions officers with a coin toss. Random assignment of college admission ensures that when it comes to cross school comparisons, ceteris is paribus on average. Unfortunately, for science, I have not yet convinced MIT admissions to replace its staff with a stack of pennies. 

As we'll discuss later, random assignment is often impossible or impractical so we must look for practical and inexpensive strategies that have the same ceteris paribus inducing power as random assignment. Kamal, where should we look? I don't know. If we could somehow control for... Correct. 

Metrics masters are control freaks. We implement statistical strategies that make the groups choosing different paths as similar as possible. Rather than simply comparing wages of public and private alums, we look within sets of alumni that have similar ability and background. Within these sets, we make public/private comparisons but not across them. This strategy moves us one giant step closer to ceteris paribus and apples to apples comparisons. Let's look again at the Furious Five. Our principal tool in the struggle for control is regression. Regression is a neat way to compare two groups while simultaneously holding many differences between those in the groups fixed. 

Do regression estimates show that private university education is worth paying for? Using regression to adjust for applicant ability and family background and a few demographic characteristics like race and sex reduces the private college premium from 14% to 9%. Nine percent still seems pretty good. But do we have true ceteris paribus? Camilla? Um... I'm not sure we controlled for everything. Maybe private school students are more ambitious or smarter in ways not fully captured by test scores. If so, comparisons are not apples to apples even after the adjustments you speak of. Worrying indeed. The possibility that the variables we've adjusted for using regression do not fully account for group differences is called omitted variables bias or OVB. OVB is selection bias in a regression context. 

We suffer the effects of OVB when the regression we've got is not the one we want. The regression we want, the regression of our dreams, has more and better controls than the one we've got. How can we control for something like ambition? Is there an ambition index? It's not easy to produce ceteris paribus comparisons. Regression is a tool. It's not magic. Yet sometimes the results unearthed by this tool are striking. Masters of Econometrics professor Stacy Dale and Alan Krueger faced the challenge of selection bias and OVB. In a famous academic study, Dale and Krueger controlled for the many possible differences between students who've attended different types of schools. They had the insight that selection bias in this context originates in two forces: student ambition and college opportunity. 

Most students have a pretty good sense of their own aptitude, inclination, and motivation for school work. These forces are summarized by the type of schools they apply to. At the same time, college admissions offices invest massive hours and energy into ascertaining who will succeed on campus. They evaluate and select for academic ability and college commitment. What if we compare the outcomes of those who had the exact same acceptances and rejections? Compare two high school students, Maya and Mariana. Both admitted to UNC and Duke but not to Yale. Similarly ambitious and judged similarly capable by these three schools' admissions offices, Maya opts for Duke because a friend is going there, while Mariana heads to UNC in Chapel Hill. 

Maya and Mariana are not clones of course, and they've chosen different types of schools for personal reasons. But otherwise, they have a lot in common. The personal factors that drive them to choose between the schools on their menu might not be closely related to their future earning power. Pooling many such comparisons moves us one giant step closer to ceteris paribus. Remarkably, a regression model that controls for the sets of schools to which applicants have applied and been admitted shows almost no difference in earnings between public and private graduates. In other words, averaging over a large number of Maya to Mariana type comparisons, the private school premium falls to nothing. Maya might have enjoyed her expensive Duke education, but on average at least students like her will do no better in the labor market than comparable public school peers. That's quite a change from our initial 14% earnings gap favoring elite school alumni. Regression has the power to turn a clouded statistical night into a clear causal day. But you'll need to know a little more before you can regress with skill and confidence.

Subtitles

Verified Available Languages
  • English
  • Spanish
  • Chinese


Thanks to our awesome community of subtitle contributors, individual videos in this course might have additional languages. More info below on how to see which languages are available (and how to contribute more!).

How to turn on captions and select a language:

  1. Click the settings icon (⚙) at the bottom of the video screen.
  2. Click Subtitles/CC.
  3. Select a language.
     


 

Contribute Translations!

Join the team and help us provide world-class economics education to everyone, everywhere for free! You can also reach out to us at [email protected] for more info.


Submit subtitles

 

 

Accessibility

We aim to make our content accessible to users around the world with varying needs and circumstances.

Currently we provide:


Are we missing something? Please let us know at [email protected]

Download

Creative Commons

Creative Commons License

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
The third party material as seen in this video is subject to third party copyright and is used here pursuant
to the fair use doctrine as stipulated in Section 107 of the Copyright Act. We grant no rights and make no
warranties with regard to the third party material depicted in the video and your use of this video may
require additional clearances and licenses. We advise consulting with clearance counsel before relying
on the fair use doctrine.