American Economic Review 2013, 103(5): 1797–1829
http://dx.doi.org/10.1257/aer.103.5.1797
1797
Does Disability Insurance Receipt Discourage Work?
Using Examiner Assignment to Estimate Causal Effects
of SSDI Receipt
By N M, K J. M,  A S*
We present the rst causal estimates of the effect of Social Security
Disability Insurance benet receipt on labor supply using all program
applicants. We use administrative data to match applications to
disability examiners and exploit variation in examiners’ allowance
rates as an instrument for benet receipt. We nd that among the
estimated 23 percent of applicants on the margin of program entry,
employment would have been 28 percentage points higher had they
not received benets. The effect is heterogeneous, ranging from
no effect for those with more severe impairments to 50 percentage
points for entrants with relatively less severe impairments. (JEL
H55, J14, J22)
US disability program expenditures have increased dramatically over the last sev-
eral decades. As a result, the Social Security system’s long-run scal imbalance is
now an immediate problem: the Social Security Disability Insurance (SSDI) Trust
Fund is expected to exhaust its assets by 2016, after which time expected tax rev-
enues will cover only 79 percent of SSDI benets (Board of Trustees 2012).
As the SSDI caseload has grown, the employment of disabled workers has
steadily declined. As Figure 1 shows, during the last two decades alone, the SSDI
caseload grew threefold while the relative employment rate of disabled workers fell
by one-half. This occurred despite the facts that the Americans with Disabilities Act
(ADA) created new employment protections for disabled workers, jobs became less
physically demanding (Autor, Levy, and Murnane 2003), life expectancy increased
* Maestas: RAND Corporation, 1776 Main Street, Santa Monica, CA 90401 (e-mail: [email protected]);
Mullen: RAND Corporation, 1776 Main Street, Santa Monica, CA 90401 (e-mail: [email protected]); Strand:
Social Security Administration, 500 E Street, ninth oor, Washington, DC 20254 (e-mail: Alexander.Strand@ssa.
gov). We thank Josh Angrist, David Autor, John Bound, Raj Chetty, Eli Donkar, Kirk Doran, Joe Doyle, Terry
Flannery, Eric French, Steve Goss, Larry Katz, Jeff Kling, Lee Lockwood, Erin Johnson, Day Manoli, Paco
Martorell, David Pattison, Heather Royer, Juan Saavedra, David Stapleton, Till von Wachter, Heidi Williams,
Gema Zamarro, participants of the 2010 and 2011 MRRC Researcher Workshops, the 2010 All-California Labor
Economics Conference at University of California-Santa Barbara, and the 2011 American Economic Association
meetings, NBER Spring Public Economics Meeting, Society of Labor Economics conference, NBER Summer
Institute Social Security Workshop, and seminar participants at the Center for Business and Public Policy at the
University of Illinois, Urbana-Champaign, RAND, MIT, Harvard, Center for Retirement Research at Boston
College, and the Social Security Administration for helpful comments and suggestions. This research was sup-
ported by a grant from the US Social Security Administration (SSA) through the Michigan Retirement Research
Center (MRRC). The opinions and conclusions expressed are solely those of the authors and do not represent the
opinions or policy of SSA or any agency of the federal government. SSA reviewed a draft of the paper for accuracy
and disclosure of personally identiable information before it was circulated.
Go to http://dx.doi.org/10.1257/aer.103.5.1797 to visit the article page for additional materials and author
disclosure statement(s).
1798
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
(Christensen et al. 2009), and the health of the working-age population remained
stable (Houtenville et al. 2009; Burkhauser and Daly 2011).
Although demographic and economic factors have hastened the program’s
expansion, the consensus in the research literature is that the structure of the SSDI
program itself has been a major force behind the decline in employment and con-
comitant program growth (Duggan and Imberman 2009; Burkhauser and Daly
2011). Indeed, many have attributed the long-running structural decline in employ-
ment and earnings among low-skilled men to the SSDI program, at least in part
(see, e.g., Parsons 1980, 1982; Haveman and Wolfe 1984a, 1984b; Bound 1989;
and Bound and Waidmann 1992). More recently, Autor and Duggan (2003) show
that declining labor demand for low-skilled workers during the late 1970s through
1990s, combined with a rising benet-to-earnings replacement rate and an expan-
sion in program eligibility in the mid-1980s, contributed to falling employment and
dramatic program growth during that period.
While few would deny that the SSDI program has adversely affected employ-
ment, there is little consensus about the magnitude of SSDI’s impact. Progress on
this question has been hampered by the difculty of separating the causal effect
of SSDI from confounding factors that also affect the relative attractiveness of
employment versus SSDI participation, such as trends in employment, earnings
and health. In this article, we solve a critical barrier to progress—the lack of exog-
enous variation in SSDI receipt—and present the rst causal estimate of the effect
of SSDI receipt on labor supply using the entire population of SSDI applicants in
the present day system.
The precise nature of our contribution is best understood in the context of the
prior literature. Bound (1989) rst proposed the approach of using denied applicants
as a control group for SSDI beneciaries, as an alternative to earlier analyses that
0
1
2
3
4
5
6
7
8
9
10
0
0.1
0.2
0.3
0.4
0.5
0.6
SSDI caseload (millions)
Disabled/nondisabled employment ratio
Disabled/nondisabled
employment ratio
SSDI caseload (millions)
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
F 1. T R  SSDI   D  E   D
Source: Authors’ tabulations of Current Population Survey; Annual Statistical Report on the
Social Security Disability Insurance Program, 2010, Table 3.
1799
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
related labor supply outcomes to nonrandom variation in SSDI replacement rates
(see, e.g., Parsons 1980). These earlier analyses, which attributed the entire decline
in employment among older men to the SSDI program, overstated the impact of
SSDI on labor supply because SSDI replacement rates confound benet generosity
with variation in lifetime earnings and labor force attachment (Haveman and Wolfe
1984a, 1984b; Bound 1989; Gruber 2000). Bound argued that, in observational data,
the postdecision labor supply of denied applicants was an upper bound estimate of
the potential labor supply of SSDI beneciaries had they not received SSDI benets.
He contended it was an upper bound because denied applicants should have less
severe impairments (and, hence, greater employment potential) than allowed appli-
cants on average. He concluded that the employment of older (ages 45– 64) male
SSDI beneciaries during the 1970s would have been at most around 35 percentage
points higher had they never received SSDI benets, and therefore SSDI accounted
for only about one-quarter of the historic decline in employment among older men.
Parsons (1991) argued that Bound’s estimate was not an upper bound on the total
program effect if the application process also reduces the labor supply of applicants.
All applicants must demonstrate that they are not working before they may apply,
and initially denied applicants must remain out of the labor force to maintain eligi-
bility for benets while they pursue appeals or reapply. This initial withdrawal from
and time spent out of the labor force while seeking benets may diminish human
capital, making it harder for all applicants to reenter the labor force.
1
Moreover, if
the application process affects the postdecision labor supply of ultimately allowed
and denied applicants differentially, then Bound’s approach would not even give
an upper bound on the effect of benet receipt. More generally, Bound’s approach
would understate the employment potential of SSDI beneciaries to the degree that
denied applicants have lower employment potential than allowed applicants.
2
Von Wachter, Song, and Manchester (2011) applied Bound’s comparison approach
to observational data for the 1980s and 1990s. Using administrative records that
permit the tracking of appeals outcomes, they conclude that the overall employ-
ment potential of the SSDI caseload has risen signicantly since the 1970s owing
to an increase in beneciaries who are younger and who have nonterminal impair-
ments such as mental health and musculoskeletal conditions. Still, Von Wachter,
Manchester, and Song lacked an exogenous source of variation in SSDI receipt, and
thus were unable to resolve the fundamental issue of causal magnitude.
Two other studies improved on the observational approach by using policy varia-
tion in initial allowance rates to mitigate the confounding with lifetime earnings and
impairment severity. Gruber and Kubik (1997) used differential decreases in state-
level initial allowance rates during the late 1970s that resulted from federal policy
changes aimed at reducing program growth. Their reduced form estimates imply
that labor force participation among older men would have been 8.9 percentage
1
New evidence from Autor et al. (2011) suggests that the SSDI program may indeed affect employment out-
comes through channels other than benet receipt, such as processing time.
2
This is certainly possible. Allowance decisions for applicants with less severe impairments explicitly take
account of vocational factors that are inversely correlated with lifetime labor force attachment such as age, educa-
tion, and work experience, and later we show that denied applicants in fact have lower prior labor force attachment
than allowed applicants.
1800
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
points greater had they not been initially allowed benets.
3
Gruber and Kubik did
not estimate the “rst-stage” relationship between the state-level initial allow-
ance rate and an individual’s ultimate allowance probability (which, again, would
account for appeals/reapplication outcomes) so it is not possible to scale up the
reduced form effect to nd the implied causal effect of SSDI receipt on labor sup-
ply. Chen and van der Klaauw (2008) used variation in initial allowance rates arising
from program rules that relax the vocational eligibility standard discontinuously at
ages 45, 50, and 55 using survey data covering the 1990s. Their estimates are dif-
cult to compare to the prior literature because of sample differences; they neces-
sarily analyzed only vocational applicants (those who did not qualify on the basis of
medical criteria alone), but also included SSI applicants and women. Their preferred
estimate suggests that the employment of older vocational beneciaries would have
been 20 percentage points higher in the absence of initial allowance, but like Gruber
and Kubik (1997) their estimates are reduced form estimates because they could
not track appeals outcomes in their data. If appropriately scaled up by the rst-stage
relationship between initial and ultimate allowance, their estimate implies substan-
tial work capacity among vocational beneciaries, perhaps even more capacity than
Bound’s estimate implies.
4
Our study advances this literature in several ways. Our rst contribution is the
development of a research design that had not previously been used in this litera-
ture, but which enables us to estimate the causal effect of SSDI receipt on labor
supply along a margin of major policy interest—program entry. To do this, we
compare the subsequent labor supply of otherwise similar applicants who were
allowed or denied benets only because their applications were randomly assigned
to disability examiners with different allowance propensities at the initial point in
the disability determination process. This assignment mechanism generates exog-
enous variation in decision outcomes that is unrelated to unobserved impairment
severity or labor force attachment. We implement this research design in a unique
administrative dataset that has been underused by researchers and contains disabil-
ity examiner identication codes for the universe of SSDI applications in a given
year. Because the disability determination process allows for multiple levels of
appeal, we employ an intention-to-treat framework (Angrist, Imbens, and Rubin
1996) in which we use an examiner’s allowance propensity in the initial determi-
nation stage as an instrumental variable for the ultimate allowance decision for
a given application. We identify ultimate allowance decisions (taking account of
appeals outcomes) by linking the applications data to administrative data record-
ing SSDI benet receipt up to six years after the initial decision. We then link our
3
Gruber and Kubik (1997) report that a 10 percent increase in the denial rate led to a 0.48 percentage point
decrease in labor force participation. The mean denial rate in 1977 was 53.8 percent.
4
Chen and van der Klaauw (2008) also apply Bound’s comparison approach to their data and nd that their
quasi-experimental estimates are consistently larger than the upper bound observational estimate. This implies that
a comparison of allowed and denied applicants in observational data does not necessarily yield an upper bound on
the potential labor supply of beneciaries. Rather, this pattern suggests that for older vocational applicants (39 per-
cent of all applicants in their sample) unobserved differences in labor force attachment between allowed and denied
applicants may dominate unobserved differences in impairment severity. This is a reasonable interpretation since
vocational applicants are more likely to have been economically induced to apply for SSDI benets, and allowance
decisions for vocational applicants depend on factors that are negatively correlated with labor force attachment.
Chen and van der Klaauw’s use of exogenous variation in allowance rates naturally resolves the downward bias in
the observational estimate.
1801
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
data to administrative earnings records and examine labor supply outcomes two to
four years from the date of the initial determination. By waiting two to four years
to measure labor supply outcomes, we address the likelihood that ultimately denied
applicants kept their earnings low during the appeals process.
We nd that the employment rate of beneciaries on the margin of SSDI entry in
2005 and 2006 (23 percent of all applicants) would have been on average 28 per-
centage points higher two years later if they had never received SSDI benets.
Employment above SSAs threshold dening substantial gainful activity would have
been 19 percentage points higher, and annual earnings would have been $3,781
higher (including the zeros). Although the estimate of employment capacity is large,
our estimate of earnings capacity indicates that the average work capacity of this
group is far below their predisability earnings levels. Still, our results show that a
subset of SSDI beneciaries either retains or recovers some degree of work capacity
in the years immediately following their initial decision.
Concurrent with our study, French and Song (2011) use variation in the propen-
sity of administrative law judges (ALJs) in the second stage of the appeals process
to estimate the labor supply effect of SSDI receipt. They nd that the employment
rate of applicants allowed at the ALJ hearing level would be 26 percentage points
higher if they had instead been denied. Our framework enables us to compare their
estimates to ours and explain why, despite important differences in the composition
of the two samples (i.e., plaintiffs in ALJ-level appeals cases versus all initial appli-
cants), it is reasonable to expect the two approaches to produce similar estimates.
5
Our research design facilitates two additional contributions to the literature. First,
we estimate that 23 percent of SSDI applicants are on the margin of program entry
in the sense that whether they ultimately receive benets depends on their initial
examiner assignment. The size and characteristics of this group are of signicant
policy interest, since these applicants would be most affected by policy changes
addressing disability determinations. We nd that this group on the margin of pro-
gram entry disproportionately includes younger applicants, applicants with low
earnings histories, and applicants with mental impairments. In addition, we cal-
culate that the program entry margin (accounting for appeals outcomes) spans the
twentieth through forty-third percentiles of the distribution of unobserved severity
among all applicants.
Second, our continuous instrumental variable offers a rst opportunity to test for
heterogeneous treatment effects. We nd that the causal effect of SSDI receipt on
employment is not the same across all beneciaries. Rather, employment capacity
rises as high as 50 percentage points for marginal applicants with the least severe
impairments and falls to zero for those with the most severe impairments. We also
estimate the distribution of treatment effects among applicants on the margin of
program entry.
5
One important difference is that French and Song use data on ALJ decisions from the 1990s, whereas we use
data on initial decisions from the mid-2000s. To the extent that the composition of applicants has changed over time,
this could affect the underlying work capacity available in the SSDI population.
1802
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
I. Background on SSDI
The SSDI program denes disability as the “inability to engage in substantial
gainful activity (SGA) by reason of any medically determinable physical or mental
impairment(s) which can be expected to result in death or which has lasted or can be
expected to last for a continuous period of not less than 12 months.Activity is con-
sidered “substantial” if it involves signicant physical and/or mental exertion and it
is considered “gainful” if it is performed for pay or prot (whether or not prot is
actually realized). SSA operationalizes this denition by setting an earnings thresh-
old— currently $1,040 per month— over which individuals are said to be engaging
in SGA and are therefore disqualied from participating in the program. The SGA
threshold impacts both program entry (through eligibility) and the labor supply of
current beneciaries (whose benets are suspended if they earn more than the SGA
threshold once they have completed a trial work period).
Individuals apply for SSDI benets at their local eld ofce, which screens out
those who are not currently insured or who are engaging in SGA.
6
These are labeled
“technical denials” and do not receive further review.
7
The remaining applications
are forwarded to a state Disability Determination Services (DDS) ofce, where
cases are assigned to disability examiners for review.
8
During our sample period
(2005–2006), the vast majority of initial claims for adults were allocated to examin-
ers using a computer program developed by Iron Data LLC, although some states
used their own program to process cases.
9
We conducted telephone interviews with
managers of the ten largest DDS ofces (all of which use the Iron Data software)
as well as two ofces that used their own software. These 12 ofces processed 40
percent of the claims in our sample in 2006. The interviews revealed that nearly
all cases were automatically assigned to examiners by computer program, which
assigned new applications to examiners based on availability.
10
The managers vari-
ously described this process as a “round robin,” “next in line,” “rotate to the back,
or “equal distribution” system.
Two types of exceptions were noteworthy in our interviews with DDS manag-
ers.
11
First, in all DDS ofces a small fraction of cases are sent by the eld ofces
with high-priority ags for expedited handling, and these are sometimes manually
assigned to examiners who specialize in high-priority cases. The only high-priority
6
Roughly half of SSDI applicants also concurrently apply for benets from the Supplemental Security Income
(SSI) program, which provides additional income to individuals with limited income and assets. The medical and
vocational criteria are the same under both programs.
7
The eld ofce also screens out applications from individuals who do not qualify for SSDI based on their work
history. Generally, to be eligible for SSDI one must have earned at least 40 work credits (or “covered quarters”), 20
of which must have been earned in the last ten years. However, these restrictions are relaxed for younger individuals.
8
There are 114 DDS ofces. Two-thirds of states (including Washington, DC) have only one DDS ofce, and
47 states have four or fewer regional DDS ofces. States with more than four DDS ofces are: New York (5),
Missouri (6), Florida (10), and California (10). There are also seven ofces handling claims in US territories.
9
In January 2011, SSA announced that all DDS ofces would switch to a new (common) case management
system (to be provided by Lockheed Martin with Iron Data LLC acting as subcontractor).
10
Even prior to the era of computer assignment, there is evidence that cases were randomly assigned to examin-
ers. See, for example, Lewin Group, Inc. (2001), which conducted site visits in 1999 as part of a study on the quality
of disability determinations.
11
Other exceptions emerged, but these seemed unlikely to be concentrated among examiners. For example, high
prole cases or cases involving conicts of interest with DDS personnel could be reassigned manually.
1803
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
cases during 2005–2006 were for allegations of terminal illness (TERI).
12
TERI
claims are characterized by impairments that cannot be reversed and are expected to
end in death, and have very high allowance rates. An allegation of certain conditions
by the claimant will result in the claim being marked TERI by the eld ofce.
13
Second, in some (although not all) DDS ofces, the applicant’s affected body sys-
tem is taken into account in assignment of cases to newly hired examiners as part of
their training.
14
All screening personnel are required to follow a ve-step review process in order
to determine whether applicants are unable to perform SGA given the severity
of their impairment and/or vocational background. (See Figure A1 in the online
Appendix for a visual overview of the process.) Denials are issued to applicants
engaging in SGA (step 1—performed at the eld ofce) and to applicants with
impairments that are obviously temporary or nonsevere (step 2). Next, “medi-
cal” allowances are made to individuals with specic impairments deemed severe
enough to warrant allowance into the program regardless of vocational background
(step 3). The remaining applicants are evaluated to determine whether they have the
residual functional capacity to perform any of their past jobs (step 4) or any job for
which they are qualied in the national economy given their vocational background
(step 5). Denials may be issued at either of these two steps, and “vocational” allow-
ances are issued to qualied applicants at step 5. Vocational allowances are made
in consultation with a medical-vocational grid that provides guidance for allowance
decisions based on residual functional capacity, age group, education, and type of
work experience (i.e., skilled, unskilled).
15
Disability examiners are not medically
trained but may consult with a medical consultant (a physician or psychologist) in
order to assess an applicant’s residual functional capacity, or ability to work given
their physical and/or mental impairments.
Denied applicants can appeal their initial determination within 60 days by applying
for reconsideration by the original DDS ofce.
16
Applicants denied at reconsidera-
tion have an additional 60 days to le an appeal to have an administrative law judge
(ALJ) review their case. The ALJ must consider the application using the same steps
in the same order as the initial determination, but the applicant may present new
information. Applicants denied by an ALJ have additional opportunities to appeal
to the SSA Appeals Council (AC) and nally to Federal Court. In our data, roughly
12
Other types of high-priority cases were introduced after our sample period. Since 2007, SSA has introduced
the Quick Disability Determination (QDD) program, whichidenties electronically via a predictive model those
claims that have a high probability of allowance and for which evidence is expected to be easily and quickly
obtained, and the Compassionate Allowance (CAL) program, which identies cases likely to qualify under the
medical listings. Similarly, in 2008 SSA began enhanced processing of military service casualty claims (“Wounded
Warriors”) using the expedited procedures.
13
These conditions include: chronic dependence on a cardiopulmonary life-sustaining device; awaiting certain
organ transplants; chronic pulmonary or heart failure requiring continuous home oxygen and unable to care for per-
sonal needs; any malignant neoplasm (cancer) which is metastatic, stage IV, persistent or inoperable; an allegation
or diagnosis of ALS, AIDS, or one of eight types of cancer; or, comatose for 30 days or more.
14
In many ofces, less experienced examiners handle only initial determinations, whereas more experienced
examiners also handle more complex types of cases such as reconsiderations or continuing disability reviews (CDRs).
15
Practically, the grid generates increases in the allowance rate at ages 45, 50, and 55, which was exploited in
the regression discontinuity design of Chen and van der Klaauw (2008).
16
Since 1999, ten “prototype” states have eliminated the reconsideration step of the appeals process; they
are: Alabama, Alaska, California (Los Angeles North and Los Angeles), Colorado (West), Louisiana, Michigan,
Montanta, New Hampshire, New York, and Pennsylvania.
1804
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
one-third of applicants are allowed in the initial determination (with about 60 percent
of these receiving vocational allowances), and just under two-thirds of applicants are
ultimately awarded SSDI benets within six years of their initial determination. An
audit study performed by the Ofce of the Inspector General (OIG) (2008) estimated
that in 2006 the average processing time for cases in the initial determination phase
was 131 days, or just over one-third of a year. Average (cumulative) processing times
for cases reaching the appeals phases were 279 days (0.76 years) for reconsidera-
tion, 811 days (2.22 years) for ALJ, 1,053 days (2.88 years) for AC, and 1,720 days
(4.71 years) for Federal Court (OIG 2008). Just under one-third of cases made it to
the ALJ level, where approximately 61 percent of initial denials were overturned
(GAO 2004). Fewer than 5 percent of cases progressed to the AC level and fewer than
1 percent of cases progressed to Federal Court.
Individuals who apply for SSDI benets must stop working or reduce their earn-
ings below the SGA threshold for a period of ve months before they are entitled to
receive benets, or until the allowance decision is made (whichever comes later).
17
The SSDI benet formula is the same progressive formula used to calculate Social
Security benets based on past average indexed monthly earnings and does not
depend on the type or severity of the individual’s impairment. The average monthly
benet was $1,129 in 2010 (Social Security Administration 2010, Table 36). Once
benets commence, beneciaries begin a nine-month Trial Work Period (TWP)
which allows individuals to test their ability to return to work by relaxing the
restriction that earnings may not exceed the SGA threshold. The TWP is followed
by a three-month Grace Period before individuals earning above the SGA threshold
have their benets suspended. Thus, SSDI beneciaries may engage in SGA for up
to 12 months (not necessarily consecutively) while receiving their full benets with-
out any penalty. Over the next three years, during an extended period of eligibility
(EPE) benets are paid for months in which earnings are below SGA, and not paid
when earnings are above SGA.
Upon reaching the Social Security Full Retirement Age, SSDI benets are auto-
matically converted to Social Security retired worker benets, and the SGA earn-
ings restriction is lifted. Very few beneciaries exit the program for a reason other
than death (35 percent in 2009) or automatic conversion to retired worker benets
at the Full Retirement Age (54 percent). In 2009, only 8 percent of worker bene-
ciaries’ benets were terminated because they no longer qualied for benets; of
those, 39 percent failed a medical review (a Continuing Disability Review, or CDR)
whereas 61 percent were found to be earning more than the SGA threshold (Social
Security Administration 2010, Table 50).
II. Data and Caseload Characteristics
We make use of a unique workload management database called the Disability
Operational Data Store (DIODS). The DIODS contains the universe of SSDI
applications and, according to Social Security Online, is considered “the SSA
17
In addition to the ve-month waiting period for SSDI benets, there is a 19-month waiting period for health-
care coverage under Medicare. Neither waiting period exists for SSI, which immediately entitles its beneciaries
to cash benets and Medicaid.
1805
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
denitive data store for disability claims for state agency workload management
purposes.
18
The DIODS contains alphanumeric codes that identify disability
examiners uniquely within DDS and allow us to construct a measure of each
examiner’s propensity to allow SSDI applicants.
19
We use data on all initial medi-
cal determinations (that is, excluding technical denials) made between January 1,
2005, and December 31, 2006. We construct our sample by restricting to disabled
workers only between the ages of 18 and 64 who were assigned to examiners han-
dling at least 30 cases in the observed time frame.
20
We further exclude applicants
who were previously SSDI or SSI beneciaries.
21
We link the DIODS data to SSAs Master Beneciary Record (MBR) in order to
identify which applicants ultimately received SSDI benets (by the end of 2010).
This allows us to identify SSDI receipt at least four years and up to six years
after the initial determination. In order to identify applicants who appealed an ini-
tial denial to the administrative law judge (ALJ) level, we also linked our sam-
ple with an extract from the Case Processing and Management System (CPMS),
which includes information on hearings conducted through June 2011.
22
Because
we observe benet receipt four to six years after the initial decision, we conserva-
tively estimate that we correctly classify benet receipt for at least 99.4 percent of
applicants in our sample.
23
To examine past and subsequent labor supply, we link to administrative annual
earnings records between 1995 and 2009 from SSAs Detailed Earnings Record
(DER). The DER contains uncapped earnings from box 5 (Medicare wages and
tips) of individuals’ W-2 forms. We aggregate across all earnings reports to measure
earnings at the individual level. This gives us annual earnings up to nine years before
and up to four years after the initial decision.
24
We use three measures of labor
supply: (i) employment—dened as earning more than $1,000 (in 2008 dollars)
in a given year;
25
(ii) employment above SGA (where the SGA threshold is that
dened by SSA in a given year); and (iii) annual earnings. All earnings amounts are
presented in 2008 dollars and include those with zero earnings. We examine labor
18
http://www.socialsecurity.gov/disability/data/ssa-sa-mowl.htm.
19
Unfortunately, we do not observe any examiner characteristics. For example, we cannot distinguish between
full- and part-time examiners, nor can we observe examiner tenure.
20
One DDS manager indicated that trainees all have the same examiner code in the DIODS for her DDS.
Although we do not believe this practice to be widespread, we eliminated cases assigned to examiner codes associ-
ated with more than 900 cases in the two-year decision period in order to avoid this potential problem. This addi-
tional restriction results in the exclusion of 24 examiners and 37,631 cases.
21
This eliminates claims assigned to specic examiners associated with the “Special Disability Workload,
which resulted from an outreach initiative by SSA to determine whether recipients of SSI payments had worked
enough to become insured for SSDI (OIG 2006).
22
Unfortunately, we do not observe reconsideration-level appeals, or appeals beyond the ALJ level.
23
This is based on the fact that 2 percent of applications result in benet receipt despite being denied at the
ALJ level; the median and ninetieth percentile of cumulative processing times among these applicants are four and
ve years, respectively (see Table 2 in Autor et al. 2011). Thus, we misclassify at most 0.6 percent of outcomes
(= 0.01 × 0.1 of 2005 applications plus 0.01 × 0.5 of 2006 applications) as denials rather than allowances.
24
Earnings are measured between January 1 and December 31 of a given calendar year. Thus, an applicant
whose decision was made in December of 2005 will have had 11 fewer months elapse between his decision and
outcome year (say, 2007) than an applicant whose decision was made in January of 2005, even though we refer to
the outcome as being measured two years later for both applicants.
25
Because any positive earnings result in a W-2 report, we observe many individuals with very small annual
earnings. We impose the $1,000 earnings threshold to restrict our denition of employment to only “meaningful”
participation in the labor market.
1806
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
supply at two to four years after the initial decision to permit most of the denied
applicants to complete the appeals process.
Finally, we link the sample to SSAs Numerical Identication System (Numident)
le, which contains dates of death for individuals who died during our analysis
period. We restrict the analysis samples to individuals who were alive through the
end of the calendar year in which earnings are observed.
26
Table 1 presents summary statistics for our two-year analysis sample, overall and
by case disposition.
27
The sample consists of just over 2.1 million applications.
These cases were adjudicated by approximately 8,700 DDS examiners, for an aver-
age caseload of roughly 250 initial determinations in the two-year period, or 125
per year.
28
Table 1 divides the sample into four groups: (i) initially allowed, (ii)
initially denied, but allowed on appeal (or through reapplication), (iii) initially
denied, and denied on appeal, and (iv) initially denied, and not observed to appeal.
29
Approximately 35 percent of applicants denied at the initial determination level do
not appeal their decision. However, among those who appeal, the success rate is
fairly high—fully three-quarters of these denials are eventually overturned. This
underscores the importance of using an intent-to-treat framework, and also is the
reason why previous estimates based only on variation in initial allowance rates are
reduced form estimates, and not estimates of the effect of benet receipt.
Fifty-nine percent of applications are for one of two types of impairments: mus-
culoskeletal (38 percent) or mental disorders (21 percent). These percentages are
virtually the same among all allowances, although musculoskeletal (mental) cases
are less (more) likely to be allowed at the initial determination level than on appeal.
Fewer than 1 percent of all applications are cases of alleged terminal illness, and
these cases have very high initial allowance rates, representing just over 2 percent
of initial allowances.
30
Among the initially denied applicants, the three groups are
ordered in terms of impairment severity; applicants who did not appeal were more
likely to be denied in earlier steps of the initial review process than those who
appealed but were subsequently denied, who were more likely to be denied earlier
than those who successfully appealed. Ultimately allowed applicants also tend to be
older and have higher preonset earnings than ultimately denied applicants.
Figure 2 provides a rst descriptive look at the data by plotting the employment
rate of 2005 applicants up to nine years before and four years after their initial
26
SSDI applicants have relatively high rates of mortality, especially among beneciaries. Livermore, Stapleton,
and Claypool (2009) estimate that 13 percent of SSDI beneciaries die within two years of entering the program.
27
These will differ from published statistics since we impose some additional sample restrictions, such as exclud-
ing high mortality cases (those who died in the two years following the initial determination) as well as applicants
assigned to examiners with fewer than 30 initial determinations.
28
This number actually underestimates the number of initial determination decisions per year since it imposes
our sample restrictions. In addition, note that initial determinations represent only about a third of all cases decided
by examiners, which also include reconsiderations, continuing disability reviews, and SSI determinations, as well
as SSDI claims led by dependents.
29
We identify individuals who appeal by matching initial applications to cases presented to administrative law
judges. Thus, group iv includes some individuals who appealed for reconsideration but, upon denial at that stage,
declined to appeal further. It also may contain denied applicants who submitted a new application instead of appeal-
ing, but whose subsequent application was also denied.
30
While we do not have a direct measure of terminal illness cases in our dataset, we construct an indicator for
alleged terminal illness using detailed diagnosis codes corresponding to the impairments listed in Section I. Our
estimated fraction of terminal illness cases is consistent with administrative counts and internal SSA studies of
mortality rates among high-priority claims.
1807
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
determination.
31
We restrict the sample to 2005 applicants because outcomes for
2006 applicants are observed only up to three years after their initial determination.
32
From Figure 2 we see that before the initial determination, the employment rate of
applicants allowed on appeal is very similar to that of applicants who were initially
allowed—above 90 percent ve years before the initial decision. From this peak,
employment declines steadily in the years leading up to the initial decision, then
31
Figures A2 and A3 in the online Appendix present the same graphs for the other two outcome measures: earn-
ing above the SGA threshold and mean earnings.
32
Consistent with our later empirical results, we do not condition on a balanced panel for Figure 2 (that is, indi-
viduals living up to four years after the initial determination). However, the gures and estimates are similar with
and without the restriction of a balanced panel.
T 1— S S
Initially denied
Variable
All
applicants
Initially
allowed
Allowed
on appeal
Denied
on appeal
No appeal
Observations 2,214,066 723,627 719,705 243,559 527,175
Percent of sample 100.0 32.7 32.5 11.0 23.8
Percent of initial denials 48.3 16.3 35.4
Final step of initial decision (percent)
2. Denied—nonsevere 17.3 0.0 17.8 28.5 35.0
3. Allowed—met listings 13.8 42.3 0.0 0.0 0.0
4. Denied—work past jobs 24.0 0.0 38.6 33.0 33.0
5. Denied—able to work 26.0 0.0 43.7 38.5 31.9
5. Allowed—unable to work 18.9 57.7 0.0 0.0 0.0
Body system code
Percent musculoskeletal system 37.9 27.0 45.6 42.5 40.3
Percent mental disorders 20.6 25.7 16.3 20.5 19.4
Percent cardiovascular system 8.4 9.2 8.9 6.7 7.2
Percent neurological 8.2 11.3 7.1 6.2 6.2
Percent endocrine system 4.3 2.6 5.4 4.9 4.6
Percent respiratory system 4.0 5.4 3.4 3.5 3.2
Percent special/other
3.9 1.0 3.3 6.6 7.3
Percent malignant neoplastic diseases 3.5 6.2 2.1 1.2 2.7
Percent immune system disorders 2.7 3.1 2.7 2.5 2.2
Percent digestive system 2.4 1.9 2.6 2.5 2.8
Percent special senses and speech 2.2 2.7 1.5 1.8 2.5
Percent genitourinary impairments 1.4 3.1 0.6 0.5 0.7
Percent skin disorders 0.3 0.3 0.3 0.3 0.4
Percent hematological disorders 0.3 0.3 0.3 0.3 0.4
Percent terminal illness 0.8 2.1 0.2 0.1 0.2
Age at initial decision
Mean 47.09 51.03 47.18 42.97 43.46
Standard deviation 10.36 10.49 9.20 9.92 11.98
Average earnings, 3–5 years before decision
Mean 22,697 29,123 23,358 14,495 16,764
Standard deviation 27,298 33,971 24,494 20,422 25,145
EXALLOW
Mean 0.36 0.39 0.34 0.34 0.35
Standard deviation 0.10 0.11 0.10 0.09 0.10
Notes: CPMS, ALJ-level appeals though June 2011. MBR, allowances on appeal at higher levels or additional appli-
cation through December 2010. EXALLOW is examiner initial allowance rate less own decision.
Sources: DIODS, initial applications decided in 2005–2006.
1808
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
drops sharply in the decision year. By three years after the decision the two groups
are indistinguishable with employment rates around 13 percent.
In contrast, ultimately denied applicants have lower employment rates to begin
with— 80 percent ve years before the initial determination— and signicantly
higher (albeit reduced) employment rates (around 50 percent) after the initial deter-
mination.
33
Although denied applicants who appeal (unsuccessfully) and those
who do not appeal have similar predecision employment rates, their postdecision
employment differs substantially. Specically, only about 40 percent of individuals
who appealed unsuccessfully are working three years after their initial determina-
tion. Given average cumulative processing times reported by the OIG (2008), very
few of these applicants are likely to be still awaiting a decision (and, hence, sup-
pressing labor supply) three to four years after the initial decision. Rather, the gap
between the two types of denied applicants likely reects selection into the appeals
process; that is, individuals with less severe impairments are more likely to opt out
of the appeals process, whether by choice or because they fail to secure a lawyer
willing to represent their case.
Similarly, differences in prior employment between ultimately allowed and
denied applicants are consistent with selection on nonhealth factors into the appli-
cant population. If individuals with low labor supply prior to disability onset have
lower opportunity costs of applying, they will be more likely to apply conditional
on a given level of health, and as a result prior labor supply and health will be
negatively correlated among applicants. The relationship between prior labor mar-
ket history and ultimate disposition of the case becomes even more pronounced
when examining the percent of applicants engaging in SGA (i.e., earning more than
33
From Table 1 we calculate that approximately 68 percent (= 0.24/(0.11 + 0.24)) of those ultimately denied
benets did not appeal the initial determination.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
–10 –9 –8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5
Percent earning > = $1,000/year
Years after initial decision
Initially allowed
Initially denied, allowed on appeal
Initially denied, denied on appeal
Initially denied, no appeal
F 2. E    I D, 2005 D
1809
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
$11,760 per year in 2009) (see Figure A2 in the online Appendix) and average earn-
ings (Figure A3 in the online Appendix).
34
III. Empirical Strategy
The goal of this article is to estimate causal models of labor supply of the follow-
ing form:
(1) y
i
= X
i
β + γ D I
i
+ ν
i
,
where y
i
is earnings (or employment) of applicant i, X
i
denotes observed character-
istics (e.g., age, type of impairment) that may inuence labor supply, D I
i
= 1 if the
applicant was awarded SSDI benets (i.e., is observed to be an SSDI beneciary up
to six years after the initial determination), and ν
i
is an error term. In observational
data, inference is hampered if some unobserved characteristic, such as severity of
the impairment, impacts both labor supply and SSDI allowance. For instance,
y
i
= X
i
β + γ D I
i
s
i
+ ε
i
,
where s
i
denotes unobserved severity, which can be thought of as (the unobserved
part of) the earnings loss associated with the individual’s impairment, and which
is uncorrelated with any remaining idiosyncratic element ε
i
. Then in the regression
above ν
i
= − s
i
+ ε
i
, and if E[ s
i
| D I
i
] 0, ordinary least squares (OLS) regres-
sion gives a biased estimate of the average treatment effect, γ. In particular, OLS
estimates γ −
[E[ s
i
| D I
i
= 1 ] E[ s
i
| D I
i
= 0]]. As observed by Bound (1989),
if γ < 0 and if the unobserved characteristic is positively correlated with SSDI
receipt, OLS overestimates the magnitude of the coefcient on DI and provides an
upper bound on the labor supply effect of benet receipt.
From SSAs point of view, in an ideal world SSDI benets are awarded to indi-
viduals whose potential earnings—in the absence of SSDI benets—are less than
the SGA threshold:
X
i
βs
i
+ ε
i
< SGA.
In practice, however, cases are assigned to disability examiners who have imperfect
information, and so the assignment rule becomes based on the contrast:
X
i
β
s
ij
< SGA,
where
s
ij
denotes the estimate by examiner j of the severity of individual is impair-
ment. This estimate is a function of both the individual’s impairment severity—
which the examiner observes in greater detail than the econometrician, through
34
Note that, because of the earnings restriction, these are unlikely to be the same beneciaries engaging in SGA
in more than one year.
1810
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
medical records and test results
35
and characteristics of the examiner assigned to
the case, such as previous experience or personal perceptions and tastes. Let
s
ij
= s
i
+ σ
j
,
where σ
j
denotes a systematic component of examiner judgment that leads some
examiners to over- or underestimate severity of applicants on average. Then the
assignment rule becomes
(2) D I
i
= 1( s
i
> X
i
βSGAσ
j
),
which implies that examiner j s allowance propensity conditional on applicant char-
acteristics is
(3) P
j
= P( σ
j
| X
i
) = 1F( X
i
βSGAσ
j
),
where F denotes the cumulative distribution function for unobserved severity.
Equations (2) and (3) show that high- σ
j
examiners systematically overestimate
severity, resulting in lower allowance thresholds and higher propensities to allow
applicants conditional on applicant characteristics. This implies a natural identi-
cation strategy for estimating the labor supply effect of SSDI benet receipt. In
particular, we implement an instrumental variables (IV) estimation strategy where
we instrument for SSDI receipt in equation (1) using the assigned examiner’s pro-
pensity P
j
.
36
In order to avoid biasing measurement of examiner propensities using the appli-
cant’s own decision, we construct the following individual-specic instrument for
SSDI receipt:
EXALLO W
ij
=
n_allowe d
j
1(D I
i
= 1 )
__
n_case s
j
1
.
37
Intuitively, EXALLOW measures the allowance rate of examiner j, to whom appli-
cant i was assigned, for all cases except for the case of applicant i himself. Thus,
conditional on observed characteristics and assuming conditional random assign-
ment, which we discuss below, EXALLOW should be positively correlated with the
applicant’s own award decision only if there exists an underlying examiner-specic
threshold for allowance decisions.
Table 1 reports the mean (0.36) and standard deviation (0.10) of EXALLOW. The
mean absolute deviation is 0.08. The range of EXALLOW runs virtually the entire
35
Note that we do not observe any medical records in our dataset, only the type of impairment. In addition, in
some cases the examiner may communicate with the treating doctor to clarify an aspect of an applicant’s le, and
these communications may not be part of the applicant’s more detailed record.
36
The estimation strategy we employ is similar to that used by Kling (2006) to examine the effect of incarcera-
tion on labor supply and earnings, Doyle (2007, 2008) to examine the effects of foster care placement on juvenile
delinquency and adult crime, and Perry (2008) to examine the effect of treatment of maternal depression on man-
agement of children’s asthma.
37
An alternative approach would be to regress allowance decisions on a full set of examiner xed effects in the
rst stage. A disadvantage of this approach is that IV is then susceptible to a “many weak instruments” problem
(Stock, Wright, and Yogo 2002). In addition, our formulation of the examiner allowance propensity as a continuous
instrumental variable has the advantage of allowing us to compute marginal treatment effects, as can be seen below.
1811
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
length of the unit interval: 0.02 to 1. However, only a few examiners have such
extreme allowance rates: the rst and ninety-ninth percentiles of EXALLOW are
0.17 and 0.64, respectively.
38
Figure 3 presents smoothed histograms at the exam-
iner level of examiners’ deviations from the mean initial allowance rate in their
DDS ofce, unadjusted and regression-adjusted for differences in case mix. Case
controls include the fraction of cases in each of nine age bands, 14 body system
codes, alleged terminal illness, three-digit zip code, and decision month, as well as
a variable measuring average prior earnings of the set of applicants assigned to a
given examiner. Adjusting for case mix reduces variation in initial allowance rates,
but there is still signicant variation remaining (the standard deviation is 0.06, com-
pared with 0.10 unadjusted).
Two key assumptions underlie our empirical strategy. First, in order for EXALLOW
to be a valid instrument for SSDI receipt, applicants’ assignment to DDS exam-
iners must be uncorrelated with unobserved characteristics such as impairment
severity conditional on observed characteristics. This amounts to an assumption of
conditional random assignment to DDS examiner within a DDS. That is, at most,
examiners may specialize in a particular type of impairment (e.g., mental disor-
ders) or age group, but within this type, examiners do not further specialize in cases
of either low or high severity. As discussed previously, applicants are assigned to
38
Despite the fact that we condition on examiners with caseloads of 30 or more, one might be concerned that
examiners with relatively few observations will tend to have very high or very low allowance rates because they are
noisier. We explored this possibility by applying a Bayesian “shrinkage” estimator to EXALLOW (see, e.g., Kane
and Staiger 2008) and estimating our results using this “corrected” instrument. The new instrument had a range
of 0.14 to 0.75. Both the rst and second stage (labor supply) estimates were slightly higher using this alternative
instrument, but not signicantly so, and the patterns in the coefcients remained the same.
0
2
4
6
8
Percent
–0.4
–0.2 0 0.2
0.4 0.6
Residuals
Raw
Adjusted for case mix
F 3. D  E D
 DDS M I A R
Note: Caseload characteristics include DDS ofce, age, preonset earnings, body code, three-
digit zip code, terminal illness diagnosis, and decision month.
Source: 2005–2006 DIODS data.
1812
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
examiners using a rotational process that is unrelated to applicant characteristics
except in certain cases (alleged terminal illness or body system), which we observe
in our dataset. We provide empirical evidence consistent with conditional random
assignment in Section IV. Because applicants are assigned to examiners region-
ally, controlling for DDS ofce is crucial to our identication strategy; otherwise,
variation in EXALLOW could reect cultural differences in disability determination
standards across DDS ofces, as well as differences in local applicant populations
arising from differences in wage and employment opportunities in local labor mar-
kets. We therefore include DDS ofce xed effects in all empirical specications.
A second assumption that is critical to the causal interpretation of our IV estimates
is the assumption that examiners’ award propensities affect applicants’ chances of
SSDI receipt in the same way (monotonicity). In other words, we assume that cases
allowed by “strict” examiners (those with low σ
j
’s) would also have been allowed by
“lenient” examiners (those with high σ
j
’s), and that cases denied by lenient exam-
iners would also have been denied by strict examiners. While not strictly testable
without a well-designed audit study, this assumption implies that denials by strict
examiners should be overturned on appeal at a higher rate than denials by lenient
examiners, and thus the instrument should have a weaker effect on individuals
assigned to strict examiners.
39
Figure 4 provides a graphical overview of our estimation strategy, where the left
panel presents the visual rst stage, and the right panel presents the visual reduced
form. We separately plot smoothed SSDI receipt and employment rates by exam-
iner’s residualized (regression-adjusted) initial allowance rate (with dashed lines
indicating 95 percent condence interval bands) estimated via local quadratic
regression. Consistent with our earlier estimates, SSDI receipt is increasing in the
residualized initial allowance rate. Notably its slope is also increasing in the residu-
alized initial allowance rate. This is consistent with the implication of monotonicity
that individuals initially denied by strict examiners are on average less healthy than
individuals initially denied by lenient examiners and, thus, more likely to obtain
benets on appeal or through reapplication. Whereas SSDI receipt rises with the
examiner’s allowance propensity, the right panel shows that the employment rate
of applicants two years after the initial decision falls, from just under 30 percent
among individuals assigned to very strict examiners to just under 24 percent among
individuals assigned to very lenient examiners.
IV. First-Stage Estimates: Initial Disability Determinations and SSDI Receipt
In this section we present our rst-stage estimates and use them to draw infer-
ences about the size and characteristics of the population of applicants on the mar-
gin of program entry. Beginning with the rst-stage estimates themselves, Table 2
presents the coefcient on EXALLOW in the applicant-level regression of SSDI
receipt on EXALLOW, with and without covariates. We present results for 2005 and
2006, separately as well as pooled. All models include DDS indicators to account
39
A second implication of the monotonicity assumption is that examiners who are “strict” on one type of case,
say, musculoskeletal impairments, are also strict on any other type of case, say, mental disorders. Below we provide
empirical evidence that allowance rates for different body system codes are positively correlated within examiner.
1813
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
for stratication of examiners across DDS ofces. We display t-statistics in paren-
theses, where robust standard errors are computed and clustered by DDS examiner.
Column 1 shows the rst-stage coefcient on EXALLOW from a regression with no
additional covariates. In both years, a 10 percentage point increase in initial exam-
iner allowance rate leads to an approximately 3 percentage point increase in the
probability of ultimately receiving SSDI.
Adding covariates sequentially to the regression allows us to indirectly test for
random assignment on the basis of observable characteristics because only covari-
ates that are correlated with EXALLOW will affect the estimated coefcient on
EXALLOW when included. Based on our interviews with DDS managers (see
Section I), we expect the additions of the body system and terminal illness indica-
tors to potentially affect the coefcient on EXALLOW, since they are case assign-
ment variables, but no other variables should affect the coefcient. The coefcient
on EXALLOW falls from 0.29 to 0.24 with the addition of body system codes and
is not signicantly affected by the addition of any other variables, including the
TERI ag. Thus, our results are consistent with random assignment of applicants
to examiners within DDS ofce, conditional on body system code and alleged ter-
minal illness.
40
40
We also experimented with a different measure of initial allowance rate to test the implication of the monoto-
nicity assumption that generic allowance rates can be used to instrument for any type of case. For this measure, we
constructed the initial allowance rate leaving out all cases with the same body system code as the applicant (instead
of just the applicant’s own case). Table A1 in the online Appendix presents these results. For all impairments but one
(“special/other” cases, around 4 percent of the sample), this alternative measure of EXALLOW is positively and sig-
nicantly associated with increased SSDI receipt. (We replicated our analysis of labor supply effects dropping this
0.6
0.65
0.7
0.75
0.8
–0.2
–0.1 0
0.1 0.2
Residualized initial allowance rate
SSDI receipt
0.22
0.24
0.26
0.28
0.3
–0.2
–0.1 0 0.1 0.2
Residualized initial allowance rate
Employment
F 4. SSDI R  L S  I A R
Notes: Ninety-ve percent condence intervals shown with dashed lines. Employment measured in the second year
after the initial decision. Bandwidth is 0.116 for DI and 0.130 for labor force participation.
Source: DIODS data for 2005 and 2006.
1814
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
Column 7 shows that conditional on observables, examiner allowance rates are
strongly correlated with individual benet receipt: our rst stage coefcient of 0.23
implies that a 10 percentage point increase in the examiner allowance rate is associ-
ated with a statistically signicant ( p < 0.0001) 2.3 percentage point increase in the
probability that an applicant ultimately receives SSDI benets.
41
A. Size of the Population of Applicants on the Margin of Program Entry
Our rst-stage coefcient also gives an estimate of the fraction of applicants on
the margin of program entry; that is, those applicants who might have received
a different allowance decision had their case been assigned to a different initial
examiner in the same DDS ofce. More precisely, in the case of a binary treatment
group and obtained virtually identical results.) If we alternatively constructed EXALLOW using only cases of the
same impairment type, then monotonicity would have to hold only within impairment type. However, this approach
results in loss of precision and a selected sample due to small samples within impairment type at the examiner level.
Table A2 in the online Appendix presents estimates of the rst and second stages using this alternative instrument.
41
Recall that in a model with one endogenous regressor and one instrument the t-statistic squared is equal to the
incremental F statistic (in this case F = 1,592).
T 2— F-S R: E  EXALLOW  SSDI R
(1) (2) (3) (4) (5) (6) (7)
2005
Coefcient on EXALLOW 0.278*** 0.271*** 0.220*** 0.215*** 0.209*** 0.204*** 0.204***
t-stat 31.01 30.45 26.56 25.97 26.82 26.15 26.1
Fraction of always takers (
P
A
)
0.59***
t-stat 3.53
R
2
0.012 0.017 0.042 0.043 0.118 0.128 0.128
2006
Coefcient on EXALLOW 0.309*** 0.306*** 0.258*** 0.254*** 0.245*** 0.242*** 0.243***
t-stat 31.73 31.66 30.88 30.7 32.48 32.16 32.28
Fraction of always takers (
P
A
)
0.56***
t-stat 3.32
R
2
0.013 0.018 0.044 0.045 0.117 0.127 0.128
2005 and 2006
Coefcient on EXALLOW 0.294*** 0.289*** 0.240*** 0.235*** 0.227*** 0.224*** 0.226***
t-stat 41.87 40.73 38.81 38.09 40.2 39.33 39.9
Fraction of always takers (
P
A
)
0.57***
t-stat 3.42
R
2
0.012 0.017 0.042 0.043 0.117 0.127 0.127
Control variables included
Three-digit zip code X X X X X X
Body system codes X X X X X
Terminal illness indicator X X X X
Age group dummies X X X
Average previous earnings X X
Month dummies X
Control variables 112 1,015 1,031 1,031 1,040 1,041 1,052
Notes: Standard errors clustered at examiner level; t-statistics in parentheses. Mean of dependent (independent)
variable = 0.65 (0.37). All regressions include DDS dummies. Pooled specication includes year interacted with
month dummies. Number of observations is 1,090,345 for 2005 and 1,123,721 for 2006.
*** Signicant at the 1 percent level.
** Signicant at the 5 percent level.
* Signicant at the 10 percent level.
1815
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
(i.e., SSDI receipt), the size of the marginal population is the rst-stage coefcient
times the range of initial allowance rates (in this case 1, see Section III). Thus, we
estimate that 23 percent of applicants in 2005 and 2006 represent cases on the mar-
gin of program entry.
42
,
43
Given an initial allowance rate of 33 percent, this implies
that 7.6 percent of applicants (or 12 percent of incoming beneciaries) are allowed
only because of the examiner to whom they were assigned, while 15.4 percent are
denied for that reason. This does not imply that these decisions were necessarily
incorrect, only that at least one examiner in the same DDS would have come to a
different decision.
In contrast, we nd that 57 percent of applicants would receive SSDI benets
regardless of the initial examiner assigned to their case. This is given by the esti-
mated probability of SSDI receipt for the strictest examiner (see Table 2). Many of
these “always takers” receive benets only on appeal or reapplication. The remain-
ing 20 percent of applicants are “never takers”— that is, they would never receive
benets no matter which examiner reviewed their case. Figure 5 illustrates how
these estimates derive from our rst-stage regression model of SSDI receipt on the
42
It is important to note that this estimate accounts for the appeals process, which dampens the effect of initial
examiner assignment considerably. In fact, we estimate the fraction of cases whose initial decision depends on their
examiner to be 60 percent. Previous studies sponsored by SSA also document variation in disability evaluations
(e.g., Nagi 1969, Gallichio and Bye 1981); differences in methodology and caseload composition make it difcult
to compare those studies to ours.
43
Although 23 percent of applicants could have received a different allowance decision had they been assigned
to a different examiner, fewer would have received a different disposition if they had been assigned to a different
examiner (in the same DDS ofce). This is because most examiners have initial allowance rates close to the average
for their DDS ofce. For example, if all applicants were reassigned to the average examiner in their DDS ofce,
only 2 percent (= 0.23 × 0.08, where 0.08 is the mean absolute deviation of EXALLOW) of applicants would have
a new outcome. Alternatively, if one eliminated the top and bottom 1 percent of DDS examiners, then marginal
applicants would represent just under 11 percent (= 0.23 × (0.64 0.17)) of all applicants.
Pr(SSDI receipt)
1
0
0
1
A
M
P
P
+
Initial
allowance rate
Never
receive
benefits
Marginal
applicants
Always
receive
benefits
Least severe
Most severe
Most lenient Most strict
A
P
F 5. M  E  I A R  P  SSDI R
1816
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
examiner initial allowance rate. Each point on the gure is an applicant-examiner
combination, where applicants are characterized by impairment severity and exam-
iners are characterized by their initial allowance rate. Applicant severity is on the
y-axis (ordered bottom to top from most to least severe on unobserved scale u,
dened below) and examiner leniency is on the x-axis (ordered left to right from
most strict to most lenient). Monotonicity implies that any applicant with u < P
A
will receive benets no matter which examiner is assigned to his case, and any appli-
cant with u > P
A
+ P
M
will never receive benets no matter which examiner is
assigned to his case.
44
Only applicants for whom P
A
< u < P
A
+ P
M
may or may
not receive benets depending on the examiner. It is straightforward to show that
u1F(s ) is uniformly distributed, where F is the distribution of unobserved
severity s (see Section III).
45
Since u is uniform, the percent of applicants who are
always takers is P
A
, the percent of marginal applicants is P
M
, and the percent of
never takers is 1P
A
P
M
. Therefore, the marginal applicants fall between the
twentieth and forty-third percentiles of the distribution of unobserved severity.
B. Characteristics of Applicants on the Margin of Program Entry
Applicants on the margin of program entry are of signicant policy interest since
this group would be most affected by policy interventions affecting initial determi-
nations, and knowledge of their characteristics could be used to improve cost fore-
casts of proposed policy changes. While we cannot identify which applicants are in
the marginal group and which are not, our instrumental variables research strategy
offers a natural opportunity to draw probabilistic inferences about their characteris-
tics.
46
Specically, the relative likelihood that the marginal entrant has a particular
observable characteristic, compared with a typical applicant, is given by the ratio of
the rst-stage coefcient conditional on that characteristic to the overall rst-stage
coefcient.
47
Table 3 presents rst-stage estimates for groups dened by impair-
ment type, age, and prior earnings quintile. In order to improve the precision of our
estimates, we pool the 2005 and 2006 samples. The coefcient on EXALLOW is sta-
tistically signicant in nearly all groups (except for miscellaneous “special/other”
cases), and its magnitude varies substantially across groups. For example, the rst-
stage coefcient for musculoskeletal cases (0.157) implies that 16 percent of mus-
culoskeletal applicants are on the margin of program entry as a result of variation
across examiners in the same DDS. On the other hand, the coefcient for mental
44
Note that the probability of SSDI receipt for an applicant assigned to an examiner with a 0 percent initial
allowance rate is positive because of the appeals process. Similarly, the probability of SSDI receipt for an appli-
cant assigned to an examiner with a 100 percent (raw) initial allowance rate can be less than 1 after adjusting for
covariates.
45
Pr
(
ut
)
= Pr
(
1F(s ) ≤ t
)
= Pr
(
F(s ) > 1t
)
= Pr(s > F
1
(
1t
)
)
= 1F( F
1
(
1t
)
)
= t
46
This insight is due to Angrist and Pischke (2009).
47
This is a simple consequence of Bayes’s rule. The rst-stage coefcient conditional on some characteristic
X gives Pr(M | X ) = P(X | M )P(M)/P(X ), where M denotes marginal applicant. Dividing by the overall rst-stage
coefcient, P(M ), gives the relative likelihood, P(X | M)/P(X ).
1817
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
impairments (0.322) implies that 32 percent of applicants with mental disorders are
on the margin of program entry.
48
Table 3 also presents the estimated relative likelihoods for each group. From this,
we see that the marginal entrant is 43 percent more likely than the average applicant
to suffer from a mental disorder and 30 percent less likely than average to suffer
from a musculoskeletal impairment.
49
Similarly, the marginal entrant is 43 percent
more likely to be either very young or relatively old (under 29 and over 60), and
22 percent more likely to have very low prior earnings. Thus, a policy change that
uniformly increased allowance thresholds at the initial determination stage would
induce entry by younger, lower-earning individuals with mental impairments.
48
Interestingly, the bulk of musculoskeletal cases are allowed at the appellate stages, while most mental impair-
ment cases are allowed at the initial determination.
49
Another way of characterizing the marginal population is by examining the probability that a marginal appli-
cant has a particular characteristic X. This is just the relative likelihood multiplied by the fraction of applicants with
characteristic X. For example, though applicants with musculoskeletal impairments and mental disorders make up
40 percent and 21 percent of the applicant population, respectively, they account for roughly equal shares of the
marginal population: 28 percent (= 40 × 0.70) and 30 percent (= 21 × 1.43), respectively.
T 3—H: F-S R, 2005  2006 C
Observations
Initial
allowance rate
(percent)
Ultimate
allowance rate
(percent)
Fraction of
always takers
( P
A
)
Coefcient on
EXALLOW t-stat
Relative
likelihood
Body system code
Musculoskeletal system 839,847 23.3 62.4 0.567 0.157*** 15.51 0.70
Mental disorders 455,433 40.8 66.6 0.547 0.322*** 25.24 1.43
Cardiovascular system 185,063 36.1 70.6 0.617 0.252*** 14.78 1.11
Neurological 181,162 45.3 73.5 0.636 0.274*** 17.74 1.21
Endocrine system 94,156 20.0 61.4 0.579 0.098*** 3.69 0.43
Respiratory system 88,578 43.7 71.3 0.656 0.162*** 7.30 0.72
Special/other
85,587 8.7 36.3 0.370
0.019 0.66 0.08
Malignant neoplastic
diseases
77,021 58.2 77.9 0.667 0.297*** 16.34 1.31
Immune system disorders 59,188 37.5 70.0 0.570 0.348*** 13.41 1.54
Digestive system 53,224 26.4 60.9 0.516 0.260*** 8.49 1.15
Special senses and speech 48,396 41.1 63.7 0.550 0.240*** 7.32 1.06
Genitourinary impairments 31,837 71.1 84.2 0.742 0.264*** 11.78 1.17
Skin disorders 7,483 28.2 57.7 0.458 0.339*** 3.55 1.50
Hematological disorders 7,091 32.8 62.5 0.500 0.344*** 3.77 1.52
Age at decision
18–29 193,824 22.1 41.7 0.301 0.324*** 18.42 1.43
30–39 342,535 20.3 50.6 0.422 0.235*** 15.97 1.04
40–49 627,727 20.8 60.8 0.541 0.185*** 18.31 0.82
50–59 772,263 41.6 78.2 0.715 0.186*** 25.37 0.82
60–64 277,717 57.3 73.3 0.615 0.320*** 25.01 1.42
30–44 621,664 20.2 53.5 0.460 0.211*** 19.24 0.93
45–64 1,398,578 9.7 73.6 0.657 0.217*** 34.81 0.96
Average prior earnings
Bottom quintile 442,814 19.7 42.6 0.328 0.275*** 21.80 1.22
Second quintile 442,813 26.3 60.3 0.517 0.243*** 21.77 1.08
Third quintile 442,813 31.3 68.3 0.605 0.220*** 20.85 0.97
Fourth quintile 442,813 38.2 74.5 0.668 0.214*** 21.90 0.95
Top quintile 442,813 47.9 80.2 0.735 0.178*** 20.98 0.79
Note: Standard errors clustered at examiner level; control variables include: DDS dummies, three-digit zip codes,
body system codes, terminal illness indicator, age group dummies, average previous earnings, and month dummies.
*** Signicant at the 1 percent level.
** Signicant at the 5 percent level.
* Signicant at the 10 percent level.
1818
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
Such compositional changes could have important implications for government
spending. Individuals with mental impairments incur annual medical costs that are
three times higher than medical costs for individuals with relatively mild physical
impairments such as back and joint problems and cardiovascular problems (Foote
and Hogan 2001). Individuals with mental impairments also spend 50 percent more
time on the disability insurance rolls compared to individuals with musculoskeletal
impairments overall; this gure increases to more than 60 percent for individuals
who enter between ages 18 and 34 (Rupp and Scott 1996).
Finally, by computing the percent of applicants in each group who are always
takers, we can benchmark differences in unobserved severity across groups of
applicants. For example, only 33 percent of applicants from the bottom prior earn-
ings quintile have impairments above the severity threshold of all examiners, while
74 percent of applicants in the top prior earnings quintile do. This pattern suggests
economic conditions may motivate the application decisions of low-income workers
with less severe impairments.
V. Effects of SSDI Receipt on Labor Supply
A. Main Estimates
In this section we present our main estimates of the average causal effect of SSDI
receipt on the employment and earnings of the marginal entrant. Table 4 presents
results for the labor supply regressions estimated by OLS and IV (2SLS), sepa-
rately for applicants with 2005 and 2006 decisions. We examine three outcomes:
employment (earning more than $1,000/year), employment above SGA (e.g.,
earning more than $11,760 per year in 2009), and annual earnings. We measure
all labor supply outcomes two and three years after the year of the initial decision,
and we also examine outcomes four years later for those who received their initial
determinations in 2005. In all models, we control for geography (DDS ofce and
three-digit zip code indicators), characteristics of the impairment (body system
codes and terminal illness indicator), age, prior labor market history, and seasonal-
ity (month of decision).
The OLS estimates suggest that SSDI receipt is associated with an approximately
34 –35 percentage point drop in employment two years after the initial determina-
tion, a 24 –25 percentage point drop in the probability of engaging in substantial
gainful activity, and a roughly $7,000 drop in annual earnings.
50
Covariates have
little effect on the magnitude of the estimates.
In contrast, our IV estimates imply that SSDI receipt causes a 28 percentage point
decrease in employment two years after the initial determination. Thus, adjusting for
differences in unobservable characteristics via 2SLS has a substantial impact on the
estimated labor supply effects. By four years after the initial determination, the esti-
mated employment effect is only 16 percentage points. This pattern is repeated for
the other two outcomes, employment above SGA and annual earnings. Two years
after the initial decision, SSDI receipt leads to an 18–19 percentage point decrease
50
Note that, because our measure of earnings includes zeros for nonworkers in place of potential earnings, the
estimated earnings effects are likely understated for both OLS and IV.
1819
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
in employment above SGA and a $3,800 to $4,600 decrease in annual earnings.
Collectively, our estimates imply very large labor supply effects.
51
It is worth not-
ing, however, that the earnings estimates imply that beneciaries would have earned
51
The computation of percent effects is not straightforward because we observe labor supply only for all allowed
and all denied applicants, and not for the subset of applicants on the margin of program entry. Assuming their
employment rate is between the rate for all allowed and all denied, then we nd that SSDI reduces employment by
53 –189 percent, employment above SGA by 66–384 percent, and earnings by 42–194 percent. Abadie (2003) has
developed a method to estimate the counterfactual labor supply of the marginal applicant, but the method depends
on having a binary instrument. Rather than discretize our instrument, we chose to bound the size of the effect.
T 4 —E  SSDI R  E  E
Two years after decision Three years after decision Four years after decision
Outcome OLS IV OLS IV OLS IV
Panel A. 2005 decisions
1) Earn > = $1,000/year
Mean dependent variable | allowed
0.148 0.128 0.106
Mean dependent variable | denied
0.522 0.515 0.471
Coefcient on ALLOW
0.347*** 0.279*** 0.361*** 0.227*** 0.345*** 0.158***
(−322.48) (−8.64) (−336.60) (−6.99) (−321.81) (−4.83)
R
2
0.200 0.195 0.218 0.200 0.209 0.171
2) Earn > = SGA
Mean dependent variable | allowed
0.050 0.043 0.033
Mean dependent variable | denied
0.293 0.302 0.270
Coefcient on ALLOW
0.242*** 0.192*** 0.255*** 0.166*** 0.233*** 0.113***
(−256.29) (−7.62) (−264.05) (−6.70) (−252.71) (−4.59)
R
2
0.149 0.144 0.166 0.152 0.156 0.128
3) Earnings
Mean dependent variable | allowed
1,951 1,737 1,494
Mean dependent variable | denied
8,928 9,191 8,496
Coefcient on ALLOW
7,435*** 3,781*** 7,715*** 3,007*** 7,221*** 1,716
(−126.50) (−3.05) (−182.51) (−2.92) (−176.24) (−1.60)
R
2
0.133 0.117 0.145 0.114 0.125 0.084
Observations 1,090,345 1,069,494 1,042,666
Panel B. 2006 decisions
1) Earn > = $1,000/year
Mean dependent variable | allowed
0.133 0.103
Mean dependent variable | denied
0.489 0.448
Coefcient on ALLOW
0.335*** 0.283*** 0.330*** 0.207***
(−297.39) (−11.25) (−285.84) (−8.51)
R
2
0.194 0.192 0.195 0.177
2) Earn > = SGA
Mean dependent variable | allowed
0.042 0.030
Mean dependent variable | denied
0.267 0.242
Coefcient on ALLOW
0.226*** 0.175*** 0.213*** 0.152***
(−218.54) (−8.82) (−208.98) (−8.16)
R
2
0.141 0.136 0.140 0.132
3) Earnings
Mean dependent variable | allowed
1,688 1,365
Mean dependent variable | denied
8,044 7,562
Coefcient on ALLOW
6,915*** 4,581*** 6,580*** 3,308***
(−84.14) (−5.93) (−117.85) (−4.60)
R
2
0.139 0.131 0.129 0.112
Observations 1,123,721 1,094,347
Notes: Standard errors clustered at examiner level; t-statistics in parentheses; ALLOW denotes actual award deci-
sion (rather than initial decision). Control variables include: DDS dummies, three-digit zip codes, body system
codes, terminal illness indicator, age group dummies, average previous earnings, and month dummies.
*** Signicant at the 1 percent level.
** Signicant at the 5 percent level.
* Signicant at the 10 percent level.
1820
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
only one-quarter to one-half of their preapplication earnings in the absence of benet
receipt.
Another way of obtaining the IV estimates is to estimate the reduced form effects
of the examiner allowance rate on labor supply and scale by the rst-stage coef-
cient from the regression of SSDI receipt on the examiner allowance rate. Table 5
shows that a 10 percentage point increase in the examiner allowance rate decreases
employment two years after the initial determination by 0.6 0.7 percentage points.
If we attribute this difference entirely to differences in SSDI receipt, then dividing
by the rst-stage coefcient implies that SSDI receipt decreases employment by
28 percentage points (= 0.057/0.204, or 0.069/0.243, for 2005 and 2006, respec-
tively). This assumes that the only impact examiners have on applicants is through
the channel of benet receipt.
But as Parsons (1991) noted, the SSDI application process is itself a distinct
channel that may adversely affect the labor supply of applicants; in particular,
denied applicants who pursue appeals spend substantially more time in process,
waiting for their cases to be evaluated. While applicants await a nal decision,
they cannot engage in substantial gainful activity without compromising their eli-
gibility for benets. If time out of the labor market adversely affects employment
outcomes (e.g., by depreciating skills, or signaling low labor force attachment to
potential employers), then when denied applicants attempt to reenter the labor
market after exhausting their appeals they will have a more difcult time nding
work than if they had never appealed. Thus, applicants assigned to relatively strict
examiners who spend more time waiting for a decision may have lower employ-
ment rates than they otherwise would have, leading us to underestimate the effect
of SSDI receipt on labor supply.
T 5—R F R: E  EXALLOW  E  E
2005 decisions 2006 decisions
Outcome Two years Three years Four years Two years Three years
1) Earn > = $1,000/year
Coefcient on EXALLOW
0.057*** 0.047*** 0.033*** 0.069*** 0.050***
(−8.05) (−6.60) (−4.55) (−10.51) (−8.00)
R
2
0.082 0.086 0.077 0.077 0.070
2) Earn > = SGA
Coefcient on EXALLOW
0.039*** 0.034*** 0.023*** 0.042*** 0.037***
(−7.21) (−6.23) (−4.32) (−8.34) (−7.79)
R
2
0.049 0.054 0.051 0.046 0.044
3) Earnings
Coefcient on EXALLOW
772*** 619*** 355 1,114*** 804***
(−2.97) (−2.83) (−1.57) (−5.77) (−4.49)
R
2
0.066 0.062 0.054 0.071 0.058
Observations 1,090,345 1,069,494 1,042,666 1,123,721 1,094,347
Notes: Standard errors clustered at examiner level; t-statistics in parentheses. Control variables include: DDS dum-
mies, three-digit zip codes, body system codes, terminal illness indicator, age group dummies, average previous
earnings, and month dummies.
*** Signicant at the 1 percent level.
** Signicant at the 5 percent level.
* Signicant at the 10 percent level.
1821
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
Figure 6 shows the relationship between our instrumental variable and average
initial (DDS-level) and nal processing times. Final processing times are mea-
sured as time from ling date to benet receipt date (minus the ve-month waiting
period, if applicable) for SSDI beneciaries, and time to last observed decision for
nonbeneciaries.
52
While initial processing times (approximately four months,
or 120 days on average) appear to be unrelated to the examiner allowance rates,
nal processing times are shorter for more lenient examiners. As a result, appli-
cants assigned to the most lenient examiners spend approximately 125 fewer days
awaiting a nal allowance decision than applicants assigned to an average exam-
iner. This pattern is the product of two forces. First, applicants denied by strict
examiners are more likely to enter the appeals process, which adds, on average,
ve months at the reconsideration stage and an additional two years for applicants
progressing to the ALJ level (Autor et al. 2011). Second, mitigating the impact of
the rst channel, initially denied applicants with more severe impairments (con-
centrated among the stricter examiners) will spend less time in the appeals pro-
cess, all else equal, because the denied claims are more likely to be overturned at
earlier stages in the appeals process.
To investigate the inuence of endogenous application processing time on the esti-
mated labor supply effect, we perform a back-of-the-envelope calculation where we
add an estimate of the labor supply depreciation effect to the reduced form effect of
EXALLOW on labor supply in order to “restore” the lost skills due to time out of the
labor market due to appeals. To do this, we use an estimate from Autor et al. (2011)
52
Since we do not observe higher-level appeals, nal processing times are censored at the ALJ level for a very
small fraction of denied applicants with cases still in progress more than ve years after their initial determination.
It is also possible that nal processing times are censored at the initial decision for denied applicants who bypass
the appeals process and submit a new application.
100
200
300
400
500
Days
–0.2
–0.1
0
0.1
0.2
Residualized initial allowance rate
Initial decision time Final decision time
F 6. I  F P T  I A R
Notes: Ninety-ve percent condence intervals shown with dashed lines. Final processing times are measured
through the appeals stage. Bandwidth is 0.067 for initial and 0.082 for nal processing time.
Source: DIODS data for 2005 and 2006.
1822
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
of the effect of application processing time on employment using exogenous varia-
tion in examiners’ average initial processing times among initially allowed appli-
cants; they estimate a depreciation effect of approximately 0.4 percentage points
per month, or 1.6 percentage points over four months (the average difference in pro-
cessing time between the strictest and most lenient examiners). Adding this to the
reduced form estimate and dividing by the rst-stage coefcient yields a modied
2SLS estimate of 35–36 percentage points for 2005 and 2006, respectively), slightly
larger than the OLS estimates. This is consistent with Parson’s (1991) observation
that skill depreciation associated with the lengthy appeals process invalidates the
interpretation of observational differences in labor supply between allowed and
denied applicants as an upper bound on the effect of SSDI receipt on labor supply.
It is also of note that the magnitude of the effect of SSDI on labor supply declines
over time. Certainly, SSDI work incentives become weaker with program tenure
as beneciaries are more likely to have exhausted their Trial Work Period; but this
should cause the labor supply disincentive effect to increase over time, not decrease.
Similarly, the possibility of recovery should lead to greater labor supply effects over
time if many denied applicants eventually recover capacity and return to work.
One potential explanation is that as the sample ages over time, more denied appli-
cants become eligible for and claim Social Security benets rather than returning to
work. But when we estimate our results on the subset of individuals ages 18–57 in
the year of decision (i.e., no more than 61 years old four years later), the declining
pattern over time still holds (see Table A3 in the online Appendix).
53
Similarly, we
rule out differential mortality as a potential explanation by estimating regressions on
a balanced panel of individuals surviving at least four years after the initial decision
(see Table A4 in the online Appendix).
54
Another explanation for the declining effect is the Great Recession. The labor
supply effect of SSDI receipt reects both the supply of and demand for disabled
workers. A lack of employment opportunities for disabled workers in 2008 and
2009—even for those who were willing and able to work— could lead to smaller
labor supply effects in the later years of the sample period. We think the recession
is unlikely to be the main cause of the declining labor supply effect for two rea-
sons. First, the estimated effects for two years and three years later, respectively,
are similar for the 2005 and 2006 decision-year samples even though the labor sup-
ply outcomes were measured one year apart, before and after the start of the Great
Recession. Second, we estimated the model stratifying by state-level unemployment
rates in 2009 (not shown) and did not nd any discernable pattern across states
affected differentially by the Great Recession.
The declining pattern in labor supply effects could also reect the confounding
effect of long processing times for those who appeal. On the one hand, if more
and more denied applicants return to the labor market over time as they complete
53
Not surprisingly, the estimated labor supply effects are larger when Social Security benet claiming is
excluded as an option for denied applicants.
54
As a nal sensitivity check, we try alternative sample denitions where we include only examiners with at
least 10, 100, or 150 cases (instead of 30), and we add back examiners with more than 900 cases (suspected train-
ing cases). Table A5 in the online Appendix shows that larger minimum values result in somewhat larger estimated
labor supply effects, while adding back training cases does not affect the estimates substantially. The rst stage is
relatively stable across the range of cutoff values.
1823
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
appeals, then the labor supply disincentive effect should actually increase. But if
the confounding depreciation effect of processing time increases over time as nal
processing times unfold, then the estimated labor supply effect could fall even if the
true disincentive effect remains constant.
Finally, the observed pattern is consistent with denied applicants perhaps trying
but failing to sustain employment. This would imply that the work capacity of SSDI
beneciaries on the margin of program entry diminishes with time since onset of
the disability.
B. Heterogeneity in the Effect of SSDI Receipt on Labor Supply
Our main estimates imply that the employment rate of new beneciaries would
have been 28 percentage points higher in the absence of benet receipt. However,
there is no reason to expect this effect to be the same for all beneciaries; those with
more severe impairments should have less employment capacity than those with less
severe impairments. If this is the case, the effect of SSDI receipt on labor supply
should vary with impairment severity. Our unique research strategy, leveraging varia-
tion in allowance thresholds across thousands of examiners, offers a rst opportunity
to test for heterogeneity in the effect of SSDI receipt on labor supply. In this sec-
tion, we show that the effect of SSDI receipt varies along both observable and unob-
servable dimensions. This implies that the 2SLS estimates from the previous section
recover a local average treatment effect (LATE), where the causal effect of SSDI
receipt is averaged over the population of applicants on the margin of program entry.
In Table 6 we present estimates of the effect of SSDI receipt on employment
two years after the initial decision for groups dened by impairment type, age, and
prior earnings quintile. The differences across impairment types are striking. SSDI
receipt causes a 17.3 percentage point reduction in employment among applicants
with musculoskeletal impairments compared to a 36.6 percentage point reduction
for applicants with mental disorders—more than twice as large.
55
The effects among
those with malignant neoplastic diseases (i.e., cancer) and cardiovascular, neuro-
logical, and digestive disorders are of similar magnitude as the effect for mental
disorders. We estimate even larger labor supply effects for applicants with endocrine
system diseases (e.g., diabetes) and skin disorders (49.4 and 91.6 percentage points,
respectively).
56
We estimate small and statistically insignicant labor supply effects
among those with respiratory, immune system, special senses and speech, genitouri-
nary, and hematological disorders.
We also test for heterogeneous labor supply effects by age and by prior earn-
ings quintile. Using observational comparisons of allowed and denied applicants,
von Wachter, Song, and Manchester (2011) found larger differences in labor supply
among younger applicants (ages 30 – 44) than older applicants (45 – 64). This pattern
holds for the causal effect as well. Our IV estimates imply that SSDI receipt causes a
34.2 percentage point reduction in employment among younger applicants compared
55
This may partly reect differences in decision times across the two groups. Recall that applicants with mus-
culoskeletal impairments were much more likely than applicants with mental impairments to appeal. Indeed, we
estimate similar fractions of always takers within the two groups (see Table 3).
56
While we report the IV estimate for “special/other” impairments for completeness, there is no rst stage for
this group (see Table 3), and as a result the second stage estimate is not meaningful.
1824
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
to a 26.6 percentage point reduction among older applicants. The patterns by income
quintile point to smaller employment effects for applicants in the top prior earnings
quintile, consistent with our earlier nding that applicants in the top quintile tend to
have more severe impairments (see Section IVB).
More generally, the differences in labor supply effects across groups could be due
to differences in unobservable characteristics such as impairment severity. Because
applicants are (conditionally) randomly assigned to a large number of examiners,
each with a different allowance threshold, we can trace out the effect of SSDI receipt
along different margins of entry by estimating marginal treatment effects (MTE)
(Heckman, Urzua, and Vytlacil 2006). As we showed in Section IVA, under mono-
tonicity these margins correspond to percentiles of the distribution of unobserved
severity. This exercise is particularly interesting in that it tests for heterogeneous
treatment effects on the basis of unobservables.
Practically, computation of the MTE can be accomplished by computing numeri-
cal derivatives of a smoothed function relating mean labor supply outcome, E
[
y
]
, to
T 6— H: E  SSDI R  E (two years later),
2005  2006 C
OLS IV
Observations
Mean
employment |
allowed
Mean
employment |
denied
Coefcient
on
ALLOW
t-stat
Coefcient
on
ALLOW
t-stat
Body system code
Musculoskeletal system
839,847 0.125 0.520
0.361*** 300.48 0.173*** 3.42
Mental disorders
455,433 0.171 0.535
0.333*** 210.25 0.366*** 11.73
Cardiovascular system
185,063 0.116 0.462
0.328*** 134.18 0.335*** 5.81
Neurological
181,162 0.119 0.538
0.386*** 144.34 0.359*** 7.40
Endocrine system
94,156 0.130 0.467
0.316*** 100.41 0.494** 2.14
Respiratory system
88,578 0.099 0.462
0.309*** 84.61 0.061 0.51
Special/other
85,587 0.132 0.322
0.222*** 57.78
2.000 0.59
Malignant neoplastic
diseases
77,021 0.212 0.621
0.390*** 94.75 0.347*** 6.18
Immune system disorders
59,188 0.170 0.531
0.330*** 65.02 0.048 0.61
Digestive system
53,224 0.158 0.523
0.353*** 80.64 0.324*** 3.08
Special senses and speech
48,396 0.158 0.529
0.344*** 77.59 0.158 1.37
Genitourinary impairments
31,837 0.185 0.540
0.327*** 41.38 0.084 0.74
Skin disorders
7,483 0.156 0.560
0.377*** 32.25 0.916*** 3.66
Hematological disorders
7,091 0.219 0.590
0.336*** 26.35
0.116 0.44
Age at decision
18–29
193,824 0.289 0.645
0.364*** 154.42 0.336*** 6.24
30–39
342,535 0.201 0.581
0.397*** 184.06 0.369*** 7.09
40–49
627,727 0.153 0.516
0.379*** 266.59 0.355*** 7.56
50–59
772,263 0.111 0.443
0.348*** 234.89 0.290*** 8.42
60–64
277,717 0.093 0.223
0.140*** 79.17 0.105*** 3.47
30–44
621,664 0.183 0.561
0.395*** 273.58 0.342*** 8.06
45–64
1,398,578 0.115 0.418
0.317*** 312.78 0.266*** 11.13
Average prior earnings
Bottom quintile
442,814 0.137 0.336
0.202*** 155.37 0.296*** 7.81
Second quintile
442,813 0.153 0.542
0.357*** 241.28 0.319*** 7.69
Third quintile
442,813 0.141 0.598
0.427*** 271.37 0.311*** 7.30
Fourth quintile
442,813 0.132 0.625
0.466*** 280.66 0.305*** 7.34
Top quintile
442,813 0.140 0.616
0.449*** 250.98 0.192*** 3.76
Notes: Standard errors clustered at examiner level. Control variables include: DDS dummies, three-digit zip codes,
body system codes, terminal illness indicator, age group dummies, average previous earnings, and month dummies.
*** Signicant at the 1 percent level.
** Signicant at the 5 percent level.
* Signicant at the 10 percent level.
1825
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
the predicted probability of SSDI receipt. Specically, we regress initial allowance
decisions on indicators for type of impairment, age group, decision month, and
DDS, as well as a measure of average prior earnings, and construct the residual, Z,
which by construction is orthogonal to the case mix controls and varies systemati-
cally only with EXALLOW. Then we estimate a probit of ultimate SSDI receipt on
the residualized Z. This is our measure of the predicted probability of SSDI receipt,
P(Z ). Next we estimate a local quadratic regression of employment on predicted
SSDI receipt and compute the numerical derivative of this function to estimate
E[ y]/P(Z ).
Figure 7 shows the MTE as a function of unobserved severity, where severity is
reverse ordered and measured in percentiles (see denition of u in Section IVA),
along with boot-strapped 95 percent condence intervals. Applicants on the margin
for an examiner with a predicted SSDI receipt rate of 65 percent (the mean rate)
are in the sixty-fth percentile of the unobserved (reverse) severity distribution.
That is, they have an impairment that is less severe than 65 percent of applicants,
and more severe than 35 percent of applicants. Since we estimate that 57 percent of
applicants are always takers (that is, they would receive SSDI benets regardless of
initial examiner assignment), the MTE is not identied for applicants on the margin
of SSDI receipt rates less than 57 percent. Similarly, the MTE is not identied for
applicants on the margin of SSDI receipt rates greater than 80 percent (= 57 + 23,
the fraction of marginal applicants). As a result, we are only able to trace the MTE
for applicants between the fty-seventh and eightieth percentiles of the unobserved
(reverse) severity distribution (or the twentieth to forty-third percentiles of the
actual unobserved severity distribution s). The estimates become imprecise at the
more extreme ends of the distribution since there are relatively small numbers of
examiners with margins at these points.
–2
–1.5
–1
–0.5
0
0.55 0.6 0.65
0.7
0.75
0.8
Percentile of
(
reverse
)
unobserved severity distribution
F 7. M T E  E
Notes: Ninety-ve percent condence intervals shown with dashed lines. Bandwidth is 0.084.
Source: DIODS data for 2005 and 2006.
1826
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
The estimated MTE declines monotonically as unobserved severity falls (or
reverse severity rises); because it is negative in value, as it declines it increases in
magnitude, implying the disincentive effect of SSDI rises as the allowance threshold
is lowered and applicants with less severe impairments are allowed on the program.
The MTE is not statistically different from zero among applicants with the highest
severity impairments in the marginal population but increases in magnitude to more
than 50 percentage points among those in the marginal population with the least
severe impairments.
57
This implies that beneciaries with relatively severe impair-
ments have little work capacity, while those with the least severe impairments have
a very large degree of work capacity.
Our nding that the MTE is heterogeneous in unobserved severity is important
for interpreting our 2SLS estimate as a LATE and comparing our estimate to other
estimates from the literature. Of particular interest is the comparison of our estimate
of 28 percentage points with French and Song’s (2011) estimate of 26 percent-
age points based on applicants on the margin of allowance at the ALJ level among
those who appeal their initial determination. In light of the interpretation of both
estimates as LATEs, it should not be surprising that French and Song is similar
to ours. In both cases, the estimated effects are averaged among the marginal pro-
gram entrants—that is, excluding the always takers and never takers. Although the
French-Song sample excludes initially allowed applicants, as well as initially denied
applicants who do not pursue their cases further, these groups are included among
the always takers and never takers, respectively, in our framework, and thus they do
not contribute to the estimation. By the same token, the French-Song sample also
includes always takers (those who win their appeal no matter to which judge they
are assigned) and never takers (those who lose their appeal no matter to which judge
they are assigned), and these are also always takers and never takers, respectively,
in our framework. Thus, the margin of allowance is the same margin in both natural
experiments, and it makes sense that both estimates are similar.
VI. Conclusion
This paper presents causal estimates of the disincentive effect of the SSDI pro-
gram on the labor supply of program entrants. Ours is the rst set of estimates of
the disincentive effect estimated using the entire population of SSDI applicants in
the present day system. In addition, our quasi-experimental research design applied
to a new administrative dataset facilitates examination of the important heretofore
unanswered policy questions of whether the program disincentive effect varies
across individuals, to what extent, and in what ways. In particular, we can assess the
extent to which the disincentive effect varies with unobservable impairment sever-
ity. This is of particular interest since over the last two decades, the SSDI caseload
has become increasingly dominated by individuals with impairments that are par-
ticularly difcult to assess, such as mental and musculoskeletal impairments.
57
Recalling that average processing times fall with examiner allowance rates above the average, the MTE may
increase in magnitude at an even greater rate in this part of the distribution if longer processing times affect employ-
ment independently of SSDI receipt.
1827
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
We nd that employment of the marginal program entrant would be on average
28 percentage points greater in the absence of SSDI benet receipt two years after
the initial determination. His or her likelihood of engaging in substantial gainful
activity as dened by the SSDI program would be on average 18 –19 percentage
points higher, and he or she would earn $3,800 to $4,600 more per year on aver-
age in the absence of SSDI benet receipt. We estimate that in the present system,
23 percent of applicants are on the margin of program entry, that is, their ultimate
outcome depends on their initial examiner assignment. The marginal SSDI entrant
is more likely to have a mental disorder, be younger, and have preonset earnings in
the lowest earnings quintile. Such individuals tend to have higher expected medical
costs and greater expected program duration.
Importantly, we also nd that the SSDI labor supply disincentive effect is not con-
stant across individuals. Over a range corresponding to approximately 10 percentage
points above and below the current system allowance rate, the marginal treatment
effect varies from being statistically indistinguishable from zero for individu-
als characterized by high unobservable severity, to more than 50 percentage point
reduction in employment for those with lower unobservable severity. Our estimates
point to markedly greater residual work capacity among beneciaries with (unob-
servably) less severe impairments, and imply that a policy with the effect (intended
or unintended) of relaxing access to SSDI benets (regardless of impairment type)
in the initial determination phase would lead to an increase in the program’s labor
supply disincentive effect.
Finally, we close with two important qualications. First, our estimates of the
employment potential of beneciaries had they not received SSDI benets reect
the population of applicants, examiners and the program rules in force in 2005 and
2006, and the labor market conditions in the years following. Subsequent changes
in economic conditions, disability policy, or healthcare policy could alter the com-
position of applicants and beneciaries in different ways, creating different labor
supply disincentives (or incentives) at that time. Second, we have presented esti-
mates of the effect of SSDI benet receipt, and therefore our estimates are not a
measure of labor supply in the absence of the program itself. If the SSDI program
diminishes labor supply through channels other than benet receipt, such as the
application and appeals process, then our estimate of the effect of benet receipt is
just one component of the program’s total effect on labor supply.
REFERENCES
Abadie, Alberto. 2003. “Semiparametric Instrumental Variable Estimation of Treatment Response
Models.Journal of Econometrics 113 (2): 231–63.
Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. 1996. “Identication of Causal Effects
Using Instrumental Variables.Journal of the American Statistical Association 91 (434): 444–55.
Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s
Companion. Princeton: Princeton University Press.
Autor, David H., and Mark G. Duggan. 2003. “The Rise in the Disability Rolls and the Decline in
Unemployment.Quarterly Journal of Economics 118 (1): 157–205.
Autor, David H., Frank Levy, and Richard J. Murnane. 2003. “The Skill Content of Recent Techno-
logical Change: An Empirical Exploration.Quarterly Journal of Economics 118 (4): 1279–333.
Autor, David, Nicole Maestas, Kathleen Mullen, and Alexander Strand. 2011. “Does Delay Cause
Decay? The Effect of Administrative Decision Time on the Labor Force Participation and Earnings
of Disability Applicants.” Michigan Retirement Research Center Research Paper 2011–258.
1828
THE AMERICAN ECONOMIC REVIEW
AugusT 2013
Board of Trustees, Federal Old-Age and Survivors Insurance and Federal Disability Insurance Trust
Funds.
2012. “The 2011 Annual Report of the Board of Trustees of the Federal Old-Age and Survi-
vors Insurance and Federal Disability Insurance Trust Funds.Accessed June 8, 2012. http://www.
socialsecurity.gov/OACT/TR/2012/tr2012.pdf.
Bound, John. 1989. “The Health and Earnings of Rejected Disability Insurance Applicants.American
Economic Review 79 (3): 482–503.
Bound, John, and Timothy Waidmann. 1992. “Disability Transfers, Self-Reported Health, and the
Labor Force Attachment of Older Men: Evidence from the Historical Record.Quarterly Journal of
Economics 107 (4): 1393–419.
Burkhauser, Richard V., and Mary C. Daly. 2011. The Declining Work and Welfare of People with Dis-
abilities: What Went Wrong and a Strategy for Change. Washington, DC: AEI Press.
Chen, Susan, and Wilbert van der Klaauw. 2008. “The Work Disincentive Effects of the Disability
Insurance Program in the 1990s.Journal of Econometrics 142 (2): 757–84.
Christensen, Kaare, Gabriele Doblhammer, Roland Rau, and James W. Vaupel. 2009. “Ageing Popu-
lations: The Challenges Ahead.Lancet 374 (9696): 1196–1208.
Doyle, Joseph J., Jr. 2007. “Child Protection and Child Outcomes: Measuring the Effects of Foster
Care.American Economic Review 97 (5): 1583–610.
Doyle, Joseph J., Jr. 2008. “Child Protection and Adult Crime: Using Investigator Assignment to Esti-
mate Causal Effects of Foster Care.Journal of Political Economy 116 (4): 746–70.
Duggan, Mark, and Scott Imberman. 2009. “Why Are the Disability Rolls Skyrocketing? The Con-
tribution of Population Characteristics, Economic Conditions, and Program Generosity.” In Health
at Older Ages: The Causes and Consequences of Declining Disability among the Elderly, edited by
David Cutler and David Wise, 337–79. Cambridge, MA: National Bureau of Economic Research.
Foote, Sandra M., and Christopher Hogan. 2001. “Disability Prole And Health Care Costs Of Medi-
care Beneciaries Under Age Sixty-Five.Health Affairs 20 (6): 242–53.
French, Eric, and Jae Song. 2011. “The Effect of Disability Insurance Receipt on Labor Supply.” Fed-
eral Reserve Bank of Chicago, Working Paper WP-09-05.
Gallicchio, Sal, and Barry Bye. 1981. “Consistency of Initial Disability Decisions Among & Within
States.Washington, DC: U.S. Department of Health and Human Services, Social Security Admin-
istration, Ofce of Policy, Ofce of Research and Statistics.
Government Accountability Ofce (GAO). 2004. “Social Security Administration: More Effort Needed
to Assess Consistency of Disability Decisions.” Report to the Chairman, Subcommittee on Social
Security, Committee on Ways and Means, House of Representatives, GAO-04-656.
Gruber, Jonathan. 2000. “Disability Insurance Benets and Labor Supply.Journal of Political Econ-
omy 108 (6): 1162–83.
Gruber, Jonathan, and Jeffrey D. Kubik. 1997. “Disability Insurance Rejection Rates and the Labor
Supply of Older Workers.Journal of Public Economics 64 (1): 1–23.
Haveman, Robert H., and Barbara L. Wolfe. 1984a. “The Decline in Male Labor Force Participation:
Comment.Journal of Political Economy 92 (3): 532–41.
Haveman, Robert H., and Barbara L. Wolfe. 1984b. “Disability Transfers and Early Retirement: A
Causal Relationship?” Journal of Public Economics 24 (1): 47–66.
Heckman, James J., Sergio Urzua, and Edward Vytlacil. 2006. “Understanding Instrumental Variables
in Models with Essential Heterogeneity.Review of Economics and Statistics 88 (3): 389–432.
Houtenville, Andrew J., David C. Stapleton, Robert R. Weathers, II, and Richard V. Burkhauser, eds.
2009. Counting Working-Age People with Disabilities: What Current Data Tell Us and Options for
Improvement. Kalamazoo, MI: W. E. Upjohn Institute for Employment Research.
Kane, Thomas J., and Douglas O. Staiger. 2008. “Estimating Teacher Impacts on Student Achieve-
ment: An Experimental Evaluation.” National Bureau of Economic Research Working Paper 14607.
Kling, Jeffrey R. 2006. “Incarceration Length, Employment, and Earnings.American Economic
Review 96 (3): 863–76.
Lewin Group, Inc. 2001. “Evaluation of SSAs Disability Quality Assurance (QA) Processes and Develop-
ment of QA Options That Will Support the Long-Term Management of the Disability Program.” Con-
tract Number 0600–96–27331, Final Report submitted to the Social Security Administration, March 16.
Livermore, Gina, David Stapleton, and Henry Claypool. 2009. “Health Insurance and Health Care
Access Before and After SSDI Entry.The Commonwealth Fund 109 (1255).
Maestas, Nicole, Kathleen J. Mullen, and Alexander Strand. 2013. “Does Disability Insurance Receipt
Discourage Work? Using Examiner Assignment to Estimate Causal Effects of SSDI Receipt: Data-
set.American Economic Review. http://dx.doi.org/10.1257/aer.103.5.1797.
Nagi, Saad Z. 1969. Disability and Rehabilitation: Legal, Clinical, and Self-Concepts and Measure-
ment. Columbus, OH: Ohio State University Press.
1829
maestas et al.: causal effects of disability insurance receipt
Vol. 103 no. 5
Ofce of the Inspector General, Social Security Administration. 2006. “The Social Security Adminis-
tration’s Identication of Special Disability Workload Cases.” Evaluation Report A-13-05-15028.
Ofce of the Inspector General, Social Security Administration. 2008. “Disability Claims Overall Pro-
cessing Times.” Audit Report A-01-08-18011.
Parsons, Donald O. 1980. “The Decline in Male Labor Force Participation.Journal of Political Econ-
omy 88 (1): 117–34.
Parsons, Donald O. 1982. “The Male Labour Force Participation Decision: Health, Reported Health,
and Economic Incentives.Economica 49 (193): 81–91.
Parsons, Donald O. 1991. “The Health and Earnings of Rejected Disability Insurance Applicants:
Comment.American Economic Review 81 (5): 1419–26.
Perry, Cynthia D. 2008. “Does Treating Maternal Depression Improve Child Health Management? The
Case of Pediatric Asthma.Journal of Health Economics 27 (1): 157–73.
Rupp, Kalman, and Charles G. Scott. 1996. “Trends in the Characteristics of DI and SSI Disability
Awardees and Duration of Program Participation.Social Security Bulletin 59 (1): 3–21.
Social Security Administration. 2010. “Annual Statistical Report on the Social Security Disability
Insurance Program, 2009.Washington, DC: Social Security Administration, Ofce of Retirement
and Disability Policy, Ofce of Research, Evaluation, and Statistics. http://www.ssa.gov/policy/
docs/statcomps/di_asr/2009/index.html (accessed May 31, 2013).
Stock, James H., Jonathan H. Wright, and Motohiro Yogo. 2002. “A Survey of Weak Instruments and
Weak Identication in Generalized Method of Moments.Journal of Business and Economic Sta-
tistics 20 (4): 518–29.
von Wachter, Till, Jae Song, and Joyce Manchester. 2011. “Trends in Employment and Earnings of
Allowed and Rejected Applicants to the Social Security Disability Insurance Program.American
Economic Review 101 (7): 3308–29.