The roots of the long and contentious debate about whether we should spend more for K-12 education can be found in two sentences from the famous 1966 report led by James Coleman:
It is known that socioeconomic factors bear a strong relation to academic achievement. When these factors are statistically controlled, however, it appears that differences between schools account for only a small fraction of differences in pupil achievement (pp. 21-22). 
The conclusion is sometimes stated as “money doesn’t matter,” a paraphrase that also suggests there is no basis for equalizing spending in schools, which, because they often are funded by local property taxes, end up spending lower amounts in lower-income communities. And federal education spending focuses directly on giving states and districts money to close achievement gaps, which assumes money matters.
The spending question is still active. Decades after famous cases like the 1971 Serrano v. Priest case in California, equalization cases are currently working their way through courts in Connecticut, California, Texas, and elsewhere. And the newly-authorized Every Student Succeeds Act continues to provide billions under its Title I program to close gaps.
How might we gather evidence of the effects of money on education achievement? We now recognize that the approach used by Coleman 50 years ago does not yield ‘causal’ estimates, i.e., it does not measure the degree to which spending more money causes outcomes to improve (or not). The Coleman analysis looked at a cross-section of districts and schools at a point in time and found little relationship between spending and outcomes. But suppose two neighboring school districts differ in their average income levels. The more affluent one does not spend much on its schools but posts high test scores on the state assessment. The less affluent one spends more on its schools but posts low test scores on the state assessment. In this example, more spending has a negative correlation with better outcomes but we can see how erroneous it would be to conclude based on such findings that spending more on schools harms learning.
A ‘causal’ estimate of the effect of spending would be an experiment, maybe structured like this: a state identifies, say, 50 school districts and divides them randomly into two groups of 25. It gives one of those two groups more funding and does not change any other aspect of funding. If money ‘causes’ education outcomes to improve, test scores of students in the two groups will diverge over time. The difference in scores is what we want to know, the causal effect of added funding.
This kind of experiment is unlikely to be done, for various legal and political reasons. But some research methods can measure causal effects without a group experiment. Two recent studies that use these methods provide evidence that money matters. But they also provide evidence that it will take massive amounts to close gaps.
Short term money does not matter
The first study examined outcomes of School Improvement Grants (SIG), which were funded for $7 billion as part of the American Recovery and Reinvestment Act of 2009. The study was done by Mathematica Policy Research for the U.S. Department of Education’s Institute for Education Sciences. It compared schools that fell just short of receiving a SIG grant based on their test scores with schools that received SIG grants.
This approach—known by the mouthful name of a ‘regression discontinuity design’—relies on the logic that when there are cutoff points, created perhaps by rules or regulations or arbitrary procedures, those just below the cutoff and just above it tend to be similar. For example, imagine an intervention program for students struggling to learn to read. The program uses a cutoff of the 20th percentile on a test of reading skills. Students reading at the 19th percentile participate in the program. Students reading at the 21st percentile do not participate in the program. These students are likely to be similar in terms of family background, previous educational experiences, and personal characteristics such as motivation and the like. It’s not guaranteed, but likely. Now imagine schools rather than students are on each side of the cutoff. Comparing school outcomes is how the SIG study estimated effects of SIG money.
SIG grants were substantial, about $2 million a school for three years, which amounted to about $900 per student each year. With that additional money in hand, it seems obvious that schools below the cutoff would be doing more improvements than schools above the cutoff, such as using different instructional approaches, different hiring practices, developing teachers and principals and so on. And for the study to be testing something, schools on each side of the cutoff need to be doing different things. Otherwise, the study would be comparing schools making the same improvements.
But the study reported that schools just above the cutoff were undertaking improvement efforts without SIG funding. In fact, it reported that improvements under way on both sides of the cutoff were nearly equivalent (and, in the statistical analysis, the study could not conclude that improvements around the cutoff differed). What schools were doing to improve was not altered by SIG funding. It’s as if the hypothetical reading program I described above was delivered to students on both sides of the cutoff. This crucial aspect of the study’s results was mentioned in some media reports, but others overlooked it and focused on the lack of improvements in scores.  When improvements don’t differ, we should not expect outcomes to differ, and they did not.
Some commenters suggested that turnaround models being tested were just not developed to the point where they were scientifically sound.
Another explanation is that districts were taking steps to reform all low-performing schools, ones above and below the SIG cutoffs, and simply used SIG funding to underwrite some of the costs of those steps. Replacing a school’s principal, which is one of the required elements of using SIG funding, might seem like a radical step. But schools that just miss the cutoff for SIG funding are also struggling, and replacing their principals would be a reasonable decision for district administrators trying to improve those schools. ‘Comprehensive instructional reform’ also is part of the SIG model, but also is likely to be done in schools above the cutoff.
The underlying logic of SIG grants also could be an issue. Simply pushing money to schools for brief periods makes sense if one believes the money can fix whatever shortcomings the schools had, and quickly. School physical structures meet these criteria better than school operations. Structures can be repaired and refurbished within three years. But district administrators might rationally conclude that whatever instructional reform activities a school undertakes with the money should be ones that do not continue to incur costs after the three-year grant ended. So they invest in curricula (textbooks, technology), professional development for teachers and principals, using a teacher evaluation system that incorporates student test scores, and so on. Textbooks last a long time, professional development workshops don’t have to be done in the future, and evaluation systems can be rolled back. If the reform activities lead to score improvements, even better. As the Coleman report warned, schools have a limited role in education achievement. That role is even more limited in a short time span.
But long term money can matter
Two recent studies concluded that changes in spending induced by state education finance reforms improved outcomes such as test scores, high school graduation, and earnings. On the surface, reforming a state’s education finance system sends more money to low-income schools, which is what SIG did without success. But finance reforms are long-lasting, and low-income districts and schools can invest in improvements knowing that their funding is higher for the foreseeable future.
The two studies, one by Jackson et al. (2016) and one by LaFortune et al. (2016) use techniques designed to estimate causal effects of spending more money.  Essentially, they treat court-ordered finance reforms as if the reforms were ‘exogenous,’ equivalent to unanticipated surprises. Reforms aren’t surprises, of course—it’s hard for a state government not to know a court case is under way. But what a court will decide and what it will instruct a state to do is not known in advance. Both studies use ‘event history analysis’ to compare time trends for test scores and other outcomes for states in which finance reforms are enacted relative to that state’s trends up to that point and to trends in other states not enacting reforms. The idea is that if the reform improves outcomes, the improvement should be visible as a break in the trend for that state relative to other states. If Kansas enacts a finance reform and Nebraska does not, and Kansas then experiences an improvement larger than its trend to that point, and Nebraska does not, the similarities in the two states lends credibility to the argument that the reform caused the improvement. 
Jackson et al. match state finance reforms to a nationally representative sample of students that is tracked over time. They show that these students had more years of completed schooling and higher earnings as adults.  Their data do not enable them to study education outcomes while students are in school, but the findings hint at positive ones.
LaFortune et al. take a closer look at outcomes while students are in school. Before getting to those, however, they point out that finance reforms increased overall spending, and increased spending more in low-income districts relative to high-income districts, which means at least some ‘equalization’ happened.  Based on their evidence, it is clear that finance reforms re-allocate significant amounts of money—on average, reforms increased spending by $1,225 per student a year in the lowest 20 percent of districts ranked by income, while increasing spending by $527 in the highest 20 percent of districts ranked by income. In a state like Ohio, with 1.8 million students, these amounts imply spending increases on the order of $1.8 billion each year. The SIG program may seem large because it spent $7 billion, but that amount is modest compared to school finance reforms in even one large state.
LaFortune et al. match state finance reforms to representative samples in each state of student test scores from the National Assessment of Education Progress (NAEP), which enables them to measure the effects of money on test scores. Crucially, they find that NAEP test scores increase after spending increases. Increases are not large for any one year, but they note that effects will cumulate for students who attend K-12 after a reform is enacted, a 13-year span. After ten years, they estimate that a student in a lower-income district closed the score gap with students in the median-income district by 3.5 NAEP points.  For comparison, in 2015, the gap between math scores for white and black eight-graders was 32 points (and this gap is after decades of state reforms). Finance reforms can cut into the gap, but it remains a challenge.
LaFortune et al. report another finding that underscores the challenge in closing gaps. Finance reforms reduced achievement gaps between high- and low-income school districts but did not have detectable effects on resource or achievement gaps between high- and low-income students. The authors point out that many low-income students live in high-income districts and vice versa. Equalizing resources among districts is a poorly targeted approach for mitigating achievement gaps arising from differences in household incomes that exist within districts. Using state resources to offset disparities in property tax bases may meet a legal definition of equal access under state constitutions, but the LaFortune et al. findings raise questions about this strategy for promoting more equal outcomes.
We are back to the Coleman report, but updated this way—money can matter, but spending more on schools does not yield big improvements. The update is not nothing, but it’s short of a big something.
Programs and objectives should match
The SIG study’s result that test scores did not improve is logically consistent with SIG not generating differences in what low-performing schools did to improve. And perhaps expecting short-term money to transform schools is a kind of magical thinking that the issues that plague these schools can be ‘fixed’ by temporary funds.
The LaFortune et al. study shows that durable increases in money spent in schools improved achievement. The increases help states meet their legal obligations to public education under their constitutions, and the achievement gains show that the money did matter. But the study also found that targeting the money through school districts failed to close gaps between high and low income students. If the objective is to close gaps, state equalization might not be the right tool.
Incorporating those lessons into federal policy argues for making Title I portable, which Nora Gordon wrote about in this series—the money will follow the student for a long time (for as long as the student is eligible) and it is precisely targeted to students who need it. It does not sidestep issues first raised in the Coleman report about the limited role of schools in determining achievement, but it starts at a sensible point.
ESSA requires states to develop plans and intervene in their lowest-performing 5 percent of schools. The national SIG study offers some lessons about what not to do but is unclear about what to do. Susannah Loeb’s recent piece here offers suggestions based on findings from turnarounds in California that appear more successful than what the national study found. Improving teaching is at the top of the list. Schools might play a minor role in achievement overall, but, within schools, teachers play a major role. Focusing on teaching within low-performing schools is where the evidence points.
— Mark Dynarski
Mark Dynarski is a Nonresident Senior Fellow at Economic Studies, Center on Children and Families, at Brookings.
This post originally appeared as part of Evidence Speaks, a weekly series of reports and notes by a standing panel of researchers under the editorship of Russ Whitehurst.
The author(s) were not paid by any entity outside of Brookings to write this particular article and did not receive financial support from or serve in a leadership position with any entity whose political or financial interests could be affected by this article.
1. The full Coleman report can be found at http://files.eric.ed.gov/fulltext/ED012275.pdf.
2. Sparks mentions the lack of differences between school practices: http://blogs.edweek.org/edweek/inside-school-research/2017/01/school_improvement_fund_final_report.html. Brown does not: https://www.washingtonpost.com/local/education/obama-administration-spent-billions-to-fix-failing-schools-and-it-didnt-work/2017/01/19/6d24ac1a-de6d-11e6-ad42-f3375f271c9c_story.html?utm_term=.a059805f8a53.
3. Kirabo Jackson, Rucker Johnson, and Claudia Persico. “The Effects of School Spending on Education and Economic Outcomes: Evidence from School Finance Reforms.” The Quarterly Journal of Economics (2016), pp. 157-218; Julien LaFortune, Jesse Rothstein, and Diane Whitmore Schanzenbach, “School Finance Reform and the Distribution of Student Achievement,” National Bureau of Economic Research, Working Paper 22011, February 2016. Kevin Carey and Elizabeth Harris provide an overview of both studies at https://www.nytimes.com/2016/12/12/nyregion/it-turns-out-spending-more-probably-does-improve-education.html?_r=0.
5. Eric Hanushek noted that the Jackson et al. estimates seem too large. They report that the gap between low and high income students would be closed if spending increased by 23 percent more a year. Hanushek points out that actual spending increases during the time period they studied was more like 100 percent, gaps have not closed, and other explanations for low performance such as increases in the numbers of students in poverty don’t explain the difference. See https://www.educationnext.org/boosting-education-attainment-adult-earnings-school-spending, https://www.educationnext.org/money-matters-after-all, https://www.educationnext.org/money-matter, and https://www.educationnext.org/not-right-ballpark.
7. I am using their effect size estimate of .10 (page 32) and the NAEP reported standard deviation of 36 for fourth graders and 34 for eighth graders, https://nces.ed.gov/programs/digest/d11/tables/dt11_126.asp.