Regulators Need To Use Test Scores With Great Care
Editor’s note: This post is the fifth in an ongoing discussion between Fordham’s Michael Petrilli and the University of Arkansas’s Jay Greene that seeks to answer this question: Are math and reading test results strong enough indicators of school quality that regulators can rely on them to determine which schools should be closed and which should be expanded—even if parental demand is inconsistent with test results? Prior entries can be found here, here, here, and here.
Mike, you say that we agree on the limitations of using test results for judging school quality, but I’m not sure how true that is. In order not to get too bogged down in the details of that question, I’ll try to keep this reply as brief as possible.
First, the evidence you’re citing actually supports the opposite of what you are arguing. You mention the Project Star study showing that test scores in kindergarten correlated with later life outcomes as proof that test scores are reliable indicators of school or program quality. But you don’t emphasize an important point: Whatever benefits students experienced in kindergarten that resulted in higher test scores, they did not cause higher test scores in later grades—even though they produced better later-life outcomes. As they put it, “The effects of class quality fade out on test scores in later grades, but gains in non-cognitive measures persist.” This is an example of the disconnect between test scores and life outcomes, which is exactly what I’ve been arguing. If we used test scores as a proxy for school or program quality, we would wrongly conclude that this program did not help, since the test score gains faded even though the benefits endured.
You also draw the wrong conclusion from the Deming, et al. article. The authors did find that test score gains for lower-scoring students in lower-performing schools resulted in higher earnings for those students. But lower-scoring students in higher-performing schools experienced an even larger decline in later-life earnings. These results highlight two things. First, narrowly focusing on raising test scores helps some low-scoring students. But it harms other low-scoring students such that:
This negative impact on earnings is larger, in absolute terms, than the positive earnings impact in schools at risk of being rated Low-Performing. However, there are fewer low-scoring students in high-scoring schools, so the overall effects on low-scoring students roughly cancel one another other out. Again, we find no impact of accountability pressure on higher-achieving students.
Having no net effect on low-scoring students, as well as having no effect of any kind on higher-scoring students, does not sound like a ringing endorsement of using accountability pressure to focus narrowly on test scores.
Second, the pattern of results in that paper supports my argument about the disconnect between test score gain and changes in later-life outcomes. Low-scoring students in higher-performing schools only experienced a decline of 0.4 percent in the probability of passing the tenth-grade math exam, but they exhibited a decline in annual earnings of $748 at age twenty-five. The low-scoring students in low-performing schools experienced a much larger 4.7 percent increase in the probability of passing the tenth-grade math exam, but they only exhibited an increase of $298 in earnings at age twenty-five. A negligible drop in test scores was associated with a large decline in earnings, while a large increase in test performance resulted in a more modest gain in earnings. See the disconnect?
You’re also mistaken in your belief that the evidence of this disconnect is confined to high schools. There is a fairly large literature on early education that shows little or no enduring test score gains from preschool but large benefits later in life. Again, gains in test scores do not appear to capture very well the quality of schools or programs. In addition, a series of studies by David Grissmer and colleagues found that early math and reading achievement tests are not even very good predictors of later test results relative to other types of skills and more general knowledge. They conclude: “Paradoxically, higher long-term achievement in math and reading may require reduced direct emphasis on math and reading and more time and stronger curricula outside math and reading.”
I could go on, but I promised to be brief. The overall point is that if tests were reliable indicators of school and program quality, they should consistently be predictive of later-life outcomes. As this brief review of research demonstrates, it is quite common for test score results not to be predictive of later-life outcomes. If even rigorous research fails to show a consistent relationship between test scores and later success, why would we think that regulators and policy makers with less rigorous approaches to test scores could use them to reliably identify school and program quality? Rather than relying on test results anyway and making potentially disastrous decisions to close schools or shutter programs on bad information, we should recognize that local actors—including parents—are in a better position to judge school quality. Their preferences deserve strong deference from more distant authorities.
– Jay Greene
This first appeared on Flypaper.