A diver in mid-flip heading down towards the pool.
“Competitive divers bridle when judges give 1.5 points to their triple backflips.”

Two cheers for Lynn Olson’s and Craig Jerald’s long, perceptive explanation of hostility to statewide assessments (“Statewide Standardized Assessments Were in Peril Even Before the Coronavirus. Now They’re Really in Trouble.”). Their chronology is spot on. They’re right about the intensity of the “testing backlash.” I assume they’re right that this year’s testing moratorium will cause a lot of people to say “it’s now demonstrated that we don’t really need those irksome, onerous assessments.” And I agree that tests would be a lot more popular with educators if they were demonstrably helpful in improving instruction, which waiting to year’s end doesn’t do much for.

Yet there may also be a fundamental flaw—blind spot, really—in their analysis. What if the hostility toward testing is not, at bottom, about the tests but, rather, about what the present testing regimen is mainly designed to do, which is to hold schools and educators (and sometimes kids) to account for their results? What if tests are the unwelcome messengers but what’s really at stake is the message? I submit that if testing vanished but some other form of results-based accountability remained, educators would complain just as much.

Think about it. Nobody likes to be held to account for their results, particularly when embarrassment, inconvenience, and unwanted interventions, possibly even the loss of one’s diploma or one’s job, hangs in the balance. Doctors and hospitals don’t much like it when their infection, mortality, or readmission rates are publicized and sometime lead to sanctions. Restaurateurs understandably hate it when the health department padlocks their bistros or a reviewer offers no stars. Competitive divers bridle when judges give 1.5 points to their triple backflips. It’s human nature—and the higher the stakes, the stronger the feelings.

It’s different, of course, when high-stakes accountability leads to bonuses, gold medals, five-star ratings and 9.5 point dives. Everyone loves accolades. That’s human nature, too.

The reason America got into results-based accountability in K-12 education is because many schools were producing totally unsatisfactory results, sometimes for everyone attending, sometimes just for subgroups within the school. We were, it was rightly said, a “nation at risk” because of those weak results.

That led to national goals, to statewide academic standards, to mandatory assessments in core subjects, and to complex regimens by which to evaluate the scores on those tests and the remedies to be applied when scores were low.

It did not, in theory, have to be tests by which results were gauged. But for a host of reasons—cost, convenience, security, the appearance of uniformity, objectivity and a sort of fairness, etc.—standardized tests are what we ended up primarily relying on.

Of course we got carried away, particularly in clumsy efforts to use kids’ test scores to evaluate teacher performance. Of course we neglected other important elements of learning besides what’s readily tested and important elements of good schools besides academic achievement. Of course we muddled—and still do—the balance between achievement and growth. Of course we didn’t pay enough attention to the seemingly Sisyphean quest for tests that would be both “formative” and “summative.”

All true, all culpable. Perhaps still all fixable. But I submit that the “peril” in which state assessments find themselves, according to Olson and Jerald, is not fundamentally about testing burden or the distortions it causes in curriculum, pedagogy, calendars, etc. It’s about results-based accountability for a system that’s producing unsatisfactory results. Educators trying to escape the accountability have resorted to a war on tests themselves and convinced many, many others—especially parents—that tests are the problem.

Think about it this way: imagine that we start rating elementary schools, A-F or five stars to one, based on how well their graduates do in middle school. We judge them by middle school grades, discipline, etc, not by tests. And we find a way to adjust for “value added” so as not to penalize schools just because their pupils are disadvantaged.

Extend that thought experiment to high schools. Imagine that we find ways to gauge—and report—their effectiveness based on how their graduates do in college and the labor market, as glimpsed in a recent Mathematica study of Louisiana high schools. Then we sanction or intervene in various ways in the high schools whose graduates fare poorly.

Would F schools be any happier once their grade isn’t derived by tests? Would unions complain less vociferously if states moved to intervene with tough love in one-star schools?

There are ample good reasons to find additional real-world measures of school quality and effectiveness, but let’s not deceive ourselves that doing so will end the war on accountability—or make our schools any better.

I repeat: Tests are the messenger, and it’s the glum message they continue to convey about many schools that’s the problem. Shooting them down won’t cause a single child to learn more, a single inept teacher to do a better job in the classroom, or a single crummy school to improve.

Otherwise, Lynn and Craig got it about right.

Chester E. Finn, Jr., is a Distinguished Senior Fellow and President Emeritus at the Thomas B. Fordham Institute. He is also a Senior Fellow at Stanford’s Hoover Institution.

Last updated April 24, 2020