The traditional gold-standard approach to research—a randomized control trial (RCT)—is not worth its weight as we move to a student-centered education system that personalizes for all students so that they succeed.
The reason why, as Max Ventilla, founder and CEO of AltSchool (where, full disclosure, I am an advisor), told Fast Company, is, “You’re not thinking about the global population as one unit that gets this experience or that experience. … Something that’s better for 70 percent of the kids and worse for 30 percent of the kids—that’s an unacceptable outcome.”
It’s not that RCTs are unhelpful; they can be very helpful. It’s just that stopping with a randomized-control trial represents, at its best, an incomplete research process. Accordingly, it’s one that often leads to contradictory advice, information, and signals for the actors—in this case, educators—on the ground.
In our latest white paper for the Clayton Christensen Institute titled A blueprint for breakthroughs: Federally funded education research in 2016 and beyond, my coauthor Julia Freeland Fisher and I lay out a new path forward for educational research.
That path is one in which research builds on itself through a process in which researchers:
• Observe educational interventions;
• Categorize and test hypotheses about what factors may cause changes in student outcomes;
• Report the results of these tests;
• Observe anomalies to the findings—either within studies or from other studies—and dig into a series of small “n of 1” studies to understand what conditions or circumstances were different in the outliers that caused the outcomes to be different;
• And then refine the theory of causation accordingly.
In this cycle, contradictory research—of which there is plenty in education—is a good thing because it allows for an opportunity to improve our understanding of causality. The challenge is that researchers must use research to do just that. And the federal government should help by supporting research that progresses past initial RCTs and promotes alternative methods for unearthing what drives student outcomes in different circumstances.
Said differently, RCTs can be a valuable part of a process of building research that can help us understand what works, for which students, in what circumstances. They just are not the final step in that process.
Understanding this can help us see why, hypothetically, a whole-class balanced literacy approach with a focus on comprehension strategies might achieve great results for a student who has already learned a wealth of content knowledge outside of school, but achieve a very different result for a student who lacks that same content knowledge. The approach may be the same, but the circumstances—in this case, a student’s background knowledge—are different. With an understanding of the causal mechanism, personalizing the approach for the circumstance can achieve a good result for both.
This helps to explain why different RCTs on balanced literacy or Core Knowledge might have very different—and seemingly contradictory—results. And it also helps understand why a “good” result from an RCT that shows that, on average, a particular approach worked, may be masking a deeper understanding that is critical so that all students—not just most students—succeed.
For example, a relatively recent U.S. Department of Education-funded RAND RCT study of Carnegie Learning’s Cognitive Tutor Algebra I (CTAI) product found that CTAI boosted the average student’s performance by approximately eight percentile points. Researchers emphasized that this study considered “authentic implementation” settings—that is, they tested the relative effectiveness of the program in actual school settings across a diverse array of students and teachers. Carefully studying authentic implementation sounds promising, but only if the researchers can spend sufficient time observing and revisiting the authentic circumstances surrounding the implementation, which may be affecting student results. Instead, however, when a thorough, well-funded study like this demonstrates that a high enough proportion of students benefit from an intervention, we tend to double down on those promising signals. In turn, the fact that some portion of students or certain classes likely didn’t fare as well—and others fared far better—is treated as probabilistic noise from which statistically significant signals of efficacy must be isolated. But generalizing research findings like this will not be helpful if we are trying to build systems that predictably support students based on their specific needs or circumstances.
Because the paradigm of research in education has remained chronically incomplete, we haven’t been able to move toward sounder understandings of causality and the salient circumstances in which different approaches drive success. Today we often divide the world by novice and expert learners, for example, and know that we must treat the groups differently in a given domain—but we haven’t gone deep enough.
To realize the promise of personalizing learning being touted everywhere now from the halls of the Department of Education to the walls of Facebook, it’s time to solve that.
—Michael B. Horn
This post originally appeared on Forbes.