In the summer 2019 issue of Education Next, Grover Whitehurst expresses concern about current approaches to social-emotional learning, or SEL, and the state of the evidence supporting such “whole-learner” practices in schools. His principle points are: (1) that work in SEL is misfocused, meaning it is directed to the wrong things (e.g., personality traits, dispositions), and (2) that practice and policymaking have gotten ahead of the evidence. In reality, traditional approaches to social-emotional learning do not focus on “personality constructs such as conscientiousness and broad dispositions such as grit.” Rather, as we describe below, effective SEL programming focuses on concrete, teachable skills and has been shown in many studies to lead to gains in important outcomes. Whitehurst’s sole reliance on two broad and general studies to make his case leaves out a large number of individual studies (randomized trials, no less) that reveal the promise and impact of SEL. We lay out our two key points below.
Whitehurst writes that work in SEL is misguided in its focus on personality traits and dispositions (e.g., conscientiousness, agreeableness, persistence, etc.), which he describes as largely influenced by genetic and environmental factors and, as such, are unlikely to be changed through school-based programming. Understandably, there is great allure in cultivating these qualities in children because there is ample evidence confirming their value. For example (and as Whitehurst points out), conscientiousness is linked to desirable outcomes such as academic achievement and higher labor market earnings. However, just because these traits are desirable does not mean that they are suitable targets for school-based programming. In fact, there is little evidence that interventions targeting these types of outcomes result in meaningful change. While the kinds of traits often described as character or personality are certainly important, research suggests such traits are relatively stable over the life course.
The truth is that we should, as Whitehurst writes, focus on concrete, specific, observable, and teachable skills and competencies—and this is exactly what the best SEL interventions and practices do. These programs and strategies are designed this way because we have a great deal of evidence from developmental science about the relevance of such skills and competencies and about how they grow and change over time. SEL programs targeting such skills and competencies are effective because they make these skills explicit and teach them. For example, the 4Rs Program (Reading, Writing, Respect, and Resolution) is a universal, elementary school-based intervention that focuses on social problem solving and conflict resolution. The basic idea behind the intervention is to target the social-cognitive processes thought to lead to aggressive behavior. That is, it is designed to help children think, feel, and act differently in situations of interpersonal conflict. For example, unit 5 in the 4th grade curriculum focuses on understanding and managing conflict and solving problems collaboratively. The unit begins with students reading a relevant book and discussing it as a group. This is followed by three specific lessons, the first focused on conflict and violence and what they mean, the second on negotiation and how it works, and the third on how conflict and specific negotiation strategies go together. This is how one program makes SEL concrete and explicit.
What’s difficult is that some of the loudest champions for the field make their case based on studies that include personality and dispositional traits, and then these types of constructs become viewed as interchangeable with the concrete, developmental, and teachable skills and competencies that make up the tradition of SEL. These challenges illuminate an even more troubling issue, which is that the terminology in this field is a mess. The field goes by many names—social-emotional learning, bullying prevention, character education, conflict resolution, social skills, life skills, and soft skills, to name just a few. Moreover, major players in the field have put forward competing organizational schemes or frameworks that often use different or even conflicting terminology to describe similar sets of skills. As a result, there is little clarity about what we mean, and the field is beset with dilemmas about how to promote and measure skills in this area, further complicating attempts to translate research into practice. Our lab at the Harvard Graduate School of Education has tried to address this challenge by building a website and set of tools that use a systematic coding system to identify and show relationships between different skills, terminology, and frameworks. These tools are designed to help key stakeholders in practice and policy to be concrete and explicit about their goals and align their efforts (i.e., frameworks, programs, assessments) to more effectively and deliberately achieve results. We hope these tools can serve as a starting point for ongoing work intended to bring coherence and consistency to the field.
The State of the Evidence
Whitehurst argues further that practice and policy decisions around SEL are based in skewed perceptions of the evidence, namely relying on large meta-analyses with subpar methodology and ignoring conflicting or null findings. He highlights two key studies, a 2011 meta-analysis of the effects of SEL and its follow-up examining longer-term effects, and the large-scale Social and Character Development Research Consortium (SACD) study from the early 2000s. Whitehurst is correct to highlight the limitations of existing meta-analyses, however, they still have value. For example, many of the studies included in the 2011 study did not involve randomization or use reliable outcome measures. Indeed, the SACD study—like so many large-scale evaluations of this type—revealed no differences between the schools randomized to a variety of “social and character development” interventions and those in the no-intervention condition. If we relied on these studies alone, we might be skeptical about the nature of the evidence. These studies do, however, establish important baseline knowledge and evidence that allows researchers to ask more refined questions using stronger methods and measures. Moreover, aggregating studies over time can provide a signal that is hard to discern from a host of individual studies that target very different things. The signal emerging from a collection of meta-analyses (now there are several) is strong, and worth following with a look at the individual studies included within them.
For example, more than two decades of randomized-controlled trials evaluating Promoting Alternative Thinking Strategies (PATHS) with socio-economically, racially, and developmentally diverse samples and conducted with rigorous research designs show positive impacts across both outcome domains (e.g., behavior, cognitive, and social) and reporters (e.g., teachers and peers). Early studies found that participation in PATHS improved emotion vocabulary, fluency, reasoning, and management of emotions for first and second grade children, with particularly strong findings for students with disabilities. Other studies of PATHS confirm the strong and significant impact of PATHS for students in special education classrooms.
In a more recent study of approximately 3,000 first through third graders, PATHS demonstrated positive and significant impacts on cognitive skills (concentration, attention, work completion), authority acceptance (oppositional and conduct problem behaviors), and social competence (prosocial behavior and emotion regulation) compared to a matched comparison group. A similar randomized study following approximately 780 students over the course of three years found decreased aggressive social problem solving, hostile attribution bias, and aggressive interpersonal negotiation strategies for students in fourth and fifth grade. These aspects of social information processing are explicit and proximal targets in the PATHS intervention. Evidence also suggests that gains produced by PATHS are sustained two years after the intervention period.
Another example comes from the 4Rs program described above, a school-based intervention in social problems solving and conflict resolution that trains and supports all teachers in kindergarten through fifth grade in how to integrate the teaching of social and emotional skills into the language arts curriculum. A randomized evaluation of 4Rs indicated that children in the 4Rs group were less aggressive, had fewer problems with attention, had fewer depressive symptoms, and showed improved social competence, compared to students in the control group. There were no effects on academic outcomes for the full sample of students participating in the study; however, children who had higher levels of behavior problems at the outset of the evaluation showed gains in attendance, reading scores, and math scores—suggesting that high-quality SEL programs can be an effective mechanism for supporting at-risk students and reducing the achievement gap.
Interestingly, the SACD study also tells an important story. As noted above, overall, it did not detect differences between the intervention and control groups, but it is worth noting that the SACD study was a mix of very different program approaches and used a general measurement battery, rather than measures aligned to the specific skills being targeted in each program. Several of the individual RCTs embedded in the broader national study found impacts on social-emotional outcomes, and this is believed to be because the individual studies used measurement batteries closely tied to the theory of change of the specific program.
Each of these examples documents impacts in areas targeted by the specific program. Each program targets specific, observable, and teachable skills and competencies, and each rests on a slightly different theory and therefore adopts a slightly different approach. We are by no means suggesting that we should ignore null findings of larger multi-program studies, but rather that we must carefully consider the evidence, what it offers, and be prepared to learn from multiple sources. What might appear to be few or no impacts may instead reflect a lack of alignment between program targets and outcome measures.
These examples we present (and there are others) use rigorous methods to demonstrate important findings that should not be overlooked. They include diverse samples and randomized longitudinal designs, and they reveal impact variation by specific characteristics, and average positive outcomes across domains (social, emotional, behavioral, cognitive), and across measurement types. These are hallmark characteristics of a robust body of evidence. As long as we stick to that evidence, practitioners and policy-makers have much to draw upon in designing and adopting evidence-based, effective programs and interventions to improve social-emotional and other outcomes. When theory and measurement are closely aligned, we do see effects. And this brings us back to the issue of terminology—we must be explicit about what we are targeting, about the activities that underlie expected change, and about how we are measuring impact. The importance and value of this sort of precision and alignment cannot be overstated.
Stephanie Jones is the Gerald S. Lesser Professor in Early Childhood Development at the Harvard Graduate School of Education. Rebecca Bailey is Assistant Director of the EASEL Lab at the Harvard Graduate School of Education, where Jennifer Kahn is a Research Manager and Sophie Barnes is a Research Coordinator.