The Limitations of Self-Report Measures of Non-Cognitive Skills

Recent evidence from economics and psychology highlights the importance of traits other than general intelligence for success in school and in life. Disparities in so-called “non-cognitive skills” appear to contribute to the academic achievement gap separating rich from poor students. Non-cognitive skills may also be more malleable and thus amenable to intervention than cognitive ability, particularly beyond infancy and early childhood. Understandably, popular interest in measuring and developing students’ non-cognitive skills has surged.

As practice and policy race forward, however, research on non-cognitive skills remains in its infancy. There is little agreement on which skills are most important, their stability within the same individual in different contexts, and, perhaps most fundamentally, how they can be reliably measured. Whereas achievement tests that assess how well children can read, write, and cipher are widely available, non-cognitive skills are typically assessed using self-report and, less frequently, teacher-report questionnaires. Like achievement tests, questionnaires have the advantage of quick, cheap, and easy administration. And unlike behavioral proxies that might be used to gauge the overall strength of a student’s character, questionnaires can be crafted to capture more specific traits to be targeted for development.

One obvious limitation of questionnaires is that they are subject to faking, and therefore, to social desirability bias. When considering whether an item such as “I am a hard worker” should be marked “very much like me,” a child (or her teacher or parent) may be inclined to choose a higher rating in order to appear more attractive to herself or to others. To the extent that social desirability bias is uniform within a group under study, it will inflate individual responses but not alter their rank order. If some individuals respond more to social pressure than others, however, their placement within the overall distribution of responses could change.

Possibly more troublesome is reference bias, which occurs when survey responses are influenced by differing standards of comparison. A child deciding whether she is a hard worker must conjure up a mental image of hard work to which she can compare her own habits. A child with high standards might consider a hard worker to be someone who does all of her homework well before bedtime and, in addition, organizes and reviews all of her notes from the day’s classes. Another child might consider a hard worker to be someone who brings home her assignments and attempts to complete them, even if most of them remain unfinished the next morning.

To illustrate the potential for reference bias in self-reported measures of non-cognitive skills, I draw on cross-sectional data from a sample of Boston students discussed in detail in a recent working paper. Colleagues from Harvard, MIT, and the University of Pennsylvania and I used self-report surveys to gather information on non-cognitive skills from more than 1,300 eighth-grade students across 32 of the city’s public schools, and linked this information to administrative data on the students’ behavior and test scores. The non-cognitive skills we measured include conscientiousness, self-control, and grit – a term coined by our collaborator Angela Duckworth to capture students’ tendency to sustain interest in, and effort toward, long-term goals.

Importantly, the schools attended by students in our sample include both open-enrollment public schools operated by the local school district and five over-subscribed charter schools that have been shown to have large, positive impacts on student achievement as measured by state math and English language arts tests. These charter schools have a “no excuses” orientation and an explicit focus on cultivating non-cognitive skills as a means to promote academic achievement and post-secondary success.

Our results confirm that the surveys we administered capture differences in non-cognitive skills that are related to important behavioral and academic outcomes. Figures 1a, 1b, and 1c compare the average number of absences, the share of students who were suspended, and the average test-score gains between fourth and eighth grade of students who ranked in the bottom- and top-quartile on each skill. [1] It shows, for example, that students who rated themselves in the bottom quartile with respect to self-control were absent 2.9 more days than students in the top quartile, and were nearly three times as likely to have been suspended as eighth graders; similar differences in absences and suspension rates are evident for conscientiousness and grit. In addition, the differences in test-score gains between bottom- and top-quartile students on each non-cognitive skill amount to almost a full year’s worth of learning in math over the middle school years.

Note: * indicates that the difference between bottom- and top-quartile students is statistically significant at the 95 percent confidence level.

Paradoxically, however, the positive relationships between these self-reported measures of non-cognitive skills and growth in academic achievement dissipate when the measures are aggregated to the school level. In other words, schools in which the average student reports higher levels of conscientiousness, self-control, and grit do not exhibit higher test-score gains than do other schools. In fact, students in these schools appear to learn a bit less.

This paradox is most vivid when comparing students who attend “no excuses” charter schools and those who attend open-enrollment district schools. Despite making far larger test-score gains than students attending open-enrollment district schools, and despite the emphasis their schools place on cultivating non-cognitive skills, charter school students exhibit markedly lower average levels of self-control as measured by student self-reports (see Figure 2). This statistically significant difference of -0.23 standard deviations is in the opposite direction of that expected, based on the student-level relationships between self-control and test-score gains displayed above. The average differences between the charter and district students in conscientiousness and grit, although statistically insignificant, run in the same counter-intuitive direction. [2]

Note: * indicates that the difference between district and charter schools is statistically significant at the 95 percent confidence level.

Two competing hypotheses could explain this paradox. One is that the measures are accurate and the charter schools, despite their success in raising test scores, and contrary to their pedagogical goals, weaken students’ non-cognitive skills along crucial dimensions such as conscientiousness, self-control, and grit.

The alternative and, in my view, more plausible hypothesis is that the measures are misleading due to reference bias stemming from differences in school climate between district and charter schools. Figure 3 confirms that the academic and disciplinary climates of the charter schools in our sample, as perceived by their students, do in fact differ from those of the open-enrollment district schools. Charter students rate teacher strictness, the clarity of rules, and the work ethic expected of them substantially higher than do students in district schools. For example, charter students’ ratings of expectations for student behavior exceed those of their district counterparts by 0.57 on the 5-point scale used for these items. Students attending charter schools also report substantially lower levels of negative peer effects and modestly lower levels of student input in their schools. Of course, these data also come from self-report surveys and may themselves be subject to reference bias. Nonetheless, they suggest the academic and disciplinary climates of the charter schools differ in ways that could lead their students to set a higher bar when assessing their conscientiousness, self-control, and grit.

Note: * indicates that the difference between district and charter schools is statistically significant at the 95 percent confidence level.

Other recent studies of “no excuses” charter schools reinforce the plausibility of the reference bias hypothesis. For example, a 2013 Mathematica evaluation of KIPP middle schools finds large positive effects on student test scores and time spent on homework, but no effects on student-reported measures of self-control and persistence in school. Similarly, Will Dobbie and Roland Fryer find that attending the Harlem Promise Academy reduced student-reported grit, despite having positive effects on test scores and college enrollment, and negative effects on teenage pregnancy (for females) and incarceration (for males). This parallel evidence from research in similar settings confirms that reference bias stemming from differences in school climate is the most likely explanation for these paradoxical findings.

If the apparent negative effects of attending a “no excuses” charter school on conscientiousness, self-control, and grit do in fact reflect reference bias, then what our data show is that these schools influence the standards to which students hold themselves when evaluating their own non-cognitive skills. The consequences of this shift in normative standards for their actual behavior both within and outside of school are of course unknown – and merit further research.

As importantly, it appears that existing survey-based measures of non-cognitive skills, although perhaps useful for making comparisons among students within the same educational environment, are inadequate to gauge the effectiveness of schools, teachers, or interventions in cultivating the development of those skills. Evaluations of the effects of teacher, school, and family influences on the development of non-cognitive skills could lead to false conclusions if the assessments used are biased by distinct frames of reference.

In the rush to embrace non-cognitive skills as the missing piece in American education, policymakers may overlook the limitations of extant measures. It is therefore essential that researchers and educators seeking to enhance students’ non-cognitive skills develop alternative measures that are valid across a broad range of school settings. In the meantime, policymakers should resist proposals to incorporate survey-based measures of non-cognitive skills into high-stakes accountability systems.

—Martin R. West

This post originally appeared on The Brown Center Chalkboard.

[1] To measure math test-score gains, we regressed 8^th-grade test scores on a cubic polynomial of 4th-grade scores in both math and English language arts and used the residuals from this regression as a measure of students’ performance relative to expectations based on their achievement before entering middle school.

[2] Estimates of the impact of attending a charter school based on admissions lotteries confirm that these patterns are not due to selection of students with weak non-cognitive skills into charter schools; rather each year’s attendance at a charter has a statistically significant negative impact on self-reported conscientiousness, self-control, and grit.

The Limitations of Self-Report Measures of Non-Cognitive Skills

Latest Issue

NEWSLETTER

Business + Editorial Office

Discover

More Information