Never Judge a Book By Its Cover—Use Student Achievement Instead
They say never judge a book by its cover. We need to start judging textbooks and other instructional materials using student achievement instead.
In February and March of last year, as teachers were preparing for the first administration of the end-of-year assessments from the Partnership for Assessment of Readiness for College and Career (PARCC) and the Smarter Balanced Assessment Consortia (SBAC), the Center for Education Policy Research at Harvard surveyed a representative sample of 1,500 teachers and 142 principals across five states.  Our report focused on the instructional changes teachers were making and the effectiveness of the training and support they were receiving. However, we also asked about the textbooks they were using.
We matched each teacher to the students they were teaching and assembled data on students’ demographic characteristics, performance on prior state tests, and the averages of such characteristics for the peers in their classroom. We also estimated each teacher’s impact on student performance in the prior school year (2013-14) to use as a control. (We wanted to account for the fact that more effective teachers may choose to use particular textbooks.) After controlling for the measures of student, peer, and teacher influences above, we estimated the variance in student outcomes on the new assessments associated with the textbook used. 
The textbook effects were substantial, especially in math. In 4th and 5th grade math classrooms, we estimated that a standard deviation in textbook effectiveness was equivalent to .10 standard deviations in achievement at the student level.  That means that if all schools could be persuaded to switch to one of the top quartile textbooks, student achievement would rise overall by roughly .127 student-level standard deviations or an average of 3.6 percentile points. Although it might sound small, such a boost in the average teacher’s effectiveness would be larger than the improvement the typical teacher experiences in their first three years on the job, as they are just learning to teach.
As advocated by Russ Whitehurst and Matt Chingos, the search for more effective curriculum materials can yield outsized bang-for-the-buck, because schools are already buying textbooks and better textbooks do not cost more on average than less effective ones. We estimate that such a study would need to collect data from roughly 1,800 schools to have the statistical power necessary to detect a .10 impact for any textbook representing at least 10 percent of the market. Some states, such as Indiana and Florida, already collect data on textbook adoptions, but even if it were necessary to collect data from a new sample of schools, the study would likely cost less than $2 million annually.  Across the PARCC and SBAC states, there are approximately 3 million 4th and 5th grade students. Assuming the more effective textbooks have a similar cost to the textbooks they replace, the incremental cost of the 3.6 percentile point gain in achievement would be the cost of the study itself—roughly 67 cents per student.
A focused effort to evaluate curricula and shift demand toward more effective options would yield a higher return on investment than more resource-intensive measures. For instance, Krueger (1999) estimated that small classes in the Tennessee classroom size experiment generated a 5 percentile point increase in performance in early grades. But that required reducing class size from 23 to 16 students per teacher. Using an average teacher salary of $55,000, the class size reduction would have a minimum cost across the PARCC and SBAC states of $3.1 billion or $1,046 per student—1,561 times the cost of the annual textbook study, for a slightly larger benefit! (And that does not include the cost of the extra classroom space that would be needed.)
Admittedly, even though we controlled for student baseline scores and demographics, mean characteristics of students at the classroom level, and teachers’ prior value-added, we could be overstating effects of textbooks, if the schools using certain textbooks are systematically different in some unmeasured way. A future annual survey of textbook usage and student achievement could do a better job of isolating the effect of individual textbooks by measuring changes in student achievement as schools transitioned from one book to another.
An annual report on the effectiveness of textbooks would transform the market, by providing publishers and software developers with a stronger incentive to compete on quality. To the extent that reduced student outcomes would reflect poorly on them, they would also have an incentive to provide the resources teachers, students and parents might need to use the textbook or software effectively. An annual report on effectiveness would also complement efforts, such as by William Schmidt, Morgan Polikoff and Ed Reports, to evaluate the content of textbooks and their alignment with the Common Core standards.
Welcome to the new economics of education research when states use common standards and assessments. By combining teacher-student links with the ability to measure achievement gains using common assessments, we could be generating lower-cost, faster-turnaround evaluations of curricula and other educational interventions.  Especially for products that are constantly evolving, such as textbooks and educational software, such an approach would provide more timely evidence than randomized clinical trials, at a fraction of the cost. States just need to start working together to reap the benefits.
— Thomas J. Kane
This post originally appeared as part of the Evidence Speaks series at Brookings.
Notes: 1. The five states were Maryland, Massachusetts, Delaware, New Mexico, and Nevada. Unfortunately, we could not use the testing outcomes in Nevada due to problems with the SBAC assessment in the spring of 2015. 2. We dropped the 20 percent of students whose teachers reported they did not use a textbook, an additional 30 percent whose teachers did not use their textbook as their primary curriculum, and limited the sample to textbooks that at least two teachers reported using. 3. Others have found big effects of textbooks and curricula on student achievement. For instance, the What Works Clearinghouse (WWC) currently contains ten evaluations of elementary school math textbooks and software, with effect sizes ranging from roughly −2 to 17 percentile points. 4. Chingos and Whitehurst advocated for a similar approach, suggesting that the Institute of Education Sciences should develop standards for collecting data on instructional materials. Rachana Bhatt and Cory Koedel have studied the effect of elementary math curricula by matching schools on baseline characteristics and comparing performance on the test in Indiana. See Bhatt, Rachana and Cory Koedel, “Large-Scale Evaluations of Curricular Effectiveness: The Case of Elementary Mathematics in Indiana” (2012). 5. States do not have a mechanism for coordinating research agendas. Perhaps the Council of Chief State School Officers, National Governors’ Association or the testing consortia could help.