Photo of Stanford University
Stanford University was the site of a summer math camp whose outcomes were studied.

I thank Jack Dieckmann for reading my critique of the proposed California State Math Framework (“California’s New Math Framework Doesn’t Add Up”) and for writing a response (“Stanford Summer Math Camp Researchers Defend Study”). In the article, I point to scores of studies cited by What Works Clearinghouse Practice Guides as examples of high-quality research that the framework ignores. I also mention and two studies of Youcubed-designed math summer camps as examples of flawed, non-causal research that the proposed California State Math Framework embraces.

I focused on outcomes measured by how students performed on four tasks created by the Mathematical Assessment Research Service. Based on MARS data, Youcubed claims that students gained 2.8 years of math learning by attending its first 18-day summer camp in 2015. Dieckmann defends MARS as being “well-respected” and having a “rich legacy,” but he offers no psychometric data to support assessing students with the same four MARS tasks pre- and post-camp and converting gains into years of learning. Test-retest using the same instrument within such a short period of time is rarely good practice. And lacking a comparison or control group prevents the authors from making credible causal inferences from the scores.

Is there evidence that MARS tasks should not be used to measure the camps’ learning gains? Yes, quite a bit. The MARS website includes the following warning: “Note: please bear in mind that these materials are still in draft and unpolished form.” Later that point is reiterated, “Note: please bear in mind that these prototype materials need some further trialing before inclusion in a high-stakes test.” I searched the list of assessments covered in the latest edition of the Buros Center’s Mental Measurements Yearbook, regarded as the encyclopedia of cognitive tests, and could find no entry for MARS. Finally, Evidence for ESSA and What Works Clearinghouse are the two main repositories for high quality program evaluations and studies of education interventions. I searched both sites and found no studies using MARS.

The burden of proof is on any study using four MARS tasks to measure achievement gains to justify choosing that particular instrument for that particular purpose.

Dieckmann is correct that I did not discuss the analysis of change in math grades, even though a comparison group was selected using a matching algorithm. The national camp study compared the change in pre- and post-camp math grades, converted to a 4-point scale, of camp participants and matched non-participants. One reason not to take the “math GPA data” seriously is that grades are missing for more than one-third of camp participants (36%). Moreover, baseline statistics on math grades are not presented for treatment and comparison groups. Equivalence of the two groups’ GPAs before the camps cannot be verified.

Let’s give the benefit of doubt and assume the two groups had similar pre-camp grades. Are post-camp grade differences meaningful? The paper states, “On average, students who attended camp had a math GPA that was 0.16 points higher than similar non-attendees.” In a real-world sense, that’s not very impressive on a four-point scale. We learn in the narrative that special education students made larger gains than non-special education students. Non-special education students’ one-tenth of a GPA point gain is underwhelming.

Moreover, as reported in Table 5, camp dosage, as measured in hours of instruction, is inversely related to math GPA. More instruction is associated with less impact on GPA. When camps are grouped into three levels of instructional hours (low, medium, and high dosage), effects decline from low (0.27) to medium (0.09) to high (0.04) dosage. This is precisely the opposite of the pattern of changes reported for the MARS outcome—and the opposite of what one would expect if increased exposure to the camps boosted math grades.

The proposed California Math Framework relies on Youcubed for its philosophical outlook on K-12 mathematics: encouraging how the subject should be taught, defining its most important curricular topics, providing guidance on how schools should organize students into different coursework, and recommending the best way of measuring the mathematics that students learn. With the research it cites as compelling and the research it ignores as inconsequential, the framework also sets a standard for what it sees as empirical evidence that educators should follow in making the crucial daily decisions that shape teaching and learning.

It’s astonishing that California’s K-12 math policy is poised to take the wrong road on so many important aspects of education.

Tom Loveless, a former 6th-grade teacher and Harvard public policy professor, is an expert on student achievement, education policy, and reform in K–12 schools. He also was a member of the National Math Advisory Panel and U.S. representative to the General Assembly, International Association for the Evaluation of Educational Achievement, 2004–2012.

Last updated May 24, 2023