Quality Counts 2001, A Better Balance: Standards, Tests, and the Tools to Succeed
by the editors of Education Week
Editorial Projects in Education, 2001.
In just five years, Education Week‘s high-profile annual compilation Quality Counts (QC) has emerged as perhaps the K-12 education field’s most prominent source, besides the publications of the federal government, of statistical information, particularly at the state level. The reporters and editors of Education Week, which modestly styles itself “American Education’s Newspaper of Record,” prepare QC, with generous subsidy from the Pew Charitable Trusts. Appearing each January, QC typically runs to a whopping 200 folio-size pages.
Each successive edition of QC includes some familiar measures, drops some old categories, and adds some newly developed ones, the latter tied mostly to the year’s policy theme. This year’s theme was attaining “A Better Balance” between academic standards and tests on the one hand, and what the editors term “the tools to succeed” on the other. (In 2000, the theme was teachers; in 1999, accountability.) Besides thousands of numbers, QC features dozens of interpretive essays by Education Week reporters and editors–thus raising the dual specters of selective statistics and biased journalism.
We have no reason to doubt the bona fides of the editors, researchers, and advisors who choose the numbers and pen the essays. They presumably yearn to be interesting, timely, relevant, and influential. They want to get noticed and buzzed about. They want to sell copies, please their advertisers, gratify their donors, and ensure that next year’s edition is eagerly awaited (and chockablock with ads). If their report had no message, no conclusions, and no edge, it would be less noticed.
However, Quality Counts‘s numbers and essays certainly do not get treated as neutral entries in a wholly academic sweepstakes. In today’s education policy wars, for better or worse, no choice of a fact can be deemed wholly neutral. Facts are also weapons. Which ones you select matter a great deal. If, for example, you seek to convey to readers a sense of teacher salaries, it matters whether you report beginning salaries or those at the top rung; whether the focus is on the mean or the median; whether fringe benefits as well as cash wages are included; and whether, for perspective, teacher salaries are set alongside the earnings of bus drivers or neurosurgeons. (Teacher salaries didn’t appear in this year’s QC, but in 2000 average teacher salaries, adjusted for the cost of living, were reported, though not counted as part of each state’s “grade.”)
Framing the Question
Subjectivity begins, of course, with the selection and framing of the theme itself. In choosing this year’s “Better Balance,” for example, the editors signaled that something is awry in the existing balance between the “hard” elements of standards-based reform (namely the academic standards, assessments, and interventions that make up a state’s accountability system) and such “soft” components as teacher training, instructional materials, and classroom environment.
Concern about this balance is as old as the standards movement itself. For at least a dozen years, a debate has raged in Washington and in state capitals over what the profession generally calls “opportunity to learn” standards, or OTL. This concern is often captured by the aphorism “It’s not fair to hold students accountable for learning things they’ve never been taught.” According to OTL doctrine, policymakers mustn’t attend solely to standards and results. They should also concern themselves with the education system’s ability to ensure that those being held accountable have ample opportunities and resources to attain the desired results.
Reasonable, yes? Sure–but only up to a point. It’s a fact that education policymakers cannot confine themselves to goals and results. They also need to be reasonably confident that the available resources and institutional arrangements have a fighting chance of producing the desired outcomes–that salaries are high enough to draw talented applicants, that school districts can provide students with up-to-date textbooks and technology.
However, it is easy to lose one’s focus on results while bogging down in resource arguments. That seems to be just fine with those who are nervous about accountability in the first place. OTL is the chief means by which yesterday’s fixation on school inputs and services reasserts itself in today’s era of results-based education. Opportunity to learn–or what QC terms “the tools to succeed”–can become a handy, even virtuous, excuse for not holding anyone to account for actually teaching or learning anything, or at least for justifying mediocrity. There is always some inadequacy or shortcoming to be found somewhere in the vastness of the K-12 delivery system, not to mention the varied problems the kids bring to school. Hence, as one starts down the path of “balance,” a reason can readily be found to rationalize unsatisfactory outcomes or to defer the day when results actually count.
The education profession has persuaded itself that all the inputs must be exactly right before any results should count for students, much less for those who teach them and lead their schools. Consequences for adults in the education system are politically touchy anyway, so OTL-type excuses for skirting them are particularly welcome. In statewide accountability systems, the notion of “cracking down” on the kids is widespread. But we look far and wide before finding any teachers or principals in serious jeopardy. It’s as if only the soldiers and not the officers are being held to account for winning or losing the battle. Whenever someone suggests accountability for the educators, the furor that follows combines OTL concerns (what if the teachers didn’t have enough professional development? What if there was high turnover among their pupils?) with moral indignation and invocation of seniority rules, tenure laws, and contractual rights.
Captured by the System
The editors of Education Week have succumbed to OTL-type reasoning, more vividly in 2001 than in the preceding four editions of QC. “States,” they now write, “must balance policies to reward and punish performance with the resources needed for students and schools to meet higher expectations.” The fundamental message of QC 2001 is that such “balance” is lacking and needs to be developed.
Thus Quality Counts 2001 succors those made uneasy by standards-based reform and high-stakes testing. In so doing, it partakes of the central assumptions of the education profession itself and risks sliding over the edge into being a professional trade journal for educators, like, say, Phi Delta Kappan or Educational Leadership, rather than a watchdog on behalf of the broader American public.
Consider the report’s “Executive Summary.” The reader need penetrate only to paragraph three to find the caution lights flashing about standards and tests. The first paragraph reports that states have been trying hard to raise academic standards and that the public supports this effort. The second paragraph says that slow progress is being made. Then comes the big But. Paragraph three warns that, without a “better balance,” all this progress is in jeopardy, together with the life prospects of “tens of thousands” of youngsters. Paragraph four then closes in for the policy kill:
Specifically, Quality Counts found, state tests are overshadowing the standards they were designed to measure and could be encouraging undesirable practices in schools. Some tests do not adequately reflect the standards or provide a rich enough picture of student learning. And many states may be rushing to hold students and schools accountable for results without providing the essential support.
The full report has three major sections. Part I consists of six essays by Education Week reporters and editors, based partly on surveys and polls. Part II is the annual state-by-state report card, full of charts and tables assigning grades and rankings to the fifty states on their level of student achievement, progress in adopting standards and accountability, efforts to improve teacher quality, their school climate, and the resources they devote to education. Finally, in part III, come 80 pages of individual state profiles.
The essays in part I are troubling on several counts, beginning with their main source of “data,” which is a survey of public school teachers–and no one else. Teachers’ views on education warrant careful attention, of course, but they’re certainly not the only affected parties and they’re among the most self-interested. To learn about foxes, one wouldn’t settle for polling only chickens. The results of this survey are predictable: protests about narrowing the curriculum, teaching to the test, inadequate professional development, unfairness toward disadvantaged and minority youngsters–and toward hard-working teachers themselves. To their credit, these essays also profile some states, districts, schools, and teachers that are responding constructively to standards and testing. But as one browses these pages to see whose opinions (besides teachers) are taken seriously, it becomes clear that most of the interviews and quotations come from critics and doubters within the education profession. Where are the comments from legislators, employers, or college admissions officers? The key essay on testing, for example, written by QC uber-editor Lynn Olson, quotes five teachers, ten academics, one parent, and two policymakers. The overwhelming majority of these comments are negative or skeptical toward high-stakes testing.
The trappings of objectivity and scholarly rigor are certainly present in part II, the report card: endless charts, elaborate footnotes, and long methodological explanations written in tiny type. Here reportorial selectivity yields to subtler decisions about which data to include and how to interpret them. Project research coordinator Ulrich Boser boasts in the report card’s introduction (none too subtly entitled “Pressure without Support”) that the tables are based on the “most comprehensive to date” survey of “state policies that aim to hold schools and students responsible for results and build their capacity to reach academic standards.” In their effort to be contemporary, the researchers omit all sorts of long-term trends and patterns that might be even more revealing than the “very latest” data. For example, no effort is made to show the increase in public-school spending in America during the past 30 (or 50) years, the uses to which that money has been put, the steady reduction in class size, the huge increase in numbers of school employees, and the various trends in achievement that correlate almost not at all with any of these resource trends.
The data under the heading student achievement are fine. Its six subcategories are all based on states’ National Assessment of Educational Progress (NAEP) scores in various subjects in grades 4 and 8. The key barometer throughout is what fraction of a state’s youngsters scored at or above “proficient” on the NAEP scale. In 4th grade reading in 1998, for example, scores ranged from a low of 17 percent in Hawaii to a high of 46 percent in Connecticut. Eleven states didn’t take part.
So far, so good. It’s exactly what one would want from a publication named Quality Counts: a nice, clear focus on academic results, namely student achievement, measured on the best yardstick available.
Turning to standards and accountability, we encounter three major subheadings, two of which (accounting for 70 percent of this grade) are also pretty solid. Under “standards,” states get points depending on how many core subjects and levels of schooling they have “clear and specific” standards in, as judged by the American Federation of Teachers. Under “accountability,” a state’s score depends on how many of five different ways it holds schools (not just kids!) accountable for their performance. All are reasonable things to look for, albeit the most important of them–“sanctions” for failing schools–can be found in just 14 states (including jurisdictions with plans to institute sanctions at some later date).
The “assessment” subheading is more problematic. Here a state can get full marks only if it uses five different kinds of test items, including “extended response” questions and “portfolios.” A state that relied on multiple-choice questions could not possibly do well here. This partakes of the view fashionable among educators that multiple-choice testing is inherently inadequate because it cannot be used to appraise anything but the most rudimentary of skills and factual recall-type knowledge. Of course that’s not so. A well-conceived multiple-choice question can probe deeply into a student’s command of complex cognitive skills, prowess at problem solving, and sophisticated knowledge of subject matter. To be sure, multiple-choice items cannot expose a student’s ability to write lucid prose or engage in original research, but they can go a long way toward revealing the sorts of things we want youngsters to know and be able to do. Moreover, they do so with great efficiency and speed, and they are low cost, flexible (computer-adapted items), and objective (with machine-based scoring).
Larger problems loom in the report card’s three remaining areas. In the section on improving teacher quality, a state’s grade depends in part on its embrace of some of the education profession’s trendier “reforms.” Rather than probing the skills and knowledge that a teacher imparts to her students, for example, QC 2001 puts considerable weight on whether the state uses a “performance assessment” (including videotapes, portfolios, etc.) to appraise teachers. It also rewards states that give bonuses to teachers who have been certified by the National Board for Professional Teaching Standards. Unfortunately, we know from the work of economists Michael Podgursky and Dale Ballou and others that to date there is no hard evidence that being certified by the National Board translates into being an effective teacher.
QC also tacitly privileges the conventional education-school path into the classroom, though it no longer rewards states for having their new teachers emerge from “nationally accredited” institutions. QC 2001 does, however, assign points to states that require at least 12 weeks of practice teaching as part of a preparation program–not necessarily a bad thing, but limiting for states and districts that are experimenting with programs such as Teach for America and alternative pathways to certification. Indeed, QC grants no points to states with alternative-certification programs! (It did last year.)
The section on school climate has some good features. For example, a quarter of a state’s grade is based on having public-school choice and charter schools. Troubling, though, is the fact that 35 percent of the climate grade depends on having classes smaller than 25 pupils, which means that QC has taken sides in the great class-size debate, notwithstanding the rivers of doubt that Hoover Institution economist Eric Hanushek and others have poured on the notion that smaller classes are an efficient means of boosting achievement. The remaining 40 percent of a state’s climate grade addresses legitimate concerns such as classroom misbehavior, pupil tardiness, and the extent of parents’ involvement in school. Unfortunately, those indicators depend on self-reporting by 8th graders. While we shouldn’t fault QC‘s editors for the fact that these were the only such data they could find, we may wonder how reliable these numbers are.
The touchy topic of resources has two major subheads: adequacy and equity. Here is where one might most expect OTL doctrine to rule. Yet QC 2001 is even more primitive, relying instead on dollars alone. A state’s grade on resource “adequacy” turns not on some calculus of what resources are needed to furnish its youngsters with an adequate education, but simply on how rapidly the state’s education spending is rising and how much of the state’s total worth is being devoted to education. This section might be called “quantity counts,” and it yields some curious results.
West Virginia, of all places, gets the highest grade here–a straight A–as it reportedly spent $8,322 per pupil on public education in 1999 and has been boosting its outlays faster than any other state and digging deeper than all but one. Yet West Virginia is at or below the national average on all the QC achievement scores, gets a D+ for standards and accountability, a C for teacher quality and a D+ for school climate. By contrast, Connecticut, which also spent more than $8,000 per pupil and which is in first or second place among the states on four of six NAEP scores (and eighth in the remaining two) clocks in with just a B- in “resource adequacy.” Adequate for what, one wonders. Education Week‘s strange way of measuring adequacy lauds a state, like West Virginia, that has only recently begun raising its spending while punishing a state like Connecticut whose spending has been high for years. Likewise, West Virginia fares better than Connecticut because it is poorer; if both states spend exactly the same per pupil, West Virginia naturally winds up devoting more of its per-capita income to education.
The measure of resource “equity” is incomprehensible to anyone who has not specialized in school finance and earned a degree in statistics. Half of a state’s grade hinges on something called “state equalization effort”; the rest comprises still more obscure factors: the “wealth-neutrality score,” “relative inequality in spending per student among districts,” and something called the “McLoone Index.” Named for school finance analyst Eugene McLoone, it is “based on the assumption that if all the pupils in a state were lined up according to the amount their districts spend on them, perfect equity would be achieved if every district spent at least as much as was spent on the pupil smack in the middle of the distribution….The ratio between what is currently spent by districts in the bottom half and what needs to be spent to achieve equity is the McLoone Index.”
The equity upshot: Hawaii naturally wins, because its unified statewide school system spends the same amount on all students. Never mind that the Aloha State’s achievement scores are among the lowest in the land.
Something closer to objectivity reappears in the final section of QC 2001, where profiles of individual states are more balanced and informative than this reviewer expected. Each profile includes a report card recapping the state’s NAEP results and its letter grades reported in earlier pages. Each gives a few basic facts about school enrollments and demographics. Then each has an essay of a page or two about what’s going on in that state. The authors are Education Week reporters who seem to have been given a fairly free hand to frame a state’s story according to what they found interesting there and with whom they talked.
Most of the essays are sober, matter-of-fact accounts of recent doings on the education reform front. The Indiana essay is a model of that kind, as are those of Louisiana, Maine, and Delaware. Some report interesting information that national observers may not have known, such as Nebraska’s abiding love of locally selected tests and its rejection of statewide assessments. Some report that heated controversies–such as the uproar surrounding Florida’s voucher program–are cooling down. Even some places where recent developments could lend themselves to a reporter’s bias against testing don’t always produce the expected “spin.” The Massachusetts account, for example, is acceptably balanced, as are those for Colorado and high-profile Texas. There are occasional slips, however. The Ohio story, for one, tends to favor the views of those who are grumping about the state’s proficiency testing program.
On balance, however, this sprawling publication displays an unmistakable, albeit uneven, set of assumptions that align with the values, preferences, and biases of the education profession itself. It thus becomes more of a report to the profession on matters that interest people within the field than a report to the public about how well that field is serving the nation.
Perhaps we shouldn’t be surprised. Most of Education Week‘s and QC‘s subscribers, after all, are educators, and most of the advertisers are firms that want to sell things to educators. This inevitably tempts reporters, editors, and publishers to view the world through the lenses of readers within the field rather than outsiders who most want to know whether the system is performing as well as it should. “Give educators what they want to see” may never have been stated in planning meetings and editorial sessions. Possibly all that happened is that the authors and their advisors and supervisors have been so close to the K-12 education system for so long that they’ve lost perspective on it and its players. They may even suffer from a touch of the Stockholm syndrome, identifying with their oppressors–their customers, in Education Week‘s case. Whatever the reason, the unhappy bottom line is that quality does not count quite as much as it should in Quality Counts.
Chester E. Finn Jr. is president of the Thomas B. Fordham Foundation, a senior fellow at the Manhattan Institute, and a visiting fellow at the Hoover Institution.