Among organizations that don’t give me a paycheck, TNTP may be my favorite.
They do two things really, really well. First, they take part in on-the-ground, let’s-solve-this-problem human-capital activities. In partner cities across the nation, they train and certify teachers, develop and implement new evaluation systems, help administrators improve observations, and much more.
Chances are, if you’re hearing about interesting, innovative teacher or leader work in an urban area, TNTP is involved.
The second is that they put out these superb little reports. They’re always short and punchy, visually pleasing, terribly informative, and, in one way or another, unexpected. Teacher Evaluation 2.0 was a valuable how-to guide for discriminating policymakers, The Irreplaceables was a teacher-retention wake-up call, and, of course, The Widget Effect was a game-changer.
The organization is at its influential-powerful best when it combines its smarts and muscle—when it can use its research and analysis to inform the field and then help implement the change. For example, TNTP’s findings on the appalling state of teacher evaluations helped shape the Race to the Top application, precipitated a wave of state-level statutory changes, and kicked off some of TNTP’s most meaningful partnerships with states and districts.
Leap Year, the organization’s latest offering, follows in this fine research-meets-practice tradition.
It looks under the hood of the first year of teaching. The conventional wisdom holds that all teachers are lousy out of the gates, so we treat the rookie season, says the report, “like a warm-up lap.”
But there’s much more to this story.
Using its “Assessment of Classroom Effectiveness” (ACE) tool, a multiple-measures evaluation system designed specifically for new teachers, TNTP assessed new educators via observations, student surveys, growth data, and principal ratings.
Among the lessons learned: Not all teachers struggle from the start; in fact, nearly 25 percent score in the top two categories (out of five) in their first observation.
Similarly, while most teachers improve throughout their first year (.2 points on a five-point scale for each observation), many do not. One-quarter of those later denied certification started off poorly and actually got worse over the year.
In fact, just like a charter school’s early performance can accurately predict its later performance, a teacher’s first-year performance tells us a great deal about his/her ability to improve. Teachers who received certification after their first year had an average score of 3.14 on their first observations; those denied certification scored around 2.50, on average, on that first observation.
In fact, writes the report, “Teachers who are performing poorly in their first year rarely show dramatic improvement in their second year.” This includes even those teachers who—thought to have potential despite early struggles—were given a second year (an “extension plan”) to earn certification.
“After more than a year in the classroom, not a single extension-plan teacher earned an observation score in the (top two) categories.”
There are plenty more fascinating tidbits throughout the report; you’ll learn about training and norming observers, using student surveys, adjusting for the inflation of principal ratings, and cultivating early skill sets in teachers.
But probably my favorite new fact relates to improving observations. It turns out that more observations aren’t the key; more observers are. “When assessing tradeoffs between adding observers and adding observations, the evidence is fairly clear—adding observers gives the greater boost to reliability. Giving teachers three different observers, instead of the same observer for each round, significantly increases the reliability of observations.”
The only complaint I had with the report is actually a complaint about an element of the underlying system, specifically, the names of the five rating categories—in order: “Ineffective, Minimally Effective, Developing, Proficient, and Skillful.”
Give 100 reasonable people those names and ask them the best, second-best, etc., I would happily gamble that less than half would choose this exact order.
Complaining about the discrepancy between a classification title and its content may seem like semantics, but it’s more than that. We have such troubled evaluation systems, I believe, partly because we still don’t have honest conversations about effectiveness. By muddying what’s meant with these indecipherable category names, we contribute to the problem.
But this is a minor matter when compared to the serious strengths of the report.
What’s most exciting is that, unlike evaluation and tenure reform, which required new laws in most states, most state departments of education can singlehandedly (or with their state boards) alter certification rules through regulation.
That means an enterprising state chief could swiftly turn this report’s findings into policy. Don’t approve prep programs graduating candidates unprepared for that critical first year; make sure early professional development builds foundational skills; prioritize additional observers over additional observations; and make permanent certification contingent on proof of success.
I’m all but certain a number of states will take this report’s lessons to heart, and once again it will be said that TNTP influenced for the better our educator policies and practices.
This blog entry first appeared on the Fordham Institute’s Flypaper blog.
Last updated April 30, 2013