At the top of the reform agenda, match teachers to pupil growth (VAM) and grade teachers accordingly: identify “good” teachers and “bad” teachers. Reward the “good” teachers and support, retrain and perhaps fire the ‘bad’ teachers.
Research appears to support the impact of “good teachers,”
In their analysis of these data, Rivkin, Hanushek, and Kain (2005) found that teacher quality differences explained the largest portion of the variation in reading and math achievement. As in the Tennessee findings, Jordan, Mendro, and Weerasinghe (1997) found that the difference between students who had three consecutive highly effective teachers (again defined as those whose students showed the most improvement) and those who had three consecutive low-effect teachers (those with the least improvement) in the Dallas schools was 34 percentile points in reading achievement and 49 percentile points in math.
If the goal is to fill classrooms with “good teachers” and rid classrooms of “bad teachers,” how are we doing in achieving that goal? The New Teacher Project report, “The Widget Effect” paints a grim picture, in the districts studied virtually every teacher receives a satisfactory rating and there is little help for new or struggling teachers.
All teachers are rated good or great.
Less than 1 percent of teachers receive unsatisfactory ratings, even in schools where students fail to meet basic academic standards, year after year.
Excellence goes unrecognized.
When excellent ratings are the norm, truly exceptional teachers cannot be formally identified. Nor can they be compensated, promoted or retained.
Professional development is inadequate.
Almost 3 in 4 teachers did not receive any specific feedback on improving their performance in their last evaluation.
Novice teachers are neglected. Low expectations for beginning teachers translate into benign neglect in the classroom and a toothless tenure process.
Poor performance goes unaddressed.
Half of the districts studied have not dismissed a single tenured teacher for poor performance in the past five years. None dismiss more than a few each year.
To address the disconnect, led by the US Department of Education, states began to design teacher assessment systems based on a combination of student growth scores (VAM) and principal lesson observations based on a widely accepted rubric. In the growth score category teachers are measured against each other and in the teacher lesson observation category against a standard, for example the Danielson or Kim Marshall frameworks.
In order to qualify for Race to the Top (RttT) and School Incentive Grant (SIG) dollars states have to implement a teacher assessment system, in New York State it is referred to by the acronym APPR. The feds and states have spent hundreds of millions of dollars, jobs for psychometricians, economists and other experts, using extremely dense mathematical calculations, formulae that are the subject of sharp differences in the academic community.
“If these teachers were measured in a different year, or a different model were used, the rankings might bounce around quite a bit,” said Edward Haertel, a Stanford professor…. “People are going to treat these scores as if they were reflections on the effectiveness of the teachers without any appreciation of how unstable they are.”
As scholars bicker the results from early adopters are surprising to the teacher scolds ,
In Florida, 97 percent of teachers were deemed effective or highly effective in the most recent evaluations. In Tennessee, 98 percent of teachers were judged to be “at expectations.”
In Michigan, 98 percent of teachers were rated effective or better.
Advocates of education reform concede that such rosy numbers, after many millions of dollars developing the new systems and thousands of hours of training, are worrisome.
“It is too soon to say that we’re where we started and it’s all been for nothing,” said Sandi Jacobs, vice president of the National Council on Teacher Quality, a research and policy organization. “But there are some alarm bells going off.”
New York State is entering both the first teacher of teacher evaluation and implementing the Common Core (CCSS) on the soon to be administered state tests. A state teacher union officer and the Chancellor of the Board of Regents have sharply differing opinions.
“We’re giving the test before teaching the curriculum. That’s not what you should do,” said Maria Neira, the vice president for research and educational services for New York State United Teachers. “We’re rushing to do it, instead of doing it right.”
Merryl H. Tisch, the chairwoman of the state board of regents, counters that the state’s timeline for common-core implementation has been clear for more than two years, and that schools and districts would have to have been “living under a rock” to be surprised now.
“There is an enormous pushback against us because we are rolling out the common-core assessment, and some think we should have waited a year,” she said. “But as youngsters graduate high school right now, they’ve already hit a wall. Their reality is right now. We feel this is such an urgent issue, we have to roll it out now.”
A principal, only half-jokingly, tells me that teachers in her school joke about undercutting other teachers to improve their “grade.” With the release of the first round of scores in August, 2012 principals were confused, in numerous instances the grades did not jibe with principal judgments. For probationary teachers the teacher grades determine tenure – few principals are willing to fight with superintendents for their teachers.
Around the country school districts are developing multiple measure systems combining the use of student test scores, usually a growth model, and supervisory lesson observations. We know the student test score data is “unstable,” aka wide year to year swings, and, up to now supervisors rate only a few percent of teachers as “ineffective.” The new multiple measures assessments, combining growth scores and lesson observations find few “ineffective” teachers. What if we train supervisors and lead teachers on the use of an agreed upon rubric and use supervisor/teachers teams to observe?
Well, guess what, the Chicago Consortium for School Research conducted a two-year research project,
“Rethinking Teacher Evaluation in Chicago: Lessons Learned from Classroom Observations, Principal-Teacher Conferences, and District Implementation” (Read here) from the University of Chicago Consortium on School Research focuses on Chicago, but the lessons learned have significant applicability to districts across the country. The report is one of the first to provide research-based evidence showing that new teacher observation tools, when accompanied by thoughtful evaluation systems and professional development, can effectively measure teacher effectiveness and provide teachers with feedback on the factors that matter for improving student learning. This is especially relevant for those districts that are implementing the Charlotte Danielson Frameworks.
If we spent our time and dollars training supervisors and teachers around an agreed upon rubric we could develop an assessment system that identifies both “highly effective” and “ineffective” teachers, that not only identifies but provides feedback to a teacher that hopefully leads to improved practice.
The annual New York City Department of Education Instructional Expectations document demands of principals “frequent brief lesson observations with meaningful feedback.”
If we know that lesson observations conducted by well-trained supervisors and teachers are effective why do we spend mega-dollars on constructing systems based on mathematical algorithms that only the few can understand?
Either, we don’t think we can train supervisors to observe lessons, or, we don’t trust them, or, we are besotted with the world of data, take your pick.
Read an excellent interview with Charlotte Danielson here: http://www.ed.gov/Teacher-Evaluation-Systems
Or, maybe we can try to emulate Finland,
Finland has developed a deeply thoughtful curriculum and then provided teachers ever more autonomy with respect to how they approach that curriculum; they have both a curriculum worth teaching to and the kind of autonomy in how they approach it that is characteristic only of the high status professions. Because Finland is at the frontiers of curriculum design to support creativity and innovation, teachers have a job that has many of the attractions of the professions that involve research, development and design. They are pushing their intellectual and creative boundaries. Because Finland is understandably satisfied with the job its teachers are doing, it is willing to trust them and their professional judgments to a degree that is rare among the nations of the world (a sign of which is the fact that there are no tests given to all Finnish students at any level of the system that would allow supervisors to make judgments about the comparative worth of individual teachers or school faculties.)
Teachers jump up and down with glee – why can’t we be like Finland?
I point out that if we were like Finland the vast majority of you wouldn’t be teachers,
Finnish teacher education programs are extremely selective, admitting only one out of every ten students who apply. The result is that Finland recruits from the top quartile of the cohort
In the good, old USA admission standards to get into college schools of education are low and virtually all prospective teachers graduate and receive certification.
These are complex issues: the one point that I’m sure of is the current teacher assessment system will neither attract and retain “good” teachers or rid the system of “bad” teachers – it will simply anger all teachers – pit principals against teachers and principals and teachers against superintendents and state education departments: a pitiable formula for failure.
The Luddites are right. People trump mindless machines (and dense algorithms).