Tag Archives: Sabermetrics

How Should We Evaluate/Assess/Rate Teacher Performance? (Maybe Peer Review)

We live in a world of assessment; let’s take a look at sports. Every major league baseball team has a group of data wonks who collect bits and pieces of data and create algorithms to assess and predict future performance. Once upon a time we could quote batting averages, home runs, earned run averages, now we’re overwhelmed by Wins Over Replacement (WAR), launch angle, etc… We live in the world of Sabermetrics (“A Guide to Sabermetrics Research”).  Every sport has its own set of data used to assess player performance and to predict outcomes.

If we work out we keep track of minutes on the treadmill, number of pull-ups and dips, deep knee bends,  we can measure our performance. We can keep track on our I-Phone or I-Watch. If we play golf: has our handicap dropped? Or, tennis: are we beating players we used to lose to?

Dancers and musicians practice with a coach, guided practice, and improve at their art.

Which raises the nurture/nature question?  Do some athletes and artists have encoded DNA that makes them a better athlete or musician, or, does 10,000 hours of practice produce excellence? Grit and determination or natural ability?

David Epstein, The Sport’s Gene: Inside the Science of Extraordinary Athletic Performance explores,

The debate is as old as physical competition. Are stars like Usain Bolt, Michael Phelps, and Serena Williams genetic freaks put on Earth to dominate their respective sports? Or are they simply normal people who overcame their biological limits through sheer force of will and obsessive training?

The truth is far messier than a simple dichotomy between nature and nurture. In the decade since the sequencing of the human genome, researchers have slowly begun to uncover how the relationship between biological endowments and a competitor’s training environment affects athleticism. Sports scientists have gradually entered the era of modern genetic research.

In his book, Outliers, Malcolm Gladwell lays out the much quoted “10,000 hours rule,”  simply put: gaining mastery requires 10,000 hours of “deliberate” practice.

The principle holds that 10,000 hours of “deliberate practice” are needed to become world-class in any field.

But a new Princeton study tears that theory down. In a meta-analysis of 88 studies on deliberate practice, the researchers found that practice accounted for just a 12% difference in performance in various domains.

In education, a 4% difference
In professions, just a 1% difference

In it, [the authors] argue that deliberate practice is only a predictor of success in fields that have super stable structures. For example, in tennis, chess, and classical music, the rules never change, so you can study up to become the best.

But in less stable fields, like entrepreneurship  [and teaching]… rules can go out the window… mastery is more than a matter of practice.

Teaching is a far more complex task: on one side the teacher, with whatever skills s/he possesses, on the other side twenty or thirty students with a wide range of life experiences: are they hungry, or bullied, or depressed, and, in the middle the content you’re expected to transmit to the students, content, or, standards, or a curriculum or a program, none of which you played a role in selecting. Almost ten years ago the Obama-Duncan administration decided  dense algorithms can be used to compare teachers to teachers who are teaching “similar” students, the tool is called Value-Added Measurement, referred to as VAM, it was rolled out as “we can use results on standardized test scores to rate and compare teachers.” John King, at that time the NYS Commissioner adopted the use of VAM combined with supervisory observations, to assess teacher performance.

The pushback was vigorous, Chancellor Merryl Tisch convened a summit, experts from around the country to discuss the efficacy of using the VAM tool. The experts were crystal clear, VAM was never intended to assess the performance of an individual teacher. The Board of Regents agreed upon a four year moratorium on the use of standardized test scores to assess teacher performance. Last week both house of the state legislature passed a bill returning the question of teacher assessment to school districts, with considerable pushback from parents who felt district would simply substitute another off-the-shelf test.

See my blog here

We should completely de-link teacher assessment from test results.

The Netherlands are among the highest achieving school systems in the OECD, 8,000 unionized public schools functioning like charter schools. the schools have extremely wide discretion in how they run. Read a detailed description here.

European school systems use an inspectorate system (See links in the blog here), the school supervisory authority sends teams of experts into schools to assess the functioning of the school.

Back in the 90’s and early 2000’s New York State sent Schools Under Registration Review (SURR) teams into schools for a deep dive into the functioning of the school and produced highly specific (“Findings and Recommendations”) reports. I was the teacher union representative on many teams.

New York City conducts periodic Quality Review visits to schools, a type of inspectorate system.

Experienced educators conduct a two-day school visit. They observe classrooms, speak with parents, students, teachers, and school leaders. They use the Quality Review Rubric to help them examine the information they gather during the school visit.

After the school visit, a Quality Review Report is published on the school’s DOE webpage. The Quality Review Report rates the school on 10 indicators of the Quality Review Rubric. The report also describes areas of strength and areas of focus and has written feedback on six of the indicators. Information from this report is also used in the School Quality Snapshot.

The QR teams can be improved, they should be joint Department/Union teams and the union should play a role in constructing the Quality Review Rubric.

As far as the assessment of individual teachers we shouldn’t fear peer review, respected colleagues providing feedback.

Let me say, I’m not hopeful. At a recent live streamed town hall, (by invitation only), the mayor, the chancellor and the chancellor’s crew met with parent and community leaders from the Bronx. To a question about the large number of schools in a district the chancellor posited an additional deputy superintendent, and added, the press would attack him for bloating the administration, and, the oress would be correct. Level upon level of supervision “monitors” data: educational decisions should be made in schools not in distant offices. A parent worried, she was in her son’s 6th grade class and saw student work replete with frequent spelling errors, the deputy chancellor suggested a Google Spelling app, the parent sighed, “He’ll only want to play video games on the computer.” Maybe a sign the school has serious instructional issues?

Empowering schools and holding them accountable for their decisions make much more sense than measuring and punishing, and, BTW, resources matter they matter a great deal, and, any school assessment should factor in “poverty risk load.” (See discussion here ).

Figthing over whether a teacher is “developing” or “effective” is insane, maybe we should be working to create collaborative school communities in which school leaders, parents and teachers work together to craft better outcomes.