Category Archives: OECD

If VAM is Dead, What Comes Next? How Should Teacher Performance Be Assessed? The Governor/Legislature/Regents Try to Escape a Self-Constructed Abyss

If you toss a rock into a pool of feces you never know whose going get splashed.

Right now the governor and members of the legislature might not smell too good.

Parents are angry and disillusioned; teachers have no confidence in the governor, the education landscape is in disarray.

The Cuomo-appointed Task Force, which includes legislative leaders, is scrambling to find a solution at the edge of Dante’s Inferno. And, perhaps glancing at the inscription,”Lasciate ogne speranza, voi ch’intrate” (most frequently translated as “Abandon all hope, ye who enter here”).

How did our leaders create this yawning abyss? How did they anger a quarter of a million parents?

Foolishly, they allowed themselves to be lured down the (de) former path.

The (de)form side – the “disruptive innovation” agents abjure the normal methods of change, i. e., research, pilot programs, building constituency, electing like-minded candidates. “Disrupters” bring about change by disrupting the existing situation,

disruptive innovation is an innovation that helps create a new market and value network, and eventually goes on to disrupt an existing market and value network.

Abolishing tenure, weakening or eliminating teacher unions, sharply raising standards (Common Core) and imposing a far more difficult examination, disruption theory in practice; by creating chaos the (de)formers hope to skip over the usual processes, to be precise, to blow up and sweep away the current system and build a new education system.

The New Teacher Project (TNTP), a reform-y think tank issued a report, The Widget Effect ,”…our school systems treat all teachers as interchangeable parts, not professionals. Excellence goes unrecognized and poor performance goes unaddressed. This indifference to performance disrespects teachers and gambles with students’ lives.”

The report supports a new evaluation system that will identify and fire “poor performance” as well as recognize excellence, aka, merit pay.

Former Commissioner King and the governor led the charge for the Race to the Top dollars that required a student performance-based teacher evaluation assessment.  Not enough teachers found ineffective, change the plan through legerdemain, using the budget process to change the education law.

To quote poet Robert Byrnes,

The best laid schemes o’ Mice an’ Men

          Gang aft agley,

An’ lea’e us nought but grief an’ pain,

          For promis’d joy!

The “best laid schemes” turned into a nightmare.

As the Task Force scrambles we might ask a fair question: How should teachers be assessed/evaluated/rated?

The current law uses two indicators, a growth model score, referred to as Value-Added Modeling (VAM), a dense mathematical algorithm that compares teachers to teachers across the state who are teaching “similar students,” (Title 1 eligible, ESL, SWD, class size) and, the traditional supervisory lesson observation.

The highly regarded American Education Research Association (AERA) recently issued a policy paper. “AERA Statement on Use of Value-Added Models (VAM) for the Evaluation of Educators and Educator Preparation Programs ” warning schools and school districts about the use of VAM,

There are considerable risks of misclassification and misinterpretation in the use of VAM to inform these evaluations … the education research community emphasizes that the use of VAM in any evaluations must satisfy technical requirements of accuracy, reliability, and validity. This includes attention not only to the construct validity and reliability of student assessments, but also to the reliability of the results of educator and program evaluation models, as well as their consequential validity. In sum, states and districts should apply relevant research and professional standards that relate to testing, personnel, and program evaluation before embarking on the implementation of VAM.

So, if VAM assessments are flawed let’s go back to classroom lesson observations; however, classroom observations are also flawed ,”...teacher performance, based on classroom observation, is significantly influenced by the context in which teachers work. In particular, students’ prior year (i.e., incoming) achievement is positively related to a teacher’s measured performance captured by the [Danielson Frameworks]” 

In other words, teachers with higher achieving kids get higher observation scores. Another study of math teachers found “…math teachers with the highest-achieving students were nearly seven times more likely to get the top observation rating than teachers with the lowest-achieving students.”

A 2014 Brookings Report  explored lesson observations and growth models and reported,

* Under current teacher evaluation systems, it is hard for a teacher who doesn’t have top students to get a top rating. Teachers with students with higher incoming achievement levels receive classroom observation scores that are higher on average than those received by teachers whose incoming students are at lower achievement levels, and districts do not have processes in place to address this bias.

* The reliability of both value-added measures and demographic-adjusted teacher evaluation scores is dependent on sample size, such that these measures will be less reliable and valid when calculated in small districts than in large districts.

* Observations conducted by outside observers are more valid than observations conducted by school administrators

* The inclusion of a school value-added component in teachers’ evaluation scores negatively impacts good teachers in bad schools and positively impacts bad teachers in good schools

We also know that inter-school/inter-district observation reliability is low, in spite of the use of the same observation rubric in a school district; supervisory observers see lessons differently.

Value-Added Modeling (VAM) has been the subject of widespread criticism, although the scoring is free of bias, supervisory observations are heavily dependent on the academic level of the students and are subject to observer bias. Ironically the large errors of measurement in VAM, plus or minus ten or fifteen percent, benefits the teacher.  An example: the cut score between effective and ineffective is a VAM score of 50 and the teacher receives a score of 45, with an error of measurement of plus or minus 10%, the error of measurement impacted teacher score falls between 35 and 55 – although the teacher score fell in the ineffective range the teacher receives an effective score. The statistical unreliability benefits the teacher.

Maybe it’s time to borrow from the (de) former playbook and “disrupt,” in other ways, press the restart button.

How about a moratorium, let’s put the emphasis on teacher evaluation on hold for a few years and take a look at teacher evaluation across the country and in the OECD nations . How do the highest achieving nations assess teacher performance? The OECD website includes a lengthy section on teacher appraisal – check out a study from the Netherlands .

Think the nations with the highest student scores on international tests (PISA) know something about assessing teacher performance?

We want a system that is fair, transparent and supported by teachers and parents, we want a system that not only assesses but also builds teacher competency; and, while we’re taking a deep dive into teacher assessment let’s also take a look into student assessment.