Flawed Evaluation Systems: How Should We Assess School/Teacher Performance? Who Will Have the Cojones to Admit Their Errors and Choose a Valid/Reliable/Stable System?

What if the educators making important decisions about schools and colleges are acting too much on their guts and not enough based on actual evidence? (Review of Howard Wainer, Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies, 2011)

The list of scientists that have rejected Value-Added Modeling (VAM) is long and growing. Howard Wainer has been parsing numbers for decades, and getting angrier and angrier.

I don’t know whether it is the age we live in, or the age I have lived to, but whichever, I have lately found myself shouting at the TV screen disturbingly often. Part of the reason for this may be the unchecked growth of the crotchety side of my nature. But some of the blame for these untoward outbreaks can be traced directly to the unremarkable dopiness that substitutes for wisdom in modern society. Ideas whose worth diminishes with data and thought are too frequently offered as the only way to do things. Promulgators of these ideas either did not look for data to test their ideas, or worse, actively avoided considering evidence that might discredit them.

The science simply does not support the concept. The tragedy is that the feds decided to require VAM-based teacher evaluations before a “beta,” a testing phase, they not only jumped into the pool they required that every state that wanted big bucks also jump into the pool.

In New York State only 20% of a teacher’s evaluation is based on student test scores; 60% of the assessment is based on supervisory observations. The state required school districts to select from a list of observation models; Danielson, Marzano, Marshall and even a rubric developed by the state teacher union.

Just as the VAM student test score methodology is fatally flawed so are supervisory observation evaluations. The entire idea is based on a fallacy – that supervisors in school A would give the same score as supervisors in school B. As I wrote in a previous blog some principals are reticent to give lesser evaluations, it reflects poorly on the principal while others strictly apply the rubric.

A just-published study supports that higher achieving students with better language skills are more likely to exhibit behaviors rewarded on the Danielson and other scales.

School principals—when conducting classroom observations—appear to give some teachers an unfair boost based on the students they’re assigned to teach, rather than based on their own instructional savvy.

* Under current teacher evaluation systems, it is hard for a teacher who doesn’t have top students to get a top rating. Teachers with students with higher incoming achievement levels receive classroom observation scores that are higher on average than those received by teachers whose incoming students are at lower achievement levels, and districts do not have processes in place to address this bias

* Observations conducted by outside observers are more valid than observations conducted by school administrators.

* The inclusion of a school value-added component in teachers’ evaluation scores negatively impacts good teachers in bad schools and positively impacts bad teachers in good schools

We suspect that across New York State teachers in high poverty-low tax districts are receiving lower APPR scores than teachers in high achievement-high tax districts – the commissioner has failed to release this type of analysis.

These flawed scores are used to claim that teachers in high poverty schools are less capable than teachers in high wealth/high achievdment schools.

Teacher assessment is not a science – the skills and experience of school leaders varies. I have sat with groups of principals watching videos and assessing lessons. Not surprisingly we disagreed.

Other countries use inspectorate systems – “inspectors” who visit schools and assess both teacher and school effectiveness.

See English system here, Swedish system here and French system here.

A study by the well-respected Chicago Consortium on School Research conducted a detailed examination of an assessment system in which trained teams of supervisors and teachers observed teachers in selected schools.

… research-based evidence showing that new teacher observation tools, when accompanied by thoughtful evaluation systems and professional development, can effectively measure teacher effectiveness and provide teachers with feedback on the factors that matter for improving student learning.

Our problem is that Secretary Duncan is wedded to a system that is unsupported, a system that is deeply flawed, a system that is rejected by experts across the spectrum, it is unlikely that he will threw himself at the feet of Randi Weingarten pleading for forgiveness. State after state timidly saluted and implemented systems, each trying to outdo the next, some using as much as 50% student test scores to assess teacher performance.

Who will bravely step up to the plate?

Which governor will reject the current system?

Too many governors and commissioners, to quote Wainer, “actively avoided considering evidence that might discredit them.”

The de Blasio-Farina-Mulgrew triumvirate may have the spine to reject flawed systems and move to a saner system – there are models – and in the world of politics we need more heroes.

9 responses to “Flawed Evaluation Systems: How Should We Assess School/Teacher Performance? Who Will Have the Cojones to Admit Their Errors and Choose a Valid/Reliable/Stable System?

  1. The link to France also goes to Sweden, not France.


  2. Will the adult-size Cuomo please stand up, please stand up, please stand up.


  3. “If they succeed in their destructive goal of crippling the landmark advancement—of 45 states committing to college and career ready expectations for all students—it will be a setback to the cause of greater equality in our schools,” King said. “And that would be a disgrace.”

    King’s rhetoric is becoming nonsensical and desperate.


    • Compare King’s recent “dig” to current headlines about re-segregation due to charters and testing, the ethnicity of teachers in poor areas, forcing charters to comply with civil rights laws and the utterly awful common core tests of late. He is spouting inflammatory nonsense. And we know from a recent story that broke about the revision of our state standards just prior to the adoption of common core that Tisch is a real piece of work.


  4. No matter if some one searches for his necessary thing, thus he/she desires to be available that in detail, thus
    that thing is maintained over here.


  5. Pingback: A Window: The Regents and the Commissioner Have an Opportunity to Craft Student Tests and Teacher Evaluation Plans That Are Meaningful to Families and Staffs | Ed In The Apple

  6. Pingback: Chancellor Tisch Will Not Seek Another Term: Some Suggestions – How To Begin to Win Back Parents and Teachers | Ed In The Apple

  7. Pingback: How Should We Evaluate/Assess/Rate Teacher Performance? (Maybe Peer Review) | Ed In The Apple

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s