What if the educators making important decisions about schools and colleges are acting too much on their guts and not enough based on actual evidence? (Review of Howard Wainer, Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies, 2011)
The list of scientists that have rejected Value-Added Modeling (VAM) is long and growing. Howard Wainer has been parsing numbers for decades, and getting angrier and angrier.
I don’t know whether it is the age we live in, or the age I have lived to, but whichever, I have lately found myself shouting at the TV screen disturbingly often. Part of the reason for this may be the unchecked growth of the crotchety side of my nature. But some of the blame for these untoward outbreaks can be traced directly to the unremarkable dopiness that substitutes for wisdom in modern society. Ideas whose worth diminishes with data and thought are too frequently offered as the only way to do things. Promulgators of these ideas either did not look for data to test their ideas, or worse, actively avoided considering evidence that might discredit them.
The science simply does not support the concept. The tragedy is that the feds decided to require VAM-based teacher evaluations before a “beta,” a testing phase, they not only jumped into the pool they required that every state that wanted big bucks also jump into the pool.
In New York State only 20% of a teacher’s evaluation is based on student test scores; 60% of the assessment is based on supervisory observations. The state required school districts to select from a list of observation models; Danielson, Marzano, Marshall and even a rubric developed by the state teacher union.
Just as the VAM student test score methodology is fatally flawed so are supervisory observation evaluations. The entire idea is based on a fallacy – that supervisors in school A would give the same score as supervisors in school B. As I wrote in a previous blog some principals are reticent to give lesser evaluations, it reflects poorly on the principal while others strictly apply the rubric.
A just-published study supports that higher achieving students with better language skills are more likely to exhibit behaviors rewarded on the Danielson and other scales.
School principals—when conducting classroom observations—appear to give some teachers an unfair boost based on the students they’re assigned to teach, rather than based on their own instructional savvy.
* Under current teacher evaluation systems, it is hard for a teacher who doesn’t have top students to get a top rating. Teachers with students with higher incoming achievement levels receive classroom observation scores that are higher on average than those received by teachers whose incoming students are at lower achievement levels, and districts do not have processes in place to address this bias
* Observations conducted by outside observers are more valid than observations conducted by school administrators.
* The inclusion of a school value-added component in teachers’ evaluation scores negatively impacts good teachers in bad schools and positively impacts bad teachers in good schools
We suspect that across New York State teachers in high poverty-low tax districts are receiving lower APPR scores than teachers in high achievement-high tax districts – the commissioner has failed to release this type of analysis.
These flawed scores are used to claim that teachers in high poverty schools are less capable than teachers in high wealth/high achievdment schools.
Teacher assessment is not a science – the skills and experience of school leaders varies. I have sat with groups of principals watching videos and assessing lessons. Not surprisingly we disagreed.
Other countries use inspectorate systems – “inspectors” who visit schools and assess both teacher and school effectiveness.
A study by the well-respected Chicago Consortium on School Research conducted a detailed examination of an assessment system in which trained teams of supervisors and teachers observed teachers in selected schools.
… research-based evidence showing that new teacher observation tools, when accompanied by thoughtful evaluation systems and professional development, can effectively measure teacher effectiveness and provide teachers with feedback on the factors that matter for improving student learning.
Our problem is that Secretary Duncan is wedded to a system that is unsupported, a system that is deeply flawed, a system that is rejected by experts across the spectrum, it is unlikely that he will threw himself at the feet of Randi Weingarten pleading for forgiveness. State after state timidly saluted and implemented systems, each trying to outdo the next, some using as much as 50% student test scores to assess teacher performance.
Who will bravely step up to the plate?
Which governor will reject the current system?
Too many governors and commissioners, to quote Wainer, “actively avoided considering evidence that might discredit them.”
The de Blasio-Farina-Mulgrew triumvirate may have the spine to reject flawed systems and move to a saner system – there are models – and in the world of politics we need more heroes.