Tag Archives: Teacher Evaluation

Report from the NYS Board of Regents Meeting (1/14-15/2019) [High School Grads as Substitutes and Punishing OptOut Schools]

I have been attending the monthly NYS Board of Regents meetings for a decade, and, I tweet as fast as my fingers can tap @edintheapple.

The meetings begin in the ornate Regents Room, the walls lined with portraits of former chancellors, with one exception (Merryl Tisch), white men, mostly with facial hair. The Board of Regents has a long history, back into the late eighteenth century. Under current law the Board is made up of seventeen members, one from each of the thirteen judicial districts and four at-large. The Regents are “elected” by combined meeting of the legislature, effectively, by the democratic majority of the Assembly. The Regents serve a five year term and the position is unsalaried.

The initial meeting is live streamed and archived (Watch 1/14/19 meeting here), the committee meetings follow throughout the day. The meetings are a full day Monday and a half day Tuesday. The committee meetings are not live streamed.

The meeting always begins with the chancellor asking a board member to offer comments, a “moment of reflection,” the audience stands, I suspect in the distant past it was a prayer: quant.

The audience, state education department staff, lobbyists, representatives of education organizations (unions, school boards, etc.), and me. There is no opportunity for public comment, although during the breaks between meetings board members and audience members chat.

The meeting began with a presentation on a major new state initiative, this month, culturally relevant education, rebranded as Cultural Responsiveness and Sustainability Frameworks, see the PowerPoint here. David Kirkland, the director of the New York University Metro Center is the lead author, David and a number of superintendents and other organization leads spoke and supported the Frameworks. As is the practice the Frameworks go to public comments and back to the board for adoption.

Another resolution was added to the agenda in response to the DeVos withdrawal of the Obama letter on student discipline and suspensions. The resolution was read to the room, a strongly worded rebuke to the DeVos policy. (Read the full resolution here)

…be it hereby resolved that the Board of Regents reaffirms its commitment to continuing its efforts to ensure that all students have equitable access to learning opportunities in safe and supportive school environments free from discrimination, harassment and bias, including reducing dependence on exclusionary school discipline and increasing equity in education for all students.

 The board moved to the committee meetings, the first was K-12, among the items on the agenda was a change in the regulations dealing with the qualifications for substitute teachers. There are a small number of districts that can’t find substitutes for absent teachers, we’re talking about districts with 3-5 schools. The controversial section is below: yes, the commissioner is proposing that the requirement be reduced to a high school diploma.

To address the Board’s concerns, the Department is proposing to require uncertified substitute teachers to hold at least an associate’s degree or its equivalent to ensure that they have a minimum educational background. However, if no eligible substitute teacher with an associate’s degree or higher, or its equivalent, is available after a good faith recruitment effort has been conducted, the school district may request from the district superintendent (for districts that are a component district of a BOCES and BOCES) or the superintendent (for school districts that are not a component district of a BOCES) a waiver allowing them to employ an individual with a high school diploma, or its equivalent.

 The members were outraged and member after member rejected the proposal, after attempts to wordsmith the regulation it was removed from the agenda. I wanted to raise my hand and make other suggestions: a district substitute teacher reserve, meaning hiring an additional teacher, or two, or whatever is necessary, on a permanent basis to serve as a sub, or, use district office staff on a rotating basis, or, perhaps, try paying daily substitutes more money.

Next, an issue that has been bubbling for a week boiled over; the state will be releasing school accountability data in a few days; and, NYS has, by far, the largest OptOut numbers among all states, about 20% and over 50% on Long Island. (BTW, a very small number in NYC concentrated in a few high achieving schools).

SED provided districts with a 62-slide PowerPoint used to identify “failing” schools and a dense alghorhym. Regent Johnson, a former superintendent, attended the meeting in her judicial district and was outraged, she asked,

 “Will OptOut schools be punished?  Will  schools be designated as failing schools due to OptOuts?”‘

She demanded of the commissioner, “Yes or No?” The commissioner sidestepped.

The board was not happy.

If you want to get into the weeds, read a detailed explanation of the accountability metrics here,

The bottom line: OptOuts within subgroups resulted in lowering the achievement metrics pushing schools into the “failing school” (targeted or comprehensive school improvement) categories. The alghorhym used by the state to determine schools has been/ discredited by Bruce Baker, a frequent writer on education finance issues

On Tuesday morning I attended the Assembly Education Committee meeting under the new chair, Michael Benedetto, a retired career teacher. The first order of business of the legislative session was to pass, unanimously, all democrats and republicans supported, the teacher evaluation bill that returns the question of teacher evaluation to school districts. Read full test of the bill and accompanying memo here. The bill will move to the full Assembly and I expect similar speedy actions in the Senate.

Legislators are more than happy to return the thorny question of teacher evaluation to local districts, and, will be perturbed by the commissioner’s decision to “punish” OptOut schools when they were assured that opting out would have no negative consequences for schools.

When hordes of phone calls begin pouring in to legislator offices legislators will seek answers and this issue can become increasingly troublesome for the commissioner.

Hope this has been helpful. All questions and/or comments, of course, welcome.

“I Hate Being Observed! It’s a Waste of Time and too frequently is Harassment.”  (A view commonly held by teachers) Can teacher observations lead to constructive conversations?

A decade ago The New Teacher Project (TNTP) issued a report, “The Widget Effect “  that concluded,

* All teachers are rated good or great

* Professional development is inadequate

* Novice teachers are neglected, and

* Poor performance goes unaddressed

The report has had enormous, and toxic, impacts. The feds and states moved to assessments of teachers using student outcomes on standardized tests, value added measurements (VAM), a dense algorithm only understood by psychometricians.

For decades teachers were observed once or twice a year, or, not at all, a mechanical process, a compliance chore.  Teachers resented, or, feared being observed, supervisors found it burdensome. If you were lucky you were in a school in which the observation process was part of a an ongoing discussion of the teaching/learning process.

New York State adopted an Annual Personnel Performance Review (APPR) scenario; each school district in the state negotiated a process within strict regulations with the union. In New York City the system was imposed by the state commissioner. The process included VAM scores and observations using a rubric (Danielson, Marshall, Marzano, etc.,).  The pushback from the unions, and parents grew, teachers in high poverty schools received lower VAM scores, the critics of the VAM methodologies grew and grew; finally the Board of Regents declared a four year moratorium on the use of student test scores, and, have just announced a one year extension to create a new teacher evaluation tool.

While VAM scores are scorned teacher observations by supervisors are equally flawed. Different supervisors rate the same lesson differently, these is no consensus. The use of a single rubric, in New York City, the Danielson Frameworks, simply became another compliance task, a checklist. All observations are entered into a computerized database, ADVANCE, and principals who fall behind in their observations are dunned.

A principal related to me: all the principals in a district were divided into teams and observed classes in a school. The group facilitators asked the principals how they would rate the lesson. One principal asked: Shouldn’t we be discussing how we would handle the post observation conference?  The facilitator demurred, no; we’re only here to assess the lesson according to Danielson.

Danielson is not the Holy Grail, and, following Danielson to the letter does not guarantee successful student outcomes.

Early in the Danielson era I was at her presentation, at the end I asked,

“Supreme Court Justice Potter Stewart wrote he couldn’t define pornography; however, he knew it when he saw it, isn’t it the same with effective instruction?”

Charlotte disagreed.

She’s incorrect, after watching many hundreds of lessons you can “feel” a good lesson. Different classes of student require different instructional strategies, effective teaching is varying teaching techniques to suit the kids in front of you.

Attempting to use student test scores to assess teacher performance was disastrous, and, emphasizing the summative assessment rather than the formative assessment is racing down another wrong path; the light at the end of the tunnel is an oncoming locomotive.

An irony: the other Danielson book, “Talk About Teaching! Leading Professional Conversations, (2009)” should be required reading for supervisors.

Danielson writes,

An important mechanism to promote teacher learning …. is that of conversation. Through focused and occasionally structured conversations, teachers are encouraged to think deeply about their work, to reflect on their approaches and student responses. And yet conducting such conversations requires skill. Many teachers assume that if their principal or supervisor wants to discuss the events in a classroom it means there is something wrong … by neglecting to engage in professional conversations with teachers, educational leaders decline to take advantage of one of the most powerful approaches at their disposal to promote teacher learning.

 Conducting a post observation conference is a skill; and should not be a burdensome, compliance chore for the observer and the observed.

Post observation conferences might be a Socratic Method, engaging the teachers in a dialogue, or, a few teachers might observe colleagues and jointly discuss the lesson among themselves with a facilitator. In my school the principal allowed us to substitute a peer observation system in lieu of traditional supervisory observations. Triads of teachers, Teacher A observed B, B observed C and C observed A, in the same week, teaching a lesson on a similar topic, the teachers met and engaged in a facilitated conversation around a template of questions; the “notes” were the observation report. The teachers who participated had never watched a colleague teach, and, reflected deeply on their own practice.

The just-approved New York City teacher contract contains two changes to the teacher evaluation section, the number of observation are reduced.

… the contract approved this week also significantly cuts back how often teachers need to be observed under the city’s evaluation system. Top-rated teachers will receive only two classroom visits — down from three or four. For new teachers or those with low marks, observations are cut from a high of six to a low of three.

 And, new professional learning teams will support “school-based professional development committees to align PD to the observations conducted throughout the school year.”

  Professional development on evaluation

  • A professional learning team consisting of UFT and DOE representatives will plan and conduct annual training sessions on the implementation of the evaluation system by the last Friday in October. 
  • The professional learning team will also ensure that teacher development tools and resources will be developed and distributed, including resources regarding evaluation of specific school settings such as co-teaching, special education settings, ENL and physical education.
  • The professional learning team will provide support to school-based professional development committees to align PD to the observations conducted throughout the year. 

Is this meaningful change?

The union took a risk, convincing teachers that formative assessment, conversations, will make them into better teachers.  Maybe they will jump on board, maybe they will continue to close their doors and do what they do. Maybe the union is alienating members or maybe changing compliance-driven cultures to collaborative school cultures.

Unions are demeaned, the “right-wing” establishment spent years to get the Janus case before the court and maneuvered the court to get the “right” justices. So far, Janus seems to have motivated unions, teacher strikes in non-collective bargaining states, the public supporting teachers, and a voucher plan in Arizona soundly defeated.

Teachers can continue to win over the public by continuing to improve, as professionals, and improve the end product, students outcomes.

A friend always reminds staffs that the solution is in the room, changing school cultures never begins with edicts from superintendents, it begins in teacher lunch rooms, in teacher rooms, it begins from the ground up, yes, superintendents must seed the fields, must change from para-military attitudes to supporting collaborative cultures.

The union president and the chancellor took a risk: risk-taking can be the path to positive embedded change

The UFT Contract Proposal: Can a Teacher Contract Rebuild Trust in Public Schools: An Aggressive Agreement Confronts Teacher Shortages, Teacher Collaboration and High Poverty/At Risk Schools

As Hurricane Michael whistled by the city the Mayor de Blasio, Chancellor Carranza and Union President Mulgrew announced a proposed union contract months before the mid February contract expiration, and, none too soon for a chancellor faced with one inherited crisis after another.

This afternoon hundreds of union delegates will convene  to hear details, ask questions, and, after what I expect will be vigorous debate vote on the contract. If approved, as I expect, it will move to the members, who, in a secret ballot, will vote to approve or reject the agreement. As with every contract there will be naysayers: not enough money, class size should have been addressed, etc., I expect an overwhelming approval by members. In 1995 the members did vote to reject a contract, a five year contract with no raises in the first two years, months later virtually the same contract was approved.

Public employee negotiations are guided by the Public Employees Relation Board (PERB) regulations, and salary is governed by the principles of “Ability to Pay” and “Pattern Bargaining;” (see an earlier blog for more detailed discussion).

The tentative 43-month contract provides a 2 percent salary increase on Feb. 14, 2019, followed by an increase of 2.5 percent on May 14, 2020, and 3 percent on May 14, 2021.  After the May, 2021 increase, the maximum teacher salary will jump to $128,657 from today’s high of $119,472. Starting teacher salaries will go from the current $56,711 to $61,070.

UFT-represented employees will still receive the lump-sum payments scheduled for this October and the following two Octobers that were negotiated in the 2014 contract.

For the chancellor the contract proposed contract settlement is crucial, he was drowning in inherited crises that have been bubbling for years.

The education headlines: thousands of students left stranded as school buses failed to arrive, a $100,000 rabbi with the $50,000 driver, working for the Department, to supervise publicly funded buses for Yeshivas , staggering numbers of special education qualified students without services (Read NYTimes story here), and, of course, the lingering and contentious specialized schools segregation conundrum.pitting students of color against Asian students.

The contract is complex and the implementation difficult, it attempts to blend the needs of the union and the needs of the school system.

 Teaching shortages and teacher retention: enrollment in teacher education programs is sharply down, and, teacher retention in high needs schools, more than half the teachers leave within five years, creates a cycle of constant teacher shortages. Are we selecting the “proper” candidates?   The department will begin to pre-screen potential candidates. When I worked as a consultant in the Chancellor’s District in the late 90’s we pre-screened candidates before we sent possible candidates on to be interviewed by principals, a similar model will be implemented for selective schools.

A subset of high needs schools, mostly in the Bronx, will have the opportunity to participate in a carefully structured local decision-making model.

 The tentative contract establishes a Bronx Collaborative Schools Model for up to 120 high-needs schools, mostly in the Bronx. Schools will be identified based on staff turnover, student achievement and other criteria, but the chapter leader and the principal must both agree to participate. These schools will form joint labor-management committees and be provided with support to make significant changes in school operations. Each school will make its own decisions on how to improve school climate, reduce teacher turnover and increase academic achievement. The changes could include an additional $5,000 to $8,000 per year for teachers in a hard-to-staff license or title.

  A continuing frustration has been obdurate school leaders who cannot/will not engage staff in the school decision-making process. The contract addresses the issue,

The agreement will expand the authority of school-based UFT consultation committees, empowering them to raise and address issues of professional development, basic instructional supplies, curriculum, inadequate space and workload. Those issues will be raised first at the school, but the chapter leader can escalate them to the district and central levels if resolution isn’t reached. The contract also provides stronger protection for members who voice concerns from attempts by a supervisor to retaliate against or harass them.

The tentative settlement includes a major victory for paraprofessionals,  due process rights not previously in the agreement.

In a major victory for paras, the tentative contract provides due-process rights for paras that are similar to those of teachers. You remain on the payroll while the case is adjudicated.

ATRs, teachers who were excessed from their schools into a pool of teachers without a permanent placement will be placed by local superintendents into vacancies from day one of the school year.

The number and length of teachers observations will be reduced for “effective” and “highly effective” rated teachers.

Read a Contract at a Glance here and full text of the Memorandum here.

In Los Angeles, a city with an elected central board, teachers have voted to strike and are currently in state-directed mediation, the central board hired a hedge fund despoiler as superintendent who wants to turn all of Los Angeles into a portfolio model; driven by charter school choice. Chicago, a mayoral control city has been battling their mayor and their governor for years, with very limited success as schools continue to be closed.

Elections have consequences, huge consequences for teachers and schools; negotiating a contract in a nation led by Trump and Betsy DeVoss is beyond challenging.

Both leaders, Mulgrew and Carranza have taken risks: can they create a truly collaborative climate at the school level; can they build a culture of trust?

Marc Tucker, in Education Week, explores the loss of trust in our schools and how we can rebuild trust,

… the distrust of school administrators by teachers, the distrust of teachers unions by governors and legislators, the distrust of state government by school district administrations, the distrust of parents by school professionals and vice-versa…well it seems to go on and on.

Where did trust go?  How can we get it back?

A union leader and a school district leader have used the vehicle of a collective bargaining agreement to address issues that hopefully will begin to rebuild trust in our public schools.

Presidential Polling and Measuring Teachers: The Misuse of Data and the Gullibility of the Public

We are addicted to predicting winners: at race tracks the betting public creates the odds for each horse in a race, every Sunday the odds makers in Las Vegas predict winners and the numbers of points by which teams will win based on previous records and a plethora of player related achievement numbers.

This is called gambling.

Data can be used for more respectable purposes, namely predicting winners in elections, another type of race, a political race, as well as predicting “success” in teaching by measuring increases in student achievement attributed to individual teachers.

Each day the New York Times online publishes odds, in the form of a percent, for the presidential election – on Sunday Hillary was “leading” Trump 90% to 10%, on Wednesday 88% to 12% percent. The section is called Upshot and the site explains the methodology. One of the sources is the Princeton Election Consortium and, if you want to get into the weeds, you can read about “symmetric random drift” and “setting a Bayesian prior,” probably well beyond the interest and knowledge of the vast percentage of “ordinary” folk.

The essential problems are the source data, the actual polling. Lo those many years ago we learned we had to create a stratified random sample, a microcosm of the population we wished to poll. An example is the upcoming September 13th Democratic primary election in the 65th Assembly District in Manhattan, the seat formerly held by Sheldon Silver, awaiting sentencing by the feds. There are six contenders for the seat, and a close look at the population in the district is revealing

Population figures, though, do not always translate into actual voters. According to 2014 census data, there were 32,952 Asian and South Asian citizens of voting age in the district. But only 15,284 were registered Democrats, said Jerry Skurnik, a partner at Prime New York, which compiles voter information. Of those, only 5,500 voted in the last three primaries.

Far fewer registered Hispanic and Portuguese Democrats voted in those three previous primaries, said Mr. Skurnik, who analyzed election data relating to social groups based on surnames. Of 11,675 registered voters, only 4,101 participated in a previous primary election, he said. Those of “European background,” including English, Irish, Italian and likely-to-be-Jewish voters, were the largest group, at 20,496 registered Democrats, with 8,205 showing up in previous primaries.

Randomly selecting names from census data is not a stratified random sample, selecting names from prime voter lists is a major step; however, how many potential prime voters don’t answer the phone and participate in the poll?  Do the participants constitute a “stratified random sample?”  I understand that fewer than 10% of those called actually respond to a polling call.

In June the United Kingdom (England, Scotland and Wales) voted in the Brexit election, an election to decide whether the UK would remain in the Common Market. Extensive polling revealed that the Brits would remain in the Common Market by a 52-48 vote, when the dust cleared the Brits voted to leave 52-48 – what went wrong?

An experienced pollster commented on “what went wrong.”

The difference between survey and election outcome can be broken down into five terms:

  1. Survey respondents not being a representative sample of potential voters (for whatever reason, Remain voters being more reachable or more likely to respond to the poll, compared to Leave voters);
  2. Survey responses being a poor measure of voting intentions (people saying Remain or Undecided even though it was likely they’d vote to leave);
  3. Shift in attitudes during the last days;
  4. Unpredicted patterns of voter turnout, with more voting than expected in areas and groups that were supporting Leave, and lower-than-expected turnout among Remain supporters.
  5. And, of course, sampling variability.

In spite of extensive polling by “the best and the brightest” the pollsters were off by four percent!!

Howard Wainer, a statistician with vast experience explains

… the response rate for virtually all of the polls ranges from 8 to 9 percent. Yes, more than 90% of those asked for their opinion hang-up. Do you know anyone who chooses to answer the phone? Who? Do you? Professional pollsters never talk about this because it means their paychecks.

The only way to use such polls is to make heroic assumptions — most commonly what is assumed is ‘ignorable nonresponse’ — that is that those who don’t respond are just the same as those who do — clearly nonsense.

Even such a sensible person as [pollster] Nate Silver has to make do with terrible information. Yes, drawing inferences from flawed data are usually better than doing it with no information at all, but it is hardly enough to keep from being terrified.

The one aspect of this in which I find some solace is that the polls may be self-fulfilling. This is seen in the shrinkage of donations to Republicans.

Although it is an unintended consequence polling results influence voters – polls discourage voters who are on the trailing side and impact voters who want to be on the winning side – the band wagon effect.

The only absolute winners are the pollsters who receive fees for parsing out the results.

Attempts to use dense mathematical algorithms to assess teacher performance face the same core issue. Value Added Measurement (VAM) purports to compare teachers who are teaching similar students, i.e., Title 1, English language learners, special education, etc. The formula creates a score for each teacher on a 1 – 100 scale so that teachers can be compared. The problem is not the dense formula – the issue is that teachers teach different students each year and the VAM scores have high errors of measurement that swing widely from year to year. A score with an error of measurement of plus or minus fifteen percent means the teacher score falls with a thirty point range. The following year the score may be substantially higher or lower and the entire system is predicated on student tests that may be fatally flawed.

If the stratified random sample is flawed or the test is flawed all conclusions emanating are flawed.

The other method of assessing teacher performance is supervisory observations, which may be helpful in improving teacher performance; however, have no inter rater reliability.

An irony is that there are numerous examples of low scores from supervisors and considerably higher VAM scores. VAM scores, although deeply flawed, in many cases protect teachers from low observational scores that may be biased.

Polls are a photograph, a moment in time based on available data that might very well be flawed or change dramatically in the days or hours before the “final” poll, the election.

Value Added Measurements have enriched testing companies, confused and angered teachers and parents and created a Quixote quest (“…revive chivalry, undo wrongs, and bring justice to the world”) that is impossible to fulfill.

We are gullible and accept complex formula as truth. If an explanation is filled with obtuse Greek letters and symbols it must be accurate.

Australia has compulsory voting, polling is probably far more accurate, in the United States local voting participation is commonly below 50%, and the voters vary from election to election. The only accurate poll is the election.

If teachers taught the same students every year and the tests met statistical standards of validity, reliability and stability the VAM scores might be reasonably accurate.

Bottom line: polling is an informed guess and VAM scores are of little value.

The Politicization of State Tests: Creating Tests in Which “All Students in New York State Are Above Average”

When the dust cleared the greatest ally to the anti-testing clique was (roll of drums!!!)  MaryEllen Elia, the New York State Commissioner of Education.

The deeply flawed state tests (“All children are above average”) reignited the argument – why do we have state test at all (aside from the federal requirements)?

Statewide ELA test scores jumped by around 7% – although the racial achievement gap remained the same.

A magic potion, incompetence or simply political legerdemain?

A little review: in September, 2015 Governor Cuomo reconvened a blue ribbon panel, actually a process to repair the Governor’s foolhardy attacks on teachers and parents. In 2014 it appeared that Cuomo had a clear path the Democratic nomination for his second term and deep pockets for the November general election. Seemingly out of nowhere Zephyr Teachout, a law professor at Fordham challenged Cuomo for the Working Families Party spot on the ballot and challenged Cuomo in the Democratic primary. While the teacher union made no endorsement some members and locals were on the Teachout side. After defeating Teachout and Rob Astorino, his Republican challenger Cuomo decided to punish teachers. He cozied up to the charter school folks, used the budgeting process to tack on legislation to extend teacher probation, and, was nastier than usual.  NYSUT, the statewide teacher union responded with a series of aggressive TV ads and the opt-out movement was created, 20% of kids opted-out of the 2015 state tests.

Cuomo’s popularity rating tumbled.

I suspect clearer heads prevailed.

The purpose of the Task Force was to guide education policy from afar and place the Board of Regents and the commissioner in the foreground. The recommendations were more than recommendations; they were a pathway for state education policy. (Cuomo: This is the endgame – you figure us out how to get us there)

The Task Force Report (Read here), which was released in December, contained twenty-one recommendations, the last recommendation was a moratorium on the use of state tests to evaluate principals and teachers for four years, applauded by the teacher union.  The recommendations called for a thorough review of the Common Core Standards and teachers would be included in every step of the process.

Recommendation 15: Undertake a formal review to determine whether to transition to untimed tests for existing and new Standardized tests aligned to the standards; not controversial, garnered little,  if any discussion; perhaps a pilot in a few schools and school districts across the state.

Surprisingly, very surprisingly, without any discussion with the Board of Regents, the Commissioner announced that the 2016 state tests would be untimed.

The January announcement, entitled “Changes for the 2016 Grades 3-8 ELA and Mathematics Tests” begins,

This memo outlines changes made as a result of feedback from the field:

* Greater involvement of educators in the test development process

* Decrease in the number of test questions, and

* A shift to untimed testing

The announcement came from Angela Infante, Deputy Commissioner, Office of Instructional Support and Peter Swerdewski, Assistant Commissioner, Office of State Assessment.

The state document states, “…students will be provided with as much time as they need.” No pilot, no transition, jumping off the diving board into the pool, and, the state made no attempt to identify students who took additional time.

The scores soared, the state commissioner, in the Daily News admits the scores are “not exactly a perfect comparison,”

After widespread opposition to the difficulty of the tests erupted in 2015, state education department officials shortened the exams for 2016 and eliminated time limits.

“Because of the changes in testing, it’s not exactly a perfect comparison,” Elia said. “And even with the increases this year, there remains much work to be done.”

The state spent many millions of dollars purchasing tests, teachers and students months of test prep, to collect data from what turns out to be a non-standardized test. A test that might not even meet federal requirements, although I’m sure the feds will simply ignore the faux jump in scores.

Was the test itself “harder” or “easier;” many months down the road a Technical Advisory Committee (TAC) will release a report, hundreds of pages of dense analysis that few will read and fewer will understand.

The basic questions: are the results of the test useful?  Can they be compared with the previous year? Can schools and school districts be compared? And, at the top of the list: are the schools in New York State making academic progress?

Howard Wainer, a Distinguished Research Scientist, the author of innumerable books and articles, an internationally recognized expert writes,

Because of the changes this year’s scores can’t be compared to last year’s and because of the untimed nature of the test (and there being no record of how long anyone took) you can’t compare scores of students who took it this year with one another. It is, in no uncertain terms, an unstandardized test.

This test is akin to measuring children’s heights but allowing some students, we don’t know who, to stand on a stool, we don’t know how high, and then declaring some taller than others.

Fred Smith, another testing expert, writing in City Limits, had doubts about the validity of the test before the test administration.

Either the state education psychometrician is lacking in competence, or knew by adopting untimed tests scores would likely jump – either is unacceptable.

If the state continues down the same path, retaining the untimed tests, even if it keeps track of students who take extra time, and the amount of extra time, we will be once again be comparing apples to oranges. Kids who take extra time or choose not to take extra time may not be the same kids as this year – we simply can’t know.

Will states across the nation also jump on the untimed tests bandwagon?

In the politicized world of education the charter school folk and their acolytes beamed at higher scores, of course, we have no way of knowing why charter school scores were generally higher than public schools, and, the pro-charter print media crowed. Mayor de Blasio and Chancellor Farina also took a victory lap, and the Mayor immediately claimed the scores were proof that mayoral control be made permanent.

Board of Regents Chancellor Rosa reminded us it’s not time for a victory lap, unfortunately everyone else is milking the results – de Blasio and Farina, the charters and principals and teachers are breathing a sigh of relief.

A perverse kind of victimless crime: except for the kids who were tortured preparing for a non-standardized test.

Although the law has changed, No Child Left Behind has been replaced by the Every Student Succeeds Act; the requirement for annual testing in grades 3-8 remains. The Leadership Conference is an umbrella group representing the major civil rights organizations across the spectrum has strongly supported the accountability requirements, aka, testing and reporting scores by subgroup, and, the law is not changing.

Testing is here to stay.

The US Department of Education has announced they will be selecting six or so states or consortiums of states to play with alternate assessments.

The anti-testing crowd points to the new law and the testing kerfuffle in New York State, why not move to portfolios and performance tasks to current replace testing? This is not a new idea.

Vermont spent a decade working to create an assessment system based on portfolios, and after an external report pointed to fatal flaws, abandoned the effort.

…report by the RAND Corporation … found that the “rater reliability” in scoring the portfolios–the extent to which scorers agreed about the quality of a student’s work–was very low. The researchers urged the state to release the assessment results only at the state level.

Daniel M. Koretz, a senior social scientist at RAND and the report’s author, said the low levels of reliability indicate that the scores are essentially meaningless, since a different set of raters could come up with a completely different set of scores.

Can thousands of teachers be expected to rate portfolios the same?

The portfolio process was expensive, extremely time consuming  and there is no guarantee the portfolio work was not “assisted” by parents or others .

Yes, portfolios and performance tasks are effective classroom tools and in the perfect world might be a way of assessing student progress, in the real world, the world in which we live, it is not reasonable to expect inter-rater reliability.

The anti-testing movement will not disappear and the opt-out movement is alive.

What is absent is leadership – Arne Duncan drove us down a path for seven years that divided education: reformers versus deformers, marketeers versus public schools, unions versus the hedge funders: education is bitterly divided. Will the next president nominate an education leader who can bring together the disparate constituencies?

Education is adrift and the unstandardized testing regimen in New York State is a prime example.

Off to Minneapolis: Preparing for the American Federation of Teacher Convention: Will the Bernie and Hillary Supporters Bond?

On Monday the American Federation of Teachers will celebrate its hundredth anniversary at their bi-annual convention, this year in Minneapolis. About 3,000 teachers, school-related personnel and nurses will spend four days setting policy for the national union, listening to a range of speakers and on Monday afternoon meet the “presumptive” Democratic nominee Hillary Clinton. (You can watch on the AFT.org website).

National conventions are always fascinating, an opportunity to meet teachers from around the nation. Chicago (CTU-Local 1) has been at war with Rahm Emmanuel, their mayor, with another strike possible in September.  California appears to be making positive changes away from endless testing, or, are they creating a dense accountability system – talking with teacher trade unionists from across the nation is always enlightening.  I will be meeting teacher union guests from other countries. Recently I was speaking with teachers and school leaders from Austria: How do you become a principal in Austria? “You belong to the right political party.” Are teachers involved in hiring staff? (Odd look) “No, neither is the principal, teachers are assigned by the bureaucracy, and have lifetime tenure after a few years.  We needed a history teacher, they sent us a gym teacher, and our system is totally top down.” BTW, Austria scores above average on PISA assessments (See here).

The convention schedule is packed full of meeting – first meeting 7 am Monday morning. The delegates will debate changes to the AFT constitution and bylaws and debate, in committees, ninety-one resolutions submitted from locals around the country. Linda Darling-Hammond will lead a discussion of teacher assessment and the new federal All Students Succeed Act (ESSA) that replaces No Child Left Behind (NCLB). The US Department had just released draft regulations – no question the regulations and the possibilities for innovation pilots will be discussed (See Education Week discussion here)

On the convention floor there are multiple microphones (usually six, seven or eight) scattered around the arena. Any delegate can jump up to a microphone to support, oppose or amend a resolution. The committees, after debating the resolutions, set priorities, the highest priority resolutions must be debated on the floor – there are thirteen committees – the top three priorities must reach the floor.

The Democratic Platform on Education was set last week, the original platform reflected the de-reformers, led by Democrats for Education Reform (DFER), a coalition led by Bernie supporters and Randi Weingarten made significant changes (Read details here), angering the DFER faction, who are supporters of the Duncan-King policies.

A theme of the convention will be bonding the Hillary and the Bernie acolytes and building a teacher-led Hillary campaign across the nation. Not an easy task since passions were high during the lengthy campaign, considering the Trump alternative, one would hope the Bernie folks will jump on board.

The latest polls, if you have any confidence in polls, predicts a very close election (Read polling results here).

I’ll be blogging from Minneapolis – stay tuned.

Magic Bullets Equal Duds: Why Do Top-Down Educational Initiatives Rarely Succeed? Will ESSA Change the Face of Education?

I asked an astronomer friend whether the Juno spacecraft would find life under the frozen oceans of Ganymede, one of the moons of Jupiter, he answered,

The frozen ocean is essentially the surface, it’s thought that the watery mantle of Ganymede might be within the range tolerated by extremophiles, but again, a lot of speculation has been done. 

Much heat and little light, or, when Sagan was asked “yeah, but what are your gut feelings?” he replied “I don’t think with my gut.”

Science is a process of enforced intellectual rigor (enforced by peer review) which requires going from known data toward new understandings. We fill in the pages of a blank book with observations, measurements, and analysis, and then try to elucidate new models of how nature works. 

Going from the “already filled in book” to elicit behavior changes is the province of religion. 

Sadly, education policy-making is in the realm of faith, not science.

The reformers abjure “enforced intellectual rigor” and make sweeping decisions that impact millions of students based upon the absence of peer reviewed “observations, measurements and analysis.”

The current and former US Secretaries of Education support a portfolio system of schools – public and charter schools competing with each other for students – the competition, they argue, will raise student achievement in both public and charter schools. The belief is loosely based on the theories of Nobel Laureate economist Milton Freedman,

 In a famous 1955 essay, Friedman argued that there is no need for government to run schools. Instead, families could be provided with publicly financed vouchers for use at the K-12 educational institutions of their choice. Such a system, Friedman believed, would promote competition among schools vying to attract students, thus improving quality, driving down costs, and creating a more dynamic education system.

The current day reformers cannot point to any evidence and, in fact the current system of public and charter side-by-side schools has not “raised all boats,” they may actually diminish student achievement.

In New York City another example is the rekindling of the “Reading Wars,” Chancellor Farina is a close friend of Lucy Calkins and “balanced literacy, her approach to the teaching of reading. Most experts are sharply critical of the Calkins’ approach and support the use of phonics to teacher reading. Friendship rules: the chancellor supports her friend (Read a discussion of the “Reading Wars” here)

Will a teacher evaluation system based on student test scores sort the best and the worst teachers and lead to higher student achievement?  Once again, there is no evidence, and, in fact, scholars tell us that value-added measurement is highly inaccurate and inappropriate for measuring teacher competency.

To make the realm of policy creation and implementation even more depressing is  when schools and school districts attempt to use the “wisdom and knowledge of experts” the attempts fail.

Anthony S. Bryk, Louis M. Gomez, Alicia Grunow, and Paul G. LeMahie in Learning to Improve: How America’s Schools Can Get Better at Getting Better, argue,

… there is no universal mechanism in education for transforming the wisdom and knowledge experts accumulate as they work into a broader professional knowledge base … well-intentioned educational reforms across the ideological spectrum were unsuccessful because they were formed around a novel solution (such as the small schools movement, etc.,) rather than a practitioner-driven problem and were imposed from above without attention to the ways local conditions might require adaptation.

To address these two challenges, the authors argue that practitioners, policy makers, and researchers should collaborate across traditional organizational boundaries to engage in ongoing disciplined inquiry.

(Read a detailed description of the book here)

The authors lay out what they call “The Six Core Principles of Improvement”

  1. Make the work problem-specific and user-centered.

It starts with a single question: “What specifically is the problem we are trying to solve?” It enlivens a co-development orientation: engage key participants early and often.

 

  1. Variation in performance is the core problem to address.

The critical issue is not what works, but rather what works, for whom and under what set of conditions. Aim to advance efficacy reliably at scale.

 

  1. See the system that produces the current outcomes.

It is hard to improve what you do not fully understand. Go and see how local conditions shape work processes. Make your hypotheses for change public and clear.

 

  1. We cannot improve at scale what we cannot measure.

Embed measures of key outcomes and processes to track if change is an improvement. We intervene in complex organizations. Anticipate unintended consequences and measure these too.

 

  1. Anchor practice improvement in disciplined inquiry.

Engage rapid cycles of Plan, Do, Study, Act (PDSA) to learn fast, fail fast, and improve quickly. That failures may occur is not the problem; that we fail to learn from them is.

 

  1. Accelerate improvements through networked communities.

Embrace the wisdom of crowds. We can accomplish more together than even the best of us can accomplish alone.

 In other words, there are no magic bullets. The “answer” is not the program; the answer is the competency and cooperation of the practitioners at the school, district and university level.

By competency I mean the ability to collaborate within and across schools, the ability to understand data and convert data into classroom practice, to become reflective practitioners. The New York City-based Progressive Redesign Opportunity Schools for Excellence (PROSE) encourages schools to break free of perceived or real constraints, to craft research-based solutions at schools with the guidance and support off labor and management.

The International Network supports twenty schools, fifteen in New York City, that work with English language learner high school students who have been in the country four years of less. The six year graduation rates match all other schools, the schools share instructional practices.

How do we seed fertile soils?  How do we prepare teachers and school leaders to use peer reviewed research to drive actual practice?  And, vitally important, how we create district leadership that supports schools and not constantly chase the magic grail, that magic bullet that has never existed.

There are highly successful schools, succeeding, frequently under the radar while schools with similar populations struggle.  Unfortunately the most successful principals, and occasionally superintendents must resort to practicing creative resistance, smiling, nodding, and continuing to do what actually works in spite of “higher ups” that chase that elusive secret sauce.

The new Every Student Succeeds Act (ESSA) devolves power from the feds to the states; states have until the spring of 2017 to create plans to address struggling schools: will states simply replicate the failed federal programs or actually create creative approaches to school improvement?

The New York State Legislature Adjourns with a “Whimper,”as Educational Policy-Making Moves to the Board of Regents

This is the way the world ends
This is the way the world ends
This is the way the world ends
Not with a bang but a whimper

T. S. Eliot, The Hollow Men (1925)

The last stanza of Eliot’s poem is an apt description of the end of the 2016 legislative session. The final days, called “the Big Ugly,” is a scramble, an endgame, the Republicans and the Democrats vying for an advantage as the state moves toward the November election. All the seats in the legislature, the 150 in the Assembly and the 63 in the Senate will be on the ballot. While the Assembly is firmly in the Democratic column the Senate is far more complex, and byzantine. The Democrats hold a single seat edge in the Senate (32-31); however five Democrats (Jeff Klein, Diane Savino, Tony Avella, David Valesky, and David Carlucci), the Independent Democrat Conference (IDF), under the leadership of Klein (Bronx) caucuses with the Republicans, giving the Republicans control of the Senate.

Hanging in the balance were mayoral control, campaign finance reform, removal of pensions for convicted legislators, online fantasy sports betting and scores of other bills.

You may ask: why is all this conflict and wheeling and dealing necessary? Why can’t legislators have civil conversations and decide the issues?

James Madison, in Federalist # 51 wrote,

Ambition must be made to counteract ambition. The interest of the man must be connected with the constitutional rights of the place. It may be a reflection on human nature, that such devices should be necessary to control the abuses of government. But what is government itself, but the greatest of all reflections on human nature? If men were angels, no government would be necessary. If angels were to govern men, neither external nor internal controls on government would be necessary

The Constitutional Convention (1787) was not covered in CSPAN; the Constitutional Convention was a secret meeting. The only notes we have are Madison’s personal notes, not made public until after the death of all the delegates, The fifty-three delegates argued, came and went, delivered lengthy speeches, met in private, and made deals.

Slavery was one of the most significant stumbling blocks, the anti-slavery Northerners versus the slave-holding South, The compromise: slavery is not mentioned in the constitution, the question of slavery was left to the states, and, as part of a compromise; slaves were counted as 3/5th of a ”free person,” and referred to in the clause as “all other Persons.”

Representatives and direct taxes shall be apportioned among the several states which may be included within this union, according to their respective numbers, which shall be determined by adding to the whole number of free persons, including those bound to service for a term of years, and excluding Indians not taxed, three fifths of all other Persons.

Deal-making, as reprehensible as it may seem, is at the essence of making government work.

Whether to extend mayoral control in New York City had nothing to do with education. Weakening the mayor might give the Republicans a chance in the 2017 mayoral election. In spite of pleas from Merryl Tisch and others in the upper echelons of power Senate leader John Flanagan offered “unacceptable” plan after plan until in the closing hours an agreement was reached, the NY Times describes the plan as a one year extension plus,,

It would effectively create a parallel system of charter schools within the city, allowing “high-performing charter schools in good standing” to switch to join the State University of New York umbrella or the Board of Regents of the State Education Department.

Probably a meaningless change, currently charters schools authorized by both New York City and Buffalo make reauthorization proposals after five years, the authorizer, SUNY or the Board of Regents can reject the recommendation. The proposal allows the charter school, if it’s  “high performing and in good standing” to move directly to SUNY or the Regents for reauthorization.

The session is most interesting for what it did not do – the houses steered clear of legislation directing the State Education Department to take any actions. A host of education bills simply died. Neither the governor nor the party leaders had any desire to once again get involved in the morass of teacher accountability or testing, any of the issues that birthed the opt outs and/or angered teachers and their unions.

The budget was generous and the political leaders appear to be leaving the educational decisions to the educational leaders.

In December the Cuomo-appointed Task Force released their report with 21-recommendations: a blueprint for the Commissioner and the Board of Regents. The core of the report was a 4-year moratorium on the use of student test scores as part of a metric to assess teacher performance.

In the six months since the release of the report the Commissioner has made tests untimed, a recommendation in the report, established a number of large field-based committees to review elements of the Common Core, and, the Regents created a number of alternative pathways to graduation.

Quietly, very quietly, the Commissioner announced a change in the observation section of the teacher evaluation regulation. The outside observer would be scrapped – what might be a good idea in theory was both overly complex and a financial burden on school districts. There was no high drama – no headlines, simply an announcement undoubtedly based on quiet discussions.

The decisions before the Board of Regents are complex, politically explosive and without explicit answers.

Can you create a teacher evaluation plan that is acceptable to principals and teachers and not trashed by external critics?

Can better tests win back opt out parents?  And, what do you mean by “better tests?”

Will alternatives to testing, perhaps, portfolios or other performance assessments, be acceptable to the feds, and acceptable to the principals and teachers?  Are performance assessments practicable in actual classroom settings?

Will additional alternative pathways to high school graduation make students more or less prepared for college?

The Regents appear to have a window – three or four years – to make decisions based on their expertise as well as respond to external pressures and scrutiny, and, hovering aloft: “disruptive” solutions such as unlimited charter schools or vouchers.

Windows open, and windows close.

Getting It Right: Building a Research-Based Teacher Assessment System

A couple of years ago I was participating in a Danielson Training Workshop, two Saturdays in a room filled with principals and network support folk. We watched a video of part of a lesson – we were told we were watching a first year teacher in November in a high school classroom.

Under the former Satisfactory/Unsatisfactory rating system the lesson was clearly satisfactory. The Danielson Frameworks (Read the 115-page NYSED document here) requires that teachers are rated on a four-point scale (Distinguished, Proficient, Basic and Unsatisfactory) while New York State also requires a four point scale (Highly Effective, Effective, Developing and Ineffective). The Frameworks divides the teaching process into four domains, 22 components and 76 elements.

The instructor asked us to rate the lesson: at my table we were all over the place. For a teacher in the third month of her first year of teaching the lesson was excellent – clearly “proficient.”  Others argued the time in teaching was irrelevant, you had to rate her against all other teachers regardless of experience – at best, she was “developing.” Inter-rater reliability was absent.

Decades ago the union sent me to an Educational Testing Service conference on teacher assessment; about thirty experienced superintendents from all over the Northeast, and me, one union guy. We began by watching three 15-minutes videos of lessons: one an “old-fashioned” classroom, the kids sitting in rows, the kids answered teacher questions, the kids stood when they answered; the questions were at a high level although a small number of kids dominated the discussion. In the other video kids were sitting at tables, the teacher asked a question, gave the kids a few minutes to “huddle,” and one of the kids answered for the group and the teacher followed up with a few clarifying questions, in the third classroom the kids were at stations around the room, it was noisy, the noise was the kids discussing the assignment, the teacher flitted around the room, answering, clarifying and asking questions.

We were asked to rate the lesson on a provided checklist.

The result: the superintendent ratings were all over the place.

I was serving as the teacher union rep on a Schools Under Registration Review (SURR) team – we were visiting a low performing school. We were told to wait, the principal was busy, four of the 50 teachers were absent and there were three vacancies, the principal was assigning classroom coverages.

At the initial get acquainted session a team member, considering the staffing issues asked, “What are the primary qualities you look for in assessing teacher quality?” The principal blurted, “They come every day and blood doesn’t run out from under the door.”

A colleague was touring a school with very high test scores.  As he walked the building with the principal, he saw uniformly “mediocre” instruction – teacher-dominated, no student engagement. He mentioned the low quality of instruction to the principal, who shrugged, “Why mess with success?”

Once again, there is no inter-rater reliability.

In a number of school districts across the state almost all teachers received maximum observation ratings.

The State Ed folk simply accept the observation ratings of principals and school districts.

Charlotte Danielson, in her other book, Talk About Teaching  (September, 2015), discusses the complex role of the principal as rater as well as staff developer: how can a principal, who is the summative evaluater honestly engage with teachers who they rate?

In an excellent article from the Center for Educator Compensation Reform, Measuring and Promoting Inter-Rater Agreement of Teacher and Principal Performance Ratings (February, 2012), the authors parse the reliability of teacher observation ratings. There are a number of statistical tools to assess reliability – the state uses none of them.

In New York State 60% of a teacher rating is made up of the teacher observation score, and, we have no idea of the accuracy of the rating.

In the pre-Race to the Top days, the Satisfactory/Unsatisfactory rating days, the entire rating was dependent on the observation – in the last year of Bloomberg term 2.7% of teachers in New York City received Unsatisfactory ratings, under the current far more complex system that incorporates student tests scores and other measures of student growth only 1% of teachers were rated ineffective (Read a description of the plan:  APPR 3012-c).

Under the newest  system the other 40% is a combination of Measures of Student Learning and Student Learning Objectives, the use of state test scores is suspended until the 2019-20 school year.

Read a detailed description of the current APPR 3012-d teacher evaluation law here and a lengthy Power Point here.

In May, 2015 the Regents convened a Learning Summit and asked a number of experts to discuss the use of student growth scores (VAM): Watch the lengthy, sometime contentious discussion  here.

With one exception the experts criticized the use of student growth scores (VAM), the VAM scores did not meet the tests of “validity,” “reliability” and “stability.”

There have been glaring errors in the system. In the Sheri Lederman law suit  a teacher had very high observation scores and due to the composition of her class, very low student growth scores. The judge ruled the use of the growth scores, in the individual case, was “arbitrary and capricious.”

The APPR plan negotiated in New York City, on the other hand, allows for appeals by a neutral third party, and, the “neutral” has overturned appeals in which there was a wide disparity between the observation and VAM scores.

The current plan, created by the governor and approved by the legislature has been rejected by teachers and parents. Teachers are convinced that their score is dependent on the ability of the students they teach, not their competence. Parents feel schools are forced to “teach to the test” due to the consequences facing principals and teachers.

Angry parents, angry teachers and principals and a governor and a legislature looking for a way out of the box they created.

And, a cynicism from elements among the public – if two-thirds of kids are “failing” state tests how is it possible that only one percent of principals and teachers are rated “ineffective?”

The Board of Regents has been tasked with finding the “right” plan.

There has been surprisingly little research and public discussion of teacher attrition – in high poverty schools staggering percentages of teachers, 30%, 40%, 50% or more leave within their first few years.

The December, 2015, Cuomo Commission Task Force, in a scathing report, tasked the Regents with “correcting” what has been a disastrous path. Partially the governor creating an incredibly complex teacher evaluation matrix and partially the Commissioner King rushing to adopt the common core, common core testing and teacher evaluation simultaneously.

Can the Regents separate political decisions from research-based and guided decisions? Can the Regents move from the John King path, an emotion-guided political path to actually following “what the research says”?

On Tuesday the new Research Work Group, chaired by Regent Johnson will convene for the first time.

The roadmap for the State Ed Department and the Board of Regents are the twenty-one recommendations of the Cuomo Common Core Task Force. A number of the recommendations: untimed testing, an in-depth review from the field of the standards, greater transparency of the test items, alternatives to the use of examinations for students with disabilities, and, the beginning of an review of teacher evaluation are already in progress.

The Commissioner and the Regents have to regain a lost credibility: from policy emanating from the Gates Foundation and the so-called reformers to policies guided by scholarship and supported by parents and educators.

Killing the Zombies: Why the “Bad Teacher” Canard Refuses to Die

Who is Clay Christensen and what is disruptive innovation in education?

Christensen is a professor in the Harvard Business School and the intellectual force behind the current education reform movement. The professor proffers that education has been basically unchanged for decades, a traditional classroom model, very little has changed including little improvement in achievement. Christensen acolytes argue that the traditional model must be “disrupted.”  A wide range of examples: placing schools in competition; public, private, charter, parochial and home-schooling through a voucher system. Traditional instruction must be replaced by an iteration of personalized learning in the form of computer-based learning and, impediments to removing “bad” teacher removed.

The “disruptors’ include the political leadership, from the White House to state capitals.  The $4.4 billion in competitive state grants, the Race to the Top (RttT), is a prime example. The lure of federal dollars to disrupt the traditional systems; RttT required the creation and expansion of charter schools as well as creating a student test score-based teacher evaluation system.

The New Teacher Project (TNTP), an advocacy organization, a “disrupter” organization, conducted a survey of school districts and a report – The Widget Effect. The findings:

Effective teachers are the key to student success, yet our school systems treat all teachers as interchangeable parts, not professionals. Excellence goes unrecognized and poor performance goes unaddressed. This indifference to performance disrespects teachers and gambles with students’ lives.

The 2009 report, surveyed fifteen schools districts across four states points to the absence formalized evaluation systems resulting in virtually all teachers rated satisfactory with few classroom observations.  For the TNTP there was no sorting of teachers by ability, no one is fired and no one is identified as being an exemplary teacher.

In districts that use binary evaluation ratings (generally “satisfactory” or “unsatisfactory”), more than 99% of teachers receive the satisfactory rating, Districts that use a broader range of rating options do little better; in their districts, 94% of teachers receive one of the top two ratings and less than 1% are rated unsatisfactory.

Since the release of the report the reformers, the “disruptors,’ have been successful, enormously successful, in convincing, coercing, luring states into highly structured teacher evaluation systems; to identify the high performers (merit pay) and prune away the low performers.

As part of their winning Race to the Top proposal New York State designed a multiple measures teacher evaluation system: 60% of a teacher score would be supervisory observations based on a rubric selected by the school district (Danielson, Marzano, Marshall and others), 20% based on a student growth data (VAM) on state grades 3-8 test scores and 20% on a locally negotiated metric – which could be test scores or other measures of student learning (MOSL). The data is pumped into a dense, extremely dense mathematical algorithm and all teachers receive a score that translates in a letter grade on the HEDI spectrum: Highly Effective, Effective, Developing and Ineffective. The teacher evaluation plan, called the Annual Professional Performance Review (APPR) has been amended a number of times – the current plan prohibits the use of student test scores for four years, is called the “matrix.”

The inclusion of a value-added measurement, the student test score algorithm has been sharply criticized by a range of scholars as well as teacher organizations, and, a state court, in a non-precedent setting decision, found the use “arbitrary and capricious.”

Millions of dollars to create a multiple measures teacher evaluation plan and the result: 1% of teachers are ineffective – the same as the Widget Effect report.

In 2009 The New Teacher Project bemoaned that only 1% of teachers were rated “unsatisfactory” and seven years later the New York State APPR plan found, you guessed it: once again, only 1%.of teachers rated “ineffective.”

Millions of dollars to create a teacher evaluation system, a host of “experts,” the application of dense mathematical formulations and the percentage of teachers rated ineffective is unchanged.

This couldn’t possibly be right!! … so say the disruptors.

In a recently released report, The Widget Effect Revisited: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness  (February, 2016), the authors conducted surveys of  newly designed teacher evaluation plans across a number of states and interviewed principals.

On the new plans, “… less than 3% of teachers were rated below Proficient”

The raters, the principals, also reported,

“…evaluators perceive more than three times as many teachers in their school as below Proficient than they rate as such.”

In lengthy interviews the principals expressed the reasons for not rating more teachers as below Proficient.

* Time constraints (“It takes too much time away from running a school”)

* Teacher potential and motivation (“Fear of the discouraging of teachers”)

* Personal discomfort (“I have a difficult time telling teachers they’re failing”)

* Racial tensions (“Very difficult for a White principal to rate Black teachers poorly”)

* Quality of replacements (“I can’t find adequate replacements”)

* Voluntary departures (“I rate them Proficient and they leave – a deal is made”

* Burdensome dismissal processes (“The process is too complex and time consuming”)

Is the “problem” too many “below Proficient” teachers or “below Proficient” principals?

What the authors failed to investigate, admittedly not the purpose of the study,

* Inter-rater reliability: do the raters from school to school use the same rubrics, and, are they competent to assess teacher performance?

* The bell-curve conundrum: is the lowest rated teacher in the school “below Proficient” when compared with all other teachers in the district?  In other words, is it “arbitrary and capacious” to establish a system that guarantees that the lowest performer in a school must be below Proficient? After all, they may be more proficient than teachers in other schools in the district.

A second baseman on a major league team may be the “least proficient” among major leaguers and in the top 1% of all second baseman across colleges and minor leagues.

I know the idea is disquieting – perhaps only 1% of teachers actually are ineffective.

Prospective teachers must be accepted by a college and meet standards set by the Council on Accreditation of Teacher Preparation (CAEP), teachers must pass a number of nationally recognized pre-service exams, pass interviews by principals/hiring committees, teach a demonstration lesson and serve a probationary period as an at-will employee.  It should not be surprising that very few teachers who survive the rigorous pre-screening end up as “below Proficient.”

Nobel Laureate Paul Krugman coined the term zombie idea: a zombie idea is “a proposition that has been thoroughly refuted by analysis and evidence, and should be dead — but won’t stay dead because it serves a political purpose, appeals to prejudices, or both.”

The disruptor “bad teacher” solution to increasing student achievement is an example of a zombie idea – in spite of reams of evidence the idea refuses to die.

What is so depressing is when compared to teacher attrition the “bad” teacher argument pales – in the lowest achieving, highest poverty schools about half of all teachers leave within five years, and, we have a pretty good idea of why they leave: the way they’re treated.

Susan Moore Johnson, at the Next Generation of Teachers Project at Harvard examines the issue of teacher attrition in the highest poverty schools in detail.  Yes, teachers commonly leave to wealthier, whiter schools; however, they are not fleeing the students, they are fleeing the working conditions.

If we know how to make significant differences (“Improve working conditions in the poorest school”), if we’ve identified the core problem (“Teacher morale and treatment”), why don’t we address the solution?

Those zombies are tough to kill off.