New York State Selects a Controversial Commissioner: Let’s Give Her (That’s Right – Her) A Chance!!

The New York State Board of Regents selected Mary Ellen Elia, the recently fired superintendent of the Hillsborough Country Florida School District (Tampa), the New York State Commissioner of Education.

Elia, who began her career as a Social Studies teacher Buffalo in 1970 has been the superintendent in Hillsborough since 2005, and, has been acclaimed nationally.

Jay Mathews, in the Washington Post, is appalled by her dismissal, and called it “senseless and catastrophic,” and continued,

Since her appointment in 2005, she has built a national reputation as a warm, smart leader who has produced some of the most academically challenging schools in the country and has found ways to raise achievement even for low-income children. She is highly praised as a manager, even though she had never served before as a superintendent and — unlike any other leader of a big district I know — had previously spent 19 years as a classroom and reading teacher.

With union leaders, she worked out a teacher compensation plan that allowed exceptional new hires to climb the salary ladder fast. Teachers union president Jean Clements called it “reliable, valid, fair and easy to understand.”

Her opponents called her dictatorial and oblivious to the needs of parents, and, some teachers were happy to see her leave.

As soon as her appointment as NYS Commissioner was announced the twittersphere and the blognet were abuzz – Bill Gates whispered into Merryl Tisch’s ear, she snapped her fingers and the other sixteen Regents fell in line.

The special meeting of the Regents was a four-hour marathon, with a two hour interview of Elia. The new Board of Regents includes six former superintendents, four newly elected members, two of whom bumped the two most senior members. The current Regents are fiercely independent. The Regents dueled with acting commissioner Wagner, no shrinking violets! No one will controls the current Board.

Let’s be honest – Mary Ellen Elia brings baggage.

Hillsborough, her school district is the eighth largest school district in the nation – over 200,000 students and 25,000 staffers. In 2008 she received the largest Gates Foundation grant – 104 million dollars – to implement the Empowering Excellent Teachers project. The grant application required negotiating with the union; most of the dollars went to increasing teacher salaries, and required: a teacher evaluation system which included 40% use of student growth scores, and principal and peer assessments, a pay-for-performance plan (See on pages 91-92 of Teacher Contract here and a salary fast-track for younger teachers, Read about the grant from the union perspective here and a union “myth-busting” article here)

The new Commish immediately visited a local school, spoke with the media and met with legislative leaders – see interview here: And, attempted to “roll back the past,”

[Elia] indicated that she thought less of the decision to simultaneously align New York’s standardized tests to the Common Core standards and start evaluating teachers using test results, though.
“Some of this across the nation, in specific places, was done very quickly without the implementation explained and without enough time,” Elia said. “I would suggest that sometimes in haste we haven’t taken the time for people to understand and to become part of the change that needs to occur.”

Commissioner Elia should be judged by her actions.

Some suggestions:

* Get Out of Albany: Attend as many teacher meetings as possible and engage in a TV interview with NYSUT president Karen Magee, attend the UFT Delegate Meeting, meetings with Opt Out parents, become a social media presence.

* Get Teacher Evaluation Resolved: The battle over the Governor-imposed teacher evaluation plan is “eating up all the air,” start the plan with lower growth scores metrics, agree that the plan will be driven by research-based data, review the metrics annually, and alleviate the fear and suspicion among teachers.

* Review the Common Core: Educators from across the nation and the state are critical of elements of the Common Core, especially the earlier grade standards. Establish a task force: scholars, teachers and parents, to review CCLS and recommend, if necessary, changes.

* Let’s Move Beyond the Testing Morass: The current Common Core aligned Pearson test have no credibility. Students have low test score grades because former commissioner King decided to structure the system to produce lower grades – a decision that led to his demise as commissioner. Are there better tests? Can we begin to explore adaptive testing? Is the use of portfolios for specific categories of student feasible? And, yes, we should totally reject the PARCC consortium.

* English Language Learners: The number of ELL students entering New York State schools has increased dramatically, and, the State responded by adding compliance regs. Increasing the minutes of required instruction or the number of bilingual teachers, neither action is the answer. Schools districts will look for loopholes, very little will change. What is the State doing to impact instruction of ELL’s in classrooms? A politically explosive arena: do school districts cut Advanced Placement classes to add bilingual classes? Why are some schools with high percentages of ELL’s doing so much better than others?

^ The Teacher Education Troubles: All college teacher education programs must be approved by the State, and, there are hundreds of programs. Nation-wide students in teacher education program come from the bottom half of college applicants. How can we attract better candidates? How can we retain teachers in the profession? Should we hold teacher education programs accountable for the success of their students? And, if so, how?
Are the current tests: edTPA. ALAST, EAS and Content exams, which cost the students about $1,000, a reliable predictor of success? Are the exams themselves valid?

* The Elephant in the Room: The high poverty, low achieving schools and school districts. Buffalo, Rochester, Syracuse, and, probably another hundred of so cities across the nation share high levels of poverty and low student academic achievement. New York State has stumbled badly, the $700 million in Race to the Top dollars are gone, and, what do we have to show? Staining schools with brands of failure, “out of time,” “priority” or whatever stain is meaningless, inner city charter schools struggle mightily, and, the governor’s “receivership” legislation is re-packaging of failed ideas. Can the state actually adopt policies that actually improve teaching and learning in our poorest schools?

* The Charter School Kerfuffle: State Ed just denied the application of a host of charter schools; they didn’t come up to standards. (why now?), SED has routinely extended charters of low-achieving charter schools. Charter schools commonly accept fewer students with disabilities, English language learners and dump out low achievers and discipline problems. There are at least 2500 empty seats in New York City charter schools. The issue is called “back-fill,” kids are dumped and not replaced to inflate data. The large Charter Management Organizations actively seek philanthropy; the many, many “mom and pop” charter schools struggle to meet payrolls. Will Elia hold charters to higher standards? Will she close down failed charter schools?

For decades commissioners were drawn from the ranks of superintendents across the state. When Regent Tisch became chancellor the Board chose David Steiner, who was not a K-12 educator, he was the dean of a college of education; John King had lots of degrees and little experience, and, now, a very high profile choice from Florida. Aren’t there an qualified suprintendents in New York State?

Understand: teachers, principals and superintendents don’t work for the commissioner; they work for elected school boards. While the commissioner and the Board set policy, they do not have the ability to actually intervene locally. In East Ramapo the school board was captured by the religious school leaders, and, they directed funds to the religious schools, they actively closed public schools and sold the buildings to religious schools, interpreted the rules to drive special education dollars to religious schools, laid off public school staffs and dramatically raised class size. The State Ed lawyer: to the commish: you have no authority to intervene.

Constitutionally the Board of Regents sets policy and the commissioner carried out the policies for the last few years; recently the governor has set the policy, and for the King years the commissioner successfully bypassed most of the Board members.

We are entering a new era with a new commissioner, a simple message: give the lady a chance.

New York State is a far cry from Florida.

How about Chancellor Tisch arrange for John Merrow to interview Commissioner Elia, Diane Ravitch and AFT President Randi Weingarten, maybe at JCC Manhattan, and, invite the Opt-Out organizations, the School-to -Prison Pipeline folks and the Immigrant Coalition to pose questions, and, arrange for CSPAN to telecast.

Mary Ellen Elia decided to jump out of the Florida frying pan into the New York State fire – let’s both give her a chance and hold her feet to the fire.

“Lipstick on a Pig,” Can, or Should, the Regents Salvage the Cuomo Teacher Evaluation Plan?

The newest iteration of the state teacher evaluation plan is eating up all the air. The all-day Education Learning Summit, the release of the 56-page Department of Education summary (Read here) and hours of discussions at the P-12 Committee of the Board of Regents.

A very quick review: in order to be eligible for the hundreds of millions of Race to the Top dollars (not so affectionately referred to as “thirty pieces of silver”) Commissioner Steiner and the Unions in 2010 entered into months of discussions that produced a teacher evaluation law (Section 3012-c ) and after more months the law was converted into the 20-20-60 plan (20% student test scores, 20% a locally negotiated measure and 60% supervisory observations). The results of the plan varied widely across the state. In some districts majorities of teachers was “Highly Effective,” in others disturbing numbers of “Developing” and “Ineffective.” Do the scores reflect teacher competence or do the scores reflect the district zip code or the ability of the students? Were teachers in poorer district, teachers of English language learners and student with disabilities fates determined before any test?

Teacher, principals and parents believe the highly complex numerical algorithms were flawed, poorly applied, or, both. BTW, it doesn’t matter whether the data is actually flawed; Cuomo’s behavior has tarnished the entire system.

The State Teacher Union, NYSUT, which represents the 700 local unions strongly opposed the plan; the New York City Teacher Union, that originally fought Mayor Bloomberg over the plan, seemed relatively happy with the plan.

What’s happening?

In the year before the plan in New York City 2.8% of teachers received Unsatisfactory ratings, in the first year of the plan 1.6% of teacher received an “Ineffective” rating; teachers in New York City were faring better under the new plan.

Under the prior system ratings were based solely on supervisory judgement, aka, principal observations; under the new plan student performance mattered. A teacher who received a low score, a “Developing” or an “Ineffective” on the observation portion and “Highly Effective” or “Effective” on the student performance section will end up with a passing score, a considerable difference from the former principal rating only system.

The new, new Cuomo plan moves away from the 20-20-60 plan to a Matrix, a 4 by 4 box that determines the teacher rating, yes, it takes a while to comprehend.

Teachers are highly suspicious and view the plan as Governor Cuomo’s plan to fire teachers, and Andrew does nothing to dissuade them.

The complexity of the models stem from the wide diversity across the state; New York City developed 159 different algorithms to account for the wide variety of subjects; many districts outside of New York City used locally developed metrics that are more questionable in their impact.

At the Education Learning Summit a report from the American Statistics Association: teacher impact on student learning ranges from 1 – 14%; some Regents argued the impact of student scores should not go beyond 20%, Chancellor Tisch, in a presentation Thursday morning suggested 40%.

At the May 18th Regents Meeting many of the Regents had grave doubts about the Governor’s plan. Regents Cashin, Rosa and Johnson were especially critical. Regent Tilles referenced the “lipstick” analogy. There was no support for the plan. Acting Commissioner Wagner suggested a number of metrics, without Regents enthusiasm.

The subtext was fascinating – the Regents are a policy board, the commissioner turns the policy into regulations. The line between policy and operations is gray. Commissioner King deftly sidestepped the board, and determined both policy and operations. The board, the Regents, in the past, with the exception of a few members, rubber stamped the policies of the commissioner, until the parent and teacher backlash required a scalp. Exit the commissioner.

Will the Regents paint “lipstick on a pig,” and to the best of their ability gussy up a deeply flawed plan, or, vote down a plan provided by the commissioner that the Regents feel is inadequate?

Six of the current Regents are former superintendents; others are uncomfortable with the past direction of the board.

Many of the Regents suggested beginning with base metrics, relatively easy to achieve, and move up the goals as teachers and principals master the complexities. Clearly the Governor wants a tougher plan, a tougher plan equals more “D” and “I” grades.

The legislators, who passed the budget bill which included the Matrix, want a plan that “satisfies” parents, principals and teachers.

Let me make it perfectly clear: the legislature despises the Governor and would love if the Regents acted as their surrogate, and ripped the Governor.

Rent control, the property tax cap, mayoral control, the Dream Act, the education tax credit, the lifting of the charter school cap, all on the table as the legislature moves to adjournment on June 17th

Towards the end of the second day of the Regents Meeting Regent Young chaired a committee meeting that began a discussion of President Obama’s “My Brother’s Keeper” initiative; a discussion that in the long run is probably more important than the Cuomo “fire to success” plan.

Lupe Fiasco on Freedom:

Restoring Trust: Can the Regent/SED, Unions and Superintendents Agree on a “Valid. Reliable and Fair” Teacher Evaluation System?

As part of my union rep duties I served on committees to select teachers for new schools; my favorite question was, “What was the best lesson you taught in the last few weeks and how do you know it?” Teachers had no problem describing lessons, and lots of trouble explaining how they could assess the effectiveness of the lesson. A few would say “exit slips,” others explain that checking the homework assessed the effectiveness of the lesson, or that experience was the best guide. In the real world we plow through the curriculum, an occasional unit exam, differentiating lessons, re-teaching concepts, understanding that god in her wisdom did not make us or our students all equal.

We have been giving statewide exams for decades, before No Children Left Behind, in grades four and eight, and Regents exams are more than a century old. Yes, we did have a triage system, classes were homogeneously grouped and students who failed to make academic progress were placed in classes with similar kids. At the end of the line kids dropped out or received a lesser diploma, and moved into the workforce. Low-skilled union jobs were commonplace, the education system was the “divider” which steered kids to college or the world of work.

For the last thirty years our economy has undergone structural changes, the low-skilled union jobs have fallen victim to automation or moved overseas.

Are schools appropriately preparing students for the rapidly and continuously changing world of work, or, to use the commonly used term, are we preparing “college and career ready” students?

The answer begins in the world of baseball.

I was attending an education conference, and as commonly happens, a book was handed out, not a dense text, not a “how to” book, not a messianic message, the book was Michael Lewis’ Moneyball: The Art of Winning an Unfair Game (2003),

Moneyball is a quest for the secret of success in baseball. Following the low-budget Oakland Athletics, their larger-than-life general manger, Billy Beane, and the strange brotherhood of amateur baseball enthusiasts, Michael Lewis has written not only “the single most influential baseball book ever” (Rob Neyer, Slate) but also what “may be the best book ever written on business” (Weekly Standard). [Lewis’] … intimate and original portraits of big league ballplayers are alone worth the price of admission—but the real jackpot is a cache of numbers—numbers!—collected over the years by a strange brotherhood of amateur baseball enthusiasts: software engineers, statisticians, Wall Street analysts, lawyers and physics professors.

Baseball is no longer ruled by the cigar chomping old-timers, every decision, from salary negotiations, to valuing players, to comparing players, to which pitch to throw to each batter is guided by a mathematical algorithm. As a wonky baseball friend claims, “You could prop Bernie in the corner of the dugout (“Weekend at Bernie’s“) and IBM’s Watson could manage the team.”

We even have a term for baseball data; Sabermetrics, a hobby among a handful of nerds now rules the national pastime.

The widespread use of data has not solved the problems of baseball, the numbers of Afro-American ballplayers and fans has sharply diminished, the fan base is aging out; data may drive decisions, the American pastime is facing a ticking clock.

See Chris Rock on baseball:

Whether we like it or not, understand it or not, data drives decisions across a wide spectrum. Ian Ayres, Super Crunchers, Why Thinking by Numbers Is the New Way to be Smart (2007) chronicles how data has embedded itself, from predicting the quality of red wines, to driving the medical profession, to determining which prisoners should be paroled, dense regressive mathematical models rule.

It is not surprising that data influences core decisions in education.

The New Teacher Project 2009 “Widget Effect” report resounded across the education domain,

All teachers are rated good or great. Less than 1 percent of teachers receive unsatisfactory ratings, making it impossible to identify truly exceptional teachers.

• Professional development is inadequate. Almost 3 in 4 teachers did not receive any specific feedback on improving their performance in their last evaluation.

• Novice teachers are neglected. Low expectations for beginning teachers translate into benign neglect in the classroom and a toothless tenure process.

• Poor performance goes unaddressed. Half of the districts studied have not dismissed a single tenured teacher for poor performance in the past five years.
The result has been the movement to assess teacher performance by applying dense mathematical regression models to education, attempting the compare teacher to teacher based upon student achievement data, using a range a variables to level the playing field.

In other words, there is no teacher evaluation system.

The new current systems attempt to address the absence of systems using statistical methods called regression analysis. The students take a common exam, a state test for example, the model allowing for a range of variables, namely, economic status of test takers, students with disabilities, the level of disability, English language learners, student attendance, etc., and the formula differentiates among teachers, within a margin of error.

As with all statistical data sets there are errors of measurement – that “plus or minus” that warns us that the results fall within a range.

“Candidate A leads candidate B 52-48 with an error of measurement of “plus or minus 4%”,” a statistical tie.

As school districts begin to create the models, usually referred to as Value-Added Models or Growth Models, “experts” warn about the problems of VAM. At the NYSED Education Learning Summit three education experts were critical, VAM data was not ready for prime time and a fourth expert argued that VAM was better then what preceded and suggested a combination of VAM student performance data, teacher observations and student surveys.

By mid-June the NYS Board of Regents/SED have to create regulations to implement another new teacher evaluation system in New York State.

The current teacher evaluation law has been on the books for three years (two years in New York City), and has raised more questions than answers.

* Are teachers in Rochester and Syracuse less able or is the algorithm flawed? (A law suit is in progress)
* Are teachers of poorer students (Low SES), English language learner and students with disabilities less able, or, is the algorithm, the formula, flawed?
* Conversely, are teachers higher income (High SES) district more able than other teachers?

On Monday the Regents will begin an in-depth discussion of the new Matrix model (See the Education Learning Summit page here)

The movement to Common Core tests was a disaster, probably too mild a term. Instead of phasing in the Common Core-aligned tests the decision-makers, led by Commissioner John King, used the “push off the end of the diving board” approach. Randi Weingarten asked King to incorporate a “save-harmless” for a year or two, to no avail. The result: over 100,000 parents opting-out, the opt-out movement is spreading across the nation, teachers are highly suspicious of everything and most importantly, electeds are threatened and are introducing legislation to weaken SED initiatives.

The Regents have four new members, three former superintendents and a former Buffalo school board member, who join former superintendents Cashin, Rosa and Young. Regents Cashin and Rosa voted against the original teacher evaluation plan.Hopefully their experience can put the train back on the tracks.

How can the Regents/SED win back the confidence of parent and teachers?

The complex numbers cannot be seen as a tool to punish teachers.

The movement to Common Core-aligned test ignored the impact of drastic drops in test scores, and, when the public expressed discomfort John King blamed parents and “outside agitators;” a classic example of how NOT to roll out a new initiative.

Advice: Accept the recommendations of NYSUT, the state teacher union, the UFT, the NYC teacher union and the NYC Department of Education., in other words, create buy-in, remember Rule # 1 of “change,” participation reduces resistance.

Each year a technical committee that includes the unions can review and modify the model. Shoving new proposals down people’s throats and vigorously defending the position hasn’t worked out too well. Incremental change builds trust.

A simple example: A teacher receives a score of 42 on a model that has an error of measurement of plus or minus 10 %. In other words the teacher’s score could be within the range of 32 to 52. If the cut scores between D (Developing) and E (Effective) is 50 the teacher should be graded E (Effective). We should accept the error of measurement issue and not disadvantage the teacher.

Similarly, on the observation rubrics, we should accept the recommendations of the unions and the NYC Department, set low cut scores and examine the results each year.

If the Regents/SED decides to select cut scores that are “tougher,” the teacher and parent wars will escalate and electeds will jump on the “voter” side, the Opt-Outs, the trash the system side. Charter schools, voucher and tax credit supporters will argue public schools are not fixable.

Just as Chris Rock points out re the underlying problems of baseball, the underlying problems of teacher quality will not be resolved by teacher evaluation regulations; however, the Regents/SED need to be standing on a stage with NYSUT President Karen Magee, UFT President Michael Mulgrew and NYC Chancellor Carmen Farina joining together announcing a teacher evaluation plan that is “valid, reliable and fair.”

Regaining a lost trust is crucial and must precede any further steps, until we can trust each other we cannot move forward, and, with the vandals at the gates, Regents/SED must take the first step.

Wolf Hall. Flanagan, Cuomo, Teachers and the Discomforting World of Albany Politics

“Our wise leaders take us to the brink; our teachers make us stop and think.” Leonard Cohen

In January the Democrats in Albany maneuvered, deals were made, unmade, promises made, stories leaked to the press, and eventually Carl Heastie was crowned as the leader of the Assembly.

On Monday the Republicans in the Senate, there are 33 Republicans in the 63-member Senate, mirrored the Democrats, once again promises were undoubtedly made, members weighed the benefits, or dangers of supporting this candidate or that candidate, and John Flanagan emerged as the leader of the Senate.

The media and “good government” guys and gals bemoaned that decisions impacting all of us are made behind closed doors, and driven by personal ambition and deal-making.

Wednesday night was the last part of the six-part PBS miniseries Wolf Hall. a magnificent and chilling chronicling of the shenanigans in the sixteenth century world of Henry VIII and his close advisor Thomas Cromwell. If you’re a theatre fan you can watch the Royal Shakespeare Company’s two-part five and half hour play at the Winter Garden Theater.

The mini-series and the play are filled with double-dealing, conniving and, not to ruin it for you, the fate of Anne Boleyn. A 21st century iteration of the unsavory side of politics is House of Cards, the Netflix series is filled with double and triple dealing, as well as an occasional murder.

Our founding fathers understood the realities of politics, the ambition of men, and the fact that men were not angels. As James Madison wrote in Federalist # 51,

Ambition must be made to counteract ambition. The interest of the man must be connected with the constitutional rights of the place. It may be a reflection on human nature, that such devices should be necessary to control the abuses of government ,,, If men were angels, no government would be necessary. If angels were to govern men, neither external nor internal controls on government would be necessary

The founding document of our nation, our constitution, was the product of behind the scenes deal-making. The meeting which we refer to as the Constitution Convention was a secret meeting, the members were sworn to secrecy, the press had no idea what was transpiring during the spring and summer of 1787 in Philadelphia. The delegates themselves, who slowly trickled into Philadelphia thought they were meeting to make the dysfunctional Articles of Confederation workable; James Madison (Virginia) and Alexander Hamilton (New York) had other ideas. As the fifty-four delegates, each representing their state began the discussions it became clear that the goal was actually to create a new national governing system.

The participants were not only sworn to secrecy, there were no official minutes of the debates. James Madison kept copious notes, and required that his notes remain secret for fifty years. It not until 1911 that Max Ferrand published an Annals of the Constitutional Convention using Madison’s notes and bits and piece of recollections of other attendees to reconstruct, to the extent possible, the actual debates within the Convention. We have no idea about the accuracy of the Ferrand Annals, whether they actually reflect the debates or they reflect the memories of Madison or whether Madison massaged his notes to disparage this one or that one.

Lawrence Goldstone, in The Dark Bargain: Slavery, Profits and the Struggle for the Constitution (2005), wrote,

,,, the proceedings remained strictly secret, conducted behind locked doors that were guarded at all times by armed sentries … The official minutes were kept intentionally sketchy … the delegates disagreed on almost everything … But of all the issues that would arise in Philadelphia, the one that evoked the most passion, the one that left the least possibility of compromise, the one that would pit morality against pragmatism, was the question of slavery. To a significant and disquieting degree, America’s most sacred document was molded and shaped by the most notorious institution in its history.

Our nation’s bible is built on secret wheeling and dealings, and stained with the curse of slavery.

In the political cauldron that is Albany Andrew Cuomo choose to make an example of teachers. Perhaps his misreading of Machiavelli; the quote “It is better to be feared than loved, if you cannot be both;” he fails to understand that punishing teachers does not mean that teachers fear the punisher.

Teachers are a strange breed, to attack teachers is to attack the children they nurture and protect. Attacking teachers is akin to attacking the cubs of the lioness.

Let us not forget the first teacher who was brought up on charges. Socrates was charged with “corrupting the morals” of Athenian youth by asking them to think deeply. He was tried, convicted and sentenced to death; the teacher evaluation law in Athens was tough.

Plato, in the Apology, recounts Socrates last speech.

“This much is all I ask of my accusers: when my sons grow up, avenge yourselves by causing them the same kind of grief that I caused you, if you think they care for money or anything else more than they care for virtue, or if they think they are somebody when they are nobody.

Reproach them as I reproach you, that they do not care for the right things and think they are worthy when they are not worthy of anything. If you do this, I shall have been justly treated by you and my sons also.

Now the hour to part has come. I go to die, you go to live. Which of us goes to the better lot is known to no one.”
― Plato, Apology

Long after Cuomo has faded from the memories of New Yorkers they will fondly remember the teacher that changed their life.

The Albany Education Learning Summit: aka, Teacher Evaluation 4.0: Can the Regents/SED Create a Student Performance/Teacher Observation Model that is Valid, Reliable and Fair?

The Regents and invited guests gathered at the State Museum in Albany to listen to a day of comments on the new Teacher Evaluation law, called by one superintendent, Teacher Evaluation 4.0. A few blocks away in the Capital the Republican members of the Senate, at least some them, were swearing undying loyalty to Majority Leader Dean Skelos while behind the scenes John De Francisco (Syracuse), Catherine Young (Olean) and John Flanagan (Smithtown-Huntington) whispered in their colleagues ears. As early as Monday Dean may be crying “Et tu John (or Catherine)” as the dagger is plunged into his leadership heart.

You can watch/listen to the Albany Summit panels, read the hundreds of pages of comments and supporting documents and submit your own comments at

Ken Wagner, the acting co-Commissioner acted as the moderator, and did an outstanding job. Wagner, who describes himself as a “recovering school psychologist,” skillfully asked clarifying questions, asked and restated questions from Regent members, the audience and the Internet audience.

New York City Regents Bendit (Manhattan), Cea (Staten Island) and Cottrell (At-Large) did not attend. In addition about a dozen members of the legislature were in attendance. All the newly appointed Regents attended.
(Regent Cea informed me she watched the webcast of the Summit)

Wagner described the changes in the law (See changes here): moving to the HEDI matrix, requiring in addition to your principal an outside evaluator, allowing for but not requiring a peer evaluator, requiring that all alternative assessment tools, for example Student Learning Objectives, are approved by the state and setting time constraints for completion of the process, with a hardship that would allow time limit extensions.

After opening remarks by co-Commissioner Beth Berlin, Wagner began by summarizing the changes in the law in great detail. Watch the opening remarks here.

In the first two sessions superintendents and the three organizations in the state representing supervisors vigorously attacked the outside evaluator concept and attempted to buttress the role of the principal as primary evaluator. Both superintendents and school-based supervisors saw the outside evaluator concept as eroding their authority as well as overly complex and an administrative monstrosity. There are eighteen approved observational rubrics: imagine matching up districts, imagine the cost of “training” outside observers, imagine the cost to district, hundreds of schools have a single supervisor, schools are hours apart, the concept of outside evaluator is mechanically impossible. In addition the outside evaluator doesn’t “know the schools,” doesn’t the students, doesn’t know the school culture.

What was NOT part of the discussion: in many school districts all teachers received 58, 59 and 60 points on the 60 point scale: Are the vast percentage of teachers highly effective?

The superintendent/principal panel urged the Regents to only credit the outside observer with 1 – 5% of the teacher observation section of the matrix.

The expert panel was the most anticipated section of the day. Seven “experts, six in attendance and one on video from California occasionally agreed and frequently disagreed about the technical aspects of the student performance section of the matrix. Was a growth model, usually referred to as Value-Added Modeling (VAM) a valid, reliable and stable method to assess teacher performance or, to quote Diane Ravitch, “junk science?”

Read my previous post on the efficacy of VAMs here.

Three of the “experts” have national reputations, Tom Kane (Harvard), Aaron Pallas (Columbia) and Jesse Rothstein (U of California – Berkeley) and have been in the midst of the VAM vortex.

The panel also included Stephen Caldas (Manhattanville College), Catherine Brown (Center for American Progress), Lesley Guggenheim (The New Teacher Project) and Sandi Jacobs (National Council on Teacher Quality)

Kane was the lead researcher in the 3-year MET Project that concluded student performance, observations and student surveys create an accurate assessment of teacher performance. The report suggests that VAM measures should account for between 33 and 50% of teacher assessment.

Read a summary of the MET Project here.

Aaron Pallas, who is an expert witness in two lawsuit challenging growth scores points out that teachers who were “measured” by the results of state tests received significantly lower scores than teachers who were “measured” by other metrics. (Read full article here.

What is immediately obvious is that teachers whose state growth ratings were not based on the growth percentiles received much higher ratings than those whose ratings were based on the growth percentiles

Jesse Rothstain has been challenging Tom Kane quite publicly over the use of VAM scores, scholars do not usually duel in public. Read a Rothstein paper here, and, a blog post that summaries the “back and forth” and links to the many other posts here.

I urge you to put aside an hour and forty minutes and watch the expert panel here.

To summarize the experts:

* Caldas points to 50% error rates in VAM models and suggests they are useless. Guggenheim from TNTP argues that they’re better than the previous “S” – “U” systems in which virtually everyone received an “S” rating. Read a blog post here in which Caldis rejects the validity of growth models

* Everyone agreed VAM models are extremely complicated and very few understand them.

* All the panelists agreed that an external evaluator added accuracy to the process, the “arms-length” nature of the external observer added to the reliability; while the purpose of the external evaluator was summative assessment the in-school supervisor continued to have the role of normative or monitoring day-to-day practice; merging the roles will be challenging.

* The growth metric can be used to measure anything – New York City developed 159 growth scores to measure Student Learning Objectives (SLOs)

* Only about 20% of teachers are measured by state test scores, others by locally developed SLOs, some by school-wide or group measures, such as pre-k and kindergarten teachers as well as perhaps physical education and art teachers; probably no alternative to using group measures.

* The number and type of observations is crucial, How many formal? How many informal? How many by an external evaluator? All agreed that timely feedback by all of the assessors is crucial.

* All the panelists agreed that the tests themselves were troublesome at best, or deeply flawed, and the scores produced. misunderstood.

* Wagner asked about the use of video, either real time or archived, to facilitate the role of the external evaluator.

Kane referenced current research entitled The Best Foot Forward Project,

In the Best Foot Forward project, we give teachers control of the camera and allow them to choose which of their videos to submit for observation. We then train their administrators to view and score them as they would score an in-person classroom observation and then have a conversation with a teacher using the video as a coaching tool. We also provide commentary and feedback from external observers.

In conclusion: Kane believes research supports the use of VAM and produces data over time that distinguishes among teachers; Rothstein, Pallas and Caldas pretty vehemently disagree.

The majority of the discussion centered on teacher observations, and, while the experts all believed that observations were the key, and, external observers crucial, superintendents and school-based supervisors disagreed sharply.

In my view the entire testing system is far too complex, the tests not useful and the data far too unstable to be used to assess teachers. I question the ability of too many school-based supervisors to accurately assess and help teachers grow professionally. Yes, some are extremely proficient, others not, there is far too little training for supervisors in using the observation process to improve instruction.

The Best Foot Forward Project supra moves us to the next level and I hope the research is utilized by school districts.

By the end of the week, a draft plan and at the May 18/19 Regents Meeting a full discussion

And, as you read this post the Senate Republicans are in the process of deposing their leader with wide-ranging ramifications for pending educational legislation.

Stay tuned.

The Nolan-Flanagan Bill: Are We Seeing the Revision of the Common Core and Testing in New York State?

In the current legislative session, over the next two years about 15,000 bills will be introduced in the Assembly. 95% will languish in committee or pass in the Assembly and die without a Senate partner.

It is highly unusual for the Democratic chair in the Assembly and the Republican chair in the Senate to introduce the same bill; it usually means that the leadership in both houses and the governor has agreed upon the substance of the bill.

Assembly Education chair Nolan and Senate Education chair Flanagan introduced the same bill entitled,

Relates to performance reviews of classroom teachers and building principals and the comprehensive review of education standards administered by the state department of education.

This bill would make amendments to the Education law:

• to provide a longer public comment period associated with APPR regulations,
• extend the period within which school districts must have an approved APPR plan,
• require SED to release test questions annually,
• require SED to take student characteristics into consideration when calculating teacher growth scores under 3012-d of the education law,
• establish a standardized test content review committee,
• reform the way in which the State Board of regents are selected and
• require the Commissioner to review the Common Core Learning Standards (CCLS).


Under current law the Commissioner of Education is required to submit and the Board of Regents adopt changes made to APPR by June 30th, 2015. This bill provides that the draft regulations shall be submitted prior to June 30th, 2015 and that a forty-five day public comment period shall immediately follow. No later than August 14th, 2015 the State Board of Regents are required to adopt the new regulations associated with 3012-d of the Education law. This will provide more opportunity for the public to weigh in on the actual draft regulations. Consistent with this proposal section two of the bill extends to December fifteenth, 2015 when school Districts are required to have in place an approved Annual Professional Performance Review Plan (APPR).

Section three of this bill intends to improve the learning process by requiring SED to release by June first each year test questions and corresponding correct answers from the most recent ELA and Math exams in grades three through eight back to the teachers in their respective classrooms. In addition, $8.4 million is appropriated to allow SED to create more exams so that in the future more test questions can be returned to teachers.

Section four provides funds for SED to release more standardized exam questions back to the classroom teachers to improve the overall learning process.

Section five of the bill places in education law, consistent with the Commissioner’s regulations, a requirement that SED ensure that specific student characteristics are factors in the calculation of teachers student growth scores. Under the Commissioner’s current regulations teachers receive adjustments for students that are English language learners, students in poverty status, students prior academic history, and students with disabilities.

Section six and seven requires SED to establish a content review committee to review standardized test items and/or selected passages for use on state exams in grades three through eight to ensure the tests are age appropriate and time appropriate. The content review committee shall include teachers and educational experts.

Section eight reforms the way in which the State Board of regents are selected.

Section nine requires the Commissioner to review the common core learning standards (CCLS). The review shall examine the implementation of the stands and the reasons for adopting such standards. In addition the Commissioner is directed to make recommendation on how to improve the standards if deemed necessary.

Parts of the bill clarifies issues in the budget, assuring a 45-day comment period simply alleviates possible legal action challenging the statute.

The section requiring the SED to release “the most recent ELA and Math exams … to teachers in their respective classrooms,” is clearly an attempt to pacify the bubbling revolt among parents and teachers. Whether the opt-out parents and the teacher union fades away or continue the assault is to be decided. The seething anger among parents is concentrated in the higher income school districts in the suburbs, in many districts the majority of parents opted out: can the opt-out movement be converted into an election movement? I doubt it: posting on a Facebook page is far from running a political campaign. New York State election law requires registering in a party or changing party registration a year prior to the primary: Are opt-out parents organized and politically sophisticated enough to convince thousands of opt-outs to register in the Republican or Democratic party and run a candidate in the primary?

Opting out challenges the state and interferes with calculations that impact school, school district and state accountability. I am unsure how it impacts teachers.

The section requiring “the Commissioner to review the common core learning standards” might be the first step to backing away from the chiseled in stone common core standards. Many critics have claimed the early grade standards are “developmentally inappropriate;” others challenge the substantial differences in the new CCSS Math standards. Superintendents, principals and teachers have sharply criticized the current exams; they are based on “standards,” not on a curriculum. How can you hold students and teachers accountable for topics that have not been taught because they’re not in the curriculum?

For teachers a vital section of the law requires, “specific student characteristics are factors in the calculation of teachers’ student growth scores.” The current algorithm has resulted in teachers of high poverty, students with disabilities and English language learners receiving lower HEDI grades than teachers of high income student – are teachers of high income students better teachers? Or, are the algorithms flawed? The new law appears to require a change in the formula.

The new Content Review Committee requires “teachers and educational experts” to serve on the committee.

The bill does change the method of selection of new Regents. The current law placed the selection solely in the hands of the Democratic members of the Assembly, the bill involves the Senate in the process.

Why did the dems and the repubs agree? Who are the winners?

The big winner is Senate Ed committee chair John Flanagan.

As soon as Senate majority leader Skelos was arrested the Republican conference convened and swore undying fealty to the Lord of the Manor, and began sharpening their knives. At the end of the session in June Skelos will step down and for the intervening six weeks or so John de Francisco, Catherine Young and John Flanagan, quietly, very quietly, will be seeking the votes among the 33 Republican members of the Senate.

A very narrow constituency.

For years the Senate Republicans have boycotted the Regents elections, they now have a voice as a result of the Flanagan bill.

The union is not unhappy, although they would liked more than one month extension in the law to complete the local plans, the other issues are quite favorable: new calculation in the APPR law, teachers on the Content Review Committee, another look at the CCSS.

You could see a revision of the Common Core resulting in sharp increases in student test scores, or, sturm und drang signifying nothing.

Measuring Teachers by Measuring Students. Are Teacher VAM Scores So Discriminatory, Arbitrary and Capricious As to Constitute an Abuse of Discretion?

Are the regulations implementing Section 3012-d of the Education Law so discriminatory, arbitrary and capricious that they constitute an abuse of discretion, and, if so, what shall be the remedy?

Months, or perhaps a year down the road the question supra will be posed to an arbitrator or a judge (generally referred to as an Article 78 proceeding).

If I was representing a teacher at the arbitration I would first explore the use of an outside evaluator. In most school districts there are about 160 days of actual instruction, five teaching periods a day equals 800 teaching periods per school year. Can one or two “ineffective” lesson observations determine that a teacher is “ineffective” for an entire school year?

For the many decades prior to the passage of the Annual Professional Practice Review (APPR) school-based supervisors were the sole deciders of teacher competence. Cycles of pre-observation, lesson observation and port-observation meetings would result in an observation report: a summary of the lesson, commentary on the quality of the lesson, perhaps using an agreed upon rubric, a conclusion and recommendations. Under the former law lessons were rated “satisfactory” or “unsatisfactory,” under the current system lessons are rated on the HEDI scale (Highly Effective, Effective, Developing, Ineffective).

Over a school year, commonly, there would be a cycle of observations. A satisfactory teacher may be observed once a year, new or probationary teachers many more times. The observations may be linked; the recommendations in single observation may be referenced in subsequent observations. In addition to the formal observations the supervisor may be in the teacher’s classroom on a daily basis.

Under the new law, Section 3012-d, outside evaluator judgements count for 35% of the teacher observation section of the assessment and principal evaluations count for 15%.

35% of a teacher’s HEDI score would be determined by an outside evaluator who bases their judgement on observing less than one quarter of one percent (assuming two lesson observations per school year) of a teacher’s practice.

“Mr. Evaluator, would you agree that in the week before and after you observed the teacher the teacher could have taught a highly effective lesson?”

Of course, in the real world, it is highly unlikely that a school district would seek to discharge a teacher if the outside evaluator and the principal their observation judgements.

The other side of the matrix is student performance measured by a Value-Added Assessment Model – usually referred to a VAM. The Regents/State Education Department is in the process of developing scoring bands, determining teacher HEDI scores, essentially cut scores.

The American Statistical Association issued a report critical of the use of VAMs for high-stakes decisions.

As the largest organization in the United States representing statisticians and related professionals, the American Statistical Association (ASA) is making this statement to provide guidance, given current knowledge and experience, as to what can and cannot be reasonably be expected from the use of VAMs. This statement focuses on the use of VAMs for assessing teachers’ performance but the issues discussed here also applies to their use for school accountability. The statement is not intended to be proscriptive. Rather, it is intended to enhance general understanding of the strengths and limitation of the results generated by VAMs and there encourage the informed us of these results.

Many states and school districts have adopted Value-Added Model (VAMs) as part of educational accountability systems. The goals of these model, which are referred to as Vale-Added Assessment (VAA) Models. is to estimate effects of individual teachers or schools on student achievement while accounting for
differences in student background. VAMs are increasingly promoted or mandated as a component in high-stakes decisions such as determining compensation, evaluating and ranking teachers, hiring or dismissing teachers, awarding tenure, or closing schools.

The American Statistical Association (ASA) makes the following recommendations regarding the use of VAMs.

* The ASA endorses the wise use of data, statistical models, and designed experiments for improving the quality of education.

* VAM’s are complex statistical models, and high -level statistical expertise is needed to develop the models and interpret the results.

* Estimates from VAMs should always be accompanied by measurements of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purpose.

* VAMs are generally based on standardized test scores, and do not directly measure potential teacher contributions toward student outcomes.

* VAMs typically measure correlation, not causation: Effects -positive and negative – attributed to a teacher may actually be caused by other factors that are not captured in the model,

* Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.

* VAMs should be viewed within the context of quality improvement which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM score can have unintended consequences that reduce quality.

Read the entire report here.

The Washington Post slams to the use of VAM data here.

A month after the release of the American Statistical Association (ASA) report Chetty, Friedman and Rockoff responded to the report in a paper entitled, “Discussion of the American Statistical Association’s Statement (2014) on Using VAMs for Education Assessment.” The response acknowledges the concerns voiced in the ASA report, and defends their own research. The response fails to address whether VAMs should be used for high-stakes, namely, firing teachers, decision-making.

On Thursday, May 7th the Regents/State Education will host a summit in Albany, an all-day series of panels, by invitation only: superintendents, school boards, parent and teacher associations and an “expert” panel. How do you define expert? The panel includes experts in economics, sociology and education-policy, not in statistics (Read Chalkbeat article here).

Tom Kane, one of the panelists and the author of richly-funded MET Report suggests that student performance data should 33-50% in an individual teacher assessment: the report; however, is extremely light on evidence.

Howard Wainer, a statistician with a long resume challenges the misuse of data to determine education polices in “Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies” (Princeton University Press, 2011), also listen to interview on the site.

Watch a superb Wainer presentation at NYU discussing the differences between data and evidence – (Wainer presentation begins at minute 8).

To return to the original question: does the use of VAM meet the DAC (discriminatory, arbitrary or capricious) standard?

In my view the answer is clearly, “no.”

There are lots of anecdotes, and, while anecdotes are data they are not evidence. The public believes that we should rid the profession of “bad teachers” and looking at student outcomes seems to make sense.

In an oft quoted remark, “[A governor] asked his legislature for enough money to give a cassette or CD of classical music to every newborn child in the state. The governor cited scientific evidence to support this unusual budget request. ‘There’s even a study,’ he declared in his State of the State address, ‘that showed that after college students listened to a Mozart piano sonata for ten minutes, their IQ scores increased by nine points’.”

The only problem is the widely repeated assumption, called the Mozart Effect, is not supported by evidence.

The high controversial 1994 “Bell Curve” book on intelligence is an example.

People often react most defensively when challenged not on their firmly held beliefs but on beliefs they wish were true but suspect at some level to be false.

If we could differentiate among teachers based on student test scores we would be living in Lake Woebegone, that wonderful National Public Radio village where “All Children are Above Average.”

I suspect that school districts will avoid charging teachers with “ineffectiveness” in situations where the sole determinant are student test scores, neither the state nor school districts want to defend what is essentially the deeply flawed governor’s law.

The governor, following in the footsteps of Mayor Bloomberg would simply attack the arbitrator or the judge who rejects his law.

The most effective determinant of teacher quality is experienced principals and superintendents, as well as teacher peers, utilizing all sources of evidence.

Data alone should not rule, evidence should rule.