Why Did New York State Test Scores Jump? Better Instruction? Untimed Tests? All the Kids Got Smarter, or, Shenanigans?

If you want to bury a news story you issue the press release on a Friday afternoon, if you want as much mileage as possible you issue the release on a Tuesday morning, followed by a press conference, in person and online, followed by laudatory speeches across the state and try to maximize the time the story garners headlines and clicks.

The State Education Department released the 2016 grades 3-8 ELA and Math scores on Friday afternoon with an odd presser. The test scores up, way up; why is the SED ashamed?

You can take a deep dive into the New York City Scores here: http://schools.nyc.gov/Accountability/data/TestResults/ELAandMathTestResults

The SED analysis of the state scores with many disaggregated charts here and here.

The Commissioner was careful not to publicly laud the increase in the scores,

But rather than celebrate the largest bump since New York adopted new tests tied to the Common Core Learning Standards, education officials reported the increases with caution. They suggested that changes in how the tests were given – not actual improvement by schools and students – may have accounted for the gains.

State Education Commissioner MaryEllen Elia also warned against making comparisons with previous years, which is typically done to evaluate schools and teachers.

“It’s not an apples to apples comparison and should be viewed in that context,” Elia said during a news conference when the results were released Friday.

For the data wonks who want to parse the results check out the files here and here.

The SED states, “…  changes in how the tests were given – not actual improvement by schools and students – may have accounted for the gains;” however, a deeper analysis is necessary.

If the increases are due to fewer questions and untimed tests, we should know, if both teachers and kids have been exposed to the more effective Common Core instruction and better professional development, we should know, or, if the SED, as some suspect, manipulated the process, we should know. All of the kids in New York State getting smarter just doesn’t seem creditable.

Under Commissioner Mills test scores increased year after year, when Chancellor Tisch and new Commissioner Steiner took over they asked a Harvard professor, Daniel Koretz to take a look – sure enough – the SED had been using many of the same questions year after year. Whether incompetence, or, more likely a method of increasing scores, we’ll never know. Scandals in Atlanta and accusations elsewhere have cast doubt on the entire testing regimen. Jumps in test scores are treated with skepticism.

For years Howard T. Everson chaired the Regents Technical Advisory Committee (TAC) and was sharply critical of test score inflation.

But given all the flaws of the test, said Prof. Howard T. Everson of the City University of New York’s Center for Advanced Study in Education, it is hard to tell what those rising scores really meant.

“Teachers began to know what was going to be on the tests,” said Professor Everson, who was a member of a state testing advisory panel and who warned the state in 2008 that it might have a problem with score inflation. “Then you have to wonder, and folks like me wonder, is that real learning or not?”

Each year after the release of the state tests scores the TAC issued a lengthy analysis of the quality of the test. Recently the TAC process has changed, as I understand the current process the TAC report goes to the test creator, Pearson, (now replaced by Questar) who vets the report, over the last few years the report was released a year after the test and was so heavily “massaged” it was meaningless.

The SED/Regents should, in the footsteps of Tisch and Steiner, immediately ask Everson or Koretz or a colleague with equally impeccable credentials to examine the current state test results.

If, in fact, the Commissioner doesn’t know why scores jumped we have to ask: why not?  If untimed tests resulted in higher test scores shouldn’t Regents Exams be untimed?  If the increased exposure to better Common Core instruction resulted in higher scores why are the Algebra 1 and Geometry scores not increasing?

Shrugging and simply saying we’re happy with increased scores but we’re clueless as to why is simply not acceptable. Data should influence policy at all levels, and, we have to be confident that the testing regimen is creditable.

4 responses to “Why Did New York State Test Scores Jump? Better Instruction? Untimed Tests? All the Kids Got Smarter, or, Shenanigans?

  1. Zina Burton-Myrick

    Teachers are being asked daily to produce data, sit on inquiry teams, go to PD or use tracking systems to student growth. If nothing is being done with this data, after testing has occurred to show the impact on student growth or teacher effectiveness then why bother? Are we mixing apples and oranges, comparing apples to apples or is this just pure bananas?


  2. I prefer to discount the “shenanigans theory”, in favor of saying that part and parcel of the improvement is owed to better modes of instructional delivery. That being said however, the test engineers did shorten the test, and lifting the time constraints had to be a big help. add these considerations to my own personal beliefs that these tests in and of themselves should never be the sole criteria for measuring pupil intelligence and comprehension of curricula, I feel that the modifications made this year are a step in truly getting to the point where more valid conclusions about pupil comprehension of curricula can be assessed.
    From time in memoriam we have all seen how some students do horribly on math exams and by contrast perform satisfactorily on English, Science or Social Studies exams. The reason for this is in no way associated with measuring student intelligence or comprehension of curricula.. A long time ago, when I was getting started in education, I was living with my mom, who was legally deaf, but could read lips. On one particular evening, I was sitting at the dining room table , grading some social studies exam papers. As each paper was given a 40% or 30% grade I began talking to myself and getting very agitated. My mom read my body language, and asked me what the problem was. I told her that I couldn’t understand why paper after paper was a failing one. I explained that I had given pre-tests, review tests , take home tests and as recently as the day before had orally drilled the class on the test material. My mom asked me , do you have good control? do they listen to you? I assured her that I did. She then, bent over the table and selected 3 different failing test papers, and told me that I should call these students one, by one, in an isolated situation to me, and ask them to answer the questions that they got wrong. When I started to protest, she said,” just do it”! Can’t argue with mom,so the next day while my class was at gym, I asked the Phys Ed Tcr if I could see a couple of my students for a few moments, and he approved. When the first student came over, I asked him,one of the questions marked wrong on his test paper and he verbalized the correct answer. After repeating the process several times, the flow of correct answers continued. I repeated the exercise with the other two students whose papers my mom had selected, and the results were exactly the same. That night, I asked her, how did you know? She said , that she didn’t know until she asked me if I had good control and when I said I did, then she knew. She explained to me that her belief was that my students in this case lacked writing skills and literacy skills. But , because they paid attention, their auditory skills were heightened.and therefore were sharper then their other senses..Going forward, I developed a system of giving grades as fractions or slashes. So that a student could receive a grade of 80/40. I also sent home a letter to parents explaining how to interpret the grade. Of course, I didn’t inform the Principal of this communication, not because of any intended effort to disrespect him, but more so owing to my own ignorance as a novice teacher, in terms of protocol. In the final analysis, I have always held that test assessment.is essential in designing models for instructional delivery. However, I also have held that without a provision in that assessment for oral inquiry, that test assessments as they are presently constructed are lacking.


  3. The first question that occurs to me is how tests which have a single correct answer are expected to measure the effectiveness of Common Core instruction to which, at its root, is expected to help students become critical thinkers. Critical thinking leads students to look at test questions very differently than students instructed in a rote, drill and kill fashion. Knowing history means more than just knowing the names and dates of all the presidents. It means understanding the issues that impacted on the country during their tenure and how and why they dealt with them as they did an how that is still resonating today.
    The second issue is that shortening test decreases its reliability. This is just arithmetic, but decreased reliability can explain the increase in scores.
    Ken raises a third issue, and one that I agree is central to the issue of how we assess students. Different students will have different ways of expressing what they know and reliance on highly verbal tests where many questions use nuanced language to make questions work to discriminate among students penalizes many students. When the DOE started urging teachers to dig deep into the data, I participated in a demonstration of the program. Choosing a question at random, I found that nearly a quarter of the class chose each of the available choices. Looking carefully at the question, I found that all four choices were correct, but the question asked for the “best answer.”. This meant that only students with the verbal skills to determine which of four correct choices was the “best,” meaning the most comprehensive, could get the answer correct. The lesson for teachers from this was to do more test prep.
    I have argued elsewhere and in comments on this blog for a different approach to evaluating students (and by implication teachers), one that uses growth portfolios where students enter work that shows what they have learned during the year. I think this is the real solution to the problems referred to in this post an hope the Regents/SED will begin to look outside the testing industry for solutions.


  4. Pingback: Why Did the State Test Scores Jump? What Do You Think? Join the Discussion | Ed In The Apple

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s