Why Did the State Test Scores Jump? What Do You Think? Join the Discussion

A few days ago I mused about the sharp jump in test scores across the state, especially in New York City (Read here)

Leonie Haimson, in a Daily News article avers State Ed simply changed cut scores to jump the overall scores,

“It’s unconscionable that the state should put out numbers to show big improvement where none seems to exist ….There is enough evidence to put these big jumps into proficiency into a lot of doubt.”

And a college professor has grave doubts,

“The increases are illusory,” said David Bloomfield, a Brooklyn College and CUNY Graduate Center education professor, citing the changes in the test and scoring

Columbia professor Aaron Pallas says we really can’t tell why the scores spiked, and a number of frequent commenters on my blog comment about possible reasons. What do you think? Extended time? Better teaching? Did the large jump in charter school scores  bump up the overall scores? . Write a comment.

Frequent commenters Ken and Mark muse:

 Ken Karcinell

 “I prefer to discount the “shenanigans theory”, in favor of saying that part and parcel of the improvement is owed to better modes of instructional delivery. That being said however, the test engineers did shorten the test, and lifting the time constraints had to be a big help. add these considerations to my own personal beliefs that these tests in and of themselves should never be the sole criteria for measuring pupil intelligence and comprehension of curricula, I feel that the modifications made this year are a step in truly getting to the point where more valid conclusions about pupil comprehension of curricula can be assessed.
From time in memoriam we have all seen how some students do horribly on math exams and by contrast perform satisfactorily on English, Science or Social Studies exams. The reason for this is in no way associated with measuring student intelligence or comprehension of curricula. A long time ago, when I was getting started in education, I was living with my mom, who was legally deaf, but could read lips. On one particular evening, I was sitting at the dining room table, grading some social studies exam papers.

As each paper was given a 40% or 30% grade I began talking to myself and getting very agitated. My mom read my body language, and asked me what the problem was. I told her that I couldn’t understand why paper after paper was a failing one. I explained that I had given pre-tests, review tests, take home tests and as recently as the day before had orally drilled the class on the test material. My mom asked me, do you have good control? Do they listen to you? I assured her that I did. She then, bent over the table and selected 3 different failing test papers, and told me that I should call these students one, by one, in an isolated situation to me, and ask them to answer the questions that they got wrong.

When I started to protest, she said,” just do it”! Can’t argue with mom, so the next day while my class was at gym, I asked the Phys Ed Teacher if I could see a couple of my students for a few moments, and he approved. When the first student came over, I asked him, one of the questions marked wrong on his test paper and he verbalized the correct answer. After repeating the process several times, the flow of correct answers continued. I repeated the exercise with the other two students whose papers my mom had selected, and the results were exactly the same. That night, I asked her, how did you know? She said, that she didn’t know until she asked me if I had good control and when I said I did, then she knew. She explained to me that her belief was that my students in this case lacked writing skills and literacy skills. But, because they paid attention, their auditory skills were heightened and therefore were sharper than their other senses. Going forward, I developed a system of giving grades as fractions or slashes so that a student could receive a grade of 80/40. I also sent home a letter to parents explaining how to interpret the grade. Of course, I didn’t inform the Principal of this communication, not because of any intended effort to disrespect him, but more so owing to my own ignorance as a novice teacher, in terms of protocol. In the final analysis, I have always held that test assessment.is essential in designing models for instructional delivery. However, I also have held that without a provision in that assessment for oral inquiry, that test assessments as they are presently constructed are lacking”

Marc Korashan

 “The first question that occurs to me is how tests which have a single correct answer are expected to measure the effectiveness of Common Core instruction to which, at its root, is expected to help students become critical thinkers. Critical thinking leads students to look at test questions very differently than students instructed in rote, drill and kill fashion. Knowing history means more than just knowing the names and dates of all the presidents. It means understanding the issues that impacted on the country during their tenure and how and why they dealt with them as they did and how that is still resonating today.
The second issue is that shortening test decreases its reliability. This is just arithmetic, but decreased reliability can explain the increase in scores.
Ken raises a third issue, and one that I agree is central to the issue of how we assess students. Different students will have different ways of expressing what they know and reliance on highly verbal tests where many questions use nuanced language to make questions work to discriminate among students penalizes many students. When the Department started urging teachers to dig deep into the data, I participated in a demonstration of the program. Choosing a question at random, I found that nearly a quarter of the class chose each of the available choices. Looking carefully at the question, I found that all four choices were correct, but the question asked for the “best answer.” This meant that only students with the verbal skills to determine which of four correct choices was the “best,” meaning the most comprehensive, could get the answer correct. The lesson for teachers from this was to do more test prep.
I have argued elsewhere and in comments on this blog for a different approach to evaluating students (and by implication teachers), one that uses growth portfolios where students enter work that shows what they have learned during the year. I think this is the real solution to the problems referred to in this post and hope the Regents/SED will begin to look outside the testing industry for solutions.”


What do you think? Join the discussion


3 responses to “Why Did the State Test Scores Jump? What Do You Think? Join the Discussion

  1. The scores on tests over time always follow a saw tooth pattern — they increase for a while, and then drop suddenly, then increase and then drop. The drops coincide with the introduction of new test forms. The gains coincide with familiarity with the old forms as items leak out into the population.

    This takes place in k-12 tests and everywhere else. We certainly see the same pattern within such specialized tests as medical licensing. And yes, strenuous efforts are made to counteract this so that scores can be compared across administration, but the best that can be managed is to minimize, not eliminate it. And now, with the ubiquitous internet, the sharing of items is instantaneous. Someone takes the test and posts some of the items they remember in repayment for an earlier examinee who did it and helped hem. For tests with serious consequences for individuals coaching schools steal items on a more industrial scale.

    The option of changing forms more frequently is only economically feasible for expensive tests, but even here, when tests are administered continuously (as they must be when computer administered) there are still leakages of items and the consequent upward score creep.

    What this means is that small changes in scores should not be interpreted (on average) as having much importance.

    I don’t know what the character of the NY tests is — how many items are new, how equating is done, how standard setting is done, but a uniform increase is unlikely to be caused by a sudden improvement of teaching effectiveness or increase in student ability.


  2. My money is on the arbitrary norms. That is the most obvious explanation for fluctuations from year to year. The score that reflects “passing” changes, as does the test, so it cannot be consistent from year to year.


  3. Though I agree with the other comments, could it simply be because this year the tests were not timed? Also the tests were shorter and supposedly had no field test questions on it. Field test questions often throw students off because they tend to be more challenging…then once thrown off students perform poorly on following questions.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s