How Would We Assess Student Progress Without Standardized Tests?

In a recent blog post Diane Ravitch wrote,

After twenty years of trying, we should have learned by now that what matters most is having expert professional teachers and giving them the autonomy to do their job with out interference by the governor or legislature.

and Diane points to Finland as the model,

My favorite model remains Finland, where schools are free of standardized testing, teachers are highly educated, teaching is a high-status profession, and politicians and think tanks don’t have the nerve to tell teachers how to teach.

Without getting into a detailed “back and forth,” OECD data differentiates among nations, some data for Finland and the United States.

* poverty rate: Finland the fourth lowest poverty rate,  the US the 30th highest, we only beat out Israel.

* income inequality: Finland is the least inequitable, we only beat out Mexico.

Comparing high wealth schools with high poverty schools is as meaningless as comparing Finland to the United States. If we want to be compared to Finland we should sharply reduce the poverty and inequality gaps within the United States.

Let’s get back to the question of assessing student performance: if our goal is providing the best education, we have to define what we mean by the “best education.” If teaching a student to be literate and numerate is the “best education” we have to set benchmarks and some method of measuring if students are reaching benchmarks.

We currently use what are called “standardized” tests, meaning all kids in the state take the same tests. The grades 3-8 tests required by federal statutes as are exit exams in high school, in New York State, the Regents Exams.

When New York State precipitously  adopted the Common Core State Standards and Common Core tests proficiency rates on the test moved from 2/3 proficient to 2/3 not proficient; thereby angering parents and creating the opt-out movement.

About 20% of parents opt their kids out of the grades 3 – 8 exams, the opt-outs are concentrated in high wealth school districts (meaning folks pay high property taxes) in the suburbs and high achieving schools in New York City.

Tests are not new, prior to No Child Left Behind (NCLB) we tested kids in grade four and eight, and, New York City has a long history of testing; local school districts gave tests to monitor student progress along with citywide tests. Regents exams have been around since the 1880’s

The difference is tests are now used to assess teacher, principal and school performance, and, the results are accountability based; meaning possible school closing and teacher ratings. The new Every School Succeeds Act (ESSA) may, we’ll find out in a few weeks, include in the plan “growth” as well as “proficiency, and, perhaps an “equity” measure.

If we ditch tests, it is unlikely we can move to the Finland system: a nation with very low childhood poverty and among the lowest income inequality among the (Organization for Economic and Cultural Development) nations.

There are other tools that are currently being used to assess student progress.

A number of school districts in California are utilizing performance tasks developed by SCALE, a Stanford-based program that has developed a bank of performance assessments,

Unlike multiple-choice “bubble” tests, performance assessments require students to construct an original response rather than simply recognize a correct answer. The Performance Assessment Resource Bank includes high-quality tasks that engage students in multiple-step and extended performances, such as researching and developing mathematical models to write an article on the rising cost of college tuition. As tasks become more complex and require greater student direction they assess more complex and integrated aspects of learning and require the planning, problem-solving, and persistence that are necessary for success in the real world. This means that the use of performance assessment can both measure and encourage the development of many of the 21st century skills—critical thinking, inquiry, communication, collaboration—that are essential for success in college, career, and life.

See an example of a 9th grade Social Studies performance task/assessment here.

The New York City-based Performance-Based Assessment Consortium  (PBAC), currently 39 high schools, has been receiving waivers from the NYS commissioner, students utilize portfolio/roundtable assessment procedures in lieu of three regents (They still take the mathematics and English regents exams). The State Department of Education has been granting waivers for a cohort of CPBC schools since the nineties. The current waiver expires at the end of this school year. Check out the PBAC site here.

In the nineties Vermont moved to a statewide attempt to replace standardized with a portfolio system; after a number of years Vermont abandoned the initiative – an external report, authored by Harvard scholar Daniel Koretz and others, found inter-rater reliability was absent.

In 2004 Jay Mathews at Education Next explored a number of authentic assessments of student work alternatives to testing, and had doubts,

Lisa Graham Keegan, chief executive officer of the Washington-based Education Leaders Council, said she thinks portfolios can help teachers assess their students’ progress, but are not a good tool for determining how a school or a district is doing. She remembers a visit to a northern Arizona school where “the writing teacher was showing me a portfolio of a student’s work in which the student was writing about kamikaze pilots during World War II.” Keegan was state school superintendent for Arizona at the time and saw that “the essay was horribly written, with glaring spelling and grammatical errors, and yet had received a score of 23 out of 25 points.

“The teacher was just glowing with what a mature and moving topic the student had chosen without any direction from her. I was less impressed and said so–something along the lines of how I could appreciate that the student had something interesting to say, but my first impression was that he didn’t know how to say it–and wasn’t that the first order task for the teacher?”

Having students display their personal strengths is fine, Keegan said, as long as they still learn to read, write, and do math capably before they graduate. “A collection of student work can be incredibly valuable,” she said, “but it cannot replace an objective and systematic diagnostic program. Hopefully, we will come to a place where we incorporate both.”

Daniel Koretz and others, raise questions about quality control in performance assessments,

 … direct assessments of complex performance do not typically generalize from one task to another and thus require careful sampling of tasks to secure an acceptable degree of score reliability and validity for most uses. These observations suggest the pressing need for greater quality control in the design and execution of performance assessments. If such assessments are to have lasting effects on instruction and learning, then their technical properties must be understood and appreciated by developer and practitioner alike.

A more recent report explores these questions, The Center for Educator Compensation Reform, “Measuring and Promoting Inter-Rater Agreement  of Teacher and Principal :Performance Ratings,” February 2012, is a comprehensive look.

Moving from testing to performance tasks/assessments and portfolios will be challenging; however, now is the time for New York State to begin to move forward.

I suggest a number of pilots,  maybe in high opt-out schools, a few in New York City, others in suburban school districts.

For example, a number of schools in New York City are high achieving, high opt-out schools, perhaps candidates for pilots. On Long Island and a few other suburban districts, high opt-out schools/school districts might be candidates for district pilots.

Pilots must be partnerships with teacher unions and higher education institutions, moving to performance tasks and/or portfolios is a major instructional shift and will require both buy-in and an enormous dose of support. New Hampshire, the major example of a state that is moving towards performance tasks is hugely invested in supporting the folks on the front lines – classrooms teachers. Read an description of the New Hampshire efforts here.

We should not tarry.

There is an absence of leadership at the US Education Department, ironically, a good thing. Previously Washington administrations (Arne Duncan, John King) were intrusive, they attempted to drive their views of education down to the classroom level. The current administration clearly has no interest in teaching and learning, they are concerned with choice, i. e., charters and vouchers.

As soon as the ESSA plan is submitted, September, the state should begin the process of creating pilot schools and school districts, exploring the complexities of moving away from standardized tests to a system of performance tasks and portfolios. We don’t need a state-wide system, at this point let’s begin the process. Down the road we may have a system in which some schools/school districts stay with standardized testing while others move to other assessment systems.

There are times not being first, waiting and seeing how initiatives work out makes sense; other times being out front allows you to set the rules. Vermont and New Hampshire are well along the path, also, far different than New York State. A window has opened, teacher unions and some schools/school districts, are ready to move away  tests, it will be a complex task, very complex:  let’s get started.

Teaching Academic Tenacity: Why the SAT, Pearson and PARCC tests Are Poor Predictors of College/Career Readiness and Why Non-Cognitive Skills Trump Faulty Exams.

We are obsessed with judging teacher quality by measuring student achievement. To make it even more complex we are measuring student achievement by a brand new yardstick, the Common Core State Standards.

Parents, educators and the New York State governor are confused, two-thirds of students scored “below proficient” on the latest tests, which the State Education Department now defines as “approaching proficiency.” (smile) and half of all teachers scored “highly effective” and less than 1% scored “ineffective” on the extremely complex APPR teacher evaluation metric.

The governor asks: if two-thirds of kids are failing state tests how can teachers score so highly on the teacher evaluation tool? How can principals give teachers high grades on the 60% lesson assessment section of the teacher evaluation tool when so many kids doing so poorly on the tests?

Unfortunately we are using the wrong tools to measure the wrong outcomes.

We base a range of decisions on a test, a few hours of bubbling in answers and writing an essay; however the SAT and the ACT, which also use bubble sheets and essays, are poor predictors of college success. The best predictor is standing in class as measured by the student’s GPA. It should not be surprising; the GPA is determined by numerous tests over four years of high school reflecting the judgment of many teachers.

The largest study of students at colleges that do not require SAT or ACT scores has found that there is “virtually no difference” in the academic performance (measured in grades or graduation rates) of those who do and don’t submit scores.

The study — involving 123,000 students at 33 colleges and universities of varying types — found that high school grades do predict student success. And this extends to those who do better or worse than expected on standardized exams. So those students with low high school grades but high test scores generally receive low college grades, while those with high grades in high school, but low test scores, generally receive high grades in college.

This is not an isolated example of research, in 2005 a study explains,

… researchers examined differences in the predictive strength of high school grades and standardized test scores for student involvement, academic achievement, retention, and satisfaction. Findings indicate that high school grades are stronger predictors of success than standardized test scores for both racial and religious minority students.

In another study the Council for Aid to Education and NYU supports the finding of the research supra

In spite of the evidence that the SAT does not achieve its purposes the folks at the College Board are rolling out a new exam in the spring of 2016, a test that reflects the Common Core standard competencies; at the same time more and more colleges are abandoning the SAT.

If tests, be it the SAT or Pearson-produced Grade 3-8 state tests or the PARCC exams are not accurate predictors of college success, or, teacher competence, how should we assess teacher performance and student achievement?

The answer may be in a Gates-funded study, Academic Tenacity: Mindsets and Skills that Promote Long Term Learning, (Carole Dweck and others, Stanford University). The introduction is exceptionally important,

In a nationwide survey of high school dropouts, 69% said that school had not motivated or inspired them to work hard. In fact, many of the students who remain in school are not motivated or inspired either, and the more time students spend in K-12 education the worse it gets. What prevents students from working hard in school? Is it something about them, or it something about school? Is there a solution to this problem?

Most education reform focuses on curriculum and pedagogy – what material is taught and how is it taught? However, curriculum and pedagogy have often been narrowly defined as the academic content and students’ intellectual processing of that material. Research shows that this is insufficient. In our pursuit of education reform, something has been missing: the psychology of the student. Psychological factors, often called motivational or non-cognitive factors – can matter even more than cognitive factors for student academic performance …

Academic tenacity is about the mindsets and skills that allow students to:

* Look beyond short-term concerns to higher order goals, and

* Withstand challenges to setbacks to persevere towards these goals.

Dweck and her co-authors make it clear, it’s not the “right” curriculum or the “right” pedagogy, there are many paths to the same ends, the “solution” is not the Common Core, the “solution” is not in the Charlotte Danielson frameworks, without a teaching/learning environment that supports Academic Tenacity too many students, too many high poverty students and student of color will be left behind.

The authors specifically define “key characteristics and behaviors” that can be defined and taught,

Key Characteristics and Behaviors of Academically Tenacious Students

* Belong academically and socially
* See school as relevant to their future
* Work hard and postpone immediate pleasures
* Not derailed by intellectual and social difficulties
* Seek out challenges
* Remain engaged over the long haul

Scientific American affirms the research findings and links to a range of research findings (Check out here)

For academic achievement, ability is not enough. What’s also needed are mindsets and strategies for overcoming obstacles, staying on task, and learning and growing over the long-term … academic tenacity is not about being smart, but learning smart.

I was visiting a middle school in one of the poorest neighborhoods in the city, a neighborhood at the top of the list of handgun violence and homicides. As I walked toward the office a student “introduced” himself, “My name is xx, can I help you?” Each classroom displayed the banner from a college and the advisory rooms had names, the name of a college. No one was yelling at kids, a student was talking loudly and a teacher simply put his find to his lips. The school leader took me into a classroom, and asked, “”What are we learning today?” The kids all raised their hands, anxious to tell me all about the lesson.

The middle school downstairs was chaos.

Danielson frameworks are a guide and set a standard; however, students in screened schools or schools with more middle class students are far more likely to reach the “highly effective” category, as evidenced by the teacher grades on the APPR, the state teacher evaluation metric.

Challenging content, rigorous curriculum and pedagogy combined with the teaching skills that promote academic tenacity is the path to creating successful schools and college and/or career ready students.

Are schools of education and school-based professional development emphasizing the teaching of Academic Tenacity? I fear not. Hopefully research will trump the current faulty teaching and learning trends.

Do Colleges Adequately Prepare Prospective Teachers? Should Colleges Be Responsible for Teachers Performance on the Job? Should the Feds Set Standards for Schools of Education? Why Aren’t Teachers Held in Higher Esteem?

The teachers were complaining about too much testing, impossibly complex tests, oppressive supervision, unhappiness that is commonplace, and pointed to Finland.

“Why don’t we adopt the same policies as in Finland, no standardized tests, a national curriculum and wide latitude in teaching?”

I wondered whether the teachers understood that if we adopted the same rules as in Finland few if any of the teachers would have a job. Fewer than 10% of applicants are accepted into Finnish schools of education.
(Read “The Secret to Finland’s Success: Educating Teachers“). Unfortunately the “best and the brightest” tend not to choose teaching as a career choice, in fact teaching attracts mediocre candidates,

… in 2007, among high school seniors who took the SAT and intended to major in education, the average scores were a dismal 480 in Critical Reading, 483 in Mathematics, and 476 in Writing.

In Finland teaching is a highly regarded career, salary is not extraordinary, the nation simply holds the position of teacher in high esteem, not so in the good, old USA.

In most public systems, no matter how well they instruct, no matter how creative or inspiring they are, no matter how much their students are learning, they all get paid a structured and uniform salary. They also have little room to advance their careers unless they become desk rats in the Land of Red Tape and Bureaucracy. What smart, ambitious person wants that?

In spite of the low status, the mediocre salaries, the absence of chances for advancement for the last decade students have flocked to schools of education. At State and City Universities, at $50,000-plus tuition prestigious schools, to private colleges, education colleges are graduating substantial numbers of students, most of whom haven’t found jobs (In NYS only 20% of elementary school certified teachers have found jobs three years after graduation).

The feds have decided to encourage/force states and colleges to up the ante, to set higher standards for prospective teachers and higher certification standards for states.

Education Week reports,

The Obama Administration will release draft accountability rules for the nation’s teacher-preparation programs this summer. Among other things, they would require states to improve their procedures for identifying strong and weak teacher-preparation programs, and would likely bar the worst from offering federal TEACH financial-aid grants.

Duncan did … mention[ed] that factors like teacher-placement rates, retention rates, gauges of alumni satisfaction, and measures of student learning would likely be part of the mix.

“We go into this very humbly and look forward to getting lots of feedback from the public,” Duncan said, but added that he doesn’t have much patience for naysayers. There’s been “such a lack of transparency, so much opaqueness [in teacher prep] I don’t think anyone can or should defend the status quo. Anyone who thinks what we’re doing is good enough, that to me is a real stretch.

New York State, as part of their Race to the Top grant, agreed to upgrade the teacher certification standards in the state. Over the years the State Education Department (SED) has approved scores of college-based teacher education programs around the state – the college “certifies” that the candidate completes the requisite courses and the student completes two exams – about 98% of the applicants pass the exams. Last year the SED moved to a different battery of exams, the edTPA. The Stanford-created exam is a complex combination of a video and two written exams within a tightly controlled environment (see description here).

The edTPA process has been sharply criticized by college faculties,

… the Vice President for Academics at UUP … notes “The edTPA is certainly separate from Common Core in many ways but I think the connection is that we see the same inappropriate rollout with the edTPA as we saw with the Common Core. So just as the Common Core was rushed in without adequate input from teachers in the k through 12 world, the same thing has happened with the edTPA. It was rushed in. Input from our educators at our colleges and universities was not taken seriously at State Ed, and so we have another debacle here.”

… UUP supports positive change and improving standards but teacher-educators and teachers need to be involved in decision-making in a more substantive way than the education department has drawn on them to date.

Bills have been introduced in the NYS Assembly to delay the implementation of the changes for one year, and, the Board of Regents will be considering a delay in implementation at the April 28th meeting.

Current federal regulations require states to identify the lowest achieving schools in the state and take corrective action, ranging from designing a corrective action plan, removing the principal, changing half the faculty to closing or converting the school to charter. These “persistently lowest achieving” (PLA) schools are almost always in the poorest zip codes. Will colleges who prepare students to teach in schools with populations of poor students be at-risk?

The feds intend to impose the same sanctions on schools of education. SED is currently “tracking” graduates into schools by the teacher scores on the APPR (teacher evaluation) scores. Every teacher in NYS receives a numerical grade (60% supervisory assessment, 20% student test scores, 20% “locally negotiated” metric) which converts to categories: highly effective, effective, developing and ineffective. Last year 1% of teacher were rated “ineffective” and 6% “developing.” College with higher percentages of “ineffective” and/or “developing” teachers will be subject to sanctions.

The unanswered questions are daunting:

* should colleges be held responsible once teachers are hired and are teaching in a school district when the college plays no role in site-based teacher support?

* is there any evidence that scores on the edTPA correlate with teacher performance?

* Is there any data which disaggregates edTPA scores by gender, race and ethnicity?

Currently only two states, New York State and the state of Washington have implemented the edTPA exams: is NYS rushing into a sea change too quickly?

There are larger questions: should teacher preparation programs be the responsibility of the school district rather than the university?

If you ask new teachers whether their college preparation program prepared them adequately for the classroom most (with a few exceptions), say “no.” Under the prior administration the NYC Department of Education mused about whether they would do a better job of preparing prospective teachers – the only non-college organization which has the ability to certify teachers is the Museum of Natural History (Earth Science) and the program is quite small and supported with federal grant dollars.

How important is the traditional college coursework? Should teacher preparation be treated as a vocation with the emphasis on actual in the classroom work under the guidance of a skilled teacher – a return to an apprentice system?

A baseline question: should teacher preparation programs be established in Washington DC or at the state or school district level? And, of course, will any of the changes matter if teachers continue to be held in low esteem?