Billy Beane, the manager of the Oakland Athletics, changed the method of evaluating baseball players – instead of the cigar chomping baseball scouts Beane used a data-driven approach, memorialized in Michael Lewis’ Moneyball (2003). The analysis/manipulation of large, really large sets of data has become the sine qua non, the “standard” for decision-making, in baseball, in healthcare as well as in education. In an Atlantic article, “Can the Government Do Moneyball,” the authors aver,
The moneyball formula in baseball—replacing scouts’ traditional beliefs and biases about players with data-intensive studies of what skills actually contribute most to winning—is just as applicable to the battle against out-of-control health-care costs. According to the Institute of Medicine, more than half of treatments provided to patients lack clear evidence that they’re effective. If we could stop ineffective treatments, and swap out expensive treatments for ones that are less expensive but just as effective, we would achieve better outcomes for patients and save money.
The field of education is no different – we now have the ability to parse large data sets to assist educators to fine tune, to individualize instruction to students,
“The important thing with the data as we see it is this: How does it improve instruction in the classroom? … The trick is to be able to combine what I call ‘autopsy data’ of what has happened with the child, with what goes on currently in class, with formative and summative evaluations on an ongoing basis.”
In addition to longitudinal data, scores from online work or assess¬ments scanned in from offline work go ¬immediately into the platform, and its predictive ¬analytics engines go to work to ¬develop recommendations to help the student get up to speed ….Teachers don’t continue to teach things their students already know. It gives just-in-time feedback of what to pay attention to now, ¬before students get so far behind that they can’t catch up.”
One thing is certain: As education becomes more Big Data–driven, ¬educators and IT leaders must ¬remember that human judgment matters too. “You have to pay attention to whether the data resonates with what the teachers know to be true about a student’s performance, …There’s no ¬substitute for authentic analysis.”
We can receive real time feedback on suggested approaches to remediating student errors, of course, the process does not explain why a student gets a wrong answer and does not involve critical thinking skills, it can tell us that, for example, 26% of Afro-American seventh graders who are eligible for Title 1 services cannot successfully divide fractions 80% of the time. How can the school district use the data? Why are 74% of students succeeding? Are the teachers of the 74% consistently successful year to year? Are the textbooks the same? Is the race/gender/experience level of the teachers a significant factor?
The use of “big data” can, within a statistical range, answer the questions; however, the “answer” does not tell us why…
The Gates-funded Measures of Effective Teaching Study identifies more effective teachers using test scores to define effectiveness. What the MET study does not do is tell us why some teachers are more effective than other teachers. Are they more effective in teaching particular skills or more effective in motivating students or some combination? Highly effective teachers have no idea why they are more or less effective from year to year.
Paul Tough, in “How Students Succeed,” challenges, “…the cognitive hypothesis, the belief ‘that success today depends primarily on cognitive skills — the kind of intelligence that gets measured on I.Q. tests, including the abilities to recognize letters and words, to calculate, to detect patterns — and that the best way to develop these skills is to practice them as much as possible, beginning as early as possible.” … Tough sets out to replace this assumption with what might be called the character hypothesis: the notion that noncognitive skills, like persistence, self-control, curiosity, conscientiousness, grit and self-confidence, are more crucial than sheer brainpower to achieving success.”
The teacher who ignites noncognitive skills may be more effective than the teacher who uses the proper teaching techniques, as “measured” by the Danielson Frameworks.
The policy wonks, the decision-makers are seduced by data, the right data set, the right algorithm, the right combination of variables can result in attributing a numerical score to a teacher. Once we’ve identified the most effective teachers we can use the “score” to drive decisions: who gets tenure, who gets fired, who gets a raise or a promotion. What used to be a decision solely made by the principal is now of part of a multiple measures rubric with a value-added measurement (VAM) counting for 20% to 50% of the scores.
The Educational Testing Service (ETS) warns us that the instability and unreliability of VAM algorithms should not be used for decisions that impact careers.
Edward H. Haertel (March, 2013) in “Reliability and Validity of Inferences About Teachers Based on Student Test Scores,” warns,
Teacher value-added scores are unreliable … that means that teachers whose students show the biggest gains one year are often not the same whose students show the largest gains the next year…
The goal for VAM is to strip away just those student differences that are outside of the current teachers control … those things the teacher should not be held accountable for…
Teacher VAM scores should emphatically not be included as a substantial factor with a fixed weight in consequential teacher personnel decisions … It is not just that the information is noisy … the scores may be systemically biased for some teachers and against others.
In spite of the evidence the US Department of Education steadfastly hews to the VAM line. Data is the answer – if you can create the right mathematical equation, if the mountain of data is large enough – you can solve all.
The use of data-making decision-making presumes that teaching is a science; it presumes that with the proper mix of “chemicals” you can “create” a desired outcome.
Is the process of teaching a science which can be measured, or, is teaching an art? Can you assign numerical values to every Danielson element and VAM growth score and assign a teacher a grade, or, was Judge Potter Stewart correct when he wrote he could not define pornography but he “knew it when he saw it.”
The teacher evaluation law in New York State is incredibly dense (see website here). The State Education Department approved 700 locally negotiated plans, collected gigabytes of data, spun the computers, and, (roll of drums!!)
51% of teachers are “highly effective”
40% of teachers are “effective”
8% of teachers are “developing”
1% of teachers are “ineffective.”
In June, 2012 2.7% of New York City teachers received an “unsatisfactory” rating. The dense formula identifed fewer ineffective teachers.
Data has become an addiction, the “meth” of the world of education.
Perhaps it would be more cost effective if we poured the dollars into a genome project – is there a “teaching” gene? Are “highly effective” teachers the product of “nature” or “nurture”?
What does a numerical score tell a teacher? What can they learn from a VAM score?
The byte-ing of teaching is a failure.