The promise of the bipartisan No Child Left Behind (NCLB) 2001 legislation was, as the name says, that no child would be left behind. A key piece of this legislation is the annual testing of every child from third through eighth grade and then once in high school.
The data from these tests were intended to provide policymakers and educators with evidence to improve educational outcomes for the most disadvantaged students. But instead of promoting equity and social justice, the data are being used, in some cases, to further punish and disenfranchise the most vulnerable students.
As an educational researcher, teacher and mom, I understand the potential as well as the unintended impacts of the annual testing regime. I also know that it doesn’t have to remain this way. We, as a nation, can do better.
Fallout of standardized testing
NCLB is a reauthorization of the 1965 Elementary and Secondary Education Act. Many efforts have been made to reauthorize NCLB since 2007, with a big push this spring to get it revised and reauthorized before the fall campaign season.
NCLB’s use of standardized testing has been widely criticized for its inability to improve learning outcomes, especially for the most vulnerable students. It’s not just excessive testing, but an inappropriate use of the results that are now threatening the quality of public education.
Professional organizations such as the American Evaluation Association and the American Educational Research Association (AERA) have put out public statements about how “high stakes” decisions based on test data violate the code of ethics to “do no harm.”
AERA’s statement lists a set of conditions under which testing programs need to be implemented: alignment of curriculum with the test items, adequate resources and opportunity to learn, validation of the passing scores and means to address the needs of students with language and learning differences.
In addition, AERA has said that test scores should follow a strict ethical code when it comes to evaluations. Much of this is currently missing.
A range of tests
Let’s take stock of just how many tests are currently “out there” and what their different purposes are.
For instance, there are NCLB mandated “accountability” tests, such as Smarter Balanced Assessment (SBAC) and Partnership for Assessment of Readiness for College and Careers (PARCC); “diagnostic” tests used by districts to assess students and inform instruction, such as Northwest Evaluation Association (NWEAs); and course-level tests for high school students, such as Advanced Placements and International Baccalaureate.
There are also college entrance tests, such as SAT and ACT. And there is the national sampling for comparison across states – National Assessment of Educational Progress (NAEP) and, for international evaluations, there is Programme for International Student Assessment (PISA).
This is all on top of the classroom- and school-level assessments that actually support the daily teaching and learning process between a teacher and a student.
As a result, what has happened is that there is too much testing and not enough learning.
The testing industry that has emerged from this is now a formidable lobby. Over the past five years it has spent over U.S. $20 million to secure the $2 billion annual industry of standardized testing in the US.
Misuse of data
The data generated from this testing are being used to make critical decisions about students, teachers and schools.
Unfortunately, however, test data have not been used to improve teaching. Instead, data from the NCLB mandated accountability tests are being terribly misused.
There are now several court cases related to the misuse of standardized test scores in teacher evaluations and high school completion tests. Teachers' job positions, careers and salaries are being determined by test scores of students they don’t even teach.
U.S. Secretary of Education Arne Duncan has pushed for teacher evaluation to be based in part on students' standardized test scores despite the experiences of Tennessee, Houston and Florida, where misuse of test data has been seen and challenged in court.
Luke Flynt, an Indian River County teacher, in public testimony to the school committee, discussed how absurdly unreasonable these models of testing are. Flynt was a teacher in Florida who received unsatisfactory ratings because the computer model predicted that his students would score above a perfect score.
Similarly, last year, Sheri Lederman, a fourth grade teacher in New York’s Great Neck Public School district, has challenged the inappropriateness of her teacher evaluation rating. The case will be heard by the New York Supreme Court.
As it is, teachers are frustrated. Testing has only added to it. Between 40% and 50% of new teachers are leaving the profession within five years. This is leading to a huge loss of social capital and institutional capacity in the highest-need schools, where the rate of teacher exodus is highest. The annual cost of teacher dissatisfaction, expressed in the high turnover, is estimated to be $2.2 billion.
This misuse of data is also driving states to opt out of the Common Core State Standards (CCSS).
At least 10 states have already dropped the CCSS, and similar legislation is pending in most other states. Several states are “rebranding” the standards by having more local input and revising elements of the standards.
Testing has not worked
The National Academies, the premier source of expert advice on pressing societal challenges, have documented that the current test-based accountability models of incentives and sanctions has not been effective for improving learning or achievement.
They have also called for reformed models of accountability that would consider broader-based measures of progress.
As is evident in these details, the true failure of education, as stated by the American Statistical Association (ASA), has been in preparing our legislators and educational policymakers in the ethical use of statistics.
In particular, the Value Added Model (VAM), a complex statistical tool, is being inappropriately used for assessing teachers’ performance.
The ASA has cautioned that these data are not an accurate measure, as standardized test scores are not “causational.” In other words, test results are affected by many factors – not just the teacher. Results need to be interpreted with caution.
And also, for this reason, no high-stakes decisions such as job termination should be made based on the test results.
The basic scientific premise of quality assessment and evaluation is taking multiple measures, using multiple methods, and making use of multiple opportunities for a more accurate representation of anything being studied, particularly something as complex as teaching.
The aspirations of “No Child Left Behind” are essential for our nation’s success. However, the current models based on limited standardized test scores significantly underrepresent the complexity of learning.
Other nations have developed models of educational accountability that are aligned with standards, reduce the number of tests, and help ensure equity and improve educational outcomes by strengthening teaching and learning. They also cost a lot less.
The question is: do we, as a nation, have the political will to leave behind the illusion of a quick fix from test scores?