Monday, July 31, 2006

NCLB Testing: About Learning, Not Standings

The No Child Left Behind Act (NCLB) is perhaps the most sweeping and controversial educational reform ever enacted. As professionals working in the testing industry, we have both benefited and suffered from this legislation. To be sure, there are aspects of NCLB that are less than ideal, but sometimes criticisms of NCLB are taken too far beyond the facts.

Recently, a commentary highly critical of NCLB was published in the Wall Street Journal. The author, Charles Murray, claims that NCLB is “a disaster for federalism” and “holds good students hostage to the performance of the least talented.” Murray cites a report from the Civil Rights Project at Harvard University, which concludes that NCLB has not improved reading and math achievement as measured by the National Assessment of Educational Progress (NAEP). Murray further argues that although many state assessments show decreases in black–white achievement gaps, these decreases are meaningless because they are statistical artifacts based on changes in pass rate percentages rather than differences in test scores.

Murray’s criticism confounds the idea of measuring students against standards (criterion-referenced testing) with measuring students relative to a group (norm-referenced testing). In a norm-referenced system, if scores for white and black students increase the same amount, clearly the gap between the groups is not closing. But Murray fails to understand the basic tenets of standards-based assessment that provide the framework for NCLB. In each state, assessments are built to measure the state content standards, the very same content standards that schools are expected to use in their instruction. In addition, each state sets achievement standards that establish how well students must perform on the assessments to be considered “proficient” and “advanced.”

In a standards-based assessment, if black and white students improve equally, the percentage of blacks achieving proficient will, over time, increase more than the percentage of whites achieving proficient. Murray is correct to say this is “mathematically inevitable.” But, is it meaningless? Is it meaningless that more black students who were not proficient are proficient now? Is it meaningless that more minority students are learning fundamental reading and math skills that they weren't learning before?

By disdaining measuring students against standards and embracing norm-referenced measurements, Murray’s arguments evoke stereotypical assumptions—notably the assumption that there is “a constant, meaningful difference between groups” (as if this is some natural law and not based on divisions of class, privilege, and access to resources) and the statement that “they cannot all even be proficient” (i.e., what’s the point of trying, they can’t learn anyway).

In Texas, standards-based assessment preceded NCLB by more than a decade. Over time, Texas policy makers have revised their assessments several times. Their most recent program, the Texas Assessments of Knowledge and Skills (TAKS), was introduced in 2003. It is based on tougher content standards and imposed tougher achievement standards than any prior Texas assessments. Texas also did something unusual when they introduced TAKS: they phased in their tougher achievement standards over several years, starting with standards that were two standard errors of measurement (SEM) below the level recommended by the standard setting panels. Table 1 shows the percentage of students passing the exit-level mathematics test between 2003 and 2006 based on different standards: 2 SEM below the panel recommendation, 1 SEM below the panel recommendation, and at the panel recommendation. The bold pass rates correspond to the standard that was used in a particular year.

Table 1: Percent of Students Passing TAKS Grade 11 Mathematics – Spring 2003 to Spring 2006

Table

Murray would probably be delighted with Table 1, because just as he predicted, the difference between black and white pass rates depends on the standard that is applied. On the other hand, Murray would also have to admit that passing percentages of both blacks and whites improved each year, regardless of the performance standard one might care to apply. The rise in test performance indicates that students are being taught the necessary skills they weren’t learning before. This is far from meaningless.

One aspect of Table 1 that Murray might take special note of is the rather astonishing increase in pass rates between 2003 and 2004. This increase did not surprise Texas educators at all. It turns out that the requirement to pass the exit-level TAKS tests did not apply in the first year of testing. Thus, the dramatic increase in passing rates between 2003 and 2004 is probably due in part to instructional changes and in part to changes in student motivation. It also provides some context for considering the use of NAEP scores as criteria for evaluating NCLB: NAEP does not measure any state’s content standards and there is little or no incentive for students to give their best performance. As the only national test available, NAEP is a convenient and available yardstick, but it was not designed to evaluate state assessment systems and its use for that purpose is of limited validity.

The politics of NCLB are complex and tend to encourage extreme positions. In calling NCLB “uninformative and deceptive,” Charles Murray has taken an extreme position that fails to recognize the rationale and merits of standards-based assessment. Irrespective of one’s views about NCLB, it is important that the public debate be an informed one, and the rhetoric of Charles Murray misses much of the issues that matter.

No comments: