Thursday, April 29, 2010

Pearson’s participation at AERA and NCME

As I read the latest TMRS Newsletter, I was reminded of my first days at Pearson. Back then, the department, lead by Jon Twing, consisted of 5 other staff. We were called psychometricians and we worked closely together across operational testing programs.

The department grew rapidly after that. As part of that growth, Jon and I created the Researcher-Practitioner model as an ideal. Under the Researcher-Practitioner Model, practice informs research and research supports practice. The role of the psychometrician combines research and fulfillment. Our department would not have separate groups of staff to perform research and fulfillment functions. Instead, our department would use the same staff members to perform both activities. Each psychometrician would dedicate the majority of their hours to contract fulfillment and the remaining hours to research.

Many things have changed since those first days but some things remain the same. With more than 50 research scientists, we are still a close-knit group. But the label “psychometrician” has been replaced with “research scientist.” And we are still working toward the ideal of the Researcher-Practitioner. While we may not have achieved the ideal, the list of Pearson staff participating in the annual conference of the American Educational Research Association (AERA) and the annual conference of the National Conference on Measurement in Education (NCME) in the latest TMRS Newsletter is proof that we are still active researchers. In Denver, Pearson staff provide 22 presentations at AERA and 15 presentations at NCME. In addition, Pearson research scientists will make three presentations at the Council of Chief State School Officers’ (CCSSO) National Conference on Student Assessment and will present at Society for Industrial & Organizational Psychology (SIOP) and the International Objective Measurement Workshop (IOMW).

Please review the research that Pearson research scientists will be presenting at these meetings that are listed in the newsletter. If you are interested in reading the conference papers, several are listed on the conference reports tab on the Research & Resources
page of the Pearson Assessment & Information website.


Paul Nichols, PhD
Vice President
Psychometric & Research Services
Assessment & Information
Pearson

Wednesday, April 07, 2010

Performance-based Assessment Redux

Cycles in educational testing continue to repeat. The promotion and use of performance-based assessments is one such cycle. Performance-based assessment involves the observation of students performing authentic tasks in a domain. The assessments may be conducted in a more- or less-formal context. The performance may be live or may be reflected in artifacts such as essays or drawings. Generally, an explicit rubric is used to judge the quality of the performance.

An early phase of the performance-based assessment cycle was the move from the use of performance-based assessment to the use of multiple-choice tests as documented in Charles Odell’s 1928 book, Traditional Examinations and New-type Tests. The “traditional examinations” Odell referred to were performance-based assessments. The “new-type tests” Odell referred to were multiple-choice tests that were beginning to be widely adopted in education. These “new-type tests’ were promoted as an improvement over the old performance-based examinations in efficiency and objectivity. However, Odell had doubts.

I am not old enough to remember the original movement from the use of performance-based assessment to the use of multiple-choice tests but I am old enough to remember the performance-based assessment movement of the 1990s. As I remember it, performance-based assessment was promoted in reaction to the perceived impact of multiple-choice accountability tests on teaching. Critics worried that the use of multiple-choice tests in high-stakes accountability testing programs was influencing teachers to teach to the test, e.g., focus on teaching the content of the test rather than a broader curriculum. Teaching to the test would then lead to inflation of test scores that reflected rote memorization rather than learning in the broader curriculum domain. In contrast, performance-based testing was promoted as a solution that would lead to authentic student learning. Teachers that engage in teaching to a performance-based test would be teaching the actual performances that were the goals of the curriculum. An example of a testing program that attempted to incorporate performance-based assessment on a large scale was the Kentucky Instructional Results Information System.

It’s déjà vu all over again, as Yogi said, and I am living through another phase of the cycle. Currently, performance-based assessments are being promoted as a component of a balanced assessment system (Bulletin #11 ). Proponents claim that performance-based assessments administered by teachers in the classroom can provide both formative and summative information. As a source of formative information (Bulletin #5 ), the rich picture of student knowledge, skills and abilities provided by performance-based assessment can be used by teachers to tailor instruction to address individual student’s needs. As a source of summative information, the scores collected by teachers using performance-based assessment can be combined with scores from large-scale standardized tests to provide a more balanced view of student achievement. In addition, proponents claim that performance-based assessments are able to assess 21st Century Skills whereas other assessment formats may not.

But current performance-based assessments still face the same technical challenges that they faced in the 1990s. A major technical challenge facing performance-based assessments is adequate reliability of scores. Variance in both teachers’ ratings and task sampling may contribute to unacceptably low score reliability for scores used for summative purposes.

A second major challenge facing performance-based assessments is adequate evidence of validity. Remember that performance-based assessment scores are being asked to provide both formative and summative information. But validity evidence for formative assessment stresses consequences of test score use whereas validity evidence for summative assessment stresses more traditional sources of validity evidence.

A third major challenge facing performance-based assessments is the need for comparability of scores across administrations. In the past, the use of complex tasks and teacher judgments has made equating difficult.

Technology to the rescue! Technology can help address the many technical challenges facing performance-based assessment in the following ways:
  • Complex tasks and simulations can be presented in standardized formats using technology to improve standardization of administration and broaden task sampling;
  • Student responses can be objectively scored using artificial intelligence and computer algorithms to minimize unwanted variance in student scores;
  • Teacher training can be detailed and sustained using online tutorials so that teachers’ rating are consistent within teachers across students and occasions and across teachers; and,
  • Computers and hand-held devices can be used to collect teachers’ ratings across classrooms and across time so that scores can be collected without interrupting teaching and learning.

Save your dire prediction for others, George Santayana. We may not be doomed to repeat history, after all. Technology offers not just a response to our lessons from the past but a way to alter the future.

Paul Nichols, PhD
Vice President
Psychometric & Research Services
Assessment & Information
Pearson