Wednesday, May 25, 2005

Those Pesky Performance Standards

I am never failed to be amazed by the discussions my colleagues and I engage in regarding what psychometricians call "standard setting". The essence of standard setting is to determine "how much is enough" regarding the performance on some measure, and to do so in a less than capricious manner (still arbitrary, but not capricious).

Nevertheless, rooms filled with content experts, testing experts, psychometricians (some of whom are experts), standard setting experts, and others engage in countless banter regarding how to plan for, control, and analyze the data resulting from (or going into) a standard setting as if the data was anything less than an arbitrary (though often not capricious) judgment.

Perhaps I am finally too old to enjoy such arbitrary distinctions anymore. Understand that I am not saying that standard setting is not important, that the established procedures should not be used or that we should not carefully plan and implement the standard setting in the best way possible following standards of best practice. I think all of this should be done. I am just not sure all of the research and rhetoric using the results or outcomes of such judgmental procedures are worth the efforts they require to discuss.

One person's opinion...of course.

Friday, May 20, 2005

Life Long Learning

Over the years, I can recall various conversations regarding student growth, preparedness and remediation. They go something like the following:

First Grade Teacher: These kids have no social skills at all. Why can't the parents do more to get their kids ready for school?

Third Grade Teacher: These kids don't know the alphabet or their math facts. Why can' t the earlier grade teachers do more?

High School Teacher: These young people don't have any of the prerequisite math skills. Why can't the middle school teachers do more?

College Instructor: Half of our entering freshmen are in remediation. Why can't the high school teachers do more?

Educators and the public alike often talk about a K-16 or K-20 system of education in this country. In fact, just last week a retired professor of mine talked about being a "life long student" and how the biggest pleasure he gets in life is the fun in finding things out. Yet, our educational systems are quick to "pass the blame" onto what has gone before. It seems to me that a more integrated system of learning, including measurement of skills from K-20 might make it easier to debunk (or at least put into perspective) the gaps students have in their pre-requisite skills as they move from kindergarten to college.

One interesting step in this area is the use of "college readiness" indicators as part of the state mandated assessment system. Texas has recently required such an indicator.

Preliminary results of the research supporting this effort (as conducted by Pearson Educational Measurement in coordination with the Texas Education Agency) is also presented.

Tuesday, May 03, 2005

How It All Started

T = X - E

Recall that one of the fundamental derivations of "strong true score" or "classical" measurement theory is that an examinee's unknown and unseen "true score" (T) is really their observed score (X) on an assessment minus error (E). Since the development of this concept (and even before) measurement practitioners and theorists to boot have been trying to estimate a student's true score with greater and greater precision. This maximization effort typically focuses on ways to partition the error (i.e., to better understand what is causing error) and ultimately reduce it such that observed student performance is a better indicator of underlying achievement or ability.

So, what does all this have to do with the TrueScores blog? Only in that it serves to mention that Pearson Educational Measurement has recently expanded our research efforts and intend to use the TrueScores blog as one of the forums for dissemination and debate. The last thing the world needs is another forum for a pompous psychometrician to pontificate about how the world would be a better place if y'all would only buy their solution. To this end, the TrueScores blog is dedicated to honest, respectful, scientifically based and open debate about the "hot" topics in today's measurement world. Some of these topics include:

  • Establishing comparability between paper-and-pencil assessments and their online or electronic counterparts.
  • Automated essay scoring: Is it practical, reliable and valid?
  • Is Computer Adaptive Testing (CAT) a potential solution for the age old question of testing time versus instructional time?

Background information related to these topics can be found at our web site on the research pages. Additional publications related to a host of topics in educational measurement can also be found at our web site. Future research will be added periodically and we will use this blog to communicate these additions.

We will be updating this blog so that a new discussion topic will be posted regularly. This will add to the debate shaping our educational policy and will provide practical and applied insights into not only classical measurement but other aspects of educational measurement including Item Response Theory, Growth Modeling (Value Added Models), Equating, Scaling and legal defensibility. As such, we hope you return.

In the mean time, if you have questions about Pearson Educational Measurement, or our parent company Pearson Education, start by visiting our home page.