The Third International Conference on Educational Data Mining was held in Pittsburgh in June. The conference began in Montreal two years ago as an offshoot of the Intelligent Tutoring Systems Conference (held this year in Pittsburgh immediately following). A small (approximately 100 participants), single-track conference, participants are mostly academicians in the fields of cognitive science, computer science, and artificial intelligence (AI), most or all of whom have dedicated their efforts to education research.
Educational data mining (EDM) is the process of analyzing student (and in some cases even educator) behaviors for the purpose of understanding their learning processes and improving instructional approaches. It is a critical component of intelligent tutoring systems, since there is an implicit realization in this field that unidimensional models of student knowledge and skills are generally insufficient for providing adaptive supports. That said, the results from EDM go beyond informing Intelligent Tutoring Systems (ITS) on how to do their job. For example, they can be a cornerstone of formative assessment practices, in which we provide teachers with actionable data on which to shape instructional decisions. In fact, few would argue that the most successful ITS is one that not only provides individualized opportunities and supports to students in real-time but also keeps the teacher actively in the loop.
Examples of the types of data used in EDM include:
Correctness of student responses (of course!)
Types of errors made by students
Number of incorrect attempts made
Use of hints and scaffolds
Level of engagement / frequency of off-task behaviors (as measured through eye-tracking, student/computer interaction analysis, etc.)
Student affect (as measured through physiological sensors, student self-reports, etc.)
The list goes on ...
Much of EDM research focuses on identification of how students cluster into groups based upon their behaviors (K-Means is particularly popular, though by no means exclusive). For example, it might be found that a population of students working on an online tutoring system seems to divide into three groups -- high-performing, low-performing/high-motivated, and low-performing/low-motivated -- with each group exhibiting distinguish patterns of interaction and hence learning. The types of supports offered to students in each of these groups can, and should, vary as a function of this clustering.
As efforts to bridge the divide between instruction and assessment get underway, such as Federal Race To The Top Assessment program, it is important that the educational testing research community stay on top of the developments from the EDM research community, to best understand the types of data to collect, techniques for analysis, and their potential for improving educational opportunities for students.
Bob Dolan, Ph.D.
Senior Research Scientist, Assessment & Information