Assessing the classroom performance of students presents challenges to teachers at all levels of education. While each grade level has its own unique and particular dilemmas, all formal school assessments must attempt to resolve a number of common critical problems. These include consistency--subjecting all students to the same sets of standards; validity--measuring students on the basis of what we teach; and reliability--the tendency of assessments to yield similar data when repeated.
Mathematica with its capabilities of processing, manipulation, and presenting a variety of statistical constructs gives educators opportunities to examine some relevant facets of assessment in original ways. The presentations enhance the teacher's understanding of the most critical components of the assessment process--individual students, groups, and the assessments themselves. For the individual student, the teacher's assessment of achievement is typically an accumulation of measurable attributes such as test scores and homework grades along with some less-measurable ones such as attitude and other affective behaviors. A number of statistical measures that bring together individual ability and an assessment's difficulty are available for use by teachers. Using Mathematica to compute these measures can provide the teacher with a more insightful view of the student's learning and development.
For groups of students, the teacher is interested in progress in comparison to equivalent groups (those of similar grade level studying similar material) and to other groups such as previous-year groups or groups taught by other teachers. By using some of Mathematica's graphical capabilities, characteristics, trends, and other qualities of group, data can be examined. For example, BarChart, ListPlot, and TextListPlot present graphed, grouped, individual scores. Show and Fit can generate functions that invite algebraic as well as graphical comparisons. These graphical presentations can illuminate the subtle similarities and differences between groups, which can more comprehensively inform the teacher about learning, development, and achievement of the group.
Improved information about specific items that comprise an assessment can help the teacher create better tests. Improved tests measure individual and group achievement more precisely, at more finely calibrated levels of difficulty, and are more robust in their measurement at wider ranges of difficulty. One of the most useful tools in item response theory is the Rasch measurement which assesses item difficulty. In short, the Rasch statistic transforms a nonlinear measure of distribution, the p-value (correct responses divided by total responses), to linear measure. It does so by using the logit function of the p-value, ln[(1-p)/p].
Computation for the Rasch statistic is a difficult and time-consuming operation, but using Mathematica to perform it allows increased transparency and user-friendliness. The teacher is able to make better global sense of both data and subjective information about the item.