The Challenge

How can we evaluate instructors' grading consistency within courses, between courses, and longitudinally over multiple semesters? The gold standard method would be to have 2+ instructors grade the same reports, and estimate inter-grader reliability. This is impractical given the time constraints of the current grading workload.  


Our Approach

We have taken a two-prong approach to this problem. Looking ahead, we are working to develop automated methods for assigning bins scores that can provide a "second reader" without adding to instructor workload. Until that is available we check instructor grading consistency each semester using a standardized correlations and grade summary spreadsheet.

As written the tables can manage scores for up to 40 students and 24 instructors. The spreadsheet performs three functions:

  • Identifying and removing outlier grades (i.e., those >2 st.dev. below the mean for the remaining grades for that instructor)
  • Summarizing and graphing lab grades by instructor and course
  • Correlating lab and lecture course grades, and plotting distributions with trend-lines

If an instructor's grading deviates from expectations or past performance, we meet with them to discuss their grading strategy and identify specific ways to improve going forward.

 

Available Resources

Resources Links
  Excel spreadsheet for grade summary and correlation analysis   XLSX file
  User guide for grade summary and correlation analysis   R/MD file;
  DOCX file

 

The spreadsheet contains sample data that are randomized subsets of grades from multiple semesters. The scores have been modified to illustrate common grading variations we see. The user guide describes how to organize and enter student grades data, and how we interpret the results.

It should be noted here that there are not hard and fast rules for what the results of the analysis should look like. Users need to define what is normal for the local program and students, then monitor program activities over time. It also is important to compare different datasets to determine which ones are more informative for local activities.

For example, in our program we perform the analysis using overall final lab course grades rather than writing grades alone. In our program, written assignments count for 70% of overall lab grade, so students' final overall grade are largely determined by writing scores. We also found that overall lab scores were slightly (though not significantly) more informative than just report grades.

 

Looking Ahead

Check the list of To Do items in the Assessment sub-project for more information about specific work in progress. Let us know if you want to contribute to one or more associated projects, or have other resources you would like to contribute.

 


Where to Learn More

  1. Gass G., Chen L. 2013. Excel based graphical tools for comparing grades across multiple lab or tutorial sections. Tested Studies in Laboratory Teaching, Proceedings of the Association for Biology Laboratory Education, 34:310-313.
Created by Dan Johnson Last Modified Thu June 9, 2022 4:45 pm by Dan Johnson

Comments

There are no comments on this entry.