STEM Writing Project
Word Use Analysis
The Challenge
Tracking students' development as scientific writers by “close reading” individual reports is impractical in large BIO101 classes. How then can students’ writing skills be evaluated longitudinally in these large courses?
Our Approach
We proposed using machine-scorable text features as proxy metrics for students’ development as writers. For this study we assembled a suite of potential candidate metrics, then asked:
- What does "good student scientific writing" look like? Which text features are informative?
- What do these features tell us about changes in students' writing patterns?
- Can proxy metrics provide useful insights about cohort-level changes over a curriculum sequence?
We divided our archive of >4400 student lab reports into 4 writer experience levels:
- Novice students enrolled in their first college biology course.
- Early-career students with 0.5-1.5 years of general college writing experience, and 1 semester of biology writing experience.
- Mid-career students with 2+ years of general writing experience, but only 1 semester of biology writing experience.
- Advanced students with 2+ years of general writing experience and 2+ prior semesters of biology writing experience
Text features that were evaluated as proxy metrics fell into 3 categories:
- Lexical range: # unique words, type/token ratios, word repetition rates
- Word choices: working vocabulary, fractional type/token ratios
- Readability: wordiness, word difficulty, sentence length and complexity
Whether proxy metrics could predict assigned grades was tested using proportional odds ordinal logistic regression (POLR).
Lessons Learned
1. Several machine-scored metrics correlated well with students’ growing experience as writers.
- Overall lexical richness (simple type-token ratio, Herdan’s C, Dugast’s U) did not change with experience.
- Word repetition (Yule’s K, Simpson’s D, Herdan’s Vm) declined significantly (11.4-20.6%, p<0.001).
2. Lexical range & use of formal terms increased as students gained writing experience.
- Total # unique words used rose 25.1% (p<0.001).
- Use of academic & specialized terms grew faster (24.2-38.1%) than general terms (12.1%-17.8%), reflecting a move to more “formal” word choices.
3. Overall, 14/32 readability indices showed a relative association (phiC) > 0.2 over the 3-course series (p<0.001).
- Not all readability indices correlated equally well with writing experience.
- Indices emphasizing wordy items and frequency of long or polysyllabic words were more likely to be positively correlated with more writing experience.
4. Proxy metrics were poor predictors of individual student grades. Fit for single- & multi-factor POLR models was low, with 59% average predictive error on the best fit model (above; Nagelkerke pseudo-R2 = 0.187.)
In summary
We found that selected proxy metrics can surface changes in students’ writing longitudinally across a curricular sequence, and for a cohort rather than just individual students. These proxy metrics are valuable because they are less subject to interpretation, and harder for students to “game.” We also found that proxy features can help us triangulate on the intrinsic features of interest/value within students' writing that we want to develop over time.
Available Resources
Resources | Links |
Summary poster - 2022 IUSE Summit in Washington, DC | PDF file |
R Shiny web form for collecting well-structured student reports | Link to QUBES |
Archive of 4400 student reports and metadata | Link to QUBES |
Structured vocabularies & R scripts for analyses | Coming soon |
Where to Learn More
Theory
- Carpenter, J. H. C. H. (2001). It’s about the Science: Students Writing and Thinking about Data in a Scientific Writing Course. Language & Learning Across the Disciplines, 5, 2.
- McCannon, B. C. (2018). Readability and Research Impact. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3341573
- Oppenheimer, D. M. (2006). Consequences of erudite vernacular utilized irrespective of necessity: Problems with using long words needlessly. Applied Cognitive Psychology, 20(2), 139–156. https://doi.org/10.1002/acp.1178
- Page, E. B., & Paulus, D. H. (1968). The Analysis of Essays by Computer. Final Report.
- Plavén-Sigray, P., Matheson, G. J., Schiffler, B. C., & Thompson, W. H. (2017). The readability of scientific texts is decreasing over time. ELife, 6, e27725. https://doi.org/10.7554/eLife.27725
- Quitadamo, I. J., & Kurtz, M. J. (2007). Learning to improve: Using writing to increase critical thinking performance in general education biology. CBE Life Sciences Education, 6(2), 140–154. https://doi.org/10.1187/cbe.06-11-0203
- Tweedie, F. J., & Baayen, R. H. (1998). How Variable May a Constant Be? Measures of Lexical Richness in Perspective. Computers and the Humanities, 32(5), 323–352.
- Underwood, J. S., & Tregidgo, A. P. (2006). Improving student writing through effective feedback: Best practices and recommendations. Journal of Teaching Writing, 22, 73–97.
Metrics
- Bormuth, J. R. (1969). Development of Readability Analyses. Department of Health, Education, & Welfare.
- Browne, C., Culligan, B., & Phillips, J. (2013a). The New Academic Word List. http://www.newgeneralservicelist.org
- Browne, C., Culligan, B., & Phillips, J. (2013b). The New General Service List. http://www.newgeneralservicelist.org
- Coleman, M., & Liau, T. L. (1975). A computer readability formula designed for machine scoring. Journal of Applied Psychology, 69, 283–284.
- Davies, M. (2016). The Corpus of Contemporary American English (COCA): 520 million words, 1990-present. http://corpus.byu.edu/coca/
- Farr, J. N., Jenkins, J. J., & Paterson, D. G. (1951). Simplification of Flesch Reading Ease Formula. Journal of Applied Psychology, 35(5), 333–337. https://doi.org/10.1037/h0062427
- Flesch, R. (1948). A new readability yardstick. The Journal of Applied Psychology, 32(3), 221–233. https://doi.org/10.1037/h0057532
- Gunning, R. (1968). The Technique of Clear Writing, Revised Edition. McGraw-Hill. https://books.google.com/books?id=ofI0AAAAMAAJ
- Herdan, G. (1960). Type Token Mathematics. A Textbook of Mathematical Linguistics. (Vol. 4). Mouton & Co.
- Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Defense Technical Information Center. https://books.google.com/books?id=7Z7ENwAACAAJ
- McLaughlin, G. H. (1969). SMOG grading: A new readability formula. Journal of Reading, 12(8), 639–646.
- O’Hayre, J. (1966). Gobbledygook Has Gotta Go (p. 113). Bureau of Land Management. http://training.fws.gov/history/HistoricDocuments.html
- Powers, R. D., Sumner, W. A., & Kearl, B. E. (1958). A recalculation of four adult readability formulas. Journal of Educational Psychology, 49(2), 99–105. https://doi.org/10.1037/h0043254
- Simpson, E. H. (1949). Measurement of Diversity. Nature, 163(4148), 688–688. https://doi.org/10.1038/163688a0
- Smith, E. A., & Kincaid, J. P. (1970). Derivation and Validation of the Automated Readability Index for Use with Technical Materials. In Human Factors (Vol. 12, Issue 5, pp. 457–564). https://doi.org/doi:10.1177/001872087001200505
- Yule, G. U. (1968). The Statistical Study of Literary Vocabulary. Cambridge University Press. https://books.google.com/books?id=-R09AAAAIAAJ
Comments
There are no comments on this entry.