In 2004, the journal Nature published a short paper by Tatem and colleagues. In that paper, the authors use linear regression to fit curves to Olympic gold medal times for men and women in the 100-meter dash. They note the shrinking gap between men's and women's times, and based on their regression, they predict that in the year 2156 women runners would beat men for the first time.
This is tongue in cheek fun way to critique common issues in linear modeling, such as the assumptions of a linear model and predicting too far beyond the data. As a gender binary dataset, it also offers a good opportunity to discuss the variable type of gender - that this may be outdated for data collection, but is still used in historical contexts.
This is my 2nd of 3 large projects and is more open ended. The Storks and Babies case study helps them develop a code base they will find useful here as well. It also asks them to use R as a calculating tool.
Cite this work
Researchers should cite this work as follows: