Cleaning Data with R and the Tidyverse in Swirl
Author(s): Rachel Hartnett
Mount St. Mary's University
1521 total view(s), 1190 download(s)
- Cleaning_Data.swc(SWC | 12 KB)
- Final Swirl Lesson Plan_Cleaning Data.pdf(PDF | 141 KB)
- GitHub - swirldev/R_Programming_E: Team swirl's R Programming Course with Email Notifications
- DataDryad link to dataset used
- Blog post on data cleaning with R and the tidyverse: detecting missing values
- QUBES - Resources: Importing Data into R
- QUBES - Resources: Sampling Distributions and Null Distributions: two swirl lessons in R
- License terms
Description
This swirl lesson is designed to provide a baseline of knowledge for what steps need to be taken to deal with missing data, and how to do that in R with the tidyverse package. By the end of the lesson, students should be able to provide a checklist for cleaning their datasets of missing data, be able to modify a dataset in R with missing data, and be able to export that dataset into a .csv file. It is recommended that students use R studio for this lesson as some of those features are necessary for the lesson.
This lesson is designed to be completed within ~35 minutes during class; however due to COVID-19 disruptions to in-person classes, this lesson was implemented online. Future modifications of this lesson could include more advanced visualizations and include different datasets that require different assumptions to be met or need to be cleaned in different ways. In addition, I hope to add additional lessons in order to include more data cleaning steps.
Cite this work
Researchers should cite this work as follows:
- Hartnett, R. (2020). Cleaning Data with R and the Tidyverse in Swirl. Make Teaching with R in Undergraduate Biology Less Excruciating 2020, QUBES Educational Resources. doi:10.25334/SKPR-CB95