Students will clean an open sources polar bear dataset using best practices to accurately and clearly designate each step taken to collect, clean, and analyze open access biodiversity data. This exercise uses Excel.
Upon completion of this module, each student should be able to:
- Access biodiversity data from open sources.
- Use descriptive, retrievable, and consistent file names to manage datasets.
- Identify common problems with digital datasets
- Rectify common problems with digital datasets
- Apply disciplinary knowledge for smart data cleaning
- Explain the importance of reproducible data and cleaning steps
- Document data cleaning steps to provide reproducibility.
This resource was developed in part at the 2019 QUBES & BioQUEST Summer Institute Evolution of Data in the Classroom: From Data to Data Science
Cite this work
Researchers should cite this work as follows: