56 total view(s), 15 download(s)
- Data is the New Science Module.docx(DOCX | 3 MB)
- GBIF User Guide.docx(DOCX | 3 MB)
- IDigBio User Guide.docx(DOCX | 2 MB)
- Data-is-the-new-science-R-Remix.docx(DOCX | 242 KB)
- Data-is-the-new-science-R-Remix.Rmd(RMD | 18 KB)
- License terms
Description
There is a changing landscape for those joining the 21stcentury workforce. Rapid advances in data research and technology are transforming how we conduct science. The volume and variety of data being generated, the increased accessibility of data for aggregation, the improved discoverability of data, and the increasingly collaborative and interdisciplinary nature of scientific research are driving the need for new skill sets to address scientific issues of critical national and global importance.
The biodiversity sciences have experienced a rapid mobilization of data that has increased capacity to investigate large-scale issues of critical importance (e.g., climate change, zoonotic disease, resource management, invasive species, and biodiversity loss). In order to investigate these types of questions, the 21stcentury biodiversity scientist must be fluent in integrative fields spanning evolutionary biology, systematics, ecology, geology, genetics, biochemistry, and environmental science and possess the quantitative, computational, and data skills to conduct research using large and complex datasets.
In this module, students will be introduced to some emerging biodiversity data resources. They will be asked to think critically about the strengths and utility of these data resources and then encouraged to think beyond the obvious to how these data could be used to answer big science questions. They will access these data and generate figures using the R programming language, guided through these steps via an R Markdown document that will preferably be interacted with and modified in the R Studio Integrated Development Environment.
Students completing this module will be able to:
- Access data from biodiversity digital data repositories
- Evaluate the research utility of occurrence data derived from different sources.
- Create and interpret a graph
- Use geo-spatial data to inform biological thinking
- Describe how a change in a system can impact multiple parts of a system
- Calculate and compare basic summary statistics for a data set
Notes
This adaptation of the "Data is the New Science" OER covers Activities 1 - 3 of the original resource using the R programming language. Each activity is facilitated in R via the included R Markdown (Rmd) document. The adaptation can be completed in approximately 2 hours of class time, provided that the learners have access to the R Studio IDE. That is, either they have already installed R Studio on their personal computers or they are accessing R Studio through an institutional resource.
The R Markdown document that is included can be `knitted` into a Word document before the students edit the document, as long as the required R packages are already installed. The necessary installs can be completed either by following the suggested installs when the Rmd is originally opened or by running the code chunk under the third-level section titled "Install new packages". The included Word document is an example of the knitting of the original Rmd file.
Cite this work
Researchers should cite this work as follows:
- Aiello-Lammens, M. (2024). Data is the New Science - R Remix. Fall 2019 BLUE FMN, QUBES Educational Resources. doi:10.25334/DNJJ-YM55