Resource Image

Data is the New Science - R Remix

Author(s): Matthew Aiello-Lammens

Pace University

16 total view(s), 6 download(s)

0 comment(s) (Post a comment)

Summary:
In this module, students will be introduced to some emerging biodiversity data resources. They will be asked to think critically about the strengths and utility of these data resources and apply what they have learned to research question. They will…

more

In this module, students will be introduced to some emerging biodiversity data resources. They will be asked to think critically about the strengths and utility of these data resources and apply what they have learned to research question. They will work through this module using an R Markdown document, preferably in the R Studio Integrated Development Environment.

Licensed under CC Attribution-ShareAlike 4.0 International according to these terms

Version 1.0 - published on 26 Nov 2024 doi:10.25334/DNJJ-YM55 - cite this

Adapted from: Data is the New Science v 1.0

Description

There is a changing landscape for those joining the 21stcentury workforce. Rapid advances in data research and technology are transforming how we conduct science. The volume and variety of data being generated, the increased accessibility of data for aggregation, the improved discoverability of data, and the increasingly collaborative and interdisciplinary nature of scientific research are driving the need for new skill sets to address scientific issues of critical national and global importance. 

 

The biodiversity sciences have experienced a rapid mobilization of data that has increased capacity to investigate large-scale issues of critical importance (e.g., climate change, zoonotic disease, resource management, invasive species, and biodiversity loss). In order to investigate these types of questions, the 21stcentury biodiversity scientist must be fluent in integrative fields spanning evolutionary biology, systematics, ecology, geology, genetics, biochemistry, and environmental science and possess the quantitative, computational, and data skills to conduct research using large and complex datasets.

 

In this module, students will be introduced to some emerging biodiversity data resources. They will be asked to think critically about the strengths and utility of these data resources and then encouraged to think beyond the obvious to how these data could be used to answer big science questions. They will access these data and generate figures using the R programming language, guided through these steps via an R Markdown document that will preferably be interacted with and modified in the R Studio Integrated Development Environment.

 

Students completing this module will be able to:

  • Access data from biodiversity digital data repositories
  • Evaluate the research utility of occurrence data derived from different sources.
  • Create and interpret a graph
  • Use geo-spatial data to inform biological thinking
  • Describe how a change in a system can impact multiple parts of a system
  • Calculate and compare basic summary statistics for a data set

Notes

This adaptation of the "Data is the New Science" OER covers Activities 1 - 3 of the original resource using the R programming language. Each activity is facilitated in R via the included R Markdown (Rmd) document. The adaptation can be completed in approximately 2 hours of class time, provided that the learners have access to the R Studio IDE. That is, either they have already installed R Studio on their personal computers or they are accessing R Studio through an institutional resource. 

The R Markdown document that is included can be `knitted` into a Word document before the students edit the document, as long as the required R packages are already installed. The necessary installs can be completed either by following the suggested installs when the Rmd is originally opened or by running the code chunk under the third-level section titled "Install new packages". The included Word document is an example of the knitting of the original Rmd file.

Cite this work

Researchers should cite this work as follows: