Introduction to Primate Data Exploration and Linear Modeling with R
Author(s): Raisa Hernández-Pacheco1, Alexandra L Bland1, Alexis A. Diaz1, Alexandra G Rosati2, Stephanie J Gonzalez1
1. California State University, Long Beach 2. University of Michigan
2999 total view(s), 260 download(s)
- Module M.1 R basics for data exploration and management (v1.0)
- Module M.2 Descriptive statistics (v1.0)
- Module M.3 Visualizing data with ggplot2 (v1.0)
- Module M.4 Simple linear regression analysis (v1.0)
- Module M.5 Linear mixed-effects models (v1.0)
- Module M.6 Generalized linear models (v1.0)
- Module M.7 Generalized linear mixed-effects models (v1.0)
- Module M.8 Survival analysis (v1.0)
- License terms
Description
The main goal for this series of modules was to provide training in data exploration, visualization, and linear modeling with R to undergraduate biology research students. All modules employ authentic data of the Cayo Santiago rhesus macaques, a free-ranging nonhuman primate population living in naturalistic conditions in the island of Cayo Santiago, Puerto Rico. By introducing this primate study system and authentic research explorations, we aim to engaged undergraduate research students on fundamental quantitative tools in modern population ecology, demography, and evolutionary cognition.
This series of modules was implemented during a Summer Research Primate Population Ecology Internship at the Quantitative Ecology Lab of California State University, Long Beach in 2023. Each module was designed as a data-driven, self-guided, student-pace activity. Each module implements Guiding questions with answers, as well as Challenges, and concludes with Discussion question(s). Challenges and Discussion questions can be used as follow-ups for discussion and clarifications.
Although a series, each module could be implemented independently, as long as students have access to prior modules for reference. This series of modules were not designed to introduce R or to provide formal study on statistics. For more in depth introduction to R and statistics, students can refer to the section Extra training that each module provides.
Versions of modules for teachers that provide answers to each challenge are available upon request.
Table of Content
Module M.1: R basics for data exploration and management
Module M.2. Descriptive statistics
Module M.3. Visualizing data with ggplot2
Module M.4. Simple linear regression analysis
Module M.5. Linear mixed-effects models
Module M.6. Generalized linear models
Module M.7. Generalized linear mixed-effects models
Module M.8 Survival analysis
Notes
Structure and functions
All Modules are structured into a brief Introduction to the system and data, followed by a combination of coding demonstrations, guided questions, and challenges. Every Module builds up from previous ones, focusing both on the statistical knowledge and the programming skills. Functions from base R and packages are summarized in the R Notation and functions index of each module.
R and RStudio version
This series of modules was created using R version 4.2.2 and RStudio version 2022.12.0+353.
R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Acknowledgments
The creation of these modules was funded by the National Science Foundation DBI BRC-BIO award 2217812. Cayo Santiago is supported by the Office of Research Infrastructure Programs (ORIP) of the National Institutes of Health, grant 2 P40 OD012217, and the University of Puerto Rico (UPR), Medical Sciences Campus. Additional support was provided by the National Science Foundation Graduate Research Fellowship Program to Alexis A. Diaz, award 2141410, and by the National Institutes of Health to Stephanie J. Gonzalez, award T32GM138075.
Future adaptations
We welcome adaptations to our work! Editable files (.Rmd) are available upon request.
Cite this work
Researchers should cite this work as follows:
- Hernández-Pacheco, R., Bland, A. L., Diaz, A. A., Rosati, A. G., Gonzalez, S. J. (2023). Introduction to Primate Data Exploration and Linear Modeling with R. QUBES Educational Resources. doi:10.25334/T0ZY-PK40