--- title: "QUBES One-Way ANOVA Homework Assignment KEY" output: html_notebook: default pdf_document: default html_document: df_print: paged --- ### Preface This is an interactive [R Markdown](http://rmarkdown.rstudio.com) Notebook file that follows (with some modifictions) the text of Chapter 5 of [*Getting started with R: an introduction for biologists, 2nd edition*](http://r4all.org/) by Andrew P. Beckerman, Dylan Z. Childs, and Owen L. Petchey. The first thing you should do is click *File*, and then *Save As* on the menu bar above and rename the file "Your Last Name_ANOVA.Rmd" ### Introduction Natural and managed ecosystems provide a variety of ecological, economic, and cultural benefits to humans that include food, water, fuel, disease control, and recreation. Many ecosystem functions and services are positively related to local biodiversity, yet most ecosystems have been altered by human activity such that they currently exhibit reduced biodiversity relative to their pre-disturbance states and diversity continues to be lost at an unprecedented rate. As a consequence, disturbed ecosystems also typically exhibit deficits of function that may persist for years to decades depending upon the ecosystem and type of disturbance. Identifying factors that will accelerate the return of key species and associated processes in ecosystems that have been damaged or destroyed by human activities is therefore a critical research priority. In recent decades, ecological restoration has become an essential tool for improving degraded ecosystems. A major challenge for practitioners is to reestablish communities that support ecosystem functions and services at levels comparable to those under pre-disturbace or reference conditions. Unfortunately, current restoration practice often fails to accomplish this goal. Given well-established positive relationships between biodiversity and multiple ecosystem functions, explicitly incorporating genetic, species, or trait diversity into habitat restoration efforts might be expected to result in increased ecosystem performance over time. Researchers from California State University, Long Beach tested this hypothesis by manipulating salt marsh plant species richness as part of a coastal wetland restoration project in southern California. They planted different numbers of salt marsh plant species (*n* = 1, 3, or 6 species) in experimental plots in Colorado Lagoon, Long Beach, CA (Fig. 1), and quantified percent cover (a proxy for plant biomass production) as a measure of ecosystem functioning after one and three years. Here we will consider the data collected three years post-restoration. ![Fig. 1. Experimental plots with either 1, 3, or 6 salt marsh plant species one year post-restoration.](experimental_plots.png) Evaluate the following questions: * What was the effect (if any) of salt marsh plant species richness on mean percent cover in experimental plots after three years? * If there is evidence of variation among diversity treatments, which group or groups were different from the others? * Given the observed data, what recommendation would you make to a local resource manager about what to plant in future salt marsh restoration efforts? As always, your workflow should be: *Plot* -> *Model* -> *Check Assumptions* -> *Interpret* -> *Plot Again*. 1. Load specific add-on packages into R (i.e., _**tidyverse**_, _**ggfortify**_, _**gridExtra**_, and _**multicomp**_): ```{r} library(tidyverse) library(ggfortify) library(gridExtra) library(multcomp) ``` 2. Import the "plant_diversity.csv" data file, give the resulting object a short descriptive name (e.g., "diversity"), and view it as a tibble: ```{r} diversity <- read_csv("plant_diversity.csv") diversity ``` 3. Visualize the Year 3 data with box-and-whisker plots. Make two different graphs: one that groups percent cover data by the categorical variable treatment and another by the continuous variable richness; use the **grid.arrange**() function in _**gridExtra**_ to organize the two plots side by side into a single figure: ```{r} treatment_plot <- ggplot(data = diversity, aes(x = treatment, y = cover_year3)) + geom_boxplot() + theme_bw() richness_plot <- ggplot(data = diversity, aes(group = richness, x = richness, y = cover_year3)) + geom_boxplot() + theme_bw() grid.arrange(treatment_plot, richness_plot, nrow = 1) ``` Note that by treating richness as a categorical variable (treatment), the spacing between levels on the *x*-axis is constant, such that the distance between one and three species is the same as the distance between three and six. There's also the issue of the current ordering of the groups - R does love the alphabet. In order to preserve the correct spacing between diversity treatment levels, it is necessary to plot richness as a continuous variable. As percent cover data were collected separately by species, combined values can exceed 100. 4. Reorder the levels of treatment from smallest to largest and check that it worked. ```{r} diversity <- diversity %>% mutate(treatment = as_factor(treatment)) %>% mutate(treatment = fct_relevel(treatment, "one", "three", "six")) levels(diversity$treatment) ``` 5. Construct a one-way analysis of variance (ANOVA), including diagnostic residuals plots: ```{r} diversity_model <- lm(cover_year3 ~ treatment, data = diversity) autoplot(diversity_model, smooth.colour = NA) ``` Evaluate the residuals plots above: do the data likely meet the assumptions of ANOVA? Why or why not? **Type your answer below**: * ANSWER GOES HERE... 6. Regardless of your previous answer, assume that a one-way ANOVA is appropriate for these data. Report the results of your analysis: ```{r} anova(diversity_model) summary(diversity_model) ``` 7. If you decide to reject the null hypothesis for the overall ANOVA, use the **glht**() function (= general linear hypothesis test) in the _**multcomp**_ package to generate Tukey's multiple comparision tests to determine which treatment means are different: ```{r} diversity_model_MC <- glht(diversity_model, linfct = mcp(treatment = "Tukey")) summary(diversity_model_MC) ``` 8. Using Figure 2 below as a guide, make a publication quality plot of the eelgrass data (replacing the Xs with letters denoting results of the Tukeys tests); export and save it to your Projects folder as a .png file (sized 5 in wide x 3 in high): ![Fig. 2. Example means plot with Tukeys results.](example_diversity_fig.png) ```{r} diversity_means <- diversity %>% group_by(richness) %>% summarise(mean_cover = mean(cover_year3)) diversity_means example_diversity_fig <- ggplot(data = diversity, group = richness, aes(x = richness, y = cover_year3, color = richness)) + geom_point(size = 2, alpha = 0.5) + geom_point(data = diversity_means, aes(x = richness, y = mean_cover, color = richness), shape = 18, size = 4) + scale_x_continuous(breaks = c(1:6, 1)) + scale_y_continuous(limits = c(0, 150)) + xlab("Salt marsh plant species richness") + ylab("Percent cover") + guides(color = FALSE, size = FALSE) + theme_minimal() + annotate("text", x = 1, y = 150, label = "A", size = 3) + annotate("text", x = 3, y = 150, label = "B", size = 3) + annotate("text", x = 6, y = 150, label = "B", size = 3) example_diversity_fig ``` ```{r} ggsave("diversity_fig.png", example_diversity_fig, width = 6.5, height = 3) ``` 9. Report your interpretation of the results of your analyses, in light of your original questions. Be specific; use evidence from your analyses above to support your claims: * Write a complete figure caption to accompany the graph you produced in step 8. * What was the effect (if any) of salt marsh plant species richness on mean percent cover in experimental plots after three years? * If there is evidence of variation among diversity treatments, which group or groups were different from the others? * Given the observed data, what recommendation would you make to a local resource manager about what to plant in future salt marsh restoration efforts? ### Wrap-up Well done! Save your file and exit the R session.