--- title: "Mountiantop Removal Mining" author: "Brian Tyler Dagliano" date: "5/1/2021" output: word_document: default html_document: default pdf_document: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ###Lets investigate the Salamander data to try and make some observations regard the effects MTRM has on streams. Throughout this analysis I would like you to imagaine that you are a biologist, working for a state agency, trying to understand what activities most impact wildlife. As a biologist, it is important that your analysis of data can be used to inform management decisions for the management of the forest. ##In order to run this analysis we will need to load the ggpubr and readr packages. If you do not have these packages already installed, you will need to install them using "install.packages("PACKAGE_NAME") tool. ```{r} library("ggpubr") #This package is used to create plots library("readr") #This package is used to read the .csv file ``` ##Bringing in the data #It is important that you have put the data into your working directory, and that the title for the data matches what you have written in the R code. Once you have brought the data in, it is good practice to view the data to ensure the data as been loaded correctly. ```{r} Data <- read_csv("Price_et_al_2015_salamander_data (1).csv") View(Data) ``` ##Test for normality #The first step of our analysis will require us to test for normality. The term "normality" simply refers to the distribution of the data. If a dataset is normally distributed then the distribution of the dataset will likely look similar to a bell curve, with nearly as many data points above the mean as there are below. Also, there won't be many data points much higher or lower than the mean. If the data is considered normally distributed then we will use a parametric test (because these test assume normal distribution), however if they are not normally distributed then we will use a non-parametric test. #One way we can check for normality is by looking at the data and see if it looks normally distributed. ```{r} ggdensity(Data$Dusky_salamander, main = "Density plot Salamander Count", xlab = "Salamander Abundance") ``` #While this graph does not appear to be distributed normally, relying soley on a visual analysis would not be wise. We can also use the Shapiro-Wilk test to test for normality. If the p-value is greater than 0.05 then the dataset is considered normally distributed. ```{r} shapiro.test(Data$Dusky_salamander) ``` #Given the result of the Shapiro-Wilk test, we can see that the dataset is not normally distributed and that we should use a non-parametric test when analyzing the data. Now that we know what kind of tests we should use, we can move on with our analysis. ##Visualize the data #Next, we will create a box-and-whisker plot of the Dusky Salamander data to visualize the data. This will allow us to compare the Dusky Salamander reference sites with the sites effected by MTRM. ```{r pressure, echo=FALSE} ggboxplot(Data, x = "Mined", y = "Dusky_salamander", #This line of code identify what variables in the dataset will be used for the x and y axis color = "Mined", palette = c("#00AFBB", "#E7B800"), #This line of code creates the label on top of the plot and assign a color to each of the groups order = c("Mined", "Reference"), #This line of code assigns the order the groups will be listed ylab = "Dusky Salamander Count", xlab = "Mined vs Reference") #This line of code assigns the x and y axis labels ``` #By looking at this plot, we can see that the sites near the MTRM sites had far fewer Dusky Mountian Salamanders. Although, we should run a test that will allow us to test our null hypothesis. #Reminder: #Our null hypothesis: There is no difference in Dusky Salamander abundance in streams near MTRM sites and those separate from MTRM sites. #Alternate hypothesis: There is a difference in the Dusky Salamander abundance in streams near MTRM sites and those separate from MTRM sites. ##Kruskal-Wallis Test #To test our null hypothesis we will run a Kruskal-Wallis test. This test compares the means of two or more groups to determine if there is statistical evidence that the population means are significantly different, when the dataset is not normally distributed. (The Kruskal-Wallis Test is similar to the One-Way ANOVA test that was explained in the video, however this test is slightly different because it does not rely on the assumption of normally distrubted data. It is the widely recommended alternative to the One-Way ANOVA test when the data is not normally distributed) ```{r} Data$Mined <- as.factor(Data$Mined) #This tells R that our data column labeled "Mined" should be considered a factor. This is required for the kruskal.test to run properly. Dusky.Test <- kruskal.test(Data$Dusky_salamander, Data$Mined) # Summary of the analysis Dusky.Test ``` #Looking at the p-value for this test, we can reject the null hypothesis that there is no difference in Dusky Salamander abundance in streams near MTRM sites and those separate from MTRM sites. We can now accept the alternate hypothesis that there is a difference in the Dusky Salamander abundance in streams near MTRM sites and those separate from MTRM sites. #As mentioned in the lesson, it is possible that the Dusky Salamanders can be used to indicate potential negetive impacts to other species of salamanders. However, it is also important to look at other groups of species whenever possible because many of them have specific environmental needs. These species differences can make them more or less resiliant than other species to specific stressors. Our dataset also includes data for the Seal Salamander, another salamander commonly found in Appalachian Mountain streams. #We will now run the same analysis for the Seal Salamander, to see if there was a similar effect of MTRM. Your assignment is to answer the questions throughout the Seal Salamander analysis. ##Visually check for normality ```{r} ggdensity(Data$Seal_salamander, main = "Density plot Salamander Count", xlab = "Salamander Abundance") ``` #Does the data appear to be normally distributed? #No ##Shapiro_Wilk Test for normality ```{r} shapiro.test(Data$Seal_salamander) ``` #What can we determine from this test? #The dataset is not normally distributed ##Visualize the data ```{r} ggboxplot(Data, x = "Mined", y = "Seal_salamander", color = "Mined", palette = c("#00AFBB", "#E7B800"), order = c("Mined", "Reference"), ylab = "Seal Salamander Count", xlab = "Mined vs Reference") ``` #What observations can be made when viualizing the box-and-whisker plot? #The two populations are different. The mined sites have fewer Seal Salamanders than the reference sites. However, the difference does not seem to be as big as the Dusky Salamanders. ##Kruskal-Wallis Test ```{r} #Note that we do not need to use the as.factor code again because we have already done it for this dataset. Seal.Test <- kruskal.test(Data$Seal_salamander, Data$Mined) # Summary of the analysis Seal.Test ``` ##What hypothesis is being tested? #The null hypothesis. ##What is the alternate hypothesis? #There is a difference in the Seal Salamander abundance in streams near MTRM sites and those separate from MTRM sites. ##Is the p-value of this ANOVA test significant? #Yes ##Comparing the p-values of the two ANOVA tests, which salamanders are more effected by MTRM? #The Dusky Salamanders seem to have been effected more than the Seal Salamanders ##Using online sources collect the same background information we studied for the Dusky Salamanders, for the Seal Salamanders. (Appearance, Habitat, Life Style, Conservation status) Include the website link of your source. #There is a patterned on the top and is pale underneath. The markers on top are mostly black and brown. Seal Salamanders usually in hardwood forests in small to medium streams with cool, well-aerated water. They are found under many cover objects around the streambanks. Seal Salamanders are abundant in the areas of their preferred habitat. 20 year studies have found no significant changes in abundance over the timespan. ##Lastly, what attributes of the Seal Salamanders make them likely to be effected by MTRM? #The fact that they preferred cool, well-aerated water means that when mining activites are taking place that will effect the water temperature and other mechanics of the water, it could have negetive effects. Additionally, mining activities could effect the availability of cover objects in their habitat.