Today’s investigation. A common way to describe nature is that of a balanced system. We can quantify such balance through biodiversity indexes, a measure of individual abundance and species richness in a community. When a community experiences a drastic change in biodiversity, it may become unbalanced and the viability of its members can be threatened. For community biologists, studying ecosystem engineers is key to understand these balance dynamics. However, what happens when a new invasive species in the community takes such a role? Today, we will learn how to use the null distribution and a test statistic to test hypotheses about the effect of invasive species on marshes using data from CSULB Wetlands Ecology Lab.


Introduction

In this lab, we will use the null distribution and the test statistic to test hypotheses about the effect of Lepidium latifolium on the abundance of soil invertebrates in a marsh ecosystem. In California, L. latifolium is an invasive plant hypothesized to be an ecosystem engineer. Ecosystem engineers are species that shape community dynamics. They are often native species that sustain the whole community and keep the ecosystem from collapsing. However, global change has broad a new type of engineer to certain communities. Now we see non-native, invasive species taking up the role and biologists wonder what are the effects of the presence of such invasive species like L. latifolium on the rest of the community.

Today, we will estimate the null distribution, test statistic, and p-value to carry out hypothesis testing and determine whether there is evidence that L. latifolium affects the community structure of native soil invertebrates (Figure 1). In her studies, Dr. Christine Whitcraft and colleagues answered this question. They found evidence indicating that L. latifolium significantly impacts brackish marsh ecosystems by changing the community structure of soil- and canopy-dwelling invertebrates. So, let’s use their data from wetlands to carry out hypothesis testing and draw appropriate conclusions about the role of invasive species as ecosystem engineers.


Figure 1. High marsh-terrestrial ecotone (transitional marsh) along margins of grassland, tidal marsh plain (marsh plain), and fringing tidal marsh (fringing marsh) at Rush Ranch Open Space Preserve (left), and *Lepidium latifolium* (right). Image: Christine Whitcraft.

Figure 1. High marsh-terrestrial ecotone (transitional marsh) along margins of grassland, tidal marsh plain (marsh plain), and fringing tidal marsh (fringing marsh) at Rush Ranch Open Space Preserve (left), and Lepidium latifolium (right). Image: Christine Whitcraft.


Upon completion of this lab, you should be able to:


References:


Worked example

In previous Chapters we have compared different estimates of variables of interest. With this, we have been asking ourselves how large is the effect of a particular treatment on the variable of interest (the response variable). On the other hand, hypothesis testing is used to answer whether there is an actual effect of the treatment on the variable of interest. That is, estimation can give us insights about the magnitude of the difference but only through hypothesis testing we can determine the probability of such difference being real or significant and not due to chance. Keep in mind that treatment may take many forms in biology and is not bounded to an actual lab treatment.


To get started, let’s define hypothesis testing and differentiate between the null hypothesis and the alternative hypothesis:


To evaluate the null hypothesis, we need a test statistic, its null distribution, and the p-value.


Now, let’s say we are studying a population of green iguanas (Iguana iguana), an invasive species in Caribbean islands, and we are interested in understanding why during warm years the sexual ratio of offspring is shifted towards females. We suspect that environmental temperature during embryonic development determines whether an egg develops as male or female (a phenomenon known as temperature-dependent sex determination). Say we sample nests randomly and count the number of offspring from each sex during a warm year, finding 13 females and 3 males. At this point, the question is do female and male offspring occur with equal frequency in the population during a warm year?

Let’s carry out the following four steps to answer our research question through hypothesis testing.


1. Stating the null and alternative hypotheses

The null hypothesis should be specific and skeptical while the alternative hypothesis can be general but meaningful within the biological context. For our iguana example, the parameter of interest is the proportion of female offspring. Say this proportion is p, thus, our null hypothesis is p = 0.5 or the frequency of female offspring in the population is not different from that of males (50%) in a warm year:



Here the alternative hypothesis allows two possibilities; p is greater or lower than 0.50. Because of this, we call it two-sided or two-tailed hypothesis.


2. Estimating the test statistic

For our example, the observed number of female offspring is our test statistic. According to \(H_0\), 50% of the offspring are expected to be females. Thus, we expect 8 out of the 16 offspring sampled to be females. However, our sample resulted in 13 females and 3 males. Thus, our test statistics is 13.


3. Estimating the null distribution and the p-value

To evaluate our data under the null hypothesis we need to estimate the probability that the observed data (13 female and 3 male offspring iguanas) is explained by chance. That is, even though the expectation under the null hypothesis is 8 female and 8 male offspring iguanas (50% each), we may not observe exactly such distribution in our data even when \(H_0\) is true just because of chance. Recall from Chapter 4 that a sample statistic is a variable that can take different values when sampling a population and such distribution of values is described by the sampling distribution (Chapter 4, Worked example). In the same way, our test statistic under the null hypothesis is a variable that can take a distribution of different values due to chance.

For example, imagine you are tossing a coin 16 times in the air and counting the number of heads. We know that there is 50% chance of counting a head each of those 16 times (like our null hypothesis in the iguana example) but say that you counted 13 heads and 3 tails. The null expectation (50% chance of being head) is still true! We just got the observed data due to chance. Thus, to evaluate our data under the null hypothesis we need to estimate the group of values the test statistic can take under the null hypothesis and their associated probabilities. This is called the null distribution.

For our example, we get the following null distribution:


Figure 2. Null distribution for the test statistic; the number of females out of 16 sampled offspring iguanas. According to this, the probability of observing exactly 8 females out of 16 offspring sampled under the null hypothesis is about 20%.

Figure 2. Null distribution for the test statistic; the number of females out of 16 sampled offspring iguanas. According to this, the probability of observing exactly 8 females out of 16 offspring sampled under the null hypothesis is about 20%.


Vast variability in the test statistic value is observed but the probability of each of the test statistic values is not the same across the distribution; some values are more likely than others. For example, the null distribution is centered at 8 female offspring, as expected from the \(H_0\). So, how do we know if getting the observed data (13 female and 3 male offspring iguanas) is a significant difference or just chance under the null hypothesis? For this we need to estimate the the probability of obtaining the observed data (or more extreme data values) if the null hypothesis is true. This is called the p-value.

With the p-value, we are asking ourselves if \(H_0\) is true, how unusual is the data we collected? For our iguana example, we are asking if female and male offspring are equally frequent during a warm year, how unusual is obtaining 13 females (or a number more extreme than that) from a random sample of 16 iguanas? If the resulting probability is extremely low, then we have evidence to reject the null hypothesis. If this probability is high, then we do not have evidence to reject the null hypothesis as there is a high chance of obtaining the observed data under the null hypothesis just by chance. To estimate this probability, the p-value, we use the null distribution of the test statistic.

Below are the values of the null distribution. Our expectation under the null hypothesis is in red; we expect 8 female offspring out of 16 sampled iguanas. On the other hand, the probability of obtaining the observed data or a number more extreme than the one observed (the test statistic: 13 females) is the sum of all the corresponding mutually exclusive probabilities of obtaining either 13, 14 15, or 16 females (purple values). Keep in mind our test is two-sided and thus we need to test for both sides or tails of the null distribution.


No. females Probability
0 0.0000
1 0.0002
2 0.0019
3 0.0089
4 0.0291
5 0.0671
6 0.1166
7 0.1771
8 0.1967
9 0.1691
10 0.1185
11 0.0754
12 0.0280
13 0.0081
14 0.0028
15 0.0005
16 0.0000


Recall from Chapter 4 that we can use the addition rule to estimate the probability of getting either one of a group of different mutually exclusive events. Here we want the probability of getting 13 or more females out of 16 sampled iguanas.

For our example,

\[ \begin{aligned} \Pr[13\ or\ more\ females]&=Pr[13]+Pr[14]+Pr[15]+Pr[16]\\\\ &=0.0081+0.0028+0.0005+0.0000\\\\ &=0.011 \end{aligned} \]


As this is a two-sided test, we also need to sum the probability of having 0, 1, 2, or 3 females or the left side of the null distribution to finally get the p-value.

A short approximation to adding the eight probabilities is multiplying one side by 2:


\[ \begin{aligned} p-value&=(2)(Pr[13]+Pr[14]+Pr[15]+Pr[16])\\\\ &=(2)(0.011)\\\\ &=0.022 \end{aligned} \]


In this case, the probability of observing 13 female offspring iguanas, or a more extreme value, out of a random sample of 16 is 0.022, assuming the null hypothesis is true.


4. Drawing a conclusion

By convention, we use the p-value of 0.05 to determine statistical significance. A p-value < 0.05 indicates that we can reject the null hypothesis. In our example, if the null hypothesis is true the likelihood of finding the observed data by chance is just 0.02 or 2%. As this is so low, then \(H_0\) most not be true. Here we conclude that, during a warm year, most iguana offspring are females.


Info-Box! Results in any report or manuscript need to include the following information in parenthesis so that the reader can have all the information needed to understand your conclusions:

  • test statistics
  • sample size
  • p-value

For example, the results from the Worked example could be reported as: “The frequency of female offspring iguanas during a warm year is significantly different from that of males (\(test=13\), n = 16, p-value = 0.02)”.



Materials and Methods

Today’s activity Invasion of the marsh is organized into two main exercises exploring the effects of L. latifolium on the abundance of soil invertebrates in California marsh ecosystems. These exercises will also motivate inferences about the role of invasive species as ecosystem engineers.


Invasion of the marsh


Research question 1 Does the presence of L. Latifolium affect the abundance of soil invertebrates?

1. Import the data Let’s start by importing the “marsh” dataset to RStudio and exploring it. Hint: review past R scripts and don’t forget to use the metadata file for reference.


Questions:

  1. How many variables and observations does “marsh” have?
  2. Considering the research question, what are the response and explanatory variables?
  3. Considering the experiment and research question, what is the treatment?


2. State the null and alternative hypotheses

Here, we are interested in whether the presence of L. latifolium affects soil invertebrate abundance, that is, whether the presence of L. latifolium increases or decreases the abundance of invertebrates in a wetland ecosystem. Therefore, our alternative hypothesis should be two-sided.

Challenge 1. State the null and the alternative hypothesis.


Now, let’s visualize the variables of interest before testing \(H_0\). Our explanatory variable (“plant”) is categorical and the response variable (“n”) is numerical. Thus, we use a boxplot.

# loading packages
library(ggplot2)
library(tidyverse)

# filtering by invertebrate type
soil <- filter(marsh,invertebrate=="soil")

# boxplot for abundance across treatment
p1 <- ggplot(soil,aes(x=plant,y=n)) +
  geom_boxplot()

p1


Question:

  1. How would you interpret the boxplot?


Let’s check other potential features of L. latifolium that could be contributing to the observed data. For that, let’s add another categorical variable to our plot; “stage_lep” or the life history stage of the invasive plant.

# boxplot for abundance across L. latifolium stages
p2 <- ggplot(soil,aes(x=stage_lep,y=n)) +
  geom_boxplot()

p2

# removing NAs from the ggplot 
p3 <- ggplot(drop_na(soil),aes(x=stage_lep,y=n)) +
  geom_boxplot()

p3


Questions:

  1. How do these boxplots differ from the previous one?
  2. What biological meaning could this data trend have?


Keep in mind that data visualization can be more useful than this. Let’s plot all variables together!

# all together!
p4 <- ggplot(soil,aes(x=plant,y=n,fill=stage_lep)) +
  geom_boxplot()

p4


From this visual inspection, something most be going on with our treatment! Let’s formally test \(H_0\).


3. Estimate the test statistic and its precision

Let’s estimate the mean soil invertebrate abundance per plot per treatment.

# filtering data by invertebrate type and treatment
s_lep <- filter(marsh,
               invertebrate=="soil" & plant=="Lep")

# mean abundance per plot
m_s_lep <- mean(s_lep$n)

m_s_lep

# standard error
se_s_lep <- sd(s_lep$n)/sqrt(36)

se_s_lep

In this case, we say that the test statistic is 31 soil invertebrates on average per plot.


Questions:

  1. What is the test statistic and its precision? Write it using an appropriate format.


4. Estimate the null distribution and the p-value

In this case, we can use the plots in the presence of native plants as a control treatment and use it to build the null distribution below. For this, we follow the steps in Chapter 4 and randomly sample the observed number of soil invertebrates per plot in the presence of native plants and estimate the mean abundance. We repeat the process 10,000 times.

Figure 3. Null distribution of mean soil invertebrate abundance in marshes.

Figure 3. Null distribution of mean soil invertebrate abundance in marshes.

mean n Probability
11 0.0001
12 0.0007
13 0.0008
14 0.0014
15 0.0056
16 0.0098
17 0.0162
18 0.0265
19 0.0375
20 0.0444
21 0.0537
22 0.0634
23 0.0723
mean n Probability
24 0.0757
25 0.0764
26 0.0756
27 0.0739
28 0.0693
29 0.0563
30 0.0452
31 0.0453
32 0.0356
33 0.0274
34 0.0236
35 0.0193
36 0.0120
mean n Probability
37 0.0095
38 0.0053
39 0.0053
40 0.0035
41 0.0022
42 0.0021
43 0.0016
44 0.0009
45 0.0006
46 0.0005
47 0.0001
48 0.0002
49 0.0002


Question:

  1. What is the p-value?
  2. Do you reject the null hypothesis? Explain.


Research question 2 Does the presence of L. Latifolium in the rosette stage affect the abundance of soil invertebrates?

Stop, Think, Do: Now is your turn to test the null hypothesis against the alternative hypothesis that the presence of L. latifolium in rosette stage (circular arrangement of leaves before flowering) affects the abundance of soil invertebrates. Stop and review steps 3-4 you just did. Think about how to manage the “soil” dataset to obtain the data needed to address the research question. Do the analysis following the demonstration codes and test your hypothesis. Hint: Create a new data frame filtering “soil” by stage_lep==“rosette” and follow the steps above using the same null hypothesis.


Questions:

  1. What is the p-value?
  2. Do you reject the null hypothesis? Explain.


Discussion questions

  1. Define the null and the alternative hypothesis.
  2. Why do we need to use a null distribution to test hypotheses?
  3. Define the p-value and how it is used in hypothesis testing.


Great Work!