Today’s investigation. One way to investigate evolution is by studying sexual dimorphism, defined as morphometric differences between males and females of the same population. For instance, sexual selection can favor extravagant traits in one sex with no effect on the other. By comparing traits like canine teeth, tail length, or brightness of fur among males and females, we are able to make connections between morphometrics and evolutionary adaptions. Today, we will explore how the student’s t-test allows us to examine hypotheses about sexual dimorphism in the chipmunk Tamias minimus using morphometric techniques from CSULB Mammal lab.


Introduction

In this lab, we will perform a one-sample and a two-sample student’s t-test to examine differences in mean body length between male and female chipmunks. Many chipmunk species have female-biased sexual dimorphism. It is hypothesized that sexual dimorphism might be adaptive because larger females produce larger litters and heavier offspring. However, chipmunks usually exhibit promiscuous mating systems where females do not compete for mates. This has motivated many questions regarding why females are found to be larger than males if sexual selection might not be the main driver of dimorphism.

Today, we will measure body length of male and female chipmunks T. minimus to determine whether this species is indeed sexually dimorphic. For this, we will use CSULB specimens and measuring techniques from CSULB Mammal lab lead by Dr. Ted Stankowich. Specifically, we will measure the body length of preserved specimens (Figure 1) and compare both group means using a t-test.

Figure 1. *Tamias minimus* chipmunk in Montana (left, source: AnimalDiversity.org) and CSULB specimen (right, source: Raisa Hernández Pacheco).

Figure 1. Tamias minimus chipmunk in Montana (left, source: AnimalDiversity.org) and CSULB specimen (right, source: Raisa Hernández Pacheco).


Upon completion of this lab, you should be able to:


Reference:


Worked example

In Chapter 5, you learned that the null hypothesis suggests equal means between two population parameters. There are many methods to test the null hypothesis. Today, we will learn how to compare two means and draw a conclusion using the student’s t-test.

Let’s get started with model assumptions. The assumptions that should hold for a t-test include:

  1. The data are independent of one another.
  2. The data are randomly selected and represent the population.
  3. The data follows a normal distribution.
  4. Equal variances exist when the standard deviations of samples are near equal.


1. Estimating a one-sample t-test

The one-sample t-test compares a sample mean (\(\overline{y}\)) to the population mean (\(\mu\)). Here, the null and alternative hypotheses are:


The test statistic, t, for the one-sample t-test is: \[ \begin{aligned} t=\frac{\overline{y}-\mu_{0}}{SE_\overline{y}}, \end{aligned} \]

where \(\overline{y}\) is the sample mean, \(\mu_{0}\) is the population mean given by the null distribution, and \(SE_{\overline{y}}\) is the standard error of the sample mean. Here, the t statistic is a ratio of the difference between the sample and the population mean, and the variation within the sample group.

The sampling distribution, or t-distribution, resembles the normal distribution (Chapter 7). However, in contrast to the normal distribution, the t-distribution’s shape depends on the number of observations and thus changes with degrees of freedom (\(df = n-1\)). This stands by the fact that \(SE_{\overline{y}}\) depends on n. In this way, the t-distribution is based on the variance, which leads to greater uncertainty and a more spread out distribution (Figure 2). If we were to increase our sample size, the t-distribution converges to standard normal distribution because as the number of observations increases, the sample variance approaches the true population variance.

Figure 2. The normal (gray) and *t* (blue) distribution.

Figure 2. The normal (gray) and t (blue) distribution.


Thus, the p-value can be obtained by comparing the observed t static with the student t-distribution expected from the null hypothesis given the degrees of freedom. Under the null hypothesis, the sample mean equals the population mean and thus \(\overline{y}-\mu_{0}\) should be 0. As you can see from the probability distribution in Figure 2, the closer t is to 0, the more difficult would be to reject the null hypothesis.

Let’s carry out a one-sample t-test using data from body mass index (BMI) across hypertensive woman.


A. Estimating the t statistic

Say that a study took a sample of twenty hypertensive women to see if their mean BMI was consistent with that expected from a healthy population. The results are in Table 1.


Table 1. Body mass index of hypertensive women
BMI (kgm^2) BMI (kgm^2)
28.46 26.97
27.39 27.99
27.53 26.74
27.88 29.00
26.65 25.93
28.67 28.00
26.72 27.25
26.70 27.41
27.08 26.51
28.76 27.61

Let’s use the expected mean BMI of the nonhypertensive female population (\(25\ kg/m^2\)) as our null hypothesis:



From Table 1, we estimate a mean BMI of \(27.46 \pm 0.187\) (\(mean\pm SE\)) for the hypertensive women sample. Recall from Chapter 4 that SE is the standard error, our estimate of precision for the test statistic.


We can now estimate the test statistic: \[ \begin{aligned} t&=\frac{27.46-25.0}{0.187}=13.16 \end{aligned} \]


B. Estimating the p-value

Under the null hypothesis, the sampling distribution of t corresponds to the t-distribution with degrees of freedom \(n - 1\), or \(df=20-1=19\). Our null hypothesis is two-sided because we are testing the probability of obtaining results that are as extreme, or more extreme, than our test statistic whether these are smaller or larger than t.

When looking at the critical values of the t-distribution in a statistical table (statsexamples.com), we find that for \(df=19\) and an \(\alpha=0.025\) (remember this is a two-tailed test), the critical value is 2.09. Because our t statistic (13.16) is more extreme than this critical value, we reject the null hypothesis (p < 0.05). We can conclude that our sample population is different from the true population.


C. Estimating 95% confidence intervals

Confidence intervals are another way to measure the precision of our sample mean. They allow us to quantify an interval associated to a level of confidence that the true parameter is in the proposed range. In particular, 95% CI states our 95% confidence that the true population parameter falls within the proposed range of values.

For our example, the 95% CI of the true mean BMI is:


\[ \begin{aligned} \overline{Y}-{t_{0.05,19}}SE_\overline{Y}&<\mu<\overline{Y}+{t_{0.05,19}}SE_\overline{Y}\\\\ 27.46-2.093(0.187)&<\mu<27.46+2.093(0.187)\\\\ 27.1&<\mu<27.9 \end{aligned} \]


The 95% CI for mean BMI in women are \(27.1 kg/m^2\) and \(27.9\ kg/m^2\). This range will bracket the population mean in 95% of samples. In this case, our interval is narrow which suggests that we obtained our BMI values with high precision.


You can check your work using the R t.test() function.

# one-sample t-test
t.test(data, mu = mu, alternative = "two.sided")


2. Estimating a two-sample t-test

This method is used to examine the differences between the means of two independent sample groups. Here, the null and alternative hypotheses are:



The test statistic, t, for the two-sample t-test is: \[ \begin{aligned} t=\frac{\overline{y}_1-\overline{y}_2}{SE_{\overline{y}_1-\overline{y}_2}}, \end{aligned} \] where \(\overline{y}_1\) and \(\overline{y}_2\) are the two group means and \(SE_{\overline{y}_1-\overline{y}_2}\) is the standard error of the difference between the two sample means for group 1 and group 2. Here, the t statistic is a ratio of the difference between the two groups and the difference within the two groups.


The \(SE_{\overline{y}_1-\overline{y}_2}\) is defined as: \[ \begin{aligned} SE_{\overline{y}_1-\overline{y}_2}=\sqrt{s^2_p (\frac{1}{n_1} + \frac{1}{n_2})}, \end{aligned} \] where \(s^2_p\) is the pooled sample variance.


The sample variance \(s^2_p\) is defined as: \[ \begin{aligned} s_p^2=\frac{{df_1s^2_1+df_2s^2_2}}{df_1+df_2}, \end{aligned} \] where \(df_1\) and \(df_2\) are the degrees of freedom of groups 1 and 2, respectively, and \(s^2_1\) and \(s^2_2\) are the variances of groups 1 and 2, respectively.


A. Stating the hypotheses

Let’s say a lab is developing a new drug to treat hypercholesterolemia patients whose cholesterol levels are > 240 mg/dL. To test any medicine, it is standard to compare the new drug against a placebo (no drug). Thus, the lab makes two pills; one being the new drug and one being a placebo. They gather 20 test subjects with hypercholesterolemia and randomly divide them in two groups of ten subjects; group 1 receives the placebo treatment (Drug A) and group 2 receives the drug treatment (Drug B). Since they are interested in knowing if this new drug can lower cholesterol, they can use a two-sample t-test to compare both group means.


Here, the hypotheses are:


The team found that the group who took drug A had a mean cholesterol of 247.8 mg/dL and a standard deviation of 1.14 mg/dL, while the group who took drug B had a mean cholesterol of 228.8 mg/dL and a standard deviation of 0.81 mg/dL. The data is in Table 2.


Table 2. Cholesterol levels in hypercholesterolemia patients under treatment
Treatment Cholesterol value
Drug A 247.8, 245.6, 244.9, 243.9, 245.5, 245.9, 246, 245.6, 247, 244.5
Drug B 228.8, 226.6, 227.7, 226.3, 227.4, 226.7, 226.3, 227.2, 228, 226.8


B. Estimating the pooled sample variance

First, we have to determine the weighted average of the two variances of each group. We do this by estimating the pooled sample variance. Recall that the degrees of freedom are \(n-1\). Thus, for group 1 \(df_1=10-1=9\) and for group 2 \(df_2=10-1=9\).

For our example, we define the pooled sample variance as:

\[ \begin{aligned} s_p^2=\frac{{9(1.14^2)+9(0.81^2)}}{9+9}=0.978 \end{aligned} \]

C. Estimating the standard error

Now that we found our pooled sample variance we can find the standard error of the difference between the means:


\[ \begin{aligned} SE_{\overline{Y_1}-\overline{Y_2}}&=\sqrt{0.978(\frac1{10} + \frac1{10})}=0.442 \end{aligned} \]

D. Estimating the two-sample test statistic

\[ \begin{aligned} t=\frac{247.8-228.8}{0.442}=43.0 \end{aligned} \]


E. Estimating the p-value

When looking at the critical values of the t-distribution in a statistical table (statsexamples.com), we find that for \(df=18\) and an \(\alpha=0.025\) (remember this is a two-tailed test), the critical value is 2.10. Because our t statistic (43.0) is more extreme than this critical value, we reject the null hypothesis (p < 0.05). We can conclude that both sample means are different. Therefore, drug B was more effective at treating hypercholesterolemia (Figure 3).

Figure 3. Cholesterol levels in hypercholesterolemia patients under treatment. Error bars represent standard error.

Figure 3. Cholesterol levels in hypercholesterolemia patients under treatment. Error bars represent standard error.


We can easily compute this in R using the function t.test().

# two-sample t-test
t.test(groupA~groupB,alternative = "two.sided",var.equal = TRUE, data=mydataframe)


3. Paired or unpaired?

The \(t\)-test can be paired in contrast to unpaired (two-sample). The paired \(t\)-test compares the means of two treatments among the \(same\) individuals. Here, no variance test is required because we are testing the same individuals twice. The paired \(t\)-test is straightforward after reducing the paired data to just one observation by calculating the difference between the paired data. Let’s demonstrate this using R codes.

Say now we are interested in all hypercholesterolemia patients with the new drug. For this, we will measure cholesterol levels before and after the drug treatment. Here we have the same patient tested twice, and thus we say we have a “pair” of observations per individual.

Here, the hypotheses are:


We can easily compute paired tests in R using the function t.test(). The first argument in the function is the response variable followed by the explanatory variable (Factor).

# paired t-test
t.test(ResponseVariable~Factor, data=MYDATA, paired=TRUE)


4. Equal variances

Finally, when we are statistically comparing means from two samples, we need to consider the assumption of equal variances. This can influence which kind of t-test we will run:


We can easily test for equal variances in R using the function var.test().The first argument in the function is the response variable followed by the explanatory variable (Factor).

var.test(ResponseVariable~Factor, data=MYDATA)


Info-Box! Let’s interpret the R t test model output for the one-sample t-test from the BMI data.

# one sample t test
t.test(bmi, mu=25, data=bmi)
## 
##  One Sample t-test
## 
## data:  bmi
## t = 13.185, df = 19, p-value = 5.2e-11
## alternative hypothesis: true mean is not equal to 25
## 95 percent confidence interval:
##  27.07170 27.85357
## sample estimates:
## mean of x 
##  27.46263

The t test model output provides the t statistic, degrees of freedom, and p-value. It also provides the alternative hypothesis which in this case is that the true mean BMI of hypertensive women is not equal to 25 mg/dL, and the 95% CI together with the sample mean. Note that any deviation from our Worked example is due to rounding.


Materials and Methods


Today’s activity “Detecting sexual dimorphism using a t-test” will compare body size between males and females in order to test whether there is sexual dimorphism in chipmunks.


Detecting sexual dimorphism using a t-test


Research question 1: Is there a difference in mean body length between male and female Tamias minimus?


1. Analyze Tamias minimus digital images

Review Chapter 2 and analyze the 30 specimen images of T. minimus using ImageJ. Generate a .csv file with your data for analysis.


2. Import the data

Let’s import and explore the data in RStudio.

Questions:

  1. State the null and alternative hypotheses.
  2. What t-test is appropriate to answer the research question?


3. Estimate the t statistic

Following the Worked example, analyze the data.


Question:

  1. Is there a significant difference in mean body length between males and females? Explain.


Challenge: Check your work using the t.test() function in R.


4. Visualize the data

First, let’s learn how to create a table of descriptive statistics using R package rstatix. For this, we use the function get_summary_stats() where the first argument is the response variable, the second argument is the response variable we want to summarize (e.g., body length), and the third argument is the categorical variable that groups the response variable (e.g., sex).

# installing the package
install.packages("rstatix",repos="http://cran.us.r-project.org")

# loading package
library(rstatix)

# summary statistics
sum <- mydata %>% 
      group_by(sex) %>% 
      get_summary_stats(length)  


Now, let’s learn how to use the newly created table of descriptive statistic in ggplot2. For this plot, we will use the function geom_errorbar().

# loading package
library(ggplot2)

# bar plot with error bars for standard error
p1 <- ggplot(sum,aes(x=sex,y=length)) +
    geom_bar(stat="identity") +
    geom_errorbar(aes(min=mean-se, max=mean+se), width=0.2) +
    theme_classic(15) +
    ylab("Body length (cm)") +
    xlab("Sex")
p1


5. Check the model assumptions of normality

Let’s check the assumption that our data is normally distributed by plotting its frequency distribution. To test for normality we can use a histogram.

# histogram of body lengths
p2 <- ggplot(mydata,aes(x=length)) +
  geom_histogram(binwidth = .5)
p2

Recall that a small sample size is expected to have more deviations from a normal distribution, whether it is normally distributed or not.

Question:

  1. Does the assumption of normality hold?


Discussion questions

  1. Interpret the estimated t statistic.
  2. Interpret the 95% CI from the R output.
  3. If you were interested in testing whether the mean body length of your sample population of 15 females is different from the known mean body length of another entirely sampled population, which test would you use? Explain.
Great Work!