M. Drew LaMar
October 26, 2016
In agent-based models, we modeled the
of many
and measured the resulting
In population models, we model the
of
and measure the resulting
Definition: Evolution is the process of change in the gene pool.
Question: What are the major forces of evolution?
Answer: There are four major forces:
- Natural selection
- Drift
- Gene flow
- Mutation
Our goal is to develop models of individual forces and explore them to develop expectations of evolutionary behavior.
Let's follow the ODD protocol, even though it was designed for agent-based models.
Purpose: To understand the effects of
natural selection in the absence of all other evolutionary forces.
Definition: A
locus is a chromosomal or genomic location of a gene.
Definition: An
allele is an alternate form of a gene at a given locus.
Definition:
Gene frequency is the relative proportion of an allele.
Definition: The
gene pool is the set of all alleles in a population at all loci.
Discuss: What assumptions do we need to make to remove the forces of
drift ,gene flow , andmutation ?
Answer:
- Infinite population (Removes drift)
- Closed system (Removes gene flow)
- Fixed number of alleles (Removes mutation)
Discuss: In terms of the entities (
loci andalleles ), what simplifications would be helpful at this stage?
Answer: First focus on one loci and two alleles.
Discuss: Before adding selection, do we have a model of evolution to start with?
Answer: Yes! We can start from
Hardy-Weinberg Equilibrium .
Model:
Hardy-Weinberg Equilibrium states that allele and genotype frequencies in a population will remain constant from generation to generation in the absence of other evolutionary influences.
Suppose our population is diploid.
Let \( A_{1} \) and \( A_{2} \) denote the two alleles at our focus locus.
Let \( p \) and \( q \) denote the proportion of \( A_{1} \) and \( A_{2} \), respectively (Note: \( p + q = 1 \))
Discuss: What is our unit for time? Is it discrete or continuous?
Answer: Time is measured in
generations , which isdiscrete .
Under the assumption of random mating, what are the genotype frequencies for \( A_{1}A_{1} \), \( A_{1}A_{2} \), and \( A_{2}A_{2} \)?
\( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|
\( p^2 \) | \( 2pq \) | \( q^2 \) |
This follows since the number of \( A_{1} \) alleles in a genotype is a binomial random variable.
For a variable to be a binomial random variable, ALL of the following conditions must be met:
The probability of having \( k \) successes in \( N \) trials is
\[ \textrm{Prob[k successes in N trials]} = \left(\begin{array}{c}N \\ k\end{array}\right)p^{k}q^{N-k} \]
Prob[\( A_{1}A_{1} \)] | Prob[k = 2] | \( \left(\begin{array}{c}2 \\ 2\end{array}\right)p^{2}q^{2-2} = p^2 \) |
Prob[\( A_{1}A_{2} \)] | Prob[k = 1] | \( \left(\begin{array}{c}2 \\ 1\end{array}\right)p^{1}q^{2-1} = 2pq \) |
Prob[\( A_{2}A_{2} \)] | Prob[k = 0] | \( \left(\begin{array}{c}2 \\ 0\end{array}\right)p^{0}q^{2-0} = q^2 \) |
Question: How do allele frequencies change over generations?
\[ p_{0} \rightarrow p_{1} \rightarrow p_{2} \rightarrow \cdots \]
\[ p_{t+1} = F(p_{t}), t \geq 0 \]
Note: Subscript corresponds to generation number.
\[ p_{0} \rightarrow p_{0} \rightarrow p_{0} \rightarrow \cdots \]
\[ p_{t+1} = p_{t} \]
How to construct \( F \)? To introduce selection, we need to model fitness, which for a diploid population occurs on the genotype and not on the allele. We will model relative fitness - i.e., we only care about the fitness of each genotype relative to the other genotypes.
Genotype | \( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|---|
Fitness | \( w_{11} \) | \( w_{12} \) | \( w_{22} \) |
Frequency | \( p^2 \) | \( 2pq \) | \( q^2 \) |
Note: Rather than model the change of allele \( p \), we will model the change of allele \( q \), as we are interpreting this model in the case of mutation, where \( A_{2} \) is the new mutant allele. This just changes perspective, not the resulting biological interpretation.
For model derivation purposes, it helps to imagine a finite population of individuals \( N_{t} \).
Genotype | \( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|---|
Fitness | \( w_{11} \) | \( w_{12} \) | \( w_{22} \) |
Frequency\( _{t} \) | \( p_{t}^2 \) | \( 2p_{t}q_{t} \) | \( q_{t}^2 \) |
Count\( _{t} \) | \( p_{t}^2N_{t} \) | \( 2p_{t}q_{t}N_{t} \) | \( q_{t}^2N_{t} \) |
Discuss: We have an implicit assumption here that we haven’t mentioned yet. Can you spot it?
Answer: Fitness is independent of time (parameters).
Genotype | \( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|---|
Fitness | \( w_{11} \) | \( w_{12} \) | \( w_{22} \) |
Frequency\( _{t} \) | \( p_{t}^2 \) | \( 2p_{t}q_{t} \) | \( q_{t}^2 \) |
Count\( _{t} \) | \( p_{t}^2N_{t} \) | \( 2p_{t}q_{t}N_{t} \) | \( q_{t}^2N_{t} \) |
Discuss: Thinking of the weights \( w_{ij} \) as multiplicative growth factors, what would Count\( _{t+1} \) be?
Genotype | \( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|---|
Fitness | \( w_{11} \) | \( w_{12} \) | \( w_{22} \) |
Frequency\( _{t} \) | \( p_{t}^2 \) | \( 2p_{t}q_{t} \) | \( q_{t}^2 \) |
Count\( _{t} \) | \( p_{t}^2N_{t} \) | \( 2p_{t}q_{t}N_{t} \) | \( q_{t}^2N_{t} \) |
Count\( _{t+1} \) | \( p_{t}^2N_{t}w_{11} \) | \( 2p_{t}q_{t}N_{t}w_{12} \) | \( q_{t}^2N_{t}w_{22} \) |
Discuss: How many \( A_{2} \) alleles are there in the \( t+1 \) generation? How many total alleles are there in the \( t+1 \) generation?
Genotype | \( A_{1}A_{1} \) | \( A_{1}A_{2} \) | \( A_{2}A_{2} \) |
---|---|---|---|
Fitness | \( w_{11} \) | \( w_{12} \) | \( w_{22} \) |
Frequency\( _{t} \) | \( p_{t}^2 \) | \( 2p_{t}q_{t} \) | \( q_{t}^2 \) |
Count\( _{t} \) | \( p_{t}^2N_{t} \) | \( 2p_{t}q_{t}N_{t} \) | \( q_{t}^2N_{t} \) |
Count\( _{t+1} \) | \( p_{t}^2N_{t}w_{11} \) | \( 2p_{t}q_{t}N_{t}w_{12} \) | \( q_{t}^2N_{t}w_{22} \) |
Answer: \[ \begin{align} N_{2,t+1} & = 2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22} \\ N_{t+1} & = 2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr) \end{align} \]
Discuss: So what is the frequency of allele \( A_{2} \) in generation \( t+1 \)?
\[ \begin{align} N_{2,t+1} & = 2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22} \\ N_{t+1} & = 2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr) \end{align} \]
Answer: \[ \begin{align} q_{t+1} & = \frac{N_{2,t+1}}{N_{t+1}} \\ & = \frac{2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22}}{2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr)} \\ & = \frac{p_{t}q_{t}w_{12} + q_{t}^2w_{22}}{p_{t}^2w_{11} + 2p_{t}q_{t}w_{12} + q_{t}^2w_{22}} \end{align} \]