Population Genetics

M. Drew LaMar
October 26, 2016

From Agent-Based Models...

In agent-based models, we modeled the

  • behavior (processes in the form of rules)

of many

  • individual agents (objects in the form of agents and patches),

and measured the resulting

  • emergent system properties (output).

...to Population Models

In population models, we model the

  • changes (processes in the form of equations)

of

  • populations (objects in the form of state variables)

and measure the resulting

  • system behavior (output).

Evolution

Definition: Evolution is the process of change in the gene pool.

Population Genetics and Evolution

Question: What are the major forces of evolution?

Answer: There are four major forces:

  • Natural selection
  • Drift
  • Gene flow
  • Mutation

Population Genetics and Evolution

Our goal is to develop models of individual forces and explore them to develop expectations of evolutionary behavior.

Let's follow the ODD protocol, even though it was designed for agent-based models.

Purpose: To understand the effects of natural selection in the absence of all other evolutionary forces.

Natural Selection - ODD

Entities, State Variables and Scales

Entities

Definition: A locus is a chromosomal or genomic location of a gene.

Definition: An allele is an alternate form of a gene at a given locus.

Definition: Gene frequency is the relative proportion of an allele.

Definition: The gene pool is the set of all alleles in a population at all loci.

Entities, State Variables and Scales

Assumptions and Simplifications

Discuss: What assumptions do we need to make to remove the forces of drift, gene flow, and mutation?

Answer:

  • Infinite population (Removes drift)
  • Closed system (Removes gene flow)
  • Fixed number of alleles (Removes mutation)

Entities, State Variables and Scales

Assumptions and Simplifications

Discuss: In terms of the entities (loci and alleles), what simplifications would be helpful at this stage?

Answer: First focus on one loci and two alleles.

Discuss: Before adding selection, do we have a model of evolution to start with?

Answer: Yes! We can start from Hardy-Weinberg Equilibrium.

Entities, State Variables and Scales

Assumptions and Simplifications

Model: Hardy-Weinberg Equilibrium states that allele and genotype frequencies in a population will remain constant from generation to generation in the absence of other evolutionary influences.

State variables

Suppose our population is diploid.

Let \( A_{1} \) and \( A_{2} \) denote the two alleles at our focus locus.

Let \( p \) and \( q \) denote the proportion of \( A_{1} \) and \( A_{2} \), respectively (Note: \( p + q = 1 \))

Entities, State Variables and Scales

Assumptions and Simplifications

State variables - Time

Discuss: What is our unit for time? Is it discrete or continuous?

Answer: Time is measured in generations, which is discrete.

Entities, State Variables and Scales

Under the assumption of random mating, what are the genotype frequencies for \( A_{1}A_{1} \), \( A_{1}A_{2} \), and \( A_{2}A_{2} \)?

\( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
\( p^2 \) \( 2pq \) \( q^2 \)

This follows since the number of \( A_{1} \) alleles in a genotype is a binomial random variable.

Entities, State Variables and Scales

For a variable to be a binomial random variable, ALL of the following conditions must be met:

  • There are a fixed number of trials \( N \) (N=2 chromosomes)
  • On each trial, there are two possible outcomes - “success” or “failure” (\( A_{1} \) or \( \ A_{2} \))
  • The probability of each outcome is the same on each trial (\( p \) and \( q=1-p \))
  • Trials are independent of one another (random mating)

The probability of having \( k \) successes in \( N \) trials is

\[ \textrm{Prob[k successes in N trials]} = \left(\begin{array}{c}N \\ k\end{array}\right)p^{k}q^{N-k} \]

Entities, State Variables and Scales

 
Prob[\( A_{1}A_{1} \)] Prob[k = 2] \( \left(\begin{array}{c}2 \\ 2\end{array}\right)p^{2}q^{2-2} = p^2 \)
Prob[\( A_{1}A_{2} \)] Prob[k = 1] \( \left(\begin{array}{c}2 \\ 1\end{array}\right)p^{1}q^{2-1} = 2pq \)
Prob[\( A_{2}A_{2} \)] Prob[k = 0] \( \left(\begin{array}{c}2 \\ 0\end{array}\right)p^{0}q^{2-0} = q^2 \)

Question: How do allele frequencies change over generations?

\[ p_{0} \rightarrow p_{1} \rightarrow p_{2} \rightarrow \cdots \]

\[ p_{t+1} = F(p_{t}), t \geq 0 \]

Note: Subscript corresponds to generation number.

Entities, State Variables and Scales

Hardy-Weinberg

\[ p_{0} \rightarrow p_{0} \rightarrow p_{0} \rightarrow \cdots \]

\[ p_{t+1} = p_{t} \]

How to construct \( F \)? To introduce selection, we need to model fitness, which for a diploid population occurs on the genotype and not on the allele. We will model relative fitness - i.e., we only care about the fitness of each genotype relative to the other genotypes.

Entities, State Variables and Scales

Selection

Genotype \( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
Fitness \( w_{11} \) \( w_{12} \) \( w_{22} \)
Frequency \( p^2 \) \( 2pq \) \( q^2 \)

Note: Rather than model the change of allele \( p \), we will model the change of allele \( q \), as we are interpreting this model in the case of mutation, where \( A_{2} \) is the new mutant allele. This just changes perspective, not the resulting biological interpretation.

Entities, State Variables and Scales

Selection

For model derivation purposes, it helps to imagine a finite population of individuals \( N_{t} \).

Genotype \( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
Fitness \( w_{11} \) \( w_{12} \) \( w_{22} \)
Frequency\( _{t} \) \( p_{t}^2 \) \( 2p_{t}q_{t} \) \( q_{t}^2 \)
Count\( _{t} \) \( p_{t}^2N_{t} \) \( 2p_{t}q_{t}N_{t} \) \( q_{t}^2N_{t} \)

Discuss: We have an implicit assumption here that we haven’t mentioned yet. Can you spot it?

Answer: Fitness is independent of time (parameters).

Entities, State Variables and Scales

Selection

Genotype \( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
Fitness \( w_{11} \) \( w_{12} \) \( w_{22} \)
Frequency\( _{t} \) \( p_{t}^2 \) \( 2p_{t}q_{t} \) \( q_{t}^2 \)
Count\( _{t} \) \( p_{t}^2N_{t} \) \( 2p_{t}q_{t}N_{t} \) \( q_{t}^2N_{t} \)

Discuss: Thinking of the weights \( w_{ij} \) as multiplicative growth factors, what would Count\( _{t+1} \) be?

Entities, State Variables and Scales

Selection

Genotype \( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
Fitness \( w_{11} \) \( w_{12} \) \( w_{22} \)
Frequency\( _{t} \) \( p_{t}^2 \) \( 2p_{t}q_{t} \) \( q_{t}^2 \)
Count\( _{t} \) \( p_{t}^2N_{t} \) \( 2p_{t}q_{t}N_{t} \) \( q_{t}^2N_{t} \)
Count\( _{t+1} \) \( p_{t}^2N_{t}w_{11} \) \( 2p_{t}q_{t}N_{t}w_{12} \) \( q_{t}^2N_{t}w_{22} \)

Discuss: How many \( A_{2} \) alleles are there in the \( t+1 \) generation? How many total alleles are there in the \( t+1 \) generation?

Entities, State Variables and Scales

Selection

Genotype \( A_{1}A_{1} \) \( A_{1}A_{2} \) \( A_{2}A_{2} \)
Fitness \( w_{11} \) \( w_{12} \) \( w_{22} \)
Frequency\( _{t} \) \( p_{t}^2 \) \( 2p_{t}q_{t} \) \( q_{t}^2 \)
Count\( _{t} \) \( p_{t}^2N_{t} \) \( 2p_{t}q_{t}N_{t} \) \( q_{t}^2N_{t} \)
Count\( _{t+1} \) \( p_{t}^2N_{t}w_{11} \) \( 2p_{t}q_{t}N_{t}w_{12} \) \( q_{t}^2N_{t}w_{22} \)
Answer: \[ \begin{align} N_{2,t+1} & = 2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22} \\ N_{t+1} & = 2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr) \end{align} \]

Discuss: So what is the frequency of allele \( A_{2} \) in generation \( t+1 \)?

Entities, State Variables and Scales

\[ \begin{align} N_{2,t+1} & = 2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22} \\ N_{t+1} & = 2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr) \end{align} \]

Answer: \[ \begin{align} q_{t+1} & = \frac{N_{2,t+1}}{N_{t+1}} \\ & = \frac{2p_{t}q_{t}N_{t}w_{12} + 2q_{t}^2N_{t}w_{22}}{2\bigl(p_{t}^2N_{t}w_{11} + 2p_{t}q_{t}N_{t}w_{12} + q_{t}^2N_{t}w_{22}\bigr)} \\ & = \frac{p_{t}q_{t}w_{12} + q_{t}^2w_{22}}{p_{t}^2w_{11} + 2p_{t}q_{t}w_{12} + q_{t}^2w_{22}} \end{align} \]

Entities, State Variables and Scales

Replace \( p_{t} = 1-q_{t} \) to arrive at

\[ q_{t+1} = \frac{(1-q_{t})q_{t}w_{12} + q_{t}^2w_{22}}{(1-q_{t})^2w_{11} + 2(1-q_{t})q_{t}w_{12} + q_{t}^2w_{22}}. \]