Bioinformatics Learning Resources for Genetics Courses

For each resource, we indicate associated GSA topics*, bioinformatic tools, uses and constraints. Hover over each title to view the resource description. Unless otherwise noted, background knowledge and skills typically taught in introductory biology are assumed; additional prerequisites are noted within the constraints. (*Genetics Society of America Genetics Learning Framework 2015 Core Categories)

Introductory Bioinformatics

Using Bioinformatics to Understand Genetic Diseases (click to download)

This Practical Guide outlines a number of basic bioinformatics approaches that can be used to understand the molecular basis of genetic diseases. A rare variation in the insulin gene is discussed, and the impact of the variation on the gene product, and how this results in disease, is explored.

  • GSA Topics: Molecular Biology of Gene Function, Genetic Variation
  • Bioinformatics Tools: BLAST, multiple sequence alignment, automated sequence translation, 3D protein structure, phylogenetic trees
  • Why Use This? Introduction to multiple bioinformatics tools in the context of a human disease (diabetes).  Each set of exercises takes only a few minutes and illustrates one or two tools.  Students could try each tool on their own ‘pet gene’ if desired.
  • Constraints: Answers not included with exercises, but easy to obtain by following instructions. Note: First link in first exercise is broken; instead, go to http://genome.ucsc.edu/ and click on ‘BLAT’.

Drosophila Behavioral Genetics (click to download)

This exercise uses both inquiry-based and active-learning approaches to introduce students to modern genetic and genomic analysis. Students first quantify behavioral interactions associated with mating in wildtype fruit flies. They then connect these phenotypic ontologies to individual candidate genes using curated data from Drosophila’s model organism database, FlyBase. Students explore known characteristics of chosen candidate genes including models of genic structure, genomic context, and known functional attributes including patterns of spatial and temporal gene expression.

  • GSA Topics: Gene Expression and Regulation, Genetics of Model Organisms
  • Bioinformatics Tools: Biological databases, genome browsers
  • Why Use This? Accessible introduction to biological databases and genome browsers, highlights the power of genetic model organisms to study complex biological phenomena.
  • Constraints: Requires two class meetings or a two-hour lab period, computers/internet access

Intermediate Bioinformatics

Sequence Similarity: Exploring Ebola Virus (click to download)

Introductory bioinformatics exercises often walk students through the use of computational tools, but often provide little understanding of what a computational tool does "under the hood." A solid understanding of how a bioinformatics computational algorithm functions, including its limitations, is key for interpreting the output in a biologically relevant context. This bioinformatics exercise integrates an introduction to web-based sequence alignment algorithms with models to facilitate student reflection and appreciation for how computational tools provide similarity output data. This streamlined adaptation provides the bioinformatics concepts and tools that enable students to explore phylogenetic relationships among the known Ebola virus strains and to test the hypothesis that the recent Ebola outbreak in DRC is caused by an known Ebola strain rather than a new strain.

  • GSA Topics: Evolution & Population Genetics
  • Bioinformatics Tools: BLAST, multiple sequence alignment, phylogenetic tree building
  • Why Use This? Introduction to the concept of sequence similarity, bioinformatics tools to examine it, and ‘under-the-hood’ understanding of algorithms involved; ends with a hypothesis-based small project.
  • Constraints: Requires two 3-hour lab periods, paper & pencil, computers/internet access and some background knowledge in probability; may be too in-depth for first-time exposure to bioinformatics.

Advanced Bioinformatics

RNA-seq with Galaxy (click to download)

This resource involves downloading RNAseq data sets from the NCBI Sequence Read Archive (SRA) and using Galaxy tools to identify differentially expressed genes. Different data sets and experimental questions can be explored. Students are introduced to all major computational steps in RNAseq data analysis, including the concept of computational pipelines/workflows.

  • GSA Topics: Gene Expression & Regulation
  • Bioinformatics Tools: Galaxy, FastQC, Trimmomatic, Bowtie/HISAT2, DESEQ2
  • Why Use This? Sophisticated command line tools for transcriptome analysis are made accessible through the Galaxy user interface. Instructors can choose from one of several pre-selected data sets or choose their own. The activity also introduces students to NCBI databases (SRA), the importance of data quality control, and the use of workflows (pipelines).
  • Constraints: Requires three 3-hour lab periods; some processes (e.g., read mapping) may take hours to complete depending on genome complexity