RNA-Seq is one of the most popular techniques for evaluating the abundance (overlapping but not necessarily synonymous with expression) of a gene. In this introductory lesson, we examine the expression of the leptin gene in samples from a mouse experimental design evaluating diet (high-fat vs. normal diet) and disease (cancer vs. normal tissue). While in-depth RNA-Seq data analysis is an advanced exercise, we cover a basic concepts from three components of this analysis: 1) provided sample data is first quality filtered to remove low confidence sequencing reads and then 2) aligned to a reference transcriptome; finally 3) we compare the number of sequencing reads present in the leptin gene in the cancer vs. normal tissue sample by visualizing the read counts in a genome browser to see the fold-change in gene expression.
This intermediate to advanced lesson is embedded in Jupyter notebooks — an approach that also introduces basic command-line computing. Jupyter notebooks intersperse explanations of command line with commands, and are an excellent learning tool. While command line computing experience is not required, any experience or the help of an instructor or TA familiar with Linux will be helpful. Formative assessments are embedded in the notebook and on a companion lesson page (along with answers, slides, and other resources). A summative assessment captures student understanding of the emphasized concepts.