A Fun Introductory Command Line Lesson: Next Generation Sequencing Quality Analysis with Emoji!
Author(s): Rachael M. St. Jacques †1, William M. Maza †1, Sabrina D. Robertson2, Andrew Lonsdale3, Caylin S. Murray1, Jason J. Williams4, Ray A. Enke*1
1. James Madison University 2. University of North Carolina Chapel Hill 3. University of Melbourne 4. Cold Spring Harbor Laboratory, DNA Learning Center
Radical innovations in DNA sequencing technology over the past decade have created an increased need for computational bioinformatics analyses in the 21st century STEM workforce. Recent evidence however demonstrates that there are significant…
Radical innovations in DNA sequencing technology over the past decade have created an increased need for computational bioinformatics analyses in the 21st century STEM workforce. Recent evidence however demonstrates that there are significant barriers to teaching these skills at the undergraduate level including lack of faculty training, lack of student interest in bioinformatics, lack of vetted teaching materials, and overly full curricula. To this end, the James Madison University, Center for Genome & Metagenome Studies (JMU CGEMS) and other PUI collaborators are devoted to developing and disseminating engaging bioinformatics teaching materials specifically designed for streamlined integration into general undergraduate biology curriculum. Here, we have developed and integrated a fun introductory level lesson to command line next generation sequencing (NGS) analysis into a large enrollment core biology course. This one-off activity takes a crucial but mundane aspect of NGS quality control (QC) analysis and incorporates the use of Emoji data outputs using the software FASTQE to pique student interest. This amusing command line analysis is subsequently paired with a more rigorous research-grade software package called FASTP in which students complete sequence QC and filtering using a few simple commands. Collectively, this short lesson provides novice-level faculty and students an engaging entry point to learning basic genomics command line programming skills as a gateway to more complex and elaborated applications of computational bioinformatics analyses.
Primary image: Undergraduate students learn the basics of command line NGS quality analysis using the FASTQE and FASTP programs.
- A Fun Introductory Command Line Lesson: Next Generation Sequencing Quality Analysis with Emoji!(PDF | 2 MB)
- S1.FASTQE-Pre-Class Assignment.pdf(PDF | 4 MB)
- S2-4.FASTQE-fastq files.zip(ZIP | 8 MB)
- S5. FASTQE-Lecture slides.pptx(PPTX | 12 MB)
- S6. FASTQE-Jupyter notebook alternative implementation instructions.docx(DOCX | 1 MB)
- S7. FASTQE-instructor version of lesson.pdf(PDF | 1 MB)
- S8. FASTQE-student version.pdf(PDF | 888 KB)
- License terms