A Fun Introductory Command Line Exercise: Next Generation Sequencing Quality Analysis with Emoji!
Author(s): Rachael St. Jacques1, Max Maza1, Sabrina Robertson2, Guoqing Lu3, Andrew Lonsdale4, Ray A Enke5
1. Department of Biology, James Madison University 2. Department of Psychology & Neuroscience, University of North Carolina at Chapel Hill 3. Department of Biology and School of Interdisciplinary Informatics, University of Nebraska Omaha 4. ARC Centre of Excellence in Plant Cell Walls, Melbourne University 5. James Madison University
808 total view(s), 408 download(s)
The activity takes FASTQ NGS data files and runs a fun program called FASTQE. This program is very similar to the popular FastQC, however, rather than outputting data plot visualizations of NGS sequence quality, FASTQE outputs emojis signifying the quality of each base call in the file. The activity takes a fundamental yet sort of boring step in NGS analysis and makes it accessible and fun to students without much experience in the field. It is also designed to for students with little to no experience using command line analysis to learn and run a few simple commands. The elated reaction from students when they get a long string of emojis to output after typing a few commands is really cool! The activity also utilizes another command line tool called FASTP for FASTQ file trimming and filtering.
Cite this work
Researchers should cite this work as follows:
- St. Jacques, R., Maza, M., Robertson, S., Lu, G., Lonsdale, A., Enke, R. A. (2019). A Fun Introductory Command Line Exercise: Next Generation Sequencing Quality Analysis with Emoji!. NIBLSE Incubator: Intro to Command Line Coding Genomics Analysis, QUBES Educational Resources. doi:10.25334/Q4P733