• Discoverability Visible
  • Join Policy Restricted
  • Created 26 Feb 2019

Intro to Command Line Coding Genomics Analysis

Version 1 published February 28, 2019

This resource is a fun computer-based intro to command line programming. The activity takes FASTQ NGS data files and runs a fun program called FASTQE. This program is very similar to the popular FastQC, however, rather than outputting data plot visualizations of NGS sequence quality, FASTQE outputs emojis signifying the quality of each base call in the file. The activity takes a fundamental yet sort of boring step in NGS analysis and makes it accessible and fun to students without much experience in the field. It is also designed to for students with little to no experience using command line analysis to learn and run a few simple commands. The elated reaction from students when they get a long string of emojis to output after typing a few commands is really cool! The activity also utilizes another command line tool called FASTP for FASTQ file trimming and filtering.

fastq = fastqe

Incubator Details

Statement from the author (Ray Enke):

I'm looking for feedback on my in class activity to make sure that it is technically sound and determine how it can be used in slightly different implementations. I would like to: 

  • Review the technical or scientific content of the activity.
  • Identify pedagogical strategies for making it more student-centered, inquiry focused, or promoting active learning.
  • Document and annotate the materials so that they can be more easily used by other teachers.
  • Customize the materials so that they are accessible to a different audience, teaching setting, or course context.
  • Clarifying the learning outcomes and working on assessments that address those goals.

Statement from the managing editor (Sabrina Robertson):

The resource takes a complicated topic, assessing quality of sequencing reads, and makes it very accessible to students through the use of an emoji based version of FASTQC called FASTQE.  Using emojis to introduce students to the concept of sequencing read quality is brilliant and sure to hook students right from the start!  The activity is also very simple and introduces students to command line.  It would be an excellent fit in introductory biology courses that want to integrate bioinformatics into their curriculum. Adaptability is the aspect of this resource that needs the most improvement. While the activity is simple, it is lacking background on FASTQC and the biological relevance of sequencing and assessing/ensuring quality of reads.  

Suggestions for improvement:

  • Brief introduction to Illumina sequencing and its application in research and/or clinic.  There are many videos available online that could help quickly reveal how Illumina sequencing works and the kinds of projects it is used for (~5 min videos)
  • Perhaps the source of the files (female_oral1) could be discussed as a way to give the activity biological context.  Is this a microbiome sample?  Where does the data come from?  Why do students need to assess quality, then filter and trim reads?
  • The resource assumes students have knowledge of the FASTQC program.  Without previous instruction/activities using FASTQC instructors would not be able to introduce this as a stand-alone activity yet. The lightening talk does a little introduction of this but not enough.  This activity could be developed so that FASTQE is used first as a tool to help students understand FASTQC more easily. Providing the FASTQC background in the activity itself would be beneficial.  Or perhaps, referring instructors to existing FASTQC activities may be sufficient. 

Incubator start and end times:

This incubator will run from April 22, 2019 to June 3, 2019

QUBES Liaison: Hayley Orndorf

Licensing Information

All NIBLSE Incubators are under Creative Commons licensing. The default license for Incubators is the Attribution-ShareAlike 4.0 International license. This license allows for sharing of adaptations of the work, as longs as all adaptations are shared alike. It also  allows for commercial use of the work. Learn more at the Creative Commons website


Authorship Information

During the Learning Resource Incubator process a small group of faculty work on improving and supporting the use of existing bioinformatics lessons. Prior to the start of the incubator, the author addresses their expectations around participant authorship. Below are the author's expectations around contributions and authorship. 

  • Contributions to the teaching resource:
    • I plan to list each of the incubator personnel who actively participate as contributors.
  • Authorship of the primary teaching resource:
    • I plan to include incubator participants who make a significant intellectual contribution to the primary teaching resource as a co-author.
  • Authorship of derivative or customized materials:
    • I plan to support others if they are interested in customizing these materials for use in other course contexts or with other student audiences.

Below are the Group members who are working on this Incubator: