Resource Image

Using DNA Subway to Analyze Sequence Relationships

Author(s): Jason Williams1, Ray A. Enke2, Oliver Hyman2, Emily Lescak3, Sam S Donovan4, William Tapprich5, Elizabeth F Ryder6

1. DNA Learning Center 2. The Department of Biology, James Madison University; The Center for Genome & Metagenome Studies, James Madison University 3. University of Alaska 4. University of Pittsburgh 5. University of Nebraska-Omaha 6. Worcester Polytechnic Institute

2758 total view(s), 521 download(s)

0 comment(s) (Post a comment)

This is a bioinformatics exercise using the DNA Subway Blue Line, a user-friendly pipeline of bioinformatics tools, to analyze a collection of mosquito DNA bar-code sequences.

Licensed under CC Attribution-ShareAlike 4.0 International according to these terms

Version 2.0 - published on 30 May 2018 doi:10.25334/Q4J111 - cite this



These are instructions for a 120-minute bioinformatics lab in which students will learn how to use the DNA Subway ( Blue Line. DNA Subway is an online educational bioinformatics platform. It bundles research-grade bioinformatics tools, high-performance computing, and databases into workflows with an easy-to-use interface. The Blue Line is a workflow for analyzing DNA bar-coding data to determine the taxonomic identity of an organism and examine inferred phylogenetic relationships among species. The data analyzed in this exercise is a collection of mosquito DNA sequences generated by students at James Madison University. Identification of a mosquito’s genus is important because different mosquitoes are capable of carrying different pathogens causing deadly human diseases.

Note: A natural way to divide this material may be to cover everything up to section IV (Phylogenetic Trees) in one class, leaving additional time to address trees in as much depth as desired.


1) The biggest change from the initial submission was to include the analysis of the mosquito data as a way of motivating students’ interest in the exercise. Identifying the genus of the mosquitoes is critical to knowledge of what pathogen they may be carrying. We added background and pictures on mosquitoes at the beginning of the exercise, and the mosquito sequences are now available in the DNA Subway database.

2) We made changes to the text to provide more explanation and guidance to the student in understanding how the data were generated, as well as in the various steps that are required in cleaning up the data for analysis. We added several figures, and clarified figure legends. We also included additional background and examples in the section on phylogenetic trees.

3) We made changes to the text to make the correspondence between the DNA Subway ‘stops’ and the various instructions more clear. We also made format changes (blue boxes) to help distinguish background information from instructions on how to proceed through the pipeline.

4) We wrote answers to the student questions, and created Instructor and Student versions of the exercise. Answers include a detailed analysis of the mosquito data set.

Cite this work

Researchers should cite this work as follows: