Tags: C8. Data types

Description

C8. Describe and manage biological data types, structure, and reproducibility. This competency addresses two distinct concerns: 1) each of the varied ‘omics fields produces data in formats particular to its needs, and these formats evolve with changes in technologies and refinements in 24 downstream software; and 2) all experimental data is subject to error and the user must be cognizant of the need to verify the reproducibility of their data. Students need to develop an awareness of, and ability to, manipulate different data types given the versioning of formats. They also need to exercise caution, to carry out appropriate statistical analyses on their data as part of normal operating procedures and report the uncertainty of their results, and to provide the relevant information to enable reproduction of their results. 

  • Describe the various sequence formats used to store DNA and protein sequences (e.g., FASTA, FASTQ).
  • Understand the representation of gene features using Gene Feature Format (GFF) files.
  • Compare reproducibility of biological and technical replicate data (e.g., transcriptomic data) using statistical tests (Spearman rank test and false discovery calculations).

No results found.