Resource Image

Introduction to Data Management and Metadata using NEON aquatic macroinvertebrate data

Author(s): Kaitlin Stack Whitney

Rochester Institute of Technology

2115 total view(s), 688 download(s)

0 comment(s) (Post a comment)

This lesson focuses on understanding metadata, the data about the data, using aquatic macroinvertebrate abundance and species information from a variety of NEON sampling locations.

Licensed under CC Attribution 4.0 International according to these terms

Version 1.0 - published on 30 Sep 2019 doi:10.25334/SJX1-F373 - cite this


This lesson introduces students to working with metadata, which can be broadly thought of as the data ABOUT existing data. Data isn’t complete without metadata, and this lesson will help students understand both how to work with metadata and how to create their own.

Data used: NEON aquatic macroinvertebrate datasets from multiple stations. It could be adapted to use any data sets or taxonomic groups though.


Activities:  The lesson involves three major activities. 1) Querying and downloading datasets and corresponding products from NEON. 2) Reading and answering comprehension questions about metadata files that correspond with data files  3) Combining two datasets based off understanding the metadata in exercise 2 (e.g. understanding which columns indicate sampling dates and in which formats will allow them to appropriately combine multiple data sets).

Programs: No specific programming skills or language is required for this lesson. This lesson is designed to be done entirely in common office/student software programs (e.g. Microsoft Word and Microsoft Office) and could be done using online programs (e.g. my university has student licenses for Google Spreadsheets and Google Docs).


Learning objectives:

1 – Students will be able to define ‘metadata’ and understand how metadata is critical for reproducible research.

2 – Students will be able to correctly answer comprehension questions about a metadata file.

3 – Students will be able to apply their understanding of the metadata file to create a new data file from two data sets.

4 – Students will understand the importance of creating and understanding metadata to go along with datasets.


Timing: This lesson was designed to take place in two – 75 minute class periods that are in a workshop format. This lesson could easily be part of a longer lab, homework, or a remote / online / asynchronous assignment.


This version is current as of Spring 2019 and was classroom taught. I encourage folks to adapt, modify, and make new versions.

Cite this work

Researchers should cite this work as follows: