• Discoverability Visible
  • Join Policy Invite Only
  • Created 24 Jun 2015

Journal Club Wed. Jan. 20, ODD and TRACE protocols for documenting models

The two papers I will discuss are:

Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: a review and first update. Ecological modelling, 221(23), 2760-2768.

Grimm, V., Augusiak, J., Focks, A., Frank, B. M., Gabsi, F., Johnston, A. S., ... & Thorbek, P. (2014). Towards better modelling and decision support: documenting model development, testing, and analysis using TRACE. Ecological Modelling, 280, 129-139.

The former paper describes a system for fully documenting agent based models (ODD) and the latter paper describes a system for planning, performing, and documenting good modeling practice.

My interest in these two papers is in what we can take from them and implement in a modeling exercise and curriculum so that we can instill in our students best practices for reproducible research.  I will write a more extensive blog post later, but it seems that the latter paper about TRACE has a nice description that might make a good outline for a course on modeling and that I want to implement with my own modeling efforts.

  1. best practices
  2. documentation
  3. journal club
  4. modeling

Comments on this entry

  1. Kam Dahlquist

    Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: a review and first update. Ecological modelling, 221(23), 2760-2768.

    Grimm, V., Augusiak, J., Focks, A., Frank, B. M., Gabsi, F., Johnston, A. S., ... & Thorbek, P. (2014). Towards better modelling and decision support: documenting model development, testing, and analysis using TRACE. Ecological Modelling, 280, 129-139.

    Note that these notes are sometimes copied directly from the article and would need to be paraphrased for use in derivative documents.

     

    • ODD stands for Overview, Design concepts, and Details, established in 2006 for agent-based modeling
    • This paper in 2010 evaluates usage in published papers and offers revisions based on the usage and feedback from interviews.
    • Definitely skewed toward agent-based modeling (with which I am vaguely familiar), but could be adapted for use with other types of models.
    • Elements of the ODD are as follows (revised version)
      • Purpose: What is the purpose of the model?
      • Entities, state variables, and scales: What kinds of entities are in the model, By what state variables, or attributes, are these entities characterized?  What are the temporal and spatial resolutions and extents of the model?
      • Process overview and scheduling: Who does what and in what order?
      • Design concepts: 11 of these
        • Which general concepts, theories, hypotheses, or modeling approaches are underlying the model’s design? (lots of good subquestions here)
        • Emergence:
        • Adaptation:
        • Objectives:
        • Learning:
        • Prediction
        • Sensing:
        • Interaction:
        • Stochasticity
        • Collectives
        • Observation
      • Initialization: What is the initial state of the model world at t = 0 of a simulation run?
      • Input data: Does the model use input from external sources such as data files or other models to represent processes that change over time?
      • Submodels: What in detail are the submodels that represent the processes listed in “Process overview and scheduling”? What are the model parameters, their dimensions, and reference values?  How were submodels designed or chosen, and how were they parameterized and then tested?
    • In about a 3-year period examined, 56 publications used ODD, although 13 of them had a co-author that was a co-author of the standards.  Of these, 75% followed it completely and correctly or missed out just one thing.  Used the info from the 25% left to revise the standards.
    • Complaints:
      • Can be redundant
      • Overkill for simple models
      • Separates units of object-oriented programming (but should be language independent)
    • Limitations:
      • Designed to describe one version of a model, not iterative versions
    • Benefits:
      • Promotes rigorous model formation
      • Facilitates reviews and comparisons of ABMs
      • May promote more holistic approaches to modeling and theory
    • For us, we can definitely take some of these questions and streamline them for students to use to describe their models.

     

    • TRACE stands for: Transparent and Comprehensive Ecological modelling documentation
    • In iterative model development, start simple and then build up, models get complex, subcomponents get tested, but in the end, model may not be adequately documented
    • Stakeholders have no way of knowing if they can trust the model if there is no documentation
      • Want to avoid blind trust in a model (especially if faulty)
      • Also want to avoid blind mistrust in modeling
    • Establish clear expectations in terms of what documentation is needed
    • This article summarizes original TRACE protocol, then based on discussion, feedback, and use in one publication, they revised the TRACE format and recommendations for use
    • Merge terms evaluation and validation to make “evaludation” defined as “the entire process of establishing model quality and credibility throughout all stages of model development, analysis, and application”
    • Meant to document model design and testing, but not the results of the model (which is what is usually focused on in a publication)
    • Two basic tasks of TRACE
      • Keep modeling notebook where you daily document what you did regarding model design, testing, and analysis and what you learned from it
      • Using this modeling notebook, apply the standardized terminology to create a TRACE documents
        • Note that Kam will provide examples of student electronic lab notebooks to see what poor, medium, and excellent examples are
        • In Biological Databases course, have a “Gene Database Testing Report Template” to do a quality assurance report for the creation of a gene database using the software
        • Have started a “Model Testing Report Template” which I will revise based on the TRACE documentation for use with my lab group (and subsequent courses).
        • In general, for a bioinformatics electronic lab notebook, I tell the students that they can copy the protocol, but then they need to modify it based on what they did and add results and discussion
    • I think the most important contribution of the paper is this explication of the “evaduation” process (which is why I’ve highlighted it)
      • Data evaluation, assessing the quality of numerical and qualitative data used for model development and testing
      • Conceptual model evaluation, scrutinizing the simplifying assumptions underlying a model’s design
      • Implementation verification, checking the model’s implementation in equations and software
      • Model output verification, comparing model output to the data and patterns that guided model design and calibration
      • Model analysis, examining the model’s sensitivity to changes in parameters and formulation to understand the models’ main behaviours and describing and justifying simulation experiments
      • Model output corroboration, comparing model output to data and patterns that were not used for model development and parameterization.
    • There is a TRACE template provided in supplementary material and three examples of how used, these points correspond to the stuff above
      • Problem formulation
      • Model description
      • Data evaluation
      • Conceptual model evaluation
      • Implementation verification
      • Model output verification
      • Model analysis
      • Model output corroboration
    • They want to establish a culture of model development, testing, and analysis in ecology
      • Culture means that you just do all these things as well as you can because you know that peers and model clients are expecting you to; there is no point any more in complaining about “additional effort” for these things.
      • I think they overestimate the level of reproducible research in empirical sciences
      • My commentary:  documentation takes a long time and it’s difficult to get compliance
      • Many standards are out there, but the level of compliance with using them properly is in question.  In the area I am familiar with, microarray data at GEO and ArrayExpress, I find that about 25% of datasets we’ve tried to use (an arbitrary list) have had serious problems, including missing column headers, missing gene IDs, can’t figure out which samples go to which data files
      • Deception at Duke scandal

    Reply Report abuse

    Replying to Kam Dahlquist

  2. Kam Dahlquist

    Here is the link to the standards registry that I mentioned today:

    https://biosharing.org/

    Reply Report abuse

    Replying to Kam Dahlquist

  3. Kam Dahlquist

    A couple more articles to add to the discussion:

    Kirk et al. (2015) Science.  Systems Biology (Un)Certainties

    http://science.sciencemag.org/content/350/6259/386.full

    letter in response in 15 Jan 2016 issue of Science.

    Roche et al (2015) Public Data Archiving in Ecology and Evolution: How Well Are We Doing? PLoS Biology

    http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002295

    Reply Report abuse

    Replying to Kam Dahlquist

Post a comment

You must log in to post comments.

Please keep comments relevant to this entry.

Line breaks and paragraphs are automatically converted. URLs (starting with http://) or email addresses will automatically be linked.