Outside the Norm: Using Public Ecology Database Information to Teach Biostatistics
Editor: Joseph Dauer
Published online:
Abstract
Biology students’ understanding of statistics is incomplete due to poor integration of these two disciplines. In some cases, students fail to learn statistics at the undergraduate level due to poor student interest and cursory teaching of concepts, highlighting a need for new and unique approaches to the teaching of statistics in the undergraduate biology curriculum. The most effective method of teaching statistics is to provide opportunities for students to apply concepts, not just learn facts. Opportunities to learn statistics also need to be prevalent throughout a student’s education to reinforce learning. The purpose of developing and implementing curriculum that integrates a topic in biology with an emphasis on statistical analysis was to improve students’ quantitative thinking skills. Our lesson focuses on the change in the richness of native species for a specified area with the aid of iNaturalist and the capacity for analysis afforded by Google Sheets. We emphasized the skills of data entry, storage, organization, curation and analysis. Students then had to report their findings, as well as discuss biases and other confounding factors. Pre and postlesson assessment revealed students’ quantitative thinking skills, as measured by a pairedsamples t test, improved. At the end of the lesson, students had an increased understanding of basic statistical concepts, such as bias in research and making databased claims, within the framework of biology.
Primary Image: Website screenshot of an iNaturalist observation (Clasping Milkweed – Asclepias amplexicalis). This image is an example of a data entry on iNaturalist. The data students export from iNaturalist is made up of hundreds, or even thousands, of observations like this one. This image is licensed under Creative Commons Attribution  Share Alike 4.0 International license. Source: Observation by cassi saari, 2014.
Citation
Tyce C, Goudsouzian LK. 2023. Outside the Norm: Using Public Ecology Database Information to Teach Biostatistics. CourseSource 10. https://doi.org/10.24918/cs.2023.38Society Learning Goals
Ecology
 Biological Diversity
 What is biodiversity at the genetic, species and functional (niche) level within an area, a biome or on Earth?
Lesson Learning Goals
Students will: describe how the presence of bias in research (such as selection bias) will impact statistical results.
 use spreadsheet software to organize and display data that facilitates comprehension and analysis.
 use data to make claims about a phenomenon.
 be exposed to how the amount of data available, type of data, and test parameters are required to choose and implement appropriate statistical tests for analyses.
Lesson Learning Objectives
Students will: interpret the parameters and value of the chisquare goodness of fit test while analyzing data.
 search for, filter, and export data from the program iNaturalist.
 apply concepts of sampling, bias, and experimental design to illustrate strengths and weaknesses within the data collected.
 use Google sheets to conduct statistical analyses.
 analyze prior research to find biases.
 effectively report statistical findings in writing.
Article Context
Course
Article Type
Course Level
Bloom's Cognitive Level
Vision and Change Core Competencies
Class Type
Class Size
Audience
Lesson Length
Pedagogical Approaches
Principles of How People Learn
Assessment Type
Introduction
Statistics is the ability to “think with data,” and therefore is an important discipline to be addressed in science courses (1). The use of quantitative reasoning, such as the application of statistical analysis, is a core competency listed in Vision and Change (2). Literature surrounding statistics has focused on introductory collegiate courses and experiences before college (1), suggesting a need for further literature applying statistics to specific subject matter. Statistics has historically been poorly integrated into biology curricula, and some students fail to learn statistics at the undergraduate level altogether (3). To address this lack of statistical instruction in life science courses, previous studies have created lessons that increase biology students’ understanding of statistics. ColonBerlingeri and Burrowes (4) designed multiple activities for genetics and zoology courses to enhance student statistics skills. Metz (3) sought to improve statistical understanding in biology students by modifying statistics courses to include more biological examples. Regardless of implementation style, it is important that the disciplines of math and biology are taught together to biology students instead of as separate entities (5), and that such exposure is scaffolded over the course of a student’s academic career to ensure proper development of skills (6).This study focuses on statistics education outside of the general statistics courses, specifically looking at teaching statistics within a college level molecular biology laboratory course.
The module we describe was designed and implemented within an intermediate Molecular Biology laboratory course at a small liberal arts university in Pennsylvania. This lesson followed a DNA barcoding activity, where students had to extract the DNA of an unknown species, then PCR amplify and perform sequence analysis of a hypervariable region of the genome. This activity extended the lesson to focus on the overall species richness of a given area. The lesson requires students to download datasets relating to the distribution of plant and animal life in specified geographical regions from the database iNaturalist. This information is typically collected from a student’s county of birth. However, students may also use instructor provided data if needed (see Teaching Discussion), as well as data from other areas of interest or personal significance (see Inclusive Teaching). After collecting information, students use the spreadsheet program Google Sheets to analyze the frequencies of various classes of organisms over time and analyze changes over time using a chisquare goodness of fit test. These activities are supplemented by instruction in the basic statistics concepts of bias, sampling methods, experimental design, as well as information specific to the chisquare goodness of fit test. Students calculate how the number and type of species have changed in distribution over the past year by using their statistical analysis, their analysis of data accuracy, and their analysis of data appropriateness by applying understanding of experimental design and sampling methods.
The American Statistical Association published guidelines in 2014 with recommendations for educators, including an enumeration of skills required of students pursuing careers involving statistics, teaching pedagogy, and how to work with other programs that teach statistics (6). The ASArecommended concepts we chose to emphasize and implement included (but were not limited to) teaching statistical theory concepts such as design of studies and biases, refraining from the use of simple datasets, teaching statistics communication, and building realworld problemsolving skills (6).
To assess the learning outcomes of this new lesson, we administered a pre and postmodule assessment (Supporting File S1). This included 10 statistics questions, 5 attitudinal questions, and 5 demographic questions. Two of the statistics questions were written by us. The rest were validated questions that came from previous research by ColonBerlingeri and Burrowes (4), a practice Advanced Placement exam (7), or the Assessment Resource Tools for Improving Statistical Thinking (ARTIST) database (8). The five attitudinal questions were selected from Wise (9).
Intended Audience
This lesson was designed for all biology majors at an undergraduate liberal arts university, regardless of background knowledge in statistics. It was taught to 95 Molecular Biology students with mixed backgrounds in statistics. Fortysix of these students chose to participate in the research aspect of this lesson. Of these 46 individuals, 36 had completed a collegiate statistics course, 2 were currently enrolled in statistics courses, 2 had statistics experience from advanced placement programs in high school, and 6 had no collegiate statistics experience. This class was primarily composed of secondyear students. Approximately half of the students were biology majors, half were enrolled in an accelerated physician assistant program, and a small percentage were also biochemistry majors.
Required Learning Time
For this exercise, six hours (two, threehour lab classes) of inclass time was devoted to the lesson. The laboratory setting of the activity meant the amount of time students spent in the classroom varied from student to student, with all students able to complete the laboratory in the allotted time. In addition, two short homework assignments (approximately one hour to complete each) were assigned (Supporting Files S2–S7). Depending on data collection methods and statistical concepts taught (i.e., a reduction in content), the overall time commitment of this exercise can be modified (see Teaching Discussion).
Prerequisite Student Knowledge
No prerequisite knowledge is required for completion of the module. However, because students must synthesize complex ideas about climate change, global commerce, and other factors which influence the spread and persistence of species, the module is best suited for students beyond their first year of an undergraduate biology program.
Prerequisite Teacher Knowledge
Teachers should have basic knowledge of p values and their interpretation, concepts regarding how biases impact research (such as how it impacts the credibility of results), chisquare goodness of fit tests, types of data (categorical, nominal, ordinal), as well as a broad understanding of different statistical tests (specifically the difference between parametric and nonparametric tests). Though such concepts are only covered in brief, advanced knowledge of these topics confers improved ability to detect mistakes and common errors within statistical thinking of students. We used both a biostatistics textbook (11) and various online resources (12–15) to ensure we had the proper statistical foundation which could provide us extra information to address student questions.
Teachers should also have knowledge of native and introduced species. This includes what makes introduced species successful, how they become introduced to new environments, and where they may be most prevalent. Although the focus of the lesson is statistics, knowledge of introduced species ensures instructors can determine if student claims regarding frequencies of native species throughout the inclass activities are logical and reasonable.
In addition, teachers should become familiar with the software being used within this lesson. This includes the database iNaturalist, as well as the techniques used in Google Sheets. Both the inclass activity and supporting materials can be used to gain the appropriate knowledge.
Scientific Teaching Themes
Active Learning
The introductory lesson on statistics prompts students to work in groups to answer interleaved questions regarding statistics, such as determining if statements/methods contain biases and classifying data as categorical, nominal, or ordinal. Such laboratory discussion is an effective method at fostering active learning in the classroom (16). Students also participate in handson and computerbased activities. To understand the effects of sample size on bias, students use colored beads to collect different sized samples and compare with other groups. Students also collect and analyze their own data from online databases and compare their findings with others in their group. The use of computerbased exercises and group work have been noted to actively involve students, helping students retain more information, learn from others, and actively participate in learning (17).
Assessment
Instructors measured learning within the development of this lesson using a pre and postmodule assessment (Supporting File S1). The pre and postmodule assessments are identical. This module was submitted for review and approved by the test university’s Institutional Review Board. Students who gave informed consent participated in this assessment, which consisted of 10 contentbased questions, 5 attitudinal questions, and 5 demographic questions. All but two content questions were either validated through previous assessments (4), from a previous practice Advanced Placement Exam (7) or found in the ARTIST database (8). The results of this assessment showed a significant improvement in understanding of statistical concepts (see Teaching Discussion).
For instructors wishing to measure learning in their own classroom without pre and postassessments, we recommend assessing the following aspects of the lesson. Instructors can measure learning through graded homework assignments (Supporting Files S5–S7). They can also monitor student progress throughout the inlab activity and lecture. This includes grading the inclass assignment for correctness, as well as assigning credit for discussion questions.
Students were able to evaluate their own learning and understanding of content through the completion of the homework and the laboratory activities, as well as their success during inclass lectureintegrated discussion questions. This included some questions that asked for thought into why they were asked to use certain strategies in completing the assignment, and if they thought it was effective. These questions focused on selfevaluation are metacognitive, which help students to think like biologists (18).
Inclusive Teaching
The lesson incorporates group work and discussion, both during lectures and activities. Students listen to multiple points of view, providing multiple unique opportunities to learn new information, as discussed in the active learning section. Some examples utilize physical objects to teach concepts. Images/screenshots are present as well to help students better comprehend written instruction, and a ‘note sheet’ outlining Google Sheets shortcuts utilizes GIFs to visualize techniques (Supporting File S8). Faculty help, through verbal description and in class demonstration, was also available. This aligns with the multiple means of representation outlined by the Universal Design for Learning framework (19). Finally, software and information used within this project are free to use and cloudbased. This makes the information accessible in two ways. First, students do not have to pay for programs to do a lesson. Second, students who only have access to cloudcomputing based devices (e.g., Chromebooks) have access to the lesson as well because no software installation is required. If students are unable to afford their own devices, university computer labs or laboratory computers may be utilized with no need for lengthy software preparation. This lesson also uses data of personal significance to students. However, if a student is uncomfortable with choosing a location related to where they are from or where they live, or are unable to find information for their location, students may choose other regions of interest, such as the location of the institution, location of a friend/family member, or a region they find interesting. Finally, we saw no significant difference in learning outcomes according to students’ selfidentified gender, leading us to believe that our lesson is equitable in regard to gender (see Teaching Discussion).
Lesson Plan
PreClass Preparation: Day 1 (Table 1: Part 1)
Preparation for the first lesson should begin with a review of the lesson, beginning with the prelab lecture (Supporting File S9). Note that supporting information is present within the speaker notes of the slideshow presentation. This lecture includes discussion of hypotheses, experimental design, sampling methods, biases, and the significance of native species. Having a solid understanding of these ideas is important for class discussion. It is likely that a student will ask a question about a particular circumstance (e.g., does a survey conducted by the school’s dining hall contain bias? If so, is it intentional?).
Next, familiarize yourself with the examples and questions in the prelab lecture. This will be the best way to assess if students are understanding the content of the lesson at that time. This is also important because, given the nature of this lecture, it is possible for multiple answers to be right given the appropriate circumstances. For example, the purpose of the “building a house” example is meant to illustrate that, if a flaw in the design process is present in an experiment, all other aspects (including statistical analysis) will be flawed as well. Yet, students may state that the plan is meant to increase transparency of research. Though different from the intended response, it is nonetheless a valid answer. When questions appeared to cause confusion, or had nuances to their answer, we addressed these in front of the entire class as they arose. Also make sure you have small items available for the sample size activity. Our laboratory used different colored beads, but this can be changed (see Teaching Discussion).
After reviewing presentation material, work through the laboratory assignment (Supporting File S10). As you work through the activity, note any places you may find challenging. For example, if you are not familiar with spreadsheet software, you may find it valuable to use your own learning process to inform the teaching of your classroom. Also remember to use the Google Sheets Tips and Tricks document to help you finish your work (Supporting File S8). Please note that shortcuts/commands for the program may differ depending on the machine being used. If your institution uses machines other than those operating on Windows or Apple software, you will need to determine the appropriate shortcuts and commands.
After reviewing the assignment, prepare sample datasets from iNaturalist, or use ones provided in the supporting materials (Supporting File S11). If a classroom is large, iNaturalist may not be able to process all download requests or it will work at a slow pace (as we observed in the first implementation of this lesson). Prepared datasets aid in the efficiency of the lab, but at the potential cost of student engagement. Since one of the goals of the lesson is to make data more applicable to students, preparing sample data from areas surrounding the educational institution can maintain engagement. The sample dataset, along with a blank version of the assignment, can be uploaded to a course management system (such as Blackboard, Google Classroom, Canvas, etc.) for easy student access to, and submission of, lesson materials.
Finally, read the homework assignment, which consists of two articles and a guided reading worksheet. The first article outlines types of bias that can be encountered in research (20) (Supporting Files S2, S3), and the second is a brief article about a study that was retracted by the prestigious British medical journal The Lancet (21) (Supporting File S4). An annotated version of the first article was provided to students to aid comprehension. Read the answer key for the assignment to understand what responses are deemed correct (Supporting File S6).
Day 1 Class (Table 1: Part 2)
All students should open the computer they are using to complete the activity and to view supporting materials. In our classroom, this included tablet computers, laboratory PCs, and personal laptops. In addition, students are encouraged to take notes. Once prepared, walk through the lecture slides with the class, providing examples during the presentation. Note that the more examples you can provide, the better. These should not be reserved for example slides only. For semantics on how to treat each individual slide, please reference the speaker notes within the presentation.
After the lecture, students may begin to work in groups on the activity. Student groups are meant to be collaborative, thus providing a support framework for students if questions or concerns arise. However, remind students that everyone must hand in their own work and use their own dataset (if possible) because grading is on an individual, not group, basis. In the implementation of this lesson, our students were provided with the necessary articles through our learning management system, but these may be offered as hard copies if desired (see Teaching Discussion).
During the activity, monitor students’ progress by walking between student groups and viewing their screens. We found that students' unfamiliarity with computers could hinder progress. Students may not understand they are working incorrectly online or hesitate to ask for assistance and will need to be corrected even without asking for help. Day Two activities build upon Day One, so errors should be corrected right away to prevent later difficulties in completing the exercise.
If multiple students make a similar error, we found it best to direct everyone’s attention to the front projector screen. For technical mistakes, we demonstrated the correct process on the instructor’s computer. For example, students were able to watch us manipulate information in Google Sheets. If many students struggled with a statistical concept, we would discuss the topic with the class and return the presentation to the appropriate slide. We would leave this information up until a new question arose.
It is also important for the teacher to monitor progress because portions of the activity (such as question two in Supporting File S10) requires students to compare answers. The instructor’s presence can facilitate discussion for those who are less comfortable speaking up in their group, as well as ensure that students are communicating and not merely sharing computers without discussion. In situations where students are uncomfortable collaborating with peers, educators can provide their own data from preclass preparation for comparison and compare the two datasets with the student.
Because this activity was conducted in a laboratory setting, students were allowed to leave once the inclass assignment was completed but were reminded that their homework was due at the next lesson. Students may pick up their homework assignments as they leave the laboratory. If desired, students may remain in the lab to work on the assignment with peers or professor aid.
PreClass Preparation: Day 2 (Table 1: Part 3)
Prepare for the second lecture in a similar manner to the first. Go over all concepts in the presentation (Supporting File S12), including p values, data organization, how to choose appropriate statistical tests, chisquare, etc. It is especially important to do a careful review of p value concepts given that many discussion questions within the laboratory activity, as well as the homework, ask questions regarding p value. Be sure to state that the p value is not just the probability of one result occurring, but the probability of a result that extreme or more occurring assuming the null hypothesis is true (11). Next, go through all example slides, including those that require student participation. When discussing data types, there may be circumstances that yield results different than what is displayed on the slides. For example, a pH strip test may yield categorical data by the color of the strip, but if a color corresponds to a pH number it may change to ranked/ordinal data or even numerical data. If such discussion occurs, be sure to explain how each answer could be right or wrong given the context of the question.
Another example within the lesson is the kneejerk reflex test. Although the test itself is not important for statistics education, it is a common assessment that serves one specific purpose. This makes it similar to statistical tests, which work in certain contexts for a specific purpose.
Finally, when walking through the sample chisquare table, it is beneficial to write the calculations on the board when presenting this material. Depending on the mathematics background of the educator this may require extra practice.
After reviewing the lesson, work with the students to complete the inclass assignment (Supporting File S13), paying attention to the calculations required for each step. It is likely that you will encounter students ordering columns incorrectly in the spreadsheet and therefore conducting improper calculations. Having a solid understanding of the calculations to be completed will allow you to better help students. Also pay particular attention as the students create formulas in Google Sheets, as this can be a complex process for firsttime users.
Finally, read through the rubric for the final homework assignment: a ‘mini lab report’ requiring students to use appropriate methods to report their statistical findings (Supporting File S7).
Day 2 Class (Table 1: Part 4)
Begin the class by collecting the previous week’s homework, consisting of one worksheet. Alternatively, students can upload the assignment through the course learning management system. Begin the prelaboratory lecture discussing the new statistical content for the day. For specific content to present on each slide, refer to the speaker notes within the presentation.
After the lesson, allow students to break into groups and work on the assignment. Much like the first day, ensure that students are working collaboratively and fairly, and actively monitor student progress to determine if intervention is required for the entire classroom. As students work, pass out the final homework assignment to be completed. This will be due the following week. This and the previous assignment can be completed at the end of the last class day (Table 1: Part 5).
Table 1. Lesson timeline. A guide to the preparation and implementation of the lesson, including anticipated times and required supporting materials.
Activity  Description  Time  Required Materials 

Part 1: Preparation for First Lesson  
Review Presentation  Review lesson presentation including a review of statistical concepts and practice questions. Also prepare teaching aids for the handson activity (beads were used for our classroom).  30–60 minutes depending on statistics background  Supporting File S9 
Review Student Activity  Complete the activity to be assigned and note possible areas of trouble for your students. Also review the provided Google Sheets cheat sheet.  60–120 minutes depending on spreadsheet background  Supporting Files S10 and S8 
Review Student Homework  Look over student homework readings and reading guide.  15–30 minutes  Supporting Files S2–S6 
Post Student Materials  Upload documents and resources to the course management system, including the homework, lessons, and assignments. Also upload sample datasets.  10–15 minutes  Supporting File S11 
Part 2: The First Lesson  
PreLab Lecture  Teach students using the presentation slides prepared.  45–60 minutes  Supporting File S9 
Laboratory Activity  Have students complete the laboratory activity in groups of 4. Frequently check in with students and walk around the classroom to monitor progress.  60–120 minutes  Supporting File S10 
Assign Homework  Provide students with hard copies of the homework assignment as they leave the laboratory. Students may stay to work on homework during laboratory time if they wish. Inform them that it will be due in one week.  5 minutes (as students leave)  Supporting Files S2–S6 
Part 3: Preparation for Second Lesson  
Review Presentation  Review lesson presentation, including a review of statistical concepts and practice questions.  30–60 minutes depending on statistics background  Supporting File S12 
Review Student Activity  Complete the activity to be assigned and note possible areas of trouble for your students. Also review the provided Google Sheets cheat sheet.  60–120 minutes depending on spreadsheet background  Supporting Files S13 and S8 
Review Student Homework  Look over student laboratory report rubric and sample.  10–20 minutes  Supporting File S7 
Post Student Materials  Upload documents and resources to the course management system, including the homework, lessons, and assignments.  10–15 minutes  
Part 4: The Second Lesson  
Collect Homework  Collect hard copies of student homework from the previous laboratory.  5 minutes  
PreLab Lecture  Teach students using the presentation slides prepared.  45–60 minutes  Supporting File S12 
Laboratory Activity  Have students complete the laboratory activity in groups of 4. Frequently check in with students and walk around the classroom to monitor progress.  60–120 minutes  Supporting File S13 
Assign Homework  Provide students with hard copies of the homework assignment as they leave the laboratory. Students may stay to work on homework during laboratory time if they wish. Inform them that it will be due in one week.  5 minutes (as students leave)  Supporting File S7 
Part 5: After the Lessons  
Grade First Homework  Grade the first homework assignment submitted by students.  5–10 minutes per student  
Grade Second Homework  Grade the second homework assignment submitted by students. This should be handed in one week after it was assigned.  5–10 minutes per student 
Teaching Discussion
Our assessment data showed that students’ understanding of statistical concepts improved significantly after completion of the lesson. To assess student learning outcomes in a quantitative manner, an optional pre and postassessment was given to students who gave informed consent to participate. These assessments, and the lesson, were submitted to and approved by the Institutional Review Board at the test university (approval number ET55081021). Results of the pre and postassessment (Figure 1) were analyzed using a pairedsamples t test. We saw significant improvement in student scores from the preassessment (mean score = 41.08696; standard deviation = 15.66713) to the postassessment (mean score = 57.3913; standard deviation = 19.48615), t(46) = 4.4227, p < 0.05. Furthermore, there was no statistically significant difference in improvement between selfidentified males (median percent change in score = 100) and females (median percent change in score = 33.33…) U(N_{male} = 15, N_{female} = 30,) = 272.5, p > 0.05, as analyzed using a MannWhitney U test. This implies there was insignificant bias in our teaching methodologies with regard to gender (Figure 2). This finding is encouraging, given the prevalence of gendermath stereotypes assuming females to have lower mathematical aptitude, as well as a high presence of stereotype threat among females (meaning individuals fear their performance may be attributed to a negative stereotype) (10). Although we also collected information on students’ selfidentified race/ethnicity, these biases were not analyzed due to a poor representation of selfreported diversity within the classroom and its volunteers.
Two findings surprised us. The first is that most students (36/46) reported at least some level of introductory statistics at the college level (Figure 3). This is because approximately half of students within the laboratory were enrolled in an accelerated physician assistant program that requires a collegelevel statistics course. Given this information, we were surprised by the low preassessment scores of students in the course that had taken some form of statistics prior to the laboratory activity (as shown in Figure 3). These results may offer support for the American Statistical Association’s recommendation to scaffold statistics instruction over the course of college education to increase student learning outcomes (6). This finding may also suggest a need for statistics education to be tailored to a major/program of study, as the general statistics course offered at the test university is taught by the mathematics department to any undergraduates in need of statistics. The second finding that stuck out was that students’ opinions of statistics failed to change (individual dependent samples t tests for each question yielded p > 0.05) (Figure 4). This may suggest that, while short lessons can improve content knowledge, they may not be capable of changing opinions, possibly due to their brief duration.
We observed students to have mixed reactions to the lesson content. Some students were comfortable with the content. Others appeared confused and overwhelmed, despite the intent of the lesson to serve as an introduction to statistics. The lesson can be narrowed in scope if desired. We encourage educators to modulate the content of the lesson depending on the students’ class level, mathematics background, and perceived abilities. This includes only teaching one day of the lesson depending on statistical background (for example, providing students with data and having them complete Day Two only), removing questions about bias to focus on working with data, removing content about hypotheses if it is already covered in a previous lab, etc. We also observed that technical errors could disrupt the lesson plan. The limited ability of iNaturalist to process simultaneous requests meant some students had to wait long times (e.g., 20 minutes) for download, which, in some cases, required that we provide sample datasets to expedite the lesson. Sometimes students made website errors which caused them to download too much or too little information. In these cases, we directed students to take data from a partner rather than use more precious lab time. If class time is an issue, consider providing the handout or making a video tutorial outlining the iNaturalist downloading process in advance so that students can collect that data before class. Finally, we discovered that when students limited their iNaturalist data selection parameters to the winter months, some geographical regions did not yield enough data for analysis. For example, areas of New York State had very few naturalist observations in the winter. In these cases, we told students to alter the selection parameters to include the summer months as well.
Extensions
There are other changes that can be made to the lesson to tailor its relevance to a particular course or student population. For example, other datasets could be used for this lab as long as they are suitable for chisquare analysis (categorical data). This could be from other online databases, faculty research projects, or even student laboratory activities. Even the statistical analyses conducted (chisquare) and statistical software could be changed depending on student knowledge and the types of data being analyzed.
Another potential modification involves relating the data provided by iNaturalist to ecological concepts beyond introduced species, such as urbanization, climate change, pollution, environment fragmentation, habitat depletion, and more. Rather than being assigned an ecological question to answer, students could devise their own. They might locate relevant issues by reviewing current events/popular press articles, then brainstorm and discuss in groups what questions they could answer about this topic using data provided by iNaturalist. This selfdirected activity provides students the opportunity to think like a researcher, as scientists often generate their research questions by basing them on a given dataset. Additionally, this would help to increase student engagement in the lesson by giving them the opportunity to investigate what interests them. Finally, students would be challenged to apply their knowledge of biological concepts to novel situations.
To further challenge students, the homework articles could be changed to focus on a different element of the lab (e.g., how to design an experiment instead of understanding biases). These can also be annotated to change the difficulty level of the activity. Inclass examples can also be changed. While the content should remain (e.g., an example about hypotheses should always be about hypotheses), the subject of the question can change (e.g., changing the cicada example to be about bees). This will be valuable in tailoring this lesson to your respective classrooms. Even the physical/handson bead activity can be changed to use different objects. For example, it is common to use M&M candy for such an activity and to compare random samples to factory standards (22).
This activity in its current state has the ability to improve student outcomes and has the potential to be modified to fit a variety of educational settings.
Supporting Materials

S1. Outside the Norm – Student Assessment

S2. Outside the Norm – First Homework Article (annotated)

S3. Outside the Norm – First Homework Article (original)

S4. Outside the Norm – Second Homework Article

S5. Outside the Norm – First Homework Assignment

S6. Outside the Norm – First Homework Assignment Answer Key

S7. Outside the Norm – Second Homework Assignment

S8. Outside the Norm – Student Guide

S9. Outside the Norm – Lecture 1 Slides

S10. Outside the Norm – Lab Activity 1

S11. Outside the Norm – Sample Data

S12. Outside the Norm – Lecture 2 Slides

S13. Outside the Norm – Lab Activity 2
Acknowledgments
We are grateful to the students in the course for engaging with the activity. We thank the Promoting Active Learning and Mentoring (PALM) Network for supporting this project. The PALM Network was supported by the National Science Foundation Research Coordination Network in Undergraduate Biology Education grant DBI162420.
References
 Horton N, Hardin J. 2015. Teaching the next generation of statistics students to “think with data”: Special issue on statistics and the undergraduate curriculum. Am Stat 69:259–265. doi:10.1080/00031305.2015.1094283.
 American Association for the Advancement of Science (AAAS). 2011. Vision and change in undergraduate biology education: A call to action. AAAS, Washington, DC.
 Metz AM. 2008. Teaching statistics in biology: Using inquirybased learning to strengthen understanding of statistical analysis in biology laboratory courses. CBE Life Sci Educ 7:317–326. doi:10.1187/cbe.07070046.
 ColonBerlingeri M, Burrowes PA. 2011. Teaching biology through statistics: Application of statistical methods in genetics and zoology courses. CBE Life Sci Educ 10:259–267. doi:10.1187/cbe.10110137.
 National Research Council (US) Committee on Undergraduate Biology Education to Prepare Research Scientists for the 21st Century. 2003. Bio2010: Transforming Undergraduate Education for future research biologists. National Academies Press (US), Washington, DC. doi:10.17226/10497.
 American Statistical Association Undergraduate Guidelines Workgroup. 2014. 2014 curriculum guidelines for undergraduate programs in statistical science. American Statistical Association, Alexandria, VA.
 The AP College Board. 2012. AP Statistics Practice Exam: From the 2012 Administration. The AP College Board.
 Garfield J, delMas R, Chance B. 2003. Assessment resource tools for improving statistical thinking. Paper presented in the Symposium: Assessment of Statistical Reasoning to Enhance Educational Quality AERA Annual Meeting.
 Wise SL. 1985. The development and validation of a scale measuring attitudes toward statistics. Educ Psychol Meas 45:401–405. doi:10.1177/001316448504500226.
 Kiefer AK, Sekaquaptewa D. 2007. Implicit stereotypes and women’s math performance: How implicit gendermath stereotypes influence women’s susceptibility to stereotype threat. J Exp Soc Psychol 43:825–832. doi:10.1016/j.jesp.2006.08.004.
 Dytham C. 2011. Choosing and using statistics: A biologist’s guide, 3rd ed. WileyBlackwell, Hoboken, NJ.
 Z. 2020. How to perform a chisquare goodness of fit test in Excel. Statology. Retrieved from https://www.statology.org/chisquaregoodnessoffittestexcel/ (accessed 20 July 2021).
 Minitab Blog. 2016. What are degrees of freedom in statistics? Retrieved from https://blog.minitab.com/en/statisticsandqualitydataanalysis/whataredegreesoffreedominstatistics (accessed 28 July 2021).
 University of Washington Psychology Writing Center. 2010. Reporting results of common statistical tests in APA format. University of Washington, Seattle, WA.
 Z. 2020. How to calculate expected frequency. Statology. Retrieved from https://www.statology.org/expectedfrequency/ (accessed 20 January 2023).
 Ueckert CW, GessNewsome J. 2008. Active learning strategies: Three activities to increase student involvement in learning. Sci Teach 75:47–52.
 Fuller TD. 1998. Using computer assignments to promote active learning in the undergraduate social problems course. Teach Sociol 26:215–221. doi:10.2307/1318835.
 Tanner KD. 2012. Promoting student metacognition. CBE Life Sci Educ 11:113–120. doi:10.1187/cbe.12030033.
 Rose D. 2001. Universal Design for Learning. J Spec Educ Technol 16:66–67. doi:10.1177/016264340101600208.
 ŠimundiÄ‡ AM. 2013. Bias in research. Biochem Medica 23:12–15. doi:10.11613/BM.2013.003.
 Eggertson L. 2010. Lancet retracts 12yearold article linking autism to MMR vaccines. Can Med Assoc J 182:E199–E200. doi:10.1503/cmaj.1093179.
 Smith, RA. 2008. A tasty sample(r): Teaching about sampling using M&M’s, p 8–10. In Benjamin Jr LT (ed), Favorite activities for the teaching of psychology. American Psychological Association, Washington, DC.
Article Files
Login to access supporting documents
 TyceGoudsouzianOutside the Norm Using Public Ecology Database Information to Teach Biostatistics.pdf(PDF  276 KB)
 S1. Outside the Norm  Student Assessment.docx(DOCX  20 KB)
 S2. Outside the Norm  First Homework Article annotated.pdf(PDF  753 KB)
 S3. Outside the Norm  First Homework Article original.pdf(PDF  99 KB)
 S4. Outside the Norm  Second Homework Article.pdf(PDF  632 KB)
 S5. Outside the Norm  First Homework Assignment.docx(DOCX  11 KB)
 S6. Outside the Norm  First Homework Assignment Answer Key.docx(DOCX  20 KB)
 S7. Outside the Norm  Second Homework Assignment.docx(DOCX  20 KB)
 S8. Outside the Norm  Student Guide.docx(DOCX  19 MB)
 S9. Outside the Norm  Lecture 1 Slides.pptx(PPTX  4 MB)
 S10. Outside the Norm  Lab Activity 1.docx(DOCX  2 MB)
 S11. Outside the Norm  Sample Data.xlsx(XLSX  495 KB)
 S12. Outside the Norm  Lecture 2 Slides.pptx(PPTX  2 MB)
 S13. Outside the Norm  Lab Activity 2.docx(DOCX  2 MB)
 License terms
Comments
Comments
There are no comments on this resource.