# Support

## Support Options

• ### Knowledge Base

Find information on common questions and issues.

• ### Support Messages

Check on the status of your correspondences with members of the QUBES team.

## Using DataCamp to help teach data science in a biostatistics class

A large part of science can be described as the process, through observation and measurement, of extracting information from the world in the form of data.  Even with in-silico models, we construct analytic information through mathematical analysis, or record "measurements" on resulting simulations.  Statistics or machine learning can then be used on the results or data to extract relevant information.

I have been teaching an Introduction to Biostatistics course at the College of William and Mary for the past 4 years.  The focus of my course is "data", which allows me to organize the course around the following questions:

1. How do we design experiments that will lead to data containing information about the real world that most effectively stands as evidence for our specific hypothesis or question?
2. With this data in hand, what skills do we need to learn in order to store, manipulate and visualize the data?
3. What statistical tools/techniques should we use to extract the maximal amount of relevant information from our data?

These questions can be visualized as three points on the following "data science triangle":

It is important to mention here that the image above is not meant to silo or imply that these three areas are done in isolation.  For example, an effective practicing scientist will have deep knowledge of the statistical procedures they will use and the data they will collect as they are designing their experiments.  All of this leads me to the following statement:  data is central to science and statistics, and to leave out two of the three points of the data science triangle in a biostatistics course (design and basic data handling skills) is doing a disservice to our students.

There are many interesting opportunities and potential challenges associated with how to incorporate more experimental design and data skills in a biostatistics course (stay tuned!)  For example, should a biostatistics course have a separate lab for data skills?  Should students be designing their own experiments and collecting their own data?  Side stepping these extremely important discussions, let's focus for now on the following question:  Where can faculty go for resources and training in data skills, and how can those skills be brought into the classroom?  Two entities that immediately come to my mind are Data Carpentry and DataCamp.  There are of course plenty others - feel free to mention your favorites in the comments!  The remainder of this post introduces DataCamp and their offerings - but I encourage everyone to also take a look at Data Carpentry!

So what courses do my students take?  Below are clickable DataCamp badges corresponding to each course.  Remember, the first chapter of each course is free, so go check them out!

What has been your experience with DataCamp? Do you teach data management skills? What else would you like to know or share about integrating experimental design, data science and biostatistics in a single course?  Leave us a comment below!