Collections

Thinking with Data How to Turn Information into Insights

Many analysts are too concerned with tools and techniques for cleansing, modeling, and visualizing datasets and not concerned enough with asking the right questions. In this practical guide, data strategy consultant Max Shron shows you how to put the why before the how, through an often-overlooked set of analytical skills.

Thinking with Data helps you learn techniques for turning data into knowledge you can use. You’ll learn a framework for defining your project, including the data you want to collect, and how you intend to approach, organize, and analyze the results. You’ll also learn patterns of reasoning that will help you unveil the real problem that needs to be solved.

Max Shron talk at NYC data science meetup

Talk for O'Reilly

0 comments 0 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

Investigating Trade-offs among Mammal Traits

This is an example of how Shiny can be used to quickly engage students in data exploration, visualization, and analysis. This tool allows you to explore the dataset associated with,  "PanTHERIA: a species‐level database of life history, ecology, and geography of extant and recently extinct mammals".

This app draws on a large species-level dataset with metabolic, life history, and ecological traits of most living and recently extinct mammal species. Users can select and plot traits, fit linear models to the data, and query displayed datapoints.

0 comments 0 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

NIBLSE Core Competencies

The bioinformatics competencies that NIBLSE recommends undergraduate life sciences students have by the time they graduate. As discussed in the narrative, they are informed by the results of the national NIBLSE survey, analysis of ninety syllabi with bioinformatics content, and the cumulative expertise and experience of the authors. Following each competency is a list of three representative examples illustrating the competency. 

 

1 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

Data Management Skill Building Hub

The Data Management Skillbuilding Hub contains resources for better data management and is open to community input and update. These resources are adaptable across a range of contexts and intended for use by researchers, teachers, librarians, or anyone who wants to learn better data management practices. Each tile below contains a lesson in slide format with annotations, a one page handout that distills the main message, and a hands-on exercise. 

Also see the data life cycle at DataONE.

0 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

Swirl - for learning R

swirl teaches you R programming and data science interactively, at your own pace, and right in the R console!

0 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

A Primer for Computational Biology by Scott O'Neil

This is an Open Educational Resource written on the PressBooks platform. This textbook has great resources and concise chapters you can assign to students.

0 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

Software Carpentry

Since 1998, Software Carpentry has been teaching researchers the computing skills they need to get more done in less time and with less pain. Our volunteer instructors have run hundreds of events for more than 34,000 researchers since 2012. All of our lesson materials are freely reusable under the Creative Commons - Attribution license.

The Software Carpentry Foundation and its sibling project, Data Carpentry, have merged to become The Carpentries, a fiscally sponsored project of Community Initiatives, a 501(c)3 non-profit incorporated in the United States.

0 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Science Resources

Modeling: Case Studies and Experimental Activities

0 comments 0 reposts

Profile picture of Becky Sanft

Becky Sanft onto Resources for Tuesday Sessions

CC-BIOME

Contains a link to the CC-BIOME site that will be discussed during the session and a link to a forum discussion where you can provide feedback on the CC-BIOME site.

   

0 comments 0 reposts

Profile picture of Elia Crisucci

Elia Crisucci onto Resources for Tuesday Sessions

An Invitation to Modeling Resources

These are two manuscripts and PowerPoint slides that outline a framework for models and modeling that we will discuss in the session.

0 comments 0 reposts

Profile picture of Kam Dahlquist

Kam Dahlquist onto Resources for Tuesday Sessions

XKCD - simple writer

Only allows you to use the most common 1000 words in the English language.  

Educational use here - Introducing Students to the Challenges of Communicating Science by Using a Tool That Employs Only the 1,000 Most Commonly Used Words

 

0 comments 1 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto HITS 2018 Workshop Resources

Data Management Skill Building Hub

The Data Management Skillbuilding Hub contains resources for better data management and is open to community input and update. These resources are adaptable across a range of contexts and intended for use by researchers, teachers, librarians, or anyone who wants to learn better data management practices. Each tile below contains a lesson in slide format with annotations, a one page handout that distills the main message, and a hands-on exercise. 

Also see the data life cycle at DataONE.

0 comments 1 reposts

4) You can get your whole genome sequenced but should you? Wired article 6/26/2017

0 comments 0 reposts

Whole Genome Sequencing and You (10 min)

0 comments 0 reposts

How to Sequence the Human Genome 5min

0 comments 0 reposts

Allen Cell Drug Perturbation website

0 comments 0 reposts

Profile picture of Carlos C. Goller

Carlos C. Goller onto HT Cell imaging Collection

Lessons from HeLa Cells:The Ethics and Policy of Biospecimens

Human biospecimens have played a crucial role in scientific and medical advances. Although the ethical and policy issues associated with biospecimen research have long been the subject of scholarly debate, the story of attention of a much broader audience. The story has been a catalyst for policy change, including major regulatory changes proposed in the United States surrounding informed consent. These proposals are premised in part on public opinion data, necessitating a closer look at what such data tell us. The development of biospecimen policy should be informed by many considerations—one of which is public input, robustly gathered, on acceptable approaches that optimize shared interests, including access for all to the benefits of research. There is a need for consent approaches that are guided by realistic aspirations and a balanced view of autonomy within an expanded ethical framework.

0 comments 0 reposts

Data for Democracy - Code of Ethics

Data for Democracy is partnering with Bloomberg and BrightHive to develop a code of ethics for data scientists. This code will aim to define values and priorities for overall ethical behavior, in order to guide a data scientist in being a thoughtful, responsible agent of change. The code of ethics is being developed through a community-driven approach.

By hosting discussions among data scientists, we hope to better capture the diverse interests, needs, and concerns that are at play in the community, and put together a code that is truly created by data scientists, for data scientists.

0 comments 0 reposts

Community Principles on Ethical Data Practices

This code of ethics for data sharing is created and proposed for adoption by the data science community to reflect the behaviors and principles for the responsible and ethical use and sharing of data by data scientists. 

As a community-driven crowdsourced effort, you can join the the discussion and contribute to the next version of the Community Principles on Ethical Data Sharing.

0 comments 0 reposts

Qiime2 View

This interface can view .qza and .qzv files directly in your browser without uploading to a server.

QIIME 2 View (or q2view for short) is an entirely client-side interface for viewing QIIME 2 artifacts and visualizations (.qza/.qzv files respectively). This means that you do not need to have a working QIIME 2 installation to inspect QIIME 2 results. It also means that the files you provide are not sent beyond your browser. In other words, this entire site functions without a server (which makes it very inexpensive to operate).

Additionally, q2view supports viewing externally hosted files, which means you can provide a link to your file (for example on Dropbox) and q2view will automatically download and display it. Better yet, the resulting pages are themselves shareable, so if a collaborator does not have QIIME 2, you can simply upload your results and share your q2view links with your collaborator. q2view will automatically fetch your results and display them to your collaborator.

0 comments 0 reposts

Allen Institute Data

The Allen Cell Explorer is the data portal for the Allen Institute for Cell Science, where you can explore our publicly available data, tools and models. The portal provides an unprecedented view into the organizational diversity of human stem cells by combining large-scale 3D imaging data, predictive models, observations of cells, detailed methods, and cell lines that can be purchased for use in labs around the world.

0 comments 0 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Sources

Pacific Biosciences datasets

Here are some sample datasets so that you can explore PacBio® sequence data as well as file types generated by the PacBio RS and SMRT® Analysis.

0 comments 0 reposts

Profile picture of Sam S Donovan

Sam S Donovan onto Data Sources