Resource Image

Using genomic data and machine learning to predict antibiotic resistance: a tutorial paper

Author(s): Faye Orcales1, Lucy Moctezuma2, Meris Johnson-Hagler1, John Suntay1, Jameel Ali1, Kristiene Recto1, Pleuni Pennings, Ph.D1

1. San Francisco State University 2. California State University, East Bay

216 total view(s), 121 download(s)

0 comment(s) (Post a comment)

Summary:
Antibiotic resistance is a global public health concern. Bacteria have evolved resistance to most antibiotics, which means that for any given bacterial infection, the bacteria may be resistant to one or several antibiotics. For effective treatment…

more

Antibiotic resistance is a global public health concern. Bacteria have evolved resistance to most antibiotics, which means that for any given bacterial infection, the bacteria may be resistant to one or several antibiotics. For effective treatment and to control the spread of resistant strains of bacteria, accurate and efficient detection of resistance is important. It has been suggested that genomic sequencing and machine learning (ML) could make resistance testing more accurate and cost-effective. Given that ML is likely to become an ever more important tool in medicine, we believe that it is important for pre-health students and others in the life sciences to learn to use machine learning tools. This paper provides a step-by-step tutorial to learn four different ML models to predict drug resistance for E. coli isolates based on genomic data. The tutorial is accessible to beginners, and doesn't require any software to be installed as it is based on Google Colab notebooks. They can be used in undergraduate and graduate classes.

Licensed under CC Attribution-ShareAlike 4.0 International according to these terms

Version 2.0 - published on 23 Jul 2024 doi:10.25334/EPNE-YH86 - cite this

Description

This project contains two primary resources under "File Contents": an article (manuscript) in PDF format and a GitHub link. We advise viewers to read through the article first to get a sense of what our tutorials will entail, and to build a solid foundation on key machine learning concepts before diving into the tutorial code. After reading through the article, you are encouraged to go to our GitHub repository. 

This GitHub repository contains step-by-step tutorial notebooks that will guide you through a machine learning analysis pipeline to predict antibiotic resistance from genomic data. This tutorial is great for undergraduate and graduate students, as well as trained biologists who have little to no computational experience. The GitHub repository has 6 notebooks. We recommend you start with notebook 1 and finish with 6, as each notebooks uses key concepts from the previous one. Only a google account is required to access the tutorials as they are given in google colab notebooks. You do not need to download additional software. To run the code you are required to make your own local copy. To do this navigate to your notebook of interest on GitHub > select "Open in Colab" > select "File" to open a dropdown menu > finally select "Save a copy in Drive."

Notes

Second version - added first manuscript revision 

Cite this work

Researchers should cite this work as follows: