Translational Medicine in the Age of Data

Billions of clinical measurements are recorded every day and stored in electronic health systems around the world. Each one of these experiments is a window into the human system, creating the most comprehensive and diverse medical data set ever imagined. Unfortunately, traditional statistical techniques were not developed to handle such diversity, instead they excel at analyzing homogenous data sets with first order effects. Because of this, these techniques are simply unable to untangle the sophisticated web of biological pathways and genetic interactions governing the human system.

With enormous data come enormous opportunity

Data Science is a new field dedicated to developing the methods, algorithms, and tools to unravel the complexities of enormous data. In our lab we advance data science by designing rigorous computational and mathematical methods that address the fundamental challenges of health data science. Foremost, we integrate our medical observations with systems and chemical biology models to not only explain drug effects, but also further our understanding of basic biology and human disease.

One particular area of interest is the integration of high-throughput data capture technologies, such as next-generation genome and transcriptome sequencing, metabolomics, and proteomics, with the electronic medical record to study the complex interplay between genetics, environment, and disease.

For a more in-depth information on our research areas of interest see our reviews in WIREs System Biology and Medicine, Science Translational Medicine, and Clinical Pharmacology & Therapeutics.

News and Events

Hunt for dangerous drug interactions reveals strategy that can save lives
By Sam Roe and Karisa King

"The experiment began with thousands of patient files, millions of prescription orders, billions of clinical measurements and a single question: Could big data be used to discover deadly drug combinations?" Read the whole story.

Animal Venom Database Could Be Boon To Drug Development
By Emily Mullin

"The bite of a poisonous snake, scorpion or other venomous creature could very well kill you, but it also might be able to heal certain medical conditions like cancer, diabetes and heart failure. That's the idea behind VenomKB, short for Venom Knowledge Base, the first online database that aims to catalog all the known animal toxins and their physiological effects on humans." Read the whole story.

Can Big Data Tell Us What Clinical Trials Don't? - New York Times Magazine
By Veronique Greenwood

"The Tatonetti Laboratory at Columbia University is a nexus in this search for signal in the noise. There, Nicholas Tatonetti, an assistant professor of biomedical informatics develops algorithms to trawl medical databases and turn up correlations." Read the whole story.

Featured publications

Tal Lorberbaum, Kevin J. Sampson, Raymond L. Woosley, Robert S. Kass, Nicholas P. Tatonetti
An Integrative Data Science Pipeline to Identify Novel Drug Interactions that Prolong the QT Interval.
Drug Safety. Feb 2016. Source.

Alexandra Jacunski, Scott Dixon, and Nicholas P Tatonetti
Connectivity Homology Enables Inter-Species Network Models of Synthetic Lethality.
PLOS Computational Biology. Oct 2015. Source.

Mary Regina Boland, Zachary Shahn, David Madigan, George Hripcsak, Nicholas P Tatonetti
Birth Month Affects Lifetime Disease Risk: A Phenome-Wide Method.
Journal of the American Medical Informatics Association. June 2015. Source.

See more publications.


Our lab is in the Department of Biomedical Informatics at Columbia University as well as the Department of Systems Biology, and the Department of Medicine. We are a member of the Data Science Institute at Columbia.

Potential graduate students should apply to the Department of Biomedical Informatics Training Program or the Computational Biology Training Program at Columbia.