Publicly available data and code

Open Access


Downstream effects of targeted proteins is essential to drug design. We introduce a data-driven method named DATE, which integrates drug-target relationships with gene expression, protein-protein interaction, and pathway annotation data to connect Drugs to target pAthways by the Tissue Expression.

Links drugs to tissue-specific target pathways
467,396 connections for 1,034 drugs and 954 pathways in 259 tissues/cell lines available
Code available on request.
Published in CPT: Pharmacometrics & Systems Pharmacology (2018)
Updated on June 8, 2018
Open Access


G protein-coupled receptors (GPCRs) are central to how cells respond to their environment and a major class of pharmacological targets. We developed a data-driven method named GOTE, that connects Gpcrs to dOwnstream cellular pathways by the Tissue Expression.

Links G-protein coupled receptors to tissue-specific molecular pathways
93,012 connections for 213 GPCRs and 654 pathways in 196 tissues/cell types available
Code available here
Published in Bioinformatics (2016)
Updated on June 8, 2018
Open Access


A knowledge base for therapeutic uses of venoms. As of its original release, contains 39,000 mined from MEDLINE describing potentially therapeutic effects of venoms on the human body.

Links venom compounds to physiological effects
39K venom/effect associations in three databases available for download
Code available on GitHub
Published in Scientific Data (2015)
Updated on December 1, 2015
Open Access


Network analysis framework that identifies adverse event (AE) neighborhoods within the human interactome (protein-protein interaction network). Drugs targeting proteins within this neighborhood are predicted to be involved in mediating the AE.

Links drugs to seed sets of proteins and phenotypes, like drug side-effects and diseases
A description of the algorithm is available here
Code in Python available on GitHub
Updated on December 14, 2015
Open Access


Interspecies, network-based predictions of synthetic lethality. The original release contains ~109 million gene pairs with their associated synthetic lethality scores.

Predict synthetic lethality (SL) from yeast to humans.
109 million human SL gene pairs available in 3 parts: part 1, part 2, and part 3. And mouse too.
Code is available upon request.
Updated on August 25, 2015
Open Access

TFICA ("tif-uh-kuh")

Results of the Transcription Factor-Indepdent Component Analysis method to more accurately link TFs to their gene targets using ChIP-Seq and expression data.

Links transcription factors (TF) to target gene modules and diseases
149 TFs, 424 gene modules, 60K relationships available for download
Code is available upon request.
Published in PLOS Genetics (2014)
Updated on February 6, 2014


Drug side effects were mined from publicly available data. Offsides is a database of drug side-effects that were found, but are not listed on the official FDA label.

Links drugs to adverse reactions
1K drugs, 10K side effects, 1M drug-effect relationships and similarities available
Code in R, Fortran, Python avail at
Last updated on March 14, 2012


Drug interactions were mined from publicly available sources. Twosides is the only comprehensive database drug-drug-effect relationships.

Links pairs of drugs to adverse reactions
63K pairs of drugs, 800K drug interactions available for download
Code in R, Fortran, Python avail at
Updated on March 14, 2012