RIFTEHR
Relationship Inference from the Electronic Health Records
Estimate of disease heritability using 7.4 million familial relationships inferred from electronic health records
Fernanda Polubriaginof, Rami Vanguri, Kayla Quinnies, Gillian Belbin, Alexandre Yahi, Hojjat Salmasian, Tal Lorberbaum, Victor Nwankwo, Li Li, Mark Shervey, Patricia Glowe, Iuliana Ionita-Laza, Mary Simmerling, George Hripcsak, Suzanne Bakken, David Goldstein, Krzysztof Kiryluk, Eimear Kenny, Joel Dudley, David K. Vawdrey, Nicholas P. Tatonetti
Supporting Materials
RIFTEHR (Relationship Inference from the Electronic Health Records) is a tool for mining familial relationships using the emergency contact data provided by patients during their inpatient stays. The inferred relationships accurately predict genetic relatendess with 87% to 99% positive predictive value, depending on the relationship type. The preliminary manuscript describing the method, its validation, and the use of EHR-inferred relationships to estimate disease heritability is available pre-print for the scientific community. The following is a list of code and data referenced by the manuscript. We endeavor to make as much of the data publicly available as possible while still protecting patient privacy.
Heritability measures what proportion of disease can be attributed to genetics. Observational Heritability is an estimate of the heritability using observational resources where ascertainment is uncontrolled. We introduce a methodology, called SOLARStrap, to estimate Observational Heritability in the preliminary manuscript referenced above. Source code and data to run RIFTEHR and SOLARStrap are available on GitHub. Data files for notebooks and the rhinitis example are also available.
Browse the high confidence observational heritability estimates. Lower confidence estimates are available at Mendeley Data and additional files are in the Supplemental Data Files.
Clinical Data Release
Clinical and familial relationships data from the Columbia and Cornell will be made available at this URL at the time of publication: Mendeley Data and additionally at Release Data. The data will be prepared according to "Section 4. Preparation of clinical data for release" in the supplemental materials. These data will cover approximately 500 traits.
Download the 500 traits observational heritability estimates from the table below.
Updates and News
May 17th, 2018
- Published online at Cell
- De-identified data made available on the site
April 20th, 2017
- Added web app to browse new high confidence heritability estimates (see above).
July 29th, 2016
- Added data browsers for the high confidence heritability estimates: Columbia (N=216) and Cornell (N=160).
- Added all heritability runs to release data set.
July 27th, 2016
Supporting website created and pre-print manuscript deposited at bioRxiv at http://dx.doi.org/10.1101/066068.
June 14th, 2018
Peer reviewed manuscript available at Cell https://www.cell.com/cell/pdf/S0092-8674(18)30525-7.pdf.