MDClone is a free, secure, self-service platform for building queries and downloading computationally derived (“synthetic”) data from the institute’s research data core (RDC). Since the data do not contain protected health information (PHI), their use is not classified as human subject research.
*Before you submit your request, please read the Instructions for Requesting Access.
The Institute for Informatics, Data Science and Biostatistics (I2DB) implemented the MDClone platform in 2018, and conducted a landmark study validating that synthetic data produces the same results as original data while maintaining data privacy. The institute’s significant effort and investment in securing this resource for our campus is part of a strategic goal to accelerate the pace of data-driven research at Washington University in St. Louis.
Our instance of MDClone is designed to meet the specific clinical and translational research needs of researchers at the School of Medicine.
The I2DB scope of support includes:
- Online training and onboarding to the platform
- Demonstration videos
- Beginner and advanced classes via Bernard Becker Medical Library. Register »
- Data brokerage services for downloading the original data (requires IRB approval)
MDClone data coverage includes:
- Inpatient and outpatient data from across BJC network facilities
- Demographics, encounters, diagnosis, procedure, medication order, lab results, vitals, allergies, surgery, microbiology, social history, problem list, practitioner, and insurance
- Data from 1997 to present
For a more detailed breakdown of the data available in MDClone, please review the MDClone Data Inventory.
I2DB publications related to MDClone:
- Understanding the opportunity and application of synthetic data in healthcare Paediatric and Perinatal Epidemiology
- Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives. JAMIA Open
- The Use of Synthetic Electronic Health Record Data and Deep Learning to Improve Timing of High-Risk Heart Failure Surgical Intervention by Predicting Proximity to Catastrophic Decompensation. Frontiers in Digital Health
- Predicting Mortality among Patients with Liver Cirrhosis in Electronic Health Records with Machine Learning. PLoS One
- The National COVID Cohort Collaborative: Analyses of Original and Computationally Derived Electronic Health Record Data. Journal of Medical Internet Research
For more information, please contact us at email@example.com.