MDClone is a free, secure, self-service platform for building queries and downloading computationally derived (“synthetic”) data from the institute’s research data core (RDC). Since the data do not contain protected health information (PHI), their use is not classified as human subject research.

*Before you submit your request, please read the Instructions for Requesting Access.

Figure 1 from Spot the Difference: Comparing Results of Analyses from Real Patient Data and Synthetic Derivatives by Randi E Foraker, Sean C Yu, Aditi Gupta, et al, used under CC-BY (v4.0)

The Institute for Informatics, Data Science and Biostatistics (I2DB) implemented the MDClone platform in 2018, and conducted a landmark study validating that synthetic data produces the same results as original data while maintaining data privacy. The institute’s significant effort and investment in securing this resource for our campus is part of a strategic goal to accelerate the pace of data-driven research at Washington University in St. Louis.

Our instance of MDClone is designed to meet the specific clinical and translational research needs of researchers at the School of Medicine.

You can schedule MDClone Virtual Consultations up to 30 days in advance by clicking here. Please contact Chris Sorensen at with questions.

The I2DB scope of support includes:

MDClone data coverage includes:

  • Inpatient and outpatient data from across BJC network facilities
  • Demographics, encounters, diagnosis, procedure, medication order, lab results, vitals, allergies, surgery, microbiology, social history, problem list, practitioner, and insurance
  • Data from 2010 to present

For a more detailed breakdown of the data available in MDClone, please review the MDClone Data Inventory.

I2DB publications related to MDClone:

For more information, please contact us at