case
study

data factory platform

SECURE RESEARCH PARTNERSHIPS
WITH DATA SANDBOXES

HealthLab’s Client needed terabytes of clinical data de-identified and a safe way to share it with a pharmaceutical company research project in an auditable way.


At a Glance …

Key Metrics

5 Terabytes of Clinical Data De-identified

129 Data Ponds Created

6 Medical Publications

CHALLENGES


Health institutions can lose oversight of their data once they have shared it with research partner.

De-identifying clinical data at scale can be challenging.

Making data useful for research partners can require multiple iterations which can be time consuming..

SOLUTIONS


The solution leveraged HealthLab’s Data Factories which were used first to de-identify terabytes of clinical data in one “factory” sandbox, then derivative data ponds were created in a second factory’s sandbox that could be shared and audited. Deep technical insight into Cerner’s data tables were required to make the final data sets useful to the researchers.

PARNER SANDBOXES

“BIG” DATA
PIPELINES

MEDICAL INFORMATICS EXPERTISE

BENEFITS



1


2


3

ITERABILITY
Changes to the data ponds for the teams could be tracked, tweaked, and reapplied shortening the iteration cycles.

AUDITABILITY
The Client never had to let their data leave their sandboxes. Every change and derived data pond could be seen.

RE-USABILITY
As each third-party study created new data ponds, the final “gold standard” datasets could be reused in other projects.