
case
study
data factory platform
SECURE RESEARCH PARTNERSHIPS
WITH DATA SANDBOXES
HealthLab’s Client needed terabytes of clinical data de-identified and a safe way to share it with a pharmaceutical company research project in an auditable way.
At a Glance …
Key Metrics
5 Terabytes of Clinical Data De-identified
129 Data Ponds Created
6 Medical Publications
CHALLENGES
Health institutions can lose oversight of their data once they have shared it with research partner.
De-identifying clinical data at scale can be challenging.
Making data useful for research partners can require multiple iterations which can be time consuming..
SOLUTIONS
The solution leveraged HealthLab’s Data Factories which were used first to de-identify terabytes of clinical data in one “factory” sandbox, then derivative data ponds were created in a second factory’s sandbox that could be shared and audited. Deep technical insight into Cerner’s data tables were required to make the final data sets useful to the researchers.
PARNER SANDBOXES
“BIG” DATA
PIPELINES
MEDICAL INFORMATICS EXPERTISE
BENEFITS
1
2
3
ITERABILITY
Changes to the data ponds for the teams could be tracked, tweaked, and reapplied shortening the iteration cycles.
AUDITABILITY
The Client never had to let their data leave their sandboxes. Every change and derived data pond could be seen.
RE-USABILITY
As each third-party study created new data ponds, the final “gold standard” datasets could be reused in other projects.