Test implementation of a distributed database system for privacy-preserving data analysis
PI: Tim Brick (Human Development and Family Studies)
Plan for funding tuition for graduate students, or the remainder of the researcher’s salary for postdoc and research faculty: None; Summer or reduced-hours only
Other Senior Team Members: Michael Neale, Distinguished Professor of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics; Steven Boker, Professor of Psychology and of Data Science, University of Virginia. The PI will serve as the mentor for the junior researcher.
The goal of this project is to develop a testing initial implementation of distributed database system for privacy-preserving data analysis in behavioral sciences. Open Science principles dictate that data must be made available for re-analysis and to ensure reproducibility. Yet as data become more and more personal and more and more intensive, the problem of anonymizing data becomes nearly impossible. Several high-profile leaks and deanonymization projects on data from tools like Netflix and Strava have illustrated a clear need for a new model of privacy. These challenges come to a head in the behavioral sciences with recent increases in the collection and use of intensive longitudinal data (ILD) from smartphones, wearables, and other sensors. These data are tremendously valuable for projects on topics ranging from child development to addiction to PTSD and anxiety, but the collection of this type of data presents a formidable challenge from a privacy perspective.
One emerging approach to solve this problem is to have each participant maintain individual data (MID) and then to use a distributed likelihood evaluation (DLE) tool to analyze the data in-place. In this model (called MIDDLE), each participant in a scientific study would keep their data in a private cloud locker, and the MIDDLE engine would use tools like secure multiparty computation and federated computing to perform analyses on the data at a group level—permitting scientists to run models and understand group-level patterns without having access to person-level data. Although a simple proof-of-concept has demonstrated the utility of this approach, a number of problems remain unsolved before this approach can be deployed.
This project tackles the implementation of one required software element: a distributed database framework that can be used as a testbed for future research. The project will design and implement a containerized framework for ingesting and storing a variety of data types, and for running secure multiparty computation algorithms for distributed analysis of the associated data, beginning with a secure multiparty computation algorithm that is already well-defined. The team is experienced at development of quantitative software— the team’s current software project, OpenMx, has been downloaded 1.6M times and cited over 2000 times.
A list of specific areas of computational and/or data science expertise or skills that the current team is particularly interested in recruiting to support the project: databases, distributed computing.
Any other requirements or expectations of potential ICDS Junior Researchers: none
A list of specific objectives for work supported by this call: containerized software for a data storage and analysis node on RC; eventual publication of a paper describing the software and process
At least one medium to long-term goal: A collaborative proposal to NIH call for Building Sustainable Software Tools for Open Science (RFA-OD-24-010; submission due June 2026); Personal Health Informatics for Delivering Actionable Insights to Individuals (PAR-25-235; submission due Feb, June, or October 2026) in collaboration with a clinical trial.
A short statement (1 sentence to 1 paragraph) explaining the connection of the project to ICDS’s mission: This project opens the door to interdisciplinary applications of data science to the behavioral and health sciences, in line with ICDS’s focus on interdisciplinary data science approaches.
A paragraph summarizing team member’s recent and/or planned engagement with ICDS: Timothy Brick is an ICDS co-hire and regularly participates in ICDS activities and committees, such as monthly lunches, faculty search committee, and annual ICDS symposia, and a faculty representative for ICDS and CHHD on the Chief Information Security Officer’s advisory board. He was recently appointed by the SSRI as a liaison with the ICDS for HPC development.