Estimating the causal effects of clinical interventions from observational electronic health records (Faculty/Junior Researcher Collaboration Opportunity)

Estimating the causal effects of clinical interventions from observational electronic health records

PI: Vasant Honavar (IST)

Apply as Junior Researcher 

PI will be responsible for tuition support.

Proposal description:

Electronic Health Records (EHRs) provide a digital representations of patients’ medical histories collected over time. Structured EHRs contain critical clinical information such as diagnoses, medications, vital signs, lab results, and procedures over time. This data potentially offers a rich source of insights into comparative effectiveness of treatments, medications and other interventions on health outcomes. However, extracting such insights entails causal effect estimation from extremely high dimensional, sparsely and irregularly time sampled, longitudinal, primarily observational (as opposed to experimental) health data. While there has been impressive progress on causal effect estimation from observational data, most of the existing methods focus on cross-sectional data (with specified pre-treatment variables, treatment and posttreatment effects of interest), or relational, as opposed to longitudinal, data. Furthermore, they are generally limited to point interventions delivered at one time (as opposed interventions delivered in a particular order over time, e.g., chemotherapy followed by surgery for cancer patients).

Against this background, this project aims to develop effect methods for causal effect estimation from longitudinal data, under different scenarios: point interventions and point effects, longitudinal interventions and point effects, point interventions and longitudinal effects, and longitudinal interventions and longitudinal effects. The resulting methods will be tested on synthetic data (where the ground truth causal effects are known) as well as real-world de-identified EHR data (the latter in collaboration with our clinical research collaborators).

Specific aims of the research include:

• Mathematically formulate the different variants of the problem of causal effect estimation from longitudinal data

• Establish the conditions under with such causal effects are identifiable

• Tailor state-of-the-art representation learning techniques for causal effect estimation to the different scenarios mentioned above

• Evaluation of the resulting algorithms on synthetic data (where the ground truth causal effects are known)

• Demonstrate real-world applications using EHR data (in collaboration with our clinical research collaborators)

The long-term goals of this research center around advances in theoretical foundations, methods, and tools for causal effect estimation from complex longitudinal data with applications to healthcare, public policy, education, and beyond. The project aims to lay the groundwork for a competitive multi-disciplinary collaborative proposal for external funding.

Connection to ICDS Mission: This project directly supports the ICDS mission by advancing core methods related to the development of advanced methods for causal effect estimation from longitudinal data with many potential real-world applications, including healthcare.

Ideal student background: Good knowledge of theory and practice of causal modeling and causal inference (at a level comparable to that offered by DS 560 at Penn State), machine learning (including deep learning), and familiarity with electronic health records data.