Data-Driven Discovery of Regulatory Mechanisms and Cellular Resource Allocation via Multi-Modal Data Integration (Faculty/Junior Researcher Collaboration Opportunity)

Data-Driven Discovery of Regulatory Mechanisms and Cellular Resource Allocation via Multi-Modal Data Integration

PI: Vasant Honavar (IST)

Apply as Junior Researcher 

Plan for funding tuition for graduate students, or the remainder of the researcher’s salary for postdoc and research faculty: Existing NSF funding through the National Synthesis Center for Emergence in the Molecular and Cellular Sciences (NCEMS) Award No. #2335029 will be used.

Project Narrative: This project supports a working group within the U.S. National Science Foundation (NSF) National Synthesis Center for Emergence in the Molecular and Cellular Sciences (NCEMS) at Penn State. NCEMS aims to drive multidisciplinary collaboration by synthesizing publicly available research data to address fundamental scientific questions at the intersection of data science and molecular and cellular biology. Despite major consortia like the Dependency Map and Cancer Cell Line Encyclopedia generating extensive multi-omics datasets for thousands of cell lines, these data remain fragmented and are rarely integrated at the single-cell level. Recent advances in single-cell multi-omics sequencing and imaging now enable simultaneous profiling of DNA, RNA, and protein within the same cell line; however, integration frameworks to unify these data across modalities are lacking. Our project will address this challenge by developing unified computational frameworks that harmonize and integrate single-cell sequencing and imaging data across DNA, RNA, and protein modalities within well-characterized cell lines. By minimizing donor and cell-type variability, this approach will enable robust, mechanistic studies of gene regulation and cellular response to perturbation, creating a foundational resource for benchmarking and advancing machine learning, network biology, and multi-omics research. Through NCEMS’s international collaboration, with working group members spanning 43 institutions across 18 U.S. states and six countries, the Junior Researcher will gain unique exposure to a global scientific network, raising both the research profile and international visibility of Penn State.

Project Objectives and Goals: This project will deliver unified, reproducible frameworks and analytical pipelines for integrating and analyzing single-cell and bulk multi-omics data across DNA, RNA, and protein modalities in well-characterized cell lines. Immediate outcomes will include harmonized datasets, robust cross-modality alignment tools, and a curated, accessible database of cell lines with high-coverage multi-omics data. The project will develop and validate machine learning and network biology models to infer regulatory relationships and predict missing modalities, along with standardized pipelines for data harmonization and benchmarking. In the medium term, these resources will provide a foundational framework for mechanistic studies of gene regulation, enable robust benchmarking of new analytical methods, and support community-driven resource development by the NCEMS Working Group. Long-term goals include publication and dissemination of results and tools, and the expansion of these frameworks to additional cell lines and data modalities, thereby accelerating advances in multi-omics integration and regulatory biology.

Required Expertise/Skills: Integrating, harmonizing, and curating heterogeneous multi-omics datasets (single-cell and bulk RNA-seq, scATAC-seq, CITE-seq, imaging data); computational modeling and machine learning (including network biology, latent representation learning, and predictive modeling); development and deployment of reproducible analytical pipelines; statistical validation and cross-modality benchmarking; quality control for large-scale biological data.

Interdisciplinary Components: This project synthesizes expertise across computational biology, genomics, single-cell multi-omics, network biology, and statistics. It bridges experimental and computational efforts, aligning with NCEMS’s mission of enabling community-scale synthesis research and ICDS’s commitment to addressing complex scientific challenges through interdisciplinary computational approaches.

Mentorship and Team Integration: The graduate student supported by this funding will receive comprehensive mentorship and support from the following interdisciplinary experts:

● Vasant Honavar: NCEMS Working Group co-lead and Associate Director, ICDS Associate Director, Professor Penn State College of Information Sciences and Technology

● Elizabeth Brunk: Assistant Professor of pharmacology and of chemistry University of North Carolina School of Medicine

● Ferhat Ay: Associate Professor La Jolla Institute for Immunology

● William Noble: Professor in the Departments of Genome Sciences and Computer Science and Engineering at the University of Washington

● Maowei Dong: NCEMS Project Manager, Huck Institutes of the Life Sciences

● Justin Petucci: NCEMS Associate Director, ICDS RISE AI/ML Team Lead

Mentorship will include weekly meetings, presentations, methodological training, and regular feedback, fostering advanced analytical and computational skills and international collaboration. An NCEMS Staff Scientist (PhD-level, experienced in open science, team science, and data science) will provide direct mentorship, ensuring both technical and professional development. The junior researcher will gain research experience in a professional environment outside their primary lab, interact with NCEMS Working Groups, and develop transferable data science skills valuable for their thesis and future careers. Participation in NCEMS-sponsored events, such as the Annual Summit, training workshops, and hackathons, will further support professional growth. Authorship will be provided on research papers to which the researcher contributes in accordance with Working Group guidelines.

Funding Request: We are seeking a graduate student at 50% RA for this project that has the ability to work in person at NCEMS offices (4th floor in Benkovic Building)

PI ICDS Engagement: PI Vasant Honavar is actively involved in ICDS activities as a current associate director.