Millions of Galaxies but No Time: Rapid Inference of Galaxy Properties with Neural Density Estimators (Faculty/Junior Researcher Collaboration Opportunity)

Millions of Galaxies but No Time: Rapid Inference of Galaxy Properties with Neural Density Estimators

PI: Joel Leja (Astronomy and Astrophysics)

Apply as Junior Researcher 

Plan for funding tuition for graduate students, or the remainder of the researcher’s salary for postdoc and research faculty: Grant funding through Caryl Gronwall / the HETDEX project Supplementary proprietary funds from the involved faculty are also available as needed

Background: Interpreting galaxy images and spectroscopy is the primary way to understand galaxy formation and evolution. This is generally done in a Bayesian inference framework by forward-modeling the observations, including appropriate priors and sampling the parameter space with techniques such as MCMC or nested sampling. This forward-modeling is a complex & expensive process: one must generate populations of millions or billions of stars with varying elemental abundances, formation histories, and stellar physics, and combine this with other models of luminous astrophysical objects such as black holes and gaseous nebulae. One of the most advanced and well-adopted approaches for this is the Prospector (Leja et al., 2017) inference framework, developed in part here at Penn State, which incorporates hundreds of galaxy astrophysical parameters in a mature forward-modeling framework.

The key challenge is that these models are slow: Prospector can generate 20 galaxy models per second, but millions of models are required to fit a single galaxy, resulting in ∼10-hour fits for single objects. This is wholly insufficient to fit the billions of galaxies that are expected to be observed by the Vera C. Rubin Observatory, which will have first light this year. Our research has focused on developing and deploying new rapid inference methods to address this issue, reaching ∼15minutes with neural net emulators (Mathews et al., 2023) and ∼ 1−60 seconds with a modified version of simulation-based inference (SBI++) customized for astronomical data (Wang et al., 2023).

Project Goal: We seek an ICDS Junior Researcher who will perform the first, pioneering application of our SBI++algorithm to millions of galaxies from the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), in which Penn State has a leadership role. This will involve training a neural density estimator using a training set of fake galaxies along with a model of the observational uncertainties, applying the model to a large catalog of observed galaxies, and assessing the quality of the results. We expect this to be featured in a publication in the Astrophysical Journal.

Expertise/skills of interest: (a) Experience in Bayesian statistics, neural networks, MCMC/nested sampling, and simulation-based inference are helpful (but not required!). (b) Programming experience in python is required.

Expectations: (a) Post-comps graduate student or postdocs with some experience (or at least strong interest) in: (1) Astronomy & Astrophysics, Physics or a related field; and (2) Applied Math, Computer Sciences, Data Sciences, IST, Statistics or a related field. (b) Be willing to produce open-source coding materials (e.g. Jupyter notebooks) and make results readily reproducible/highly citeable. (c) Weekly meeting and project updates with faculty advisor(s), alongside participation in group meetings (1 hour, once/week).

Mentoring: In addition to Prof. Joel Leja, Dr. Olivia Curtis, Prof. Caryl Gronwall and Prof. Robin Ciardullo are members/leaders in HETDEX and interested + willing to serve as mentors for the junior researcher. There is widespread enthusiasm and support for this project!

Engagement: Leja is an ICDS co-hire and a member of the Center for Astrostatistics who has a strong interest in the wide variety of data-intensive methodologies employed within and beyond ICDS.