Advancing Air Pollution Exposure Assessment with Machine Learning Techniques (Faculty/Junior Researcher Collaboration Opportunity)

Advancing Air Pollution Exposure Assessment with Machine Learning Techniques

PI: Xi Gong (Biobehavioral Health)

Apply as Junior Researcher 

The quality of the air is vital to human health. To conduct epidemiological analyses on the relationship between air pollution exposure and health outcomes, it is essential to first accurately estimate an individual’s exposure to air pollutants. However, the process from air pollutant emissions to individual exposure is complex and nonlinear, which poses significant challenges for modeling. There is a need for exposure assessment models that balance accuracy, complexity, and usability. Given the self-learning, fast convergence, and faulttolerant characteristics of Machine learning (ML) and artificial intelligence (AI) methods, they are well-suited for modeling these nonlinear complex relationshipsOur team has developed a preliminary model using a pruned feedforward neural network (pruned-FNN) to estimate annual air pollution exposure based on emission timing and rates, terrain features, meteorological conditions, and proximity measures. This model has been implemented in one U.S. state over an 11-year period. The current project aims to redesign the computational framework to enhance predictive performance and expand its application to additional air pollutants, broader geographic regions, and longer time spans.

Planned Activities:

 Data Collection and GIS database development: Expand the current dataset (air emission, air monitoring, climate, and terrain data) from one state to cover the entire U.S. over the past 30 years. Preprocess and integrate the data into a GIS database for use in modeling.

 Model design and training: Redesign the current model by comparing alternative ML/AI models to identify the best-performing one. Air monitoring data will serve as the ground truth for model training, calibration, and cross-validation.

 Exposure assessment: Apply the model to estimate exposure to each air pollutant with high spatial resolution (e.g., 10km by 10km) across the entire U.S. over the past 30 years. Organize the results in a GIS database for further visualization and spatial analysis.

A list of specific areas of computational and/or data science expertise or skills that the current team is particularly interested in recruiting to support the project:

AI/ML methodology and implementation, GIS-based modeling and spatial data analysis.

Other Expectations of ICDS Junior Researcher:

Regular availability for meetings (weekly or bi-weekly, times flexible)

A list of specific objectives for work supported by this call: 

We will submit at least one paper presenting the model and datasets. The model has the potential to become a key reference in environmental exposure assessment research, and the resulting exposure dataset could serve as a foundational resource for future environmental health studies focused on air pollution exposure and human health outcomes. The medium to long-term goal is to leverage the preliminary results to support a future funding proposal to the NIH NIEHS R01 or R21 grants, along with other relevant opportunities.

Connection of the project to ICDS’s mission: 

We will develop and apply data science and ML/AI methods to environmental health science to advance understanding, response, and mitigation of air pollution’s adverse health effects.

A paragraph summarizing team member’s recent and/or planned engagement with ICDS:

Xi Gong is a co-hire at ICDS. He has regularly participated in and plans to continue engaging in ICDS activities, such as monthly lunches, faculty search interviews, AI Week, and the annual ICDS symposium.