Data Science Community Seminar
Date: Tuesday, October 12
Time: 9:00 a.m.
Join Penn State’s Data Science Community for a short talk about data science research.
Presenter: Qingyun Wu, Assistant Professor of Information Sciences and Technology
Title: Fast, Economical & Scalable AutoML
About: Automated machine learning (AutoML) is the process of automating the time-consuming, iterative tasks of machine learning model development, including data pre-processing, hyperparameter tuning, model selection, etc. It frees data scientists, analysts, and developers from tedious trial-and-error in building machine learning models. In this talk, I will introduce our latest efforts in fast, economical & scalable AutoML, how it can benefit a wide spectrum of end-to-end data science and machine learning tasks, and the new challenges.
Presenter: Manzhu Yu, Assistant Professor, Associate Director of Geoinformatics and Earth Observation Laboratory
Title: Time series prediction of air pollution from wildfires using Transformer – a multi-head attention mechanism
About: Wildfire smoke can be more damaging to respiratory health than other sources of air pollution. As fires grow larger and human populations expand, it is crucial to provide a more accurate picture of how communities will be at risk for wildfire. In this research, we investigated the capability of a Transformer architecture for predicting short-term PM2.5 measurements in California during wildfire seasons. The time series prediction leverages the past 24 hours observations to predict PM2.5 measurements in the future 12 hours. Feature contributions and feature temporal contributions were calculated to capture different characteristics in multi-variable time series, distinguish each variable’s contribution to the prediction, and provide guidance on future air quality forecast systems over multi-variable data.