Grassroots data science initiative seeks to create connections, collaborationsPosted on September 22, 2020
by Liam Jackson; originally published on Penn State News
UNIVERSITY PARK, Pa. — Dave Hunter knows how valuable a serendipitous connection can be. Throughout his career, Hunter, professor of statistics at Penn State, said that “being at the right place at the right time” helped him find new partnerships and projects to advance his science and his academic career. Now, Hunter is providing similar opportunities to other Penn State researchers involved in data science.
With help from a Teaching and Learning with Technology (TLT) faculty fellowship, Hunter spearheaded the launch of a grassroots data science initiative designed, in part, to provide more chances for other researchers to create new connections, conversations and collaborations. Now that the community has grown, the Institute for Computational and Data Sciences (ICDS) and the University Libraries have joined the initiative, which is already leading to new collaborations being formed.
Data science for all
Hunter was first inspired to form a data science community when the University launched a major in data sciences in 2016. As head of the Department of Statistics at the time, Hunter was involved in conversations about the proposed major with faculty members in the College of Information Sciences and Technology and the School of Electrical Engineering and Computer Science, each of which had faculty and students deeply involved in data science education.
“In the end we decided to take an intercollege approach to the major,” Hunter said. “It was a nice way to build community around this new major, and that’s where I started to dive into this idea of data science at Penn State, when I started to have lots of chance conversations around data science with various people.”
In one of those conversations, Hunter learned from a colleague — Scott McDonald, professor of science education — about the TLT Faculty Fellows program.
“Scott told me he had been working with the fellows program for a while, and I thought, wouldn’t it be great if this data science thing were bigger than just the three departments involved in the major? It could involve everybody who is trying to do something with big data. It seemed like a nice way to build a community that would not be exclusive,” he said.
The TLT Faculty Fellows program provides faculty members with a team of support staff around projects that are “at the intersection of pedagogy and technology,” said Bart Pursel, interim director of innovation with TLT. A key component of the program is that it’s designed to be flexible to adapt to the changing nature of projects that work with ideas and technology at the cutting edge.
“Projects might meander and end up veering off in a couple of directions, and that’s okay. The fellows program is designed for that,” said Pursel. “We want to take projects from innovation to scale, and we take projects on with the idea that we don’t just want to do a niche thing. We want projects to grow legs and go into other disciplines, or impact different parts of the University.”
Hunter was paired with TLT’s Data Empowered Learning team, where he said he found support from a whole team who had experience with data science. But Hunter credits one person, Hannah Williams, project manager, with his ability to launch the initiative.
“There are many great people from TLT involved in this initiative, but Hannah has truly gone above and beyond,” he said.
After several months of brainstorming and conversations with stakeholders from around the University, the team came up with a plan for investigating the possibility of creating a community.
Putting shape to a data science community
The first task was to identify how to shape a data science initiative at Penn State. To keep it as inclusive as possible, Hunter wanted a grass roots movement that allowed the membership to guide the direction of the initiative. Fittingly for a data science initiative, the team sought to collect data, using a detailed survey sent out to Penn State faculty. The biggest question they sought to answer was: Does the Penn State community want to be involved in a data science initiative?
“The answer was a resounding yes,” said Williams. “People wanted interaction and engagement, and we started to realize that people feel very strongly about all sorts of things in this space of data science: policy, privacy, applications, methodologies, education, internal collaborations, grant proposals, and just being competitive in general.”
Considering how ubiquitous data science is, creating a University-wide community would require a careful, thoughtful approach. Even defining data science is a challenge for many institutions, but Hunter prefers one that, at first, seems simple.
“Data science is about deriving meaning from data, and often that means big data,” Hunter said.
As Hunter notes, this definition implies that it requires a team approach, often interdisciplinary.
“Often, this means the data themselves are not easy to obtain, so you need some technical expertise to obtain data,” he said. “Then you have put it into a format where it can be analyzed, which requires a different type of expertise. Then you need to know what you’re trying to learn, which requires subject matter expertise. You need to know how to answer the questions you’re trying to answer, and this requires some statistical expertise.”
The TLT team helped Hunter launch a web presence for the community, datascience.psu.edu. Their real success came after realizing that people needed more opportunities for conversations around different aspects of data science. An informal lecture series would be the best way to gather anyone with an interest in data science, an idea supported by their survey data.
Anyone interested in joining the data science community can get more information on the data science community website.
Modeling their talk series after the Penn State Materials Research Institute’s Millennium Café series, the group strategically built in opportunities for conversation wherever they could. First, each talk would include two speakers, one on the development side of data science and one on the application side. Next, each speaker would be asked to tell a little bit about themselves, outside of their researcher or practitioner role, to provide a “human element” to the talks, said Williams. Finally, they booked space for far longer than would be required for two 13-minute talks. Four hours total, and there were times when that didn’t seem like enough.
“People were sticking around for sometimes several hours afterward to talk shop, whether to find out what open-source packages people are using, or just sharing best practices and ideas,” said Pursel. “It’s been encouraging to see that.”
In its first year, the Data Science Talks featured 17 researchers from 15 departments and three campuses, including one, Michael Rutter, associate professor of statistics and mathematics at Penn State Behrend, who gave his talk using a Beam robot in the Dreamery.
The initiative has led to more than just conversations for several researchers. After giving a data science talk about his work with PlantVillage, David Hughes, associate professor of entomology and biology, was approached by an audience member, Medha Uppala, postdoctoral researcher in the Department of Statistics.
“As a new postdoc at Penn State, I was keen on forming new contacts with researchers in the data science community. This was especially important to me as I’m an applied statistician and I was looking for new field applications that I can pursue my research in,” Uppala said. “David spoke about his research with PlantVillage, the farmer networks in Kenya and how they adopt new technologies. It so happens that some of my background is in social networks. So I approached him to chat about his work and if he had any open social network problems I could collaborate on. It all began there.”
Hughes said the two are involved in a new project that is “going to be a huge study in Kenya on social networks.”
“It’s a big change in our work, which is great and the whole point of such forums,” Hughes said. “Diversity is always good. Bringing together different ideas is the essence of advancement in knowledge.”
The future of the community
The success of the series grabbed the attention of two Penn State units that, like TLT, serve the entire University community — ICDS and the University Libraries. Both have become co-organizers of the initiative.
Hunter opted to hand the reins over to new faculty leadership both because he is going on sabbatical in the fall 2020 semester, and because he believes in an inclusive community.
“This is the sort of thing that by its nature should not be directed by the same person year after year after year,” he said.
Two new faculty leads — Briana Ezray, research data librarian, and Xiaofeng Liu, associate professor of civil and environmental engineering and ICDS co-hire — volunteered to take on a leadership role with the community, and they will be announcing the fall semester speakers at the first fall 2020 data science meeting on Sept. 24.
Williams noted that seeing the evolution and growth of the initiative is a sign of what she hopes are many good things to come.
“This is now this amazing partnership with other important units in the University, ICDS and the Libraries, who have a vested interest in this and want to see it move forward,” she said. “That’s a huge success, and it’s only going to get better from there.”
Anyone interested in joining the data science community can get more information on the data science community website.
- SMH! Brains trained on e-devices may struggle to understand scientific info
- Multi-institutional team to use AI to evaluate social, behavioral science claims
- NSF invests in cyberinfrastructure institute to harness cosmic data
- Center for Immersive Experiences set to debut, serving researchers and students
- Distant Suns, Distant Worlds
- CyberScience Seminar: Researcher to discuss how AI can help people avoid adverse drug interactions
- AI could offer warnings about serious side effects of drug-drug interactions
- Taking RTKI drugs during radiotherapy may not aid survival, worsens side effects
- Cost-effective cloud research computing options now available for researchers
- Costs of natural disasters are increasing at the high end
- Model helps choose wind farm locations, predicts output
- Virus may jump species through ‘rock-and-roll’ motion with receptors
- Researchers seek to revolutionize catalyst design with machine learning
- Resilient Resumes team places third in Nittany AI Challenge
- ‘AI in Action’: Machine learning may help scientists explore deep sleep
- Clickbait Secrets Exposed! Humans and AI team up to improve clickbait detection
- Focusing computational power for more accurate, efficient weather forecasts
- How many Earth-like planets are around sun-like stars?
- Professor receives NSF grant to model cell disorder in heart
- Whole genome sequencing may help officials get a handle on disease outbreaks
- New tool could reduce security analysts’ workloads by automating data triage
- Careful analysis of volcano’s plumbing system may give tips on pending eruptions
- Reducing farm greenhouse gas emissions may plant the seed for a cooler planet
- Using artificial intelligence to detect discrimination
- Four ways scholars say we can cut the chances of nasty satellite data surprises
- Game theory shows why stigmatization may not make sense in modern society
- Older adults can serve communities as engines of everyday innovation
- Pig-Pen effect: Mixing skin oil and ozone can produce a personal pollution cloud
- Researchers find genes that could help create more resilient chickens
- Despite dire predictions, levels of social support remain steady in the U.S.
- For many, friends and family, not doctors, serve as a gateway to opioid misuse
- New algorithm may help people store more pictures, share videos faster
- Head named for Ken and Mary Alice Lindquist Department of Nuclear Engineering
- Scientific evidence boosts action for activists, decreases action for scientists
- People explore options, then selectively represent good options to make difficult decisions
- Map reveals that lynching extended far beyond the deep South
- Gravitational forces in protoplanetary disks push super-Earths close to stars
- Supercomputer cluster donation helps turn high school class into climate science research lab
- Believing machines can out-do people may fuel acceptance of self-driving cars
- People more likely to trust machines than humans with their private info
- IBM donates system to Penn State to advance AI research
- ICS Seed Grants to power projects that use AI, machine learning for common good
- Penn State Berks team advances to MVP Phase of Nittany AI Challenge
- Creepy computers or people partners? Working to make AI that enhances humanity
- Sky is clearing for using AI to probe weather variability
- ‘AI will see you now’: Panel to discuss the AI revolution in health and medicine
- Privacy law scholars must address potential for nasty satellite data surprises
- Researchers take aim at hackers trying to attack high-value AI models
- Girls, economically disadvantaged less likely to get parental urging to study computers
- Seed grants awarded to projects using Twitter data
- Researchers find features that shape mechanical force during protein synthesis
- A peek at living room decor suggests how decorations vary around the world
- Interactive websites may cause antismoking messages to backfire
- Changing how government assesses risk may ease fallout from extreme financial events
- Algorithm aims to alert consumers before they use illicit online pharmacies
- Using cues and actions to help people get along with artificial intelligence
- Multi-university NSF grant to boost research computing expertise