Researchers take aim at hackers trying to attack high-value AI modelsPosted on March 26, 2019
Driving Digital Innovation in AI is a series of stories about how Penn State’s Institute for CyberScience researchers are studying machine learning and artificial intelligence to make these technologies more useful and effective, as well as investigating the impact of AI on people and society.
Machine learning is powering some of the world economy’s most valuable systems, from helping select which students will attend prestigious universities to finding potential threats in a crowded soccer game. Because machine learning creates such valuable models, that also makes those models targets of hackers and malicious attackers, according to Patrick McDaniel, William L. Weiss Professor of Information and Communications Technology in the School of Electrical Engineering and Computer Science at Penn State and an Institute for CyberScience associate.
McDaniel and his team are not only studying how to make AI and machine-learning models more effective, they also are studying how to protect these increasingly valuable, but often not well-understood, burgeoning technologies. He said that while a considerable mystique has grown up around artificial intelligence and machine learning, the technology relies on common sense logic. It’s all about using probabilities to build models, according to McDaniel.
“Some people have the idea that machine learning is some sort of magic dust that you can sprinkle over anything to make that device or technology automatically learn, but machine learning is really a set of mathematical techniques that looks at examples of some phenomenon and then uses mathematical techniques to generalize,” said McDaniel.
As an example, he said a researcher could easily create a machine-learning model that determines which students are basketball players based on the students’ height.
“Let’s say you put that set of 30 students in your training data set and, then, you train a model, which is just a mathematical technique that looks at all those samples, and the model says, ‘if you’re above this height, then you play basketball,’ because taller kids tend to play basketball more than shorter kids,” said McDaniel.
The problem is that, in the real world, there are always exceptions to the rule. Those exceptions are one of the main vulnerabilities of machine-learning programs.
“If you’ve seen Spudd Webb, who is short, but is a great basketball player, you realize that the model we created to select basketball players is not a perfect model. And that’s the challenge of machine learning: most real-world phenomena are messy,” said McDaniel. “There’s really no easy way to know if someone plays basketball just by looking at their height.”
High value = highly vulnerable
Because models can be inaccurate, hackers can use that as a possible attack route on machine-learning models.
“If I’m a hacker, I can produce a sample that I know your model is wrong about,” said McDaniel. “And what’s even worse — if I have the ability to modify a sample, let’s say an image of a puppy, and I know how your model is designed, I can just tweak little pieces of that image, so that, even though you and I would recognize that the image is a picture of a puppy, when the image is fed to your machine-learning program, it would say it’s a kitten.”
Of course, developers are using machine learning for more than just to pick basketball teams and choose between puppy and kitten pictures, McDaniel said. Businesses and organizations are using machine-learning models in processes ranging from helping soldiers guide military weapon systems to working with economists to monitor markets.
It’s that value that makes machine-learning models so attractive to hackers, according to McDaniel. Hackers could change those models slightly to steal, gain advantages, or just otherwise create mayhem.
“One thing we learned about security is that anything that has value is going to be attacked,” said McDaniel. “So, as we continue to deploy machine learning, we’re going to find that hackers are going to target not only the systems — in the traditional hacking into people’s computers sense — but they’re going to attack the models.”
McDaniel’s research team is working on technical counter-measures for these attacks.
Ryan Sheatsley, a doctoral candidate in information sciences and technology, who works with McDaniel, said that the group has noticed a “cat and mouse game” develop in the machine-learning space over the past couple of years. Hackers find ways to manipulate machine-learning models and organizations try to defend against those attacks.
According to Sheatsley, all of these defenses have ultimately failed. Taking a page from other cybersecurity measures, the team is building a type of defense that might help trap hackers who are probing for vulnerabilities in machine-learning programs.
“Instead of trying to prevent people from hacking into the system, our question was: What if we can make them hack into the system in a predictable way — a way that we can measure and observe?” said Sheatsley. “In traditional computer science and security we call this a honeypot. Defenders purposefully puts out this vulnerable system and the attacker attacks the system in a way that’s predictable that we can measure.”
By making tweaks to the model, the researchers can add an artifact — similar to a watermark on a picture — in the model that hackers would be tempted to change. If they do, the researchers know that the model has been manipulated and could take actions to counter the hack.
McDaniel and his students expect these cat-and-mouse games between hackers and machine-learning model defenders to continue. Just recently, McDaniel received a $9.98 million grant from the National Science Foundation’s Secure and Trustworthy Cyberspace program to establish and lead the Center for Trustworthy Machine Learning. The grant will allow members of the multi-institution, multidisciplinary center to develop a rigorous understanding of the types of security risks involved in machine learning and to devise the tools, metrics and methods to manage and mitigate those vulnerabilities.