Even though you may not entirely realize it, artificial intelligence has a huge influence on our lives. The AI subfield of machine learning, in which computers learn from data, underlies social media algorithms and the recommendations we receive for everything from streaming content to shopping. It’s also used to assess medical images, help banks make lending decisions, and it’s the basis for autonomous vehicles, among countless other emerging uses.
Even as the field matures, it’s become apparent that machine learning applications are only as good as the data they train on—and often that data is not very granular.
That’s why Ramya Korlakai Vinayak, an assistant professor of electrical and computer engineering at the University of Wisconsin-Madison, will use her National Science Foundation CAREER Award to build novel models and algorithms to help tease out the diversity hiding in these datasets to help preference and metric learning algorithms work better for diverse populations.
Metric and preference learning are very useful and commonly used to learn how we represent different concepts and make preference judgments based on them. They are useful in a variety of applications from recommendation systems, behavioral and cognitive psychology, individualized education, crowdsourced democracy and aggregating preferences over populations in social science surveys and datasets.
Vinayak says that many data collections involving people, like those from health and behavioral studies or social science research, contain societal-scale information from diverse populations, including people of different ages, backgrounds, income levels, and phenotypes. However, many off-the-shelf machine learning algorithms do not take such diversity into account. “Unfortunately, the existing tools and algorithms are inadequate because they focus very much on pooling everybody’s data and learning a common model that works well on average,” she says. “Furthermore, usually there is not enough data from each individual to learn separate personalized models. So, we need better machine learning models and algorithms that can actually capture the diversity in societal-scale datasets.”
One issue is that large datasets often lack individual identifiers and do not break people down into subgroups. One of Vinayak’s aims is to develop tools that can pick these subgroups out of the larger pool of data, while still maintaining individual confidentiality and privacy. “I’m building models and algorithms where we can learn different models for different subgroups, but without knowing who belongs to which subgroup,” she says. “Simultaneously, we’ll cluster people into different subgroups while learning the models that fit well for the subgroups.”
For instance, the algorithm could look at a large dataset of individuals with different preferences and learn these diverse preferences, even if the information of differences is not explicitly part of the data.
“The algorithms I’m trying to build would leverage data from everybody for parameters that apply to everyone, but also simultaneously learn these other subsets of parameters that are more individual,” she says. “So, in essence, these models are bridging the two extremes of universal modeling and individual modeling.”
For the outreach portion of her CAREER Award, Vinayak hopes to bring an awareness of data diversity to students. To that end, she is developing modules for undergraduate and graduate courses to discuss the topic, which she says is rarely part the current machine learning curriculum. “Thinking about learning from diverse data becomes more and more important as machine learning models and algorithms touch our lives every day and become more common,” she says. “I want to prepare students to be able to think along these lines as they go out and make an impact in various fields.”
Vinayak is also collaborating with the College of Engineering’s inclusion, equity and diversity teams to increase mentoring opportunities for students from underrepresented backgrounds interested in computer engineering. She already mentors several undergraduates through the Wisconsin Science and Computing Emerging Research Stars program and hopes to involve more computer engineering students in that or similar programs.
Featured image caption: Ramya Korlakai Vinayak, an assistant professor of electrical and computer engineering, will use her National Science Foundation CAREER Award to build novel models and algorithms to help tease out the diversity hiding in these datasets to help preference and metric learning algorithms work better for diverse populations. Credit: Joel Hallberg.