Skip to main content
Kangwook Lee
May 6, 2024

Through his CAREER award, Kangwook Lee is looking for ways to make AI more adaptable

Written By: Jason Daley

The reason generative artificial intelligence models like ChatGPT or Gemini can fix computer code in just seconds, compose a sonnet in the style of Shakespeare, or explain the physics of a black hole is that the set of data that trained those models was the entire internet. Gulping down and processing that much information, however, is incredibly expensive and time consuming.

That’s why Kangwook Lee, an assistant professor of electrical and computer engineering at the University of Wisconsin-Madison, is using a National Science Foundation CAREER award to develop theory and tools for model adaptation, or tuning these massive pre-trained models to work efficiently on specific tasks. The goal is to create more targeted (and therefore, more useful) AI tools.

In the recent past, machine learning researchers could develop training models from scratch. But as the field advanced, it became more costly to create these models. Machine learning engineers instead began adapting pre-trained models for new applications through a set of model adaptation techniques. However, many of those techniques don’t work well with the latest machine learning platforms, including large language models like ChatGPT.

“There are new model adaptation methods being developed almost every week,” says Lee. “But we don’t have any clear framework that describes them and we don’t have a clear theoretical understanding of what’s going on with these new methods. So the motivation of this project is to develop a new understanding of these emerging techniques to adapt or transfer knowledge from the pre-trained models to downstream tasks.”

In earlier iterations of model adaptation, Lee says that researchers would take a model that has millions or billions of “tuning knobs” and adjust them little by little to optimize the model for a certain task. But the new machine learning models are so huge that adjusting all those “knobs” is prohibitive.

Instead, the new model adaptation techniques leave the majority of tuning knobs fixed, while adjusting just a small set of knobs. Some newer model adaptation methods don’t adjust any tuning knobs – they simply apply a wrapper to the model, i.e., a custom interface geared toward one application.

As part of the outreach portion of his project, Lee and his students are developing a tool that enables people to create their own specialized AI models—no coding or training needed. There are already well-developed AI modules for applications like speech recognition or writing that can be stacked like Lego blocks. “These methods don’t require any traditional background in machine learning or data science,” says Lee. “All you need is good creativity to come up with new ways of connecting existing AI models.”