Skip to main content
Qiaomin Xie
April 1, 2024

Xie hopes to advance reinforcement learning with NSF CAREER Award

Written By: Tom Ziemer

As requests flood a data center, the computer network must match those requests to servers in a way that balances completing each job quickly with managing overall system performance.

Qiaomin Xie likes to use the analogy of directing customers at a grocery store to checkout lines—the optimal resource allocation will keep things flowing for individual consumers while also maintaining harmony for the collective group.

But to ensure complex systems function smoothly—especially when confronted with situations that require sequential decision-making—the algorithms that control them need to be properly designed. A data-driven approach, exploiting the availability of data from systems, provides a new paradigm of designing intelligent control algorithms. That’s where machine learning researchers like Xie come in.

Xie, an assistant professor of industrial and systems engineering at the University of Wisconsin-Madison, will use a National Science Foundation CAREER Award to enhance a machine learning technique called reinforcement learning for use in systems like computer and communications networks.

The five-year, $532,910 grant will fund Xie’s work on the algorithms and theories underpinning reinforcement learning.

“Reinforcement learning is the natural tool to help us design better control algorithms for these large-scale systems,” says Xie, who joined the UW-Madison faculty in 2021. “Reinforcement learning enables the system to learn and improve its performance autonomously through interaction with the environment, and also adapts to changes in the environment.”

In contrast to other machine learning techniques, which typically learn from static datasets or independent samples, reinforcement learning follows a dynamic process that Xie says makes it well-suited to navigate problems that require sequential decision-making. Reinforcement learning also uniquely balances the tradeoff between exploratory actions to learn more about a given environment and leveraging its current knowledge to maximize rewards.

“Similar to how humans learn from experience, reinforcement learning agents learn through trial and error,” says Xie, who became interested in reinforcement learning after a computer program trained with the method made headlines in 2016 by defeating a master-level player in the Chinese board game Go. “To maximize the cumulative rewards over time, the agent must learn to associate immediate action with long-term outcomes.”

But to realize its potential in applications such as computer and communications networks, robotics and inventory control, reinforcement learning algorithms need sharpening to contend with environments that feature high levels of randomness and noise, long time horizons, and the sheer complexity of systems as compared to simplified training models.

Xie hopes her work can yield more efficient and resilient reinforcement learning algorithms and theories for problems involving single or multiple strategic decision-making agents.

Top photo by Joel Hallberg.