The language of machine learning was a decidedly foreign one to Hanna Barton as she sat in class at the University of Wisconsin-Madison in early September.
Barton, a graduate student in industrial and systems engineering, had signed up for Machine Learning in Action, a new topics course in the department, at the urging of her PhD adviser, Harvey D. Spangler Assistant Professor Nicole Werner. But, having taken only one course in the programming language Python as an undergraduate in biomedical engineering, Barton felt unnerved as Assistant Professor Justin Boutilier began the first lecture.
Eight weeks later, she found herself at the annual international meeting of the Human Factors and Ergonomics Society in Seattle, easily conversing with researchers who employ machine learning in their work.
“It’s probably the largest jump in my proficiency in a class,” says Barton, who’s studying human factors and health systems engineering.
In Machine Learning in Action, undergraduate and graduate students get an overview of popular machine learning methods—and then put them to work on real datasets, giving them a taste of the kind of work they’re likely to find should they pursue careers in data science or, in the case of Barton and others, useful methods to apply in their own research.
Boutilier, who joined the College of Engineering in fall 2019, brought the course concept with him from the Massachusetts Institute of Technology, where he completed a postdoctoral research fellowship and taught a condensed version to master of business administration and master of supply chain management students in the Sloan School of Management.
Rather than concentrating heavily on the theory behind machine learning, Boutilier emphasizes tangible examples of problems that are ripe for solving with different methods. He introduces each technique with a case study, such as predicting the decisions of U.S. Supreme Court justices using decision trees or the crucial knowledge on cardiovascular disease that logistic regression has yielded from the landmark, longitudinal Framingham Heart Study.
Students try their hand at various methods—linear and logistic regression, clustering, classification, bagging, boosting and more—to ferret out insights from datasets that include life expectancies and potential contributing factors from all 50 states, rider data from the Boston bike share program and an assortment of information on plants related to the viability of invasive species.
By working through relevant questions using Python, the students get a firsthand look at the possibilities—and potential pitfalls—of machine learning and the level of rigor required to obtain accurate conclusions.
“This course is focused on the application of machine learning,” says Boutilier. “We don’t really do a lot of theory. My goal for the students is that by the end of this course, they could go do this.”
That’s been the case for Ebrahim Eldamnhoury, a graduate student in the Department of Civil and Environmental Engineering’s program in construction engineering and management. He’s using methods from the course in his research trying to predict the likelihood of success of construction projects.
“I knew a little bit of machine learning, some preliminary stuff like linear or logistic regression,” says Eldamnhoury, who’s consulted Boutilier for advice on research dilemmas. “Getting into this class gave me a wide range of other tools that I can use in my research.”
Boutilier also brings his own research on global health predicaments into the classroom. He’s previously applied a technique called random forest—essentially myriad decision trees—to predict ambulance travel times in the populous and packed city of Dhaka, Bangladesh. And he’s used a host of machine learning methods in his work predicting diabetes and hypertension risk in lower-income populations in India.
“I hope it just shows students how these methods can actually be used in interesting problems, and that they know enough to go apply them,” he says. “They basically know enough to do exactly what I did.”