Someday soon, oil refineries may trade in crude oil for agricultural waste like corn stalks or renewable plants like switchgrass in order to produce sustainable biofuels. But we’re not there quite yet; converting those products into usable chemicals on a large scale requires efficient catalytic reactions, which researchers are still hunting for. Recently, Conway Assistant Professor Reid Van Lehn and his colleagues in the Department of Chemical and Biological Engineering have found a way to speed up the process of finding suitable reaction conditions using machine learning, which may help the era of biofuels come a little bit sooner.
One of the ways to convert lignocellulosic biomass into usable fuels is via acid-catalyzed reactions, which usually take place in water. It’s often a slow process, but research has shown that the addition of certain organic cosolvents can increase reaction rates 100-fold or more.
Finding which cosolvents work best, however, is a matter of trial and error. Over the past decade, Van Lehn’s group and researchers at the University of Minnesota have used molecular dynamics simulations to try to understand how the cosolvents change the system and increase catalytic reaction rates, finding that the amount of water around the reactant is directly related to the rate of the catalytic reaction.
Using that information, Van Lehn and his student Alex Chew as well as Baldovin-DaPra Associate Professor Victor Zavala and his students Shengli Jiang and Weiqi Zhang, decided to see if they could use a machine learning method called a convolutional neural network to identify which cosolvents were most likely to improve catalytic reaction rates by analyzing relationships between the positions of solvent molecules. “It turns out there’s a whole class of machine learning algorithms that are designed to recognize these types of spatial correlations,” he says. “So, the question we’ve been exploring now is, given a set of water positions and the positions of other solvent molecules in the environment, can we predict any type of interaction related to these water molecules?”
To find out, Van Lehn’s team used experimental data on these catalytic reactions produced by Richard L. Antoine Professor George Huber and Ernest Micek Distinguished Chair in Chemical and Biological Engineering James Dumesic and trained the neural network, which they named SolventNet, to look at the positions of the water molecules and infer reactivity based only on that factor.
SolventNet, they found, is very accurate at predicting which cosolvents produce the best results. It’s especially surprising since the system does not take into account the most important part of the reaction, the catalyst. “The catalyst is not there. We don’t model the transition state at all,” Van Lehn says. “And yet, encoded within the solvent environment is still enough information for accurate rate predictions.”
Van Lehn estimates that this new method of assessing cosolvents could reduce the computational cost of predicting reaction rates by a factor of 200. “And what this means is that now we can screen a huge number of solvent systems to try to find optimal conditions that improve reaction rates,” he says.
The entire project, he says, is an outgrowth of UW-Madison’s strength in machine learning. Chew, he says, took a class on machine learning and applied the principles to examine acid-catalyzed reactions as a project for the class, bringing the idea to Van Lehn and Zavala’s groups. “I think they got a high score on the project,” Van Lehn says. “I would hope so, because now it’s getting published as it’s under review at a top journal.”
Van Lehn says other projects he’s working on, and many other problems in engineering, can benefit from machine learning. Colleagues like Zavala as well as experts in electrical and computer engineering and computer science are developing new ways to apply machine learning to research. “It’s very easy for us in engineering to partner with them,” says Van Lehn, “especially where we can bring in some of our computational expertise and physical insight and then identify appropriate algorithms.”