Machine learning systems have become increasingly popular in the world of scientific research, and it’s easy to see why.
The algorithms – designed to examine large and complex datasets for patterns that could predict future outcomes – can save a lot of man-hours, and many hope they will even be able to find patterns that humans, through more traditional methods of data analysis, cannot.
Impressive, yes. But machine learning models are so complex that it’s notoriously difficult even for their creators to explain their results. In fact, they’ve even been known to cheat in order to come to an orderly solution.
Add that reality to the fact that many of the scientists now harnessing the technology are not machine learning experts, and you have a recipe for scientific disaster. As Princeton professor Arvind Narayanan and his doctoral student Sayash Kapoor explained to Wireda surprising number of scientists using these systems can make serious methodological errors – and if this trend continues, the ripple effects in academia could be quite severe.
According Wired, the duo grew concerned when they came across a political science study that, using data produced by machine learning, claimed they could predict the next civil war with a staggering 90% accuracy. But when Narayanan and Kapoor took a closer look, they found the document was riddled with false results — a result, the researchers say, of something called “data leakage.”
In short, a data leak happens when a machine learning system uses numbers it shouldn’t. This usually happens when users mismanage data pools, distorting how the model “learns”.
After discovering the data leak in the Civil War paper, the Princeton researchers began looking for similar machine learning errors in other published studies — and the results, which they published in their own article which has not yet been peer reviewed, were striking. They found data leaks in a grand total of 329 papers across a number of fields, including medicine and the social sciences.
“They claimed near-perfect accuracy, but we found that in each of those cases there was an error in the machine learning pipeline,” Kapoor told the magazine.
As they explain in the article, the proliferation of machine learning is leading to what they call a “reproducibility crisis”, which essentially means that the results of a study cannot be reproduced by a search for followed.
The claim raises the specter that a sequel could be looming to another serious replication crisis that has rocked the scientific establishment over the past decade, in which researchers misused statistics to come to sweeping conclusions that do not represented nothing more than statistical noise in large data sets.
If this stands up to closer scrutiny, it would be an extremely concerning revelation. Aside from dead spider bots, most searches aren’t done for no reason. The goal of most science is to eventually apply it to something in some way, whether it is used to effect immediate action or to inform future study. A mistake in an information pipeline anywhere will frequently lead to tracking errors down the road – and as it probably goes without saying, this could have some pretty devastating consequences.
According Wired, Narayanan and Kapoor believe that the prevalence of machine learning errors in scientific research can be attributed to two things: the hype surrounding the systems and the lack of training provided to those who use them. The AI industry has been commercializing machine learning software that promises ever-increasing levels of ease and efficiency — and as Narayanan and Kapoor point out, that’s not necessarily a good thing.
“The idea that you can take a four-hour online course and then use machine learning in your scientific research has become so overblown,” Kapoor says. “People haven’t stopped to think about where things can potentially go wrong.”
Of course, scientists can make mistakes without AI. It also doesn’t help that machine learning can somehow seem quite difficult to challenge, especially when ease and efficiency are part of the sales pitch – after all, it It’s just calculations, isn’t it? But as things stand, it looks like the researchers are probably making serious mistakes, not only with machine learning, but because of this one.
That’s not to say that AI can’t be useful for scientific study. We are sure that in many cases this has been the case and it will probably continue to be so. Obviously, however, researchers using it need to be careful and really ask themselves if they really know what they are doing. Because at the end of the day, it’s not machine error, it’s human error.
To echo all the math teachers: maybe show your work next time.
READ MORE: Botched use of machine learning is causing a ‘reproducibility crisis’ in science [Wired]
Learn more about machine learning: Ambitious researchers want to use AI to talk to all animals