Remodelling machine learning: An AI that thinks like a scientist

Modern science generates a lot of data — amounts far beyond the ability of humans to process on their own. This is why scientists often rely on computational methods such as machine learning — a type of artificial intelligence to crunch big data. In fact, it’s pretty commonplace today, and is even used in applications like Spotify and Netflix to help predict your media preferences. But there’s a problem. Traditional approaches to machine learning, such as deep learning, are focused on classification and statistical correlation. This makes them very useful for recognising and comparing patterns in things like audio or images, but not very good at discovering what might have generated the patterns in the first place. Understanding these so-called generating mechanisms — in other words the dynamics of cause and effect — is a fundamental challenge across all scientific fields, from decoding how a complex biological cell arises from DNA, to figuring out how a thunderstorm forms from differences in air temperature and humidity.

Such tasks require abilities such as inference or reasoning, which most artificial intelligences are not designed for. To tackle these challenges, one team of scientists are developing a revolutionary new form of machine learning, which they’ve called causal deconvolution by algorithmic generative models, which basically means breaking down complicated and interconnected datasets to look at the underlying mechanisms or code that generate them. It aims to find the individual underlying algorithms that give rise to a dataset, in a sense, treating natural phenomena as computer programmes generating observed data.

So, we can think about this with a visual analogy — if we have a table with several objects on it and we lay a cloth over it, we’re presented with a unique 3D terrain and this represents a mass of collected data. Some of it relates to the objects underneath and some doesn’t. Modern machine learning — which is based on classical statistics — is good at identifying objects by comparing features such as shape to previous examples it’s been shown. But what it can’t do is tell us much about how the objects underneath were formed.

The aim of this new methodology is to first work out which bits of the terrain relate to the different objects — this is the deconvolution part — and then to identify the objects by looking at how they might be generated. This is done by testing lots of possible coffee cup, hat or book generating algorithms, and looking at how these could influence the tablecloth arrangement. So unlike traditional machine learning, it attempts to infer how the objects might have been generated to match the observed data. But this isn’t just intended to work on imaginary dinosaurs. It’s hoped that it can be applied to help scientists unravel the dynamics of cause and effect in fields such as genetics and cell biology — for example, in understanding the complicated gene interactions that give rise to cancer — and that’s because this methodology moves away from classification and towards questions of causation, providing insights and models to help us understand the underlying mechanisms behind observed data.

So, as machine intelligence continues to develop, researchers may become increasingly reliant on AIs which think less like machines and more like scientists. .

Leave a Reply

Your email address will not be published. Required fields are marked *