AI is not learned! New research reveals ways to decipher the black box of artificial intelligence-AI-php.cn

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

WBOY

Release： 2024-01-16 21:30:31

forward

701 people have browsed it

Artificial Intelligence (AI) has been developing rapidly, but to humans, powerful models are a "black box."

We don’t understand the inner workings of the model and the process by which it reaches its conclusions.

However, recently, Professor Jürgen Bajorath, a chemical informatics expert at the University of Bonn, and his team have made a major breakthrough.

They have designed a technique that reveals how some artificial intelligence systems used in drug research operate.

Research shows that artificial intelligence models predict drug effectiveness primarily by recalling existing data, rather than learning specific chemical interactions.

——In other words, AI predictions are purely based on piecing together memories, and machine learning does not actually learn!

Their research results were recently published in the journal Nature Machine Intelligence.

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

Paper address: https://www.nature.com/articles/s42256-023-00756-9

In the field of medicine, researchers are feverishly searching for effective active substances to fight disease - which drug molecules are the most effective?

Typically, these effective molecules (compounds) are docked to proteins, which act as enzymes or receptors that trigger specific physiological chains of action.

In special cases, certain molecules are also responsible for blocking adverse reactions in the body, such as excessive inflammatory responses.

The number of possible compounds is huge, and finding the one that works is like looking for a needle in a haystack.

So the researchers first used AI models to predict which molecules would best dock and bind strongly to their respective target proteins. These drug candidates are then further screened in more detail in experimental studies.

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

Since the development of artificial intelligence, drug discovery research has increasingly adopted AI-related technologies.

For example, graph neural network (GNN) is suitable for predicting the strength of binding of a certain molecule to a target protein.

A graph consists of nodes representing objects and edges representing relationships between nodes. In the graph representation of a protein-ligand complex, the edges of the graph connect protein or ligand nodes, representing the structure of a substance, or the interaction between a protein and a ligand.

GNN models use protein-ligand interaction maps extracted from X-ray structures to predict ligand affinities.

Professor Jürgen Bajorath said that the GNN model is like a black box to us, and we have no way of knowing how it derives its predictions.

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

Professor Jürgen Bajorath works at the LIMES Institute of the University of Bonn and the Bonn-Aachen International Center for Information Technology (Bonn-Aachen International Center for Information Technology) and the Lamarr Institute for Machine Learning and Artificial Intelligence.

How does artificial intelligence work?

Researchers from the Chemical Informatics Department of the University of Bonn, together with colleagues from the Sapienza University of Rome, analyzed in detail whether graph neural networks really learn the interactions between proteins and ligands. effect.

The researchers analyzed a total of six different GNN architectures using their specially developed "EdgeSHAPer" method.

The EdgeSHAPer program can determine whether the GNN has learned the most important interactions between compounds and proteins, or made predictions through other means.

The scientists trained six GNNs using graphs extracted from the structures of protein-ligand complexes - where the compound's mode of action and the strength of its binding to the target protein are known.

Then, test the trained GNN on other compounds and use EdgeSHAPer to analyze how the GNN produces predictions.

“If GNNs behave as expected, they need to learn the interactions between compounds and target proteins and make predictions by prioritizing specific interactions.”

However, according to the research team’s analysis, the six GNNs basically failed to do this. Most GNNs only learn some protein-drug interactions, focusing mainly on ligands.

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

The above figure shows the experimental results in 6 GNNs. The color-coded bars represent the top 25 edges of each prediction determined with EdgeSHAPer. The average proportion of proteins, ligands, and interactions in .

We can see that the interaction represented by green should be what the model needs to learn, but the proportion in the entire experiment is not high, while the orange color representing the ligand Articles account for the largest proportion.

To predict the binding strength of a molecule to a target protein, models primarily "remember" the chemically similar molecules they encountered during training and their binding data, regardless of the target protein. . These remembered chemical similarities essentially determine the prediction.

AI is not learned! New research reveals ways to decipher the black box of artificial intelligence

This is reminiscent of the "Clever Hans effect" - just like the horse that looks like it can count Horses, in effect, infer expected outcomes based on subtle differences in their companions' facial expressions and gestures.

This may mean that the so-called "learning ability" of GNN may be untenable, and the model's predictions are largely overestimated because chemical knowledge can be used Make predictions of the same quality as simpler methods.

However, another phenomenon was also found in the study: as the potency of the test compound increases, the model tends to learn more interactions.

Perhaps by modifying the representation and training techniques, these GNNs can be further improved in the desired direction. However, the assumption that physical quantities can be learned from molecular graphs should generally be treated with caution.

「Artificial intelligence is not black magic.」

The above is the detailed content of AI is not learned! New research reveals ways to decipher the black box of artificial intelligence. For more information, please follow other related articles on the PHP Chinese website!