Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery-AI-php.cn

Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery

Post-stage functionalization is an economical method for optimizing the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage functionalization challenging

To solve this problem, researchers from the University of Munich, ETH Zurich, and Roche Basel collaborated to develop a late-stage Functionalization platform, which is based on geometric deep learning and high-throughput reaction screening technology

Given that borylation is one of the key steps of functionalization, we used computational models to predict the yield under different reaction conditions, The average absolute error range is 4-5%. The model was able to classify new reactions for known and unknown substrates with an accuracy of 92% and 67% respectively. We were able to accurately capture the regioselectivity of the main product, with an F-score of 67% for the classifier. We successfully identified many opportunities for structural diversification when applied to 23 different commercial drug molecules

The study is titled "Using Geometric Deep Learning to Enable High-Throughput Experiments to Promote Late-stage Drug Diversification", Published in the journal Nature Chemistry on November 23, 2023

Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery

LSF projects play an important role in medicinal chemistry research

The novelty and complexity of structures make synthesizing chemical target structures challenging when aiming to establish structure-activity relationships in medicinal chemistry. Structure-activity relationship models can guide lead compounds and lead compound optimization plans to improve the pharmacological activity and physicochemical properties of drug candidates. Efficient integration is crucial for the exploration of structure-activity relationships, which is the bottleneck of the design-make-test-analyze cycle

There are many alternative methods to activate and modify C-H bonds to achieve organic scaffolds Late Stage Functionalization (LSF), ranging from molecular building blocks to advanced drug molecules. Many catalytic systems provide directional and non-directional approaches, as well as chemical and site-selective access to modified analogues

Among the numerous LSF methods, the C-H borylation method is considered the most commonly used for fast compounds Diversify your approach. Organoboron compounds can be converted into a variety of functional groups as a reliable means for subsequent C-C bond coupling reactions, thereby enabling extensive structure-activity relationship research

However, currently, in drug discovery, LSF’s The app has only a few reports. Most of these reports focus on a single LSF reaction type. Direct LSF of multiple types of C–H bonds with different bond strengths, electronic properties, and steric and functional group environments poses challenges. Furthermore, conducting LSF projects is often time and resource intensive, which is not consistent with the tight schedules and limited assets of many medicinal chemistry projects

Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery

Diagram showing boronation An overview of diversity research. (Data source: paper)

Artificial Intelligence Supported LSF (Language Support Feature)

High-throughput experiment (HTE) is an established reaction optimization method. Semi-automated, miniaturized low-batch screening is possible, allowing multiple transformations to be performed in parallel using a small number of precious building blocks and consumables quickly and reproducibly. Combined with FAIR (Findability, Accessibility, Interoperability, Reusability) documentation that generates high-quality datasets on successful and failed responses, HTE enables advanced data analytics and machine learning to unlock LSF for drug discovery. Foundation.

Graph neural network (GNN) has been widely used in molecular feature extraction and attribute prediction. Among various machine learning methods developed for chemical reaction planning, GNNs have been successfully applied to retrosynthetic planning, regioselectivity prediction, and reaction product prediction. In addition, transformers and fingerprint-based methods have also been developed to solve similar problems

Research has shown that by learning the geometry of the transition state, the outcome of a competing reaction can be accurately predicted. Predictions of the regioselectivity of reactions driven by electronic effects can be improved using density functional theory (DFT) and graphical characterization of atomic partial charges. Combining graph machine learning and high-throughput experiments (HTE), conditions for C-H activation reactions of organic substrates can be optimized. Some studies have focused on the use of deep learning models of transition states that have the ability to predict reaction outcomes, including enantioselectivity in some cases. However, these methods are limited to small molecule structures and relatively small molecules. The small data set makes it challenging to apply such models to more structurally complex drug-like molecules. Based on literature research, the regioselectivity of iridium-catalyzed borylation reactions can be predicted through a hybrid machine learning model enhanced with quantum chemical information of transition states. However, the impact of steric and electronic effects has not yet been explored on the performance of C-H activation reaction models and their regioselective applications in molecules with multiple aromatic ring systems

Automated LSF Boron Screening with Geometric Deep Learning

Researchers from the University of Munich, ETH Zurich and Roche Pharmaceuticals Basel present an automatic LSF boron screening method applied to geometric deep learning LSF boron screening method for identifying late-stage hits and lead diversification opportunities. Computational deep learning was employed to predict reaction outcome, yield, and regioselectivity of complex drug molecules LSF.

Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery

Illustration: Overview of the method. (Source: paper)

With this approach, the number of laboratory experiments could be significantly reduced, leading to a higher Efficiency and Sustainability of Chemical Synthesis

For the first step of this research, we conducted a thorough analysis of the published literature in order to select appropriate reaction conditions for high-throughput screening and late-stage lead compounds for drug discovery Substrates with relevant properties. We determined the reaction conditions based on a manually organized data set of 38 literatures

The selection of LSF substrates was based on the results of cluster analysis of 1,174 approved drugs, resulting in 23 structurally different drug molecules. This approach enables researchers to use relevant examples of reaction conditions and substrates in an "information library" approach, rather than relying solely on ideal substrates and fragments of limited applicability to optimize the synthesis of lead compounds

In the second step, researchers generate data (experimental datasets) using semi-automated high-throughput experiments (HTE). The reaction data of selected drug molecules and reaction conditions provide high-quality data for machine learning of subsequent reaction results

Finally, we trained different graph neural network (GNN) models using two-dimensional, three-dimensional and Molecular maps with enhanced partial charges on atoms to predict binary reaction outcomes, reaction yields, and regioselectivities. "Interestingly, the predictions improved when we considered three-dimensional information about the starting material rather than just its two-dimensional chemical formula," said Kenneth Atz, a PhD student at ETH Zurich. The method has been successfully used to identify positions in existing active ingredients where additional reactive groups can be introduced. This will help researchers more quickly develop new, more effective variants of known pharmaceutical active ingredients

Please click the following link to view the paper content: https://www.nature.com/articles/ s41557-023-01360-5

Related reports: https://techxplore.com/news/2023-11-Artificial intelligence paves the way for drugs.html

The above is the detailed content of Predicting optimal options for synthesizing drug molecules using geometric deep learning methods, paving the way for new drug discovery. For more information, please follow other related articles on the PHP Chinese website!