Artificial intelligence: Machine learning approach for screening large database and drug discovery

doi:10.1016/j.antiviral.2023.105740

Antiviral Research

Volume 220, December 2023, 105740

https://doi.org/10.1016/j.antiviral.2023.105740 Get rights and content

Abstract

Recent research in drug discovery dealing with many faces difficulties, including development of new drugs during disease outbreak and drug resistance due to rapidly accumulating mutations. Virtual screening is the most widely used method in computer aided drug discovery. It has a prominent ability in screening drug targets from large molecular databases. Recently, a number of web servers have developed for quickly screening publicly accessible chemical databases. In a nutshell, deep learning algorithms and artificial neural networks have modernised the field. Several drug discovery processes have used machine learning and deep learning algorithms, including peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modelling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Although there are presently a wide variety of data-driven AI/ML tools available, the majority of these tools have, up to this point, been developed in the context of non-communicable diseases like cancer, and a number of obstacles have prevented the translation of these tools to the discovery of treatments against infectious diseases. In this review various aspects of AI and ML in virtual screening of large databases were discussed. Here, with an emphasis on antivirals as well as other disease, offers a perspective on the advantages, drawbacks, and hazards of AI/ML techniques in the search for innovative treatments.

Introduction

The vast research in the biological field has generated huge data. However, experts have questioned if biological data transmission and transfer are done effectively in order to derive practical knowledge (Sanal et al., 2019). In pharmaceutical research data is available in various forms such as research outcomes, clinical data, and ethnic population-wise data. The average cost of a new drug discovery and development process costs from $1 to $2 billion and takes about 15 years to release it in the market. When research problems are taken into consideration, the data can be used to develop new drug treatments that can be accurate and cost effective (Sousa et al., 2019).

The data suggest that almost 90% of drug candidates fail in the clinical phase, this makes huge production. The chemical database is so vast any technique that is quick, effective, and relatively inexpensive can be utilised to analyse these molecules. Many techniques are involved in computer aided drug discovery (CADD) such as virtual screening (VS), pharmacophore mapping, docking studies (Vamathevan et al., 2019a, Vamathevan et al., 2019b). VS is a computer technique for searching huge libraries of tiny molecules (ligands) for lead compounds that can bind to a molecular target. The Traditional screening method is a very tedious and laborious task so, in recent years artificial intelligence (AI) has drawn attention due its robust application (Paul et al., 2021).

To find out new suitable molecules the researchers are involved in developing new innovative methods and algorithms. Significantly artificial intelligence (AI), deep learning (DL), machine learning (ML), and computational chemistry are new approaches involved towards drug discovery. Such methods can be used alone or in combination to create novel technique that incorporate a diverse set of efficient algorithms that improve predictions (Mak and Pichika, 2019).

The application of AI-based computational techniques in pharmaceutics and health science are most stable and most effective. AI works by mimicking human performance in machine learning models (Chan et al., 2019). Computational models, machine learning algorithms play an important role in developing a new drug by predicting its various activities. The AI can help screen and identify the complex chemical structure, its properties and activity process in drug identification process. It has application in every field of pharmaceutics (Vanommeslaeghe et al., 2015). The integration of AI and ML is more effective and generates accurate predictions for newly designed drugs. In recent years, various types of study provide deep atomic insights which help to find out the sources of diseases, functions and other information (Ahuja, 2019).

Natural products are constantly being investigated in the development of new bioactive molecules with commercial applications, catching the interest of scientific research efforts due to their pharmacophore-like structures, pharmacokinetic features, and distinct chemical space. The bioactive compounds are more valuable to develop new drugs because of their physic-chemical diversity & less toxicity properties (A Bryce and H Hillier, 2014). From ancient times many bioactive compounds like pilocarpine, quinine, morphine, and artemisinin. They are from synthetic compounds in various aspects of the chiral centre, high percentage of oxygen, and their chemical space is highly diverse, containing different structural scaffolds, when compared with synthetic compound libraries. The bioactive compounds now became a new target for synthesis & development drugs because of their unique characteristics (Davenport and Kalakota, 2019). The conventional method of new bioactive compounds involves various steps starting from collection of material, authentication. Of sample, extraction, isolation, characterization of isolated bioactive compounds and finally biological assay to predict its activity. Further steps include bioactive/lead optimization by chemical synthesis, structural improvement through pharmaco-dynamic and pharmacokinetic properties and to increase their biological activities. While CADD is efficient, economic, less time consuming as compared to the methods used only in-vitro and in-vivo assays (Aguiar-Pulido et al., 2013).

The drug designing by CADD has two methods-structure-based drug design (SBDD) and ligand-based drug design (LBDD), both methods based on algorithms, scoring functions, and force fields for ranking. Nowadays many computer programming and software are developed that run with various algorithms to interpret the results of both methods using predefined scoring functions. But the prediction of exact energy levels and force field is a difficult task for screening the possible drug molecules. This can be solved by quantum physics which can increase the efficiency of prediction of new drug discovery and reduce the error (Zador et al., 2021). The advantage of CADD as well as molecular docking is it has ability to screen the complete small molecule database in fast and can show realistic interactions between hit molecules and macromolecules. The macromolecules are polymers of amino acids or nucleic acids which can be predicted by algorithms that can adjust the atoms in macromolecules. There are several freely available online tolls, programmes and software which can predict the properties of lead molecules. Recently different classes of natural product such as alkaloids, flavonoids, steroids, and terpenoids have been proven new drug candidates (Rayan et al., 2017). Apart from these phytochemical groups metabolomics products, bacteria, cnidaria, insects are also used in drug discovery. Currently the AI in silico analyses, artificial intelligence and cheminformatics approach have proved robust to selected interested candidates from large database and analysing their bioactivity, pharmacodynamics, and their pharmacokinetic properties (Newman and Cragg, 2016).

In this review, we discuss current computational methods that use artificial intelligence, machine learning, and cheminformatics, to screen the large database that have been developed in the last years.

Section snippets

Conventional drug discovery

The drug discovery process focuses on identifying a molecule that is therapeutically effective in the treatment and management of diseases. Drug development begins when there is a lack of appropriate medical solutions available for the disease. Drug discovery and development is a lengthy, expensive, and complicated procedure that can take 10–12 years and cost billions to complete. Due to recent advancements in FDA guidelines in the last 40 years, the intricacy of drug development has been

Conclusion

The identification and optimization of lead compounds from large-sized chemical libraries is a very crucial step in drug discovery which is very time consuming and involves lots of investment. The lead compound should possess all possible pharmacodynamic and pharmacokinetic properties and should have high-affinity binding and specificity for a target associated with a disease which can be used as drug target. Application of computational tools and reliable scoring function that measures the

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (77)

A.P. Challa et al.
Machine learning on drug-specific data to predict small molecule teratogenicity
Reprod. Toxicol.
(2020)
H.S. Chan et al.
Advancing drug discovery via artificial intelligence
Trends Pharmacol. Sci.
(2019)
P. Jeffrey et al.
Assessment of the blood–brain barrier in CNS drug discovery
Neurobiol. Dis.
(2010)
J. Lin et al.
Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier
Artif. Intell. Med.
(2019)
C.A. Lipinski et al.
Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings
Adv. Drug Delivery
(2012)
K.K. Mak et al.
Artificial intelligence in drug development: present status and future prospects
Drug Discovery Today Technol.
(2019)
S. Mignani et al.
Present drug-likeness filters in medicinal chemistry during the hit and lead optimization process: how far can they be simplified?
Drug Discovery Today
(2018)
G. Opassi et al.
The hitchhiker's guide to the chemical-biological galaxy
Drug Discov. Today
(2018)
D. Paul et al.
Artificial intelligence in drug discovery and development
Drug Discov. Today Technol.
(2021)
W.P. Rodrigues et al.
Whole-canopy gas exchanges in Coffea sp. is affected by supra-optimal temperature and light distribution within the canopy: the insights from an improved multi-chamber system
Sci. Hortic.
(2016)

F. Spyrakis et al.

Open challenges in structure-based virtual screening: receptor modeling, target flexibility consideration and active site water molecules description

Arch. Biochem. Biophys.

(2015)

S. Techaoei et al.

Chemical evaluation and antibacterial activity of novel bioactive compounds from endophytic fungi in Nelumbo nucifera

Saudi J. Biol. Sci.

(2020)

N.J. Vickers

Animal communication: when i'm calling you, will you answer too?

Curr. Biol.

(2017)

Q. Wang et al.

Identification of a novel protein arginine methyltransferase 5 inhibitor in non-small cell lung cancer by structure-based virtual screening

Front. Pharmacol.

(2018)

B.M. Wingert et al.

Improving small molecule virtual screening strategies for the next generation of therapeutics

Curr. Opin. Chem. Biol.

(2018)

J.L. Wolfender et al.

Innovative omics-based approaches for prioritisation and targeted isolation of natural products–new strategies for drug discovery

Nat. Prod. Rep.

(2019)

R. A Bryce et al.

Quantum chemical approaches: semiempirical molecular orbital and hybrid quantum mechanical/molecular mechanical techniques

Curr. Pharmaceut. Des.

(2014)

V. Aguiar-Pulido et al.

Evolutionary computation and QSAR research

Curr. Comput. Aided Drug Des.

(2013)

Z. Ahmed et al.

Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine

Database

(2020)

A.S. Ahuja

The impact of artificial intelligence in medicine on the future role of the physician

PeerJ

(2019)

N.A. Alzahab et al.

Hybrid deep learning (hDL)-based brain-computer interface (BCI) systems: a systematic review

Brain Sci.

(2021)

S.I. Avram et al.

A collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes

J. Chem. Inf. Model.

(2014)

Bhaskar SA, Rungta R, Route J, Nyberg E, Mitamura T. Sieg at mediqa 2019. Multi-task neural ensemble for biomedical...

M.J. Bradburn et al.

Survival analysis Part III: multivariate data analysis–choosing a model and assessing its adequacy and fit

Br. J. Cancer

(2003)

R.F. Bruns et al.

Rules for identifying potentially reactive or promiscuous compounds

J. Med. Chem.

(2012)

P.L. Cedoz et al.

Methyl Mix 2.0: an R package for identifying DNA methylation genes

Bioinformatics

(2018)

K.S. Da Costa et al.

Exploring the potentiality of natural products from essential oils as inhibitors of odorant-binding proteins: a structure-and ligand-based virtual screening approach to find novel mosquito repellents

ACS Omega

(2019)

A. Daina et al.

A boiled‐egg to predict gastrointestinal absorption and brain penetration of small molecules

ChemMedChem

(2016)

T. Davenport et al.

The potential for artificial intelligence in healthcare

Fut. Healthcare J.

(2019)

A.B. Deore et al.

The stages of drug discovery and development process

Asian J. Pharm. Res. Dev.

(2019)

J. Drews

Drug discovery: a historical perspective

Science

(2000)

E. Ferrero et al.

In silico prediction of novel therapeutic targets using gene–disease association data

J. Transl. Med.

(2017)

C. Gorgulla

An open-source drug discovery platform enables ultra-large virtual screens

Nature

(2020)

T. Halgren

New method for fast and accurate binding‐site identification and analysis

Chem. Biol. Drug Des.

(2007)

J. Henry et al.

Towards high-throughput chemo behavioural phenomics in neuropsychiatric drug discovery

Mar. Drugs

(2019)

D. Horvath

A virtual screening approach applied to the search for trypanothione reductase inhibitors

J. Med. Chem.

(1997)

B.J. Huffman et al.

Natural products in the “marketplace”: interfacing synthesis and biology

J. Am. Chem. Soc.

(2019)

J.P. Hughes et al.

Principles of early drug discovery

Br. J. Pharmacol.

(2011)

Cited by (0)

View full text

Artificial intelligence: Machine learning approach for screening large database and drug discovery

Abstract

Introduction

Section snippets

Conventional drug discovery

Conclusion

Declaration of competing interest

Reprod. Toxicol.

Trends Pharmacol. Sci.

Neurobiol. Dis.

Artif. Intell. Med.

Adv. Drug Delivery

Drug Discovery Today Technol.

Drug Discovery Today

Drug Discov. Today

Drug Discov. Today Technol.

Sci. Hortic.

Arch. Biochem. Biophys.

Saudi J. Biol. Sci.

Curr. Biol.

Front. Pharmacol.

Curr. Opin. Chem. Biol.

Nat. Prod. Rep.

Quantum chemical approaches: semiempirical molecular orbital and hybrid quantum mechanical/molecular mechanical techniques

Curr. Pharmaceut. Des.

Evolutionary computation and QSAR research

Curr. Comput. Aided Drug Des.

Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine

Database

The impact of artificial intelligence in medicine on the future role of the physician

PeerJ

Hybrid deep learning (hDL)-based brain-computer interface (BCI) systems: a systematic review

Brain Sci.

A collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes

J. Chem. Inf. Model.

Survival analysis Part III: multivariate data analysis–choosing a model and assessing its adequacy and fit

Br. J. Cancer

Rules for identifying potentially reactive or promiscuous compounds

J. Med. Chem.

Methyl Mix 2.0: an R package for identifying DNA methylation genes

Bioinformatics

Exploring the potentiality of natural products from essential oils as inhibitors of odorant-binding proteins: a structure-and ligand-based virtual screening approach to find novel mosquito repellents

ACS Omega

A boiled‐egg to predict gastrointestinal absorption and brain penetration of small molecules

ChemMedChem

The potential for artificial intelligence in healthcare

Fut. Healthcare J.

The stages of drug discovery and development process

Asian J. Pharm. Res. Dev.

Drug discovery: a historical perspective

Science

In silico prediction of novel therapeutic targets using gene–disease association data

J. Transl. Med.

An open-source drug discovery platform enables ultra-large virtual screens

Nature

New method for fast and accurate binding‐site identification and analysis

Chem. Biol. Drug Des.

Towards high-throughput chemo behavioural phenomics in neuropsychiatric drug discovery

Mar. Drugs

A virtual screening approach applied to the search for trypanothione reductase inhibitors

J. Med. Chem.

Natural products in the “marketplace”: interfacing synthesis and biology

J. Am. Chem. Soc.

Principles of early drug discovery

Br. J. Pharmacol.