Artificial intelligence: Machine learning approach for screening large database and drug discovery
Introduction
The vast research in the biological field has generated huge data. However, experts have questioned if biological data transmission and transfer are done effectively in order to derive practical knowledge (Sanal et al., 2019). In pharmaceutical research data is available in various forms such as research outcomes, clinical data, and ethnic population-wise data. The average cost of a new drug discovery and development process costs from $1 to $2 billion and takes about 15 years to release it in the market. When research problems are taken into consideration, the data can be used to develop new drug treatments that can be accurate and cost effective (Sousa et al., 2019).
The data suggest that almost 90% of drug candidates fail in the clinical phase, this makes huge production. The chemical database is so vast any technique that is quick, effective, and relatively inexpensive can be utilised to analyse these molecules. Many techniques are involved in computer aided drug discovery (CADD) such as virtual screening (VS), pharmacophore mapping, docking studies (Vamathevan et al., 2019a, Vamathevan et al., 2019b). VS is a computer technique for searching huge libraries of tiny molecules (ligands) for lead compounds that can bind to a molecular target. The Traditional screening method is a very tedious and laborious task so, in recent years artificial intelligence (AI) has drawn attention due its robust application (Paul et al., 2021).
To find out new suitable molecules the researchers are involved in developing new innovative methods and algorithms. Significantly artificial intelligence (AI), deep learning (DL), machine learning (ML), and computational chemistry are new approaches involved towards drug discovery. Such methods can be used alone or in combination to create novel technique that incorporate a diverse set of efficient algorithms that improve predictions (Mak and Pichika, 2019).
The application of AI-based computational techniques in pharmaceutics and health science are most stable and most effective. AI works by mimicking human performance in machine learning models (Chan et al., 2019). Computational models, machine learning algorithms play an important role in developing a new drug by predicting its various activities. The AI can help screen and identify the complex chemical structure, its properties and activity process in drug identification process. It has application in every field of pharmaceutics (Vanommeslaeghe et al., 2015). The integration of AI and ML is more effective and generates accurate predictions for newly designed drugs. In recent years, various types of study provide deep atomic insights which help to find out the sources of diseases, functions and other information (Ahuja, 2019).
Natural products are constantly being investigated in the development of new bioactive molecules with commercial applications, catching the interest of scientific research efforts due to their pharmacophore-like structures, pharmacokinetic features, and distinct chemical space. The bioactive compounds are more valuable to develop new drugs because of their physic-chemical diversity & less toxicity properties (A Bryce and H Hillier, 2014). From ancient times many bioactive compounds like pilocarpine, quinine, morphine, and artemisinin. They are from synthetic compounds in various aspects of the chiral centre, high percentage of oxygen, and their chemical space is highly diverse, containing different structural scaffolds, when compared with synthetic compound libraries. The bioactive compounds now became a new target for synthesis & development drugs because of their unique characteristics (Davenport and Kalakota, 2019). The conventional method of new bioactive compounds involves various steps starting from collection of material, authentication. Of sample, extraction, isolation, characterization of isolated bioactive compounds and finally biological assay to predict its activity. Further steps include bioactive/lead optimization by chemical synthesis, structural improvement through pharmaco-dynamic and pharmacokinetic properties and to increase their biological activities. While CADD is efficient, economic, less time consuming as compared to the methods used only in-vitro and in-vivo assays (Aguiar-Pulido et al., 2013).
The drug designing by CADD has two methods-structure-based drug design (SBDD) and ligand-based drug design (LBDD), both methods based on algorithms, scoring functions, and force fields for ranking. Nowadays many computer programming and software are developed that run with various algorithms to interpret the results of both methods using predefined scoring functions. But the prediction of exact energy levels and force field is a difficult task for screening the possible drug molecules. This can be solved by quantum physics which can increase the efficiency of prediction of new drug discovery and reduce the error (Zador et al., 2021). The advantage of CADD as well as molecular docking is it has ability to screen the complete small molecule database in fast and can show realistic interactions between hit molecules and macromolecules. The macromolecules are polymers of amino acids or nucleic acids which can be predicted by algorithms that can adjust the atoms in macromolecules. There are several freely available online tolls, programmes and software which can predict the properties of lead molecules. Recently different classes of natural product such as alkaloids, flavonoids, steroids, and terpenoids have been proven new drug candidates (Rayan et al., 2017). Apart from these phytochemical groups metabolomics products, bacteria, cnidaria, insects are also used in drug discovery. Currently the AI in silico analyses, artificial intelligence and cheminformatics approach have proved robust to selected interested candidates from large database and analysing their bioactivity, pharmacodynamics, and their pharmacokinetic properties (Newman and Cragg, 2016).
In this review, we discuss current computational methods that use artificial intelligence, machine learning, and cheminformatics, to screen the large database that have been developed in the last years.
Section snippets
Conventional drug discovery
The drug discovery process focuses on identifying a molecule that is therapeutically effective in the treatment and management of diseases. Drug development begins when there is a lack of appropriate medical solutions available for the disease. Drug discovery and development is a lengthy, expensive, and complicated procedure that can take 10–12 years and cost billions to complete. Due to recent advancements in FDA guidelines in the last 40 years, the intricacy of drug development has been
Conclusion
The identification and optimization of lead compounds from large-sized chemical libraries is a very crucial step in drug discovery which is very time consuming and involves lots of investment. The lead compound should possess all possible pharmacodynamic and pharmacokinetic properties and should have high-affinity binding and specificity for a target associated with a disease which can be used as drug target. Application of computational tools and reliable scoring function that measures the
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (77)
- et al.
Machine learning on drug-specific data to predict small molecule teratogenicity
Reprod. Toxicol.
(2020) - et al.
Advancing drug discovery via artificial intelligence
Trends Pharmacol. Sci.
(2019) - et al.
Assessment of the blood–brain barrier in CNS drug discovery
Neurobiol. Dis.
(2010) - et al.
Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier
Artif. Intell. Med.
(2019) - et al.
Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings
Adv. Drug Delivery
(2012) - et al.
Artificial intelligence in drug development: present status and future prospects
Drug Discovery Today Technol.
(2019) - et al.
Present drug-likeness filters in medicinal chemistry during the hit and lead optimization process: how far can they be simplified?
Drug Discovery Today
(2018) - et al.
The hitchhiker's guide to the chemical-biological galaxy
Drug Discov. Today
(2018) - et al.
Artificial intelligence in drug discovery and development
Drug Discov. Today Technol.
(2021) - et al.
Whole-canopy gas exchanges in Coffea sp. is affected by supra-optimal temperature and light distribution within the canopy: the insights from an improved multi-chamber system
Sci. Hortic.
(2016)