Drug-discovery protocols in pharmaceutical industry have mainly relied for years on the high throughput screening methods for quickly assay of the biological or biochemical activity of a large number of drug-like compounds. A whole range of difficult problems, including efficacy, activity, toxicity, and bioavailability of the designed compounds are usually encountered in the discovery process. The total drug development process costs, more than hundreds of millions of dollars, should be added to this list. Computational techniques which provide options for understanding chemical systems, yield information that is difficult, if not nearly impossible, to obtain in laboratory experiments. During the last few decades utilization of these techniques in drug design procedures have accelerated the process of high throughput screening by using virtual features of molecules. Of the high throughput virtual screening approaches, quantitative structure-activity relationships (QSAR) has been proved as a practical one in modern drug-discovery protocols. It depends exclusively on the physicochemical features of the ligands (molecular descriptors), when no information is in hand about the ۳-D structure of the target.
QSAR is fundamentally a protocol that applies knowledge of statistics and mathematics to perform prediction or classification of biological data related to the designed molecules. Many linear and none-linear statistical model building methods have been applied in
QSAR approach. Artificial neural network (ANN) is one of the most popular non-linear modeling methods utilized in
QSAR studies. It was applied by Hiller et al in drug design for the first time in ۱۹۷۳. It was indicated that neural network could be helpful for the classification of molecules into two categories: active and inactive. Later on, in ۱۹۹۰, Aoyama et al successfully applied neural network in decision making about compound interactions in contrast with a linear model building method, multiple linear regression (MLR). They proved neural network as a multi-regression method with one neuron at the output layer to predict the molecular biological activity. At the same time, neural networks were widely used in
QSAR based on the ۲-D representation of compound similarities. All researchers attempted to confirm that neural network can be a potential tool in the routine works of
QSAR analysis, feature extraction, non-linear modeling, classification and prediction. The number of drug-like compounds which are attractive for pharmaceutical industry is increasing every day. The same is true for the number of the molecular descriptors describing the physicochemical features of these compounds. This puts forward two major disadvantages namely, redundancy and over-fitting which makes the prediction and/or classification unreliable. Several algorithms have been proposed as solutions for these drawbacks during the two last decades. In this article, it is attempted to discuss about the advantages and disadvantages of the proposed neural network algorithms and especially innovative deep learning techniques utilized in ligand-based virtual screening.