180 Handbook of Chemoinformatics Algorithms
6.4 CALCULATION OF DESCRIPTORS
Descriptors are quantitative characteristics describing molecular structures that are
used in QSAR and other chemoinformatics studies. They can be experimental or cal-
culated physicochemical properties of molecules such as molecular weight, molar
refraction, energies of HOMO and LUMO, normal boiling point, octanol/water
partition coefficients, topological indices or invariants of molecular graphs (structural
formulas), molecular surface, molecular volume, etc. The first abstract molecular
topological indices introduced in molecular property prediction studies were the
Wiener index
28
and the Platt index.
29
Herein, we will not discuss different types of descriptors in detail but mention
briefly major descriptor classes. There is an excellent monograph titled Handbook
of Molecular Descriptors by Roberto Todeschini and Vivian Consonni
30
that pro-
vides reference materials on more than 2000 different descriptors. Most of descriptors
included in this book can be calculated by the Dragon software.
31
Dragon calculates
many different groups of descriptors such as constitutional descriptors (sometimes
referred to as zero-dimensional [0D] descriptors), counts of different molecular
groups, physicochemical properties of compounds, and so on. (one-dimensional [1D]
descriptors), connectivity indices, information indices, counts of paths and walks,
and so on (two-dimensional [2D] descriptors), geometrical properties, GETAWAY,
WHIM, 3DMoRSE descriptors, and so on (3D descriptors), and some other descrip-
tors. MolconnZ
32
is another widely used descriptor calculation software. In total, it
calculates more than 800 descriptors including valence path, cluster, path/cluster and
chain molecular connectivity indices, kappa molecular shape indices, topological and
electrotopological state indices, differential connectivity indices, the graph’s radius
and diameter, Wiener and Platt indices, Shannon and Bonchev-Trinajsti´c information
indices, counts of different vertices, and counts of paths and edges between different
kinds of vertices. MOE
25
descriptors include both 2D and 3D molecular descriptors.
2D descriptors include physical properties, subdivided surface areas, atom counts
and bond counts, Kier and Hall connectivity and kappa shape indices, adjacency and
distance matrix descriptors, pharmacophore feature descriptors, and partial charge
descriptors. 3D molecular descriptors include potential energy descriptors, surface
area, volume and shape descriptors, and conformation-dependent charge descrip-
tors. Chirality molecular topological descriptors (CMTD) developed in our laboratory
include chirality and ZE-isomerism molecular connectivity indices, overall Zagreb
indices, extended indices, and overall connectivity indices.
33−35
They are calculated
as conventional descriptors with modified vertex degrees. Another group of descrip-
tors frequently used in our laboratory is atom-pair (AP) descriptors.
36
Each descriptor
is defined as a count of pairs of atoms of the same type being away from each other on a
certain topological distance (2DAP descriptors) or a Euclidean distance within certain
intervals(3DAP descriptors).A new version of the program includes chirality descrip-
tors, which are counts of APs with one or both atoms in the pair chiral.
37
Comparative
molecular field analysis (CoMFA) descriptors represent values of Lennard-Jones and
Coulomb energies of interactions between a molecule and a probe atom at certain
grid points built around a set of spatially aligned molecules.
38
The molecules are
aligned according to a pharmacophore model, a spatially arranged set of features that