332 Handbook of Chemoinformatics Algorithms
Deterministic algorithms similar to the one described above have been applied to
processes such as pyrolysis, metabolism, and signal transduction. Broadbelt et al. [4]
developed a network generator named NetGen and applied it to hydrocarbon pyrol-
ysis (ethane and cyclohexane). Another pyrolysis network generation process can be
found in Ref. [5] for the thermal cracking of butane. With both applications, the ele-
mentary transformations and their associated be- and r-matrices are given in Figure
11.7. All studies generated networks with reactions rates. In Ref. [4], the rates were
estimated using linear free energy, and quantitative structure–reactivity relationships
as in Faulon and Sault [5]. Finally, in both cases, the numbers of species and reactions
were found to scale exponentially with the number of atoms of the initial species.
Beyond pyrolysis, it has been proposed to use techniques similar to NetGen to gen-
erate reaction networks to predict toxicity of complex reaction mixtures [27] focusing
on degradation of chemicals by cytochrome P450. The deterministic generator pro-
posed by Faulon and Sault has also been used to generate and analyze complex reaction
networks in interstellar chemistry [28].
Broadbelt et al. adapted NetGen with a new software named BNICE to explore
the diversity of complex metabolic networks [26]. With BNICE, elementary trans-
formations were computed for the KEGG database resulting in about 250 unique
elementary transformations. The transformations were then appliedto the biosynthesis
pathways of aromatic amino acids, phenylalanine, tyrosine, and tryptophan. The three
amino acids are synthesized from chrorismate and the cofactors and cosubstrates—
glutamate, gutamine, serine, NAD
+
/NAD, and 5-phospho-α-D-ribose-1-diphsopahe
(PRPP). The native pathways from chorismate to phenylalanine and tyrosine com-
prise three reactions between less than 10 compounds, and the native pathway leading
to tryptophan is composed of five reactions between 19 compounds. The 250 elemen-
tary transformations were applied to the initial substrate, cosubstrate, and cofactors,
producing 246 compounds with the phenylalanine pathway and 289 and 58 for the
tyrosine and tryptophan pathways. These compounds fall into three categories: the
compounds that are part of the native pathways (i.e., compounds in KEGG), com-
pounds found in the CAS database (http://www.cas.org/) but not in KEGG, and novel
compounds. The main outcome of the study was the in silico discovery of novel alter-
native biosynthesis pathways to aromatic amino acid biosynthesis, which remain
to be experimentally verified through enzyme and pathway engineering. BNICE
was also applied to polyketide biosynthesis [29]. While about 10,000 polyketides
structures have been discovered experimentally, BNICE raised this number over
7 millions.
To generate reaction networks for product biocatalysis and biodegradation, a semi-
automated system (UM-PPS) with a database (UM-BBD) has been developed at the
University of Minnesota [30] (http://umbbd.msi.umn.edu/predict/). In this system,
reaction rules (i.e., elementary transformation) are applied to an initial compound
entered by the user. Because several reaction rules applied to an initial compound
may result in different products, the user has to select the next product and the next
rule to apply. The process is iterated until no more rules can be applied, the user
selecting next products and rules at each step. At the time of writing (March 2009) the
database contained 259 biotransformation rules. Rules generally transform one func-
tional group into another, for instance, a cyano group can be hydrated with one water