11.5 Mapping Metabolic Pathways
This section first describes the metabolic pathway data, then explains how to measure the statistical significance of homomorphisms and reports the results of pairwise mappings between four species.
Data. The genomescale metabolic network data in the studies were drawn from BioCyc [19–21], a collection of 260 Pathway/Genome Databases, each of which describes metabolic pathways and enzymes of a single organism. In this chapter, authors have chosen the metabolic networks of Escherichia coli, the yeast Saccharomyces cerevisiae, the eubacterium Bacillus subtilis, and the archeabacterium Thermus thermophilus so that they cover the major lineages Archaea, Eukaryotes, and Eubacteria. The bacterium E. coli, with 256 pathways, is the most extensively studied prokaryotic organism. T. thermophilus, with 178 pathways, belongs to Archaea. B. subtilis, with 174 pathways, is one of the best understood Eubacteria in terms of molecular biology and cell biology. S. cerevisiae, with 156pathways, is the most thoroughly researched eukaryotic microorganism.
Statistical Significance of Mapping. Following a standard randomization procedure, one can randomly permute pairs of edges (u, v) and(u′, v′) if no other edges exist between these four vertices u, u′, v, v′ in the text graph by reconnecting them as (u, v′) and (u′, v). This allows one to keep the incoming and outgoing degrees of each vertex intact. One finds the minimum cost homomorphism from the pattern graph in the ...