Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Nicotinamide adenine dinucleotide wikipedia , lookup
Ridge (biology) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Minimal genome wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression profiling wikipedia , lookup
V21 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate biochemical pathways into a densely-woven metabolic network (2) The connectivity of substrates in this network follows a power-law. (3) Constraint-based modeling approaches (FBA) were successful in analyzing the capabilities of cellular metabolism including - its capacity to predict deletion phenotypes - the ability to calculate the relative flux values of metabolic reactions, and - the capability to identify properties of alternate optimal growth states in a wide range of simulated environmental conditions Open questions - what parts of metabolism are involved in adaptation to environmental conditions? - is there a central essential metabolic core? - what role does transcriptional regulation play? 21. Lecture WS 2005/06 Bioinformatics III 1 Distribution of fluxes in E.coli Aim: understand principles that govern the use of individual reactions under different growth conditions. Nature 427, 839 (2004) Stoichiometric matrix for E.coli strain MG1655 containing 537 metabolites and 739 reactions taken from Palsson et al. Apply flux balance analysis to characterize solution space (all possible flux states under a given condition). d Ai Sij v j 0 dt j vj is the flux of reaction j and Sij is the stoichiometric coefficient of reaction j. 21. Lecture WS 2005/06 Bioinformatics III 2 Optimal states Denote the mass carried by reaction j producing (consuming) metabolite i by vˆij S ij v j Fluxes vary widely: e.g. dimensionless flux of succinyl coenzyme A synthetase reaction is 0.185, whereas the flux of the aspartate oxidase reaction is 10.000 times smaller, 2.2 10-5. Using linear programming and adapting constraints for each reaction flux vi of the form imin ≤ vi ≤ imax, the flux states were calculated that optimize cell growth on various substrates. Plot the flux distribution for active (non-zero flux) reactions of E.coli grown in a glutamate- or succinate-rich substrate. 21. Lecture WS 2005/06 Bioinformatics III 3 Overall flux organization of E.coli metabolic network a, Flux distribution for optimized biomass production on succinate (black) and glutamate (red) substrates. The solid line corresponds to the power-law fit that a reaction has flux v P(v) (v + v0)- , with v0 = 0.0003 and = 1.5. d, The distribution of experimentally determined fluxes from the central metabolism of E. coli shows power-law behaviour as well, with a best fit to P(v) v- with = 1. Both computed and experimental flux distribution show wide spectrum of fluxes. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 4 Response to different environmental conditions Is the flux distribution independent of environmental conditions? b, Flux distribution for optimized biomass on succinate (black) substrate with an additional 10% (red), 50% (green) and 80% (blue) randomly chosen subsets of the 96 input channels (substrates) turned on. The flux distribution was averaged over 5,000 independent random choices of uptake metabolites. the flux distribution is independent of the external conditions. Is the wide flux distribution also present in non-optimal conditions? c, Flux distribution from the non-optimized hit-and-run sampling method of the E. coli solution space. The solid line is the best fit, with v0 = 0.003 and = 2. Inset shows the flux distribution in four randomly chosen sample points. Many individual non-optimal states are consistent with an exponent = 1. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 5 Use scaling behavior to determine local connectivity The observed flux distribution is compatible with two different potential local flux structures: (a) a homogenous local organization would imply that all reactions producing (consuming) a given metabolite have comparable fluxes (b) a more delocalized „high-flux backbone (HFB)“ is expected if the local flux organisation is heterogenous such that each metabolite has a dominant source (consuming) reaction. Schematic illustration of the hypothetical scenario in which (a) all fluxes have comparable activity, in which case we expect kY(k) 1 and (b) the majority of the flux is carried by a single incoming or outgoing reaction, for which we should have kY(k) k . Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 6 Measuring the importance of individual reactions To distinguish between these 2 schemes for each metabolite i produced (consumed) by k reactions, define vˆ k ij Y k , i k vˆ j 1 l 1 il 2 where vij is the mass carried by reaction j which produces (consumes) metabolite i. If all reactions producing (consuming) metabolite i have comparable vij values, Y(k,i) scales as 1/k. If, however, the activity of a single reaction dominates we expect Y(k,i) 1 (independent of k). Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 7 Characterizing the local inhomogeneity of the flux net a, Measured kY(k) shown as a function of k for incoming and outgoing reactions, averaged over all metabolites, indicates that Y(k) k-0.27. Inset shows non-zero mass flows, v^ij, producing (consuming) FAD on a glutamate-rich substrate. an intermediate behavior is found between the two extreme cases. the large-scale inhomogeneity observed in the overall flux distribution is also increasingly valid at the level of the individual metabolites. The more reactions that consume (produce) a given metabolite, the more likely it is that a single reaction carries most of the flux, see FAD. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 8 Clean up metabolic network Simple algorithm removes for each metabolite systematically all reactions but the one providing the largest incoming (outgoing) flux distribution. The algorithm uncovers the „high-flux-backbone“ of the metabolism, a distinct structure of linked reactions that form a giant component with a star-like topology. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 9 Maximal flow networks glutamate rich succinate rich substrates Directed links: Two metabolites (e.g. A and B) are connected with a directed link pointing from A to B only if the reaction with maximal flux consuming A is the reaction with maximal flux producing B. Shown are all metabolites that have at least one neighbour after completing this procedure. The background colours denote different known biochemical pathways. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 10 FBA-optimized network on glutamate-rich substrate High-flux backbone for FBA-optimized metabolic network of E. coli on a glutamate-rich substrate. Metabolites (vertices) coloured blue have at least one neighbour in common in glutamate- and succinate-rich substrates, and those coloured red have none. Reactions (lines) are coloured blue if they are identical in glutamate- and succinate-rich substrates, green if a different reaction connects the same neighbour pair, and red if this is a new neighbour pair. Black dotted lines indicate where the disconnected pathways, for example, folate biosynthesis, would connect to the cluster through a link that is not part of the HFB. Thus, the red nodes and links highlight the predicted changes in the HFB when shifting E. coli from glutamate- to succinate-rich media. Dashed lines indicate links to the biomass growth reaction. (1) Pentose Phospate (11) Respiration (2) Purine Biosynthesis (12) Glutamate Biosynthesis (20) Histidine Biosynthesis (3) Aromatic Amino Acids (13) NAD Biosynthesis (21) Pyrimidine Biosynthesis (4) Folate Biosynthesis (14) Threonine, Lysine and Methionine Biosynthesis (5) Serine Biosynthesis (15) Branched Chain Amino Acid Biosynthesis (6) Cysteine Biosynthesis (16) Spermidine Biosynthesis (22) Membrane Lipid Biosynthesis (7) Riboflavin Biosynthesis (17) Salvage Pathways (23) Arginine Biosynthesis (8) Vitamin B6 Biosynthesis (18) Murein Biosynthesis (24) Pyruvate Metabolism (9) Coenzyme A Biosynthesis (19) Cell Envelope Biosynthesis (25) Glycolysis (10) TCA Cycle Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 11 Interpretation Only a few pathways appear disconnected indicating that although these pathways are part of the HFB, their end product is only the second-most important source for another HFB metabolite. Groups of individual HFB reactions largely overlap with traditional biochemical partitioning of cellular metabolism. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 12 How sensitive is the HFB to changes in the environment? b, Fluxes of individual reactions for glutamate-rich and succinate-rich conditions. Reactions with negligible flux changes follow the diagonal (solid line). Some reactions are turned off in only one of the conditions (shown close to the coordinate axes). Reactions belonging to the HFB are indicated by black squares, the rest are indicated by blue dots. Reactions in which the direction of the flux is reversed are coloured green. Only the reactions in the high-flux territory undergo noticeable differences! Type I: reactions turned on in one conditions and off in the other (symbols). Type II: reactions remain active but show an orders-in-magnitude shift in flux under the two different growth conditions. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 13 Flux distributions for individual reactions Shown is the flux distribution for four selected E. coli reactions in a 50% random environment. a Triosphosphate isomerase; b carbon dioxide transport; c NAD kinase; d guanosine kinase. Reactions on the v curve (small fluxes) have unimodal/gaussian distributions (a and c). Shifts in growth-conditions only lead to small changes of their flux values. Reactions off this curve have multimodal distributions (b and d), showing several discrete flux values under diverse conditions. Under different growth conditions they show several discrete and distinct flux values. Almaar et al., Nature 427, 839 (2004) 21. Lecture WS 2005/06 Bioinformatics III 14 Summary Metabolic network use is highly uneven (power-law distribution) at the global level and at the level of the individual metabolites. Whereas most metabolic reactions have low fluxes, the overall activity of the metabolism is dominated by several reactions with very high fluxes. E. coli responds to changes in growth conditions by reorganizing the rates of selected fluxes predominantly within this high-flux backbone. Apart from minor changes, the use of the other pathways remains unaltered. These reorganizations result in large, discrete changes in the fluxes of the HFB reactions. 21. Lecture WS 2005/06 Bioinformatics III 15 The same authors as before used Flux Balance Analysis to examine utilization and relative flux rate of each metabolite in a wide range of simulated environmental conditions for E.coli, H. pylori and S. cerevisae: consider in each case 30.000 randomly chosen combinations where each uptake reaction is a assigned a random value between 0 and 20 mmol/g/h. adaptation to different conditions occurs by 2 mechanisms: (a) flux plasticity: changes in the fluxes of already active reactions. E.g. changing from glucose- to succinate-rich conditions alters the flux of 264 E.coli reactions by more than 20% (b) less commonly, adaptation includes structural plasticity, turning on previously zero-flux reactions or switching off active pathways. 21. Lecture WS 2005/06 Bioinformatics III 16 Emergence of the Metabolic Core The two adaptation method mechanisms allow for the possibility of a group of reactions not subject to structural plasticity being active under all environmental conditions. Assume that active reactions were randomly distributed. If typically a q fraction of the metabolic reactions are active under a specific growth condition, we expect for n distinct conditions an overlap of at least qn reactions. This converges quickly to 0. 21. Lecture WS 2005/06 Bioinformatics III 17 Emergence of the Metabolic Core (A–C) The average relative size of the number of reactions that are always active as a function of the number of sampled conditions (black line) for (A) H. pylori, (B) E. coli, and (C) S. cerevisiae. (D and E) The number of metabolic reactions (D) and the number of metabolic core reactions (E) in the three studied organisms. However, as the number of conditions increases, the curve converges to a constant enoted by the dashed line, identifying the metabolic core of an organism. Red line : number of reactions that are always active if activity is randomly distributed in the metabolic network. The fact that it converges to zero indicates that the real core represents a collective network effect, forcing a group of reactions to be active in all conditions. 21. Lecture WS 2005/06 Bioinformatics III 18 Metabolic Core of E.coli The constantly active reactions form a tightly connected cluster! All reactions that are found to be active in each of the 30,000 investigated external conditions are shown. Metabolites that contribute directly to biomass formation are colored blue, while core reactions (links) catalyzed by essential (or nonessential) enzymes are colored red (or green). (Black-colored links denote enzymes with unknown deletion phenotype.) Blue dashed lines indicate multiple appearances of a metabolite, while links with arrows denote unidirectional reactions. Note that 20 out of the 51 metabolites necessary for biomass synthesis are not present in the core, indicating that they are produced (or consumed) in a growth-condition-specific manner. Blue and brown shading: folate and peptidoglycan biosynthesis pathways White numbered arrows denote current antibiotic targets inhibited by: (1) sulfonamides, (2) trimethoprim, (3) cycloserine, and (4) fosfomycin. A few reactions appear disconnected since we have omitted the drawing of cofactors. 21. Lecture WS 2005/06 Bioinformatics III 19 Metabolic Core Reactions The metabolic cores contain 2 types of reactions: (a) reactions that are essential for biomass production under all environment conditions (81 of 90 in E.coli) (b) reactions that assure optimal metabolic performance. 21. Lecture WS 2005/06 Bioinformatics III 20 Characterizing the Metabolic Cores (A) The number of overlapping metabolic reactions in the metabolic core of H. pylori, E. coli, and S. cerevisiae. The metabolic cores of simple organisms (H. pylori and E.coli) overlap to a large extent. The largest organism (S.cerevisae) has a much larger reaction network that allows more flexbility the relative size of the metabolic core is much lower. (B) The fraction of metabolic reactions catalyzed by essential enzymes in the cores (black) and outside the core in E. coli and S. cerevisiae. Reactions of the metabolic core are mostly essential ones. (C) One could assume that the core represents a subset of high-flux reactions. This is apparently not the case. The distributions of average metabolic fluxes for the core and the noncore reactions in E. coli are very similar. 21. Lecture WS 2005/06 Bioinformatics III 21 Correlation among E.coli Metabolic Reactions Pearson correlation using flux values from 30,000 conditions for each reaction pair before grouping the reactions according to a hierarchical averagelinkage clustering algorithm. The values of the flux-correlation matrix range from -1 (red) through 0 (white) to 1 (blue). The horizontal color bar denotes if a reaction is a member of the core (green), and the vertical color bar denotes whether the enzymes catalyzing the reaction are essential (red). group of highly correlated reactions significantly overlaps with metabolic core. (B) Distribution of Pearson correlation in mRNA copy numbers from 41 experiments. The correlations of the core reactions are clearly shifted towards higher values. 21. Lecture WS 2005/06 Bioinformatics III 22 Summary - Adaptation to environmental conditions occurs via structural plasticity and/or flux plasticity. Here: identification of a surprisingly stable metabolic core of reactions that are tightly connected to eachother. - the reactions belonging to this core represent potential targets for antimicrobial intervention. 21. Lecture WS 2005/06 Bioinformatics III 23 Integrated Analysis of Metabolic and Regulatory Networks Sofar, studies of large-scale cellular networks have focused on their connectivities. The emerging picture shows a densely-woven web where almost everything is connected to everything. In the cell‘s metabolic network, hundreds of substrates are interconnected through biochemical reactions. Although this could in principle lead to the simultaneous flow of substrates in numerous directions, in practice metabolic fluxes pass through specific pathways ( high flux backbone). Topological studies sofar did not consider how the modulation of this connectivity might also determine network properties. Therefore it is important to correlate the network topology with the expression of enzymes in the cell. 21. Lecture WS 2005/06 Bioinformatics III 24 Analyze transcriptional control in metabolic networks Regulatory and metabolic functions of cells are mediated by networks of interacting biochemical components. Metabolic flux is optimized to maximize metabolic efficiency under different conditions. Control of metabolic flow: - allosteric interactions - covalent modifications involving enzymatic activity - transcription (revealed by genome-wide expression studies) Here: N. Barkai and colleagues analyzed published experimental expression data of Saccharomyces cerevisae. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 25 Recurrence signature algorithm Aim: identify transcription „modules“ (TMs). a set of randomly selected genes is unlikely to be identical to the genes of any TM. Yet many such sets do have some overlap with a specific TM. In particular, sets of genes that are compiled according to existing knowledge of their functional (or regulatory) sequence similarity may have a significant overlap with a transcription module. Algorithm receives a gene set that partially overlaps a TM and then provides the complete module as output. Therefore this algorithm is referred to as „signature algorithm“. Ihmels et al. Nat Genetics 31, 370 (2002) 21. Lecture WS 2005/06 Bioinformatics III 26 Recurrence signature algorithm normalization of data identify modules classify genes into modules a, The signature algorithm. b , Recurrence as a reliability measure. The signature algorithm is applied to distinct input sets containing different subsets of the postulated transcription module. If the different input sets give rise to the same module, it is considered reliable. c, General application of the recurrent signature method. Ihmels et al. Nat Genetics 31, 370 (2002) 21. Lecture WS 2005/06 Bioinformatics III 27 Correlation between genes of the same metabolic pathway Distribution of the average correlation between genes assigned to the same metabolic pathway in the KEGG database. The distribution corresponding to random assignment of genes to metabolic pathways of the same size is shown for comparison. Importantly, only genes coding for enzymes were used in the random control. Interpretation: pairs of genes associated with the same metabolic pathway show a similar expression pattern. However, typically only a set of the genes assigned to a given pathway are coregulated. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 28 Correlation between genes of the same metabolic pathway Genes of the glycolysis pathway (according KEGG) were clustered and ordered based on the correlation in their expression profiles. Shown here is the matrix of their pair-wise correlations. The cluster of highly correlated genes (orange frame) corresponds to genes that encode the central glycolysis enzymes. The linear arrangement of these genes along the pathway is shown at right. Of the 46 genes assigned to the glycolysis pathway in the KEGG database, only 24 show a correlated expression pattern. In general, the coregulated genes belong to the central pieces of pathways. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 29 Coexpressed enzymes often catalyze linear chain of reactions Coregulation between enzymes associated with central metabolic pathways. Each branch corresponds to several enzymes. In the cases shown, only one of the branches downstream of the junction point is coregulated with upstream genes. Interpretation: coexpressed enzymes are often arranged in a linear order, corresponding to a metabolic flow that is directed in a particular direction. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 30 Co-regulation at branch points To examine more systematically whether coregulation enhances the linearity of metabolic flow, analyze the coregulation of enzymes at metabolic branch-points. Search KEGG for metabolic compounds that are involved in exactly 3 reactions. Only consider reactions that exist in S.cerevisae. 3-junctions can integrate metabolic flow (convergent junction) or allow the flow to diverge in 2 directions (divergent junction). In the cases where several reactions are catalyzed by the same enzymes, choose one representative so that all junctions considered are composed of precisely 3 reactions catalyzed by distinct enzymes. Each 3-junction is categorized according to the correlation pattern found between enzymes catalyzing its branches. Correlation coefficients > 0.25 are considered significant. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 31 Coregulation pattern in three-point junctions All junctions corresponding to metabolites that participate in exactly 3 reactions (according to KEGG) were identified and the correlations between the genes associated with each such junction were calculated. The junctions were grouped according to the directionality of the reactions, as shown. Divergent junctions, which allow the flow of metabolites in two alternative directions, predominantly show a linear coregulation pattern, where one of the emanating reaction is correlated with the incoming reaction (linear regulatory pattern) or the two alternative outgoing reactions are correlated in a context-dependent manner with a distinct isozyme catalyzing the incoming reaction (linear switch). By contrast, the linear regulatory pattern is significantly less abundant in convergent junctions, where the outgoing flow follows a unique direction, and in conflicting junctions that do not support metabolic flow. Most of the reversible junctions comply with linear regulatory patterns. Indeed, similar to divergent junctions, reversible junctions allow metabolites to flow in two alternative directions. Reactions were counted as coexpressed if at least two of the associated genes were significantly correlated (correlation coefficient >0.25). As a random control, we randomized the identity of all metabolic genes and repeated the analysis. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III In the majority of divergent junctions, only one of the emanating branches is significantly coregulated with the incoming reaction that synthesizes the metabolite. 32 Co-regulation at branch points: conclusions The observed co-regulation patterns correspond to a linear metabolic flow, whose directionality can be switched in a condition-specific manner. When analyzing junctions that allow metabolic flow in a larger number of directions, there also only a few important branches are coregulated with the incoming branch. Therefore: transcription regulation is used to enhance the linearity of metabolic flow, by biasing the flow toward only a few of the possible routes. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 33 Connectivity of metabolites The connectivity of a given metabolite is defined as the number of reactions connecting it to other metabolites. Shown are the distributions of connectivity between metabolites in an unrestricted network () and in a network where only correlated reactions are considered (). In accordance with previous results (Jeong et al. 2000) , the connectivity distribution between metabolites follows a power law (log-log plot). In contrast, when coexpression is used as a criterion to distinguish functional links, the connectivity distribution becomes exponential (log-linear plot). Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2004) 21. Lecture WS 2005/06 Bioinformatics III 34 Differential regulation of isozymes Observe that isozymes at junction points are often preferentially coexpressed with alternative reactions. investigate their role in the metabolic network more systematically. Two possible functions of isozymes associated with the same metabolic reaction. An isozyme pair could provide redundancy which may be needed for buffering genetic mutations or for amplifying metabolite production. Redundant isozymes are expected to be coregulated. Alternatively, distinct isozymes could be dedicated to separate biochemical pathways using the associated reaction. Such isozymes are expected to be differentially expressed with the two alternative processes. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 35 Differential regulation of isozymes in central metabolic PW Arrows represent metabolic pathways composed of a sequence of enzymes. Coregulation is indicated with the same color (e.g., the isozyme represented by the green arrow is coregulated with the metabolic pathway represented by the green arrow). Most members of isozyme pairs are separately coregulated with alternative processes. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 36 Differential regulation of isozymes Regulatory pattern of all gene pairs associated with a common metabolic reaction (according to KEGG). All such pairs were classified into several classes: (1) parallel, where each gene is correlated with a distinct connected reaction (a reaction that shares a metabolite with the reaction catalyzed by the respective gene pair); (2) selective, where only one of the enzymes shows a significant correlation with a connected reaction; and (3) converging, where both enzymes were correlated with the same reaction. Correlations coefficients >0.25 were considered significant. To be counted as parallel, rather than converging, we demanded that the correlation with the alternative reaction be <80% of the correlation with the preferred reaction. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 37 Differential regulation of isozymes: interpretation The primary role of isozyme multiplicity is to allow for differential regulation of reactions that are shared by separated processes. Dedicating a specific enzyme to each pathway may offer a way of independently controlling the associated reaction in response to pathway-specific requirements, at both the transcriptional and the post-transcriptional levels. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 38 Genes coexpressed with metabolic pathways Identify the coregulated subparts of each metabolic pathway and identify relevant experimental conditions that induce or repress the expression of the pathway genes. Also associate additional genes showing similar expression profiles with each pathway using the signature algorithm. Input: set of genes, some of which are expected to be coregulated. Output: coregulated part of the input and additional coregulated genes together with the set of conditions where the coregulation is realized. Numerous genes were found that are not directly involved in enzymatic steps: - transporters - transcription factors Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 39 Co-expression of transporters Transporter genes are co-expressed with the relevant metabolic pathways providing the pathways with its metabolites. Co-expression is marked in green. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 40 Co-regulation of transcription factors Transcription factors are often co-regulated with their regulated pathways. Shown here are transcription factors which were found to be co-regulated in the analysis. Co-regulation is shown by color-coding such that the transcription factor and the associated pathways are of the same color. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 41 Hierarchical modularity in the metabolic network Sofar: co-expression analysis revealed a strong tendency toward coordinated regulation of genes involved in individual metabolic pathways. Does transcription regulation also define a higher-order metabolic organization, by coordinated expression of distinct metabolic pathways? Based on observation that feeder pathways (which synthesize metabolites) are frequently coexpressed with pathways using the synthesized metabolites. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 42 Feeder-pathways/enzymes Feeder pathways or genes co-expressed with the pathways they fuel. The feeder pathways (light blue) provide the main pathway (dark blue) with metabolites in order to assist the main pathway, indicating that coexpression extends beyond the level of individual pathways. These results can be interpreted in the following way: the organism will produce those enzymes that are needed. 21. Lecture WS 2005/06 Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) Bioinformatics III 43 Hierarchical modularity in the metabolic network Derive hierarchy by applying an iterative signature algorithm to the metabolic pathways, and decreasing the resolution parameter (coregulation stringency) in small steps. Each box contains a group of coregulated genes (transcription module). Strongly associated genes (left) can be associated with a specific function, whereas moderately correlated modules (right) are larger and their function is less coherent. The merging of 2 branches indicates that the associated modules are induced by similar conditions. All pathways converge to one of 3 low-resolution modules: amino acid biosynthesis, protein synthesis, and stress. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 44 Hierarchical modularity in the metabolic network Although amino acids serve as building blocks for proteins, the expression of genes mediating these 2 processes is clearly uncoupled! This may reflect the association of rapid cell growth (which triggers enhanced protein synthesis) with rich growth conditions, where amino acids are readily available and do not need to be synthesized. Amino acid biosynthesis genes are only required when external amino acids are scarce. In support of this view, a group of amino acid transporters converged to the protein synthesis module, together with other pathways required for rapid cell growth (glucose fermentation, nucleotide synthesis and fatty acid synthesis). Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 45 Global network properties Jeong et al. showed that the structural connectivity between metabolites imposes a hierarchical organization of the metabolic network. That analysis was based on connectivity between substrates, considering all potential connections. Here, analysis is based on coexpression of enzymes. In both approaches, related metabolic pathways were clustered together! There are, however, some differences in the particular groupings (not discussed here), and importantly, when including expression data the connectivity pattern of metabolites changes from a power-law dependence to an exponential one corresponding to a network structure with a defined scale of connectivity. This reflects the reduction in the complexity of the network. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 46 Summary Transcription regulation is prominently involved in shaping the metabolic network of S. cerevisae. 1 Transcription leads the metabolic flow toward linearity. 2 Individual isozymes are often separately coregulated with distinct processes, providing a means of reducing crosstalk between pathways using a common reaction. 3 Transcription regulation entails a higher-order structure of the metabolic network. It exists a hierarchical organization of metabolic pathways into groups of decreasing expression coherence. Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003) 21. Lecture WS 2005/06 Bioinformatics III 47