* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download HTM_moran_4
Gene desert wikipedia , lookup
Protein moonlighting wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
List of types of proteins wikipedia , lookup
Community fingerprinting wikipedia , lookup
Genome evolution wikipedia , lookup
Gene expression wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Expression vector wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Basal metabolic rate wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Network-based data integration reveals extensive post-transcriptional regulation of human tissue-specific metabolism Tomer Shlomi*, Moran Cabili*, Markus J. Herrgard, Bernhard Q Palsson and Eytan Ruppin * These authors contributed equally to this work 1 Metabolism Metabolism is the totality of all the chemical reactions that operate in a living organism. Catabolic reactions Breakdown and produce energy Anabolic reactions Use energy and build up essential cell components 2 Why Study Human Metabolism? • In born errors of metabolism cause acute symptoms and even death on early age • Metabolic diseases (obesity, diabetics) are major sources of morbidity and mortality. • Metabolic enzymes and their regulators gradually becoming viable drug targets 3 Modeling Cellular Metabolism A Short Review Metabolic flux : The production or elimination of a quantity of metabolite per mass of organ or organism over a specific time frame Metabolite Reaction catalyzed by an enzyme “..it is the concept of metabolic flux that is crucial in the translation of genotype and environmental factors into phenotype or a threshold for disease.” Brendan Lee Nature 2006 4 Constraint Based Modeling Find a steady-state flux distribution through all biochemical reactions • Under the constraints: – Mass balance: metabolite production and consumption rates are equal – Thermodynamic: irreversibility of reactions – Enzymatic capacity: bounds on enzyme rates • Successfully predicts: constant 5 Constraint Based Modeling (CBM) Mathematical Representation of Constrains Glucose + ATP Glucokinase Glucose-6-Phosphate + ADP Mass balance S·v = 0 n Subspace of R metabolites • Stoichiometric matrix – network topology with stoichiometry of biochemical reactions reactions Glucose ATP G-6-P ADP Thermodynamic & capacity 10 >vi > 0 Glucokinase -1 -1 +1 +1 Optimization Maximize Vgrowth Bounded convex cone Fell, et al (1986), Varma and Palsson (1993) 6 Human Metabolic Models • Motivated by the fact that in-vivo studies of tissue-specific metabolic functions are limited in scope • Individual genes and pathways (KEGG, HumanCyc) • Detailed description of the genes, reactions, enzymes • No connections between pathways • Specific cell-types and organelles • Red blood cell Wiback et al. 2002 • Mitochondria Vo et al. 2004 • Large-Scale Human Metabolic Networks • The first large-scale model of human metabolism ~2000 genes, ~3700 reactions, 7 organelles (Duarte et al. 2007, Ma et al. 2007) 7 CBM in Human Modeling human tissue function is problematic •Various cell-types activate different pathways (shown in Expression studies) •Hard to formulate cellular metabolic objectives – (like biomass maximization for microbial species) •Unknown inputs and outputs of each cell-type Can we use constraint-based modeling to systematically predict tissuespecific metabolic behavior? 8 Our Objective : 1. General approach to study tissue specific metabolic models 2. Tissue specific activity of metabolic genes/reactions Our Method : Model Integration with Tissue-Specific Gene and Protein Expression Data Motivated by the assertion that highly expressed genes in a certain tissue are likely to be active there 9 Our Method 1 Gene expression data Protein measurements data Highly and Lowly expressed gene sets Gene-to-reaction mapping Highly and Lowly expressed reaction sets Human Metabolic Model 2 (Duarte et. al) 3 New objective function: Maximize consistency with expression data. Use Mixed Integer Linear Programming (MILP) 4 Determine activity state and conf. level for each gene/reaction 10 Our Method Determine Highly and Lowly Reaction sets 1. Genes set :Extract set of enzymes whose expression is significantly increased or decreased (GeneNote, HPRD) 2. Reactions set :Employ a detailed gene-to-reaction mapping to identify a tissue-specific expression state for each reaction R1 = (g1 & g2) | g3 | g4 11 Our Method 1 Gene expression data Protein measurements data Highly and Lowly expressed gene sets Gene-to-reaction mapping Highly and Lowly expressed reaction sets Human Metabolic Model 2 (Duarte et. al) 3 New objective function: Maximize consistency with expression data. Use Mixed Integer Linear Programming (MILP) 4 Determine activity state and conf. level for each gene/reaction 12 Our Method Represent Flux Consistency with Expression State Highly expressed Input E1 E2 H1 M1 L1 M3 M4 L2 M6 M2 E6 E5 M5 M7 E3 M8 Output H2 Output E4 H3 E7 M9 Lowly expressed Looking for real flux vector V Now add additional Boolean vectors H, L s.t : Hi=1 Vi != 0 (if the enzyme associated with Vi is Highly expressed) L i=1 Vi=0 (if the enzyme associated with Vi is Lowly expressed) 13 Our Method Define a New Objective function Highly expressed Input E1 E2 H1 M1 L1 E5 M3 M4 L2 M5 M6 M2 E6 M7 E3 M8 Output H2 Output E4 H3 E7 M9 4 out of 5 reactions were Use Mixed Integer Linear Programming. Define a new objective consistent withfunction: the MAX Σ (Hi + Li ) expression state! Lowly expressed Which practically mean maximize the number of Highly expressed reactions that are active and the number of Lowly expressed reactions that are inactive Maximize consistency with expression data 14 Our Method 1 Gene expression data Protein measurements data Highly and Lowly expressed gene sets Gene-to-reaction mapping Highly and Lowly expressed reaction sets Human Metabolic Model 2 (Duarte et. al) 3 New objective function: Maximize consistency with expression data. Use Mixed Integer Linear Programming (MILP) 4 Determine activity state and conf. level for each gene/reaction 15 Our Method Flux Activity State • Gene’s flux activity states -reflect the absence/existence of non-zero flux through the enzymatic reactions they encode • Comparison of the flux activity states and the expression state will teach us on post transcription regulation Highly expressed E1 E5 Lowly expressed M3 E2 M4 M1 M5 M2 M6 E6 M7 E3 M8 E4 Up regulated E7 M9 Down regulated 16 Flux Activity State Consider Space of Possible Solutions • We predict for each tissue active and inactive gene and reactions sets • Since there is a space of possible solutions to the MILP problem we solve a set of MILP problems to determine the gene activity 1. Simulate a state where the gene is inactive 2. Simulate an active gene product Estimate confidence levels based on the drop in the consistency (with expression) between the 2 different solutions! 17 Results Gene Tissue Specific Activity •We employed the method described above on • metabolic network model of Duarte et al. • gene and protein expression measurements from GeneNote and HPRD •10 tissues : brain, heart, kidney, liver, lung, pancreas, prostate, spleen, skeletal muscle and thymus. • The activity state of 781 out of 1475 model genes was determined in at least one tissue 18 Post-transcriptional Regulation of Metabolic Genes • Post-transcriptional regulation plays a major role in shaping tissue-specific metabolic behavior: ~20% of the metabolic genes per tissue • average of 42 (3.6%) genes post-transcriptionally up-regulated and 180 (15.4%) post-transcriptionally down-regulated in each tissue down-regulated up-regulated 19 Cross Validation Test •We performed a five-fold cross validation test •80% of the genes were used to constrain the model •Gene activity states for a held-out set of 20% of the genes were predicted according to the expression constrains of the remaining other 80% •The overlap between the genes predicted as active and the highly expressed genes in the held-out data was significantly high for all tissues 20 Large Scale Validation Large-Scale Mining of Tissue-Specificity Data - Tissue-specificity of genes, reactions, and metabolites is significantly correlated with all data sources - Tissue specificity of post-transcriptional up regulated elements is significantly high !!!! - Tissue specificity of post-transcriptional down regulated elements is significantly low !!!! 21 Tissue-Specific Metabolite Exchange with Biofluids • 249 metabolites are known to be secreted or taken up by human tissues • 54% of the metabolites are not associated with transporters and cannot be predicted by expression data • Transport direction can not be inferred by the expression data • A transporter might carry several metabolites • Many of the known transporters are post-transcriptionall regulated 22 Metabolic Disease-Causing Genes • 162 metabolic genes are associated with a mendelian disease • Prediction accuracy: precision of 49% and a recall of 22% • There is a significant affect of post transcriptional regulation on disease-causing genes GBE1 causes the glycogen storage disease is post-transcriptionally up-regulated in liver, heart, skeletal muscle, and brain) 23 Summary Methodological Standpoint • First constraint-based modeling analysis of recently published human metabolic networks • First to account for post-transcriptional regulation within the computational framework of large-scale metabolic modeling • Integrate expression data as part of the optimization instead of imposing it as a constrain during the preprocessing step (Akesson et al. 2004) 24 Summary Main Conclusions • Post transcriptional regulation plays a significant rule in shaping tissue specific metabolic behavior The tissue specificity of many metabolic disease-causing genes goes markedly beyond that manifested in their expression level, giving rise to new predictions concerning their involvement in different tissues Metabolites exchange with biofluids displays a large variance across tissues, composing a unique view of tissue-specific uptake and secretion of hundreds of metabolites 25 What’s Next? • Integrate other tissue-specificity data • Modeling of metabolic diseases – Using various data sources (known disease-causing genes, drug databases) – Predict tissue-wide metabolic symptoms – Predict metabolic response to drugs • Predict disease biomarkers that can be identified by biofluid metabolomics 26 Thank you! 27 Mathematical representation of our optimization problem max (iR ( y y ) iR y i ) v, y , y E i i N s.t S v 0 (1) v min v v max (2) vi y i v min, i v min, i , , i RE (3) , i RE (4) vi y i v max, i v max, i v min, i (1 y i ) vi v max, i (1 y i ) , i R N (5) v Rm y i , y i 0,1 28