* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 投影片 1
Long non-coding RNA wikipedia , lookup
Human genome wikipedia , lookup
Genomic library wikipedia , lookup
Point mutation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Transposable element wikipedia , lookup
Ridge (biology) wikipedia , lookup
Oncogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene nomenclature wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene desert wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Public health genomics wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression programming wikipedia , lookup
History of genetic engineering wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome (book) wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Genome editing wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Helitron (biology) wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression profiling wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Introduction to Systems Biology 國立台灣大學資訊工程系 博士後研究員 詹鎮熊 What is a system? Features of a system Components Interrelated components Boundary Purpose Environment Interfaces Input Output Constrain Examples of Systems Life‘s Complexity Pyramid System Functional modules Building blocks Components Z. N. Oltvai and A.-L. Barabási, Science 298, 763 (2002) 生物圈 個體 生態體系 器官系統 社區 組織 族群 細胞 個體 分子 原子 個體 – 細胞 – 胞器 – 分子 Organism – Cell – Organelle – Molecules 人體由上兆個細胞組成 每個細胞具有: 46 條染色體 2 米長的DNA 30 億個鹼基 (A, T, G, C) 2~3萬個基因 The Central Dogma Bottom-up From genes to phenotypes If the genome sequence can be fully sequenced, can we resolve all the secrets hidden in the DNA? The -omics (-ome) era Genomics (Genome) Human Genome Project Other Genome Projects Mouse Fly Dog Worm Bacteria … Most recently … Cat Human genome project Sequence the whole genome sequence of several individuals Competition between Celera and NIH Took over a decade Draft in 2000, complete in 2003 The next stage: HapMap HapMap is a catalog of common genetic variants that occur in human beings It describes: what these variants are where they occur in our DNA and how they are distributed among people within populations and among populations in different parts of the world Single Nucleotide Polymorphism (SNP) Personalized genome James Watson (454 Life Science) Craig Venter (Venter Institute) 23andme (backed by Google, focus on social/family relationships) Navigenics (focus on medical conditions) Personal Genome Project (PGP, Harvard) Proteomics (Proteome) Categorize all proteins (and their relationships) in a temporal-spatial confined system Identities of these proteins Quantities Variants of these proteins Alternative splicing forms Post-translational modifications (Phosphorylation, Methylation, Ubiqutination, …) Proteomics Mass Spectrometry Fluorescence Resonance Energy Transfer (FRET) Co-localization (interaction) between proteinprotein, proteinDNA pairs Transcriptome Identify all transcription factors (TF) functioning in a specific temporalspatial confined system Identify all genes regulated by specific TFs ChIP-chip TransFac database Chromatin Immuno-Precipitation (ChIP) a well-established procedure used to investigate interactions between DNAbinding proteins and DNA in vivo ChIP-chip Transcription Factor Binding Motifs Interactome Categorized all interactions (proteinprotein or protein-DNA) within an organism Yeast Two-Hybrid Immuno-coprecipitation (co-IP) Mass Spectrometry FRET … Yeast Two-hybrid Metabolomics (Metabolome) “systematic study of the unique chemical fingerprints that specific cellular processes leave behind” Collection of all metabolites in a biological organism Analytical methods for metabolomics Separation Gas Chromatography (GC) High performance liquid chromatography (HPLC) Capillary electrophoresis (CE) Detection Mass Spectrometry Nuclear magnetic resonance (NMR) spectroscopy Glycomics Oligosaccharide Glycoprotein/Proteoglycan Proteins attached to oligosaccharides Important to cell recognition Cancer targeting Influenza Model Organisms Yeast (S. cerevisiae) Worm (C. elegans) Fruit Fly (D. melanogaster) Mouse (M. musculus) Monitoring the System High throughput monitoring of gene expression Microarray Protein microarray GC/HPLC/MASS/Tandem MASS Phenotype/Disease Microarray Protein Microarray Phenotypes Lethality Synthetic lethal Developmental Morphological Behavioral Diseases Genotypes and Phenotypes + environment → phenotype genotype +genotype environment + random-variation → phenotype Importance of Computer Models Interactions in cell are too complex to handle by pen-and-paper With high-throughput tools, biology shifts from descriptive to predictive Computers are required to store, processing, assemble, and model all high-throughput data into networks Types of Computer Models Chemical Kinetic Model Defined by concentrations of different molecular species in the cell Represented with a number of equations Some processes may be stochastic Simplified Discrete Circuit Network with nodes and arrows Nodes represent quantity or other attributes Directed edges represent effect of nodes on other nodes Different Mathematical Formulations Differential Equations Linear (ordinary) Partial Stochastic S-Systems Power-law formulation Captures complicate dynamics Parameter estimation is computation intensive Model details Selection of genes, gene products, and other molecules to be included Cellular compartments: nucleus, golgi, or other organelles Too much details may lead to more noises Minimal model able to predict system properties (mRNA level, growth rate, etc) is sufficient Construct Model from Global Patterns Microarray gene expression patterns: Up-regulated/down-regulated Gene expression profiles under different conditions: Tumor/normal, cell cycle, drug treatment, … Methods: Bayesian Inferences Machine learning (clustering, classification) … Framework for Systems Biology Tools for Simulation E-cell Cell Illustrator Virtual Cell Standardizing efforts: BioJake SBML (systems biology markup language) Facilitate the exchange of models E-Cell System A software to construct object models equivalent to a cell system or a part of the cell system Employing Structured VariableProcess model (previously called the Substance-Reactor model, or SRM) Objects: Variables, Processes, Systems Cell Illustrator Computational Databases Protein-protein interaction DIP, BIND, MIPS, MINT, IntAct, POINT, BioGRID Protein-DNA interaction TRANSFAC, SCPD Metabolic pathways KEGG, EcoCyc, WIT, Reactome Gene Expression GEO, ArrayExpress, GNF, NCI60, commercial Gene Ontology Network Biology The entities within a system form intertwined complex networks Genes Proteins Metabolites External factors… Gene (Transcription) Regulatory Network Protein-Protein Interaction Network Metabolic Pathways KEGG metabolic pathway Gene Ontology The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism Annotations Molecular Function Cellular Components Biological Processes Challenges of Databases Provide information other than simple entries (e.g. PPI with functional annotation or binding strength) Data maintenance – update Integration with other databases Applications Target identification and drug discovery Disease Gene Identification From networks From literature From microarray Quantitative Trait Loci (QTL) Genome-Wide Association Study (GWAS) Endeavour Systems biology (integrated) approaches? Drug Targets Gene identification from network Nodes Hubs Edges (interactions) Define critical genes from connected edges? Shortest path, alternative path? Weights Metabolic pathways as well Gene identification from literature OMIM (Online Mendelian Inheritance in Men) Single gene disease Complex disease Defects identified, target for drugs and cures Gene identification from microarray Up-regulated genes Down-regulated genes Too many? Cluster of genes Regulator (transcription factors) for the important clusters Quantitative Trait Loci (QTL) Region of DNA that is associated with a particular phenotypic trait Phenotypic characteristic varies in degree and attributes to interaction between two or more genes QTL may not be gene itself, but as a sequence of DNA, is closely linked with the target gene Quantitative Trait Loci LOD (log odd ratio): how likely to observe a locus for a group with specific trait (phenotype) Expression QTL (e-QTL): combine microarray for gene expression (identify transcription regulatory elements as QTL) cM: centimorgan, 1,000,000 bases in chromosome Genome-Wide Association Studies (GWAS) Genome-wide association studies (GWAS) rely on newly available research tools and technologies to rapidly and cost-effectively analyze genetic differences between people with specific illnesses, such as diabetes or heart disease, compared to healthy individuals. Keys to success of GWAS Population Resource Large sample size required for significant detection SNP Map and Genotyping High-throughput genotyping IT and Analysis Tool Storage and analysis (1000 microarrays for billions of data points) What have GWAS found? Genes associated with risks of: type 2 diabetes Parkinson's disease heart disorders Obesity prostate cancer … An integrated approach: Endeavour Genes can obtain various scores regarding their association with disease These scores include those mentioned above The various ranks of these genes according to different scores are determined With a consensus scoring scheme (data fusion), the resulting prediction accuracy could be improved Aerts, et al. (2006) Toward personalized medicine Targeted therapy Using antibody against biomarkers (cancer or other infectious agents) Require prior knowledge of patient response (through lab tests or biochips) Gene therapy Replace or inhibit genes in patients Vectors Adenovirus (AAV) Silencing the disease gene RNAi microRNA RNA interference Putting All Together Network of Networks Gene regulation (protein-DNA) Protein-protein interaction Metabolic pathway How…? Questions?