* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download network - bioinf leipzig
Epitranscriptome wikipedia , lookup
Epigenetics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Transposable element wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Point mutation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene therapy wikipedia , lookup
Genetic engineering wikipedia , lookup
Minimal genome wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genomic imprinting wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene desert wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Epigenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genome evolution wikipedia , lookup
Primary transcript wikipedia , lookup
History of genetic engineering wikipedia , lookup
Epigenetics in stem-cell differentiation wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression programming wikipedia , lookup
Helitron (biology) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene expression profiling wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Designer baby wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Networks in Biology Gene Regulatory Networks (GRNs) Dr. Katja Nowick [email protected] www.nowick-lab.info Networks in Biology Networks in cells (molecular networks): • Metabolic Networks • Gene regulatory networks • Protein-Protein-Interaction networks Networks between cells: • Neural networks • Immune system Networks in ecosystems: • Food networks • Cooperation/Symbiosis Social networks: • Friendships • Epidemiology Identity of the nodes (vertices) and meaning of the links (edges) depends on the studied network Characteristics of biological networks • Node degree distribution follows a power law • Small world characteristics • Hierarchical and modular organization • Overrepresentation of certain network motifs • Preferential attachment • Are dynamic Typical parameters analyzed in a network • Node degree (hubinesss) • Neighborhood • Centralization • Clustering coefficient • Centrality (Betweenness Centrality, Closeness Centrality) Why are cells different from each other? Examples of Gene Regulation Networks • Stem cell differentiation regulation Nodes: Genes, including transcription factors (TFs) Links: Interactions: who regulates expression of whom Directional or bidirectional Activating or repressing 6 Feed-back and other loops MacArthur et al., PLoS ONE 3: e3086 (2008) Examples of Gene Regulation Networks • TF network of E.coli Ca. 20% of all interactions in E.coli Here nodes are operons (genes on the same mRNA) Links: TF X regulates operon Y Examples of Gene Regulation Networks • TF network of drosophila embryonic development Transcription + translation (gene expression) ! TFs are also proteins some generated proteins regulate new genes network Examples of Gene Regulation Networks • Stem cell differentiation regulation 1. Nodes: Genes, including transcription factors (TFs) Links: Interactions: who regulates expression of whom Directional or bidirectional Activating or repressing 10 Feed-back and other loops MacArthur et al., PLoS ONE 3: e3086 (2008) TFs regulate expression of other genes TF Promoter Gene TFs regulate expression of other genes Promoter Gene Many TFs have to come together to start/stop transcription of a target Transcription factors (TFs) ~ 1500 TFs in human genome Tubby Structural AF-4 Dwarfin ZNF ZNF AP-2 Paired Box BHLH 117 762 TEA BHLH BZip GCM HOX HOX T-Box Trp cluster Β-Scaffold NHR FOX Pocket domain 199 E2F Jumonji Other Bromodomain RFX Heat shock Methyl-CpG-binding Modified after Messina et al., 2004 Some TFs bind DNA as dimers Many TFs have to come together to start/stop transcription of a target bHLH: basic helix loop helix TFs bZip: beta zipper TFs NR: nuclear receptors Homo-dimers or hetero-dimers added complexity Environmental signals trigger the GRN Environmental signals trigger the GRN - Activators - Environmental signals trigger the GRN - Repressors - TFs are often hubs in the GRNs • TFs and their target genes TF TF TF TF TF TF TF TF TF binding to DNA Promoter Gene TF Binding sites (TFBS): short sequence motifs, degenerate Enhancers are sites on the DNA helix that are bound to by activators in order to loop the DNA bringing a specific promoter to the initiation complex. Enhancers are much more common in eukaryote than prokaryotes, where only a few examples exist (to date). Silencers are regions of DNA sequences that, when bound by particular transcription factors, can silence expression of the gene. TFs recognize specific sites/motifs in DNA • TFs bind short sequence motifs • Motifs are degenerated TFs interact to regulate their targets • TFs cooperate to regulate their targets TF TF TF Promoter Gene TF TF TF TF TF TFs interact to regulate their targets • Co-occurrence of TF binding sites in the genome Encode 2012 Complex TF interactions • Summary • TFs bind as monomers, homo-dimers, or hetero-dimers • Multiple TFs (~7-10) cooperate to regulate gene expression • TFs regulate the expression of other TFs • Feedback loops, autoregulation … • It makes sense to represent this complexity in a network TFs: what is known and what not Not only TFs regulate gene expression General TFs RNA polymerase II transcriptioninitiation complex Specific TFs Activate or repress expression of particular genes GRFs Cofactors Bridge between specific and general TFs; activate or repress Chromatin remodeler Make DNA accessible or inaccessible miRNAs *GRN = Gene Regulatory Factor Bind to mRNA to degrade them Epigenetic control of gene expression • Chromatin remodeler Examples of epigenetic/histone modifications * * Temporal changes of the epigenome Interactions between TFs and histone modifications • Histone modifications influence chromatin states • Chromatin states influence binding of TFs • TFs interact with enzymes that modify histones Not only TFs regulate gene expression General TFs RNA polymerase II transcriptioninitiation complex Specific TFs Activate or repress expression of particular genes GRFs Cofactors Bridge between specific and general TFs; activate or repress Chromatin remodeler Make DNA accessible or inaccessible miRNAs *GRN = Gene Regulatory Factor Bind to mRNA to degrade them miRNAs • = small non-coding RNA molecule (ca. 22 nucleotides) • > 1000 miRNAs in the human genome A primary miRNA (pri-miRNA) transcript is encoded in the cell's DNA and transcribed in the nucleus, processed by an enzyme Dosha and exported into the cytoplasm where it is further processed by Dicer. After strand separation, the mature miRNA represses protein production either by blocking translation or causing transcript degradation. Interactions between TFs, miRNAs, other ncRNAs, and histone modifications • Neurogenesis Interactions between TFs, miRNAs, other ncRNAs, and histone modifications • TFs bind as monomers, homo-dimers, or hetero-dimers • Multiple TFs (~7-10) cooperate to regulate gene expression • TFs regulate the expression of other TFs • Feedback loops, autoregulation … • Network • Add epigenetic modifications ~375 Mio interactions • Add ncRNAs • Even more complex networks ~5000 ncRNAs Why are tissues different from each other? Cell states are defined by gene expression How is a gene activated or repressed (at a certain time and location)? So let’s talk about the links now Examples of Gene Regulation Networks • Stem cell differentiation regulation Nodes: Genes, including transcription factors (TFs) 2. Links: Interactions: who regulates expression of whom Directional or bidirectional Activating or repressing 36 Feed-back and other loops MacArthur et al., PLoS ONE 3: e3086 (2008) Cell states are defined by gene expression How is a gene activated or repressed (at a certain time and location)? Goal: discover which gene is regulated by which TF How do we get the information for the links? Network construction based on literature • Manual • Semi-automated (i.e. preBIND) • Natural Language Processing (NLP) (i.e. PathwayStudio) preBIND Donaldson I, et al. BMC Bioinformatics. 4:11 (2003) 38 Is the network encoded in the DNA? TFs bind to specific motifs It should be possible to predict TF target genes by reading the DNA http://fasta.bioch.virginia.edu/cshl/ Experimental approaches Experimental approaches Experimental approaches Experimental approaches Experimental approaches Experimental approaches • Expensive • Time consuming • For one research group only feasible for a few TFs A collection of TFBS can be found in databases: Jasper, Transfac Motif databases • Jaspar: http://jaspar.genereg.net/ • http://www.gene-regulation.com /pub/databases.html How good is a motif? To score a single site s for match to a motif W, we use Pr(s |W ) How good is a motif? • Scoring motif matches • Pr (s | W) is the key idea. However, some statistical mashing is done on this. Consider a genome that is very A/T rich: Pr(A) = 0.45, Pr(T) = 0.45, Pr (C) = 0.05, Pr(G) = 0.05 We saw that Pr (ACACGTT | W) = 0.048 In fact Pr (ACATGTT | W) = 0.048 too. • Compute the probability of each site under the above “background model”: Pr (ACACGTT ) = 0.45x0.05x0.45x0.05x0.05x0.45x0.45 =0.0000051. So Pr (ACACGTT | W) = 0.048 is 9364 times Pr (ACACGTT) Similarly, Pr (ACATGTT) is 0.0000461. So Pr (ACATGTT | W) = 0.048 is 1040 times Pr (ACATGTT) • Pr (ACACGTT | W) is 9364 times Pr (ACACGTT) Pr (ACATGTT | W) is 1040 times Pr (ACATGTT) In other words, if we compare how well “W explains the site” to how well “random background explains it”, then ACACGTT stands out. How good is a motif? • The Log Likelyhood Ratio (LLR) score Given a motif W, background nucleotide frequencies Wb, and a site s, LLR score of s = log (Pr(s |W) / Pr(s |Wb ) Good scores > 0. Bad scores < 0. Finding the TF target gene • So, what to do with the motif now? Find motif matches in DNA Typically people designate the gene closest to the motif as TF’s target Motif discovery We assumed that we have experimental characterization of a TFs binding specificity (the motif) What if we don’t? We can try computational motif discovery Motif discovery – Option 1 Try to find the motif given the promoter regions of the five genes G1, G2, … G5 Motif discovery – Option 2 Motif discovery – some algorithms Idea: Find a motif with many (significantly more than expected by chance) matches in the given sequences Motif discovery – some tools Is the network encoded in the DNA? TFs bind to specific motifs It should be possible to predict TF target genes by reading the DNA Is this really so simple? • For most TFs is the binding site not known • Since TFBS are degenerated, hard to predict how efficient the TF really binds • How far away can the binding site be from the promoter? • Multiple TFs might compete for the same binding site • Is the nearest gene really the target gene? • Does the binding event have an effect at all? • … Does the TF binding really have an effect? Problem: TFs bind at many places But is indeed a gene regulated by the binding event? Combine motif finding experiments with experiments changing the TF expression (perturbartion experiments) • Chromatin immuno-precipitation (ChIP)-Seq • Overexpression or knock-down of TFs in cell lines, followed by RNA-Seq + - Inferring networks from perturbations Sachs et al. Science. 2005 308:523-9 Reverse engineering the topology of regulatory molecular biological networks can be done through the analysis of a set of perturbations. Picture: reversed engineering of the hierarchy of a cell signaling network using multiple perturbations and a statistical method called Bayesian 60 networks inference. Inferring Networks from Time Series Microarrays Zou M, Conzen SD. Bioinformatics. 2005 21(1):71-9. Regulatory interactions can also be inferred directly from data = reverse engineering of biological pathways/networks from data. In the example above time-series expression data61is used to infer a directed and signed graph based on delayed correlations. Why are tissues different from each other? • Summary GRNs are hierarchical Top layer Kernels Initial TFs Hierarchy Core layer Bottom layer Differentiation batteries Terminal TFs GRNs are hierarchical - Yeast Top layer Kernels Initial TFs Hierarchy Core layer Bottom layer Differentiation batteries Terminal TFs Yeast regulatory network of 13385 regulatory interactions among 4503 genes, which includes 158 TFs and 4369 target genes . The model based on experimental evidence in yeast organizes TFs in a stratified nature of three distinct layers: the top, core, and bottom layers. TFs within a layer are highly interconnected and share similar properties. TFs of the different layers regulate distinct sets of targets genes. The three layers are also connected by a central skeleton, a feed-forward structure that utilizes the TFs of the top layer to regulate TFs of the core layer, and TFs of the core layer to regulate TFs of the bottom layer. The core layer is characterized by the highest number of TFs and hubs and is important for signal propagation for the regulation of almost all targets. GRNs are hierarchical - Development Bottom layer Differentiation batteries Terminal TFs Hierarchy Core layer e.g. drosophila development Top layer Kernels Initial TFs Developmental biologists have proposed a concept that concentrates on the timely order of events in developmental pathways. In this system modules are classified as kernels, plug-ins, input-output switches and differentiation batteries. Modules can be thought of fulfilling one specific function . Kernels are the initial modules of the network that impact most other parts of the net-work. They are, for instance, involved in the initiation of the development of certain body parts. Differentiation batteries may play a role in terminal steps of the differentiation of body parts and do generally not affect other parts of the network. GRNs are hierarchical – cell fate Top layer Kernels Initial TFs Hierarchy Core layer Bottom layer Differentiation batteries Terminal TFs Terminal selector TFs (acting either alone or in synergistic combination) activate downstream target genes directly via terminal selector motifs and also autoregulate their own expression via Hobert O PNAS 2008;105:20067-20071 those motifs. Autoregulated expression of a terminal selector is critical to maintain the differentiated features of the cell. Downstream targets of terminal selectors (X) define differentiated properties of a neuron, such as neurotransmitter receptor, ion channels, adhesion proteins etc. Targets may also include TFs that regulate specific “subroutines.” TFs that are induced by terminal selectors may also cooperate with terminal selector proteins in a feed-forward loop configuration to jointly control specific terminal genes.