* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Protein-coding genes
Epigenomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
X-inactivation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Point mutation wikipedia , lookup
Messenger RNA wikipedia , lookup
Pathogenomics wikipedia , lookup
Polyadenylation wikipedia , lookup
Transposable element wikipedia , lookup
Gene desert wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene expression programming wikipedia , lookup
RNA interference wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genome wikipedia , lookup
Genome (book) wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of RNA biology wikipedia , lookup
Minimal genome wikipedia , lookup
Microevolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Genome editing wikipedia , lookup
RNA silencing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Designer baby wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome evolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Non-coding RNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Primary transcript wikipedia , lookup
Organisation of human genome Nuclear genome (3.2 Gbp) 24 types of chromosomes Y- 51Mb and chr1 -279Mbp Mitochondrial genome 1.5% Exons Introns (junk) Intergenic regions (junk) The genome is empty? 9 Saccharomyces cerevisiae (baker’s yeast) Estimated number of genes: 6,034 Drosophila melanogaster (fruit fly) 13,061 Caenorhabditus elegans (roundworm) 19,099 Arabidopsis thaliana (mustard plant) 25,000 LA COMPLEJIDAD BIOLÓGICA CRECIENTE EXIGE CAMBIOS GENÓMICOS QUE INCREMENTEN LA CAPACIDAD INFORMACIONAL DEL SISTEMA... ...PERO EL NÚMERO DE GENES EN LOS DISTINTOS GENOMAS SECUENCIADOS NO CONCUERDA CON LO ESPERADO (APARENTEMENTE) Amphimedon queenslandica 18693 Trichoplax adhaerens 11514 Bos taurus >22790 Nematostella vectensis 18000 Nassonia vitripennis 17279 Homo sapiens 21527 Mus musculus 22083 Danio rerio 21413 Drosophila melanogaster 13781 Ciona intestinalis 16000 Takifugu rubripes 18500 Caenorhabditis elegans 20224 Strongylocentrotus purpuratus 23300 Anolis carolinensis 17000 Xenopus tropicalis 18000 Gallus gallus <17000 Arabidopsis thaliana 26000 Gorilla gorilla 21000 Oryza sativa 50000 Pan troglodytes 21000 Populus trichocarpa 45550 Glycine max 75778 Why (coding) gene number doesn’t matter? • More sophisticated regulation of expression? • Proteome vastly larger than genome? – Alternate splicing – RNA editing • Postranslational modifications • Cellular location …but, remember there are other genes Genes in the genome: • Protein-coding genes (mRNA): around 20500 (as of 10/2012) • Non-coding RNAs Ribosomal RNA (rRNA) Transfer RNA (tRNA) Small nuclear RNA (SnRNA) Small nucleolar RNA (SnoRNA) microRNA (miRNA) Other non-coding RNAs (Xist, 7SK, etc.) • Peudogenes Non polypeptide–coding: RNA encoding Statistics about the current Gencode freeze (version 13) *The statistics derive from the gtf files, which include only the main chromosomes of the human reference genome. Version 13 (March 2012 freeze, GRCh37) General stats Total No of Genes 55123 Protein-coding genes 20670 Long non-coding RNA genes 12393 Small non-coding RNA genes 9173 Pseudogenes 13123 Total No of Transcripts 182967 Protein-coding transcripts 77901 Long non-coding RNA loci transcripts 19835 Total No of distinct translations 78119 Genes that have more than one distinct translations 14235 Protein-coding genes (mRNA): HUMAN genes and their homology to genes from other organisms CODING GENES Noncoding regions in coding genes • Regulatory regions – RNA polymerase binding site – Transcription factor binding sites – Polyadenylation [poly(A)] sites – Enhancers • 5’- and 3’-UTRs DNA as a series of ‘docking’ sites It is the relative location of these docking sites to one another that permits genes to be transcribed, spliced, and translated properly and in specific spatial and temporal patterns. …some more statistics • • • • • • • • • • Gene density 1/100 kb (vary widely); Averagely 9 exons per gene 363 exons in titin gene Many genes are intronsless Largest intron is 800 kb (WWOX gene) Smallest introns – 10 bp Average 5’ UTR 0,2-0,3 kb Average 3’ UTR 0,77 kb but underestimated… Largest protein: titin: 38,138 aa Largest gene: dystrophin Human genes vary enormously in size and exon content An example of complex human gene locus INK4a-ARF From: Prof. Gordon Peters website Genes within genes Neurofibromatosis gene (NF1) intron 26 encode : OGMP (oligodendrocyte myelin glycoprotein) EVI2A and EVO2B (homologues of ecotropic viral intergration sites in mouse) Why gene number doesn’t matter? • More sophisticated regulation of expression • Proteome vastly larger than genome – Alternate splicing – RNA editing… • Postranslational modifications • Cooption • GRN’s connectivity REDES DINÁMICAS Why gene number doesn’t matter? • More sophisticated regulation of expression • Proteome vastly larger than genome – Alternate splicing – RNA editing… • Postranslational modifications • Cooption • GRN’s connectivity Table 1. Levels of regulation--loci of control constraints--above the genome. Levels and transitions Dynamic regulatory system 1. Genome to transcriptome Epigenetic regulation of gene expression (5). Includes pathways that detect energy levels (redox levels) and repress DNA transcription when cellular NADH levels are increased. 2. Transcriptome to proteome Regulatory constraints include posttranslational modification of proteins. 3. Proteome to dynamic system Metabolic networks of glycolysis and mitochondrial oxidation-reduction are the dynamic systems presently the best understood in terms of both mechanism of formation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential. 4. Dynamic systems to phenotype Control of global phenotype such as disease may be localized to a single regulatory system (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels Gene Expression • The products of genes may be RNA or protein • RNA and protein synthesis occur in many steps • These steps are regulated and conttroled Table 1. Levels of regulation--loci of control constraints--above the genome. Levels and transitions Dynamic regulatory system 1. Genome to transcriptome Epigenetic regulation of gene expression (5). Includes pathways that detect energy levels (redox levels) and repress DNA transcription when cellular NADH levels are increased. 2. Transcriptome to proteome Regulatory constraints include posttranslational modification of proteins. 3. Proteome to dynamic system Metabolic networks of glycolysis and mitochondrial oxidation-reduction are the dynamic systems presently the best understood in terms of both mechanism of formation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential. 4. Dynamic systems to phenotype Control of global phenotype such as disease may be localized to a single regulatory system (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels UCSC Table 1. Levels of regulation--loci of control constraints--above the genome. Levels and transitions Dynamic regulatory system 1. Genome to transcriptome Epigenetic regulation of gene expression (5). Includes pathways that detect energy levels (redox levels) and repress DNA transcription when cellular NADH levels are increased. 2. Transcriptome to proteome Regulatory constraints include posttranslational modification of proteins. 3. Proteome to dynamic system Metabolic networks of glycolysis and mitochondrial oxidation-reduction are the dynamic systems presently the best understood in terms of both mechanism of formation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential. 4. Dynamic systems to phenotype Control of global phenotype such as disease may be localized to a single regulatory system (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels Gene Expression • The products of genes may be RNA or protein • RNA and protein synthesis occur in many steps • These steps are regulated and conttroled Location of CpG islands in the gene CpG islands do NOT have a deficit of CpG dinucelotides How epigenetics works Promoter Region CpG Island = CpG = methylated CpG Gene Unmethylated CpGs relax chromatin Gene RNA = CpG = methylated CpG Proteins Methylated CpGs constrain chromatin Gene RNA = CpG = methylated CpG Proteins Chromatin Modification Chromatin Remodeling SNF/SWI Transcription Factor Modification Acetylation Phosphorylation DNA Methylation CpG dinucleotides MeCP2 Histone Substitution H2AZ H2Ax H3.3 Histone Modification Acetylation Ubiquitination Sumoylation Methylation Phosphorylation Eukaryotic transcription regulation Modular construction and combinatorial control • The regulatory sequence (cis element) on DNA consists of multiple motifs specific for transcription factors. • Multiple transcription factors can bind simultaneously to the regulatory sequences and act together on the transcription of the gene. Co-activator protein General transcription factors TBP Transcriptional activators binding to promoter region TATA -35 Regulated Transcription Gene X Activators stimulate the highly cooperative assembly of initiation complexes Binding sites for activators that control transcription of the mouse TTR gene Figure 10-60 Model for cooperative assembly of an activated transcription-initiation complex in the TTR promoter Figure 10-61 (TTR= transthyretin) Distant Cis-Acting Elements Locus Control Region Regulatory site required for optimal expression of adjacent group of genes Insulator Element Prevents activation/repression extending to an adjacent regulatory sequence Distant Cis-Acting Elements Insulator Element Prevents activation/repression extending to an adjacent regulatory sequence Co-activator protein General transcription factors TBP Transcriptional activators binding to promoter region TATA -35 Regulated Transcription Gene X ALTERNATIVE PROMOTERS REGULACIÓN ESPECÍFICA DE SEXO EN EL GEN DNMT1 (METHYLTRANSFERASE): PROMOTORES DE OOCITO, SOMÁTICO, O DE ESPERMATOCITO Posttranscriptional control • Regulation of RNA processing • Regulation of mRNA degradation • Regulation of translation mRNA: many places for variation, modification, regulation • transcription • • • • • 5’ capping 3’ polyA addition • • • • • alternative exons self-splicing, spliceosomemediated editing • changing bases and codons • • nonsense-mediated decay degradation signals sequestration • alternative sites mature mRNA only stability • • splicing • • • initiation elongation termination • nuclear export localization in cytoplasmic compartments access to translation machinery antisense/RNA interference • inhibit translation The PolyA Site (PAS) PAS stop UTR 3’ exon PolyA signal ~17nt AATAAA T AAAAAAAAA AAAA Alternative polyadenylation sites Alternative PAS & Post-transcriptional (de)regulation Coding sequence Possible regulatory element (stability, translation, transport) 3' UTR AUUAAA AUUAAA AUUAAA AUUAAA AUUAAA Use of abnormal polyA site is associated to various diseases: A/B Thalassemia (globin) Mantle cell lymphoma (Cyclin CCND1) Teratocarcinoma (PDGF) Hypertension (Ca2+ ATPase) Consensus nucleotides at intron/exon junctions Alternative splicing is a mechanism for Generating functional diversity Alternative processsing example RNA editing RNA editing is a rare form of post-transcriptional processing whereby base-specific changes are enzymatically introduced at the RNA level. Types of RNA editing in humans: (i) C---> U, occurs in humans by a specific cytosine deaminase e.g. The expression of the human apolipoprotein B gene in the intestine involves tissue-specific RNA editing (ii) A ---> I, the amino group in in carbon 6 of adenine is replaced by a carbonyl group. I then acts as a G. Occurs in some ligandgated ion channels. (iii) U ---> C, in mRNA of the WT1 Wilms’ tumor gene (iv) U ---> A, in alpha-galactosidase mRNA Apo B-100 Apo B-48 Gene Expression • The products of genes may be RNA or protein • RNA and protein synthesis occur in many steps • These steps are frequently regulated 3. Protein Phosphorylation Post-translational modifications that alter activity of the p53 protein. Enzymes that have been shown to modify specific amino acid residues of p53 are shown. Enzymes that inhibit the covalent modifications are indicated in red. P, phosphorylation; R, ribosylation; Ac, acetylation. …increasing informational capability of the genome, but there are other genes….