Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distribuzione di sequenze nel genoma umano Genoma Umano 3200 Mb Geni e sequenze correlate 1200 Mb Geni 48 Mb Sequenze correlate 1152 Mb DNA intergenico 2000 Mb Ripetizioni intersperse 1400 Mb Altre regioni intergeniche 600 Mb Pseudogeni LINE 640 Mb Microsatelliti 90 Mb Frammenti genici SINE 420 Mb Varie 510 Mb Introni, UTR Elementi LTR 250 Mb Trasposoni DNA 90 Mb 50 kb di genoma a confronto Fattori trascrizionali in eucarioti Tipi di ripetizione nel genoma umano SINE ALU MIR MIR3 1.558.000 1.090.000 393.000 75.000 Elementi LTR LINE 868.000 LINE-1 516.000 LINE-2 315.000 LINE-3 37.000 Classe I ERV Classe II ERV(K) Classe III ERV(L) MaLR Trasposoni DNA hAT Tc-I PiggyBac N.C. 443.000 112.000 8.000 83.000 240.000 240.000 195.000 75.000 2.000 22.000 Confronto di interi genomi Per aumentare l’affidabilità della predizione, si richiede che per ogni coppia di ortologhi, ognuno dei due geni risulti quello più simile all’altro quando confrontato con l’intero genoma Genoma 1 Genoma2 se entrambe le relazioni sono vere ==> le due proteine gialle si possono proporre come ortologhe Tools for Comparative Genomics Browser: This site contains the reference sequence and • UCSC working draft assemblies for a large collection of genomes. The Ensembl project produces genome databases • Ensembl: for vertebrates and other eukaryotic species, and makes this information freely available online. The Map Viewer provides a wide variety of genome • MapView: mapping and sequencing data.[26] A comprehensive suite of programs and databases for • VISTA: comparative analysis of genomic sequences. It was built to visualize the results of comparative analysis based on DNA alignments. The presentation of comparative data generated by VISTA can easily suit both small and large scale of data. COMPARATIVE GENOMICS AT THE VERTEBRATE EXTREMES Dario Boffelli, Marcelo A. Nobrega and Edward M. Rubin NATURE REVIEWS | GENETICS VOLUME 5 | JUNE 2004 | 457 Annotators of the human genome are increasingly exploiting comparisons with genomes at both the distal and proximal evolutionary edges of the vertebrate tree. Despite the sequence similarity between primates, comparisons among members of this clade are beginning to identify primate- as well as human-specific functional elements. At the distal evolutionary extreme, comparing the human genome to that of non-mammal vertebrates such as fish has proved to be a powerful filter to prioritize sequences that most probably have significant functional activity in all vertebrates. Conservation in enhancers shared by human and fish A core enhancer in an intron in DACH is >98% identical for 350 bp in humans, mice and rats. In the ~1 billion years of parallel evolutionary time that separates human, mouse, rat, chicken, frog and fish, only 6 substitutions occurred in a 120-bp fragment that corresponds to an enhancer, 4 of which occurred in the frog lineage alone, and none occurred in the mammalian lineage. Defining functional DNA elements in the human genome Manolis Kellis et al. PNAS April 29, 2014 | vol. 111 | no. 17 | 6131–6138 With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. The complementary nature of evolutionary, biochemical and genetic evidence Encyclo pedia of DNA Elements (ENCODE) Project DNA that produces a phenotype upon alteration GERP++ elements from 34 mammal alignments Epigenetic and evolutionary signals in cis-regulatory modules (CRMs) of the HBB complex Primate phylogenetic tree Biomedical relevance Specific differences in cytochrome P450 genes that are involved in drug metabolism or other genetic components that are relevant to disease, such as pathways involving the melanocortin receptor, methyltransferases and the parathyroid hormone receptor 1. Macaques have an expanded array of MHC class I genes that are central to their response to infectious agents and other immune system processes. Rhesus macaques carry variants in the ornithine carbamoyltransferase (OTC), phenylalanine hydroxylase (PAH) and N-acetylglucosaminidase alpha (NAGLU) genes that predispose some human individuals to disease. Chimpanzees carry ‘disease’ alleles in genes that are related to cancer (mutL homologue 1 (MLH1)), diabetes mellitus (peroxi- some proliferator-activated receptor g (PPARG)) and Alzheimer’s disease (apolipoprotein E (APOE)). Transcriptomics. Expression of drug-metabolizing P450 genes and some amino acid sequences differ between cynomolgus and rhesus macaques98, which has implications for pharmacokinetics. Lo “switch” fetale-adulto nel locus delle -globine Il locus Albumina / Alfa-fetoproteina (ALB/AFP) Hind III (AAGCTT) 13.4 5 10 13.5 45.9 46.3 47.8 48.2 15 20 25 30 35 40 50 45 55 60 III II ALB Prom inattive ALB Prom AFP + AFP - ALB - ALB + III II Ealb Eafp Before birth Eafp AFP Prom AFP Prom ACTIVE AFP Prom inactive III II Ealb Eafp ALB Prom After birth ACTIVE x Ealb x Kb Sau3A (GATC) 3C “Carbon Copy” (5C) Hi-C, a method that probes the three-dimensional architecture of whole genomes Cut with restriction enzyme Fill ends and mark with biotin Ligate blunt ends Crosslink DNA Purify and shear DNA Sequence paired-ends Pull down biotin Map of chromosome 14 at 1Mb resolution (A) The map of chromosome 14 at 1Mb resolution exhibits substructure in the form of an intense diagonal and a constellation of large blocks The Observed/expected matrix (B) shows loci with either more (red) or less (blue) interactions than would be expected given their genomic distance (range: 0.2 – 5). Correlation map of chromosome 14 at a resolution of 100kb The principal component (eigenvector) correlates with the distribution of genes and with features of open chromatin. Genome architecture at three scales Two compartments, corresponding to open and closed chromatin, spatially partition the genome. Chromosomes (blue, cyan, green) occupy distinct territories. Individual chromosomes weave back-and-forth between the open and closed chromatin compartments. At the scale of single megabases, the chromosome consists of a series of fractal globules. Prof. Vincenzo De Simone Analisi del Trascrittoma mediante Microarrays a DNA 3 Lezione n. Parole chiave: Microarrays, Gene Chips, Trascrittoma. Corso di Laurea: AgroBiotecnologie, Biotecnol. Mediche Insegnamento: Biologia Molecolare Avanzata Email Docente: vincenzo.desimone@ A.A. 2009-2010 unina.it Prof. Vincenzo De Simone Un esempio di studio comparato del Trascrittoma mediante Microarrays a DNA 4 Lezione n. Parole chiave: Microarrays, trascrittoma, profili di espressione genica. Corso di Laurea: AgroBiotecnologie Biotecnol. Mediche Insegnamento: Biologia Molecolare Avanzata Email Docente: vincenzo.desimone@ A.A. 2009-2010 unina.it Conclusioni • Struttura e composizione dei genomi. • Sequenze ripetute nel genoma umano. • Gli strumenti per il confronto tra genomi. • Identificazione di CRMs (Cis Regulatory Modules) per confronto tra genomi agli estremi dell’albero evolutivo dei vertebrati. • Identificazione di CRMs mediante approcci multidisciplinari (biochimica, genetica, genomica e trascrittomica comparativa). • Genomica comparativa dei primati ed identificazione di regioni a velocità evolutiva variabile e di alleli utili per patologie umane. • Interazioni tra regioni genomiche distanti e metodiche 3/5/HiC. • Il pacchetto VISTA per la genomica comparativa. • Il Genome Browser UCSC