* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 8 - Brandeis Life Sciences
RNA silencing wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genomic imprinting wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Transposable element wikipedia , lookup
Whole genome sequencing wikipedia , lookup
History of RNA biology wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Gene expression programming wikipedia , lookup
Epigenomics wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Oncogenomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Epitranscriptome wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Minimal genome wikipedia , lookup
Point mutation wikipedia , lookup
Genomic library wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Public health genomics wikipedia , lookup
Human genome wikipedia , lookup
Pathogenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Metagenomics wikipedia , lookup
Designer baby wikipedia , lookup
History of genetic engineering wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome editing wikipedia , lookup
Microevolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome evolution wikipedia , lookup
Lecture 9 Genomic wide analysis of nucleic acids. A Genome Revolution in Biology and Medicine • We are in the midst of a "Golden Era" of biology • The Human Genome Project has produced a huge storehouse of data that will be used to change every aspect of biological research and medicine • The revolution is about treating biology as an information science, not about specific biochemical technologies. Historical Milestones Year Milestone 1866 Mendel’s discovery of genes 1871 Discovery of nucleic acids 1951 First protein sequence (insulin) 1953 Double helix structure of DNA 1960s Elucidation of the genetic code 1977 Advent of DNA sequencing 1975-79 First cloning of human genes 1986 Fully automated DNA sequencing 1995 First whole genome (Haemophilus Influenza) 1999 First human chromosome(Chr #22) 2000 Drosophila / Arabidopsis genomes 2001 Human and mouse genomes Much more genomes since them!! • Genomic data – Whole genome data sets. According to http://www.ebi.ac.uk/genomes/ as at 30-sept-04 • • • • • • • • Archea – 19 Bacteria – 167 Eukaryota - 36 Organelles – 569 Phages – 137 Plasmids – 204 Viroids – 36 Viruses – 911 • TOTAL:2079 The ….omics Genomics • The application of high-throughput automated technologies to molecular biology. • The experimental study of complete genomes. Genomics Technologies • Automated DNA sequencing • Automated annotation of sequences • DNA microarrays – gene expression (measure RNA levels) – single nucleotide polymorphisms (SNPs) • Protein chips (SELDI, etc.) • Protein-protein interactions New Types of Biological Data • Microarrays - gene expression • Multi-level maps: genetic, physical, sequence splicing, expression, function • Networks of protein-protein interactions • Cross-species relationships • homologous genes • chromosome organization (synteny) • common regulatory sequences Biological Information Protein 2-D gel mRNA Expression Protein 3-D Structure Mass Spec. Genome sequence The Cell What is gene expression? • The amount of RNA produced from a gene. • Level of RNA produced from a gene is controlled by: – Transcription – Degradation • Transcriptome - Expressed transcripts in a cell under defined experimental conditions. – mRNA(5-10% of total RNA). – rRNA, tRNA - make up most of total RNA Analysis of gene expression at the single gene level. • Northern Blots – Measure RNA levels by hybridization of a labeled probe to total RNA. • Reporter Genes – Use of an enzyme to measure the amount of transcription from a promoter. • Quantitative RT-PCR. Assaying the regulation of 1000s of genes in a single experiment • DNA microarrays – DNA molecules printed at high density used to determine the level of RNA or DNA in a sample. – Can be thought of a “reverse Northern blots” • Other technologies - SAGE - Microbeads DNA Microarrays • Spotted DNA arrays (glass slides) – Competitive binding of samples – Fluorescent detection - Cy3 and Cy5 – Small sample sizes (10-30µl). – PCR or cDNA arrays – Long oligonucleotide arrays • Short oligonucleotide arrays – ex. Affymetrix • DNA spotted onto nylon membranes (macroarrays) Applications of DNA microarrays • Expression profiling – Determining the relative levels of RNA in two or more samples. • DNA/DNA hybridizations – Investigate gene content between different strains – Determine gene dosage – 16S arrays - microbial communities (being developed). • Identification of protein binding sites – ChIP-Chip. Immunoprecipitation of protein/DNA complexes. Assaying those interactions with microarrays. cDNA spotted microarrays Labeling RNA or DNA with Cy3 or Cy5. • Cy3 and Cy5 - most often used fluorescent molecules used to label samples for microarray analysis. – Absorb light at one wavelength and emit at another. – Emission and Excitation spectra do not overlap significantly. – In arrays Cy3 and Cy5 are usually false colored green (Cy3) and red (Cy5) for ease of visualization. Affymetrix Gene Chips Microarray Experiment - labeling, hybridizing, scanning Affymetrix = Oligonucleotide Microarray Each gene on an Affy chip is represented by a probe set Rationality of Affy analysis - MM probes are used to measure background signals due to non-specific sources and scanner offset. - Using a MM probe as an estimate of background seems them great in theory. - The expression value for a gene is a combination of the (PM-MM) signals for each of the probes (i.e. the average) Microarray Data Analysis • • • • Data mining and visualization Controls and normalization of results Statistical validation Linkage between gene expression data and gene sequence/function/metabolic pathways databases • Clustering and pattern detection • Discovery of common sequences in coregulated genes Regulons and Stimulons • Operon - group of genes co-expressed on a single transcript. – One location of the genome • Regulon - genes that are regulated by a single transcription factor. – Genes and operons throughout the genome • Stimulon - collection of genes that are regulated in response to environmental changes. – Can be multiple regulons affected at once. • Regulatory network - alternative term for regulon. Identifying genes whose expression changes at specific stages of the cell cycle RBK1 PHO87 BUD5 MATa2 MATa1 TSM1 HO Microarray analysis of 150 damage-regulated mRNAs after a single unrepaired HO-induced DSB YCR043C 12346 hr 1 2 3 4 6 hr YCR033W YCR034W YCR035C YCR036W YCR037C YCR038C YCR039C YCR040W YCR041W YCR042C YCR043C YCR044C YCR045C FEN3 RRP43 RBK1 PHO87 BUD5 MATa 2 MATa 1 TSM1 4 kb/hr Audrey Gasch Moreshwar Vaze day night day night day night Circadian Rhythms Genes whose expression changes during the day in fruit flies Cancer can be qualified from the transcriptome Bioinformatics • Genomics produces high-throughput, high-quality data, and bioinformatics provides the analysis and interpretation of these massive data sets. • It is impossible to separate genomics laboratory technologies from the computational tools required for data analysis. What type of data we can use to build a transcriptional network? -Protein-Protein interaction data -Expression data -ChIP data CHIP ON CHIP Comparative Genomics • The Assumption that underlies comparitive genomics is that the two genomes had a common ancestor and that each organism is a combination of the ancestor and the action of evolution. • Evolution can be broadly thought of as the combination of two processes: mutational forces that generate random mutations in the genome sequence, and selection pressures that 1. Eliminate random mutations (negative selection), 2. Have no effect on mutations (neutral selection) or, 2. Increase the frequency of mutant alleles in the population as a result of a gain in fitness (positive selection). • The combined action of mutation and selection is represented generally by a RATE MATRIX of base-pair changes between the two observed genomes. Human Comparative Genomics Mouse Rat Evolutionary relationship between metazoans that are sequenced, or due for sequencing. Evolutionary distances are in millions of years. C.Elegans Comparative Genomics • Comparative genomics may be defined as the derivation of genomic information following comparison of the information content of 2 or more species genome sequences The similarity is such that human chromosomes can be cut (schematically at least) into about 150 pieces (only about 100 are large enough to appear here), then reassembled into a reasonable approximation of the mouse genome. http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/ttmousehuman.html Harnessing the genome to answer real problems How do we control infectious disease? How do we slow or stop the effects of cancer? How can we detect and treat genetic disorders? Only 2% of human diseases are due to single gene defects the rest involve networks of gene expression. Most pharmaceutical drugs act on individual proteins or sets of proteins. Proteomics The study of the ‘proteome’ While an organism has only one genome, it has many transcriptomes, proteomes and metabolomes mRNA level expressed protein level nor does it indicate the nature of the functional protein product Genomic Sequence mRNA Protein Product Functional Protein Product Translational Control Transcriptional Control Post-Translational Control Temporal Changes in mRNA and protein t Gene t Expression t Protein When you measure expression affects what you find Does mRNA level correlate with protein level? 1000 1000 100 10 1 0.1 Glutathione-S-transferase in 60 human cell lines mRNA (Northern) mRNA (EST clones) 20 liver proteins and corresponding mRNAs R=0.48 0.1 1 10 Protein (2D gels) Anderson & Seilhamer Electrophoresis 1997 18:533-537 100 100 x 10 x x xx 1.0 R = 0.43 0.1 0.1 1.0 10 100 Protein (Affinity-HPLC) Anderson & Anderson Electrophoresis 1998 19:1853-1861 From Tew et al 1996 Lung Ovarian x CNS Leukemia Renal Melanoma Breast Genomics, proteomics era. -Lots of data (lots of real data and lots of noise!). Needs validation! -Dangers : + Become too descriptive and reductionist + Forget about the biological problem Year by year we are becoming better equipped to accomplish the things we are striving for. But what are we actually striving for? - Bertrand de Jouvenel, 1903-1987 Success is the ability to go from failure to failure without losing your enthusiasm. - Winston Churchill, 18741965