Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 22— Genomics II • Functional Genomics—studying genes in groups, with respect to the cell, tissue, signaling pathway or organism • Proteomics—to understand the interplay among many different proteins (cellular processes and organismal level [traits]) • Bioinformatics—using computers, math, and statistics to understand the genome and proteome information (record, store, analyze, predict) Chapter 22— Genomics II • Functional Genomics—studying genes in groups, with respect to the cell, tissue, signaling pathway or organism • Proteomics—to understand the interplay among many different proteins (cellular processes and organismal level [traits]) • Bioinformatics—using computers, math, and statistics to understand the genome and proteome information (record, store, analyze, predict) A A mixture of 3 different types of F mRNA A Microarrays for studying gene expression or resequencing D D F A portion of a DNA microarray A A D B F Add reverse transcriptase, poly-dT primers that anneal to the mRNAs, C and fluorescent nucleotides. Note: Only 1 complementary cDNA strand is made. A Fluorescently labeled cDNA that is complementaryF to the mRNA A E D F D D F A D F Hybridize cDNAs to the microarray. Figure 22.1 A B C D E F View with a laser scanner. Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Modern day “Southerns” and “Northerns”—microarray analysis Two distinct forms of large B-cell lymphoma are shown by the expression pattern: GC B-like DLBCL (orange) and Activated B-like DLBCL (blue) significantly better overall survival ASH ALIZADEH et al. 2000 Nature 403, 503-511 (3 February 2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling Observation/problem • Diffuse large B-cell lymphoma (DLBCL) = most common subtype of non-Hodgkin's lymphoma is clinically heterogeneous: 40% of patients respond well to current therapy and have prolonged survival, whereas the remainder succumb to the disease Hypothesis • variability in natural history reflects unrecognized molecular heterogeneity in the tumours. Experiment • DNA microarrays used for a systematic characterization of gene expression in B-cell malignancies. Results • Diversity in gene expression among the tumours of DLBCL patients (reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour). • Identified two molecularly distinct forms of DLBCL which had gene expression patterns indicative of different stages of B-cell differentiation. – – One type expressed genes characteristic of germinal centre B cells ('germinal centre B-like DLBCL'); the second type expressed genes normally induced during in vitro activation of peripheral blood B cells ('activated B-like DLBCL'). • Patients with germinal centre B-like DLBCL had a significantly better overall survival than those with activated B-like DLBCL. Conclusion • Molecular classification of tumours on the basis of gene expression can thus identify previously undetected and clinically significant subtypes of cancer. ASH ALIZADEH et al. 2000 Nature 403, 503-511 (3 February 2000) Protein of interest Which DNA sequences bind to my protein of interest? Add formaldehyde to crosslink protein to DNA. Lyse the cells. Sonicate DNA into small pieces. Add antibodies that recognize the protein of interest. The antibodies are bound to heavy beads. After the antibodies bind to the protein of interest, the sample is subjected to centrifugation. Chromatin Immunoprecipitation Assay (ChIP) Protein of interest Antibody against protein of interest Bead Pellet Collect complexes in pellet. Add chemical that breaks the crosslinks to remove the protein. Known Candidates: Conduct PCR using primers to a known DNA region. or Unknown Candidates: Ligate DNA linkers to the ends of the DNA. Linker If PCR amplifies the DNA, the protein was bound to the DNA region recognized by the primers. Figure 22.2 Conduct PCR using primers that are complementary to the linkers. Incorporate fluorescently labeled nucleotides during PCR. Denature DNA and hybridize to a microarray. See Figure 22.1 Chapter 22—Genomics II • Functional Genomics—studying genes in groups, with respect to the cell, tissue, signaling pathway or organism • Proteomics—to understand the interplay among many different proteins (cellular processes and organismal level [traits]) • Bioinformatics—using computers, math, and statistics to understand the genome and proteome information (record, store, analyze, predict) Why is the proteome so large? Alternative splicing pre-mRNA Exon 1 Exon 2 Exon 3 Exon 4 Alternative splicing Translation Exon 2 Exon 1 Exon 5 Exon 4 Exon 6 or Exon 3 Exon 1 Exon 5 Exon 4 Exon 6 or Exon 2 Exon 1 Exon 6 Exon 4 (a) Alternative splicing Exon 5 Exon 6 Irreversible modifications Proteolytic processing SH SH Disulfide bond formation S S Heme group Attachment of prosthetic groups, sugars, or lipids Sugar Phospholipid Reversible modifications Phosphorylation Acetylation Methylation PO42- Phosphate group O C CH3 Acetyl group CH3 Methyl group (b) Posttranslational covalent modification Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Why is the proteome so large? Post translational modification Techniques to study the proteome: 2D Gel analysis Lyse a sample of cells and load the resulting mixture of proteins onto an isoelectric focusing gel. pH 4.0 Proteins migrate until they reach the pH where their net charge is 0. At this point, a single band could contain 2 or more different proteins. pH 10.0 SDS-polyacrylamide gel pH 4.0 Lay the tube gel onto an SDS-polyacrylamide gel and separate proteins according to their molecular mass. pH 10.0 200 kDa 10 kDa Brooker, Fig 22.4 N Purified protein Techniques to study the proteome: Mass spectrometry C Digest protein into small fragments using a protease. N C Determine the mass of these fragments with a first spectrometer. Abundance 1652 daltons 0 Brooker, Fig 22.5 Mass/charge 4000 Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Abundance 1652 daltons 0 4000 Mass/charge Analyze this fragment with a second spectrometer. The peptide is fragmented from one end. 1201 Abundance 1008 900 1652 1428 1315 1114 Mass/charge –Asn–Ser–Asn–Leu–His–Ser– Tandem mass spectrometry to sequence peptides 1565 1800 Brooker, Fig 22.5, cont. Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Chapter 22—Genomics II • Functional Genomics—studying genes in groups, with respect to the cell, tissue, signaling pathway or organism • Proteomics—to understand the interplay among many different proteins (cellular processes and organismal level [traits]) • Bioinformatics—using computers, math, and statistics to understand the genome and proteome information (record, store, analyze, predict) Example of DNA Sequence as stored in Genetic Database Numbers represent the base number in the sequence file Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display A bioinformatics program may ask: • Does the sequence contain a gene? • Which nt’s are the functional sites (e.g. promoters, exons, introns, termination sequence)? • Does the sequence encode a protein? (have an open reading frame [ORF] • What is the secondary structure of its RNA or associated amino acid sequence? • Is the sequence homologous to any other known sequences? • What is the evolutionary relationship between two or more sequences? 5′ end 3′ end A secondary structural model for E. coli 16S rRNA Brooker, Fig 22.7 Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Sequence matches between E. coli and K. pneumoniae • DNA sequences of the lacY gene – ~ 78% of the bases are a perfect match • In this case, the two sequences are similar because the genes are homologous to each other – They have been derived from the same ancestral gene – Refer to Figure 22.6 Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Example output from a computer alignment program (and comparison to real world data) Human Pa Ca Mouse Lu Ca Human LHON, Human Thy Ca Interesting cancer mutation pattern in mitochondrial ND6 protein Mouse Lu Ca Sequence homology used to “hang” human cancer mutations on the bovine crystal structure of Cytochrome B Chen and Uberto 2014 Federal Genetic Databases National Center for Biotechnology Information www.ncbi.nlm.nih.gov/ U.S. government-funded national resource for molecular biology information. BLAST programs identify sequences with homology or similarity Table 22.5 Origin of orthologous genes Ancestral lacY gene Ancestral organism Evolutionary separation of 2 (or more) distinct species E. coli K. pneumoniae lacY gene lacY gene Accumulation of random mutations in the 2 genes lacY gene Figure 22.6 Mutation lacY gene Mutation Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Mb ζ ψζ ψα2 ψα1 α2 α1 f ε g G g A ψβ δ β Millions of years ago 0 200 a chains 400 b chains Myoglobin Hemoglobins 600 800 Duplication Better at binding and storing oxygen in muscle cells Ancestral globin Better at binding and transporting oxygen via red blood cells 1,000 Brooker, Fig 8.7 Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display Orthologs, paralogs, homologs • • • • Like Brooker fig 8-7 All the globin genes have homology to each other a-like genes are paralogs of each other; b-like genes are paralogs of each other; a-1 in mice and a-1 in humans are orthologs From Thompson and Thompson, Genetics in Medicine, 6th ed.