Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Hutchinson-Guilford Progeria -premature aging -lifespan = 13.4 years -retarded growth -midface hypoplasia -micrognathia -alopecia -low adiposity -osteodysplasia -premature, severe atherosclerosis -death due to MI De Sandre-Ciovannoli, Science express, 17 April 2003 Lamin A mutations in HGS Exons 11 and 12 code the Lamin A tail (not lamin c) Red is coiled-coil and blue is globular domains 1824C>T is aa conservative (G608G) but - in 300 con. 1824C>T creates a cryptic donor site at 1819, -50 aa del Best guess Most diseases are probably interactions between polygenic heritable events, and environmental pressures leading to somatic epigenetic changes. Translation: diseases are complicated. Gene by Environment Interaction Predisposition Event Disease hydrocarbons radiation estrogens low fiber colon CA colon CA breast CA atherosclerosis DNA FAP MSH BRCA LDLr Microarrays-the big net. Ideal disease-hunter: genomic scale protein quantitation and sequencing. Imperfect solution A: genomic scale detection of mRNA level. Problem: little information on protein level Imperfect solution B: genome-wide SNP/haplotype. Problem: statistical limits on patient populations Common compromise: microarray profiling mRNA transcripts (transcript profiling) to identity target areas. Target genes are then followed by proteomics and SNPs. Array flavors DNA detection (SNP, genotyping, etc.) • short oligonucleotides to detect mismatches RNA detection (transcript profiling) • Plasmid • Inserts • Long oligonucleotides (60 mers) • Short oligonucleotides (20 mers) Hybridization-basic elements • Hybridization = Annealing - Melting • CRUCIAL: non-covalent, hydrogen bonds -->equilibrium rules, binding is statistical • Best hybridization occurs with: • long sequences (no hyb when nt<4) • high salt concentration (hybrids melt in water) • low temperatures (hybrids melt with heat) • G and C (3 H) bind better than A and T (2 H) • self-complementarity is low (high GC is bad) Base-pairing (the stuff of life) A T G C T A G Lewin. Genes VII page 8. C Tm-a good thing. Tm is a measure of the stability of DS-DNA under a given set of conditions. Stability, and therefore Tm, is affected by: Strand length - the longer the strand, the higher the Tm Base Composition - higher the GC content, the higher the Tm. Ionic Strength - as the ionic strength increases, so does Tm. Double helical DNA is stabilised by cations. Divalent cations (eg Mg2+) are more effective than monovalent cations (+ or K+). Organic Solvents - formamide for instance lowers the Tm by weakening the hydrophobic interactions. Melting Curves-Tm measured Tm Tm PCR Primer design www.oligo.net Array Choice Factors Expression profiling: Sequence known? Not known? Oligo arrays High confidence Immediate ID cDNA arrays Clone drift/cross hyb sequence clones Sample selection -isolate the purest phenotypic examples of test and control -laser capture microdissection (LCM) -always control for treatment and manipulation -people are the most meaningful, but least controllable -animals are highly controllable, but less meaningful -cell systems (in vitro) are controlled, but meaningful? -small amounts of RNA can be amplified -while purifying cells is good, the processing is bad. -The quality of the results are directly proportional to the samples that are chosen. Laser Capture Microdissection The importance of purity Human colon cancer Blue are normal cells Red are tumor cells Assessing sample quality Amount > 5 ug total RNA or 500 ng of poly A+ Basic: O.D. 260/280 ratio >2.1, nucleic acids absorb at 260, protein at 280 nm thus, increasing impurity reduces ratio Better: agarose gel electrophoresis, EtBR stained if total RNA, 28s = 2 x 18s ribosomal (Lab-on-chip) or Q-PCR of a low and high gene, against standard Best: test chip GeneChip Probe Arrays ® Hybridized Probe Cell GeneChip Probe Array Single stranded, labeled RNA target * * * * * Oligonucleotide probe 11 µm 1.28cm Millions of copies of a specific oligonucleotide probe >1 million probes Image of Hybridized Probe Array George Washington Genomics Core Facility Synthesis of Ordered Oligonucleotide Arrays Light (deprotection) Mask OOOOO HO HO O O O T– TTOOO Substrate Light (deprotection) Mask C AT A T AGCTG T TCCG TTCCO TTOOO Substrate C– REPEAT GeneChip Expression Array Design ® Gene 5´ Sequence 3´ Multiple oligo probes Probes designed to be Perfect Match Probes designed to be Mismatch Procedures for Target Preparation Cells Labeled transcript AAAA IVT Poly (A)+ RNA cDNA (Biotin-UTP Biotin-CTP) Wash & Stain Hybridize Scan L L L L Fragment (heat, Mg2+) L L L L (16 hours) Labeled fragments Streptavidin-Phycoerythrin (SAPE) Fluorescent stain-laser stimulated Analysis of expression level from probe sets A single, contiguous gene set for the rat B-actin gene. Each pixel is quantitated and integrated for each oligo feature (range 0-25,000) Perfect Match (PM) Mis Match (MM) Control PM - MM = difference score All significant difference scores are averaged to create “average difference” = expression level of the gene. Affymetrix Instrument System ® Platform for GeneChip® Probe Arrays • Integrated • Exportable • Easy to use •Versatile GeneChip analysis of human atherosclerosis Dissect normal media from atherosclerotic lesion Prepare highly purified RNA O.D. 260/280 = 2.0 Reverse transcribe w/poly dT + T7 = cDNA Transcribe with T7 + biotin dUTP = cRNA Purify probe/hybridize to chip Wash and detect with avidin/PE + ab amplification Read fluorescent label And deconvolve genes Basic Bioinformatics-Scatterplot E145 P22-N (raw) 10000 1000 100 10 1 0.1 0.01 0.01 Sample E145 P4-N (raw) 0.1 1 10 100 1000 10000 Transcript profiling of aged rat aorta. Affymetrix GeneChip analysis of 10 aortas @ 20 mo. vs. 3 mo. mRNAs Decreased in the Aged Aorta Experiment 1 Experiment 2 Descriptions Signal Change Signal Change 12 884 94 40 34 2 75 40 17 81 30 90 17 18 68 97 35 52 90 63 13 72 86 04 38 45 17 08 65 93 31 39 * -3.6 -5.7 -3.3 -2.8 -3.4 -8.2 -3.1 -2.3 -2.9 -2.1 -6.5 -2.2 -4.5 -11 .1 -2.1 -2.1 * 14 901 25 730 33 0 59 89 20 53 22 91 38 36 45 14 32 84 65 80 62 82 10 106 13 019 11 044 12 816 73 95 * -4.9 -3.5 -3.9 -3.3 -3.5 -3.1 -3.0 -2.7 -2.2 -2.0 -1.9 -2.2 -1.7 -1.6 -1.5 -1.5 * Egr-1 (3 p robe sets ) colla gen alph a1 typ e I (3 pro be s ets) fl avin-conta inin g mo nooxygen ase 1 (FMO-1) cycloo xyge nase iso form COX-2 le uci ne zip per p rotei n mRNA he at s hock p rotei n 70 (3 p robe sets ) DNA po lymeras e al pha ph osph oeno lpyruvate ca rboxykin ase (GTP) retinol -bind ing prote in (RBP) C4 comp leme nt p rotei n DnaJ-like p rotei n (RDJ1) pl asmi noge n activator in hibi tor-1 (PAI-1) RCO4 -1 ge ne for cytoch rome c o xi dase sub unit IV li popro tein lip ase RTK40 hom olog rib onucleo prote in F AND 1 8 ESTs FAQs: How many replicates? Number of Genes Greater Than 2 Fold Number of Genes Called Differentially Expressed as a Function of Number of Replicates 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1 2 3 4 5 Number of Replicates 6 7 Simple fold changes • Crude, insensitive--but effective Criteria: Present 1.5-fold up/down Hierachical clustering Statistical testing and ontology Gene Abbrev. Fold Lists Description Gene Abbrev. Fold Lists Description Apoptosis BAD BCL2L1 CCND1 MDM2 PRSS25 TNFRSF6 VDAC2 Growth factors/regulators FGF5 2.3 ** HDGF -1.2 *** IGFBP3 -1.6 ** IGFBP4 -1.5 * LRP1 -1.7 ** LTBP2 -1.3 *** SMURF2 1.8 *** VEGFB -1.5 *** fibroblast growth factor 5 hepatoma-derived growth factor (high-mobility group protein 1-like) insulin-like growth factor binding protein 3 (2 sets) insulin-like growth factor binding protein 4 LRP1, TGF-§ Type V receptor latent transforming growth factor beta binding protein 2 SMAD-specific ubiquitin ligase vascular endothelial growth factor B Signalling FKBP9 JAK1 MAP3K12 MAP3K4 PPIH STAT1 STAT3 STAT6 FK506 binding protein 9, 63 kDa Janus associated kinase 1 mitogen-activated protein kinase kinase kinase 12 mitogen-activated protein kinase kinase kinase 4 peptidyl prolyl isomerase H (cyclophilin H) signal transducer and transactivator 1 signal transducer and transactivator 3 signal transducer and transactivator 6 Cell Cycle CCND1 CCNI CDK11 CUL1 JUN MDM2 PDGFRB 1.4 6.6 1.9 2.2 * * *** * ? ? 1.2 *** 1.8 *** 1.9 -1.6 -1.6 -1.3 1.4 2.2 -2.1 *** *** *** ** *** * ** Chromatin remodeling CBFA2T1 -1.6 * CHD3 -1.5 *** HDAC4 -1.5 * HIST1H2BN 1.8 ** HIST1H2AL 1.7 ** MYST1 -1.5 *** POLB 1.8 *** BCL2-antagonist of cell death BCL2-like 1 (BCL-XL) cyclin D1, PRAD1 Mdm2, p53 binding protein serine protease 25-Omi/HtrA2 TNF receptor superfamily, 6, fas, CD95 voltage-dependent anion channel 2 cyclin D1, PRAD1 (3 sets) cyclin I cyclin-dependent kinase (CDC2-like) 11 cullin 1-cyclin D1 degrading v-Jun homolog Mdm2, p53 binding protein platelet-derived growth factor receptor, beta core-binding factor, cyclin D-related chromodomain helicase DNA binding protein 3 histone deacetylase 4 histone 1, both H2bn and H2bd histone 1, H2al MYST histone acetyltransferase 1 polymerase (DNA directed), beta Cholesterol/Fatty acid/Membranes ATP8B1 2.3 *** Potential phospholipid-transporting ATPase FADS1 -1.4 ** fatty acid desaturase 1 LRP1 -1.7 ** low density lipoprotein-related protein 1 PLTP -1.4 *** phospholipid transfer protein SRD5A1 1.9 ** steroid-5-alpha-reductase, alpha 1 Extracellular Matrix COL1A2 -1.3 COL6A1 -1.6 FBN1 -1.3 FN1 -1.3 LAMB2 -1.4 LAMA2 -1.6 RECK ? TIMP1 -1.5 *** *** *** ** *** *** *** ** collagen, type I, alpha 2 (2 sets) collagen, type VI, alpha 1 fibrillin 1 (Marfan syndrome) fibronectin 1 (2 sets) laminin, beta 2 (laminin S) laminin, alpha 2 (merosin) reversion-inducing cys-rich w/Kazal (MMP9 regulator) tissue inhibitor of metalloproteinase 1 (2 sets) -2.4 -1.2 -1.7 -1.4 3.4 1.4 -1.4 -1.3 * * *** ** * ** * *** Mitochondrial/Metabolic AHCYL1 -1.5 *** ATP5J 1.2 *** ETFA 1.3 *** HCCS 1.4 *** TOMM34 1.4 *** S-adenosylhomocysteine hydrolase-like 1 (3 sets) ATP synthase, H+ transporting, mitochondrial F0 complex, subunit F6 electron-transfer-flavoprotein, alpha polypeptide (glutaric aciduria II) holocytochrome c synthase (cytochrome c heme-lyase) translocase of outer mitochondrial membrane 34 Stress/oxidant/antioxidant DNAJA2 1.3 ** DNAJB4 2.7 *** PSMF1 1.4 * PSMB6 1.4 * PTMA -1.5 ** SOD3 -1.6 ** DnaJ (Hsp40) homolog, subfamily A, member 2 DnaJ (Hsp40) homolog B4 (2 sets), HLJ1 proteasome (prosome, macropain) inhibitor subunit 1 (PI31) proteasome (prosome, macropain) subunit, beta type, 6 prothymosin, alpha (gene sequence 28) superoxide dismutase 3, extracellular Transcription factors BLZF1 1.8 *** CEBPD -1.4 ** JUN 1.4 *** MSC -1.5 *** ZNF24 -1.3 *** ZNF42 1.4 * ZNF337 -1.3 * basic leucine zipper nuclear factor 1 (JEM-1) CCAAT/enhancer binding protein (C/EBP), delta v-Jun homolog musculin (lamin C homolog, repressor) zinc finger protein 24 (KOX 17) zinc finger protein 42 (myeloid-specific retinoic acid- responsive) zinc finger protein 337 Pathways of genetic information Expression of Egr-1 mRNA in human lesions. Patient # E213 Minutes 5 Tissue L M 65 L M E217 5 L M 65 L M H 20 Egr-1 RhoA Egr-1 mRNA and protein in lesions vs normal cells. Egr-1 mRNA A) 20 Media Lesion 15 10 B) Egr-1 x 5 Actin 0 E197 E196 Western blot E197E221E240E243 MLML MLML Expression screening by GeneChip • each oligo sequence (20 mer) is synthesized as a 11 µ square (feature) • each feature contains > 1 million copies of the oligo • scanner resolution is about 2 µ (pixel) • each gene is quantitated by 11 oligos and compared to equal # of mismatched controls • 44,000 genes are evaluated with 11 matching oligos and 11 mismatched oligos = 4 x 106 features/chip • features are photolithographically synthesized onto a 2 x 2 cm glass substrate ® GeneChip Array Advantages – Specificity Oligo arrays cDNA arrays Gene “on” ~ 150 µm 24 µm Gene “off” Detection Pattern Single Spot Limitations to all microarrays. - dynamic range of gene expression: very difficult to simultaneously detect low and high abundance genes accurately - each gene has multiple splice variants 2 splice variants may have opposite effects (i.e. trk) arrays can be designed for splicing, but complexity ^ 5X - translational efficiency is a regulated process: mRNA level does not correlate with protein level - proteins are modified post-translationally glycosylation, phosphorylation, etc. - pathogens might have little ‘genomic’ effect CardioChip in silico workup Lipoprotein genes/variants Atherosclerosis markers Restenosis markers Coagulation factors Stress markers Inflammatory markers Infectious agents Heart failure predictors