Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing Southern California Bioinformatics Institute Summer 2004 Funded by the National Science Foundation and National Institute of Health Outline History of RNAi Small interfering RNA (siRNA) Mechanism siRNA design and selection Blast vs Smith-Waterman Project Objectives and Results Conclusions & Future Work History of RNAi Discovered in 1998 by Andrew Fire, Craig Mello, and colleagues RNAi – silencing of gene expression by dsRNA molecules Organism used: Caenorhabditis elegans Short interfering RNA (siRNA) Mechanism http://www.bioteach.ubc.ca/MolecularBiology/AntisenseRNA/siRNA.gif siRNA Selection & Design: Avoiding Cross-Hybridization Important to guard against strong crosshybridization to other genes Cross-hybridization with non-specific targets results in wasted lab time and materials, as well as inaccurate conclusions Preliminary sequence analysis allows verification of candidate oligos to protect against crosshybridization siRNA Selection & Design Hybridization concerns: siRNA mismatch tolerance Insertion/deletion vs mismatch Query: Sbjct: Query: Sbjct: 1 GAACTTATCTTCCTTCTTC 19 ||||||||||||||||||| 3783 GAACTTATCTTCCTTCTTC 3801 19 GAAGAAGGAAGATAAGTTC 1 ||||||||| || |||||| 778 GAAGAAGGATGAGAAGTTC 796 Blast vs Smith-Waterman Blast may potentially miss relevant alignments Using word size seven, nearly 6% of all possible alignments with three mismatches between 21-mers will be missed Increasing word size or allowing more mismatches contribute to higher rate of missed hits Smith-Waterman is said to have higher sensitivity, so why not use it? Project Objectives Test set: 10,000 19-mer oligos/siRNAs Test database: RefSeq Comparison study between Blast and Smith Waterman 15/19 -> Percent Identity threshold set to 78% … e-value adjustment from default of 10. E-value 500 used A Closer Look at Smith-Waterman & Blast Parameters Algorithm Alignment Score/ (ID) Param Match Mismatch Smith Waterman Query: 29 17/19 (89%) default +2 -2 Blast Query: 1 gaaagagcatctacgg 16 ||||||||||| |||| Sbjct: 2393 gaaagagcatccacgg 2378 12 15/16 (93%) W7 e 500 Default +1 -3 G -5 E -2 -7 Blast Query: 1 29 17/19 (89%) W7 e 500 G1 q2r2 +2 -2 G -1 E -2 -3 Sbjct: 19 TCACCGTAGATGCTCTTTC 1 || |||| ||||||||||| 2376 TC-CCGTGGATGCTCTTTC 2393 gaaagagcatctacggtga 19 ||||||||||| |||| || Sbjct: 2393 gaaagagcatccacgg-ga 2376 GO/ GE Gap Total -3 Smith-Waterman vs. Blast Results Original Query Sequence: CTTTTTAACATCGACGGTC >gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3 (GATA3), mRNA Length = 2365 Score = 31.7 bits (38), Expect = 0.041 Identities = 19/19 (100%) Strand = Plus / Plus Query: 1 ctttttaacatcgacggtc 19 ||||||||||||||||||| Sbjct: 299 ctttttaacatcgacggtc 317 SWN hit-4 bin Blast hit-1 bin W7 G1 r2 q2 e500 E2 Percent Identity: 89% ,GATA3 gene Smith-Waterman vs. Blast Results Original Query Sequence: AAAATACTGAGAGAGGGAG >gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3 >gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3 (GATA3), mRNA Length = 2365 (GATA3), mRNA Score = 31.7 bits (38), Expect = 0.041 Length = 2365 Identities = 19/19 (100%) Strand = Plus / Plus Score = 31.7 bits (38), Expect = 0.041 Query: 1 ctttttaacatcgacggtc 19 ||||||||||||||||||| Identities = 19/19 (100%) Sbjct: 299 ctttttaacatcgacggtc 317 Strand = Plus / Plus >gi|4557424|ref|NM_001248.1| Homo sapiens ectonucleoside triphosphate diphosphohydrolase 3 (ENTPD3), mRNA Query: 1 Length = 2797 ctttttaacatcgacggtc 19 ||||||||||||||||||| Sbjct: 299 ctttttaacatcgacggtc 317 Score = 24.6 bits (29), Expect = 5.7 Identities = 17/19 (89%), Gaps = 1/19 (5%) Strand = Plus / Minus Query: 1 SWN hit-1 bin aa-aatactgagagaggga 18 || ||||||||| |||||| Sbjct: 2044 aagaatactgagggaggga 2026 Blast hit-4 bin Conclusions and Future Work Produce more conclusive statistics for occurrences of more accurate Smith-Waterman results No consensus exists as to which hits are considered dangerous or significant for cross-hybridization Creation of a position-specific matrix Mutation tolerance on the 5’ end Low tolerance on the 3’ end GU wobble References Novina, C and Sharp, P. The RNAi revolution. Nature. 2004 Jul 8;430(6996):161-4. Dorsett, Y and Tuschl, T. siRNAs: applications in functional genomics and potential as therapeutics. Nat Rev Drug Discov. 2004 Apr;3(4):318-29. Snove, O Jr. and Holen, T. Many commonly used siRNAs risk off-target activity. Biochem Biophys Res Commun. 2004 Jun 18;319(1):256-63. Paroo, Z and Corey, DR. Challenges for RNAi in vivo. Trends Biotechnol. 2004 Aug;22(8):390-4. Amarzguioui, M. et al. Tolerance for mutations and chemical modifications in siRNA. Nucl Acids Research. 2003; 31(2)589-595. Acknowledgements Dr. Cecilie Boysen (advisor) Parcel Scientific Staff David Meyer Paracel Software Engineer Stephanie Pao Paracel Technical Sales Engineer Frances Tong Paracel Intern William White Paracel Technical Writer Southern California Bioinformatics Institute 2004 Faculty and Staff: Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Sandra Sharp & Dr. Wendie Johnston, & Jackie Leung Fellow interns NIH & NSF Short interfering RNA Mechanism Post-transcriptional gene silencing. Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004. Dorsett, Y and Tuschl, T. siRNAs: applications in functional genomics and potential as therapeutics. Nat Rev Drug Discov. 2004 Apr;3(4):318-29. •Reverse genetic approaches – expensive and time consuming •siRNA may be chemically synthesized or expressed from DNA vectors MicroRNAs Translational silencing. Picture from: Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004. Short RNAs 19-25 nucleotides Abundant, single stranded RNAs encoded in genomes of most multicellular organisms: from few thousand to 40,000 molecules per cell Some evolutionarily conserved and developmentally regulated Differences between siRNA and miRNA siRNA Promote the cleavage or degradation of mRNAs Sense strand has “exactly the same sequence as the target strand” Target genes or genetic elements from which they originated miRNA Regulate the expression of mRNAs; transcription is not impeded and mRNAs not destroyed Imperfect base-pairing between mRNA targets and miRNA Regulate separate genes Interchangeability of siRNAs and miRNAs miRNA may act like siRNA * perfect or near-perfect complementarity to cellular mRNAs Could siRNA also work like miRNA? * synthetic siRNA partially complementary to ‘reporter’ gene inhibited its expression Distinction between single site with almost exact complementarity and numerous partially complementary binding sites Laboratory and Clinical Applications of siRNA In C. elegans, simple experiment: inject dsRNA, soak in dsRNA solution, or feed with bacteria expressing dsRNA In worms, screening for obesity and ageing In fruitflies, purified long dsRNA used to identify roles of genes in cholesterol metabolism and heart formation Therapeutic potential of siRNAs for humans File Type Bases Sequences # of Oligos BRCA1 fasta 3243 1 3255 GATA3 fasta 3070 1 3070 HLA-molecule fasta 2918 1 2918 Insulin-likegrowth-factor fasta 4989 1 4971 Interleukinreceptor fasta 1451 1 1433 NFKB1 fasta 4104 1 4186 Serine kinase fasta 3506 1 3488 Serotoninreceptor fasta 1927 1 1909 TNF2 fasta 1669 1 1651 Vinculin fasta 5647 1 5629 32554 10 Total Paroo, Z and Corey, DR. Challenges for RNAi in vivo. Trends Biotechnol. 2004 Aug;22(8):390-4. Blast vs Smith-Waterman Speed Test Results 346.08 350 300 250 Time in minutes 200 205.69 SWN Blast 150 100 50 0 46.7 11.35 Default e500