* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Classification of DNA sequences using Bloom Filters
Zinc finger nuclease wikipedia , lookup
Molecular cloning wikipedia , lookup
DNA barcoding wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Designer baby wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Microevolution wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Transposable element wikipedia , lookup
Primary transcript wikipedia , lookup
Pathogenomics wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Point mutation wikipedia , lookup
Whole genome sequencing wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Human Genome Project wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microsatellite wikipedia , lookup
Genomic library wikipedia , lookup
Human genome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Multiple sequence alignment wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Helitron (biology) wikipedia , lookup
Sequence alignment wikipedia , lookup
CS 6293 AT: Current Bioinformatics HW2 Papers 1. BLAT--The BLAST-Like Alignment Tool 2. Classification of DNA sequences using Bloom Filters Course Intructor Dr. Jianhua Ruan Presenters Husnu Narman Nihat Altiparmak BLAT--The BLAST-Like Alignment Tool W. James Kent (2002) UCSC Cited by 2229(Google Scholar) Brief Information About BLAST • BLAST: Basic Local Allignment Search Tool • Find a gene in different kinds of databases Divide query to small part words and compare High Scoring Segments Pairs(HSP) Evaluate, handle exceptions, and reports Scan for exact matches in HSP List all of the HSPs in the database Extend exact matches to HSP BLAT • BLAT: The Blast-Like Alignment Tool • Find a gene in different kinds of databases • Why new search tool? Differences between BLAST and BLAT BLAST BLAT • Index of Query • Triggers extension one or two hit occur • List of exons sorted by size • Index of Database • Triggers extensions any number perfect or near perfect hits • Look up location of a sequence in genome or determine exon structure of a mRNA Classification of DNA sequences using Bloom Filters Strannheim et al. (2010) Stockholm, SWEDEN Classification of DNA sequences using Bloom Filters • New generation sequencing technologies – Complex datasets – New efficient, specialized sequence analysis algorithms • Often, only noval sequences required, unnecessary sequences(belonging to a known genome) need to be removed • A new algorithm(FACS) to classify sequences as belonging or not belonging to a reference sequence • Source code available at; – http://facs.biotech.kth.se Bloom Filter • A memory efficient data structure for testing whether an element is part of a reference set • m bit vector with k hash functions • Never returns a false negative; may however return a false positive • Optimal number of hash functions; 𝑚 𝑘 = 𝑙𝑛2 𝑛 Example Bloom Filter x y z 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 √ √ w 𝑚 = 18, 𝑘 = 3 x Method • Bloom filter is created from the reference sequence with desired K-mer and false positive rate. • The query sequences are then classified by using the bloom filter Evaluation • Experimental metagenome dataset(Allander et al. 2005) containing 177184 reads • Analysis using human genome as a reference • FACS, BLAT and SSAHA2 compared 21x 31x Evaluation False Positive Rate(Missed) False Positive Rate Percentage (%) 0.06 0.05 0.04 0.03 0.02 0.01 0 FACS BLAT SSAHA2 Any Questions?