Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Promoter (genetics) wikipedia , lookup
Molecular cloning wikipedia , lookup
DNA barcoding wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Community fingerprinting wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Non-coding DNA wikipedia , lookup
Point mutation wikipedia , lookup
Molecular evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
BLAST and Database Searches Mohammed Mehdi Rizvi Molecular Lab Techniques 446 Background BLAST is the Basic Local Alignment Search Tool. It is an algorithm used for comparing sequences, such as amino acid sequences of proteins, or the nucleotides of DNA sequences. The program and algorithms were designed by David J. Lipman, Webb Miller, Eugene Myers, Warren Gish and Stephen Altschul at NIH. There are a number of different databases, GenBank, EMBL and DDBJ for DNA. BLAST searches against these databases, but databases can be accessed, searched and queried without using BLAST. BLAST is more used for comparing alignments and similarities. Eugene Myers Webb Miller David J. Lipman Stephen Altschul BLAST We’ve used nucleotide BLAST in this class. There are protein-protein BLASTs, as well as protein-translated nucleotide BLASTs. BLAST is useful in primer design. Before fast algorithms such as BLAST were developed, database searches were incredibly time consuming due to the use of full alignment procedures, such as the Smith-Waterman algorithm. Before BLAST, Lipman and William R. Pearson also developed FASTA, another alignment software package, which left the legacy of the FASTA format, still ubiquitous today. How does it work? The first step of the algorithm breaks the query into “words”. The usual length for DNA is 11 characters. A long sequence of DNA will be broken down into 11 character “words” The words are compared against the sequence database. A scoring matrix is used to obtain the S value. Low complexity sequences are filtered and removed due to causing artificial hits. Alignment Alignment is used to determine if a sequence is like another sequence, uncovering identical or similar regions. There are two alignment types: global and local. Global contains the whole sequence to an entire other sequence. The output of a global alignment is a comparison of two sequences. Local alignments reveals similar, conserved regions. BLAST, as implied by the name, uses local alignments. Interpreting BLAST output Lower E value suggests more significant match (smaller probability of match by chance) Query coverage: what percentage of the sequence is aligned. These show where a similarity has been found, where the colour indicates the degree of similarity. References https://www.ndsu.edu/pubweb/~mcclean/plsc411/Blast-explanation-lectureand-overhead.pdf http://blog.thegrandlocus.com/2014/06/once-upon-a-blast https://blastalgorithm.com/ https://www.ncbi.nlm.nih.gov/ http://www.genebee.msu.su/blast/blast_faqs.html