Download Dot plot - TeachLine

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleic acid analogue wikipedia , lookup

Genomic library wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Molecular cloning wikipedia , lookup

Gene expression programming wikipedia , lookup

Pathogenomics wikipedia , lookup

Genetic code wikipedia , lookup

Minimal genome wikipedia , lookup

Designer baby wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Mutation wikipedia , lookup

Transposable element wikipedia , lookup

Primary transcript wikipedia , lookup

Gene expression profiling wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

History of genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

DNA vaccination wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genome evolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

Gene wikipedia , lookup

Human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Microsatellite wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Metagenomics wikipedia , lookup

Helitron (biology) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Point mutation wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Sequence alignment wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Transcript
Sequence
Similarity
Searching
75321
Class 4 March 2010
Why Compare Sequences?
Identify sequences found in lab
experiments
What is this thing I just found?
Compare new genes to known ones
Compare genes from different species
information about evolution
Guess functions for entire genomes full
of new gene sequences
Are there other sequences like
this one?
1) Huge public databases - GenBank, Swissprot,
etc.
2) Sequence comparison is the most powerful and
reliable method to determine evolutionary
relationships between genes
3) Similarity searching is based on alignment
4) BLAST and FASTA provide rapid similarity
searching
a. rapid = approximate (heuristic)
b. false + and - scores
Similarity ≠ Homology
1) 25% similarity ≥ 100 AAs is
strong evidence for homology
2) Homology is an evolutionary
statement which means “descent
from a common ancestor”
common 3D structure
usually common function
homology is all or nothing, you
cannot say "50% homologous"
How to Compare Sequences?
GATGCCATAGAGCTGTAGTCGTACCCT <—
—> CTAGAGAGC-GTAGTCAGAGTGTCTTTGAGTTCC
Manually line them up and count?
an alignment program can do it for you
or a just use a text editor
Dot Plot
shows regions of similarity as diagonals
Global vs Local similarity
1) Global similarity uses complete aligned
sequences - total % matches
GCG GAP program, Needleman & Wunch
algorithm
2) Local similarity looks for best internal
matching region between 2 sequences
GCG BESTFIT program,
Smith-Waterman algorithm,
BLAST and FASTA
3) dynamic programming
optimal computer solution, not approximate
Search with Protein, not
DNA Sequences
1) 4 DNA bases vs. 20 amino acids - less
chance similarity
2) can have varying degrees of similarity
between different AAs
- # of mutations, chemical similarity, PAM matrix
3) protein databanks are much smaller than
DNA databanks
Similarity is Based on Dot
Plots
1) two sequences on vertical and horizontal
axes of graph
2) put dots wherever there is a match
3) diagonal line is region of identity
(local alignment)
4) apply a window filter - look at a group of
bases, must meet % identity to get a dot
Simple Dot Plot
GA TC AA CTGAC GTA
G
T
T
C
A
G
C
T
G
C
G
T
A
C
Dot plot filtered with 4 base
window and 75% identity
GA TC AA CTGAC GTA
G
T
T
C
A
G
C
T
G
C
G
T
A
C
Dot plot of real data
Global vs. Local Alignments