Download Diagrams Sep 7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Biological preliminaries
Genome: entire complement of genetic
material carried by an individual.
Transcriptome: entire set of transcribed
sequences produced by the genome.
Proteome: entire set of proteins encoded by
the genome.
One letter nucleotide codes. Based on Nomenclature Committee of the
International Union of Biochemistry (NC-IUB). Molecular Biology and Evolution 3:99-108 (1986).
Guanine
Adenine
Thymine
Cytosine
Purine
Pyrimidine
Amino
Keto
Strong (3H bonds)
Weak (2H bonds)
Not G
Not A
Not T
Not C
Any
Unknown
G
A
T
C
G
T
A
G
G
A
A
G
G
G
G
?
or
or
or
or
or
or
or
or
or
or
or
A
C
C
T
C
T
C or T
T or C
C or A
A or T
C or T or A
G
A
T
C
R
Y
M
K
S
W
H
B
V
D
N
X
One letter amino acid codes
Alanine
Aspartic acid
Glutamine
Isoleucine
Methionine
Serine
Tyrosine
A
D
Q
I
M
S
Y
Arginine
Cysteine
Glycine
Leucine
Phenylalanine
Threonine
Valine
R
C
G
L
F
T
V
Asparagine
Glutamic acid
Histidine
Lysine
Proline
Tryptophan
Unknown
N
E
H
K
P
W
X
Biological preliminaries
Similarity: resemblance between two characters.
Homology: Two traits are homologous if they are derived
(with or without modifications) from a common ancestor.
A
B
A
B
B
homologs
Homoplasy: independent origin of similar characters between
species
A
A
B
A
homoplasy
B
Plesiomorphy: primitive or ancestral character state
Apomorphy: derived state representing an evolutionary
novelty
Symplesiomorphy: primitive state shared by several taxa
Synapomorphy: derived character state shared by several
taxa
Autapomorphy: derived character state unique to a taxa
Mutations: a mutation is an error in replication of the
nucleotide sequence. It may encompass one or more
nucleotides and in complicated situations may involve disjoint
nucleotides. They can be caused by internal errors of
metabolism or by external agents such as radiation.
Substitutions: Substitutions are differences in two sequences
caused originally by mutations but which have been acted on
by selection.
Replacements: observed differences between amino acid
sequences.
Transition - Transversions
purines
A
G
pyrimidines
C
T
Transition: changes purine ⇐⇒ purine
or pyrimidine ⇐⇒ pyrimidine
Transversion: changes purine ⇐⇒ pyrimidine
8 transversions for 4 transitions
Genomics
Beginnings
The first protein sequenced was bovine insulin in 1956. This was
basically done by a series of tricks. Each individual amino acid was
determined by a separate and different experiment.
Beginnings
The first protein sequenced was bovine insulin in 1956. This was
basically done by a series of tricks. Each individual amino acid was
determined by a separate and different experiment.
The first direct attempts to sequence an RNA molecule were by
Holley and co-workers in 1965 (R.W. Holley et al., 1965, Science
147:1462-1465). The technique that they used was very labor intensive and it took them approximately one year to determine the
77 nucleotides that make up the alanine transfer RNA of yeast.
Genome Sequencing
1956
1965
1977
1977
1995
1997
1998
2000
2001
2005
First protein sequence (Bovine insulin)
Holley et al. Sience 147: 1462-1465 (yeast alanine NA)
Maxam and Gilbert. PNAS 74: 560-564.
Sanger et al. PNAS 74: 5463-5467
First sequenced genome: Haemophilus influenzae
Escherichia. coli and Saccharomyces cerevisae
Caenorhabditis elegans genome
Drosophila melanogaster genome
Human genome sequence
454 introduce next generation sequencing (NGS)
Maxam & Gilbert sequencing
I
Principle
I
I
I
Method: Four different treatments
I
I
I
I
I
radioactively label DNA fragments at their 50 end using
alkaline phosphatase / polynucleotide kinase
separate the fragments according to their size using gel
electrophoresis followed by an autoradiography to visualize the
fragments
Dimethylsulfate followed by heat treatment (G)
+ mild acid (A+G)
Hydrazine (C+T)
Hydrazine + 2M NaCl (C)
Limitation
I
Require a cloning step (amplification and labeling)
Maxam & Gilbert sequencing
G
−ve
A+G
T+ C
C
Inferred DNA sequence
P 32
C T T C AGT AC GT C G
P 32
C T T C AGT AC GT C
P 32
C T T C AGT AC GT
P 32
C T T C AGT AC G
P 32
C T T C AGT AC
C T T C AGT A
P 32
P 32
C T T C AGT
P 32
C T T C AG
P 32
CT T CA
P 32
CT T C
P 32
CT T
P 32
CT
P 32
+ve
C
Sanger sequencing
I
Principle
I
I
I
I
Method:
I
I
Use the properties of DNA replication
Therefore requires the use of primers
Aim: amplification of fragments of different sizes by stopping
the DNA replication with the use of 20 ,30 dideoxyribonucleotide triphosphates
Four different individual reactions are performed with each of
the radioactively labeled 20 ,30 -dideoxyribonucleotide
triphosphates
Gel electrophoresis followed by autoradiography
Sanger sequencing
O
O
P
O−
"Ob
"
b
O–CH2
"
T
T
O
Base
O
b
P
O−
O–CH2
O
O
P
O
O−
Base
"Ob
"
b
"
b
CH2 —
T
T
OH
Base
"Ob
b
"
b
"
T
T
O
O
P
O
O−
Base
"Ob
"
b
"
b
CH2 —
T
T
O
O
O
O
O
P
O
O
−
O
P
O
O
−
P
O
−
O–CH2
Base
"Ob
b
"
b
"
T
T
OH
P
O−
O
CH2 —"
"Ob
"
b
T
T
OH
Base
b
Sanger sequencing
O
O
O
P
O
O
−
O
P
O
O
−
P
O−
O–CH2
Ob
"
" b
"
T
T
Base
b
Sanger sequencing
DNA Replication
O
O
O
P
O-
O
CH2
O
O
Base
P
O-
O
CH2
O
O
O
Base
O
P
O-
O
CH2
O
O
Base
P
O-
O
CH2
O
O
OH
O
O
O-
P
O-
O
O
P
O-
Base
P
O-
O
CH2
O
O
O
P
O-
O
CH2
O
OH
Base
OH
Base
Sanger sequencing
O
O
P
O-
O
CH2
O
Base
O
O
P
O-
O
CH2
O
Base
OH
O
O-
O
O
P
-
O
O
O
P
-
O
P
O-
O
CH2
O
OH
Base
Sanger sequencing
O
O
P
O-
O
CH2
O
Base
O
O
P
O-
O
CH2
O
Base
OH
O
O-
O
O
P
-
O
O
O
P
-
O
P
O-
O
CH2
O
Base
dideoxynucleotide triphosphate
Sanger sequencing
O
O
O
P
O-
O
CH2
O
O
Base
P
O-
O
CH2
O
O
O
O
Base
P
O-
O
CH2
O
O
Base
P
O-
O
CH2
O
O
OH
O
O
O-
O
O
P
-
O
O
O
P
-
O
Base
P
O-
O
CH2
O
Base
P
O-
O
CH2
O
Base
Sanger sequencing
Dideoxynucleotide
G
A
T
C
−ve
Inferred DNA sequence
P 32
C T T C AGT AC GT C G
P 32
C T T C AGT AC GT C
P 32
C T T C AGT AC GT
P 32
C T T C AGT AC G
P 32
C T T C AGT AC
C T T C AGT A
P 32
P 32
C T T C AGT
P 32
C T T C AG
P 32
CT T CA
P 32
CT T C
CT T
P 32
CT
P 32
P 32
+ve
C
Sanger sequencing
Dideoxynucleotide
G
A
T
C
−ve
Inferred DNA sequence
P 32
C T T C AGT AC GT C G
P 32
C T T C AGT AC GT C
P 32
C T T C AGT AC GT
P 32
C T T C AGT AC G
P 32
C T T C AGT AC
C T T C AGT A
P 32
P 32
C T T C AGT
P 32
C T T C AG
P 32
CT T CA
P 32
CT T C
CT T
P 32
CT
P 32
P 32
+ve
C
Autoradiogram courtesy of Dr. Rahat Zaheer
Example of a good trace
Example of a poorer quality trace
Example of a bad trace
Example of the beginning of a trace
Example of the middle of a trace
Example near the useful end of a trace
Related documents