Download Meaning and Molecular Data - Circle

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA virus wikipedia , lookup

DNA sequencing wikipedia , lookup

DNA repair protein XRCC4 wikipedia , lookup

Homologous recombination wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

DNA replication wikipedia , lookup

DNA repair wikipedia , lookup

DNA profiling wikipedia , lookup

DNA polymerase wikipedia , lookup

Replisome wikipedia , lookup

DNA nanotechnology wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Microsatellite wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
Isaiah 40:4, 5
4 Every valley shall be exalted, and
every mountain and hill shall be
made low: and the crooked shall
be made straight, and the rough
places plain:
5
And the glory of the LORD shall
be revealed, and all flesh shall see
it together: for the mouth of the
LORD hath spoken it.
©2001 Timothy G. Standish
Getting Meaning
From
Molecular Data
Timothy G. Standish, Ph. D.
©2001 Timothy G. Standish
What are Genes?
The one gene one enzyme hypothesis has
been refined to mean each gene codes for a
polypeptide
Things get fuzzy when a specific locus codes
for more than one polypeptide
For the purposes of this class, we will define
genes as segments of DNA that are
transcribed and associated regions that
control their transcription
Genes may code for both polypeptides or
RNAs
©2001 Timothy G. Standish
Determination of Gene Numbers
DNA sequences are considered to be the gold
standard for determining the number of genes in an
organism’s genome
The problem is that most organisms have unsequenced genomes and, even when genomes are
sequenced, deciding if a segment of DNA
represents a region that is transcribed can
frequently be difficult
Searching DNA for open reading frames seems to
be the most logical way of finding genes, but just
because an open reading frame exists does not
definitively answer whether it is transcribed
©2001 Timothy G. Standish
Indirect Estimates
DNA hybridization etc.
©2001 Timothy G. Standish
Denaturation and Renaturation
Heating double stranded DNA can overcome the
hydrogen bonds holding it together and cause the
strands to separate resulting in denaturation of
the DNA
When cooled relatively weak hydrogen bonds
between bases can reform and the DNA renatures
Denatured DNA
ATGAGCTGTACGATCGTG
ATGAGCTGTACGATCGTG
TACTCGACATGCTAGCAC
Double stranded DNA
ATGAGCTGTACGATCGTG
TACTCGACATGCTAGCAC
TACTCGACATGCTAGCAC
Single stranded DNA
Double stranded DNA
©2001 Timothy G. Standish
Denaturation and Renaturation
DNA with a high guanine and cytosine content has
relatively more hydrogen bonds between strands
This is because for every GC base pair 3 hydrogen bonds
are made while for AT base pairs only 2 bonds are made
Thus higher GC content is reflected in higher melting or
denaturation temperature
ACGAGCTGCACGAGC
TGCTCGACGTGCTCG
ATGATCTGTAAGATC
TACTAGACATTCTAG
67 % GC content High melting temperature
33 % GC content Low melting temperature
ATGAGCTGTCCGATC
TACTCGACAGGCTAG
50 % GC content - Intermediate melting temperature
©2001 Timothy G. Standish
Determination of GC Content
Comparison of melting temperatures can be used to
determine the GC content of an organisms genome
To do this it is necessary to be able to detect whether
DNA is melted or not
Absorbance at 260 nm of DNA in solution provides
a means of determining how much is single stranded
Single stranded DNA absorbs 260 nm ultraviolet
light more strongly than double stranded DNA does
although both absorb at this wavelength
Thus, increasing absorbance at 260 nm during
heating indicates increasing concentration of single
stranded DNA
©2001 Timothy G. Standish
Determination of GC Content
1.0
Tm is the
temperature
at which half
the DNA is
melted
OD260
Single
stranded
DNA
Relatively
low GC
content
Relatively
high GC
content
Tm = 75 oC
Tm = 85 oC
Double
stranded
DNA
0
65
70
75
80
85
Temperature (oC)
90
95
©2001 Timothy G. Standish
GC Content Of Some Genomes
Organism
% GC
Homo sapiens
39.7 %
Sheep
42.4 %
Hen
42.0 %
Turtle
43.3 %
Salmon
41.2 %
Sea urchin
35.0 %
E. coli
51.7 %
Staphylococcus aureus
50.0 %
Phage l
Phage T7
55.8 %
48.0 %
©2001 Timothy G. Standish
Hybridization
The bases in DNA will only pair in very specific ways, G
with C and A with T
In short DNA sequences, imprecise base pairing will not be
tolerated
Long sequences can tolerate some mispairing only if -G
of the majority of bases in a sequence exceeds the energy
required to keep mispaired bases together
Because the source of any single strand of DNA is
irrelevant, merely the sequence is important, DNA from
different sources can form double helix as long as their
sequences are compatible
Thus, this phenomenon of base pairing of single stranded
DNA strands to form a double helix is called hybridization
as it may be used to make hybrid DNA composed of
strands which came from different sources
©2001 Timothy G. Standish
Hybridization
DNA from source “X”
CTGATGGTCATGAGCTGTCCGATCGATCAT
TACTCGACAGGCTAG
Hybridization
TACTCGACAGGCTAG
DNA from source “Y”
©2001 Timothy G. Standish
Hybridization
Because DNA sequences will seek out and hybridize with
other sequences with which they base pair in a specific
way much information can be gained about unknown DNA
using single stranded DNA of known sequence
Short sequences of single stranded DNA can be used as
“probes” to detect the presence of their complimentary
sequence in any number of applications including:
–
–
–
–
Southern blots
Northern blots (in which RNA is probed)
In situ hybridization
Dot blots . . .
In addition, the renaturation or hybridization of DNA in
solution can tell much about the nature of organism’s
genomes
©2001 Timothy G. Standish
Reassociation Kinetics
An organism’s DNA can be heated in solution
until it melts, then cooled to allow DNA strands to
reassociate forming double stranded DNA
This is typically done after shearing the DNA to
form many fragments a few hundred bases in
length
The larger and more complex an organisms
genome is, the longer it will take for
complimentary strands to bum into one another
and hybridize
Reassociation follows second order kinetics
©2001 Timothy G. Standish
Reassociation Kinetics
The following equation describes the second order
rate kinetics of DNA reassociation:
Concentration of
single stranded
DNA after time t
Initial
concentration of
single stranded
DNA
C
1
=
Co 1 + kCot
Second order
rate constant
(the important
thing is that it is
a constant)
Co (measured in
moles/liter) x t
(seconds). Generally
graphed on a log10
scale.
Cot1/2 is the point at
which half the initial
concentration of single
stranded DNA has
annealed to form
double-stranded DNA
©2001 Timothy G. Standish
Reassociation Kinetics
1.0
Fraction
remaining
singlestranded
(C/Co) 0.5
0
Higher Cot1/2
values indicate
greater
genome
complexity
Cot1/2
10-4 10-3 10-2 10-1
1
101
102 103
Cot (mole x sec./l)
104
©2001 Timothy G. Standish
Reassociation Kinetics
1.0
Prokaryotic DNA
Fraction
remaining
Repetitive
singleDNA
stranded
(C/Co) 0.5
Unique
sequence
complex
DNA
Eukaryotic DNA
0
10-4 10-3 10-2 10-1
1
101
102 103
Cot (mole x sec./l)
104
©2001 Timothy G. Standish
Repetitive DNA
Organism
% Repetitive DNA
Homo sapiens
21 %
Mouse
35 %
Calf
42 %
Drosophila
70 %
Wheat
42 %
Pea
52 %
Maize
60 %
Saccharomycetes cerevisiae
5%
E. coli
0.3 %
©2001 Timothy G. Standish
The Globin Gene Family
Globin genes code for the
a
b
protein portion of hemoglobin
In adults, hemoglobin is made
Fe
up of an iron containing heme
molecule surrounded by 4
globin proteins: 2 a globins
b
a
and 2 b globins
During development, different globin genes are
expressed which alter the oxygen affinity of
embryonic and fetal hemoglobin
©2001 Timothy G. Standish
Model For Evolution Of The
Globin Gene Family
Ancestral
Globin gene
Duplication
Mutation
a
b
Transposition
Chromosome 16
a
z
z
Embryo
b
Duplication and Mutation
e
g
Duplication and Mutation
Gg
a2 a1 yq
e
Ag
a
yz ya2 ya1
Fetus and
Adult
Embryo
Fetus
Chromosome 11
b
yb
d
b
Adult
Pseudo genes (y) resemble genes, but may lack introns and, along with other
differences typically have stop codons that come soon after the start codons.
©2001 Timothy G. Standish
©2001 Timothy G. Standish