Download Document

Document related concepts

Transposable element wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

DNA polymerase wikipedia , lookup

Oncogenomics wikipedia , lookup

Replisome wikipedia , lookup

Nucleosome wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Frameshift mutation wikipedia , lookup

DNA profiling wikipedia , lookup

Primary transcript wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Metagenomics wikipedia , lookup

DNA damage theory of aging wikipedia , lookup

Cancer epigenetics wikipedia , lookup

DNA vaccination wikipedia , lookup

RNA-Seq wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Genome evolution wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

Molecular cloning wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Epigenomics wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Genome (book) wikipedia , lookup

Human genetic variation wikipedia , lookup

DNA supercoil wikipedia , lookup

Genomic library wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene wikipedia , lookup

Human genome wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genome-wide association study wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Designer baby wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genomics wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Non-coding DNA wikipedia , lookup

Point mutation wikipedia , lookup

Genome editing wikipedia , lookup

History of genetic engineering wikipedia , lookup

Helitron (biology) wikipedia , lookup

Microevolution wikipedia , lookup

Microsatellite wikipedia , lookup

Tag SNP wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

SNP genotyping wikipedia , lookup

Transcript
Map by Blotting
Actual Map
2kb
E
E
3kb
4kb
GeneR
5kb
GeneC
GeneX
8kb
H
Marker
H
Marker
3kb
E
EcoRI
E
4.5kb
0.5kb
GeneA
9kb
Marke
r
1kb
E
Marker
E
HindII
I
Southern Blot inferred Map
E
E
4kb
GeneC
1
H
8kb
H
Deletion of restriction site
2kb
E
E
3kb
4kb
GeneR
GeneC
8kb
Marker
H
Marker
3kb
EcoRI
WT
E
5kb
GeneX
H
E
4.5kb
0.5kb
GeneA
9kb
Marker
1kb
E
Marker
E
EcoRI
Mutant
2
Individuals
Methods used to study differences between individuals
RFLP
SNP
DNA Repeats
3
RFLP analysis
RFLP= Restriction fragment length polymorphism
Refers to variation in restriction sites between individuals in a
population Because of mutations in Restriction sites
These are extremely useful and valuable for geneticists (and
lawyers)
On average two individuals (humans) vary at 1 in 300-1000 bp
The human genome is 3x109 bp
This means that they will differ in more than 3 million bp!!!
By chance these changes will create or destroy the recognition
sites for restriction enzymes
4
RFLP
Lets generate a EcoRI map for the region in one individual
3kb
GAATTC
4kb
GAATTC
GAATTC
The the same region of a second individual may appear as
7kb
GAGTTC
GAATTC
1
Normal
GAATTC
Mutant
GAGTTC
2
EcoRI
Marker
GAATTC
5
RFLP
The internal EcoRI site is missing in the second individual
For X1 the sequence at this site is GAATTC
CTTAAG
This is the sequence recognized by EcoRI
The equivalent site in the X2 individual is different
GAGTTC
CTCAAG
This sequence IS NOT recognized by EcoRI and is therefore
not cut
Now if we examine a large number of humans at this site we
may find that 25% possess the EcoRI site and 75% lack this
site.
We can say that a restriction fragment length polymorphism
exits in this region
These polymorphisms usually do not have any phenotypic
consequences
Silent mutations that do not alter the protein sequence
because of redundancy in codon usage, localization in introns
or non-genic regions
6
RFLP
RFLP are identified by southern blots
In the region of the human X chromosome, two forms of the
X-chromosome are Segregating in the population.
X1
B
R
4
R
R
5
3
B
6
R
3.5
1
2
Digest DNA with
EcoRI or BamHI and
probe with
Probe1/ probe2
What do we get?
X2
B
R
4
R
8
1
B
6
R
3.5
2
7
RFLP in individuals
If we used probe1 for southern blots with a BamHI digest what
would be the results for X1/X1, X1/X2 and X2/X2 individuals?
X1/X1
Probe1
X1/X2
X2/X2
18
BamHI
18
18
If we used probe1 for southern blots with a EcoRI digest what
would be the results for X1/X1, X1/X2 and X2/X2 individuals?
X1/X1
Probe1
Probe2
B
5
R
4
B
9.5
6
R
9.5
X1
3.5
B
8
R
2
8
R
8, 5, 3
9.5
3
R
1
X2/X2
5, 3
EcoRI
EcoRI
4
X1/X2
R
1
6
R
X2
3.5
B
R
2
8
RFLP
RFLP’s are found by trial and error
They require an appropriate probe and appropriate enzyme
They are very valuable because they can be used just like any
other genetic marker to map genes
They are employed in recombination analysis (mapping) in the
same way as conventional morphological allele markers are
employed
The presence of a specific restriction site at a specific locus on
one chromosome and its absence at a specific locus on another
chromosome can be viewed as two allelic forms of a gene
The phenotype in this case is a Southern blot rather than white
eye/red eye
4
6
R
2
R
X1
2
R
1
R
R
4
5
R
2
1
4
8
R
R
1
X2
2
R
R
1
R
4
R
2
R
3
92
R
4
6
R
2
R
X1
2
R
1
R
R
4
5
R
2
1
4
8
R
R
1
X2
2
R
R
Probe1
1
R
4
R
2
R
3
2
Probe2
6+2
X1/X1 individual
5
8
X2/X2 individual
3
10
R
Using RFLPs to map human disease genes
8 EcoRI
6 Probe1
2
5 EcoRI
3 Probe2
1
2
Which RFLP pattern segregates specifically with the diseased
individual
Top or bottom?
Using DNA probes for different RFLPs you can screen
individuals for a RFLP pattern that shows co-inheritance with
the disease
Conclusion: the actual mutation resides at or near the top
RFLP1
11
RFLPs and Mapping unknown Genes
Lets review standard mapping:
To map any two genes with respect to one another, they must be
heterozygous at both loci.Gene W and B are responsible for wing
and bristle development
W
Centromere
B
Telomere
To find the map distance between these two genes we need allelic
variants at each locus
W=wings
w= No wings
B=Bristles
b= no bristles
To measure genetic distance between these two genes, the
double heterozygote is crossed to the double homozygote
12
Mapping
To map a gene with respect to another, you perform crosses and
measure recombination frequency between the two genes.
Gene W and B are responsible for wing and bristle development
W
Centromere
B
Telomere
To find the map distance between these two genes we need
ALLELIC variants at each locus
W=wings
w= No wings
B=Bristles
b= no bristles
To measure distance between these two genes, the double
heterozygote is crossed to the double homozygote
WB/wb
Female
Wings
Bristles
X
wb/wb
Males
No wings
No bristles
----W--------B------w--------b---
----w--------b------w--------b---
13
Mapping
Female gamete
Male gamete (wb)
Genotype
phenotype
WB
WB/wb
Wings
Bristles
51
wb
wb/wb
No wings
No bristles
43
Wb
Wb/wb
Wings
No bristles
3
wB
wB/wb
No wings
Bristles
4
Map distance= # recombinants /Total progeny
7/101= 7 M.U.
14
Mapping
Both the normal and mutant alleles of gene B (B and b) are
sequenced and we find
W
Centromere
B
B
GAATTC
3
2
E
Telomere
E
E
b
E
5
E
AAATTC
The mutation disrupts the sequence and alters a EcoRI site!
If DNA is isolated from B/B, B/b and b/b individuals, cut
with EcoRI and probed in A Southern blot, the pattern that
we will obtain is
B/B Bristle
B/b Bristle
b/b No bristle
15
Mapping
To find the map distance between two genes we need
ALLELIC variants at each locus
Therefore in the cross (WB/wb x wb/wb), the genotype
at the B locus can be distinguished either by the presence and
absence of bristles OR by a Southern blot
WB/wb
Female
x
wb/wb
Male
Wings
Bristles
No wings
No Bristles
Southern blot:
Southern blot:
5 and 2 kb band
5 kb band
There are some phenotypes for specific genes that are very
painful to measure
Having a RFLP makes the problem easier
Just like Genes, RFLPs mark specific positions on chromosomes
and can be used for mapping.
16
Mapping
Female gamete
Male gamete (wb)
Parental
Genotype
phenotype
WB
WB/wb
Wings
5kb 2kb
51
wb
wb/wb
No wings
5kb
43
Wb
Wb/wb
Wings
5kb
3
wB
wB/wb
No wings
5kb 2kb
4
Recombinant
17
Map distance= # recombinants /Total progeny 7/101= 7 M.U.
Mapping
To find the map distance between genes, multiple alleles are
required.
We know the distance between W and B by the classical method
because multiple alleles exist at each locus
(W & w, B & b). It is 7MU.
We know the distance between B and R by the classical method as
20MU.
7MU
Centromere
W
20MU
B
C
R
Telomere
Now suppose you find a new gene C.
You could map this gene with respect to Genes W, B and R using
classical methods.
However, what if it is difficult to study the function of this new
gene (the phenotype is difficult to see with the naked eye)
If the researcher identifies an RFLP in this gene you can 18
map
the gene mutation by simply following the RFLP.
Mapping
C
c
E
8
E
6
E
2
E
E
With this RFLP, the C gene can be mapped with respect to
other genes in any cross.
Genotype/phenotype relationships for the W and C genes
WW and Ww = Red eyes
ww = white eyes
CC = 8kb band
C/c = 8, 6, 2 kb bands
cc = 6, 2 kb bands
To determine map distance between R and C, the following cross
is performed
W
C
----------------------w
c
w
c
----------------------w
c
19
Mapping
W
7MU
20MU
C
B
W
C(8)
w
R
c(6,2)
w
c(6,2)
w
c(6,2)
Male gamete (wc)
Genotype
phenotype
WC
WC/wc
Red/8,6,2
45
wc
wc/wc
white/6,2
45
Wc
Wc/wc
Red/6,2
5
wC
wC/wc
white/8,6,2
5
Female gamete
Parental
Recombinant
Map distance between W and C is 10MU
20
Mapping
Prior to RFLP analysis, only a few classical markers existed in
humans (approximately 200)
Now over 7000 RFLPs have been mapped in the human genome.
Newly inherited disorders are now mapped by determining
whether they are linked to previously identified RFLPs
7MU
Centromere
W
or
w
Centromere RFLP1
Probe1
7kb
or
4kb
20MU
B
or
b
R
or
r
Telomere
RFLP2
RFLP3
Telomere
Probe2
1kb
or
2kb
Probe3
3kb
or
9kb
21
Individuals
Methods used to study differences between individuals
RFLP
SNP
DNA Repeats
22
Genetic polymorphism
•Genetic Polymorphism: A difference in DNA sequence among
individuals, groups, or populations.
•Genetic Mutation: A change in the nucleotide sequence of a
DNA molecule.
Genetic mutations are a subset of genetic polymorphism
Genetic Variation
Single nucleotide
Polymorphism
(point mutation)
Repeat heterogeneity
23
SNP
•A Single Nucleotide Polymorphism is a source variance in a
genome.
•A SNP ("snip") is a single base change in DNA.
•SNPs are the most simple form and most common source of
genetic polymorphism in the human genome (90% of all human
DNA polymorphisms).
•There are two types of nucleotide base substitutions resulting
in SNPs:
–Transition: substitution between purines (A, G) or
between pyrimidines (C, T). Constitute two thirds of all
SNPs.
–Transversion: substitution between a purine and a
pyrimidine.
While a single base can change to all of the other three
bases, most SNPs have only one allele.
24
SNPs-
Single Nucleotide Polymorphisms
-----------------------ACGGCTAA
-----------------------ATGGCTAA
Instead of using restriction enzymes, these are found by
direct sequencing/PCR
They are extremely useful for mapping
Markers
Classical Mendelian
RFLPs
SNPs
~200
7000
1.4x106
SNPs occur every 300-1000 bp along the 3 billion long human
genome
Many SNPs have no effect on cell function
Note: RFLPs are a subclass of SNPs
25
SNPs
Humans are genetically >99 per cent identical: it is the
tiny percentage that is different
Much of our genetic variation is caused by single-nucleotide
differences in our DNA : these are called single nucleotide
polymorphisms, or SNPs.
As a result, each of us has a unique genotype that typically differs in
about three million nucleotides from every other person.
SNPs occur about once every 300-1000 base pairs in the genome, and
the frequency of a particular polymorphism tends to remain stable in
the population.
Because only about 3 to 5 percent of a person's DNA sequence codes
for the production of proteins, most SNPs are found outside of
"coding sequences".
26
How did SNPs arise?
F2a----ACGGACTGAC----CCTTACGTTG----TACTACGCAT---|
F1 ----ACTGACTGAC----CCTTACGTTG----TACTACGCAT----
P
----ACTGACTGAC----CCTTACGTTG----TACTACGCAT---|
F1 ----ACTGACTGAC----CCTTACGTTG----TACTAGGCAT---|
|
F2b----ACTGACTGAC----CCATACGTTG----TACTAGGCAT----
Compare the two F2 progeny
Haplotype1 (F2a) = SNP allele1
----ACGGACTGAC----CCTTACGTTG----TACTACGCAT---Haplotype2 (F2b) = SNP allele2
----ACTGACTGAC----CCATACGTTG----TACTAGGCAT----
27
Each of 1013 cells in the human body receives approximately
thousand DNA lesions per day (Lindahl and Barnes 2000)
When these mutations are not repaired they are fixed in the
genome of that particular cell
If a mutation is fixed in germ cells that go on to be fertilized
and form an embryo they will be propagated to progeny
28
SNPs, RFLPs, point mutations
GAATTC
GAATTC
GAATTC
GAATTC
GAGTTC
GAATTC
RFLP
SNP
SNP
Pt mut
SNP
GAATTC
GAATTC
GAATTC
GACTTC
RFLP
Pt mut
SNP
29
Coding Region SNPs
•Types of coding region SNPs
–Synonymous: the substitution causes no amino acid change to
the protein it produces. This is also called a silent mutation.
–Non-Synonymous: the substitution results in an alteration of
the encoded amino acid. A missense mutation changes the
protein by causing a change of codon. A nonsense mutation
results in a misplaced termination.
–More than half of all coding sequence SNPs result in
non-synonymous codon changes.
Alzheimer’s SNP
Occasionally, a SNP may actually cause a disease.
SNPs within a coding sequence are of particular interest to
researchers because they are more likely to alter the biological
function of a protein.
One of the genes associated with Alzheimer's, apolipoprotein E or
ApoE, is a good example of how SNPs affect disease development.
This gene contains two SNPs that result in three possible alleles for
this gene: E2, E3, and E4.
Each allele differs by one DNA base, and the protein product of each
gene differs by one amino acid.
Each individual inherits one maternal copy of ApoE and one paternal
copy of ApoE.
Research has shown that an individual who inherits at least one E4
allele will have a greater chance of getting Alzheimer's.
Apparently, the change of one amino acid in the E4 protein alters its
structure and function enough to make disease development more
likely. Inheriting the E2 allele, on the other hand, seems to indicate
that an individual is less likely to develop Alzheimer's.
Intergenic SNPs
Researchers have found that most SNPs are not responsible for a
disease state because they are intergenic SNPs
Instead, they serve as biological markers for pinpointing a disease on
the human genome map, because they are usually located near a gene
found to be associated with a certain disease.
Scientists have long known that diseases caused by single genes and
inherited according to the laws of Mendel are actually rare.
Most common diseases, like diabetes, are caused by multiple genes.
Finding all of these genes is a difficult task.
Recently, there has been focus on the idea that all of the genes
involved can be traced by using SNPs.
By comparing the SNP patterns in affected and non-affected
individuals—patients with diabetes and healthy controls, for
example—scientists can catalog ALL of the DNA sequence variations
in affected Vs unaffected individuals to identify mutations that
underlie susceptibility for diabetes
GAATTC
GAGTTC
GAATTC
RFLP
SNP
SNP
Pt mut
SNP
GAATTC
GACTTC
RFLP
Pt mut
SNP
How do you identify SNPs in individuals- PCR
PCR is quick sensitive and robust and is useful when dealing with small amounts of DNA, or
where rapid and high-throughput screening is required.
PCR:
* The polymerase chain reaction involves many rounds of DNA synthesis.
* All DNA synthesis reactions require a template, a primer, a enzyme and a supply of
nucleotides. In the standard PCR, two primers flank the target for amplification and face
inwards. DNA synthesis therefore proceeds across the region between the primers.
PCR results in exponential amplification of the target sequence.
How does it work?
The reaction begins by heating up the DNA template to 94°C, which splits (denatures)
the double strands into single strands. The sample is then cooled to about 54°C, which
allows the primers to stick (anneal) to the template. When the sample is heated up again
to 72°C, the polymerase enzyme uses the primers as starting points to copy the single
strands.
Special DNA polymerase that can withstand high temperatures is used
The cycle of denaturation, primer annealing and primer extension is repeated over and
over again (using a machine that automates the heating and cooling of the samples), each
time producing more copies of the original template.
During repeated rounds of these reactions, the number of newly synthesized DNA strands
increases exponentially. After 25 to 30 cycles, the initial template DNA will have been
copied several million-fold.
Doubling occurs in every cycle of the PCR leading to exponential amplification of the
target. After 25 cycles there are over 8 000 000 copies!
The PCR is useful where the amount of starting material is limited or poorly preserved.
Examples of PCR applications include cloning DNA from single cells, prenatal screening for
mutations in early human embryos, and the forensic analysis of DNA sequences in samples
such as fingerprints, blood stains, semen or hairs.
The PCR is also very useful where many samples have to be processed in parallel. For
example, the large-scale analysis of single nucleotide polymorphisms involves PCR-based
techniques
33
PCR
If a region of DNA has already been sequenced in one individual,
the sequence information can be used to isolate and amplify that
sequence from other individuals DNA in a population.
Individuals with mutations in p53 are at risk for colon cancer
To determine if an individual had such a mutation, prior to PCR
one would have to clone the gene from the individual of interest
(construct a genomic library, screen the library, isolate the clone
and sequence the gene).
With PCR, the gene can be isolated directly from DNA isolated
from that individual.
No lengthy cloning procedure necessary
Only small amounts of genomic DNA required
30 rounds of amplification can give you >109 copies of a gene
34
PCR
Heat and add primers
Heat resistant
DNA polymerase
Heat and add primers + DNA pol
35
36
5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’
3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’
PRIMER1 5’AAAGATC3’
3’AGCTAGAT5’ PRIMER2
5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’
3’AGCTAGAT5’
5’AAAGATC3’
3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’
5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’
3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’
5’AAAGATCGGGGGGGGGGGGGGGTCGATCTA3’
3’TTTCTAGCCCCCCCCCCCCCCCAGCTAGAT5’
37
PCR
How do you detect PCR?
Agarose Gels
Size of PCR product will depend upon location of PCR primers
38
SNPs and Primers- ASO hybridization
Individual 1
GACTCCTGAGGAGAAGTG
Individual 2
GACTCCTGTGGAGAAGTG
Raise Temperature
Raise Temperature
Use allele specific oligos (ASO) (orange and red) for a SNP.
2
1
DNA from individuals 1 and 2 are tested under
CONDITIONS that only allow perfect matches of oligos to
anneal to the genomic DNA.
39
PCR allows only one SNP to be tested at a time
1
aAa
2
cCc
3
A
4
G
5
G
6
T
7
A
8
T
9
C
Ind1
aTa
cGc
A
T
G
T
G
T
G
Ind2
40
Microarrays and SNPs
SNP1 B
SNP1 A
GACTCCTGAGGAGAAGTG
GACTCCTGTGGAGAAGTG
SNP2 B
SNP2 A
GGGGGGGGGGGGGGGGGG
GGGGGGGGCGGGGGGGGG
Design oligonucleotides complementary to each Polymorphism.
These oligos are arrayed on a slide
Each spot corresponds to a polymorphism
Isolate genomic DNA from individual
Label Genomic DNA and hybridize to array
Oligo probes on slide
GACTCCTGAGGAGAAGTG
SNP1
GACTCCTGTGGAGAAGTG
GGGGGGGGGGGGGGGGGG
GGGGGGGGCGGGGGGGGG
SNP2
41
Microarray
slide
GACTCCTGAGGAGAAGTG
SNP1
1
2
3
4
5
6
7
8
9
GACTCCTGTGGAGAAGTG
GGGGGGGGGGGGGGGGGG
SNP2
GGGGGGGGCGGGGGGGGG
Individuals
GACTCCTGAGGAGAAGTG
SNP1
GACTCCTGTGGAGAAGTG
1TT
2GG
GGGGGGGGGGGGGGGGGG
SNP2
GGGGGGGGCGGGGGGGGG
GACTCCTGAGGAGAAGTG
SNP1
GACTCCTGTGGAGAAGTG
1AA
2CC
GGGGGGGGGGGGGGGGGG
SNP2
GGGGGGGGCGGGGGGGGG
GACTCCTGAGGAGAAGTG
SNP1
GACTCCTGTGGAGAAGTG
1AT
2GG
GGGGGGGGGGGGGGGGGG
SNP2
GGGGGGGGCGGGGGGGGG
42
Genotype and Haplotype
In the most basic sense, a haplotype is a “haploid genotype”.
Haplotype: particular pattern of SNPs (or alleles) found on a
single chromosome in a single individual.
The DNA sequence of any two people is 99 percent identical.
Sets of nearby SNPs on the same chromosome are inherited in
blocks.
Therefore while Blocks may contain a large number of SNPs, a
few SNPs are enough to uniquely identify the haplotype of that
block.
The HapMap is a map of these specific SNPs.
SNPs that identify the haplotypes are called tag SNPs.
This makes genome scan approaches to finding regions with genes
that affect diseases much more efficient and comprehensive.
Haplotyping: involves grouping individuals by haplotypes, or
particular patterns of sequential SNPs, on a single chromosome.
There are thought to be a small number of haplotype patterns for
each chromosome.
Microarrays or PCR are used to accomplish haplotyping.
Haplotype and SNPs
Each individual has a characteristic pattern of SNPs
SNPs occur every 300-1000bp apart. There are over a million
SNPs in each individual
When we generate a SNP map for an individual we DO NOT
check every single SNP in that individuals DNA
SNPs are transmitted as blocks (Recombination hot spot)- so
no point analyzing SNPs that go together
1
aAa
2
cCc
3
A
4
G
5
G
6
T
7
A
8
T
9
C
Ind1
aGa
cGc
A
T
G
T
G
T
G
Ind2
SNPs in red were not studied. Only the 9 black SNPs were
studied
SNP mapping is used to narrow down the known physical location
of mutations to a single gene.
The human genome sequence provided us with the list of many of the
parts to make a human.
The HapMap provides us with indicators which we can focus on in
looking for genes involved in common disease.
By using HapMap data to compare the SNP patterns of people
affected by a disease with those of unaffected people, researchers
can survey the whole genome and identify genetic contributions to
common diseases more efficiently than has been possible without
this genome-wide map of variation: the HapMap Project has
simplified the search for gene variants.
Oligonucleotide chips contain thousands of short DNA sequences
immobilised at different positions. Such chips can be used to
discriminate between alternative bases at the site of a SNP.
Chips allow many SNPs to be analyzed in parallel.
Short DNA sequences on the chip represent all possible variations at
a polymorphic site;
A labeled genomic DNA from an individual will only stick if there is an
exact match. The base is identified by the location of the fluorescent
signal.
45
A recessive disease pedigree
46
Mapping recessive disease genes with DNA markers
SNP markers are mapped evenly across the genome.
The markers are polymorphic.
We can tell looking at the SNP pattern of a particular
grandchild which grandparent contributed a certain part of its
DNA.
If we knew that grandparent carried the disease, we could say
that part of the DNA might be responsible for the disease.
1
2
3
4
5
4 different alleles at each locus
Position1 can be A or C or G or T
6
7
8
9
SNPs in red were
not studied. Only
the 9 black SNPs
were studied
Position2 can be A or C or G or T
Position3 ………………..
Grand
parent
1
A-A-A-A-A-A-A-A-A
Chromosome A-A-A-A-A-A-A-A-A
2
C-C-C-C-C-C-C-C-C
C-C-C-C-C-C-C-C-C
3
G-G-G-G-G-G-G-G-G
G-G-G-G-G-G-G-G-G
4
T-T-T-T-T-T-T-T-T
T-T-T-T-T-T-T-T-T
47
Mapping recessive disease genes with SNP markers
1
Grand-parent
1
A-A-A-A-A-A-A-A-A
A-A-A-A-A-A-A-A-A
Dad
2 3 4
2
C-C-C-C-C-C-C-C-C
C-C-C-C-C-C-C-C-C
A-A-A-A-A-A-A-A-A
C-C-C-C-C-C-C-C-C
Offspring1
A-A-A-C-C-A-A-C-C
G-G-G-G-T-T-T-T-G
Offspring2
C-C-A-A-C-A-C-A-A
G-G-G-G-T-T-T-G-G
Offspring3
A-A-A-A-A-C-C-C-C
T-T-G-G-G-G-T-T-T
Offspring4
C-C-C-C-C-C-A-A-A
G-G-T-T-T-T-T-T-T
5 6
7
3
G-G-G-G-G-G-G-G-G
G-G-G-G-G-G-G-G-G
8 9
4
T-T-T-T-T-T-T-T-T
T-T-T-T-T-T-T-T-T
Mom G-G-G-G-G-G-G-G-G
T-T-T-T-T-T-T-T-T
Grandparents 1 and 4 and offspring 1 and 4 have the disease
We would look at the markers and see that ONLY at position 7
do offspring 1 and 4 have the DNA from grandparents 1 and 4.
48
It is therefore likely that the disease gene will be somewhere
near marker 7.
SNP typing
Oligonucleotide chips contain thousands of short DNA sequences immobilised at different
positions. Such chips can be used to discriminate between alternative bases at the site of a
SNP.
Chips allow many SNPs to be analysed in parallel, which is necessary for large-scale
association or pharmacogenomic studies.
Key principles
* DNA chips are miniature devices with thousands of different DNA sequences
immobilised at different positions on the surface. Oligonucleotide chips contain very short
DNA sequences (~25 nucleotides).
* A DNA sequence containing a single nucleotide polymorphism is hybridised to the chip.
* A method is employed to discriminate between alternative bases at the polymorphic
site. This is known as typing the polymorphism.
* A signal, corresponding to the specific identified base, is detected.
* A chip can be used to type many SNPs simultaneously.
How does it work?
Two chip-based typing methods are widely used. One method relies on allele-specific
hybridisation. Short DNA sequences on the chip represent all possible variations at a
polymorphic site; a labelled DNA will only stick if there is an exact match. The base is
identified by the location of the fluorescent signal.
Alternatively, the oligonucleotide on the chip may stop one base before the variable site. In
this case typing relies on allele-specific primer extension. A DNA sample stuck onto the chip
is used as a template for DNA synthesis, with the immobilised oligonucleotide as a primer.
The four nucleotides, containing different fluorescent labels, are added along with DNA
polymerase. The incorporated base, which is inserted opposite to the polymorphic site on the
template, is identified by the nature of its fluorescent signal. In a variation of this
technique, the added nucleotide is identified not by a fluorescent label but by mass
spectrometry.
How is it used?
The chip-based methods discussed above are particularly suitable for high-throughput SNP
typing which is required for large-scale studies of populations. Two of the most important
applications are 'association studies', which attempt to correlate SNP profiles with
predisposition to disease, and pharmacogenomic studies, which attempt to correlate SNP
profiles with drug response patterns.
49
A disadvantage of chip-based assays is that they are somewhat inflexible - new SNPs cannot
easily be incorporated onto a chip, requiring a new chip to be made. This is being overcome
PCR and RFLP
WT
----------CCTGAGGAG-------------------------GGACTCCTC---------------MSTII
Mut
----------CCTGTGGAG-------------------------GGACACCTC----------------
PCR amplify DNA from normal and sickle cell patient
Digest with MstII
WT
Mut
500
400
300
200
100
50
Individuals
Methods used to study differences between individuals
RFLP
SNP
DNA Repeats
51
Genetic polymorphism
•Genetic Polymorphism: A difference in DNA sequence among
individuals, groups, or populations.
Genetic mutations are a kind of genetic polymorphism.
Genetic Variation
Single nucleotide
Polymorphism
(point mutation)
Repeat heterogeneity
52
Repeats and DNA fingerprint
Variation between people- small DNA change – a single
nucleotide polymorphism [SNP] – in a target site,
RFLPs and point mutations are proof of variation at the DNA
level.
Satellite sequences: a short sequence of DNA repeated many
times.
Chr1
Interspersed
Chr2
tandem
53
Mini Satellite Repeats and Blots
Mini Satellite sequences: a short sequence (20-100bp long) of
DNA repeated many times (alleles vary in length from 0.5 to 20
kb)
E
E
2
E
5
E
6
Chr1
Chr2
3
1
E
E
4
tandem
E
0.5
E
5
3
1
Take Genomic DNA
Digest with EcoRI
Probe southern blot
with repeat probe
54
Repeat expansion/contraction
Tandem repeats expand and contract during recombination.
Mistakes in pairing leads to changes in tandem repeat numbers
These can be detected by Southern blotting because as the
number of repeats expand at a specific site, the restriction
fragment at that site expands in size
Chr1 Individual 1
2
E
E
Chr1 Individual 2
3
E
Ind2
Ind1
E
5
3
There are on average
between 2 and 10 alleles
(repeats) per mini-sat locus
1
55
Micro-satellite and PCR
Minisatellite repeat expansion and contraction is
investigated using PCR and gels instead of gels and
southern blots
56
DNA fingerprint
1 2 3 4
The use of microsatellite analysis in genetic profiling. In this
example, 2 different microsatellites located on the short arm of
chromosome 6 have been amplified by the polymerase chain
reaction (PCR). The PCR products are labeled with a blue or
green fluorescent marker and run in a polyacrylamide gel each
lane showing the genetic profile of a different individual. Each
individual has a different genetic profile because each person has
a different set of microsatellite length variants, the variants
giving rise to bands of different sizes after PCR.
57
DNA finger printing
Variation between people- small DNA change – a single nucleotide
polymorphism [SNP] – in a target site,
RFLPs and SNPs are proof of variation at the DNA level,
Satellite sequences: a short sequence of DNA repeated many
times.
Micro satellite are 2-4 bp repeats in tandem repeats 15-100
times in a row
Mini satellite are 20-100 bp repeats in tandem (0.5 to 20kb
long)
Class
size
No of loci
method
SNP
1 bp
100 million
PCR/microarray
Micro
~200bp
200,000
PCR
Mini
0.2-20kb
30,000
southern blot
58
FBI and Microsatellite
The FBI uses a set of 13 different microsatellite markers in
forensic analysis.
13 sets of specific PCR primers are used to determine the allele
present in the test sample for each marker.
The marker used, the number of alleles at each marker and the probability
of obtaining a random match for a marker is shown.
How often would you expect an individual to be mis-identified if all 13
markers are analyzed
Marker
No. of alleles
A
B
C
D
E
F
G
H
I
J
K
L
M
11
19
7
7
10
10
10
11
10
8
8
15
20
probability of
random match
0.112
0.036
0.081
0.195
0.062
0.075
0.158
0.065
0.067
0.085
0.089
0.028
0.039
P= 0.112x0.036x0.081x0.195x----- =
= 1.7x10-15
59
The innocence project
60
Prop 69 2004
Commits serious crime
Not in database
Commits minor crime
DNA in database
DNA from crime scene has
Partial match. Focus on family.
61
Jeffreys
In 1986, the Enderby murder case, a case local to Leicester,
saw the first use of DNA profiling in criminology. Two young
girls had been raped and murdered, one in 1983 and one in
1986. After the second murder, a young man was arrested and
gave a full confession. The police thought he must have
committed the first murder as well, so they asked Professor
Jeffreys to analyse forensic samples – semen from the first and
second victims, samples from the victims, and blood from the
prime
suspect.
"The police were right – both girls had been raped by the same
man," says Professor Jeffreys. "But it wasn't the man who had
confessed. At first I thought there was something wrong with
the technology, but we and the Home Office's Forensic
Science Service did additional testing and it was clear that it
was not his semen. He had given a false confession and was
released – so the first time DNA profiling was used in
criminology, it was to prove innocence."
Blood samples from more than 5000 men in the local community were
collected. The murderer nearly got away with it – sending a proxy in
to give a blood sample – but eventually he was apprehended and got
two life sentences.
62