Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
RFLP analysis
RFLP= Restriction fragment length polymorphism
Refers to variation in restriction sites between individuals in a
population
These are extremely useful and valuable for geneticists (and
lawyers)
On average two individuals (humans) vary at 1 in 1000 bp
The human genome is 3x109 bp
This means that they will differ in more than 3 million bp.
By chance these changes will create or destroy the recognition
sites for Restriction enzymes
1
RFLP
Lets generate a restriction map for a region of human Xchromosome
The restriction map in the same region of the X chromosome
of a second individual may appear as
Normal
GAATTC
Mutant
GAGTTC
2
RFLP
The internal EcoRI site is missing in the second individual
For X1 the sequence at this site is GAATTC
CTTAAG
This is the sequence recognized by EcoRI
The equivalent site in the X2 individual is mutated
GAGTTC
CTCAAG
Now if we examine a large number of humans at this site we
may find that 25% possess the EcoRI site and 75% lack this
site.
We can say that a restriction fragment length polymorphism
exits in this region
These polymorphisms usually do not have any phenotypic
consequences
Silent mutations that do not alter the protein sequence
because of redundancy in Codon usage, localization to introns
or non-genic regions or do not affect protein
Structure/function.
3
RFLP
RFLP are identified by southern blots
In the region of the human X chromosome, two forms of the
X-chromosome are Segregating in the population.
Digest DNA with
EcoRI and probe with
probe1
What do we get?
4
RFLP
Digesting with BamHI and performing Southern blots with the
above probe produces the following results for X1/X1, X1/X2 and
X2/X2 individuals:
There is no variation with respect to the BamHI sites, all
individuals produce the same banding patterns on Southern
blots
If we used probe2 for southern blots with a BamHI digest
what would be the Results for X1/X1, X1/X2 and X2/X2
individuals?
If we used probe2 for southern blots with a EcoRI digest
what would be the results for X1/X1, X1/X2 and X2/X2 5
individuals?
RFLP
RFLP’s are found by trial and error and they require an
appropriate probe and enzyme
They are very valuable because they can be used just like any
other genetic marker to map genes
They are employed in recombination analysis (mapping) in the
same way as conventional Allelic variants are employed
The presence of a specific restriction site at a specific locus on
one chromosome and its absence at a specific locus on another
chromosome can be viewed as two allelic forms of a gene
The phenotype in this case is a Southern blot rather than white
eye/red eye
Lets review standard mapping:
To map any two genes with respect to one another, they must
be heterozygous at both loci.
6
Mapping
Gene W and B are responsible for wing and bristle development
W
Centromere
B
Telomere
To find the map distance between these two genes we need allelic
variants at each locus
W=wings
w= No wings
B=Bristles
b= no bristles
To measure genetic distance between these two genes, the
double heterozygote is crossed to the double homozygote
7
Mapping
Female gamete
Male gamete (wb)
Map distance= # recombinants /Total progeny
7/101= 7 M.U.
8
Mapping
Both the normal and mutant alleles of gene B (B and b) are
sequenced and we find
W
Centromere
B
B
GAATTC
3
2
E
Telomere
E
E
b
E
5
E
AAATTC
By chance, this mutation disrupts the amino acid sequence and
also a EcoRI site!
If DNA is isolated from B/B, B/b and b/b individuals, cut
with EcoRI and probed in A Southern blot, the pattern that we
will obtain will be
B/B Bristle
B/b Bristle
b/b No bristle
9
Mapping
Therefore in the previous cross (WB/wb x wb/wb), the genotype
at the B locus can be distinguished either by the presence and
absence of bristles or Southern blots
WB/wb
Female
x
wb/wb
Male
Wings
Bristles
No wings
No Bristles
Southern blot:
Southern blot
5 and 2 kb band
5 kb band
There are some phenotypes for specific genes that are very
painful to measure
Having a RFLP makes the problem easier
10
Mapping
The same southern blot method can be employed for the (W) wing
Locus with a different restriction enzyme (BamHI) if an
RFLP exists at this locus !!
You make the DNA, digest half with EcoRI and probe with bristle
probe
Digest the other half with BamHI and probe with the wing probe.
W
GTATCC
8
B
w
B
B
4
B
4
B
GGATCC
11
Mapping
To find the map distance between genes, multiple alleles are
required.
We can determine the distance between W and B by the classical
Method because multiple alleles exist at each locus (W & w, B & b)
Centromere
W
B
C
R
Telomere
You find a new gene C. There are no variants of this gene that
alter the phenotype of the fly, that you can observe. Say we don’t
even know the function of this gene. You can’t even predict its
phenotype.
However the researcher identified an RFLP variant in this gene.
12
Mapping
C
c
E
8
E
6
E
2
E
E
With this RFLP, the C gene can be mapped with respect to
other genes:
Genotype/phenotype relationships for the W and C genes
WW and Ww = Red eyes
ww = white eyes
CC = 8kb band
C/c = 8, 6, 2 kb bands
cc = 6, 2 kb bands
To determine map distance between R and C, the following cross
is performed
W
C
----------------------w
c
w
c
----------------------w
c
13
Mapping
W
B
C
W
C(8)
w
R
c(6,2)
w
c(6,2)
w
c(6,2)
Female gamete
Male gamete (wc)
14
Mapping
Prior to RFLP analysis, only a few classical markers existed in
humans
Now over 7000 RFLPs have been mapped in the human genome.
Newly inherited disorders are now mapped by determining
whether they are linked to previously identified RFLPs
15
Genetic polymorphism
•Genetic Polymorphism: A difference in DNA sequence among
individuals, groups, or populations.
•Genetic Mutation: A change in the nucleotide sequence of a
DNA molecule.
Genetic mutations are a kind of genetic polymorphism.
Genetic Variation
Single nucleotide
Polymorphism
(point mutation)
Repeat heterogeneity
16
SNP
•A Single Nucleotide Polymorphism is a source variance in a
genome.
•A SNP ("snip") is a single base mutation in DNA.
•SNPs are the most simple form and most common source of
genetic polymorphism in the human genome (90% of all human
DNA polymorphisms).
•There are two types of nucleotide base substitutions resulting in
SNPs:
–Transition: substitution between purines (A, G) or between
pyrimidines (C, T). Constitute two thirds of all SNPs.
–Transversion: substitution between a purine and a
pyrimidine.
17
SNPs-
Single Nucleotide Polymorphisms
-----------------------ACGGCTAA
-----------------------ATGGCTAA
Instead of using restriction enzymes, these are found by direct
sequencing
They are extremely useful for mapping
Markers
Classical Mendelian
RFLPs
SNPs
100
7000
1.4x106
SNPs occur every 300-1000 bp along the 3 billion long human
genome
Many SNPs have no effect on cell function
18
SNPs
Humans are genetically >99 per cent identical: it is the
tiny percentage that is different
Much of our genetic variation is caused by single-nucleotide
differences in our DNA : these are called single nucleotide
polymorphisms, or SNPs.
As a result, each of us has a unique genotype that typically differs in
about three million nucleotides from every other person.
SNPs occur about once every 300-1000 base pairs in the genome, and
the frequency of a particular polymorphism tends to remain stable in
the population.
Because only about 3 to 5 percent of a person's DNA sequence codes
for the production of proteins, most SNPs are found outside of
"coding sequences".
19
SNPs, RFLPs, point mutations
GAATTC
GAATTC
GAATTC
GAATTC
GAATTC
GAATTC
GACTTC
GAATTC
GAATTC
SNP
GAGTTC
RFLP
SNP
GAATTC
GAATTC
RFLP
Pt mut
SNP
Pt mut
SNP
20
Coding Region SNPs
•Types of coding region SNPs
–Synonymous: the substitution causes no amino acid change to
the protein it produces. This is also called a silent mutation.
–Non-Synonymous: the substitution results in an alteration of
the encoded amino acid. A missense mutation changes the
protein by causing a change of codon. A nonsense mutation
results in a misplaced termination.
–One half of all coding sequence SNPs result in
non-synonymous codon changes.
Occasionally, a SNP may actually cause a disease.
SNPs within a coding sequence are of particular interest to
researchers because they are more likely to alter the biological
function of a protein.
Intergenic SNPs
Researchers have found that most SNPs are not
responsible for a disease state because they are intergenic SNPs
Instead, they serve as biological markers for pinpointing a disease
on the human genome map, because they are usually located near a
gene found to be associated with a certain disease.
Scientists have long known that diseases caused by single genes and
inherited according to the laws of Mendel are actually rare.
Most common diseases, like diabetes, are caused by multiple genes.
Finding all of these genes is a difficult task.
Recently, there has been focus on the idea that all of the genes
involved can be traced by using SNPs.
By comparing the SNP patterns in affected and non-affected
individuals—patients with diabetes and healthy controls, for
example—scientists can catalog the specific DNA variations that
underlie susceptibility for diabetes
PCR
If a region of DNA has already been cloned and sequenced, the
sequence can be used to isolate and amplify that sequence
from other individuals in a population.
Individuals with mutations in p53 are at risk for colon cancer
To determine if an individual had such a mutation, prior to PCR
One would have to clone the gene from the individual of interest
(construct a genomic library, screen the library, isolate the
Clone and sequence the gene).
With PCR, the gene can be isolated directly from DNA isolated
from that individual.
No lengthy cloning procedure
Only small amounts of genomic DNA required
30 rounds of amplification can give you >109 copies of a gene
23
PCR
24
PCR
25
Genotype and Haplotype
In the most basic sense, a haplotype is a “haploid genotype”.
Haplotype: particular pattern of sequential SNPs (or alleles) found on
a single chromosome in a single individual
The DNA sequence of any two people is 99 percent identical.
Sets of nearby SNPs on the same chromosome are inherited in blocks.
Blocks may contain a large number of SNPs, but a few SNPs are
enough to uniquely identify the haplotypes in a block.
The HapMap is a map of these blocks and the specific SNPs that
identify the haplotypes are called tag SNPs.
Haplotyping: involves grouping individuals by haplotypes, or particular
patterns of sequential SNPs, on a single chromosome.
Microarrays, and sequencing are used to accomplish haplotyping.
SNP mapping is used to narrow down the known physical location
of mutations to a single gene.
The human genome sequence provided us with the list of many of
the parts that make a human.
The HapMap provides us with indicators which we can focus on in
looking for genes involved in common disease.
Using the HapMap data we compare the SNP patterns of people
affected by a disease with those of unaffected people.
This allows researchers to survey the whole genome quickly and
identify genetic contributions to common diseases--the HapMap
Project has simplified the search for gene variants.
27
A recessive disease pedigree
28
Mapping recessive disease genes with DNA markers
DNA markers are mapped evenly across the genome
The markers are polymorphic- they look slightly different in
Different individuals.
We can tell looking at a particular individual which grandparent
Contributed a certain part of its DNA.
If we knew that grandparent carried the disease, we could say
That part of the DNA might be responsible for the disease.
A
B
C
D
E
F
G
H
I
4 different alleles at each locus
A1, A2, A3, A4
B1, B2, B3, B4
C1, C2,………….
29
Mapping recessive disease genes with DNA markers
A B C D E F
G H I
Grandparents 1 and 4 and offspring 1 and 4 have a disease
We would look at the markers and see that ONLY at position G
30 4.
do offspring 1 and 4 have the DNA from grandparents 1 and
It is therefore likely that the disease gene will be somewhere
near marker G.
Genetic polymorphism
•Genetic Polymorphism: A difference in DNA sequence among
individuals, groups, or populations.
•Genetic Mutation: A change in the nucleotide sequence of a
DNA molecule.
Genetic mutations are a kind of genetic polymorphism.
Genetic Variation
Single nucleotide
Polymorphism
(point mutation)
Repeat heterogeneity
31
Repeats
Variation between people- small DNA change – a single
nucleotide polymorphism [SNP] – in a target site,
RFLPs and SNPs are proof of variation at the DNA level,
Satellite sequences: a short sequence of DNA repeated many
times in a row.
Chr1
Interspersed
Chr2
tandem
32
Repeats
Satellite sequences: a short sequence of DNA repeated many
times in a row.
2
5
E
E
6
E
Chr1
Interspersed
Chr2
tandem
3
E
1 E
4
E
0.5
5
3
1
33
Repeat probe
Repeat expansion
Tandem repeats expand and contract during recombination.
Mistakes in pairing leads to changes in tandem repeat numbers
E
E
E
E
1
E
2
4
Individual 1
E
E
2
Individual 2
E
E
3
34
Micro-satellite
35
DNA finger printing
Variation between people- small DNA change – a single nucleotide
polymorphism [SNP] – in a target site,
RFLPs and SNPs are proof of variation at the DNA level,
Satellite sequences: a short sequence of DNA repeated many
times in a row.
September 1984,
Make a probe that hybridizes to these minisatellites at the same
time.
Hybridize the probe to a blot with DNA from several different
people.
The X-ray of the blot was developed in Leicester University.
'what a complicated mess’ says Professor Jeffreys, then
suddenly he realized -we had patterns”
"There was a level of individual specificity”
Class
size
No of loci
method
SNP
1 bp
100 million
PCR/microarray
Micro
~100bp 200,000
PCR
Mini
1-20kb 30,000
southern blot
36
DNA fingerprint
QuickTime™ and a
TIFF
(Uncompressed)
The use
of microsatellite
analysisdecompressor
in genetic profiling. In this
example, 2
different
microsatellites
are
needed
to see this located
picture.on the short arm of
chromosome 6 have been amplified by the polymerase chain
reaction (PCR). The PCR products are labeled with a blue or
green fluorescent marker and run in a polyacrylamide gel each
lane showing the genetic profile of a different individual. Each
individual has a different genetic profile because each person has
a different set of microsatellite length variants, the variants
giving rise to bands of different sizes after PCR.
37