Download sjbionotes10

Document related concepts
no text concepts found
A Level Biology Unit 10
page 1
Heckmondwike Grammar School Biology Department
Edexcel A-Level Biology B
Mendelian Inheritance ........................................................... p 4
one gene ......................................................................... p 5
two genes ....................................................................... p 16
Chi squared test ..................................................................... p 20
Evolution and the Gene Pool .............................................. p 21
Epigenetic Control of Gene Expression ........................... p 27
Stem Cells ................................................................................ p 35
Biotechnology.......................................................................... p 39
GM Crops ................................................................................ p 58
These notes may be used freely by biology students and teachers.
I would be interested to hear of any comments and corrections.
Neil C Millar ([email protected]) June 2016
Unit 1 Biochemistry
Unit 2 Cells
Unit 3 Reproduction
Unit 4 Transport
Unit 5 Biodiversity
Unit 6 Ecology
HGS Biology A-level notes
Unit 7 Metabolism
Unit 8 Microbes
Unit 9 Control Systems
Unit 10 Genetics
NCM 09/16
A Level Biology Unit 10
page 2
Biology Unit 10 Genetics and Biotechnology
10.01 Mendelian Inheritance
Be able to construct genetic crosses.
 The terms genotype, phenotype, homozygote,
heterozygote, dominance, recessive, codominance
and multiple alleles.
 Be able to construct pedigree diagrams.
 Sex linkage on the X chromosome, including
haemophilia in humans.
 The inheritance of two non-interacting unlinked
 Autosomal linkage results from the presence of
alleles on the same chromosome, including
black/grey body and long/vestigial wing in Drosophila.
The results of crosses can be explained by the events
of meiosis. The processes of random assortment and
crossing over during meiosis give rise to new
combinations of alleles in gametes. How random
fertilisation during sexual reproduction brings about
genetic variation.
Be able to use chi squared tests to test the significance
of the difference between observed and expected
10.02 Evolution and the Gene Pool
The Hardy-Weinberg equation can be used to
monitor changes in the allele frequencies in a
 Mutations are the source of new variations.
 Sometimes changes in allele frequencies can be the
result of chance and not selection, including genetic
drift. Allele frequencies can be influenced by
population bottlenecks and the founder effect.
 Selection pressures acting on the gene pool change
allele frequencies in the population, including
stabilising selection (maintaining continuity in a
population) and disruptive selection (leading to
changes or speciation).
10.05 Epigenetic Control of Gene Expression
Gene expression can be changed by epigenetic
modification, including:
 Histone modification and DNA methylation.
 Transcription factors are proteins that bind to
DNA. The role of transcription factors in regulating
gene expression.
 How post–transcription modification of mRNA in
eukaryotic cells (RNA splicing) can result in
different products from a single gene.
 Non-coding RNA
HGS Biology A-level notes
Epigenetic modification is important in ensuring cell
differentiation. How epigenetic modifications can
result in totipotent stem cells in the embryo
developing into pluripotent cells in the blastocyst and
finally into fully differentiated somatic cells.
10.06 Stem Cells
What is meant by a stem cell, including the differences
between totipotent, pluripotent and multipotent stem
 Pluripotent stem cells from embryos provide
opportunities to develop new medical advances,
although there are ethical considerations.
 How
reprogrammed to form induced pluripotent stem
cells (iPS cells) by the artificial introduction of
named genes. Why the use of iPS stem cells may be
less problematic than the use of embryonic stem
10.07 Biotechnology
 How recombinant DNA can be produced, including
the role of restriction endonucleases and DNA
 How PCR can be used to amplify DNA samples.
 What is meant by the term genome. Gel
bioinformatics (unit 5). How gene sequencing can
be used to predict the amino acid sequence of
proteins and possible links to genetically
determined conditions.
 How DNA profiling can be used in forensic science
to identify criminals and to test paternity.
 How recombinant DNA can be inserted into other
cells, and the use of various vectors such as viruses
and gene guns.
 How antibiotic resistance marker genes and replica
plating are used to identify recombinant cells.
 How ‘knockout’ mice can be used as a valuable
animal model to investigate gene function.
10.08 GM Crops
The process of genetic modification of soya beans and
how it has been used to improve production, including
altering the balance of fatty acids to prevent oxidation
of soya products. Why the widespread use of genetic
modification of major commercial crops and other
transgenic processes have caused public debate of
their advantages and disadvantages.
NCM 09/16
A Level Biology Unit 10
page 3
Blank Page
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 4
Mendelian Inheritance
In unit 1 we studied molecular genetics – the study of DNA. Here we are concerned with the study of
inheritance of characteristics at the whole organism level. This is also known as classical genetics or
Mendelian inheritance, since it was pioneered by Gregor Mendel.
Gregor Mendel
Mendel (1822-1884) was an Austrian monk at Brno monastery. He
was a keen scientist and gardener, and studied at Vienna University,
where he learnt mathematics. He investigated inheritance in pea
plants and published his results in 1866. They were ignored at the
time, but were rediscovered in 1900, and Mendel is now recognised
as the “Father of Genetics”. His experiments succeeded where
other had failed because:
 Mendel investigated simple qualitative characteristics (or traits),
such as flower colour or seed shape, and he varied one trait at a
time. Previous investigators had tried to study many complex
quantitative traits, such as human height or intelligence, but this
is a rare instance where qualitative results are more informative
than quantitative ones, and Mendel knew this.
 Mendel used an organism whose sexual reproduction he could easily control by carefully pollinating
stigmas with pollen using a brush. Peas can also be self-pollinated, allowing self-crosses to be performed.
This is not possible with animals.
 Mendel repeated his crosses hundreds of times and applied statistical tests to his results.
 Mendel studied two generations of peas at a time.
A typical experiment looked like this:
Mendel made several conclusions from these experiments:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 5
1. There are no mixed colours (e.g. pink), so this disproved the widely-held blending theories of
inheritance that characteristics gradually mixed over time.
2. A characteristic can disappear for a generation, but then reappear the following generation, looking
exactly the same. So a characteristic can be present but hidden.
3. The outward appearance (the phenotype) is not necessarily the same as the inherited factors (the
genotype) For example the P1 red plants are not the same as the F1 red plants.
4. One form of a characteristic can mask the other. The two forms are called dominant and recessive
5. The F2 ratio is always close to 3:1 (or 75%:25%). Mendel was able to explain this by supposing that each
individual has two versions of each inherited factor, one received from each parent. We’ll look at his
logic in a minute.
Mendel’s factors are now called genes and we know they are found on chromosomes. The two alternative
forms are called alleles and are found on homologous pairs of chromosomes (the maternal and paternal).
So in the example above we would say that there is a gene for flower colour and its two alleles are “red”
and “white”. One allele comes from each parent, and the two alleles are
found on the same position (or locus) on the homologous chromosomes. If
the homologous chromosomes have the same alleles at a locus this is
homozygous, and if they have different alleles this is heterozygous. The
chromosomes on the right are homozygous for the seed shape genes but
heterozygous for the flower colour gene. The term “pure-breeding” really
means homozygous. You should revise genes and chromosomes from unit 1.
With two alleles there are three possible combinations of alleles (or genotypes) and two possible
appearances (or phenotypes):
homozygous dominant
homozygous recessive
Rr, rR
The dominant allele is defined as the allele that is expressed in the heterozygous state, while the recessive
allele is defined as the allele that is only expressed in the homozygous state (or is not expressed in the
heterozygous state).
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 6
The Monohybrid Cross
A simple breeding experiment involving just a single characteristic, like Mendel’s experiment, is called a
monohybrid cross. We can now explain Mendel’s monohybrid cross in detail.
At fertilisation any male
gamete can fertilise any
random. The possible
results of a fertilisation
shown in the diagram.
Each of the possible
outcomes has an equal
chance of happening, so
this explains the 3:1
Mendel’s First Law (the principle of segregation)
This result is summarised in Mendel’s First Law, which states that individuals carry two discrete hereditary
factors (alleles) controlling each characteristic. The two alleles segregate (or separate) during meiosis, so
each gamete carries only one of the two alleles. Today we can explain Mendel’s first law by the behaviour
of homologous chromosomes during meiosis:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 7
The Monohybrid Test Cross
You can see an individual’s phenotype, but you can’t see its genotype. If an individual shows the recessive
trait (white flowers in the above example) then they must be homozygous recessive as it’s the only
genotype that will give that phenotype. If they show the dominant trait then they could be homozygous
dominant or heterozygous. You can find out which by performing a test cross with a pure-breeding
homozygous recessive. This gives two possible results:
 If the offspring all show the dominant trait then the parent must be homozygous dominant.
 If the offspring are a mixture of phenotypes in a 1:1 ratio, then the parent must be heterozygous.
The results of a genetic cross can also be shown as a pedigree diagram, like a family tree. These pedigrees
show the inheritance of a particular characteristic though a family, and are most often used for humans
(particularly for the inheritance of a genetic disease), but are also used for commercial animals like racing
horses or pedigree dogs.
In these diagrams, males are shown as squares, females as circles, and the phenotypes as different colours.
Each individual is usually named or numbered, so that they can be referred to. Every pedigree should have a
key to indicate what the colours represent.
The pedigrees only show phenotypes, because that is what is known about the individuals, but by studying a
pedigree diagram, many of the genotypes can be deduced. The most useful feature to look for is a cross
where two parents with one phenotype have at least one offspring of a different type, such as 7, 8 and 14 in
the diagram above. There can only be one explanation: The parents must be heterozygotes, showing the
dominant phenotype, and the different child must be homozygous recessive, showing the recessive
phenotype. We now know that the yellow allele is recessive and the red is dominant, and with this
knowledge, many of the genotypes can be filled in.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 8
How does Genotype control Phenotype?
Mendel never knew this, but we can explain in detail the relation between an individual’s genes and its
appearance. A gene was originally defined as an inherited factor that controls a characteristic, but we now
know that a gene is also a length of DNA that codes for a protein (see unit 2). It is the proteins that
actually control phenotype in their many roles as enzymes, pumps, transporters, motors, hormones, or
structural elements. For example the flower colour gene actually codes for an enzyme that converts a
white pigment into a red pigment:
 The dominant allele is the normal (or “wild-type”) form of the gene that codes for functioning enzyme,
which therefore makes red-coloured flowers.
 The recessive allele is a mutation of the gene. This mutated gene codes for non-functional enzyme, so
the red pigment can’t be made, and the flower remains white. Almost any mutation in a gene will result
in an inactive gene product (often an enzyme), since there are far more ways of making an inactive
protein than a working one.
Sometimes the gene actually codes for a protein apparently unrelated to the phenotype. For example the
gene for seed shape in peas (round or wrinkled) actually codes for an enzyme that synthesises starch! The
functional enzyme makes lots of starch and the seeds are full and rounded, while the non-functional enzyme
makes less starch so the seeds wrinkle up. The gene responsible for all the symptoms of cystic fibrosis
actually codes for a chloride ion channel protein. A “tallness” gene may be a control gene that regulates the
release of growth hormone.
This table shows why the allele that codes for a functional protein is usually dominant over an allele that
codes for a non-function protein. In a heterozygous cell, some functional protein will be made, and this is
usually enough to have the desired effect. In particular, enzyme reactions are not usually limited by the
amount of enzyme, so a smaller amount in heterozygotes will have little effect on phenotype.
Gene product
homozygous dominant (RR)
all functional enzyme
homozygous recessive (rr)
no functional enzyme
heterozygous (Rr)
some functional enzyme
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 9
Sex Determination
In unit 2 we came across the sex chromosomes (X and Y). Since these are non-homologous they are called
heterosomes, while the other 22 pairs are called autosomes. In humans the sex chromosomes are
homologous in females (XX) and non-homologous in males (XY), though in other species it is the other
way round. The inheritance of the X and Y chromosomes can be demonstrated using a monohybrid cross:
This shows that there will always be a 1:1 ratio of males to females. Note that female gametes (eggs) always
contain a single X chromosome, while the male gametes (sperm) can contain a single X or a single Y
chromosome. Sex is therefore determined solely by the sperm. There are techniques for separating X and
Y sperm, and this is used for planned sex determination in farm animals using artificial insemination (AI).
In humans it is the Y chromosome that actually determines sex: all embryos start developing as females, but
if the sex-determining “SRY” gene on the Y chromosome is expressed, male hormones are produced in the
embryo, causing the development of male characteristics. In the absence of male hormones, the embryo
continues to develop as a female. The X chromosome is not involved in sex determination.
Although females have two X chromosomes, only one of them is actually used in each cell. The other X
chromosome is completely inactivated in a process called X inactivation. The inactivated X chromosome is
chosen at random in each cell and is condensed into a structure called a Barr body, which cannot be
expressed. X inactivation happens in all female cells so that they have the same amount of gene product as
males (who only have one X chromosome).
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 10
Sex-Linked Characteristics
What else do the X and Y chromosomes do? As we saw in unit 2, the Y chromosome is very small,
containing very few genes, and doesn’t seem to do anything other than determine sex.
The X
chromosome, on the other hand, is large and contains over a thousand genes that have nothing to do with
sex, coding for important products such as rhodopsin, blood clotting proteins and muscle proteins. Females
have two copies of each gene on the X chromosome (i.e. they’re diploid), but males only have one copy of
each gene on the X chromosome (i.e. they’re haploid). Males always inherit their X chromosome from
their mothers, and always pass on their X chromosome to their daughters. This means that the inheritance
of these genes is different for males and females, so they are called sex-linked characteristics.
Eye Colour in Fruit Flies
The first example of a sex-linked gene discovered was eye colour in the fruit fly Drosophila melanogaster.
This tiny fly has been a favourite organism for genetics research for over 100 years because:
 The flies are small and easily reared in the laboratory.
 They have a short two-week life cycle.
 Each female lays hundreds of fertilized eggs, giving large populations of offspring
suitable for statistical analysis.
 Although the flies are only 2mm long, their characteristics can be observed
quite easily under magnification.
 They only have four chromosomes.
Drosophila can have red or white eyes, with red (R) being dominant to white (r). When a red-eyed female is
crossed with a white-eyed male, the offspring all have red eyes, as expected for a dominant characteristic
(left cross below). However, when the opposite cross was done (a white-eye male with a red-eyed male) all
the male offspring had white eyes (right cross below). This surprising result was not expected for a simple
dominant characteristic, but it could be explained if the gene for eye colour was located on the X
chromosome. Note that in these crosses the alleles are written in the form XR (red eyes) and Xr (white
eyes) to show that they are on the X chromosome.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 11
Another well-known example of a sex linked characteristic is haemophilia in humans. We saw in unit 4 how
blood clotting is initiated by a cascade of protein clotting factors culminating in the production of fibrin,
which binds blood cells together to form a clot. The genes for two of the protein factors (factors VIII and
IX) are on the X chromosome, and mutations in either of these stops the blood clotting – haemophilia. The
disease affects around 1% of male births, but almost no females. The diagram below shows a cross between
a normal male and a heterozygous (carrier) female, using the symbols XH for the dominant allele (normal
blood-clotting factors) and Xh for the recessive allele (non-functional blood-clotting factors, haemophilia).
Females with haemophilia are very rare since they would have to be homozygous recessive and so inherit a
haemophilia allele from their father. Until recently boys with haemophilia had a low life expectancy due to
uncontrollable internal bleeding following minor accidents, so rarely had children. However, haemophiliac
girls are becoming more common, as improved treatments for the disease have allowed more haemophiliac
males to survive to adulthood and become parents. Haemophilia has passed through the royal families of
Europe due to an original mutation in Queen Victoria:
Other examples of sex linked characteristics include red-green colour-blindness and muscular dystrophy.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 12
In most situations (and all of Mendel’s experiments) one allele is completely dominant over the other, so
there are just two phenotypes. But in some cases there are three phenotypes, because neither allele is
completely dominant over the other, so the heterozygous genotype has its own phenotype. This situation is
called codominance or incomplete dominance. Since there is no dominance we can no longer use capital
and small letters to indicate the alleles, so a more formal system is used. The gene is represented by a
letter and the different alleles by superscripts to the gene letter.
Flower Colour in Snapdragons
One example of codominance is flower colour in snapdragon plants. The flower colour gene C has two
alleles: CR (red) and CW (white). The three genotypes and their phenotypes are:
Gene product
homozygous (C C )
all functional enzyme
Homozygous (CW CW)
no functional enzyme
some functional enzyme
heterozygous (C C )
In this case the enzyme is probably less active, so a smaller amount of enzyme will make significantly less
product, and this leads to the third phenotype. A monohybrid cross looks like this:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 13
Note that codominance is not an example of “blending inheritance” since the original phenotypes reappear
in the second generation. The genotypes are not blended and they still obey Mendel’s law of segregation. It
is only the phenotype that appears to blend in the heterozygotes.
Sickle Cell Anaemia
Another example of codominance is sickle cell haemoglobin in humans. The gene for haemoglobin (or more
accurately for the polypeptide globin – see unit 1) “Hb” has two codominant alleles: HbA (the normal gene)
and HbS (the mutated gene). The mutation in
the HbS gene is a single base substitution
(TA), changing one amino acid out of 146 in
the polypeptide chain. This amino acid binds to
other haemoglobin molecules, so the molecules
link together to form long chains, distorting the
red blood cells into sickle shapes.
There are three phenotypes:
Normal. All haemoglobin molecules are normal, with normal disk-shaped red blood cells.
Sickle cell anaemia. All haemoglobin molecules are abnormal, so most red blood cells are
sickle-shaped. These sickled red blood cells are less flexible than normal cells, so can block
capillaries and arterioles, causing cell death and sever pain. Sickle cells are also destroyed by
the spleen faster than they can be made, so not enough oxygen can be carried in the blood
(anaemia). Without treatment this phenotype is fatal in early childhood, though modern
medical intervention can extend life expectancy to 50.
Sickle cell trait. 50% of the haemoglobin molecules in every red blood cell are normal, and
50% abnormal. Long chains do not form, so the red blood cells are normal and carry oxygen
normally. However these red blood cells do sickle when infected by the malaria parasite, so
infected cells are destroyed by the spleen. This phenotype therefore confers immunity to
malaria, and is common in areas of the world where malaria is endemic.
Other examples of codominance include coat colour in cattle (red/white/roan), and coat colour in cats
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 14
Lethal Alleles
An unusual effect of codominance is found in Manx cats, which have no tails. If two Manx cats are crossed
the litter has ratio of 2 Manx kittens to 1 normal (long-tailed) kitten. The explanation for this unexpected
ratio is explained in this genetic diagram:
The gene S actually controls the development of the embryo cat’s spine. It has two codominant alleles: SN
(normal spine) and SA (abnormal, short spine). The three phenotypes are:
Normal. Normal spine, long tail
Manx Cat. Last few vertebrae absent, so no tail.
Lethal. Spine doesn’t develop, so this genotype is fatal early in development. The embryo
doesn’t develop and is absorbed by the mother, so there is no evidence for its existence.
Many human genes also have lethal alleles, because many genes are so essential for life that a mutation in
these genes is fatal. If the lethal allele is expressed early in embryo development then the fertilised egg may
not develop enough to start a pregnancy, or the embryo may miscarry. If the lethal allele is expressed later
in life, then we call it a genetic disease, such as muscular dystrophy or cystic fibrosis.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 15
Multiple Alleles
An individual has two copies of each gene, so can only have two alleles of any gene, but there can be more
than two alleles of a gene in a population. An example of this is blood group in humans. The red blood cell
antigen is coded for by the gene I (for isohaemaglutinogen), which has three alleles IA, IB and Io . (They are
written this way to show that they are alleles of the same gene.) IA and IB are codominant, while Io is
recessive. The six possible genotypes and four phenotypes are:
(blood group)
antigens on
red blood cells
A and B
Io Io
anti-A and anti-B
The cross below shows how all four blood groups can arise from a cross between a group A and a group B
Other examples of multiple alleles are: eye colour in fruit flies, with over 100 alleles, and human leukocyte
antigen (HLA) genes, with 47 known alleles.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 16
Two Genes
So far we have looked at the inheritance of a single gene, but Mendel also studied the inheritance of two
different characteristics at a time in pea plants, so we’ll look at one of his dihybrid crosses. The two traits
are seed shape and seed colour. Round seeds (R) are dominant to wrinkled seeds (r), and yellow seeds (Y)
are dominant to green seeds (y). With these two genes there are 4 possible phenotypes (note that it’s
often useful to use a shorthand where _ can mean a dominant or recessive allele):
round yellow
RRyy, Rryy
round green
rrYY, rrYy
wrinkled yellow
wrinkled green
Mendel’s dihybrid cross looked like this:
All 4 possible phenotypes are produced, but always in the ratio 9:3:3:1.
Mendel’s Second Law (the principle of independent assortment)
This result is summarised in Mendel’s Second Law, which states that alleles of different genes are inherited
independently; in other words the inheritance of one gene does not affect the inheritance of the other.
Today we can explain Mendel’s second law by the independent assortment of bivalents during meiosis:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 17
The dihybrid cross can be written out in a genetic diagram, just like all the monohybrid crosses:
The gametes have one allele of each gene, and that allele can end up with either allele of the other gene.
This gives 4 different gametes for the second generation, and 16 possible genotype outcomes.
The Dihybrid Test Cross
There are 4 genotypes that all give the same round yellow phenotype. Just like we saw with the
monohybrid cross, these four genotypes can be distinguished by crossing with a double recessive
phenotype. This gives 4 different results:
Original genotype
result of test cross
all round yellow
1 round yellow : 1 round green
1 round yellow : 1 wrinkled yellow
1 round yellow : 1 round green: 1 wrinkled yellow: 1 wrinkled green
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 18
Two Linked Genes
The law of independent assortment only applies when the two genes are on separate chromosomes. When
the two genes are on the same chromosome they are not independent but remain linked together during
meiosis, so are inherited together. This is called autosomal linkage.
Linkage was first discovered in the fruit fly Drosophila. The two characteristics
investigated were:
 Body colour
Grey body (G) dominant to black (g) body
 Wing length
Long wings (L) dominant to short, or vestigial wings (l)
A cross was carried out between grey-body, long-wing heterozygote (GgLl)
and a black-body vestigial-wing homozygote (ggll), with the following results:
grey long
black vestigial
grey vestigial
black long
total offspring
 Most of the offspring (83%) displayed the original two phenotypes – grey long or black vestigial – in the
ratio 1:1. This is because the two genes (for body colour and wing length) are on the same
chromosome, so the alleles stay linked together during meiosis.
 Sometimes the linkage is broken due to crossing-over during meiosis, so the alleles can mix up, and this
genetic recombination accounts for the rest of the offspring (17%) with the other two phenotypes (grey
vestigial or black long in the ratio 1:1).
We can explain this in a genetic diagram, this time showing the genes on the homologous chromosomes:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 19
Chromosome maps
The number of offspring with genetic recombination can be calculated as a frequency, called the crossover
value (COV):
crossover value =
number of offspring showing recombination
total number of offspring
In the Drosophila cross just described the crossover value for the body colour gene and wing size gene is:
×100 = 17%
But with other pairs of genes the crossover values will be quite different. This is because the crossover
value for a pair of genes on a chromosome depends on how far apart the two gene loci are on the
chromosome. The further apart they are, the more likely it is for a chiasma to form between the two loci
during meiosis, so the higher the chances of recombination so the higher the crossover value.
It was quickly realised that crossover values between pairs of genes could be used
to build up a map of the location of those genes on a chromosome (the gene
loci). This diagram shows a simple chromosome map for chromosome 2 of D.
melanogaster. The numbers are crossover values, which correspond to distance
along the chromosome.
Effect of Gene Linkage
Alleles on the same chromosome are often inherited together (they’re linked), contrary to Mendel’s
second law. However, sometimes alleles on the same chromosome can be recombined and inherited
independently, due to crossing over in meiosis. There are no fixed genetic ratios, and the frequency of
recombinant phenotypes depends on how far apart the two genes are on the chromosome.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 20
Analysing Crosses with the Chisquared (2) Test
The results of genetic crosses are an example of categoric data, i.e. observations using words rather than
numbers (e.g. colours, shapes, species). If a large number of observations are made then the number of
observations of each category can be counted to give frequencies. The Chisquared (2) test compares the
frequencies observed from an experiment with the frequencies expected from a theory such as Mendel's
laws of genetics.
Null Hypothesis: there is no difference between the observed and expected frequencies.
For example the frequencies of flower colours from a genetic cross can be compared to frequencies
expected from a genetic cross. Here the flower colours of 929 plants were observed and the observed
frequency of each colour was recorded in the column of a table.
frequency (O)
frequency (E)
The formula for 2 is:
𝜒2 = ∑
(𝑂 − 𝐸)2
In order to carry out the 2-test we need to add two columns to the results table:
1. The first new column is for “expected frequencies”. Mendel’s law predicts a 3:1 ratio, so 75% of the 929
plants (696.75) are expected to be red, and 25% are expected to be white (232.25).
2. The second new column is to calculate (O-E)2/E for each colour. Add up all these values at the bottom
to give the 2 value (0.39 in this case).
3. Calculate the degrees of freedom: dof = number of categories – 1 = 2 – 1 = 1 in this case.
4. Lookup up the critical value of 2 in a 2-table for 1 degree of freedom (3.84).
To test the null hypothesis we compare this critical value (3.84) with our calculated value of 2 (0.39).
 If the critical value of 2 is less than our calculated value then p < 0.05. We reject the null hypothesis
and conclude that there is a significant difference between the observed and expected frequencies.
 If the critical value of 2 is more than our calculated value then p > 0.05. We accept the null
hypothesis and conclude that there is no significant difference in frequencies, and the observed
difference in frequencies is just due to chance.
In this example 3.84 > 0.39, so we choose the second option: p > 0.05 so we accept the null hypothesis and
conclude that the slight difference from an exact 3:1 ratio is just due to chance, i.e. the observed
frequencies of flower colours are consistent with Mendel's law.
This is an extract from a tables of critical values of 2 at a confidence level () of 0.05. If the critical value of
the test statistic is less than the calculated value then p<0.05 and the result is significant at the 0.05 level.
3.84 5.99 7.81 9.49 11.07 12.59 14.07 15.51 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 27.59 28.87 30.14 31.41
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 21
Population Genetics
We’ve seen how alleles are passed on from one individual to another in a population. Now we’ll see how
all the alleles in a population might change. The sum of all the alleles of all the genes of all the individuals in
a population is called the gene pool. Within a gene pool we want to know how the proportions of different
alleles change over time. In the early 20th century biologists started to apply Mendel’s laws of inheritance
to whole populations and found a simple formula to calculate allele frequencies in a gene pool. This formula
is called the Hardy-Weinberg equation, since it was devised independently by the English mathematician G.
H. Hardy and the German physician G. Weinberg in 1908.
The Hardy-Weinberg Equation
Just as with the genetic crosses, let’s consider the case of a single gene at a time. For example, imagine that
coat colour in cats is controlled by a single gene with two alleles – black (B) and white (b). The black allele
is completely dominant over the white allele. Each cat has two alleles for coat colour – either BB or Bb or
bb. In population genetics we always measure frequencies – decimal fractions out of one. We don’t know
what the frequency of each genotype in the population is, but we do know that the sum of the two allele
frequencies must add up to one, by definition (because there are only two alleles of this gene).
Mathematically, if p is the frequency of the dominant allele A, and q is the frequency of the recessive allele a,
Now the gametes produced by the cats in this population will only have one allele of the coat colour gene
each – either B or b. While we don’t know the allele in any particular gamete, we know that overall,
because gamete production is random, the frequencies of the B and b alleles in the gametes will be the
same as in the gene pool of the parent cats, i.e. p and q. So we can do a Punnett square for reproduction in
this cat population:
This Punnett square gives us the frequencies of the different genotypes in the population when the cats
reproduce. The genotype BB has a frequency p2, the genotype bb has a frequency q2, and the genotype Bb
has a frequency 2pq. The sum of the genotype frequencies must add up to one (by definition), so:
p2 + 2pq + q2 = 1
This is the Hardy-Weinberg equation.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 22
Using the Hardy-Weinberg Equation
We can use the Hardy-Weinberg equation to calculate genotype and allele frequencies from observed
phenotype frequencies. Let’s take a population of 1000 cats with 840 black cats and 160 white cats.
 The phenotype frequency for black is 0.84 (840/1000) and for white is 0.16 (160/1000)
 We know that white is the recessive allele, so the white cats must be homozygous recessive, so the
frequency of the genotype bb is 0.16
 The genotype bb has a frequency q2, so q2 = 0.16
 q  q 2  0.16  0.4
 p + q = 1, so p = 1 – q = 1 – 0.4 = 0.6
Now we can calculate the genotype frequencies:
 frequency of BB = p2
= 0.62
= 0.36
 frequency of Bb = 2pq = 2 x 0.6 x 0.4 = 0.48
 frequency of bb = q2
= 0.16 (already found)
 check that the these add up to one
= 1.00
We can convert these frequencies to actual numbers in the population, for example
 Number of heterozygous cats = 0.48 x 1000 = 480
The Hardy-Weinberg equation can be used to calculate any of the three types of frequencies:
The proportions of the two alleles B and b in recessive allele (a)
the population. Allele frequencies are dominant allele (A)
particularly interesting because evolution causes
the allele frequencies to change.
The proportions of the three possible homozygous recessive (aa)
genotypes (BB, Bb and bb) in the population. homozygous dominant (AA)
We can’t see the genotypes, but we can heterozygous (Aa)
calculate them.
The proportions of the different characteristics recessive phenotype
in the population (e.g. red or white). These are dominant phenotype
the easiest to measure, because we can see and
count them in a population.
= q2
= p2
= 2pq
= q2
= p2 + 2pq
The Hardy-Weinberg equation can be very useful in many different applications. For example the incidence
of the single-gene recessive disorder cystic fibrosis in humans is 1 in 2500. From this observation we can
use the Hardy-Weinberg equation to calculate that one in 25 people are heterozygous carriers of the
disease allele, and this sort of information is important in genetic counselling.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 23
The Hardy-Weinberg Principle
The Hardy-Weinberg equation predicts that the frequencies of dominant and recessive alleles in a gene
pool remain constant over time, so long as five key conditions about the population were met:
1. There are no mutations, so no new alleles are created.
2. There is no immigration, so no new alleles are introduced, and no emigration, so no alleles are lost.
3. Mating is random, so alleles are mixed randomly in sexual reproduction.
4. The population is large, so no alleles are eliminated by genetic drift.
5. There is no selection, so no alleles are favoured or eliminated.
These conditions mean that there is nothing to disturb the gene pool, which therefore remains in a stable
genetic equilibrium. In other words, the allele frequencies in the population will remain constant from
generation to generation. This principle is called the Hardy-Weinberg principle. Before this it was thought
that dominant alleles would increase in frequency over time, and recessive alleles would decrease in
frequency, but this intuitive idea is wrong. Dominant alleles need not be common. For example the
dominant allele for Huntington’s disease is very rare in human populations and almost everyone is
homozygous recessive.
Gene Pools and Evolution
Evolution is defined as a change in allele frequencies of a population’s gene pool over time. In most real
populations allele frequencies do change over time, so the population is evolving. This means that at least
one of the five conditions of the Hardy-Weinberg principle is not true, in other words one or more factors
are acting to change the allele frequencies. These disturbing factors include:
1. Mutations
The original source of new alleles and genetic variation.
2. Gene flow
The movement of alleles between populations.
3. Non-random mating
Changes in allele frequency due to inbreeding.
4. Genetic drift
Changes in allele frequency in a small population due to chance.
5. Selection
Changes in allele frequency due to natural or sexual selection, producing adaptive
changes in response to the environment.
We can immediately see that the five conditions needed for a Hard-Weinberg equilibrium listed at the top
of the page are simply the absence of these five disturbing factors. We’ll look at each of these disturbing
factors in turn.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 24
1. Mutations
Mutations are the original source of all new alleles. We looked at different kinds of mutation in units 1 and
3, but they can all make new alleles.
Mutations are random, rare and spontaneous.
 Random means that any part of the DNA is as likely as any other to be mutated, whether it is coding or
non-coding, important or unimportant, somatic or germ-line. And the change is random too – it may
equally be neutral, lethal or beneficial.
 Rare, because cells have error-checking systems to prevent mutations from occurring. The result is that
a gene will mutate only about once per 105 cell divisions on average. However, there are so many living
cells on the planet that new alleles arise somewhere all the time.
 Spontaneous means that mutations are not caused by any particular factor (though the mutation rate is
increased by certain mutagenic factors like ionising radiation and viruses)
Only mutations in germ-line (reproductive) cells will enter the gene pool and be available for selection.
Mutations in somatic (body) cells will die with their owner.
2. Gene flow
Allele frequencies in a population’s gene pool will be changed if there is a significant movement of alleles
into or out of the population. One obvious way for this to happen is by migration: immigration could
introduce new alleles to a population while emigration could cause some alleles to be lost from the
population. Gene flow can also arise due to dispersal of seeds, pollen, or spores.
3. Non-random mating
Non-random mating includes:
 Sexual selection, where individuals choose mates of the other sex with particular characteristics to
reproduce with. This is a form of selection, so is covered by item 5 below.
 Inbreeding, where closely-related individuals mate. This increases the frequency of homozygotes.
 Selective breeding of domesticated animals and plants, where humans choose which individuals can
breed, and usually involved inbreeding as well. Alleles that are favoured by humans will increase in
frequency, while those that are undesirable will decrease in frequency.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 25
4. Genetic Drift
Allele frequencies can change just due to random chance. In a large population these random changes
would have an insignificant effect, but in a small population the effect can be considerable and can lead to
evolutionary changes. Changes in allele frequency in a small population due to chance are called genetic
drift. There are two common examples of genetic drift: genetic bottlenecks and the founder effect.
Genetic Bottlenecks
A genetic bottleneck happens when a population is drastically reduced in size due to a natural
catastrophe. The few survivors will only have a small range of alleles between them, with many of the
original alleles being lost in the large numbers who died. As the population grows again it will have a
very different set of allele frequencies from the original parent population.
 Cheetahs are a threatened species partly due to their very low genetic diversity. This is probably due
to a genetic bottleneck at the end of the last glacial period ten thousand years ago.
 An extreme example is the Golden Hamster, of which the vast majority are descended from a single
litter found in the Syrian Desert around 1930.
 We now know that humans have very low genetic diversity compared to other primate species.
Analysis of mitochondrial and Y-chromosome DNA from humans suggests that modern humans
went through a genetic bottleneck 70 000 years ago, when the world population fell to 15 000 due
to environmental changes following the eruption of the Toba supervolcano in Indonesia.
The Founder Effect
The founder effect occurs when a small number of individuals colonise a new habitat and start a new,
isolated population. Since the few individuals will only have a small range of alleles between them, the
founder effect is an example of a genetic bottleneck, and is sometimes called a colonisation bottleneck.
Founder effects are common throughout evolutionary history, and are readily seen in remote islands
(such as the Hawaiian or Galapagos islands), where colonisation is difficult and rare. A few animals or a
few plant seeds may by chance float or “raft” to a remote island during a storm, and give rise to new
populations. These new populations will have low genetic diversity, reflecting the small range of alleles in
the small founding population. In extreme cases a founding population can be as small as a single
pregnant female animal or a single plant seed.
The founder effect can also be seen in human populations. For example the island of Pingelap in
Micronesia suffered a typhoon in 1775 that reduced the population on the island to only 20. The
islanders today have a high frequency of a particular form of total colour blindness, since one of the
typhoon survivors was a carrier for this allele. The Afrikaners of South Africa have a high incidence of
Huntington’s disease, since one of the original Dutch settlers had the disease due to the presence of a
dominant allele.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 26
5. Selection
Selection includes natural selection and sexual selection, and it always causes a change in allele frequency.
The important different between selection and the other four disturbing factors is that selection is the only
factor that produces adaptive evolutionary changes (as described in unit 5). These histograms show three
kinds of natural selection, depending on which phenotypes are selected by the environment. The shaded
areas represent the phenotypes that are favoured.
 Directional Selection occurs when one extreme phonotype (e.g. tallest) is favoured over the other
extreme (e.g. shortest). This happens when the environment changes in a particular way. "Environment"
includes biotic as well as abiotic factors, so organisms evolve in response to each other e.g. if predators
run faster there is selective pressure for prey to run faster, or if one tree species grows taller, there is
selective pressure for other to grow tall. Most environments do change (e.g. due to migration of new
species, or natural catastrophes, or climate change, or to sea level change, or continental drift, etc.), so
directional selection is common.
 Disruptive (or Diverging) Selection. This occurs when both extremes of phenotype are selected over
intermediate types. For example in a population of finches, birds with large and small beaks feed on large
and small seeds respectively and both do well, but birds with intermediate beaks have no advantage, and
are selected against.
 Stabilising (or Normalising) Selection. This occurs when the intermediate phenotype is selected over
extreme phenotypes, and tends to occur when the environment doesn't change much. For example
birds’ eggs and human babies of intermediate birth weight are most likely to survive. Natural selection
doesn't have to cause a directional change, and if an environment doesn't change there is no pressure
for a well-adapted species to change. Fossils suggest that many species remain unchanged for long
periods of geological time.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 27
Epigenetic Control of Gene Expression
In unit 1 we saw how genes are expressed through transcription and translation into proteins, which give
cells their functions and properties. But cells don’t express all their genes all the time. Gene expression can
be switched on or off by environmental stimuli (e.g. age, light, injury, nutrients, chemicals). The regulation of
gene expression is called epigenetic regulation because it is regulated by environmental factors, not by the
DNA sequence (epigenetic literally means “outside genetics”). These epigenetic changes include chemical
changes to DNA and histones, but not to the base sequence of DNA. And as we shall see, epigenetic
changes can be passed down to daughter cells through mitosis, but not usually to offspring through sexual
There are five epigenetic control points along the gene expression pathway:
We’ll look at each step in detail.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 28
1. Chromatin Remodelling
In unit 3 we saw that DNA in the nucleus of a eukaryotic cell is tightly bound to histone proteins, forming
The degree of folding changes during the cell cycle, for example the tightly-packed chromosome structures
are only formed during prophase of mitosis, to allow the DNA to be moved easily. However, even in
interphase, the folding can change from a loosely-packed form (euchromatin) to a tightly-packed form
(heterochromatin). This switching is called chromatin remodelling, and regulates which genes can be
Nucleosomes tightly packed
DNA inaccessible for transcription
Genes repressed
DNA methylated
Histones de-acetylated
Nucleosomes loosely packed
DNA accessible for transcription
Genes activated
DNA de-methylated
Histones acetylated
The switching is controlled by enzymes chemically changing the DNA or histone proteins. Two of the most
important changes are DNA methylation and histone acetylation.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 29
DNA methylation
One of the four DNA nucleotides, cytosine (C), can be methylated by having a methyl group (-CH3)
attached by the enzyme DNA methyl transferase (DNMT):
Note that this methylation does not affect the hydrogen bonds formed in base-pairs, so does not affect
base-pairing or the base sequence of the DNA. DNMT only methylates cytosine when it is followed by
guanine in the double helix (known as a CpG dinucleotide sequence), and the diagonally-opposite
cytosine is also methylated, so both strands of DNA are methylated at each location.
DNA methylation turns off gene expression in that region of DNA by causing the nucleosomes to coil
up into tight heterochromatin. In this state the transcription proteins can’t bind and no transcription can
take place.
Histone modification
In chromatin DNA wraps twice round a core of histone proteins to form a structure called a
nucleosome. Each protein core is made of eight globular polypeptide chains (it’s an octamer), and each
chain has a long polypeptide “tail” (around 100 amino
acids long) emerging from the core. The amino acids in
these tails can be extensively modified by the addition
or removal of methyl groups (-CH3), acetyl groups
(-COCH3), phosphate groups (-PO4 ) or other groups,
and these modifications cause the nucleosomes to
stick together to form heterochromatin or separate to
form euchromatin. So these changes also regulate gene
Chromatin remodelling is a long-term regulator of gene expression: once genes are switched off they
usually stay switched off in that cell and in its daughter cells. This happens because the methylation patterns
and histone modifications are copied whenever DNA is replicated and the cell divides. The daughter cells
that result from mitosis thus have the same epigenetic changes and so the same inactivated genes.
Chromatin remodelling is therefore an important factor in cell differentiation (see p34).
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 30
2. Control of Transcription by Transcription Factors
In unit 1 we saw that, for transcription to happen, the enzyme RNA polymerase must bind to the DNA
molecule just upstream of the gene, at a region called the promoter. However, RNA polymerase binds only
weakly at first and it needs the assistance of a number of other DNA-binding proteins before it can start
transcribing the gene. There are two classes of regulatory proteins:
 Transcription factors bind to the promoter region, just beside the DNA polymerase. Each transcription
factor protein has a specific binding site that binds to a particular DNA sequence in the promoter.
 Activator proteins bind to the enhancer sequence, some distance further upstream. Again, different
activator proteins have different binding sites that bind to a specific DNA sequences.
Once bound to DNA, the different regulatory proteins bind together as the DNA molecule bends and
loops to form a combined transcription complex. This complex activates RNA polymerase, which can now
move along the DNA molecule, transcribing the gene.
This example shows two regulatory proteins, but in practice there can be over a dozen different proteins
involved. Some promote transcription, while others repress it, so transcription factors provide a very
flexible method of control. Both the promoter and enhancer sequences are blocked if the DNA is in the
closed heterochromatin, so transcription can’t take place. Since transcription factors are proteins, they are
synthesised in the cytoplasm and transported into the nucleus to bind to DNA. And of course their own
production is controlled by transcription factors!
Steroid hormones, like oestrogen and testosterone, stimulate protein synthesis and growth by being
transcription factors. Steroid hormones are lipids, so cross the cell membrane by lipid diffusion and bind to
a receptor protein in the cytoplasm to form a transcription factor.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 31
3. Alternative Splicing of mRNA
In unit 1 we learnt that genes contain coding sections (exons) and non-coding sections (introns). The
introns are removed from mRNA in a process called post-transcriptional modification (or just splicing),
which is carried out by a large RNA-protein complex called a spliceosome. It turns out that almost all genes
have more exons than they need to make one particular protein, and by combining some of the exons
together in different ways, the spliceosome can make different isoforms of mature mRNA, which are then
translated to make different proteins with a different structure and function. This is called alternative
Almost all human genes make different proteins by alternative splicing, making anything from 2 to 10,000
different mRNA isoforms from each gene. This explains how the 20,000 genes in the human genome can
code for over 100,000 different proteins. The record-holder so far is the Drosophila gene Dscam, which
makes proteins involved in the development of the fly’s nervous system. This gene has 116 exons, but only
17 are used in each protein, giving a possible 38,000 alternative combinations. A simpler example is the
production of the two human peptide hormones calcitonin and CGRP, which are made from the same
gene. In the thyroid gland the gene is spliced to make calcitonin, while in the hypothalamus the gene is
alternatively spliced to produce CGRP:
The observation that different mRNA splicings are used in different tissues shows that this is under
environmental control, so alternative splicing is another example of epigenetic regulation of gene
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 32
4. Control of expression by non-coding RNA
About 90% of human DNA is transcribed into RNA, but only about 2% of this RNA is mRNA that is
translated into protein. The remaining 98% of RNA just remains as RNA and is called non-coding RNA
(ncRNA), because it does not code for protein.
ncRNAs are almost all involved in the control of gene expression at all points along the gene expression
pathway. There are thousands of different kinds of ncRNA already discovered, and this is probably just the
tip of the iceberg. Some of the main ncRNAs are:
Small interfering RNA (siRNA) and micro RNA (miRNA) are both involved in controlling
gene expression by RNA interference (RNAi). siRNA and miRNA are short lengths of
antisense RNA, with sequences complementary to part of an mRNA molecule. The
short RNA molecules therefore bind to the mRNA by complementary base-pairing,
forming regions of double-stranded RNA. This double-stranded RNA cannot be
translated in a ribosome, and in fact is broken down by RNAse enzymes. RNA
interference inhibits gene expression by destroying mRNA.
Small nuclear RNA (snRNA) combines with proteins to form small nuclear riboproteins
(SNURPS), which in turn form the spliceosome complexes we’ve just looked at. The
snRNA molecules bind to the pre-mRNA and so control alternative splicing.
These are all involved in translation. We’ve already come across transfer RNA (tRNA)
and ribosomal RNA (rRNA) in unit 1. Small nucleolar RNA (snoRNA) is found in the
nucleolus, where it regulates the amount and type of rRNA transcription, and therefore
the production of ribosomes. snoRNA therefore controls gene expression at the
translation stage.
Mature mRNA, consisting solely of exons, still contains long sequences at either end that
are not translated into protein. There are called untranslated regions (UTRs). These
UTRs are involved in the binding of the mRNA to ribosomes and in switching on the
translation of that mRNA.
Xist is a gene on the X chromosome that controls the process of X inactivation, where
one of the two X chromosomes in females is inactivated. This happens in all female cells
so that they have the same amount of gene product as males (who only have one X
chromosome). The Xist gene is transcribed to ncRNA but not translated into protein.
The ncRNA then coats one X chromosome, causing it all to condense into a
heterochromatin form called a Barr body, and permanently silencing all its genes.
With so many different functions, non-coding RNA probably has a bigger effect on our characteristics than
coding RNA.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 33
Gene Expression in Embryo Development
We’ve seen how genes can be switched on and off by epigenetic control. Now we’ll look at the most
important example of epigenetic control – embryo development. An adult human has around 1014 cells;
every one formed by mitosis from the zygote, so every one with the same DNA (ignoring the odd
mutation). Yet these cells are all different – in fact there are over 200 different cell types in an adult human.
Since all the cells have the same genome, the differences must be due entirely to changes in gene
expression, and these changes must arise as the embryo develops. The process of becoming a specialised
cell by controlling gene expression is called cell differentiation.
We looked briefly at human embryo development in unit 3. Here are the stages during the first 5 days:
By the 32-cell stage the cells are already starting to become differentiated and by the blastocyst stage there
are clearly two cell types:
 An outer spherical layer of cells called the trophoblast, which will form the placenta
 An inner mass of cells, which will form the embryo.
From now on the fate of each cell and all its descendants is pre-determined. This cell determination is
found early in the development of all embryos and is triggered by different environmental cues. It may seem
odd that cells inside a tiny embryo have different environments, but there are numerous small but
significant differences within the growing embryo.
 In most animals, cell fate is determined by cytoplasmic determination, caused by chemical gradients in
the egg cell, present before fertilisation. The zygote has two opposite “poles” and, as it divides by
cleavage, the embryo cells experience slightly different chemical environments, which act as epigenetic
environmental cues to trigger differentiation.
 In mammals cell fate is determined by positional determination, caused simply by the position of a cell in
the developing embryo. The main body plan (the dorso-ventral axis) is established by day 16, and cells in
different parts of the embryo transmit chemical signals between neighbours to coordinate
In mammals there is no differentiation or determination up to the 8-cell stage. Indeed 8-cell embryos can
be split, and each cell can act like a zygote and grow to become a complete embryo. This is what happens
naturally with identical twins, and artificially with embryo cloning of farm animals. After the 8-cell stage the
cells start to be determined, and if they are experimentally transplanted to a different part of the embryo
they develop as they would have before the move.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 34
For some organisms we can draw a “fate map”, showing the destiny of every cell. This has been done for
the tiny nematode worm Caenorhabditis elegans, which is only 1mm long and always has exactly 959 adult
cells (excluding gametes). This shows a simplified fate map (or cell lineage) for the first four divisions of the
nematode worm:
By the fourth cell division (16 cells) the fate of each cell is reasonably determined, and after 10 divisions,
when the adult animal is complete, every cell is irreversibly differentiated. The fate map and the cell
differentiation is exactly the same for every nematode worm.
As an embryo develops more and more genes are switched off in any given cell. The more genes are
switched off the more differentiated the cell is. However, there is a minimum number of housekeeping
genes that every cells needs to express in order to remain alive. There are around 4000 of these (out of a
total genome of 20,000 genes), including genes involved in the cell cycle, metabolism and cell structure.
Fully-differentiated somatic cells will typically express an additional 100-600 genes needed for their
particular functions.
Control of haemoglobin genes
A good example of gene switching during embryo development is haemoglobin. In unit 1 we saw that
haemoglobin is
made of
polypeptide chains:
2  chains and 2  chains (22). In unit 3 was found that a
different haemoglobin is found in embryos, with a higher
affinity for oxygen. This fetal haemoglobin is made of
2  chains and 2  chains (22). This chart shows how the
three genes are switched on or off before and after birth.
The switching is under epigenetic control using DNA
methylation, histone acetylation, ncRNAs and transcription factors.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 35
Stem Cells
Stem cells are cells that can divide and differentiate into another kind of cell. Stem cells possess two key
 Stem cells are potent – they have the potential to differentiate into specialized cell types.
 Stem cells are immortal – they can divide indefinitely.
As we’ve just seen, early embryo cells are stem cells, since they will differentiate into all the different cells
of an adult organism, but as the embryo develops, the cells become less and less potent. There are different
classes of stem cell potency (though in fact the reduction in potency is gradual through embryo
Stem cell type
(aka omnipotent)
Can differentiate into any cell type,
including placental tissue, so can construct
a complete, viable organism.
Zygote and very early embryo cells
up to 8-cell stage.
Can differentiate into any cell type except
placental tissue.
Blastocyst inner cell mass
Can differentiate into cells of a closely
related family of cells.
Particular cells from most adult
tissues (used for growth and repair).
Can only differentiate into cells of their
own type.
most adult tissues
The discovery of stem cells in the 1950s immediately led to suggestions that they could be used clinically in
medical treatments. Many human diseases are caused by the death or destruction of particular cells. The
idea is to transplant tissue grown from stem cells into a patient, where it would grow and replace damaged
tissue. Some potential examples of these cell-based therapies are shown in the table:
Type of cell
Pancreatic  cells
Skeletal muscle cells
Blood cells
Nerve cells
Skin cells
Bone cells
Cartilage cells
Retina cells
HGS Biology A-level notes
Disease that could be treated
Myocardial infarction
Type I diabetes
Muscular dystrophy
Parkinson’s disease, multiple sclerosis, strokes, paralysis due to spinal
Macular degeneration
NCM 09/16
A Level Biology Unit 10
page 36
Traditional sources of Stem Cells
 Embryo stem cells are grown in vitro from the inner cell mass of five-day old blastocysts. These
embryos are created by in vitro fertilisation (IVF) to help infertile couples reproduce, but any “spare”
embryos, no longer needed for reproduction, can be used to create stem cells, with the informed
consent of the donor couple. Since these stem cells are pluripotent, they can be differentiated into any
cell type for clinical use. However, these cells are not the patient’s own cells, so there is a problem of
immune rejection by the patient’s immune system, and, during stem cell therapy, patients have to take
immunosuppressant drugs. In addition, there remains a debate about whether it is ethical to use human
embryos for this purpose, since the embryos are destroyed in the process. Embryonic stem cell
research had been banned across Europe by the European parliament, although the UK government
does allow such research in the UK.
 Adult stem cells are extracted from certain tissues of the body. It is thought that most organs and
tissues maintain a small number of undifferentiated stem cells, which the body uses to replace and repair
damaged tissue. Tissues where stem cells have been found include the brain, bone marrow, blood
vessels, muscle, skin, heart, gut and liver. Since these stem cells are multipotent, they can differentiate
only into their own family of cells (e.g. blood cells, muscle cells), but not others. The use of these cells
has no ethical issues, and there are also no problems of rejection if the stem cells are taken from the
patient’s own body. However, they are difficult to find and difficult to grow in culture. Blood cellforming (hematopoietic) stem cells from bone marrow are already being used successfully to treat
leukaemia (cancer of white blood cells); and heart disease and diabetes have be treated in mice. In
addition, human stem cells grown in culture are being used to test the effects of new drugs, without
harming humans or animals.
Neither of these sources of human stem cells is ideal: The embryonic stem cells cause an immune reaction
and have ethical problems, while the adult stem cells are not very effective. Ideally we would like
pluripotent stem cells from the patient’s own tissues. There are now two ways we might do this:
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 37
New Sources of Stem Cells
 Somatic Cell cloning (or therapeutic cloning)
This technique involves making a human embryo in the same way that Dolly the sheep was made – by
somatic cell nuclear transfer. The difference with therapeutic cloning is that the embryo is not implanted
into a surrogate mother and never grows into a viable human (such reproductive cloning is illegal).
Instead the embryo is grown in vitro to the blastocyst stage and the inner cell mass, containing
pluripotent stem cells, is removed. These stem cells are genetically identical to the patient, so there is
no immune rejection. However, the creation of a human embryo solely for medical purposes raises
ethical issues.
 Induced Pluripotent Stem (iPS) Cells
This technique was discovered in 2006 by Shinya Yamanaka at Kyoto University in Japan. He took
fibroblast (connective tissue) cells and inserted four genes into them using a retrovirus vector. The four
genes were Oct4, Soc3, klf4 and cMyc, which all code for transcription factors. In fully-differentiated
fibroblast cells these genes are switched off, but the new active copies of the genes were expressed to
make the necessary transcription factors. These transcription factors in turn switched on many other
genes and “reprogrammed” the fibroblast cells to become pluripotent stem cells, just like those found in
a blastocyst.
The iPS cells can be grown in culture and differentiated to become any somatic cell, just like embryonic
stem cells. But there is no immune rejection (since the cells are patient’s own) and there are no ethical
issues (since no embryos are involved). iPS cells are still a very new development, still at the research
stage, but are likely to solve the problems of both adult and embryo stem cells, making the use of
embryo stem cells obsolete.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 38
Blank Page
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 39
Biotechnology is defined as: "Any technological application that uses biological systems, living organisms, or
derivatives thereof, to make or modify products or processes for specific use." In practice biotechnology usually
refers to the application of molecular (DNA) biology in the laboratory. Biotechnology has applications in:
 research e.g. human genome project
 medicine e.g. genetically-engineered drugs, gene therapy
 agriculture e.g. improving crops
 industry e.g. manufacturing enzymes, biosensors
A particular aspect of biotechnology is genetic engineering, which means altering the genes in a living
organism to produce a Genetically Modified Organism (GMO) with a new genotype. Genetic engineering
can include inserting a foreign gene from one species into another (forming a transgenic organism); altering
an existing gene so that its product is changed or changing gene expression.
Techniques of Biotechnology
Modern biotechnology is possible due to the development of techniques from the 1960s onwards, which
arose from our greater understanding of DNA and how it functions, following the discovery of its structure
by Watson and Crick in 1953. This table lists the techniques that we shall look at in detail.
Restriction Enzymes
To cut DNA at specific points, making small fragments
DNA Ligase
To join DNA fragments together
Reverse transcriptase
To make DNA from mRNA
in vitro
To amplify very small samples of DNA
To separate fragments of DNA
Southern Blot
To look for specific sequences in DNA
DNA Sequencing
To read the base sequence of a length of DNA
DNA Profiling
To compare different peoples’ DNA
To carry DNA into cells
To deliver a vector into a living cell
Marker Genes
To identify cells that have been transformed
Knockout Mice
To investigate gene function
HGS Biology A-level notes
in vitro
in vivo
NCM 09/16
A Level Biology Unit 10
page 40
1. Restriction Enzymes
These are enzymes that cut DNA at specific sites. They are properly called restriction endonucleases
because they cut phosphodiester bonds in the middle of the polynucleotide chain. Some restriction
enzymes cut straight across both chains, forming blunt ends, but most enzymes make a staggered cut in the
two strands, forming sticky ends.
The cut ends are “sticky” because they have short stretches of single-stranded DNA with complementary
sequences. One sticky end will stick (or anneal) to another sticky end by complementary base pairing (i.e.
with weak hydrogen bonds), if the sticky ends have both been cut with the same restriction enzyme.
Restriction enzymes have highly specific active sites, and will only cut DNA at specific base sequences, 4-8
base pairs long, called recognition sequences. Recognition sequences are usually palindromic, which means
that the sequence and its complement are the same but reversed (e.g. GAATTC has the complement
CTTAAG). Short lengths of DNA cut out by restriction enzymes are called restriction fragments. There
are thousands of different restriction enzymes known, with over a hundred different recognition sequences.
Restriction enzymes are named after the bacteria species they came from, so EcoR1 is from E. coli strain R.
2. DNA Ligase
We came across DNA ligase in unit 1 joining gaps in the DNA backbone following DNA replication. It is
commonly used in genetic engineering to do the reverse of a restriction enzyme, i.e. to join together
complementary restriction fragments. Two restriction fragments can anneal if they have complementary
sticky ends, but only by weak hydrogen bonds, which can quite easily be broken, say by gentle heating. The
backbone is still incomplete. DNA ligase completes the DNA backbone by forming covalent phosphodiester
bonds. Restriction enzymes and DNA ligase can therefore be used together to join lengths of DNA from
different sources.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 41
3. Reverse Transcriptase
The enzyme reverse transcriptase does the reverse of transcription: it synthesises DNA from an RNA
template (so it is an RNA-dependent DNA polymerase enzyme). Reverse transcriptase is produced
naturally by the retroviruses (which we came across in unit 1), and it helps them to invade cells. In
biotechnology reverse transcriptase is used to make an “artificial gene”, called complementary DNA
(cDNA), from an mRNA template as shown in this diagram:
Mature mRNA (without introns) is extracted from cells and mixed with reverse transcriptase and DNA
nucleotides. A new strand of DNA is synthesised, complementary to the mRNA strand, forming a doublestranded DNA/RNA “heteroduplex” molecule. The two strands of this molecule are then separated and
reverse transcriptase now synthesises a second DNA stand, complementary to the first. The result is a
normal double-stranded DNA molecule called cDNA. Note that the cDNA molecule is much shorter than
the original gene in the organism’s DNA (typically <50% the size), since the cDNA doesn’t have introns. In
addition, cDNA only has the exons for one particular alternative splicing; whereas the original DNA has a
variety of other exons available. Indeed mRNA extracted from different tissues will form different cDNAs,
even if they come from the same gene! cDNA is therefore an “artificial gene”.
Reverse transcriptase has several uses in biotechnology:
 It makes genes without introns. Eukaryotic genes with many introns are often too big to be
incorporated into a bacterial plasmid, and bacteria are unable to splice out the introns anyway. The
artificial cDNA gene is made from mRNA that already has the introns spliced out of it, so it can be
expressed in bacteria.
 It contains the exact sequence for one specific protein, without needing any particular exon splicing.
 It makes a stable copy of a gene, since DNA is less readily broken down by enzymes than RNA.
 It makes genes easier to find. There are some 20 000 genes in the human genome, and finding the DNA
fragment containing one gene out of this many is a very difficult task. However a given cell only
expresses a few genes, so only makes a few different kinds of mRNA molecule. For example the  cells
of the pancreas make insulin, so make lots of mRNA molecules coding for insulin. This mRNA can be
isolated from these cells and used to make cDNA of the insulin gene.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 42
4. Polymerase Chain Reaction (PCR)
The polymerase chain reaction is a technique used to copy (or amplify) DNA samples as small as a single
molecule. It was developed in 1983 by Kary Mullis, for which discovery he won a Nobel Prize in 1993. PCR
is simply DNA replication in a test tube. If a length of DNA is mixed with the four nucleotides (A, T, C and
G) and the enzyme DNA polymerase in a test tube, then the DNA will be replicated many times. The
details are shown in this diagram:
1. Start with a sample of the DNA to be amplified, and add the four nucleotides and the enzyme DNA
2. Heat to 95°C for two minutes to breaks the hydrogen bonds between the base pairs and separate the
two strands of DNA. Normally (in vivo) the DNA double helix would be separated by an enzyme.
3. Add primers to the mixture and cool to 40°C. Primers are short lengths of single-stranded DNA (about
20 bp long) that anneal (i.e. form complementary base pairs) to complementary sequences on the two
DNA strands forming short lengths of double-stranded DNA. The DNA is cooled to 40°C to allow the
hydrogen bonds to form. There are two reasons for using primers:
 The enzyme DNA polymerase can only extend existing double stranded DNA.
 Only the DNA between the primer sequences is replicated, so by choosing appropriate primers you
can ensure that only a specific target sequence is copied. The choice of primers is therefore very
important to select the DNA to be amplified.
4. The DNA polymerase enzyme can now build new stands alongside each old strand to make doublestranded DNA. Each new nucleotide binds to the old strand by complementary base pairing and is
joined to the growing chain by a phosphodiester bond. The enzyme used in PCR is derived from the
thermophilic bacterium Thermus aquaticus, which grows naturally in hot springs at a temperature of
90°C, so it is not denatured by the high temperatures in step 2. Its optimum temperature is about 72°C,
so the mixture is heated to this temperature for a few minutes to allow replication to take place as
quickly as possible.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 43
5. Each original DNA molecule has now been replicated to form two molecules. The cycle is repeated
from step 2 and each time the number of DNA molecules doubles. This is why it is called a chain
reaction, since the number of molecules increases exponentially, like an explosive chain reaction. After n
cycles, there is an amplification factor of 2n. Typically PCR is run for 20-30 cycles.
PCR can be completely automated, so in a few hours a tiny sample of DNA can be amplified millions of
times with little effort. The product can be used for further studies, such as cloning, electrophoresis, or
gene probes. Because PCR can use such small samples it can be used in forensic medicine (with DNA taken
from samples of blood, hair or semen), and can even be used to copy DNA from mummified human bodies,
extinct woolly mammoths, or from an insect that's been encased in amber since the Jurassic period. One
problem of PCR is having a pure enough sample of DNA to start with. Any contaminant DNA will also be
amplified, and this can cause problems, for example in court cases.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 44
5. Electrophoresis
This is a form of chromatography used to separate different pieces of DNA on the basis of their length. It
might typically be used to separate restriction fragments. The DNA samples are placed into wells at one
end of a thin slab of gel made of agarose or polyacrylamide, and covered in a buffer solution. An electric
current is passed through the gel. Each nucleotide in a molecule of DNA contains a negatively-charged
phosphate group, so DNA is attracted to the anode (the positive electrode). The molecules have to diffuse
through the gel, and longer lengths of DNA are retarded by the gel so move more slowly than shorter
lengths. So the smaller the length of the DNA molecule, the further down the gel it will move in a given
time. At the end of the run the current is turned off.
Unfortunately the DNA on the gel cannot be seen, so it must be visualised. There are two common
methods for doing this:
 The DNA can be stained with a coloured chemical such as azure A (which stains the DNA bands blue),
or a fluorescent molecule such as ethidium bromide (which emits coloured light when the finished gel is
illuminated with invisible ultraviolet light).
 The DNA samples at the beginning can be radiolabelled with a radioactive isotope such as
visualised using autoradiography. Ordinary photographic film (sometimes called X-ray film) is placed on
top of the finished gel in the dark for a few hours, and the radiation from any radioactive DNA on the
gel exposes the film. When the film is developed the position of the DNA shows up as dark bands on
the film. This method is extremely sensitive.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 45
6. Southern Blot
A Southern blot is used to detect a specific target sequence in samples of DNA using a DNA probe. A
DNA probe is simply a short length of single-stranded DNA (100-1000 nucleotides long) with a radioactive
or fluorescent “label” attached. The probe will anneal to any fragments of DNA containing the
complementary sequence, forming regions of double-stranded hybrid DNA (the process is also called
hybridisation). The hybrid DNA fragments are now labelled and can be identified.
The Southern blot method is:
Eletrophoresis gel
1. DNA is extracted from the source cells (e.g. different patients or
different species) and amplified by PCR to make enough DNA for
the hybridisation. The DNA samples are then digested by a
restriction enzyme into many small fragments, and the fragments
separated on an electrophoresis gel.
2. The gel is placed in an alkali solution, which breaks the hydrogen
bonds between the DNA bases causing the stands to separate. A
thin sheet of nylon or nitrocellulose is placed on top of the gel.
The alkali solution is then drawn up through the gel to a stack of
paper towels by capillary action, bringing the DNA with it. The
DNA sticks to the nylon membrane.
3. The nylon sheet is separated from the gel, placed in a plastic bag
containing a solution of labelled probes and mixed thoroughly. The
probes will anneal to DNA fragments in the nylon membrane that
have a complementary sequence, forming hybrid DNA molecules
stuck to the nylon sheet (but again they can’t be seen).
4. The location of the hybrid DNA can be visualised by different
methods, depending on the label used. If the probes were
radioactive then they can be visualised by autoradiography, and the
probes show up as bands on photographic film. If the probes were
fluorescent then they can be visualised as bands of light when
illuminated by ultraviolet light.
The Southern blot was invented by Edwin Southern at Edinburgh University in 1975 (as a biological joke,
similar blotting techniques using RNA or protein are called northern and western blots respectively).
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 46
7. DNA Sequencing
This means reading the base sequence of a length of DNA. DNA sequencing is based on a beautifully
elegant technique developed by Fred Sanger in Cambridge in 1975, and now called Sanger Sequencing.
1. Label 4 test tubes A, T, C and G. Into every test tube add:
 a sample of the DNA to be sequenced, up to 400
nucleotides long, so long molecules must be broken up into
shorter fragments first using restriction enzymes. The
sample must contain many millions of individual molecules,
so may need to be amplified by PCR first.
 the four DNA nucleotides
 the enzyme DNA polymerase.
2. In each test tube add:
a dideoxy nucleotide that cannot form a phosphodiester
bond and so stops further synthesis of DNA. Tube A has
dideoxy A (A*), tube T has dideoxy T (T*), and so on. The
dideoxy nucleotides are present at about 1% of the
concentration of the normal nucleotides
 a fluorescent primer to allow the DNA polymerase to work
and to visualise the DNA later. A different primer is used in
each tube: tube A has a green primer, tube T red, tube C
blue and tube G yellow.
3. Let the DNA polymerase synthesise many copies of the DNA
sample. About 1% of the time, at random, a dideoxy nucleotide
will be added to the growing chain and synthesis of that chain
will then stop. A range of DNA molecules will be synthesised
ranging from full length to very short. The important point is
that in tube A, all the fragments will stop at an A nucleotide. In
tube T, all the fragments will stop at a T nucleotide, and so on.
4. The contents of the four tubes are mixed together and all the
different DNA molecules are then separated using capillary
electrophoresis, which gives good separation in a narrow tube
gel (1m long by 0.1mm diameter). The DNA molecules move
down the gel, smallest first. As they move they pass through a
laser beam, which causes the fluorescent labels to emit light of
their particular colour, depending in the terminal base. The
coloured light is detected by a sensor and the colour recorded
on a computer, which converts the sequence of colours into a
sequence of bases.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 47
Sanger’s original method was very slow, but the technology improves every year and modern “next
generation” machines can sequence one million bases per second, equivalent to 1 human genome per hour!
It is now possible for individuals to have their personal genomes sequenced.
Why Sequence?
1. To find out more about DNA and how it works, for example, non-coding DNA, alternative splicing, etc.
2. To compare sequences between species and so deduce phylogenetic relationships (unit 5).
3. To compare sequences between individuals and so deduce family relationships.
4. To identify different alleles at a gene locus to explore human diversity and history.
5. To identify alleles linked with diseases and so plan personalised medical treatments.
Once a gene sequence is known the amino acid sequence of
the protein that the DNA codes for can also be
determined, using the genetic code table. Computers can
identify start and stop codons and so work out where genes
are and what they do. Although the entire human genome
was sequenced in 2003, we still haven’t identified all the
genes, but it is estimated that there are about 20,000. This
is a surprisingly small number, but, as we have seen, these
20,000 genes code for more than 500,000 different
proteins, due to alternative splicing.
Genetically-determined conditions
Our genes determine all our characteristics, including our state of health, and there is hope that a better
understanding of human DNA will lead to new understanding of disease. There are a few conditions that
are known to be caused by an allele of a single gene, such as haemophilia, muscular dystrophy, cystic
fibrosis and Huntington’s disease. It is hoped that these single-gene disorders become treatable using gene
therapy, although none has yet been successful. However, these single-gene disorders are rare, and most
diseases, like cancer and heart disease, are caused by many alleles interacting.
This makes them very difficult to understand and treat, but as the genomes of more and more individuals
are being sequenced, we can start to correlate certain patterns of alleles with certain conditions. This is
likely to lead to more personalised medicine, where treatments can be tailored to patients’ genotypes, not
just their symptoms.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 48
8. DNA Profiling
DNA profiling (or genetic fingerprinting) is used to distinguish between DNA samples from different people
and is widely used in forensics and paternity testing. 99.9% of human DNA has exactly the same sequence
in every person, including the protein-coding DNA and all the non-coding DNA that makes the important
ncRNAs. However, there is enough variation in the remaining 0.1% (over 100 000 base pairs) to be used to
distinguish one individual from another.
Scientists have discovered a number of regions of non-coding DNA that contain simple repetitive
sequences called STRs (short tandem repeats), for example the sequence GATAGATAGATAGATAGATA
contains five repeats of the 4-base sequence GATA. Everyone has these STR sequences in the same loci,
but different people have different numbers of repeats. So these STR regions are known as Variable
Number Tandem Repeat (VNTR) sequences.
There are typically around 10 different variants, or alleles, at each VNTR locus (say 3-13 repeats), so each
allele is shared by around 10% of the population. This obviously isn’t good enough for a unique
identification, but forensic scientists usually look at 17 different VNTR loci on 17 different chromosomes.
Since there are two versions of each chromosome in a diploid cell (the maternal and paternal
chromosomes), there are actually 34 loci tested. If each locus has 10 different variant this gives a total of
1034 different combinations. And since there are less than 1010 humans, it is extraordinarily unlikely that any
two individuals will have exactly the same pattern of VNTRs. This diagram shows two VNTR loci of two
individuals. They have the same number of repeats at locus 2, but different numbers at locus 1.
If we could isolate just the VNTR loci and run them on an electrophoresis gel, they would separate by
length of DNA (i.e. number of repeats), and so give a different pattern of bands on the gel. This is the basis
of the DNA profiling method, invented by Sir Alec Jeffreys at Leicester University in 1984.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 49
1. Cells are collected by taking a buccal swab or from evidence at a crime scene, and DNA is extracted
from the cells. As little 100 pg of DNA is needed.
2. The DNA is amplified by PCR. Remember, in PCR only the DNA between the primers is amplified. So
17 pairs of PCR primers are carefully constructed to match target sequences at known VNTR regions of
17 chromosomes. An 18th locus on the sex chromosomes is also targeted to reveal the individual’s sex.
The primers are also fluorescent and are made so that each primer fluoresces a different colour. This
diagram shows two loci being targeted by the PCR primers.
3. All the amplified DNA fragments are separated by capillary electrophoresis, just as in DNA sequencing.
The light emitted by the fluorescent primers is detected by a sensor and recorded on a computer, which
gives a print-out of the bands. This is the DNA profile.
This shows a traditional DNA
fingerprint gel, showing bands
of DNA fragments from many
different loci. Samples of the
same DNA (or from identical
twins) give the same banding
pattern, but samples from
different people give different
banding patterns. These gels
can be quite difficult to
This shows a modern
computer-generated DNA
profile, showing homologous
pairs of bands for each of 16
loci, including X and Y. The
labels show the numbers of
repeats represented by each
band, so the profile is very easy
to interpret. All DNA profiles
are stored on a UK National
DNA Database.
DNA profiling is used
 in forensic science, to match DNA samples collected from a crime (e.g. from sperm, blood, hair, skin)
with that of suspects. DNA evidence is very powerful, but convictions can be questioned if there is any
suspicion of DNA contamination.
 to determine family relationships, e.g. paternity testing. Since children inherit half their DNA from each
parent, then one band from each locus should match each parent.
 to prevent undesirable inbreeding during breeding programs in farms and zoos (unit 5).
 to measure genetic diversity within a population.
 to establish phylogenetic relationships between species, including extinct ones, using DNA extracted
from archaeological remains.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 50
9. Vectors
Now we turn to genetically-modifying living cells. The next three techniques are concerned with
transferring genes (DNA) from the test tube into cells, so that the genes can be replicated and expressed in
those cells. To do this we first need a vector.
In genetic engineering a vector is a length of DNA that carries the gene we want into a host cell. A vector
is needed because a length of DNA containing a gene on its own won’t actually do anything inside a host
cell. Since it is not part of the cell’s normal genome it won’t be replicated when the cell divides, it won’t be
expressed, and in fact it will probably be broken down pretty quickly. A vector gets round these problems
by having these properties:
 It is big enough to hold the gene we want (plus a few others), but not too big.
 It is circular (or more accurately a closed loop), so that it is less likely to be broken down (particularly
in prokaryotic cells where DNA is always circular).
 It contains control sequences, such as a replication origin and a transcription promoter, so that the gene
will be replicated, expressed, or incorporated into the cell’s normal genome.
 It contains marker genes, so that cells containing the vector can be identified.
Common vectors include bacterial plasmids, viruses and yeast artificial chromosomes. Plasmids are the
most common kind of vector, so we shall look at how they are used in some detail. Plasmids are short
circular bits of DNA found naturally in bacterial cells. A typical plasmid contains 3-5 genes and there are
usually around 10 copies of a plasmid in a bacterial cell. Plasmids are copied separately from the main
bacterial DNA when the cell divides, so the plasmid genes are passed on to all daughter cells. They are also
used naturally for exchange of genes between bacterial cells, so bacterial cells will readily take up a plasmid.
Because they are so small, plasmids are easy to handle in a test tube, and foreign genes can quite easily be
incorporated into them using restriction enzymes and DNA ligase.
One of the first plasmids to be used was the R-plasmid. This
plasmid contains a replication origin, several recognition
sequences for different restriction enzymes (with names like
PstI and EcoRI), and two marker genes, which in this case
confer resistance to antibiotics. The R plasmid gets its name
from these resistance genes.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 51
The diagram below shows how a gene can be incorporated into a plasmid using restriction and ligase
1. A restriction enzyme (Pst1 here) is used to cut the gene from the donor DNA, with sticky ends.
2. The same restriction enzyme cuts the plasmid in the middle of one of the marker genes (we’ll see why
this is useful later).
3. The gene and plasmid are mixed in a test tube and they anneal because they were cut with the same
restriction enzyme and have the same sticky ends.
4. The fragments are joined covalently by DNA ligase to form a hybrid vector (in other words a mixture
or hybrid of bacterial and foreign DNA).
5. Several other products are also formed: some plasmids will simply re-anneal with themselves to re-form
the original plasmid, and some DNA fragments will join together to form chains or circles. These
different products cannot easily be separated, but it doesn’t matter, as the marker genes can be used
later to identify the correct hybrid vector.
This technique takes place entirely in test tubes; there are no cells involved. So the next step is to insert
our modified DNA (the hybrid or recombinant vector) into a living cell.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 52
10. Transformation
Transformation means inserting new DNA (usually as a vector) into a living cell (called a host cell), which is
thus genetically modified, or transformed. The terms transfection and transduction are also used for similar
processes. A transformed cell can replicate and express the genes in the new DNA. DNA is a large
molecule that does not readily cross cell membranes, so the membranes must be made permeable in some
way. There are different ways of doing this depending on the type of host cell.
 Heat Shock. Bacterial and animal cells in culture can be made to take up DNA from their surroundings
by raising the temperature suddenly raised by about 40°C.
 Electroporation. The most efficient method of delivering genes to bacterial cells is to use a highvoltage pulse, which temporarily disrupts the membrane and allows the plasmid to enter the cell.
 Gene Gun. Tiny gold or tungsten particles coated with DNA can be fired at plant cells using a
compressed air gun. The gene gun gets round the problem of inserting DNA through the tough cell wall
to modify plant cells. The high-velocity particles penetrate the cell wall and deliver the DNA to the plant
cell nucleus.
 Plant Infection. The bacterium Agrobacterium tumefaciens is a pathogen of many dicot plants, where its
infection causes crown gall disease. A. tumefaciens possesses a plasmid, called the Ti plasmid, which is
integrated into the plant cells' chromosomal DNA. Scientists have exploited this infection mechanism to
genetically modify plants. First the new gene is inserted into the Ti plasmid in A. tumefaciens cells, which
is grown in culture. Then the target plant cells are infected with transformed A. tumefaciens cells, which
insert the new gene into some plant cells. Finally, whole new plants are grown from these modified cells
by micropropagation.
Animals and Humans
 Micro-Injection. To transform individual cells, such as fertilised animal egg cells, the DNA is injected
directly into the nucleus using an incredibly fine micro-pipette.
 Liposomes. Human cells in vivo can be transformed by DNA encased in liposomes, which fuse with the
cell membrane, delivering the DNA into the cell.
 Viruses. Human cells in vivo can be infected by genetically-engineered viruses, which deliver the DNA
into host cells. The viruses must first be made it safe, so they can’t cause disease.
Most of these transformation techniques have a very low success rate (≪1%), so we need to be able to
identify those few cells that have taken up the foreign DNA and been transformed. This is where the
plasmid’s marker genes are used.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 53
11. Marker Genes
Marker genes (or reporter genes) are used to find which cells have actually taken up the hybrid vector.
Following transformation, there are (at least) these four possible outcomes:
Vectors contain two different marker genes, which are needed to identify the required cells.
The first marker gene distinguishes between cells that have taken up a plasmid from those that haven’t.
Cells with the plasmid now have a gene for resistance to an antibiotic such as tetracycline (not for the
antibiotic), so if all the cells are grown on a medium containing tetracycline, all the normal untransformed
cells (>99.99%) are killed. Only the few transformed cells will survive, and these can then be grown and
cloned on another plate.
The second marker gene distinguishes between cells that have taken up the hybrid plasmid from those that
have taken up the original plasmid. The trick here is that the foreign DNA is inserted inside the second
marker gene, so cells with the hybrid plasmid cannot make that gene product. Different genes are used for
this second marker:
 The marker gene can be a gene for resistance to another antibiotic, such as ampicillin. Cells with the
hybrid vector are not resistant to ampicillin. Since this means killing the cells we want, the ampicillin test
is done on a replica plate. Colonies that grow on the first (tetracycline) plate but not on the replica
(ampicillin) plate are the ones we want.
 The marker gene can be a gene for the enzyme -galactosidase (lactase). This enzyme turns a white
substrate in the agar plate into a blue product. So colonies of cells with the original plasmid turn blue,
while those with the hybrid plasmid remain white, and can easily be identified.
 The marker gene can be a gene for green fluorescent protein (GFP). Colonies of cells with the original
plasmid fluoresce green in UV light, while those with the hybrid plasmid do not fluoresce, and can easily
be identified.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 54
Gene Cloning
Gene cloning simply means make multiple copies of a piece of DNA. It is a necessary step in just about any
aspect of molecular biotechnology, such as genetic engineering, genome sequencing or genetic
fingerprinting. There are two different ways to clone DNA:
 In vitro gene cloning uses PCR to clone DNA in the test tube.
 In vivo gene cloning uses restriction enzymes, vectors, DNA ligase, transformation of bacterial cells,
marker genes and growth cultures to clone DNA inside bacterial cells.
The two techniques have different advantages and disadvantages:
in vitro cloning
in vivo cloning
(using PCR)
(using living cells)
Simple, automated technique, which can be
completed in a few hours
Complex, multi-step process, needing several days
to complete
Very sensitive, can clone a single molecule
Large amounts of original DNA needed
Can use DNA from different kinds of source,
including degraded DNA from crime scenes or
archaeological sources
Needs intact, pure DNA
Clones DNA molecules up to 1kbp long
Clones DNA molecules up to 2Mbp long
High error rate, since no error-correction
Low error rate due to cellular error-correcting
DNA is made in the test tube, so cannot be
expressed directly
DNA is made in cells, so can be expressed easily
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 55
12. Knockout Mice
We have now identified many thousands of genes in humans and other organisms, but in many cases we
don’t actually know what the genes do! The genes are identified by examining entire genome sequences and
looking for indicators of genes, such as start and stop codons, promotor sequences, etc. Computers then
compare gene sequences with the sequences of known genes from other organisms (such as Drosophila) to
try to identify their function, but similar-looking genes may have different functions in different organisms.
Of the 20,000 genes in the human gene, we know the approximate function of about 15,000.
Some of the 15,000 identified human genes.
One way to identify the real physiological function of a gene is to make a knockout mouse, where one gene
is inactivated. Mice are fairly closely related to humans, compared to other laboratory animals like fruit flies
and nematode worms, and they share most of their genes with us. But they can also be bred and observed
easily in a lab and there are fewer ethical problems in performing genetic experiments on mice compared
to humans. By comparing the appearance, physiology, cells and biochemistry of the knockout mouse with a
normal one the function of the missing gene can often be deduced.
This technique is often used to study human diseases such as cancer, obesity, heart disease, diabetes,
arthritis, substance abuse, anxiety, aging and Parkinson disease. For example, knockout mice with an
inactive CFTR gene develop cystic fibrosis like humans do, and have been used to study the disease and to
trial potential gene therapy cures, where new DNA is introduced to cells in vivo to try to cure the disease.
It is more ethical to perform early trials to test the safety of gene therapy on mice than on humans.
Knockout mice are made by a technique called gene targeting. The first knockout mouse was created by
Capecchi, Evans and Smithies in 1989, for which they were awarded the Nobel Prize in 2007. The gene
targeting technique is described on the next page.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 56
Gene Targeting Method
1. We start with the sequence of the mouse genome, which has been done, and we identify a gene
sequence from the sequence (though we don’t yet know what the gene does). We then make a
targeting vector, with the same sequence as the gene and its surrounding flanking sequences, but has
some DNA removed so the gene won’t work, and a marker gene added.
2. Isolate pluripotent stem cells from a brown mouse embryo in a Petri dish, add the targeting vector and
let it enter the stem cells using electroporation (see pxx).
3. The targeting vector will line up with the original gene in the stem cell’s DNA because it has almost the
same sequence, and DNA will swapped between the vector and the host DNA in a process called
homologous recombination. This process is like crossing over in meiosis, but it happens during the
normal cell cycle between sister chromatids. So gene targeting hijacks a normal homologous
recombination process to replace the normal gene with a modified (knockout) gene.
4. The stem cells continue to grow in culture. The few cells that were transformed are identified using the
marker gene and the others are killed.
5. The transformed cells are injected into the inner cell mass of a blastocyst embryo from a different
(white) mouse using a micropipette, so the embryo contains a mixture of normal cells from the white
mouse and transformed cells from the brown mouse. The embryo is implanted into a surrogate mother.
6. The mother gives birth to chimera mice – that is mice with cells
from two different sources. The chimeras are obvious because
they have distinct white and brown patches of fur.
7. We want mice with 100% transformed cells, not chimeras. We do
this by breeding from the chimeras. Some chimeras will, by
chance, have gonad tissue developed from transformed stem cells,
so they will produce gametes with modified DNA. From their
offspring will can select homozygous transformed mice. These are
the knockout mice.
These steps are shown on the next page
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
HGS Biology A-level notes
page 57
NCM 09/16
A Level Biology Unit 10
page 58
Genetically Modified Organisms
We have looked at some of the many techniques used in biotechnology. We’ll now turn to some
applications of these techniques. The applications involve altering the genes in a living organism to produce
a Genetically Modified Organism (GMO) with a new genotype. The GMO is designed to benefit human in
some way. If a foreign gene in copied from one species into another the GMO is called a transgenic
organism, but remember that not all GMOs are transgenic: the genetic modification might just alter an
existing gene so that its product is changed or change its gene expression. We’ll consider the applications in
three groups.
Using genetically modified organisms (usually microbes) to produce chemicals
 Gene Products
(usually proteins) for medical or industrial applications.
 New Phenotypes
Using gene technology to alter the characteristics of organisms (usually farm
animals or crops).
Using gene technology on humans to treat a disease.
 Gene Therapy
Gene Products
The biggest and most successful kind of genetic engineering is the production of gene products. These
products are of medical, agricultural or commercial value to humans. This table shows a few of the
examples of genetically engineered products that are already available.
Factor VIII
human hormone used to treat diabetes
human growth hormone, used to treat dwarfism
human hormone
bovine growth hormone, used to increase milk yield of cows
human blood clotting factor, used to treat haemophiliacs
anti-blood clotting agent used in surgery
used in reconstructive surgery
hepatitis B antigen, for vaccination
research and clinical use
enzyme inhibitor used to treat cystic fibrosis and emphysema
enzyme used to treat Pompe’s disease
enzyme used to treat CF
enzyme used in manufacture of cheese
enzyme used in paper production
biodegradable plastic
Host Organism
bacteria, yeast
yeast, plants
goats, plants
sheep, yeast
bacteria /yeast
The products are mostly proteins, which are produced directly when a gene is expressed, but they can also
be non-protein products produced by genetically-engineered enzymes. The basic idea is to transfer a gene
(often human) to another host organism (usually a microbe) so that it will make the gene product quickly,
cheaply and ethically. It is also possible to make “designer proteins” by altering gene sequences, but while
this is a useful research tool, there are no commercial applications yet.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 59
New Phenotypes
This means altering the characteristics of organisms by genetic engineering. The organisms are generally
commercially-important crops or farm animals, and the object is to improve their quality in some way. This
can be seen as a high-tech version of selective breeding, which has been used by humans to alter and
improve their crops and animals for at least 10 000 years. This table gives an idea of what is being done.
long life tomatoes
There are two well-known projects, both affecting the gene for the enzyme
polygalactourinase (PG), a pectinase that softens fruits as they ripen. Tomatoes that
make less PG ripen more slowly and retain more flavour. The American “Flavr Savr”
tomato used antisense technology to silence the gene, while the British Zeneca
tomato disrupted the gene. Both were successful and were on sale for a few years,
but neither is produced any more.
Genes for various powerful protein toxins have been transferred from the bacterium
Bacillus thuringiensis to crop plants including maize, rice and potatoes. These Bt
toxins are thousands of times more powerful than chemical insecticides, and since
they are built-in to the crops, insecticide spraying (which is non-specific and damages
the environment) is unnecessary.
Gene for virus coat protein has been cloned and inserted into tobacco, potato and
tomato plants. The coat protein seems to “immunise” the plants, which are much
more resistant to viral attack.
The gene for an enzyme that synthesises a chemical toxic to weevils has been
transferred from Bacillus bacteria to the Rhizobium bacteria that live in the root
nodules of legume plants. These root nodules are now resistant to attack by weevils.
This is a huge project, which aims to transfer the 15-or-so genes required for
nitrogen fixation from the nitrogen-fixing bacteria Rhizobium into cereals and other
crop plants. These crops would then be able to fix their own atmospheric nitrogen
and would not need any fertiliser. However, the process is extremely complex, and
the project is nowhere near success.
crop improvement
Proteins in some crop plants, including wheat, are often deficient in essential amino
acids (which is why vegetarians have to watch their diet so carefully), so the protein
genes are being altered to improve their composition for human consumption.
tick-resistant sheep The gene for the enzyme chitinase, which kills ticks by digesting their exoskeletons,
has bee transferred from plants to sheep. These sheep should be immune to tick
parasites, and may not need sheep dip.
Fast-growing fish
A number of fish species, including salmon, trout and carp, have been given a gene
from another fish (the ocean pout) which activates the fish’s own growth hormone
gene so that they grow larger and more quickly. Salmon grow to 30 times their
normal mass at 10 times the normal rate.
cleaning microbes
Genes for enzymes that digest many different hydrocarbons found in crude oil have
been transferred to Pseudomonas bacteria so that they can clean up oil spills.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 60
Genetically Modified Soya Beans
Soya beans (Glycine max, called soybean in the USA) are leguminous bean plants originally from China. The
beans are high in protein (since they’re legumes) and low in fat, so have become popular in human diets and
are now the most widely-cultivated legume worldwide. Soya beans have three main commercial uses:
 Soya bean lipids are extracted from the beans and used as vegetable oil for frying and baking. The oil is
also used in the manufacture of paint and cosmetics.
 Soya mean meal is the cell mass remaining after the oil is removed. It has a high protein content and is
used as livestock feed and dog food.
 The beans can be eaten raw or used to make soy sauce, soya flour, soya milk, tofu and miso.
The biggest producer of soya beans is the United States and now 93% of the US soya bean crop is
genetically modified. There are two reasons for genetic modification:
1. Herbicide resistance (Roundup-Ready Soybeans). The herbicide glyphosate (sold as Roundup by
Monsanto) kills plants by inhibiting an enzyme unique to plants. Now soya beans have been geneticallymodified with the gene for an enzyme from the bacterium Agrobacterium tumefaciens, which is not
affected by glyphosate. These soya beans are therefore resistant to the herbicide. Fields can safely be
sprayed with this herbicide, which will kill all weeds, but not the GM soya bean. This helps to improve
soya bean yield but means continued use of agrochemicals, so is controversial.
2. Modified lipid composition (Vistive Gold). Normal Soya bean oil already has a good mix of lipids
with a low concentration of unhealthy saturated fats and a high concentration of healthy polyunsaturated
fats. However the polyunsaturated fats are susceptible to oxidation on heating in a fryer or on longterm storage, which makes it rancid. Monsanto has a made a GM soya bean with less polyunsaturated
fats, giving it a longer shelf life. Two enzymes in the biosynthetic pathway of fatty acids have been
knocked out to alter the fatty acid composition.
HGS Biology A-level notes
NCM 09/16
A Level Biology Unit 10
page 61
Evaluating Biotechnology
The whole point of creating genetically-modified organisms is to benefit humans, and the benefits are
usually fairly obvious, but nevertheless there has been some vocal opposition to GMOs. Opposition is often
based on ethical, moral or social grounds, such as harm to animals or the environment, though there can
also be more practical issues, such as distrust of large corporations.
 Medicines and drugs can be produced safely in large quantities from microbes rather than from
slaughtered animals. These medicines benefit humans and can spare animal suffering as well.
 Agricultural productivity can be improved while using less pesticides or fertilisers, so helping the
environment. GM crops can grow on previously unsuitable soil or in previously unsuitable climates.
 GM crops can improve the nutrition and health of millions of people by improving the nutritional quality
of their staple crops.
 Risks to the modified organism. Genetic modification of an organism may have unforeseen genetic
effects on that organism and its offspring. These genetic effects could include metabolic diseases or
cancer, and would be particularly important in vertebrate animals, which have a nervous system and so
are capable of suffering. The research process may also harm animals.
 Transfer to other organisms. Genes transferred into GMOs could be transferred again into other
organisms, by natural accidents. These natural accidents could include horizontal gene transmission in
bacteria, cross-species pollination in plants, and viral transfer. This could result in a weed being resistant
to a herbicide, or a pathogenic bacterium being resistant to an antibiotic. To avoid transfer via crosspollination, genes can now be inserted into chloroplast DNA, which is not found in pollen.
 Risks to the ecosystems. A GMO may have an unforeseen effect on its food web, affecting other
organisms. Many ecosystems are often delicately balanced, and a GMO could change that balance.
 Risk to biodiversity. GMOs may continue to reduce the genetic biodiversity already occurring due to
selective breeding.
 Risks to human societies. There could be unexpected and complicated social and economic
consequences from using GMOs. For example if GM bananas could be grown in temperate countries,
that would be disastrous for the economies of those Caribbean countries who rely on banana exports.
 Risks to local farmers. Developing GMOs is expensive, and the ownership of the technology remains
with the large multi-national corporations. This means the benefits may not be available to farmers in
third world countries who need it most.
HGS Biology A-level notes
NCM 09/16
Related documents