Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Level Biology Unit 10 page 1 Heckmondwike Grammar School Biology Department Edexcel A-Level Biology B Contents Mendelian Inheritance ........................................................... p 4 one gene ......................................................................... p 5 two genes ....................................................................... p 16 Chi squared test ..................................................................... p 20 Evolution and the Gene Pool .............................................. p 21 Epigenetic Control of Gene Expression ........................... p 27 Stem Cells ................................................................................ p 35 Biotechnology.......................................................................... p 39 GM Crops ................................................................................ p 58 These notes may be used freely by biology students and teachers. I would be interested to hear of any comments and corrections. Neil C Millar ([email protected]) June 2016 Y12 Unit 1 Biochemistry Unit 2 Cells Unit 3 Reproduction Unit 4 Transport Unit 5 Biodiversity Unit 6 Ecology Y13 HGS Biology A-level notes Unit 7 Metabolism Unit 8 Microbes Unit 9 Control Systems Unit 10 Genetics NCM 09/16 A Level Biology Unit 10 page 2 Biology Unit 10 Genetics and Biotechnology Specification 10.01 Mendelian Inheritance Be able to construct genetic crosses. The terms genotype, phenotype, homozygote, heterozygote, dominance, recessive, codominance and multiple alleles. Be able to construct pedigree diagrams. Sex linkage on the X chromosome, including haemophilia in humans. The inheritance of two non-interacting unlinked genes. Autosomal linkage results from the presence of alleles on the same chromosome, including black/grey body and long/vestigial wing in Drosophila. The results of crosses can be explained by the events of meiosis. The processes of random assortment and crossing over during meiosis give rise to new combinations of alleles in gametes. How random fertilisation during sexual reproduction brings about genetic variation. Be able to use chi squared tests to test the significance of the difference between observed and expected results. 10.02 Evolution and the Gene Pool The Hardy-Weinberg equation can be used to monitor changes in the allele frequencies in a population. Mutations are the source of new variations. Sometimes changes in allele frequencies can be the result of chance and not selection, including genetic drift. Allele frequencies can be influenced by population bottlenecks and the founder effect. Selection pressures acting on the gene pool change allele frequencies in the population, including stabilising selection (maintaining continuity in a population) and disruptive selection (leading to changes or speciation). 10.05 Epigenetic Control of Gene Expression Gene expression can be changed by epigenetic modification, including: Histone modification and DNA methylation. Transcription factors are proteins that bind to DNA. The role of transcription factors in regulating gene expression. How post–transcription modification of mRNA in eukaryotic cells (RNA splicing) can result in different products from a single gene. Non-coding RNA HGS Biology A-level notes Epigenetic modification is important in ensuring cell differentiation. How epigenetic modifications can result in totipotent stem cells in the embryo developing into pluripotent cells in the blastocyst and finally into fully differentiated somatic cells. 10.06 Stem Cells What is meant by a stem cell, including the differences between totipotent, pluripotent and multipotent stem cells. Pluripotent stem cells from embryos provide opportunities to develop new medical advances, although there are ethical considerations. How differentiated fibroblasts can be reprogrammed to form induced pluripotent stem cells (iPS cells) by the artificial introduction of named genes. Why the use of iPS stem cells may be less problematic than the use of embryonic stem cells. 10.07 Biotechnology How recombinant DNA can be produced, including the role of restriction endonucleases and DNA ligase. How PCR can be used to amplify DNA samples. What is meant by the term genome. Gel electrophoresis, DNA sequencing and bioinformatics (unit 5). How gene sequencing can be used to predict the amino acid sequence of proteins and possible links to genetically determined conditions. How DNA profiling can be used in forensic science to identify criminals and to test paternity. How recombinant DNA can be inserted into other cells, and the use of various vectors such as viruses and gene guns. How antibiotic resistance marker genes and replica plating are used to identify recombinant cells. How ‘knockout’ mice can be used as a valuable animal model to investigate gene function. 10.08 GM Crops The process of genetic modification of soya beans and how it has been used to improve production, including altering the balance of fatty acids to prevent oxidation of soya products. Why the widespread use of genetic modification of major commercial crops and other transgenic processes have caused public debate of their advantages and disadvantages. NCM 09/16 A Level Biology Unit 10 page 3 Blank Page HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 4 Mendelian Inheritance In unit 1 we studied molecular genetics – the study of DNA. Here we are concerned with the study of inheritance of characteristics at the whole organism level. This is also known as classical genetics or Mendelian inheritance, since it was pioneered by Gregor Mendel. Gregor Mendel Mendel (1822-1884) was an Austrian monk at Brno monastery. He was a keen scientist and gardener, and studied at Vienna University, where he learnt mathematics. He investigated inheritance in pea plants and published his results in 1866. They were ignored at the time, but were rediscovered in 1900, and Mendel is now recognised as the “Father of Genetics”. His experiments succeeded where other had failed because: Mendel investigated simple qualitative characteristics (or traits), such as flower colour or seed shape, and he varied one trait at a time. Previous investigators had tried to study many complex quantitative traits, such as human height or intelligence, but this is a rare instance where qualitative results are more informative than quantitative ones, and Mendel knew this. Mendel used an organism whose sexual reproduction he could easily control by carefully pollinating stigmas with pollen using a brush. Peas can also be self-pollinated, allowing self-crosses to be performed. This is not possible with animals. Mendel repeated his crosses hundreds of times and applied statistical tests to his results. Mendel studied two generations of peas at a time. A typical experiment looked like this: Mendel made several conclusions from these experiments: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 5 1. There are no mixed colours (e.g. pink), so this disproved the widely-held blending theories of inheritance that characteristics gradually mixed over time. 2. A characteristic can disappear for a generation, but then reappear the following generation, looking exactly the same. So a characteristic can be present but hidden. 3. The outward appearance (the phenotype) is not necessarily the same as the inherited factors (the genotype) For example the P1 red plants are not the same as the F1 red plants. 4. One form of a characteristic can mask the other. The two forms are called dominant and recessive respectively. 5. The F2 ratio is always close to 3:1 (or 75%:25%). Mendel was able to explain this by supposing that each individual has two versions of each inherited factor, one received from each parent. We’ll look at his logic in a minute. Mendel’s factors are now called genes and we know they are found on chromosomes. The two alternative forms are called alleles and are found on homologous pairs of chromosomes (the maternal and paternal). So in the example above we would say that there is a gene for flower colour and its two alleles are “red” and “white”. One allele comes from each parent, and the two alleles are found on the same position (or locus) on the homologous chromosomes. If the homologous chromosomes have the same alleles at a locus this is homozygous, and if they have different alleles this is heterozygous. The chromosomes on the right are homozygous for the seed shape genes but heterozygous for the flower colour gene. The term “pure-breeding” really means homozygous. You should revise genes and chromosomes from unit 1. With two alleles there are three possible combinations of alleles (or genotypes) and two possible appearances (or phenotypes): Genotype Name Phenotype RR homozygous dominant red rr homozygous recessive white Rr, rR heterozygous red The dominant allele is defined as the allele that is expressed in the heterozygous state, while the recessive allele is defined as the allele that is only expressed in the homozygous state (or is not expressed in the heterozygous state). HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 6 The Monohybrid Cross A simple breeding experiment involving just a single characteristic, like Mendel’s experiment, is called a monohybrid cross. We can now explain Mendel’s monohybrid cross in detail. At fertilisation any male gamete can fertilise any female gamete at random. The possible results of a fertilisation can most worked Punnett easily out using Square be a as shown in the diagram. Each of the possible outcomes has an equal chance of happening, so this explains the 3:1 ratio observed by Mendel. Mendel’s First Law (the principle of segregation) This result is summarised in Mendel’s First Law, which states that individuals carry two discrete hereditary factors (alleles) controlling each characteristic. The two alleles segregate (or separate) during meiosis, so each gamete carries only one of the two alleles. Today we can explain Mendel’s first law by the behaviour of homologous chromosomes during meiosis: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 7 The Monohybrid Test Cross You can see an individual’s phenotype, but you can’t see its genotype. If an individual shows the recessive trait (white flowers in the above example) then they must be homozygous recessive as it’s the only genotype that will give that phenotype. If they show the dominant trait then they could be homozygous dominant or heterozygous. You can find out which by performing a test cross with a pure-breeding homozygous recessive. This gives two possible results: If the offspring all show the dominant trait then the parent must be homozygous dominant. If the offspring are a mixture of phenotypes in a 1:1 ratio, then the parent must be heterozygous. Pedigrees The results of a genetic cross can also be shown as a pedigree diagram, like a family tree. These pedigrees show the inheritance of a particular characteristic though a family, and are most often used for humans (particularly for the inheritance of a genetic disease), but are also used for commercial animals like racing horses or pedigree dogs. In these diagrams, males are shown as squares, females as circles, and the phenotypes as different colours. Each individual is usually named or numbered, so that they can be referred to. Every pedigree should have a key to indicate what the colours represent. The pedigrees only show phenotypes, because that is what is known about the individuals, but by studying a pedigree diagram, many of the genotypes can be deduced. The most useful feature to look for is a cross where two parents with one phenotype have at least one offspring of a different type, such as 7, 8 and 14 in the diagram above. There can only be one explanation: The parents must be heterozygotes, showing the dominant phenotype, and the different child must be homozygous recessive, showing the recessive phenotype. We now know that the yellow allele is recessive and the red is dominant, and with this knowledge, many of the genotypes can be filled in. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 8 How does Genotype control Phenotype? Mendel never knew this, but we can explain in detail the relation between an individual’s genes and its appearance. A gene was originally defined as an inherited factor that controls a characteristic, but we now know that a gene is also a length of DNA that codes for a protein (see unit 2). It is the proteins that actually control phenotype in their many roles as enzymes, pumps, transporters, motors, hormones, or structural elements. For example the flower colour gene actually codes for an enzyme that converts a white pigment into a red pigment: The dominant allele is the normal (or “wild-type”) form of the gene that codes for functioning enzyme, which therefore makes red-coloured flowers. The recessive allele is a mutation of the gene. This mutated gene codes for non-functional enzyme, so the red pigment can’t be made, and the flower remains white. Almost any mutation in a gene will result in an inactive gene product (often an enzyme), since there are far more ways of making an inactive protein than a working one. Sometimes the gene actually codes for a protein apparently unrelated to the phenotype. For example the gene for seed shape in peas (round or wrinkled) actually codes for an enzyme that synthesises starch! The functional enzyme makes lots of starch and the seeds are full and rounded, while the non-functional enzyme makes less starch so the seeds wrinkle up. The gene responsible for all the symptoms of cystic fibrosis actually codes for a chloride ion channel protein. A “tallness” gene may be a control gene that regulates the release of growth hormone. This table shows why the allele that codes for a functional protein is usually dominant over an allele that codes for a non-function protein. In a heterozygous cell, some functional protein will be made, and this is usually enough to have the desired effect. In particular, enzyme reactions are not usually limited by the amount of enzyme, so a smaller amount in heterozygotes will have little effect on phenotype. Genotype Gene product Phenotype homozygous dominant (RR) all functional enzyme red homozygous recessive (rr) no functional enzyme white heterozygous (Rr) some functional enzyme red HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 9 Sex Determination In unit 2 we came across the sex chromosomes (X and Y). Since these are non-homologous they are called heterosomes, while the other 22 pairs are called autosomes. In humans the sex chromosomes are homologous in females (XX) and non-homologous in males (XY), though in other species it is the other way round. The inheritance of the X and Y chromosomes can be demonstrated using a monohybrid cross: This shows that there will always be a 1:1 ratio of males to females. Note that female gametes (eggs) always contain a single X chromosome, while the male gametes (sperm) can contain a single X or a single Y chromosome. Sex is therefore determined solely by the sperm. There are techniques for separating X and Y sperm, and this is used for planned sex determination in farm animals using artificial insemination (AI). In humans it is the Y chromosome that actually determines sex: all embryos start developing as females, but if the sex-determining “SRY” gene on the Y chromosome is expressed, male hormones are produced in the embryo, causing the development of male characteristics. In the absence of male hormones, the embryo continues to develop as a female. The X chromosome is not involved in sex determination. Although females have two X chromosomes, only one of them is actually used in each cell. The other X chromosome is completely inactivated in a process called X inactivation. The inactivated X chromosome is chosen at random in each cell and is condensed into a structure called a Barr body, which cannot be expressed. X inactivation happens in all female cells so that they have the same amount of gene product as males (who only have one X chromosome). HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 10 Sex-Linked Characteristics What else do the X and Y chromosomes do? As we saw in unit 2, the Y chromosome is very small, containing very few genes, and doesn’t seem to do anything other than determine sex. The X chromosome, on the other hand, is large and contains over a thousand genes that have nothing to do with sex, coding for important products such as rhodopsin, blood clotting proteins and muscle proteins. Females have two copies of each gene on the X chromosome (i.e. they’re diploid), but males only have one copy of each gene on the X chromosome (i.e. they’re haploid). Males always inherit their X chromosome from their mothers, and always pass on their X chromosome to their daughters. This means that the inheritance of these genes is different for males and females, so they are called sex-linked characteristics. Eye Colour in Fruit Flies The first example of a sex-linked gene discovered was eye colour in the fruit fly Drosophila melanogaster. This tiny fly has been a favourite organism for genetics research for over 100 years because: The flies are small and easily reared in the laboratory. They have a short two-week life cycle. Each female lays hundreds of fertilized eggs, giving large populations of offspring suitable for statistical analysis. Although the flies are only 2mm long, their characteristics can be observed quite easily under magnification. They only have four chromosomes. Drosophila can have red or white eyes, with red (R) being dominant to white (r). When a red-eyed female is crossed with a white-eyed male, the offspring all have red eyes, as expected for a dominant characteristic (left cross below). However, when the opposite cross was done (a white-eye male with a red-eyed male) all the male offspring had white eyes (right cross below). This surprising result was not expected for a simple dominant characteristic, but it could be explained if the gene for eye colour was located on the X chromosome. Note that in these crosses the alleles are written in the form XR (red eyes) and Xr (white eyes) to show that they are on the X chromosome. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 11 Haemophilia Another well-known example of a sex linked characteristic is haemophilia in humans. We saw in unit 4 how blood clotting is initiated by a cascade of protein clotting factors culminating in the production of fibrin, which binds blood cells together to form a clot. The genes for two of the protein factors (factors VIII and IX) are on the X chromosome, and mutations in either of these stops the blood clotting – haemophilia. The disease affects around 1% of male births, but almost no females. The diagram below shows a cross between a normal male and a heterozygous (carrier) female, using the symbols XH for the dominant allele (normal blood-clotting factors) and Xh for the recessive allele (non-functional blood-clotting factors, haemophilia). Females with haemophilia are very rare since they would have to be homozygous recessive and so inherit a haemophilia allele from their father. Until recently boys with haemophilia had a low life expectancy due to uncontrollable internal bleeding following minor accidents, so rarely had children. However, haemophiliac girls are becoming more common, as improved treatments for the disease have allowed more haemophiliac males to survive to adulthood and become parents. Haemophilia has passed through the royal families of Europe due to an original mutation in Queen Victoria: Other examples of sex linked characteristics include red-green colour-blindness and muscular dystrophy. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 12 Codominance In most situations (and all of Mendel’s experiments) one allele is completely dominant over the other, so there are just two phenotypes. But in some cases there are three phenotypes, because neither allele is completely dominant over the other, so the heterozygous genotype has its own phenotype. This situation is called codominance or incomplete dominance. Since there is no dominance we can no longer use capital and small letters to indicate the alleles, so a more formal system is used. The gene is represented by a letter and the different alleles by superscripts to the gene letter. Flower Colour in Snapdragons One example of codominance is flower colour in snapdragon plants. The flower colour gene C has two alleles: CR (red) and CW (white). The three genotypes and their phenotypes are: Genotype Gene product Phenotype homozygous (C C ) all functional enzyme red Homozygous (CW CW) no functional enzyme white some functional enzyme pink R R R W heterozygous (C C ) In this case the enzyme is probably less active, so a smaller amount of enzyme will make significantly less product, and this leads to the third phenotype. A monohybrid cross looks like this: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 13 Note that codominance is not an example of “blending inheritance” since the original phenotypes reappear in the second generation. The genotypes are not blended and they still obey Mendel’s law of segregation. It is only the phenotype that appears to blend in the heterozygotes. Sickle Cell Anaemia Another example of codominance is sickle cell haemoglobin in humans. The gene for haemoglobin (or more accurately for the polypeptide globin – see unit 1) “Hb” has two codominant alleles: HbA (the normal gene) and HbS (the mutated gene). The mutation in the HbS gene is a single base substitution (TA), changing one amino acid out of 146 in the polypeptide chain. This amino acid binds to other haemoglobin molecules, so the molecules link together to form long chains, distorting the red blood cells into sickle shapes. There are three phenotypes: HbA HbA Normal. All haemoglobin molecules are normal, with normal disk-shaped red blood cells. HbS HbS Sickle cell anaemia. All haemoglobin molecules are abnormal, so most red blood cells are sickle-shaped. These sickled red blood cells are less flexible than normal cells, so can block capillaries and arterioles, causing cell death and sever pain. Sickle cells are also destroyed by the spleen faster than they can be made, so not enough oxygen can be carried in the blood (anaemia). Without treatment this phenotype is fatal in early childhood, though modern medical intervention can extend life expectancy to 50. HbA HbS Sickle cell trait. 50% of the haemoglobin molecules in every red blood cell are normal, and 50% abnormal. Long chains do not form, so the red blood cells are normal and carry oxygen normally. However these red blood cells do sickle when infected by the malaria parasite, so infected cells are destroyed by the spleen. This phenotype therefore confers immunity to malaria, and is common in areas of the world where malaria is endemic. Other examples of codominance include coat colour in cattle (red/white/roan), and coat colour in cats (black/orange/tortoiseshell). HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 14 Lethal Alleles An unusual effect of codominance is found in Manx cats, which have no tails. If two Manx cats are crossed the litter has ratio of 2 Manx kittens to 1 normal (long-tailed) kitten. The explanation for this unexpected ratio is explained in this genetic diagram: The gene S actually controls the development of the embryo cat’s spine. It has two codominant alleles: SN (normal spine) and SA (abnormal, short spine). The three phenotypes are: SNSN Normal. Normal spine, long tail SNSA Manx Cat. Last few vertebrae absent, so no tail. SASA Lethal. Spine doesn’t develop, so this genotype is fatal early in development. The embryo doesn’t develop and is absorbed by the mother, so there is no evidence for its existence. Many human genes also have lethal alleles, because many genes are so essential for life that a mutation in these genes is fatal. If the lethal allele is expressed early in embryo development then the fertilised egg may not develop enough to start a pregnancy, or the embryo may miscarry. If the lethal allele is expressed later in life, then we call it a genetic disease, such as muscular dystrophy or cystic fibrosis. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 15 Multiple Alleles An individual has two copies of each gene, so can only have two alleles of any gene, but there can be more than two alleles of a gene in a population. An example of this is blood group in humans. The red blood cell antigen is coded for by the gene I (for isohaemaglutinogen), which has three alleles IA, IB and Io . (They are written this way to show that they are alleles of the same gene.) IA and IB are codominant, while Io is recessive. The six possible genotypes and four phenotypes are: Phenotype (blood group) Genotypes antigens on red blood cells plasma antibodies A IA IA, IA Io A anti-B B IB IB , IA Io B anti-A AB IA IB A and B none O Io Io none anti-A and anti-B The cross below shows how all four blood groups can arise from a cross between a group A and a group B parent. Other examples of multiple alleles are: eye colour in fruit flies, with over 100 alleles, and human leukocyte antigen (HLA) genes, with 47 known alleles. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 16 Two Genes So far we have looked at the inheritance of a single gene, but Mendel also studied the inheritance of two different characteristics at a time in pea plants, so we’ll look at one of his dihybrid crosses. The two traits are seed shape and seed colour. Round seeds (R) are dominant to wrinkled seeds (r), and yellow seeds (Y) are dominant to green seeds (y). With these two genes there are 4 possible phenotypes (note that it’s often useful to use a shorthand where _ can mean a dominant or recessive allele): Genotypes Shorthand Phenotype RRYY, RRYy, RrYY, RrYy R_Y_ round yellow RRyy, Rryy R_yy round green rrYY, rrYy rrY_ wrinkled yellow rryy rryy wrinkled green Mendel’s dihybrid cross looked like this: All 4 possible phenotypes are produced, but always in the ratio 9:3:3:1. Mendel’s Second Law (the principle of independent assortment) This result is summarised in Mendel’s Second Law, which states that alleles of different genes are inherited independently; in other words the inheritance of one gene does not affect the inheritance of the other. Today we can explain Mendel’s second law by the independent assortment of bivalents during meiosis: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 17 The dihybrid cross can be written out in a genetic diagram, just like all the monohybrid crosses: The gametes have one allele of each gene, and that allele can end up with either allele of the other gene. This gives 4 different gametes for the second generation, and 16 possible genotype outcomes. The Dihybrid Test Cross There are 4 genotypes that all give the same round yellow phenotype. Just like we saw with the monohybrid cross, these four genotypes can be distinguished by crossing with a double recessive phenotype. This gives 4 different results: Original genotype result of test cross RRYY RRYy all round yellow 1 round yellow : 1 round green RrYY 1 round yellow : 1 wrinkled yellow RrYy 1 round yellow : 1 round green: 1 wrinkled yellow: 1 wrinkled green HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 18 Two Linked Genes The law of independent assortment only applies when the two genes are on separate chromosomes. When the two genes are on the same chromosome they are not independent but remain linked together during meiosis, so are inherited together. This is called autosomal linkage. Linkage was first discovered in the fruit fly Drosophila. The two characteristics investigated were: Body colour Grey body (G) dominant to black (g) body Wing length Long wings (L) dominant to short, or vestigial wings (l) A cross was carried out between grey-body, long-wing heterozygote (GgLl) and a black-body vestigial-wing homozygote (ggll), with the following results: grey long black vestigial grey vestigial black long total offspring 965 944 185 206 2300 42.0% 41.0% 8.0% 9.0% 100% Most of the offspring (83%) displayed the original two phenotypes – grey long or black vestigial – in the ratio 1:1. This is because the two genes (for body colour and wing length) are on the same chromosome, so the alleles stay linked together during meiosis. Sometimes the linkage is broken due to crossing-over during meiosis, so the alleles can mix up, and this genetic recombination accounts for the rest of the offspring (17%) with the other two phenotypes (grey vestigial or black long in the ratio 1:1). We can explain this in a genetic diagram, this time showing the genes on the homologous chromosomes: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 19 Chromosome maps The number of offspring with genetic recombination can be calculated as a frequency, called the crossover value (COV): crossover value = number of offspring showing recombination ×100 total number of offspring In the Drosophila cross just described the crossover value for the body colour gene and wing size gene is: (185+206) ×100 = 17% 2300 But with other pairs of genes the crossover values will be quite different. This is because the crossover COV= value for a pair of genes on a chromosome depends on how far apart the two gene loci are on the chromosome. The further apart they are, the more likely it is for a chiasma to form between the two loci during meiosis, so the higher the chances of recombination so the higher the crossover value. It was quickly realised that crossover values between pairs of genes could be used to build up a map of the location of those genes on a chromosome (the gene loci). This diagram shows a simple chromosome map for chromosome 2 of D. melanogaster. The numbers are crossover values, which correspond to distance along the chromosome. Effect of Gene Linkage Alleles on the same chromosome are often inherited together (they’re linked), contrary to Mendel’s second law. However, sometimes alleles on the same chromosome can be recombined and inherited independently, due to crossing over in meiosis. There are no fixed genetic ratios, and the frequency of recombinant phenotypes depends on how far apart the two genes are on the chromosome. . HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 20 Analysing Crosses with the Chisquared (2) Test The results of genetic crosses are an example of categoric data, i.e. observations using words rather than numbers (e.g. colours, shapes, species). If a large number of observations are made then the number of observations of each category can be counted to give frequencies. The Chisquared (2) test compares the frequencies observed from an experiment with the frequencies expected from a theory such as Mendel's laws of genetics. Null Hypothesis: there is no difference between the observed and expected frequencies. For example the frequencies of flower colours from a genetic cross can be compared to frequencies expected from a genetic cross. Here the flower colours of 929 plants were observed and the observed frequency of each colour was recorded in the column of a table. flower colour red white sum observed frequency (O) 705 224 929 expected frequency (E) 696.75 232.25 929 The formula for 2 is: (O-E)2/E 0.10 0.29 0.39 𝜒2 = ∑ (𝑂 − 𝐸)2 𝐸 In order to carry out the 2-test we need to add two columns to the results table: 1. The first new column is for “expected frequencies”. Mendel’s law predicts a 3:1 ratio, so 75% of the 929 plants (696.75) are expected to be red, and 25% are expected to be white (232.25). 2. The second new column is to calculate (O-E)2/E for each colour. Add up all these values at the bottom to give the 2 value (0.39 in this case). 3. Calculate the degrees of freedom: dof = number of categories – 1 = 2 – 1 = 1 in this case. 4. Lookup up the critical value of 2 in a 2-table for 1 degree of freedom (3.84). To test the null hypothesis we compare this critical value (3.84) with our calculated value of 2 (0.39). If the critical value of 2 is less than our calculated value then p < 0.05. We reject the null hypothesis and conclude that there is a significant difference between the observed and expected frequencies. If the critical value of 2 is more than our calculated value then p > 0.05. We accept the null hypothesis and conclude that there is no significant difference in frequencies, and the observed difference in frequencies is just due to chance. In this example 3.84 > 0.39, so we choose the second option: p > 0.05 so we accept the null hypothesis and conclude that the slight difference from an exact 3:1 ratio is just due to chance, i.e. the observed frequencies of flower colours are consistent with Mendel's law. This is an extract from a tables of critical values of 2 at a confidence level () of 0.05. If the critical value of the test statistic is less than the calculated value then p<0.05 and the result is significant at the 0.05 level. dof 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 3.84 5.99 7.81 9.49 11.07 12.59 14.07 15.51 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30 27.59 28.87 30.14 31.41 HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 21 Population Genetics We’ve seen how alleles are passed on from one individual to another in a population. Now we’ll see how all the alleles in a population might change. The sum of all the alleles of all the genes of all the individuals in a population is called the gene pool. Within a gene pool we want to know how the proportions of different alleles change over time. In the early 20th century biologists started to apply Mendel’s laws of inheritance to whole populations and found a simple formula to calculate allele frequencies in a gene pool. This formula is called the Hardy-Weinberg equation, since it was devised independently by the English mathematician G. H. Hardy and the German physician G. Weinberg in 1908. The Hardy-Weinberg Equation Just as with the genetic crosses, let’s consider the case of a single gene at a time. For example, imagine that coat colour in cats is controlled by a single gene with two alleles – black (B) and white (b). The black allele is completely dominant over the white allele. Each cat has two alleles for coat colour – either BB or Bb or bb. In population genetics we always measure frequencies – decimal fractions out of one. We don’t know what the frequency of each genotype in the population is, but we do know that the sum of the two allele frequencies must add up to one, by definition (because there are only two alleles of this gene). Mathematically, if p is the frequency of the dominant allele A, and q is the frequency of the recessive allele a, then p+q=1 Now the gametes produced by the cats in this population will only have one allele of the coat colour gene each – either B or b. While we don’t know the allele in any particular gamete, we know that overall, because gamete production is random, the frequencies of the B and b alleles in the gametes will be the same as in the gene pool of the parent cats, i.e. p and q. So we can do a Punnett square for reproduction in this cat population: This Punnett square gives us the frequencies of the different genotypes in the population when the cats reproduce. The genotype BB has a frequency p2, the genotype bb has a frequency q2, and the genotype Bb has a frequency 2pq. The sum of the genotype frequencies must add up to one (by definition), so: p2 + 2pq + q2 = 1 This is the Hardy-Weinberg equation. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 22 Using the Hardy-Weinberg Equation We can use the Hardy-Weinberg equation to calculate genotype and allele frequencies from observed phenotype frequencies. Let’s take a population of 1000 cats with 840 black cats and 160 white cats. The phenotype frequency for black is 0.84 (840/1000) and for white is 0.16 (160/1000) We know that white is the recessive allele, so the white cats must be homozygous recessive, so the frequency of the genotype bb is 0.16 The genotype bb has a frequency q2, so q2 = 0.16 q q 2 0.16 0.4 p + q = 1, so p = 1 – q = 1 – 0.4 = 0.6 Now we can calculate the genotype frequencies: frequency of BB = p2 = 0.62 = 0.36 frequency of Bb = 2pq = 2 x 0.6 x 0.4 = 0.48 frequency of bb = q2 = 0.16 (already found) check that the these add up to one = 1.00 We can convert these frequencies to actual numbers in the population, for example Number of heterozygous cats = 0.48 x 1000 = 480 The Hardy-Weinberg equation can be used to calculate any of the three types of frequencies: Frequency Allele frequency Description Formulae The proportions of the two alleles B and b in recessive allele (a) the population. Allele frequencies are dominant allele (A) particularly interesting because evolution causes the allele frequencies to change. Genotype frequency The proportions of the three possible homozygous recessive (aa) genotypes (BB, Bb and bb) in the population. homozygous dominant (AA) We can’t see the genotypes, but we can heterozygous (Aa) calculate them. Phenotype frequency The proportions of the different characteristics recessive phenotype in the population (e.g. red or white). These are dominant phenotype the easiest to measure, because we can see and count them in a population. =q =p = q2 = p2 = 2pq = q2 = p2 + 2pq The Hardy-Weinberg equation can be very useful in many different applications. For example the incidence of the single-gene recessive disorder cystic fibrosis in humans is 1 in 2500. From this observation we can use the Hardy-Weinberg equation to calculate that one in 25 people are heterozygous carriers of the disease allele, and this sort of information is important in genetic counselling. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 23 The Hardy-Weinberg Principle The Hardy-Weinberg equation predicts that the frequencies of dominant and recessive alleles in a gene pool remain constant over time, so long as five key conditions about the population were met: 1. There are no mutations, so no new alleles are created. 2. There is no immigration, so no new alleles are introduced, and no emigration, so no alleles are lost. 3. Mating is random, so alleles are mixed randomly in sexual reproduction. 4. The population is large, so no alleles are eliminated by genetic drift. 5. There is no selection, so no alleles are favoured or eliminated. These conditions mean that there is nothing to disturb the gene pool, which therefore remains in a stable genetic equilibrium. In other words, the allele frequencies in the population will remain constant from generation to generation. This principle is called the Hardy-Weinberg principle. Before this it was thought that dominant alleles would increase in frequency over time, and recessive alleles would decrease in frequency, but this intuitive idea is wrong. Dominant alleles need not be common. For example the dominant allele for Huntington’s disease is very rare in human populations and almost everyone is homozygous recessive. Gene Pools and Evolution Evolution is defined as a change in allele frequencies of a population’s gene pool over time. In most real populations allele frequencies do change over time, so the population is evolving. This means that at least one of the five conditions of the Hardy-Weinberg principle is not true, in other words one or more factors are acting to change the allele frequencies. These disturbing factors include: 1. Mutations The original source of new alleles and genetic variation. 2. Gene flow The movement of alleles between populations. 3. Non-random mating Changes in allele frequency due to inbreeding. 4. Genetic drift Changes in allele frequency in a small population due to chance. 5. Selection Changes in allele frequency due to natural or sexual selection, producing adaptive changes in response to the environment. We can immediately see that the five conditions needed for a Hard-Weinberg equilibrium listed at the top of the page are simply the absence of these five disturbing factors. We’ll look at each of these disturbing factors in turn. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 24 1. Mutations Mutations are the original source of all new alleles. We looked at different kinds of mutation in units 1 and 3, but they can all make new alleles. Mutations are random, rare and spontaneous. Random means that any part of the DNA is as likely as any other to be mutated, whether it is coding or non-coding, important or unimportant, somatic or germ-line. And the change is random too – it may equally be neutral, lethal or beneficial. Rare, because cells have error-checking systems to prevent mutations from occurring. The result is that a gene will mutate only about once per 105 cell divisions on average. However, there are so many living cells on the planet that new alleles arise somewhere all the time. Spontaneous means that mutations are not caused by any particular factor (though the mutation rate is increased by certain mutagenic factors like ionising radiation and viruses) Only mutations in germ-line (reproductive) cells will enter the gene pool and be available for selection. Mutations in somatic (body) cells will die with their owner. 2. Gene flow Allele frequencies in a population’s gene pool will be changed if there is a significant movement of alleles into or out of the population. One obvious way for this to happen is by migration: immigration could introduce new alleles to a population while emigration could cause some alleles to be lost from the population. Gene flow can also arise due to dispersal of seeds, pollen, or spores. 3. Non-random mating Non-random mating includes: Sexual selection, where individuals choose mates of the other sex with particular characteristics to reproduce with. This is a form of selection, so is covered by item 5 below. Inbreeding, where closely-related individuals mate. This increases the frequency of homozygotes. Selective breeding of domesticated animals and plants, where humans choose which individuals can breed, and usually involved inbreeding as well. Alleles that are favoured by humans will increase in frequency, while those that are undesirable will decrease in frequency. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 25 4. Genetic Drift Allele frequencies can change just due to random chance. In a large population these random changes would have an insignificant effect, but in a small population the effect can be considerable and can lead to evolutionary changes. Changes in allele frequency in a small population due to chance are called genetic drift. There are two common examples of genetic drift: genetic bottlenecks and the founder effect. Genetic Bottlenecks A genetic bottleneck happens when a population is drastically reduced in size due to a natural catastrophe. The few survivors will only have a small range of alleles between them, with many of the original alleles being lost in the large numbers who died. As the population grows again it will have a very different set of allele frequencies from the original parent population. Cheetahs are a threatened species partly due to their very low genetic diversity. This is probably due to a genetic bottleneck at the end of the last glacial period ten thousand years ago. An extreme example is the Golden Hamster, of which the vast majority are descended from a single litter found in the Syrian Desert around 1930. We now know that humans have very low genetic diversity compared to other primate species. Analysis of mitochondrial and Y-chromosome DNA from humans suggests that modern humans went through a genetic bottleneck 70 000 years ago, when the world population fell to 15 000 due to environmental changes following the eruption of the Toba supervolcano in Indonesia. The Founder Effect The founder effect occurs when a small number of individuals colonise a new habitat and start a new, isolated population. Since the few individuals will only have a small range of alleles between them, the founder effect is an example of a genetic bottleneck, and is sometimes called a colonisation bottleneck. Founder effects are common throughout evolutionary history, and are readily seen in remote islands (such as the Hawaiian or Galapagos islands), where colonisation is difficult and rare. A few animals or a few plant seeds may by chance float or “raft” to a remote island during a storm, and give rise to new populations. These new populations will have low genetic diversity, reflecting the small range of alleles in the small founding population. In extreme cases a founding population can be as small as a single pregnant female animal or a single plant seed. The founder effect can also be seen in human populations. For example the island of Pingelap in Micronesia suffered a typhoon in 1775 that reduced the population on the island to only 20. The islanders today have a high frequency of a particular form of total colour blindness, since one of the typhoon survivors was a carrier for this allele. The Afrikaners of South Africa have a high incidence of Huntington’s disease, since one of the original Dutch settlers had the disease due to the presence of a dominant allele. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 26 5. Selection Selection includes natural selection and sexual selection, and it always causes a change in allele frequency. The important different between selection and the other four disturbing factors is that selection is the only factor that produces adaptive evolutionary changes (as described in unit 5). These histograms show three kinds of natural selection, depending on which phenotypes are selected by the environment. The shaded areas represent the phenotypes that are favoured. Directional Selection occurs when one extreme phonotype (e.g. tallest) is favoured over the other extreme (e.g. shortest). This happens when the environment changes in a particular way. "Environment" includes biotic as well as abiotic factors, so organisms evolve in response to each other e.g. if predators run faster there is selective pressure for prey to run faster, or if one tree species grows taller, there is selective pressure for other to grow tall. Most environments do change (e.g. due to migration of new species, or natural catastrophes, or climate change, or to sea level change, or continental drift, etc.), so directional selection is common. Disruptive (or Diverging) Selection. This occurs when both extremes of phenotype are selected over intermediate types. For example in a population of finches, birds with large and small beaks feed on large and small seeds respectively and both do well, but birds with intermediate beaks have no advantage, and are selected against. Stabilising (or Normalising) Selection. This occurs when the intermediate phenotype is selected over extreme phenotypes, and tends to occur when the environment doesn't change much. For example birds’ eggs and human babies of intermediate birth weight are most likely to survive. Natural selection doesn't have to cause a directional change, and if an environment doesn't change there is no pressure for a well-adapted species to change. Fossils suggest that many species remain unchanged for long periods of geological time. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 27 Epigenetic Control of Gene Expression In unit 1 we saw how genes are expressed through transcription and translation into proteins, which give cells their functions and properties. But cells don’t express all their genes all the time. Gene expression can be switched on or off by environmental stimuli (e.g. age, light, injury, nutrients, chemicals). The regulation of gene expression is called epigenetic regulation because it is regulated by environmental factors, not by the DNA sequence (epigenetic literally means “outside genetics”). These epigenetic changes include chemical changes to DNA and histones, but not to the base sequence of DNA. And as we shall see, epigenetic changes can be passed down to daughter cells through mitosis, but not usually to offspring through sexual reproduction. There are five epigenetic control points along the gene expression pathway: We’ll look at each step in detail. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 28 1. Chromatin Remodelling In unit 3 we saw that DNA in the nucleus of a eukaryotic cell is tightly bound to histone proteins, forming chromatin. The degree of folding changes during the cell cycle, for example the tightly-packed chromosome structures are only formed during prophase of mitosis, to allow the DNA to be moved easily. However, even in interphase, the folding can change from a loosely-packed form (euchromatin) to a tightly-packed form (heterochromatin). This switching is called chromatin remodelling, and regulates which genes can be expressed. Nucleosomes tightly packed DNA inaccessible for transcription Genes repressed DNA methylated Histones de-acetylated Nucleosomes loosely packed DNA accessible for transcription Genes activated DNA de-methylated Histones acetylated The switching is controlled by enzymes chemically changing the DNA or histone proteins. Two of the most important changes are DNA methylation and histone acetylation. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 29 DNA methylation One of the four DNA nucleotides, cytosine (C), can be methylated by having a methyl group (-CH3) attached by the enzyme DNA methyl transferase (DNMT): Note that this methylation does not affect the hydrogen bonds formed in base-pairs, so does not affect base-pairing or the base sequence of the DNA. DNMT only methylates cytosine when it is followed by guanine in the double helix (known as a CpG dinucleotide sequence), and the diagonally-opposite cytosine is also methylated, so both strands of DNA are methylated at each location. DNA methylation turns off gene expression in that region of DNA by causing the nucleosomes to coil up into tight heterochromatin. In this state the transcription proteins can’t bind and no transcription can take place. Histone modification In chromatin DNA wraps twice round a core of histone proteins to form a structure called a nucleosome. Each protein core is made of eight globular polypeptide chains (it’s an octamer), and each chain has a long polypeptide “tail” (around 100 amino acids long) emerging from the core. The amino acids in these tails can be extensively modified by the addition or removal of methyl groups (-CH3), acetyl groups 3- (-COCH3), phosphate groups (-PO4 ) or other groups, and these modifications cause the nucleosomes to stick together to form heterochromatin or separate to form euchromatin. So these changes also regulate gene expression. Chromatin remodelling is a long-term regulator of gene expression: once genes are switched off they usually stay switched off in that cell and in its daughter cells. This happens because the methylation patterns and histone modifications are copied whenever DNA is replicated and the cell divides. The daughter cells that result from mitosis thus have the same epigenetic changes and so the same inactivated genes. Chromatin remodelling is therefore an important factor in cell differentiation (see p34). HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 30 2. Control of Transcription by Transcription Factors In unit 1 we saw that, for transcription to happen, the enzyme RNA polymerase must bind to the DNA molecule just upstream of the gene, at a region called the promoter. However, RNA polymerase binds only weakly at first and it needs the assistance of a number of other DNA-binding proteins before it can start transcribing the gene. There are two classes of regulatory proteins: Transcription factors bind to the promoter region, just beside the DNA polymerase. Each transcription factor protein has a specific binding site that binds to a particular DNA sequence in the promoter. Activator proteins bind to the enhancer sequence, some distance further upstream. Again, different activator proteins have different binding sites that bind to a specific DNA sequences. Once bound to DNA, the different regulatory proteins bind together as the DNA molecule bends and loops to form a combined transcription complex. This complex activates RNA polymerase, which can now move along the DNA molecule, transcribing the gene. This example shows two regulatory proteins, but in practice there can be over a dozen different proteins involved. Some promote transcription, while others repress it, so transcription factors provide a very flexible method of control. Both the promoter and enhancer sequences are blocked if the DNA is in the closed heterochromatin, so transcription can’t take place. Since transcription factors are proteins, they are synthesised in the cytoplasm and transported into the nucleus to bind to DNA. And of course their own production is controlled by transcription factors! Steroid hormones, like oestrogen and testosterone, stimulate protein synthesis and growth by being transcription factors. Steroid hormones are lipids, so cross the cell membrane by lipid diffusion and bind to a receptor protein in the cytoplasm to form a transcription factor. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 31 3. Alternative Splicing of mRNA In unit 1 we learnt that genes contain coding sections (exons) and non-coding sections (introns). The introns are removed from mRNA in a process called post-transcriptional modification (or just splicing), which is carried out by a large RNA-protein complex called a spliceosome. It turns out that almost all genes have more exons than they need to make one particular protein, and by combining some of the exons together in different ways, the spliceosome can make different isoforms of mature mRNA, which are then translated to make different proteins with a different structure and function. This is called alternative splicing. Almost all human genes make different proteins by alternative splicing, making anything from 2 to 10,000 different mRNA isoforms from each gene. This explains how the 20,000 genes in the human genome can code for over 100,000 different proteins. The record-holder so far is the Drosophila gene Dscam, which makes proteins involved in the development of the fly’s nervous system. This gene has 116 exons, but only 17 are used in each protein, giving a possible 38,000 alternative combinations. A simpler example is the production of the two human peptide hormones calcitonin and CGRP, which are made from the same gene. In the thyroid gland the gene is spliced to make calcitonin, while in the hypothalamus the gene is alternatively spliced to produce CGRP: The observation that different mRNA splicings are used in different tissues shows that this is under environmental control, so alternative splicing is another example of epigenetic regulation of gene expression. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 32 4. Control of expression by non-coding RNA About 90% of human DNA is transcribed into RNA, but only about 2% of this RNA is mRNA that is translated into protein. The remaining 98% of RNA just remains as RNA and is called non-coding RNA (ncRNA), because it does not code for protein. ncRNAs are almost all involved in the control of gene expression at all points along the gene expression pathway. There are thousands of different kinds of ncRNA already discovered, and this is probably just the tip of the iceberg. Some of the main ncRNAs are: siRNA miRNA Small interfering RNA (siRNA) and micro RNA (miRNA) are both involved in controlling gene expression by RNA interference (RNAi). siRNA and miRNA are short lengths of antisense RNA, with sequences complementary to part of an mRNA molecule. The short RNA molecules therefore bind to the mRNA by complementary base-pairing, forming regions of double-stranded RNA. This double-stranded RNA cannot be translated in a ribosome, and in fact is broken down by RNAse enzymes. RNA interference inhibits gene expression by destroying mRNA. snRNA Small nuclear RNA (snRNA) combines with proteins to form small nuclear riboproteins (SNURPS), which in turn form the spliceosome complexes we’ve just looked at. The snRNA molecules bind to the pre-mRNA and so control alternative splicing. tRNA rRNA snoRNA These are all involved in translation. We’ve already come across transfer RNA (tRNA) and ribosomal RNA (rRNA) in unit 1. Small nucleolar RNA (snoRNA) is found in the nucleolus, where it regulates the amount and type of rRNA transcription, and therefore the production of ribosomes. snoRNA therefore controls gene expression at the translation stage. mRNA UTR Mature mRNA, consisting solely of exons, still contains long sequences at either end that are not translated into protein. There are called untranslated regions (UTRs). These UTRs are involved in the binding of the mRNA to ribosomes and in switching on the translation of that mRNA. Xist Xist is a gene on the X chromosome that controls the process of X inactivation, where one of the two X chromosomes in females is inactivated. This happens in all female cells so that they have the same amount of gene product as males (who only have one X chromosome). The Xist gene is transcribed to ncRNA but not translated into protein. The ncRNA then coats one X chromosome, causing it all to condense into a heterochromatin form called a Barr body, and permanently silencing all its genes. With so many different functions, non-coding RNA probably has a bigger effect on our characteristics than coding RNA. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 33 Gene Expression in Embryo Development We’ve seen how genes can be switched on and off by epigenetic control. Now we’ll look at the most important example of epigenetic control – embryo development. An adult human has around 1014 cells; every one formed by mitosis from the zygote, so every one with the same DNA (ignoring the odd mutation). Yet these cells are all different – in fact there are over 200 different cell types in an adult human. Since all the cells have the same genome, the differences must be due entirely to changes in gene expression, and these changes must arise as the embryo develops. The process of becoming a specialised cell by controlling gene expression is called cell differentiation. We looked briefly at human embryo development in unit 3. Here are the stages during the first 5 days: By the 32-cell stage the cells are already starting to become differentiated and by the blastocyst stage there are clearly two cell types: An outer spherical layer of cells called the trophoblast, which will form the placenta An inner mass of cells, which will form the embryo. From now on the fate of each cell and all its descendants is pre-determined. This cell determination is found early in the development of all embryos and is triggered by different environmental cues. It may seem odd that cells inside a tiny embryo have different environments, but there are numerous small but significant differences within the growing embryo. In most animals, cell fate is determined by cytoplasmic determination, caused by chemical gradients in the egg cell, present before fertilisation. The zygote has two opposite “poles” and, as it divides by cleavage, the embryo cells experience slightly different chemical environments, which act as epigenetic environmental cues to trigger differentiation. In mammals cell fate is determined by positional determination, caused simply by the position of a cell in the developing embryo. The main body plan (the dorso-ventral axis) is established by day 16, and cells in different parts of the embryo transmit chemical signals between neighbours to coordinate differentiation. In mammals there is no differentiation or determination up to the 8-cell stage. Indeed 8-cell embryos can be split, and each cell can act like a zygote and grow to become a complete embryo. This is what happens naturally with identical twins, and artificially with embryo cloning of farm animals. After the 8-cell stage the cells start to be determined, and if they are experimentally transplanted to a different part of the embryo they develop as they would have before the move. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 34 For some organisms we can draw a “fate map”, showing the destiny of every cell. This has been done for the tiny nematode worm Caenorhabditis elegans, which is only 1mm long and always has exactly 959 adult cells (excluding gametes). This shows a simplified fate map (or cell lineage) for the first four divisions of the nematode worm: By the fourth cell division (16 cells) the fate of each cell is reasonably determined, and after 10 divisions, when the adult animal is complete, every cell is irreversibly differentiated. The fate map and the cell differentiation is exactly the same for every nematode worm. As an embryo develops more and more genes are switched off in any given cell. The more genes are switched off the more differentiated the cell is. However, there is a minimum number of housekeeping genes that every cells needs to express in order to remain alive. There are around 4000 of these (out of a total genome of 20,000 genes), including genes involved in the cell cycle, metabolism and cell structure. Fully-differentiated somatic cells will typically express an additional 100-600 genes needed for their particular functions. Control of haemoglobin genes A good example of gene switching during embryo development is haemoglobin. In unit 1 we saw that haemoglobin is made of four polypeptide chains: 2 chains and 2 chains (22). In unit 3 was found that a different haemoglobin is found in embryos, with a higher affinity for oxygen. This fetal haemoglobin is made of 2 chains and 2 chains (22). This chart shows how the three genes are switched on or off before and after birth. The switching is under epigenetic control using DNA methylation, histone acetylation, ncRNAs and transcription factors. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 35 Stem Cells Stem cells are cells that can divide and differentiate into another kind of cell. Stem cells possess two key properties: Stem cells are potent – they have the potential to differentiate into specialized cell types. Stem cells are immortal – they can divide indefinitely. As we’ve just seen, early embryo cells are stem cells, since they will differentiate into all the different cells of an adult organism, but as the embryo develops, the cells become less and less potent. There are different classes of stem cell potency (though in fact the reduction in potency is gradual through embryo development): Stem cell type Properties Source Totipotent (aka omnipotent) Can differentiate into any cell type, including placental tissue, so can construct a complete, viable organism. Zygote and very early embryo cells up to 8-cell stage. Pluripotent Can differentiate into any cell type except placental tissue. Blastocyst inner cell mass Multipotent Can differentiate into cells of a closely related family of cells. Particular cells from most adult tissues (used for growth and repair). Unipotent Can only differentiate into cells of their own type. most adult tissues The discovery of stem cells in the 1950s immediately led to suggestions that they could be used clinically in medical treatments. Many human diseases are caused by the death or destruction of particular cells. The idea is to transplant tissue grown from stem cells into a patient, where it would grow and replace damaged tissue. Some potential examples of these cell-based therapies are shown in the table: Type of cell Myocytes Pancreatic cells Skeletal muscle cells Blood cells Nerve cells Skin cells Bone cells Cartilage cells Retina cells HGS Biology A-level notes Disease that could be treated Myocardial infarction Type I diabetes Muscular dystrophy Leukaemia Parkinson’s disease, multiple sclerosis, strokes, paralysis due to spinal injury Burns Osteoporosis Osteoarthritis Macular degeneration NCM 09/16 A Level Biology Unit 10 page 36 Traditional sources of Stem Cells Embryo stem cells are grown in vitro from the inner cell mass of five-day old blastocysts. These embryos are created by in vitro fertilisation (IVF) to help infertile couples reproduce, but any “spare” embryos, no longer needed for reproduction, can be used to create stem cells, with the informed consent of the donor couple. Since these stem cells are pluripotent, they can be differentiated into any cell type for clinical use. However, these cells are not the patient’s own cells, so there is a problem of immune rejection by the patient’s immune system, and, during stem cell therapy, patients have to take immunosuppressant drugs. In addition, there remains a debate about whether it is ethical to use human embryos for this purpose, since the embryos are destroyed in the process. Embryonic stem cell research had been banned across Europe by the European parliament, although the UK government does allow such research in the UK. Adult stem cells are extracted from certain tissues of the body. It is thought that most organs and tissues maintain a small number of undifferentiated stem cells, which the body uses to replace and repair damaged tissue. Tissues where stem cells have been found include the brain, bone marrow, blood vessels, muscle, skin, heart, gut and liver. Since these stem cells are multipotent, they can differentiate only into their own family of cells (e.g. blood cells, muscle cells), but not others. The use of these cells has no ethical issues, and there are also no problems of rejection if the stem cells are taken from the patient’s own body. However, they are difficult to find and difficult to grow in culture. Blood cellforming (hematopoietic) stem cells from bone marrow are already being used successfully to treat leukaemia (cancer of white blood cells); and heart disease and diabetes have be treated in mice. In addition, human stem cells grown in culture are being used to test the effects of new drugs, without harming humans or animals. Neither of these sources of human stem cells is ideal: The embryonic stem cells cause an immune reaction and have ethical problems, while the adult stem cells are not very effective. Ideally we would like pluripotent stem cells from the patient’s own tissues. There are now two ways we might do this: HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 37 New Sources of Stem Cells Somatic Cell cloning (or therapeutic cloning) This technique involves making a human embryo in the same way that Dolly the sheep was made – by somatic cell nuclear transfer. The difference with therapeutic cloning is that the embryo is not implanted into a surrogate mother and never grows into a viable human (such reproductive cloning is illegal). Instead the embryo is grown in vitro to the blastocyst stage and the inner cell mass, containing pluripotent stem cells, is removed. These stem cells are genetically identical to the patient, so there is no immune rejection. However, the creation of a human embryo solely for medical purposes raises ethical issues. Induced Pluripotent Stem (iPS) Cells This technique was discovered in 2006 by Shinya Yamanaka at Kyoto University in Japan. He took fibroblast (connective tissue) cells and inserted four genes into them using a retrovirus vector. The four genes were Oct4, Soc3, klf4 and cMyc, which all code for transcription factors. In fully-differentiated fibroblast cells these genes are switched off, but the new active copies of the genes were expressed to make the necessary transcription factors. These transcription factors in turn switched on many other genes and “reprogrammed” the fibroblast cells to become pluripotent stem cells, just like those found in a blastocyst. The iPS cells can be grown in culture and differentiated to become any somatic cell, just like embryonic stem cells. But there is no immune rejection (since the cells are patient’s own) and there are no ethical issues (since no embryos are involved). iPS cells are still a very new development, still at the research stage, but are likely to solve the problems of both adult and embryo stem cells, making the use of embryo stem cells obsolete. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 38 Blank Page HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 39 Biotechnology Biotechnology is defined as: "Any technological application that uses biological systems, living organisms, or derivatives thereof, to make or modify products or processes for specific use." In practice biotechnology usually refers to the application of molecular (DNA) biology in the laboratory. Biotechnology has applications in: research e.g. human genome project medicine e.g. genetically-engineered drugs, gene therapy agriculture e.g. improving crops industry e.g. manufacturing enzymes, biosensors A particular aspect of biotechnology is genetic engineering, which means altering the genes in a living organism to produce a Genetically Modified Organism (GMO) with a new genotype. Genetic engineering can include inserting a foreign gene from one species into another (forming a transgenic organism); altering an existing gene so that its product is changed or changing gene expression. Techniques of Biotechnology Modern biotechnology is possible due to the development of techniques from the 1960s onwards, which arose from our greater understanding of DNA and how it functions, following the discovery of its structure by Watson and Crick in 1953. This table lists the techniques that we shall look at in detail. Technique Purpose Type 1 Restriction Enzymes To cut DNA at specific points, making small fragments 2 DNA Ligase To join DNA fragments together 3 Reverse transcriptase To make DNA from mRNA manipulating DNA in vitro 4 PCR To amplify very small samples of DNA 5 Electrophoresis To separate fragments of DNA 6 Southern Blot To look for specific sequences in DNA 7 DNA Sequencing To read the base sequence of a length of DNA 8 DNA Profiling To compare different peoples’ DNA 9 Vectors To carry DNA into cells 10 Transformation To deliver a vector into a living cell 11 Marker Genes To identify cells that have been transformed 12 Knockout Mice To investigate gene function HGS Biology A-level notes Analysing DNA in vitro Using DNA in vivo NCM 09/16 A Level Biology Unit 10 page 40 1. Restriction Enzymes These are enzymes that cut DNA at specific sites. They are properly called restriction endonucleases because they cut phosphodiester bonds in the middle of the polynucleotide chain. Some restriction enzymes cut straight across both chains, forming blunt ends, but most enzymes make a staggered cut in the two strands, forming sticky ends. The cut ends are “sticky” because they have short stretches of single-stranded DNA with complementary sequences. One sticky end will stick (or anneal) to another sticky end by complementary base pairing (i.e. with weak hydrogen bonds), if the sticky ends have both been cut with the same restriction enzyme. Restriction enzymes have highly specific active sites, and will only cut DNA at specific base sequences, 4-8 base pairs long, called recognition sequences. Recognition sequences are usually palindromic, which means that the sequence and its complement are the same but reversed (e.g. GAATTC has the complement CTTAAG). Short lengths of DNA cut out by restriction enzymes are called restriction fragments. There are thousands of different restriction enzymes known, with over a hundred different recognition sequences. Restriction enzymes are named after the bacteria species they came from, so EcoR1 is from E. coli strain R. 2. DNA Ligase We came across DNA ligase in unit 1 joining gaps in the DNA backbone following DNA replication. It is commonly used in genetic engineering to do the reverse of a restriction enzyme, i.e. to join together complementary restriction fragments. Two restriction fragments can anneal if they have complementary sticky ends, but only by weak hydrogen bonds, which can quite easily be broken, say by gentle heating. The backbone is still incomplete. DNA ligase completes the DNA backbone by forming covalent phosphodiester bonds. Restriction enzymes and DNA ligase can therefore be used together to join lengths of DNA from different sources. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 41 3. Reverse Transcriptase The enzyme reverse transcriptase does the reverse of transcription: it synthesises DNA from an RNA template (so it is an RNA-dependent DNA polymerase enzyme). Reverse transcriptase is produced naturally by the retroviruses (which we came across in unit 1), and it helps them to invade cells. In biotechnology reverse transcriptase is used to make an “artificial gene”, called complementary DNA (cDNA), from an mRNA template as shown in this diagram: Mature mRNA (without introns) is extracted from cells and mixed with reverse transcriptase and DNA nucleotides. A new strand of DNA is synthesised, complementary to the mRNA strand, forming a doublestranded DNA/RNA “heteroduplex” molecule. The two strands of this molecule are then separated and reverse transcriptase now synthesises a second DNA stand, complementary to the first. The result is a normal double-stranded DNA molecule called cDNA. Note that the cDNA molecule is much shorter than the original gene in the organism’s DNA (typically <50% the size), since the cDNA doesn’t have introns. In addition, cDNA only has the exons for one particular alternative splicing; whereas the original DNA has a variety of other exons available. Indeed mRNA extracted from different tissues will form different cDNAs, even if they come from the same gene! cDNA is therefore an “artificial gene”. Reverse transcriptase has several uses in biotechnology: It makes genes without introns. Eukaryotic genes with many introns are often too big to be incorporated into a bacterial plasmid, and bacteria are unable to splice out the introns anyway. The artificial cDNA gene is made from mRNA that already has the introns spliced out of it, so it can be expressed in bacteria. It contains the exact sequence for one specific protein, without needing any particular exon splicing. It makes a stable copy of a gene, since DNA is less readily broken down by enzymes than RNA. It makes genes easier to find. There are some 20 000 genes in the human genome, and finding the DNA fragment containing one gene out of this many is a very difficult task. However a given cell only expresses a few genes, so only makes a few different kinds of mRNA molecule. For example the cells of the pancreas make insulin, so make lots of mRNA molecules coding for insulin. This mRNA can be isolated from these cells and used to make cDNA of the insulin gene. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 42 4. Polymerase Chain Reaction (PCR) The polymerase chain reaction is a technique used to copy (or amplify) DNA samples as small as a single molecule. It was developed in 1983 by Kary Mullis, for which discovery he won a Nobel Prize in 1993. PCR is simply DNA replication in a test tube. If a length of DNA is mixed with the four nucleotides (A, T, C and G) and the enzyme DNA polymerase in a test tube, then the DNA will be replicated many times. The details are shown in this diagram: 1. Start with a sample of the DNA to be amplified, and add the four nucleotides and the enzyme DNA polymerase. 2. Heat to 95°C for two minutes to breaks the hydrogen bonds between the base pairs and separate the two strands of DNA. Normally (in vivo) the DNA double helix would be separated by an enzyme. 3. Add primers to the mixture and cool to 40°C. Primers are short lengths of single-stranded DNA (about 20 bp long) that anneal (i.e. form complementary base pairs) to complementary sequences on the two DNA strands forming short lengths of double-stranded DNA. The DNA is cooled to 40°C to allow the hydrogen bonds to form. There are two reasons for using primers: The enzyme DNA polymerase can only extend existing double stranded DNA. Only the DNA between the primer sequences is replicated, so by choosing appropriate primers you can ensure that only a specific target sequence is copied. The choice of primers is therefore very important to select the DNA to be amplified. 4. The DNA polymerase enzyme can now build new stands alongside each old strand to make doublestranded DNA. Each new nucleotide binds to the old strand by complementary base pairing and is joined to the growing chain by a phosphodiester bond. The enzyme used in PCR is derived from the thermophilic bacterium Thermus aquaticus, which grows naturally in hot springs at a temperature of 90°C, so it is not denatured by the high temperatures in step 2. Its optimum temperature is about 72°C, so the mixture is heated to this temperature for a few minutes to allow replication to take place as quickly as possible. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 43 5. Each original DNA molecule has now been replicated to form two molecules. The cycle is repeated from step 2 and each time the number of DNA molecules doubles. This is why it is called a chain reaction, since the number of molecules increases exponentially, like an explosive chain reaction. After n cycles, there is an amplification factor of 2n. Typically PCR is run for 20-30 cycles. PCR can be completely automated, so in a few hours a tiny sample of DNA can be amplified millions of times with little effort. The product can be used for further studies, such as cloning, electrophoresis, or gene probes. Because PCR can use such small samples it can be used in forensic medicine (with DNA taken from samples of blood, hair or semen), and can even be used to copy DNA from mummified human bodies, extinct woolly mammoths, or from an insect that's been encased in amber since the Jurassic period. One problem of PCR is having a pure enough sample of DNA to start with. Any contaminant DNA will also be amplified, and this can cause problems, for example in court cases. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 44 5. Electrophoresis This is a form of chromatography used to separate different pieces of DNA on the basis of their length. It might typically be used to separate restriction fragments. The DNA samples are placed into wells at one end of a thin slab of gel made of agarose or polyacrylamide, and covered in a buffer solution. An electric current is passed through the gel. Each nucleotide in a molecule of DNA contains a negatively-charged phosphate group, so DNA is attracted to the anode (the positive electrode). The molecules have to diffuse through the gel, and longer lengths of DNA are retarded by the gel so move more slowly than shorter lengths. So the smaller the length of the DNA molecule, the further down the gel it will move in a given time. At the end of the run the current is turned off. Unfortunately the DNA on the gel cannot be seen, so it must be visualised. There are two common methods for doing this: The DNA can be stained with a coloured chemical such as azure A (which stains the DNA bands blue), or a fluorescent molecule such as ethidium bromide (which emits coloured light when the finished gel is illuminated with invisible ultraviolet light). The DNA samples at the beginning can be radiolabelled with a radioactive isotope such as 32P, then visualised using autoradiography. Ordinary photographic film (sometimes called X-ray film) is placed on top of the finished gel in the dark for a few hours, and the radiation from any radioactive DNA on the gel exposes the film. When the film is developed the position of the DNA shows up as dark bands on the film. This method is extremely sensitive. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 45 6. Southern Blot A Southern blot is used to detect a specific target sequence in samples of DNA using a DNA probe. A DNA probe is simply a short length of single-stranded DNA (100-1000 nucleotides long) with a radioactive or fluorescent “label” attached. The probe will anneal to any fragments of DNA containing the complementary sequence, forming regions of double-stranded hybrid DNA (the process is also called hybridisation). The hybrid DNA fragments are now labelled and can be identified. The Southern blot method is: Eletrophoresis gel 1. DNA is extracted from the source cells (e.g. different patients or different species) and amplified by PCR to make enough DNA for the hybridisation. The DNA samples are then digested by a restriction enzyme into many small fragments, and the fragments separated on an electrophoresis gel. 2. The gel is placed in an alkali solution, which breaks the hydrogen bonds between the DNA bases causing the stands to separate. A thin sheet of nylon or nitrocellulose is placed on top of the gel. The alkali solution is then drawn up through the gel to a stack of paper towels by capillary action, bringing the DNA with it. The weight paper towels nylon sheet gel alkali solution DNA sticks to the nylon membrane. 3. The nylon sheet is separated from the gel, placed in a plastic bag containing a solution of labelled probes and mixed thoroughly. The sealed bag nylon sheet probes will anneal to DNA fragments in the nylon membrane that have a complementary sequence, forming hybrid DNA molecules stuck to the nylon sheet (but again they can’t be seen). 4. The location of the hybrid DNA can be visualised by different probe solution Autoradiograph methods, depending on the label used. If the probes were radioactive then they can be visualised by autoradiography, and the probes show up as bands on photographic film. If the probes were fluorescent then they can be visualised as bands of light when illuminated by ultraviolet light. The Southern blot was invented by Edwin Southern at Edinburgh University in 1975 (as a biological joke, similar blotting techniques using RNA or protein are called northern and western blots respectively). HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 46 7. DNA Sequencing This means reading the base sequence of a length of DNA. DNA sequencing is based on a beautifully elegant technique developed by Fred Sanger in Cambridge in 1975, and now called Sanger Sequencing. 1. Label 4 test tubes A, T, C and G. Into every test tube add: a sample of the DNA to be sequenced, up to 400 nucleotides long, so long molecules must be broken up into shorter fragments first using restriction enzymes. The sample must contain many millions of individual molecules, so may need to be amplified by PCR first. the four DNA nucleotides the enzyme DNA polymerase. 2. In each test tube add: a dideoxy nucleotide that cannot form a phosphodiester bond and so stops further synthesis of DNA. Tube A has dideoxy A (A*), tube T has dideoxy T (T*), and so on. The dideoxy nucleotides are present at about 1% of the concentration of the normal nucleotides a fluorescent primer to allow the DNA polymerase to work and to visualise the DNA later. A different primer is used in each tube: tube A has a green primer, tube T red, tube C blue and tube G yellow. 3. Let the DNA polymerase synthesise many copies of the DNA sample. About 1% of the time, at random, a dideoxy nucleotide will be added to the growing chain and synthesis of that chain will then stop. A range of DNA molecules will be synthesised ranging from full length to very short. The important point is that in tube A, all the fragments will stop at an A nucleotide. In tube T, all the fragments will stop at a T nucleotide, and so on. 4. The contents of the four tubes are mixed together and all the different DNA molecules are then separated using capillary electrophoresis, which gives good separation in a narrow tube gel (1m long by 0.1mm diameter). The DNA molecules move down the gel, smallest first. As they move they pass through a laser beam, which causes the fluorescent labels to emit light of their particular colour, depending in the terminal base. The coloured light is detected by a sensor and the colour recorded on a computer, which converts the sequence of colours into a sequence of bases. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 47 Sanger’s original method was very slow, but the technology improves every year and modern “next generation” machines can sequence one million bases per second, equivalent to 1 human genome per hour! It is now possible for individuals to have their personal genomes sequenced. Why Sequence? 1. To find out more about DNA and how it works, for example, non-coding DNA, alternative splicing, etc. 2. To compare sequences between species and so deduce phylogenetic relationships (unit 5). 3. To compare sequences between individuals and so deduce family relationships. 4. To identify different alleles at a gene locus to explore human diversity and history. 5. To identify alleles linked with diseases and so plan personalised medical treatments. Once a gene sequence is known the amino acid sequence of the protein that the DNA codes for can also be determined, using the genetic code table. Computers can identify start and stop codons and so work out where genes are and what they do. Although the entire human genome was sequenced in 2003, we still haven’t identified all the genes, but it is estimated that there are about 20,000. This is a surprisingly small number, but, as we have seen, these 20,000 genes code for more than 500,000 different proteins, due to alternative splicing. Genetically-determined conditions Our genes determine all our characteristics, including our state of health, and there is hope that a better understanding of human DNA will lead to new understanding of disease. There are a few conditions that are known to be caused by an allele of a single gene, such as haemophilia, muscular dystrophy, cystic fibrosis and Huntington’s disease. It is hoped that these single-gene disorders become treatable using gene therapy, although none has yet been successful. However, these single-gene disorders are rare, and most diseases, like cancer and heart disease, are caused by many alleles interacting. This makes them very difficult to understand and treat, but as the genomes of more and more individuals are being sequenced, we can start to correlate certain patterns of alleles with certain conditions. This is likely to lead to more personalised medicine, where treatments can be tailored to patients’ genotypes, not just their symptoms. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 48 8. DNA Profiling DNA profiling (or genetic fingerprinting) is used to distinguish between DNA samples from different people and is widely used in forensics and paternity testing. 99.9% of human DNA has exactly the same sequence in every person, including the protein-coding DNA and all the non-coding DNA that makes the important ncRNAs. However, there is enough variation in the remaining 0.1% (over 100 000 base pairs) to be used to distinguish one individual from another. Scientists have discovered a number of regions of non-coding DNA that contain simple repetitive sequences called STRs (short tandem repeats), for example the sequence GATAGATAGATAGATAGATA contains five repeats of the 4-base sequence GATA. Everyone has these STR sequences in the same loci, but different people have different numbers of repeats. So these STR regions are known as Variable Number Tandem Repeat (VNTR) sequences. There are typically around 10 different variants, or alleles, at each VNTR locus (say 3-13 repeats), so each allele is shared by around 10% of the population. This obviously isn’t good enough for a unique identification, but forensic scientists usually look at 17 different VNTR loci on 17 different chromosomes. Since there are two versions of each chromosome in a diploid cell (the maternal and paternal chromosomes), there are actually 34 loci tested. If each locus has 10 different variant this gives a total of 1034 different combinations. And since there are less than 1010 humans, it is extraordinarily unlikely that any two individuals will have exactly the same pattern of VNTRs. This diagram shows two VNTR loci of two individuals. They have the same number of repeats at locus 2, but different numbers at locus 1. If we could isolate just the VNTR loci and run them on an electrophoresis gel, they would separate by length of DNA (i.e. number of repeats), and so give a different pattern of bands on the gel. This is the basis of the DNA profiling method, invented by Sir Alec Jeffreys at Leicester University in 1984. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 49 Method 1. Cells are collected by taking a buccal swab or from evidence at a crime scene, and DNA is extracted from the cells. As little 100 pg of DNA is needed. 2. The DNA is amplified by PCR. Remember, in PCR only the DNA between the primers is amplified. So 17 pairs of PCR primers are carefully constructed to match target sequences at known VNTR regions of 17 chromosomes. An 18th locus on the sex chromosomes is also targeted to reveal the individual’s sex. The primers are also fluorescent and are made so that each primer fluoresces a different colour. This diagram shows two loci being targeted by the PCR primers. 3. All the amplified DNA fragments are separated by capillary electrophoresis, just as in DNA sequencing. The light emitted by the fluorescent primers is detected by a sensor and recorded on a computer, which gives a print-out of the bands. This is the DNA profile. This shows a traditional DNA fingerprint gel, showing bands of DNA fragments from many different loci. Samples of the same DNA (or from identical twins) give the same banding pattern, but samples from different people give different banding patterns. These gels can be quite difficult to interpret. This shows a modern computer-generated DNA profile, showing homologous pairs of bands for each of 16 loci, including X and Y. The labels show the numbers of repeats represented by each band, so the profile is very easy to interpret. All DNA profiles are stored on a UK National DNA Database. DNA profiling is used in forensic science, to match DNA samples collected from a crime (e.g. from sperm, blood, hair, skin) with that of suspects. DNA evidence is very powerful, but convictions can be questioned if there is any suspicion of DNA contamination. to determine family relationships, e.g. paternity testing. Since children inherit half their DNA from each parent, then one band from each locus should match each parent. to prevent undesirable inbreeding during breeding programs in farms and zoos (unit 5). to measure genetic diversity within a population. to establish phylogenetic relationships between species, including extinct ones, using DNA extracted from archaeological remains. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 50 9. Vectors Now we turn to genetically-modifying living cells. The next three techniques are concerned with transferring genes (DNA) from the test tube into cells, so that the genes can be replicated and expressed in those cells. To do this we first need a vector. In genetic engineering a vector is a length of DNA that carries the gene we want into a host cell. A vector is needed because a length of DNA containing a gene on its own won’t actually do anything inside a host cell. Since it is not part of the cell’s normal genome it won’t be replicated when the cell divides, it won’t be expressed, and in fact it will probably be broken down pretty quickly. A vector gets round these problems by having these properties: It is big enough to hold the gene we want (plus a few others), but not too big. It is circular (or more accurately a closed loop), so that it is less likely to be broken down (particularly in prokaryotic cells where DNA is always circular). It contains control sequences, such as a replication origin and a transcription promoter, so that the gene will be replicated, expressed, or incorporated into the cell’s normal genome. It contains marker genes, so that cells containing the vector can be identified. Common vectors include bacterial plasmids, viruses and yeast artificial chromosomes. Plasmids are the most common kind of vector, so we shall look at how they are used in some detail. Plasmids are short circular bits of DNA found naturally in bacterial cells. A typical plasmid contains 3-5 genes and there are usually around 10 copies of a plasmid in a bacterial cell. Plasmids are copied separately from the main bacterial DNA when the cell divides, so the plasmid genes are passed on to all daughter cells. They are also used naturally for exchange of genes between bacterial cells, so bacterial cells will readily take up a plasmid. Because they are so small, plasmids are easy to handle in a test tube, and foreign genes can quite easily be incorporated into them using restriction enzymes and DNA ligase. One of the first plasmids to be used was the R-plasmid. This plasmid contains a replication origin, several recognition sequences for different restriction enzymes (with names like PstI and EcoRI), and two marker genes, which in this case confer resistance to antibiotics. The R plasmid gets its name from these resistance genes. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 51 The diagram below shows how a gene can be incorporated into a plasmid using restriction and ligase enzymes. 1. A restriction enzyme (Pst1 here) is used to cut the gene from the donor DNA, with sticky ends. 2. The same restriction enzyme cuts the plasmid in the middle of one of the marker genes (we’ll see why this is useful later). 3. The gene and plasmid are mixed in a test tube and they anneal because they were cut with the same restriction enzyme and have the same sticky ends. 4. The fragments are joined covalently by DNA ligase to form a hybrid vector (in other words a mixture or hybrid of bacterial and foreign DNA). 5. Several other products are also formed: some plasmids will simply re-anneal with themselves to re-form the original plasmid, and some DNA fragments will join together to form chains or circles. These different products cannot easily be separated, but it doesn’t matter, as the marker genes can be used later to identify the correct hybrid vector. This technique takes place entirely in test tubes; there are no cells involved. So the next step is to insert our modified DNA (the hybrid or recombinant vector) into a living cell. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 52 10. Transformation Transformation means inserting new DNA (usually as a vector) into a living cell (called a host cell), which is thus genetically modified, or transformed. The terms transfection and transduction are also used for similar processes. A transformed cell can replicate and express the genes in the new DNA. DNA is a large molecule that does not readily cross cell membranes, so the membranes must be made permeable in some way. There are different ways of doing this depending on the type of host cell. Microbes Heat Shock. Bacterial and animal cells in culture can be made to take up DNA from their surroundings by raising the temperature suddenly raised by about 40°C. Electroporation. The most efficient method of delivering genes to bacterial cells is to use a highvoltage pulse, which temporarily disrupts the membrane and allows the plasmid to enter the cell. Plants Gene Gun. Tiny gold or tungsten particles coated with DNA can be fired at plant cells using a compressed air gun. The gene gun gets round the problem of inserting DNA through the tough cell wall to modify plant cells. The high-velocity particles penetrate the cell wall and deliver the DNA to the plant cell nucleus. Plant Infection. The bacterium Agrobacterium tumefaciens is a pathogen of many dicot plants, where its infection causes crown gall disease. A. tumefaciens possesses a plasmid, called the Ti plasmid, which is integrated into the plant cells' chromosomal DNA. Scientists have exploited this infection mechanism to genetically modify plants. First the new gene is inserted into the Ti plasmid in A. tumefaciens cells, which is grown in culture. Then the target plant cells are infected with transformed A. tumefaciens cells, which insert the new gene into some plant cells. Finally, whole new plants are grown from these modified cells by micropropagation. Animals and Humans Micro-Injection. To transform individual cells, such as fertilised animal egg cells, the DNA is injected directly into the nucleus using an incredibly fine micro-pipette. Liposomes. Human cells in vivo can be transformed by DNA encased in liposomes, which fuse with the cell membrane, delivering the DNA into the cell. Viruses. Human cells in vivo can be infected by genetically-engineered viruses, which deliver the DNA into host cells. The viruses must first be made it safe, so they can’t cause disease. Most of these transformation techniques have a very low success rate (≪1%), so we need to be able to identify those few cells that have taken up the foreign DNA and been transformed. This is where the plasmid’s marker genes are used. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 53 11. Marker Genes Marker genes (or reporter genes) are used to find which cells have actually taken up the hybrid vector. Following transformation, there are (at least) these four possible outcomes: Vectors contain two different marker genes, which are needed to identify the required cells. The first marker gene distinguishes between cells that have taken up a plasmid from those that haven’t. Cells with the plasmid now have a gene for resistance to an antibiotic such as tetracycline (not for the antibiotic), so if all the cells are grown on a medium containing tetracycline, all the normal untransformed cells (>99.99%) are killed. Only the few transformed cells will survive, and these can then be grown and cloned on another plate. The second marker gene distinguishes between cells that have taken up the hybrid plasmid from those that have taken up the original plasmid. The trick here is that the foreign DNA is inserted inside the second marker gene, so cells with the hybrid plasmid cannot make that gene product. Different genes are used for this second marker: The marker gene can be a gene for resistance to another antibiotic, such as ampicillin. Cells with the hybrid vector are not resistant to ampicillin. Since this means killing the cells we want, the ampicillin test is done on a replica plate. Colonies that grow on the first (tetracycline) plate but not on the replica (ampicillin) plate are the ones we want. The marker gene can be a gene for the enzyme -galactosidase (lactase). This enzyme turns a white substrate in the agar plate into a blue product. So colonies of cells with the original plasmid turn blue, while those with the hybrid plasmid remain white, and can easily be identified. The marker gene can be a gene for green fluorescent protein (GFP). Colonies of cells with the original plasmid fluoresce green in UV light, while those with the hybrid plasmid do not fluoresce, and can easily be identified. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 54 Gene Cloning Gene cloning simply means make multiple copies of a piece of DNA. It is a necessary step in just about any aspect of molecular biotechnology, such as genetic engineering, genome sequencing or genetic fingerprinting. There are two different ways to clone DNA: In vitro gene cloning uses PCR to clone DNA in the test tube. In vivo gene cloning uses restriction enzymes, vectors, DNA ligase, transformation of bacterial cells, marker genes and growth cultures to clone DNA inside bacterial cells. The two techniques have different advantages and disadvantages: in vitro cloning in vivo cloning (using PCR) (using living cells) Simple, automated technique, which can be completed in a few hours Complex, multi-step process, needing several days to complete Very sensitive, can clone a single molecule Large amounts of original DNA needed Can use DNA from different kinds of source, including degraded DNA from crime scenes or archaeological sources Needs intact, pure DNA Clones DNA molecules up to 1kbp long Clones DNA molecules up to 2Mbp long High error rate, since no error-correction Low error rate due to cellular error-correcting mechanisms DNA is made in the test tube, so cannot be expressed directly DNA is made in cells, so can be expressed easily HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 55 12. Knockout Mice We have now identified many thousands of genes in humans and other organisms, but in many cases we don’t actually know what the genes do! The genes are identified by examining entire genome sequences and looking for indicators of genes, such as start and stop codons, promotor sequences, etc. Computers then compare gene sequences with the sequences of known genes from other organisms (such as Drosophila) to try to identify their function, but similar-looking genes may have different functions in different organisms. Of the 20,000 genes in the human gene, we know the approximate function of about 15,000. Some of the 15,000 identified human genes. One way to identify the real physiological function of a gene is to make a knockout mouse, where one gene is inactivated. Mice are fairly closely related to humans, compared to other laboratory animals like fruit flies and nematode worms, and they share most of their genes with us. But they can also be bred and observed easily in a lab and there are fewer ethical problems in performing genetic experiments on mice compared to humans. By comparing the appearance, physiology, cells and biochemistry of the knockout mouse with a normal one the function of the missing gene can often be deduced. This technique is often used to study human diseases such as cancer, obesity, heart disease, diabetes, arthritis, substance abuse, anxiety, aging and Parkinson disease. For example, knockout mice with an inactive CFTR gene develop cystic fibrosis like humans do, and have been used to study the disease and to trial potential gene therapy cures, where new DNA is introduced to cells in vivo to try to cure the disease. It is more ethical to perform early trials to test the safety of gene therapy on mice than on humans. Knockout mice are made by a technique called gene targeting. The first knockout mouse was created by Capecchi, Evans and Smithies in 1989, for which they were awarded the Nobel Prize in 2007. The gene targeting technique is described on the next page. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 56 Gene Targeting Method 1. We start with the sequence of the mouse genome, which has been done, and we identify a gene sequence from the sequence (though we don’t yet know what the gene does). We then make a targeting vector, with the same sequence as the gene and its surrounding flanking sequences, but has some DNA removed so the gene won’t work, and a marker gene added. 2. Isolate pluripotent stem cells from a brown mouse embryo in a Petri dish, add the targeting vector and let it enter the stem cells using electroporation (see pxx). 3. The targeting vector will line up with the original gene in the stem cell’s DNA because it has almost the same sequence, and DNA will swapped between the vector and the host DNA in a process called homologous recombination. This process is like crossing over in meiosis, but it happens during the normal cell cycle between sister chromatids. So gene targeting hijacks a normal homologous recombination process to replace the normal gene with a modified (knockout) gene. 4. The stem cells continue to grow in culture. The few cells that were transformed are identified using the marker gene and the others are killed. 5. The transformed cells are injected into the inner cell mass of a blastocyst embryo from a different (white) mouse using a micropipette, so the embryo contains a mixture of normal cells from the white mouse and transformed cells from the brown mouse. The embryo is implanted into a surrogate mother. 6. The mother gives birth to chimera mice – that is mice with cells from two different sources. The chimeras are obvious because they have distinct white and brown patches of fur. 7. We want mice with 100% transformed cells, not chimeras. We do this by breeding from the chimeras. Some chimeras will, by chance, have gonad tissue developed from transformed stem cells, so they will produce gametes with modified DNA. From their offspring will can select homozygous transformed mice. These are the knockout mice. These steps are shown on the next page HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 2. HGS Biology A-level notes page 57 1. NCM 09/16 A Level Biology Unit 10 page 58 Genetically Modified Organisms We have looked at some of the many techniques used in biotechnology. We’ll now turn to some applications of these techniques. The applications involve altering the genes in a living organism to produce a Genetically Modified Organism (GMO) with a new genotype. The GMO is designed to benefit human in some way. If a foreign gene in copied from one species into another the GMO is called a transgenic organism, but remember that not all GMOs are transgenic: the genetic modification might just alter an existing gene so that its product is changed or change its gene expression. We’ll consider the applications in three groups. Using genetically modified organisms (usually microbes) to produce chemicals Gene Products (usually proteins) for medical or industrial applications. New Phenotypes Using gene technology to alter the characteristics of organisms (usually farm animals or crops). Using gene technology on humans to treat a disease. Gene Therapy Gene Products The biggest and most successful kind of genetic engineering is the production of gene products. These products are of medical, agricultural or commercial value to humans. This table shows a few of the examples of genetically engineered products that are already available. Product Insulin HGH Encephalin BST Factor VIII Anti-thrombin Collagen Vaccines Antibodies AAT -glucosidase DNAse rennin cellulase PHB Use human hormone used to treat diabetes human growth hormone, used to treat dwarfism human hormone bovine growth hormone, used to increase milk yield of cows human blood clotting factor, used to treat haemophiliacs anti-blood clotting agent used in surgery used in reconstructive surgery hepatitis B antigen, for vaccination research and clinical use enzyme inhibitor used to treat cystic fibrosis and emphysema enzyme used to treat Pompe’s disease enzyme used to treat CF enzyme used in manufacture of cheese enzyme used in paper production biodegradable plastic Host Organism bacteria, yeast bacteria plants bacteria pigs goats plants yeast, plants goats, plants sheep, yeast rabbits bacteria bacteria /yeast bacteria plants The products are mostly proteins, which are produced directly when a gene is expressed, but they can also be non-protein products produced by genetically-engineered enzymes. The basic idea is to transfer a gene (often human) to another host organism (usually a microbe) so that it will make the gene product quickly, cheaply and ethically. It is also possible to make “designer proteins” by altering gene sequences, but while this is a useful research tool, there are no commercial applications yet. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 59 New Phenotypes This means altering the characteristics of organisms by genetic engineering. The organisms are generally commercially-important crops or farm animals, and the object is to improve their quality in some way. This can be seen as a high-tech version of selective breeding, which has been used by humans to alter and improve their crops and animals for at least 10 000 years. This table gives an idea of what is being done. Organism Modification long life tomatoes There are two well-known projects, both affecting the gene for the enzyme polygalactourinase (PG), a pectinase that softens fruits as they ripen. Tomatoes that make less PG ripen more slowly and retain more flavour. The American “Flavr Savr” tomato used antisense technology to silence the gene, while the British Zeneca tomato disrupted the gene. Both were successful and were on sale for a few years, but neither is produced any more. Insect-resistant crops Genes for various powerful protein toxins have been transferred from the bacterium Bacillus thuringiensis to crop plants including maize, rice and potatoes. These Bt toxins are thousands of times more powerful than chemical insecticides, and since they are built-in to the crops, insecticide spraying (which is non-specific and damages the environment) is unnecessary. virus-resistant crops Gene for virus coat protein has been cloned and inserted into tobacco, potato and tomato plants. The coat protein seems to “immunise” the plants, which are much more resistant to viral attack. pest-resistant legumes The gene for an enzyme that synthesises a chemical toxic to weevils has been transferred from Bacillus bacteria to the Rhizobium bacteria that live in the root nodules of legume plants. These root nodules are now resistant to attack by weevils. Nitrogen-fixing crops This is a huge project, which aims to transfer the 15-or-so genes required for nitrogen fixation from the nitrogen-fixing bacteria Rhizobium into cereals and other crop plants. These crops would then be able to fix their own atmospheric nitrogen and would not need any fertiliser. However, the process is extremely complex, and the project is nowhere near success. crop improvement Proteins in some crop plants, including wheat, are often deficient in essential amino acids (which is why vegetarians have to watch their diet so carefully), so the protein genes are being altered to improve their composition for human consumption. tick-resistant sheep The gene for the enzyme chitinase, which kills ticks by digesting their exoskeletons, has bee transferred from plants to sheep. These sheep should be immune to tick parasites, and may not need sheep dip. Fast-growing fish A number of fish species, including salmon, trout and carp, have been given a gene from another fish (the ocean pout) which activates the fish’s own growth hormone gene so that they grow larger and more quickly. Salmon grow to 30 times their normal mass at 10 times the normal rate. environment cleaning microbes Genes for enzymes that digest many different hydrocarbons found in crude oil have been transferred to Pseudomonas bacteria so that they can clean up oil spills. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 60 Genetically Modified Soya Beans Soya beans (Glycine max, called soybean in the USA) are leguminous bean plants originally from China. The beans are high in protein (since they’re legumes) and low in fat, so have become popular in human diets and are now the most widely-cultivated legume worldwide. Soya beans have three main commercial uses: Soya bean lipids are extracted from the beans and used as vegetable oil for frying and baking. The oil is also used in the manufacture of paint and cosmetics. Soya mean meal is the cell mass remaining after the oil is removed. It has a high protein content and is used as livestock feed and dog food. The beans can be eaten raw or used to make soy sauce, soya flour, soya milk, tofu and miso. The biggest producer of soya beans is the United States and now 93% of the US soya bean crop is genetically modified. There are two reasons for genetic modification: 1. Herbicide resistance (Roundup-Ready Soybeans). The herbicide glyphosate (sold as Roundup by Monsanto) kills plants by inhibiting an enzyme unique to plants. Now soya beans have been geneticallymodified with the gene for an enzyme from the bacterium Agrobacterium tumefaciens, which is not affected by glyphosate. These soya beans are therefore resistant to the herbicide. Fields can safely be sprayed with this herbicide, which will kill all weeds, but not the GM soya bean. This helps to improve soya bean yield but means continued use of agrochemicals, so is controversial. 2. Modified lipid composition (Vistive Gold). Normal Soya bean oil already has a good mix of lipids with a low concentration of unhealthy saturated fats and a high concentration of healthy polyunsaturated fats. However the polyunsaturated fats are susceptible to oxidation on heating in a fryer or on longterm storage, which makes it rancid. Monsanto has a made a GM soya bean with less polyunsaturated fats, giving it a longer shelf life. Two enzymes in the biosynthetic pathway of fatty acids have been knocked out to alter the fatty acid composition. HGS Biology A-level notes NCM 09/16 A Level Biology Unit 10 page 61 Evaluating Biotechnology The whole point of creating genetically-modified organisms is to benefit humans, and the benefits are usually fairly obvious, but nevertheless there has been some vocal opposition to GMOs. Opposition is often based on ethical, moral or social grounds, such as harm to animals or the environment, though there can also be more practical issues, such as distrust of large corporations. Benefits Medicines and drugs can be produced safely in large quantities from microbes rather than from slaughtered animals. These medicines benefit humans and can spare animal suffering as well. Agricultural productivity can be improved while using less pesticides or fertilisers, so helping the environment. GM crops can grow on previously unsuitable soil or in previously unsuitable climates. GM crops can improve the nutrition and health of millions of people by improving the nutritional quality of their staple crops. Risks Risks to the modified organism. Genetic modification of an organism may have unforeseen genetic effects on that organism and its offspring. These genetic effects could include metabolic diseases or cancer, and would be particularly important in vertebrate animals, which have a nervous system and so are capable of suffering. The research process may also harm animals. Transfer to other organisms. Genes transferred into GMOs could be transferred again into other organisms, by natural accidents. These natural accidents could include horizontal gene transmission in bacteria, cross-species pollination in plants, and viral transfer. This could result in a weed being resistant to a herbicide, or a pathogenic bacterium being resistant to an antibiotic. To avoid transfer via crosspollination, genes can now be inserted into chloroplast DNA, which is not found in pollen. Risks to the ecosystems. A GMO may have an unforeseen effect on its food web, affecting other organisms. Many ecosystems are often delicately balanced, and a GMO could change that balance. Risk to biodiversity. GMOs may continue to reduce the genetic biodiversity already occurring due to selective breeding. Risks to human societies. There could be unexpected and complicated social and economic consequences from using GMOs. For example if GM bananas could be grown in temperate countries, that would be disastrous for the economies of those Caribbean countries who rely on banana exports. Risks to local farmers. Developing GMOs is expensive, and the ownership of the technology remains with the large multi-national corporations. This means the benefits may not be available to farmers in third world countries who need it most. HGS Biology A-level notes NCM 09/16