* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Gene as the unit of genetic material - E
Public health genomics wikipedia , lookup
Frameshift mutation wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
DNA vaccination wikipedia , lookup
Human genome wikipedia , lookup
Genomic library wikipedia , lookup
Epigenomics wikipedia , lookup
DNA supercoil wikipedia , lookup
Oncogenomics wikipedia , lookup
Molecular cloning wikipedia , lookup
X-inactivation wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression programming wikipedia , lookup
Primary transcript wikipedia , lookup
Genomic imprinting wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Population genetics wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Cancer epigenetics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Genome evolution wikipedia , lookup
Non-coding DNA wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genetic engineering wikipedia , lookup
Genome editing wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome (book) wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Point mutation wikipedia , lookup
Designer baby wikipedia , lookup
History of genetic engineering wikipedia , lookup
MOLECULAR GENETICS Unit I Introduction to molecular genetics: Mendelian genetics – Mendel’s work, Laws of heredity, Test cross,incomplete dominance Epistasis : Multiple allelism, Sex Determination in Plants and animals, concepts of allosomes and autosomes , XX-XY, XX-XO,ZW-ZZ, ZO-ZZ types, linkage and Crossing Over, linkage in maize and Drosophila, Mechanism of crossing Over, and its importance, chromosomal variations. General account of structural and numerical aberrations, Cytoplasmic inheritance Variations.Karyotype in man, inherited disorders – Allosomal (Klinefelter syndrome and Turner’s syndrome),Autosomal (Down syndrome and Cri – Du-Chat syndrome). Unit –II Molecular organization of chromosome: DNA as the genetic material, DNA Replication in prokaryotes– Enzymes and proteins involved in replication, theta model and rolling circle model. Copy number control. Mutations Types: spontaneous and induced, Mutagens: Physical and chemical, Mutation at the molecular level. DNA Repair causes and mechanism – photo reactivation, excision repair, mismatch repair, SOS repair. Recombination in prokaryotes. Unit –III Transformation, Conjunction and Transduction in Bacteria, Structure of Prokaryotic and Eukaryotic genes- genetic code, properties and Wobble Hypothesis, Transcription in prokaryotes. Mechanism, Promoters and RNA polymerase, transcription factors. Translation mechanism in Prokaryotes. Unit IV Regulation of Gene expression in Prokaryotes – Operon concept (Lac and Trp). Attenuation and antitermination. DNA methylation, Regulatory sequences and Transaction factor, Transcriptional regulation in prokaryotes and translational regulation. Insertional elements and transposons. Extra cnromosomal inheritance, Transposable elements in Maize and Drosophila. Unit V Genetic control and development: Genetic determinants of Development. Early embryonic development in animals, Population genetics and evolution: allele frequencies and genetic frequencies Hardy – Weinberg principle. Mutation and migration, Natural selection. Genetic Shift, causes of variation and artificial selection. Genetic and multifactorial interactions, causes of variation and artificial selection. Genetic Load and Genetic counseling. References 1. Gardner (2001) Principles pf genetics, John Wiley & Sons Inc, New York. 2. Robert Tamarin, (1996) Principles of genetics, 5th Ed WMC Brown publication, Boston 3. Berjamin Lewin (2004) Genes VIII, Pearson Education Corporation, New jersey 4. Alberts B,(1994) molecular biology of the cell, Garland publishing Inc New York 5. Lodish & Baltimore, (2004) molecular cell biology, II Ed W H freeman & company, New York 6. Friedfielder.D,(1987) Molecular biology II Ed., Narosa publishing house , New Delhi. Unit I Changes in Chromosome Structure Chromosomal aberrations Chromosomal aberrations are disruptions in the normal chromosomal content of a cell. Abnormal numbers of chromosomes or chromosome sets, aneuploidy, may be lethal or give rise to genetic disorders. Chromosomal mutations produce changes in whole chromosomes (more than one gene) or in the number of chromosomes present. 1. Deletion - loss of part of a chromosome 2. Duplication - extra copies of a part of a chromosome 3. Inversion - reverse the direction of a part of a chromosome 4. Translocation - part of a chromosome breaks off and attaches to another chromosome Duplication occurs when a gene sequence is repeated in excess of the normal amount. Duplications occur in normal chromosomes. This may have adaptive advantage in that useful mutations may occur in the copies. An inversion alters the position and sequence of the genes so that gene order is reversed within the chromosome. When the inverted segment of chromosome contains centromere, then such type of inversion is called as heterobranchial or pericentric inversion. When the inverted segment includes no centromere and the centromere is located out side the segment then such type of inversion is called homobranchial or pericentric inversion. A translocation occurs when a part of one chromosome is transferred to another nonhomologous chromosome.Its called as reciprocal translocation. Most translocations are reciprocal. A deletion is the loss of a chromosome region by viral attack, chemicals, irradiation, or other environmental factors. Most are lethal or cause serious disorder. Due to break at the end if terminal end is deleted it is called terminal deletion. Like that if two breaks occur and a intercalary segment ,is lost it is called intercalary deletion. The gain or loss of DNA from chromosomes can lead to a variety of genetic disorders. Human examples include: Cri du chat, which is caused by the deletion of part of the short arm of chromosome 5. WolfHirschhorn syndrome, which is caused by partial deletion of the short arm of chromosome 4. Down's syndrome, usually is caused by an extra copy of chromosome 21 Edwards syndrome, which is the second-most-common trisomy; They have a characteristic clenched hands and overlapping fingers. Patau Syndrome, also called D-Syndrome or trisomy-13. Symptoms are somewhat similar to those of trisomy-18, but they do not have the characteristic hand shape. Klinefelter's syndrome (XXY). Men with Klinefelter syndrome are usually sterile, and tend to have longer arms and legs and to be taller than their peers. Boys with the syndrome are often shy and quiet, and have a higher incidence of speech delay and dyslexia. During puberty, without testosterone treatment, some of them may develop gynecomastia. Turner syndrome (X instead of XX or XY). In Turner syndrome, female sexual characteristics are present but underdeveloped. People with Turner syndrome often have a short stature, low hairline, abnormal eye features and bone development and a "caved-in" appearance to the chest. XYY syndrome. XYY boys are usually taller than their siblings. Like XXY boys and XXX girls, they are somewhat more likely to have learning difficulties. Triple-X syndrome (XXX). XXX girls tend to be tall and thin. They have a higher incidence of dyslexia. CHANGES IN CHROMOSOME NUMBER Aneuploidy: Changes in chromosome number can occur by the addition of all or part of a chromosome (aneuploidy), The loss of an entire set of chromosomes (monoploidy) or The gain of one or more complete sets of chromosomes (euploidy). Each of these conditions is a variation on the normal diploid number of chromosomes. each of these can have drastic effects on phenotypic expression. Aneuploidy - the abnormal condition were one or more chromosomes of a normal set of chromosomes are missing or present in more than their usual number of copies Monoploidy - the loss of an entire set of chromosomes Euploidy - an entire set of chromosomes is duplicated once or several times The different conditions of aneuploidy are: 1. Nullisomy - the loss of both pairs of homologous chromosomes; individuals are called nullisomics and their chromosomal composition is 2N-2 2. Monosomy - the loss of a single chromosome; individuals are called monosomics and their chromosomal composition is 2N-1 3. Trisomy - the gain of an extra copy of a chromosome; individuals are called trisomics and their chromosomal composition is 2N+1 4. Tetrasomic - the gain of an extra pair of homologous chromosomes; individuals are called tetrasomics and their chromosomal composition is 2N+2 Development of aneuploids The development of aneuploids is not well understood, but they may have arisen by a process called non-disjunction. Non-disjunction occurs when paired chromosomes do not separate either during meiosis I or meiosis II. The direct result of this event is that gametes develop that have too few or too many chromosomes. I If this occurs during meiosis I normal gametes are not developed, and if it occurs during meiosis II half of the gametes will be normal and the other half will be abnormal. Phenotypic Effects of Aneuploidy The table below is a list of some of the aneuploids of humans. The best known condition is probably Down's Syndrome which results from an extra copy of chromosome 21. An important point to remember is that aneuploidy is usually lethal in animals, but can be tolerated to a greater extent in plants. Monoploidy An individual that contains one half the normal number of chromosomes is a monoploid and exhibits monoploidy. Monoploids are very rare in nature because recessive lethal mutations become unmasked, and thus they die before they are detected. These alleles normally are not a problem in diploids because their effects are masked by dominant alleles in the genome. Some species such as bees, ants and male bees are normally monoploid because they develop from unfertilized eggs. Consequently, these individuals will be sterile. A stage in the life cycle of some fungal species can also be monoploid. Monoploidy has been applied in plant biotechnology to rapidly develop plants from anthers that have a fixed genotype. Euploidy Euploidy in Animals A genome that contains three or more full copies of the haploid chromosome number are polyploid. As a general rule polyploids can be tolerated in plants, but are rarely found in animals. Euploidy in plants Before we discuss polyploidy in plants in detail, first a distinction must be made between the two major classes of polyploids, autopolyploids and allopolyploids. Autopolyploid - an individual that has an additional set of chromosomes that are identical to parental species; an autotriploid would have the chromosomal composition of AAA and an autotetraploid would be AAAA; both of these are in comparison to the diploid with the chromosomal composition of AA Allopolyploid - an individual that has an additional set of chromosomes derived from another species; these typically occur after chromosomal doubling and their chromosomal composition would be AABB; if both species have the same number of chromosomes then the derived species would be an allotetraploid An autotriploid could occur if a normal gamete (n) unites with a gamete that has not undergone a reduction and is thus 2n. The zygote would be 3n. Triploids could also be produced by mating a diploid (gametes = n) with a tetraploid (gametes = 2n) to produce an individual that is 3n. One generalization that has been made is that autopolyploids are larger than their diploid counterpart. For example, their flowers and fruits are larger in size which appears to be the result of larger cell size than cell number. This increased size does offer some commercial advantages. Important triploid plants include, some potatoes, bananas, watermelons and Winesap apples. All of these crops must be propagated asexually. Examples of tetraploids are alfalfa, coffee, peanuts and McIntosh apples. These also are larger and grow more vigorously. Euploidy and Plant Speciation One goal of plant breeding has been to develop allopolyploids that have new traits that are not seen in other species. The one beneficial allopolyploid developed to date is Triticale. This amphidiploid was developed from the pollination of wheat (Triticum, 2n=42) with rye (Secale, 2n=14). The goal of this experiment was to combine the rugged phenotype of rye with the high yielding characteristics of wheat. The final chromosomal composition was 2n=56 chromosomes. CROSSING OVER The process of crossing over can be defined as a process which produces new combinations (recombinations) of genes by interchanging of corresponding segments between non-sister chromatids of homologous chromosomes. Following two types of crossing over has been recognized. They are:1. Germinal or Meiotic 2. Somatic or Mitotic Germinal or Meiotic: Commonly crossing over occurs in the germinal cells of reproductive organs during the process of gametogenesis which includes meiosis. This type of crossing over is called germinal or meiotic crossing over. Somatic or Mitotic: Sometimes crossing over may occur during mitosis of somatic cells. This type of crossing over is called as somatic or mitotic crossing over. MECHANISM OF CROSSING OVER: According to the widely accepted Whitehouse model for the crossing over, the whole process includes the following four stages:- i. Synapsis ii. Duplication of chromosome iii. Crossing over iv. Terminalization Synapsis: During zygotene stage of prophase-I of meiosis occurring in developing sex cells, the homologous chromosomes come close to each other and pairing on Synapsis between the homologous chromosomes (genetically identical) takes place. It’s a prime event which provides the mechanical basis of heredity and variation. Duplication of chromosome: The synapsis is followed by duplication of chromosomes. During this stages, each homologous chromosome of a bivalent splits longitudinally and form two identical sister chromatids, So that each bivalent is now composed of four chromatids. Crossing over: The crossing over occurs in the homologous chromosomes only during the four stranded stage or tetrad stage. During the process of crossing over two non-sister chromatids first break at the corresponding points due to the activity of a nuclear enzyme called endonuclease. Then a segment on one side of each break connects with a segment on the opposite side of the break, so that the two nonsister chromatids cross each other at the point of break and exchange. The fusion of chromosomal segment with that of opposite one takes place due to the action of an enzyme ligase. The crossing of two chromosomes is called as chiasma formation and the resultant cross as chiasma or chiasmata. The crossing over thus includes the breaking of chromatid segments, their transposition and fusion. Terminalisation: After the completion of the crossing over, the non-sister chromatids start to repel each other because the force of synapsis attraction between them decreases. The chromatids separate progressively from the centromere towards the chiasma and the chiasma itself moves in a Zipper fashion towards the end of tetrad. The movement of chiasma is called as terminalization. The following theories explains about the mechanism of crossing over Classical theory by Karl in 1932 Duplication theory by Belling in 1927 Copy choice theory by J.Laderberg in 1955 Break and Exchange theory by Muller Among them the widely accepted theory about the crossing over is the Break and Exchange theory. http://www.contexo.info/DNA_Basics/Meiosis.htm CYTOPLASMIC GENETIC SYSTEM The part of the cell which occurs between the plasma membrane and nuclear envelope is known as the cytoplasm. It forms most essential part of the cell because it is seat of all biosynthetic and bio energetic functions. Most of the phenotypic characters are controlled by the genes present in the chromosomes but some characters are expressed by the factors present in the cytoplasm. These factors are called plasma genes. They also transmit characters from one generation to next generation. This type of inheritance is called as cytoplsmic inheritance. These together constitute a cytoplasmic genetic system.e.g) Mitochondrial DNA & Chloroplast DNA. The offspring receive cytoplasm only from female parent and not from the male parent. Hence the plasma genes of female parent are contributed to the offspring and are called maternal inheritance.E.g)Shell coiling in snail. Chloroplast DNA: The chloroplasts of green plants are cytoplasmic organelles that contain various pigments and enzymes necessary for photosynthesis. This pigmentation was one of the easiest traits to observe in plant breeding experiments in early days. They are transmitted by the cytoplasm of the zygote through the female parent. These lead to the hypotheses that chloroplast must carry genes. We know that chloroplast contain a unique circular DNA that is completely different from nuclear DNA. The presence of this organelle DNA was demonstrated in 1962. The chloroplast DNA is 10-20 times smaller than E.Coli DNA. First of all Ris and Plant has reported DNA molecule in the chloroplast of chlamydomonas. Later on DNA molecule was reported from the chloroplast of other algae and higher plants. The DNA of chloroplast resembles closely with the bacterial DNA. The length of the Chloroplast DNA varies in different plant species. Chloroplast DNA consists of 83-128x106mw with a size of 1.21-1.93x105bp A number of genes are located on the circle, one of the important feature is the presence of two copies of ribosomal DNA sequence mostly found as inverted repeats. The symbiotic origin of the chloroplast: Due to certain characteristics the chloroplast is comparable to a symbiotic organism living inside the plant cell. They divide, grow and differentiate. They contain circular DNA, ribosomal RNA, mRNA and are able to conduct protein synthesis. By visualizing these similarities between chloroplast and microorganism it has been suggested that chloroplasts might have resulted from a symbiotic relationship between an autotrophic microorganisms. But there still exists certain doubts about the symbiotic origin of the chloroplast. The chloroplasts are supposed to be originated from the blue green algae. Chloroplasts genome is larger and more complex when compared to mitochondrial genome. They contain approximately 150 genes. These genes code for many RNA and protein involved in gene expression for the function photosynthesis. Mitochondrial DNA: Mitochondria is a cell organelle. It is involved in ATP synthesis and produce energy. Hence it is called as the power house of the cell. Almost all eukaryotes have mitochondrial genomes. The mitochondria contain one or two molecules of DNA according to its size. The mitochondrial DNA is highly twisted, double stranded and circular in shape. Mitochondrial genome sizes are variable and unrelated to the complexity of the organism. Most multicellular animals have small mitochondrial genomes with a Compact genetic organization. The genes being close together with little space between them(e.g) human mitochondrial genome. The largest sequenced mitochondrial genome is the genome of the plant Arabidopsis thaliana. Lower eukaryotes such as the yeast and flowering plants have larger, less compact mitochondrial genome. The DNA sequence analysis revealed similarities between the mitochondrial genome and the bacteria rickettsia. The length of the mitochondrial DNA resembles that of bacterial DNA. Mitochondrial DNA besides its circular shape differ from nuclear DNA in many ways. In 1968 Rabinowitch has shown that mitochondrial DNA has more Guanine and Cytosine (GC) than nuclear DNA. As the mitochondrial DNA is smaller than the nuclear DNA they contain only few coded informations. All mitochondrial genomes contain genes for the mitochondrial rRNAs and atleast some of the protein components for the respiratory chain. Parson & Simpson(1967) have shown the presence of DNA polymerase enzyme in mitochondria which helps in the duplication of mitochondrial DNA. Due to DNA molecule the mitochondria are capable of undergoing self reproduction and they may carry biological information’s to exhibit a type of cytoplasmic inheritance. The symbiotic origin of mitochondria: Early cytologists such as Altmann&Schimber (1890) have suggested the possibility of origin of mitochondria from the prokaryotic cells. According to their hypothesis the mitochondria is considered as the intra cellular parasite that entered in to the cytoplasm of the eukaryotic cells in the early evolutionary change. Mitochondria is supposed to be derived from the bacteria Rickettsia. The DNA molecule of the mitochondria and its replication process is similar to bacterial DNA. These are about the Mitochondrial DNA DOMINANCE RELATIONS Different alleles of the same gene can have different levels of phenotypic expression (outward appearance). Because genes are usually present in two copies (sex chromosomes being the exception in diploid organisms like humans), this difference in expression can be important to the phenotype of the organism. Some alleles mask the effect of all other forms of the gene present. These alleles are said to be dominant. A dominant gene has the same effect when it is present as a single copy (heterozygous) as when it is present as two copies (homozygous). The allele not expressed in the heterozygous form is said to be recessive to the masking allele, the dominant allele. In the pea plants studied by Austrian monk and botanist Gregor Johann Mendel, the allele for tall plants was dominant to the allele for dwarf plants. When a gene is said to be dominant or recessive without any qualification, then it is assumed that effect is with reference to the wild type allele. An effect called co-dominance can also occur where there is a blending of the effects of the two alleles. Many genes for characteristics that are not for discrete states (e.g. presence versus absence) can be co-dominant. Examples of this include such characteristics as blood types with the ABO blood system. Alleles are not required (constrained) to act in an all or nothing manner and degrees of blending may be possible. In certain instances, there is no dominance shown at all, and when two differing parental types cross there is simply an intermediate blending of the two forms. For example, snap dragons with white flowers when mated with those with red flowers give progeny with pink flowers. There are also situations between dominance and no dominance, which are known as partial dominance. With these situations there is an intermediate form produced in the first generation, but it is markedly more similar to one of the parents than the other. For the effects of a recessive allele to be observable it has to be present in the homozygous form. In other words, there must be two copies of the same allele present. Recessive alleles can be hidden in one generation but they may reappear in the next depending upon the breeding that has occurred. Just because the allele cannot be directly observed does not mean it is not present. INCOMPLETE DOMINANCE & CODOMINANCE 1. For Incomplete dominance, neither allele has complete dominace over the other and hence both allele will NOT FULLY expressed themselves, instead, both allele express intermediate character. Loss of function mutation: Loss of function mutations are mutations that code for a substance which is completely defective, for an enzyme that has no catalytic ability at all. Example: Snap dragon One allele codes for red flowers One allele codes for white flowers When bothe alleles are together, the phenotype expressed will be pink, which means both allele did NOT FULLY expressed themselves but expressed intermediate characteristic. GENOTYPES RR = Homozygous Red RW = Heterozygous WW = Homozygous White RESULTING PHENOTYPE Red Flower Pink Flower White Flower where R = allele for Red & W = allele for white 2. For Codominance, both allele expressed themselves FULLY as shown in the phenotype. Gain of function mutation. Gain of function mutations are mutations that code for a substance which is functional, but perhaps leads to a slightly different outcome, perhaps coding for an enzyme that functions best at a higher or lower temperature. Gain of function mutations are quite common. Example: ABO Blood group I A codes for the production of antigen A I B codes for the production of antigen B When both alleles are together, both characteristics are expressed FULLY, meaning that antigen A and B are produced in the blood. Hence co-dominance. Example 2: Coat colour in cattle One allele codes for white patches One allele codes for red patches White and red patches found together indicate Co-dominance GENOTYPES RR = Homozygous Red RESULTING PHENOTYPE Red Fur RW = Heterozygous WW = Homozygous White Red & White Fur White Fur where R = allele for Red & W = allele for white Over dominace: When the heterozygote has a more extreme phenotype than either of the parent then it is usually referred to as Overdominance, Super dominance or heterodominance. For example: There is heterodominance when the heterozygote Aa between a pair of factors which control size is bigger than their homozygotes. This type of allelic characters is found with qualitative characters like size,production,Vigour e.t.c… GENE INTERACTION The phenotypic ratios obtained by Mendel in garden peas demonstrate that one gene controls one character of the two alleles of a gene one allele is completely dominant over the other. Due to this the heterozygote’s has a phenotype identical to the homozygous parent. Soon after Mendel’s work was rediscovered, instances came to light where a gene was not producing an individual effect. On the contrary genes were interacting with each other to produce novel phenotypes which did not exhibit dominance relationships observed in Mendel’s experiments. In one of the first reported case by Kolreuter the heterozygote showed a phenotype intermediate between the parental phenotypes. This was termed incomplete dominance or intermediate inheritance. In co-dominance the heterozygote expresses both the parental phenotypes equally. Sometimes a gene masks the expression of another gene at a different locus. This is known as Epistasis. On still other occasions a gene does not completely mask another gene as in epistasis but in some way modifies the effect of the second gene. Known as modifying genes. Such genes either enhance or suppress the expression of a different gene. Interaction between genes to act together to produce an effect that neither gene can produce separately. Such genes are said to be complementary. There are genes that copy other genes. So to say, to produce a similar effect. Thus independent genes that produce the same effect are given the same name duplicate genes. Some genes are causing death. They are known as lethal genes. Each chemical reaction in cell involves the conversion of a precursor in to end product, each step being mediated by a specific enzyme. Each protein depends on gene for its production. If it lack in any one of the enzyme the substance can not be converted in to product it will stop by that step. Metabolic block will occur. This is an example for gene interaction. EPISTASIS When two different genes which are not alleles, both affect the same character in such a way that the expression of one masks, inhibits or suppress the expression of the other gene, it is called Epistasis. Those genes that suppresses is said to be Epistatic and the gene which remains suppressed is hypostatic. Dominant Epistasis: Epistasis due to dominant gene is called Dominant epistasis. In poultry white birds belong to two different varities namely white leghorns or white wyandottes. The gene for white leghorns is dominant over the gene for coloured plumage. But the gene for white wyandottes is recessive. A cross between white leghorns and white wyandottes gives an F1 of white birds with small dark flecks. When such birds are inbred the F2 progeny segregates in the ratio of 13 white to 3 colored birds. The gene for white leghorns is” IICC”. The gene for white wyandottes is “iicc”. The gene I is Epistatic to gene C. Even if a single I is present in the genotype then the phenotype will be White leghorns. This is the case of dominant epistasis because even one dominant allele of gene I is able to express itself. Recessive Epistasis: Epistasis due to recessive gene is called Recessive epistasis. In mice albinism is produced by a recessive gene aa. There is a different gene B which in the dominant state (BB & Bb) produces grey coat color called agouti, And when recessive (bb) leds to black coat color. The recessive gene for albinism (aa) is found to be epistatic to the gene for agouti(BB and Bb) and also to its recessive, homozygous allele (bb) for black. The presence of the dominant allele(AA) of the epistatic gene allows expression of gene B so that agouti(BB and Bb) and black(bb) coat colors can be produced. P: Agouti X Albino AABB aabb Gametes: AB F1: ab Agouti X Agouti AaBb AaBb Gametes: AB, Ab, aB, ab F2: AB Ab aB ab AABB AABb AaBB AaBb Agouti AABb Agouti AAbb Agouti AaBb Agouti Aabb Agouti AaBB Black AaBb Agouti aaBB Black aaBb Agouti AaBb Agouti Aabb Albino aaBb Albino Aabb Agouti Black Albino Albino Phenotypic ratio: 9Agouti: 3Black: 4Albino Other examples: Epistasis in Drosophila, Epistasis and Blood group in Man GENETIC MAPPING OR LINKAGE MAPPING How do researchers create a genetic map? To produce a genetic map, researchers collect blood or tissue samples from family members where a certain disease or trait is prevalent. Using various laboratory techniques, the scientists isolate DNA from these samples and examine it for the unique patterns of bases seen only in family members who have the disease or trait. These characteristic molecular patterns are referred to as polymorphisms, or markers. Before researchers identify the gene responsible for the disease or trait, DNA markers can tell them roughly where the gene is on the chromosome. This is possible because of a genetic process known as recombination. As eggs or sperm develop within a person's body, the 23 pairs of chromosomes within those cells exchange - or recombine - genetic material. If a particular gene is close to a DNA marker, the gene and marker will likely stay together during the recombination process, and be passed on together from parent to child. So, if each family member with a particular disease or trait also inherits a particular DNA marker, chances are high that the gene responsible for the disease lies near that marker. The more DNA markers there are on a genetic map, the more likely it is that one will be closely linked to a disease gene - and the easier it will be for researchers to zero-in on that gene. One of the first major achievements of the HGP was to develop dense maps of markers spaced evenly across the entire collection of human DNA. The observations by Thomas Hunt Morgan that the amount of crossing over between linked genes differs led to the idea that crossover frequency might indicate the distance separating genes on the chromosome. Morgan's student Alfred Sturtevant developed the first genetic map, also called a linkage map. Sturtevant proposed that the greater the distance between linked genes, the greater the chance that non-sister chromatids would cross over in the region between the genes. By working out the number of recombinants it is possible to obtain a measure for the distance between the genes. This distance is called a genetic map unit (m.u.), or a centimorgan and is defined as the distance between genes for which one product of meiosis in 100 is recombinant. A recombinant frequency (RF) of 1 % is equivalent to 1 m.u. A linkage map is created by finding the map distances between a number of traits that are present on the same chromosome, ideally avoiding having significant gaps between traits to avoid the inaccuracies that will occur due to the possibility of multiple recombination events. Linkage mapping is critical for identifying the location of genes that cause genetic diseases. In an ideal population, genetic traits and markers will occur in all possible combinations with the frequencies of combinations determined by the frequencies of the individual genes. For example, if alleles A and a occur with frequency 90% and 10%, and alleles B and b at a different genetic locus occur with frequencies 70% and 30%, the frequency of individuals having the combination AB would be 63%, the product of the frequencies of A and B, regardless of how close together the genes are. However, if a mutation in gene B that causes some disease happened recently in a particular subpopulation, it almost always occurs with a particular allele of gene A if the individual in which the mutation occurred had that variant of gene A and there have not been sufficient generations for recombination to happen between them (presumably due to tight linkage on the genetic map). In this case, called linkage disequilibrium, it is possible to search potential markers in the subpopulation and identify which marker the mutation is close to, thus determining the mutation's location on the map and identifying the gene at which the mutation occurred. Once the gene has been identified, it can be targeted to identify ways to mitigate the disease. Linkage map A linkage map is a genetic map of a species or experimental population that shows the position of its known genes and/or genetic markers relative to each other in terms of recombination frequency, rather than as specific physical distance along each chromosome. A genetic map is a map based on the frequencies of recombination between markers during crossover of homologous chromosomes. The greater the frequency of recombination (segregation) between two genetic markers, the farther apart they are assumed to be. Conversely, the lower the frequency of recombination between the markers, the smaller the physical distance between them. Historically, the markers originally used were detectable phenotypes (enzyme production, eye color) derived from coding DNA sequences; eventually, confirmed or assumed noncoding DNA sequences such as microsatellites or those generating restriction fragment length polymorphisms (RFLPs) have been used. Genetic maps help researchers to locate other markers, such as other genes by testing for genetic linkage of the already known markers. A genetic map is not a physical map or gene map. Recombination frequency Recombination frequency (θ) is the frequency that a chromosomal crossover will take place between two loci (or genes) during meiosis. Recombination frequency is a measure of genetic linkage and is used in the creation of a genetic linkage map. A centimorgan (cM) is a unit that describes a recombination frequency of 1%. As an example of linkage, consider the classic experiment by William Bateson and Reginald Punnett. They were interested in trait inheritance in the sweet pea and were studying two genes—the gene for flower color (P, purple, and p, red) and the gene affecting the shape of pollen grains (L, long, and l, round). They crossed the pure lines PPLL and ppll and then selfcrossed the resulting PpLl lines. According to Mendelian genetics, the expected phenotypes would occur in a 9:3:3:1 ratio of PL:Pl:pL:pl. To their surprise, they observed an increased frequency of PL and pl and a decreased frequency of Pl and pL (see chart below). Bateson and Punnett experiment Phenotype and genotype Observed Expected from 9:3:3:1 ratio Purple, long (P_L_) 284 216 Purple, round (P_ll) 21 72 Red, long (ppL_) 21 72 Red, round (ppll) 55 24 Their experiment revealed linkage between the P and L alleles and the p and l alleles. The frequency of P occurring together with L and with p occurring together with l is greater than that of the recombinant Pl and pL. The recombination frequency cannot be computed directly from this experiment, but intuitively it is less than 50%. The progeny in this case received two dominant alleles linked on one chromosome (referred to as coupling or cis arrangement). However, after crossover, some progeny could have received one parental chromosome with a dominant allele for one trait (eg Purple) linked to a recessive allele for a second trait (eg round) with the opposite being true for the other parental chromosome (eg red and Long). This is referred to as repulsion or a trans arrangement. The phenotype here would still be purple and long but a test cross of this individual with the recessive parent would produce progeny with much greater proportion of the two crossover phenotypes. While such a problem may not seem likely from this example, unfavorable repulsion linkages do appear when breeding for disease resistance in some crops. When two genes are located on the same chromosome, the chance of a crossover producing recombination between the genes is directly related to the distance between the two genes. Thus, the use of recombination frequencies has been used to develop linkage maps or genetic maps. LETHALITY AND LETHAL GENES The term lethal is applied to those changes in the genome of an organism which cause death of that organism is called as lethality. The fully dominant lethal allele kills the carrier individual both in homozygous as well as heterozygous conditions. It occasionally arises by mutation from a normal allele. Therefore the mutant dominant lethal allele is removed from the population in the same generation in which it arose. The recessive lethal allele kills the carrier individual only in homozygous condition. The completely lethal genes usually cause death of the zygote later in the embryonic development or even after birth or hatching. Completely lethality is the case, were no individual attain the age of reproduction. TYPES OF LETHALITY: Zygotic lethality Gametic lethality Gametophytic lethality EXAMPLE OF LETHAL ALLELES: In Snapdragon three types of plants occur 1. Green plants with chlorophyll 2. Yellowish green plants with carotenoids 3. White plants without any chlorophyll. The homozygous green plants have the genotype CC and the homozygous white plant has the genotype cc. The heterozygotes of this green and white plants have Cc genotype. When they are crossed together they produce plants in the ratio of 1:2:1 but the white plants die due to the absence of chlorophyll and modify the ratio as1:2 or 2:1. In this case the homozygous recessive genotype is lethal. In Mice an incompletely dominant allele Y for yellow coat colour is found to be lethal in Homozygous condition. When a heterozygous yellow mice is crossed with another heterozygous yellow mice instead of 3:1 ratio we get 2:1 ratio. Because the homozygous dominant genotype resulted in the death of that mice. In Drosophila the genes for white eyes and vestigial wings are best example for sub lethal genes. Both of these genes reduce the viability of flies to greater extent. In Human Congenital ichthyosis, Amaurotic idiocy, Sicle cell anemia, Retinoblastoma, Epilophia are examples for lethality caused by lethal genes. LINKAGE Genetic linkage was first discovered by the British geneticists William Bateson and Reginald Punnett shortly after Mendel's laws were rediscovered. Genetic linkage occurs when particular genetic loci or alleles for genes are inherited jointly. Genetic loci on the same chromosome are physically connected and tend to stay together during meiosis, and are thus genetically linked. This is called autosomal linkage. Alleles for genes on different chromosomes are usually not linked, due to independent assortment of chromosomes during meiosis. Because there is some crossing over of DNA when the chromosomes segregate. Alleles on the same chromosome can be separated and go to different daughter cells. There is a greater probability of this happening if the alleles are far apart on the chromosome, as it is more likely that a cross-over will occur between them. The relative distance between two genes can be calculated using the offspring of an organism showing two linked genetic traits, and finding the percentage of the offspring where the two traits do not run together. The higher the percentage of descendants that does not show both traits, the farther apart on the chromosome the two genes are. When genes occur on the same chromosome, they are usually inherited as a single unit. Genes inherited in this way are said to be linked, and are referred to as "linkage groups." For example, in fruit flies the genes affecting eye color and wing length are inherited together because they appear on the same chromosome. Views of Classical geneticists on linkage:I. Suttons view on linkage: He suggested that each chromosome must bear more than a single gene and that the genes represented by one chromosome must be inherited together. II. Coupling & Repulsion hypothesis of Bateson and Punnet: The tendency of dominant or recessive alleles to integerate together was explained as gametic coupling by Bateson and Punnet. E.g Dihybrid cross between Sweet pea plant with purple or blue flower and long pollen grain. The tendency of both dominant and recessive allele to repel each other during was termed repulsion. III. Morgans view on linkage: Morgan stated that the pair of genes of homozygous parent enters in to the same gametes and to remain together whereas same genes of heterozygous parents tend to enter in different gametes and remain apart from each other. The degree of strength of linkage depends upon the distance between the linked genes in the chromosome. His concept about linkage helped to develop the theory of linear arrangement of genes. IV. Chromosome theory of linkage: The chromosome theory of linkage of Morgan and Castle states that: The genes which show linkage are situated in the same pair of chromosome. The linked genes are arranged in a linear fashion on the chromosome. The distance between the linked genes determines the strength of linkage. The closely located genes show strong linkage than widely located genes. The linked genes remain in their original combination during the course of inheritance. Kinds of linkage: - It’s of two kinds 1. Complete linkage When the linked genes are so closely located in chromosomes that they inherit in same linkage group for two or more generations in a continuous manner, then they are called completely linked genes. And the phenomenon of inheritance is called complete linkage. 2. Incomplete linkage The linked genes do not always stay together they may exchange their parts between the homologous non sister chromatids during crossing over. This type of linkage is called incomplete linkage. MATERNAL EFFECTS The embryo is formed when a female gamete unites with a male gamete. In the vast majority of species, the female gamete is physically larger than the male gamete and provides the cytoplasm for the developing embryo. Within this cytoplasm are factors that were released by the nuclear genes of the female. Those factors may have specific effects upon the developing embryo. The female cytoplasm also contributes the mitochondria for all species as well as the chloroplast for plant species. These two organelles contain DNA and control certain traits in the offspring. Those phenotypes that are controlled by nuclear factors found in the cytoplasm of the female are said to express a maternal effect. Those phenotypes controlled by organelle genes exhibit maternal inheritance. Example: The classic phenotype which exhibits maternal effects is coiling direction of snail shells. The coiling phenotype that is seen in the offspring is controlled by the genotype of the mother. The following crosses were made between pure line snails, and the following results were seen. By convention, the female is always given first. These results at first glance appear to be at odds with Mendel's laws. First, the F1 phenotype is not the same for both crosses. With other experiments, the results of reciprocal crosses (complementary crosses were the phenotypes of female and male are reversed in the initial parental cross) were equivalent, but with this experiment it appears that the female controls the phenotype. Yet, the F2 appears to contradict this hypothesis because the left- and right-coiled F1 individuals produced all right progeny. Furthermore, the 3:1 Mendelian ratio is not seen in the F2, but rather appears in the F3 generation. How can this result be explained? First, let's look for results that are familar. The F3 ratio of 3 right:1 left for both crosses suggests that right-coiled shells are dominant to left-coiled shells. If this is the case, then we can assign the following genotypes to the pure lines: Right-coiled shell: s+s+ Left-coiled shell: ss The next observation is that the phenotype of the F1 generation is always that of the female parent. One hypothesis would suggest that the genotype of the female controls the genotype of its offspring. Can this result be confirmed in the subsequent generations? If the genotypes we assigned to the parents are correct, then the genotype of F1 individuals from each cross is s+s (from s+s+ x ss and ss x s+s+). If the female genotype does control the phenotype of its offspring, then we would predict that all the F2 snails would have right coils. This is the exact result that is seen. But what would the genotypes of the F2 snails be? If we intermate snails with the genotype s+s the genotypic ratio should be 3 s+_ to 1 ss. These genotypes would not be expressed as a phenotype until the F3 generation. These are the results that were obtained. A general conclusion from all traits that express a maternal effect is that the normal Mendelian ratios are expressed one generation than expected. Mendelian inheritance Mendelian inheritance (or Mendelian genetics or Mendelism) is a set of primary tenets relating to the transmission of hereditary characteristics from parent organisms to their children; it underlies much of genetics. They were initially derived from the work of Gregor Mendel published in 1865 and 1866 which was "re-discovered" in 1900, and were initially very controversial. When they were integrated with the chromosome theory of inheritance by Thomas Hunt Morgan in 1915, they became the core of classical genetics. [edit] History The laws of inheritance were derived by Gregor Mendel, a 19th century [1] monk conducting hybridization experiments in garden peas (Pisum sativum). Between 1856 and 1863, he cultivated and tested some 28,000 pea plants. From these experiments he deduced two generalizations which later became known as Mendel's Laws of Heredity or Mendelian inheritance. He described these laws in a two part paper, "Experiments on Plant Hybridization" that he read to the Natural History Society of Brno on February 8 and March 8, 1865, and which was published in 1866.[2] Mendel's conclusions were largely ignored. Although they were not completely unknown to biologists of the time, they were not seen as generally applicable, even by Mendel himself, who thought they only applied to certain categories of species or traits. A major block to understanding their significance was the importance attached by 19th Century biologists to the apparent blending of inherited traits in the overall apperance of the progeny, now known to be due to multigene interactions, in contrast to the organ-specific binary characters studied by Mendel.[1] In 1900, however, his work was "re-discovered" by three European scientists, Hugo de Vries, Carl Correns, and Erich von Tschermak. The exact nature of the "re-discovery" has been somewhat debated: De Vries published first on the subject, mentioning Mendel in a footnote, while Correns pointed out Mendel's priority after having read De Vries's paper and realizing that he himself did not have priority. De Vries may not have acknowledged truthfully how much of his knowledge of the laws came from his own work, or came only after reading Mendel's paper. Later scholars have accused Von Tschermak of not truly understanding the results at all.[1] Regardless, the "re-discovery" made Mendelism an important but controversial theory. Its most vigorous promoter in Europe was William Bateson, who coined the term "genetics", "gene", and "allele" to describe many of its tenets. The model of heredity was highly contested by other biologists because it implied that heredity was discontinuous, in opposition to the apparently continuous variation observable for many traits. Many biologists also dismissed the theory because they were not sure it would apply to all species, and there seemed to be very few true Mendelian characters in nature. However later work by biologists and statisticians such as R.A. Fisher showed that if multiple Mendelian factors were involved in the expression of an individual trait, they could produce the diverse results observed. Thomas Hunt Morgan and his assistants later integrated the theoretical model of Mendel with the chromosome theory of inheritance, in which the chromosomes of cells were thought to hold the actual hereditary material, and create what is now known as classical genetics, which was extremely successful and cemented Mendel's place in history. Mendel's findings allowed other scientists to predict the expression of traits on the basis of mathematical probabilities. A large contribution to Mendel's success can be traced to his decision to start his crosses only with plants he demonstrated were true-breeding. He also only measured absolute (binary) characteristics, such as color, shape, and position of the offspring, rather than quantitative characteristics. He expressed his results numerically and subjected them to statistical analysis. His method of data analysis and his large sample size gave credibility to his data. He also had the foresight to follow several successive generations (f2, f3) of his pea plants and record their variations. Finally, he performed "test crosses" (back-crossing descendants of the initial hybridization to the initial true-breeding lines) to reveal the presence and proportion of recessive characters. Without his careful attention to procedure and detail, Mendel's work could not have had the impact it made on the world of genetics. Mendel's Laws The principles of heredity were written by the Augustinian monk Gregor Mendel in 1865. Mendel discovered that by crossing white flower and purple flower plants, the result was a hybrid offspring. Rather than being a mix of the two, the offspring was purple flowered. He then conceived the idea of heredity units, which he called "factors", one which is a recessive characteristic and the other dominant. Mendel said that factors, later called genes, normally occur in pairs in ordinary body cells, yet segregate during the formation of sex cells. Each member of the pair becomes part of the separate sex cell. The dominant gene, such as the purple flower in Mendel's plants, will hide the recessive gene, the white flower. After Mendel self-fertilized the F1 generation and obtained the 3:1 ratio, he correctly theorized that genes can be paired in three different ways for each trait; AA, aa, and Aa. The capital A represents the dominant factor and lowercase a represent the recessive. Mendel stated that each individual has two factors for each trait, one from each parent. The two factors may or may not contain the same information. If the two factors are identical the individual is called homozygous for the trait. If the two factors have different information, the individual is called heterozygous. The alternative forms of a factor are called alleles. The genotype of an individual are made up of the many alleles it possesses. An individual's physical appearance, or phenotype, is determined by its alleles as well as by its environment. An individual possesses two alleles for each trait; one allele is given by the female parent and the other by the male parent. They are passed on when an individual matures and produces gametes, egg and sperm. When gametes form the paired alleles separate randomly so that each gamete receives a copy of one of the two alleles. The presence of an allele doesn't promise that the trait will be expressed in the individual that possesses it. In heterozygous individuals the only allele that in expressed is the dominant. The recessive allele is present but its expression is hidden. Mendel summarized his findings in two laws; the Law of Segregation and the Law of Independent Assortment. The Law of Segregation states that the members of each pair of alleles separate when gametes are formed. A gamete will receive one allele or the other. The Law of Independent assortment states that two or more pairs of alleles segregate independently of one another during gamete formation. Law of Independent Assortment The Law of Independent Assortment, also known as "Inheritance Law", states that the inheritance pattern of one trait will not affect the inheritance pattern of another. While Mendel's experiments with mixing one trait always resulted in a 3:1 ratio (Fig. 1) between dominant and recessive phenotypes, his experiments with mixing two traits (dihybrid cross) showed 9:3:3:1 ratios (Fig. 2). But the 9:3:3:1 table shows that each of the two genes are independently inherited with a 3:1 ratio. Mendel concluded that different traits are inherited independently of each other, so that there is no relation, for example, between a cat's color and tail length. This is actually only true for genes that are not linked to each other. Independent assortment occurs during meiosis I in eukaryotic organisms, specifically anaphase I of meiosis,[3] to produce a gamete with a mixture of the organism's maternal and paternal chromosomes. Along with chromosomal crossover, this process aids in increasing genetic diversity by producing novel genetic combinations. Of the 46 chromosomes in a normal diploid human cell, half are maternally-derived (from the mother's egg) and half are paternally-derived (from the father's sperm). This occurs as sexual reproduction involves the fusion of two haploid gametes (the egg and sperm) to produce a new organism having the full complement of chromosomes. During gametogenesis - the production of new gametes by an adult - the normal complement of 46 chromosomes needs to be halved to 23 to ensure that the resulting haploid gamete can join with another gamete to produce a diploid organism. An error in the number of chromosomes, such as those caused by a diploid gamete joining with a haploid gamete, is termed aneuploidy. In independent assortment the chromosomes that end up in a newly-formed gamete are randomly sorted from all possible combinations of maternal and paternal chromosomes. Because gametes end up with a random mix instead of a pre-defined "set" from either parent, gametes are therefore considered assorted independently. As such, the gamete can end up with any combination of paternal or maternal chromosomes. Any of the possible combinations of gametes formed from maternal and paternal chromosomes will occur with equal frequency. For human gametes, with 23 pairs of chromosomes, the number of possibilities is 2^23 or 8,388,608 possible combinations.[4] The gametes will normally end up with 23 chromosomes, but the origin of any particular one will be randomly selected from paternal or maternal chromosomes. This contributes to the genetic variability of progeny. Figure 1: Dominant and recessive phenotypes. (1) Parental generation. (2) F1 generation. (3) F2 generation. Dominant (red) and recessive (white) phenotype look alike in the F1 (first) generation and show a 3:1 ratio in the F2 (second) generation Background Figure 2: The genotypes of two independent traits show a 9:3:3:1 ratio in the F2 generation. In this example, coat color is indicated by B (brown, dominant) or b (white) while tail length is indicated by S (short, dominant) or s (long). When parents are homozygous for each trait ('SSbb and ssBB), their children in the F1 generation are heterozygous at both loci and only show the dominant phenotypes. If the children mate with each other, in the F2 generation all combination of coat color and tail length occur: 9 are brown/short (purple boxes), 3 are white/short (pink boxes), 3 are brown/long (blue boxes) and 1 is white/long (green box). Figure 3: The color alleles of Mirabilis jalapa are not dominant or recessive. (1) Parental generation. (2) F1 generation. (3) F2 generation. The "red" and "white" allele together make a "pink" phenotype, resulting in a 1:2:1 ratio of red:pink:white in the F2 generation. The reason for these laws is found in the nature of the cell nucleus. It is made up of several chromosomes carrying the genetic traits. In a normal cell, each of these chromosomes has two parts, the chromatids. A reproductive cell, which is created in a process called meiosis, usually contains only one of those chromatids of each chromosome. By merging two of these cells (usually one male and one female), the full set is restored and the genes are mixed. The resulting cell becomes a new embryo. The fact that this new life has half the genes of each parent (23 from mother, 23 from father for total of 46) is one reason for the Mendelian laws. The second most important reason is the varying dominance of different genes, causing some traits to appear unevenly instead of averaging out (whereby dominant doesn't mean more likely to reproduce recessive genes can become the most common, too). There are several advantages of this method (sexual reproduction) over reproduction without genetic exchange: 1. Instead of nearly identical copies of an organism, a broad range of offspring develops, allowing more different abilities and evolutionary strategies. 2. There are usually some errors in every cell nucleus. Copying the genes usually adds more of them. By distributing them randomly over different chromosomes and mixing the genes, such errors will be distributed unevenly over the different children. Some of them will therefore have only very few such problems. This helps reduce problems with copying errors somewhat. 3. Genes can spread faster from one part of a population to another. This is for instance useful if there's a temporary isolation of two groups. New genes developing in each of the populations don't get reduced to half when one side replaces the other, they mix and form a population with the advantages of both sides. 4. Sometimes, a mutation (e. g. sickle cell anemia) can have positive side effects (in this case malaria resistance). The mechanism behind the Mendelian laws can make it possible for some offspring to carry the advantages without the disadvantages until further mutations solve the problems. Mendelian trait A Mendelian trait is one that is controlled by a single locus and shows a simple Mendelian inheritance pattern. In such cases, a mutation in a single gene can cause a disease that is inherited according to Mendel's laws. Examples include sickle-cell anemia, Tay-Sachs disease, cystic fibrosis and xeroderma pigmentosa. A disease controlled by a single gene contrasts with a multifactorial disease, like arthritis, which is affected by several loci (and the environment) as well as those diseases inherited in a non-Mendelian fashion. The Mendelian Inheritance in Man database is a catalog of, among other things, genes in which Mendelian traits causes disease. Test cross In genetics, a test cross, first introduced by Mendel, is used to determine if an individual exhibiting a dominant trait is homozygous or heterozygous for that trait. Test crosses involve breeding the individual in question with another individual that expresses a recessive version of the same trait. If all offspring display the dominant phenotype, the individual in question is homozygous dominant; if the offspring display both dominant and recessive phenotypes, then the individual is heterozygous. In some sources, the "test cross" is defined as being a type of back cross between the recessive homozygote and F1 generation Dominance In genetics, dominance describes the effects of the different versions of a particular gene on the phenotype of an organism. Many animals (including humans) and plants have two copies of each gene in their genome, one inherited from each parent. The different variants of a specific gene (such as that coding for earlobes) are known as alleles. If an organism inherits two alleles that are at odds with one another, and the phenotype of the organism is determined completely by one of the alleles, then that allele is said to be dominant. The other allele, which has no tangible effect on the organism's phenotype, is said to be recessive. For example, in humans, if a person inherits the allele for free earlobes from one parent and the one for attached earlobes from the other, that person will have free earlobes. Thus the free lobe allele is said to be dominant over the attached lobe allele (and the attached lobe allele is said to be recessive to the free lobe allele). In order to have attached earlobes, a person must inherit the allele for attached earlobes from both parents. Note that this doesn't necessarily mean that either parent must have attached earlobes - since both parents could be carrying the allele for attached lobes while outwardly having free lobes. In most cases a dominance relationship is seen when the gene encodes an enzyme, and its recessive counterpart does not. In many cases, a normal function can be maintained with only half the amount of an enzyme. In these cases a single copy of the dominant allele produces enough of the gene’s product to give the same effect as two normal copies. Dominance was discovered by Mendel, who introduced the use of uppercase letters to denote dominant alleles and lowercase to denote recessive alleles, as is still commonly used in introductory genetics courses (for example, E and e for alleles causing free and attached lobes). Although this usage is convenient it is misleading, because dominance is not a property of an allele considered in isolation, but a relationship between the effects of two alleles. When geneticists loosely refer to a dominant allele or a recessive allele, they mean that the allele is dominant or recessive to the standard allele. Geneticists often use the term dominance in other contexts, distinguishing between simple or complete dominance as described above, and other relationships. Relationships described as incomplete or partial dominance are usually more accurately described as giving an intermediate or blended phenotype. The relationship described as codominance describes a relationship where the distinct phenotypes caused by each allele are both seen when both alleles are present. Nomenclature Genes are indicated in shorthand by a combination of one or a few letters - for example, in cat coat genetics the alleles Mc and mc (for "mackerel tabby") play a prominent role. Alleles producing dominant traits are denoted by initial capital letters; those that confer recessive traits are written with lowercase letters. The alleles present in a locus are usually separated by a slash, '/'; in the Mc vs mc case, the dominant trait is the "mackerel-stripe" pattern, and the recessive one the "classic" or "oyster" tabby pattern, and thus a classical-pattern tabby cat would carry the alleles mc/mc, whereas a mackerel-stripe tabby would be either Mc/mc or Mc/Mc. Relationship to other genetics concepts Humans have 23 homologous chromosome pairs (22 pairs of autosomal chromosomes and two distinct sex chromosomes, X and Y). It is estimated that the human genome contains 20,00025,000 genes[1]. Each chromosomal pair has the same genes, although it is generally unlikely that homologous genes from each parent will be identical in sequence. The specific variations possible for a single gene are called alleles: for a single eye-color gene, there may be a blue eye allele, a brown eye allele, a green eye allele, etc. Consequently, a child may inherit a blue eye allele from their mother and a brown eye allele from their father. The dominance relationships between the alleles control which traits are and are not expressed. An example of an autosomal dominant human disorder is Huntington's disease, which is a neurological disorder resulting in impaired motor function. The mutant allele results in an abnormal protein, containing large repeats of the amino acid glutamine. This defective protein is toxic to neural tissue, resulting in the characteristic symptoms of the disease. Hence, one copy suffices to confer the disorder. A list of human traits that follow a simple inheritance pattern can be found in human genetics. Humans have several genetic diseases, often but not always caused by recessive alleles. Punnett square The genetic combinations possible with simple dominance can be expressed by a diagram called a Punnett square. One parent's alleles are listed across the top and the other parent's alleles are listed down the left side. The interior squares represent possible offspring, in the ratio of their statistical probability. In the previous example of flower color, P represents the dominant purple-colored allele and p the recessive white-colored allele. If both parents are purple-colored and heterozygous (Pp), the Punnett square for their offspring would be: P p P PP Pp p Pp pp In the PP and Pp cases, the offspring is purple colored due to the dominant P. Only in the pp case is there expression of the recessive white-colored phenotype. Therefore, the phenotypic ratio in this case is 3:1, meaning that F2 generation offspring will be purple-colored three times out of four, on average. Note: Dominant alleles are capitalized. Dominant trait A dominant trait refers to a genetic feature that hides the recessive trait in the phenotype of an individual. A dominant trait is a phenotype that is seen in both the homozygous AA and heterozygous Aa genotypes. Many traits are determined by pairs of complementary genes, each inherited from a single parent. Often when these are paired and compared, one allele (the dominant) will be found to effectively shut out the instructions from the other, recessive allele. For example, if a person has one allele for blood type A and one for blood type O, that person will always have blood type A. For a person to have blood type O, both their alleles must be O (recessive). When an individual has two dominant alleles (AA), the condition is referred to as homozygous dominant; an individual with two recessive alleles (aa) is called homozygous recessive. An individual carrying one dominant and one recessive allele is referred to as heterozygous. A dominant trait when written in a genotype is always written before the recessive gene in a heterozygous pair. A heterozygous genotype is written Aa, not aA. Types of dominances Simple dominance or complete dominance Consider the simple example of flower color in peas, first studied by Gregor Mendel. The dominant allele is purple and the recessive allele is white.[verification needed] In a given individual, the two corresponding alleles of the chromosome pair fall into one of three patterns: both alleles purple (PP) both alleles white (pp) one allele purple and one allele white (Pp) If the two alleles are the same (homozygous), the trait they represent will be expressed. But if the individual carries one of each allele (heterozygous), only the dominant one will be expressed. The recessive allele will simply be suppressed. Simple dominance in pedigrees Dominant traits are recognizable by the fact that they do not skip generations, as recessive traits do. It is therefore quite possible for two parents with purple flowers to have white flowers among their progeny, but two such white offspring could not have purple offspring (although very rarely, one might be produced by mutation). In this situation, the purple individuals in the first generation must have both been heterozygous (carrying one copy of each allele). Incomplete dominance Discovered by Carl Correns, incomplete dominance (sometimes called partial dominance) is a heterozygous genotype that creates an intermediate phenotype. In this case, only one allele (usually the wild type) at the single locus is expressed in a doseage dependent manner, which results in an intermediate phenotype. A cross of two intermediate phenotypes (= monohybrid heterozygotes) will result in the reappearance of both parent phenotypes and the intermediate phenotype. There is a 1:2:1 phenotype ratio instead of the 3:1 phenotype ratio found when one allele is dominant and the other is recessive. This lets an organism's genotype be diagnosed from its phenotype without time-consuming breeding tests. The classic example of this is the color of carnations. R R R' RR RR' R' RR' R'R' R is the allele for red pigment. R' is the allele for no pigment. Thus, RR offspring make a lot of red pigment and appear red. R'R' offspring make no red pigment and appear white. Both RR' and R'R offspring make some pigment and therefore appear pink. Another readily visible example of incomplete dominance is the color modifier Merle in dogs. Codominance In codominance, neither phenotype is recessive. Instead, the heterozygous individual expresses both phenotypes. A common example is the ABO blood group system. The gene for blood types has three alleles: A, B, and i. i causes O type and is recessive to both A and B. The A and B alleles are codominant with each other. When a person has both an A and a B allele, the person has type AB blood. When two persons with AB blood type have children, the children can be type A, type B, or type AB. There is a 1A:2AB:1B phenotype ratio instead of the 3:1 phenotype ratio found when one allele is dominant and the other is recessive. This is the same phenotype ratio found in matings of two organisms that are heterozygous for incomplete dominant alleles. Example Punnett square for a father with A and i, and a mother with B and i: A i B AB B i A O Amongst the very few codominant genetic diseases in humans, one relatively common one is A1AD, in which the genotypes Pi00, PiZ0, PiZZ, and PiSZ all have their more-or-less characteristic clinical representations. Most molecular markers are considered to be codominant. A roan horse has codominant follicle genes, expressing individual red and white follicles. Dominant negative Some mutations are not loss-of-function, but gain-of-function changes. These are usually dominant. For example, "dominant negative" or antimorphic mutations occurs when the gene product adversely affects the normal, wild-type gene product within the same cell. This usually occurs if the product can still interact with the same elements as the wild-type product, but block some aspect of its function. Such proteins may be competitive inhibitors of the normal protein functions. Types: A mutation in a transcription factor that removes the activation domain, but still contains the DNA binding domain. This product can then block the wild-type transcription factor from binding the DNA site leading to reduced levels of gene activation. A protein that is functional as a dimer. A mutation that removes the functional domain, but retains the dimerization domain would cause a dominant negative phenotype, because some fraction of protein dimers would be missing one of the functional domains. Autosomal dominant gene An autosomal dominant gene is one that occurs on an autosomal (non-sex determining) chromosome. As it is dominant, the phenotype it gives will be expressed even if the gene is heterozygous. This contrasts with recessive genes, which need to be homozygous to be expressed. The chances of an autosomal dominant disorder being inherited are 50% if one parent is heterozygous for the mutant gene and the other is homozygous for the normal, or 'wild-type', gene. This is because the offspring will always inherit a normal gene from the parent carrying the wild-type genes, and will have a 50% chance of inheriting the mutant gene from the other parent. If the mutant gene is inherited, the offspring will be heterozygous for the mutant gene, and will suffer from the disorder. If the parent with the disorder is homozygous for the gene, the offspring produced from mating with an unaffected parent will always have the disorder. See Mendelian inheritance. The term vertical transmission refers to the concept that autosomal dominant disorders are inherited through generations. This is obvious when you examine the pedigree chart of a family for a particular trait. Because males and females are equally affected, they are equally likely to have affected children. Although the mutated gene should be present in successive generations in which there are more than one or two offspring, it may appear that a generation is skipped if there is reduced penetrance. Recessive trait The term "recessive allele" refers to an allele that causes a phenotype (visible or detectable characteristic) that is only seen in homozygous genotypes (organisms that have two copies of the same allele) and never in heterozygous genotypes. Every diploid organism, including humans, has two copies of every gene on autosomal chromosomes, one from the mother and one from the father. The dominant allele of a gene will always be expressed while the recessive allele of a gene will be expressed only if the organism has two recessive forms.[2] Thus, if both parents are carriers of a recessive trait, there is a 25% chance with each child to show the recessive trait. The term "recessive allele" is part of the laws of Mendelian inheritance formulated by Gregor Mendel. Examples of recessive traits in Mendel's famous pea plant experiments include the color and shape of seed pods and plant height. Autosomal recessive allele Relationship between two carrier parents and probabilities of children being unaffected, carriers, or affected Autosomal recessive is a mode of inheritance of genetic traits located on the autosomes (the pairs of non-sex determining chromosomes - in humans 22). In opposition to autosomal dominant trait, a recessive trait only becomes phenotypically apparent when two similar alleles of a gene are present. In other words, the subject is homozygous for the trait. The frequency of the carrier state can be calculated by the Hardy-Weinberg formula: p2 + 2pq + q2 = 1 (p is the frequency of one pair of alleles, and q = 1 − p is the frequency of the other pair of alleles.) Recessive genetic disorders occur when both parents are carriers and each contributes an allele to the embryo, meaning these are not dominant genes. As both parents are heterozygous for the disorder, the chance of two disease alleles landing in one of their offspring is 25% (in autosomal dominant traits this is higher). 50% of the children (or 2/3 of the remaining ones) are carriers. When one of the parents is homozygous, the trait will only show in his/her offspring if the other parent is also a carrier. In that case, the chance of disease in the offspring is 50%. Nomenclature of recessiveness Technically, the term "recessive gene" is imprecise because it is not the gene that is recessive but the phenotype (or trait). It should also be noted that the concepts of recessiveness and dominance were developed before a molecular understanding of DNA and before molecular biology, thus mapping many newer concepts to "dominant" or "recessive" phenotypes is problematic. Many traits previously thought to be recessive have mild forms or biochemical abnormalities that arise from the presence of the one copy of the allele. This suggests that the dominant phenotype is dependent upon having two dominant alleles, and the presence of one dominant and one recessive allele creates some blending of both dominant and recessive traits. Examples Pea Plant Gregor Mendel performed many experiments on pea plant (Pisum sativum) while researching traits, chosen because of the simple and low variety of characteristics, as well as the short period of germination. He experimented with color (green vs. yellow), size (short vs. tall), pea texture (smooth vs. wrinkled), and many others. By good fortune, the characteristics displayed by these plants clearly exhibited a dominant and recessive form. This is not true for many organisms. For example, when testing the color of the pea plants, he chose two yellow plants, since yellow was more common than green. He mated them, and examined the offspring. He continued to mate only those that appeared yellow, and eventually, the green ones would stop being produced. He also mated the green ones together and determined that only green ones were produced. Mendel determined that this was because green was a recessive trait which only appeared when yellow, the dominant trait, was not present. Also, he determined that the dominant trait would be displayed whether or not the recessive trait was there. Autosomal recessive disorders Dominance/recessiveness refers to phenotype, not genotype. An example to prove the point is sickle cell anemia. The sickle cell genotype is caused by a single base pair change in the betaglobin gene: normal=GAG (glu), sickle=GTG (val). There are several phenotypes associated with the sickle genotype: 1. anemia (a recessive trait) 2. blood cell sickling (co-dominant) 3. altered beta-globin electrophoretic mobility (co-dominant) 4. resistance to malaria (dominant) This example demonstrates that one can only refer to dominance/recessiveness with respect to individual phenotypes. Mechanisms of dominance Many genes code for enzymes. Consider the case where someone is homozygous for some trait. Both alleles code for the same enzyme, which causes a trait. Only a small amount of that enzyme may be necessary for a given phenotype. The individual therefore has a surplus of the necessary enzyme. Let's call this case "normal". Individuals without any functional copies cannot produce the enzyme at all, and their phenotype reflects that. Consider a heterozygous individual. Since only a small amount of the normal enzyme is needed, there is still enough enzyme to show the phenotype. This is why some alleles are dominant over others. In the case of incomplete dominance, the single dominant allele does not produce enough enzyme, so the heterozygotes show some different phenotype. For example, fruit color in eggplants is inherited in this manner. A purple color is caused by two functional copies of the enzyme, with a white color resulting from two non-functional copies. With only one functional copy, there is not enough purple pigment, and the color of the fruit is a lighter violet shade. Some non-normal alleles can be dominant. The mechanisms for this are varied, but one simple example is when the functional enzyme E is composed of several subunits where each Ei is made of several alleles Ei = ai1ai2, with making them either functional or not functional according to one of the schemes described above. For example one could have the rule that if any of the Ei subunits are nonfunctional, the entire enzyme E is nonfunctional in the sense that the phenotype is not displayed. In the case of a single subunit say E1 is E1 = F where F has a functional and nonfunctional allele (heterozygous individual)(F = a1A1) , the concentration of functional enzyme determined by E could be 50% of normal. If the enzyme has two identical subunits ( the concentration of functional enzyme is 25% of normal. For four subunits, the concentration of functional enzyme is about 6% of normal (roughly scaling slower than 1 / 2c where c is the number of copies of the allele ( 1 / 24 is about 51% percent) This may not be enough to produce the wild type phenotype. There are other mechanisms for dominant mutants. Epistasis Epistasis is the interaction between genes. Epistasis takes place when the action of one gene is modified by one or several other genes, which are sometimes called modifier genes. The gene whose phenotype is expressed is said to be epistatic, while the phenotype altered or suppressed is said to be hypostatic. In general, the fitness increment of any one allele depends in a complicated way on many other alleles; but, because of the way that the science of population genetics was developed, evolutionary scientists tend to think of epistasis as the exception to the rule. In the first models of natural selection devised in the early 20th century, each gene was considered to make its own characteristic contribution to fitness, against an average background of other genes. Some introductory college courses still teach population genetics this way. Epistasis and genetic interaction refer to the same phenomenon; however, epistasis is widely used in population genetics and refers especially to the statistical properties of the phenomenon. Examples of tightly linked genes having epistatic effects on fitness are found in supergenes and the human major histocompatibility complex genes. The effect can occur directly at the genomic level, where one gene could code for a protein preventing transcription of the other gene. Alternatively, the effect can occur at the phenotypic level. For example, the gene causing albinism would hide the gene controlling color of a person's hair. In another example, a gene coding for a widow's peak would be hidden by a gene causing baldness. Fitness epistasis (where the affected trait is fitness) is one cause of linkage disequilibrium. Studying genetic interactions can reveal gene function, the nature of the mutations, functional redundancy, and protein interactions. Because protein complexes are responsible for most biological functions, genetic interactions are a powerful tool. Sex-determination system A sex-determination system is a biological system that determines the development of sexual characteristics in an organism. Most sexual organisms have two sexes. In many cases, sex determination is genetic: males and females have different alleles or even different genes that specify their sexual morphology. In animals, this is often accompanied by chromosomal differences. In other cases, sex is determined by environmental variables (such as temperature) or social variables (the size of an organism relative to other members of its population). The details of some sex-determination systems are not yet fully understood. [edit] Chromosomal determination XX/XY sex chromosomes The XX/XY sex-determination system is one of the most familiar sex-determination systems and is found in human beings and most other mammals, although at least one monotreme, the platypus, presents a particular sex determination scheme that in some ways resembles that of the ZW sex chromosomes of birds, and it also lacks the SRY gene. In the XY sex-determination system, females have two of the same kind of sex chromosome (XX), while males have two distinct sex chromosomes (XY). Some species (including humans) have a gene SRY on the Y chromosome that determines maleness; others (such as the fruit fly) use the presence of two X chromosomes to determine femaleness. The XY sex chromosomes are different in shape and size from each other unlike the autosomes, and are termed allosomes. Drosophila sex-chromosomes The XY sex-determination system is the sex-determination system found in humans, most other mammals, some insects (Drosophila) and some plants (Ginkgo). In this system, females have two of the same kind of sex chromosome (XX), and are called the homogametic sex. Males have two distinct sex chromosomes (XY), and are called the heterogametic sex. The XY sex determination system was first described independently by Nettie Stevens and Edmund Beecher Wilson in 1905. [edit] Mechanisms Some species (including most mammals) have a gene or genes on the Y chromosome that determine maleness. In the case of humans, a single gene (SRY) on the Y chromosome acts as a signal to set the developmental pathway towards maleness.[1] Other mammals use several genes on the Y chromosome for that same purpose. Not all male-specific genes are located on the Y chromosome. Other species (including most Drosophila species) use the presence of two X chromosomes to determine femaleness. One X chromosome gives putative maleness. The presence of Y chromosome genes are required for normal male development. Humans, as well as some other organisms, can have a chromosomal arrangement that is contrary to their phenotypic sex, that is, XX males or XY females. See, for example, XX male syndrome and Androgen insensitivity syndrome. XX/X0 sex determination In this variant of the XY system, females have two copies of the sex chromosome (XX) but males have only one (X0). The 0 denotes the absence of a second sex chromosome. This system is observed in a number of insects, including the grasshoppers and crickets of order Orthoptera and in cockroaches (order Blattodea). The nematode C. elegans is male with one sex chromosome (X0); with a pair of chromosomes (XX) it is a hermaphrodite. ZW sex chromosomes The ZW sex-determination system is found in birds and some insects and other organisms. The ZW sex-determination system is reversed compared to the XY system: females have two different kinds of chromosomes (ZW), and males have two of the same kind of chromosomes (ZZ). Haplodiploidy Haplodiploidy is found in insects belonging to Hymenoptera, such as ants and bees. Unfertilized eggs develop into haploid individuals, which are the males. Diploid individuals are generally female but may be sterile males. Thus, if a queen bee mates with one drone, her daughters share ¾ of their genes with each other, not ½ as in the XY and ZW systems. This is believed to be significant for the development of eusociality, as it increases the significance of kin selection. This is common also in wasps that are parasitic and in the male greenflies. Multiple Alleles Alleles are alternative forms of a gene, and they are responsible for differences in phenotypic expression of a given trait (e.g., brown eyes versus green eyes). A gene for which at least two alleles exist is said to be polymorphic. Instances in which a particular gene may exist in three or more allelic forms are known as multiple allele conditions. It is important to note that while multiple alleles occur and are maintained within a population, any individual possesses only two such alleles (at equivalent loci on homologous chromosomes). Examples of Multiple Alleles Two human examples of multiple-allele genes are the gene of the ABO blood group system, and the human-leukocyte-associated antigen (HLA) genes. The ABO system in humans is controlled by three alleles, usually referred to as IA, IB, and IO (the "I" stands for isohaemagglutinin). IA and IB are codominant and produce type A and type B antigens, respectively, which migrate to the surface of red blood cells, while IO is the recessive allele and produces no antigen. The blood groups arising from the different possible genotypes are summarized in the following table. Genotype Blood Group IA IA A IA IO A IB IB B IB IO B IA IB AB IO IO O HLA genes code for protein antigens that are expressed in most human cell types and play an important role in immune responses. These antigens are also the main class of molecule responsible for organ rejections following transplantations—thus their alternative name: major histocompatibility complex (MHC) genes. The most striking feature of HLA genes is their high degree of polymorphism —there may be as many as one hundred different alleles at a single locus. If one also considers that an individual possesses five or more HLA loci, it becomes clear why donor-recipient matches for organ transplantations are so rare (the fewer HLA antigens the donor and recipient have in common, the greater the chance of rejection). Polymorphism in Noncoding DNA It must be realized that although the above two are valid examples, most genes are not multiply allelic but exist only in one or two forms within a population. Most of the DNA sequence variation between individuals arises not because of differences in the genes, but because of differences in the noncoding DNA found between genes. An example of a noncoding DNA sequence that is extremely abundant in humans is the so-called microsatellite DNA. Microsatellite sequences consist of a small number of nucleotides repeated up to twenty or thirty times. For instance, the microsatellite composed of the dinucleotide AC is very common, appearing about one hundred thousand times throughout the human genome. The interesting feature about microsatellites is that they are very highly polymorphic for the number of repeat lengths. For example, one particular individual might possess the microsatellite sequence ACACACACAC at a specific locus on one chromosome, and the sequence ACACACACACACACACAC at the same locus on the other homologous chromosome. Making Use of Polymorphic DNA Multiple alleles and noncoding polymorphic DNA are of considerable importance in gene mapping— identifying the relative positions of genetic loci on chromosomes. Gene maps are constructed by using the frequency of crossing-over to estimate the distance between a pair of loci. To obtain a good estimate, one must analyze a large number of offspring from a single cross. In laboratory organisms such as the fruit fly Drosophila, programmed crosses can be carried out so it is possible to use gene loci to construct a reliable genetic map. In humans, this is not the case. For this reason, the more highly variable noncoding regions are of considerable importance in human genetic mapping. see also Blood Type; Immune System Genetics; Mapping; Polymorphisms; Transplantation. Genetic linkage Genetic linkage occurs when particular genetic loci or alleles for genes are inherited jointly. Genetic loci on the same chromosome are physically connected and tend to segregate together during meiosis, and are thus genetically linked. Alleles for genes on different chromosomes are usually not linked, due to independent assortment of chromosomes during meiosis. Because there is some crossing over of DNA when the chromosomes segregate, alleles on the same chromosome can be separated and go to different daughter cells. There is a greater probability of this happening if the alleles are far apart on the chromosome, as it is more likely that a cross-over will occur between them. The relative distance between two genes can be calculated using the offspring of an organism showing two linked genetic traits, and finding the percentage of the offspring where the two traits do not run together. The higher the percentage of descendants that does not show both traits, the further apart on the chromosome they are. Among individuals of an experimental population or species, some phenotypes or traits occur randomly with respect to one another in a manner known as independent assortment. Today scientists understand that independent assortment occurs when the genes affecting the phenotypes are found on different chromosomes or separated by a great enough distance on the same chromosome that recombination occurs at least half of the time. An exception to independent assortment develops when genes appear near one another on the same chromosome. When genes occur on the same chromosome, they are usually inherited as a single unit. Genes inherited in this way are said to be linked, and are referred to as "linkage groups." For example, in fruit flies the genes affecting eye color and wing length are inherited together because they appear on the same chromosome. But in many cases, even genes on the same chromosome that are inherited together produce offspring with unexpected allele combinations. This results from a process called crossing over. At the beginning of normal meiosis, a chromosome pair (made up of a chromosome from the mother and a chromosome from the father) intertwine and exchange sections or fragments of chromosome. The pair then breaks apart to form two chromosomes with a new combination of genes that differs from the combination supplied by the parents. Through this process of recombining genes, organisms can produce offspring with new combinations of maternal and paternal traits that may contribute to or enhance survival. Genetic linkage was first discovered by the British geneticists William Bateson and Reginald Punnett shortly after Mendel's laws were rediscovered. Chromosomal crossover Thomas Hunt Morgan's illustration of crossing over (1916) Chromosomal crossover (or crossing over) is the process by which two chromosomes pair up and exchange sections of their DNA. This often occurs during prophase 1 of meiosis in a process called synapsis. Synapsis begins before the synaptonemal complex develops, and is not completed until near the end of prophase 1. Crossover usually occurs when matching regions on matching chromosomes break and then reconnect to the other chromosome. The result of this process is an exchange of genes, called genetic recombination. Chromosomal crossovers also occur in asexual organisms and in somatic cells, since they are important in some forms of DNA repair.[1] A double crossing over Recombination involves the breakage and rejoining of parental chromosomes Crossing over was described, in theory, by Thomas Hunt Morgan.He relied on the discovery of the Belgian Professor Frans Alfons Janssens of the University of Leuven who described the phenomenon in 1909 and had called it 'chiasmatypie'. The term chiasma (genetics) is linked if not identical to chromosomal crossover. Morgan immediately saw the great importance of Janssens cytological interpretation of chiasmata to the experimental results of his research on the heredity of Drosophila. The physical basis of crossing over was first demonstrated by Harriet Creighton and Barbara McClintock in 1931.[2] [edit] Chemistry Holliday Junction Molecular structure of a Holliday junction. Meiotic recombination initiates with double-stranded breaks that are introduced into the DNA by the Spo11 protein.[3] One or more exonucleases then digest the 5’ ends generated by the doublestranded breaks to produce 3’ single-stranded DNA tails. The meiosis-specific recombinase Dmc1 and the general recombinase Rad51 coat the single-stranded DNA to form nucleoprotein filaments.[4] The recombinases catalyze invasion of the opposite chromatid by the single-stranded DNA from one end of the break. Next, the 3’ end of the invading DNA primes DNA synthesis, causing displacement of the complementary strand, which subsequently anneals to the singlestranded DNA generated from the other end of the initial double-stranded break. The structure that results is a cross-strand exchange, also known as a Holliday junction. The contact between two chromatids that will soon undergo crossing-over is known as a chiasma. The Holliday junction is a tetrahedral structure which can be 'pulled' by other recombinases, moving it along the four-stranded structure. Consequences In most eukaryotes, a cell carries two copies of each gene, each referred to as an allele. Each parent passes on one allele to each offspring. An individual gamete inherits a complete haploid complement of alleles on chromosomes that are independently selected from each pair of chromatids lined up on the metaphase plate. Without recombination, all alleles for those genes linked together on the same chromosome would be inherited together. Meiotic recombination allows a more independent selection between the two alleles that occupy the positions of single genes, as recombination shuffles the allele content between homologous chromosomes. Recombination does not have any influence on the statistical probability that another offspring will have the same combination. This theory of "independent assortment" of alleles is fundamental to genetic inheritance. However, there is an exception that requires further discussion. The difference between gene conversion and chromosomal crossover. Blue is the two chromatids of one chromosome and red is the two chromatids of another one. The frequency of recombination is actually not the same for all gene combinations. This leads to the notion of "genetic distance", which is a measure of recombination frequency averaged over a (suitably large) sample of pedigrees. Loosely speaking, one may say that this is because recombination is greatly influenced by the proximity of one gene to another. If two genes are located close together on a chromosome, the likelihood that a recombination event will separate these two genes is less than if they were farther apart. Genetic linkage describes the tendency of genes to be inherited together as a result of their location on the same chromosome. Linkage disequilibrium describes a situation in which some combinations of genes or genetic markers occur more or less frequently in a population than would be expected from their distances apart. This concept is applied when searching for a gene that may cause a particular disease. This is done by comparing the occurrence of a specific DNA sequence with the appearance of a disease. When a high correlation between the two is found, it is likely that the appropriate gene sequence is really closer. Problems Although crossovers typically occur between homologous regions of matching chromosomes, similarities in sequence can result in mismatched alignments. These processes are called unbalanced recombination. Unbalanced recombination is fairly rare compared to normal recombination, but severe problems can arise if a gamete containing unbalanced recombinants becomes part of a zygote. The result can be a local duplication of genes on one chromosome and a deletion of these on the other, a translocation of part of one chromosome onto a different one, or an inversion. Mendel's Principles The focus of genetics research then shifted to understanding what really happens in the transmission of hereditary traits from parents to children. A number of hypotheses were suggested to explain heredity, but Gregor Mendel, a little known Central European monk, was the only one who got it more or less right. His ideas had been published in 1866 but largely went unrecognized until 1900, which was long after his death. While Mendel’s research was with plants, the basic underlying principles of heredity that he discovered also apply to people and Gregor Mendel other animals because the mechanisms of heredity are essentially 1822-1884 the same for all complex life forms. Through the selective cross-breeding of common pea plants (Pisum sativum) over many generations, Mendel discovered that certain traits show up in offspring without any blending of parent characteristics. For instance, the pea flowers are either purple or white--intermediate colors do not appear in the offspring of cross-pollinated pea plants. Mendel observed seven traits that are easily recognized and apparently only occur in one of two forms: 1. flower color is purple or white 5. seed color is yellow or green 2. flower position is axil or terminal 6. pod shape is inflated or constricted 3. stem length is long or short 7. pod color is yellow or green 4. seed shape is round or wrinkled Mendel picked common garden pea plants for the focus of his research because they can be grown easily in large numbers and their reproduction can be manipulated. Pea plants have both male and female reproductive organs. As a result, they can either self-pollinate themselves or cross-pollinate with another plant. In his experiments, Mendel was able to selectively cross-pollinate purebred plants with particular traits and observe the outcome over many generations. This was the basis for his conclusions about the nature of genetic inheritance. In cross-pollinating plants that either produces yellow or green pea seeds exclusively, Mendel found that the first offspring generation (f1) always has yellow seeds. However, the following generation (f2) consistently has a 3:1 ratio of yellow to green. This 3:1 ratio occurs in later generations as well. Mendel realized that this was the key to understanding the basic mechanisms of inheritance. He came to three important conclusions from these experimental results: 1. that the inheritance of each trait is determined by "units" or "factors" that are passed on to descendents unchanged (these units are now called genes) 2. that an individual inherits one such unit from each parent for each trait 3. that a trait may not show up in an individual but can still be passed on to the next generation. It is important to realize that, in this experiment, the starting parent plants were homozygous for pea seed color. That is to say, they each had two identical forms (or alleles) of the gene for this trait--2 yellows or 2 greens. The plants in the f1 generation were all heterozygous. In other words, they each had inherited two different alleles-one from each parent plant. It becomes clearer when we look at the actual genetic makeup, or genotype, of the pea plants instead of only the phenotype, or observable physical characteristics. Mendel's observations from these experiments can be summarized in two principles: 1) The principle of Segregation 2) The principle of independent assortment 1) According to the principle of segregation, for any particular trait, the pair of alleles of each parent separate and only one allele passes from each parent on to an offspring. Which allele in a parent's pair of alleles is inherited is a matter of chance. We now know that this segregation of alleles occurs during the process of sex cell formation (i.e., meiosis). Segregation of alleles in the production of sex cells 2) According to the principle of independent assortment, different pairs of alleles are passed to offspring independently of each other. The result is that new combinations of genes present in neither parent are possible. For example, a pea plant's inheritance of the ability to produce purple flowers instead of white ones does not make it more likely that it will also inherit the ability to produce yellow pea seeds in contrast to green ones. Likewise, the principle of independent assortment explains why the human inheritance of a particular eye color does not increase or decrease the likelihood of having 6 fingers on each hand. Today, we know this is due to the fact that the genes for independently assorted traits are located on different chromosomes. These two principles of inheritance, along with the understanding of unit inheritance and dominance, were the beginnings of our modern science of genetics. However, Mendel did not realize that there are exceptions to these rules. NOTE: Some biologists refer to Mendel's "principles" as "laws". MULTIPLE ALLELES allele = (n) a form of a gene which codes for one possible outcome of a phenotype For example, in Mendel's pea investigations, he found that there was a gene that determined the color of the pea pod. One form of it (one allele) creates yellow pods, & the other form (allele) creates green pods. Two possible phenotypes of one trait (pod color) are determined by two alleles (forms) of the one "color" gene. THE DEALS ON MULTIPLE ALLELES Now, if there are 4 or more possible phenotypes for a particular trait, then more than 2 alleles for that trait must exist in the population. We call this "MULTIPLE ALLELES". Let me stress something. There may be multiple alleles within the population, but individuals have only two of those alleles. Because individuals have only two biological parents. We inherit half of our genes (alleles) from ma, & the other half from pa, so we end up with two alleles for every trait in our phenotype. An excellent example of multiple allele inheritance is human blood type. Blood type exists as four possible phenotypes: A, B, AB, & O. There are 3 alleles for the gene that determines blood type. (Remember: You have just 2 of the 3 in your genotype --- 1 from mom & 1 from dad). The alleles are as follows: ALLELE CODES FOR IA Type "A" Blood IB Type "B" Blood i Type "O" Blood Notice that, according to the symbols used in the table above, that the allele for "O" (i) is recessive to the alleles for "A" & "B". With three alleles we have a higher number of possible combinations in creating a genotype. GENOTYPES RESULTING PHENOTYPES IAIA Type A IAi Type A IBIB Type B IBi Type B IAIB Type AB ii Type O SEX LINKAGE IN DIPLOIDS Sex linkage is the phenotypic expression of an allele that is related to the chromosomal sex of the individual Sex linkage also called as X linkage refers to the mode of inheritance common for those organisms that have the XY chromosome system of sex (gender) determination. For example in humans, females are usually XX and males are typically XY. The X chromosome has many genes not found on the y chromosome and this means that many genes associated with the X and y chromosomes in the human male do not come in pairs, contrary to what Mendel concluded about his pea plants. X linked recessive and X linked dominant traits. X linked recessive: Study the diagrams carefully. Notice that the X linked recessive trait leads to a pattern that looks on the surface like a typical complete dominance situation where two heterozygotes are crossed. You do get a 3:1 phenotypic ratio but notice that all the individuals with the recessive phenotype are male! E.g The various types of color "blindness" are familiar examples of x linked recessive traits. Males are more frequently affected than females. When a female parent is carrying the trait then 50% of her sons have a chance of being affected and 50% of the daughters would be carriers but phenotypically normal. The affected male parents cannot transmit the trait directly to his sons. X linked dominance. Notice the pattern for an x linked dominant trait. In this example a homozygous recessive genotype is mated with a male who carries the dominant allele. He would have the 'purple' phenotype in this hypothetical example and the female would have the 'white' phenotype. Observe that in this situation the male offspring all have the recessive phenotype since they all have the a allele but no A allele to mask it. The female offspring all have the dominant phenotype because they are all heterozygous. The genetic disorder in humans, soft enamel is a commonly cited X linked dominant trait. The affected male transmits the trait to all his daughters but not to the son. When affected females are homozygous they transmit the trait to all their children of both sexes. Y linked traits. The y chromosome does have genes on it. These genes would be paired up with corresponding genes of of the x chromosome. For humans this means that every son of a male will inherit the gene determining the train since the y chromosome is only passed from father to son. E.g A trait called hairy ears is just such a trait. Unit II Origin of Replication Definition An origin of replication is a sequence of DNA at which replication is initiated on a chromosome, plasmid or virus. For small DNAs, including bacterial plasmids and small viruses, a single origin is sufficient. Larger DNAs have many origins, and DNA replication is initiated at all of them; otherwise, if all replication had to proceed from a single origin, it would take too long to replicate the entire DNA mass. Background The origin of replication determines the vector copy number, which could typically be in the range of 25–50 copies/cell if the expression vector is derived from the lowcopy-number plasmid pBR322, or between 150 and 200 copies/cell if derived from the high-copy-number plasmid pUC. The copy number influences the plasmid stability, i.e. the maintenance of the plasmid within the cells during cell division. A positive effect of a high copy number is the greater stability of the plasmid when the random partitioning occurs at cell division. On the other hand, a high number of plasmids generally decreases the growth rate, thus possibly allowing for cells with few plasmids to dominate the culture, since they grow faster. There appears to be no significant advantage of using higher-copy-number plasmids over pBR322 -based vectors in terms of production yields. The origin of replication also determines the plasmid's compatibility: its ability to replicate in conjunction with another plasmid within the same bacterial cell. Plasmids that utilize the same replication system cannot co-exist in the same bacterial cell. They are said to belong to the same compatibility group. The introduction of a new origin, in the form of a second plasmid from the same compatibility group, mimics the result of replication of the resident plasmid. Thus any further replication is prevented until after the two plasmids have been segregated to different cells to create the correct prereplication copy number. A small derivative of plasmid R1 was used to integratively suppress a chromosomal dnaA(Ts) mutation. The strain obtained grew normally at 42°C. The integratively suppressed strain was used as recipient for various plasmid R1 derivatives. Plasmid R1 and miniplasmid derivatives of R1 could be established in the strain that carried an integrated R1 replicon, but they were rapidly lost during growth. However, plasmids also carrying ColE1 replication functions were almost completely stably inherited. The integratively suppressed strain therefore allows the establishment of bacteria diploid with respect to plasmid R1 and forms a useful and sensitive system for studies of interaction between plasmid R1 replication functions. Several of the chimeric plasmids caused inhibition of growth at high temperatures. All plasmids that inhibited growth carried one particular PstI fragment from plasmid R1 (the PstI F fragment), and in all cases the growth inhibition could be ascribed to repression of initiation of chromosome replication at 42°C, i.e., they carry a trans-acting switch-off function. Furthermore, the analogous PstI fragments from different copy mutants of plasmid R1 were analyzed similarly, and one mutant was found to lack the switch-off function. The different chimeric plasmids were also tested for their incompatibility properties. All plasmids that carried the switch-off function (and no other plasmids) also carried R1 incompatibility gene(s). Since the PstI F fragment, which is present on all these plasmids, is very small (0.35 × 106), it is suggested that the switch-off regulation of replication (by an inhibitor), incompatibility, and copy number control are governed by the same gene. In bacteria, plasmids and some DNA viruses, DNA replication is initiated and regulated by binding of initiator proteins to repetitive sequences. To understand the control mechanism we used the plasmid mini-F, whose copy number is stringently maintained in Escherichia coli, mainly by its initiator protein RepE and the incC region. The monomers of RepE protein bound to incC iterons, which exert incompatibility in trans and control the copy number of mini-F plasmid in cis. Many incompatibility defective mutants carrying mutations in their incC iterons had lost the affinity to bind to RepE, while one mutant retained high level binding affinity. The mutated incC mini-F plasmids lost the function to control the copy number. The copy number of the wild-type mini-F plasmid did not increase in the presence of excess RepE. These results suggested that the control of replication by incC iterons does not rely on their capacity to titrate RepE protein. Using a ligation assay, we found that RepE proteins mediated a cross-link structure between ori2 and incC, for which the dimerization domain of RepE and the structure of incC seem to be important. The structure probably causes inhibition of extra rounds of DNA replication initiation on mini-F plasmids, thereby keeping mini-F plasmid at a low copy number Initia tion of replication of the broad-host-range plasmid RSF1010, is accurately controlled by the plasmid-encoded proteins, RepB (MobA), RepB', RepA and RepC [Haring et al., Proc. Natl. Acad. Sci. USA 82 (1985) 6090-6094; Scherzinger et al., Nucleic Acids Res. 19 (1991) 1203-1211]. The genes encoding these proteins which are essential for replication and conjugative mobilization are transcribed from a cluster of promoters, P1/P3 and P2, which partly overlap with the origin of conjugal transfer, oriT. Three regions were found where deletion mutations affect the mobilization of RSF1010 and increase its copy number in Escherichia coli. A deletion in the mobC gene increased the copy number of RSF1010 four-fold. Another deletion, that removed oriT and part of the promoter believed to be responsible for the expression of mobC, results in a three-fold increase in copy number. The third type of deletions affect the N-terminal part of RepB (MobA). A deletion that created a frame-shift results in a three-fold increase in copy number. A smaller, in-frame deletion of this region only affected the mobilization of RSF1010, but not its copy number. The extent by which RSF1010 or its deletion derivatives could repress the P1/P3 and P2 promoters has indicated that these promoters are negatively regulated by MobC and RepB (MobA), presumably by their attachment to the oriT region of RSF1010. Both MobC and RepB are required for the maximal repression of the rep operon. Optimal function of RSF1010 thus involves not only overlapping genes, but also proteins that exert multiple functions, mobilization, replication and regulation. The replication control system of plasmid R6-5 has been investigated by characterization of high-copy-number mutant miniplasmids, development of an in vivo assay for the site of action or "target" of the replication control elements, and sequence analysis of the replication control regions of the wild-type plasmid and two copy-number mutant derivatives. These and other experiments have shown that three plasmid determinants--copA/incA, copB, and copT--are involved in DNA replication control. The products of the copB and copA/incA genes, a 9500-dalton basic polypeptide and either a 7200-dalton basic polypeptide or a short untranslated RNA molecule, respectively, are negative-acting elements that interact with the third element, their target, the copT DNA sequence, or its product to regulate the frequency of initiation of plasmid replication. The location of copT within the copA/incA gene and 1600 base pairs upstream from the origin of replication indicates that regulation is effected at a preinitiation stage of replication, such as the production of a primer or other initiation factor. Gene as the unit of genetic material A gene is the basic unit of heredity in a living organism. All living things depend on genes to hold the information to build and maintain their cells and to pass on their traits to offspring. In general terms, a gene is a segment of nucleic acid that, taken as a whole, specifies a trait. The colloquial usage of the term gene often refers to the scientific concept of an allele. The notion of a gene has evolved with the science of genetics, which began when Gregor Mendel noticed that biological variations are inherited from parent organisms as specific, discrete traits. The biological entity responsible for defining traits was termed a gene, but the biological basis for inheritance remained unknown until DNA was identified as the genetic material in the 1940s. All organisms have many genes corresponding to many different biological traits, some of which are immediately visible, such as eye color or number of limbs, and some of which are not, such as blood type or increased risk for certain diseases, or the thousands of basic biochemical processes that comprise life. In cells, a gene is a portion of DNA that contains both "coding" sequences that determine what the gene does, and "non-coding" sequences that determine when the gene is active (expressed). When a gene is active, the coding and non-coding sequences are copied in a process called transcription, producing an RNA copy of the gene's information. This piece of RNA can then direct the synthesis of proteins via the genetic code. In other cases, the RNA is used directly, for example as part of the ribosome. The molecules resulting from gene expression, whether RNA or protein, are known as gene products, and are responsible for the development and functioning of all living things. In more technical terms, a gene is a locatable region of genomic sequence, corresponding to a unit of inheritance, and is associated with regulatory regions, transcribed regions and/or other functional sequence regions. The physical development and phenotype of organisms can be thought of as a product of genes interacting with each other and with the environment. [A concise definition of a gene, taking into account complex patterns of regulation and transcription, genic conservation and non-coding RNA genes has been proposed by Gerstein et al: "A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products".] The concept of gene was first introduced by Mendel and he termed them as hereditary factors or elements This concept was purely hypothetical and did not carry any experimental evidence. Wilhelm Johansen coined the term gene in 1909 to describe heritable factors responsible for the transmission and expression of a given biological character But without reference to any particular theory of inheritance But evidence generated from experimental work carried out with higher plants suggest genes as the units of inheritance, controlling one phenotypic trait. Based on this data, Morgan in 1926 proposed the particulate gene theory in which he stated that genes are corpuscular and arranged in a linear order on chromosomes like beads on a string. Thus, Mendel's unit of inheritance co relates more directly with the gene as a unit of function than with the unit of structure or single nucleotide pair. 1. The unit of function means that a fragment or unit of genetic material controls the inheritance of one unit character or attribute of phenotype. 2. The unit of structure could be operationally defined in two ways i As the unit of inheritance not sub divisible by recombination. ii. The smallest unit of genetic material capable of independent mutation. Gene A unit of heredity, usually a stretch of genetic material (DNA or RNA) with a defined function in the organism or cell, such as one for a protein. There are many genes within a genome. For example, the human genome is now found to contain about 30 000 genes, while the rice genome has about 50 000. A concise definition of gene taking into account complex patterns of regulation and transcription, genic conservation and non-coding RNA genes, has been proposed by Gerstein et al. "A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products". The gene is the unit of inheritance, and each chromosome may have several thousand genes. We inherit particular chromosomes through the egg of our mother and sperm of our father. The genes on those chromosomes carry the code that determines our physical characteristics, which are a combination of those of our two parents. [ According to the pre 1940 beads-on-a-string concept, the gene was the basic unit of inheritance defined by three criteria function, recombination and mutation pair. Since it clearly does not make sense to call each nucleotide pair a gene, emphasis has been shifted to the original definition of the gene as the unit of function.] The classical view was that all three criteria defined the same basic unit of inheritance. But this theory was discarded with advances in the DNA structure. Then Sutton introduced a new gene concept, which was elaborated by Muller. This concept was known as classical gene structure. It states that 1. Genes determine physical as well as physiological characteristics. 2. Genes are present on the chromosome, and on a single chromosome there are many genes. 3. Genes occupy a specific position on a chromosome which is called as locus or loci and genes are arranged in single linear order. The Hershey-Chase Experiment Legend: Illustration of the 1952 experiment connecting DNA and heredity. Side by side experiments are performed with separate bacteriophage (virus) cultures in which either the protein capsule is labeled with radioactive sulfur or the DNA core is labeled with radioactive phosphorus. 1. The radioactively labeled phages are allowed to infect bacteria. 2. Agitation in a blender dislodges phage particles from bacterial cells. 3. Centrifugation concentrates cells, separating them from the phage particles left in the supernatant. Results: 1. Radioactive sulfur is found predominantly in the supernatant. 2. Radioactive phosphorus is found predominantly in the cell fraction, from which a new generation of infective phage can be isolated. Conclusion: The active component of the bacteriophage that transmits the infective characteristic is the DNA. There is a clear correlation between DNA and genetic information. Mutation In biology, mutations are changes to the nucleotide sequence of the genetic material of an organism. Mutations can be caused by copying errors in the genetic material during cell division, by exposure to ultraviolet or ionizing radiation, chemical mutagens, or viruses, or can be induced by the organism, itself, by cellular processes such as hypermutation. In multicellular organisms with dedicated reproductive cells, mutations can be subdivided into germ line mutations, which can be passed on to descendants through the reproductive cells, and somatic mutations, which involve cells outside the dedicated reproductive group and which are not usually transmitted to descendants. If the organism can reproduce asexually through mechanisms such as cuttings or budding the distinction can become blurred. For example, plants can sometimes transmit somatic mutations to their descendants asexually or sexually where flower buds develop in somatically mutated parts of plants. A new mutation that was not inherited from either parent is called a de novo mutation. The source of the mutation is unrelated to the consequence, although the consequences are related to which cells are affected. Classification By effect on structure five types of chromosomal mutations. The sequence of a gene can be altered in a number of ways. Gene mutations have varying effects on health depending on where they occur and whether they alter the function of essential proteins. Structurally, mutations can be classified as: Small-scale mutations, such those as affecting a small gene in one or a few nucleotides, including: o Point mutations, often caused by chemicals or malfunction of DNA replication, exchange a single nucleotide for another. Most common is the transition that exchanges a purine for a purine (A ↔ G) or a pyrimidine for a pyrimidine, (C ↔ T). A transition can be caused by nitrous acid, base mis-pairing, or mutagenic base analogs such as 5-bromo2-deoxyuridine (BrdU). Less common is a transversion, which exchanges a purine for a pyrimidine or a pyrimidine for a purine (C/T ↔ A/G). A point mutation can be reversed by another point mutation, in which the nucleotide is changed back to its original state (true reversion) or by second-site reversion (a complementary mutation elsewhere that results in regained gene functionality). These changes are classified as transitions or transversions. An example of a transversion is adenine (A) being converted into a cytosine (C). There are also many other examples that can be found. Point mutations that occur within the protein coding region of a gene may be classified into three kinds, depending upon what the erroneous codon codes for: Silent mutations: which code for the same amino acid. Missense mutations: which code for a different amino acid. Nonsense mutations: which code for a stop and can truncate the protein. o Insertions add one or more extra nucleotides into the DNA. They are usually caused by transposable elements, or errors during replication of repeating elements (e.g. AT repeats). Insertions in the coding region of a gene may alter splicing of the mRNA (splice site mutation), or cause a shift in the reading frame (frameshift), both of which can significantly alter the gene product. Insertions can be reverted by excision of the transposable element. o Deletions remove one or more nucleotides from the DNA. Like insertions, these mutations can alter the reading frame of the gene. They are generally irreversible: though exactly the same sequence might theoretically be restored by an insertion, transposable elements able to revert a very short deletion (say 1–2 bases) in any location are either highly unlikely to exist or do not exist at all. Note that a deletion is not the exact opposite of an insertion: the former is quite random while the latter consists of a specific sequence inserting at locations that are not entirely random or even quite narrowly defined. Large-scale mutations in chromosomal structure, including: o Deletions of large chromosomal regions, leading to loss of the genes within those regions. Chromosomal translocations: interchange of genetic parts from nonhomologous chromosomes. Interstitial deletions: an intra-chromosomal deletion that removes a segment of DNA from a single chromosome, thereby apposing previously distant genes. For example, cells isolated from a human astrocytoma, a type of brain tumor, were found to have a chromosomal deletion removing sequences between the "fused in glioblastoma" (fig) gene and the receptor tyrosine kinase "ros", producing a fusion protein (FIG-ROS). The abnormal FIG-ROS fusion protein has constitutively active kinase activity that causes oncogenic transformation (a transformation from normal cells to cancer cells). Chromosomal inversions: reversing the orientation of a chromosomal segment. By effect on function Lethal mutations are mutations that lead the death of the organisms which carry the mutations. A back mutation or reversion is a point mutation that restores the original sequence and hence the original phenotype. By impact on protein sequence A frameshift mutation is a mutation caused by insertion or deletion of a number of nucleotides that is not evenly divisible by three from a DNA sequence. Due to the triplet nature of gene expression by codons, the insertion or deletion can disrupt the reading frame, or the grouping of the codons, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein produced is. Missense mutations or nonsynonymous mutations are types of point mutations where a single nucleotide is changed to cause substitution of a different amino acid. This in turn can render the resulting protein nonfunctional. Such mutations are responsible for diseases such as Epidermolysis bullosa, sickle-cell disease, and SOD1 mediated ALS A neutral mutation is a mutation that occurs in an amino acid codon which results in the use of a different, but chemically similar, amino acid. This is similar to a silent mutation, where a codon mutation may encode the same amino acid (see Wobble Hypothesis); for example, a change from AUU to AUC will still encode leucine, so no discernible change occurs (a silent mutation). A nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and possibly a truncated, and often nonfunctional protein product. Silent mutations are mutations that do not result in a change to the amino acid sequence of a protein. They may occur in a region that does not code for a protein, or they may occur within a codon in a manner that does not alter the final amino acid sequence. The phrase silent mutation is often used interchangeably with the phrase synonymous mutation; however, synonymous mutations are a subcategory of the former, occurring only within exons. The name silent could be a misnomer. For example, a silent mutation in the exon/intron border may lead to alternative splicing by changing the splice site (see Splice site mutation), thereby leading to a changed protein. Causes of mutation Two classes of mutations are spontaneous mutations (molecular decay) and induced mutations caused by mutagens. Spontaneous mutations on the molecular level include: Tautomerism – A base is changed by the repositioning of a hydrogen atom. Depurination – Loss of a purine base (A or G) to form an apurinic site (AP site). Deamination – Changes a normal base to an atypical base. Examples include C → U and A → HX (hypoxanthine), which can be corrected by DNA repair mechanisms; and 5MeC (5methylcytosine) → T, which is less likely to be detected as a mutation because thymine is a normal DNA base. Transition – A purine changes to another purine, or a pyrimidine to a pyrimidine. Transversion – A purine becomes a pyrimidine, or vice versa. Induced mutations on the molecular level can be caused by: Chemicals o Hydroxylamine NH2OH o Base analogs (e.g. BrdU) o Alkylating agents (e.g. N-ethyl-N-nitrosourea) These agents can mutate both replicating and non-replicating DNA. In contrast, a base analog can only mutate the DNA when the analog is incorporated in replicating the DNA. Each of these classes of chemical mutagens has certain effects that then lead to transitions, transversions, or deletions. o Agents that form DNA adducts (e.g. ochratoxin A metabolites) o DNA intercalating agents (e.g. ethidium bromide) o DNA crosslinkers o Oxidative damage Radiation o Ultraviolet radiation (nonionizing radiation). Two nucleotide bases in DNA – cytosine and thymine – are most vulnerable to radiation that can change their properties. UV light can induce adjacent thymine bases in a DNA strand to pair with each other, as a bulky dimer. o Ionizing radiation DNA has so-called hotspots, where mutations occur up to 100 times more frequently than the normal mutation rate. A hotspot can be at an unusual base, e.g., 5methylcytosine. Mutation rates also vary across species. Evolutionary biologists have theorized that higher mutation rates are beneficial in some situations, because they allow organisms to evolve and therefore adapt more quickly to their environments. For example, repeated exposure of bacteria to antibiotics, and selection of resistant mutants, can result in the selection of bacteria that have a much higher mutation rate than the original population. Mutation and Their Role in Evolution Mutations create variation within the gene pool. Less favorable (or deleterious) mutations can be reduced in frequency in the gene pool by natural selection, while more favorable (beneficial or advantageous) mutations may accumulate and result in adaptive evolutionary changes. For example, a butterfly may produce offspring with new mutations. The majority of these mutations will have no effect; but one might change the color of one of the butterfly's offspring, making it harder (or easier) for predators to see. If this color change is advantageous, the chance of this butterfly surviving and producing its own offspring are a little better, and over time the number of butterflies with this mutation may form a larger percentage of the population. Neutral mutations are defined as mutations whose effects do not influence the fitness of an individual. These can accumulate over time due to genetic drift. It is believed that the overwhelming majority of mutations have no significant effect on an organism's fitness. Also, DNA repair mechanisms are able to mend most changes before they become permanent mutations, and many organisms have mechanisms for eliminating otherwise permanently mutated somatic cells. Mutation is generally accepted by the scientific community as the mechanism upon which natural selection acts, providing the advantageous new traits that survive and multiply in offspring or disadvantageous traits that die out with weaker organisms. Mutation of DNA at molecular level Replication Errors Nothing is perfect. Although in most cases DNA replicates itself correctly, it does make errors from time to time. Tautomeric shifts The bases on both strands of DNA should be complementary. "A" must pair with "T" and "G" must pair with "C". If the pairing is wrong, when DNA undergoes semiconservative replication, the daughter DNA will be wrong. For example, if a C-G pair is changed to C-A, after replication, one of the two daughter DNA will have A-T instead of the original C-G. How does C pair with A? Scientists discovered a rare tautomeric form of each base GATC. Tautomer of a base contains exactly the same atoms as the common form but the atomic arrangement is slightly different. The deviated molecular shape enables a tautomer to pair with a normally noncomplementary base. A summary of tautomeric pairing is given in the following table: Please note that a tautomer cannot bond with another tautomer. Base analogues Some compounds act as base analogues. Like tautomers, they resemble the bases. But since its structure is still different from the bases, base analogues often pair wrongly and cause mutation. An example is 5-bromodeoxyuridine (5-BrdU). It can substitute for thymine. However, the presence of a bromine atom instead of an alkyl group makes tautomerization easier to occur. TA pair is changed to CG pair. Another example is 2-amino purine (2-AP), which resembles adenine and causes an AT pair to change to CG pair. Alkylation Some chemicals are electron loving. They are electrophilic. They attack DNA by donating an alkyl group to the DNA molecule. Ethylmethane sulfonate (EMS), which is a gas, is a common alkylating agent. It alkylates guanine. After that guanine becomes 6-ethyl guanine, which is capable of pairing with thymine and cause a GC to AT mutation after replication. An alkylating agent may also donate an alkyl group to the N7 atom of guanine or the N3 atom of adenine. It makes the bonding between the base and deoxyribose more fragile. If the base breaks off the sugar, a gap called an apurine site (AP site) is left behind. When the DNA replicates, the AP site may terminate the process. Insertion of a wrong base often causes point mutation. Deamination Deamination refers to the lost of amino group. Cytosine and adenine tends to lose their amino group. After that the amino group is changed to a keto group. Cytosine is converted to Uracil. (For adenine, hypoxanthine is formed) Uracil bonds with adenine during replication instead of guanine, causing a CG to AT mutation. (On the other hand, hypoxanthine bonds with cytosine instead of thymine, causing an AT to GC mutation) Acridine dyes This group of chemical have the same dimension as a base pair. It is able to replace a whole base pair in DNA. When the DNA is replicated, this wedge often causes slippage or wrong base pairing. Frameshift mutation often occurs. Proflavin Drawn by ourselves Silent mutation Sometimes a mutation, often point mutation will cause no effect at all. For example, if a codon UCA is changed to UCG, it is a mutation. However, UCA and UCG both code for the same amino acid serine. Silent mutation is not harmful. Triplet repeats There is a kind of unusual mutation. It involves neither change of a base nor deletion, insertion of a base. Instead, a codon repeats abnormally. Huntington is a disease caused by mutation of this kind. The CAG sequence of a Huntington patient is repeated 42-86 times while in normal genome this sequence should be repeated no more then 34 times. The cause of triplet repeats is still unknown, but it occurs frequently in human genome. UV and radiation The purines and pyrimidines absorb UV light, especially around those of wavelength around 260nm. Pyrimidines make use of the extra energy to form cross bonds between carbon atoms of two neighbouring pyrimidines to form a dimer. The most common kind of dimer is thymine dimers. If the error goes uncorrected, when the DNA is replicated, it doesn't know which base to insert to pair with the dimer. The result is wrong base pairing. Waves with high energy, such as gamma rays and Xrays, they penetrate deeply into the tissues and ionize molecules around. The ionized molecules are very unstable since they have unpaired electrons. They attack DNA, alter its structure and cause mutations. Although there are many ways that DNA can undergo mutation, the mutation rate is actually very low. (About 1:10000 to 1: 1000000) Damaged Reversal OR Photo Reactivation Exposure of a cell to ultraviolet light can result in the covalent joining of two adjacent pyrimidines producing a dimer. Although cytosine-cytosine, cytosine thymine dimers are also formed, the principal products of UV irradiation are thymine thymine dimers. These thymine dimers prevent DNA polymerase from replicating the DNA strand beyond the site of dimer formation. In E coli, an enzyme called DNA photolyase (deoxyribodipyrimidine photolyase or photo reactivating enzyme) detects and binds to the damaged DNA site. Then the enzyme absorbs energy from visible light, which activates it so it can break the bonds holding the pyrimidine dimer together. The enzyme then falls free of the DNA. This enzyme thus reverses the UV induced dimerization. Photo reactivation or photo-restoration is a light dependent DNA repair mechanism in which certain types of pyrimidine dimers are cleaved. This repair pathway is found in many prokaryotes and lower eukaryotes but absent in higher eukaryotes. Photo reactivation should not be confused with other, non enzymatic mechanisms of monomerization. Direct reversal: Only a few types of damage are repaired in this way although it is probably the most energy efficient. Especially the formation of pyrimidine dimers, which is the major type of damage induced by UV light. Pyrimidine dimers are formed between adjacent pyrimidines (particularly thymines) on the same strand of DNA by the formation of a cyclobutane ring resulting from saturation of the double bonds in their ring structure (Fig. 12.25). Pyrimidine dimers distort the double helical structure of DNA and block transcription or replication past the damaged site. Recognition of distortions in the double helix is the major way that DNA damage is generally recognized in the cell. One mechanism of repair (there are several others) is through direct reversal of the dimerization reaction. The process is called photoreactivation because the energy to break the cyclobutane ring is derived from visible light. Therefore, in this kind of repair mechanism the original pyrimidine bases are restored and remain in the DNA. The repair of pyrimidine dimers by photoreactivation by the enzyme photolyase, is common to many prokaryotes and eukaryotes (E. coli, yeast, and several species of plants and animals). However, photoreactivation is not universal. Many species (including humans) lack this kind of repair mechanism. But humans have other kinds of repair mechanisms that directly reverse certain damages. Photoreactivation phr gene - codes for deoxyribodipyrimidine photolyase that, with cofactor folic acid, binds in dark to T dimer. When light shines on cell, folic acid absorbs the light and uses the energy to break bond of T dimer; photolyase then falls off DNA Prokaryotic DNA replication DNA replication in prokaryotes is exemplified in E. coli. It is bi-directional and originates at a single origin of replication (OriC). [edit] Initiation The initiation of replication is mediated by a protein that binds to a region of the origin known as the DnaA box. In E. coli, there are 5 DnaA boxes, each of which contains a highly conserved 9 bp consensus sequence 5' - TTATCCACA - 3'. Binding of DnaA to this region causes it to become negatively supercoiled. Following this, a region of OriC upstream of the DnaA boxes (known as DnaB boxes) become melted. There are three of these regions, and each are 13 bp long, and AT-rich (which facilitates melting because less energy is required to break the two hydrogen bonds that form between A and T nucleotides). This region has the consensus sequence 5' - GATCTNTTNTTTT - 3. Melting of the DnaB boxes requires ATP (which is hydrolyzed by DnaA). Following melting, DnaA recruits a hexameric helicase (six DnaB proteins) to opposite ends of the melted DNA. This is where the replication fork will form. Recruitment of helicase requires six DnaC proteins, each of which is attached to one subunit of helicase. Once this complex is formed, an additional five DnaA proteins bind to the original five DnaA proteins to form five DnaA dimers. DnaC is then released, and the prepriming complex is complete. In order for DNA replication to continue, SSB protein is needed to prevent the single strands of DNA from forming any secondary structures and to prevent them from reannealing, and DNA gyrase is needed to relieves the stress (by creating negative supercoils) created by the action of DnaB helicase. The unwinding of DNA by DnaB helicase allows for primase (DnaG) and RNA polymerase to prime each DNA template so that DNA synthesis can begin. [edit] Elongation Once priming is complete, DNA polymerase III holoenzyme is loaded into the DNA and replication begins. The catalytic mechanism of DNA polymerase III involves the use of two metal ions in the active site, and a region in the active site that can discriminate between deoxynucleotides and ribonucleotides. The metal ions are general divalent cations that help the 3' OH initiate a nucleophilic attack onto the alpha phosphate of the deoxyribonucleotide and orient and stabilize the negatively charged triphosphate on the deoxyribonucleotide. Nucleophilic attack by the 3' OH on the alpha phosphate releases pyrophosphate, which is then subsequently hydrolyzed (by inorganic phosphatase) into two phosphates. This hydrolysis drives DNA synthesis to completion. Furthermore, DNA polymerase III must be able to distinguish between correctly paired bases and incorrectly paired bases. This is accomplished by distinguishing Watson-Crick base pairs through the use of an active site pocket that is complementary in shape to the structure of correctly paired nucleotides. This pocket has a tyrosine residue that is able to form van der Waals interactions with the correctly paired nucleotide. In addition, dsDNA (double stranded DNA) in the active site has a wider and shallower minor groove that permits the formation of hydrogen bonds with the third nitrogen of purine bases and the second oxygen of pyrimidine bases. Finally, the active site makes extensive hydrogen bonds with the DNA backbone. These interactions result in the DNA polymerase III closing around a correctly paired base. If a base is inserted and incorrectly paired, these interactions could not occur due to disruptions in hydrogen bonding and van der Waals interactions. DNA is read in the 3' → 5' direction, therefore, nucleotides are synthesized (or attached to the template strand) in the 5' → 3' direction. However, one of the parent strands of DNA is 3' → 5' while the other is 5' → 3'. To solve this, replication occurs in opposite directions. Heading towards the replication fork, the leading strand in synthesized in a continuous fashion, only requiring one primer. On the other hand, the lagging strand, heading away from the replication fork, is synthesized in a series of short fragments known as Okazaki fragments, consequently requiring many primers. The RNA primers of Okazaki fragments are subsequently degraded by RNAse H and DNA Polymerase I (exonuclease), and the gap (or nicks) are filled with deoxyribonucleotides and sealed by the enzyme ligase. [edit] Termination Termination of DNA replication in E. coli is completed through the use of termination sequences and the Tus protein. These sequences allow the two replication forks to pass through in only one direction, but not the other. However, these sequences are not required for termination of replication. Regulation of DNA replication is achieved through several mechanisms. Mechanisms involve the ratio of ATP to ADP, of DnaA to the number of DnaA boxes and the hemimethylation and sequestering of OriC. The ratio of ATP to ADP indicates that the cell has reached a specific size and is ready to divide. This "signal" occurs because in a rich medium, the cell will grow quickly and will have a lot of excess DNA. Furthermore, DnaA binds equally well to ATP or ADP, and only the DnaA-ATP complex is able to initiate replication. Thus, in a fast growing cell, there will be more DnaA-ATP than DnaA-ADP. Because the levels of DnaA are strictly regulated, and 5 DnaA-DnaA dimers are needed to initiate replication, the ratio of DnaA to the number of DnaA boxes in the cell is important. After DNA replication is complete, this number is halved, thus DNA replication cannot occur until the levels of DnaA protein increases. Finally, DNA is sequestered to a membrane-binding protein called SeqA. This protein binds to hemi-methylated GATC DNA sequences. This four bp sequences occurs 11 times in OriC, and newly synthesized DNA only has its parent strand methylated. DAM methyltransferase methylates the newly synthesized strand of DNA only if it is not bound to SeqA. The importance of hemi-methylation is twofold. Firstly, OriC becomes inaccessible to DnaA, and secondly, DnaA binds better to fully methylated DNA than hemi-methylated DNA. Retrieved from "http://en.wikipedia.org/wiki/Prokaryotic_DNA_replication" Rolling circle replication Rolling circle replication produces multiple copies of a single circular template. Rolling circle replication describes a process of nucleic acid replication that can rapidly synthesize multiple copies of circular molecules of DNA or RNA, such as plasmids, the genomes of bacteriophages, and the circular RNA genome of viroids. Some eukaryotic viruses also replicate their DNA via a rolling circle mechanism. Circular DNA replication Rolling circle DNA replication is initiated by an initiator protein encoded by the plasmid or bacteriophage DNA, which nicks one strand of the double-stranded, circular DNA molecule at a site called the double-strand origin, or DSO. The initiator protein remains bound to the 5' phosphate end of the nicked strand, and the free 3' hydroxyl end is released to serve as a primer for DNA synthesis by DNA polymerase III. Using the unnicked strand as a template, replication proceeds around the circular DNA molecule, displacing the nicked strand as single-stranded DNA. Displacement of the nicked strand is carried out by a host-encoded helicase called PcrA (the abbreviation standing for plasmid copy reduced) in the presence of the plasmid replication initiation protein. Continued DNA synthesis can produce multiple single-stranded linear copies of the original DNA in a continuous head-to-tail series called a concatemer. These linear copies can be converted to double-stranded circular molecules through the following process: First, the initiator protein makes another nick to terminate synthesis of the first (leading) strand. RNA polymerase and DNA polymerase III then replicate the single-stranded origin (SSO) DNA to make another double-stranded circle. DNA polymerase I removes the primer, replacing it with DNA, and DNA ligase joins the ends to make another molecule of double-stranded circular DNA. Rolling circle replication has found wide uses in academic research and biotechnology, and has been successfully used for amplification of DNA from very small amounts of starting material. http://www3.interscience.wiley.com:8100/legacy/college/boyer/0471661791/animations/replication/re plication.swf Theta structure A Theta structure is an intermediate structure formed during the replication of a circular DNA molecule (prokaryote DNA), two replication forks can proceed independently around the DNA ring and when viewed from above it resembles the Greek letter "theta" (θ). Originally discovered by Cairns, it led to the understanding that (in this case) bidirectional DNA replication could take place. Proof of the bidirectional nature came from providing replicating cells with a pulse of tritiated thymidine, quenching rapidly and then autoradiographing. Results showed that the radioactive thymidine was incorporated into both forks of the theta structure, not just one, indicating synthesis at both forks, in opposite directions around the loop. DNA replication DNA replication. The double helix is unwound and each strand acts as a template. Bases are matched to synthesize the new partner strands. DNA replication, the basis for biological inheritance, is a fundamental process occurring in all living organisms to copy their DNA. This process is "semiconservative" in that each strand of the original double-stranded DNA molecule serves as template for the reproduction of the complementary strand. Hence, following DNA replication, two identical DNA molecules have been produced from a single double-stranded DNA molecule. Cellular proofreading and errorchecking mechanisms ensure near perfect fidelity for DNA replication.[1][2] In a cell, DNA replication begins at specific locations in the genome, called "origins".[3] Unwinding of DNA at the origin, and synthesis of new strands, forms a replication fork. In addition to DNA polymerase, the enzyme that synthesizes the new DNA by adding nucleotides matched to the template strand, a number of other proteins are associated with the fork and assist in the initiation and continuation of DNA synthesis. DNA replication can also be performed in vitro (outside a cell). DNA polymerases, isolated from cells, and artificial DNA primers are used to initiate DNA synthesis at known sequences in a template molecule. The polymerase chain reaction (PCR), a common laboratory technique, employs such artificial synthesis in a cyclic manner to amplify a specific target DNA fragment from a pool of DNA. DNA polymeras DNA polymerase adds nucleotides to the 3' end of a strand of DNA. If a mismatch is accidentally incorporated, the polymerase is inhibited from further extension. Proofreading removes the mismatched nucleotide and extension continues. DNA polymerases are a family of enzymes that carry out all forms of DNA replication.[5] A DNA polymerase can only extend an existing DNA strand paired with a template strand; it cannot begin the synthesis of a new strand. To begin synthesis of a new strand, a short fragment of DNA or RNA, called a primer, must be created and paired with the template strand before DNA polymerase can synthesize new DNA. Once a primer pairs with DNA to be replicated, DNA polymerase synthesizes a new strand of DNA by extending the 3' end of an existing nucleotide chain, adding new nucleotides matched to the template strand one at a time via the creation of phosphodiester bonds. The energy for this process of DNA polymerization comes from two of the three total phosphates attached to each unincorporated base. (Free bases with their attached phosphate groups are called nucleoside triphosphates.) When a nucleotide is being added to a growing DNA strand, two of the phosphates are removed and the energy produced creates a phosphodiester (chemical) bond that attaches the remaining phosphate to the growing chain. The energetics of this process also help explain the directionality of synthesis - if DNA were synthesized in the 3' to 5' direction, the energy for the process would come from the 5' end of the growing strand rather than from free nucleotides. DNA polymerases are generally extremely accurate, making less than one error for every 10 7 nucleotides added.[6] Even so, some DNA polymerases also have proofreading ability; they can remove nucleotides from the end of a strand in order to correct mismatched bases. If the 5' nucleotide needs to be removed during proofreading, the triphosphate end is lost. Hence, the energy source that usually provides energy to add a new nucleotide is also lost. DNA replication within the cell [edit] Origins of replication For a cell to divide, it must first replicate its DNA.[7] This process is initiated at particular points within the DNA, known as "origins", which are targeted by proteins that separate the two strands and initiate DNA synthesis.[3] Origins contain DNA sequences recognized by replication initiator proteins (eg. dnaA in E coli' and the Origin Recognition Complex in yeast).[8] These initiator proteins recruit other proteins to separate the two strands and initiate replication forks. Initiator proteins recruit other proteins to separate the DNA strands at the origin, forming a bubble. Origins tend to be "AT-rich" (rich in adenine and thymine bases) to assist this process because A-T base pairs have two hydrogen bonds (rather than the three formed in a C-G pair)— strands rich in these nucleotides are generally easier to separate.[9] Once strands are separated, RNA primers are created on the template strands and DNA polymerase extends these to create newly synthesized DNA. As DNA synthesis continues, the original DNA strands continue to unwind on each side of the bubble, forming replication forks. In bacteria, which have a single origin of replication on their circular chromosome, this process eventually creates a "theta structure" (resembling the Greek letter theta: θ). In contrast, eukaryotes have longer linear chromosomes and initiate replication at multiple origins within these. The replication fork Many enzymes are involved in the DNA replication fork. When replicating, the original DNA splits in two, forming two "prongs" which resemble a fork (hence the name "replication fork"). DNA has a ladder-like structure; imagine a ladder broken in half vertically, along the steps. Each half of the ladder now requires a new half to match it. Because DNA polymerase can only synthesize a new DNA strand in a 5' to 3' manner, the process of replication goes differently for the two strands comprising the DNA double helix. Leading strand The leading strand is that strand of the DNA double helix that is orientated in a 5' to 3' manner. On the leading strand, a polymerase "reads" the DNA and adds nucleotides to it continuously. This polymerase is DNA polymerase III (DNA Pol III) in prokaryotes and presumably Pol ε[10][11] in eukaryotes. Lagging strand The lagging strand is that strand of the DNA double helix that is orientated in a 3' to 5' manner. Because of its orientation, opposite to the working orientation of DNA polymerase III which is in a 5' to 3' manner, replication of the lagging strand is more complicated than of the leading strand. On the lagging strand, primase "reads" the DNA and adds RNA to it in short, separated segments. In eukaryotes, primase is intrinsic to Pol α.[12] DNA polymerase III or Pol δ lengthens the primed segments, forming Okazaki fragments. Primer removal in eukaryotes is also performed by Pol δ.[13] In prokaryotes, DNA polymerase I "reads" the fragments, removes the RNA using its flap endonuclease domain, and replaces the RNA nucleotides with DNA nucleotides (this is necessary because RNA and DNA use slightly different kinds of nucleotides). DNA ligase joins the fragments together. Dynamics at the replication fork The assembled human DNA clamp, a trimer of the protein PCNA. As helicase unwinds DNA at the replication fork, the DNA ahead is forced to rotate. This process results in a build-up of twists in the DNA ahead.[14] This build-up would form a resistance that would eventually halt the progress of the replication fork. DNA topoisomerases are enzymes that solve these physical problems in the coiling of DNA. Topoisomerase I cuts a single backbone on the DNA, enabling the strands to swivel around each other to remove the build-up of twists. Topoisomerase II cuts both backbones, enabling one double-stranded DNA to pass through another, thereby removing knots and entanglements that can form within and between DNA molecules. Bare single-stranded DNA has a tendency to fold back upon itself and form secondary structures; these structures can interfere with the movement of DNA polymerase. To prevent this, singlestrand binding proteins bind to the DNA until a second strand is synthesized, preventing secondary structure formation.[15] Clamp proteins form a sliding clamp around DNA, helping the DNA polymerase maintain contact with its template and thereby assisting with processivity. The inner face of the clamp enables DNA to be threaded through it. Once the polymerase reaches the end of the template or detects double stranded DNA, the sliding clamp undergoes a conformational change which releases the DNA polymerase. Clamp-loading proteins are used to initially load the clamp, recognizing the junction between template and RNA primers. [edit] Regulation of replication The cell cycle of eukaryotic cells. Eukaryotes Within eukaryotes, DNA replication is controlled within the context of the cell cycle. As the cell grows and divides, it progresses through stages in the cell cycle; DNA replication occurs during the S phase (Synthesis phase). The progress of the eukaryotic cell through the cycle is controlled by cell cycle checkpoints. Progression through checkpoints is controlled through complex interactions between various proteins, including cyclins and cyclin-dependent kinases.[16] The G1/S checkpoint (or restriction checkpoint) regulates whether eukaryotic cells enter the process of DNA replication and subsequent division. Cells which do not proceed through this checkpoint are quiescent in the "G0" stage and do not replicate their DNA. Replication of chloroplast and mitochondrial genomes occurs independent of the cell cycle, through the process of D-loop replication. Bacteria Most bacteria do not go through a well-defined cell cycle and instead continuously copy their DNA; during rapid growth this can result in multiple rounds of replication occurring concurrently.[17] Within E coli, the most well-characterized bacteria, regulation of DNA replication can be achieved through several mechanisms, including: the hemimethylation and sequestering of the origin sequence, the ratio of ATP to ADP, and the levels of protein DnaA. These all control the process of initiator proteins binding to the origin sequences. Because E coli methylates GATC DNA sequences, DNA synthesis results in hemimethylated sequences. This hemimethylated DNA is recognized by a protein (SeqA) which binds and sequesters the origin sequence; in addition, dnaA (required for initiation of replication) binds less well to hemimethylated DNA. As a result, newly replicated origins are prevented from immediately initiating another round of DNA replication.[18] ATP builds up when the cell is in a rich medium, triggering DNA replication once the cell has reached a specific size. ATP competes with ADP to bind to DnaA, and the DnaA-ATP complex is able to initiate replication. A certain number of DnaA proteins are also required for DNA replication — each time the origin is copied the number of binding sites for DnaA doubles, requiring the synthesis of more DnaA to enable another initiation of replication. [edit] Termination of replication Because bacteria have circular chromosomes, termination of replication occurs when the two replication forks meet each other on the opposite end of the parental chromosome. E coli regulate this process through the use of termination sequences which, when bound by the Tus protein, enable only one direction of replication fork to pass through. As a result, the replication forks are constrained to always meet within the termination region of the chromosome.[19] Eukaryotes initiate DNA replication at multiple points in the chromosome, so replication forks meet and terminate at many points in the chromosome; these are not known to be regulated in any particular manner. Because eukaryotes have linear chromosomes, DNA replication often fails to synthesize to the very end of the chromosomes (telomeres), resulting in telomere shortening. This is a normal process in somatic cells — cells are only able to divide a certain number of times before the DNA loss prevents further division. (This is known as the Hayflick limit.) Within the germ cell line, which passes DNA to the next generation, the enzyme telomerase extends the repetitive sequences of the telomere region to prevent degradation. Telomerase can become mistakenly active in somatic cells, sometimes leading to cancer formation. [edit] Rolling circle replication Another method of copying DNA, sometimes used in vivo by bacteria and viruses, is the process of rolling circle replication.[20] In this form of replication, a single replication fork progresses around a circular molecule to form multiple linear copies of the DNA sequence. In cells, this process can be used to rapidly synthesize multiple copies of plasmids or viral genomes. In the cell, rolling circle replication is initiated by an initiator protein encoded by the plasmid or virus DNA. This protein is able to nick one strand of the double-stranded, circular DNA molecule at a site called the double-strand origin (DSO) and remains bound to the 5' phosphate end of the nicked strand. The free 3' hydroxyl end is released and can serve as a primer for DNA synthesis. Using the unnicked strand as a template, replication proceeds around the circular DNA molecule, displacing the nicked strand as single-stranded DNA. Continued DNA synthesis produces multiple single-stranded linear copies of the original DNA in a continuous head-to-tail series. In vivo these linear copies are subsequently converted to double-stranded circular molecules. Rolling circle replication can also be performed in vitro and has found wide uses in academic research and biotechnology, often used for amplification of DNA from very small amounts of starting material. Replication can be initiated by nicking a double-stranded circular DNA molecule or by hybridizing a primer to a single-stranded circle of DNA. The use of a reverse primer (or random primers) produces hyperbranched rolling circle amplification, resulting in exponential rather than linear growth of the DNA molecule. [edit] Polymerase chain reaction Researchers commonly replicate DNA in vitro using the polymerase chain reaction (PCR). PCR uses a pair of primers to span a target region in template DNA, and then polymerizes partner strands in each direction from these primers using a thermostable DNA polymerase. Repeating this process through multiple cycles produces amplification of the targeted DNA region. At the start of each cycle, the mixture of template and primers is heated, separating the newly synthesized molecule and template. Then, as the mixture cools, both of these become templates for annealing of new primers, and the polymerase extends from these. As a result, the number of copies of the target region doubles each round, increasing exponentially.[21] spontaneous mutation Definition and sources A spontaneous mutation is one that occurs as a result of natural processes in cells. We can distinguish these from induced mutations; those that occur as a result of interaction of DNA with an outside agent or mutagen. Since some of the same mechanisms are involved in producing spontaneous and induced mutations, we will consider them together. Some so-called "spontaneous mutations" probably are the result of naturally occurring mutagens in the environment; nevertheless there are others that definitely arise spontaneously, for example, DNA replication errors. DNA replication errors and polymerase accuracy Mistakes in DNA replication where an incorrect nucleotide is added will lead to a mutation in the next round of DNA replication of the strand with the incorrect nucleotide. The frequency at which a DNA polymerase makes mistakes (inserts an incorrect base) will influence the spontaneous mutation frequency and it has been observed that different polymerases vary in their accuracy. One major factor affecting polymerase accuracy is the presence of a "proofreading" 3'-5' exonuclease which will remove incorrectly paired bases inserted by the polymerase. The studies showed that the function of the 3'-5' exonuclease is to prevent misincorporation during DNA replication and to prevent mutations. Mutator mutants have since been isolated in other organisms and have been shown to affect various components of the DNA replication complex; alterations in a number of these proteins are likely to affect the accuracy of the system. Base alterations and base damage Tautomerization: The bases of DNA are subject to spontaneous structural alterations called tautomerization: they are capable of existing in two forms between which they interconvert. For example, guanine can exist in keto or enol forms. The keto form is favored but the enol form can occur by shifting a proton and some electrons; these forms are called tautomers or structural isomers. The various tautomer forms of the bases have different pairing properties. Thymine can also have an enol form; adenine and cytosine exist in amino or imino forms. If during DNA replication, G is in the enol form, the polymerase will add a T across from it instead of the normal C because the base pairing rules are changed (not a polymerase error). The result is a G:C to A:T transition; tautomerization causes transition mutations only. Deamination: Another mutatgenic process occurring in cells is spontaneous base degradation. The deamination of cytosine to uracil happens at a significant rate in cells. Deamination can be repaired by a specific repair process which detects uracil, not normally present in DNA; otherwise the U will cause A to be inserted opposite it and cause a C:G to T:A transition when the DNA is replicated. Deamination of methylcytosine to thymine can also occur. Methylcytosine occurs in the human genome at the sequence 5'CpG3', which is normally avoided in the coding regions of genes. If the meC is deaminated to T, there is no repair system which can recognize and remove it (because T is a normal base in DNA). This means that wherever CpG occurs in genes it is a "hot spot" for mutation. Such a hot spot has recently been found in the achondroplasia gene. Oxidation: A third type of spontaneous DNA damage that occurs frequently is damage to the bases by free radicals of oxygen. These arise in cells as a result of oxidative metabolism and also are formed by physical agents such as radiation. An important oxidation product is 8-hydroxyguanine, which mispairs with adenine, resulting in G:C to T:A transversions. Alkylation: Still another type of spontaneous DNA damage is alkylation, the addition of alkyl (methyl, ethyl, occasionally propyl) groups to the bases or backbone of DNA. Alkylation can occur through reaction of compounds such as S-adenosyl methionine with DNA. Alkylated bases may be subject to spontaneous breakdown or mispairing. Spontaneous Frameshift mutations These result from the insertion or deletion of one or more (not in multiples of three) nucleotides in the coding region of a gene. These causes an alteration of the reading frame: since codons are groups of three nucleotides, there are three possible reading frames for each gene although only one is used. eg. mRNA with sequence AUG CAG AUA AAC GCU GCA UAA amino acid sequence from the first reading frame: met gln ile asn ala ala stop the second reading frame gives: cys arg stop A mutation of this sort changes all the amino acids downstream and is very likely to create a nonfunctional product since it may differ greatly from the normal protein. Further, reading frames other than the correct one often contain stop codons which will truncate the mutant protein prematurely. Induced Mutation A mutagen is a natural or human-made agent (physical or chemical) which can alter the structure or sequence of DNA. Induced mutation is caused by mutagens. Chemical mutagens The first report of mutagenic action of a chemical was in 1942 by Charlotte Auerbach, who showed that nitrogen mustard (component of poisonous mustard gas used in World Wars I and II) could cause mutations in cells. Since that time, many other mutagenic chemicals have been identified and there is a huge industry and government bureaucracy dedicated to finding them in food additives, industrial wastes, etc. It is possible to distinguish chemical mutagens by their modes of action; some of these cause mutations by mechanisms similar to those which arise spontaneously while others are more like radiation. 1. Base analogs These chemicals structurally resemble purines and pyrimidines and may be incorporated into DNA in place of the normal bases during DNA replication: bromouracil (BU)--artificially created compound extensively used in research. Resembles thymine (has Br atom instead of methyl group) and will be incorporated into DNA and pair with A like thymine. It has a higher likelihood for tautomerization to the enol form (BU*) aminopurine --adenine analog which can pair with T or (less well) with C; causes A:T to G:C or G:C to A:T transitions. Base analogs cause transitions, as do spontaneous tautomerization events. 2. Chemicals which alter structure and pairing properties of bases There are many such mutagens; some well-known examples are: nitrous acid--formed by digestion of nitrites (preservatives) in foods. It causes C to U, meC to T, and A to hypoxanthine deaminations. [See above for the consequences of the first two events; hypoxanthine in DNA pairs with C and causes transitions. Deamination by nitrous acid, like spontaneous deamination, causes transitions. nitrosoguanidine, methyl methanesulfonate, ethyl methanesulfonate--chemical mutagens that react with bases and add methyl or ethyl groups. Depending on the affected atom, the alkylated base may then degrade to yield a baseless site, which is mutagenic and recombinogenic, or mispair to result in mutations upon DNA replication. 3. Intercalating agents acridine orange, proflavin, ethidium bromide (used in labs as dyes and mutagens) All are flat, multiple ring molecules which interact with bases of DNA and insert between them. This insertion causes a "stretching" of the DNA duplex and the DNA polymerase is "fooled" into inserting an extra base opposite an intercalated molecule. The result is that intercalating agents cause frame shifts. 4. Agents altering DNA structure We are using this as a "catch-all" category which includes a variety of different kinds of agents. These may be: --large molecules which bind to bases in DNA and cause them to be noncoding--we refer to these as "bulky" lesions (eg. NAAAF) --agents causing intra- and inter-strand crosslinks (eg. psoralens--found in some vegetables and used in treatments of some skin conditions) --chemicals causing DNA strand breaks (eg. peroxides) Radiation Radiation was the first mutagenic agent known; its effects on genes were first reported in the 1920's. Microwaves, infrared, visible, ultraviolet (UV), X and gamma radiation are some examples of radiations. X- and gamma-rays are energetic enough that they produce reactive ions (charged atoms or molecules) when they react with biological molecules; thus they are referred to as ionizing radiation.UV radiation is not ionizing but can react with DNA and other biological molecules and is also important as a mutagen. Biological effects of radiation Ionizing radiation produces a range of damage to cells and organisms primarily due to the production of free radicals of water (the hydroxyl or OH radical). Free radicals possess unpaired electrons and are chemically very reactive and will interact with DNA, proteins, lipids in cell membranes, etc. Thus X-rays can cause DNA and protein damage which may result in organelle failure, block cell division, or cause cell death. Genetic effects of radiation Ionizing radiation produces a range of effects on DNA both through free radical effects and direct action: -breaks in one or both strands (can lead to rearrangements, deletions, chromosome loss, death if unrepaired; this is from stimulation of recombination) -damage to/loss of bases (mutations) -crosslinking of DNA to itself or proteins The genetic effects of radiation were reported in 1927 in Drosophila by Muller and in 1928 in plants (barley) by Stadler; both showed that the frequency of induced mutations is a function of X-ray dose. UV (ultraviolet) UV radiation is less energetic, and therefore non-ionizing, but its wavelengths are preferentially absorbed by bases of DNA and by aromatic amino acids of proteins, so it, too, has important biological and genetic effects. UV is normally classified in terms of its wavelength: UV-C (180-290 nm)--"germicidal"-most energetic and lethal, it is not found in sunlight because it is absorbed by the ozone layer; UV-B (290-320 nm)--major lethal/mutagenic fraction of sunlight; UV-A (320 nm-visible)--"near UV"--also has deleterious effects (primarily because it creates oxygen radicals) but it produces very few pyrimidine dimers. The major lethal lesions are pyrimidine dimers in DNA (produced by UV-B and UV-C)-these are the result of a covalent attachment between adjacent pyrimidines in one strand. These dimers, like bulky lesions from chemicals, block transcription and DNA replication and are lethal if unrepaired. They can stimulate mutation and chromosome rearrangement as well. E.g. Thiamine dimer. Temperature: An increase in 3˚C-10˚C temperature will increase the mutation rate to higher levels DNA repair mechanism DNA damage occurs spontaneously and as a result to ubiquitous environmental agents, most organisms possess some capacity to repair their DNA. We can divide "repair" mechanisms into 3 categories: Damage reversal (Direct repair)l--simplest; enzymatic action restores normal structure without breaking backbone Damage removal (Excision repair& Mismatch repair)--involves cutting out and replacing a damaged or inappropriate base or section of nucleotides Damage tolerance (SOS repair& recombinant repair)--not truly repair but a way of coping with damage so that life can go on Damage reversal 1. Photoreactivation An example of the single step reaction is the direct reversal that can be accomplished by the bacterial photolyase enzyme: a cyclobutane pyrimidine dimer is converted into two adjacent pyrimidines, and thereby the lesion is repaired. This is one of the simplest and perhaps oldest repair systems: it consists of a single enzyme which can split pyrimidine dimers (break the covalent bond) in presence of light to see the photoreactivation reaction. The photolyase enzyme catalyzes this reaction; it is found in many bacteria, lower eukaryotes, insects, and plants. It seems to be absent in mammals (including humans). The gene is present in mammals but may code for a protein with an accessory function in another type of repair. Damage removal 1. Base excision repair Simple base modifications such as monofunctional alkylations can be removed by the base excision repair system whereas more complex, bulky lesions are dealt with by the nucleotide excision repair pathways. The damaged or inappropriate base is removed from its sugar linkage and replaced. Nucleotide excision repair involves recognition, incision, degradation or (excision), polymerization or (Synthesis), and finally, ligation. These are glycosylase enzymes which cut the basesugar bond. example: uracil glycosylase--enzyme which removes uracil from DNA. Uracil is not supposed to be in DNA--can occur if RNA primers not removed in DNA replication or (more likely) if cytosine is deaminated (this is potentially mutagenic). The enzyme recognizes uracil and cuts the glyscosyl linkage to deoxyribose. The sugar is then cleaved and a new base put in by DNA polymerase using the other strand as a template. The steps of excision repair in prokaryotes are as follows: The distortion in the DNA (caused by the thymine dimer) is recognized by a protein complex. A pair of endonucleases makes nicks in the DNA strand on either side of the thymine dimer (generally the nicks are 12 nucleotides apart). The 12-nucleotide piece of DNA between the nicks is removed, and DNA polymerase I fill in the gap left behind. DNA ligase seals the final nick in the DNA Mismatch repair This process occurs after DNA replication as a last "spellcheck" on its accuracy. Another multi-step process is the one seen after mismatch formation, often a consequence of a replicative error. In E. coli, these mismatch bases are repaired by a set of enzymes, the MutS, MutL and MutH proteins. The MutS protein recognizes the lesion, and initiates the assembly of a repair complex containing all three proteins. The dam gene (deoxyadeninemethyltransferase) will methylate the adenine base in the parental strand. The MutH protein incises at a GATC sequence in the unmethylated strand. Next, a MutS, MutL and MutU dependent excision step removes a section of DNA containing the GATC site and the mismatch. The resulting single stranded gap is filled in by DNA polymerase III. There is currently much interest in what the homologous pathway is in mammalian cells, and whether there is interaction between it and nucleotide excision. DNA damage tolerance Not all DNA damage is or can be removed immediately; some of it may persist for a while. If a DNA replication fork encounters DNA damage such as a pyrimidine dimer it will normally act as a block to further replication. However, in eukaryotes, DNA replication initiates at multiple sites and it may be able to resume downstream of a dimer, leaving a "gap" of single-stranded unreplicated DNA. The gap is potentially just as dangerous if not more so than the dimer if the cell divides. So there is a way to repair the gap by recombination with either the other homolog or the sister chromatid--this yields two intact daughter molecules, one of which still contains the dimer. Recombination (daughter-strand gap) repair This is a repair mechanism which promotes recombination to fix the daughter-strand gap--not the dimer--and is a way to cope with the problems of a non-coding lesion persisting in DNA. Recombination repair has been well characterized in bacteria, but these processes are not well defined in mammalian cells. The events of recombinational repair are shown. First, the damaged region undergoes recombination with the complemenary strand on the other DNA molecule. One strand is exchanged between the two DNA molecules. This essentially transfers the gap to the DNA molecule that doesn't have the dimer The gap can now be filled in by DNA polymerase I, and the dimer can be repaired by excision, since a template strand now exists. SOS Repair - If the UV exposure is sufficiently severe, the DNA damage may overwhelm the other repair mechanisms. In such situations, DNA replication would almost certainly halt, and the cell would die. As a last ditch effort to save itself, a cell activates the SOS repair system. This is a complex system, in which a whole battery of repair mechanisms are used to try to save the cell. One of these mechanisms allows replication to proceed across damaged templates, even though the template can't accurately be read. As a result, random nucleotides get inserted into the newly-synthesized DNA strand. “Rec a” is protein which inhibits the proof reading property of DNA polymerase I. This mechanism is therefore errorprone, and leads to mutations, which could be deleterious. In this case, however, the alternative is death, so mutation is preferable. Unit III Elucidation of the genetic code Assigning amino acids to triplet codons was the main objective of molecular biology during the first half of the 1960s. Once it had been accepted that the genetic code is triplet and non-overlapping, and that genes display colinearity with the proteins that they specify, attention turned to elucidation of the code, with the objective of assigning amino acids to individual codons. Two types of experiment enabled the code of Escherichia coli to be completely worked out during the years 1961-66. Cell-free translation of artificial RNAs The first type of experiment was pioneered by Marshall Nirenberg and Heinrich Matthaei at the National Institutes of Health, Maryland, and made use of an extract, prepared from E. coli cells, that contained all the components needed to carry out translation except for mRNA. This cell-free protein synthesizing system therefore only made protein when RNA was added. If the sequence of the added RNA was known or could be predicted then the composition of the proteins that were made could be used to assign codons to amino acids. The system was first used with the simplest artificial RNAs, those that contain just a single nucleotide, such as the homopolymer poly(U), whose sequence is 5′-UUU … UUU-3′. When added into the cell-free system, poly(U) directed synthesis of polyphenylalanine, showing that the codon 5′-UUU-3′ codes for phenylalanine. Equivalent experiments enabled 5′-AAA-3′ to be assigned to lysine and 5′-CCC-3′ to proline. For unexplained reasons poly(G) gave no protein product. Reliable methods for sequencing RNA were not developed until the late 1960s. This meant that when the cell-free experiments were performed it was not possible to determine the exact sequences of artificial heteropolymers - RNAs containing more than one nucleotide. These could still, however, be used in the cell-free system because their codon compositions could be deduced statistically from the identity and relative amounts of the nucleotides used in the reaction mixture from which each heteropolymer was made. Random heteropolymers of different compositions enabled amino acids to be assigned to over half the codons in the genetic code. A few more codons were assigned when the technique for synthesis of artificial RNAs was refined so that ordered heteropolymers could be made. These RNAs are polymerized from dinucleotides such as 5′-GC-3′ and so have predictable sequences, the ordered heteropolymer poly(GC) having the sequence 5′-GCGC … GCGC-3′ and therefore containing just two codons, 5′-GCG-3′ and 5′CGC-3′. The triplet binding assay The genetic code could not be completed with the standard cell-free protein synthesizing system because it was simply not possible to devise random and ordered heteropolymers that enabled every codon to be assigned unambiguously. A new approach was therefore needed. This was the triplet binding assay, devised by Nirenberg and Philip Leder in 1964 and based on their discovery that a ribosome will attach to an RNA triplet if the appropriate aminoacyl-tRNA (a tRNA linked to its amino acid; see Section 11.1.1) is also present. The code could therefore be completed (and all previously assigned codons checked) by synthesizing triplets of all possible sequences and testing each one individually with different aminoacyl-tRNAs. The final remaining ambiguity concerned the termination codons, which could not be directly identified by the triplet binding assay as it could not be proven that the inability of 5′UAA-3′, 5′-UAG-3′ and 5′-UGA-3′ to bind any aminoacyl-tRNA was not simply the result of deficiencies in this assay system. Confirmation that these are the termination codons was provided by Sydney Brenner and co-workers at Cambridge, UK, through genetic analysis of suppressor mutations, which result in one or other of the termination codons being recognized as a codon for an amino acid. Gene Structure There are two general types of gene in the human genome: non-coding RNA genes and proteincoding genes. Non-coding RNA genes represent 2-5 per cent of the total and encode functional RNA molecules. Many of these RNAs are involved in the control of gene expression, particularly protein synthesis. They have no overall conserved structure. Protein-coding genes represent the majority of the total and are expressed in two stages: transcription and translation. They show incredible diversity in size and organisation and have no typical structure. There are, however, several conserved features. Simplified overview of gene structure and expression. A protein-coding gene is defined by the extent of the primary transcript. The promoter and any other regulatory elements are outside the gene. The gene itself is divided into three types of sequence. The coding region (light blue) is the information used to define the sequence of amino acids in the protein. The untranslated regions (dark blue) are found in the mRNA but are not used to define the protein sequence; they are often regulatory in nature. Finally introns (white) are found in primary transcript but spliced out of the mRNA. They may interrupt the coding and untranslated regions. The boundaries of a protein-encoding gene are defined as the points at which transcription begins and ends. The core of the gene is the coding region, which contains the nucleotide sequence that is eventually translated into the sequence of amino acids in the protein. The coding region begins with the initiation codon, which is normally ATG. It ends with one of three termination codons: TAA, TAG or TGA. On either side of the coding region are DNA sequences that are transcribed but are not translated. These untranslated regions or non-coding regions often contain regulatory elements that control protein synthesis (see Figure). Both the coding region and the untranslated regions may be interrupted by introns. Most human genes are divided into exons and introns. Exons are the part of the transcript that will eventually be transported to the cytoplasm for translation. When discussing gene with alternate splicing, an exon is a portion of the transcript that could be translated, given the correct splicing conditions. The exons can be divided into three parts . Introns are intervening sequences between the exons that are never translated. Some sequences inside introns function as miRNA, and there are even some cases of small genes residing completely within the intron of a large gene. For some genes (such as the antibody genes), internal control regions are found inside introns. These situations, however, are treated as exceptions. The exons are the sections that are found in the mature transcript (messenger RNA), while the introns are removed from the primary transcript by a process called splicing. Regions of the genome with protein-coding genes include several elements: Enhancer regions (Normally up to a few thousand basepairs upstream of transcription) Promoter regions (Normally less than a couple of hundred basepairs upstream of transcription) include elements such as the TATA and CAAT boxes, GC elements, etc. The smallest protein-coding gene in the human genome is only 500 nucleotides long and has no introns. It encodes a histone protein .The largest human gene encodes the protein dystrophin, which is missing or non-functional in the disease muscular dystrophy. This gene is 2.5 million nucleotides in length and it takes over 16 hours to produce a single transcript. However, more than 99 per cent of the gene made up of its 79 introns. Gene Structure There are two general types of gene in the human genome: non-coding RNA genes and proteincoding genes. Non-coding RNA genes represent 2-5 per cent of the total and encode functional RNA molecules. Many of these RNAs are involved in the control of gene expression, particularly protein synthesis. They have no overall conserved structure. Protein-coding genes represent the majority of the total and are expressed in two stages: transcription and translation. They show incredible diversity in size and organisation and have no typical structure. There are, however, several conserved features. Simplified overview of gene structure and expression. A protein-coding gene is defined by the extent of the primary transcript. The promoter and any other regulatory elements are outside the gene. The gene itself is divided into three types of sequence. The coding region (light blue) is the information used to define the sequence of amino acids in the protein. The untranslated regions (dark blue) are found in the mRNA but are not used to define the protein sequence; they are often regulatory in nature. Finally introns (white) are found in primary transcript but spliced out of the mRNA. They may interrupt the coding and untranslated regions. The boundaries of a protein-encoding gene are defined as the points at which transcription begins and ends. The core of the gene is the coding region, which contains the nucleotide sequence that is eventually translated into the sequence of amino acids in the protein. The coding region begins with the initiation codon, which is normally ATG. It ends with one of three termination codons: TAA, TAG or TGA. On either side of the coding region are DNA sequences that are transcribed but are not translated. These untranslated regions or non-coding regions often contain regulatory elements that control protein synthesis (see Figure). Both the coding region and the untranslated regions may be interrupted by introns. Most human genes are divided into exons and introns. Exons are the part of the transcript that will eventually be transported to the cytoplasm for translation. When discussing gene with alternate splicing, an exon is a portion of the transcript that could be translated, given the correct splicing conditions. The exons can be divided into three parts . Introns are intervening sequences between the exons that are never translated. Some sequences inside introns function as miRNA, and there are even some cases of small genes residing completely within the intron of a large gene. For some genes (such as the antibody genes), internal control regions are found inside introns. These situations, however, are treated as exceptions. The exons are the sections that are found in the mature transcript (messenger RNA), while the introns are removed from the primary transcript by a process called splicing. Regions of the genome with protein-coding genes include several elements: Enhancer regions (Normally up to a few thousand basepairs upstream of transcription) Promoter regions (Normally less than a couple of hundred basepairs upstream of transcription) include elements such as the TATA and CAAT boxes, GC elements, etc. The smallest protein-coding gene in the human genome is only 500 nucleotides long and has no introns. It encodes a histone protein .The largest human gene encodes the protein dystrophin, which is missing or non-functional in the disease muscular dystrophy. This gene is 2.5 million nucleotides in length and it takes over 16 hours to produce a single transcript. However, more than 99 per cent of the gene made up of its 79 introns. POST TRANSLATIONAL MODIFICATIONS AND FOLDING OF NEW ASSEMBLED POLYPEPTIDES The protein molecule ultimately needed by a cell often differs from the polypeptide chain that is synthesized. There are several ways in which the modification of the synthesized chain occurs: 1. In prokaryotes fMet is never retained at the amino terminal. The formyl group is removed by the enzyme deformylase, which leaves methionine as the amino terminal amino acid. In both prokaryotes and eukaryotes the fmet or met or a few more amino acids are also removed. It is catalyzed by an enzyme amino peptidase. 2. Newly created amino terminal amino acids are sometimes acetylated. 3. Some amino acid side chains may also be modified. For example in collagen a large fraction of the praline and lysine are hydroxylated. Phosphorylation of serine, tyrosin, and threonine occurs in many organisms. Various sugar may be attached to the free hydroxyl group of serine and threonine to form glycoproteins. 4. A variety of prosthetic groups such as heme and biotin are covalently attached to some enzymes. 5. Two distant sulfhydryl groups in two cysteines may be oxidized to form a disulphide bond. E.g. Insulin 6. Polypeptide chains may be cleaved in different sites to form active enzyme. The chymotrypsinogen is converted into digestive enzyme chymotrypsin by the removal of four amino acids from two different sites. 7. After these modifications are made to a polypeptide chain further folding of that polypeptide chain occurs to make it an active protein. 8. The newly synthesized polypeptide chain will be in primary structure. 9. It further forms hydrophobic and electrostatic bonds with the adjacent amino acids and other functional groups to form and pleaded sheaths. This is the secondary structure of protein. 10. The protein folds back further in to complex structure called tertiary structure. 11. The tertiary structured proteins along with some functional groups or prosthetic groups join together and they form the Quardenary structure. It is the active state of protein. E.g. Hemoglobin. Transduction Transduction is the process by which DNA is transferred from one bacterium to another by a virus. It also refers to the process whereby foreign DNA is introduced into another cell via a viral vector. This is a common tool used by molecular biologists to stably introduce a foreign gene into a host cell's genome.When bacteriophages (viruses that infect bacteria) infect a bacterial cell, their normal mode of reproduction is to harness the replicational, transcriptional, and translation machinery of the host bacterial cell to make numerous virions, or complete viral particles, including the viral DNA or RNA and the protein coat. Transduction as a method of transfer genetic material However, the packaging of bacteriophage DNA has low fidelity and small pieces of bacterial DNA, together with the bacteriophage genome, may become packaged into the bacteriophage genome. At the same time, some phage genes are left behind in the bacterial chromosome. There are generally two types of recombination events that can lead to this incorporation of bacterial DNA into the viral DNA, leading to two modes of recombination. Generalized transduction Generalised transduction may occur in two main ways, recombination and headful packaging. If bacteriophages undertake the lytic cycle of infection upon entering a bacterium, the virus will take control of the cell’s machinery for use in replicating its own viral DNA. If by chance bacterial chromosomal DNA is inserted into the viral capsid used to contain the viral DNA, while this lytic pathway is proceeding, the mistake will lead to generalized transduction. If the virus replicates using 'headful packaging', it attempts to fill the nucleocapsid with genetic material. If the viral genome results in spare capacity, viral packaging mechanisms may incorporate bacterial genetic material into the new virion.The new virus capsule now loaded with part bacterial DNA continues to infect another bacterial cell. This bacterial material may become recombined into another bacterium upon infection. When the new DNA is inserted into this recipient cell it can fall to one of three fates 1. The DNA will be absorbed by the cell and be recycled for spare parts. 2. If the DNA was originally a plasmid, it will re-circularize inside the new cell and become a plasmid again. 3. If the new DNA matches with a homologous region of the recipient cell’s chromosome, it will exchange DNA material similar to the actions in conjugation. This type of recombination is random and the amount recombined depends on the size of the virus being used.It is worth asking whether generalized transduction can occur by lysogenic phages. Two possible scenarios might be imagined to cause generalized transduction though literature references have not been found to confirm or dispute them: 1. A lysogenic phage whose site of integration is randomly chosen, which occasionally brings along adjacent DNA because of an erroneous excision process. 2. A lysogenic phage that goes into its lytic phase and randomly incorporates cell DNA. Specialized transduction The second type of recombination event is called specialized transduction. If a virus removes itself from the chromosome incorrectly, some of the bacterial DNA can be packaged into the virion. Mistakes in this process of viral DNA going from the lysogenic to the lytic cycle lead to specialized transduction. There are three possible results from specialized transduction: 1. DNA can be absorbed and recycled for spare parts. 2. The bacterial DNA can match up with a homologous DNA in the recipient cell and exchange it. The recipient cell now has DNA from both itself and the other bacterial cell. 3. DNA can insert itself into the genome of the recipient cell as if still acting like a virus resulting in a double copy of the bacterial genes. Example of specialized transduction is λ phages in Escherichia coli. Discovery Transduction was discovered by Norton Zinder and Joshua Lederberg at the University of Wisconsin-Madison in 1951. Unit IV Methylation of DNA The importance of methylation in DNA-protein interactions is well known. We showed that a particular DNA sequence could be protected from restriction endonucleases if it were methylated. A small percentage of cytosine residues are methylated in many eukaryotic organisms, mainly in CpG sequences; 80% of the cytosines in CpG sequences in human DNA are methylated. The degree of methylation of DNA is related to the silencing of a gene. Genes that are dormant in one cell type but active in another, or genes that are dormant at one stage of development but active in another are usually less methylated when active and more fully methylated when inactive. For example, adenovirus, a cancer-causing virus, has been observed in many eukaryotic cell lines. In most lines in which the adenovirus DNA has integrated into the host chromosome, late viral genes are turned off. These genes are highly methylated at their CCGG or GCGC sites. In addition, chemicals that prevent methylation frequently activate previously dormant genes. For example, 5-azacytidine inhibits methylation; X chromosomal genes, which are normally deactivated, can be reactivated by treatment with 5-azacytidine. There are numerous other examples of the activation of genes after treatment with this chemical. The activated genes lack methylated cytosines that were previously methylated. Finally, the possibility exists that DNA methylation can affect the pattern of chromatin structure. Recent work has also indicated that the methylation itself may not prevent transcription, but rather may be a signal for transcriptional inactivity. In the thale cress plant, Arabidopsis thaliana, a protein named Mom (for Morpheusmolecule), has been discovered that, when mutated, results in genes that have heavy methylation levels but are actively transcribed. Thus, the methylation level can be separated from the transcriptional activity of genes, although the two usually occur together. Arabidopsis is proving to be a good model in the study of the role of methylation in transcriptional activation. Further interest has been generated in the role of methylation in controlling gene expression by the discovery of Z DNA, and the fact that Z DNA can be stabilized by methylation. This observation has led to a model of transcriptional regulation based on alternative DNA structures. Sequences that could exist as Z DNA exist as B DNA when being transcribed. If the gene is to be silenced (turned off), the CpG sequences are converted to stable Z DNA by methylation, which then blocks transcription. This possibility has gained some interest because of the recent discovery of an enzyme, double-stranded RNA adenosine deaminase (ADAR1) that binds to Z DNA sequences. DNA Methylation The primary structure of DNA can be modified in various ways. These modifications are important in the expression of the genetic material. One such modification is DNA methylation, in which methyl groups (–CH3) are added (by specific enzymes) to certain positions on the nucleotide bases. In bacteria, adenine and cytosine are commonly methylated, whereas, in eukaryotes, cytosine is the most commonly methylated base. Bacterial DNA is frequently methylated to distinguish it from foreign, unmethylated DNA that may be introduced by viruses; bacteria use proteins called restriction enzymes to cut up any unmethylated viral DNA. In eukaryotic DNA, cytosine bases are often methylated to form 5-methylcytosine ( FIGURE 10.18). The extent of cytosine methylation varies; in most animal cells, about 5% of the cytosine bases are methylated, but more than 50% of the cytosine bases in some plants are methylated. On the other hand, no methylation of cytosine has been detected in yeast cells, and only very low levels of methylation (about 1 methylated cytosine base per 12,500 nucleotides) are found in Drosophila. Why eukaryotic organisms differ so widely in their degree of methylation is not clear. Methylation is most frequent on cytosine nucleotides that sit next to guanine nucleotides on the same strand: . . . GC . . . . . . CG . . . In eukaryotic cells, methylation is often related to gene expression. Sequences that are methylated typically show low levels of transcription while sequences lacking methylation are actively being transcribed. Methylation can also affect the three- dimensional structure of the DNA molecule. OPERON CONCEPT The Lactose Operon The lac operon was the first system where genetic control of protein synthesis was studied. Francois Jacob and Jacques Monod observed that the expression of certain genes in mutants of Escherichia coli were induced by the presence of lactose in the medium. Later, they developed the concept of the lactose operon and won the Nobel Prize in 1965 together with Andr‚ Lwoff. An operon is a segment of DNA that has a cluster of contiguous genes which are transcribed as a single polycistronic mRNA. The lactose operon has three contiguous structural genes: lac Z. This gene codes for the enzyme Beta-galactosidase which breaks the Beta-galactoside linkage in the lactose molecule to generate glucose and galactose. lac Y. This gene codes for the enzyme Beta-galactoside permease which transports lactose into the cell. lac A. This gene codes for the enzyme Beta-galactoside transacetylase which actual function is still unknown. The promoter and operator of the lac operon are respectively located immediately upstream from these genes. The promoter is the specific region in the DNA where the RNA polymerase binds to initiate transcription. The operator is the region of DNA where the repressor binds to terminate transcription. The lac I operon is located immediately upstream from the lac operon. The lac I operon consists of the lac I promoter and the lac I regulator gene. The lac I regulator gene codes for the repressor protein. Only lacZ and lacY appear to be necessary for lactose catabolism Specific control of the lac genes depends on the availability of the substrate lactose to the bacterium. The proteins are not produced by the bacterium when lactose is unavailable as a carbon source. The lac genes are organized into an operon; that is, they are oriented in the same direction immediately adjacent on the chromosome and are co-transcribed into a single polycistronic mRNA molecule. Transcription of all genes starts with the binding of the enzyme RNA polymerase which binds to a specific DNA binding site immediately upstream of the genes, the promoter. From this position RNAP proceeds to transcribe all three genes (lacZYA) into mRNA. The first control mechanism is the regulatory response to lactose, which uses an intracellular regulatory protein called the lactose repressor to hinder production of β-galactosidase in the absence of lactose. The lacI gene coding for the repressor lies nearby the lac operon and is always expressed (constitutive). If lactose is missing from the growth medium, the repressor binds very tightly to a short DNA sequence just downstream of the promoter near the beginning of lacZ called the lac operator. The repressor binding to the operator interferes with binding of RNAP to the promoter, and therefore mRNA encoding LacZ and LacY is only made at very low levels. When cells are grown in the presence of lactose, however, a lactose metabolite called allolactose, which is a recombination of glucose and galactose, binds to the repressor, causing a change in its shape. Thus altered, the repressor is unable to bind to the operator, allowing RNAP to transcribe the lac genes and thereby leading to high levels of the encoded proteins. The second control mechanism is a response to glucose, which uses the Catabolite activator protein (CAP) to greatly increase production of β-galactosidase in the absence of glucose. Cyclic adenosine monophosphate (cAMP) is a signal molecule whose prevalence is inversely proportional to that of glucose. It binds to the CAP, which in turn allows the CAP to bind to the CAP promoter which assists the RNAP in binding to the DNA. In the absence of glucose, the prevalence of cAMP and binding of the CAP to the DNA significantly increases the production of , enabling the cell to digest the lactose needed to produce glucose. In summary: When lactose is absent then there is very little Lac enzyme production (the operator has LacI bound to it). When lactose is present but a preferred carbon source (like glucose) is also present then a small amount of enzyme is produced (LacI is not bound to the operator). When lactose is the favoured carbon source (for example in the absence of glucose) cAMP-CAP binds upstream of the promoter at a specific site. This bends the DNA around the protein which creates tension, and allows the RNA polymerase to bind to the promoter and Lac enzyme production is maximised. The DNA is not easily unwound under normal conditions, without the bound CAP, as the DNA contains a large number of the nucleotides which have 2 hydrogen bonds between them, needing more energy to part them Regulatory Sequences in Protein-Coding Genes ■ Expression of eukaryotic protein-coding genes generally is regulated through multiple protein-binding control regions that are located close to or distant from the start site ■ Promoters direct binding of RNA polymerase II to DNA, determine the site of transcription initiation, and influence transcription rate . ■ Three principal types of promoter sequences have been identified in eukaryotic DNA. The TATA box, the most common, is prevalent in rapidly transcribed genes. Initiator promoters are found in some genes, and CpG islands are characteristic of genes transcribed at a low rate. ■ Promoter-proximal elements occur within ≈200 base pairs upstream of a start site. Several such elements, containing ≈10–20 base pairs, may help regulate a particular gene. ■ Enhancers, which contain multiple short control elements, may be located from 200 base pairs to tens of kilobases upstream or downstream from a promoter, within an intron, or downstream from the final exon of a gene. ■ Promoter-proximal elements and enhancers often are cell-type-specific, functioning only in specific differentiated REGULATORY SEQUENCES In our consideration of gene regulation, it will be necessary to distinguish between the DNA sequences that are transcribed and the DNA sequences that regulate the expression of other sequences. We will refer to any DNA sequence that is transcribed into an RNA molecule as a gene. According to this definition, genes include DNA sequences that encode proteins, as well as sequences that encode rRNA, tRNA, snRNA, and other types of RNA. Structural genes encode proteins that are used in metabolism or biosynthesis or that play a structural role in the cell. Regulatory genes are genes whose products, either RNA or proteins, interact with other sequences and affect their transcription or translation. In many cases, the products of regulatory genes are DNA binding proteins. We will also encounter DNA sequences that are not transcribed at all, but still play a role in regulating other nucleotide sequences. These regulatory elements affect the expression of sequences to which they are physically linked. Much of gene regulation takes place through the action of proteins produced by regulatory genes that recognize and bind to regulatory elements. DNA-Binding Proteins: Regulatory sequences produce regulatory proteins. Much of gene regulation is accomplished by proteins that bind to DNA sequences and influence their expression. These regulatory proteins generally have discrete functional parts called domains, typically consisting of 60 to 90 amino acids that are responsible for binding to DNA. Within a domain, only a few amino acids actually make contact with the DNA. These amino acids (most commonly asparagine, glutamine, glycine, lysine, and arginine) often form hydrogen bonds with the bases or interact with the sugar–phosphate backbone of the DNA. Many regulatory proteins have additional domains that can bind other molecules such as other regulatory proteins. DNAbinding proteins can be grouped into several distinct types on the basis of a characteristic structure, called a motif, found within the binding domain. Motifs are simple structures, such as alpha helices, that can fit into the major groove of the DNA. TRANSPOSABLE ELEMENTS IN PROKARYOTES AND EUKARYOTES The DNA sequence that moves from one place to another within a genome is called transposon or transposable element. It’s known as jumping gene or selfish gene.This transposition is either conservative or replicative. In conservative the transposons move without copying themselves.They are liberated from the donor site by double strand breaks in the DNA. In the replicative a copy of the transposon is inserted while the original stay in place. This mechanism involves only single strand at the donor site. “Barbara Mc Clintock” first discovered transposable element in prokaryotes in 1967. Each transposon is bounded by inverted repeats on its either end. They are present in prokaryotes and eukaryotes E.g In E.coli, Drosophila, Maize e.t.c… When a plasmid carrying a transposon is inserted in to a bacterium it may be in corporated in to plasmid DNA or genomic DNA. During this movement it picks up certain genes found close to it and transfer it to a new site. By this a new combination of gene develops in the host cell. Transposons are of three types:I. IS elements II. Complex transposons III. Composite transposons IS elements: It’s a simplest transposon. It contains about 700-1500 bases coding for the transposase and resolvase enzyme. It is bounded by inverted terminal repeats of about 10-30 base pairs. The terminal inverted repeats are necessary for transposition because they signal for the transposition enzyme. The IS element has been numbered as IS1,IS2,…. Complex transposon: The complex transposon consists of an antibiotic resistance geneand genes for transposase and resolvase enzyme. The complex transposon are numbered as Tn1,Tn2,…. E.g in Tn7 consists of streptomycin resistant gene. Composite transposone: Composite transposon has a central region and it has two IS elements one before another one after this central region. It also has terminal inverted repeats Its numbered as Ty1, Ty2,…. The IS element in the two ends can be identical or different. Significance of transposons: To transfer a foreign gene from one plasmid to another or from one organism to another. To restructure a genome. To construct rDNAs for gene cloning. Unit V FACTORS AFFECTING GENE FREQUENCY The factors which affect gene frequency may lead to evolutionary change; the circumstances in which the Hardy-Weinberg law may fail to apply are five. They will affect the gene frequency. They are: mutation gene flow genetic drift nonrandom mating natural selection MUTATION The frequency of gene B and its allele b will not remain in Hardy-Weinberg equilibrium if the rate of mutation of B -> b (or vice versa) changes. By itself, this type of mutation probably plays only a minor role in evolution; the rates are simply too low. In any case, evolution absolutely depends on mutations because this is the only way that new alleles are created. After being shuffled in various combinations with the rest of the gene pool, these provide the raw material on which natural selection can act. GENE FLOW Many species are made up of local populations whose members tend to breed within the group. Each local population can develop a gene pool distinct from that of other local populations. However, members of one population may breed with occasional immigrants from an adjacent population of the same species. This can introduce new genes or alter existing gene frequencies in the residents. In many plants and some animals, gene flow can occur not only between subpopulations of the same species but also between different (but still related) species. This is called hybridization. If the hybrids later breed with one of the parental types, new genes are passed into the gene pool of that parent population. This process is called introgression. In either case, gene flow increases the variability of the gene pool. GENETIC DRIFT As we have seen, interbreeding often is limited to the members of local populations. If the population is small, Hardy-Weinberg may be violated. Chance alone may eliminate certain members out of proportion to their numbers in the population. In such cases, the frequency of an allele may begin to drift toward higher or lower values. Ultimately, the allele may represent 100% of the gene pool or, just as likely, disappear from it. Drift produces evolutionary change, but there is no guarantee that the new population will be more fit than the original one. Evolution by drift is aimless, not adaptive. NONRANDOM MATING One of the cornerstones of the Hardy-Weinberg equilibrium is that mating in the population must be random. If individuals (usually females) are choosy in their selection of mates the gene frequencies may become altered. Darwin called this sexual selection. Nonrandom mating seems to be quite common. Breeding territories, courtship displays, "pecking orders" can all lead to it. In each case certain individuals do not get to make their proportionate contribution to the next generation. Assortative mating Humans seldom mate at random preferring phenotypes like themselves (e.g., size, age, ethnicity). This is called assortative mating. Marriage between close relatives is a special case of assortative mating. The closer the kinship, the more alleles shared and the greater the degree of inbreeding. Inbreeding can alter the gene pool. Potentially harmful recessive alleles - invisible in the parents - become exposed to the forces of natural selection in the children. NATURAL SELECTION If individuals having certain genes are better able to produce mature offspring than those without them, the frequency of those genes will increase. This is simple expressing Darwin's natural selection in terms of alterations in the gene pool. Natural selection results from Differential mortality and/or Differential fecundity. Mortality Selection Certain genotypes are less successful than others in surviving through to the end of their reproductive period. The evolutionary impact of mortality selection can be felt anytime from the formation of a new zygote to the end (if there is one) of the organism's period of fertility. Mortality selection is simply another way of describing Darwin's criteria of fitness. Fecundity Selection Certain phenotypes may make a disproportionate contribution to the gene pool of the next generation by producing a disproportionate number of young. Such fecundity selection is another way of describing another criterion of fitness described by Darwin: family size. In each of these examples of natural selection certain phenotypes are better able than others to contribute their genes to the next generation. Thus, by Darwin's standards, they are more fit. The outcome is a gradual change in the gene frequencies in that population. THE GENETICS OF PATTERN FORMATION IN DROSOPHILA One of the best-studied systems for the genetic control of pattern formation is the early embryonic development of Drosophila melanogaster. Geneticists have isolated a large number of mutations in fruit flies that influence all aspects of their development, and these mutations have been subjected to molecular analysis, providing much information about how genes control early development in Drosophila. The development of the fruit fly: An adult fruit fly possesses three basic body parts: head, thorax, and abdomen (FIGURE 21.3). The thorax consists of three segments: the first thoracic segment carries a pair of legs; the second thoracic segment carries a pair of legs and a pair of wings; and the third thoracic segment carries a pair of legs and the halteres (rudiments of the second pair of wings found in most other insects). The abdomen contains nine segments. When a Drosophila egg has been fertilized, its diploid nucleus immediately divides nine times without division of the cytoplasm, creating a single, multinucleate cell (FIGURE 21.4b). These nuclei are scattered throughout the cytoplasm but later migrate toward the periphery of the embryo and divide several more times (FIGURE 21.4c). Next, the cell membrane grows inward and around each nucleus, creating a layer of approximately 6000 cells at the outer surface of the embryo (FIGURE 21.4d). Four nuclei at one end of the embryo develop into pole cells, which eventually give rise to germ cells. The early embryo then undergoes further development in three distinct stages: (1) the anterior–posterior axis and the dorsal–ventral axis of the embryo are established (FIGURE 21.5a); (2) the number and orientation of the body segments are determined (FIGURE 21.5b); and (3) the identity of each individual segment is established (FIGURE 21.5c). Different sets of genes control each of these three stages (Table 21.1). Egg-polarity genes: The egg-polarity genes play a crucial role in establishing the two main axes of development in fruit flies. You can think of these axes as the longitude and latitude of development: any location in the Drosophila embryo can be defined in relation to these two axes. There are two sets of egg-polarity genes: one set determines the anterior– posterior axis and the other determines the dorsal–ventral axis. These genes work by setting up concentration gradients of morphogens within the developing embryo. A morphogen is a protein whose concentration gradient affects the developmental fate of the surrounding region. The egg-polarity genes are transcribed into mRNAs during egg formation in the maternal parent, and these mRNAs become incorporated into the cytoplasm of the egg. After fertilization, the mRNAs are translated into proteins that play an important role in determining the anterior–posterior and dorsal–ventral axes of the embryo. Because the mRNAs of the polarity genes are produced by the female parent and influence the phenotype of their offspring, the traits encoded by them are examples of genetic maternal effects. Egg-polarity genes function by producing proteins that become asymmetrically distributed in the cytoplasm, giving the egg polarity, or direction. This asymmetrical distribution may take place in a couple of ways. The mRNA may be localized to particular regions of the egg cell, leading to an abundance of the protein in those regions when the mRNA is translated. Alternatively, the mRNA may be randomly distributed, but the protein that it encodes may become asymmetrically distributed, either by a transport system that delivers it to particular regions of the cell or by its removal from particular regions by selective degradation. Determination of the dorsal–ventral axis: The dorsal ventral axis defines the back (dorsum) and belly (ventrum) of a fly (see Figure 21.5). At least 12 different genes determine this axis, one of the most important being a gene called dorsal. The dorsal gene is transcribed and translated in the maternal ovary, and the resulting mRNA and protein are transferred to the egg during oogenesis. In a newly laid egg, mRNA and protein encoded by the dorsal gene are uniformly distributed throughout the cytoplasm but, after the nuclei migrate to the periphery of the embryo (see Figure 21.4c), Dorsal protein becomes redistributed. Along one side of the embryo, Dorsal protein remains in the cytoplasm; this side will become the dorsal surface. Along the other side, Dorsal protein is taken up into the nuclei; this side will become the ventral surface. At this point, there is a smooth gradient of increasing nuclear Dorsal concentration from the dorsal to the ventral side ( FIGURE 21.6). The nuclear uptake of Dorsal protein is thought to be governed by a protein called Cactus, which binds to Dorsal protein and traps it in the cytoplasm. The presence of yet another protein, called Toll, can alter Dorsal, allowing it to dissociate from Cactus and move into the nucleus. Together, Cactus and Toll regulate the nuclear distribution of Dorsal protein, which in turn determines the dorsal–ventral axis of the embryo. Inside the nucleus, Dorsal protein acts as a transcription factor, binding to regulatory sites on the DNA and activating or repressing the expression of other genes (Table 21.2). High nuclear concentration of Dorsal protein (as on the ventral side of the embryo) activates a gene called twist, which causes mesoderm to develop. Low concentrations of Dorsal protein (as in cells on the dorsal side of the embryo), activates a gene called decapentaplegic, which specifies dorsal structures. In this way, the ventral and dorsal sides of the embryo are determined. Determination of the anterior–posterior axis: Establishing the anterior–posterior axis of the embryo is a crucial step in early development. We will consider several genes in this pathway (Table 21.3). One important gene is bicoid, which is first transcribed in the ovary of an adult female during oogenesis. Bicoid mRNA becomes incorporated into the cytoplasm of the egg and, as it is passes into the egg, bicoid mRNA becomes anchored to the anterior end of the egg by part of its 3’end. This anchoring causes bicoid mRNA to become concentrated at the anterior end (FIGURE 21.7a). (A number of other genes that are active in the ovary are required for proper localization of bicoid mRNA in the egg.) When the egg has been laid, bicoid mRNA is translated into Bicoid protein. Because most of the mRNA is at the anterior end of the egg, Bicoid protein is synthesized there and forms a concentration gradient along the anterior–posterior axis of the embryo, with a high concentration at the anterior end and a low concentration at posterior end. This gradient is maintained by the continuous synthesis of Bicoid protein and its short half-life. The high concentration of Bicoid protein at the anterior end induces the development of anterior structures such as the head of the fruit fly. Bicoid—like Dorsal—is a morphogen. It stimulates the development of anterior structures by binding to regulatory sequences in the DNA and influencing the expression of other genes. One of the most important of the genes stimulated by Bicoid protein is hunchback, which is required for the development of the head and thoracic structures of the fruit fly. The development of the anterior–posterior axis is also greatly influenced by a gene called nanos, an egg-polarity gene that acts at the posterior end of the axis. The nanos gene is transcribed in the adult female, and the resulting mRNA becomes localized at the posterior end of the egg ( FIGURE 21.7b). After fertilization, nanos mRNA is translated into Nanos protein, which diffuses slowly toward the anterior end. The Nanos protein gradient is opposite that of Bicoid protein: Nanos is most concentrated at the posterior end of the embryo and is least concentrated at the anterior end. Nanos protein inhibits the formation of anterior structures by repressing the translation of hunchback mRNA. The synthesis of the Hunchback protein is therefore stimulated at the anterior end of the embryo by Bicoid protein and is repressed at the posterior end by Nanos protein. This combined stimulation and repression results in a Hunchback protein concentration gradient along the anterior–posterior axis that, in turn, affects the expression of other genes and helps determine the anterior and posterior structures. Segmentation genes: Like all insects, the fruit fly has a segmented body plan. When the basic dorsal–ventral and anterior– posterior axes of the fruit-fly embryo have been established, segmentation genes control the differentiation of the embryo into individual segments. These genes affect the number and organization of the segments, and mutations in them usually disrupt whole sets of segments. The approximately 25 segmentation genes in Drosophila are transcribed after fertilization; so they don’t exhibit a genetic maternal effect, and their expression is regulated by the Bicoid and Nanos protein gradients. The segmentation genes fall into three groups as shown in FIGURE 21.8. Gap genes define large sections of the embryo; mutations in these genes eliminate whole groups of adjacent segments. Mutations in the Krüppel gene, for example, cause the absence of several adjacent segments. Pair-rule genes define regional sections of the embryo and affect alternate segments. Mutations in the even-skipped gene cause the deletion of even-numbered segments, whereas mutations in the fushi tarazu gene cause the absence of odd-numbered segments. Segment-polarity genes affect the organization of segments. Mutations in these genes cause part of each segment to be deleted and replaced by a mirror image of part or all of an adjacent segment. For example, mutations in the gooseberry gene cause the posterior half of each segment to be replaced by the anterior half of an adjacent segment. The gap genes, pair-rule genes, and segment-polarity genes act sequentially, affecting progressively smaller regions of the embryo. First, the egg-polarity genes activate or repress the gap genes, which divide the embryo into broad regions. The gap genes, in turn, regulate the pair-rule genes, which affect the development of pairs of segments. Finally, the pair rule genes influence the segment-polarity genes, which guide the development of individual segments. Homeotic genes: After the segmentation genes have established the number and orientation of the segments, homeotic genes become active and determine the identity of individual segments. Eyes normally arise only on the head segment, whereas legs develop only on the thoracic segments. The products of homeotic genes activate other genes that encode these segment-specific characteristics. Mutations in the homeotic genes cause body parts to appear in the wrong segments. Homeotic mutations were first identified in 1894, when William Bateson noticed that floral parts of plants occasionally appeared in the wrong place: he found, for example, flowers in which stamens grew in the normal place of petals. In the late 1940s, Edward Lewis began to study homeotic mutations in Drosophila, which caused bizarre rearrangements of body parts. Mutations in the Antennapedia gene, for example, cause legs to develop on the head of a fly in place of the antenna (FIGURE 21.9). Homeotic genes create addresses for the cells of particular segments, telling the cells where they are within the regions defined by the segmentation genes. When a homeotic gene is mutated, the address is wrong and cells in the segment develop as though they were somewhere else in the embryo. Homeotic genes are expressed after fertilization and are activated by specific concentrations of the proteins produced by the gap, pair-rule, and segment-polarity genes. The homeotic gene Ultrabithorax (Ubx), for example, is activated when the concentration of Hunchback protein (a product of a gap gene) is within certain values. These concentrations exist only in the middle region of the embryo; so Ubx is expressed only in these segments. The homeotic genes encode regulatory proteins that bind to DNA; each gene contains a subset of nucleotides, called a homeobox, that are similar in all homeotic genes. The homeobox consists of 180 nucleotides and encodes 60 amino acids that serve as a DNA-binding domain; this domain is related to the helix-turn-helix motif (See Figure 16.2a). Homeoboxes are also present in segmentation genes and other genes that play a role in spatial development. There are two major clusters of homeotic genes in Drosophila. One cluster, the Antennapedia complex, affects the development of the adult fly’s head and anterior thoracic segments. The other cluster consists of the bithorax complex and includes genes that influence the adult fly’s posterior thoracic and abdominal segments. Together, the bithorax and Antennapedia genes are termed the homeotic complex (HOM-C). In Drosophila, the bithorax complex contains three genes, and the Antennapedia complex has five; they are all located on the same chromosome ( FIGURE 21.10). In addition to these eight genes, HOM-C contains many sequences that regulate the homeotic genes. Remarkably, the order of the genes in the HOM-C is the same as the order in which the genes are expressed along the anterior–posterior axis of the body. The genes that are expressed in the more anterior segments are found at the one end of the complex, whereas those expressed in the more posterior end of the embryo are found at the other end of complex (See Figure 21.10). The reason for this correlation is unknown. Homeobox Genes in Other Organisms: After homeotic genes in Drosophila had been isolated and cloned, molecular geneticists set out to determine if similar genes exist in other animals; probes complementary to the homeobox of Drosophila genes were used to search for homologous genes that might play a role in the development of other animals. The search was hugely successful: homeobox-containing (Hox) genes have been found in all animals studied so far, including nematodes, beetles, sea urchins, frogs, birds, and mammals. They have even been discovered in fungi and plants, indicating that Hox genes arose early in the evolution of eukaryotes. In vertebrates, there are four clusters of Hox genes, each of which contains from 9 to 11 genes. Interestingly, the Hox genes of other organisms exhibit the same relation between order on the chromosome and order of their expression along the anterior–posterior axis of the embryo as that of Drosophila ( FIGURE 21.11). Mammalian Hox genes, like those in Drosophila, encode transcription factors that help determine the identity of body regions along an anterior– posterior axis. SUMMARY: Development is a complex process consisting of numerous events that must take place in a highly specific sequence. The results of studies in fruit flies and other organisms reveal that this process is regulated by a large number of genes. In Drosophila, the dorsal–ventral axis and the anterior–posterior axis are established by maternal genes; these genes encode mRNAs and proteins that are localized to specific regions within the egg and cause specific genes to be expressed in different regions of the embryo. The proteins of these genes then stimulate other genes, which in turn stimulate yet other genes in a cascade of control. As might be expected, most of the gene products in the cascade are regulatory proteins, which bind to DNA and activate other genes. In the course of development, successively smaller regions of the embryo are determined ( FIGURE 21.12). In Drosophila, first, the major axes and regions of the embryo are established by egg polarity genes. Next, patterns within each region are determined by the action of segmentation genes: the gap genes define large sections; the pair-rule genes define regional sections of the embryo and affect alternate segments; and the segment-polarity genes affect individual segments. Finally, the homeotic genes provide each segment with a unique identity. Initial gradients in proteins and mRNA stimulate localized gene expression, which produces more finely located gradients that stimulate even more localized gene expression. Developmental regulation thus becomes more and more narrowly defined. The processes by which limbs, organs, and tissues form (called morphogenesis) are less well understood, although this pattern of generalized-to-localized gene expression is encountered frequently. GENETIC COUNSELING Every day researchers are learning more about the genetics of common diseases and how those diseases run in families. If you have an inherited disease in your family, a genetic counseling session can help you understand your personal risk or the risk for other family members. It can also help you learn what testing, surveillance, prevention strategies, or research trials may be right for your situation. In most cases, a genetic counselor will lead the session, but some nurses, doctors, and medical geneticists are also trained to do genetic counseling. How to Find Additional Support Who Is A Genetic Counselor? Traditionally, a genetic counselor has a master’s degree in genetic counseling and has studied genetic diseases and how those diseases run in families. The genetic counselor can help a person or family understands their risk for genetic conditions (such as cystic fibrosis, cancer, or Down syndrome), educate the person or family about that disease, and assess the risk of passing those diseases on to children. A genetic counselor will often work with families to identify members who are at risk. If it is appropriate, they will discuss genetic testing, coordinate any testing, interpret test results, and review all additional testing, surveillance, surgical, or research options that are available to members of the family. Genetic counselors often work as part of a health care team in conjunction with specially trained doctors, social workers, nurses, medical geneticists, or other specialists to help families make informed decisions about their health. They also work as patient advocates, helping individuals receive additional support and services for their health care needs. Who Sees A Genetic Counselor? Any person who may have a genetic condition, has a family history of an inherited disease, or has other risk factors for a genetic condition or birth defect may benefit from seeing a genetic counselor. If a person's family history indicates the possibility of an inherited disease, their doctor may give them a referral. Some pregnant women may also be referred to genetic counselors to receive counseling about the risks of birth defects or for help in interpreting test results. Pregnant women older than 35 are especially likely to see a genetic counselor because it is standard for them to be offered amniocentesis due to their increased risk of having a baby with a chromosomal abnormality such as Down syndrome. If you are unsure about whether you would benefit from genetic counseling, Genetic Health's Tree Builder tool can help clarify whether you have an increased risk for certain genetic conditions like cancer, diabetes, or heart disease. What Happens at A Genetic Counseling Session? To assess your risk for an inherited condition, a genetic counselor needs to know medical information about you and your family. In some cases, you may need to provide this information when you make the appointment. The genetic counselor will often take a more detailed family medical history and use this information to generate a family tree, which shows all of your relatives, their relationship to you and diseases they had. This diagram helps the genetic counselor determine your risk for inherited diseases. If you do have an increased risk, the counselor will make sure that you understand the basic genetic concepts that affect how the disease runs in families, educate you about the disease itself, and explain the level of risk for you and your family. A Family Tree for a Family with the Inherited Syndrome FAP After an initial appointment, the genetic counselor may need more information in order to make a final risk assessment. For example, they may need to know results of a pathology report on a relative's tumor, or the exact age when a relative developed a disease. They may also need to review medical records for a relative to clarify a diagnosis. Once the counselor has established your risk, he or she may discuss options — such as genetic tests if they are available — that may help clarify whether you or members of your family carry a genetic mutation that increases your risk for a particular disease. If there is an appropriate test, the counselor will discuss in detail what information it can give, the risks, benefits, limitations, and other possible consequences of being tested. They also provide detailed follow-up to be sure that you understand what the results mean. Even if genetic testing is not appropriate for your situation, the counselor will help you understand other options to reduce your risk (such as having ovaries removed in women at risk for ovarian cancer) or lifestyle changes that may help your situation. In many cases, the medical team will be involved in designing a plan of action for continued medical management. The genetic counseling session will usually last at least an hour if not longer. Although some people may only require one session, others will require several sessions if they are pursuing genetic testing or for additional follow-up. What not to Expect from a Genetic Counseling Session Genetic counseling sessions do not include: Any testing or procedures that you do not explicitly approve. A genetic counselor will carefully explain to you any tests that are possible for your situation. However, they cannot have the test done until you give written consent that you understand and want that particular test. The genetic counselor can not draw blood or use your DNA or test results without your permission. Prescriptions. In most cases, genetic counselors are not medical doctors and do not write prescriptions. Specific medical recommendations. A genetic counselor will try to make sure that you fully understand the risks, benefits, and possible consequences of every option that is available to you. However, the genetic counselor will not make medical decisions for you. Long-term psychological care. Although many genetic counseling sessions include follow-up sessions to be sure that you are able to handle new information about your health, most genetic counselors are not trained to provide long-term psychological care. For example, if results from a genetic test cause emotional problems that disrupt your daily life, the genetic counselor will most likely refer you to a mental health counselor, support group, or other sources of support for your situation. How Can I Prepare for A Genetic Counseling Session? The best way to prepare for a genetic counseling session for adult onset diseases such as cancer, heart disease, or diabetes is to find out as much as you can about your family medical history. Talk to your family members and try to find medical information about your siblings, parents, aunts and uncles, cousins, grandparents, children, and grandchildren. At minimum, this information should include: Your relation to each family member, including whether family members are adopted or half-relatives Major health conditions that affect each family member such as cancer, diabetes, or heart disease The age of onset for each condition Age of death (where relevant) Cause of death Whether family members had a child with a blood relative Try to confirm each health condition that affects family members. In many cases, your risk may be different depending on exactly what condition your family member had. For example, if you think that a relative had lung cancer when in fact they had breast cancer, it could seriously affect the accuracy of your risk assessment. How to Find a Genetic Counselor In many cases, a doctor will refer you to a genetic counselor if it is appropriate for your condition. However, you may be in a situation where you are seeking genetic counseling on your own. Most genetic counselors are associated with a hospital, clinic, or research group. You can try calling your local medical group or clinic. You may also be able to find a genetic counselor through resources such as the National Society of Genetic Counselors. How to Find Additional Support Genetic counselors are trained to provide support for people coping with genetic diseases. However, in many cases people may need long-term support, or support from people going through a similar experience. In these cases, the genetic counselor may be able to recommend support groups for your particular situation. Support groups vary widely in their scope and focus. Also organizations for your particular disease — such as the American Heart Association, American Diabetes Association, or American Cancer Society — often list support groups. GENETIC LOAD In population genetics, genetic load or genetic burden is a measure of the cost of lost alleles due to selection (selectional load) or mutation (mutational load). It is a value in the range 0 < L < 1, where 0 represents no load. The concept was first formulated in 1937 by JBS Haldane, independently formulated, named and applied to humans in 1950 by H. J. Muller, and elaborated further by Haldane in 1957 Definition: Genetic load is the reduction in selective value for a population compared to what the population would have if all individuals had the most favored genotype. It is normally stated in terms of fitness as the reduction in the mean fitness for a population compared to the maximum fitness. Causes of genetic load: Load may be caused by selection and mutation. 1. Mutational load: Mutation load is caused when a mutation at a locus produces a new allele of either lesser or greater fitness. This lowers the average fitness of the population; a deleterious mutation has a lower relative fitness, lowering average load, while an advantageous mutation effectively lowers the relative fitness of the existing allele, and thus also lowers average fitness. 2. Selectional load: Selection occurs when the fitnesses of particular alleles are inequal, hence selection always exerts a load. With directional selection, the allele frequencies will tend towards an equilibrium position with the fittest allele reaching a frequency in mutation-selection balance. As mutations are rare, this is effectively fixation. Consider two alleles and. If w1 > w2, then at equilibrium, and hence, and .If the mean fitness is 0, the load is equal to 1, but the population goes extinct. One of the most fundamental misconceptions is that most of the individuals in a normal population do not carry genes for genetic diseases. Starting with this assumption, many breeders argue that it should be possible to breed animals with a desired conformation while avoiding undesirable traits. They will concede that some unfortunate individuals do carry recessive genes for these traits, but believe that if they choose their breeding stock carefully, they can avoid the problems that others are having. If problems do show up, it is due to "bad luck", the lack of direct genetic tests for recessive genes, or because another breeder has been concealing something. The truth is that it is virtually impossible to avoid genetic disease. Geneticists believe that most species carry a "genetic load" of 3-5 recessive lethal genes. The difference between purebred dogs and a human is that the latter have something in excess of 2500 genetic diseases, but most of them are extremely rare and thus seldom come from both parents to produce an affected child, whereas many dog breeds have a relatively small number of very common genetic diseases. It is the frequency of these problems, rather than the number of different ones, that is the true indicator of genetic health in a population. When direct genetic tests do become generally available, they will be used to identify the defective genes carried by a dog that may be bred -- not for the purpose of eliminating these individuals from the gene pool, but for planning crosses intelligently so that the defects don't match up to create a homozygous affected individual. They will also make it easier to identify individuals carrying dominant traits that are not always expressed, and those who are homozygous for late-onset diseases, such as SA, which do not show up until after an affected individual may have passed the gene on to every one of his or her progeny. PEDIGREE ANALYSIS Basicprinciples If more than one individual in a family is afflicted with a disease, it is a clue that the disease may be inherited. This information can then be used to predict recurrence risk in future generations.A basic method for determining the pattern of inheritance of any trait (which may be a physical attribute like eye color or a serious disease like Marfan syndrome) is to look at its occurrence in several individuals within a family, also as many generations as possible. For a disease trait, a doctor has to examine existing family members to determine who is affected and who is not. The same information may be difficult to obtain about more distant relatives, and is often incomplete. Once family history is determined, the doctor will draw up the information in the form of a special chart or family tree that uses a particular set of standardized symbols. This is referred to as a pedigree. In a pedigree, males are represented by squares and females by circles . An individual who exhibits the trait in question, for example, someone who suffers from Marfan syndrome, is represented by a filled symbol represents a mating or . A horizontal line between two symbols . The offspring are connected to each other by a horizontal line above the symbols and to the parents by vertical lines. Roman numerals (I, II, III, etc.) symbolize generations. Arabic numerals (1,2,3, etc.) symbolize birth order within each generation. In this way, any individual within the pedigree can be identified by the combination of two numbers (i.e., individual II3). Dominant and recessive traits Using genetic principles, the information presented in a pedigree can be analyzed to determine whether a given physical trait is inherited or not and what the pattern of inheritance is. In simple terms, traits can be either dominant or recessive. A dominant trait is passed on to a son or daughter from only one parent. Characteristics of a dominant pedigree are: 1) Every affected individual has at least one affected parent; 2) Affected individuals who mate with unaffected individuals have a 50% chance of transmitting the trait to each child; and 3) Two affected individuals may have unaffected children. Recessive traits are passed on to children from both parents, although the parents may seem perfectly "normal." Characteristics of recessive pedigrees are: 1) An individual who is affected may have parents who are not affected; 2) All the children of two affected individuals are affected; and 3) In pedigrees involving rare traits, the unaffected parents of an affected individual may be related to each other. The reason for the two distinct patterns of inheritance has to do with the genes that predispose an individual to a given disease. Genes exist in different forms known as alleles, usually distinguished one from the other by the traits they specify. Individuals carrying identical alleles of a given gene are said to be homozygous for the gene in question. Similarly, when two different alleles are present in a gene pair, the individual is said to be heterozygous. Dominant traits are expressed in the heterozygous condition (in other words, you only need to inherit one disease-causing allele from one parent to have the disease). Recessive traits are only expressed in the homozygous condition (in other words, you need to inherit the same disease-causing allele from both parents to have the disease). Penetrance and expressivity Penetrance is the probability that a disease will appear in an individual when a diseaseallele is present. For example, if all the individuals who have the disease-causing allele for a dominant disorder have the disease, the allele is said to have 100% penetrance. If only a quarter of individuals carrying the disease-causing allele show symptoms of the disease, the penetrance is 25%. Expressivity, on the other hand, refers to the range of symptoms that are possible for a given disease. For example, an inherited disease like Marfan syndrome can have either severe or mild symptoms, making it difficult to diagnose. Non-inherited traits Not all diseases that occur in families are inherited. Other factors that can cause diseases to cluster within a family are viral infections or exposure to disease-causing agents (for example, asbestos). The first clue that a disease is not inherited is that it does not show a pattern of inheritance that is consistent with genetic principles (in other words, it does not look anything like a dominant or recessive pedigree). Population genetics is the study of the allele frequency distribution and change under the influence of the four evolutionary processes: natural selection, genetic drift, mutation and gene flow. It also takes account of population subdivision and population structure in space. As such, it attempts to explain such phenomena as adaptation and speciation. Population genetics was a vital ingredient in the modern evolutionary synthesis, its primary founders were Sewall Wright, J. B. S. Haldane and R. A. Fisher, who also laid the foundations for the related discipline of quantitative genetics. Scope and theoretical considerations The framework of mathematical population genetics is an important achievement of the modern evolutionary synthesis. According to Beatty (1986), for example, it defines the core of the modern synthesis. According to Lewontin (1974) the theoretical task for population genetics is a process in two spaces: a "genotypic space" and a "phenotypic space". The challenge of a complete theory of population genetics is to provide a set of laws that predictably map a population of genotypes (G1) to a phenotype space (P1), where selection takes place, and another set of laws that map the resulting population (P2) back to genotype space (G2) where Mendelian genetics can predict the next generation of genotypes, thus completing the cycle. Even leaving aside for the moment the non-Mendelian aspects of molecular genetics, this is clearly a gargantuan task. Visualizing this transformation schematically: (adapted from Lewontin 1974, p. 12). XD T1 represents the genetic and epigenetic laws, the aspects of functional biology, or development, that transform a genotype into phenotype. We will refer to this as the "genotype-phenotype map". T2 is the transformation due to natural selection, T3 are epigenetic relations that predict genotypes based on the selected phenotypes and finally T4 the rules of Mendelian genetics. In practice, there are two bodies of evolutionary theory that exist in parallel, traditional population genetics operating in the genotype space and the biometric theory used in plant and animal breeding, operating in phenotype space. The missing part is the mapping between the genotype and phenotype space. This leads to a "sleight of hand" (as Lewontin terms it) whereby variables in the equations of one domain, are considered parameters or constants, where, in a full-treatment they would be transformed themselves by the evolutionary process and are in reality functions of the state variables in the other domain. The "sleight of hand" is assuming that we know this mapping. Proceeding as if we do understand it is enough to analyze many cases of interest. For example, if the phenotype is almost one-to-one with genotype (sickle-cell disease) or the time-scale is sufficiently short, the "constants" can be treated as such; however, there are many situations where it is inaccurate. [edit] Genetic structure Because of physical barriers to migration, along with limited vagility, and natal philopatry, natural populations are rarely panmictic (Buston et al., 2007). There is usually a geographic range within which individuals are more closely related to one another than those randomly selected from the general population. This is described as the extent to which a population is genetically structured (Repaci et al., 2007). [edit] Population geneticists The three founders of population genetics were the Britons R.A. Fisher and J.B.S. Haldane and the American Sewall Wright. Fisher and Wright had some fundamental disagreements and a controversy about the relative roles of selection and drift continued for much of the century between the Americans and the British. The Frenchman Gustave Malécot was also important early in the development of the discipline. John Maynard Smith was Haldane's pupil, whilst W.D. Hamilton was heavily influenced by the writings of Fisher. The American George R. Price worked with both Hamilton and Maynard Smith. American Richard Lewontin and Japanese Motoo Kimura were heavily influenced by Wright. Evolution and Population Genetics Review Haploid, Diploid Diploid cells (2N) have two complete sets of chromosomes. The body cells of animals are diploid. Haploid cells have one complete set of chromosomes. Some organisms are haploid. Animals are diploid but their gametes (sperm and eggs) are haploid. Mitosis Mitosis is a type of cell division that results in daughter cells that are identical (genetically) to the parent cell. If the parent cell is diploid, the two daughter cells will be diploid. Similarly, a haploid cell that divides by mitosis will produce two haploid daughter cells. The diagram shows how chromosome movement results in two daughter cells with chromosomes that are identical to the parent cell. Below: The single-stranded chromosomes in the two daughter cells will later become doublestranded. The two resulting strands (chromatids) are identical. Meiosis Meiosis is a type of cell division in which the daughter cells have 1/2 the number of chromosomes as the parent cell. If the parent cell is diploid, the daughter cells will each be haploid. Meiosis has two separate divisions resulting in four daughter cells. The first division is shown below. Each of the two cells produced by the first division (shown above) divides again (shown below). Notice that the second meiotic division is like mitosis. Evolution and Population Genetics Evolution Occurs in Populations Species There is not a good definition of species; perhaps the concept of species is artificial but it is useful because it allows people to classify organisms. Most biologists would agree that members of a sexually-reproducing species are able to interbreed and have a shared gene pool. Different species do not exchange genes with each other; they do not interbreed. This definition of species is based on sexual reproduction and therefore does not work with prokaryotes or other asexual species. Population A population is an interbreeding group of organisms (the same species) that occupies a particular area. The size of the area is somewhat arbitrary. There could be a population of fish in an aquarium and a population of fish in a lake. Gene Frequency and Evolution Gene frequency refers to the proportion of alleles that are of a particular type. For example, if 60% of the alleles in a population are "a" and 40% are "A", then the gene frequency of "a" is 0.6 and the gene frequency of "A" is 0.4. On a small scale, evolution involves changes in gene frequencies. Population Model A population is a group of interbreeding organisms that occupies a particular area. Initial Population Circles are used to represent genes in this diagram of a population. Individuals are diploid, so two circles are used to represent an individual. Gene Frequencies in the Model Population In the population above, 33% of the genes for eye color in a population are "A" and 67% are "a". The frequency of "A" is therefore 0.33 and the frequency of "a" is 0.67. Gametes During meiosis, "AA" individuals will produce all "A" gametes. Similarly, 1/2 of the gametes produced by an "Aa" individuals will be "A" and the other half will be "a"; "aa" individuals will produce all "a" gametes. Individual AA Aa aa Gametes all A 1/2 A, 1/2 a all a The proportion of A and a in the gametes will be the same as in the population. In the example population we have been using, suppose that each individual produces four gametes. In reality, males produce many millions of gametes and females produce relatively few. This is not a concern for our model because in either case, the gene frequency of the gametes will be the same as that of the population that produced them. The gene frequency of "A" and "a" in the gamete pool will remain 0.33 and 0.67. Gene frequency: The next generation Because the gene frequency in the gamete pool did not change, the gene frequency in the population the next generation remains the same. The Hardy-Weinberg law states that under certain conditions (discussed below), the gene frequency of a population does not change from generation to generation. Should There Be Fewer Recessive Alleles? The population model described above predicts that gene frequencies will not change from one generation to the next even if there are more recessive alleles. There is sometimes a misconception among students beginning to study genetics that dominant traits are more common than recessive traits. It isn't true. For example, blood type O is recessive and is the most common type of blood. Huntington's (a disease of the nervous system) is caused by a dominant gene and the normal gene is recessive. Fortunately, most people are recessive; the dominant is uncommon. The misconception comes from the observation that in a cross of Aa X Aa, 3/4 of the offspring will show the dominant characteristic. However, the 3:1 ratio comes only if the parents are both Aa. If there are many recessive genes in a population, then most matings are likely to be aa X aa and most offspring will be aa. Forces that Change Gene Frequencies Migration can change the gene frequency of a population if the migrants have a different gene frequency than that of the population they are leaving or entering. The founder effect occurs when the gene frequency of a newly established population is somewhat different from the parental population. This may be due to the small sample of founding individuals. The sample-size phenomenon can be illustrated by flipping a coin. The expected number of "heads" from flipping a coin is 50% but if a coin is flipped only 4 times, you may get all "heads" or all "tails". If the coin is flipped 1000 times, the actual number of "heads" and "tails" will probably not deviate much from 50%. Thus, the larger the sample size of emigrants, the more likely it is to reflect the population from which it is leaving. Below: The population on the right was formed from a few individuals emigrating from the population on the left. During a bottleneck, a large population undergoes a decrease in size so that relatively few individuals remain. Because there are few individuals, the gene frequency is more likely to drift. Below: The gene frequency of the initial population (left) changes because many of the individuals have died. The population on the right is the same population after the bottleneck has occurred. Genetic drift refers to random fluctuations in the gene frequency of a population. This is more likely to occur in a small population. As with bottlenecks and the founder effect, it is a samplesize phenomenon. The smaller the population, the more likely that gene frequencies are likely to fluctuate from generation to generation. Mutation changes gene frequencies when genes of one type ("A" for example) mutate to another type ("a" for example). Natural selection changes gene frequencies when genes or gene combinations are more likely to result in greater reproductive success of the individual that possesses them. Conditions Necessary for Hardy-Weinberg Equilibrium Notice that the gene frequency the next generation is the same as that of the initial population. The Hardy-Weinberg principle states that if the following conditions are met, the gene frequency of a population will not change from generation to generation: No migration Large population size No mutation Random mating No selection Natural Selection Natural selection is a mechanism that produces changes in the gene frequency from one generation to the next. As a result, organisms become better adapted to their environment. It is important to keep in mind as you read below that natural selection does not act on individuals; it acts on populations. Individual organisms cannot become better-adapted to their environment. Natural selection occurs because 1. 2. 3. 4. Individuals within a population vary; they are not all identical. Some variants are “better” than others. The traits that vary are heritable. The “better” individuals will have more success reproducing; they will have more offspring. In successive generations, more offspring will have the better trait. These items are discussed below. Variation Sexual reproduction promotes genetic variation. For many traits that occur in a population, individuals are often not all identical. For example, if running speed were measured, some individuals would likely be able to run faster than others but most individuals would probably be intermediate. If number of individuals is plotted against the trait in question (running speed for example), a graph like the one shown is often produced. We would get a similar bell-shaped curve if we plotted height, weight, performance on exams, or almost any other characteristic. Some Variants are Better Some individuals are bound to be better than others. Perhaps their body structure allows them to escape predators better or to find food faster or to better provide for their young. For example, suppose that the faster-running animals diagrammed below are better able to escape predators than the slower ones. You would expect that more of the faster ones would survive and reproduce than the slower ones. The slower rabbits will not reproduce as much because predators kill them more than they kill the faster rabbits. Traits Are Heritable Those individuals that survive better or reproduce more will pass their superior genes to the next generation. Individuals that do not survive well or that reproduce less as a result of "poorer genes" will not pass those genes to the next generation in high numbers. As a result, the population will change from one generation to the next. The frequency of individuals with better genes will increase. This process is called natural selection. Natural Selection Produces Evolutionary Change If the conditions discussed above are met, the genetic composition of the population will change from one generation to the next. This process is called natural selection. The word "evolution" refers to a change in the genetic composition of a population. Natural selection produces evolutionary change because it changes the genetic composition of populations. A variety of other mechanisms can also produce evolutionary change. For example, suppose that 65% of the eye-color genes in a population were for individuals with blue eyes and 35% of the genes were for brown eyes. If most of the immigrants entering the population carried the blue gene, the overall composition might change from 65% blue to 70% blue. Natural selection acts on populations; a single individual cannot evolve. Natural selection does not act on an individual to make it better adapted to its environment. Example of Natural Selection: Industrial Melanism Kettlewell studied the peppered moth (Biston betularia) from insect collections in England. He observed that in polluted areas, most of the peppered moths were the dark form. In clean areas, most were the pale form. During the early 1800's, the dark form comprised less than 2% of the population and the pale form made up more than 98%. During the 1800’s the dark form increased in frequency in urban areas. Kettlewell suggested that dark moths survived better in polluted areas because they were more difficult for avian (bird) predators to see on the darkened tree trunks. Similarly, he suggested that light-colored moths were more difficult to see in unpolluted areas because the tree trunks were light-colored. To test this, he released moths of each type (light and dark) in both polluted and unpolluted areas. In the unpolluted area, he recaptured 13.7% of the light moths and 4.7% of the dark moths. In the polluted area, he recaptured 13% of the light and 27.5% of the dark moths. Sexual Reproduction and Evolutionary Change Variation Individuals with in a population usually are not all identical and much of this variation is due to genetic differences among individuals. Sexual reproduction acts to increase variation in populations by shuffling genes. Offspring have some genes from each of two different parents and therefore are not identical clones of their parents. The increased variation due to sexual reproduction allows natural selection (and thus evolution) to produce changes in populations as described above. Ultimately, all variation in a population comes from changes in the DNA. These changes are called mutations. Recombination during sexual reproduction promotes variation. Sperm and eggs (gametes) are produced by a type of cell division called meiosis. During meiosis, crossing-over and independent assortment act to shuffle the genes before gametes are produced. Fluctuating environments Evolutionary change due to natural selection would not be necessary if the environment never changed and the organisms within the environment were optimally adapted to the environment. For example, imagine a plant that is adapted to an environment that has an average annual rainfall of 100 cm. If the climate were to change so that the amount of rainfall decreased, individuals that could tolerate less rain would survive and reproduce better, thus establishing their drought-tolerant genes in subsequent generations. If there was no variation in the plant population, there would not be any drought-tolerant individuals and the species would likely go extinct in areas of decreased rainfall. Sexual reproduction therefore, enables species to survive in fluctuating or changing environments because it promotes variation, which in turn allows natural selection. Model Chromosomes The drawings of chromosomes below will be cut out and used in class for reviewing mitosis and meiosis in the "Review" section at the beginning of this page. Be sure that you can do the following using these models of chromosomes: Create a haploid cell. Create a diploid cell. Simulate mitosis in a diploid cell. Simulate mitosis in a haploid Simulate meiosis in a diploid cell. Use the models to create two gametes: an egg and a sperm. Simulate the fusion of the two gametes to create a fertilized egg (called a zygote). Hardy Weinberg Law The formula (p + q)2=p2 + 2pq +q2 is expressing the genotypic expectations of progeny in terms of gametic or allelic frequencies of the parental gene pool and is originally formulated by a British mathematician Hardy and a German physician Weinberg (1908) independently. Both forwarded the idea, called Hardy-Weinberg law equilibrium after their names, that both gene frequencies and genotype frequencies will remain constant from generation to generation in an infinitely large interbreeding population ill which mating is at random and no selection, migration or mutation occur. Should a population initially be in disequilibrium, one generation of random mating is sufficient to bring it into genetic equilibrium and there after the population will remain in equilibrium (unchanged in gametic and zygotic frequencies) as long as Hardy-Weinberg condition persist. Hardy-Weinberg law depends on following kinds of genetic equilibriums for its full attainment: 1. The population is infinitely large and mate at random. 2. No selection is operative. 3. The population is closed, i.e., no immigration or emigration occur. 4. No mutation is operative in alleles. 5. Meiosis is normal so that chance is the , only factor operative in gametogenesis. The significance of the Hardy-Weinberg equilibrium was not immediately appreciated. A rebirth of biometrical genetics was later brought about with the classical papers of R.A. Fisher, beginning in 1918 and those of Sewall Wright, beginning in 1920. Under the leadership of these mathematicians emphasis was placed on the population rather than on the individual or family group, which had previously occupied the attention of most Mendelian geneticists. In about 1935, T. Dobzhansky and others started to interpret and to popularize , the mathematical approach for studies of genetics and evolution. Quantitative genetics is the study of continuous traits (such as height or weight) and its underlying mechanisms. It is effectively an extension of simple Mendelian inheritance in that the combined effect of the many underlying genes results in a continuous distribution of phenotypic values. History The field was founded, in evolutionary terms, by the originators of the modern synthesis, R.A. Fisher, Sewall Wright and J. B. S. Haldane, and aimed to predict the response to selection given data on the phenotype and relationships of individuals. Analysis of Quantitative trait loci, or QTL, is a more recent addition to the study of quantitative genetics. A QTL is a region in the genome that affects the trait or traits of interest. Quantitative trait loci approaches require accurate phenotypic, pedigree and genotypic data from a large number of individuals. [edit] Traits Quantitative genetics is not limited to continuous traits, but to all traits that are determined by many genes. This includes: Continuous traits are quantitative traits with a continuous phenotypic range. They are often polygenic, and may also be influenced significantly by environmental effects. Meristic traits or other ordinal numbers are expressed in whole numbers, such as number of offspring, or number of bristles on a fruit fly. These traits can be either treated as approximately continuous traits or as threshold traits. Some qualitative traits can be treated as if they have an underlying quantitative basis, expressed as a threshold trait (or multiple thresholds). Some human diseases (such as, schizophrenia) have been studied in this manner. [edit] Basic principles This section's factual accuracy is disputed. Please see the relevant discussion on the talk page. (March 2008) The phenotypic value (P) of an individual is the combined effect of the genotypic value (G) and the environmental deviation (E): P=G+E The genotypic value is the combined effect of all the genetic effects, including nuclear genes, mitochondrial genes and interactions between the genes. It is therefore often subdivided in an additive (A) and a dominance component (D). The additive effect described the cumulative effect of the individual genes, while the dominance effect is the result of interactions between those genes. The environmental deviation can be subdivided in a pure environmental component (E) and an interaction factor (I) describing the interaction between genes and the environment. This can be described as: P=A+D+E+I The contribution of those components cannot be determined in a single individual, but they can be estimated for whole populations by estimating the variances for those components, denoted as: VP = VA + VD + VE + VI The heritability of a trait is the proportion of the total (i.e. phenotypic) variation (VP) that is explained by the genetic variation. This is the total genetic variation (VG) in broad sense heritabilities (H2), while only the additive genetic variation (VA) is used for narrow sense heritabilities (h2), often simply called heritability. The latter gives an indication how a trait will respond to natural or artificial selection. [edit] Resemblance between relatives Central in estimating the variances for the various components is the principle of relatedness. A child has a father and a mother. Consequently, the child and father share 50% of their alleles, as do the child and the mother. However, the mother and father normally share some alleles as a result of shared ancestors. Similarly, two full siblings share also on average 50% of the alleles with each other, while half sibs share only 25% of their alleles. This variation in relatedness can be used to estimate which proportion of the total phenotypic variance (VP) is explained by the above-mentioned components. [edit] Correlated traits Although some genes have only an effect on a single trait, many genes have an effect on various traits. Because of this, a change in a single gene will have an effect on all those traits. This is calculated using covariances, and the phenotypic covariance (CovP) between two traits can be partitioned in the same way as the variances described above. The genetic correlation is calculated by dividing the covariance between the additive genetic effects of two traits by the square root of the product of the variances for the additive genetic effects of the two traits: In population genetics, genetic load or genetic burden is a measure of the cost of lost alleles due to selection (selectional load) or mutation (mutational load). It is a value in the range 0 < L < 1, where 0 represents no load. The concept was first formulated in 1937 by JBS Haldane, independently formulated, named and applied to humans in 1950 by H. J. Muller[1], and elaborated further by Haldane in 1957 Definition Genetic load is the reduction in selective value for a population compared to what the population would have if all individuals had the most favored genotype.[3] It is normally stated in terms of fitness as the reduction in the mean fitness for a population compared to the maximum fitness. [edit] Mathematics Consider a single gene locus with the alleles , which have the fitnesses and the allele frequencies respectively. Ignoring frequency-dependent selection, then genetic load (L) may be calculated as: where wmax is the maximum value of the fitnesses and is mean fitness which is calculated as the mean of all the fitnesses weighted by their corresponding allele frequency: where the ith allele is and has the fitness and frequency wi and pi respectively. When the wmax = 1, then (1) simplifies to [edit] Causes of genetic load Load may be caused by selection and mutation. [edit] Mutational load This section requires expansion. Mutation load is caused when a mutation at a locus produces a new allele of either lesser or greater fitness. This lowers the average fitness of the population; a deleterious mutation has a lower relative fitness, lowering average load, while an advantageous mutation effectively lowers the relative fitness of the existing allele, and thus also lowers average fitness. [edit] Selectional load This section requires expansion. Selection occurs when the fitnesses of particular alleles are inequal, hence selection always exerts a load. With directional selection, the allele frequencies will tend towards an equilibrium position with the fittest allele reaching a frequency in mutation-selection balance. As mutations are rare, this is effectively fixation. Consider two alleles and . If w1 > w2, then at equilibrium, and , hence , and . If the mean fitness is 0, the load is equal to 1, but the population goes extinct. One of the most fundamental misconceptions is that most of the individuals in a normal population do not carry genes for genetic diseases. Starting with this assumption, many breeders argue that it should be possible to breed animals with a desired conformation while avoiding undesirable traits. They will concede that some unfortunate individuals do carry recessive genes for these traits, but believe that if they choose their breeding stock carefully, they can avoid the problems that others are having. If problems do show up, it is due to "bad luck", the lack of direct genetic tests for recessive genes, or because another breeder has been concealing something. The truth is that it is virtually impossible to avoid genetic disease. Geneticists believe that most species carry a "genetic load" of 3-5 recessive lethal genes. The difference between purebred dogs and a human is that the latter have something in excess of 2500 genetic diseases, but most of them are extremely rare and thus seldom come from both parents to produce an affected child, whereas many dog breeds have a relatively small number of very common genetic diseases. It is the frequency of these problems, rather than the number of different ones, that is the true indicator of genetic health in a population. When direct genetic tests do become generally available, they will be used to identify the defective genes carried by a dog that may be bred -- not for the purpose of eliminating these individuals from the gene pool, but for planning crosses intelligently so that the defects don't match up to create a homozygous affected individual. They will also make it easier to identify individuals carrying dominant traits that are not always expressed, and those who are homozygous for late-onset diseases, such as SA, which do not show up until after an affected individual may have passed the gene on to every one of his or her progeny. POPULATION GENETICS AND EVOLUTION Thus, population genetics has provided great support to the idea of organic evolution. Various aspects of evolution can be reviewed in terms of population genetics as follows: Speciation The process of formation of new species is called speciation. Two genetically divergent populations can form new species only when they become geographically isolated from each other. If a large population is fragmented into two or more units which are geographically isolated from one another, each independent unit follows different evolutionary paths for following reasons 1. Each isolated unit of a population may has its own type of mutation which provides raw materials for organic diversity. 2. The mutations and gene combinations which appear in different isolated population units will have different adaptive values in the new environments. 3. The organisms which originally colonize a certain geographical area and form an isolated population may not be representative of the group from which they came so that different gene frequencies exist from beginning. 4. The size of the new population may become quite small at various times so that a genetic "bottle neck" is formed, from which all subsequent organisms will arise. 5. During the period of small population size, the gene frequencies will fluctuate in unpredictable directions. The fluctuation in gene frequency is called genetic drift. Stratification When the genetic equilibrium of a natural population is disturbed by pressure of evolution many factors such as mutation, selection, migration, and isolation, the natural population is divided into many sub-populations. Each sub-population has different gene frequencies. Such population which consists of two or more such-populations with different gene frequencies is said to be stratified. For example, the Indian population is a stratified population because it consists of various sub-populations, such as dravadians and aryans, etc. Population genetics has provided great help to human genetics in understanding the inheritance of various human traits in a given population. It has also helped in better understanding of organic evolution, especially evolution of man.