* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 5. Harmful mutations
Transposable element wikipedia , lookup
DNA profiling wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
Metagenomics wikipedia , lookup
Frameshift mutation wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
DNA polymerase wikipedia , lookup
Genome (book) wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Oncogenomics wikipedia , lookup
SNP genotyping wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Genome evolution wikipedia , lookup
Human genome wikipedia , lookup
Genetic engineering wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Primary transcript wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Genomic library wikipedia , lookup
DNA vaccination wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Epigenomics wikipedia , lookup
Molecular cloning wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
DNA supercoil wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Designer baby wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Microsatellite wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Non-coding DNA wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genome editing wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Point mutation wikipedia , lookup
Microevolution wikipedia , lookup
History of genetic engineering wikipedia , lookup
1 LECTURE 1 Of Special Course “Modern Problems of Molecular Biology” for Pharmaceutical Department students of the 2nd Year of Study Theme: “MOLECULAR BIOLOGY: SUBJECT AND TASKS. DNA STRUCTURE, FUNCTIONS AND PROPERTIES. MOLECULAR MECHANISMS OF DNA REPLICATION, RECOMBINATION AND REPAIR. MOLECULAR STRUCTURE OF A GENE. STRUCTURE OF GENOMES OF VIRUSES, PRO- AND EUKARYOTES” Lecture plan: 1. Subject, aim and tasks of Molecular Biology Course 2. Steps of Molecular Biology development 3. DNA chemical structure, macromolecular arrangement, functions and properties 3.1 Form, structure and macromolecular properties of DNA 3.2 Primary DNA structure: 3.3 DNA chemical characteristic 3.4 Secondary structure of DNA 3.5 Tertiary structure of DNA: double helix 3.6 DNA conformations 4. DNA replication 5. DNA self-correction 6. DNA damage 7. DNA reparation 8. DNA and RNA comparison. RNA types and properties 9. Gene definition and classification 9.1 Gene types 9.2 Gene properties 10.Sequences of human genome 10.1 Regulatory sequences 10.2 Repetitive sequences 10.3 Gene clusters 10.4 Pseudogenes 10.5 Tandemly repeated (satellite) DNA 11. Transposed sequences 11.1 Transposons causing diseases 12. Non-nuclear heredity 12.1 Mitochondrial genome 1. Subject, aim and tasks of Molecular Biology Course All living beings have molecular basis of structure and functions. The bodies of living organisms consist of approximately 30 main elements. They are (a 99% of mass) carbon, oxygen, hydrogen, nitrogen mostly and phosphorus and sulfur as well. Sodium, potassium, iron, calcium, magnesium, chlorine and iodine are compounds of a 2 great importance also. The atoms of the various elements are «combined» in different proportions and constitute the enormous amount of the varied molecules are creating the living systems (cells). Molecules are compose living beings are subdivided into organic and inorganic The inorganic molecules of living beings are represented by water (70% from body mass), and different solved salts. Compound organic molecules have significant weight. They contain carbon as a main component and also include hydrogen, oxygen, nitrogen, phosphorus and sulfur. The organic polymers made from amino acids compose proteins’ molecules. Polymers are composed from nucleotides have an important role in energy accumulation and transduction (ATP, GTP and others). However, main quantity of nucleotides are represented in a cell are subunits of informative molecules DNA and RNA. Thus, a living being can be defined as the highly-organized arrangement of molecules with the permanent usage of energy and matter of surrounding environment «under the direction of» the informative programs of nucleic acids. They provide the prolongated support of high efficiency arrangement of molecules and processes in living beings. 1.1.Molecular Biology subject and tasks: Molecular biology is science about the mechanisms of genetic information storage, transmission and expression, about intermolecular interaction which underlaid biological processes, about a structure and functions of irregular biopolimers - nucleic acids and proteins. All molecules of the living systems are the subjects of molecular biology. Tasks of Molecular Biology are following: 1) Study of molecular structures of the living systems 2) Study of relations between the molecular structures of living beings 3) Study of genes and their functions 4) Application of the knowledge for molecular biotechnology The main aim of the studying the Molecular Biology’ Subject is: To be able to explain vital functions of human organism at molecular-genetic level of structural organization. 2. Steps of Molecular Biology development 1902 - Walter Sutton created term "gene" to describe "factors“ located on chromosomes: he observed chromosomal movement during meiosis and developed the chromosomal theory of heredity 1905-1908 - William Bateson and Reginal Crudell Punnett demonstrated actions of some genes modify action of other genes: the first time gene regulation was demonstrated 1933 - A new technique, electrophoresis, was introduced by Arne Tiselius for separating proteins in solution 1937 - Frederick Charles Bawden discovered tobacco mosaic virus RNA. 1944 - Barbara McClintock reported transposable elements: "jumping genes" 1946 - Edward Tatum and Joshua Lederberg discovered that bacteria can exchange genetic material directly through conjugation. Max Delbruck and Alfred Day Hershey discovered a combination of genetic material from viruses: genetic recombination 1950 - Erwin Chargaff found that amounts of adenine and thymine and cytosine and guanine in DNA are always about the same. This is now called "Chargaff's Rules" 3 1953 - James Watson and Francis Crick proposed the double-stranded, helical, complementary, anti-parallel model for DNA 1955 - Frederick Sanger announced the first complete sequence of a protein, bovine insulin Arthur Kornberg discovered and isolated DNA polymerase from E. coli bacteria 1956 - Francis Crick and George Gamov worked out the "Central Dogma" to explain protein synthesis from DNA: the DNA sequence codes for amino acid sequences and genetic information flows in one direction - from DNA to mRNA to protein 1959 - Francois Jacob and Jacques Monod discovered an important mechanism behind genetic regulation: mappable control functions located on chromosomes in DNA sequence - named "repressor" and "operon" 1961 - Marshall Nirenberg, Heinrich Mathaei and Severo Ochoa cracked the "Genetic Code": a sequence of three nucleotide bases (codon) determine each of amino acids 1967 Mary Weiss and Howard Green found a technique for combining human cells and mouse cells grown in one culture: somatic cell hybridisation The first evolutionary trees from protein sequences were set op by WM Fitch and E Margoliash 1970 - Howard Temin and David Baltimore independently isolated reverse transcriptase, an enzyme that can make DNA from RNA 1972 - Paul Berg used a restriction enzyme to cut DNA and ligase to past two DNA strands together to form hybrid circular molecule. This was the first recombinant DNA molecule First successful DNA cloning experiments 1973 - Stanley Cohen and Herbert Boyer first successfully transfered DNA from one life form into another: a spliced viral DNA and bacterial DNA to create a plasmid with dual antibiotic resistance 1974 - Allan Maxam and Walter Gilbert (Harvard) and Frederick Sanger (U.K. Medical Research Council) independently developed different methods for sequencing DNA 1977 - Bacteriophage FX-174 (5368 bp) was the first complete genome (DNA) to be sequenced Richard Roberts’ and Phil Sharp’s labs showed that eukaryotic genes contain many interruptions, called introns. 1978 - Genentech successfully produced human insulin using recombinant DNA technology in E. coli David Botstein discovered the use of restriction enzymes produces different fragments from one person to another, RFLP: restriction fragment length polymorphisms 1980 - Kary Mullis invented the polymerase chain reaction (PCR), a method for multiplying DNA sequences in vitro 1981 - Gordon and Ruddle (Ohio University) made the first transgenic mice by inserting genes from other animals with DNA microinjection. Human mitochondral DNA sequenced (16569 bp) 1983 - First genetic modifed plant is created; a tobacco plant resistant to an antibiotic 1984 - Alec Jeffreys developed the technique of using sequences of DNA for identification, called "genetic fingerprinting" Chiron Corp determined the entire sequence of the HIV-1 genome 1990 - Human Genome Project launched: estimated cost of $13 billion (plan 15 years) BLAST: fast sequence similarity searching tool introduced by S. Karlin and S.F. Altshul 1994 - The FlavrSavr Tomato becomes the first genetic modified food to be approved for sale. A gene expression the enzyme polygalacturonase, which is responsible for the tomato's 4 softness, was introduced by Calgene 1999 - Drosophila melanogaster (fruitfly) genome completely sequenced (175 Mb) 2000 - Completion of the Arabidopsis thaliana sequence (157 Mb) Human genome draft version finished (3200 Mb) 2010 Completion of the 2010 Project: the understanding the function of all genes within their cellular, organismal and evolutionary context of Arabidopsis thaliana Future goals of molecular biology and bioinformatics research 2050 - Completion of the first computational model of a complete cell, or maybe even already of a complete organism 3. DNA chemical structure, macromolecular arrangement, functions and properties A nucleic acid is a complex, high-molecular-weight biochemical macromolecule composed of smaller units nucleotides which created the chains that convey genetic information. Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. The main role of DNA molecules is the long-term storage of information. 3.1 Form, structure and macromolecular properties of DNA - DNA usually occurs as linear chromosomes in eukaryotes, and circular chromosomes in prokaryotes. - at most organisms DNA is a double helix; some viruses are exception, they have singlestranded DNA - DNA has primary, secondary and tertiary levels of its structure; - Prokaryotes have just 1 chromosome is represented with a single supercoiled DNA macromolecule; - Eukaryotic cells have DNA as a part of desoxyribonucleic complex (chromatin) 3.2 Primary DNA structure: - it is the unramified polynucleotide chain with the certain sequence of nucleotides; - is strictly specie-dependent and individual; - represents the codified form of genetic information (genetic code); - in the case of circular DNA the ends of molecule are reserved. 3.3 DNA chemical characteristic The nucleotide repeats contain both the segment of the backbone of the molecule, which holds the chain together, and a base, which interacts with the other DNA strand in the helix. A base linked to a sugar is called a nucleoside and a base linked to a sugar and one or more phosphate groups is called a nucleotide. If multiple nucleotides are linked together, as in DNA, this polymer is called a polynucleotide. The backbone of the DNA strand is made from alternating phosphate and sugar residues. The sugar in DNA is 2-deoxyribose, which is a pentose (five-carbon) sugar. The sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings. 3.4 Secondary structure of DNA - connected with a hydrogen bonds antiparallel, polynucleotide chains 5 - each type of base on one strand forms a bond with just one type of base on the other strand in accordance with Chargaffs’ rules. This is called complementary base pairing. - different functions of complementary chains: coding chain and template chain - efficiency of replication and transcription processes 3.5 Tertiary structure of DNA: double helix The DNA chain is 22 to 26 Ångströms wide (2.2 to 2.6 nanometres), and one nucleotide unit is 3.4 Å (0.34 nm) long. Sugar-phosphate backbones are not equally-spaced, resulting in major and minor grooves. One complete turn of the helix requires 3.4 nm (10 bases/turn). 3.6 DNA conformations DNA exists in several possible conformations. The conformations so far identified are: A-DNA, B-DNA, C-DNA, D-DNA, E-DNA, H-DNA, L-DNA, and Z-DNA. However, only ADNA, B-DNA, and Z-DNA are believed to be found in nature. The conformation that DNA adopts depends on the hydration level, DNA sequence, the amount and direction of supercoiling, chemical modifications of the bases, the type and concentration of metal ions, as well as the presence of polyamines in solution. 4. DNA replication DNA replication, the basis for biological inheritance, is a fundamental process occurring in all living organisms to copy their DNA. This process is "semiconservative" in that each strand of the original double-stranded DNA molecule serves as template for the reproduction of the complementary strand. Hence, following DNA replication, two identical DNA molecules have been produced from a single double-stranded DNA molecule. Cellular proofreading and error-checking mechanisms ensure near perfect fidelity for DNA replication. In a cell, DNA replication begins at specific locations in the genome, called "origins". Unwinding of DNA at the origin, and synthesis of new strands, forms a replication fork. 5. DNA self-correction Imino-cytosine pairs (incorrectly, as far as the cell is concerned) with adenine. Nanoseconds later, the iC converts back to the normal amino form of cytosine, and no longer pairs with adenine. The mismatched base sticks out and is cleaved off by the EXO-NUCLEASE activity of the enzyme, leaving a fresh OH group to try again. 6. DNA damage DNA can be damaged by many different sorts of mutagens. These include oxidizing agents, alkylating agents and also high-energy electromagnetic radiation such as ultraviolet light and x-rays. The type of DNA damage produced depends on the type of mutagen. For example, UV light mostly damages DNA by producing thymine dimers, which are cross-links between adjacent pyrimidine bases in a DNA strand. On the other hand, oxidants such as free radicals or hydrogen peroxide produce multiple forms of damage, including base modifications, particularly of guanosine, as well as double-strand breaks. It has been estimated that in each human cell, about 500 bases suffer oxidative damage per day. Of these oxidative lesions, the most damaging are double-strand breaks, as they can produce point mutations, insertions and deletions from the DNA sequence, as well as chromosomal translocations. Many mutagens intercalate into the space between two adjacent base pairs. These molecules are mostly polycyclic, aromatic, and planar molecules and include ethidium, proflavin, daunomycin, doxorubicin and thalidomide. DNA intercalators are used in 6 chemotherapy to inhibit DNA replication in rapidly-growing cancer cells. In order for an intercalator to fit between base pairs, the bases must separate, distorting the DNA strand by unwinding of the double helix. These structural modifications inhibit transcription and replication processes, causing both toxicity and mutations. As a result, DNA intercalators are often carcinogens, with benzopyrene diol epoxide, acridines, aflatoxin and ethidium bromide being well-known examples. 7. DNA reparation a. PHOTOREACTIVATION (photoenzyme repair) -under the action of the ultra violet radiation, dimeric bonds appear between adjacent pyrimidin bases, more often between ..T-T.. in one chain which prevents the formation of complementary bonds with A nucleotide from another chain. Under the action of a visible light photoreactivation with the help of reparative enzyme takes place. It destroy dimeric bonds and formation of hydrogen bonds between complementary nucleotides of DNA strands take place (Fig. 2.9). b. EXCISION REPAIR does not depend on light (dark reparation). In this type of repair excision of damaged part of one DNA chain by enzyme endonuclease takes place; after that other enzyme, reparative polymeraze, catalyzes synthesis of a missing part according to the principle of complementarity and antiparallelism on the remain part of DNA. Then enzyme ligase joins free ends of a new end with the old ones (Fig. 2.10). If the damages cannot be removed by the mechanisms mentioned above, they are removed during recombination between two double strands of DNA. As a result one new molecule of DNA without damages forms. 8. DNA and RNA comparison. RNA types and properties Both DNA and RNA are composed of repeating units of nucleotides. Each nucleotide consists of a sugar, a phosphate and a nucleic acid base. The sugar in DNA is deoxyribose. The sugar in RNA is ribose, the same as deoxyribose but with one more OH (oxygen-hydrogen atom combination called a hydroxyl). This is the biggest difference between DNA and RNA. Another difference is that RNA molecules can have a much greater variety of nucleic acid bases. DNA has mostly just 4 different bases with a few extra occasionally. The difference in these bases (between DNA and RNA) allows RNA molecules to assume a wide variety of shapes and also many different functions. DNA, on the other hand, serves as a set of directions and that's about all (but that's absolutely necessary!). Messanger RNA - it constitute about 1% in total weight of the cell; - determines, what amino acids, in what sequence and quantities must be involved in a polypeptide chain; - is a primary transcript in prokaryotes; - is the result of processing in eukaryotes. Transfer RNA (tRNA) - the shortest and has a small molecular mass; - has three levels of spatial arrangement; - has acceptor and anticodon loops; - contains the modified nitrogenous bases 7 - presence of the modified nitrogenous bases in an anticodon explains the fact of disparity of amount of codons (61) and tRNA types. Main functions of rRNA - structural (makes about 60% ribosome weight). One molecule of rRNA is a part of a small ribosomal subunit, three molecules build-up a large ribosomal subunit); - initiation of protein synthesis of (provides interaction of ribosomes with the certain nucleotide sequences of mRNA and tRNA); - termination of protein synthesis (determines ending of synthesis and slabbing of the completed molecules of proteins from ribosomes). 9. Gene definition and classification A gene is a segment of DNA corresponding to a single protein (or set of alternate protein variants) or a single (catalytic or structural) RNA molecule. 9.1 Genes’ classification: Structural genes - genes that code sequence of amino acids of structural proteins and enzymes, Genes of modulation - genes that are the cause of other genes activity reduction (suppressors or inhibitors), the also cause the incretion of other genes activity (intensificators). Genes of control (regulation) - genes that render influence to time of another genes activity. E.g. gene - promoter part goes before the structural genes that respond certain proteins. Promoter is being recognized by RNA-polymerase thus signaling for m-RNA synthesis to get started. Gene - terminator - the end of genes structural part contains a certain sequence of nucleotides called the terminator. It contains nonsense-triplets. M-RNA synthesis stops here. RNA-coding genes (rRNA, tRNA), constitutive genes. rRNA genes (4 types) have information about ribosomal RNA structure and are responsible for their synthesis. tRNA genes (more than 30 variations) have information about tRNA structure. 9.2 Gene properties - Discrete action; -Stability (constancy); - Liability (change) of genes is connected to their ability to mutations; - Specificity – every gene is responsible for trait development; - Pleiotropy - one gene can be responsible for several traits; - Penetrance and expressivity – frequency of gene viability and level of the trait expression. 10.Sequences of human genome It is estimated that only about 5% of the human genome contains actual coding sequences. Genes for polypeptides include a leader region followed by the coding region followed by the trailer. Leader -----> coding region -----> trailer Intervening sequences separate genes on a strand of DNA. The leader and trailer are not translated into protein. The coding region is divided into exons and introns. Leader -----> coding region -----> trailer 8 exons --- introns While introns are transcribed into immature RNA, they are removed fromwithin the transcript by splicing together of exons on either side and, thus, do not encode amino acids. Leader ----(intron---exon---intron---exon---intron)n----trailer The genes have polarity characterized by a 5’ upstream end and a 3’ downstream end. In its double stranded form the complementary strands are antiparallel. 10.1 Regulatory Sequences A class of sequences makes up a numerically insignificant fraction of the genome but provides critical functions. For example, certain sequences indicate the beginning and end of genes, sites for initiating replication and recombination, or provide landing sites for proteins that turn genes on and off. Like structural genes, regulatory sequences are inherited; however, they are not commonly referred to as genes. 10.2 Repetitive sequences HIGHLY REPETITIVE CENTROMERIC DNA - tandem repeats in the (untranscribed) heterochromatin flanking the centromeres are there, but we don't yet know why. One commonly occurring highly repetitive sequence is the Alu sequence (Alu element) consisting of 300,000 to 500,000 base pairs scattered throughout the human genome. These contain a single site for the restriction endonuclease AluI. VNTRs - Variable Number Tandem Repeats are 1-5 kb long, and consist of variable numbers of adjacent repeats. No one knows what they do, but they are “highly” variable among individuals. These are the fragments of DNA that are used as DNA FINGERPRINTS, and are used extensively in criminal forensics. SPACER DNA - Spacer DNA are regions of nontranscribed DNA between tandemly repeated genes, such as ribosomal RNA genes in eukaryotes. Its function is probably to do with ensuring the high rates of transcription associated with these genes. These are sequences that have no product, yet serve a definite function. For example, the telomere sequences allow replication without reduction in telomere size. 10.3 Gene clusters A gene cluster is a set of two or more genes that serve to encode for the similar products. Because populations from a common ancestor tend to possess the same varieties of gene clusters, they are useful for tracing back recent evolutionary history. An example of a gene cluster is the Human β-globin gene cluster, which contains five functional genes and one non-functional gene for similar proteins. Hemoglobin molecules contain any two identical proteins from this gene cluster, depending on their specific role. 10.4 Pseudogenes Another class of non-coding DNA is the "pseudogene", so named because it is believed to be a remnant of a real gene that has suffered mutations and is no longer functional. Pseudogenes may have arisen through the duplication of a functional gene, followed by inactivation of one of the copies. Comparing the presence or absence of pseudogenes is one method used by evolutionary geneticists to group species and to determine relatedness. Thus, these sequences are thought to carry a record of our evolutionary history. 10.5 Tandemly Repeated (Satellite) DNA Satellite DNA - DNA of different density: a component of an animal's DNA that differs in density from surrounding DNA, consists of short repeating sequences of nucleotide pairs, 9 and does not undergo transcription There is remarkable variability in genome size among eukaryotes that has little correlation with organismal complexity, ploidy or number of coding genes. For example, a newt has six times the genome size of a human. Much of this variation is due to non-coding, tandemly repeated DNA. Indeed, a substantial fraction of the genomes of many eukaryotes is composed of repetitive DNA in which short sequences are tandemly repeated in small to huge arrays. Tandemly repetitive sequences, commonly known as "satellite DNAs" are classified into three major groups: Satellites are very highly repetitive with repeat lengths of one to several thousand base pairs. These sequences typically are organized as large (up to 100 million bp !) clusters in the heterochromatic regions of chromosomes, near centrosomes and telomeres; these are also found abundantly on the Y chromsome. Minisatellites are moderately repetitive, tandemly repeated arrays of moderately-sized (9 to 100 bp, but usually about 15 bp) repeats, generally involving mean array lengths of 0.5 to 30 kb. They are found in euchromatic regions of the genome of vertebrates, fungi and plants and are highly variable in array size. Microsatellites are moderately repetitive, and composed of arrays of short (2-6 bp) repeats found in vertebrate, insect and plant genomes. The human genome contains at least 30,000 microsatellite loci located in euchromatin. Copy numbers are characteristically variable within a population, typically with mean array sizes on the order of 10 to 100. Sometimes is considered as a VNTR. Random changes that alter the length of microsatellite DNA near the gene for the vasopressin receptor affect social behavior in male voles. A longer microsatellite region resulted in more bonding and care giving. 11.Transposed sequences Multiple copies of small DNA segments called TRANSPOSABLE GENETIC ELEMENTS exist throughout the genome, and some can excise and move to different positions within the genome of a single cell, a process is called transposition. Their function is not known, and some suspect that they could be remnants of "parasitic" or "selfish" DNA that simply goes along for the ride without regard for the host. In the transposition, they can cause mutations and change the amount of DNA in the genome. Transposons were also once called "jumping genes", and are examples of mobile genetic elements. They were discovered by Barbara McClintock early in her career, for which she was awarded a Nobel prize in 1983. She noticed insertions, deletions, and translocations, caused by these transposons. These changes in the genome could, for example, lead to a change in the color of corn kernels. About 50% of the total genome of maize consists of transposons. RETROTRANSPOSONS are similar, but have been reverse transcribed from RNA into DNA which then inserts into the host genome. These little buggers act a bit like retroviruses. 11.1 Transposons causing diseases Transposons are mutagens. They can damage the genome of their host cell in different ways: A transposon or a retroposon that inserts itself into a functional gene will most likely disable that gene. 10 After a transposon leaves a gene, the resulting gap will probably not be repaired correctly. Diseases that are often caused by transposons include hemophilia A and B, severe combined immunodeficiency, porphyria, predisposition to cancer, and Duchenne muscular dystrophy. Additionally, many transposons contain promoters which drive transcription of their own transposase. These promoters can cause aberrant expression of linked genes, causing disease or mutant phenotypes. 12. Non-nuclear heredity A genome is all the genetic information in the haploid portion of chromosomes of a cell. When the first draft of the human genome sequence became available in February 2001 there was some surprise that instead of 100,000 genes, only about 30,000 genes were counted. For example, only about 1.5% of the human genome consists of protein-coding exons, with over 50% of human DNA consisting of non-coding repetitive sequences. The reasons for the presence of so much non-coding DNA in eukaryotic genomes and the extraordinary differences in genome size, or C-value, among species represent a long-standing puzzle known as the "Cvalue enigma”. Human Genome researchers have confirmed the existence of 19,599 proteincoding genes in the human genome and identified another 2,188 DNA segments that are predicted to be protein-coding genes. In general it should be 21,787 protein-coding genes. Almost a quarter of the protein-coding genes are involved in expression, replication and maintenance of the genome and another 20% specify components of the signal transduction pathways that regulate genome expression and other cellular activities in response to signals received from outside of the cell. Enzymes responsible for the general biochemical functions of the cell account for another 17.5% of the known genes; the remainder are involved in activities such as transport of compounds into and out of cells, the folding of proteins into their correct three-dimensional structures, the immune response, and synthesis of structural proteins such as those found in the cytoskeleton and in muscles. 12.1 Mitochondrial genome Not all genetic information is found in nuclear DNA. Both plants and animals have an organelle—a "little organ" within the cell— called the mitochondrion. Each mitochondrion has its own set of genes. Mitochondria were once independent living cells similar to today’s bacteria. Millions of years ago, the bacteria invaded primitive amoeboid cells and established a mutually beneficial (symbiotic) relationship. So, human cells probably evolved as symbiotic cellular communities. Over millions of years, redundant genes were lost from the bacteria and they became entirely dependent on their hosts, ceasing to exist as independent life forms. Cells often have multiple mitochondria, particularly cells requiring lots of energy, such as active muscle cells. Unlike nuclear DNA (the DNA found within the nucleus of a cell), half of which comes from our mother and half from our father, mitochondrial DNA is only inherited from our mother. This is because mitochondria are only found in the female gametes or "eggs" of sexually reproducing animals, not in the male gamete, or sperm. Mitochondrial DNA also does not recombine; there is no shuffling of genes from one generation to the other, as there is with nuclear genes. Large numbers of mitochondria are found in the tail of sperm, providing them with an engine that generates the energy needed for swimming toward the egg. However, when the sperm enters the egg during fertilization, the tail falls off, taking away the father's 11 mitochondria. Human mitochondrial genome is a small circular DNA molecule 16 568 bp in length containing 37 genes. Twenty-four of mitochondrial genes specify RNA molecules involved in protein synthesis while the remaining 13 encode proteins required for the biochemical reactions that make up respiration (ATP synthesis). A number of rare diseases are caused by mutations in mitochondrial DNA, and the tissues primarily affected are those that most rely on respiration, i.e. the brain and nervous system, muscles, and the kidneys and liver. Mitochondrial diseases include Leber's hereditary optic neuropathy, in which there is loss of vision often combined with cardiac arrhythmia, and Kearns-Sayre syndrome, which involves paralysis of the eye muscles, dementia and seizures. LECTURE 4 Of Special Course “Modern Problems of Molecular Biology” for Pharmaceutical Department students of the 2nd Year of Study THEME: “MOLECULAR MECHANISMS OF GENE, CHROMOSOMAL AND GENOMIC MUTATIONS” Lecture plan: 1. Mutation definition. Mutant bodies 2. Mutation origin and inheritance 3. Mutation effect on an organism 4. Beneficial mutations 5. Harmful mutations 6. Antimutagenic effect 7. Mutation types 7.1 Gene mutations 7.2 Chromosomal abberations 7.3 Genomic mutations 8. Mosaicim 9. Copy number variation 10. Frequency of mutations 1. Mutation definition. Mutant bodies Mutation is defined as a permanent transmissible change in a DNA sequence away from normal that has not been repaired. Organisms in which mutations took place are named mutants. 2. Mutation origin and inheritance Mutations occur naturally, caused by errors in DNA duplication, errors in processing DNA and errors in meiosis and mitosis. Physical damage and chemical damage can induce mutations as well, and are used by researchers to study mutations. 12 If the frequency of the gene’ change is lower that 1per cent in the population, the allele is regarded as a mutation. This implies there is a normal allele that is prevalent in the population and that the mutation changes this to a rare and abnormal variant. Mutations range in size from a single DNA building block (DNA base) to a large segment of a chromosome. Mutations occur in two ways: they can be inherited from a parent or acquired during a person’s lifetime. Mutations that are passed from parent to child are called hereditary mutations or germline mutations (because they are present in the egg and sperm cells, which are also called germ cells). This type of mutation is present throughout a person’s life in virtually every cell in the body. Mutations that occur only in an egg or sperm cell, or those that occur just after fertilization, are called new (de novo) mutations. De novo mutations may explain genetic disorders in which an affected child has a mutation in every cell, but has no family history of the disorder. Acquired (or somatic) mutations occur in the DNA of individual cells at some time during a person’s life. These changes can be caused by environmental factors such as ultraviolet radiation from the sun, or can occur if a mistake is made as DNA copies itself during cell division. Acquired mutations in somatic cells cannot be passed on to the next generation. Some genetic changes are very rare; others are common in the population. Genetic changes that occur in more than 1 percent of the population are called polymorphisms. They are common enough to be considered a normal variation in the DNA. Polymorphisms are responsible for many of the normal differences between people such as eye color, hair color, and blood type. Although many polymorphisms have no negative effects on a person’s health, some of these variations may influence the risk of developing certain disorders. 2. Mutation effect on an organism The characteristics of organisms are determined by their genetic material (DNA), and random mutations (changes) in the DNA can result in slight changes in organisms. As these accumulate, there can be changes in organisms, resulting in evolution. About 90 percent of DNA is thought to be non-functional, and mutations there generally have no effect. The remaining 10 percent is functional, and has an influence on the properties of an organism, as it is used to direct the synthesis of proteins that guide the metabolism of the organism. Mutations to this 10 percent can be neutral, beneficial, or harmful. Probably less than half of the mutations to this 10 percent of DNA are neutral. Harmful mutations result in organisms less likely to survive, and so these mutations tend to be eliminated from the population (group of organisms in a species). Beneficial mutations also tend to be eliminated by chance, but less often, and tend to be preserved. As these accumulate, the species can gradually adapt to its environment. Neutral mutations are generally eliminated, curiously, but sometimes can spread to the whole population. We then say that the mutation has fixed in the population. The rate of evolution is the rate at which mutations fix in the population. 4. Beneficial mutations Although most mutations that change protein sequences are harmful, some mutations have a positive effect on an organism. In this case, the mutation may enable the mutant organism to withstand particular environmental stresses better than wild-type organisms, or reproduce more quickly. In these cases a mutation will tend to become more common in a 13 population through natural selection. For example, a specific 32 base pair deletion in human CCR5 (CCR5-Δ32) confers HIV resistance to homozygotes and delays AIDS onset in heterozygotes. The CCR5 mutation is more common in those of European descent. One possible explanation of the etiology of the relatively high frequency of CCR5-Δ32 in the European population is that it conferred resistance to the bubonic plague in mid-14th century Europe. People with this mutation were more likely to survive infection; thus its frequency in the population increased. This theory could explain why this mutation is not found in Africa, where the bubonic plague never reached. A newer theory suggests that the selective pressure on the CCR5 Delta 32 mutation was caused by smallpox instead of the bubonic plague. 5. Harmful mutations Changes in DNA caused by mutation can cause errors in protein sequence, creating partially or completely non-functional proteins. To function correctly, each cell depends on thousands of proteins to function in the right places at the right times. When a mutation alters a protein that plays a critical role in the body, a medical condition can result. A condition caused by mutations in one or more genes is called a genetic disorder. Some mutations alter a gene's DNA base sequence but do not change the function of the protein made by the gene. Studies of the fly Drosophila melanogaster suggest that if a mutation does change a protein, this will probably be harmful, with about 70 percent of these mutations having damaging effects, and the remainder being either neutral or weakly beneficial. However, studies in yeast have shown that only 7% of mutations that are not in genes are harmful. If a mutation is present in a germ cell, it can give rise to offspring that carries the mutation in all of its cells. This is the case in hereditary diseases. On the other hand, a mutation may occur in a somatic cell of an organism. Such mutations will be present in all descendants of this cell within the same organism, and certain mutations can cause the cell to become malignant, and thus cause cancer. Often, gene mutations that could cause a genetic disorder are repaired by the DNA repair system of the cell. Each cell has a number of pathways through which enzymes recognize and repair mistakes in DNA. Because DNA can be damaged or mutated in many ways, the process of DNA repair is an important way in which the body protects itself from disease. 6. Antimutagenic effect The nature had created several protective mechanisms during the evolution which decrease frequency of phenotypic manifestations of mutations. I. Degeneracy of biological code, i.e. one amino acid is specified by several triplets. II. Mutant genes are often recessive and do not manifest themselves at heterozygous state because of the diploid set of chromosomes (normal allele gene). III. Reparation of DNA structure damaged by the action of mutagen, repetitions of genes, double chain of DNA, and homologous pairs of chromosomes. The reparations can occurs in two ways: Photoreactivation (photoenzyme repair, see Chapter 2) -under the action of the ultraviolet radiation reparation of dimeric bonds appear between adjacent pyrimidin bases, more often between T and T in one chain occurs. T-T links prevents the formation of complementary bonds with A nucleotide from another chain. Excision repair does not depend on light (dark reparation). In this type of repair excision 14 of damaged part of one DNA chain by enzyme endonuclease takes place. 7. Mutation types 7.1 Gene mutations Gene mutations are characterized by changes of the nucleotides normal sequence. Normal gene and mutant one formed from it are allelic. There are the following types of gene mutations: Replacement of nucleotides - one nucleotide appears instead of another. There are 2 types of replacement: Missense mutation - This type of mutation is a change in one DNA base pair that results in the substitution of one amino acid for another in the protein made by a gene. Nonsense mutation - A nonsense mutation is also a change in one DNA base pair. Instead of substituting one amino acid for another, however, the altered DNA sequence prematurely signals the cell to stop building a protein. This type of mutation results in a shortened protein that may function improperly or not at all. These mutations appear constantly and are the most frequent. They have an important meaning for evolution of nature and for selection of practically important mutants. The sickle-cell anemia is an example of mutation caused by the nucleotides replacement. Under this disease amino acid glutamine is replaced by valine in beta-chain of hemoglobin in sixth position. TTG appears instead of glutamine-coding triplet GAG. Gene mutations may either manifest in change of a character or not, so mutations can't be revealed. A degree of character change depends on polypeptide chain structure change and protein function, e.g. enzyme. It should be taken into account that mutation appear only in one of allele gene and its manifestation depends on interaction with a normal allele in heterozygotes. Insertion - An insertion changes the number of DNA bases in a gene by adding a piece of DNA. As a result, the protein made by the gene may not function properly. Duplication - A duplication consists of a piece of DNA that is abnormally copied one or more times. This type of mutation may alter the function of the resulting protein. Repeat expansion - Nucleotide repeats are short DNA sequences that are repeated a number of times in a row. For example, a trinucleotide repeat is made up of 3-base-pair sequences, and a tetranucleotide repeat is made up of 4-base-pair sequences. A repeat expansion is a mutation that increases the number of times that the short DNA sequence is repeated. This type of mutation can cause the resulting protein to function improperly. 7.2 Chromosomal abberations Duplications A deleted chromosome fragment can attach to its homologue, thereby duplicating a region of genes on the chromosome to which it attaches. Or, as happens in some genes, a segment of the gene or chromosome undergoes multiple repetitions, so that several copies are located on the chromosome. A common duplication, the trinucleotide repeat, occurs within some abnormal genes, and is responsible for several genetic disorders, including fragile X syndrome and Huntington's disease. In vertebrates, hemoglobin genes may have evolved by duplication. Hemoglobin is composed of alpha and beta sub-units, which are coded by genes that have a similarity of 75%, suggesting a common origin by duplication followed by divergence. Example - Fragile X: the most common form of mental retardation. The X chromosome 15 of some people is unusually fragile at one tip - seen "hanging by a thread" under a microscope. Most people have 29 "repeats" at this end of their X-chromosome, those with Fragile X have over 700 repeats due to duplications. Affects in USA 1:1500 males, 1:2500 females. Chromosomal translocation A gene can be transposed or translocated (moved) to a different location along the chromosome, so that the sequence might read A-B-C-E-F-G-D rather than A-B-C-D-E-F-G. Reciprocal translocations involve exchange of genes between non-homologous chromosomes. The translocation of a piece of the human 22 chromosome to the 9 chromosome causes Acute Myelogenous Leukemia because it interferes with a gene that controls cell division. This abnormal chromosome is called the Philadelphia chromosome, from the city in which the researchers who discovered this abnormality lived.Fusion of the long arms of two acrocentric chromosomes [13,14,15,21,22] into a single chromosome having lost the short arms at the same time. Most often occurs as 21/21, 13/14, and 14/21 translocations. Apart from being an important cause of uniparental disomy, it may cause trisomy 21 (Down's syndrome) in the offspring. Human chromosome 2 is a result of a centric fusion between two ancestral ape chromosomes (gorillas have 24 pairs of chromosomes). Ring chromosomes usually occur when a chromosome breaks in two places and the ends of the chromosome arms fuse together to form a circular structure. The ring may or may not include the chromosome’s constriction point (centromere). In many cases, genetic material near the ends of the chromosome is lost. Dicentric chromosomes - unlike normal chromosomes, which have a single constriction point (centromere), a dicentric chromosome contains two centromeres. Dicentric chromosomes result from the abnormal fusion of two chromosome pieces, each of which includes a centromere. These structures are unstable and often involve a loss of some genetic material. 7.3 Genomic mutations involves the changing of chromosomes number of whole chromosome sets. A set of chromosomes in sex cell (gamete) is called genome. Genomic mutations are subdivided into: 1) polyploidy — the state of having more than diploid set of chromosomes (3n—triploid, 4n— tetraploid, 5n — pentaploid, 6n—hexaploid). Many plant species are polyploid. Aneuploidy - a genetic change that involves the loss or gain of entire chromosomes. Due to problems in the cell division process, the replicated chromosomes may not separate into the daughter cells accurately. This can result in cells that have too many chromosomes or too few chromosomes. An example of a fairly common aneuploid condition that is unrelated to cancer is Down syndrome, in which there is an extra copy of chromosome 21 in all of the cells of the affected individual. The most common types of aneuploids are: 1. Nullisomics: A whole pair of chromosomes is missing, expressed as 2n-2. 2. Monosomics: One homologue is missing, expressed as 2n-1. 3. Trisomics: Individuals with and extra chromosome. One pair has three instead of two chromosomes, expressed as 2n+1. 8. Mosaicism 16 Mutations may also occur in a single cell within an early embryo. As all the cells divide during growth and development, the individual will have some cells with the mutation and some cells without the genetic change. This situation is called mosaicism. 9. Copy number variation In some cases the number of copies varies—meaning that a person can be born with one, three, or more copies of particular genes. Less commonly, one or more genes may be entirely missing. This type of genetic difference is known as copy number variation (CNV). Copy number variation results from insertions, deletions, and duplications of large segments of DNA. These segments are big enough to include whole genes. Variation in gene copy number can influence the activity of genes and ultimately affect many body functions. Researchers were surprised to learn that copy number variation accounts for a significant amount of genetic difference between people. More than 10 percent of human DNA appears to contain these differences in gene copy number. While much of this variation does not affect health or development, some differences likely influence a person’s risk of disease and response to certain drugs. Future research will focus on the consequences of copy number variation in different parts of the genome and study the contribution of these variations to many types of disease. 10.Frequency of mutations Both genetic and non-genetic factors influence frequency of mutations Properties of a given locus. It has been calculated that an average frequency of mutations is from 10-5 to 10-6. But genes of one type are more mutable and the other are more stable. E.g., in a human being gene of haemophilia A mutates with the frequency of 30 - 50 x 10-6 and that of haemophilia B with frequency of 2-3 x 10-6. The properties of an organism where a gene functions. It is established that mutations by hemophilia B in women are met 10 times more frequent than in men. Factors on non-genetic nature influence the frequency of mutations. It is supposed that their number enlarges in a human being with age, influenced by ultraviolet rays etc. The process of genetic material change is called mutagenesis (G.de Freeze). There are spontaneous and induced mutagenesis. Spontaneous mutagenesis (SM) is a name for changes of inherited material under natural conditions without visible reasons. Though they exist and the main mechanisms are: -errors of DNA duplication lead to the replacement of bases in nucleotides; -errors of DNA reparation fixing the changes had happened; - presence of special genes - mutators, inducing mutagenesis; - ability of separate genes to migrate; - accumulation of definite chemical substances in the cells during which later on damages genie apparatus. Induced mutagenesis is an artificial increase of mutations frequincy under influence of non-physiologic doses of mutagenes. Mutagenes are substances causing mutations. They are subdivided into physical, chemical, biological. Among physical mutagenes ionizing radiation is on the first place. It's mutagenic influence on a human being is increased because of wide use of radioactive substances in industry, X-ray diagnosis, medicine. Consequences of Chernobyl accident are of great importance in this respect. The spectrum of chemical mutagenes is extremely wide. These are chemical substances 17 of harmful chemical manufactures, different chemical substances used in an agriculture, the ones used in domestic needs, Pharmaceuticals. Biological mutagenes are viruses that able to insert into a human being genome and damage. LECTURE 5 Of Special Course “Modern Problems of Molecular Biology” for Pharmaceutical Department students of the 2nd Year of Study THEME: “DNA METHODS. RECOMBINANT DNA” Lecture Plan: 1. Recombinant DNA technology definition 2. Direct gene testing 3. DNA methods 4. Sources of the genetic material for DNA testing 5. Step 1: DNA Restriction 5.1Creation of the recombinant DNA 6. Polymerase chain reaction (PCR) 7. Step 2: Gel electrophoresis 8. Step 3: DNA hybridization 9. Southern blot 10.Indirect gene tracking (linkage) 10.1 Restriction fragment length polymorphism 10.2 DNA chips 10.3 DNA profiling 11.Predictive Genetic Testing 12.DNA sequencing 13.The Western blot 14.The Northern blot 15.Eastern blotting 16.Southwestern blotting 17.Practical Applications of DNA Technology 18.Detection of a Gene 1. Recombinant DNA technology definition a collection of experimental techniques, which allow for isolation, copying & insertion of new DNA sequences into host-recipient cells by a number of laboratory protocols & methodologies. 2. Direct gene testing looks at the presence or absence of a known gene mutation by examining the sequence of nucleotides in the information in the gene The test is very accurate and used for diagnosis and screening including prenatal, genetic 18 carrier testing and screening, presymptomatic and predictive testing Limitations include: – Interpretation of the test result eg. finding that a person has a faulty gene does not always relate to how a person is, or will be, affected by that condition – The testing may be time-consuming and expensive for the health service if not for the patient – For some complex conditions eg. cancer, the testing may have to be done on a family member with the condition to identify a family-specific mutation in the gene (mutation searching) before unaffected family members can be offered predictive testing 3. DNA methods: DNA sequencing - это узнавание последовательности оснований ДНК DNA cloning - - размножение отдельных фрагментов ДНК Reverse transcription - получение определенных фрагментов ДНК (зондов ДНК) на основе обратной транскрипции с мРНК DNA hybridization - путем генной инженерии Polymerase chain reaction (PCR) FISH - analysis 4. Sources of the genetic material for DNA testing DNA to be tested can be extracted from the cells of a variety of body fluids or tissues. While the majority of tests are carried out using DNA from blood cells (lymphocytes), cells obtained from the lining of the cheek using a mouth-wash or the cells in the roots of an individual’s hair may also be sources of DNA. 5. Step 1: DNA Restriction In the 40 - 50 years that molecular biology has existed, scientists have used restriction mapping to analyze the structure and sequence of DNA molecules. This powerful technique involves the use of restriction digestion. Restriction Digestion is the process of cutting DNA molecules into smaller pieces with special enzymes called Restriction Endonucleases (sometimes just called Restriction Enzymes or RE). REs recognize specific DNA sequences wherever that sequence occurs in the DNA, usually 4 to 8 bp in length (for example GATATC), and cleave at these sequences. As everyone’s DNA has some small differences, the sites may be at different places in people’s non-coding DNA and so the enzymes will cut the DNA into different sizes in different people. Nathans, Smith and Arber were awarded the Nobel Prize in 1979 for discovering restriction enzymes and having the insight and creativity to use these enzymes to map genes. More than 900 restriction enzymes, some sequence specific and some not, have been isolated from over 230 strains of bacteria since. Restriction enzymes and the fragments produced by them have become powerful tools of molecular genetics. They are used to map DNA molecules physically, to analyze population polymorphisms, to rearrange DNA molecules, to prepare molecular probes, 19 to create mutants, to analyze the modification status of the DNA, and other applications. There are three classes of restriction enzymes (type I, II and III). The most commonly used ones are type II enzymes. These recognize specific sequences that are 4, 5, and 6 nucleotides in length and display a twofold symmetry. Recognition sequences for many type II enzymes are the same on both strands. Such recognition sequences are said to be palindromic. Some RE cleave both strands exactly at the axis of symmetry, generating fragments of DNA that carry blunt or flushed ends (e.g. EcoRV); others cleave each strand at similar locations on opposite sides of the axis of symmetry, creating fragments of DNA that carry protruding singlestranded termini (also called “sticky” ends or overhangs). 5.1Creation of the recombinant DNA Restriction Endonucleases - diplotomic cuts at unique DNA sequences, mostly palindromes. DNA's cut this way have sticky (complimentary) ends & can be reannealed or spliced with other DNA molecules to produce new genes combinations and sealed via DNA ligase. 6. Polymerase chain reaction (PCR) In molecular biology, the PCR is a technique to amplify a single or few copies of a piece of DNA across several orders of magnitude, generating thousands to millions of copies of a particular DNA sequence. The method relies on thermal cycling, consisting of cycles of repeated heating and cooling of the reaction for DNA melting and enzymatic replication of the DNA. Primers (short DNA fragments) containing sequences complementary to the target region along with a DNA polymerase (after which the method is named) are key components to enable selective and repeated amplification. As PCR progresses, the DNA generated is itself used as a template for replication, setting in motion a chain reaction in which the DNA template is exponentially amplified. PCR can be extensively modified to perform a wide array of genetic manipulations. Almost all PCR applications employ a heat-stable DNA polymerase, such as Taq polymerase, an enzyme originally isolated from the bacterium Thermus aquaticus. This DNA polymerase enzymatically assembles a new DNA strand from DNA building blocks, the nucleotides, by using single-stranded DNA as a template and DNA oligonucleotides (also called DNA primers), which are required for initiation of DNA synthesis. The vast majority of PCR methods use thermal cycling, i.e., alternately heating and cooling the PCR sample to a defined series of temperature steps. These thermal cycling steps are necessary first to physically separate the two strands in a DNA double helix at a high temperature in a process called DNA melting. At a lower temperature, each strand is then used as the template in DNA synthesis by the DNA polymerase to selectively amplify the target DNA. The selectivity of PCR results from the use of primers that are complementary to the DNA region targeted for amplification under specific thermal cycling conditions. 6.Polymerase chain reaction (PCR) Developed in 1984 by Kary Mullis, PCR is now a common and often indispensable technique used in medical and biological research labs for a variety of applications. These include DNA cloning for sequencing, DNA-based phylogeny, or functional analysis of genes; the diagnosis of hereditary diseases; the identification of genetic fingerprints (used in forensic 20 sciences and paternity testing); and the detection and diagnosis of infectious diseases. In 1993 Mullis was awarded the Nobel Prize in Chemistry for his work on PCR. 7. Step 2: Gel electrophoresis Gel electrophoresis is a technique used for the separation of nucleic acids and proteins. The cut DNA is placed into a slab of ‘jelly’ (a gel matrix) and an electrical current is applied so that the ‘jelly’ becomes electrified and has a ‘positive’ (+) end at the top and a negative (-) end at the bottom - just like the positive and negative ends of a battery. As the DNA is a chemical which has a negative charge, the DNA moves towards the positive end of the gel or from the top to the bottom. The pieces of DNA separate on the gel according to size: the biggest pieces move the slowest and so will be closest to the top of the gel. The gel now contains all of the individual’s DNA spread from the top to the bottom of the gel. The frictional force of the gel material acts as a "molecular sieve," separating the molecules by size. During electrophoresis, macromolecules are forced to move through the pores when the electrical current is applied. Their rate of migration through the electric field depends on the strength of the field, size and shape of the molecules, relative hydrophobicity of the samples, and on the ionic strength and temperature of the buffer in which the molecules are moving. After staining, the separated macromolecules in each lane can be seen in a series of bands spread from one end of the gel to the other. 8. Step 3: DNA hybridization To select out the pieces of DNA that need to be analysed, the pieces of DNA that have spread through the gel are covered with special DNA ‘probes’. The probes have been made in the laboratory and contain a match for the DNA sequence that the test is designed to identify. The probes in fact have the opposite letters in the genetic code sequence to the sequence in the gene or DNA segment that needs to be isolated. The two sequences match up because of the ability of the letters A and T, and C and G to pair with each other as shown in Figure. The development of the probes used is critical. They can be expensive to develop and the process may take some time. Recent developments have enabled faster testing to see if a genetic condition is due to having the loss of copies of particular gene(s) (deletion) or too many copies (duplication). The probes are produced in the form of microarrays. The same principles described above are used in microarray testing except that the DNA from the person being tested is applied to a very small unit on which thousands of different ‘probes’ representing thousands of regions of DNA or genes have been placed. Microarrays can be built that are specific for one particular chromosome or include all of the DNA in a human cell (the genome). An example of the result of a DNA genetic test as seen in the laboratory is shown in Figure. There are two copies of each gene. In this case Person A has two faulty copies of a gene and may have the genetic condition Person B has one copy that is faulty and the other is working. Therefore this person is a carrier of the faulty gene Person C has both copies of this gene containing the right information and has normal gene function. The DNA examination may involve the analysis of the gene itself (direct gene testing) or of short segments of the DNA close to or within a gene (indirect gene tracking or linkage). 21 9. Southern blot is a method routinely used in molecular biology for detection of a specific DNA sequence in DNA samples. Southern blotting combines transfer of electrophoresis-separated DNA fragments to a filter membrane and subsequent fragment detection by probe hybridization. The method is named after its inventor, the British biologist Edwin Southern. Other blotting methods (i.e., western blot, northern blot, eastern blot, southwestern blot) that employ similar principles, but using RNA or protein, have later been named in reference to Edwin Southern's name. As the technique was eponymously named, Southern blot should be capitalized as is required for proper nouns, whereas names for other blotting methods should not. Hybridization of the probe to a specific DNA fragment on the filter membrane indicates that this fragment contains DNA sequence that is complementary to the probe. The transfer step of the DNA from the electrophoresis gel to a membrane permits easy binding of the labeled hybridization probe to the size-fractionated DNA. It also allows for the fixation of the target-probe hybrids, required for analysis by autoradiography or other detection methods. Southern blots performed with restriction enzyme-digested genomic DNA may be used to determine the number of sequences (e.g., gene copies) in a genome. A probe that hybridizes only to a single DNA segment that has not been cut by the restriction enzyme will produce a single band on a Southern blot, whereas multiple bands will likely be observed when the probe hybridizes to several highly similar sequences (e.g., those that may be the result of sequence duplication). Modification of the hybridization conditions (for example, increasing the hybridization temperature or decreasing salt concentration) may be used to increase specificity and decrease hybridization of the probe to sequences that are less than 100% similar. 10.Indirect gene tracking (linkage) relies on comparing DNA markers from family members with the condition to markers in unaffected relatives. Used in situations where the gene itself has not been precisely located or where mutation(s) in a gene have not yet been defined; the test is not as accurate as direct gene testing but can be used in diagnosis including prenatal and presymptomatic and predictive testing. Limitations include: It may not always be possible to find DNA markers that enable the scientists to tell the difference between the faulty gene copy and the working gene copy. 10.1 Restriction fragment length polymorphism The term restriction fragment length polymorphism, or RFLP, (commonly pronounced “rif-lip”) refers to a difference between two or more samples of homologous DNA molecules arising from differing locations of restriction sites, and to a related laboratory technique by which these segments can be distinguished. In RFLP analysis the DNA sample is broken into pieces (digested) by restriction enzymes and the resulting restriction fragments are separated according to their lengths by gel electrophoresis and transferred to a membrane via the Southern blot procedure. Hybridization of the membrane to a labeled DNA probe then determines the length of the fragments which are complementary to the probe. A RFLP occurs when the length of a detected fragment varies between individuals. Each fragment length is considered an allele, and can be used in genetic analysis. Although now largely obsolete, RFLP analysis was the first DNA profiling technique inexpensive enough to see widespread application. In addition to genetic fingerprinting, RFLP 22 was an important tool in genome mapping, localization of genes for genetic disorders, determination of risk for disease, and paternity testing. Most RFLP markers are co-dominant (both alleles in heterozygous sample will be detected) and highly locus-specific. RFLP analysis may be subdivided into single- (SLP) and multi-locus probe (MLP) paradigms. Usually, the SLP method is preferred over MLP because it is more sensitive, easier to interpret and capable of analyzing mixed-DNA samples. Moreover data can be generated even when the DNA is degraded (e.g. when it is found in bone remains.) In this method of genetic testing, scientists use the fact that there are special segments of DNA that are located very close to the gene on the same chromosome. These segments nearly always travel with the gene when it is passed from parent to child: this is more likely the closer they are to the gene. These segments of DNA are called ‘polymorphic markers’: “poly” means many and morphic means forms. These markers are different in different families. They are a bit like flashing lights that warn them that either the working copy of the gene, or the faulty copy containing the mutation, is nearby. The closer the markers are linked to the gene, the more confident the scientist can be that a marker is travelling with either the working copy or the faulty gene copy. This method of indirect gene tracking is referred to as the linkage method. The markers that are linked to the faulty or working gene copies are special to each family, so this method of genetic testing can only be done within families. Indirect gene tracking is a ‘family test’. 10.2 DNA chips can reveal DNA mutations and RNA expression. There are a large number of genes in eukaryotic genomes. The pattern of expression between different tissues and at different times is quite distinctive. Cells have unique mRNA's. For example, early stage skin cancer have a unique mRNA "fingerprint". To find these patterns, DNA sequences can be arranged in an array on a solid support. DNA chip technology provides these large arrays. Merging DNA technology with the manufacturing technology of the semiconductor industry, large arrays are being produced. DNA chips are glass slides onto which DNA sequences are attached in precise order. The typical slide is divided into 24x24 uM squares. Each contains about 10 million copies of a particular sequence, which is up to 20 nucleotides long. A computer controls the additions of the nucleotides in predetermined pattern. Up to 60,000 different sequences can be put on a single chip. Cellular mRNA is isolated from cells and is used to make complementary DNA, which is called cDNA. Reverse transcriptase and PCR are used together in a process called RT-PCR. The amplified cDNA is coupled to a fluorescent dye. It is then hybridized to the chip. A sensitive scanner detects the spots on the array that glow. The combinations of spots that light up differ with different types of cells or different physiological states. DNA chip technology can be used in detecting genetic variants. 23 Twenty-nucleotide fragments of DNA sequences of all possible mutations are arranged. The person's DNA is hybridized to determine if any hybridize to a mutant sequence on the chip. 10.3 DNA profiling (also called DNA testing, DNA typing, or genetic fingerprinting) — is a technique employed by forensic scientists to assist in the identification of individuals on the basis of their respective DNA profiles. DNA profiles are encrypted sets of numbers that reflect a person's DNA makeup, which can also be used as the person's identifier. DNA profiling should not be confused with full genome sequencing. It is used in, for example, parental testing and rape investigation. Although 99.9% of human DNA sequences are the same in every person, enough of the DNA is different to distinguish one individual from another. DNA profiling uses repetitive ("repeat") sequences that are highly variable, called variable number tandem repeats (VNTR). VNTRs loci are very similar between closely related humans, but so variable that unrelated individuals are extremely unlikely to have the same VNTRs. The DNA profiling technique was first reported in 1984 by Sir Alec Jeffreys at the University of Leicester in England, and is now the basis of several national DNA databases. 11.Predictive Genetic Testing Sometimes the detection of the faulty gene provides the person with an increased risk estimate, rather than certainty, that they will develop a particular condition later in life. This type of direct gene testing is called predictive testing. Predictive testing for some families is available for inherited conditions such as an inherited predisposition to haemochromatosis or breast cancer. 12.DNA sequencing Once a gene has been located precisely on a chromosome, the next steps are for scientists to determine the normal chemical structure of the gene and the changes that alter the coded message. Maxam-Gilbert sequencing In 1976–1977, Allan Maxam and Walter Gilbert developed a DNA sequencing method based on chemical modification of DNA and subsequent cleavage at specific bases. Although Maxam and Gilbert published their chemical sequencing method two years after the groundbreaking paper of Sanger and Coulson on plus-minus sequencing, Maxam-Gilbert sequencing rapidly became more popular, since purified DNA could be used directly, while the initial Sanger method required that each read start be cloned for production of single-stranded DNA. However, with the improvement of the chain-termination method (see below), Maxam-Gilbert sequencing has fallen out of favour due to its technical complexity prohibiting its use in standard molecular biology kits, extensive use of hazardous chemicals, and difficulties with scale-up. The method requires radioactive labelling at one end and purification of the DNA fragment to be sequenced. Chemical treatment generates breaks at a small proportion of one or two of the four nucleotide bases in each of four reactions (G, A+G, C, C+T). Thus a series of labelled fragments is generated, from the radiolabelled end to the first 'cut' site in each molecule. The fragments in the four reactions are arranged side by side in gel electrophoresis for size separation. To visualize the fragments, the gel is exposed to X-ray fil for 24 autoradiography, yielding a series of dark bands each corresponding to a radiolabelled DNA fragment, from which the sequence may be inferred. Also sometimes known as "chemical sequencing", this method originated in the study of DNA-protein interactions (footprinting), nucleic acid structure and epigenetic modifications to DNA, and within these it still has important applications. 13.The Western blot (alternatively, protein immunoblot) is an analytical technique used to detect specific proteins in a given sample of tissue homogenate or extract. It uses gel electrophoresis to separate native or denatured proteins by the length of the polypeptide (denaturing conditions) or by the 3-D structure of the protein (native/ non-denaturing conditions). The proteins are then transferred to a membrane (typically nitrocellulose or PVDF), where they are probed (detected) using antibodies specific to the target protein. There are now many reagent companies that specialize in providing antibodies (both monoclonal and polyclonal antibodies) against tens of thousands of different proteins. Commercial antibodies can be expensive, although the unbound antibody can be reused between experiments. This method is used in the fields of molecular biology, biochemistry, immunogenetics and other molecular biology disciplines. Other related techniques include using antibodies to detect proteins in tissues and cells by immunostaining and enzyme-linked immunosorbent assay (ELISA). The method originated from the laboratory of George Stark at Stanford. The name western blot was given to the technique by W. Neal Burnette and is a play on the name Southern blot, a technique for DNA detection developed earlier by Edwin Southern. Detection of RNA is termed northern blotting and the detection of post-translational modification of protein is termed Eastern blotting. 14.The Northern blot is a technique used in molecular biology research to study gene expression by detection of RNA (or isolated mRNA) in a sample. With northern blotting it is possible to observe cellular control over structure and function by determining the particular gene expression levels during differentiation, morphogenesis, as well as abnormal or diseased conditions. Northern blotting involves the use of electrophoresis to separate RNA samples by size, and detection with a hybridization probe complementary to part of or the entire target sequence. The term 'northern blot' actually refers specifically to the capillary transfer of RNA from the electrophoresis gel to the blotting membrane, however the entire process is commonly referred to as northern blotting. The northern blot technique was developed in 1977 by James Alwine, David Kemp, and George Stark at Stanford University. Northern blotting takes its name from its similarity to the first blotting technique, the Southern blot, named for biologist Edwin Southern. 15.Eastern blotting is a technique to analyze proteins, lipids, or glycoconjugates, and is most often used to detect carbohydrate epitopes. Thus, Eastern blotting can be considered an extension of the biochemical technique of western blotting which detects protein post translational modifications (PTM). Multiple techniques have been described by the term Eastern blotting, most use proteins or lipids blotted from SDS-PAGE gel on to a PVDF or nitrocellulose membrane. Transferred proteins are analyzed for post-translational modifications using probes that may detect lipids, 25 carbohydrate, phosphorylation or any other protein modification. Eastern blotting should be used to refer to methods that detect their targets through specific interaction of the PTM and the probe, distinguishing them from a standard Far-western blot. In principle, Eastern blotting is similar to lectin blotting (i.e. detection of carbohydrate epitopes on proteins or lipids); however, the term lectin blotting is more prevalent in the literature. 16.Southwestern blotting based along the lines of Southern blotting (which was created by Edwin Southern) and first described by B. Bowen and colleagues in 1980, is a lab technique which involves identifying and characterizing DNA-binding proteins (proteins that bind to DNA) by their ability to bind to specific oligonucleotide probes. The proteins undergo gel electrophoresis and are subsequently transferred to nitrocellulose membranes similar to other types of blotting. The name southwestern blotting is based on the fact that this technique detects DNAbinding proteins, since DNA detection is by Southern blotting and protein detection is by western blotting. However, since the first southwestern blottings, many more have been proposed and discovered. Large amounts of proteins and their degradation when being isolated hampered previous protocols. "Southwestern blot mapping" is performed for rapid characterization of both DNAbinding proteins and their specific sites on genomic DNA. Proteins are separated on a Sodium Dodecyl Sulfate (SDS), polyacrylamide gel (PAGE), renatured by removing SDS in the presence of urea, and blotted onto nitrocellulose by diffusion. The genomic DNA region of interest is digested by restriction enzymes selected to produce fragments of appropriate but different sizes, which are subsequently end-labeled and allowed to bind to the separated proteins. The specifically bound DNA is eluted from each individual protein-DNA complex and analyzed by acrylamide gel electrophoresis. Evidence that tissue-specific DNA binding proteins may be detected by this technique is presented. Moreover, their sequence-specific binding allows the purification of the corresponding selectively bound DNA fragments and may improve protein-mediated cloning of DNA regulatory sequences. 17. Practical Applications of DNA Technology 1. Medical: disease often involves changes in gene expression, so DNA methods - help to found failed gene. - A disease/infection diagnosis is also possible: PCR & labeled DNA probes from pathogens can help identify microbe types by Restriction Fragment Length Analysis (RFLP) when markers often inherited with disease. - Fragment analysis (DNA fingerprinting) also used for paternity testing - Gene Therapy: idea is to replace defective genes via microinjection of DNA requires vectors. 2. Pharmaceutical Products: manufactured drugs. 18. Detection of a Gene Locating a gene (or its activity) - Restriction Maps. Restriction maps... via gel electrophoresis &DNA-electropherogram DNA fingerprintg. CSI Miami - how to make one: a murder case&a rape case + DNA prints in Health & Society& DNA Forensic Science DNA Probe Hybridizationg - to detect specific DNA with a probe 26 Comparing Restriction Fragments to a probe: Southerng Blotting - DNA electrophoresis & blotting one can detect specific gene sequence in samples by binding to labeled probes DNA micro-arrays - monitor gene expression in thousands of genes & changes by passing cDNA of the cell's mRNA over slide with ssDNA of all cell's genes; DNA microchips are fabricated by high speed robotics akin to Intel chip making DNA (mRNA's) are fluorescently tagged so easy to see in slide's wells [microchips arrays made simultaneously by phopshoramidite method of Caruthers]. Gene Sequencing strategy - random fragments are sequenced and then ordered relative to each other via overlap & supercomputing Sequencing Strategies methodology dideoxy procedure (development by Fred Sanger)