* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download DNA METHODS FOR HLA TYPING A WORKBOOK FOR - ASHI-U
Genetic engineering wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
DNA profiling wikipedia , lookup
DNA sequencing wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Human genome wikipedia , lookup
DNA polymerase wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Designer baby wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Metagenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Genomic library wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
DNA vaccination wikipedia , lookup
Molecular cloning wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
DNA supercoil wikipedia , lookup
Point mutation wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Epigenomics wikipedia , lookup
Non-coding DNA wikipedia , lookup
Genome editing wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
History of genetic engineering wikipedia , lookup
SNP genotyping wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Microevolution wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Microsatellite wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
DNA METHODS FOR HLA TYPING A WORKBOOK FOR BEGINNERS New in version 8: next generation sequencing and allele frequencies Carolyn Katovich Hurley, Ph.D., D(ABHI) C. W. Bill Young Marrow Donor Recruitment and Research Program Department of Oncology, Georgetown University School of Medicine, Washington, DC 20057 Version 8 Copyright (c) 1993, 1998, 2004, 2008, 2015 by Georgetown University PURPOSE OF THE MANUAL The purpose of this manual is to provide the reader with the basic background needed to understand the HLA system and the molecular biology methods used to identify HLA "types". It is assumed that the reader has a college education and has taken courses in basic biology and biochemistry. The best results will be obtained if the reader starts at the beginning of the manual and follows all of the instructions in the manual. The manual is meant to be a workbook and spaces for answers are provided in the text. Answers are provided at the end of the manual. Readers are encouraged to discuss their questions with their laboratory director. C.W. Bill Young Marrow Donor Recruitment and Research Program 1 TABLE OF CONTENTS Transplantation and HLA Typing ..................................................................................... 3 1. General Concepts in Molecular Biology .................................................................... 4 2. HLA Class I Proteins and Genes ............................................................................ 17 3. HLA Class II Proteins and Genes ........................................................................... 27 4. HLA Alleles and Inheritance .................................................................................... 30 5. HLA Types Defined by Serology ............................................................................. 43 6. Gene Amplification Using the Polymerase Chain Reaction .................................... 50 7. Use of Oligonucleotide Probes to Detect Specific DNA Sequences ....................... 61 8. Sanger-Based DNA Sequencing of HLA Genes ..................................................... 68 9. Next Generation DNA Sequencing for HLA ............................................................ 75 10. Other Molecular Biology Techniques for HLA Typing............................................. 80 11. Interpretation of DNA Typing Results ..................................................................... 84 Answers to the Questions ............................................................................................. 88 REFERENCE WEB SITES Web Address Content http://hla.alleles.org Nomenclature http://igdawg.org/cwd.html Common and well documented alleles http://www.ebi.ac.uk/ipd/imgt/hla/ Allele history, alignments, serologic links https://bioinformatics.bethematchclinical.org/ Allele/haplotype frequencies in US, haplotype tools, multi-allele codes World-wide allele distribution maps http://www.pypop.org/popdata/ C.W. Bill Young Marrow Donor Recruitment and Research Program 2 TRANSPLANTATION AND HLA TYPING Hematopoietic stem cells found in the bone marrow differentiate to become blood cells. These cells play important roles in fighting infection (white blood cells or lymphocytes), in transporting oxygen to the tissues (red blood cells), and in causing blood clotting (platelets). Certain types of cancer affect these stem cells and can be fatal to the patient. One therapy for the treatment of these blood diseases like leukemia and aplastic anemia is bone marrow (or hematopoietic stem cell) transplantation. In this procedure, the abnormal bone marrow of the patient is destroyed by irradiation and chemotherapy. The patient (recipient) is then transfused with stem cells from another individual (donor). The transfused cells travel to the cavities in the bones and grow there forming new blood cells. In order for hematopoietic stem cell transplantation to be successful, the patient and the donor of the stem cells must be matched for a group of proteins called HLA molecules (or HLA antigens). This manual describes HLA molecules and discusses the techniques that are used to determine the HLA "types" of the patient and potential donors. If the patient and a potential donor have the same HLA type, the transplant has a good chance of being successful. If a patient and a donor are HLA mismatched, the donor’s stem cells may be destroyed by the patient's immune system (graft rejection) or the immune system cells in the donor’s stem cell preparation may attempt to destroy the patient's own cells (graft vs. host disease). These events can result in major complications or death of the patient. Transplantation of organs like the kidney face similar histocompatibility barriers. This manual is designed to be used by readers who will participate in using molecular biology techniques to identify HLA types. It begins with a description of some basic molecular biology concepts. C.W. Bill Young Marrow Donor Recruitment and Research Program 3 CHAPTER 1 GENERAL CONCEPTS IN MOLECULAR BIOLOGY The purpose of this chapter is to describe basic principles regarding DNA and to describe the methods used to study DNA. I. Structure of DNA and RNA A. DNA (deoxyribonucleic acid) is composed of a phosphate and sugar (deoxyribose) backbone attached to bases (Adenine, Thymine, Cytosine, Guanine) [Figure 1-1]. The combination of a base, a sugar, and a phosphate group is called a nucleotide (e.g., dATP is an abbreviation of deoxyadenosine triphosphate; dNTP is an abbreviation which means any nucleotide). Nucleotides are the basic building blocks of DNA. B. Nucleotides are linked together to form a linear chain of nucleotides. One end of this single DNA strand is called 5'; the other end is called 3'. The two ends have different structures. It is common practice to put the 5' nucleotide on the left side of the page when writing down a sequence of nucleotides (e.g., 5' TAAGGCT 3'). Figure 1-1 Structure of DNA Nucleotide 5’ 3’ P S P S P S P G A T C C T A G S P S P S P 3’ S S P 5’ dGTP, deoxyguanosine triphosphate C. Two strands of DNA form a ladder (or double helix) by base pairing (G-C and A-T). The bases are paired through hydrogen bonds. G-C pairs form 3 hydrogen bonds while A-T pairs form 2 hydrogen bonds. This difference means that it is harder to break the bonds that hold together a G-C pair than the bonds holding together an A-T pair. C.W. Bill Young Marrow Donor Recruitment and Research Program 4 II. D. The two strands of DNA that form a double helix run in opposite directions. This means that the 3’ end of one strand is next to the 5’ end of the second strand. E. The length of double-stranded DNA is measured in base pairs. 1000 base pairs is called a kilobase (1 kb). F. The two strands of DNA are complementary to one another; if you are given the sequence of one strand, you automatically know the sequence of the second strand. To save space, the sequence of only one strand of the DNA, the coding strand, is reported in the literature. The sequence of the coding strand is the same sequence as found in the mRNA that is transcribed from that gene and is always written with the 5' end on the left (like a sentence). [More on mRNA in Chapter 2.] G. RNA (ribonucleic acid) is composed of phosphate, ribose, and bases (A,C,G, Uracil). RNA containing the coding information for a protein (messenger RNA or mRNA) is single stranded. Restriction endonucleases (RE) A. Why are restriction enzymes important to understand? We don’t use RE for HLA typing any more but the concept of the frequency of specific sequences in the DNA will be important for other methods of DNA typing. Also, next generation sequencing ligates DNA fragments together in library creation and this is a topic covered in this section. Restriction endonucleases (RE) are enzymes that cut double-stranded DNA at a specific sequence of base pairs. These sequences are palindromes, that is, they read the same forwards (coding strand) as backwards (noncoding strand). For example, the restriction enzyme EcoRI cleaves DNA at the sequence: ---GAATTC--- (coding strand) ---CTTAAG--- (noncoding strand) B. Depending on the restriction enzyme, the ends generated after cleavage are blunt (or flat) or have a 5' or 3' single strand protrusion (cohesive or sticky end). C.W. Bill Young Marrow Donor Recruitment and Research Program 5 Blunt ends are generated by the RE SmaI: 5' ----CCCGGG---- 3' ===> 3' ----GGGCCC---- 5' 5' ---CCC 3' 3' ---GGG 5' 5' GGG--- 3' 3' CCC--- 5' A 5' single strand protrusion is formed by the RE EcoRI: 5' ----GAATTC---- 3' ---> 3' ----CTTAAG---- 5' 5' ----G 3' 5' AATTC---- 3' 3' ----CTTAA 5' 3' G---- 5' ═══════════════════════════════════════════════════════════════════ QUESTION 1: Look up the recognition sequence for the restriction enzyme PstI on the internet. Write out the sequence of a double stranded DNA with the restriction enzyme site in it. Label the 5' and 3' ends. Draw the fragments generated after cleavage with the 5' and 3' ends labeled. What kind of protrusion is generated? ═══════════════════════════════════════════════════════════════════ C. Cohesive-ended fragments can be ligated (or linked) to one another only if their protrusions are compatible (have complementary sequences). Any DNA fragments which have blunt ends can be ligated together. D. The number of nucleotides defining the cleavage site of a restriction enzyme can vary. RE with shorter recognition sequences cut DNA more frequently than those with longer recognition sequences. Assuming that a piece of DNA has an equal content of each base (A, C, G, and T), a RE with a 4-base recognition sequence will cleave the DNA, on average, every 4 4 (256) bases compared to every 46 (4096) bases for a RE with a 6-base recognition sequence. Note: The probability of cleavage for a RE that cuts GGCT is calculated as the probability of finding that sequence in the DNA. Probability of a specific base, e.g., G, occurring is 1/4 (based on 4 bases: A,C,G,T) so probability of the specific sequence GGCT is 1/4x1/4x1/4x1/4 = 1/256. C.W. Bill Young Marrow Donor Recruitment and Research Program 6 E. RE are endonucleases. If a restriction site appears at the end of a DNA fragment, the restriction enzyme will not cut the DNA. 5' GAATTCGACTGCCATA 3' 3' CTTAAGCTGACGGTAT 5' will not be cut by EcoRI ═══════════════════════════════════════════════════════════════════ QUESTION 2: Write down the DNA sequence cleaved by the RE NotI. Will NotI cleave genomic DNA more or less frequently than EcoRI? [Hint: You will need to look up the recognition sequence of this enzyme before you can answer the question.] ═══════════════════════════════════════════════════════════════════ III. Denaturation/Hybridization A. Denaturation disrupts the hydrogen bonds which hold the bases and, hence, the double stranded DNA together. Denaturation can be accomplished by heating the DNA or by treating the DNA with alkali (NaOH) or polar solvents (e.g., dimethyl sulfoxide (DMSO), formamide) which break hydrogen bonds. B. The melting temperature (Tm) is defined as the temperature at which 50% of the DNA is hybridized (i.e., found in a double stranded form) and 50% is denatured. The Tm for a short piece of DNA can be estimated by [4 x G+C pairs] + [2 x A+T pairs]. The Tm is influenced by the base composition and the length of the double stranded DNA. Heating DNA at 94 oC or higher will usually denature all double stranded DNA regardless of the length or base composition. For example, the Tm of the double stranded DNA sequence: AATGCGGAT TTACGCCTA is (4x4)+(2x5)=26oC. C.W. Bill Young Marrow Donor Recruitment and Research Program 7 ═══════════════════════════════════════════════════════════════════ QUESTION 3: Write out the sequence of any piece of DNA that is 18 base pairs in length and determine its approximate melting temperature. Is the melting temperature higher or lower than the example shown above? What would be the melting temperature of the 18 base pair sequence if it was made up of only G-C pairs? Only A-T pairs? ═══════════════════════════════════════════════════════════════════ C. Hybridization or reannealing of DNA regenerates the base pairs and yields double stranded DNA. The efficiency of hybridization depends on: 1. Concentration of DNA: For example, if we denature the DNA found in a human cell, it will take a long time for one strand of an HLA gene to find its complement among all the other genes present. In contrast, the strand can rapidly find its complement in a solution that contains only many copies of that HLA gene. Figure 1-2 DNA Denaturation / Annealing Stringent Non-Stringent DNA G C A T C.W. Bill Young Marrow Donor Recruitment and Research Program 8 D. 2. Time that the hybridization is given to take place: The more complex the mixture of DNA, the more time a single stranded piece of DNA needs to find its match. For example, one strand of an HLA gene will have a hard time finding its complement in a mixture containing all the genes found in a human cell. The longer the hybridization time, the more likely the strands are to find one another and anneal. 3. Base composition of the DNA: Stretches of GC pairs tend to anneal more rapidly than stretches of AT pairs to form double stranded DNA. This probably happens because the formation of 3 hydrogen bonds between a single GC pair stabilizes the hybridized strands to a greater extent than the 2 hydrogen bonds formed by an AT pair. TMAC (tetramethylammonium chloride) is a chemical which causes DNA to reanneal at a rate related only to its length so that the time of reannealing does not depend on the numbers of GC pairs. Single strands of DNA will bind to imperfectly matched single strands under conditions of low stringency [Figure 1-2]. Under high stringency conditions, only perfect matches are found. Usually, salt concentration and temperature are used to control the stringency. 1. Temperature of the reaction: High temperatures cause the DNA strands to move more rapidly. Imperfectly matched DNA hybrids will dissociate more rapidly at high temperatures. Therefore, higher temperatures favor the generation of perfectly matched hybrids. Obviously, if the temperature is too high, the two strands will not reanneal. 2. Salt concentration of the reaction mixture: High concentrations of salt allow imperfectly matched hybrids to be formed. At low salt concentrations, only perfectly matched hybrids will form. 3. Low salt and high temperature create high stringency conditions. The stringency of the match is controlled during the incubation of single stranded DNA (hybridization) or during the wash following the hybridization reaction. The temperature is the easiest parameter to adjust to obtain perfect matches. If using a short piece of single stranded DNA for hybridization, the final high stringency wash is often carried out 3-50C below the melting temperature to keep the perfectly matched strands hybridized but to eliminate imperfectly matched hybrids [Figure 1-3]. C.W. Bill Young Marrow Donor Recruitment and Research Program 9 Figure 1-3 Denaturation of DNA + Single stranded synthetic DNA (oligo) Control Stringency Hybridization Wash to remove imperfectly matched hybrids E. Probes are used to detect specific DNA sequences by hybridization. For HLA typing, these probes are synthetic single stranded DNA (oligonucleotides) which are 12-26 nucleotides in length. Probes are often called sequence specific oligonucleotide probes (SSOP). 1. F. IV. Because many copies of these probes can be added to create a high concentration, hybridization takes place rapidly. One strand of the DNA is often attached to a solid support (e.g., membrane or bead) to increase the speed of hybridization and to aid in the detection of a successful hybridization reaction. Detection of hybridization A. If the probe is labeled, hybridization can be detected. 1. Probes can be labeled using radioactive phosphate (P32) or by adding a modified or unusual nucleotide to a probe. The label is added to the probe either during or after synthesis of the probe. Methods to do this are described later. 2. Binding can be detected by autoradiography (to detect P32) or through detection of color, chemiluminescence (light emitted by a chemical reaction) or fluorescence. Detectors may be X-ray film, ELISA plate readers, fluorescence or chemiluminescence detectors, or the human eye. Figure 1-4 illustrates a detection system. C.W. Bill Young Marrow Donor Recruitment and Research Program 10 Figure 1-4 Detection of DNA Hybridized to Probe Probe Y Enzyme Antibody to tag tag Substrate DNA-1 DNA-2 DNA-3 DNA-4 Solid support--membrane Probe hybridizes to DNA-2 V. Electrophoresis A. Electrophoresis through agarose or polyacrylamide gels is one method to separate, identify, and purify DNA fragments [Figure 1-5]. DNA is negatively charged (due to the phosphate backbone) and moves away from the negative pole (cathode) and toward the positive pole (anode) in an electric field. Pieces of DNA are identified by size. Small pieces of DNA move more rapidly than large pieces of DNA. DNA is visualized on the gel by UV light if the DNA is stained with ethidium bromide or other less hazardous dyes or by X-ray film if the DNA is labeled with a radioactive isotope. Figure 1-5 Electrophoresis of DNA DNA – Negative Phosphate Backbone - 1.5-2% Agarose Gel + Tris/Acetate/EDTA or Tris/Borate/EDTA buffer Power Supply C.W. Bill Young Marrow Donor Recruitment and Research Program 11 B. The gel itself is a matrix through which the pieces of DNA travel. Polyacrylamide-like gels are used to separate small fragments of DNA (5-500 base pairs) and are used for Sanger-based DNA sequencing or oligonucleotide purification. Agarose gels are used for DNA from 200 base pairs to 50,000 base pairs (50 kilobases). Two DNA fragments which are 50 bp and 55 bp will likely migrate at different rates and will be identified on an acrylamide gel but will migrate together (and not be resolved) on an agarose gel. C. The percent composition of acrylamide or agarose determines how easily pieces of DNA of different sizes can travel through the matrix; therefore, the percent composition of the gel will determine the range over which DNA fragments are resolved (separated from one another). ═══════════════════════════════════════════════════════════════════ QUESTION 4: Figure 1-6 is a picture of a stained agarose gel. The lane labeled M contains commercially purchased marker DNA which contains DNA fragments that are multiples of 100 base pairs in length. The other two lanes contain pieces of DNA that have been electrophoresed in the gel. Label the positive and negative poles of the gel and indicate the direction of DNA migration. What are the approximate sizes of the two DNA fragments in the lanes? Figure 1-6 Agarose Gel M BP 700 400 100 ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 12 VI. Synthesis of DNA A. Enzymes called DNA polymerases synthesize single strands of DNA [Figure 1-7]. The enzymes require: 1. A single stranded DNA template to copy. The polymerases synthesize a product whose sequence is complementary to the template. Thermostable DNA polymerase (Taq DNA polymerase) and similar enzymes require a DNA template. Denatured genomic DNA can served as this template. When starting with a mRNA template, the enzyme reverse transcriptase copies mRNA to make complementary DNA (cDNA). Figure 1-7 DNA Synthesis 5' 5' 3' 3' 3' 5' Single stranded DNA DNA polymerase (enzyme) Single stranded oligo primer Nucleotides (dNTP: N=A, C, G, T) dNTP, 2’ deoxynucleoside 5’ triphosphate 2. A single stranded DNA primer (a synthetic oligonucleotide) that is hybridized to the template. Polymerases usually add nucleotides on to the 3' end of a primer and extend the newly synthesized strand from 5' to 3'. 3. Nucleotides (dNTP) to form the new DNA strand. B. If modified nucleotides are added during synthesis or if a modified (e.g., biotinylated) primer is used for synthesis, the newly synthesized DNA becomes labeled. C. Single stranded DNA of a defined sequence (oligonucleotide) can be made C.W. Bill Young Marrow Donor Recruitment and Research Program 13 in the laboratory using an automated DNA synthesizer and phosphoramidite chemistry. These pieces of DNA can be used as primers for DNA synthesis or as probes in hybridization reactions. Modified nucleotides might be used instead of A,C,G,T to provide more stability or to enhance hybridization to certain sequences. D. DNA ligases (e.g., T4 DNA ligase) catalyze the formation of the phosphate backbone in double stranded DNA and can be used to join up DNA fragments [Figure 1-8]. For example, a piece of DNA cut with the RE EcoRI into two fragments can be put together again by incubating with a ligase. Figure 1-8 Restriction Enzyme (EcoRI) G CTTAA AATTC G GTATTC CATAAG G CTTAA AATTC G GTATTC CATAAG Ligase Kinase GTATTC *P- GTATTC TDT (Terminal transferase) TGCTTAC VII. TGCTTACUUUU E. Kinases (e.g., T4 polynucleotide kinase) add phosphates on to the 5' ends of DNA fragments [Figure 1-8]. This enables DNA to be ligated or attached to another piece of DNA. F. Terminal transferase adds nucleotides to the 3' ends of DNA molecules to make a single stranded tail [Figure 1-8]. Terminal transferase can be used to create homopolymer tails (tails of all one nucleotide) for cloning or for labeling the DNA. Extraction of DNA from cells A. Any cell with a nucleus can be used as a source of DNA [Figure 1-9]. Red C.W. Bill Young Marrow Donor Recruitment and Research Program 14 blood cells do not contain nuclei; however, other cells in the blood like white blood cells are a good source of DNA. Cell lines like transformed B cells are also a good source of DNA. Because transformed cells can be grown in culture in the laboratory, they provide an inexhaustible supply of DNA. B. Many different protocols can be used to isolate DNA from cells. One protocol uses Triton X-100 (a detergent) to lyse the cell membrane releasing the nuclei. If the starting material was whole blood, these nuclei must then be washed extensively to wash away any hemoglobin released by the red blood cells. The heme portion of hemoglobin interfers with the gene amplification reaction used to determine HLA types. Figure 1-9 DNA & Its Preparation More stable than RNA Found in all nucleated cells Nucleus Lyse cell / nuclear membranes DNA Remove proteins bound to Protein DNA Isolate DNA away from other cell components Cell C. The nuclei are lysed using another detergent, Tween-20, and the DNA freed from the proteins bound to it by treatment with Proteinase K, an enzyme which destroys proteins. D. After the proteins are destroyed, the Proteinase K is also destroyed by incubation of the DNA at high temperatures (90oC). It is important to destroy the Proteinase K because it can degrade the enzyme used in the HLA typing reaction. E. Other methods may employ a solid phase to bind DNA for isolation. F. A number of vendors sell kits to prepare DNA. C.W. Bill Young Marrow Donor Recruitment and Research Program 15 Reference: Green and Sambrook. Molecular Cloning, A Laboratory Manual, 4th edition. Cold Spring Harbor Laboratory Press C.W. Bill Young Marrow Donor Recruitment and Research Program 16 CHAPTER 2 HLA CLASS I PROTEINS AND GENES The purpose of this chapter is to describe an important group of HLA molecules, the class I molecules. Matching for bone marrow transplantation involves identification of these class I molecules, HLA-A, HLA-B and HLA-C. This chapter also discusses the general properties of proteins and genes. I. HLA is an abbreviation for human leukocyte antigen. These protein molecules are expressed on the surfaces of our cells and play an important role in transplantation. Because of this, the HLA molecules of graft donor and recipient must be identified to find the best match. These molecules are often called histocompatibility molecules because of their role in compatibility of tissue. II. Class I HLA molecules are histocompatibility molecules which are identified during HLA typing. A. The class I molecules are found on the surface of essentially all nucleated cells in the body. Figure 2-1 Class I Protein Structure NH2 Alpha 1 domain Alpha 2 domain NH2 Alpha 3 domain Beta 2 microglobulin Transmembrane region Cytoplasmic region COOH B. At least three different class I molecules, HLA-A, HLA-B, and HLA-C, are expressed on each cell of an individual. These molecules are very similar to one another. There are approximately 0.5-1 million class I molecules on the C.W. Bill Young Marrow Donor Recruitment and Research Program 17 surface of a cell. C. Class I molecules are comprised of a cell membrane glycopolypeptide (44K MW, alpha or heavy chain) associated with a second polypeptide, beta-2 microglobulin (12K MW) [Figure 2-1]. Each chain is a linear string of amino acids. Each class I protein has a specific sequence of amino acids that differs from other proteins. The sequence of an HLA-A molecule is different from the sequence of an HLA-B molecule. [Note: Class II molecules also have alpha chains but they are different in amino acid sequence from class I alpha chains. Likewise, the HLA-A,B,C alpha chains differ from one another in protein sequence.] Table 2-1. One- and Three-Letter Codes for Amino Acids Alanine Ala A Leucine Leu L Arginine Arg R Lysine Lys K Asparagine Asn N Methionine Met M Aspartic Acid Asp D Phenylalanine Phe F Cysteine Cys C Proline Pro P Glutamic Acid Glu E Serine Ser S Glutamine Gln Q Threonine Thr T Glycine Gly G Trytophan Trp W Histidine His H Tyrosine Tyr Y Isoleucine Ile I Valine Val V There are 20 amino acids (e.g., lysine, serine, leucine) [Table 2-1]. To save space, publications sometimes use a single letter to refer to an amino acid. For example, S is used for serine, L for leucine, and K for lysine. One end of the polypeptide chain is called the amino-terminus (or Nterminus) and the other end is called the carboxy-terminus (or C-terminus). D. Each class I alpha (or heavy) chain can be divided into regions: three extracellular domains (each ~100 amino acids in length), transmembrane region and cytoplasmic tail. Each alpha chain is initially synthesized with a short signal peptide on its amino terminus. This peptide is removed from the polypeptide during transport to the cell surface and is not found in the mature protein. C.W. Bill Young Marrow Donor Recruitment and Research Program 18 III. Class I genes A. Some of the information specifying the amino acid sequences of the HLA class I molecules is located on human chromosome 6 in a region called the major histocompatibility complex (MHC) [Figure 2-2]. The sequence of base pairs containing the genetic information that specifies a protein sequence defines a gene. The genes that encode the three class I 44K MW alpha chains are located next to one another in the MHC. Beta-2 microglobulin is encoded on another chromosome. Figure 2-2 Human MHC HLA Class II Region DP B1 A1 DQ B1 HLA Class I Region DR A1 B1 B3 B C A A B4 B5 B. The genes that encode the alpha chains of the class I proteins are located next to one another. This cluster of genes is part of the major histocompatibility gene complex (MHC). This complex encompasses approximately 3500 kb (3,500,000 bases) of DNA. Beta2 microglobulin is encoded on another chromosome. When the location of a gene on a chromosome is known, that gene can also be called a locus. C. The World Health Organization (W.H.O.) has a committee that assigns names to HLA loci. For example, HLA-A is the name of the locus that encodes the HLA-A alpha chain. C.W. Bill Young Marrow Donor Recruitment and Research Program 19 D. The information for a single gene is not found in a single stretch of base pairs but is found in multiple segments of double stranded DNA called exons [Figure 2-3]. These segments are separated by intervening segments of DNA called introns. The entire set of exons containing the coding information for an entire polypeptide chain is called a "gene" [Figure 2-3]. Figure 2-3 Class I Alpha Chain Locus DNA (exons/introns) L 1 A1 A2 A3 TM 2 3 4 5 Cyt 6 7 8 mRNA (cDNA/CDS) 1 2 3 4 5 6 1 2 3 4 5 6 7 8 Polypeptide IV. Transfer of information from DNA to RNA to protein. A. The DNA encoding the HLA molecule is copied (transcribed) into mRNA. After processing, the mRNA contains only the exon sequences. ═══════════════════════════════════════════════════════════════════ QUESTION 1: Figure 2-4 lists the genomic or DNA sequence of an HLA-A gene. Figure 25 lists the coding sequence (CDS) (=mRNA sequence) sequence from the same HLA-A gene. Use the information in Figure 2-5 to identify and box the exons which encode the HLA-A gene in the genomic DNA sequence in Figure 2-4. Remember that genomic DNA contains exons and introns. Hint: The vertical lines in Figure 2-4 and the shading added to Figure 2-5 will help you find the exons. ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 20 Figure 2-4. Full length genomic sequence A*01:01:01:01 A*01:01:01:01 -291 -281 -271 -261 -251 -241 -231 -221 -211 -201 CAGGAGCAGA GGGGTCAGGG CGAAGTCCCA GGGCCCCAGG CGTGGCTCTC AGGGTCTCAG GCCCCGAAGG CGGTGTATGG ATTGGGGAGT CCCAGCCTTG A*01:01:01:01 -191 -181 -171 -161 -151 -141 -131 -121 -111 -101 GGGATTCCCC AACTCCGCAG TTTCTTTTCT CCCTCTCCCA ACCTACGTAG GGTCCTTCAT CCTGGATACT CACGACGCGG ACCCAGTTCT CACTCCCATT A*01:01:01:01 -91 -81 -71 -61 -51 -41 -31 -21 -11 -1 GGGTGTCGGG TTTCCAGAGA AGCCAATCAG TGTCGTCGCG GTCGCTGTTC TAAAGTCCGC ACGCACCCAC CGGGACTCAG ATTCTCCCCA GACGCCGAGG A*01:01:01:01 10 20 30 40 50 60 70 80 90 100 |ATGGCCGTCA TGGCGCCCCG AACCCTCCTC CTGCTACTCT CGGGGGCCCT GGCCCTGACC CAGACCTGGG CGG|GTGAGTG CGGGGTCGGG AGGGAAACCG A*01:01:01:01 110 120 130 140 150 160 170 180 190 200 CCTCTGCGGG GAGAAGCAAG GGGCCCTCCT GGCGGGGGCG CAGGACCGGG GGAGCCGCGC CGGGAGGAGG GTCGGGCAGG TCTCAGCCAC TGCTCGCCCC A*01:01:01:01 210 220 230 240 250 260 270 280 290 300 CAG|GCTCCCA CTCCATGAGG TATTTCTTCA CATCCGTGTC CCGGCCCGGC CGCGGGGAGC CCCGCTTCAT CGCCGTGGGC TACGTGGACG ACACGCAGTT A*01:01:01:01 310 320 330 340 350 360 370 380 390 400 CGTGCGGTTC GACAGCGACG CCGCGAGCCA GAAGATGGAG CCGCGGGCGC CGTGGATAGA GCAGGAGGGG CCGGAGTATT GGGACCAGGA GACACGGAAT A*01:01:01:01 410 420 430 440 450 460 470 480 490 500 ATGAAGGCCC ACTCACAGAC TGACCGAGCG AACCTGGGGA CCCTGCGCGG CTACTACAAC CAGAGCGAGG ACG|GTGAGTG ACCCCGGCCC GGGGCGCAGG A*01:01:01:0 510 520 530 540 550 560 570 580 590 600 TCACGACCCC TCATCCCCCA CGGACGGGCC AGGTCGCCCA CAGTCTCCGG GTCCGAGATC CACCCCGAAG CCGCGGGACT CCGAGACCCT TGTCCCGGGA A*01:01:01:01 610 620 630 640 650 660 670 680 690 700 GAGGCCCAGG CGCCTTTACC CGGTTTCATT TTCAGTTTAG GCCAAAAATC CCCCCGGGTT GGTCGGGGCG GGGCGGGGCT CGGGGGACTG GGCTGACCGC A*01:01:01:01 710 720 730 740 750 760 770 780 790 800 GGGGTCGGGG CCAG|GTTCTC ACACCATCCA GATAATGTAT GGCTGCGACG TGGGGCCGGA CGGGCGCTTC CTCCGCGGGT ACCGGCAGGA CGCCTACGAC A*01:01:01:01 810 820 830 840 850 860 870 880 890 900 GGCAAGGATT ACATCGCCCT GAACGAGGAC CTGCGCTCTT GGACCGCGGC GGACATGGCA GCTCAGATCA CCAAGCGCAA GTGGGAGGCG GTCCATGCGG A*01:01:01:01 910 920 930 940 950 960 970 980 990 1000 CGGAGCAGCG GAGAGTCTAC CTGGAGGGCC GGTGCGTGGA CGGGCTCCGC AGATACCTGG AGAACGGGAA GGAGACGCTG CAGCGCACGG |GTACCAGGGG A*01:01:01:01 1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 CCACGGGGCG CCTCCCTGAT CGCCTATAGA TCTCCCGGGC TGGCCTCCCA CAAGGAGGGG AGACAATTGG GACCAACACT AGAATATCAC CCTCCCTCTG A*01:01:01:01 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 GTCCTGAGGG AGAGGAATCC TCCTGGGTTT CCAGATCCTG TACCAGAGAG TGACTCTGAG GTTCCGCCCT GCTCTCTGAC ACAATTAAGG GATAAAATCT A*01:01:01:01 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 CTGAAGGAGT GACGGGAAGA CGATCCCTCG AATACTGATG AGTGGTTCCC TTTGACACCG GCAGCAGCCT TGGGCCCGTG ACTTTTCCTC TCAGGCCTTG A*01:01:01:01 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 TTCTCTGCTT CACACTCAAT GTGTGTGGGG GTCTGAGTCC AGCACTTCTG AGTCTCTCAG CCTCCACTCA GGTCAGGACC AGAAGTCGCT GTTCCCTTCT A*01:01:01:01 1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 CAGGGAATAG AAGATTATCC CAGGTGCCTG TGTCCAGGCT GGTGTCTGGG TTCTGTGCTC TCTTCCCCAT CCCGGGTGTC CTGTCCATTC TCAAGATGGC A*01:01:01:01 1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 CACATGCGTG CTGGTGGAGT GTCCCATGAC AGATGCAAAA TGCCTGAATT TTCTGACTCT TCCCGTCAG|A CCCCCCCAAG ACACATATGA CCCACCACCC A*01:01:01:01 1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 CATCTCTGAC CATGAGGCCA CCCTGAGGTG CTGGGCCCTG GGCTTCTACC CTGCGGAGAT CACACTGACC TGGCAGCGGG ATGGGGAGGA CCAGACCCAG A*01:01:01:01 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 GACACGGAGC TCGTGGAGAC CAGGCCTGCA GGGGATGGAA CCTTCCAGAA GTGGGCGGCT GTGGTGGTGC CTTCTGGAGA GGAGCAGAGA TACACCTGCC A*01:01:01:01 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 ATGTGCAGCA TGAGGGTCTG CCCAAGCCCC TCACCCTGAG ATGGG|GTAAG GAGGGAGATG GGGGTGTCAT GTCTCTTAGG GAAAGCAGGA GCCTCTCTGG 1910 1920 1930 1940 1950 1960 1970 C.W. Bill Young Marrow Donor Recruitment and Research Program 1980 1990 2000 21 A*01:01:01:01 AGACCTTTAG CAGGGTCAGG GCCCCTCACC TTCCCCTCTT TTCCCAG|AGC TGTCTTCCCA GCCCACCATC CCCATCGTGG GCATCATTGC TGGCCTGGTT A*01:01:01:01 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 CTCCTTGGAG CTGTGATCAC TGGAGCTGTG GTCGCTGCCG TGATGTGGAG GAGGAAGAGC TCAG|GTGGAG AAGGGGTGAA GGGTGGGGTC TGAGATTTCT A*01:01:01:01 2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 TGTCTCACTG AGGGTTCCAA GCCCCAGCTA GAAATGTGCC CTGTCTCATT ACTGGGAAGC ACCTTCCACA ATCATGGGCC GACCCAGCCT GGGCCCTGTG A*01:01:01:01 2210 2220 2230 2240 2250 2260 2270 2280 2290 2300 TGCCAGCACT TACTCTTTTG TAAAGCACCT GTTAAAATGA AGGACAGATT TATCACCTTG ATTACGGCGG TGATGGGACC TGATCCCAGC AGTCACAAGT A*01:01:01:01 2310 2320 2330 2340 2350 2360 2370 2380 2390 2400 CACAGGGGAA GGTCCCTGAG GACAGACCTC AGGAGGGCTA TTGGTCCAGG ACCCACACCT GCTTTCTTCA TGTTTCCTGA TCCCGCCCTG GGTCTGCAGT A*01:01:01:01 2410 2420 2430 2440 2450 2460 2470 2480 2490 2500 CACACATTTC TGGAAACTTC TCTGGGGTCC AAGACTAGGA GGTTCCTCTA GGACCTTAAG GCCCTGGCTC CTTTCTGGTA TCTCACAGGA CATTTTCTTC A*01:01:01:01 2510 2520 2530 2540 2550 2560 2570 2580 2590 2600 CCACAG|ATAG AAAAGGAGGG AGTTACACTC AGGCTGCAA|G TAAGTATGAA GGAGGCTGAT GCCTGAGGTC CTTGGGATAT TGTGTTTGGG AGCCCATGGG A*01:01:01:01 2610 2620 2630 2640 2650 2660 2670 2680 2690 2700 GGAGCTCACC CACCCCACAA TTCCTCCTCT AGCCACATCT TCTGTGGGAT CTGACCAGGT TCTGTTTTTG TTCTACCCCA G|GCAGTGACA GTGCCCAGGG A*01:01:01:01 2710 2720 2730 2740 2750 2760 2770 2780 2790 2800 CTCTGATGTG TCTCTCACAG CTTGTAAAG|G TGAGAGCTTG GAGGGCCTGA TGTGTGTTGG GTGTTGGGTG GAACAGTGGA CACAGCTGTG CTATGGGGTT A*01:01:01:01 2810 2820 2830 2840 2850 2860 2870 2880 2890 2900 TCTTTGCGTT GGATGTATTG AGCATGCGAT GGGCTGTTTA AGGTGTGACC CCTCACTGTG ATGGATATGA ATTTGTTCAT GAATATTTTT TTCTATAG|TG A*01:01:01:01 2910 2920 2930 2940 2950 2960 2970 2980 2990 3000 TGA|GACAGCT GCCTTGTGTG GGACTGAGAG GCAAGAGTTG TTCCTGCCCT TCCCTTTGTG ACTTGAAGAA CCCTGACTTT GTTTCTGCAA AGGCACCTGC A*01:01:01:01 3010 3020 3030 3040 3050 3060 3070 3080 3090 3100 ATGTGTCTGT GTTCGTGTAG GCATAATGTG AGGAGGTGGG GAGAGCACCC CACCCCCATG TCCACCATGA CCCTCTTCCC ACGCTGACCT GTGCTCCCTC A*01:01:01:01 3110 3120 3130 3140 3150 3160 3170 3180 3190 3200 CCCAATCATC TTTCCTGTTC CAGAGAGGTG GGGCTGAGGT GTCTCCATCT CTGTCTCAAC TTCATGGTGC ACTGAGCTGT AACTTCTTCC TTCCCTATTA C.W. Bill Young Marrow Donor Recruitment and Research Program 22 Figure 2-5. Nucleotide CDS (coding sequence) A*01:01:01:01 -20 -15 -10 -5 1 ATG GCC GTC ATG GCG CCC CGA ACC CTC CTC CTG CTA CTC TCG GGG GCC CTG GCC CTG ACC CAG ACC TGG GCG G|GC A*01:01:01:01 A*01:01:01:01 5 10 15 20 25 TCC CAC TCC ATG AGG TAT TTC TTC ACA TCC GTG TCC CGG CCC GGC CGC GGG GAG CCC CGC TTC ATC GCC GTG GGC A*01:01:01:01 30 35 40 45 50 TAC GTG GAC GAC ACG CAG TTC GTG CGG TTC GAC AGC GAC GCC GCG AGC CAG AAG ATG GAG CCG CGG GCG CCG TGG A*01:01:01:01 55 60 65 70 75 ATA GAG CAG GAG GGG CCG GAG TAT TGG GAC CAG GAG ACA CGG AAT ATG AAG GCC CAC TCA CAG ACT GAC CGA GCG A*01:01:01:01 80 85 90 95 100 AAC CTG GGG ACC CTG CGC GGC TAC TAC AAC CAG AGC GAG GAC G|GT TCT CAC ACC ATC CAG ATA ATG TAT GGC TGC A*01:01:01:01 105 110 115 120 125 GAC GTG GGG CCG GAC GGG CGC TTC CTC CGC GGG TAC CGG CAG GAC GCC TAC GAC GGC AAG GAT TAC ATC GCC CTG A*01:01:01:01 130 135 140 145 150 AAC GAG GAC CTG CGC TCT TGG ACC GCG GCG GAC ATG GCA GCT CAG ATC ACC AAG CGC AAG TGG GAG GCG GTC CAT A*01:01:01:01 155 160 165 170 175 GCG GCG GAG CAG CGG AGA GTC TAC CTG GAG GGC CGG TGC GTG GAC GGG CTC CGC AGA TAC CTG GAG AAC GGG AAG A*01:01:01:01 180 185 190 195 200 GAG ACG CTG CAG CGC ACG G|AC CCC CCC AAG ACA CAT ATG ACC CAC CAC CCC ATC TCT GAC CAT GAG GCC ACC CTG A*01:01:01:01 205 210 215 220 225 AGG TGC TGG GCC CTG GGC TTC TAC CCT GCG GAG ATC ACA CTG ACC TGG CAG CGG GAT GGG GAG GAC CAG ACC CAG A*01:01:01:01 230 235 240 245 250 GAC ACG GAG CTC GTG GAG ACC AGG CCT GCA GGG GAT GGA ACC TTC CAG AAG TGG GCG GCT GTG GTG GTG CCT TCT A*01:01:01:01 255 260 265 270 275 GGA GAG GAG CAG AGA TAC ACC TGC CAT GTG CAG CAT GAG GGT CTG CCC AAG CCC CTC ACC CTG AGA TGG G|AG CTG A*01:01:01:01 280 285 290 295 300 TCT TCC CAG CCC ACC ATC CCC ATC GTG GGC ATC ATT GCT GGC CTG GTT CTC CTT GGA GCT GTG ATC ACT GGA GCT A*01:01:01:01 305 310 315 320 325 GTG GTC GCT GCC GTG ATG TGG AGG AGG AAG AGC TCA G|AT AGA AAA GGA GGG AGT TAC ACT CAG GCT GCA A|GC AGT A*01:01:01:01 330 335 340 GAC AGT GCC CAG GGC TCT GAT GTG TCT CTC ACA GCT TGT AAA G|TG TGA C.W. Bill Young Marrow Donor Recruitment and Research Program 23 B. The mRNA is translated into protein by the ribosomes. The ribosome reads the genetic code to convert RNA sequences into a protein sequence. The genetic code consists of all possible triplet combinations of RNA bases. Each triplet (codon) specifies one amino acid. For example, the codon UCU (TCT in DNA) specifies the amino acid serine. An amino acid can be specified by more than one triplet. For example, UCC, UCA, and UCG also specify serine. The class I protein sequences start with a methionine at the amino-terminus of the signal peptide encoded by an AUG codon in the mRNA. Some triplets are called "stop codons" and identify the end of the protein sequence (carboxy-terminus). UGA is a stop codon. C. The class I signal peptide and part of the first amino acid of the alpha-1 domain are encoded in exon 1 [Figure 2-3]. The rest of the first domain is encoded in exon 2. Exon 3 encodes the second domain and exon 4 encodes the third domain. The remainder of the exons encode the transmembrane and cytoplasmic regions. D. Intron 1 follows exon 1, intron 2 follows exon 2 and so forth. E. If the gene is characterized by a sequence analysis of a DNA copy of the mRNA, that sequence is called a cDNA (complementary DNA) sequence. The cDNA sequence reported in the literature has the same sequence as the mRNA and is always written 5' (on the left) to 3'. ═══════════════════════════════════════════════════════════════════ QUESTION 2: Use Figure 2-5 and Table 2-2 to translate the cDNA (mRNA) sequence for HLA-A into a protein sequence. The codon encoding the first amino acid in the leader sequence is indicated. Circle the stop codon. ═══════════════════════════════════════════════════════════════════ V. The polypeptide specified by an alpha gene associates with the polypeptide specified by the beta-2 microglobulin gene to form a class I protein [Figure 2-1]. Polypeptide is a term used to indicate that the alpha and beta chains are not usually found alone but are found in an complex (also called a heterodimer). C.W. Bill Young Marrow Donor Recruitment and Research Program 24 Table 2-2. The Alphabet of the Genetic Code 1 1 2 3 UUU2 Phe UCU Ser UAU Tyr UGU Cys UUC Phe UCC Ser UAC Tyr UGC Cys UUA Leu UCA Ser UAA Term3 UGA Term UUG Leu UCG Ser UAG Term UGG Trp CUU Leu CCU Pro CAU His CGU Arg CUC Leu CCC Pro CAC His CGC Arg CUA Leu CCA Pro CAA Gln CGA Arg CUG Leu CCG Pro CAG Gln CGG Arg AUU Ile ACU Thr AAU Asn AGU Ser AUC Ile ACC Thr AAC Asn AGC Ser AUA Ile ACA Thr AAA Lys AGA Arg AUG Met ACG Thr AAG Lys AGG Arg GUU Val GCU Ala GAU Asp GGU Gly GUC Val GCC Ala GAC Asp GGC Gly GUA Val GCA Ala GAA Glu GGA Gly GUG Val GCG Ala GAG Glu GGG Gly mRNA codons which specify a particular amino acid. U is found in mRNA; T is found in DNA. Term indicates a termination codon which halts protein synthesis. C.W. Bill Young Marrow Donor Recruitment and Research Program 25 References: Nelson and Cox. Lehninger’s Principles of Biochemistry. 2012. Alberts et al. Molecular Biology of the Cell. 2007. Murphy. Janeway’s Immunobiology. 2011. C.W. Bill Young Marrow Donor Recruitment and Research Program 26 CHAPTER 3 HLA CLASS II PROTEINS AND GENES The purpose of this chapter is to describe a second important group of HLA molecules, the class II molecules. Matching of donor and recipient for bone marrow transplantation involves identification of at least one of the class II molecules, HLA-DR. Many of the characteristics of the class II molecules are similar to the class I molecules discussed in Chapter 2. I. Class II HLA molecules A. These molecules are found on the surface of cells of the immune system like B cells and dendritic cells although other cell types may express these molecules under certain conditions (e.g., under the influence of cytokines). Figure 3-1 Class II Protein Structure Amino-termini Alpha 1 domain Beta 1 domain Alpha 2 domain Beta 2 domain Transmembrane region Cytoplasmic region Carboxy-termini B. Three different class II molecules, HLA-DR, HLA-DQ, and HLA-DP, are expressed on immune system cells by an individual. These molecules are very similar to one another. There are approximately 0.5-1 million class II molecules on the cell surface. C.W. Bill Young Marrow Donor Recruitment and Research Program 27 II. C. The class II molecules are cell membrane glycoproteins consisting of an alpha polypeptide chain (~34K MW) and a beta polypeptide chain (~28K MW) [Figure 3-1]. Each protein has a specific sequence of amino acids that differs from other proteins. The sequence of an HLA-DR molecule is different from the sequence of an HLA-DP molecule. D. Each class II polypeptide chain can be divided into regions: two extracellular domains (each ~100 amino acids in length), a transmembrane region, and a cytoplasmic tail. Each polypeptide is initially synthesized with a short signal peptide on its amino terminus. This peptide is removed from the polypeptide during transport to the cell surface after synthesis and is not found in the mature protein. Class II genes A. The information specifying the amino acid sequences of the HLA class II molecules is located on human chromosome 6 [Figure 3-2]. The sequence of base pairs containing the genetic information that specifies a protein sequence defines a gene. Figure 3-2 Human MHC HLA Class II Region DP B1 A1 DQ B1 HLA Class I Region DR A1 B1 B3 B C A A B4 B5 B. The genes that encode the alpha and beta chains of the class II proteins are located next to one another in the MHC. C.W. Bill Young Marrow Donor Recruitment and Research Program 28 C. DQB1 is the name of the locus that encodes the DQ beta chain and DQA1 is the name of the locus that encodes the DQ alpha chain. The numbers (e.g., DQA1) were added because there are other class II-like genes in the MHC. D. The information for a single gene is found in multiple segments of double stranded DNA called exons [Figure 3-3]. ═══════════════════════════════════════════════════════════════════ QUESTION 1: Go to the website http://hla.alleles.org/ and find a copy of the current listing of the W.H.O. nomenclature for HLA alleles. What is the name of the locus that encodes the DR alpha chain? ═══════════════════════════════════════════════════════════════════ Figure 3-3 DR Beta Chain Locus DNA (exons/introns) L 1 B1 B2 2 3 Tm 4 Cyt 5 6 mRNA (cDNA/CDS) 1 2 3 4 5 6 1 2 3 4 5 6 Polypeptide C.W. Bill Young Marrow Donor Recruitment and Research Program 29 CHAPTER 4 HLA ALLELES AND INHERITANCE The purpose of this chapter is to describe the HLA diversity in the human population. It is this diversity that makes it is so difficult to find an unrelated donor with the same HLA alleles as a patient. I. HLA genes are highly polymorphic (many forms). This means that, if we look at an HLA gene in unrelated individuals in a large population of humans, we will see that many people have a different DNA sequence for that gene. Each different DNA sequence is called an allele. These different gene sequences give rise to many different class I and class II allelic products (polypeptides). A. Some loci have many alleles. For example, the DRB1 locus has over 1500 alleles. Other loci have only a few alleles. For example, DRA has seven alleles. B. It is unlikely that two unrelated individuals carry the same class II alleles at DR, DQ and DP loci. A ballpark estimate for common HLA alleles is that only 1 person in 20,000 people from the same ethnic group will carry the same HLA alleles as another person in that group. C. The two alleles carried by an individual is called their genotype. Figure 4-1 C.W. Bill Young Marrow Donor Recruitment and Research Program 30 II. In the class II alleles, the second exons which encode the first domain of each polypeptide chain contain the majority of the allelic differences. Allelic differences can also be found in other exons. HLA class II typing usually focuses on defining the differences in the second exon [Figure 4-1]. ═══════════════════════════════════════════════════════════════════ QUESTION 1: Go to the web, http://www.ebi.ac.uk/ipd/imgt/hla/, and find a copy of the current list of the CDS (coding region) sequences of the DRB1 alleles. Practice using the alignment tool. ═══════════════════════════════════════════════════════════════════ III. The W.H.O. assigns names to HLA alleles. For example, alleles at the DRB1 locus are called: DRB1*01:01:01, DRB1*01:02:01, DRB1*04:02:01. A. Each HLA allele is designated by the name of the gene or locus followed by an asterisk and two or more fields separated by colons [Figure 4-2]. For example, DPB1*02:01:02 is an allele of the HLA-DP B1 gene; DQB1*03:01:01:01 is an allele of the HLA-DQ B1 gene; DQA1*01:01:01 is an allele of the HLA-DQ A1 gene. The first field following the asterisk is a numerical designation often based on the serologic type of the resultant protein (explained in Chapter 5) and/or the similarity to other alleles in that group or family. The next field in the allele designation refers to the order in which that allele was discovered. The allele name should be viewed as simply a unique name for the allele and does not necessarily imply anything about its relationship to other alleles with similar names. ═══════════════════════════════════════════════════════════════════ QUESTION 2: Use http://hla.alleles.org/ to list a few alleles of the DQB1 locus. ═══════════════════════════════════════════════════════════════════ B. Sometimes shorter names are used such as DRB1*01:01. This designation means that the typing does not distinguish among alleles whose names all start with DRB1*01:01. These alleles include DRB1*01:01:01, DRB1*01:01:02 and DRB1*01:01:03. C. Bone marrow donor registries may use letter codes to designate subsets of HLA alleles. For example, the letters AF in DRB1*14:AF mean *14:01 or *14:09. For example, DRB1*04:ABC means DRB1*04:03 or *04:04 or *04:06 or *04:07 or*04:08 or *04:10 or *04:11 or *04:17 or *04:19 or *04:20 or *04:23. The letter codes can be found at https://bioinformatics.bethematch clinical.org/. D. The number of people carrying a specific HLA allele varies. These frequency C.W. Bill Young Marrow Donor Recruitment and Research Program 31 differences mean that some alleles are considered common and others may have been only observed in one individual so far. A list of common alleles can be found at http://igdawg.org/cwd.html. Maps showing the world-wide distribution of some HLA alleles can be found at http://www.pypop.org/popdata/. ═══════════════════════════════════════════════════════════════════ QUESTION 3: The DRB1*01:01 allele is found in 1.9% of individuals in a population. The DRB1*03:01 allele is found in 7%. What is the probability of finding an individual from that population who carries both alleles? ═══════════════════════════════════════════════════════════════════ Figure 4-2 IV. DNA sequence differences between alleles can result in differences in the protein sequence or can be silent at the protein level. A. An example of alleles that do not differ at the protein sequence level is DRB1*11:01:01 and DRB1*11:01:02. Both alleles have the same protein sequence but differ in the codons used to specify that sequence (silent or synonymous substitutions). The W.H.O committee indicates silent changes by adding a third field to the allele name. Since the immune system detects C.W. Bill Young Marrow Donor Recruitment and Research Program 32 B. differences in the HLA proteins, silent differences that do not change the protein sequence are not considered important in selecting an HLA matched donor. N is added to indicated a null or nonexpressed allele (e.g., DRB4*01:03:01:02N). Since the N is optional, this allele is also correctly named DRB4*01:03:01:02. Null alleles are important to consider in the selection of an HLA matched donor. The fourth field of this allele name indicates that it differs from DRB4*01:03:01:01 in the intron or 5’ or 3’ regions of the gene (that is, the exon sequences are identical). ═══════════════════════════════════════════════════════════════════ QUESTION 4: Using http://www.ebi.ac.uk/ipd/imgt/hla/, translate the first 20 codons of DRB1*01:01:01 and DRB1*13:02:01 into amino acids using Table 2-2. [The web site will also convert DNA sequence to protein sequence but give it a try using the table!] ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 5: Write down a different DNA sequence that would encode the same polypeptide. ACT GGT TAC TTC Thr Gly Tyr Phe T G Y F GAG Glu E ═══════════════════════════════════════════════════════════════════ V. Each person carries two copies of each HLA-DR, -DQ and -DP alpha and beta gene, one inherited from their mother and one from their father [Figures 4-3 and 44]. Furthermore, since all of the HLA genes are found on a single chromosome, each person has inherited one copy of chromosome 6 carrying one HLA gene complex from their mother and one copy carrying a second HLA gene complex from their father. C.W. Bill Young Marrow Donor Recruitment and Research Program 33 A. The HLA genes are codominantly expressed, that is, both copies encode proteins that are expressed on a single cell. B. A person can have two identical alleles of a single gene (homozygous) or may have two different alleles of a single gene (heterozygous). C. A parent and child share one chromosome or haplotype (haploidentical). A sibling (brother or sister) has a 1 in 4 chance of receiving the same two copies of chromosome 6 from their parents as another sibling and becoming HLA identical. Thus, the class II genes, DR, DQ, DP, are inherited as a package. Traditionally, chromosomes from the father are labeled "a" and "b" and chromosomes from the mother are labeled "c" and "d". Figure 4-3 Inheritance of HLA-DRB1 Locus Alleles in a Family DRB1*01:01 DRB1*03:01 DRB1*01:01 DRB1*03:01 DRB1*01:01 DRB1*15:01 DRB1*03:01 DRB1*15:01 DRB1*03:01 DRB1*03:01 DRB1*03:01 DRB1*15:01 4 possible genotypes in children Heterozygous vs homozygous 1 in 4 chance that two sibs inherit same two alleles D. Sometimes the two copies of chromosome 6, which carry the class II genes, exchange gene segments, a process called reciprocal recombination [Figure 4-4]. This exchange reshuffles the DR, DQ, DP combinations. If this happens in the germ cells (egg and sperm), that person's offspring may inherit the new combination. C.W. Bill Young Marrow Donor Recruitment and Research Program 34 ═══════════════════════════════════════════════════════════════════ QUESTION 6: Father is DRB1*04:02,DRB1*11:03 and mother is DRB1*01:01,DRB1*03:01. What are the possible DRB1 allele combinations inherited by their children? Can two children be DR identical (i.e., share DRB1 alleles)? Will any of the children be homozygous? ═══════════════════════════════════════════════════════════════════ Figure 4-4 Segregation of Haplotypes in Families Sib 1 A*01:01,B*07:02,DRB1*04:01 A*03:01,B*15:01,DRB1*11:04 ac Sib 2 Father ab A*01:01,B*07:02,DRB1*04:01 A*02:01,B*08:01,DRB1*01:01 A*02:01,B*08:01,DRB1*01:01 A*03:01,B*15:01,DRB1*11:04 Sib 3 A*01:01,B*07:02,DRB1*04:01 A*02:01,B*53:01,DRB1*11:01 cd A*03:01,B*15:01,DRB1*11:04 A*02:01,B*53:01,DRB1*11:01 Mother Haplotype = combination of alleles on chromosome Haploidentical = share 1 haplotype bc ad Sib 4 A*02:01,B*08:01,DRB1*01:01 A*02:01,B*53:01,DRB1*11:01 A*02:01,B*08:01,DRB1*01:01 A*03:01 A*03:01,B*53:01,DRB1*11:01 bd Recombinant ═══════════════════════════════════════════════════════════════════ QUESTION 7: Father is a: DRB1*04:03,DPB1*02:01; b: DRB1*11:01,DPB1*02:01 and mother is c: DRB1*01:01,DPB1*02:01; d: DRB1*03:02,DPB1*01:01. What are the possible DR,DP allele combinations inherited by their children? Could a child be DRB1*04:03,DPB1*02:01; DRB1*01:01,DPB1*01:01? ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 35 VI. The number of genes in the MHC may vary among different individuals carrying different haplotypes or different copies of chromosome 6. A. Some chromosomes carry only one expressed DR beta gene [Figure 4-5]. An expressed gene is one that encodes a polypeptide. The DR beta locus is called DRB1. Figure 4-5 DR Genes and DR Proteins DRB1 Gene DRA Gene DR protein DRB1 Gene DR protein VII. DRB4 Gene DRA Gene DR protein B. Other copies of chromosome 6 carry two expressed DR beta genes. One locus is DRB1. The second beta chain locus is called DRB3 or DRB4 or DRB5. Individuals who carry haplotypes containing two DR beta loci express two different DR molecules encoded by that haplotype [Figure 4-5]. C. Each chromosome carries a single DQA1, DQB1, DPA1, and DPB1 gene and, thus, encodes a single DQ and a single DP molecule. D. Some class II genes in the MHC are pseudogenes (e.g., DRB2) and are defective in some way. Others (e.g. DM,DO) specify proteins that are not considered important for transplantation matching. Particular HLA-DR alleles are often found together on chromosomes that carry two C.W. Bill Young Marrow Donor Recruitment and Research Program 36 DR beta loci. A. DRB1 and DRB4. Alleles whose names start with DRB1*04, DRB1*07, and DRB1*09 are usually found on the same chromosome as alleles whose names begin with DRB4 [Figure 4-6]. B. DRB1 and DRB3. Alleles whose names start with DRB1*03, DRB1*11, DRB1*12, DRB1*13, and DRB1*14, are usually found on the same chromosome as alleles whose names begin with DRB3 [Figure 4-6]. C. DRB1 and DRB5. Alleles whose names start with DRB1*15 or DRB1*16 are usually found on the same chromosome as alleles whose names begin with DRB5 [Figure 4-6]. D. These associations are common but not always found. For example, DRB1*15:01 can be found on a chromosome without a DRB5 allele and DRB5 can be found on a chromosome without a DRB1*15:01 allele. Figure 4-6 DR Genes and DR Proteins DRB1 Gene DRB1*04:02 DRB3 Gene DRB3*01:01 DRA Gene DRA*01:01 DR protein DR52 protein DRB1 Gene DRB1*04:02 DRB4 Gene DRB4*01:01 DR protein DRB1 Gene DRB1*16:02 DR protein E. DRA Gene DRA*01:01 DR53 protein DRB5 Gene DRB5*02:01 DRA Gene DRA*01:02 DR51 protein The names of the DRB loci (DRB1, DRB3, DRB4, DRB5) are derived from the hypothesized evolutionary origin of the DRB genes [Figure 4-7]. These genes are thought to have arisen from duplication, deletion, and C.W. Bill Young Marrow Donor Recruitment and Research Program 37 diversification over evolutionary time. Figure 4-7 Gene Duplication Generates the DR Subregion DRB Duplication of gene DRB1 DRB2 Duplication of genes DRB1 DRB2 DRB1’ DRB2’ Diversification of genes DRB1 DRB2 DRB3 DRB4 Deletion of gene Deletion of gene DRB1 DRB2 DRB4 DRB1 DRB2 DRB3 ═══════════════════════════════════════════════════════════════════ Question 8: Draw the class II loci present on the two copies of chromosome 6 from a person who carries DRB1*13:04, DRB1*04:02, DRB3*02:01, DRB4*01:01, DQA1*02:01, DQB1*02:01, DPA1*01:04, DPB1*04:01 alleles. Is the person homozygous or heterozygous for DQ and DP alleles? ═══════════════════════════════════════════════════════════════════ VIII. The class II genes are closely related to one another and share many segments of sequence. This sharing may result in difficulties in distinguishing specific HLA alleles. A. Some alleles fall into families based on similarities in their DNA sequences. These families are designated by their similar nomenclature. For example, C.W. Bill Young Marrow Donor Recruitment and Research Program 38 alleles, DRB1*08:01:01, DRB1*08:02:03, DRB1*08:03:02, DRB1*08:04:04, are very similar in sequence to one another as denoted by the digits in the first field of the allele name, 08. B. Some segments of each class II gene sequence are shared by all alleles. C. Other segments of the gene sequences are polymorphic or vary among alleles. Alleles from different allele families may share these polymorphic segments of sequence. ═══════════════════════════════════════════════════════════════════ QUESTION 9: Compare the DNA sequences of DRB1*01:02:01 and DRB5*02:02 and identify sequences that are shared. Some sequences are common to many DRB alleles; other sequence segments are found only in a few alleles. Identify the shared sequences in these two alleles. [Hint: sequences are usually compared to one sequence (e.g., DRB1*01:01) and nucleotides that are identical are indicated by a dash (-).] Compare the DNA sequences of DRB1*01:03 and DRB1*13:01:01 and identify sequences that are shared. Observe the nucleotides that differ between alleles. ═══════════════════════════════════════════════════════════════════ IX. Class I loci are also highly polymorphic. A. In the class I alleles, the second and third exons which encode the first and second domains of the alpha chain contain the majority of the allelic differences. B. The class I loci are very similar to one another. The sequences that identify an A locus allele from a B or C locus allele are located in the first exon (encoding the signal peptide) or in the 3' exons (encoding the transmembrane and cytoplasmic regions) or in the introns separating the exons. C. Alleles are designated by numbers. For example, A*02:02 and A*03:01:01:01 are alleles of the HLA-A locus. B*27:02 is an allele of the HLA-B locus [Figure 4-8]. C.W. Bill Young Marrow Donor Recruitment and Research Program 39 D. Null or nonexpressed alleles may be indicated by an “N” in the allele name e.g., A*24:11N. An “L” indicates an allele which exhibits decreased protein expression. An “S” indicates that the HLA protein product is secreted. While null alleles are important to consider in matching of patient and donor, the importance of low vs. normal expression levels or secreted vs cell surface molecule is not yet known. “Q” means that the level of expression is not known but expected to be low or absent. Figure 4-8 ═══════════════════════════════════════════════════════════════════ QUESTION 10: Use the W.H.O. nomenclature report to list a few alleles of the C locus. ═══════════════════════════════════════════════════════════════════ E. The HLA genes are closely related to one another and share many segments of sequence. This sharing may result in difficulties in distinguishing specific HLA class I alleles or molecules. C.W. Bill Young Marrow Donor Recruitment and Research Program 40 ═══════════════════════════════════════════════════════════════════ QUESTION 11: Go to the web page: http://www.ebi.ac.uk/ipd/imgt/hla/ and find a copy of the current list of the CDS sequences of the class I alleles. Compare the nucleotide sequences of B*07:02:01, B*08:01, and B*42:01 and identify sequences that are shared between alleles. ═══════════════════════════════════════════════════════════════════ X. Each person carries two copies of each HLA-A, -B, and -C locus, one inherited from their mother and one from their father. A. These genes are codominantly expressed, that is, both copies encode proteins that are expressed on a single cell [Figure 4-9]. If this is a cell of the immune system, it will also express all the class II alleles the cell carries. Figure 4-9 B. A parent and child share one chromosome or haplotype (haploidentical). A sibling has a 1 in 4 chance of receiving the same two chromosomes (HLA identical) as another sibling. C.W. Bill Young Marrow Donor Recruitment and Research Program 41 C. XI. Sometimes reciprocal recombination alters these allele combinations. Other class I molecules (HLA-E, HLA-G) are expressed in specific tissues at specific times. For example, HLA-G is expressed at the maternal/fetal interface. These molecules are less polymorphic than HLA-A, -B, and -C and are not considered important for HLA matching for transplantation. Other class I-like genes (e.g., HLA-Y) are pseudogenes and are defective, not expressing a protein. References: Fernandez Vina MA, Hollenbach JA, Lyke KE, Sztein MB, Maiers M, Klitz W, Cano P, Mack S, Single R, Brautbar C, Israel S, Raimondi E, Khoriaty E, Inati A, Andreani M, Testi M, Moraes ME, Thomson G, Stastny P, Cao K. 2012. Tracking human migrations by the analysis of the distribution of HLA alleles, lineages and haplotypes in closed and open populations. Philos Trans R Soc Lond B Biol Sci. 367(1590):820-9. Mack SJ, Cano P, Hollenbach JA, He J, Hurley CK, Middleton D, Moraes ME, Pereira SE, Kempenich JH, Reed EF, Setterholm M, Smith AG, Tilanus MG, Torres M, Varney MD, Voorter CE, Fischer GF, Fleischhauer K, Goodridge D, Klitz W, Little AM, Maiers M, et al. 2013. Common and well-documented HLA alleles: 2012 update to the CWD catalogue. Tissue Antigens 81(4):194-203. Parham, P., Lomen, C.E., Lawlor, D.A., Ways, J.P., Holmes, N., Coppin, H.L., Salter, R.D., Wan, A.M., Ennis, P.D. 1988. Nature of polymorphism in HLA-A, -B, -C molecules. Proc. Natl. Acad. Sci. USA 85:4005-4009. Parham, P. and Ohta, T. Population biology of antigen presentation by MHC Class I molecules. Science 272:67-74, 1996. Solberg OD, Mack SJ, Lancaster AK, Single RM, Tsai Y, Sanchez-Mazas A, Thomson G. 2008. Balancing selection and heterogeneity across the classical human leukocyte antigen loci: a meta-analytic review of 497 population studies. Hum Immunol. 69(7):443-64. C.W. Bill Young Marrow Donor Recruitment and Research Program 42 CHAPTER 5 HLA TYPES DEFINED BY SEROLOGY The purpose of this chapter is to describe the terms used by HLA serologists to describe the many forms of the HLA molecules found in the human population. I. Serology is used to identify the HLA proteins on the surfaces of cells. A. The different forms of the HLA proteins found in the human population may be detected serologically using antibodies (human alloantisera or monoclonal antibodies) in a test called a microcytotoxicity assay [Figure 5-1]. Figure 5-1 HLA Typing -- Microcytotoxicity Assay Y Antibody HLA Molecule Cell Complement Y B. The alloantisera utilized in this assay are obtained from humans who have been sensitized to foreign HLA molecules by pregnancy or previous transplant. These antibodies are used as reagents to identify serologic determinants or specificities (or HLA types) by reacting with the HLA molecules present on the cell surfaces. C. Because humans mount an immune response to foreign HLA molecules, the HLA molecules are often called antigens. An antigen is any substance that an antibody can bind. C.W. Bill Young Marrow Donor Recruitment and Research Program 43 II. Each serologic specificity (or HLA type) is designated by a letter indicating the kind of HLA antigen (A, B, C, DR, DQ, DP) and a number. The number indicates the order in which the type was originally discovered or its relationship to other defined antigens. For examples, DQ7 is a serologic specificity localized on an HLA-DQ antigen, DR4 is a serologic specificity localized on an HLA-DR antigen, and B8 is a serologic specificity localized on an HLA-B antigen. ═══════════════════════════════════════════════════════════════════ QUESTION 1: Go to the web http://hla.alleles.org/ and find a list of the accepted serologic specificities. ═══════════════════════════════════════════════════════════════════ III. Broad antigens and splits A. Some antibodies define clusters of HLA proteins (broad antigens) that are similar and that are indistinguishable using a particular antibody preparation [Figure 5-2]. For example, the A2 specificity is likely found on over 200 different HLA-A molecules. B. Other more specific antibodies may allow the definition of subdivisions or "splits" of broad antigens. A nontechnical example of splits which may be clearer is that a stranger may recognize you as a member of the Jones family (broad specificity) but a friend may recognize you as Mary Jones (a split). Figure 5-2 C.W. Bill Young Marrow Donor Recruitment and Research Program 44 ═══════════════════════════════════════════════════════════════════ EXAMPLE: Antibodies define A9, a broad antigen [Figure 5-2]. A9 was the 9th HLA-A specificity described. The splits of A9 have been labeled A23 and A24. Thus, the specificity A23 is often called a split of the broad specificity A9 and can be also written as A23(9). Because these serologic specificities were named as they were discovered, the splits of A9 were designated as A23 and A24 because they were the 23rd and 24th HLA-A antigens discovered. An individual who types as A24 using A24-specific antibodies would also type as A9 using A9-specific antibodies but they would not type as A23. ═══════════════════════════════════════════════════════════════════ C. Different HLA alleles defined by DNA typing can specify HLA proteins which are indistinguishable using serologic typing. For example, an individual carrying the DRB1*04:01:01 allele would have the same serologic type (DR4) as an individual carrying the DRB1*04:12 allele. Thus, DRB1*04:01:01 and DRB1*04:12 are splits of the broad specificity DR4. These splits are identified by DNA typing but, because we do not have serologic reagents specific enough to define this split, we can not serologically distinguish between the antigens specified by the two alleles, DRB1*04:01:01 and DRB1*04:12. ═══════════════════════════════════════════════════════════════════ QUESTION 2: Go to http://www.ebi.ac.uk/ipd/imgt/hla/ under HLA dictionary. Look up the tables that list the serologic specificities of HLA-A,-B,-DR and links them to the alleles. ═══════════════════════════════════════════════════════════════════ D. Although HLA types have been defined using serology for many years, the available antibodies lack the resolution required to identify all of the specific products of the HLA alleles [Figures 5-3]. For example, serology can not determine whether a donor and recipient who are typed as DR4 carry the same alleles of DR4. This is one reason that, in some situations such as typing for bone marrow transplantation, that serology has been replaced by DNA-based typing methods. ═══════════════════════════════════════════════════════════════════ QUESTION 3: Do the donor and recipient below carry the same DR alleles? Donor: DR8,DR3 Recipient: DR8,DR3 ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 45 Figure 5-3 ═══════════════════════════════════════════════════════════════════ QUESTION 4: Father is DR4, DR11 and mother is DR1, DR3. What are the possible DR types inherited by their children? What are one possible set of DR alleles that might be found in the family? ═══════════════════════════════════════════════════════════════════ E. IV. Cellular assays such as the mixed lymphocyte culture (MLC) measure the differences in class II proteins between individuals. A cellular assay is more sensitive in detecting HLA differences than serologic typing since even a single amino acid differences can cause stimulation of white blood cells. Because of the difficulty in generating and maintaining cellular reagents and because of difficulties in interpreting experimental results, these types of assays have been replaced by DNA-based typing methods. The HLA genes encode specific HLA antigens. C.W. Bill Young Marrow Donor Recruitment and Research Program 46 A. The DR molecules encoded by the DRA and DRB1 loci carry the DR1, DR15, DR16, DR3, DR4, etc. serologic specificities [Figure 5-4]. Figure 5-4 B. Molecules encoded by the DRA and DRB3 loci carry the DR52 serological specificity Molecules encoded by the DRA and DRB4 loci carry the DR53 serological specificity. Molecules encoded by the DRA and DRB5 loci carry the DR51 serological specificity. C. Molecules encoded by the DQA1 and DQB1 loci carry DQ serological specificities. Molecules encoded by the DPA1 and DPB1 loci carry DP serological specificities. [Note: There are very few alloantisera that detect DP types.] D. Molecules encoded by the A, B, and C loci carry HLA-A, HLA-B, and HLA-C serologic specificities. HLA-C is poorly defined by serologic reagents. C.W. Bill Young Marrow Donor Recruitment and Research Program 47 E. Individuals who carry haplotypes (or chromosomes) with two DR beta loci express two different DR molecules specified by that haplotype and, since the allele associations are usually fixed, the combination of DR molecules can usually be predicted [Figure 5-4]. [Review Chapter 4.] 1. Individuals who express a DR4 molecule usually also express a DR53 molecule and carry a DR4 beta chain allele (e.g., DRB1*04:02) and a DR53 beta chain allele (e.g., DRB4*01:01:01:01) [Figure 5-4]. This is also true for DR7 and DR9. 2. Individuals who express a DR3 molecule usually also express a DR52 molecule and carry a DR3 beta chain allele (e.g., DRB1*03:01:01) and a DR52 beta chain allele (e.g., DRB3*01:01:02:01). This is also true for DR11, DR12, DR13, and DR14. 3. Individuals who express a DR15 molecule usually also express a DR51 molecule and carry a DR15 beta chain allele (e.g., DRB1*15:01:01) and a DR51 beta chain allele (e.g., DRB5*01:01:01). This is also true for DR16. ═══════════════════════════════════════════════════════════════════ QUESTION 5: Draw the DR loci present in the MHC of a person who is serologically typed as DR11, DR52. List one set of possible alleles. [Hint: There are more than one possible allele for each locus, just pick one.] ═══════════════════════════════════════════════════════════════════ F. There are exceptions to these associations. For example, some DR7 haplotypes carry a DRB4 allele but it is not expressed as a DR53 molecule on the cell surface because of a defect in the gene (e.g., a termination codon halting protein synthesis). The allele specifying this null allele is sometimes labeled with an “N” (e.g., DRB4*01:03:01:02N). ═══════════════════════════════════════════════════════════════════ QUESTION 6: Father expresses HLA-A2,A3,B27,B53 and mother expresses HLAA2,A11,B51,B71. What are the possible HLA-A and -B types that their children might express? C.W. Bill Young Marrow Donor Recruitment and Research Program 48 One of the children expresses HLA-A2,B27,B71. What are the most likely haplotypes carried by the parents? ═══════════════════════════════════════════════════════════════════ V. Complications. A. Unfortunately, the serologic types associated with some HLA alleles are not yet known. For example, the serologic type specified by B*08:08 is not known. B. In some cases, the name of the allele does not reflect its serologic type. For example, the HLA molecule specified by B*50:02 is serologically typed as B45. ═══════════════════════════════════════════════════════════════════ QUESTION 7: Using the DNA dictionary, what is the serologic type assigned to the following alleles: A*68:01:01, B*13:01, B*18:04, DRB1*03:01:01, DRB1*11:22? ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 8: A patient carries A*02:01:01:01 and A*24:09N alleles. What is the serologic type of this patient? Would a donor serologically typed as A2, A24 be a match for this patient? ═══════════════════════════════════════════════════════════════════ References: HLA websites with serology information: http://www.ebi.ac.uk/ipd/imgt/hla/ http://hla.alleles.org/ C.W. Bill Young Marrow Donor Recruitment and Research Program 49 CHAPTER 6 GENE AMPLIFICATION USING THE POLYMERASE CHAIN REACTION The polymerase chain reaction (PCR) is a rapid way of isolating large quantities of specific HLA genes for HLA typing [Figure 6-1]. Using this method, we can generate millions of copies of a specific gene. Figure 6-1 PCR Amplification for Identification of HLA Alleles After HLA Gene Other Genes Before HLA Gene I. PCR is a method of DNA synthesis [Figure 1-7] using: A. A DNA polymerase that is not destroyed when heated (thermostable): Taq polymerase. This enzyme comes from bacteria that live in hot springs. B. DNA template: Crude cell lysates containing heat denatured DNA can be used although better results are obtained with DNA that has been more extensively purified. C. Nucleotides. D. Primers. 1. One of the advantages of PCR is that it uses two primers so that both C.W. Bill Young Marrow Donor Recruitment and Research Program 50 DNA strands are copied. The two single stranded primers must flank the region to be amplified. Since most of the polymorphism of the class II genes is found in the second exon, primers usually are designed to flank this region [Figure 6-2]. Primers for class I alleles usually flank the polymorphism found in exons 2 and 3. Figure 6-2 PCR Primers Amplify One Part of an HLA Gene Class II Gene L 1 B1 B2 Tm 2 3 4 Cyt 5 6 2. The primers hybridize to opposite strands of the DNA ladder. One primer is called a forward (or 5' or sense or coding) primer and the other primer is called a reverse (or 3' or antisense or noncoding) primer. Remember the discussion in Chapter 1 about DNA synthesis. 3. The primer sequences must complement (match) their target sequence and be sufficiently long (20-30 nucleotides) to bind to only the HLA gene that you want to amplify. 4. A primer should not contain any stretches of sequence that would anneal to the other primer (form primer dimers). For example, 5' AGCACTTTT and 5' TCCATAAAA would not be a good choice because the stretches of Ts and As would anneal. The polymerase would add complementary nucleotides to make a short double strand DNA called a primer dimer. 5’ AGCACTTTT 3’ 3’ AAAATACCT 5’ ===> 5’AGCACTTTTATGGA TCGTGAAAATACCT C.W. Bill Young Marrow Donor Recruitment and Research Program 51 II. DNA is amplified by using a three step procedure: A. DNA denaturation (94-96oC) to generate a single stranded template [Figure 6-3]. Figure 6-3 B. Annealing of the primers (45-65oC) using hybridization conditions that guarantee that the primers will bind to perfectly matched sequences (target sequence) and not to sequences that are not matched [Figure 6-4]. The temperature of annealing controls, in part, the specificity of the amplification [Hint: Remember Chapter 1]. The higher the temperature, the more specific the amplification until you get to the melting temperature of the primer. At that temperature, amplification does not work very well, if at all. C.W. Bill Young Marrow Donor Recruitment and Research Program 52 Figure 6-4 PCR – Anneal Primers at 5065ºC 5’GGGGTGCCCCCCCCTTTTTTTGAAAAA3’ 3’CTTTTT5’ 5’GGGGT3’ 3’CCCCACGGGGGGGGAAAAAAACTTTTT5’ Annealing temperature depends on length, GC content of primer ie melting temperature Figure 6-5 PCR – Extension at 72ºC 5’GGGGTGCCCCCCCCTTTTTTTGAAAAA3’ 3’CTTTTT5’ 5’GGGGT3’ 3’CCCCACGGGGGGGGAAAAAAACTTTTT5’ Use temperature optimal for polymerase C.W. Bill Young Marrow Donor Recruitment and Research Program 53 o C. Extension (synthesis of DNA) (around 72 C) [Figure 6-5]. D. At the end of the first cycle, amount of target DNA is doubled [Figure 6-6]. Figure 6-6 PCR – End of 1st Cycle 5’GGGGTGCCCCCCCCTTTTTTTGAAAAA3’ 3’CCCCACGGGGGGGGAAAAAAACTTTTT5’ 3’CCCCACGGGGGGGGAAAAAAACTTTTT5’ 5’GGGGTGCCCCCCCCTTTTTTTGAAAAA3’ Note: primers become part of the newly synthesized DNA E. The three steps are repeated over and over by simply changing the temperature of the reaction mix using an instrument called a thermal cycler. The newly synthesized strands serve as templates for synthesis in the next cycle (another advantage of PCR). Usually 25-30 cycles of amplification are carried out to yield millions of copies of the gene of interest. F. Figure 6-7 illustrates how the reaction yields an amplicon of defined length after many cycles. Question 2 gives you a chance to demonstrate this to yourself. C.W. Bill Young Marrow Donor Recruitment and Research Program 54 Figure 6-7. ═══════════════════════════════════════════════════════════════════ QUESTION 1: What would happen if you used a primer set that had the following sequences to amplify a gene from human DNA: 5' CC 5' TA Would primers that were 20 bases long be more or less specific in priming the amplification reaction? Why? [Hint: Remember the discussion about the specificity of restriction enzymes. What is the probability that you will find a two nucleotide-long sequence in the DNA compared to the probability of finding a 20 nucleotide-long sequence?] ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 2: Draw out a PCR reaction amplifying the DNA listed below using primers listed. What happens at each step of the amplification (denaturation, annealing, extension)? What do the amplified products look like after the second cycle of C.W. Bill Young Marrow Donor Recruitment and Research Program 55 amplification? After the third cycle? 5' AATAATAATAATAATAATTATGGCGGCTATCGGCGGCGGCTTTATTATTATTATTCCCC 3' 3' TTATTATTATTATTATTAATACCGCCGATAGCCGCCGCCGAAATAATAATAATAAGGGG 5' Primers: 5'TTATG and 5'AAAGC How many base pairs in length is the amplified fragment? If you started with one piece of DNA, after 3 cycles, how many copies do you have? After 10 cycles? ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 3: The sequences of the two primers that are used to amplify HLA-A alleles are: Forward: 5’ CCC AGA CGC CGA GGA TGG CCG 3’ Hint: 5’UTR-exon 1 boundary Reverse: 5’ GCA GGG CGG AAC CTC AGA GTC ACT CTC T 3’ Hint: Intron 3; remember 5’ to 3’ and complementary DNA strands running in opposite directions Use the HLA-A sequence in Figure 2-4 to locate these sequences in the DNA. Remember that DNA synthesis proceeds 5' to 3' and that we are only interested in the polymorphic regions (second and third exon). Why can't you find the primer sequences in the CDS sequence of HLA-A in Figure 2-5? ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 4: Locate the C locus primers used to amplify exons 2 and 3: AGCGAGG(GT)GCCCGCCCGGCGA and GGAGATGGGGAAGGCTCCCCACT. [Hint: C.W. Bill Young Marrow Donor Recruitment and Research Program 56 (GT) means that two primers are synthesized, one with G at this position and one with T in this position. Hint: Look at the genomic sequences because the primers anneal in the introns. Note: Primers described in Cereb, N. et al. 1996. Nucleotide sequences of MHC class I introns 1,2, and 3 in humans and intron 2 in nonhuman primates. Tissue Antigens 47:498511. Correction in Tissue Antigens 48:235-236, 1996. ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 5: Design primers to amplify the second exon of the DQA1 alleles. You will need to find a publication listing the sequences of all the DQA1 alleles to do this [Hint: HLA/IMGT web site.]. ═══════════════════════════════════════════════════════════════════ III. To determine if DNA has been amplified following the PCR reaction, the PCR product is often analyzed by gel electrophoresis to identify a fragment of a specific size. For example, amplification of exon 2 of the DRB genes will yield a fragment of DNA approximately 270 base pairs in length and amplification of exons 2 and 3 of an HLA-B gene will yield a fragment of DNA approximately 1000 base pairs in length. A. IV. V. Note that a primer dimer may also detected on the gel during electrophoresis. It will be smaller that the HLA amplicon so it will move farther down the gel. Advantages of PCR A. Large quantities of the HLA gene of interest will allow the rapid detection of the HLA type. B. Amplification of a specific HLA gene means that detection of an HLA type will not be influenced by the presence of other HLA genes or non-HLA genes during probe hybridization or sequencing. Contamination with previously amplified DNA is the biggest potential problem. It is relatively easy to contaminate the work area with amplified DNA because there are so many copies of an HLA gene following amplification. C.W. Bill Young Marrow Donor Recruitment and Research Program 57 ═══════════════════════════════════════════════════════════════════ QUESTION 6: Assume that a PCR amplification generates 27 million (27 x 10 6) copies of a piece of DNA in a 100 microliter reaction volume. How many copies will be in a 1 microliter drop that was sucked into the barrel of a pipetter? ═══════════════════════════════════════════════════════════════════ VI. Allele or group-specific PCR or sequence-specific primer (SSP) typing. A modification of this approach is called ARMS (amplification refractory mutation system). A. In this technique, PCR primers are designed to anneal only to a specific set of alleles or to a single allele [Figure 6-8]. One or both primers include sequences unique to the allele(s). These unique sequences should be located at the 3' end of the primer for maximum specificity in the annealing step. Remember, that to get efficient amplification, both PCR primers must anneal to the DNA. Figure 6-8 Sequence Specific PCR DRB1*01, DRB1*15 DRB1*01 primer DRB1*03 primer DRB1*15 primer DRB1*01 DRB1*15 MW Sample marker Gel B. MW Sample marker Gel MW Sample marker Gel The primer set can be designed to give an amplified fragment of a specific C.W. Bill Young Marrow Donor Recruitment and Research Program 58 size which can be detected by gel electrophoresis. C. Failure to use the appropriate amplification conditions can cause amplification that will give either false positives (wrong alleles amplified) or false negatives (correct allele not amplified). D. Primers used in the ARMS system may have an additional mismatch incorporated in their sequence. This makes the primers mismatched to all alleles; however, the allele(s) which amplify are less mismatched than the alleles which should not amplify. E. The 3’ end of the primer is most important in controlling the specificity of annealing and amplification. Differences between the primer and template near the 5’ end of the primer are likely not to affect annealing of the primer and will give rise to false positives (i.e. amplification occurs even though mismatched). ═══════════════════════════════════════════════════════════════════ QUESTION 7: Design a forward PCR primer that will amplify all the DRB1*04 alleles and not most of the other DRB1 alleles. You can use 5’ CCG CTG CAC TGT GAA GCT CT as a reverse primer (end of exon 2) but you need to design a forward primer. You will have to look at the DRB1 allele sequences that you looked at in Chapter 4. Can you design a forward PCR primer that will amplify only the DRB1*11 alleles? Can you design a primer set that will amplify only DRB1*11:01 alleles? ═══════════════════════════════════════════════════════════════════ VII. The amplified DNA is used to identify HLA types as described in the next chapters. References: Green and Sambrook. Molecular Cloning, A Laboratory Manual, 4th edition. Cold Spring Harbor Laboratory Press Bunce, M. O’Neill, C.M., Barnardo, M.C., Krausa, P., Browning, M.J., Morris, P.J. & Welsh, C.W. Bill Young Marrow Donor Recruitment and Research Program 59 K.I. 1995. Phototyping: comprehensive DNA typing for HLA-A,B,C DRB1, DRB3, DRB4,DRB5 & DQB1 by PCR with 144 primer mixes utilizing sequence-specific primers (PCR-SSP). Tissue Antigens 46:355-367. Cereb, N., Kong, Y., Lee, S., Maye, P., Yang, S.Y. 1996. Nucleotide sequences of MHC class I introns 1,2, and 3 in humans and intron 2 in nonhuman primates. Tissue Antigens 47:498-511. Correction in Tissue Antigens 48:235-236, 1996. Krausa, P., Bodmer, J.G., and Browning, M.J. 1993. Defining the common subtypes of HLA A9, A10, A28 and A19 by use of ARMS/PCR. Tissue Antigens 42:91-99. Mullis, K. The unusual origin of the polymerase chain reaction. Scientific American, April 1990. Olerup, O. and Zetterquist, H. 1992. HLA-DR typing by PCR amplification with sequence specific primers (PCR-SSP) in 2 hours: An alternative to serological DR typing in clincial practice including donor-recipient matching in cadaveric transplantations. Tissue Antigens 39:225-235. Saiki, R.K., Gelfand, D.H., Stoffel, S., Scharf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B., and Erlich, H.A. 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-91. C.W. Bill Young Marrow Donor Recruitment and Research Program 60 CHAPTER 7 USE OF OLIGONUCLEOTIDE PROBES TO DETECT SPECIFIC DNA SEQUENCES The amplified DNA generated using the PCR is used to identify HLA types. This chapter discusses the use of oligonucleotide probe hybridization (Figure 7-1) to detect specific HLA alleles. Figure 7-1 SSOPH – Standard Format Probe 2 Oligo Probe #1 Probe 3 Probe 4 Label Different genomic DNA samples on membrane I. Multiple membranes, One for each probe Selection and synthesis of probes (oligonucleotides) to detect specific alleles (SSOP=sequence specific oligonucleotide probes) (also called SSO) by hybridization (SSOPH). A. Oligonucleotides ("oligos") (single stranded DNAs) must complement their target sequence and be sufficiently long (~18 nucleotides) to allow the use of hybridization conditions that guarantee discrimination between the target sequence and other closely related sequences [Figure 7-2]. The more differences between the matched and mismatched sequences, the easier it is to establish specific hybridization conditions. If the sequences are mismatched for only one nucleotide, the mismatched base should be placed in the middle of the oligonucleotide probe. Stringent washes are more likely to remove single base mismatches in this position; it is more destabilizing C.W. Bill Young Marrow Donor Recruitment and Research Program 61 than a mismatch at one end of the oligonucleotide. Figure 7-2 B. The oligo should be approximately 50% G+C (if possible) and should not contain any complementary sequences that might cause the oligo to anneal to itself (form a hairpin-like structure). For example, AAAAATGCCGCTATTTTT would not be a good choice. C. The oligo can be complementary to either strand of the DNA. Most people synthesize oligos which are identical in sequence to the coding strand (CDS, the sequence published in the literature) so that they are easier to find in the HLA allele sequences. II. Labeling of oligo probes to detect binding. Probes are usually end-labeled, tailed, or contain modified nucleotides. If probes are attached to beads, the bead can be fluorescently tagged. III. Hybridization of probes to DNA samples. A. Amplified DNA is linked to a solid support, denatured, and then hybridized to a labeled SSOP [Figures 7-1, 7-3]. Support can be a membrane, plastic C.W. Bill Young Marrow Donor Recruitment and Research Program 62 plate or a bead. B. The conditions of the hybridization and/or wash to remove nonspecifically bound probe are very important in controlling specificity of hybridization [Figure 7-4]. Most people hybridize using nonstringent conditions and utilize stringent conditions for washing. Figure 7-3 Probe Hybridization Synthetic oligonucleotide (oligo) Solid support DNA Figure 7-4 Denaturation of DNA + Single stranded synthetic DNA (oligo) Hybridization Wash to remove imperfectly matched hybrids Control Stringency PCR Probes C.W. Bill Young Marrow Donor Recruitment and Research Program 63 C. In one form of this technique, samples of DNA from many people are applied to a single membrane and hybridized with a single probe [Figure 7-1]. Each assay requires a separate membrane for each probe. D. Alternatively, the probes may be linked to the solid support (like a 24 well plate) and then hybridized to labeled, denatured, amplified DNA (reverse dot blot) [Figure 7-5]. Support can be a membrane, plastic plate or a bead. E. Another variation on the SSOP technique is the “chip” technology. In this methodology, short overlapping oligonucleotides representing the entire sequences of alleles are attached to a solid support about the size of a dime (a “chip”)]. Labeled denatured amplified DNA is incubated with this chip and the binding of the DNA to individual oligos is detected by a excitation of the label with a laser]. Figure 7-5 SSOPH--Reverse Format 1 Sample / Membrane Amplified HLA Gene Label Oligo Multiple probes on solid support ═══════════════════════════════════════════════════════════════════ COMMENT: Both probes and primers are single stranded pieces of synthetic DNA. When the oligonucleotide is used to prime DNA synthesis, it is called a primer. When the oligonucleotide is used for hybridization and that hybridization event is used to define an HLA type, the oligonucleotide is called a probe. ═══════════════════════════════════════════════════════════════════ IV. Interpretation of hybridization results C.W. Bill Young Marrow Donor Recruitment and Research Program 64 A. V. When a sample is tested with a panel of oligo probes to type DRB1, for example, some probes will hybridize and some will not, based on the sequences of the two DRB1 alleles present. A software program then takes these data and compares it to the DNA sequences of all the known DRB1 alleles and determines which alleles might be present. See Chapter 11. Difficulty with this system: A. Alleles often share nucleotide sequences so probes usually detect more than one allele. One may need to use multiple probes for typing a single allele. One may also need to carry out group specific amplification [Figures 7-6]. Identification of individual alleles is called allele-level typing. Typing which narrows down the allele possibilities but still includes more than one possible allele is termed intermediate resolution testing. An example is a sample typed as (DRB1*11:01 or DRB1*11:04) and (DRB1*03:02 or DRB1*03:03). B. Since oligo probes identify only specific sequences within the gene, differences outside of these regions or at the edges of the probe will go unnoticed. Thus new alleles might be missed. C. Contamination with other DNAs may give false positives. For example, if DNA amplified in a previous assay contaminates a pipettor, that DNA may be transferred into a DNA sample in another assay. ═══════════════════════════════════════════════════════════════════ QUESTION 1: The two sequences below differ by a single nucleotide. Design a 10 base long oligo probe to detect sequence #1. #1 5' ATACAGAGGTACTACGCCTAATATGGCGCTA #2 5' ATACAGAGGTACTACACCTAATATGGCGCTA What is the G+C content of the oligo that you have designed? What is its approximate melting temperature? If you carry out the hybridization wash at 50oC, what will happen? What would happen if you put the discriminating nucleotide at the 3’ or 5’ end of the probe? ═══════════════════════════════════════════════════════════════════ ═══════════════════════════════════════════════════════════════════ QUESTION 2: Using the nomenclature website that lists all of the DRB1 allele sequences, design an oligonucleotide to specifically detect DRB1*11 alleles. This oligo will be equivalent to a DR11-specific antibody used in serologic HLA typing. If we use the oligonucleotide probes to define "serologic types", this is called antigen level or low C.W. Bill Young Marrow Donor Recruitment and Research Program 65 resolution typing. [Note that the probe will also hybridize to a few non-DRB1*11 alleles like DRB1*04:15 and DRB1*03:08.] Design an oligonucleotide to detect all DRB1 alleles. Design an oligonucleotide to detect only the DRB1*10:01 alleles. If we use the oligonucleotide probes to define alleles, this is called allele level or high resolution typing. ═══════════════════════════════════════════════════════════════════ Figure 7-6 ═══════════════════════════════════════════════════════════════════ QUESTION 3: How would you design a protocol to identify a DRB1*01:02 allele from a DRB1*01:01 allele? ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 66 References (a few examples): Bugawan, T. L., Apple, R., and Erlich, H. A. 1994. A method for typing polymorphism at the HLA-A locus using PCR amplification and immobilized oligonucleotide probes. Tissue Antigens 44:137-147. Bugawan, T. and Erlich. 1991. Rapid typing of HLA-DQB1 DNA polymorphism using nonradioactive oligonucleotide probes and amplified DNA. Immunogenetics 33:163. Middleton, D., Williams, F., Hamill, M. A., et. al. 2000. Frequency of HLA-B alleles in a Caucasoid population determined by a two-stage PCR-SSOP typing stategy. Human Immunol 61:1285-1297. Fernandez-Vina, M., Lazaro, A.M., Sun, Y., Miller, S., Forero, L., and Stastny, P. 1995. Population diversity of B-locus alleles observed by high-resolution DNA typing. Tissue Antigens 45:153-168. Gao, X., Fernandez-Vina, M., Shumway, W., and Stastny, P. 1990. DNA typing for class II HLA antigens with allele-specific or group-specific amplification: typing for subsets of HLADR4. Human Immunol. 27:40-50. Fulton, R.J., R.L. McDade, P.L. Smith, L.J. Kienker, and J.R. Kettman Jr. 1997. Advance multiplexed analysis with the FlowMetrix TM system. Clinical Chemistry 43: 1749-1756. Saiki, R.K., Walsh, P.S., Levenson, C.H., and Erlich, H.A. 1989. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes. Proc. Natl. Acad. Sci. USA 86:6230-6234. C.W. Bill Young Marrow Donor Recruitment and Research Program 67 CHAPTER 8 SANGER-BASED DNA SEQUENCING OF HLA GENES It is not yet know how many HLA alleles exist in the human population. It is possible that there will be so many HLA alleles that SSOP typing will require huge panels of oligonucleotide probes and SSP will require huge primer panels to identify alleles. It may be more practical and informative to use DNA sequencing to determine if a specific donor and recipient carry the same HLA alleles. I. DNA sequencing can be used to determine the exact base sequence of the DNA or mRNA (cDNA) encoding an HLA molecule. II. Method. A. The DNA of the gene must be amplifed by PCR [Figure 8-1]. 1. PCR is more commonly used today to obtain enough copies of a gene to sequence (Chapter 6). The primers used for PCR amplification determine how many alleles are coamplified. For example, if primers that amplify all HLA-B alleles are used, then the amplified product will contain both HLA-B alleles expressed by an individual. Figure 8-1 C.W. Bill Young Marrow Donor Recruitment and Research Program 68 2. If a sequence-specific primer set is used for PCR, each of the two alleles may be isolated for sequencing Cloning is more work and would likely not be used routinely for HLA typing. To make a DNA library or to "clone" a piece of DNA, pieces of DNA or DNA copies of the mRNA (cDNA) are inserted into a vector. Vectors are pieces of DNA that can replicate like a plasmid or virus. Each vector containing an inserted piece of DNA is propagated in bacteria and, as the bacteria replicate, millions of copies of the vector and inserted DNA are made. Cloning can be used to isolate one HLA allele for characterization from a heterozygous individual. PCR amplified DNA (e.g., HLA-B PCR product) is cloned into a vector and individual bacterial colonies isolated. Each colony carries one vector and, hence, one of the two HLA alleles. DNA/cDNA libraries can be made that contain most of the pieces of DNA (genomic library) carried by a cell or most of the mRNAs (cDNA library) expressed by a cell. From these libraries, the genes or mRNAs (cDNAs) that encode the HLA molecules can be isolated. ═══════════════════════════════════════════════════════════════════ QUESTION 1: If a person was typed as DRB1*01:01 and DRB1*07:01, how would you obtain only the DRB1*07:01 allele for sequence analysis? ═══════════════════════════════════════════════════════════════════ B. Sanger-based DNA sequencing is a method of DNA synthesis (Chapter 1) requiring: 1. A single stranded template containing the PCR-amplified copies of the HLA gene being sequenced. 2. One primer that anneals to the template (called the sequencing primer). 3. A DNA polymerase like Taq polymerase. 4. Nucleotides. a. dATP, dTTP, dGTP, dCTP. C.W. Bill Young Marrow Donor Recruitment and Research Program 69 b. 5. C. Dideoxynucleotides (ddATP,ddTTP,ddCTP,ddGTP). When the polymerase incorporates a dideoxynucleotide into the growing DNA strand, synthesis stops at that point. Some method of labeling is used to identify the newly synthesized DNA strand. a. The dideoxynucleotide could be labeled with a fluorescent dye b. Or the primer could be labeled with a fluorescent dye The method used for sequencing is called the Sanger chain termination sequencing method [Figure 8-2]: 1. The amplified DNA is denatured and the primer is annealed. 2. The DNA is divided into 4 aliquots. A different chain-terminating nucleotide is added to each aliquot in addition to all four of the normal nucleotides. For example, ddCTP + dATP, dTTP, dGTP, dCTP are added to one tube [Figure 8-2]. Figure 8-2 Labeled Terminator Sequencing 3’GGGGTGCCCCCCCCTTTTTTTGAAAAA5’ 5’CCCCA C G A T C-d ideoxy 3’GGGGTGCCCCCCCCTTTTTTTGAAAAA5’ 5’CCCCACd 3’GGGGTGCCCCCCCCTTTTTTTGAAAAA5’ 5’CCCCACGGGGGGGGAAAAAAACd C.W. Bill Young Marrow Donor Recruitment and Research Program 70 As the polymerase is synthesizing a complementary DNA strand (the sequence shown in Figure 8-2), it has a choice of nucleotides for the incorporation. If it uses a normal nucleotide, synthesis proceeds. If the polymerase incorporates a dideoxynucleotide, synthesis halts. Remember that many identical strands of DNA are being synthesized at the same time. 4. Each reaction generates populations of labeled oligonucleotides of different lengths that begin from a fixed point (the primer) and terminate randomly at the residue represented by the ddNTP in that aliquot [Figure 8-3]. 5. The populations of oligonucleotides of different lengths are resolved by electrophoresis on a polyacrylamide like gel. If different colored fluorescent dyes are used to label each dideoxy aliquot, the four aliquots can be run in the same lane. A laser reads each color as the fragments pass by a detector (automated sequencer). 6. The “read” from a single sequencing primer is usually more than the length of an HLA exon but doesn’t usually include all of the exons in the amplicon. Usually multiple sequencing primers are used in different reactions to produce fragments covering all of the exons included in the amplicon. Sense and antisense primers are used to obtain the sequence of both strands of the DNA being sequenced. Figure 8-3 C.W. Bill Young Marrow Donor Recruitment and Research Program 71 ═══════════════════════════════════════════════════════════════════ QUESTION 2: Draw out the sequencing reaction for the following piece of single stranded DNA using the primer listed below. What would the sequencing gel look like? 3' AAAAAATGCCGAATCCGATACGTCGGGCATT 5' Primer: 5' TTTTTTA 3' ═══════════════════════════════════════════════════════════════════ D. In sequence based typing (SBT), amplified HLA alleles are sequenced directly (i.e., without cloning) to identify the alleles carried by the individual. 1. Depending on the PCR primers used, the sequence may contain either a single allele or two alleles mixed together [Figure 8-4]. Figure 8-4 Sequence-Based Typing Heterozygote T G C C A T G C A [M] a. When two alleles are sequenced simultaneously, positions where the two alleles differ in sequence will show two nucleotides. For example, in Figure 8-4, the fourth position shows both a C and an A. This is labeled as M meaning both C.W. Bill Young Marrow Donor Recruitment and Research Program 72 C and A. Table 8-1 lists the codes used for multiple nucleotides. b. 2. A sequence-specific sequencing primer may be used to produce the sequence of a single allele from a PCR reaction containing both alleles at a locus. A software program is used to identify the alleles based on their sequence. Table 8-1 IUB Codes for Multiple Nucleotides R A/G (puRine) Y C/T (pYrimidine) K G/T (Keto) M A/C (aMino) S G/C (Strong 3H) W A/T (Weak 2H N A/C/T/G aNy base IUB, International Union of Biochemistry Nomenclature Committee III. Sanger sequencing is a great method for identifying the HLA alleles carried by an individual but, without isolating each of the two alleles, doesn’t tell us the phase of the nucleotides across the entire region sequenced. Phase identifies which polymorphic residues are carried on the maternal versus the paternal chromosome 6 [Figure 8-5]. Without phasing, it may not be possible to tell which genotype (combination of 2 alleles) an individual carries. C.W. Bill Young Marrow Donor Recruitment and Research Program 73 Phase of Polymorphisms Paternal chrom 6 Maternal chrom 6 DRB1*010101 1 5 10 15 20 CTG GCT TTG GCT GGG GAC ACC CGA C|CA CGT TTC TTG TGG CAG CTT AAG TTT GAA TGT CAT TTC TTC AAT GGG ACG DRB1*040101 --- --- --- --- --- --- --- --- -|-- --- --- --- GA- --- G-- --A CA- --G --- --- --- --- --C --- --- DRB1*010101 25 30 35 40 45 GAG CGG GTG CGG TTG CTG GAA AGA TGC ATC TAT AAC CAA GAG GAG TCC GTG CGC TTC GAC AGC GAC GTG GGG GAG DRB1*040101 --- --- --- --- --C --- --C --- -A- T-- --- C-- --- --- --- -A- --- --- --- --- --- --- --- --- --- DRB1*010101 DRB1*040101 50 55 60 65 70 TAC CGG GCG GTG ACG GAG CTG GGG CGG CCT GAT GCC GAG TAC TGG AAC AGC CAG AAG GAC CTC CTG GAG CAG AGG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- -A- DRB1*010101 75 80 85 90 95 CGG GCC GCG GTG GAC ACC TAC TGC AGA CAC AAC TAC GGG GTT GGT GAG AGC TTC ACA GTG CAG CGG CGA G|TT GAG DRB1*040101 --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- -|-C T-T References: Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467. Kotsch, K., J. Wehling, S. Kohler, and R. Blasczyk. 1997. Sequencing of HLA class I genes based on the conserved diversity of the noncoding regions: sequencing-based typing of the HLA-A gene. Tissue Antigens 50:178-191. Versluis, L.F., Rozemuller, E., Tonks, S., Marsh, S. G. E., Bouwens, A. G. M., Bodmer, J. G., and Tilanus, M. G. J. 1993. High-resolution HLA-DPB typing based upon computerized analysis of data obtained by fluorescent sequencing of the amplified polymorphic exon 2. Human Immunology 38:277-283. McGinnis, M.D., Conrad, M.P., Bouwens, A.G.M., Tilanus, M.G.J., and Kronick, M.N. 1995. Automated, solid-phase sequencing of DRB region genes using T7 sequencing chemistry and dye-labeled primers. Tissue Antigens 46:173-179. Petersdorf, E.W. and Hansen, J.A. 1995. A comprehensive approach for typing alleles of the HLA-B locus by automated sequencing. Tissue Antigens 46:73-85. C.W. Bill Young Marrow Donor Recruitment and Research Program 74 CHAPTER 9 NEXT GENERATION DNA SEQUENCING OF HLA The purpose of this chapter is to describe new methods of DNA sequencing which are replacing the Sanger method. The basic sequencing chemistries share similarities with the Sanger method so it will be useful to read Chapter 8. One advantage is that these new methods can determine and report the sequence of a single DNA double helix and so allow the phase of polymorphic nucleotides to be determined. I. Next generation sequencing (called NGS for short) is often used to sequence entire human genomes. It is thought that characterization of the sequence of all of the genetic information in an individual will help, for example, determine disease susceptibility, identify mutations that might result in cancer, and aid in disease treatment decisions. Thus, NGS is providing the technology for “personalized medicine.” Although whole genome sequencing should result in sequencing of the HLA genes carried by an individual, in practice this doesn’t work very well. Therefore, the use of NGS to identify HLA alleles usually begins with amplification of the HLA genes by PCR. II. There are several platforms for NGS using different instruments and somewhat different sequencing and detection chemistries. Below is a simplified version of how all of these methods work. For the details of each platform, see the references for this chapter. A. One advantage of NGS is that it has the capability of sequencing long stretches of DNA. Thus, some strategies for HLA sequencing include amplification of the entire HLA gene, exons and introns, with generic or locus-specific primers. B. Because NGS results in the sequences of individual DNA fragments, it is possible to combine different HLA amplicons for sequencing. So, for example, it is possible in one sequencing run to obtain the DNA sequences of all of the HLA genes from an individual simultaneously. It is also possible to combine the samples from many individuals and sequence all their HLA genes simultaneously. C. Once the HLA genes are amplified by PCR, the amplicons may need to be broken into smaller fragments for DNA sequencing [Figure 9-1]. For example, an NGS sequencing chemistry might be limited to determining the sequence of only 500 base pairs. So, if an HLA amplicon is 6,000 base pairs long, for example, it must be fragmented into smaller pieces for DNA sequencing. Any piece of DNA over 500 base pairs will not be completely C.W. Bill Young Marrow Donor Recruitment and Research Program 75 sequenced. One method for fragmenting the DNA is by sonication. D. Each long DNA amplicon (remember this includes thousands of copies of the region amplified) will be randomly broken into pieces of varying sizes. This will result in a series of overlapping DNA fragments covering the length of the amplicon. Figure 9-1 Amplicons and Reads Thousands of copies of HLA-B amplicon containing the entire gene (5 kb) Random fragments of HLA-B amplicon generated by sonication Each fragments binds to solid support and is sequenced separately II. Before sequencing the amplified DNA, the laboratory will need to prepare a DNA library. The methods used will depend on the platform used. A. Library preparation will usually involve ligating short pieces of DNA to both 5’ and 3’ ends of each DNA fragment [Figure 9-2]. These short DNA sequences, called “adaptors”, will help the fragments anneal to the surface of the sequencing reaction chamber and will provide sites where PCR and sequencing primers can anneal. B. The ligation will also attach unique DNA sequences called “indices”. These short sequences act as barcodes, labeling the fragments as coming from one individual. The purpose of the indices is to allow the laboratory to combine the samples from multiple individuals into one sequencing run. The software will later sort out the information for each individual based on these barcodes. C.W. Bill Young Marrow Donor Recruitment and Research Program 76 Figure 9-2 DNA Library Genomic DNA or PCR amplicon Randomly fragment DNA Attach PCR & sequencing primer annealing sites to ends of each DNA fragment Hybridize single stranded DNA fragments to solid support III. Once the library is created from each individual sample, samples from many individuals are combined and DNA sequencing commences. There are several different NGS sequencing chemistries that can be used. All have some resemblance to the Sanger sequencing method. A. IV. The unique feature of next generation sequencing is that each DNA fragment is sequenced separately in what is call massive parallel sequencing. Thus, the sequences of millions of DNA fragments are determined simultaneously. Analysis of the huge amount of data collected is then carried out by software programs. A. Once sequencing is complete, a computer program separates all the sequences (called “reads”) based on their indices into sets of reads deriving from one individual (called “demultiplexing”). B. Then a computer program aligns the reads to each locus based on the sequences of reference genes [Figure 9-3]. So, for example, the computer C.W. Bill Young Marrow Donor Recruitment and Research Program 77 will identify all the reads that match to a reference HLA-A sequence and will assemble them to form the sequence of an HLA-A gene. Figure 9-3 Software Must Align Each Read to Reference Set of Genes Class I sequence may yield thousands of reads Ref HLA-A Ref HLA-B Ref HLA-C Ref HLA-Y Genes are very similar, sharing regions of sequence. This is challenging for programs to address. C. The computer program will then separate the reads matching to HLA-A into the sequences of the two alleles present at that locus by identifying which polymorphic residues are carried on the same DNA fragments i.e., determine the phase of polymorphisms [Figure 9-4]. C.W. Bill Young Marrow Donor Recruitment and Research Program 78 Figure 9-4 Fragments Are Phased, Separated into Two Alleles TGCAATACGCG GCAGTGCAA AATACGCGTAACT GCAGTGCAATACGCGTAACT GCATTGCATTAC GCATTACGCGTAGCT Phased GCATTGCATTACGCGTAGCT long distance, loss of phase Phased or D. The end result is the complete sequences of both alleles at each HLA locus for an individual. This method is a high volume procedure so that several hundred individuals might be typed simultaneously. References: Gabriel C, Furst D, Fae I, Wenda S, Zollikofer C, Mytilineos J, Fischer GF. 2014. HLA typing by next-generation sequencing - getting closer to reality. Tissue Antigens 83(2):65-75. De Santis D, Dinauer D, Duke J, Erlich HA, Holcomb CL, Lind C, Mackiewicz K, Monos D, Moudgil A, Norman P, Parham P, Sasson A, Allcock RJ. 2013. 16(th) IHIW : review of HLA typing by NGS. Int J Immunogenet 40(1):72-6. Erlich H. HLA DNA typing: past, present, and future. 2012. Tissue Antigens 80(1):1-11. http://en.wikipedia.org/wiki/DNA_sequencing Platforms (these web sites often have educational videos ): Illumina MiSeq: http://www.illumina.com/ Pacific Biosystems: http://www.pacificbiosciences.com/ Life Technologies Ion Torrent: http://ioncommunity.lifetechnologies.com/welcome C.W. Bill Young Marrow Donor Recruitment and Research Program 79 CHAPTER 10 OTHER MOLECULAR BIOLOGY TECHNIQUES FOR HLA TYPING While PCR/SSOP and SSP typing are the most common techniques used for HLA typing, other techniques have been described. I. PCR/Restriction Fragment Length Polymorphism (RFLP)/AFLP [Figure 10-1]. A. This method can be used if two alleles differ by the presence of a restriction enzyme site. Following PCR amplification, the amplified DNA is cleaved with the restriction enzyme and the alleles are identified by the fragmentation pattern upon gel electrophoresis. Figure 10-1 PCR / RFLP / AFLP Allele 1 GAATTC CTTAAG Allele 2 GTATTC CATAAG Restriction Enzyme digest G CTTAAG AATTC CTTAAG GTATTC CATAAG Gel Allele: B. II. 1/1 2/2 1/2 Two problems with the technique are a failure to get complete cleavage which may make a homozygote look like a heterozygote and difficulty in finding appropriate restriction sites in all of the alleles at a locus. Denaturing gradient gel electrophoresis. A. As amplified double stranded DNA is electrophoresed through a denaturing gradient gel, it reaches a point in the gradient where it denatures. This point is determined by the sequence of the DNA. B. Not widely used because the results may be difficult to interpret. C.W. Bill Young Marrow Donor Recruitment and Research Program 80 III. IV. Single stranded conformational polymorphism (SSCP). A. PCR-amplified DNA is denatured and electrophoresed on a polyacrylamide gel. Each single strand moves at a position related to its conformation as determined by its sequence. B. Not widely used because the results may be difficult to interpret, this method could be useful in comparing the alleles of two individuals. Heteroduplex formation [Figure 10-2]. A. Amplified DNA is denatured and allowed to reanneal under nonstringent conditions. If DNA strands are present which do not perfectly match (e.g., in a person heterozygous for the gene amplified), these will form heteroduplexes in addition to homoduplexes. These heteroduplexes will have an altered conformation compared to the homoduplexes and will migrate differently in an electric field. B. Sometimes a labeled reference DNA is added. This DNA is designed to anneal to the gene of interest creating additional heteroduplexes. In this case, only the labeled heteroduplexes are detected. This variation is called reference strand mediated conformational analysis (RSCA). C. This method could be used to compare the alleles of two individuals or to determine an HLA type if compared to known alleles. Figure 10-2 1 2 Heteroduplex 1 2 3 Denature + reanneal nonstringent + Gel Lane 1/2: homozygotes Lane 3: heterozygote Homo Hetero Hetero Homo Duplex Duplex Duplex Duplex C.W. Bill Young Marrow Donor Recruitment and Research Program 81 V. Exonuclease-released fluorescence [Figure 10-3]. A. An sequence specific oligonucleotide labeled with reporter and quencher dyes is hybridized to target DNA. Addition of Taq polymerase and a locusspecific primer located 5' of the probe causes the reporter dye to be released during PCR. Once the reporter dye is separated from the quencher dye, fluorescence is produced indicating that the probe had hybridized to the DNA. Figure 10-3 Exonuclease-Released Fluorescence Taq Reporter Quencher References: RFLP: Maeda, M., Uryu, N., Murayama, N., Ishii, H., Ota, M. Tsuji, K., and Inoko, H. 1990. A simple and rapid method for HLA-DP genotyping by digestion of PCR-amplified DNA with allele-specific restriction endonucleases. Human Immunology 27:111-121. Olerup, O. 1990. HLA class II typing by digestion of PCR-amplified DNA with allele-specific restriction endonucleases will fail to unequivocally identify the genotypes of many homozygous and heterozygous individuals. Tissue Antigens 36:83-87. SSCP: Orita, M., Iwahana, H., Kanazawa, H., Hayashi, K., and Sekiya, T. 1989. Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphisms. Proc. Natl. Acad. Sci. USA 86:2766-2770. C.W. Bill Young Marrow Donor Recruitment and Research Program 82 Lo, Y.M.D., Patel, P., Mehal, W.Z., Fleming, K.A., Bell, J.I., and Wainscoat, J.S. 1992. Analysis of complex genetic systems by ARMS-SSCP: application to HLA genotyping. Nucleic Acids Res. 20-1005-1009. Heteroduplex/RSCA: Savage et. al. 1996. A rapid HLA-DRB1*04 subtyping method using PCR and DNA heteroduplex generators. Tissue Antigens 47:284-292. Summers, C., Morling, F., Taylor, M., Yin, J. L., and Stevens, R. 1994. Donor-recipient HLA class I bone marrow transplant matching by multilocus heteroduplex analysis. Transplantation 58: 628-629. Arguello, J.R., Little, A-M, Bohan, E. et. al. 1998. High resolution HLA class typing by reference strand mediated conformation analysis (RSCA). Tissue Antigens 52: 57-66. Exonuclease: Faas et. al. 1996. Sequence specific priming and exonuclease-released fluorescence detection of HLA-DQB1 alleles. Tissue Antigens 48:97-112. Slateva, K., Albis-Camps, M., Blasczyk, et.al. 1998. Fluorotyping of HLA-A by sequence specific priming and fluorogenic probing. Tissue Antigens 52: 462-72. C.W. Bill Young Marrow Donor Recruitment and Research Program 83 CHAPTER 11 INTERPRETATION OF DNA TYPING RESULTS The hybridization results obtained with sequence specific oligonucleotide probes or the amplifications obtained with sequence specific primers are used to identify HLA types. Likewise the DNA sequences obtained by either Sanger or NGS are used to assign alleles. I. Most HLA alleles do not have a unique DNA sequence that characterizes that allele and that is found in no other allele (Chapters 6 and 7). II. Detection of HLA types might use a panel of SSOP (30-40 probes) to obtain low resolution or serologic typing level resolution results for a single locus e.g., DRB1 or HLA-A. The use of more probes will produce an intermediate level of typing resolution. A. HLA types are obtained by comparing the positive and negative probe hybridizations (or the positive and negative sequence specific amplifications) to the known list of alleles (Figure 11-1). Figure 11-1 DR allele 0101 0301 0401 0701 1101 1301 Probe 1 Probe 2 Probe 3 Probe 4 Probe 5 Probe 6 ═══════════════════════════════════════════════════════════════════ QUESTION 1:Based on Figure 11-1, what would be the typing if probe 1 and probe 3 were positive and the remainder negative? What would the typing be if probe 2, probe 4, probe 5 were positive and the rest negative? ═══════════════════════════════════════════════════════════════════ C.W. Bill Young Marrow Donor Recruitment and Research Program 84 ═══════════════════════════════════════════════════════════════════ QUESTION 2: Align the sequences of the DRB1 alleles. If probe DR1001 (5'TGGCAGCTTAAGTTTGAA (codons 9-13)) is positive, one of the DRB1*01 alleles is present. If probe DR7007 (5'ACATCCTGGAAGACGAGC (codons 66-72)) is also positive, the DRB1*01:03 allele may be present. What would be your interpretation of the DR type for this sample if the probe DR1004 (5' GAGCAGGTTAAACATGAG (codons 9-14)) is also positive in addition to the probes listed above?. How would your interpretation of DRB1*01:03 change? ═══════════════════════════════════════════════════════════════════ VI. IV. DNA sequence interpretation A. The use of Sanger sequencing for typing allows identification of the complete sequence of the HLA region amplified. The testing must be supplemented with a strategy to determine the phase of the polymorphic nucleotides in order to identify the genotype(s) present. B. NGS sequencing of an entire HLA gene with phasing should result in the identification of a single genotype at the allele level. Interpretation of typing results provides us with a genotype for an individual. Often the typing result yields more than one possible genotype and more testing will be required to determine which genotype is the correct one for that individual. Figure 11-2 is an example of a sequencing result which identified two possible genotypes: DRB1*01:01:01, 03:01:02 or DRB1*01:04,03:14. Figure 11-2 C.W. Bill Young Marrow Donor Recruitment and Research Program 85 Alternative Genotypes or Ambiguous Combinations DRB1*01:01:01 DRB1*01:04 DRB1*03:01:02 DRB1*03:14 DRB1*01:01:01 DRB1*01:04 DRB1*03:01:02 DRB1*03:14 DRB1*01:01:01 DRB1*01:04 DRB1*03:01:02 DRB1*03:14 DRB1*01:01:01 DRB1*01:04 DRB1*03::01:02 DRB1*0314 1 5 CTG GCT TTG GCT GGG GAC ACC CGA C|CA CGT TTC TTG *** *** *** *** --- --- --- --- -|-- --- --- ----- --- --- --- --- --- --- A-- -|-- --- --- --*** *** *** *** *** *** *** *** *|-- --- --- --- 10 15 20 TGG CAG CTT AAG TTT GAA TGT CAT TTC TTC AAT GGG ACG --- --- --- --- --- --- --- --- --- --- --- --- --GA- T-C TC- -C- -C- --G --- --- --- --- --- --- --GA- T-C TC- -C- -C- --G --- --- --- --- --- --- --- GAG ------- CGG ------- GTG ------- 25 CGG ------- TTG ---AC -AC CTG ------- GAA ----C --C AGA ------- 30 TGC ---A-A- ATC --T-T-- TAT --C-C-- AAC ------- CAA ----G --G 35 GAG ------- GAG ------- TCC --AAAA- GTG ------- CGC ------- 40 TTC ------- GAC ------- AGC ------- GAC ------- GTG ------- 45 GGG ------- GAG ------- TAC ---T-T- CGG ------- GCG ------- 50 GTG ------- ACG ------- GAG ------- CTG ------- GGG ------- 55 CGG ------- CCT ------- GAT ------- GCC ------- GAG ------- 60 TAC ------- TGG ------- AAC ------- AGC ------- CAG ------- 65 AAG ------- GAC ------- CTC ------- CTG ------- GAG ------- 70 CAG ------- AGG ---A-A- 95 G|TT -|--|-C -|** GAG --C-T *** CGG ------- GCC ---G-G- GCG --CGCG- 75 GTG ------- GAC ------- ACC -AT -AT --- TAC ------- TGC ------- 80 AGA ------- CAC ------- AAC ------- TAC ------- GGG ------- 85 GTT ------- GGT -TG -TG --- GAG ------- AGC ------- TTC ------- 90 ACA ------- GTG ------- CAG ------- CGG ------- CGA ------- Alternative allele combinations possible: DRB1*01:01:01,*03:01:02 vs DRB1*01:04,*03:14 ═══════════════════════════════════════════════════════════════════ QUESTION 3: Why can’t these two alternative genotypes be distinguished by sequencing of both alleles of the DRB1 locus together? How would you determine which genotype is the correct one? ═══════════════════════════════════════════════════════════════════ V. New HLA alleles continue to be described. If the same sample is typed with the same set of probes each year, the interpretation of the hybridization results might change over time. For example, if DNA from a sample was positive with probe DR1001AS in 1990, the result would be interpreted as DRB1*01:01 or DRB1*01:02 or DRB1*01:03. If DNA from the same sample was typed again a few years later, a positive hybridization result with probe DR1001AS would be interpreted as DRB1*01:01 or DRB1*01:02 or DRB1*01:03 or DRB1*01:04. DRB1*01:04 is a more recently described allele which carries the sequence detected by DR1001AS. Note that the hybridization result does not change; only the interpretation of that result. This means that one must always review the interpretation of a typing result if the typing occurred some time in the past (e.g., over a year ago). This requires the knowledge of the sequences of the probes and primers used in the typing and the positive and negative hybridization results obtained with those probes and primers. C.W. Bill Young Marrow Donor Recruitment and Research Program 86 ═══════════════════════════════════════════════════════════════════ QUESTION 4: A donor on the unrelated bone marrow registry was typed as DRB1*11 using the probe 5703 (5' GCCTGATGAGGAGTACTG (codons 55-61)). What would be this person’s DRB1 type based on all of the DRB1 alleles that have been identified to date? (Hint: Look at the list of current DRB1 alleles.) ═══════════════════════════════════════════════════════════════════ References: Hurley, C.K. 1997. Acquisition and use of DNA-based HLA typing data in bone marrow registries. 1997. Tissue Antigens 49:323-328. C.W. Bill Young Marrow Donor Recruitment and Research Program 87 ANSWERS TO THE QUESTIONS Chapter 1: Question 1: PstI recognition site 5' 3' CTGCA|G This generates a 3' protrusion. example: 5' 3' GTAC TGCA|GTC cut---> CATG|ACGT CAG 3' 5' 5' 3' GTACTGCA CATG 3' 5' 5' and 3' GTC ACGTCAG 3' 5' Question 2: NotI recognition site: 5' 3' GC|GGCCGC = 8 base pairs= 48= one cut every 65,536 bp EcoRI recognition site: 5' 3' G|AATTC = 6 base pairs= 46= one cut every 4,096 bp NotI has a longer recognition sequence than EcoRI; therefore, NotI cuts less frequently. Question 3: Example: ATGCCTTAGGCATCCGTT TACGGAATCCGTAGGCAA Tm=[4x9] + [2x9] = 36 + 18 = 54C This example temperature is higher than the example provided in the manual. If the sequence is 18 bp of GC, the Tm= [4x18]+[2x0]= 72oC. If the sequence is 18 bp of AT, the Tm= [4x0] + [2x18]= 36oC Question 4: The DNA fragment in lane 1 is approx. 500 bp long. The DNA fragment in lane 2 is approx. 330 bp long. The migration is from negative (top of gel) to positive (bottom). Chapter 2: Question 1: A*01:01:01:01 -291 -281 -271 -261 -251 -241 -231 -221 -211 -201 CAGGAGCAGA GGGGTCAGGG CGAAGTCCCA GGGCCCCAGG CGTGGCTCTC AGGGTCTCAG GCCCCGAAGG CGGTGTATGG ATTGGGGAGT CCCAGCCTTG A*01:01:01:01 -191 -181 -171 -161 -151 -141 -131 -121 -111 -101 GGGATTCCCC AACTCCGCAG TTTCTTTTCT CCCTCTCCCA ACCTACGTAG GGTCCTTCAT CCTGGATACT CACGACGCGG ACCCAGTTCT CACTCCCATT A*01:01:01:01 -91 -81 -71 -61 -51 -41 -31 -21 -11 -1 GGGTGTCGGG TTTCCAGAGA AGCCAATCAG TGTCGTCGCG GTCGCTGTTC TAAAGTCCGC ACGCACCCAC CGGGACTCAG ATTCTCCCCA GACGCCGAGG A*01:01:01:01 10 20 30 40 50 60 70 80 90 100 |ATGGCCGTCA TGGCGCCCCG AACCCTCCTC CTGCTACTCT CGGGGGCCCT GGCCCTGACC CAGACCTGGG CGG|GTGAGTG CGGGGTCGGG AGGGAAACCG C.W. Bill Young Marrow Donor Recruitment and Research Program 88 A*01:01:01:01 110 120 130 140 150 160 170 180 190 200 CCTCTGCGGG GAGAAGCAAG GGGCCCTCCT GGCGGGGGCG CAGGACCGGG GGAGCCGCGC CGGGAGGAGG GTCGGGCAGG TCTCAGCCAC TGCTCGCCCC A*01:01:01:01 210 220 230 240 250 260 270 280 290 300 CAG|GCTCCCA CTCCATGAGG TATTTCTTCA CATCCGTGTC CCGGCCCGGC CGCGGGGAGC CCCGCTTCAT CGCCGTGGGC TACGTGGACG ACACGCAGTT A*01:01:01:01 310 320 330 340 350 360 370 380 390 400 CGTGCGGTTC GACAGCGACG CCGCGAGCCA GAAGATGGAG CCGCGGGCGC CGTGGATAGA GCAGGAGGGG CCGGAGTATT GGGACCAGGA GACACGGAAT A*01:01:01:01 410 420 430 440 450 460 470 480 490 500 ATGAAGGCCC ACTCACAGAC TGACCGAGCG AACCTGGGGA CCCTGCGCGG CTACTACAAC CAGAGCGAGG ACG|GTGAGTG ACCCCGGCCC GGGGCGCAGG A*01:01:01:01 510 520 530 540 550 560 570 580 590 600 TCACGACCCC TCATCCCCCA CGGACGGGCC AGGTCGCCCA CAGTCTCCGG GTCCGAGATC CACCCCGAAG CCGCGGGACT CCGAGACCCT TGTCCCGGGA A*01:01:01:01 610 620 630 640 650 660 670 680 690 700 GAGGCCCAGG CGCCTTTACC CGGTTTCATT TTCAGTTTAG GCCAAAAATC CCCCCGGGTT GGTCGGGGCG GGGCGGGGCT CGGGGGACTG GGCTGACCGC A*01:01:01:01 710 720 730 740 750 760 770 780 790 800 GGGGTCGGGG CCAG|GTTCTC ACACCATCCA GATAATGTAT GGCTGCGACG TGGGGCCGGA CGGGCGCTTC CTCCGCGGGT ACCGGCAGGA CGCCTACGAC A*01:01:01:01 810 820 830 840 850 860 870 880 890 900 GGCAAGGATT ACATCGCCCT GAACGAGGAC CTGCGCTCTT GGACCGCGGC GGACATGGCA GCTCAGATCA CCAAGCGCAA GTGGGAGGCG GTCCATGCGG A*01:01:01:01 910 920 930 940 950 960 970 980 990 1000 CGGAGCAGCG GAGAGTCTAC CTGGAGGGCC GGTGCGTGGA CGGGCTCCGC AGATACCTGG AGAACGGGAA GGAGACGCTG CAGCGCACGG |GTACCAGGGG A*01:01:01:01 CCTCCCTCTG 1010 1020 1030 1040 1050 1060 1070 1080 1090 CCACGGGGCG CCTCCCTGAT CGCCTATAGA TCTCCCGGGC TGGCCTCCCA CAAGGAGGGG AGACAATTGG GACCAACACT AGAATATCAC 1100 A*01:01:01:01 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 GTCCTGAGGG AGAGGAATCC TCCTGGGTTT CCAGATCCTG TACCAGAGAG TGACTCTGAG GTTCCGCCCT GCTCTCTGAC ACAATTAAGG GATAAAATCT A*01:01:01:01 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 CTGAAGGAGT GACGGGAAGA CGATCCCTCG AATACTGATG AGTGGTTCCC TTTGACACCG GCAGCAGCCT TGGGCCCGTG ACTTTTCCTC TCAGGCCTTG A*01:01:01:01 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 TTCTCTGCTT CACACTCAAT GTGTGTGGGG GTCTGAGTCC AGCACTTCTG AGTCTCTCAG CCTCCACTCA GGTCAGGACC AGAAGTCGCT GTTCCCTTCT A*01:01:01:01 1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 CAGGGAATAG AAGATTATCC CAGGTGCCTG TGTCCAGGCT GGTGTCTGGG TTCTGTGCTC TCTTCCCCAT CCCGGGTGTC CTGTCCATTC TCAAGATGGC A*01:01:01:01 1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 CACATGCGTG CTGGTGGAGT GTCCCATGAC AGATGCAAAA TGCCTGAATT TTCTGACTCT TCCCGTCAG|A CCCCCCCAAG ACACATATGA CCCACCACCC A*01:01:01:01 1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 CATCTCTGAC CATGAGGCCA CCCTGAGGTG CTGGGCCCTG GGCTTCTACC CTGCGGAGAT CACACTGACC TGGCAGCGGG ATGGGGAGGA CCAGACCCAG A*01:01:01:01 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 GACACGGAGC TCGTGGAGAC CAGGCCTGCA GGGGATGGAA CCTTCCAGAA GTGGGCGGCT GTGGTGGTGC CTTCTGGAGA GGAGCAGAGA TACACCTGCC A*01:01:01:01 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 ATGTGCAGCA TGAGGGTCTG CCCAAGCCCC TCACCCTGAG ATGGG|GTAAG GAGGGAGATG GGGGTGTCAT GTCTCTTAGG GAAAGCAGGA GCCTCTCTGG A*01:01:01:01 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 AGACCTTTAG CAGGGTCAGG GCCCCTCACC TTCCCCTCTT TTCCCAG|AGC TGTCTTCCCA GCCCACCATC CCCATCGTGG GCATCATTGC TGGCCTGGTT A*01:01:01:01 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 CTCCTTGGAG CTGTGATCAC TGGAGCTGTG GTCGCTGCCG TGATGTGGAG GAGGAAGAGC TCAG|GTGGAG AAGGGGTGAA GGGTGGGGTC TGAGATTTCT A*01:01:01:01 2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 TGTCTCACTG AGGGTTCCAA GCCCCAGCTA GAAATGTGCC CTGTCTCATT ACTGGGAAGC ACCTTCCACA ATCATGGGCC GACCCAGCCT GGGCCCTGTG A*01:01:01:01 2210 2220 2230 2240 2250 2260 2270 2280 2290 2300 TGCCAGCACT TACTCTTTTG TAAAGCACCT GTTAAAATGA AGGACAGATT TATCACCTTG ATTACGGCGG TGATGGGACC TGATCCCAGC AGTCACAAGT A*01:01:01:01 2310 2320 2330 2340 2350 2360 2370 2380 2390 2400 CACAGGGGAA GGTCCCTGAG GACAGACCTC AGGAGGGCTA TTGGTCCAGG ACCCACACCT GCTTTCTTCA TGTTTCCTGA TCCCGCCCTG GGTCTGCAGT A*01:01:01:01 2410 2420 2430 2440 2450 2460 2470 2480 2490 2500 CACACATTTC TGGAAACTTC TCTGGGGTCC AAGACTAGGA GGTTCCTCTA GGACCTTAAG GCCCTGGCTC CTTTCTGGTA TCTCACAGGA CATTTTCTTC A*01:01:01:01 2510 2520 2530 2540 2550 2560 2570 2580 2590 2600 CCACAG|ATAG AAAAGGAGGG AGTTACACTC AGGCTGCAA|G TAAGTATGAA GGAGGCTGAT GCCTGAGGTC CTTGGGATAT TGTGTTTGGG AGCCCATGGG A*01:01:01:01 2610 2620 2630 2640 2650 2660 2670 2680 2690 2700 GGAGCTCACC CACCCCACAA TTCCTCCTCT AGCCACATCT TCTGTGGGAT CTGACCAGGT TCTGTTTTTG TTCTACCCCA G|GCAGTGACA GTGCCCAGGG A*01:01:01:01 2710 2720 2730 2740 2750 2760 2770 2780 2790 2800 CTCTGATGTG TCTCTCACAG CTTGTAAAG|G TGAGAGCTTG GAGGGCCTGA TGTGTGTTGG GTGTTGGGTG GAACAGTGGA CACAGCTGTG CTATGGGGTT A*01:01:01:01 2810 2820 2830 2840 2850 2860 2870 2880 2890 2900 TCTTTGCGTT GGATGTATTG AGCATGCGAT GGGCTGTTTA AGGTGTGACC CCTCACTGTG ATGGATATGA ATTTGTTCAT GAATATTTTT TTCTATAG|TG A*01:01:01:01 2910 2920 2930 2940 2950 2960 2970 2980 2990 3000 TGA|GACAGCT GCCTTGTGTG GGACTGAGAG GCAAGAGTTG TTCCTGCCCT TCCCTTTGTG ACTTGAAGAA CCCTGACTTT GTTTCTGCAA AGGCACCTGC A*01:01:01:01 3010 3020 3030 3040 3050 3060 3070 3080 3090 3100 ATGTGTCTGT GTTCGTGTAG GCATAATGTG AGGAGGTGGG GAGAGCACCC CACCCCCATG TCCACCATGA CCCTCTTCCC ACGCTGACCT GTGCTCCCTC A*01:01:01:01 3110 3120 3130 3140 3150 3160 3170 3180 3190 3200 CCCAATCATC TTTCCTGTTC CAGAGAGGTG GGGCTGAGGT GTCTCCATCT CTGTCTCAAC TTCATGGTGC ACTGAGCTGT AACTTCTTCC TTCCCTATTA C.W. Bill Young Marrow Donor Recruitment and Research Program 89 Question 2: A*01:01:01:01 -21 -11 -1 10 20 30 40 50 60 70 MAVM APRTLLLLLS GALALTQTWA GSHSMRYFFT SVSRPGRGEP RFIAVGYVDD TQFVRFDSDA ASQKMEPRAP WIEQEGPEYW DQETRNMKAH A*01:01:01:01 80 90 100 110 120 130 140 150 160 170 SQTDRANLGT LRGYYNQSED GSHTIQIMYG CDVGPDGRFL RGYRQDAYDG KDYIALNEDL RSWTAADMAA QITKRKWEAV HAAEQRRVYL EGRCVDGLRR A*01:01:01:01 180 190 200 210 220 230 240 250 260 270 YLENGKETLQ RTDPPKTHMT HHPISDHEAT LRCWALGFYP AEITLTWQRD GEDQTQDTEL VETRPAGDGT FQKWAAVVVP SGEEQRYTCH VQHEGLPKPL A*01:01:01:01 280 290 300 310 320 330 340 TLRWELSSQP TIPIVGIIAG LVLLGAVITG AVVAAVMWRR KSSDRKGGSY TQAASSDSAQ GSDVSLTACK V 350 See the answer to question 2 for the start codon and leader sequence. Chapter 3: Question1: DRA Chapter 4: Question 1: No question, look up reference. Question 2: For example: DQB1*05:01, DQB1*02:01 Question 3: (1.9/100) x (7/100) = 13.3/10,000 = 0.133/100 = 0.133% Question 4: DRB1*01:01: GGG GAC ACC CGA CCA CGT TTC TTG TGG CAG CTT AAG TTT GAA C.W. Bill Young Marrow Donor Recruitment and Research Program 90 Gly Asp Thr Arg Pro Arg Phe Leu Trp Gln Leu Lys Phe Glu TGT CAT TTC TTC AAT GGG Cys His Phe Phe Asn Gly DRB1*13:02: GGG GAC ACC Gly Asp Thr TGT CAT TTC Cys His Phe AGA Arg TTC Phe Question 5: Example: ACC GGA TAT Thr Gly Tyr CCA Pro AAT Asn TTT Phe CGT TTC TTG GAG TAC TCT ACG TCT GAG Arg Phe Leu Glu Tyr Ser Thr Ser Glu GGG Gly GAA Glu Question 6: ♂ DRB1*04:01, DRB1*11:03 ♀ DRB1*01:01, DRB1*03:01 Possible DRB1 combinations of children: DRB1*04:02, *03:01 DRB1*04:02, *01:01 DRB1*11:03, *03:01 DRB1*11:03, *01:01 Yes, it is possible for two children to be DR identical. No children are homozygous; all are heterozygous (i.e., carry 2 different alleles). Question 7: ♂ DRB1*04:03, DPB1*02:01 DRB1*11:01, DPB1*02:01 ♀ DRB1*01:01, DPB1*02:01 DRB1*03:02, DPB1*01:01 Possible DR, DP combinations of children: DRB1*04:03, DPB1*02:01; DRB1*01:01, DPB1*02:01 DRB1*04:03, DPB1*02:01; DRB1*03:02, DPB1*01:01 DRB1*11:01, DPB1*02:01; DRB1*01:01, DPB1*02:01 DRB1*11:01, DPB1*02:01; DRB1*03:02, DPB1*01:01 and many more possibilities by recombination, for example: DRB1*04:03, DPB1*02:01; DRB1*03:02, DPB1*02:01 Yes, DRB1*04:03, DPB1*02:01; DRB1*01:01, DPB1*01:01 is possible by recombination. DRB1*01:01 DPB1*02:01 X C.W. Bill Young Marrow Donor Recruitment and Research Program 91 DRB1*03:02 DRB1*01:01 DPB1*01:01 DRB1*03:02 DPB1*02:01 DPB1*01:01 Question 8: DRB1*13:04 DRB3*02:01 DQA1*02:01 DQB1*02:01 DPA1*01:04 DPB1*04:01 DRB1*04:02 DRB4*01:01 DQA1*02:01 DQB1*02:01 DPA1*01:04 DPB1*04:01 Note the predicted association of DRB1*13:04 with DRB3*02:01 and DRB1*04:02 with DRB4*01:01. The person is homozygous for DQ and DP. Question 9: See next two pages. Question 10: For example: C*01:01, C*02:01, C*02:02, C*08:01 C.W. Bill Young Marrow Donor Recruitment and Research Program 92 Chapter 4-Question 9: Part One: DRB1*0101 DRB1*0102 DRB5*0202 1 10 GGG GAC ACC CGA CCA CGT TTC TTG TGG CAG CTT AAG TTT GAA TGT CAT TTC TTC AAT --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- T-- --- --- CA- --- GA- --- -A- --G --- --- --- --- --C 20 GGG ACG GAG CGG GTG CGG TTG CTG GAA AGA --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --C --- C-C --- DRB1*0101 DRB1*0102 DRB5*0202 40 50 GAG GAG TCC GTG CGC TTC GAC AGC GAC GTG GGG GAG TAC CGG GCG GTG ACG GAG CTG GGG CGG CCT GAT GCC GAG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- AA- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --C --T --- DRB1*0101 DRB1*0102 DRB5*0202 70 80 90 100 GAG CAG AGG CGG GCC GCG GTG GAC ACC TAC TGC AGA CAC AAC TAC GGG GTT GGT GAG AGC TTC ACA GTG CAG CGG CGA GTT GAG CCT AAG GTG ACT GTG TAT --- --- --- --- --- --- --- --- --- --T --- --- --- --- --- --- -C- -TG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- GC- --- --- --- --- --- --- --- --- --- --- --- --- --- -C- -TG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- DRB1*0101 DRB1*0102 DRB5*0202 110 120 130 CCT TCA AAG ACC CAG CCC CTG CAG CAC CAC AAC CTC CTG GTC TGC TCT GTG AGT GGT TTC TAT CCA GGC AGC ATT GAA GTC AGG TGG TTC CGG AAC GGC CAG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- G-- -G- --- --- A-- --- --- --- --- --- --- --- --- --- --- --- -A- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- DRB1*0101 DRB1*0102 DRB5*0202 140 GAA GAG AAG GCT GGG GTG GTG TCC ACA GGC CTG ATC CAG --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- --- --T --- DRB1*0101 DRB1*0102 DRB5*0202 180 TAC ACC TGC CAA GTG GAG CAC CCA AGT GTG ACG AGC CCT CTC --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --C --- --- --- --- --- DRB1*0101 DRB1*0102 DRB5*0202 210 220 GGC TTC GTG CTG GGC CTG CTC TTC CTT GGG GCC GGG CTG TTC ATC TAC TTC AGG AAT CAG AAA GGA CAC TCT GGA --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --T --- --- --- --- --- --- --- --- --- --- --A --- --- --- --- -A- --- --- --- --G --- --- --- 150 AAT GGA GAT TGG ACC TTC CAG ACC CTG GTG --- --- --- --- --- --- --- --- --- ----- --- --C- --- --- --- --- -TT --- --- 30 TGC ATC TAT AAC CAA --- --- --- --- --G-- --- --- --- --- 60 TAC TGG AAC AGC CAG AAG GAC CTC CTG --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- A-- --- 160 ATG CTG GAA ACA GTT CCT CGG AGT GGA GAG --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- 170 GTT ----- 190 200 ACA GTG GAA TGG AGA GCA CGG TCT GAA TCT GCA CAG AGC AAG ATG CTG AGT GGA GTC GGG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- -A- --- --- --- --- --- --- --- --- --- --- --- A-- --- C.W. Bill Young Marrow Donor Recruitment and Research Program 93 230 CTT CAG CCA ACA GGA TTC CTG AGC TGA --- --- --- --- --- --- --- --- ----- --C --- --- --- C-- G-- --- --- part two: DRB1*0101 DRB1*0103 DRB1*1301 1 10 GGG GAC ACC CGA CCA CGT TTC TTG TGG CAG CTT AAG TTT GAA TGT CAT TTC TTC AAT --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- A-- --- --- --- --- GA- T-C TC- -C- -C- --G --- --- --- --- --- 20 GGG ACG GAG CGG GTG CGG TTG CTG GAA AGA --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --C --- --C --- DRB1*0101 DRB1*0103 DRB1*1301 40 50 GAG GAG TCC GTG CGC TTC GAC AGC GAC GTG GGG GAG TAC CGG GCG GTG ACG GAG CTG GGG CGG CCT GAT GCC GAG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- AA- --- --- --- --- --- --- --- --- --- -T- --- --- --- --- --- --- --- --- --- --- --- --- DRB1*0101 DRB1*0103 DRB1*1301 70 80 90 100 GAG CAG AGG CGG GCC GCG GTG GAC ACC TAC TGC AGA CAC AAC TAC GGG GTT GGT GAG AGC TTC ACA GTG CAG CGG CGA GTT GAG CCT AAG GTG ACT GTG TAT --A G-C GA- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----A G-C GA- --- --- --- --- --- --- --- --- --- --- --- --- --- --- -TG --- --- --- --- --- --- --- --- --C C-T --- --- --- --- --- --- DRB1*0101 DRB1*0103 DRB1*1301 110 120 130 CCT TCA AAG ACC CAG CCC CTG CAG CAC CAC AAC CTC CTG GTC TGC TCT GTG AGT GGT TTC TAT CCA GGC AGC ATT GAA GTC AGG TGG TTC CGG AAC GGC CAG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- --- --- --- --- --T --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --T --- --- DRB1*0101 DRB1*0103 DRB1*1301 140 GAA GAG AAG GCT GGG GTG GTG TCC ACA GGC CTG ATC CAG --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- A-- --- --- --- --- --- --- --- --- --C DRB1*0101 DRB1*0103 DRB1*1301 180 TAC ACC TGC CAA GTG GAG CAC CCA AGT GTG ACG AGC CCT CTC --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --C --- --A --- --- --- DRB1*0101 DRB1*0103 DRB1*1301 210 220 GGC TTC GTG CTG GGC CTG CTC TTC CTT GGG GCC GGG CTG TTC ATC TAC TTC AGG AAT CAG AAA GGA CAC TCT GGA --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- 150 AAT GGA GAT TGG ACC TTC CAG ACC CTG GTG --- --- --- --- --- --- --- --- --- ----- --- --C --- --- --- --- --- --- --- 30 TGC ATC TAT AAC CAA --- --- --- --- ---A- T-- C-- --- --G 60 TAC TGG AAC AGC CAG AAG GAC CTC CTG --- --- --- --- --- --- --- A-- ----- --- --- --- --- --- --- A-- --- 160 ATG CTG GAA ACA GTT CCT CGG AGT GGA GAG --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- 170 GTT ----- 190 200 ACA GTG GAA TGG AGA GCA CGG TCT GAA TCT GCA CAG AGC AAG ATG CTG AGT GGA GTC GGG --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ----- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- C.W. Bill Young Marrow Donor Recruitment and Research Program 94 230 CTT CAG CCA ACA GGA TTC CTG AGC TGA --- --- --- --- --- --- --- --- ----- --- --- -G- --- --- --- --- --- Chapter 4, Question 11: 1 ConsensusATGCGGGTCA TGGCGCCCCG b-07021 ----t----- ---------b-0801 ----t----- ---------b-4201 ----t----- ---------- AACCCTCCTC ----g--------g--------g----- CTGCTGCTCT ---------------------------- CGGGGGCCCT ---c--------c--------c------ GGCCCTGACC ---------------------------- GAGACCTGGG ---------------------------- CCGGCTCCCA ---------------------------- CTCCATGAGG ---------------------------- 100 TATTTCTACA ---------------g------------ Consensus b-07021 b-0801 b-4201 101 CCGCCGTGTC --t-----------a-----t------- CCGGCCCGGC ---------------------------- CGCGGGGAGC ---------------------------- CCCGCTTCAT ---------------------------- CGCAGTGGGC -t--------t--------t-------- TACGTGGACG ---------------------------- ACACGCAGTT ----c-----------------c----- CGTGAGGTTC ---------------------------- GACAGCGACG ---------------------------- 200 CCGCGAGTCC ---------------------------- Consensus b-07021 b-0801 b-4201 201 GAGGATGGAG ---aga------aga------aga---- CCGCGGGCGC ---------------------------- CGTGGATAGA ---------------------------- GCAGGAGGGG ---------------------------- CCGGAGTATT ---------------------------- GGGACCGGGA --------a--------a--------a- GACACAGATC c--------c--------c--------- TTCAAGACCA -a----g--c ----------a----g--c ACACACAGAC -gg----------------gg------- 300 TGACCGAGAG ---------------------------- Consensus b-07021 b-0801 b-4201 301 AGCCTGCGGA ---------------------------- ACCTGCGCGG ---------------------------- CTACTACAAC ---------------------------- CAGAGCGAGG ---------------------------- CCGGGTCTCA ---------------------------- CACCCTCCAG ---------------------------- AGGATGTATG --c-----c--c-----c--c-----c- GCTGCGACGT ---------------------------- GGGGCCGGAC ---------------------------- 400 GGGCGCCTCC ---------------------------- Consensus b-07021 b-0801 b-4201 401 TCCGCGGGTA --------c--------c--------c- TGACCAGTAC ----------a--------a-------- GCCTACGACG ---------------------------- GCAAGGATTA ---------------------------- CATCGCCCTG ---------------------------- AACGAGGACC ---------------------------- TGCGCTCCTG ---------------------------- GACCGCGGCG ------c--------------------- GACACGGCGG --------------c--------c---- 500 CTCAGATCAC ---------------------------- Consensus b-07021 b-0801 b-4201 501 CCAGCGCAAG ---------------------------- TGGGAGGCGG ---------------------------- CCCGTGTGGC ------a--------------------- GGAGCAGCTG --------g-------gac -------gac AGAGCCTACC ---------------------------- TGGAGGGCAC --------ga ------------------- GTGCGTGGAG ---------------------------- TGGCTCCGCA ---------------------------- GATACCTGGA ---------------------------- 600 GAACGGGAAG ---------------------------- Consensus b-07021 b-0801 b-4201 601 GAGACGCTGC --c-a----g --c------g --c------g AGCGCGCGGA -------t-------------------- CCCCCCAAAG ---------------------------- ACACATGTGA -----c--------c--------c---- CCCACCACCC ---------------------------- CATCTCTGAC ---------------------------- CATGAGGCCA ---------------------------- CCCTGAGGTG ---------------------------- CTGGGCCCTG ---------------------------- 700 GGCTTCTACC --t------------------------- 701 800 Consensus CTGCGGAGAT CACACTGACC TGGCAGCGGG ATGGCGAGGA CCAAACTCAG GACACCGAGC TTGTGGAGAC CAGACCAGCA GGAGATAGAA CCTTCCAGAA C.W. Bill Young Marrow Donor Recruitment and Research Program 95 b-07021 b-0801 b-4201 ---------- ---------- ---------- ---------- ---------- -----t---- ---------- ---------- ---------- ------------------- ---------- ---------- ---------- ---------- -----t---- ---------- ---------- ---------- ------------------- ---------- ---------- ---------- ---------- -----t---- ---------- ---------- ---------- ---------- Consensus b-07021 b-0801 b-4201 801 GTGGGCAGCT ---------------------------- GTGGTGGTGC ---------------------------- CTTCTGGAGA ---------------------------- AGAGCAGAGA ---------------------------- TACACATGCC ---------------------------- ATGTACAGCA ---------------------------- TGAGGGGCTG ---------------------------- CCGAAGCCCC ---------------------------- TCACCCTGAG ---------------------------- 900 ATGGGAGCCA ---------g ---------g ---------g Consensus b-07021 b-0801 b-4201 901 TCTTCCCAGT ---------------------------- CCACCATCCC -----g--------g--------g---- CATCGTGGGC ---------------------------- ATTGTTGCTG ---------------------------- GCCTGGCTGT ---------------------------- CCTAGCAGTT ---------------------------- CTAGTGGTCA ...------...------...------- TCGGAGCTGT ---------------------------- GGTCGCTGCT ---------------------------- 1000 GTGATGTGTA ---------------------------- Consensus b-07021 b-0801 b-4201 1001 GGAGGAAGAG ---------------------------- CTCAGGTGGA t--------------------------- AAAGGAGGGA ---------------------------- GCTACTCTCA ---------------------------- GGCTGCGTCC --------g--------g--------g- AGCGACAGTG ---------------------------- CCCAGGGCTC ---------------------------- TGATGTGTCT ---------------------------- CTCACAGCTT ---------------------------- 1100 GAAAAGTGTG --~~~~~~~~ --~~~~~~~~ --~~~~~~~ Chapter 5 Question 1: No question. Question 2: No question. Question 3: Not necessarily. There are many alleles of DR3 and DR8. Serologic testing cannot distinguish between the alleles. For example, the donor can be DRB1*08:03, DRB1*03:01 and the recipient can be DRB1*08:04, DRB1*03:02. C.W. Bill Young Marrow Donor Recruitment and Research Program 96 Question 4: A)♂ DR4/DR11 ♀ DR1/DR3 children: DR4/DR1; DR4/DR3; DR11/DR1; DR11/DR3 B) ♂ DRB1*04:02/DRB1*11:03 ♀ DRB1*01:01/DRB1*03:01 children: DRB1*04:02/DRB1*01:01; DRB1*040:2/DRB1*03:01 DRB1*11:03/DRB1*01:01; DRB1*11:03/DRB1*03:01 Question 5: Loci: DRB1 DRB3 DRA DRB1*11:02 DRB3*02:02 DRA*01:01 Question 6: ♂ A2,A3,B27,B53 ♀ A2,A11,B51,B71 Children: A2,B27;A2,B51 -or- A2,B53;A2,B51 -or- A3,B27;A2,B51 -or- A3,B53;A2,B51 A2,B27;A2,B71 -or- A2,B53;A2,B71 -or- A3,B27;A2,B71 -or- A3,B53;A2,B71 A2,B27;A11,B71 -or- A2,B53;A11,B71 -or- A3,B27;A11,B71 -or- A3,B53;A11,B71 A2,B27;A11,B51 -or- A2,B53;A11,B51 -or- A3,B27;A11,B51 -or- A3,B53;A11,B51 parent one: A2,B27 and A3,B53 parent two: A2,B71 and A11,B51 Question 7: A*68:01=A68(28), B*13:01=B13, B*18:04=serologic type not known, DRB1*03:01=DR17(3), DRB1*11:22=serologic type not known Question 8: Patient exhibits serologic type A2 only; the A*2409N allele does not specify an HLA-A antigen. An A2,A24 donor would not be a good choice; a A2 donor would be better. C.W. Bill Young Marrow Donor Recruitment and Research Program 97 Chapter 6: Question 1: It would amplify many sequences. If 20 bp long, then the primers would be more specific because the probability of finding a 20 bp sequence is less than finding a 2 bp sequence. A 2 bp sequence would be fairly common and the amplification would not be specific. 1/(42) vs. 1/(420) 1/16 vs. 1/1,099,500,000,000 Question 2: Denaturation (dsDNA becomes ssDNA): 5' AATAA--------------------------------CCCC 3' TTATT--------------------------------GGGG Primers Anneal: 5' AATAA-----------TTATGGCGG-------GGCGGCTTTA-------CCCC ||||| CGAAA 5' 5'TTATG |||| 3 'TTATT------------AATACCGCC-------CCGCCGAAAT-------GGGG Extension- nucleotides add to the 3' end of the primer and form 2 separate dsDNA molecules at the end of the first cycle: 5'AATAA----------TTATGGCGG----------GGCGCTTT----------CCCC ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄CCGCGAAA 5' 5' TTATGGCGG►►►►►►►►►►►►►►►►►►►►►►►►►►► 3'TTATT----------AATACCGCC----------CCGCGAAA----------GGGG After second cycle: 5'AATAA--------------------------------GCTTT----------CCCC 3' 3'TTATT--------------------------------CGAAA 5' 5'TTATG---------------------------------CCCC 3' 3'TTATT---------AATAC---------------------------------GGGG 3' 5'TTATG------------------GCTTT 3' 3'TTATT---------AATAC------------------CGAAA 5'TTATG------------------GCTTT----------CCCC 3' 3'AATAC------------------CGAAA 5' =4 dsDNA molecules C.W. Bill Young Marrow Donor Recruitment and Research Program 98 o o The polymerase copies the DNA until the temperature is changed (e.g., 72 C to 94 C). In the initial cycles of the PCR, the original DNA strands yield long pieces of DNA when copied by Taq. As the cycles proceed, a major DNA product appears that is a fixed size, bounded by the primers. After 3rd cycle: 8 dsDNA molecules There are 26 bp in the amplified fragment. after 3 cycles= 23 = 8 copies after 10 cycles= 210 = 1024 copies Question 3: Find the answer in the answer for Chapter 2 question 1, the primer annealing sites are underlined. The primer sequences cannot be found entirely in the cDNA sequence because the primers are partially (forward) or completely positioned in the introns. Question 4: C*010201 C*04010101 C*050101 C*06020101 C*0802 C*12030101 C*1701 GGTTCTAGAG ------------------------------------------------------- C*010201 C*04010101 C*050101 C*06020101 C*0802 C*12030101 C*1701 20 30 40 50 60 70 80 90 100 110 TGGCGCCCCG AACCCTCATC CTGCTGCTCT CGGGAGCCCT GGCCCTGACC GAGACCTGGG CCT|GTGAGTG CGGGGTTGGG AGGGAAACGG CCTCTGCGGA ---------- ---------- ---------- ---------- ---------- ---------- --G|------- ---------- ---------- ------G------------ ---------- ---------- ---------- ---------- ---------- ---|------- --A------- ---------- ------------------- ---------- ---------- ---------- ---------- ---------- ---|------- ---------- ---------- ------------------- ---------- ---------- ---------- ---------- ---------- ---|------- --A------- ---------- ------------------- ---------- ---------- ---------- ---------- ---------- ---|------- ---------- ---------- ------------------A -G-----C-- ---------- ---------- --------T- ---------- --G|------- ---------- ---------- ---------- C*010201 C*04010101 C*050101 C*06020101 C*0802 C*12030101 C*1701 120 130 140 150 160 170 180 190 200 210 GAGGAACGAG GTGCCCGCCC GGCGAGGGCG CAGGACCCGG GGAGCCGCGC AGGGAGGAGG GTCGGGCGGG TCTCAGCCCC TCCTCGCCCC CAG|GCTCCCA -----G---- -G-------- ---------- ---------- ---------- ---------- ---------- --------A- ------T--- ---|-----------G---- -G-------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---|-----------G---- -G-------- ---------- ---------- ---------- -------T-- ---------- ---------- ---------- ---|-----------G---- -G-------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---|-----------G---- -G-------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---|-----------G---- -G-------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---|------- AAGCCAATCA ------------------------------------------------------- GCGTCTCCGC ---------------------------------------------------A--- AGTCCCGGTT ------------------------------------------------------- CTAAAGTCCC -----------------------------------------------G-----.. CAGTCACCCA ---------------------------------------------.....----- CCCGGACTCA ------------------G ---------G ---------G ---------G ---------- GATTCTCCCC ------------------------------------------------------- AGACGCCGAG ------------------------------------------------------- |ATGCGGGTCA |---------|---------|---------|---------|---------|---------- C.W. Bill Young Marrow Donor Recruitment and Research Program 99 C*010201 C*04010101 C*050101 C*06020101 C*0802 C*12030101 C*1701 CAGCGGAGAG ---------------------T-----------------T---------T----- C*010201 C*04010101 C*050101 C*06020101 C*0802 C*12030101 C*1701 1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 GGGAGCCTTC CCCATCTCCC GTAGATCTCC CGGCATGGCC TCCCACGAGG AGGGGAGGAA AATGGGATCA GCGCTAGAAT ATCGCCCTCC CTTGAATGGA ---------- ---------- ---------- ---G------ ---------- ---------- ---------- ---------- ---------- ------------------- ---------T ---------- ---G------ ---------- ---------- ---------- -----G---- ---------- ------------------- ---------T ---------- ---G------ ---------- ---------- ---------- ---------- ---------- ------------------- ---------T ---------- ---G------ ---------- ---------- ---------- -----G---- ---------- ------------------- ---------T ---------- ---G------ ---------- ---------- ---------- ---------- ---------- ------------------- ---------T A--------- ---G------ ---------- ---------- ---------- ---------- ---------- ---------- CCTACCTGGA ------------------------------------------------------- GGGCACGTGC -------------------------------------------------GA---- GTGGAGTGGC ------------------------------------------------------- Question 5: For example: Forward: 5' 3' GGT GTA AAC TTG TAC CAG TCCGCAGATA --------------------------------------------------G---- CCTGGAGAAC ------------------------------------------------------- GGGAAGGAGA ---------------A-----------------A--------------------- CGCTGCAGCG ------------------------------------------------------- CGCGG|GTACC -----|---------|---------|---------|---------|---------|----- AGGGGCAGTG ------------------------------------------------------- Reverse: 5' 3' CAA CTC TAC CGC TGC TAC Question 6: 270,000 copies Question 7: Part one: 5' CTT GGA GCA GGT TAA ACA 3' for example (codon 7-13). Place the mismatch on the 3' end of a PCR primer. Part two: Can’t do this. The best choice is: 5' CTG GGG CGG CCT GAT GAG 3' (codon 5258) which will amplify all DRB1*11 alleles plus a few other alleles like DRB1*14:15. Part three: No, it can not be done!! Chapter 7: Question 1: 3' GATGCGGATT 5'. If the DNA is double stranded, the complementary sequence of 5' CTACGCCTAA 3' can also be used. C.W. Bill Young Marrow Donor Recruitment and Research Program 100 50% G+C. Approximate melting temperature is 30([4X5]+[2x5]=30). No hybridization will occur at 50o, because it is above the melting temperature. If the probe is positioned so the variation between the alleles is at the 3’ or 5’ end of the probe, it will be very difficult to achieve specific hybridization; the probe will likely bind well to both alleles. Question 2: Part one: For example: 5' CGG CCT GAT GAG GAG TAC (codon 55-60). Place the mismatch in the center of the oligo for a probe. Part two: For example: 5' TGG AAC AGC CAG AAG GAC (codon 61-66) although DRB1*04:11 differs in this region. Part three: 5' GAG GAG GTT AAG TTT GAG (codon 9 -14). Question 3: First, amplify only the DRB1*01alleles with PCR primers designed around codons 25-31 and DRBAMPB (for example). Then use a probe to distinguish between DRB1*01:01 and DRB1*01:02, designed around codons 83-89. Use probe with sequence: 5' C GGG GCT GTG GAG AGC TT (for example to detect DRB1*01:02 and not DRB1*01:01, *01:03, *01:04). Chapter 8: Question 1: create PCR primers that would amplify DRB1*07:01 Question 2: T G C A C.W. Bill Young Marrow Donor Recruitment and Research Program 101 Chapter 9: No questions Chapter 10: No questions Chapter 11: Question 1: DRB1*01:01, *04:01 DRB1*11:01, *07:01 Question 2: Sample also carries one of the DRB1*04 alleles. Since one of the DRB1*04 alleles, DRB1*04:02, also carries the sequence detected by probe DR7007, it is not known if DRB1*01:03 is present. As a result, the sample would be typed as [DRB1*01:03 and DRB1*04:01 or *04:03 or *04:04 or ... *04:10] OR [DRB1*01:01 or *01:02 or *01:03 and DRB1*04:02]. To determine if DBR1*01:03 is present, you would need to amplify DRB1*01 using a group-specific amplification. Question 3: These two allele combinations would have exactly the same sequence. Positions of mixed bases (i.e., two bases, a polymorphic residue) are the same. One must isolate individual alleles to identify which combination is present. Look at the sequence of the two alternative DRB1*03 alleles and see how knowing whether the DRB1*03 specific sequence at codons 9-14 is on the same strand of DNA (i.e., in the same allele) as the sequence at codons 77 and 86 will identify which allele, DRB1*03:01:02 or DRB1*03:14, is present. Question 4: The person could have any one of the DRB1*11 alleles (now numbering over 30) or DRB1*03:08 or DRB1*04:15 or DRB1*12:04 or DRB1*14:11. The GAG codon at 58 used to be unique for the DRB1*11 alleles but this is no longer the case. C.W. Bill Young Marrow Donor Recruitment and Research Program 102