* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download - Wiley Online Library
Polymorphism (biology) wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Metagenomics wikipedia , lookup
Dominance (genetics) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Copy-number variation wikipedia , lookup
Human genetic variation wikipedia , lookup
Genetic code wikipedia , lookup
Genetic engineering wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Public health genomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome evolution wikipedia , lookup
Population genetics wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Gene therapy wikipedia , lookup
Gene expression programming wikipedia , lookup
Gene nomenclature wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Microsatellite wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Gene desert wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Frameshift mutation wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
American Journal of Hematology 70:269–277 (2002) Spectrum of -Thalassemia Mutations and Their Association With Allelic Sequence Polymorphisms at the -Globin Gene Cluster in an Eastern Indian Population Ritushree Kukreti,1* Debasis Dash,1 Vineetha K E,1 Sanchita Chakravarty,1 Swapan Kr Das ,2 Madhusnata De ,2 and Geeta Talukder 2 2 1 Functional Genomics Unit, Centre for Biochemical Technology (CSIR), Delhi, University Campus, Delhi, India Thalassemia Counselling Department, Vivekananda Institute of Medical Sciences, Ramakrishna Mission Seva Pratisthan, Calcutta, India In this report, the spectrum of -thalassemia mutations and genotype-to-phenotype correlations were defined in large number of patients (-thalassemia carriers and major) with varying disease severity in an Eastern Indian population mainly from the state of West Bengal. The five most common -thalassemia mutations were detected, which included IVS1-5 (G➝C), codon 15 (G➝A), codon 26 (G➝A), codon 30 (G➝C), and codon 41/42 (−TCTT). These accounted for 85% in 80 -thalassemic alleles deciphered from 56 patients, including -thalassemia major and carriers, and 15% of alleles remained uncharacterized in these patients. Expression of the human -globin gene is regulated by an array of cis-acting DNA elements, including five DNase I hypersensitive sites (HSs) in the locus control region (LCR), promoters that incorporate certain silencer elements, and enhancers at 3ⴕ of the -globin gene. For detailed studies and to understand the molecular basis of -thalassemia, we studied two groups of subjects: a group of 12 patients from four families having -thalassemia major and carrier phenotype and a control group of 26 healthy individuals. In these two groups, we examined portions of the -globin gene locus control region HSs 1, 2, 3, and 4, which included the (CA)x(TA)y repeat motif, the (AT)xNy(AT)z repeat motif, the inverted repeat sequence TGGGGACCCCA, the promoter region of the G␥-globin gene, an (AT)x(T)y repeat 5ⴕ of the silencer region, and the -globin gene and its 3ⴕ flanking region. We investigated the allelic sequence polymorphisms in these regions and their association with the -thalassemia mutations to know the possible genotype–phenotype relationship in -thalassemia patients. An analysis of cisacting regulatory regions showed varied sequence haplotypes associated with some frequent -thalassemia mutations in this Eastern Indian population. Am. J. Hematol. © 2002 Wiley-Liss, Inc. 70:269–277, 2002. Key words: -thalassemia mutation; -globin gene; locus control region; hypersensitive site; allelic sequence polymorphisms INTRODUCTION -Thalassemia is a highly prevalent autosomal recessive disorder characterized by the reduced or absent expression of the -globin gene, leading to an imbalance of ␣- and -globin chains [1]. So far, over 300 -thalassemia alleles have been characterized in or around the -globin region. Normally -thalassemia trait in India is 3.3% with 1–2 per 1,000 couples being at risk of having an affected offspring each year, leading to a high societal burden [2]. As the ethnic composition of the Indian popu© 2002 Wiley-Liss, Inc. Contract grant sponsor: Department of Biotechnology, Government of India, Programme on Functional Genomics. *Correspondence to: Dr. Ritushree Kukreti, Functional Genomics Unit, Centre for Biochemical Technology (CSIR), Delhi University Campus, Mall Road, Delhi 110 007, India. E-mail: [email protected] Received for publication 12 November 2001; Accepted 15 March 2002 Published online in Wiley InterScience (www.interscience.wiley. com). DOI: 10.1002/ajh.10117 270 Kukreti et al. Fig. 1. Physical representation of the -globin genes and -globin LCR. (a) Location of the seven restriction enzyme sites defining the RFLP haplotypes (Orkin et al., 1982), HincII 5ⴕ of the -globin gene; HindIII in G␥ and A␥ genes; HincII in the ⌿1-globin gene; HincII 3ⴕ of the ⌿1-globin gene; AvaII in the -globin gene; BamH1 3ⴕ of the -globin gene. (b) The human -globin LCR, located from approximately 8–22 kb upstream of the -globin gene is composed in 5 hypersensitive sites. Arrows above the map indicate location of major HSs of the LCR. HS4 showing SNP at the Palindromic sequence, HS2 showing AT rich region and HS1 showing (CA)x(TA)y motif (c) -globin gene showing positions of 5ⴕ silencer region including (AT)x(T)y motif, promoter region, exons, introns, and 3ⴕ sequences (enhancer) present downstream to Exon3 of the -globin gene. Polymorphic positions, in the 5ⴕ flanking region, inside the -globin gene and 3ⴕ flanking region defining the sequence haploytype. -thalassemia mutations Cd 15 (G➝A), Cd 30 (G➝C),IVSI-5 (G➝C) are represented by bold characters. -globin gene and HSs of the -LCR are amplified and sequenced by appropriate synthetic oligonucleotides. lation is heterogeneous [3], each region of the country has its own distinct set and frequency of -thalassemia mutations [4]. Orkin et al. (5, Fig. 1) observed the entire chromosomal environments of the -thalassemia mutations bearing the mutant alleles are in strong linkage disequilibrium to specific patterns of DNA restriction site polymorphism, referred to as the -globin gene cluster or RFLP haplotypes. Analysis of these haplotypes in the -globin gene cluster has been useful in determining the chromosomal background of -thalassemia mutations in several human populations. The eastern region of India is not well characterized in this regard. The Bengali population from the state of West Bengal has been the subject of our study. It is an admixture of native people with later migrants who settled there and left their genetic imprints. -Thalassemia mutations are found in abun- dance here. We have sequenced minimal combinations of cis-acting regulatory elements that normally direct full expression of the -globin gene [6,7] (Fig. 1) and analyzed the allelic sequence variations at the -globin gene cluster in eastern Indian population. In the present study, we also attempted to study the nucleotide variations in the -globin gene cluster and their association with the -thalassemia mutations so that we could explore the genetic basis of the clinical diversity of the disease in the eastern part of India. MATERIALS AND METHODS Genomic DNA Samples -Thalassemia patients studied for this work were recruited from the Vivekananda Institute of Medical Sci- Spectrum of -Thalassaemia Mutations in Eastern India 271 G TABLE I. PCR-Based Methods of Analysis for the -LCR-HS1, -HS2, -HS3, -HS4, ␥-Promoter Region, (AT)x(T)y, and 3ⴕ Flanking Sequence (Enhancer) Polymorphisms Region analyzed Primers (sequence 5⬘–3⬘) -LCR–HS1 FP–CCTGCAAGTTATCTGGTCAC RP–CTTAGGGGCTTATTTTATTTTGT -LCR–HS2 FP–CAGGGCAGATGGCAAAAA RP–CTGACCCCGTATGTGAGCA FP–ATGGGGCAATGAAGCAAAGGAA RP–ACCCATACATAGGAAGCCCATAGC FP–GCAAACACAGCAAACACAACGAC RP–CAGGGCAAGCCATCTCATAGC FP–GGCCCCTTCCCCACACTATC RP–ATGGCAGAGGCAGAGGACAGGTTG FP–TTCCCAAAACCTAATAAGTAAC RP–CCTCAGCCCTCCCTCTAA -LCR–HS3 -LCR–HS4 G ␥–promoter region (AT)x(T)y region 3⬘ flanking sequences (enhancer) FP–TGCCCTGGCCCACAAGTATC RP–TCAGGGGAAAGGTGGTATCTCTAA Location of the primers corresponding to positions of the human -globin gene (Coordinates GenBank) 13156–13601 8757–9217 4397–4992 668–1109 34273–34816 61440–61960 63586–64125 Size of amplified products (bp) PCR profile 94°C, 5⬘; (94°C, 30⬙; 62°C, cycles; (94°C, 30⬙; 61°C, cycles; (94°C, 30⬙; 60°C, 23 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, 27 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, 27 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, 35 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, 35 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, cycles; (94°C, 30⬙; 61°C, cycles; (94°C, 30⬙; 60°C, 23 cycles; 72°C, 10⬘ 94°C, 5⬘; (94°C, 30⬙; 62°C, 35 cycles; 72°C, 10⬘ 1⬘; 72°C, 1⬘) 3 1⬘; 72°C, 1⬘) 3 1⬘; 72°C, 1⬘) 445 75⬙; 72°C, 15⬙) 460 75⬙; 72°C, 15⬙) 595 75⬙; 72°C, 15⬙) 442 75⬙; 72°C, 15⬙) 544 1⬘; 72°C, 1⬘) 3 1⬘; 72°C, 1⬘) 3 1⬘; 72°C, 1⬘) 520 75⬙; 72°C, 15⬙) 539 PCR, polymerase chain reaction; FP, forward primer; RP, reverse primer; bp, base pairs. ences, Calcutta, after diagnosis. The disease was diagnosed by clinical and hematological data. Control blood samples were collected from different communities with informed consent. out direct sequencing. Standard Gene Scan analysis software was used to analyze the data. We also opted for direct DNA sequencing of the -globin gene to further confirm the mutations. Phenotype Analysis Examination of Cis-Acting Potential Regulators Genomic DNA was isolated from whole blood in anticoagulant (EDTA) by using the standard proteinase K–phenol–chloroform procedure as described in Old et al. [8]. Abnormal hemoglobins were analyzed by agarose gel electrophoresis at pH 8.6, Hb A2 and Hb F were determined by gel elution and alkali denaturation [9], respectively. ␣-Globin genotype was determined as described previously [10]. -Globin gene, spanning a 1.8-kb region, and the following regions cis to the -globin gene were examined (GenBank accession number U01317): (a) -globin gene cluster locus control 5⬘ HS-1, 5⬘ HS-2, 5⬘ HS-3, and 5⬘ HS-4; (b) promoter region of G␥-globin gene; (c) 5⬘ silencer region the (AT)x(T)y repeat; and (d) 3⬘ flanking region to -globin gene. The (CA)x(TA)y repeat motif in -LCRHS1, the (AT) x N y (AT) z repeat motif in -LCRHS2, and the (AT)x(T)y repeat motif in the 5⬘ -globin gene promoter region were analyzed by Genescan to determine the size of the PCR products, and direct sequencing was carried out using forward and reverse primers [12,13]. The location of these cis-acting regulatory elements, the PCR primers used for their amplification, sizing, and sequencing, and the optimum conditions for polymerase chain reactions with the PerkinElmer Gene Amp PCR System 9600 (Perkin-Elmer, Oak Brook, IL) are shown in Table I. The PCR products were purified with the Qiaquick PCR purification Kit (Qiagen), and DNA sequencing was performed by the ABI Prism 377 automated DNA sequencer (Applied Biosystems) using dye terminator chemistry. The sequencing reaction consisted of 25 cycles, with denaturation at 96°C Mutation Analysis by Allele-Specific Oligonucleotide, ARMS-PCR, and SnapShot Primer Extension Kit Screening for known five -thalassemia mutations, IVS1-1 (G➝T), IVS1-5 (G➝C), codon 8/9 (+G), codon 26 (G➝A), and codon 41/42 (−TCTT), was performed with the Amplification Refractory Mutation System (ARMS) as described in Old et al. [8]. The singlenucleotide primer extension method was used for validation or comparative genotyping of known SNPs and point mutations [11]. This straightforward technique permits exact base identity determination of a polymorphic locus by using ABI Prism Snapshot ddNTP primer extension kit (Applied Biosystems, Foster City, CA) with- 272 Kukreti et al. TABLE II. Spectrum of -Thalassemia Mutations in Eastern India RESULTS HS4, 5⬘ silencer region (AT)x(T)y, promoter region of G ␥-globin gene, and 3⬘ flanking region. The pedigrees, hematological parameters, ␣-genotype, and -thalassemia mutations of all members of the four families are given in Fig. 2. Genotypes linked to -thalassemia mutations in these members were revealed by PCR and genomic sequencing (Table III). Complete sequencing of the repeat motifs (CA) x (TA) y , (AT) x N y (AT) z , and (AT)x(T)y of -globin gene cluster was carried in 26 normal individuals from eastern Indian population, and the number and frequencies of observed repeat polymorphisms are shown in Table IV. It is often difficult to correctly align the repeat polymorphism. We have combined analysis of PCR product size using fluorescent primers with information derived from sequence analysis to show that it is possible to discern the pattern of various repeats in the individuals with correct typing of the motifs [12,13]. Spectrum of -Thalassemia Mutations in Eastern India Polymorphic Sites in -LCR HS1, -LCR HS2, -LCR HS3, and -LCR HS4 A total of 80 -thalassemic alleles have been deciphered from 10 homozygous -thalassemia, 40 heterozygous -thalassemia, 4 Hb E/-thalassemia, and 6 double heterozygous -thalassemia Eastern Indian patients. -Thalassemia carriers and -thalassemia major were defined by the phenotypic characteristics (high levels of Hb A2, >3.5%, and transfusion dependence) [14]. Common mutations IVS1-1 (G➝T), IVS1-5 (G➝C), codon 8/9 (+G), codon 26 (G➝A), codon 41/42 (−TCTT), codon 15 (G➝A), and codon 30 (G➝C) described in the Indian population [15] were screened by ARMS. Five different mutations were detected as shown in Table II: IVS1-5 (G➝C), Hb E:codon 26 (G➝A), codon 41/42 (−TCTT), codon 15 (G➝A), and codon 30 (G➝C). These accounted for 28.75%, 25%, 17.5%, 3.75%, and 10% of all the studied -thalassemic alleles, respectively, and 15% was uncharacterized. The single nucleotide extension technique, using fluorescent-labeled dideoxy nucleotide and primers designed to terminate on the base adjacent to the SNP, has also been used to detect these mutations in the -globin gene, and it is a cost-effective method for investigating point mutations and deletions for diagnostic purposes. The -LCR HS1 sequence analysis of the 445-bp fragment revealed three polymorphic patterns in the (CA)x(TA) y motif: (CA) 1 2 (TA) 6 , (CA) 1 0 (TA) 8 , and (CA)12(TA)7, the latter being a novel polymorphic pattern found in the Indian population. The observed variations led to allelic variability, both in base composition and in the length of the repeat, and they add an element to the genetically variable environment of the -thalassemia chromosome. Sequence analysis of the 462-bp HS2 fragment, including the highly polymorphic (AT)xNy(AT)z motif of the -LCR, showed polymorphisms identical to those described previously [16,17] and occurred in three different sequence configurations of the (AT) x N y (AT) z motif: (AT) 1 0 N 1 2 (AT) 1 1 , (AT)9N12(AT)11, (AT)9N12(AT)10 specific to the genotype of the SNP at the palindromic region of HS4 of the -LCR. It was observed that the G allele of -LCR HS4 is associated with only one distinct polymorphic pattern of AT-rich segments, (AT)9N12(AT)10 of the LCR HS2 region [17] (Table IV), indicating these mutational events were probably rare and arose from a common founder. In the core region of 5⬘ HS3 of the LCR, there were no variations from the reference sequence among the different haplotypes (data not shown). Mutation IVS1–5 (G→C) Hb E:codon 26 (G→A) Codon 41/42 (−TCTT) Codon 15 (G→A) Codon 30 (G→C) Uncharacterized Total Alleles % 23 20 14 3 8 12 80 28.75 25.0 17.50 3.75 10.0 15.0 100 for 30 sec and annealing at 60°C for 4 min. Sequences were aligned with the corresponding wild-type sequences using the Factura and Sequence Navigator software programs. Family Studies We studied 12 patients belonging to four unrelated families with -thalassemia carrier and major phenotype from an eastern Indian population. Three individuals of family D showed the phenotypes of heterozygous and homozygous -thalassemia; however, as a result of sequence analysis, no mutation was detected in either allele of the -globin gene. All the family members were analyzed for -globin gene and subsequently cis-acting elements: -LCR HS1, -LCR HS2, -LCR HS3, -LCR Polymorphic Sites in Promoter Region of the G ␥-Globin Gene, 5ⴕ Silencer Region to the -Globin Gene, -Globin Gene, and Its 3ⴕ Flanking Region Analysis of G␥-globin gene promoter showed no variation from the normal sequence. We examined sequence polymorphism in the 5⬘-flanking region (−650 to −250) upstream of a cap site and in the -globin gene (positions −224 to poly A signal) and in its 3⬘ flanking region Spectrum of -Thalassaemia Mutations in Eastern India 273 Fig. 2. Pedigrees, hematological results, ␣-genotype and -thalassemia mutations of patients from four families with -thalassemia carrier and -thalassemia major phenotype. N: -globin gene GenBank refernce sequence. Arrow indicate the family where the members showed the phenotypes of heterozygous and homozgous -thalassemia, however no mutation was detected in either allele of the -globin gene. (sequences at the end of exon 3 of the -globin gene) [18] in 24 -thalassemic alleles from eastern India. The 5⬘-flanking region contains numerous dimorphic nucleotide positions [10]. Three of them were polymorphic in our samples at positions −551, −521, and −340. In addition, a microsatellite, (AT)x(T)y, is subject to frequent micro-insertion or deletion [19]. We found (AT)7(T)7, (AT)8(T)5, and (AT)9(T)5 configurations in the families studied here. No polymorphisms were observed in either the 5⬘ or the 3⬘ untranslated regions of the gene. Five dimorphic sites defining the intragenic polymorphism in the -globin gene (Fig. 1, positions +59, +511, +569, +576, and +1,161) and four dimorphic sites in the 3⬘ flanking region of the -globin gene (Fig. 1, positions 101, 181, 183, and 339) have been studied. Three -thalassemia mutations, i.e., codon 15 (G➝A), codon 30 (G➝C), and IVS1-5 (G➝C), and two uncharacterized alleles were analyzed in our family members. A B C D E F G H 12–6 10–8 * * * * * 12–7 * 10–12–11 9–12–10 * * 9–12–10 9–12–10 9–12–10 9–12–10 * HS2 (AT)xNy(AT) −10,623 −551 T C * C C C * * C −158 C * * * * * * * * ␥-Promoter region HS4 (TGGGGACCCCA) −18,542 A G * * G G G * * G 7–7 8–5 9–5 * * 8–5 8–5 8–5 9–5 C * * * * T T * * A * * * * * * * * 5⬘ Silencer region (AT)x(T)y −530 −521 −491 T C * * * C C C C −340 C * T * * * T T T Exon 1 +59 C G * * * * * G G G T T * * * * T T C * * * * * * * * T * C C C C C C C -Globin gene Intron 2 +511 +569 +576 +1161 1 4 5 2 2 2 3 6 6 FR G * * * * * * * * G * A * * * * * * C * A * * A A A A A * * * * * * * * 3⬘ to -globin gene 101 181 183 339 a In -LCR HS2, N ⳱ ACA CAT ATA CGT. Various positions in the -LCR are numbered relative to -globin gene. Various positions are numbered at the 5⬘ end, within the -globin gene in relation to the -globin gene cap site and at the 3⬘ end in relation to its polyadenylylation site. Positions +59, +511, +569, +576, and +1161 refer to the framework defined by Orkin et al. in 1982 [5]; FR–framework. In each position, asterisks indicate identity with the -globin GenBank reference sequence. BGC refers to normal individual from eastern Indian population. BGC ref Codon 15 Codon 30 Codon 30 Codon 30 IVS1–5 IVS1–5 Uncharacterised allele Uncharacterised allele -thalassemia mutations HS1 (CA)x(TA)y −6,284 -LCR TABLE III. -Thalassemia Mutations and Their Associated Sequence Haplotypes of -Globin Gene Cluster in Eastern Indian Populationsa 274 Kukreti et al. Spectrum of -Thalassaemia Mutations in Eastern India 275 TABLE IV. Number (n) and Percentages (%) of Different Repeat Sequences of the -Globin Gene Cluster in the Alleles From Normal Individuals of Eastern Indian Population Origin (Eastern India) Repeat sequences n % -LCR HS1 (CA)x(TA)y (CA)12(TA)6 (CA)10(TA)8 (CA)12(TA)7 No.a 34 7 11 52 65.4 13.5 21.1 100 18 16 18 52 34.6 30.8 34.6 100 34 12 6 52 65.4 23.1 11.5 100 -LCR HS2(AT)xNy(TA)z (AT)9N12(TA)10 (AT)9N12(TA)11 (AT)10N12(TA)11 No. 5⬘ Silencer region (AT)x(T)y (AT)7(T)7 (AT)8(T)5 (AT)9(T)5 No. a No. refers to total number of analyzed alleles. Sequence Polymorphism Associated With -Thalassemia Mutations Analyzed sequence haplotypes in the -globin gene cluster defines the chromosome background in which the thalassemia mutation arose. Nine single nucleotide polymorphisms and three length variants were identified in the -globin gene cluster, and these sites could be segregated as eight sequence haplotypes in the thalassemia families studied. Three observed -thalassemia mutations, codon 15 (G➝A), codon 30 (G➝C), and IVS1-5 (G➝C), were found to be linked with different sequence haplotype, and sequence variations within the structural gene refer to the framework as described by Orkin et al. [5] (Table III). The codon 15 (G➝A) mutation is linked to haplotype A and sequence framework 4. The codon 30 (G➝C) is associated with three different haplotypes: B, C, and D. Haplotype B has sequence framework 5, while haplotypes C and D are linked with identical sequence framework 2. The most common IVS1-5 (G➝C) mutation is associated with two different haplotypes, E and F, and sequence frameworks are namely 2 and 3, while in family D, the patient with the -thalassemia major phenotype inherited different chromosome haplotypes, G and H, which are linked to the identical framework 6 sequence containing the normal -globin gene (uncharacterized allele). Because this is the first molecular characterization involving the repeat sequence polymorphism in the -globin gene cluster in eastern India, we combined both the technique of automated genotyping with direct sequence analysis and family studies to demonstrate the nature of repeat polymorphic sequences. DISCUSSION We have investigated the molecular basis of -thalassemia in Eastern India and linked some -thalassemia mutations to -globin gene cluster sequence haplotype with population samples from eastern India. This is the first report of an association of repeat polymorphic markers with the globin gene mutations in this region of India. Five different -thalassemia mutations were detected in 85% of the total 80 -thalassemic alleles studied (Table II). Among the -thalassemia mutations, IVS1-5 (G➝C) and Hb E are the most common mutations, occurring at percentages of 28.75 and 25, respectively. IVS1-5 (G➝C) happens to be the most frequent mutation in many other parts of India as well, i.e., Punjab, Gujrat, Maharastra, and southern India [2,15,20]. Characterization of the mutation patterns revealed from such study should provide the basis for prenatal diagnosis and genetic counseling of affected individuals. Although regulation of -globin gene expression may have diverse explanations, there could be genetic elements cis to the -globin gene that affect the severity of the disease. Several genetic loci are candidates for cisacting regulators of the -globin gene transcription [7]. The -globin gene LCR, located 6–18 kb 5⬘ to the -globin gene, plays a vital role in the tissue-specific and developmental-specific expression of globin genes. It consists of 1–5 hypersensitive sites (HSs) [21], which occur in phylogenetically conserved regions that bind transcription factors. Three different polymorphic (CA)x(TA)y patterns of -LCR HS1, namely, (CA)12(TA)6, (CA)10(TA)8, and (CA)12(TA)7, were observed in our patients. These polymorphic patterns are also present in the studied normals (Table IV), indicating that it is presumably a polymorphism [6]. We observed from automated genotyping and direct sequence analysis that the G allele in the palindromic sequence of the HS4 locus is associated with only one distinct pattern, (AT)9N12(AT)10, of HS2 of the -LCR. There appears to be linkage disequilibrium at the two loci with the pre- 276 Kukreti et al. ponderance of occurrence of the G allele at HS4 along with the (AT)9N12(AT)10 allele at the HS2 locus of -LCR, suggesting that the G allele could be an evolutionarily new mutation in the study population [17]. There are reports that sequence variations in the HS2 region of -LCR are associated with altered levels of fetal hemoglobin [16] in the thalassemia intermediate patients. The DNA silencer region 5⬘ to the -globin gene acts as a negative regulatory element, and it is possible that the -globin gene silencer with the core structure (AT)x(T)y may modulate expression of -globin gene [22]. Three length polymorphisms were observed in the (AT)x(T)y sequence region: (AT)7(T)7, (AT)8(T)5, and (AT)9(T)5. The AT dinucleotide insert accompanies either the presence or the absence of a T➝A replacement at −528, showing a +ATA,−T event as reported previously [23]. These variants have also been described by Perrin et al. [24], Ragusa et al. [25], and Gasperini et al. [26] mostly in Sicily, Algerian, and Sardinian populations. The (AT)9(T)5 motif has been studied with varying implications. It has been associated with silent -thalassemia, mild phenotype, and higher levels of Hb F in some homozygous -thalassemia patients [25]. The dimorphisms observed in the 5⬘ region at positions −551, −521, and −340 within the -globin gene at positions +59, +511, +569, and +1,161 and in the 3⬘ flanking region of the -globin gene at positions 181 and 183 have also been reported in Melanesian, Caucasian, Asian, and African populations [18]. The codon 15 (G➝A) mutation appears on one sequence haplotype A. The codon 30 (G➝C) mutation showed association with three distinct sequence haplotypes: B, C, and D. These haplotypes exhibit differences at both the sides of the mutation and could be the result of a conversion between a normal and a thalassemia chromosome. Such events are likely, as variant 9-12-10 in the -LCR HS2 and variant 9-3 in the silencer region 5⬘ to the -globin gene are frequent in normal alleles from the eastern Indian population (Table IV). The most common mutation, IVS1-5 (G➝C), appears on two sequence haplotypes, E and F, differing by three base substitutions (−551, −521, and +59). The haplotypes associated with this mutation are also represented among the normal alleles of this population. In India this mutation is strongly linked with eight different haplotypes and two different sequence frameworks, 2 and 3 [27], but it is also found in the same percentage in other populations, such as Indo-Mauritians, with three varied haplotypes on an identical sequence framework [27]. Recently Bandyopadhyay et al. [2] found this mutation to be strongly linked (82%) with a + − − − − + haplotype (HindII , HindIII ␥G, HindIII ␥A, HindII 5⬘, HindII 3⬘, and HinfI ) in an eastern Indian population. This mutation is considered to be the oldest -thalassemia allele in India, on the basis of its high haplotype diversity as well its wide distribution in this subcontinent. This type of molecular heterogeneity underlies the wide spectrum of clinical manifestations of the disease. In brief, this analysis revealed the association of some common -thalassemia mutations with the varied haplotypes which would be generated due to the large size of the chromosomal portion and the existence of hot spots for meiotic recombinations leading to many recombinations, substitutions, or conversions with the adjacent regions [24]. This exercise gives us an opportunity to determine the allele diversity associated with the -thalassemia mutations; it also allows us to study the nucleotide variations in the -globin gene cluster and their associations with the -thalassemia mutations in order to explore the genetic basis of the clinical diversity of the disease in eastern part of India, primarily the state of West Bengal. One of our aims was to learn if the severity of the phenotype of the thalassemia carrier and thalassemia major in family D could be predicted on the basis of genotypic analysis. The clinical phenotypes are usually heterogeneous and depend mainly on the -thalassemia mutation inherited [28]. -Globin gene analysis for family D members did not reveal any mutation in either heterozygous state or homozygous state corresponding to the severity of the disease. Studies of such families would be of interest, as they may reveal some association of the sequence polymorphisms in the cluster with the low -globin gene expression and consequently the phenotypes of the -thalassemia. To explain the phenotype for a -thalassemia carrier and -thalassemia major, we postulated the existence of an unknown genetic determinant in these patients that might be present in -globin gene cluster and affecting the phenotype; simple sequences present throughout the cluster have been known to form a number of unusual structures in chromatin, and they are believed to play an important role in eukaryotic gene regulation [29]. Sequence analysis of major regulatory regions of -globin gene cluster detected various changes from the reference sequence. We found sequence haplotypes G and H to be associated with the uncharacterized allele in patients from family D. The nucleotide changes (data not shown) and number of repeat variations (Table IV) in the -globin gene cluster implied in our study are also represented among the normal alleles of this population. This indicates that the sequence repeat polymorphisms present in the -globin gene cluster are common polymorphisms [6,7,10,16], and although they may not be necessarily associated with the hematological characteristic of -thalassemia, they could play an important role in subtle expression of the locus as a whole [16]. We speculate that the genetic determinant may lie either in other unexamined cisacting sequences, i.e., phylogenetically conserved regions of the LCR, 5⬘ of the -globin gene, 5⬘ of the ␦-globin gene, other 3⬘ sequences of the -globin gene, Spectrum of -Thalassaemia Mutations in Eastern India or elsewhere in the genome, which may involve the function of a gene encoding for a transcription factor regulating the function of the -globin gene [27,30]. With the recent success of the Human Genome Project, it is anticipated that more genetic modifiers and environmental factors will be discovered that can help in understanding the variations in phenotypes of -thalassemia. 14. 15. 16. ACKNOWLEDGMENTS The authors thank Prof. Samir K. Brahmachari and Dr. Mitali Mukherjee for scientific help and Ms. R. Jaya, Ms. Sakshi, and Ms. Ruchi for technical support. Financial support from the Department of Biotechnology, Government of India, in the Programme on Functional Genomics to S.K.B. is duly acknowledged. 17. 18. 19. REFERENCES 1. Weatherall DJ, Clegg JB. Thalassemia—a global public health problem. Nat Med 1996;2:847–849. 2. Bandyopadhyay A, Bandyopadhyay S, Chowdhury MD, Dasgupta UB. Major -globin gene mutations in Eastern India and their associated haplotypes. Hum Hered 1999;49:232–235. 3. Varawalla NY, Old JM, Sarkar R, Venkatesan R, Weatherall DJ. The spectrum of - thalassemia mutation on the Indian subcontinent: the basis for prenatal diagnosis. Br J Haematol 1991;78:242–247. 4. Thapar R. A history of India. Vol 1. Harmondsworth, England: Penguin; 1996. 5. Orkin SH, Kazazian HH Jr, Antonarakis SE, Goff SC, Boehm CD, Sexton JP, Waber PG, Giardina PJ. Linkage of -thalassemia mutations and -globin gene polymorphisms with DNA polymorphisms in human -globin gene cluster. Nature 1982;296:627–631. 6. Pasceri P, Pannell D, Wu X, Ellis J. Full activity from human -globin locus control region transgenes requires 5⬘HS1, distal -globin promoter, and 3⬘ -globin sequences. Blood 1998;82:853–862. 7. Lu Z-H, Steinberg MH. Fetal hemoglobin in sickle cell anemia: relation to regulatory sequences cis to the -globin gene. Blood 1996;87: 1604–1611. 8. Old JM, Varawalla NY, Weatherall DJ. The rapid detection and prenatal diagnosis of -thalassemia in the Asian Indian and Cypriot populations in the UK. Lancet 1990;336:834–837. 9. De M, Chakraborty G, Das SK, Bhattacharya DK, Talukder G. Molecular studies of haemoglobin E in tribal populations of Tripura. Lancet 1997;349:1297. 10. Dimovski AJ, Adekile AD, Divoky V, Baysal E, Huisman THJ. Polymorphic pattern of the (AT)x(T)y motif at −530 5⬘ to the -globin gene in over 40 patients homozygous for various -thalassemia mutations. Am J Hematol 1994;45:51–57. 11. Fortina P, Delgrosso K, Sakazume T, Santacroce R, Moutereau S, Su HJ, Graves D, McKenzie S, Surrey S. Simple two-color array-based approach for mutation detection. Eur J Hum Genet 2000;8:884–894. 12. Tatu T, Thein SL. Automated genotyping for accurate assignment of the (AT)xNy(AT)z motif within the -globin locus control region— hypersensitive site 2. Br J Hematol 2001;112:488–492. 13. Saleem Q, Choudhry S, Mukerji M, Bashyam L, Padma MV, Chakravarthy A, Maheshwari MC, Jain S, Brahmachari SK. Molecular analysis 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 277 of autosomal dominant hereditary ataxias in the Indian population: high frequency of SCA2 and evidence for a common founder mutation. Hum Genet 2000;106:179–187. Chui DHK, Waye JS. Hydrops fetalis caused by ␣-thalassemia: an emerging health care problem. Blood 1998;91:2213–2222. Varawalla NY, Fitches AC, Old JM. Analysis of -globin gene haplotypes in Asian Indians: origin and spread of  on the Indian subcontinent. Hum Genet 1992;90:443–449. Samakoglu S, Philipsen S, Grosveld F, Luleci G, Bagci H. Nucleotide changes in the ␥-globin promoter and the (AT)xNy(AT)z polymorphic sequence of LCRS-2 region associated with altered levels of Hb F. Eur J Hum Genet 1999;7:345–356. Kukreti R, Rao CB, Das SK, De M, Talukder G, Vaz F, Verma IC, Brahmachari SK. Study of the single nucleotide polymorphism (SNP) at the palindromic sequence of hypersensitive site (HS) 4 of the human -globin locus control region (LCR) in Indian population. Am J Hematol 2002;69:77–79. Fullerton SM, Harding RM, Boyce AJ, Clegg JB. Molecular and population genetic analysis of allelic sequence diversity at the human -globin locus. Proc Natl Acad Sci U S A 1994;91:1805–1809. Harding RM, Fullerton SM, Griffiths RC, Bond J, Cox MJ, Schneider JA, Moulin DS, Clegg JB. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am J Hum Genet 1997;60:772– 789. Venkatesan R, Sarkar R, Old JM. -Thalassemia mutations and their linkage to -haplotypes in Tamilnadu in Southern India. Clin Genet 1992;42:251–256. Orkin SH. Regulation of globin gene expression in erythroid cells. Eur J Biochem 1995;23:271–281. Drew LR, Tang DC, Berg PE, Rodgers GP. The role of trans-acting factors and DNA-bending in the silencing of human -globin gene expression. Nucleic Acids Res 2000;28:2823–2830. Elion J, Berg PE, Trabuchet G, Schechter AN, Krishnamoorthy R, Labie D. Is polymorphism 0.5 kb 5⬘ to the -globin gene relevant to the S gene expression? Blood 1989;74(Suppl 1):527a. Perrin P, Bouhassa R, Mselli L, Garguier N, Nigon V-M, Bennani C, Labie D, Trabuchet G. Diversity of sequence haplotypes associated with -thalassemia mutations in Algeria: implications for their origin. Gene 1998;213:169–177. Ragusa A, Lombardo M, Beldjord C, Ruberto C, Lombardo T, Elion J, Nagel RL, Krishnamoorthy R. Genetic epidemiology of -thalassemia in Sicily: do sequences 5⬘ to the G␥ gene and 5⬘ to the  gene interact to enhance HBF expression in -thalassemia? Am J Hematol 1992;40:199–206. Gasperini D, Perseu L, Melis MA, Maccioni L, Sollaino MC, Paglietti E, Cao A, Galanello R. Heterozygous -thalassemia with thalassemia intermedia phenotype. Am J Hematol 1998;57:43–47. Kotea N, Ramasawmy R, Lu CY, Fa NS, Gerard N, Beesoon S, Ducrocq R, Surrun SK, Nagel RL, Krishnamoorthy R. Spectrum of -thalassemia mutations and their linkage to -globin gene haplotypes in the Indo-Mauritians. Am J Hematol 2000;63:11–15. Rund D, Oron-Karni V, Filon D, Goldfarb A, Rachmilewitz E, Oppenheim A. Genetic analysis of -thalassemia intermedia in Israel: diversity of mechanisms and unpredictability of phenotype. Am J Hematol 1997;54:16–22. Wells RD, Collier DA, Hanvey JC, Shimizu M, Wohlrab F. The chemistry and biology of unusual DNA structures adopted by oligopurine– oligopyrimidine sequences. FASEB J 1988;2:2939–2949. Onishi Y, Kiyama R. Enhancer activity of HS2 of the human -LCR is modulated by distance from the key nucleosome. Nucleic Acids Res 2001;29:3448–3457.