* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Embryo Genome Profiling by Single-Cell
Non-coding DNA wikipedia , lookup
Population genetics wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
DNA sequencing wikipedia , lookup
Genetic engineering wikipedia , lookup
SNP genotyping wikipedia , lookup
Medical genetics wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Dominance (genetics) wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Oncogenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Human genetic variation wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genomic imprinting wikipedia , lookup
Minimal genome wikipedia , lookup
Public health genomics wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Human leukocyte antigen wikipedia , lookup
Human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Pathogenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Genome editing wikipedia , lookup
Genomic library wikipedia , lookup
Genome evolution wikipedia , lookup
Human Genome Project wikipedia , lookup
Exome sequencing wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Papers in Press. Published February 26, 2015 as doi:10.1373/clinchem.2014.228569 The latest version is at http://hwmaint.clinchem.org/cgi/doi/10.1373/clinchem.2014.228569 Clinical Chemistry 61:4 000 – 000 (2015) Molecular Diagnostics and Genetics Embryo Genome Profiling by Single-Cell Sequencing for Preimplantation Genetic Diagnosis in a -Thalassemia Family Yanwen Xu,1,2† Shengpei Chen,3,4,5,6† Xuyang Yin,3,4,5† Xiaoting Shen,1,2† Xiaoyu Pan,3,4,5,7† Fang Chen,3,4,5,8 Hui Jiang,3,4,5,9 Yu Liang,3 Wei Wang,3,4 Xun Xu,3 Jian Wang,3 Xiuqing Zhang,3,5* Canquan Zhou,1,2* and Jun Wang3,10,11,12* BACKGROUND: The embryonic genome, including genotypes and haplotypes, contains all the information for preimplantation genetic diagnosis, representing great potential for mendelian disorder carriers to conceive healthy babies. METHODS: We developed a strategy to obtain the full embryonic genome for a -thalassemia– carrier couple to have a healthy second baby. We carried out sequencing for single blastomere cells and the family trio and further developed the analysis pipeline, including recovery of the missing alleles, removal of the majority of errors, and phasing of the embryonic genome. RESULTS: The final accuracy for homozygous and heterozygous single-nucleotide polymorphisms reached 99.62% and 98.39%, respectively. The aneuploidies of embryos were detected as well. Based on the comprehensive embryonic genome, we effectively performed wholegenome mendelian disorder diagnosis and human leukocyte antigen matching tests. This retrospective study in a -thalassemia family demonstrates a method for embryo genome recovery through single-cell sequencing, which permits detection of genetic variations in preimplantation genetic diagnosis. It shows the potential of single-cell sequencing technology in preimplantation genetic diagnosis clinical practices. CONCLUSIONS: © 2015 American Association for Clinical Chemistry 1 First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; 2 Guangdong Provincial Key Laboratory of Reproductive Medicine, Guangdong, China; 3 BGI-Shenzhen, Shenzhen, China; 4 Shenzhen Municipal Key Laboratory of Birth Defects Screening and Engineering, Shenzhen, China; 5 Guangdong Provincial Key Laboratory of Human Diseases Genome, Guangdong, China; 6 State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; 7 School of Bioscience and Bioengineering, South China University of Technology, Guangzhou, China; 8 Section of Molecular Disease Biology, Department of Veterinary Disease Biology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; 9 Department of Biology, University of Copenhagen, Copenhagen, Denmark; 10 Department of Biology, University of Copenhagen, Copenhagen, Denmark; 11 King Abdulaziz University, Jeddah, Saudi Arabia; 12 The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen. † Yanwen Xu, Shengpei Chen, Xuyang Yin, Xiaoting Shen, and Xiaoyu Pan contributed equally to this work and all should be considered as first authors. In the past 2 decades, in vitro fertilization (IVF)13 with preimplantation genetic diagnosis (PGD) has played a key role in reproductive clinics. Couples carrying chromosomal abnormalities or mendelian disorders would be advised to choose IVF with PGD (1, 2 ), especially those with an affected child. The affected child provides important information for PGD, revealing mutations related to disease as pathogenic variants for mendelian disease diagnosis (3 ). During the IVF process, genetic materials can be obtained from embryos at 3 stages: polar body biopsy (4 ), cleavage biopsy at day 3 (5 ), and blastocyst biopsy at day 5/6 (6 ). Cleavage biopsy for a single blastomere is the most popular approach for PGD indications. -Thalassemia (OMIM 613985) is a group of autosomal recessive monogenic disorders caused by mutations in the -globin genes (7 ). It is currently the most common group of monogenic disorders. To date, the available definitive cure for -thalassemia major is hematopoietic stem cell transplantation from an HLAidentical donor. The HLA system is the name of the MHC in humans. It contains a large number of genes that encode antigen-presenting proteins that play a major role in immune system function. PGD of single-gene disorders combined with HLA matching has emerged as a tool for couples at risk of transmitting a genetic disease to select unaffected embryos of an HLA tissue type compatible with that of an existing affected child (8 ). At delivery, hematopoietic stem cells from the newborn umbilical cord blood can be used to treat the affected sibling. * Address correspondence to: X.Z. at: •••. E-mail [email protected]. C.N. at: First Affiliated Hospital of Sun Yat-sen University, 58 Zhongshan Rd. II Guangzhou, Gu, China 510080. Fax 86-20-87750632; e-mail [email protected]. J.W. at: •••. E-mail [email protected]. Received June 6, 2014; accepted January 13, 2015. Previously published online at DOI: 10.1373/clinchem.2014.228569 13 Nonstandard abbreviations: IVF, in vitro fertilization; PGD, preimplantation genetic diagnosis; WGA, whole-genome amplification; ADO, allele dropout; PA, preferential amplification; MDA, multiple displacement amplification; SNP, single-nucleotide polymorphism; NGS, next-generation sequencing; WGS, whole-genome sequencing; AF, amniotic fluid; MAR, missing allele rate; HMM, hidden Markov model; CR, copy ratio; ChrX, X chromosome. 1 Copyright (C) 2015 by The American Association for Clinical Chemistry During the PGD procedure, aneuploidy testing for preimplantation embryos is necessary owing to the high incidence of chromosomal abnormalities in embryos cultured in vitro. The sex information of embryos can be an indication for PGD as well. Hence, genome-wide profiling of embryos with diagnosis of single gene disorders, HLA matching, sex, and aneuploidy provides an approach to PGD for mendelian disorder carriers, such as the carriers of -thalassemia. Traditionally, multiplex PCR has been used to detect the pathogenic variants of an embryo with short tandem repeat markers in close proximity to the causative gene as a diagnosis backup. With the introduction of whole-genome amplification (WGA) to amplify the biopsied embryonic single cell, the complete embryonic genome can be obtained to facilitate the diagnosis of genetic variants in embryos. However, the allele dropout (ADO) and preferential amplification (PA) in WGA still restrict the diagnostic accuracy of WGA-based PGD. The ADO in multiple displacement amplification (MDA) is reported to range from 10.0% to 38.9% (9 ). Karyomapping based on single-nucleotide polymorphism (SNP) array has been reported to overcome the ADO obstacles in single-cell analysis (10 ). For the embryonic aneuploidy testing, detection of chromosomal abnormalities based on array comparative genomic hybridization (11 ), SNP array (12 ), or next-generation sequencing (NGS) (13–17 ) in biopsied embryonic cells have been published. In this study, genome profiling of human embryos using single blastomere whole-genome sequencing (WGS) was assessed in a -thalassemia family. The ADO in WGA was addressed and the haplotype of the embryo was established. Materials and Methods SAMPLE RECRUITMENT We recruited a couple who had a daughter suffering from -thalassemia. This couple with a maternal age of 37 years underwent IVF with PGD to obtain a healthy baby. Single-cell biopsy was performed on day 3 after fertilization and amplified using the MDA approach. Seven blastomeres were biopsied. A single cell was aspirated with a sampling micropipette going through the split in the zona pellucida. After biopsy, embryos were cultured in blastocyst medium (SAGE BioPharma) until vitrification. PCR–reverse dot blot analysis was used to diagnose -thalassemia, and short tandem repeat linkage analysis was used for HLA matching. The pregnant mother received amniocentesis for prenatal diagnosis of thalassemia at 16 weeks of gestation. Both the PGD and prenatal diagnosis indicated a female embryo that was negative for -thalassemia (not a carrier of or affected with -thalassemia) and that the HLA types matched those of the affected child. A healthy female baby was 2 Clinical Chemistry 61:4 (2015) born at 38 weeks of gestation in July 2012. We collected a portion of the WGA production of the 7 blastomeres with the amniotic fluid (AF) of the corresponding IVF fetus, as well as 5 mL of whole blood from this couple and the proband daughter. Each participant provided informed consent under the protocol approved by the Ethics Committee of the first affiliated hospital of Sun Yatsen University. WGA, LIBRARY CONSTRUCTION, AND SEQUENCING Each blastomere cell was transferred into a PCR tube containing 3.5 L of PBS. The wash buffer during cell sorting was used as the negative control. MDA was performed using the REPLI-g Midi kit (Qiagen) according to the instructions. The amplified DNA was stored at ⫺20 °C for further processing. Approximately 1 g MDA product from each blastomere was used for library construction and sequencing. For the transferred blastomere, we performed WGS using a pair-end strategy of 90 bp with 30-fold depth. The peripheral blood samples of parents and proband daughter as well as the AF were used for DNA extraction, library construction, All-in-One target region capture, and sequencing. The MDA products of the other 6 blastomeres nontransferred were also used for All-in-One target region sequencing. We extracted 1 g DNA from each blood sample and AF, and constructed libraries with 200-bp insert sizes. Libraries were used for All-in-One target capture with a NimbleGen EZ sequence capture array containing 181.37-Mb target regions of the whole exome, one million tag-SNPs, the whole MHC region, and several mendelian disorder– related genes. After chip hybridization and elution, the enriched DNA libraries were quality controlled. We performed NGS for the qualified libraries with pair-end 90-bp reads. The sequence reads of this project have been deposited at NCBI SRA (SRA065291). ALIGNMENT, SNP/INDEL CALLING The reads with low quality and adapter contamination were filtered out before aligning. The cleaned reads were mapped to the reference genome (Hg19, NCBI release GRCh37) using Burrows-Wheeler alignment [BWA (http://bio-bwa.sourceforge.net/)] (18 ) with the default parameters. We removed PCR duplications using SAMtools (19 ). Afterward, we recalibrated the base qualities and performed the SNP/Indel calling using the Genome Analysis Toolkit (http://www.broadinstitute.org/ gatk/) (20 ) (indels are insertions and deletion in a genome with a length of approximately 15 nucleotides). ORIGINAL GENOTYPE CALLING We performed the original genotype calling on the candidate sites, where one of the parents carried an SNP. The ADO and PA of WGA were evaluated as well. Tradition- Embryo Genome Profiling by NGS for PGD ally, ADO is defined as the random amplification failure of 1 of the 2 heterozygous alleles, whereas PA is defined as a bias that favors the amplification of one allele over the other, leading to failure to reach the threshold of detection (21 ). For better understanding, herein we denoted the missing allele rate (MAR) as the overall percentage of ADO and PA. In this section, the expected homozygous (parents carried same homozygous SNP) and heterozygous (parents carried different homozygous SNP) would be used to estimate the sequence error rate and MAR respectively. The L statistic was defined as the odds ratio between heterozygous and homozygous genotypes: L ⫽ log10 Pr共genotype ⫽ AB兲 , max[Pr共genotype ⫽ AA兲, Pr共genotype ⫽ BB兲] where Pr is the probability of each genotype. Pr was calculated through binomial distribution (Bin) in these formulas: Pr(genotype ⫽ AB) ⫽ Bin(N,0.5), Pr(genotype ⫽ AA) ⫽ Bin共N,1 ⫺ e兲, and Pr(genotype ⫽ BB) ⫽ Bin共N,e兲, where the sequence error (e) was estimated as the mismatch rate at expected homozygous sites and N represents the number of expected recombinations and SNPs. EMBRYO HAPLOTYPE PHASING We used heterozygous callings of original genotypes to predict the parental transmission through a hidden Markov model (HMM). An HMM can be considered a generalization of a mixture model, for which the hidden variables, which control the mixture component to be selected for each observation, are related through a Markov process. During this process, the maternal and paternal transmissions were predicted separately: (a) The hidden states of the model, S ⫽ {0,1}, defined which parental allele was transmitted to the embryo. In particular, 0 represented the allele transmitted to the proband sister. (b) The distinct observation symbols per state, V ⫽ {0,1}, expressed the consistency between the transmitted allele and the draft genotype; 1 stood for consistency between the transmitted allele and the draft genotype and 0 stood for inconsistency. (c) The initial state distribution ⫽ {i},(i ⑀ S, i ⫽ 0.5) was the given prior probability of the different hidden state at the first site. (d) The state transition probability distribution, A ⫽ {aij},(i,j ⑀ S), represented the probability of recombination or not, which could be calculated according to former studies (22 ). If i ⫽ j, aij ⫽ 1 ⫺ Nr/Np; otherwise, aij ⫽ Nr/Np, where Nr and Np represented the number of expected recombinations and SNPs. (e) The observation symbols probability distribution in each state, B ⫽ {bi(k)},(i ⑀ S, k ⑀ V ), were estimated using the expected homozygous (Must-hom.) and expected heterozygous (Must-het.) sites: b i (0) ⫽ no. sites (L ⬎ 0,Must-hom.) , no. sites (L ⬎ 0,Must-het. or Must-hom.) b i (0) ⫽ 1 ⫺ b i (0). Finally, we deciphered the optimized hidden states using HMM and the Viterbi algorithm. ANEUPLOIDY DETECTION Aneuploidies of the blastomeres were analyzed according to the depth of the heterozygous SNP alleles. The signals were filtered with 2 criteria: with size larger than 30 Mb and with ratios between the depth of the 2 alleles ⬎2 or ⬍0.5. After filtering, the signals with a copy ratio (CR) ⬍0.75 were considered to be deletions, and the signals with CR ⬎1.25 were considered to be duplications. SEX DETERMINATION Ideally, the CR of the sex chromosomes can reveal the sex of the embryo. The CR was defined as the ratio between the observed and expected reads: CRc ⫽ R c /R total , L c /L total where subscript c is the ID and R and L represent the number of mapped reads and chromosomal length, respectively. In the case of heterozygosity examination, we evaluated the difference in the L distribution on the X chromosome (ChrX) between expected heterozygous and homozygous sites by the rank-sum test. Results NEXT-GENERATION SEQUENCING The couple received IVF treatment with a PGD test for both -thalassemia and HLA. A single-blastomere WGA of the transferred embryo was collected for WGS (PEDE-A). A total of 118.51 Gbp data was generated using NGS, which covered 96.8% of the reference genome [HG19, NCBI (National Center for Biotechnology Information) built 37] to 38.25⫻ depth. To construct the parental haplotype, we performed All-in-One target sequencing using genomic DNA extracted from the peripheral blood of parents (PED-Mother & PED-Father) and the proband daughter (PED-Sister) as well as WGA products from the 7 blastomeres, including the transferred embryo (PED-E-B) and the other 6 embryos nontransferred (PED-E-1,2,4,5,6,7). In addition, amniotic fluid of the corresponding IVF baby (PED-AF) was collected to evaluate the accuracy of method through All-inOne target sequencing (Fig. 1A). The summary of data production for NGS is displayed in Table 1 in the Data Supplement that accompanies the online version of Clinical Chemistry 61:4 (2015) 3 Fig. 1. Sample design, sequence strategy, bioinformatics pipeline, and general performance. (A), Sequence strategy and bioinformatics pipeline. (B and C), The detailed statistics of the performance on autosomes and sex chromosomes. (D), The MAR statistics before and after embryo haplotype phasing. this article at http://www.clinchem.org/content/vol61/ issue4. WGA BIAS ASSESSMENT AND ORIGINAL SNP PROFILING According to the AF genotypes on autosomes, we found that 34 004 of 158 643 embryo heterozygous SNPs had dropped at least 1 allele, corresponding to an ADO rate of 21.43% (Fig. 2A). Approximately 56% of the ADO occurred in the paternal allele on the autosomes (Fig. 2B). As was reported previously (23, 24 ), we observed a GC content–related bias, namely the ADO rate was increased in GC-poor (⬍30%) and GC-rich (⬎30%) genomic regions (see online Supplemental Fig. 1). To eliminate this bias, we performed original genotype calling of the embryo genome using the parameter L to determine the original genotype (see online Supplemental Fig. 2). It indicated that L could isolate heterozygous sites from homozygote and ADO backgrounds. Besides the expected homozygous and heterozygous sites, 330 060 original genotypes were determined successfully. According to the AF genotype on autosomes, 75.39% 4 Clinical Chemistry 61:4 (2015) and 99.32% of the homozygous and heterozygous genotypes were accurately predicted (Fig. 1B). Regarding ChrX, 71.36% and 99.66% of the homozygous and heterozygous genotypes were accurately determined (Fig. 1C). This indicated an MAR of 40.02% on autosomes and 37.99% on sex chromosomes. EMBRYO HAPLOTYPE PHASING We phased the embryonic autosomes with the parental haplotypes, which were inferred from the genotype of the proband daughter (Fig. 1A). We successfully phased 131 785 and 133 601 of the maternal and paternal heterozygous sites, including mutations in the hemoglobin, beta (HBB)14 gene. Subsequently, we used an HMM to determine the parental transmission, identifying re- 14 Human genes: HBB, hemoglobin, beta; HLA-DRB1, SlC25A22, solute carrier family 25 (mitochondrial carrier: glutamate), member 22; INS, insulin; SMPD1, sphingomyelin phosphodiesterase 1, acid lysosomal; TH, tyrosine hydroxylase. Embryo Genome Profiling by NGS for PGD Fig. 2. Basic statistics and chromosomal contribution of ADO. (A), According to the AF genotypes, approximately 21% of the heterozygotes dropped out at least 1 allele, and approximately 56% of these ADO occur on the paternal allele. (B), The chromosomal contribution of each ADO category (green, paternal ADO; orange, maternal ADO; red, both allele dropped ADO; grey, uncertain). combination breakpoints and filtering out sequencing error at the same time. The approach led to successful phasing of 98.57% of the autosomal SNPs in the embryonic genome. According to the AF genotype, the accuracy of homozygous callings was significantly improved from 75.39% to 99.73%. Using the haplotype information, 62 110 homozygous callings were identified as PA or ADO and corrected back to heterozygous with an accuracy of 95.44% (Fig. 1B), making the overall accuracy of the heterozygous callings 97.85%. Also, the autosomal MAR was markedly decreased from 40.02% to 0.50% (Fig. 1D). We further performed a simulation by subsampling of PED-E-A to characterize how the accuracy of the embryo genotype depends on sequencing depth (see online Supplemental Fig. 3). As little as 25% of the WGS data was able to decrease the MAR to ⬍1% (mean depth ⫽ 9.5⫻). Concomitantly, the accuracy of the final genotypes stabilized above 97%. This indicated that our model was robust for lower depth single-cell sequencing. The accuracy of homologous recombination site pinpointing using our method was demonstrated. We compared the recombination breakpoints in parental alleles between amniotic fluid PED-AF and the embryo Clinical Chemistry 61:4 (2015) 5 Table 1. Aneuploidies detection for the 7 embryos by NGS and SNP array. Embryos NGS results SNP array results PED-E-A 46,XX 46,XX PED-E-1 46,XY 46,XY PED-E-2 46,XX 46,XX PED-E-4 41,XY, −1(M), −6(P), −12(P), −14(M), −15(P) 41,XY, −1, −6, −12, −14, −15 PED-E-5 47,XX, +7(P), +9(M), −11(P) 47,XX, +7, +9, −11 PED-E-6 44,XX, −12(P), −19(P) 44,XX, −12, −19 PED-E-7 45,XX, +6(P), −8(P), −20(P) 45,XX, +6, −8, −20 M, maternal alleles originated; P, paternal alleles originated. transferred PED-E-A (see online Supplemental Fig. 4). We also evaluated the accuracy statistics of SNP calling close to the recombination breakpoints (see online Supplemental Fig. 5), which illustrated the reliability of the embryonic haplotype phased by our method. SEX DETECTION AND SEX CHROMOSOME ANALYSIS Genomic information on sex chromosomes is central to the diagnoses of mendelian diseases such as Duchenne muscular dystrophy (MIM number 310200). We performed analysis on sex determination and haplotype on sex chromosomes. The CR of sex chromosomes would have revealed the sex information (see online Supplemental Fig. 6). The relative ratios of ChrX and ChrY for the transferred embryo were 0.75 and 0.07, respectively. The lower ratio of ChrY indicated that the signals of ChrY might randomly originate from an experimental or analytical process. To reach an unequivocal conclusion, we determined the sex of the embryo by examining the heterozygosity of ChrX. Theoretically, the L distribution between expected heterozygous and homozygous sites would be similar for male embryos but different for female embryos. The rank–sum test (P value ⫽ 2.2 ⫻ 10⫺16) indicated a significant difference in the L distribution for the transferred embryo (PED-E-A), implying a female embryo. This finding was consistent with the birth. The L distribution between heterozygous and homozygous sites was significantly different for PED-E-2, PED-E-5, PED-E-6, PED-E-7, indicating the 4 embryos were female. The other 2 embryos PED-E-1 and PED-E-4 were determined as male with the same L distributions. The sex information of all 7 embryos was validated by SNP array (Table 1; also see Supplemental Fig. 8). We next determined the haplotype of ChrX for the female embryo transferred. Of the 4887 original genotypes, over 99% of the transmitted allele was successfully predicted. The general accuracies of the homozygous and heterozygous callings were 97.18% and 99.52%, respec6 Clinical Chemistry 61:4 (2015) tively (Fig. 1C). Meanwhile, the MAR was decreased from 39.73% to 0.12% (Fig. 1D). ANEUPLOIDY DETECTION Aneuploidies of the 7 blastomeres were detected using the depth ratio of the parental alleles in heterozygous sites. Supplemental Fig. 7 shows the sequencing depth of the parental alleles with CR ⬍0.75 (deletion) or CR ⬎1.25 (duplication) after filtering for partial chromosomes in the 7 embryos. The transferred embryo was confirmed as euploidy because the majority of filtered alleles were within 0.75 ⬍ CR ⬍ 1.25 in all the 22 autosomes and sex chromosomes. The aneuploidy results of all 7 embryos are listed in Table 1, and the NGS results were all validated by SNP array (see online Supplemental Fig. 8). The embryonic aneuploidies by NGS in this study were all consistent with the SNP array results. WHOLE-GENOME MENDELIAN DISORDER DIAGNOSIS AND HLA MATCHING The embryonic genome allowed us to perform a genomewide diagnosis of mendelian disorders, including 83 common mendelian disorders in the clinic involving 384 genes (Fig. 3A). The genotypes indicated that the transferred embryo was totally negative for these 83 disorders. Again, the results were confirmed by the healthy phenotypes after birth. A detailed diagnosis on -thalassemia and HLA matching was further performed. Mutations in the HBB gene cause -thalassemia. The All-in-One sequencing indicated that the proband sister carried a homozygous A3 C point mutation at rs7480526 (NM_000518.4:c.315 ⫹ 74T⬎G), a known pathogenic variant for -thalassemia (25 ) in the intron region of the HBB gene. Both parents carried a single copy of this mutation and exhibited a normal phenotype. We performed an HBB gene diagnosis based on the genotype and haplotype for all 7 embryos. The transferred embryo was homozygous A in rs7480526. Haplotypes of the proband sister were defined as Hap0/Hap0. The pre- Embryo Genome Profiling by NGS for PGD Fig. 3. Genome-wide clinical screening. (A), Information of the genome for the transferred embryo PED-E-A. The outer to inner circles formed represent the genome-wide mendelian disorder diagnosis, chromosomal structure, sequence depth, paternal allele transmission, and maternal allele transmission. (B), Details of the -thalassemia diagnosis for PED-E-A. (C), Details of the HLA-matching test for PED-E-A. SlC25A22, solute carrier family 25 (mitochondrial carrier: glutamate), member 22; INS, insulin; SMPD1, sphingomyelin phosphodiesterase 1, acid lysosomal; TH, tyrosine hydroxylase. dicted transmission showed that the haplotype was HapI/ HapI for PED-E-A at rs7480526 (Fig. 3B). We diagnosed this transferred embryo as -thalassemia negative (not a carrier of or affected with -thalassemia), which was consistent with the AF genotypes and diagnosis after birth. Genotype at rs7480526 and predicted transmission of the haplotype for all 7 embryos are shown in Table 2 and Supplemental Fig. 9. The genotype at rs7480526 was consistent with the predicted transmission of haplotype for all 7 embryos and was further validated by Sanger sequencing (Table 2; also see Supplemental Fig. 10). The genotypes of HLA genes (see online Supplemental Table 2) have been used before transplantation (26 ). For this family, the umbilical cord blood stem cells of the transferred embryo could be used to treat the proband sister’s thalassemia (27 ). We selected the heterozygous genotypes of all of the HLA genes (see online Sup- Table 2. The genotype at rs7480526, the haplotype in the HBB gene, and MHC regions for the 7 embryos. NGS Embryos Genotype at rs7480526 Sanger sequencing Haplotype in HBB gene Haplotype in MHC regions Genotype at rs7480526 Haplotype in HLA-DRB1 PED-Sister CC Hap0/Hap0 Hap0/Hap0 CC Hap0/Hap0 PED-E-A AA Hap1/Hap1 Hap0/Hap0 AA Hap0/Hap0 PED-E-1 AC Hap0/Hap1 Hap0/Hap1 AC Hap0/Hap1 PED-E-2 AC Hap0/Hap1 Hap1/Hap0 AC Hap1/Hap0 PED-E-4 AC Hap1/Hap0 Hap0/Hap0 AC Hap0/Hap0 PED-E-5 AA Hap1/Hap1 Hap0/Hap1 AA Hap0/Hap1 PED-E-6 AC Hap1/Hap0 Hap0/Hap0 AC Hap0/Hap0 PED-E-7 CC Hap0/Hap0 Hap1/Hap0 CC Hap1/Hap0 Clinical Chemistry 61:4 (2015) 7 plemental Table 2) in NGS data to predict the transmission haplotypes of each embryo. According to the predicted transmission, the haplotypes of the transferred embryo (PED-E-A) were Hap0/Hap0 in HLA genes (Fig. 3C), consistent with the proband sister. Thus, we determined that their HLA types were matched. The predicted transmission of haplotype in MHC regions for all 7 embryos is shown in Table 2 and online Supplemental Fig. 9. The haplotype in HLA genes predicted from NGS data were further validated by PCR with Sanger sequencing of an oligonucleotide sequence in the major histocompatibility complex, class II, DR beta 1 (HLA-DRB1) gene with both paternal and maternal heterozygous alleles (see online Supplemental Fig. 11). The haplotype of HLA-DRB1 in each embryo by Sanger sequencing was consistent with the result predicted from NGS. Discussion In a retrospective study, we demonstrated a model for retrieval of the embryo genome using single-cell sequencing assisted by parental haplotypes in a -thalassemia family. Over 97% of the embryo autosomal genotypes and 99% of the genotypes on ChrX were reconstructed accurately. In addition, the diagnosis of -thalassemia, HLA matching, sex, and aneuploidy were analyzed and proved to be accurate. This established the technical foundation of embryo genome recovery using single-cell sequencing in clinics. Although we successfully recovered the embryo SNPs and made an accurate diagnosis, there are 2 major directions for further development. First, the parental haplotype can be inferred using a simple trio strategy. We can infer the parental haplotypes using these “offspring” and phase these embryo genomes reciprocally. There may be several experimental approaches to obtain the parental haplotype directly, such as a fosmid library (28 ), long fragment read technology (29 ), and microfluidic technology (30 ). The single-cell sequencing of sperm or second polar body can help to obtain the parental haplotypes as well. Second, in this study we focused exclusively on SNPs and aneuploidies. However, clinical applications of embryonic genomes require information for other forms of variations, e.g., short indels and de novo mutations. Indels play an important role in mendelian disorders (31 ) and are potential markers of complex traits (32 ). In this study, we also tried to detect embryo indels using our strategy. The accuracy of autosomal homozygous and heterozygous callings of indels in embryo was improved from 69.00% and 83.10% to 89.14% and 88.13%, respectively (see online Supplemental Table 3). Because of the false positives in WGA and sequencing, we were unable to confirm de novo mutations. In traditional PGD, multiplex PCR and array-based technology have been used to identify pathogenic mu8 Clinical Chemistry 61:4 (2015) tations (33 ). However, multiplex PCR has a lower throughput and a limited number of known sites can be analyzed. Array-based technology provides a highthroughput approach such as aCGH (array comparative genomic hybridization) and SNP array, but it can identify only known common markers in human genome. Although karyomapping based on SNP array provides an option for the detection of the majority of mutations, the resolution and accuracy are restricted. NGS provides a high-throughput and single-nucleotide resolution technology to analyze the embryonic genome, enabling PGD. Compared to array-based karyomapping, NGS mainly shows technological advantages. The strength of NGS is in the detection of family-specific mutations or rare mutations, which would not be included in the probe targets of SNP array. A large number of pathogenic mutations would not be directly detected in most SNP array assays, such as the rs7480526 of the HBB gene in this study. Furthermore, detection of embryo indels was evaluated, indicating the feasibility of NGS for analysis in embryos of complicated variations that cannot be analyzed in SNP array assay. The cost of NGS with 30-fold coverage of the embryo genome would be about $1000. However, sequencing with lower depth is also possible because, in our model, the MAR decreased ⬍1% with a depth of 9.5⫻. The method theoretically can be modified so that a condensed target region for sequencing could be used and the cost reduced accordingly. The turnaround time of sequencing is about 20 working days, so that embryo vitrification is needed. Several studies have indicated that, with the introduction of vitrification, the use of NGS during pregnancy and the overall efficiency during IVF– PGD might improve dramatically (34 –36 ). Furthermore, with continuous technical improvements the costeffectiveness of sequencing is rapidly increasing as is the accuracy. In this study, we introduced an NGS-based method to access the embryo genome. Numerical chromosome anomalies, which often occur during human cleavagestage embryogenesis, have been accurately detected. The ability to recover an embryo genome at high resolution using single-cell sequencing should bring new insights into the diagnoses of mendelian disorders and personal genomic analysis for PGD. In the case of parental carriage of mendelian disorders, the embryo genome provides all of the necessary evidence for diagnosis. It paves the way for PGD toward personal medicine, such as diagnosis of allergies (37 ). The additional information available to parents will raise ethical questions, especially regarding the prediction of novel parental pathological mutations. Another major problem in testing embryos without a phenotype for diseases that are not in the family is in interpreting the variants. Variants of uncertain significance will often be discovered, and many of these vari- Embryo Genome Profiling by NGS for PGD ants are difficult to classify even in affected individuals. The variants of uncertain significance must be considered in embryonic personal genome interpretation during clinical practice. The pretest and posttest genetic counseling is also substantial and should be standardized. The ethical questions raised need to be thoroughly discussed within the scientific community and at a societal level. Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article. Authors’ Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest: Employment or Leadership: C. Zhou, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China. Consultant or Advisory Role: None declared. Stock Ownership: None declared. Honoraria: None declared. Research Funding: J. Wang, W. Wang, S. Chen, X. Yin, X. Pan, F. Chen and H. Jiang, this study was funded by the Laboratory of Shenzhen Birth Defect Screening Project (JZF no. [2011] 861 and JZF no. [2011] 862) and was approved by the Shenzhen Municipal Commission for Development and Reform and Key Laboratory Project in Shenzhen (CXB201108250096A) and Key Laboratory of Cooperation Project in Guangdong Province (2011A060906007). Expert Testimony: None declared. Patents: S. Chen, X. Yin, and X. Pan, PCT/CN2013/073375. Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, or final approval of manuscript. Acknowledgments: We thank Laurie Goodman and Huijue Jia for writing advice. We thank Yingrui Li, Xu Yang, and Fengping Xu for support and thank Chunlei Zhang, Xuchao Li, and Chun Gong for assistance in data analysis. References 1. Bisignano A, Wells D, Harton G, Munne S. PGD and aneuploidy screening for 24 chromosomes: advantages and disadvantages of competing platforms. Reprod Biomed Online 2011;23:677– 85. 2. Verlinsky Y, Cohen J, Munne S, Gianaroli L, Simpson JL, Ferraretti AP, Kuliev A. Over a decade of experience with preimplantation genetic diagnosis: a multicenter report. Fertil Steril 2004;82:292– 4. 3. Ng SB, Nickerson DA, Bamshad MJ, Shendure J. Massively parallel sequencing and rare disease. Hum Mol Genet 2010;19:R119 –24. 4. Verlinsky Y, Ginsberg N, Lifchez A, Valle J, Moise J, Strom CM. Analysis of the first polar body: preconception genetic diagnosis. Hum Reprod 1990;5:826 –9. 5. Handyside AH, Kontogianni EH, Hardy K, Winston RM. Pregnancies from biopsied human preimplantation embryos sexed by Y-specific DNA amplification. Nature 1990;344:768 –70. 6. McArthur SJ, Leigh D, Marshall JT, de Boer KA, Jansen RP. Pregnancies and live births after trophectoderm biopsy and preimplantation genetic testing of human blastocysts. Fertil Steril 2005;84:1628 –36. 7. Olivieri NF. The beta-thalassemias. N Engl J Med 1999; 341:99 –109. 8. Milachich T, Timeva T, Ekmekci C, Beyazyurek C, Tac HA, Shterev A, Kahraman S. Birth of a healthy infant after preimplantation genetic diagnosis by sequential blastomere and trophectoderm biopsy for -thalassemia and HLA genotyping. Eur J Obstet Gynecol Reprod Biol 2013;169:261–7. 9. Zheng YM, Wang N, Li L, Jin F. Whole genome amplification in preimplantation genetic diagnosis. J Zhejiang Univ Sci B 2011;12:1–11. 10. Handyside AH, Harton GL, Mariani B, Thornhill AR, Affara N, Shaw MA, Griffin DK. Karyomapping: a universal method for genome wide analysis of genetic disease based on mapping crossovers between parental haplotypes. J Med Genet 2010;47:651– 8. 11. Fragouli E, Alfarawati S, Daphnis DD, Goodall NN, Mania A, Griffiths T, et al. Cytogenetic analysis of human blastocysts with the use of FISH, CGH and aCGH: scientific data and technical evaluation. Hum Reprod 2011; 26:480 –90. 12. Treff NR, Su J, Tao X, Levy B, Scott RT Jr. Accurate single cell 24 chromosome aneuploidy screening using whole genome amplification and single nucleotide polymorphism microarrays. Fertil Steril 2010;94: 2017–21. 13. Yin X, Tan K, Vajta G, Jiang H, Tan Y, Zhang C, et al. Massively parallel sequencing for chromosomal abnormality testing in trophectoderm cells of human blastocysts. Biol Reprod 2013;88:69. 14. Zhang C, Chen S, Yin X, Pan X, Lin G, Tan Y, et al. A single cell level based method for copy number variation analysis by low coverage massively parallel sequencing. PLoS One 2013;8:e54236. 15. Fiorentino F, Biricik A, Bono S, Spizzichino L, Cotroneo E, Cottone G, et al. Development and validation of a next-generation sequencing-based protocol for 24chromosome aneuploidy screening of embryos. Fertil Steril 2014;101:1375– 82. 16. Wang L, Wang X, Zhang J, Song Z, Wang S, Gao Y, et al. Detection of chromosomal aneuploidy in human preimplantation embryos by next-generation sequencing. Biol Reprod 2014;90:95. 17. Wells D, Kaur K, Grifo J, Glassner M, Taylor JC, Fragouli E, Munne S. Clinical utilisation of a rapid low-pass whole genome sequencing technique for the diagnosis of aneuploidy in human embryos prior to implantation. J Med Genet 2014;51:553– 62. 18. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25:1754 – 60. 19. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009;25:2078 –9. 20. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297– 303. 21. Findlay I, Ray P, Quirke P, Rutherford A, Lilford R. Allelic drop-out and preferential amplification in single cells and human blastomeres: implications for preimplantation diagnosis of sex and cystic fibrosis. Hum Reprod 1995;10:1609 –18. 22. Kirkness EF, Grindberg RV, Yee-Greenbaum J, Marshall CR, Scherer SW, Lasken RS, Venter JC. Sequencing of isolated sperm cells for direct haplotyping of a human genome. Genome Res 2013;23:826 –32. 23. Xu X, Hou Y, Yin X, Bao L, Tang A, Song L, et al. Singlecell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 2012;148: 886 –95. 24. Hou Y, Song L, Zhu P, Zhang B, Tao Y, Xu X, et al. Singlecell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell 2012;148:873– 85. 25. Bilgen T, Arikan Y, Canatan D, Yesilipek A, Keser I. The association between intragenic SNP haplotypes and mutations of the beta globin gene in a Turkish population. Blood Cells Mol Dis 2011;46:226 –9. 26. Sheldon S, Poulton K. HLA typing and its influence on organ transplantation. Methods Mol Biol 2006;333: 157–74. 27. Lucarelli G, Isgro A, Sodani P, Gaziev J. Hematopoietic stem cell transplantation in thalassemia and sickle cell anemia. Cold Spring Harb Perspect Med 2012;2: a011825. 28. Kitzman JO, Mackenzie AP, Adey A, Hiatt JB, Patwardhan RP, Sudmant PH, et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat Biotechnol 2011;29:59 – 63. 29. Peters BA, Kermani BG, Sparks AB, Alferov O, Hong P, Alexeev A, et al. Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells. Nature 2012;487:190 –5. 30. Fan HC, Wang J, Potanina A, Quake SR. Whole-genome molecular haplotyping of single cells. Nat Biotechnol 2011;29:51–7. 31. Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 2010;42:30 –5. 32. Vali U, Brandstrom M, Johansson M, Ellegren H. Insertion-deletion polymorphisms (indels) as genetic markers in natural populations. BMC Genet 2008;9:8. 33. Harper JC, Sengupta SB. Preimplantation genetic diagnosis: state of the art 2011. Hum Genet 2012;131: 175– 86. Clinical Chemistry 61:4 (2015) 9 34. Roque M, Lattes K, Serra S, Sola I, Geber S, Carreras R, Checa MA. Fresh embryo transfer versus frozen embryo transfer in in vitro fertilization cycles: a systematic review and meta-analysis. Fertil Steril 2013; 99:156 – 62. 35. Shapiro BS, Daneshmand ST, De Leon L, Garner FC, 10 Clinical Chemistry 61:4 (2015) Aguirre M, Hudson C. Frozen-thawed embryo transfer is associated with a significantly reduced incidence of ectopic pregnancy. Fertil Steril 2012;98:1490 – 4. 36. Lathi RB, Massie JA, Gilani M, Milki AA, Westphal LM, Baker VL, Behr B. Outcomes of trophectoderm biopsy on cryopreserved blastocysts: a case series. Reprod Biomed Online 2012;25:504 –7. 37. Negoro T, Orihara K, Irahara T, Nishiyama H, Hagiwara K, Nishida R, et al. Influence of SNPs in cytokine-related genes on the severity of food allergy and atopic eczema in children. Pediatr Allergy Immunol 2006;17:583–90.