Download Embryo Genome Profiling by Single-Cell

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-coding DNA wikipedia , lookup

Population genetics wikipedia , lookup

Polyploid wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

DNA sequencing wikipedia , lookup

Genetic engineering wikipedia , lookup

SNP genotyping wikipedia , lookup

Medical genetics wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Dominance (genetics) wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Oncogenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Human genetic variation wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genomic imprinting wikipedia , lookup

Minimal genome wikipedia , lookup

Public health genomics wikipedia , lookup

Microevolution wikipedia , lookup

Genome (book) wikipedia , lookup

Human leukocyte antigen wikipedia , lookup

Human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Pathogenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Genome editing wikipedia , lookup

Genomic library wikipedia , lookup

Tag SNP wikipedia , lookup

Genome evolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Human Genome Project wikipedia , lookup

Exome sequencing wikipedia , lookup

Genomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Preimplantation genetic diagnosis wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
Papers in Press. Published February 26, 2015 as doi:10.1373/clinchem.2014.228569
The latest version is at http://hwmaint.clinchem.org/cgi/doi/10.1373/clinchem.2014.228569
Clinical Chemistry 61:4
000 – 000 (2015)
Molecular Diagnostics and Genetics
Embryo Genome Profiling by Single-Cell Sequencing
for Preimplantation Genetic Diagnosis in a
␤-Thalassemia Family
Yanwen Xu,1,2† Shengpei Chen,3,4,5,6† Xuyang Yin,3,4,5† Xiaoting Shen,1,2† Xiaoyu Pan,3,4,5,7†
Fang Chen,3,4,5,8 Hui Jiang,3,4,5,9 Yu Liang,3 Wei Wang,3,4 Xun Xu,3 Jian Wang,3 Xiuqing Zhang,3,5*
Canquan Zhou,1,2* and Jun Wang3,10,11,12*
BACKGROUND:
The embryonic genome, including genotypes and haplotypes, contains all the information for preimplantation genetic diagnosis, representing great potential
for mendelian disorder carriers to conceive healthy babies.
METHODS: We developed a strategy to obtain the full
embryonic genome for a ␤-thalassemia– carrier couple to
have a healthy second baby. We carried out sequencing
for single blastomere cells and the family trio and further
developed the analysis pipeline, including recovery of the
missing alleles, removal of the majority of errors, and
phasing of the embryonic genome.
RESULTS: The final accuracy for homozygous and
heterozygous single-nucleotide polymorphisms reached
99.62% and 98.39%, respectively. The aneuploidies of
embryos were detected as well. Based on the comprehensive embryonic genome, we effectively performed wholegenome mendelian disorder diagnosis and human leukocyte antigen matching tests.
This retrospective study in a ␤-thalassemia
family demonstrates a method for embryo genome recovery
through single-cell sequencing, which permits detection of
genetic variations in preimplantation genetic diagnosis. It
shows the potential of single-cell sequencing technology in
preimplantation genetic diagnosis clinical practices.
CONCLUSIONS:
© 2015 American Association for Clinical Chemistry
1
First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; 2 Guangdong Provincial Key Laboratory of Reproductive Medicine, Guangdong, China; 3 BGI-Shenzhen,
Shenzhen, China; 4 Shenzhen Municipal Key Laboratory of Birth Defects Screening and
Engineering, Shenzhen, China; 5 Guangdong Provincial Key Laboratory of Human Diseases Genome, Guangdong, China; 6 State Key Laboratory of Bioelectronics, School of
Biological Science and Medical Engineering, Southeast University, Nanjing, China;
7
School of Bioscience and Bioengineering, South China University of Technology,
Guangzhou, China; 8 Section of Molecular Disease Biology, Department of Veterinary
Disease Biology, Faculty of Health and Medical Sciences, University of Copenhagen,
Copenhagen, Denmark; 9 Department of Biology, University of Copenhagen, Copenhagen, Denmark; 10 Department of Biology, University of Copenhagen, Copenhagen, Denmark; 11 King Abdulaziz University, Jeddah, Saudi Arabia; 12 The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen.
†
Yanwen Xu, Shengpei Chen, Xuyang Yin, Xiaoting Shen, and Xiaoyu Pan contributed
equally to this work and all should be considered as first authors.
In the past 2 decades, in vitro fertilization (IVF)13 with
preimplantation genetic diagnosis (PGD) has played a
key role in reproductive clinics. Couples carrying chromosomal abnormalities or mendelian disorders would be
advised to choose IVF with PGD (1, 2 ), especially those
with an affected child. The affected child provides important information for PGD, revealing mutations related to
disease as pathogenic variants for mendelian disease diagnosis (3 ). During the IVF process, genetic materials can
be obtained from embryos at 3 stages: polar body biopsy
(4 ), cleavage biopsy at day 3 (5 ), and blastocyst biopsy at
day 5/6 (6 ). Cleavage biopsy for a single blastomere is the
most popular approach for PGD indications.
␤-Thalassemia (OMIM 613985) is a group of autosomal recessive monogenic disorders caused by mutations in the ␤-globin genes (7 ). It is currently the most
common group of monogenic disorders. To date, the
available definitive cure for ␤-thalassemia major is hematopoietic stem cell transplantation from an HLAidentical donor. The HLA system is the name of the
MHC in humans. It contains a large number of genes
that encode antigen-presenting proteins that play a major
role in immune system function. PGD of single-gene
disorders combined with HLA matching has emerged as
a tool for couples at risk of transmitting a genetic disease
to select unaffected embryos of an HLA tissue type compatible with that of an existing affected child (8 ). At
delivery, hematopoietic stem cells from the newborn umbilical cord blood can be used to treat the affected sibling.
* Address correspondence to: X.Z. at: •••. E-mail [email protected]. C.N. at: First
Affiliated Hospital of Sun Yat-sen University, 58 Zhongshan Rd. II Guangzhou, Gu, China
510080. Fax 86-20-87750632; e-mail [email protected]. J.W. at: •••. E-mail
[email protected].
Received June 6, 2014; accepted January 13, 2015.
Previously published online at DOI: 10.1373/clinchem.2014.228569
13
Nonstandard abbreviations: IVF, in vitro fertilization; PGD, preimplantation genetic diagnosis; WGA, whole-genome amplification; ADO, allele dropout; PA, preferential amplification; MDA, multiple displacement amplification; SNP, single-nucleotide polymorphism; NGS, next-generation sequencing; WGS, whole-genome sequencing; AF,
amniotic fluid; MAR, missing allele rate; HMM, hidden Markov model; CR, copy ratio;
ChrX, X chromosome.
1
Copyright (C) 2015 by The American Association for Clinical Chemistry
During the PGD procedure, aneuploidy testing for preimplantation embryos is necessary owing to the high incidence of chromosomal abnormalities in embryos cultured in vitro. The sex information of embryos can be an
indication for PGD as well. Hence, genome-wide profiling of embryos with diagnosis of single gene disorders,
HLA matching, sex, and aneuploidy provides an approach to PGD for mendelian disorder carriers, such as
the carriers of ␤-thalassemia.
Traditionally, multiplex PCR has been used to detect the pathogenic variants of an embryo with short tandem repeat markers in close proximity to the causative
gene as a diagnosis backup. With the introduction of
whole-genome amplification (WGA) to amplify the biopsied embryonic single cell, the complete embryonic
genome can be obtained to facilitate the diagnosis of
genetic variants in embryos. However, the allele dropout
(ADO) and preferential amplification (PA) in WGA still
restrict the diagnostic accuracy of WGA-based PGD.
The ADO in multiple displacement amplification
(MDA) is reported to range from 10.0% to 38.9% (9 ).
Karyomapping based on single-nucleotide polymorphism (SNP) array has been reported to overcome the
ADO obstacles in single-cell analysis (10 ). For the embryonic aneuploidy testing, detection of chromosomal
abnormalities based on array comparative genomic hybridization (11 ), SNP array (12 ), or next-generation sequencing (NGS) (13–17 ) in biopsied embryonic cells
have been published. In this study, genome profiling of
human embryos using single blastomere whole-genome
sequencing (WGS) was assessed in a ␤-thalassemia family. The ADO in WGA was addressed and the haplotype
of the embryo was established.
Materials and Methods
SAMPLE RECRUITMENT
We recruited a couple who had a daughter suffering from
␤-thalassemia. This couple with a maternal age of 37
years underwent IVF with PGD to obtain a healthy baby.
Single-cell biopsy was performed on day 3 after fertilization and amplified using the MDA approach. Seven blastomeres were biopsied. A single cell was aspirated with a
sampling micropipette going through the split in the
zona pellucida. After biopsy, embryos were cultured in
blastocyst medium (SAGE BioPharma) until vitrification. PCR–reverse dot blot analysis was used to diagnose
␤-thalassemia, and short tandem repeat linkage analysis
was used for HLA matching. The pregnant mother received amniocentesis for prenatal diagnosis of ␤thalassemia at 16 weeks of gestation. Both the PGD and
prenatal diagnosis indicated a female embryo that was
negative for ␤-thalassemia (not a carrier of or affected
with ␤-thalassemia) and that the HLA types matched
those of the affected child. A healthy female baby was
2
Clinical Chemistry 61:4 (2015)
born at 38 weeks of gestation in July 2012. We collected
a portion of the WGA production of the 7 blastomeres
with the amniotic fluid (AF) of the corresponding IVF
fetus, as well as 5 mL of whole blood from this couple and
the proband daughter. Each participant provided informed consent under the protocol approved by the Ethics Committee of the first affiliated hospital of Sun Yatsen University.
WGA, LIBRARY CONSTRUCTION, AND SEQUENCING
Each blastomere cell was transferred into a PCR tube
containing 3.5 ␮L of PBS. The wash buffer during cell
sorting was used as the negative control. MDA was performed using the REPLI-g Midi kit (Qiagen) according
to the instructions. The amplified DNA was stored at
⫺20 °C for further processing.
Approximately 1 ␮g MDA product from each blastomere was used for library construction and sequencing.
For the transferred blastomere, we performed WGS using a pair-end strategy of 90 bp with 30-fold depth.
The peripheral blood samples of parents and proband daughter as well as the AF were used for DNA
extraction, library construction, All-in-One target region
capture, and sequencing. The MDA products of the
other 6 blastomeres nontransferred were also used for
All-in-One target region sequencing. We extracted 1 ␮g
DNA from each blood sample and AF, and constructed
libraries with 200-bp insert sizes. Libraries were used for
All-in-One target capture with a NimbleGen EZ sequence capture array containing 181.37-Mb target regions of the whole exome, one million tag-SNPs, the
whole MHC region, and several mendelian disorder–
related genes. After chip hybridization and elution, the
enriched DNA libraries were quality controlled. We performed NGS for the qualified libraries with pair-end
90-bp reads. The sequence reads of this project have been
deposited at NCBI SRA (SRA065291).
ALIGNMENT, SNP/INDEL CALLING
The reads with low quality and adapter contamination
were filtered out before aligning. The cleaned reads were
mapped to the reference genome (Hg19, NCBI release
GRCh37) using Burrows-Wheeler alignment [BWA
(http://bio-bwa.sourceforge.net/)] (18 ) with the default
parameters. We removed PCR duplications using
SAMtools (19 ). Afterward, we recalibrated the base qualities and performed the SNP/Indel calling using the Genome Analysis Toolkit (http://www.broadinstitute.org/
gatk/) (20 ) (indels are insertions and deletion in a
genome with a length of approximately 15 nucleotides).
ORIGINAL GENOTYPE CALLING
We performed the original genotype calling on the candidate sites, where one of the parents carried an SNP. The
ADO and PA of WGA were evaluated as well. Tradition-
Embryo Genome Profiling by NGS for PGD
ally, ADO is defined as the random amplification failure
of 1 of the 2 heterozygous alleles, whereas PA is defined as
a bias that favors the amplification of one allele over the
other, leading to failure to reach the threshold of detection (21 ). For better understanding, herein we denoted
the missing allele rate (MAR) as the overall percentage of
ADO and PA. In this section, the expected homozygous
(parents carried same homozygous SNP) and heterozygous (parents carried different homozygous SNP) would
be used to estimate the sequence error rate and MAR
respectively. The L statistic was defined as the odds ratio
between heterozygous and homozygous genotypes:
L ⫽ log10
Pr共genotype ⫽ AB兲
,
max[Pr共genotype ⫽ AA兲, Pr共genotype ⫽ BB兲]
where Pr is the probability of each genotype. Pr was calculated through binomial distribution (Bin) in these
formulas:
Pr(genotype ⫽ AB) ⫽ Bin(N,0.5),
Pr(genotype ⫽ AA) ⫽ Bin共N,1 ⫺ e兲, and
Pr(genotype ⫽ BB) ⫽ Bin共N,e兲,
where the sequence error (e) was estimated as the mismatch rate at expected homozygous sites and N represents the number of expected recombinations and
SNPs.
EMBRYO HAPLOTYPE PHASING
We used heterozygous callings of original genotypes to
predict the parental transmission through a hidden
Markov model (HMM). An HMM can be considered a
generalization of a mixture model, for which the hidden
variables, which control the mixture component to be
selected for each observation, are related through a
Markov process. During this process, the maternal and
paternal transmissions were predicted separately: (a) The
hidden states of the model, S ⫽ {0,1}, defined which
parental allele was transmitted to the embryo. In particular, 0 represented the allele transmitted to the proband
sister. (b) The distinct observation symbols per state, V ⫽
{0,1}, expressed the consistency between the transmitted
allele and the draft genotype; 1 stood for consistency
between the transmitted allele and the draft genotype and
0 stood for inconsistency. (c) The initial state distribution
␲ ⫽ {␲i},(i ⑀ S, ␲i ⫽ 0.5) was the given prior probability
of the different hidden state at the first site. (d) The state
transition probability distribution, A ⫽ {aij},(i,j ⑀ S), represented the probability of recombination or not, which
could be calculated according to former studies (22 ). If
i ⫽ j, aij ⫽ 1 ⫺ Nr/Np; otherwise, aij ⫽ Nr/Np, where Nr
and Np represented the number of expected recombinations and SNPs. (e) The observation symbols probability
distribution in each state, B ⫽ {bi(k)},(i ⑀ S, k ⑀ V ), were
estimated using the expected homozygous (Must-hom.)
and expected heterozygous (Must-het.) sites:
b i (0) ⫽
no. sites (L ⬎ 0,Must-hom.)
,
no. sites (L ⬎ 0,Must-het. or Must-hom.)
b i (0) ⫽ 1 ⫺ b i (0).
Finally, we deciphered the optimized hidden states using
HMM and the Viterbi algorithm.
ANEUPLOIDY DETECTION
Aneuploidies of the blastomeres were analyzed according
to the depth of the heterozygous SNP alleles. The signals
were filtered with 2 criteria: with size larger than 30 Mb
and with ratios between the depth of the 2 alleles ⬎2 or
⬍0.5. After filtering, the signals with a copy ratio (CR)
⬍0.75 were considered to be deletions, and the signals
with CR ⬎1.25 were considered to be duplications.
SEX DETERMINATION
Ideally, the CR of the sex chromosomes can reveal the sex
of the embryo. The CR was defined as the ratio between
the observed and expected reads:
CRc ⫽
R c /R total
,
L c /L total
where subscript c is the ID and R and L represent the
number of mapped reads and chromosomal length, respectively. In the case of heterozygosity examination, we
evaluated the difference in the L distribution on the X
chromosome (ChrX) between expected heterozygous
and homozygous sites by the rank-sum test.
Results
NEXT-GENERATION SEQUENCING
The couple received IVF treatment with a PGD test for
both ␤-thalassemia and HLA. A single-blastomere WGA
of the transferred embryo was collected for WGS (PEDE-A). A total of 118.51 Gbp data was generated using
NGS, which covered 96.8% of the reference genome
[HG19, NCBI (National Center for Biotechnology Information) built 37] to 38.25⫻ depth. To construct the
parental haplotype, we performed All-in-One target sequencing using genomic DNA extracted from the peripheral blood of parents (PED-Mother & PED-Father)
and the proband daughter (PED-Sister) as well as WGA
products from the 7 blastomeres, including the transferred embryo (PED-E-B) and the other 6 embryos nontransferred (PED-E-1,2,4,5,6,7). In addition, amniotic
fluid of the corresponding IVF baby (PED-AF) was collected to evaluate the accuracy of method through All-inOne target sequencing (Fig. 1A). The summary of data
production for NGS is displayed in Table 1 in the
Data Supplement that accompanies the online version of
Clinical Chemistry 61:4 (2015) 3
Fig. 1. Sample design, sequence strategy, bioinformatics pipeline, and general performance.
(A), Sequence strategy and bioinformatics pipeline. (B and C), The detailed statistics of the performance on autosomes and sex chromosomes.
(D), The MAR statistics before and after embryo haplotype phasing.
this article at http://www.clinchem.org/content/vol61/
issue4.
WGA BIAS ASSESSMENT AND ORIGINAL SNP PROFILING
According to the AF genotypes on autosomes, we found
that 34 004 of 158 643 embryo heterozygous SNPs had
dropped at least 1 allele, corresponding to an ADO rate
of 21.43% (Fig. 2A). Approximately 56% of the ADO
occurred in the paternal allele on the autosomes (Fig.
2B). As was reported previously (23, 24 ), we observed a
GC content–related bias, namely the ADO rate was increased in GC-poor (⬍30%) and GC-rich (⬎30%)
genomic regions (see online Supplemental Fig. 1).
To eliminate this bias, we performed original genotype calling of the embryo genome using the parameter L
to determine the original genotype (see online Supplemental Fig. 2). It indicated that L could isolate heterozygous sites from homozygote and ADO backgrounds. Besides the expected homozygous and heterozygous sites,
330 060 original genotypes were determined successfully.
According to the AF genotype on autosomes, 75.39%
4
Clinical Chemistry 61:4 (2015)
and 99.32% of the homozygous and heterozygous genotypes were accurately predicted (Fig. 1B). Regarding
ChrX, 71.36% and 99.66% of the homozygous and
heterozygous genotypes were accurately determined (Fig.
1C). This indicated an MAR of 40.02% on autosomes
and 37.99% on sex chromosomes.
EMBRYO HAPLOTYPE PHASING
We phased the embryonic autosomes with the parental
haplotypes, which were inferred from the genotype of the
proband daughter (Fig. 1A). We successfully phased
131 785 and 133 601 of the maternal and paternal
heterozygous sites, including mutations in the hemoglobin, beta (HBB)14 gene. Subsequently, we used an HMM
to determine the parental transmission, identifying re-
14
Human genes: HBB, hemoglobin, beta; HLA-DRB1, SlC25A22, solute carrier family 25
(mitochondrial carrier: glutamate), member 22; INS, insulin; SMPD1, sphingomyelin
phosphodiesterase 1, acid lysosomal; TH, tyrosine hydroxylase.
Embryo Genome Profiling by NGS for PGD
Fig. 2. Basic statistics and chromosomal contribution of ADO.
(A), According to the AF genotypes, approximately 21% of the heterozygotes dropped out at least 1 allele, and approximately 56% of these ADO
occur on the paternal allele. (B), The chromosomal contribution of each ADO category (green, paternal ADO; orange, maternal ADO; red, both
allele dropped ADO; grey, uncertain).
combination breakpoints and filtering out sequencing
error at the same time. The approach led to successful
phasing of 98.57% of the autosomal SNPs in the embryonic genome. According to the AF genotype, the accuracy of homozygous callings was significantly improved
from 75.39% to 99.73%. Using the haplotype information, 62 110 homozygous callings were identified as PA
or ADO and corrected back to heterozygous with an
accuracy of 95.44% (Fig. 1B), making the overall accuracy of the heterozygous callings 97.85%. Also, the autosomal MAR was markedly decreased from 40.02% to
0.50% (Fig. 1D).
We further performed a simulation by subsampling
of PED-E-A to characterize how the accuracy of the embryo genotype depends on sequencing depth (see online
Supplemental Fig. 3). As little as 25% of the WGS data
was able to decrease the MAR to ⬍1% (mean depth ⫽
9.5⫻). Concomitantly, the accuracy of the final genotypes stabilized above 97%. This indicated that our
model was robust for lower depth single-cell sequencing.
The accuracy of homologous recombination site
pinpointing using our method was demonstrated. We
compared the recombination breakpoints in parental alleles between amniotic fluid PED-AF and the embryo
Clinical Chemistry 61:4 (2015) 5
Table 1. Aneuploidies detection for the 7 embryos by NGS and SNP array.
Embryos
NGS results
SNP array results
PED-E-A
46,XX
46,XX
PED-E-1
46,XY
46,XY
PED-E-2
46,XX
46,XX
PED-E-4
41,XY, −1(M), −6(P), −12(P), −14(M), −15(P)
41,XY, −1, −6, −12, −14, −15
PED-E-5
47,XX, +7(P), +9(M), −11(P)
47,XX, +7, +9, −11
PED-E-6
44,XX, −12(P), −19(P)
44,XX, −12, −19
PED-E-7
45,XX, +6(P), −8(P), −20(P)
45,XX, +6, −8, −20
M, maternal alleles originated; P, paternal alleles originated.
transferred PED-E-A (see online Supplemental Fig. 4).
We also evaluated the accuracy statistics of SNP calling
close to the recombination breakpoints (see online Supplemental Fig. 5), which illustrated the reliability of the
embryonic haplotype phased by our method.
SEX DETECTION AND SEX CHROMOSOME ANALYSIS
Genomic information on sex chromosomes is central to
the diagnoses of mendelian diseases such as Duchenne
muscular dystrophy (MIM number 310200). We performed analysis on sex determination and haplotype on
sex chromosomes.
The CR of sex chromosomes would have revealed
the sex information (see online Supplemental Fig. 6).
The relative ratios of ChrX and ChrY for the transferred
embryo were 0.75 and 0.07, respectively. The lower ratio
of ChrY indicated that the signals of ChrY might randomly originate from an experimental or analytical process. To reach an unequivocal conclusion, we determined
the sex of the embryo by examining the heterozygosity of
ChrX. Theoretically, the L distribution between expected heterozygous and homozygous sites would be similar for male embryos but different for female embryos.
The rank–sum test (P value ⫽ 2.2 ⫻ 10⫺16) indicated a
significant difference in the L distribution for the transferred embryo (PED-E-A), implying a female embryo.
This finding was consistent with the birth. The L distribution between heterozygous and homozygous sites was
significantly different for PED-E-2, PED-E-5, PED-E-6,
PED-E-7, indicating the 4 embryos were female. The
other 2 embryos PED-E-1 and PED-E-4 were determined as male with the same L distributions. The sex
information of all 7 embryos was validated by SNP array
(Table 1; also see Supplemental Fig. 8).
We next determined the haplotype of ChrX for the
female embryo transferred. Of the 4887 original genotypes, over 99% of the transmitted allele was successfully
predicted. The general accuracies of the homozygous and
heterozygous callings were 97.18% and 99.52%, respec6
Clinical Chemistry 61:4 (2015)
tively (Fig. 1C). Meanwhile, the MAR was decreased
from 39.73% to 0.12% (Fig. 1D).
ANEUPLOIDY DETECTION
Aneuploidies of the 7 blastomeres were detected using
the depth ratio of the parental alleles in heterozygous
sites. Supplemental Fig. 7 shows the sequencing depth of
the parental alleles with CR ⬍0.75 (deletion) or CR
⬎1.25 (duplication) after filtering for partial chromosomes in the 7 embryos. The transferred embryo was
confirmed as euploidy because the majority of filtered
alleles were within 0.75 ⬍ CR ⬍ 1.25 in all the 22
autosomes and sex chromosomes. The aneuploidy results
of all 7 embryos are listed in Table 1, and the NGS results
were all validated by SNP array (see online Supplemental
Fig. 8). The embryonic aneuploidies by NGS in this
study were all consistent with the SNP array results.
WHOLE-GENOME MENDELIAN DISORDER DIAGNOSIS AND
HLA MATCHING
The embryonic genome allowed us to perform a genomewide diagnosis of mendelian disorders, including 83
common mendelian disorders in the clinic involving 384
genes (Fig. 3A). The genotypes indicated that the transferred embryo was totally negative for these 83 disorders.
Again, the results were confirmed by the healthy phenotypes after birth. A detailed diagnosis on ␤-thalassemia
and HLA matching was further performed.
Mutations in the HBB gene cause ␤-thalassemia.
The All-in-One sequencing indicated that the proband
sister carried a homozygous A3 C point mutation at
rs7480526 (NM_000518.4:c.315 ⫹ 74T⬎G), a known
pathogenic variant for ␤-thalassemia (25 ) in the intron
region of the HBB gene. Both parents carried a single
copy of this mutation and exhibited a normal phenotype.
We performed an HBB gene diagnosis based on the genotype and haplotype for all 7 embryos. The transferred
embryo was homozygous A in rs7480526. Haplotypes of
the proband sister were defined as Hap0/Hap0. The pre-
Embryo Genome Profiling by NGS for PGD
Fig. 3. Genome-wide clinical screening.
(A), Information of the genome for the transferred embryo PED-E-A. The outer to inner circles formed represent the genome-wide
mendelian disorder diagnosis, chromosomal structure, sequence depth, paternal allele transmission, and maternal allele transmission. (B), Details of the ␤-thalassemia diagnosis for PED-E-A. (C), Details of the HLA-matching test for PED-E-A. SlC25A22, solute carrier
family 25 (mitochondrial carrier: glutamate), member 22; INS, insulin; SMPD1, sphingomyelin phosphodiesterase 1, acid lysosomal;
TH, tyrosine hydroxylase.
dicted transmission showed that the haplotype was HapI/
HapI for PED-E-A at rs7480526 (Fig. 3B). We diagnosed this transferred embryo as ␤-thalassemia negative
(not a carrier of or affected with ␤-thalassemia), which
was consistent with the AF genotypes and diagnosis
after birth. Genotype at rs7480526 and predicted
transmission of the haplotype for all 7 embryos are
shown in Table 2 and Supplemental Fig. 9. The genotype at rs7480526 was consistent with the predicted
transmission of haplotype for all 7 embryos and was
further validated by Sanger sequencing (Table 2; also
see Supplemental Fig. 10).
The genotypes of HLA genes (see online Supplemental Table 2) have been used before transplantation
(26 ). For this family, the umbilical cord blood stem cells
of the transferred embryo could be used to treat the proband sister’s thalassemia (27 ). We selected the heterozygous genotypes of all of the HLA genes (see online Sup-
Table 2. The genotype at rs7480526, the haplotype in the HBB gene, and MHC regions for the 7 embryos.
NGS
Embryos
Genotype at
rs7480526
Sanger sequencing
Haplotype in
HBB gene
Haplotype in MHC
regions
Genotype at
rs7480526
Haplotype in
HLA-DRB1
PED-Sister
CC
Hap0/Hap0
Hap0/Hap0
CC
Hap0/Hap0
PED-E-A
AA
Hap1/Hap1
Hap0/Hap0
AA
Hap0/Hap0
PED-E-1
AC
Hap0/Hap1
Hap0/Hap1
AC
Hap0/Hap1
PED-E-2
AC
Hap0/Hap1
Hap1/Hap0
AC
Hap1/Hap0
PED-E-4
AC
Hap1/Hap0
Hap0/Hap0
AC
Hap0/Hap0
PED-E-5
AA
Hap1/Hap1
Hap0/Hap1
AA
Hap0/Hap1
PED-E-6
AC
Hap1/Hap0
Hap0/Hap0
AC
Hap0/Hap0
PED-E-7
CC
Hap0/Hap0
Hap1/Hap0
CC
Hap1/Hap0
Clinical Chemistry 61:4 (2015) 7
plemental Table 2) in NGS data to predict the
transmission haplotypes of each embryo. According to
the predicted transmission, the haplotypes of the transferred embryo (PED-E-A) were Hap0/Hap0 in HLA
genes (Fig. 3C), consistent with the proband sister. Thus,
we determined that their HLA types were matched. The
predicted transmission of haplotype in MHC regions for all
7 embryos is shown in Table 2 and online Supplemental
Fig. 9. The haplotype in HLA genes predicted from NGS
data were further validated by PCR with Sanger sequencing
of an oligonucleotide sequence in the major histocompatibility complex, class II, DR beta 1 (HLA-DRB1) gene with
both paternal and maternal heterozygous alleles (see online
Supplemental Fig. 11). The haplotype of HLA-DRB1 in
each embryo by Sanger sequencing was consistent with the
result predicted from NGS.
Discussion
In a retrospective study, we demonstrated a model for
retrieval of the embryo genome using single-cell sequencing assisted by parental haplotypes in a ␤-thalassemia
family. Over 97% of the embryo autosomal genotypes
and 99% of the genotypes on ChrX were reconstructed
accurately. In addition, the diagnosis of ␤-thalassemia,
HLA matching, sex, and aneuploidy were analyzed and
proved to be accurate. This established the technical
foundation of embryo genome recovery using single-cell
sequencing in clinics.
Although we successfully recovered the embryo
SNPs and made an accurate diagnosis, there are 2 major
directions for further development. First, the parental
haplotype can be inferred using a simple trio strategy. We
can infer the parental haplotypes using these “offspring”
and phase these embryo genomes reciprocally. There may
be several experimental approaches to obtain the parental
haplotype directly, such as a fosmid library (28 ), long
fragment read technology (29 ), and microfluidic technology (30 ). The single-cell sequencing of sperm or second polar body can help to obtain the parental haplotypes as well. Second, in this study we focused exclusively
on SNPs and aneuploidies. However, clinical applications of embryonic genomes require information for
other forms of variations, e.g., short indels and de novo
mutations. Indels play an important role in mendelian
disorders (31 ) and are potential markers of complex traits
(32 ). In this study, we also tried to detect embryo indels
using our strategy. The accuracy of autosomal homozygous and heterozygous callings of indels in embryo was
improved from 69.00% and 83.10% to 89.14% and
88.13%, respectively (see online Supplemental Table 3).
Because of the false positives in WGA and sequencing, we
were unable to confirm de novo mutations.
In traditional PGD, multiplex PCR and array-based
technology have been used to identify pathogenic mu8
Clinical Chemistry 61:4 (2015)
tations (33 ). However, multiplex PCR has a lower
throughput and a limited number of known sites can be
analyzed. Array-based technology provides a highthroughput approach such as aCGH (array comparative
genomic hybridization) and SNP array, but it can identify only known common markers in human genome.
Although karyomapping based on SNP array provides an
option for the detection of the majority of mutations, the
resolution and accuracy are restricted. NGS provides a
high-throughput and single-nucleotide resolution technology to analyze the embryonic genome, enabling
PGD. Compared to array-based karyomapping, NGS
mainly shows technological advantages. The strength of
NGS is in the detection of family-specific mutations or
rare mutations, which would not be included in the
probe targets of SNP array. A large number of pathogenic
mutations would not be directly detected in most SNP
array assays, such as the rs7480526 of the HBB gene in
this study. Furthermore, detection of embryo indels was
evaluated, indicating the feasibility of NGS for analysis in
embryos of complicated variations that cannot be analyzed in SNP array assay.
The cost of NGS with 30-fold coverage of the embryo genome would be about $1000. However, sequencing with lower depth is also possible because, in our
model, the MAR decreased ⬍1% with a depth of 9.5⫻.
The method theoretically can be modified so that a condensed target region for sequencing could be used and the
cost reduced accordingly. The turnaround time of sequencing is about 20 working days, so that embryo vitrification is needed. Several studies have indicated that,
with the introduction of vitrification, the use of NGS
during pregnancy and the overall efficiency during IVF–
PGD might improve dramatically (34 –36 ). Furthermore, with continuous technical improvements the costeffectiveness of sequencing is rapidly increasing as is the
accuracy.
In this study, we introduced an NGS-based method
to access the embryo genome. Numerical chromosome
anomalies, which often occur during human cleavagestage embryogenesis, have been accurately detected. The
ability to recover an embryo genome at high resolution
using single-cell sequencing should bring new insights
into the diagnoses of mendelian disorders and personal
genomic analysis for PGD. In the case of parental carriage of mendelian disorders, the embryo genome provides all of the necessary evidence for diagnosis. It paves
the way for PGD toward personal medicine, such as diagnosis of allergies (37 ). The additional information
available to parents will raise ethical questions, especially
regarding the prediction of novel parental pathological
mutations. Another major problem in testing embryos
without a phenotype for diseases that are not in the family is in interpreting the variants. Variants of uncertain
significance will often be discovered, and many of these vari-
Embryo Genome Profiling by NGS for PGD
ants are difficult to classify even in affected individuals. The
variants of uncertain significance must be considered in embryonic personal genome interpretation during clinical
practice. The pretest and posttest genetic counseling is also
substantial and should be standardized. The ethical questions raised need to be thoroughly discussed within the scientific community and at a societal level.
Author Contributions: All authors confirmed they have contributed to
the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising
the article for intellectual content; and (c) final approval of the published
article.
Authors’ Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest:
Employment or Leadership: C. Zhou, First Affiliated Hospital of Sun
Yat-sen University, Guangzhou, China.
Consultant or Advisory Role: None declared.
Stock Ownership: None declared.
Honoraria: None declared.
Research Funding: J. Wang, W. Wang, S. Chen, X. Yin, X. Pan, F.
Chen and H. Jiang, this study was funded by the Laboratory of Shenzhen Birth Defect Screening Project (JZF no. [2011] 861 and JZF no.
[2011] 862) and was approved by the Shenzhen Municipal Commission for Development and Reform and Key Laboratory Project in Shenzhen (CXB201108250096A) and Key Laboratory of Cooperation Project in Guangdong Province (2011A060906007).
Expert Testimony: None declared.
Patents: S. Chen, X. Yin, and X. Pan, PCT/CN2013/073375.
Role of Sponsor: The funding organizations played no role in the
design of study, choice of enrolled patients, review and interpretation of
data, or final approval of manuscript.
Acknowledgments: We thank Laurie Goodman and Huijue Jia for
writing advice. We thank Yingrui Li, Xu Yang, and Fengping Xu
for support and thank Chunlei Zhang, Xuchao Li, and Chun Gong for
assistance in data analysis.
References
1. Bisignano A, Wells D, Harton G, Munne S. PGD and aneuploidy screening for 24 chromosomes: advantages
and disadvantages of competing platforms. Reprod
Biomed Online 2011;23:677– 85.
2. Verlinsky Y, Cohen J, Munne S, Gianaroli L, Simpson JL,
Ferraretti AP, Kuliev A. Over a decade of experience with
preimplantation genetic diagnosis: a multicenter report. Fertil Steril 2004;82:292– 4.
3. Ng SB, Nickerson DA, Bamshad MJ, Shendure J. Massively parallel sequencing and rare disease. Hum Mol
Genet 2010;19:R119 –24.
4. Verlinsky Y, Ginsberg N, Lifchez A, Valle J, Moise J,
Strom CM. Analysis of the first polar body: preconception genetic diagnosis. Hum Reprod 1990;5:826 –9.
5. Handyside AH, Kontogianni EH, Hardy K, Winston RM.
Pregnancies from biopsied human preimplantation
embryos sexed by Y-specific DNA amplification. Nature
1990;344:768 –70.
6. McArthur SJ, Leigh D, Marshall JT, de Boer KA, Jansen
RP. Pregnancies and live births after trophectoderm biopsy and preimplantation genetic testing of human
blastocysts. Fertil Steril 2005;84:1628 –36.
7. Olivieri NF. The beta-thalassemias. N Engl J Med 1999;
341:99 –109.
8. Milachich T, Timeva T, Ekmekci C, Beyazyurek C, Tac HA,
Shterev A, Kahraman S. Birth of a healthy infant after
preimplantation genetic diagnosis by sequential blastomere and trophectoderm biopsy for ␤-thalassemia
and HLA genotyping. Eur J Obstet Gynecol Reprod Biol
2013;169:261–7.
9. Zheng YM, Wang N, Li L, Jin F. Whole genome amplification in preimplantation genetic diagnosis. J Zhejiang
Univ Sci B 2011;12:1–11.
10. Handyside AH, Harton GL, Mariani B, Thornhill AR, Affara N, Shaw MA, Griffin DK. Karyomapping: a universal
method for genome wide analysis of genetic disease
based on mapping crossovers between parental haplotypes. J Med Genet 2010;47:651– 8.
11. Fragouli E, Alfarawati S, Daphnis DD, Goodall NN, Mania A, Griffiths T, et al. Cytogenetic analysis of human
blastocysts with the use of FISH, CGH and aCGH: scientific data and technical evaluation. Hum Reprod 2011;
26:480 –90.
12. Treff NR, Su J, Tao X, Levy B, Scott RT Jr. Accurate single
cell 24 chromosome aneuploidy screening using
whole genome amplification and single nucleotide
polymorphism microarrays. Fertil Steril 2010;94:
2017–21.
13. Yin X, Tan K, Vajta G, Jiang H, Tan Y, Zhang C, et al.
Massively parallel sequencing for chromosomal abnormality testing in trophectoderm cells of human blastocysts. Biol Reprod 2013;88:69.
14. Zhang C, Chen S, Yin X, Pan X, Lin G, Tan Y, et al. A single
cell level based method for copy number variation analysis by low coverage massively parallel sequencing.
PLoS One 2013;8:e54236.
15. Fiorentino F, Biricik A, Bono S, Spizzichino L, Cotroneo
E, Cottone G, et al. Development and validation of a
next-generation sequencing-based protocol for 24chromosome aneuploidy screening of embryos. Fertil
Steril 2014;101:1375– 82.
16. Wang L, Wang X, Zhang J, Song Z, Wang S, Gao Y, et al.
Detection of chromosomal aneuploidy in human preimplantation embryos by next-generation sequencing.
Biol Reprod 2014;90:95.
17. Wells D, Kaur K, Grifo J, Glassner M, Taylor JC, Fragouli
E, Munne S. Clinical utilisation of a rapid low-pass
whole genome sequencing technique for the diagnosis
of aneuploidy in human embryos prior to implantation.
J Med Genet 2014;51:553– 62.
18. Li H, Durbin R. Fast and accurate short read alignment
with Burrows-Wheeler transform. Bioinformatics 2009;
25:1754 – 60.
19. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer
N, et al. The Sequence Alignment/Map format and
SAMtools. Bioinformatics 2009;25:2078 –9.
20. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis
K, Kernytsky A, et al. The Genome Analysis Toolkit: a
MapReduce framework for analyzing next-generation
DNA sequencing data. Genome Res 2010;20:1297–
303.
21. Findlay I, Ray P, Quirke P, Rutherford A, Lilford R. Allelic
drop-out and preferential amplification in single cells
and human blastomeres: implications for preimplantation diagnosis of sex and cystic fibrosis. Hum Reprod
1995;10:1609 –18.
22. Kirkness EF, Grindberg RV, Yee-Greenbaum J, Marshall
CR, Scherer SW, Lasken RS, Venter JC. Sequencing of
isolated sperm cells for direct haplotyping of a human
genome. Genome Res 2013;23:826 –32.
23. Xu X, Hou Y, Yin X, Bao L, Tang A, Song L, et al. Singlecell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 2012;148:
886 –95.
24. Hou Y, Song L, Zhu P, Zhang B, Tao Y, Xu X, et al. Singlecell exome sequencing and monoclonal evolution of
a JAK2-negative myeloproliferative neoplasm. Cell
2012;148:873– 85.
25. Bilgen T, Arikan Y, Canatan D, Yesilipek A, Keser I. The
association between intragenic SNP haplotypes and
mutations of the beta globin gene in a Turkish population. Blood Cells Mol Dis 2011;46:226 –9.
26. Sheldon S, Poulton K. HLA typing and its influence on
organ transplantation. Methods Mol Biol 2006;333:
157–74.
27. Lucarelli G, Isgro A, Sodani P, Gaziev J. Hematopoietic
stem cell transplantation in thalassemia and sickle cell
anemia. Cold Spring Harb Perspect Med 2012;2:
a011825.
28. Kitzman JO, Mackenzie AP, Adey A, Hiatt JB, Patwardhan RP, Sudmant PH, et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat
Biotechnol 2011;29:59 – 63.
29. Peters BA, Kermani BG, Sparks AB, Alferov O, Hong P,
Alexeev A, et al. Accurate whole-genome sequencing
and haplotyping from 10 to 20 human cells. Nature
2012;487:190 –5.
30. Fan HC, Wang J, Potanina A, Quake SR. Whole-genome
molecular haplotyping of single cells. Nat Biotechnol
2011;29:51–7.
31. Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK,
Dent KM, et al. Exome sequencing identifies the cause
of a mendelian disorder. Nat Genet 2010;42:30 –5.
32. Vali U, Brandstrom M, Johansson M, Ellegren H.
Insertion-deletion polymorphisms (indels) as genetic
markers in natural populations. BMC Genet 2008;9:8.
33. Harper JC, Sengupta SB. Preimplantation genetic
diagnosis: state of the art 2011. Hum Genet 2012;131:
175– 86.
Clinical Chemistry 61:4 (2015) 9
34. Roque M, Lattes K, Serra S, Sola I, Geber S, Carreras
R, Checa MA. Fresh embryo transfer versus frozen
embryo transfer in in vitro fertilization cycles: a systematic review and meta-analysis. Fertil Steril 2013;
99:156 – 62.
35. Shapiro BS, Daneshmand ST, De Leon L, Garner FC,
10
Clinical Chemistry 61:4 (2015)
Aguirre M, Hudson C. Frozen-thawed embryo transfer is
associated with a significantly reduced incidence of ectopic pregnancy. Fertil Steril 2012;98:1490 – 4.
36. Lathi RB, Massie JA, Gilani M, Milki AA, Westphal LM,
Baker VL, Behr B. Outcomes of trophectoderm biopsy
on cryopreserved blastocysts: a case series. Reprod
Biomed Online 2012;25:504 –7.
37. Negoro T, Orihara K, Irahara T, Nishiyama H, Hagiwara K, Nishida R, et al. Influence of SNPs in
cytokine-related genes on the severity of food allergy and atopic eczema in children. Pediatr Allergy
Immunol 2006;17:583–90.