Download Nature Genetics: doi:10.1038/ng.3304

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene therapy wikipedia , lookup

Koinophilia wikipedia , lookup

Tay–Sachs disease wikipedia , lookup

Gene expression profiling wikipedia , lookup

Behavioural genetics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Skewed X-inactivation wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome evolution wikipedia , lookup

Pharmacogenomics wikipedia , lookup

X-inactivation wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Public health genomics wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Designer baby wikipedia , lookup

Oncogenomics wikipedia , lookup

Population genetics wikipedia , lookup

RNA-Seq wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Genome (book) wikipedia , lookup

Medical genetics wikipedia , lookup

Epistasis wikipedia , lookup

Mutation wikipedia , lookup

Frameshift mutation wikipedia , lookup

Microevolution wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
Supplementary Figure 1
Distribution of sequencing coverage in the WGS500 project.
Left, plots of the cumulative distribution, for each WGS500 sample, of coverage across the genome (top left) or exome (bottom left).
Top right, a comparison of coverage between WGS500 samples (blue) and exomes (black) sequenced at the Oxford Biomedical
Research Centre. Thicker lines are the medians across samples; dotted vertical lines are the global medians. Bottom right, the
distribution, for each WGS500 sample, of the ratio of the number of reads with the alternate allele (ALT) to the total number of reads
(TOTAL), for novel variants. We expect the mean to be 0.5. Individuals with mean <0.4 are shown with colored lines. These are likely to
have sample contamination, which leads to a larger number of heterozygous calls for which there are few ALT reads. The sample
HCM_2361 was removed from further analysis. Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 2
Influence of coverage on concordance between sequence data and SNP arrays for multiple samples.
Left, genotype concordance as a function of sequencing depth; note that concordance drops progressively when coverage drops below
15×. 95% confidence intervals, calculated by the Wald method, are indicated. Right, fraction of sites with a given level of coverage.
Note that samples with higher coverage (e.g., LVNC_1.1.70, LVNC_1.2.83) have fewer SNPs in the lower-coverage bins, and the
genotype concordance estimate therefore has larger confidence intervals.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 3
Effect of filtering variants by frequency in public databases and/or other WGS500 samples.
Density plots of the distribution of the number of novel heterozygous (top) or rare homozygous (bottom) coding variants (ANNOVAR
annotation) across all individuals, where frequency is defined in the control data sets indicated. The individuals in the top 5th percentile
are shown; all of these samples are known to have African or South Asian ancestry except for MR_6 and MR_8, for which we suspect
have some sample contamination (Supplementary Fig. 1). ESP, NHLBI Exome Sequencing Project.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 4
The burden of variants of unknown significance in candidate genes for craniosynostosis.
Histograms of the number of potentially pathogenic, conserved coding variants in different candidate gene sets for craniosynostosis
(CRS). The candidate genes were chosen by a combination of literature and high-throughput database searches, augmented by expert
curation (Online Methods). Sample names in green text indicate that the variant is not likely to be pathogenic, as it does not fit a
plausible inheritance model or is less functionally compelling than another candidate (Supplementary Table 6). SC, Saethre-Chotzen
syndrome.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 5
The burden of putative regulatory variants.
Distributions of the number of novel heterozygous (top) and rare homozygous (bottom) variants that alter conserved positions in
regulatory regions within 5 kb (red) or 50 kb (black) of a gene. The fact that the number of variants does not substantially change if one
considers only regulatory regions within 5 kb of genes (black line) reflects the fact that these regions tend to be close to genes. Note
that most of the outliers are also outliers in Supplementary Figure 3, and these samples tend to be of African or Asian ancestry.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 6
The burden of putative regulatory variants around candidate genes for early-onset epilepsy or craniosynostosis.
As shown for Supplementary Figure 4 but for variants at conserved positions in regulatory regions within 50 kb of candidate genes for
early-onset epilepsy (top) or craniosynostosis (bottom).
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 7
Segregation of putative causal variants in UMOD and CASR.
Top, the NM_001008389:c.410G>A UMOD variant was identified by WGS in individual III.2 in this family with familial juvenile
hyperuricaemic nephropathy (FJHN). The G>A transition generates an AccI restriction endonuclease recognition site, and digestion of a
349-bp PCR product with AccI was used to confirm cosegregation of the variant with affected individuals in the family. Digestion of the
mutant (mut) allele generated 93-bp and 256-bp fragments, with the wild-type (WT) allele remaining uncut. Bottom, the
NM_000388:c.2299G>C CASR variant was identified by WGS in individuals I.2 and II.1 in a family with familial hypoparathyroidism
(FH). The G>C transversion causes loss of a BssSI restriction endonuclease recognition site, and digestion of a 367-bp PCR product
with BssSI was used to confirm cosegregation of the variant with affected individuals in the family. Digestion of the wild-type (WT) allele
generated 180-bp and 187-bp fragments, with the mutant (mut) allele remaining uncut.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 8
Parental origin and sequence conservation of the HUWE1 mutation.
Left, alignment of sequencing reads from the proband (CRS_4659), mother (CRS_4654) and father (CRS_4655) over 2 C/A
polymorphisms (arrows; C shown in red and A shown in black). Allele-specific primers (AARev and CCRev) were designed with a
common primer Intron6-For to amplify the HUWE1 mutation and polymorphisms in a single PCR product (top right). Red arrows
indicate polymorphic sites, and nucleotides included in the primer sequences are underlined. The results of PCR are shown
underneath. The products were digested with HpaII, which showed the presence of the mutation (white arrow) only in the CC-Rev
amplification product from the proband (second panel from the bottom, right), indicating paternal origin. Bottom right, an alignment of
the DUF908 domain in the protein encoded by HUWE1, with the mutated residue indicated (red arrow).
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 9
Identification of an inherited interstitial insertion involving chromosomes 2p25.3 and Xq27.1 associated with X-linked
recessive hypoparathyroidism.
The sequences of the proximal (top left) and distal (top right) insertion junctions are shown. Reference sequences on Xq27.1 and
2p25.3 are indicated in red and blue, respectively. A 3-bp microinsertion at the distal insertion boundary is indicated in yellow. Bottom,
primers specific for chromosomes 2 (2SPF) and X (XSPF and XSPR) were designed for the DNA sequence at the distal boundary and
used to further characterize the insertion. The sizes of the PCR products obtained with each primer pair are indicated. Chromosome X
is shown in black, and the inserted sequence from chromosome 2q25.3 is shown in gray.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 10
Coverage in this study compared to a large-scale exome sequencing project.
Coverage comparison of this study and a large-scale exome sequencing (WES) project for the variants given in Table 1 (top) and the
causative variants identified in the WES project (bottom). For the WES project, for nondisclosure reasons, only the gene name is given.
The WES coverage data (blue) were compiled from 141 whole-exome data sets that were sequenced using the Roche NimbleGen
SeqCap EZ v.2.0 kit. Labels for variants located in regions targeted by this kit are in blue, those within 20 bp of the targeted regions are
in green and those outside the targeted regions are in red. The WGS500 data (red) were compiled from all the whole-genome data sets
used in this study. The horizontal green lines denote two exemplary coverage thresholds used in variant detection. To improve
readability, the plots were truncated above a coverage value of 100 (top) or 200 (bottom) and the box-plot whiskers were extended to
the data extremes.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Figure 11
Distribution of the lengths of the largest regions of homozygosity across all samples.
Thirty-seven samples had at least one region of homozygosity >4 Mb in length (black bars), suggesting consanguinity. Note that the
largest bin includes one sample with confirmed uniparental isodisomy. See the Online Methods for an explanation of how regions of
homozygosity were identified.
Nature Genetics: doi:10.1038/ng.3304
Supplementary Note
Case studies
1. HUWE1 in craniosynostosis
1.1. Introduction
Background on the disease
Craniosynostosis, the premature fusion of the cranial sutures, is a serious disorder with a prevalence
of ~1 in 2,200 children. There are over 30 known disease genes, with dominantly acting mutations in
the FGFR2, FGFR3, TWIST1 and EFNB1 genes accounting for most of the 20-25% of cases with a single
genetic aetiology1.
Case study
The female proband, CRS_4659, was noted to have microcephaly in utero, and craniosynostosis was
suspected based on magnetic resonance imaging performed at 30 weeks’ gestation. She was born at
term by planned Caesarean section, and did not require any resuscitation. On formal craniofacial
assessment at the age of 7 weeks, a very tall skull with a marked transverse occipital constriction and
multiple palpable soft spots was noted. She was dysmorphic with exorbitsm, slightly upslanting
palpebral fissures and arched eyebrows. She had a high arched palate and thin upper lip; the ears,
hands and feet were normal. Three dimensional computed tomographic analysis of the skull showed
multiple widespread craniolacunae and synostosis of all sutures (Figure 3A). She underwent an
occipital craniectomy and foramen magnum decompression at 7 months of age and a fronto-orbital
advancement at 4 years of age. Formal developmental assessment prior to the second procedure
indicated that she was performing in the low average range of ability, with decreased attention and
concentration and marked distractibility. Speech and language assessment using Clinical Evaluation
of Language Fundamentals (CELF) Preschool gave scores in the 3-6 range (average 10).
Known genes/pathways and prior screens
The clinical picture was not reminiscent of craniosynostosis disorders caused by known disease
mutations. Analysis of blood for disorders of bone biochemistry (Ca, P, Mg, alkaline phosphatase), a
craniosynostosis disease gene screen (for mutation hotspots/deletions in FGFR1, FGFR2, FGFR3,
TWIST1 and EFNB1), karyotyping and array comparative genomic hybridisation screen (Agilent 250k)
were all normal.
Experimental design and strategy for identifying candidates
We sequenced the proband and her unaffected parents. They were nonconsanguineous, and there
was no family history of craniosynostosis. We suspected a de novo, dominant mutation (because the
majority of monogenic craniosynostosis exhibits dominant genetics), but also considered recessive
Nature Genetics: doi:10.1038/ng.3304
mechanisms. We searched for de novo variants in the proband, absent in the parents, and prioritised
exonic mutations predicted to cause alterations in the encoded protein.
1.2. Methods for follow-up studies
Parental origin of the mutation
We designed allele-specific primers to determine whether the child’s HUWE1 mutation was present
on the allele of maternal or paternal origin. We exploited two closely adjacent intronic C/A
polymorphisms at positions chrX:53,675,478 and chrX:53,675,488, ~1150 bp upstream of the HUWE1
mutation, that we had identified in the whole genome sequence. The proband and her mother
(CRS_4654) were heterozygous (CC/AA), whilst the father (CRS_4655) was a CC hemizygote
(Supplementary Figure 8 left). Allele-specific primers CCRev (5'-CCAAGGTGGGTTTTT
GTTTTGTTTTTTGTTTTGTTTTGTTTTG-3') and AARev (5'-CCAAGGTGGGTTTTTGTTTTGTTTTTTGTTT
TTTTTTGTTTTT-3') were designed to amplify the CC and AA alleles respectively (bases differing
between the two primers italicised). PCR was carried out using either CCRev or AARev and a common
forward primer Intron6-For (5'-CCCATCAACCCTATGAAGGATAGTATCTATATCC-3'), so that the product
spanned exons 5 and 6 (containing the HUWE1 mutation). The AARev primer did not generate a
product using the paternal sample, confirming that amplification was allele-specific (Supplementary
Figure 8, top gel pictures). The 1392 bp amplification product was digested with HpaII, for which a
restriction site is ablated by the HUWE1 mutation. The 645 bp fragment characteristic of the mutant
allele was observed only in DNA amplified from the child using the CCRev primer (Supplementary
Figure 8, bottom gel picture, white arrow), indicating that the mutation resided on the paternal allele
(Figure 3B).
X inactivation studies for HUWE1 mutation
Skewing of X inactivation was measured using the androgen receptor gene (AR) triplet repeat
assay2,3. Briefly, 1 µg of genomic DNA was predigested with RsaI either in the presence (+) or
absence (-) of the methylation-sensitive enzyme HpaII (20 U). PCR amplification was carried out using
primers
AR-For
(5'-TCCAGAATCTGTTCCAGAGCGTGC-3’)
and
AR-Rev
(5'-FAMGCTGTGAAGGTTGCTGTTCCTCAT-3’). Amplicons were analysed on an ABI 3130 sequencer and sized
using GeneScan software. Differences in peak areas for the two alleles in the HpaII(+) assay were
corrected for differences in amplification efficiency measured in the HpaII(-) assay, and the final
results expressed as a percentage of the more inactivated allele. Of note, Platypus was unable to
specify the correct AR triplet repeat genotypes in heterozygotes (not shown), highlighting the
difficulties of accurately calling simple sequence repeats using 100 bp read data.
Analysis of HUWE1 expression
RNA was extracted from EBV-transformed lymphoblastoid cells and scalp fibroblasts obtained from
patient CRS_4659. cDNA was synthesised using the Fermentas RevertAid First-Strand Synthesis kit
with random hexamer primers according to the manufacturer's instructions. PCR amplification was
carried out using primers Ex6Rev (5'-CTGCCAGCACCACTTGCATATCAGAGGAAGCC-3’) and Ex5For (5'GTGCGAGTTATATCACTGGGTGGACCTGTTGG-3’) to generate a product of 257 bp. HpaII digests the
normal but not the mutant product, generating fragments of 186 and 71 bp.
Nature Genetics: doi:10.1038/ng.3304
1.3. Results
Amongst 94 de novo mutations identified in the proband, a mutation at chrX:53,674,333 in the
HUWE1 gene (c.329G>A encoding p.Arg110Gln) was the only one predicted to alter protein coding.
HUWE1 encodes a ubiquitin ligase, and the mutation resides at a highly conserved position (including
across invertebrates and yeasts) of the DUF908 domain, the function of which is unknown
(http://pfam.sanger.ac.uk/family/duf908) (Supplementary Figure 8, bottom right). Five missense
mutations elsewhere in the protein were previously reported in pedigrees segregating X-linked
mental retardation or autistic spectrum disorder4-6. In two of these pedigrees, female carriers were
reported to be symptomatic and/or to have associated macrocephaly.
The mutation was found to originate from the paternal X chromosome (Supplementary Figure 8,
bottom gel picture). To seek additional support for an X-linked origin of the child’s disorder, we
studied X-inactivation, because (owing to negative selection of cells lacking a functional gene copy),
female carriers of many serious X-linked disorders exhibit preferential inactivation of the X
chromosome bearing the mutant allele7. Indeed, we found that the proband exhibited extreme
skewing of X-inactivation (Figure 3C), but to our surprise the maternal (and therefore non-mutant) X
was preferentially inactivated. Corroborating this conclusion, RNA expression studies in scalp
fibroblasts and EBV-transformed lymphoblastoid cells showed that only mutant HUWE1 is expressed
in the patient (Figure 3D). Neither the mother nor the paternal grandmother showed extreme
skewing of X-inactivation (Figure 3C), ruling out the possibility that one of the proband’s X
chromosomes was constitutionally susceptible or resistant to X-inactivation.
We considered the possibility that a different de novo mutation, occurring on the maternally
inherited X chromosome, could have led to selective inactivation of the maternal X. We therefore
scrutinised the seven other X-encoded de novo mutations that had been detected by WGS (Figure
3E). Three variants located within genes (all noncoding): two were shown to be present on the
paternal allele whilst the third, for which parental origin could not be determined, was in a gene
(CCDC160) that is in a region not subject to X- inactivation8. Of the other four de novo variants, all
were found in SINE or LINE repeats, and appeared unlikely to be functionally significant.
In addition, we scrutinised the X chromosome for regions in which a maternal allele was apparently
not transmitted, as this could indicate a de novo deletion on the maternal allele. We identified two
such regions, one (chrX:5,055,376-5,057,467) in an intergenic region at Xp22.1, and the other
(chrX:154,778,278-154,784,971) within intron 1 of TMLHE at Xq28. These copy number changes did
not appear to be good candidates to cause the skewed X-inactivation.
We used dideoxy-sequencing to screen 280 patients with craniosynostosis for mutations of HUWE1
located within the region encoding the DUF908, but found no other significant mutations. In
addition, we sequenced the entire HUWE1 gene in 47 patients with multisuture synostosis using
Fluidigm Access Array multiplexing and Ion Torrent sequencing, but again found no likely pathogenic
mutations. However, we did subsequently identify, using exome sequencing, a different de novo
hemizygous mutation altering the same amino acid of HUWE1 (c.328C>T encoding p.R110W) in a boy
presenting with metopic craniosynostosis, moderate-severe learning disability and other dysmorphic
features, making us confident that this mutation was pathogenic in both cases.
Nature Genetics: doi:10.1038/ng.3304
1.4. Clinical actions
There is strong evidence that the mutation is the cause of the child’s learning disability and
craniosynostosis, because (1) HUWE1 is an established disease gene in X-linked mental retardation in
males4,6, and (2) a de novo mutation at the same codon was found by exome sequencing in a male
patient who presented with craniosynostosis.
The parents requested help with obtaining special educational support for their daughter, so a letter
outlining the genetic diagnosis and its likely contribution to the learning disability was written to the
education authorities. A suitable educational plan was subsequently implemented and the parents
considered the genetic information to be instrumental in this outcome.
2. EPO in erythrocytosis
2.1. Introduction
Background on the disease
Erythrocytosis is a clinical condition characterized by increased red cell mass and typically elevated
haematocrit and haemoglobin (Hb) concentration9. It can be congenital (e.g. genetic) or acquired. In
primary erythrocytosis, patients have an intrinsic defect in the erythroid cells of the bone marrow
and typically have low levels of erythropoietin (Epo), the protein that promotes the survival,
proliferation and differentiation of erythrocyte progenitor cells. In secondary erythropoiesis, the
increased red cell production is driven by external factors (e.g. hypoxia or defects in oxygen sensing)
through increased erythropoietin production, and patients typically have high or inappropriately
normal Epo levels9. Epo production is controlled at the transcriptional level in an oxygen-regulated
manner. This control is mediated by hypoxia-inducible transcription factors (HIF), and mutations in
genes in the HIF pathway are known to cause erythrocytosis10.
Even after screening for all known mutations, there remains a considerable number of patients in
whom no genetic cause has been found11. Of these patients with idiopathic erythrocytosis, about
two thirds have inappropriately normal or elevated levels of erythropoietin (given their level of Hb),
suggesting a high likelihood of a defect in their oxygen-sensing pathway. Furthermore, most of these
patients have early-onset (childhood) disease and often have a family history (sometimes with clear
patterns of Mendelian inheritance), suggesting a high probability for an underlying genetic aetiology.
Case study
Two unrelated families, Family M and Family S, were identified showing an autosomal dominant
pattern of erythrocytosis inheritance (Figure 4B). The patients with erythrocytosis had raised Hb
concentrations (>16.5 g/dl in females and >18.5 g/dl in males) and haematocrits (>0.5 l/l), had high
Epo levels and were diagnosed at young age, some as young as 2 years of age (e.g. PAR07 and
PAR08).
Known genes / pathways and prior screens
Our investigations focused on patients with idiopathic erythrocytosis with either high or
inappropriately normal Epo levels, suggesting a defect in oxygen sensing rather than primary
erythrocytosis. Based on knowledge of the biological pathway for hypoxia induced erythrocytosis, a
Nature Genetics: doi:10.1038/ng.3304
list of a priori candidate genes included: HIF1A, EPAS1 (HIF2A), HIF3A, ARNT (HIF1B), HIF1AN (FIH),
EGLN1 (PHD2), EGLN2 (PHD1), EGLN3 (PHD3), VHL, EPO, HBB, HBA1, HBA2, BPGM. We also
considered JAK2 and EPOR as candidates, since these are involved in congenital erythrocytosis.
Known exonic mutations in VHL, PHD2, HIF2A, EPOR and JAK2, were screened for prior to genome
sequencing, and found to be absent.
Experimental design and strategy for identifying candidates
In family S, DNA samples were available from six individuals both from affected and unaffected
family members (Figure 4B). WGS was performed on two affected individuals (PAR09 and PAR07),
and on one unaffected (PAR18), forming a trio. DNA from one affected (PAR08) and 2 unaffected
members (PAR19 and PAR 22) were used in follow-up genotyping and segregation analysis. We did
not have any information on, nor could we collect DNA from, the father of PAR09.
In family M, DNA samples were available for four individuals. WGS was performed on two affected
individuals (PAR15 and PAR16), and DNA from one affected (PAR17) and one unaffected (PAR20)
members were used in follow-up genotyping and segregation analysis.
In both families, the disease appeared to follow a dominant inheritance pattern and so we focused
on heterozygous variants shared between affecteds, but not unaffecteds, in each family individually.
As all individuals have large numbers of these, we prioritized known candidate genes. We originally
looked at coding candidates, but later extended this to noncoding exonic sequence.
2.2. Results
None of our candidate genes contained coding mutations that segregated with disease. However,
we identified a single nucleotide variant (G>A) in the 5’UTR of EPO at chr7:100,318,468, shared by
the affected individuals in the 2 families (PAR07 and PAR09 in family S, PAR15 and PAR16 in family
M). The variant was not found in PAR18 nor in other WGS500 samples. Sanger sequencing confirmed
the presence of the 5’UTR variant in the affected members of Family M (PAR17, PAR15, PAR16) and
Family S (PAR09, PAR07, PAR08) and its absence in unaffected members of either family (Family S:
PAR 18, PAR19, PAR22; Family M: PAR20).
The EPO variant is not listed in dbSNP, 1000 Genomes, or any other samples sequenced within the
WGS500 project. In order to determine whether the variant had arisen independently in the two
families, we analysed the surrounding SNVs in the family members for which genomic sequence was
available. This revealed 47 rare variants (1000 Genomes frequency < 1%) that are found, uniquely
(within the WGS500 project) in the four affected individuals, in an approximately 8 Mb region
(chr7:93,533,251-100,993,241). This provides compelling evidence that the region is identical-bydescent in both families and that the 5’UTR variant had one common origin. The 5’UTR variant is the
only exonic variant and EPO the only candidate gene in this 8 Mb region. No other genomic region in
the two families shows a similar pattern of sharing of rare variants. This suggests that the haplotype
carrying the variant is multiple generations old, and thus likely to be found in others.
This finding will be followed-up by screening larger cohorts of patients with idiopathic erythrocytosis
for this 5’UTR variant and by functional molecular studies to investigate the mechanism of action of
this mutation. The putative functional variant lies in a small conserved block within the EPO 5’UTR
(Figure 4A), and so it seems likely to affect expression.
Nature Genetics: doi:10.1038/ng.3304
2.3. Clinical actions
WGS has played an important diagnostic role in clarifying the aetiology of the disease in these two
families with erythrocytosis. These findings will be useful in screening and family planning and
counselling.
3. SOX3 in hypoparathyroidism
3.1. Introduction
Background on the disease
Hypoparathyroidism is an endocrine disorder in which deficiency of parathyroid hormone (PTH)
results in hypocalcaemia that may be associated with tetany, carpopedal spasms, seizures, laryngeal
stridor, cataracts or ectopic calcification. Treatment with oral vitamin D preparations and calcium
supplements is effective at restoring normocalcaemia and ameliorating the neuro-muscular
symptoms. Hypoparathyroidism may be congenital due to parathyroid gland agenesis (e.g. the
DiGeorge syndrome) or acquired and due to destruction of the parathyroid glands (e.g. in
autoimmune diseases). In addition, hypoparathyroidism may occur as part of a complex congenital
syndrome (e.g. the DiGeorge syndrome or a pluriglandular autoimmune disorder) or as a nonsyndromic solitary endocrinopathy, which is referred to as isolated or idiopathic
hypoparathyroidism. Familial occurrences of idiopathic hypoparathyroidism have been reported and
autosomal dominant, autosomal recessive, and X-linked inheritances have been established. Genetic
abnormalities of TBX1, AIRE1, GATA3, TBCE, GCMB, PTH, CASR, GNA11, SOX3 and the mitochondrial
genome have been reported in patients with these syndromic and non-syndromic forms of
hypoparathyroidism. The incidence of hypoparathyroidism has not been established, but DiGeorge
syndrome is reported to occur in 1/3,000 live births and an autoimmune form of hypoparathyroidism
has been reported to occur more often in the Finnish and Iranian-Jewish populations.
Case study
The proband, HPT_3, is the son of nonconsanguineous parents. He presented at age 9 months with
seizures due to hypocalcemia (corrected serum calcium concentrations = 3.0 to 4.0 mg/dl; normal =
9.0 to 10.1 mg/dl) attributed to primary hypoparathyroidism. His grandfather, HPT_1, was also
known to have had hypocalcemic seizures during childhood. Neither the proband nor his grandfather
suffered from immunodeficiency, cardiac anomalies, craniofacial defects, developmental delay,
deafness, or renal dysplasia. Thus, the findings seemed consistent with isolated hypoparathyroidism
(HPT) that was likely inherited as an X-linked recessive trait. Treatment with vitamin D preparations
and oral calcium supplements restored normocalcaemia.
Known genes/pathways and prior screens
Genetic abnormalities involving the coding regions of the GCMB, PTH, CASR, SOX3, and AIRE1 have
been previously excluded. A previous report of familial hypoparathyroidism implicated a deletioninsertion near SOX3 on chromosome X via linkage analysis12. The report identified a region from
2p25.3 inserted at Xq27.1 with simultaneous deletion of a region from chromosome X. It also
demonstrated Sox3 expression in the developing parathyroid tissue of mouse embryos.
Nature Genetics: doi:10.1038/ng.3304
Experimental design and strategy for identifying candidates
The genomes of the proband, HPT_3, and his affected grandfather, HPT_1, were sequenced. DNA
from the proband’s unaffected mother, HPT_2, and brother, HPT_4, was used in follow-up
genotyping. The pattern of inheritance clearly suggested that the pathogenic variant was X-linked,
and SOX3, previously linked to hypoparathyroidism, was the main candidate gene. Accordingly, we
searched for mutations and larger structural variants in and around SOX3.
3.2. Results
Visual inspection of reads mapping downstream of SOX3 showed an apparent deletion of 1.4 kb of X
chromosomal sequence, located approximately 81.5 kb downstream of the gene. The pairs of reads
flanking this deletion consistently mapped to either end of an approximately 50 kb sequence on
chromosome 2p (Figure 4C), suggesting a simultaneous deletion of an X chromosomal region with an
insertion into the X chromosome of the region from 2p.
We used PCR to confirm the chromosomal rearrangement in the affected individuals (Figure 4D;
Supplementary Figure 9) and to show that the mother is a carrier and that the unaffected brother
does not have the rearrangement.
The breakpoint on chromosome X is coincident with a palindromic sequence at the 5’ end
(chrX:139,502,865-139,503,044), indicated in blue and red below, with the asterisk marking the
breakpoint (Supplementary Figure 9):
GGGTTCAGCTTCCCTCTAAGCCCCTAACATGTTTGTTCTAGTTTATTTCTGGTGACTTCAGTGCTTTTAAAAAGC
AATATAT*AAGCTATATCTAGCTTATATATTGCTTTTTAAAAGCACTGAAGTCACCAGAAATAAACTAGAACAA
ACATGTTAGGGGCTTAGAGGGAAGCTGAACCC
The deletion does not include any of the conserved non-coding elements defined in UCSC, but it
does occur approximately 500 bp upstream of a vertebrate conserved element (chrX:139,504,830139,505,088, lod=422).
The rearrangement breakpoints are distinct from the previously reported kindred12, suggesting
independent events, but they are in broadly similar regions, so the pathogenesis is likely to be
similar.
3.3. Clinical actions
WGS has helped to elucidate the aetiology of the disease in this family with hypoparathyroidism.
This finding will be useful in counselling and screening of relatives.
Supplementary references
1.
2.
3.
Wilkie, A.O. et al. Prevalence and complications of single-gene and chromosomal disorders in
craniosynostosis. Pediatrics 126, e391-400 (2010).
Allen, R.C., Zoghbi, H.Y., Moseley, A.B., Rosenblatt, H.M. & Belmont, J.W. Methylation of
HpaII and HhaI sites near the polymorphic CAG repeat in the human androgen-receptor gene
correlates with X chromosome inactivation. Am J Hum Genet 51, 1229-39 (1992).
Tilley, W.D., Marcelli, M., Wilson, J.D. & McPhaul, M.J. Characterization and expression of a
cDNA encoding the human androgen receptor. Proc Natl Acad Sci U S A 86, 327-31 (1989).
Nature Genetics: doi:10.1038/ng.3304
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
Froyen, G. et al. Submicroscopic duplications of the hydroxysteroid dehydrogenase
HSD17B10 and the E3 ubiquitin ligase HUWE1 are associated with mental retardation. Am J
Hum Genet 82, 432-43 (2008).
Nava, C. et al. Analysis of the chromosome X exome in patients with autism spectrum
disorders identified novel candidate genes, including TMLHE. Transl Psychiatry 2, e179
(2012).
Isrie, M. et al. HUWE1 mutation explains phenotypic severity in a case of familial idiopathic
intellectual disability. Eur J Med Genet 56, 379-82 (2013).
Plenge, R.M., Stevenson, R.A., Lubs, H.A., Schwartz, C.E. & Willard, H.F. Skewed Xchromosome inactivation is a common feature of X-linked mental retardation disorders. Am
J Hum Genet 71, 168-73 (2002).
Carrel, L. & Willard, H.F. X-inactivation profile reveals extensive variability in X-linked gene
expression in females. Nature 434, 400-4 (2005).
McMullin, M.F. The classification and diagnosis of erythrocytosis. Int J Lab Hematol 30, 44759 (2008).
Franke, K., Gassmann, M. & Wielockx, B. Erythrocytosis: the HIF pathway in control. Blood
122, 1122-8 (2013).
McMullin, M.F. Idiopathic erythrocytosis: a disappearing entity. Hematology Am Soc Hematol
Educ Program, 629-35 (2009).
Bowl, M.R. et al. An interstitial deletion-insertion involving chromosomes 2p25.3 and Xq27.1,
near SOX3, causes X-linked recessive hypoparathyroidism. J. Clin. Invest. 115, 2822-31
(2005).
Qi, X.P. et al. RET germline mutations identified by exome sequencing in a Chinese multiple
endocrine neoplasia type 2A/familial medullary thyroid carcinoma family. PLoS One 6,
e20353 (2011).
Slimani, A. et al. Effect of mutations in LDLR and PCSK9 genes on phenotypic variability in
Tunisian familial hypercholesterolemia patients. Atherosclerosis 222, 158-66 (2012).
Wiestner, A., Schlemper, R.J., van der Maas, A.P. & Skoda, R.C. An activating splice donor
mutation in the thrombopoietin gene causes hereditary thrombocythaemia. Nat Genet 18,
49-52 (1998).
Bolze, A. et al. Ribosomal protein SA haploinsufficiency in humans with isolated congenital
asplenia. Science 340, 976-8 (2013).
Elsayed, S.M. et al. Autosomal dominant SCA5 and autosomal recessive infantile SCA are
allelic conditions resulting from SPTBN2 mutations. Eur J Hum Genet 22, 286-8 (2014).
Wang, Y. et al. A Japanese SCA5 family with a novel three-nucleotide in-frame deletion
mutation in the SPTBN2 gene: a clinical and genetic study. J Hum Genet 59, 569-73 (2014).
Lise, S. et al. Recessive mutations in SPTBN2 implicate beta-III spectrin in both cognitive and
motor development. PLoS Genet 8, e1003074 (2012).
Babbs, C. et al. Homozygous mutations in a predicted endonuclease are a novel cause of
congenital dyserythropoietic anemia type I. Haematologica 98, 1383-7 (2013).
Cossins, J. et al. Congenital myasthenic syndromes due to mutations in ALG2 and ALG14.
Brain 136, 944-56 (2013).
Petousi, N. et al. Erythrocytosis associated with a novel missense mutation in the BPGM
gene. Haematologica 99, e201-4 (2014).
Zajac, J.D. & Danks, J.A. The development of the parathyroid gland: from fish to human.
Current Opinion in Nephrology and Hypertension 17, 353-6 (2008).
Uckun-Kitapci, A., Underwood, L.E., Zhang, J. & Moats-Staats, B. A novel mutation (E767K) in
the second extracellular loop of the calcium sensing receptor in a family with autosomal
dominant hypocalcemia. Am J Med Genet A 132A, 125-9 (2005).
Liu, M. et al. Novel UMOD mutations in familial juvenile hyperuricemic nephropathy lead to
abnormal uromodulin intracellular trafficking. Gene 531, 363-9 (2013).
Nature Genetics: doi:10.1038/ng.3304
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
Smith, G.D. et al. Characterization of a recurrent in-frame UMOD indel mutation causing
late-onset autosomal dominant end-stage renal failure. Clin J Am Soc Nephrol 6, 2766-74
(2011).
Lens, X.M., Banet, J.F., Outeda, P. & Barrio-Lucia, V. A novel pattern of mutation in
uromodulin disorders: autosomal dominant medullary cystic kidney disease type 2, familial
juvenile hyperuricemic nephropathy, and autosomal dominant glomerulocystic kidney
disease. Am J Kidney Dis 46, 52-7 (2005).
McNally, E.M., Golbus, J.R. & Puckelwartz, M.J. Genetic mutations and mechanisms in dilated
cardiomyopathy. J Clin Invest 123, 19-26 (2013).
Kirby, A. et al. Mutations causing medullary cystic kidney disease type 1 lie in a large VNTR in
MUC1 missed by massively parallel sequencing. Nat Genet 45, 299-303 (2013).
Bokil, N.J., Baisden, J.M., Radford, D.J. & Summers, K.M. Molecular genetics of long QT
syndrome. Mol Genet Metab 101, 1-8 (2010).
Wu, Y. et al. Mutations in ionotropic AMPA receptor 3 alter channel properties and are
associated with moderate cognitive impairment in humans. Proc Natl Acad Sci U S A 104,
18163-8 (2007).
Martin, H.C. et al. Clinical whole-genome sequencing in severe early-onset epilepsy reveals
new genes and improves molecular diagnosis. Human Molecular Genetics (2014).
Weckhuysen, S. et al. KCNQ2 encephalopathy: emerging phenotype of a neonatal epileptic
encephalopathy. Ann Neurol 71, 15-25 (2012).
Nakamura, K. et al. Clinical spectrum of SCN2A mutations expanding to Ohtahara syndrome.
Neurology 81, 992-8 (2013).
Palles, C. et al. Germline mutations affecting the proofreading domains of POLE and POLD1
predispose to colorectal adenomas and carcinomas. Nature Genetics 45, 136-44 (2013).
Zhou, X.P. et al. Germline mutations in BMPR1A/ALK3 cause a subset of cases of juvenile
polyposis syndrome and of Cowden and Bannayan-Riley-Ruvalcaba syndromes. Am J Hum
Genet 69, 704-11 (2001).
Lipton, L. et al. Germline mutations in the TGF-beta and Wnt signalling pathways are a rare
cause of the "multiple" adenoma phenotype. J Med Genet 40, e35 (2003).
Schwarzova, L. et al. Novel mutations of the APC gene and genetic consequences of splicing
mutations in the Czech FAP families. Fam Cancer 12, 35-42 (2013).
Sharma, V.P. et al. Mutations in TCF12, encoding a basic helix-loop-helix partner of TWIST1,
are a frequent cause of coronal craniosynostosis. Nat Genet 45, 304-7 (2013).
van der Zwaag, P.A. et al. A genetic variants database for arrhythmogenic right ventricular
dysplasia/cardiomyopathy. Hum Mutat 30, 1278-83 (2009).
Posch, M.G. et al. A missense variant in desmoglein-2 predisposes to dilated
cardiomyopathy. Mol Genet Metab 95, 74-80 (2008).
Caputo, S. et al. Description and analysis of genetic variants in French hereditary breast and
ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases. Nucleic Acids Res 40,
D992-1002 (2012).
Panizza, E. et al. Yeast model for evaluating the pathogenic significance of SDHB, SDHC and
SDHD mutations in PHEO-PGL syndrome. Hum Mol Genet 22, 804-15 (2013).
Fokkema, I.F., den Dunnen, J.T. & Taschner, P.E. LOVD: easy creation of a locus-specific
sequence variation database using an "LSDB-in-a-box" approach. Hum Mutat 26, 63-8 (2005).
Loeys, B.L. et al. Aneurysm syndromes caused by mutations in the TGF-beta receptor. N Engl
J Med 355, 788-98 (2006).
Levano, S. et al. Increasing the number of diagnostic mutations in malignant hyperthermia.
Hum Mutat 30, 590-8 (2009).
Castellone, M.D. et al. A novel de novo germ-line V292M mutation in the extracellular region
of RET in a patient with phaeochromocytoma and medullary thyroid carcinoma: functional
characterization. Clin Endocrinol (Oxf) 73, 529-34 (2010).
Nature Genetics: doi:10.1038/ng.3304
48.
49.
50.
Ngo, D.N. et al. Screening of the RET gene of Vietnamese Hirschsprung patients identifies 2
novel missense mutations. J Pediatr Surg 47, 1859-64 (2012).
Pickup, M.J. & Pollanen, M.S. Traumatic subarachnoid hemorrhage and the COL3A1 gene:
emergence of a potential causal link. Forensic Sci Med Pathol 7, 192-7 (2011).
Dandanell, M., Friis-Hansen, L., Sunde, L., Nielsen, F.C. & Hansen, T.V. Identification of 3
novel VHL germ-line mutations in Danish VHL patients. BMC Med Genet 13, 54 (2012).
Nature Genetics: doi:10.1038/ng.3304
Supplementary Tables
Supplementary Table 1: Concordance between genotypes from WGS and cytoSNP12v1 array data.
Sample
CAT_919B6
LVNC_1.1.70
LVNC_1.2.83
RTA_2.1.47
RTA_2.2.73
DCM_3.2.55
DCM_3.2.98
CCM_4.2.69
CCM_4.2.92
CQT_8.1.61
EOE_5
ERY_PAR09
Number of SNPs
277,538
279,002
277,784
277,612
276,711
279,271
278,471
277,472
277,571
279,619
278,605
278,016
% concordant genotypes
99.93
99.95
99.94
99.64
99.91
99.93
99.92
99.93
99.92
99.93
99.96
99.95
Mean coverage (concordant/discordant)
23.16/15.85
47.91/39.06
46.22/39.99
21.82/14.39
22.90/16.19
24.66/18.13
25.27/20.78
23.85/16.89
23.77/16.06
26.95/18.78
35.14/26.96
25.87/17.82
Supplementary Table 2: Error profiles between genotypes from WGS and from array data. The results show the proportion of the total SNP comparisons (across all 12
samples) that fall into each error category.
cytoSNP12v1 array
Genotype
HomRef
Het
HomAlt
Nature Genetics: doi:10.1038/ng.3304
HomRef
50.618%
0.019%
0.004%
WGS
Het
0.018%
29.621%
0.012%
HomAlt
0.002%
0.011%
19.696%
Supplementary Table 3: The number of putative de novo protein-altering variants (ANNOVAR annotation) in children from trios or quartets under different filtering
strategies. When segregation is ignored, this is simply the number of novel heterozygous coding variants as defined by absence from the control datasets indicated.
Parental genotypes (either raw calls or jointly called genotypes) were checked to determine whether the variants were likely to be de novo. EOE: Early-onset epilepsy,
CRS: craniosynostosis, SC: Saethre-Chotzen syndrome, ERY: erythrocytosis, HCM: hypertrophic cardiomyopathy. Note that all families were treated as trios with one
affected child and two unaffected parents for this analysis, even though, for the ERY trio, both the mother and daughter were affected, and, for the MR and EOE quartets,
both children were affected. +Note that the mean excludes HCM_2361, MR_6 and MR_8 because the abnormally large number of novel heterozygous variants in these
samples is probably due to contamination.
Filtered against
Segregation
ignored
1000G, ESP
WGS500
1000G, ESP,
WGS500
de novo - raw
calls
de novo – joint
calls
ignored
de novo - raw
calls
de novo – joint
calls
ignored
de novo - raw
calls
de novo – joint
calls
Nature Genetics: doi:10.1038/ng.3304
CRS
4103
193
CRS
4447
265
CRS
4659
330
CRS
4917
273
EOE EOE EOE EOE
15
18
21
22
469 395 336 314
ERY
PAR07c
285
HCM
+
2361
1194
15
25
38
24
25
35
64
43
36
641
1
4
4
3
4
1
5
0
2
11
78
78
88
83
274
200
59
65
106
888
1
1
1
1
3
3
4
3
3
0
1
1
1
3
1
0
0
54
54
62
59
252
176
44
1
1
1
1
3
3
0
1
1
1
3
1
MR MR EOE EOE EOE EOE SC
+
+
6
8
12
2
5
7 2930
604 611 281 179 277 258 242
121 138
+
Mean (reduction relative to
row above)
292.6
20
21
51
31
21
32.1 (89%)
1
5
4
1
1
2.6 (92%)
109 124 147
57
97
86
89
107.6
529
31
41
2
2
1
3
1
2.1 (98%)
2
7
3
3
1
2
0
1
1
1 (52%)
50
81
851
102 114
93
42
78
63
63
83.6
4
3
3
521
30
41
2
2
1
3
1
2.1 (98%)
0
0
2
7
3
3
1
2
0
1
1
1 (52%)
3
3
Supplementary Table 4: The number of putative simple recessive (i.e. homozygous) protein-altering variants (ANNOVAR annotation) in trio and quartet children under
different filtering strategies. When segregation is ignored, this is simply the number of rare homozygous coding variants as defined by their frequency in the control
datasets indicated. In order to classify the variant as recessive, the parents were both required to be heterozygous (in the raw calls). Manual inspection revealed that
many of the homozygous variants filtered out using the other WGS500 samples tended to be in low complexity regions which are prone to spurious or incorrect calls, or
have very low coverage. +Note that the mean excludes HCM_2361, MR_6 and MR_8 because of suspected sample contamination, EOE_5 due to a uniparental disomy for
chromosome 9, and EOE_12 due to consanguinity.
Filtered
against
1000G, ESP
WGS500
1000G, ESP,
WGS500
Segregation
CRS
4103
CRS
4447
CRS
4659
CRS
4917
EOE
15
ignored
72
78
74
88
75
78
recessive
2
1
1
1
2
ignored
2
4
0
3
2
recessive
0
2
0
0
ignored
1
3
0
recessive
0
1
0
Nature Genetics: doi:10.1038/ng.3304
EOE EOE
18
21
EOE
22
ERY
PAR07c
HCM
+
2361
79
85
82
93
5
3
2
4
4
3
1
2
0
6
1
1
0
0
0
1
3
1
3
1
1
0
0
0
1
0
0
0
MR MR
+
+
6
8
91
EOE
+
12
EOE EOE EOE SC
+
2
5
7 2930
+
76
87
87
Mean (reduction relative to
row above)
80.8
3
2
7
3
2.8 (96.8%)
1
13
4
1
1.9
34
1
0
1
0
0.5 (73.9%)
7
9
1
11
3
1
1.5
1
7
1
0
0
0
0.25 (83.3%)
68
85
85
1
2
10
18
13
40
0
3
6
18
1
0
Supplementary Table 5: Search terms used to define candidate gene lists for early-onset epilepsy for analysis of the burden of variants of unknown significance.
Databases were accessed in September 2012. HGMD, Human Gene Mutation Database, www.hgmd.cf.ac.uk/; MIPS, Mammalian Protein-Protein Interaction Database,
http://mips.helmholtz-muenchen.de/genre/proj/corum; GO, Gene Ontology, www.geneontology.org.
Tier Database
1
2
HGMD
HGMD
MIPS
HGMD
3
GO
Nature Genetics: doi:10.1038/ng.3304
Search term
Ohtahara syndrome (disease/phenotype)
epileptic encephalopathy (disease/phenotype)
early infantile epileptic encephalopathy (disease/phenotype)
epilepsy (disease/phenotype)
seizure (all fields)
gene names from tier 1
ion channel (all fields)
ion channel (gene description)
ion channel (gene ontology)
brain development (gene ontology)
brain development (all fields)
ion channel complex
brain development
Number of genes
1
10
4
83
1
9
240
134
226
78
98
200
335
Total number of unique genes
10
82
679
Supplementary Table 6: The variants thought to be causal (classes A, B or C) or possibly causal (class D or E, still under investigation) in the EOE and CRS trios and the MR
quartet. The EOE quartet and HCM trio are not listed because they currently have no good candidates, and the ERY trio (with parent and child affected) is described in
the section on the noncoding variant in EPO.
Sample
Gene
Class
Consequence
Inheritance
Candidate gene
CRS_4103
MNT
E
nonsense
de novo
no
CRS_4447
ZIC1
A
nonsense
de novo
no, but in EOE tier 3
CRS_4917
TECPR1; THNSL2
E
nonsynonymous;
nonsynonymous
de novo; de novo
no; no
CRS_4659
HUWE1
B
nonsynonymous
de novo
no, but in MR tier 1
SC_2930
CDC45
A
nonsynonymous and splicing (synonymous)
recessive (compound
heterozygous)
no
EOE_0007
SCN2A
C
nonsynonymous
de novo
tier 2
EOE_0012
PIGQ
A
essential splice site
recessive
no
EOE_0015
CSNK1G1
D
nonsynonymous
de novo
no
EOE_0018
CBL
D
essential splice site
de novo
no
EOE_2
KCNQ2
C
nonsynonymous
de novo
tier 2
EOE_5
KCNT1
B
nonsynonymous
recessive - UPD9
tier 3
MR_6, MR_8
GRIA3
C
nonsynonymous
X-linked
tier 1
Nature Genetics: doi:10.1038/ng.3304
Supplementary Table 7: Results from two-sided Fisher’s exact tests for a difference in the frequency of pseudo-candidate variants in candidate genes between trio
probands and controls. Note that the numbers in each row do not add to 216 (the total number of individuals included in Figure 2), since related unaffected individuals
were excluded from the control set, as were the individuals in the EOE quartet, because they might be expected to carry variants in these genes too.
Disease
Variant
Genotype
novel
heterozygous
EOE
rare
homozygous
novel
heterozygous
CRS
rare
homozygous
Nature Genetics: doi:10.1038/ng.3304
Tier
Number of controls
without variants
Number of controls with
variants
Number of cases
without variants
Number of cases with
variants
1
2
3
1
2
3
1
2
3
1
2
3
146
98
12
147
144
126
114
102
48
149
145
139
2
50
136
1
4
22
35
47
101
0
4
10
6
2
0
6
6
3
4
5
1
5
5
4
0
4
6
0
0
3
1
0
4
0
0
1
Odds
ratio
0
3.88
NA
0
0
5.63
0.82
0
1.89
0
0
3.43
p value
1
0.18
1
1
1
0.05
1
0.32
1
1
1
0.31
Supplementary Table 8: Summary of conditions for which pathogenic genes were identified (class A, B or C), with evidence for pathogencity. aGene and/or variant will be
reported in an independent publication. bCausal variant discovered independently of WGS500. cReference to publications on WGS500 case studies.
Disease
Gene
Class of mutation
Evidence for pathogenicity
WGS500
Referencec
Acquired essential
thrombocytosis
THPO
splicing
Asplenia
RPSAb
splicing
SPTBN2
nonsense
Common variable
immunodeficiency
disorder
a
missense
Congenital
dyserythropoietic
anaemia, type 1
C15ORF41
missense
ALG2
missense
Cerebellar ataxia
Congenital
myasthenic
syndrome
Nature Genetics: doi:10.1038/ng.3304
The same variant (NM_001177598: c.13+1G>C) has been described in two families before 13,14 and
arose independently. Wiestner et al. showed that it leads to skipping of exon 3 in the 5' UTR, which
increased thrombopoietin protein causing enhanced megakaryopoiesis and thus increased platelet
count in the affected individual15. This variant was used clinically: additional family members were
screened, provided with counselling and offered appropriate haematology follow up.
Bolze et al. found that rare heterozygous RPSA mutations were significantly enriched in asplenia
patients compared to controls (8/23 cases vs. 1/508 controls)16. This mutation (NM_002295.4:c.34+5G>C ) leads to an impaired splicing at the end of exon 1 of RPSA, producing an insertion of 70bp in
80% of the transcripts coming from that allele. Overall the mutation leads to a 40% reduction of the
RPSA protein level.
This is a homozygous stop mutation (NM_006946:c.1881C>A:p.C627X), and a mouse knock out has the
same phenotype. Another family with a different homozygous stop mutation and identical phenotype
and additional cases with truncating mutations have been published17,18. The family have requested
consideration for using this for prenatal diagnosis.
This gene is part of the TNFR superfamily and involved in B cell activation and proliferation. This
variant is associated with B cell defects in its homozygous and heterozygous state. Although there is
published evidence for a functional B cell defect in carriers, there seems to be incomplete penetrance
and we therefore consider that it contributes to, rather than causes the phenotype.
This variant (NM_001130010:c.533T>A:p.L178Q) was present in all 3 affected family members of this
pedigree. A second homozygous missense change (Y94C) was later identified in the same gene in four
further CDA-I patients from two unrelated pedigrees. C15ORF41 has been added to the genes
resequenced in unexplained anaemia patients by the NHS diagnostic service and they have recently
identified a further individual from an unrelated pedigree with CDA-I caused by the L178Q change
identified in the WGS study. This makes a total of 8 individuals from 4 unrelated pedigrees with CDA-I
caused by this gene and work is ongoing onto the pathogenic mechanism.
The recessive variant (NM_033087:c.203T>G:p.V68G) segregates with disease within the family: the
parents and unaffected brother are heterozygous, the patient homozygous. Expression of the protein
derived from patient muscle biopsy was severely reduced versus control muscle biopsies. Expression
from cDNA encoding the ALG2 mutation in HEK293 cells showed severely reduced ALG2 protein
expression versus controls. Information was used in the clinic for genetic counselling, and to provide
appropriate treatment: a cholinesterase inhibitor (pyridostigmine) in combination with either
salbutamol or ephedrine.
N/A
N/A
19
N/A
20
21
Craniosynostosis
Erythrocytosis
Familial
hypoparathyroidism
ZIC1
nonsense
HUWE1
missense
EPO
noncoding
BPGM
missense
SOX3
noncoding
Nature Genetics: doi:10.1038/ng.3304
This patient had bicoronal synostosis and severe learning disability and was found to have a de novo
nonsense mutation in ZIC1 (NM_003412.3: c.1163C>A: p.S388*). We found three other similar patients
with de novo nonsense mutations in this gene, one by exome sequencing and two by resequencing it
in 342 patients. The transcript escapes nonsense-mediated decay and cDNA constructs showed altered
activity in biological assays. This information was used clinically.
This patient had multisuture craniosynostosis and mild learning disability. A de novo missense
mutation (NM_031407.6: c.329G>A:p.R110Q) was found in HUWE1, which was previously reported for
mental retardation with craniofacial features. The mutation affects a very highly conserved residue in a
domain of unknown function (DUF908). The gene is large, spanning 154,641 bp and comprising 84
exons, and, because of extensive heterogeneity in CRS, the contribution to the disease is likely to be
low, and thus it was not surprising that we did not find any other HUWE1 mutations in a cohort of 47
unrelated cases with complex CRS. The mutation was shown to have originated on the paternal X
chromosome (Figure 3B and Supplementary Figure 8). Unexpectedly, cells from the patient show
preferential inactivation of the maternally inherited, wild-type X (Figure 3C) and, consistent with these
two observations, only the mutant allele was expressed in the two tissues (fibroblast and transformed
lymphoblasts) available for analysis (Figure 3D). Whilst this work was under review we identified, using
exome sequencing, a different de novo hemizygous mutation altering the same amino acid of HUWE1
(c.328C>T encoding p.R110W) in a boy presenting with metopic craniosynostosis, moderate-severe
learning disability and other dysmorphic features. See the Supplementary Note for further details.
The same variant at a highly conserved base (NM_000799.2:c.-136G>A; Figure 4A) within the 5’ UTR of
the erythropoietin gene EPO was identified in two independent families with erythrocytosis and cosegregated with the disease. EPO is a strong candidate gene for erythrocytosis as erythropoietin is
essential for red cell production and increased erythropoietin levels lead to increased red cell mass,
the hallmark of erythrocytosis. The genetic evidence for causality of this EPO variant is strong: it is the
only rare exonic variant found in an extended (8 Mb) region that is identical-by-descent in the affected
individuals in these two unrelated families (the only such region), suggesting that it had a single
mutational origin. See the Supplementary Note for further details.
The patient inherited this mutation (NM_001724:c.269G>A:p.R90H) from his mother, who was
asymptomatic but had hemoglobin levels on the upper end of the normal range. Both he and his
mother had significantly lower levels of BPGM in red blood cells. BPGM deficiency affects the
hemoglobin-oxygen dissociation curve, which leads to less available oxygen, stimulating red cell
production. Other erythrocytosis patients with BPGM deficiency have been reported, and the same
residue is mutated in other patients.
A complex interstitial insertion-deletion leading to deletion of 1.4 kb of the X chromosome and
insertion of 50 kb from chromosome 2p was discovered in a patient with X-linked hypoparathyroidism.
This variant lies 81.5 kb downstream of SOX3, segregates with the disease and is similar to, but distinct
N/A
N/A
N/A
22
N/A
from, an event previously reported in an independent kindred12. SOX3 is a strong candidate since it is
known to be involved in the development of the parathyroid gland23. See the Supplementary Note for
further details.
CASR
UMOD
Familial tubulointerstitial
nephropathy
Hypertrophic
cardiomyopathy
(sarcomere genenegative)
Inflammatory
bowel
syndrome/colitis
Interstitial nephritis
UMOD
MYBPC3c
a
MUC1
Nature Genetics: doi:10.1038/ng.3304
missense
This variant (NM_000388:c.2299G>C:p.E767Q) co-segregates with disease in the family
(Supplementary Figure 7), and a previously reported mutation at the same location (E767K) causes a
similar phenotype (autosomal dominant hypocalcaemia)24. It was missed by prior sequencing in a UK
research lab.
missense
This variant (NM_001008389:c.410G>A:p.C137Y) co-segregates with disease in the family
(Supplementary Figure 7). The UMOD gene is well known in FJHN25, and this variant is located in a
region in which multiple mutations have been observed before (cbEGF3). The variant was missed by
prior sequencing in a non-UK research lab.
missense (inframe This is a complex indel comprising chr16:20360333, CCTTCGGGGCAG > C
insertion/deletion) (NM_001008389:c.279_289del:p.93_97del) and chr16:20360345, A > AGGAGGCGG
(NM_001008389:c.278_279insCCGCCTCC:p.V93fs). Pathogenicity is strongly supported by a previously
published study describing four kindreds with the same indel and phenotype26 and another paper
describing this indel in a family with Autosomal Dominant Medullary Cystic Kidney Disease type 227.
Some of the families in the paper by Smith, et al. 26 have the same haplotype as this one. The UMOD
gene was not initially suspected due to late presentation and absence of gout. This discovery has been
used clinically to improve diagnosis of other family members, and potentially identify suitable kidney
donors within the family.
nonsense
We found a heterozygous nonsense mutation (NM_000256:c.1303C>T:p.Q435X) in one of two affected
cousins, who had not had prior clinical genetic testing. Segregation data are not available in his family,
but mutations in MYBPC3 and MYH3 cause 75% of HCM cases28, and so this variant seems highly likely
to be causal. The cause for HCM in the other cousin and his brother are still unidentified.
missense
This gene is involved epithelial stress response and in the production of reactive oxygen species. The
variant is extremely rare: absent from 1000 Genomes and from 4000 IBD patients. Plasmid data have
demonstrated that the protein is defective, and data from biopsies and primary epithelial organoids
from case and controls showed defects in protein function. Recent publications suggest that defects in
gene function increase i29n colitis susceptibility in animal models. Further details will be disclosed in a
subsequent publication.
a
Mutations in MUC1 cause medullary cystic kidney disease type 1. Variants in this gene are not
amenable to identification by WGS due to segmental duplications, and this variant was found by
another method.
N/A
N/A
N/A
N/A
N/A
N/A
Long QT syndrome
KCNQ1
frameshift
Mental retardation
GRIA3
missense
Ohtahara syndrome
and other earlyonset epilepsies
PIGQ
splicing
KCNT1
missense
KCNQ2
missense
SCN2A
missense
POLD1
POLE
missense
missense
MSH6
missense and
nonsense
BMPR1A
frameshift
APC
splicing
TCF12
nonsense
Multiple adenoma
Saethre-Chotzen
syndrome (TWIST1
Nature Genetics: doi:10.1038/ng.3304
KCNQ1 is a well known gene in long QT syndrome30. This variant
(NM_000218:c.1195_1196insC:p.A399fs), which leads to a frameshift and premature stop codon,
segregates in the family and was missed in original HPLC clinical genetic testing. It is now being used
for cascade testing.
GRIA3, which encodes an ionotropic glutamate receptor, has previously been implicated in X-linked
mental retardation31. Both affected brothers inherited the mutation from their heterozygous mother,
who is phenotypically normal. It lies in the highly conserved channel region, and electrophysiology
experiments showed that it affects gating of the channel. Functional studies are underway, and further
details will be disclosed in a subsequent publication.
We found a recessive mutation (NM_004204:c.690-2A>G) that affected splicing of PIGQ and led to
defective glycophosphatidyl inositol (GPI) biosynthesis. Mutations in other GPI pathway genes,
including PIGA, the binding partner of PIGQ, have been implicated in various syndromes that involve
seizures.
This patient had uniparental isodisomy for chromosome 9, which led to a missense variant in KCNT1
(NM_020822: c.2896G>A:p.A966T) becoming homozygous. This gene had previously been implicated
in other types of epilepsy, and electrophysiology experiments demonstrated an effect on channel
current.
This patient had a de novo mutation in KCNQ2 (NM_004518:c.827C>T:p.T276I which falls in a highly
conserved transmembrane segment of the channel that forms part of the pore and is two amino acids
away from the T274M mutation recently described in another patient33.
This patient had a de novo mutation in SCN2A (NM_001040143:c.5558A>G:p.H1853R). It falls in the
cytosolic C-terminal region of the protein; other de novo mutations in the cytosolic domains were
recently reported in patients with Ohtahara Syndrome34
The POLD1 (NM_002691:c.G1433A:p.S478N) and POLE (NM_006231:c.1270C>G:p.L424V) variants cosegregated in multiple families, were over-represented in cases versus controls, and functional assays
in yeast showed they caused hypermutation
This patient had possible compound heterozygote mutations in MSH6 (NM_000179:c.G2315A:p.R772Q
and NM_000179:c. 2731C>T:p.R911*). Nonsense mutations in this are known to cause Lynch
syndrome, which predisposes to colorectal cancer.
N/A
We found a rare frameshift mutation in BMPR1A (NM_004329.2:c.142_143insT:p.Thr49Asnfs*22), a
known juvenile polyposis 36and multiple adenoma gene37.
This mutation (NM_001127511:c.251-2A>G) affects a canonical splice site, consistent with knowledge
that early APC exon mutations and splice mutations cause attenuated polyposis38.
One patient had a nonsense mutation (NM_207037.1:c.1283T>G; p.L428*) and another a splicing
mutation (NM_207037.1:c.1035+3G>C; called intronic by Annovar on RefSeq transcripts) which was
N/A
N/A
32
35
N/A
N/A
39
negative)
TCF12
splicing
CDC45
synonymous
(splicing) and
missense
Nature Genetics: doi:10.1038/ng.3304
shown to lead to the skipping of exon 12. Mutations in TCF12 were found in four other patients by
exome sequencing, and 32/341 patients in whom these genes was resequenced; all had coronal
synostosis.
This patient was compound heterozygous for one missense (NM_001178010.2:c.773A>G;p.D258G)
and one synonymous (NM_001178010.2:c.318C>T;p.V106=;) variant in CDC45. The synonymous
variant was found to cause skipping of exon 4. Two other coronal synostosis patients with compound
heterozygous missense mutations were found: one by exome sequencing and the other by
resequencing 427 cases.
N/A
Supplementary Table 9: Breakdown of results by project category for all 156 projects, with the percentage of the total for that project category indicated in parentheses.
See Online Methods for an explanation of results class and of project category. The totals for the broader project categories are shaded grey.
Project category
1.1
1.2
1.3
1.4
1
2.1
2.2
2
3
4
Total
Nature Genetics: doi:10.1038/ng.3304
A
4
1
0
0
5 (10.6%)
3
0
3 (14.3%)
2 (3.7%)
2 (5.9%)
12 (7.7%)
B
0
1
0
0
1 (2.1%)
2
0
2 (9.5%)
1 (1.9%)
0
4 (2.6%)
Result class
C
5
0
1
2
8 (17%)
3
1
4 (19.1%)
2 (3.7%)
3 (8.8%)
17 (10.9%)
D
9
5
0
0
14 (29.8%)
2
2
4 (19.1%)
12 (22.2%)
4 (11.8%)
34 (21.8%)
E
9
2
0
8
19 (40.4%)
4
4
8 (38.1%)
37 (68.5%)
25 (73.5%)
89 (57%)
Total
27
9
1
10
47
14
7
21
54
34
156
Supplementary Table 10: Incidental findings deemed not to be significant. The frequencies in the UK10K twins cohort and in the Exome Variant Server European
American (EVS_EA) cohort are shown. VUS: variant of unknown significance; NS: nonsynonymous; ARVC: Arrhythmogenic right ventricular cardiomyopathy; LOVD:
Leiden Open Variation Database; UMD: Universal Mutation Database; GSDB: Genome Sequence Database.
Potential Incidental
Finding Condition
Gene
Variant
Effect
UK10K
Twins
EVS_EA
Comments on pathogenicity
Arrhythmogenic right
ventricular
cardiomyopathy
DSG2
NM_001943:c.473T>G:p.V158G
NS
0.0071
0.0079
DSG2
NM_001943:c.2759T>G:p.V920G
NS
0.0057
0.0050
DSG2
NM_001943:c.1174G>A:p.V392I
NS
0.0017
0.0021
DSP
NM_004415:c.4372C>G:p.R1458G
NS
0.0020
0.0021
DSP
NM_001008844:c.88G>A:p.V30M
NS
0.0006
0.0019
DSP
BRCA2
BRCA2
BRCA2
BRCA2
BRCA2
LDLR
NM_001008844.1:c.2815G>A:p.G939S
NM_000059:c.9586A>G:p.K3196E
NM_000059:c.8182G>A:p.V2728I
NM_000059:c.9976A>T:p.K3326*
NM_000059:c.1151C>T:p.S384F
NM_000059:c.223G>C:p.A75P
NM_001195800:c.1371C>T:p.N457=
0.0031
0.0100
0.0014
0.0003
0.0006
0.0001
0.0001
0.0045
0.0084
0.0015
0.0005
0.0002
LDLR
NM_001195800:c.1372G>A:p.E458K
NS
NS
NS
nonsense
NS
NS
spliceacceptor
NS
PCSK9
NM_174936:c.520C>T:p.P174S
NS
Classified as variant of ‘no known
pathogenicity’ in ARVC database40 based on
9 independent reports.
Classified as variant of ‘no known
pathogenicity’ in ARVC database40 based on
8 independent reports. Lack of segregation
reported by 41.
Classified as variant of ‘no known
pathogenicity’ in ARVC database40 based on
14 independent reports.
Classified as variant of ‘no known
pathogenicity’ in ARVC database40 based on
2 independent reports.
Classified as variant of ‘no known
pathogenicity’ in ARVC database40 based on
10 independent reports.
Classified as VUS in ARVC database40.
classified as VUS in UMD for BRCA242.
classified as neutral in UMD for BRCA242.
classified as neutral in UMD for BRCA242.
classified as neutral in UMD for BRCA242.
classified as neutral in UMD for BRCA242.
Codon AAC to AAT, synonymous, not near
splice site
Not listed in UK or Dutch LDLR databases.
http://www.ucl.ac.uk/ldlr/Current/search.p
hp?select_db=LDLR&srch=all&page=8
Report indicating this may be protective of
(not risk factor for) hypercholesterolaemia14
Breast cancer
Familial hypercholesterolaemia
Nature Genetics: doi:10.1038/ng.3304
0.0003
0.0001
normal in a yeast assay 43
0.0022
+10 bp into intron and not obviously
affecting splicing
not listed in LOVD44 for MSH2
http://chromium.liacs.nl/LOVD2/colon_canc
er/variants.php?select_db=MSH2&action=vi
ew_all
Likely neutral in Universal Mutation
Database for MSH6
http://www.umd.be/MSH6/
Single report of Loeys-Dietz syndrome with
non-penetrant parent45, no corroborative
functional data. 6 samples in EVS with
variant.
Not listed as causative in European
Malignant Hyperthermia Group database
https://emhg.org/genetics/mutations-inryr1/ .
Not segregating according to 46
Not listed as causative in European
Malignant Hyperthermia Group database
https://emhg.org/genetics/mutations-inryr1/.
Not listed as causative in European
Malignant Hyperthermia Group database
https://emhg.org/genetics/mutations-inryr1/.
Gain of function mutation in two
independent MTC reports, in vitro
functional data indicating weakly
transforming on transfection
(phosphotyrosine activity and
proliferation rates)13,47. Mutation also
SDHD
NM_003002:c.158C>T:p.P53L
NS
MLH1
NC_000003.11:g37056045A>G
MSH2
NM_000251.1:c.1886A>G:p.G629R
splicedonor
NS
MSH6
NM_000179:c.1526T>C:p.V509A
NS
Loeys-Dietz
syndrome (aortic
aneurysm)
TGFBR1
NM_001130916:c.1202A>G:p.N401S
NS
absent
0.0007
Malignant
hyperthermia
RYR1
NM_000540:c.4055C>G:p.A1352G
NS
0.0080
0.0000
RYR1
NM_000540:c.4178A>G:p.K1393R
NS
0.0040
0.0058
RYR1
NM_000540:c.7025A>G:p.N2342S
NS
0.0011
0.0013
RET
NM_020630:c.874G>A:p.V292M
NS
absent
absent
Hereditary
paraganglioma
pheochromocytoma
syndrome
Inherited colorectal
cancer
Phaeochromocytoma
& medullary thyroid
carcinoma (MTC)
Nature Genetics: doi:10.1038/ng.3304
0.0048
0.0013
Vascular EhlersDanlos syndrome
(subarachnoid
haemorrhage) (EDS)
Von Hippel-Lindau
disease
COL3A1
NM_000090:c.812G>A:p.R271Q
NS
0.0017
0.0038
COL3A1
NM_000090:c.3938A>G:p.K1313R
NS
0.0017
0.0026
VHL
NM_000551.2:c.340+5G>C
splicedonor
0.0009
Nature Genetics: doi:10.1038/ng.3304
described as loss-of-function in two
patients with Hirschsprung disease48.
Listed as probably pathogenic in EDS
database
https://eds.gene.le.ac.uk/variants.php?sele
ct_db=COL3A1&action=view_all. Report of
potential pathogenic association between
COL3A1 and subarachnoid haemorrhage or
arteriopathy in 4 patients including R271Q49.
Present in 37 EVS samples suggesting
absolute risk low or absent.
Listed as probably pathogenic in EDS
database. K1313R reported for one patient
in 49. Present in 23 EVS samples suggesting
absolute risk low or absent.
340+5G>C, reported as benign in 50