Download Detecting copy number variants and runs of homozygosity on a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA-Seq wikipedia , lookup

Metagenomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Genetics and archaeogenetics of South Asia wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Genomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Y chromosome wikipedia , lookup

Chromosome wikipedia , lookup

Genomic library wikipedia , lookup

Human genome wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

X-inactivation wikipedia , lookup

Polyploid wikipedia , lookup

Genome (book) wikipedia , lookup

Public health genomics wikipedia , lookup

Neocentromere wikipedia , lookup

Karyotype wikipedia , lookup

Genome-wide association study wikipedia , lookup

Human genetic variation wikipedia , lookup

Medical genetics wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Tag SNP wikipedia , lookup

SNP genotyping wikipedia , lookup

Molecular Inversion Probe wikipedia , lookup

Transcript
Detecting copy number variants and runs of
homozygosity on a single array — challenges and
applications
Douglas Hurd and Ruth Burton
Abstract
In constitutional genetics research, analysis of single nucleotide polymorphisms (SNPs) provides
invaluable insight into a number of conditions. When analysed in conjunction with copy number
variation (CNV) data from array comparative genomic hybridisation (aCGH) arrays, this insight can
aid in the identification of additional genetic variants to those yielded by the CNV data alone.
Protocols for high-resolution SNP arrays can be time consuming whereas aCGH array protocols are
less laborious, and as the gold-standard for CNV detection, well established in laboratory
workflows. Recent advances have made it possible to combine CNV probes with probes able to
detect SNPs on a single aCGH+SNP array, affording the benefits of shorter processing time and
dual data with easy integration into the workflow. Although these combined arrays do not have the
resolution capabilities of traditional SNP platforms, they have been research-validated to provide
informative SNP data for various genetic aberrations such as uniparental disomy (UDP), mosaic
aneuploidy and runs of homozygosity (ROH), without compromising on high quality CNV data.
Researchers commonly report biologically relevant SNP data at lower resolutions and indeed the
argument exists that increased resolution does not necessarily equal an increase in informative
data. This review explores the various applications of combined arrays, the challenges faced in
their implementation and their many advantages such as the easy to interpret, flexible data they
provide.
Introduction
Identifying DNA variants that contribute to a disease
or syndrome is a key objective in human genetics.
Copy number variants (CNVs) and other forms of
structural variation are important in understanding
the underlying mechanisms to many common
diseases. CNVs are defined as chromosomal
segments, at least 1000 bases in length that vary in
copy number (CN) between individuals1. A second
major contributor to human variation is at the
resolution of a single base. Single nucleotide
polymorphisms (SNPs) are genome positions at
which there are two distinct alleles each of which
appear at high frequency in the population.
Array comparative genomic hybridisation (aCGH) is
the gold-standard for detecting CNV2; however, until
recently it was not possible to combine the long 60mer oligonucleotide probes used for CNV detection
with probes able to detect SNPs. This review
highlights the importance of combined copy number
(CN) and SNP platforms in constitutional genetics
research and describes the advantages of using
such long oligonucleotide aCGH arrays over short
oligonucleotide SNP genotyping platforms.
The primary considerations when selecting an array
platform are typically integration into existing
workflows and the resolution of the array. Array
resolution is particularly important when studying
uniparental disomy (UPD) and consanguinity. UPD is
the presence of a homologous chromosome pair
derived from only one parent. The absence of any
heterozygous SNPs over an entire chromosome is a
clear indication of UPD. Smaller runs of
homozygosity (ROH) are common in offspring of
consanguineous relationships; these vary
considerably in size and frequency.
Workflow
aCGH not only delivers the highest quality CNV
data2 but also provides a more streamlined and rapid
workflow when compared to SNP-based array
platforms (Figure 1). This is particularly useful for
high-throughput research laboratories that require
fast access to results. One of the challenges in
combining CN and SNP content is the selection of
probes that reliably detect and discriminate between
SNP alleles while working under hybridisation
conditions developed for CN detection; however,
using the standard aCGH protocol greatly reduces
total and hands-on time in the lab.
Figure 1: A comparison of two typical array processing workflows. The aCGH +SNP workflow
offers considerable time savings when compared to a typical SNP genotyping platform. A: The
CGH +SNP protocol as used by the OGT CytoSure arrays and B: A typical protocol for a SNP
genotyping platform.
Applications
ROH in outbred populations
There are three distinct uses for SNP probes in
constitutional genetics research:
It is now well known that individuals in many different
population groups have ROH in their genome. The
natural frequency and size these ROH in normal
outbred populations has been well studied. It is
important to consider this when choosing a SNP
detection platform, particularly if the goal of the study
is to report biologically relevant ROH as well as
changes in CN.



Aiding in the identification of mosaic
aneuploidy and chimerism
Identification of UPD by the detection of runs
of homozygosity (ROH)
Identification of ROH by inheritance by
descent and consanguinity
Mosaic aneuploidy and chimerism
Mosaic aneuploidy can be detected in a normal
aCGH experiment; however, the B-allele frequency
(BAF) of SNP probes can help in the identification of
mosaicism3,4 as the distribution of homozygous and
heterozygous SNPs can reinforce the subtle
changes in CN that occur in mosaic samples. The
BAF generated using SNP probes can, (in addition),
help to determine if chimerism is present3,4. An
advantage of using a combined CGH and SNP
platform is that complex conditions like mosaicism
and chimerism can be studied (Figure 2).
In normal European populations, ROH covering on
average 93Mb (1.5%) of DNA were present
throughout the genome. The ROH can be up to 4Mb
in length5 and were found in populations from all
parts of Europe with the average number of ROH in
a person being approximately 40 with a median
length of approximately 1.25 Mb6.
Similar ROH have been reported in other outbred
populations. For example, in a Chinese population
the size of the ROH varied from 2.94 to 26.27 Mb in
length7. Using HapMap samples, obtained from a
diverse population set, DNA obtained from CEPH
Utah residents were found to have a mean of
77%
94%
Figure 2: An example of a mosaic deletion of 20q analysed using CytoSure Interpret Software. The top
panel displays the CN probes, in blue and the bottom panel the SNP probes in black and red. The SNP
probes are displayed in a BAF plot which clearly shows the mosaic region. The values of 77% and 94%
indicate the percentage of cells containing that aberration. Mosaicism is also shown by the CN probes by a
shift in the average log ratio away from zero*.
8.3 LOH regions with the maximum region being
6.48 Mb in length. Meanwhile samples from
Japanese residents of Tokyo had an average of 8.4
regions with a maximum of 17.91 Mb length8. Finally,
a large study of a diverse population set reported by
Kirin et al (2010) showed that many other
populations also contain ROH9. However, a ROH of
over 10 Mb is considered very rare in cosmopolitan
populations9.
isodisomy. With isodisomy, regions of LOH are seen.
When two chromosomes from the same parent are
inherited, this is known as heterodisomy.
Chromosome
Syndrome
Maternally inherited chr6
Transient neonatal diabetes
Maternally inherited (in 5% of cases) ch7
Silver Russell syndrome
Paternally inherited chr11
Beckwith-Wiedemann
Maternally inherited chr14
Temple syndrome
Maternally inherited (in 25% of cases) chr15
Prader Willi
Paternally inherited (in 2-3% of cases) chr15
Angelman syndrome
Detection of UPD has largely been performed
through screening DNA using microsatellite markers.
Other methods of UPD detection rely on identifying
imprinted genes through changes in methylation
patterns. Both approaches are time consuming and
challenging. It is not possible to detect UPD using a
All ROH have the potential to cause an autosomal
traditional CGH array as there are no changes in
recessive disease. However, it is the excessively
CN, so a platform containing SNP probes must be
long ROH that are likely to greatly increase the
used. To distinguish between isodisomy and
chance of a discernible phenotype. Long ROH are
heterodisomy it is necessary to analyse the
most commonly caused by UPD, but can also be
inheritance of the ROH. It is important to be able to
due to consanguinity or shared parental ancestry9. A distinguish between isodisomy and heterodisomy
recent report10 found that the definition of ancestral
when studying UPD and recessive diseases. Unless
ROH varied between laboratories but included
the mutated gene is carried by both parents,
definitions such as “the presence of ROH
uniparental isodisomy is a prerequisite for a
on a few chromosomes” and “1 Mb blocks and
recessive disease to occur.
higher of ROH”.
It is important to study UPD using a combined CN
Uniparental disomy
and SNP platform because UPD is often associated
with chromosomal aberrations. Interestingly it is not
Uniparental disomy occurs when both copies of a
UPD which causes the phenotype per se11 but the
chromosome are inherited from a single parent. If
aberration.
only parts of a chromosome are inherited this is
called segmental UPD. It is possible to inherit two
There are several well-known constitutional diseases
copies of the same chromosome, which is known as that can arise due to UPD, typically by affecting
Table 1: Common imprinting syndromes
imprinting11. The most common imprinting
syndromes are shown in Table 1.
been shown that the ROH are present throughout
the genome17.
Typically the type of the UPD in these syndromes is
either whole chromosome or segmental isodisomy or
a combination of segmental heterodisomy and
isodisomy caused by meiotic recombination events.
The segments are typically very large, exceeding
well over 10 Mb12, 13. In cases of BeckwithWiedemann, paternally inherited, segmental
isodisomy of chromosome 11 is always seen;
however, the size of the segments varies. In a study
by Cooper et al (2007), the sizes of the segments
were shown to vary from less than 3 Mb to whole
chromosome UPD, with the majority of samples
having segments of greater than 17 Mb. From this
study the critical regions could be narrowed down to
between 1.7-2.8 Mb14. An example of whole
chromosome UPD on chromosome 6 is shown in
Figure 3.
The number and size of ROH in offspring of
consanguineous unions depends on the degree of
parental relatedness16, 17 and can theoretically vary
from 25% of the genome (800 Mb) for first degree
relatives to 1.56% of the genome (50 Mb) for fifth
degree relatives.
Consanguinity
It has been suggested that the actual ROH might be
larger than predicated by the theoretical calculations.
A study by Woods et al (2006) showed that an
offspring of a first cousin union had ROH covering
11% of the genome; the theoretical calculations
predicted that this should only be 6.25%18. An
example of a consanguineous sample is shown in
Figure 4 showing multiple long ROH across the
genome.
Challenges of identifying biologically significant
ROH
Identification of homozygosity can be useful for
understanding underlying disease mechanisms. As
discussed above, normal outbred populations rarely
have ROH above 10 Mb but commonly have smaller
ROH9 (Kirin et al, 2010), occurring across all
As discussed above in a normal outbred population populations and are termed ancestral ROH.
Although the detection of ROH is useful it raises
ROH are short and are typically under 5 Mb.
Consanguinity samples however have a significantly complex legal and ethical issues and it is important
to be able to distinguish between naturally occurring
increased number and size of ROH exceeding
ancestral ROH and ROH that is biologically relevant.
10 Mb16. This therefore increases the chance of
homozygosity for recessive mutations. It is estimated To detect biologically relevant ROH it is necessary to
use a cut-off value to exclude ancestral ROH. There
that the offspring of first cousins has an increased
is conflicting evidence in the literature regarding
risk of 1.7-2.8% of congenital malformations. It has
In clinical genetics, consanguinity is defined as the
union of individuals related as second cousins or
closer and it is estimated that such couples account
for 10.4% of the world’s population15.
Figure 3: A BAF plot showing the distribution of the individual SNP probes
analysed using CytoSure Interpret Software. Shown here is an example of whole
chromosome UPD so the majority of the probes have a BAF value of 1. The lefthand graph shows the overall percentage of homozygous probes for all the
chromosomes, here chromosome 6 is selected and this is highlighted in red. The
centre dial gives the percentage of homozygous SNP probes for the whole
chromosome which is 95%. The right-hand table details the ROH. In this
example there are two continuous ROH, one on the p-arm containing 155 SNPs
and a second on the q-arm containing 240 SNPs. The score reflects the quality
of the ROH, with a higher score indicating increased quality†.
Figure 4: A consanguineous sample on an OGT CytoSure ISCA +SNP array analysed using CytoSure
Intrepret Software. ROH are indicated by the red solid bars to the left-hand side of the chromosome
ideograms. The bright red blocks to the right-hand side of the ideograms indicate deletions and the green
blocks amplifications*.
what value should be used, these are summarized in interesting to consider whether using a combined CN
Table 2.
and SNP array could increase the discovery of
biologically relevant ROH. Approximately 80% of
The variation in cut-off values reported in the
developmental disorder samples of unknown cause
literature is reflected in research laboratories
have a normal result when a traditional aCGH
reporting policy. A recent study10 found that each
platform is used.
laboratory made its own decision regarding the cutoff value for classifying biologically relevant ROH.
It is estimated that the frequency of UPD in
These values ranged from ≥10 Mb to ≥5 Mb. In
newborns is approximately 1 in 3,500 with not all
some laboratories the total percentage of
UPDs causing a phenotypic effect. Around 1,100
homozygosity across the genome was considered,
cases of whole chromosome UPD and
whereas in other laboratories the frequency of ROH approximately 120 reports on segmental UPD have
was considered to be important. Overall there was
been described in the literature11. In a large study by
considerable variability in what was considered
Papenhausen et al20 where 13,000 samples were
biologically relevant and highlighted the need for the tested, 92 samples were found to have ROH greater
introduction of guidelines to standardise the process. than 13.5 Mb on single chromosome or multiple
ROH amounting to 15 Mb over two chromosomes.
Frequency of biologically significant ROH
These samples were suspected to have UPD. From
studying the inheritance patterns of the ROH, where
There are few reports on the frequency of ROH and available, there was an even mix of complete
UPD found in samples typically analysed by
isodisomy and heterodisomy combined with
cytogenetics research laboratories and it is
isodisomy. The ROH varied in size from 13.5 Mb to
Study
ROH Threshold
19
Kearney et al
Suggested a conservative clinical threshold of between 3 Mb and 10 Mb
3
Conlin et al
20 Mb
16
Sund et al
10 Mb on two separate chromosomes
20
Papenhausen et al
4
Bruno et al
13.5 Mb on single chromosome (15 Mb total on two chromosomes)
5.3 Mb, with most regions not clinically significant
Table 2: Several recent studies present conflicting recommendations regarding the cut-off value that should
be used to distinguish ancestral ROH from biologically relevant ROH.
127.8 Mb with an average size of 46.32 Mb.
Smaller studies have also reported a low
frequency of detection of ROH and a complex
range in size and frequency3, 16, 4.
A comparatively small study of 35 samples that
had a known development disorder of unknown
cause and a normal aCGH result showed that
using a high-resolution SNP array did not detect
additional pathogenic CN aberrations. A vast
amount of data was generated and 200-1000
changes were identified per sample. More
aberrations were detected in samples with
reduced technical quality. Stringent filtering had
to be applied to identify potentially relevant
aberrations. Four samples were identified that
had a ROH associated with an OMIM disease
gene. Inheritance studies showed that these
ROH were not true segmental UPD. This result is
not unexpected as the samples came from a
small founder population21.
Conclusion
The studies reviewed here highlight the current
complexities in defining and detecting ROH.
Although the frequency of biologically relevant
ROH is low, detecting ROH and distinguishing
ancestral ROH from biologically relevant ROH is
important and can be useful for discerning the
underlying cause of the disease. What is clear
though is that there is little additional benefit to
identifying small ROH. This adds to the
complexity of the data and does not improve the
identification of biologically relevant regions.
Combined CN and SNP platforms offer goldstandard CNV analysis but also SNP probe
resolution that enable accurate detection of
biologically relevant ROH.
The CytoSure ISCA +SNP array
After careful optimization and considerable
experimental validation OGT has identified a
number of informative SNP probes that work
effectively using the standard aCGH protocol
allowing easy integration into existing workflows.
In addition, OGT’s CytoSure CGH +SNP arrays
allow any reference DNA to be used and no
restriction digest of the sample is required. This
means that the labeling and hybridisation steps
can be competed in a single day which is
significantly quicker than a typical SNP workflow
(Table 3).
The OGT workflow is scalable and amenable to
automation, particularly when using OGT’s
CytoSure HT Genomic Labeling Kit. No dedicated
PCR areas or specialist equipment, other than
the hybridisation oven and chambers, are
required and any standard microarray scanner
can be used. The array design itself is flexible
and custom CN +SNP designs are
straightforward to produce. Each array purchase
comes with complimentary access to CytoSure
Interpret Software, a powerful, user-friendly CN
and SNP data analysis package. Innovative
features such as the Accelerate Workflow enable
the automation of data analysis workflows,
minimising the need for user intervention and
maximising the consistency and speed of data
interpretation. CytoSure Interpret Software also
includes extensive annotation tracks covering
syndromes, genes, exons, CNVs and
recombination hotspots — each of which link to
publically available databases such as ISCA,
Ensembl and the Database of Genomic Variants,
providing results in context.
OGT’s CytoSure ISCA +SNP array has been
specifically developed to offer sufficient resolution
to detect abnormally long LOH stretches present
in consanguineous samples or in samples
containing UPD, whilst excluding standard length
ROH that are not biologically relevant without
compromising CN detection.
Total hands-on time
OGT’s CGH + SNP protocol
27 – 41 hours
(dependant on format)
1 hour 5 min
Standard SNP protocol
39 hours 45 min –
41 hours 45 min
6 hours 45 min
Time to hybridisation set -up
1 day
3 days
Time to results
3 days
4 days
Total time required
Table 3: Overview of workflows. The OGT aCGH +SNP workflow offers considerable time savings
compared when compared to a typical SNP genotyping platform.
To find out more about OGT’s CytoSure CGH +SNP arrays, visit www.ogt.com/cytosure or
contact [email protected].
References
1.
Feuk, L. et al (2006) Structural variation in the
human genome. Nat. Rev. Genetics 7, 85-97
2.
Curtis, C. et al (2009) The pitfalls of platform
comparison: DNA copy number array
technologies assessed. BMC Genomics 10,
588-610
3.
Conlin, L.K. et al (2010) Mechanisms of
mosaicism, chimerism and uniparental disomy
identified by single nucleotide polymorphism
array analysis. Human Molecular Genetics 7,
1263-1275
4.
Bruno, D.L. et al (2009) Detection of cryptic
pathogenic copy number variations and
constitutional loss of heterozygosity using high
resolution SNP microarray analysis in 117
patients referred for cytogenetic analysis and
impact on clinical practice. Journal of Medical
Genetics 46, 123-131
5.
McQuillian, R. et al (2008) Runs of
homozygosity in European populations.
American Journal of Human Genetics 83, 359372
6.
Nothnagel, M. et al (2010) Genomic and
geographic distribution of SNP-defined runs of
homozygosity in Europeans. Human Molecular
Genetics 1, 2927-2935
7.
Li, L. et al (2006) Long contiguous stretches of
homozygosity in the Human Genome. Human
Mutation 27, 1115-11121
8.
Gibson, J. et al (2006) Extended tracts of
homozygosity in outbred human populations.
Human Molecular Genetics 14, 789-795
9.
Kirin, M. et al (2010) Genomic runs of
homozygosity record population history and
consanguinity. PLoS ONE 5(11): e13996.
doi:10.1371/journal.pone.0013996
10. Grote, L. et al (2012) Variability in laboratory
reporting practices for regions of homozygosity
indicating parental relatedness as identified by
SNP microarray testing. Genetics in Medicine
14, 971-976L
11. Liehr, T. et al (2010) Cytogenetic contribution
to uniparental disomy (UPD). Molecular
Cytogenetics 3, 1755-8166
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Bruce, S. et al (2005) Global analysis of
uniparental disomy using high density
genotyping arrays. Journal of Medical
Genetics 42, 847-851
Altug-Teber, Ö. et al (2005) A rapid microarray
based whole genome analysis for detection of
uniparental disomy. Human Mutation 26, 153159
Cooper, W.N. et al (2007) Mitotic
recombination and uniparental disomy in
Beckwith-Wiedemann syndrome. Genomics
89, 613-617
Bittles, A.H. and Black, M.L. (2010)
Consanguinity, human evolution, and complex
diseases. Proceedings of the National
Academy of Sciences USA 26, 1779-1786
Sund, K.L. et al (2012) Regions of
homzygosity identified by SNP microarray
analysis aid in the diagnosis of autosomal
recessive disease and incidentally detect
parental blood relationships. Genetic Medicine
15, 70-78
Bennett, R.L. et al (2002) Genetic counselling
and screening of consanguineous couples and
their offspring: recommendations of the
National Society of Genetic Counselors.
Journal of Genetic Counseling, 11, 97-119
Woods, C.G. et al (2006) Quantification of
homozygosity in consanguineous individuals
with autosomal recessive disease. The
American Journal of Human Genetics 78, 889896
Kearney, H.M. et al (2011) Diagnostic
implications of excessive homozygosity
detected by SNP-based microarrays:
consanguinity, uniparental disomy, and
recessive single-gene mutations. Clinics in
Laboratory Medicine 31, 595-613
Papenhausen, P. et al (2011) UPD detection
using homozygosity profiling with a SNP
genotyping microarray. American Journal of
Medical Genetics Part A 155, 757–768
Siggberg, L. et al (2012) High-resolution SNP
array analysis of patients with developmental
disorder and normal array CGH results. BMC
Med Genet 13:84
* Data kindly provided by Emory Genetics Laboratory.
†
Data kindly provided by Dr Deborah J G Mackay and Dr Rebecca Poole, Wessex Regional Genetics
Laboratory, Salisbury District Hospital, Salisbury.
Begbroke Science Park,
Begbroke Hill, Woodstock Rd
Begbroke, Oxfordshire, OX5 1PF
United Kingdom
T:+44 (0)1865 856826 (US: 914-467-5285)
F: +44 (0)1865 848684
www.ogt.com
CytoSure: This product is provided under an agreement between Agilent Technologies, Inc. and OGT. The manufacture, use, sale or import of this
product may be subject to one or more of U.S. patents, pending applications, and corresponding international equivalents, owned by Agilent
Technologies, Inc. The purchaser has the non-transferable right to use and consume the product for RESEARCH USE ONLY AND NOT for
DIAGNOSTICS PROCEDURES. It is not intended for use, and should not be used, for the diagnosis, prevention, monitoring, treatment or alleviation of
any disease or condition, or for the investigation of any physiological process, in any identifiable human, or for any other medical purpose.
This document and its contents are © Oxford Gene Technology IP Limited – 2013. All rights reserved. OGT™ and CytoSure™ are trademarks of Oxford
Gene Technology IP Limited.