* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Detecting copy number variants and runs of homozygosity on a
Metagenomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Genetics and archaeogenetics of South Asia wikipedia , lookup
Microevolution wikipedia , lookup
Genome evolution wikipedia , lookup
Copy-number variation wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Y chromosome wikipedia , lookup
Genomic library wikipedia , lookup
Human genome wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
X-inactivation wikipedia , lookup
Genome (book) wikipedia , lookup
Public health genomics wikipedia , lookup
Neocentromere wikipedia , lookup
Genome-wide association study wikipedia , lookup
Human genetic variation wikipedia , lookup
Medical genetics wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
Detecting copy number variants and runs of homozygosity on a single array — challenges and applications Douglas Hurd and Ruth Burton Abstract In constitutional genetics research, analysis of single nucleotide polymorphisms (SNPs) provides invaluable insight into a number of conditions. When analysed in conjunction with copy number variation (CNV) data from array comparative genomic hybridisation (aCGH) arrays, this insight can aid in the identification of additional genetic variants to those yielded by the CNV data alone. Protocols for high-resolution SNP arrays can be time consuming whereas aCGH array protocols are less laborious, and as the gold-standard for CNV detection, well established in laboratory workflows. Recent advances have made it possible to combine CNV probes with probes able to detect SNPs on a single aCGH+SNP array, affording the benefits of shorter processing time and dual data with easy integration into the workflow. Although these combined arrays do not have the resolution capabilities of traditional SNP platforms, they have been research-validated to provide informative SNP data for various genetic aberrations such as uniparental disomy (UDP), mosaic aneuploidy and runs of homozygosity (ROH), without compromising on high quality CNV data. Researchers commonly report biologically relevant SNP data at lower resolutions and indeed the argument exists that increased resolution does not necessarily equal an increase in informative data. This review explores the various applications of combined arrays, the challenges faced in their implementation and their many advantages such as the easy to interpret, flexible data they provide. Introduction Identifying DNA variants that contribute to a disease or syndrome is a key objective in human genetics. Copy number variants (CNVs) and other forms of structural variation are important in understanding the underlying mechanisms to many common diseases. CNVs are defined as chromosomal segments, at least 1000 bases in length that vary in copy number (CN) between individuals1. A second major contributor to human variation is at the resolution of a single base. Single nucleotide polymorphisms (SNPs) are genome positions at which there are two distinct alleles each of which appear at high frequency in the population. Array comparative genomic hybridisation (aCGH) is the gold-standard for detecting CNV2; however, until recently it was not possible to combine the long 60mer oligonucleotide probes used for CNV detection with probes able to detect SNPs. This review highlights the importance of combined copy number (CN) and SNP platforms in constitutional genetics research and describes the advantages of using such long oligonucleotide aCGH arrays over short oligonucleotide SNP genotyping platforms. The primary considerations when selecting an array platform are typically integration into existing workflows and the resolution of the array. Array resolution is particularly important when studying uniparental disomy (UPD) and consanguinity. UPD is the presence of a homologous chromosome pair derived from only one parent. The absence of any heterozygous SNPs over an entire chromosome is a clear indication of UPD. Smaller runs of homozygosity (ROH) are common in offspring of consanguineous relationships; these vary considerably in size and frequency. Workflow aCGH not only delivers the highest quality CNV data2 but also provides a more streamlined and rapid workflow when compared to SNP-based array platforms (Figure 1). This is particularly useful for high-throughput research laboratories that require fast access to results. One of the challenges in combining CN and SNP content is the selection of probes that reliably detect and discriminate between SNP alleles while working under hybridisation conditions developed for CN detection; however, using the standard aCGH protocol greatly reduces total and hands-on time in the lab. Figure 1: A comparison of two typical array processing workflows. The aCGH +SNP workflow offers considerable time savings when compared to a typical SNP genotyping platform. A: The CGH +SNP protocol as used by the OGT CytoSure arrays and B: A typical protocol for a SNP genotyping platform. Applications ROH in outbred populations There are three distinct uses for SNP probes in constitutional genetics research: It is now well known that individuals in many different population groups have ROH in their genome. The natural frequency and size these ROH in normal outbred populations has been well studied. It is important to consider this when choosing a SNP detection platform, particularly if the goal of the study is to report biologically relevant ROH as well as changes in CN. Aiding in the identification of mosaic aneuploidy and chimerism Identification of UPD by the detection of runs of homozygosity (ROH) Identification of ROH by inheritance by descent and consanguinity Mosaic aneuploidy and chimerism Mosaic aneuploidy can be detected in a normal aCGH experiment; however, the B-allele frequency (BAF) of SNP probes can help in the identification of mosaicism3,4 as the distribution of homozygous and heterozygous SNPs can reinforce the subtle changes in CN that occur in mosaic samples. The BAF generated using SNP probes can, (in addition), help to determine if chimerism is present3,4. An advantage of using a combined CGH and SNP platform is that complex conditions like mosaicism and chimerism can be studied (Figure 2). In normal European populations, ROH covering on average 93Mb (1.5%) of DNA were present throughout the genome. The ROH can be up to 4Mb in length5 and were found in populations from all parts of Europe with the average number of ROH in a person being approximately 40 with a median length of approximately 1.25 Mb6. Similar ROH have been reported in other outbred populations. For example, in a Chinese population the size of the ROH varied from 2.94 to 26.27 Mb in length7. Using HapMap samples, obtained from a diverse population set, DNA obtained from CEPH Utah residents were found to have a mean of 77% 94% Figure 2: An example of a mosaic deletion of 20q analysed using CytoSure Interpret Software. The top panel displays the CN probes, in blue and the bottom panel the SNP probes in black and red. The SNP probes are displayed in a BAF plot which clearly shows the mosaic region. The values of 77% and 94% indicate the percentage of cells containing that aberration. Mosaicism is also shown by the CN probes by a shift in the average log ratio away from zero*. 8.3 LOH regions with the maximum region being 6.48 Mb in length. Meanwhile samples from Japanese residents of Tokyo had an average of 8.4 regions with a maximum of 17.91 Mb length8. Finally, a large study of a diverse population set reported by Kirin et al (2010) showed that many other populations also contain ROH9. However, a ROH of over 10 Mb is considered very rare in cosmopolitan populations9. isodisomy. With isodisomy, regions of LOH are seen. When two chromosomes from the same parent are inherited, this is known as heterodisomy. Chromosome Syndrome Maternally inherited chr6 Transient neonatal diabetes Maternally inherited (in 5% of cases) ch7 Silver Russell syndrome Paternally inherited chr11 Beckwith-Wiedemann Maternally inherited chr14 Temple syndrome Maternally inherited (in 25% of cases) chr15 Prader Willi Paternally inherited (in 2-3% of cases) chr15 Angelman syndrome Detection of UPD has largely been performed through screening DNA using microsatellite markers. Other methods of UPD detection rely on identifying imprinted genes through changes in methylation patterns. Both approaches are time consuming and challenging. It is not possible to detect UPD using a All ROH have the potential to cause an autosomal traditional CGH array as there are no changes in recessive disease. However, it is the excessively CN, so a platform containing SNP probes must be long ROH that are likely to greatly increase the used. To distinguish between isodisomy and chance of a discernible phenotype. Long ROH are heterodisomy it is necessary to analyse the most commonly caused by UPD, but can also be inheritance of the ROH. It is important to be able to due to consanguinity or shared parental ancestry9. A distinguish between isodisomy and heterodisomy recent report10 found that the definition of ancestral when studying UPD and recessive diseases. Unless ROH varied between laboratories but included the mutated gene is carried by both parents, definitions such as “the presence of ROH uniparental isodisomy is a prerequisite for a on a few chromosomes” and “1 Mb blocks and recessive disease to occur. higher of ROH”. It is important to study UPD using a combined CN Uniparental disomy and SNP platform because UPD is often associated with chromosomal aberrations. Interestingly it is not Uniparental disomy occurs when both copies of a UPD which causes the phenotype per se11 but the chromosome are inherited from a single parent. If aberration. only parts of a chromosome are inherited this is called segmental UPD. It is possible to inherit two There are several well-known constitutional diseases copies of the same chromosome, which is known as that can arise due to UPD, typically by affecting Table 1: Common imprinting syndromes imprinting11. The most common imprinting syndromes are shown in Table 1. been shown that the ROH are present throughout the genome17. Typically the type of the UPD in these syndromes is either whole chromosome or segmental isodisomy or a combination of segmental heterodisomy and isodisomy caused by meiotic recombination events. The segments are typically very large, exceeding well over 10 Mb12, 13. In cases of BeckwithWiedemann, paternally inherited, segmental isodisomy of chromosome 11 is always seen; however, the size of the segments varies. In a study by Cooper et al (2007), the sizes of the segments were shown to vary from less than 3 Mb to whole chromosome UPD, with the majority of samples having segments of greater than 17 Mb. From this study the critical regions could be narrowed down to between 1.7-2.8 Mb14. An example of whole chromosome UPD on chromosome 6 is shown in Figure 3. The number and size of ROH in offspring of consanguineous unions depends on the degree of parental relatedness16, 17 and can theoretically vary from 25% of the genome (800 Mb) for first degree relatives to 1.56% of the genome (50 Mb) for fifth degree relatives. Consanguinity It has been suggested that the actual ROH might be larger than predicated by the theoretical calculations. A study by Woods et al (2006) showed that an offspring of a first cousin union had ROH covering 11% of the genome; the theoretical calculations predicted that this should only be 6.25%18. An example of a consanguineous sample is shown in Figure 4 showing multiple long ROH across the genome. Challenges of identifying biologically significant ROH Identification of homozygosity can be useful for understanding underlying disease mechanisms. As discussed above, normal outbred populations rarely have ROH above 10 Mb but commonly have smaller ROH9 (Kirin et al, 2010), occurring across all As discussed above in a normal outbred population populations and are termed ancestral ROH. Although the detection of ROH is useful it raises ROH are short and are typically under 5 Mb. Consanguinity samples however have a significantly complex legal and ethical issues and it is important to be able to distinguish between naturally occurring increased number and size of ROH exceeding ancestral ROH and ROH that is biologically relevant. 10 Mb16. This therefore increases the chance of homozygosity for recessive mutations. It is estimated To detect biologically relevant ROH it is necessary to use a cut-off value to exclude ancestral ROH. There that the offspring of first cousins has an increased is conflicting evidence in the literature regarding risk of 1.7-2.8% of congenital malformations. It has In clinical genetics, consanguinity is defined as the union of individuals related as second cousins or closer and it is estimated that such couples account for 10.4% of the world’s population15. Figure 3: A BAF plot showing the distribution of the individual SNP probes analysed using CytoSure Interpret Software. Shown here is an example of whole chromosome UPD so the majority of the probes have a BAF value of 1. The lefthand graph shows the overall percentage of homozygous probes for all the chromosomes, here chromosome 6 is selected and this is highlighted in red. The centre dial gives the percentage of homozygous SNP probes for the whole chromosome which is 95%. The right-hand table details the ROH. In this example there are two continuous ROH, one on the p-arm containing 155 SNPs and a second on the q-arm containing 240 SNPs. The score reflects the quality of the ROH, with a higher score indicating increased quality†. Figure 4: A consanguineous sample on an OGT CytoSure ISCA +SNP array analysed using CytoSure Intrepret Software. ROH are indicated by the red solid bars to the left-hand side of the chromosome ideograms. The bright red blocks to the right-hand side of the ideograms indicate deletions and the green blocks amplifications*. what value should be used, these are summarized in interesting to consider whether using a combined CN Table 2. and SNP array could increase the discovery of biologically relevant ROH. Approximately 80% of The variation in cut-off values reported in the developmental disorder samples of unknown cause literature is reflected in research laboratories have a normal result when a traditional aCGH reporting policy. A recent study10 found that each platform is used. laboratory made its own decision regarding the cutoff value for classifying biologically relevant ROH. It is estimated that the frequency of UPD in These values ranged from ≥10 Mb to ≥5 Mb. In newborns is approximately 1 in 3,500 with not all some laboratories the total percentage of UPDs causing a phenotypic effect. Around 1,100 homozygosity across the genome was considered, cases of whole chromosome UPD and whereas in other laboratories the frequency of ROH approximately 120 reports on segmental UPD have was considered to be important. Overall there was been described in the literature11. In a large study by considerable variability in what was considered Papenhausen et al20 where 13,000 samples were biologically relevant and highlighted the need for the tested, 92 samples were found to have ROH greater introduction of guidelines to standardise the process. than 13.5 Mb on single chromosome or multiple ROH amounting to 15 Mb over two chromosomes. Frequency of biologically significant ROH These samples were suspected to have UPD. From studying the inheritance patterns of the ROH, where There are few reports on the frequency of ROH and available, there was an even mix of complete UPD found in samples typically analysed by isodisomy and heterodisomy combined with cytogenetics research laboratories and it is isodisomy. The ROH varied in size from 13.5 Mb to Study ROH Threshold 19 Kearney et al Suggested a conservative clinical threshold of between 3 Mb and 10 Mb 3 Conlin et al 20 Mb 16 Sund et al 10 Mb on two separate chromosomes 20 Papenhausen et al 4 Bruno et al 13.5 Mb on single chromosome (15 Mb total on two chromosomes) 5.3 Mb, with most regions not clinically significant Table 2: Several recent studies present conflicting recommendations regarding the cut-off value that should be used to distinguish ancestral ROH from biologically relevant ROH. 127.8 Mb with an average size of 46.32 Mb. Smaller studies have also reported a low frequency of detection of ROH and a complex range in size and frequency3, 16, 4. A comparatively small study of 35 samples that had a known development disorder of unknown cause and a normal aCGH result showed that using a high-resolution SNP array did not detect additional pathogenic CN aberrations. A vast amount of data was generated and 200-1000 changes were identified per sample. More aberrations were detected in samples with reduced technical quality. Stringent filtering had to be applied to identify potentially relevant aberrations. Four samples were identified that had a ROH associated with an OMIM disease gene. Inheritance studies showed that these ROH were not true segmental UPD. This result is not unexpected as the samples came from a small founder population21. Conclusion The studies reviewed here highlight the current complexities in defining and detecting ROH. Although the frequency of biologically relevant ROH is low, detecting ROH and distinguishing ancestral ROH from biologically relevant ROH is important and can be useful for discerning the underlying cause of the disease. What is clear though is that there is little additional benefit to identifying small ROH. This adds to the complexity of the data and does not improve the identification of biologically relevant regions. Combined CN and SNP platforms offer goldstandard CNV analysis but also SNP probe resolution that enable accurate detection of biologically relevant ROH. The CytoSure ISCA +SNP array After careful optimization and considerable experimental validation OGT has identified a number of informative SNP probes that work effectively using the standard aCGH protocol allowing easy integration into existing workflows. In addition, OGT’s CytoSure CGH +SNP arrays allow any reference DNA to be used and no restriction digest of the sample is required. This means that the labeling and hybridisation steps can be competed in a single day which is significantly quicker than a typical SNP workflow (Table 3). The OGT workflow is scalable and amenable to automation, particularly when using OGT’s CytoSure HT Genomic Labeling Kit. No dedicated PCR areas or specialist equipment, other than the hybridisation oven and chambers, are required and any standard microarray scanner can be used. The array design itself is flexible and custom CN +SNP designs are straightforward to produce. Each array purchase comes with complimentary access to CytoSure Interpret Software, a powerful, user-friendly CN and SNP data analysis package. Innovative features such as the Accelerate Workflow enable the automation of data analysis workflows, minimising the need for user intervention and maximising the consistency and speed of data interpretation. CytoSure Interpret Software also includes extensive annotation tracks covering syndromes, genes, exons, CNVs and recombination hotspots — each of which link to publically available databases such as ISCA, Ensembl and the Database of Genomic Variants, providing results in context. OGT’s CytoSure ISCA +SNP array has been specifically developed to offer sufficient resolution to detect abnormally long LOH stretches present in consanguineous samples or in samples containing UPD, whilst excluding standard length ROH that are not biologically relevant without compromising CN detection. Total hands-on time OGT’s CGH + SNP protocol 27 – 41 hours (dependant on format) 1 hour 5 min Standard SNP protocol 39 hours 45 min – 41 hours 45 min 6 hours 45 min Time to hybridisation set -up 1 day 3 days Time to results 3 days 4 days Total time required Table 3: Overview of workflows. The OGT aCGH +SNP workflow offers considerable time savings compared when compared to a typical SNP genotyping platform. To find out more about OGT’s CytoSure CGH +SNP arrays, visit www.ogt.com/cytosure or contact [email protected]. References 1. Feuk, L. et al (2006) Structural variation in the human genome. Nat. Rev. Genetics 7, 85-97 2. Curtis, C. et al (2009) The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 10, 588-610 3. Conlin, L.K. et al (2010) Mechanisms of mosaicism, chimerism and uniparental disomy identified by single nucleotide polymorphism array analysis. Human Molecular Genetics 7, 1263-1275 4. Bruno, D.L. et al (2009) Detection of cryptic pathogenic copy number variations and constitutional loss of heterozygosity using high resolution SNP microarray analysis in 117 patients referred for cytogenetic analysis and impact on clinical practice. Journal of Medical Genetics 46, 123-131 5. McQuillian, R. et al (2008) Runs of homozygosity in European populations. American Journal of Human Genetics 83, 359372 6. Nothnagel, M. et al (2010) Genomic and geographic distribution of SNP-defined runs of homozygosity in Europeans. Human Molecular Genetics 1, 2927-2935 7. Li, L. et al (2006) Long contiguous stretches of homozygosity in the Human Genome. Human Mutation 27, 1115-11121 8. Gibson, J. et al (2006) Extended tracts of homozygosity in outbred human populations. Human Molecular Genetics 14, 789-795 9. Kirin, M. et al (2010) Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5(11): e13996. doi:10.1371/journal.pone.0013996 10. Grote, L. et al (2012) Variability in laboratory reporting practices for regions of homozygosity indicating parental relatedness as identified by SNP microarray testing. Genetics in Medicine 14, 971-976L 11. Liehr, T. et al (2010) Cytogenetic contribution to uniparental disomy (UPD). Molecular Cytogenetics 3, 1755-8166 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Bruce, S. et al (2005) Global analysis of uniparental disomy using high density genotyping arrays. Journal of Medical Genetics 42, 847-851 Altug-Teber, Ö. et al (2005) A rapid microarray based whole genome analysis for detection of uniparental disomy. Human Mutation 26, 153159 Cooper, W.N. et al (2007) Mitotic recombination and uniparental disomy in Beckwith-Wiedemann syndrome. Genomics 89, 613-617 Bittles, A.H. and Black, M.L. (2010) Consanguinity, human evolution, and complex diseases. Proceedings of the National Academy of Sciences USA 26, 1779-1786 Sund, K.L. et al (2012) Regions of homzygosity identified by SNP microarray analysis aid in the diagnosis of autosomal recessive disease and incidentally detect parental blood relationships. Genetic Medicine 15, 70-78 Bennett, R.L. et al (2002) Genetic counselling and screening of consanguineous couples and their offspring: recommendations of the National Society of Genetic Counselors. Journal of Genetic Counseling, 11, 97-119 Woods, C.G. et al (2006) Quantification of homozygosity in consanguineous individuals with autosomal recessive disease. The American Journal of Human Genetics 78, 889896 Kearney, H.M. et al (2011) Diagnostic implications of excessive homozygosity detected by SNP-based microarrays: consanguinity, uniparental disomy, and recessive single-gene mutations. Clinics in Laboratory Medicine 31, 595-613 Papenhausen, P. et al (2011) UPD detection using homozygosity profiling with a SNP genotyping microarray. American Journal of Medical Genetics Part A 155, 757–768 Siggberg, L. et al (2012) High-resolution SNP array analysis of patients with developmental disorder and normal array CGH results. BMC Med Genet 13:84 * Data kindly provided by Emory Genetics Laboratory. † Data kindly provided by Dr Deborah J G Mackay and Dr Rebecca Poole, Wessex Regional Genetics Laboratory, Salisbury District Hospital, Salisbury. Begbroke Science Park, Begbroke Hill, Woodstock Rd Begbroke, Oxfordshire, OX5 1PF United Kingdom T:+44 (0)1865 856826 (US: 914-467-5285) F: +44 (0)1865 848684 www.ogt.com CytoSure: This product is provided under an agreement between Agilent Technologies, Inc. and OGT. The manufacture, use, sale or import of this product may be subject to one or more of U.S. patents, pending applications, and corresponding international equivalents, owned by Agilent Technologies, Inc. The purchaser has the non-transferable right to use and consume the product for RESEARCH USE ONLY AND NOT for DIAGNOSTICS PROCEDURES. It is not intended for use, and should not be used, for the diagnosis, prevention, monitoring, treatment or alleviation of any disease or condition, or for the investigation of any physiological process, in any identifiable human, or for any other medical purpose. This document and its contents are © Oxford Gene Technology IP Limited – 2013. All rights reserved. OGT™ and CytoSure™ are trademarks of Oxford Gene Technology IP Limited.