Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Technical Note: Informatics Automated Cluster Calling for Polyploid Genomes with GenomeStudio® Software GenomeStudio supports automated cluster calling for any number of clusters. Introduction Project Options The GenomeStudio Data Analysis Software is a highly visual and intuitive platform for analyzing data generated from Illumina assays. Illumina’s GenomeStudio Software works seamlessly with Illumina’s genotyping platforms to support a diverse range of data analysis needs. Primary analyses, such as raw data normalization, clustering, genotyping, and cluster calling of GoldenGate® and Infinium® genotyping data are performed using algorithms in the Genotyping (GT) Module. This document provides a brief overview of the automated cluster calling functionality currently available in GenomeStudio using both Density Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering Points to Identify the Clustering Structure (OPTICS) algorithms.1, 2 GenomeStudio Genotyping Module offers an option to set custom project options and analyze data as a polyploid project. The Current Project Options Dialog Box is available through the Tools Menu within the GenomeStudio Genotyping Module (Figure 1). Options can be adjusted per project to increase or decrease the algorithm sensitivity to cluster detection by adjusting minimum number of points required to define a cluster and default cluster distance (Figure 2). Figure 2: Current Project Options Dialog Box The automated clustering algorithms start from an estimated density distribution and are able to detect meaningful clusters in data with varying density, which is common in genotyping data. Sensitivity of cluster detection can be adjusted at the project level by specifying a minimum number of points in a cluster and cluster distance. The X-Y coordinates for cluster positions can be exported from GenomeStudio for downstream data analysis. Figure 1: Genotyping Module Tools Menu References 1. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jiawei Han, Usama M. Fayyad. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231. ISBN 1-57735-004-9. 2. Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander (1999). OPTICS: Ordering Points to Identify the Clustering Structure. ACM SIGMOD international conference on Management of data. ACM Press. pp. 49–60. Additional Information For more information about GenomeStudio Software and supporting documentation, please refer to the GenomeStudio Portal or visit www.illumina.com/genomestudio. Illumina • 1.800.809.4566 toll-free • +1.858.202.4566 tel • [email protected] • www.illumina.com FOR RESEARCH USE ONLY © 2011-2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners. Pub. No. 970-2011-001 Current as of 09 January 2013