Download Calling Polyploid Genotypes with GenoStudio Software v2010.3/v1.8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genetic clustering wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

K-means clustering wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Technical Note: Informatics
Automated Cluster Calling for Polyploid Genomes
with GenomeStudio® Software
GenomeStudio supports automated cluster calling for any number of clusters.
Introduction
Project Options
The GenomeStudio Data Analysis Software is a highly visual
and intuitive platform for analyzing data generated from Illumina
assays. Illumina’s GenomeStudio Software works seamlessly with
Illumina’s genotyping platforms to support a diverse range of data
analysis needs. Primary analyses, such as raw data normalization,
clustering, genotyping, and cluster calling of GoldenGate® and
Infinium® genotyping data are performed using algorithms in the
Genotyping (GT) Module. This document provides a brief overview
of the automated cluster calling functionality currently available
in GenomeStudio using both Density Based Spatial Clustering of
Applications with Noise (DBSCAN) and Ordering Points to Identify
the Clustering Structure (OPTICS) algorithms.1, 2
GenomeStudio Genotyping Module offers an option to set custom
project options and analyze data as a polyploid project. The Current
Project Options Dialog Box is available through the Tools Menu within
the GenomeStudio Genotyping Module (Figure 1). Options can be
adjusted per project to increase or decrease the algorithm sensitivity to
cluster detection by adjusting minimum number of points required to
define a cluster and default cluster distance (Figure 2).
Figure 2: Current Project Options Dialog Box
The automated clustering algorithms start from an estimated density
distribution and are able to detect meaningful clusters in data with
varying density, which is common in genotyping data. Sensitivity of
cluster detection can be adjusted at the project level by specifying
a minimum number of points in a cluster and cluster distance.
The X-Y coordinates for cluster positions can be exported from
GenomeStudio for downstream data analysis.
Figure 1: Genotyping Module Tools Menu
References
1. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu (1996).
A density-based algorithm for discovering clusters in large spatial databases
with noise. In Evangelos Simoudis, Jiawei Han, Usama M. Fayyad.
Proceedings of the Second International Conference on Knowledge Discovery
and Data Mining (KDD-96). AAAI Press. pp. 226–231. ISBN 1-57735-004-9.
2. Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander (1999).
OPTICS: Ordering Points to Identify the Clustering Structure. ACM SIGMOD
international conference on Management of data. ACM Press. pp. 49–60.
Additional Information
For more information about GenomeStudio Software and supporting
documentation, please refer to the GenomeStudio Portal or visit
www.illumina.com/genomestudio.
Illumina • 1.800.809.4566 toll-free • +1.858.202.4566 tel • [email protected] • www.illumina.com
FOR RESEARCH USE ONLY
© 2011-2013 Illumina, Inc. All rights reserved.
Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy,
Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, NuPCR, SeqMonitor,
Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or
registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
Pub. No. 970-2011-001 Current as of 09 January 2013