Download GenomePixelizer—a visualization program for comparative

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polycomb Group Proteins and Cancer wikipedia , lookup

Ploidy wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Metagenomics wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Chromosome wikipedia , lookup

Oncogenomics wikipedia , lookup

Essential gene wikipedia , lookup

Gene therapy wikipedia , lookup

Karyotype wikipedia , lookup

Genetic engineering wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene nomenclature wikipedia , lookup

Transposable element wikipedia , lookup

Genomics wikipedia , lookup

X-inactivation wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genomic library wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Human genome wikipedia , lookup

Public health genomics wikipedia , lookup

Polyploid wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene desert wikipedia , lookup

Ridge (biology) wikipedia , lookup

Pathogenomics wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genomic imprinting wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome editing wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene expression profiling wikipedia , lookup

Minimal genome wikipedia , lookup

Genome (book) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

Designer baby wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
BIOINFORMATICS APPLICATIONS NOTE
Vol. 18 no. 2 2002
Pages 335–336
GenomePixelizer—a visualization program for
comparative genomics within and between
species
A. Kozik, E. Kochetkova and R. Michelmore
Department of Vegetable Crops, University of California, Davis, CA 95616, USA
Received on August 28, 2001; revised on September 28, 2001; accepted on October 12, 2001
ABSTRACT
Summary: GenomePixelizer is a visualization tool that
generates custom images of the physical or genetic
positions of specified sets of genes in whole genomes
or parts of genomes. Multiple sets of genes can be
shown simultaneously with user-defined characteristics
displayed. It allows the analysis of duplication events within
and between species based on sequence similarities. The
program is written in Tcl/Tk and works on any platform that
supports the Tcl/Tk toolkit. GenomePixelizer generates
HTML ImageMap tags for each gene in the image allowing
links to databases. Images can be saved and presented
on web pages.
Availability: GenomePixelizer is freely available at http:
//niblrrs.ucdavis.edu/GenomePixelizer/GenomePixelizer
Welcome.html
Contact: [email protected]
INTRODUCTION
The increasing availability of the sequences for whole
genomes has created the need for different types of
visualization tools that allow the facile manipulation
and comparisons of the data. Several genome viewers
are currently available, for example: NCBI Map Viewer
(http://www.ncbi.nlm.nih.gov/), TIGR Genome Browser
(http://www.tigr.org/), MIPS Arabidopsis Redundancy
Viewer (http://mips.gsf.de/proj/thal/db/gv/rv/) and WormBase (AceDB, http://www.wormbase.org/). These tools
allow the viewing and exploration of not only whole chromosome(s) but also the details of genome assembly, ORF
prediction and gene annotation. However, the existing
genome viewers lack the flexibility to work with specific
subsets of genes, to analyze the relationships between
different chromosomes and to examine patterns of gene
duplication. The MIPS Arabidopsis Redundancy Viewer
comes closest to achieving this; however, this web-based
tool does not allow viewing of sets of genes other than
those from Arabidopsis, focus on regions of interest or
creation of images other than with default parameters.
We therefore created the highly customizable genome
c Oxford University Press 2002
visualization program, GenomePixelizer, that works on
any computer running the Tcl/Tk toolkit. GenomePixelizer is most similar to the MIPS Arabidopsis Redundancy Viewer and gff2ps (http://www1.imim.es/
software/gfftools/GFF2PS.html). GenomePixelizer differs from gff2ps in that it does not require GFF
(http://www.sanger.ac.uk/Software/formats/GFF/) as an
input file. The input file format for GenomePixelizer
is simpler, more flexible and customizable and can be
created using an Excel-like editor. GenomePixelizer
displays the relationship between genes on any number
of chromosomes; in contrast, gff2ps does not display
relationships between chromosomes and Arabidopsis
Redundancy Viewer displays only two chromosomes
simultaneously. However, these programs are complementary and their combined usage is extremely powerful
in understanding genome organization and evolution. We
are using GenomePixelizer to analyze the evolution of
Nucleotide Binding Site–Leucine Rich Repeat (NBS–
LRR) encoding genes in Arabidopsis relative to genome
duplication events.
PROGRAM CAPABILITIES
GenomePixelizer generates images of one or more
genomes. The positions of user-selected sets of
genes are displayed along the chromosomes based
on either physical or genetic distances (in Mb or
cM respectively). The source of sequences is not restricted to one organism; relationships between different
genomes can be displayed (e.g. cytochrome P450 genes
in Arabidopsis thaliana and Caenorhabditis elegans,
http://niblrrs.ucdavis.edu/GenomePixelizer/Examples/
GenoPix Example arab-worm-inter.html). Two userdefined characteristics are available for each gene: the
position above or below the chromosome (the direction
of transcription is currently the default in the program)
and the color of the element (e.g. the type of gene or
the presence of particular motifs). GenomePixelizer
generates HTML ImageMap tags for ‘clickable’ links to
databases such as MIPS that provide detailed information
335
A.Kozik et al.
• Regions with high gene density can be drawn using
automatic or manual correction to display overlapping
gene symbols.
• Source code is freely available and new features can be
added with minimal code modifications.
• Images can be captured by any screenshot program
and incorporated into Web pages. Images may also be
saved as a PostScript file and then transformed into
GIF or PNG file format.
Fig. 1. Screenshot from GenomePixelizer showing cytochrome
P450 genes distributed over the five chromosomes of Arabidopsis.
Genes with greater than 75% predicted amino acid identity are
joined by lines. An example dialog box containing the gene id or
additional information is shown in the lower right corner that can be
obtained by clicking on an individual element.
for each gene. Adjustable levels of sequence similarity
between genes are indicated by colored lines joining the
pairs of genes compared. The patterns generated allow
the easy identification of duplicated genomic regions
(Figure 1). This allows comparisons between patterns of
duplication for different families of genes, investigations
of the occurrence of large versus local duplications and
deletions as well as studies of macro- and micro-synteny.
FEATURES OF GENOMEPIXELIZER
• Displays user-defined features for selected genes
throughout whole genome(s).
• Images fit into a single screen without scrolling. It is
also possible to generate larger images with a built-in
scroll-bar.
• Simple and flexible input file set up, edited and
modified using spreadsheet editor (e.g. MS Excel).
Individual genes can easily be added, deleted or
modified.
• Minimal modification to the input file provides zoomin functionality and allows the viewing of regions of
high gene density in greater detail.
336
IMPLEMENTATION
GenomePixelizer requires three files. The startup file
specifies the names of the input file and the distance matrix
file as well as the number and size of chromosomes,
the upper and lower levels of sequence similarity, the
horizontal and vertical dimensions of the image, and other
optional parameters. The input file contains the gene IDs,
gene coordinates, and gene features defined by user. The
distance matrix file contains pairs of gene IDs and their
percentage similarity or identity as defined by the user.
GenomePixelizer reads the startup file and draws the
chromosomes within the window according to their specified sizes. It then reads the input file and places each gene
either below or above the chromosomes. The default positions of genes above or below the chromosomes correspond to Watson/Crick orientation; however, the user may
assign any other binary characteristic. Each gene is represented by a colored element. The color scheme is flexible
and customizable; it may reflect any feature defined by the
user, such as the type of gene or the presence of particular motifs. Simultaneously, GenomePixelizer generates a
separate file with HTML ImageMap tags that can be used
to create Web pages with clickable images. Finally, the
program reads the distance matrix file and draws the lines
between genes within the upper and lower levels of similarity defined in the startup file. Examples and detailed
documentation are available at: http://niblrrs.ucdavis.edu/
GenomePixelizer/GenomePixelizer Welcome.html.
ACKNOWLEDGEMENTS
This work was supported by a grant from the National Science Foundation Plant Genome Program (Award no. DBI9975971) and the USDA IFAFS Plant Genome Program
(Award no. 00-52100-9609). We thank Blake Meyers for
critical reading of the manuscript.