* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Pharmacogenomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Essential gene wikipedia , lookup
Transposable element wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Oncogenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Copy-number variation wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Gene therapy wikipedia , lookup
X-inactivation wikipedia , lookup
Human genome wikipedia , lookup
Genetic engineering wikipedia , lookup
Genomic library wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene desert wikipedia , lookup
Ridge (biology) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome editing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene expression programming wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression profiling wikipedia , lookup
Microevolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome (book) wikipedia , lookup
GenomePixelizer - a visualization tool for comparative genomics within and between species. A. Kozik, E. Kochetkova, and R. Michelmore (Department of Vegetable Crops, UC Davis, CA) GenomePixelizer main interface. Program reads Run Setup file by default during the start up. GenomePixelizer "Matrix Color Tuner" procedure allows user to assign color for similarity/identity" lines based on distance matrix file data dynamically, without changing the source of input file We developed a genome visualization program, GenomePixelizer, to study evolutionary patterns of specific gene families in whole genome(s). GenomePixelizer generates custom images of the physical or genetic positions of specified sets of genes in one or more genomes or parts of genomes. The positions of user-selected sets of genes are displayed along the chromosomes based on either physical or genetic distances. Multiple sets of genes can be shown simultaneously with user-defined characteristics presented. It allows the analysis of duplication events within and between species by displaying user-adjustable levels of sequence similarity. This provides comparisons between patterns of duplication for different families of genes, investigations of the occurrence of large versus local duplications and deletions as well as studies of macro- and micro-synteny. We are using GenomePixelizer to study the evolution of NBS-LRR encoding genes in comparison to other families of similar size such as cytochrome P450 and receptor kinase encoding genes in Arabidopsis both at the whole genome level and at the level of individual clusters. We are also adapting GenomePixelizer to display homologs identified in EST libraries for comparative studies. The program is written in Tcl/Tk and works on any computer platform that supports the Tcl/Tk toolkit. GenomePixelizer generates HTML ImageMap tags for each gene allowing links to databases. GenomePixelizer is under GNU General Public License. Detailed program description, source code, examples, and documentation are freely available at: http://niblrrs.ucdavis.edu/GenomePixelizer/ 1. name of file containing gene coordinates: ./Trio_NBS_P450_PKLRR_Input 2. name of the distance matrix file: ./Trio_NBS_P450_PKLRR_Matrix_Color 3. number of chromosomes: 5 4. size of chromosomes: 30 20 24 18 27 5. identity upper level: 100 6. identity lower level: 75 7. window size (pixels) X: 960 8. window size (pixels) Y: 720 9. html prefix: http://mips.gsf.de/cgi-bin/proj/thal/search_gene?code= 10. Title: NBS, P450, PK-LRR clustering in Arabidopsis, 75% identity 11. Laboratory: (Michelmore lab, UCD) ######################################################## ##### for experienced users below this line ######## 12. W/C correction: A 13. horizontal size of gene: 9 14. vertical size of gene: 4 15. W/C coefficient: 1 16. W/C correction value: 6 17. chromosome thickness: 5 18. gene feature mode (standard [std] or extended [ext]): std Run Setup file Canvas editor allows user to add text and graphical labels to images generated by GenomePixelizer GenomePixelizer "Gene Painter" procedure allows user to paint different set of genes in different colors in batch mode dynamically, without rerunning the project Program output – Graphical genomic comparison of clustering of three gene families: Gene Coordinates (Input) . . . . . . 5 At5g63410 5 At5g63450 5 At5g65240 5 At5g66900 5 At5g66910 5 At5g67200 5 At5g67280 5 At5g67310 1 At1g01280 1 At1g01600 1 At1g04210 1 At1g05700 1 At1g07560 1 At1g08590 1 At1g09970 1 At1g10860 1 At1g11600 1 At1g11680 . . . . . . Gene ID Chromosome # . . . . . At4g16890 At1g34210 At4g16860 At4g13290 At3g44480 At2g30750 At1g01600 At4g31940 At1g34540 At4g31940 At1g61180 At3g26190 At4g12310 At1g53440 . . . . . GenomePixelizer color scheme GenomePixelizer "Locus Zoomer" procedure allows user to zoom in semi-automatic mode into regions of interest and generate subprojects by extracting data from whole dataset . . . . 25.395 25.408 26.074 26.714 26.718 26.813 26.842 26.855 0.112 0.219 1.114 1.709 2.327 2.718 3.252 3.612 3.902 3.938 . . . . . . . . . . . . . C purple C green C purple C orange C orange C purple C purple C green W green Gene W green “property” W purple W purple W purple W purple W purple W purple W green W green . . . . . . . . . Color scheme: - NBS-LRR - cytochrome P450 Position on “Watson/Crick” chromosome orientation . . . . . . . . . At4g16950 0.901 At1g71830 0.900 At4g16920 0.900 At4g13310 0.895 At3g44670 0.894 At2g30770 0.893 At4g00360 0.889 At4g31950 0.886 At3g56630 0.885 At4g31970 0.885 At1g61190 0.884 At3g26200 0.883 At4g12320 0.883 At1g53430 0.883 . . . . . . . . . . . . . . orange Identity purple orange Matrix green orange File green green green Identity level green between pair of genes green orange green green Line color purple coding . . . . . - PK-LRR Distribution of NBS-LRR (putative resistance genes), cytochrome P450, PK-LRR (protein kinases) in the Arabidopsis genome. Color scheme: NBS - orange, P450 - green, PK-LRR - purple, lines connect genes with identity of 75% or higher. Example Project: Fine Dissection of Segmental Duplications in Arabidopsis Genome using GenomePixelizer Color scheme: - NBS-LRR - cytochrome P450 - PK-LRR Project implementation: Segmental Duplications in Arabidopsis Genome Colored lines connect genes with identity of 80% or higher. Color scheme of lines showing identity is chosen to easy distinguish the different pairs of chromosomes. 1. Data collection: gene coordinates, protein sequences (predicted ORFs) at MIPS Arabidopsis database [http://mips.gsf.de] 2. Data collection: Functional Categories FUNCAT for the set of genes at PEDANT database [http://pedant.gsf.de/] 3. Generation of matrix file by processing the results of FASTA search “genome against genome”. 4. Running of GenomePixelizer with the whole set of genes (~26,000) 5. Selection region of interest, and data extraction for subproject using “Locus Zoomer” procedure. 6. Re-Running of GenomePixelizer with the selected set of genes and display different levels of identity (60% and 40% respectively) using “Matrix Color Tuner" procedure. 7. Gene coloring according to MIPS Functional Categories using "Gene Painter" procedure GenomePixelizer automatically generates HTML ImageMap tags for each gene allowing Web links to databases.