Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SelectionMap: a tool for detection and visualisation of natural patterns within codon alignments I. Description SelectionMap is a free Windows based program designed for the purpose of detection comparison natural selection patterns within homologous genes of related virus species. The program detects selection using the FUBAR[1] method implemented in HYPHY[2] and then produces selection map illustrating the type and the degree of selection at each individual codon sites. The program also takes as input FUBAR and/or MEME [3] out files containing selection data in coma separated value (csv) format and corresponding codon alignments and plot the selection map. Generally, the produced map is a multi-coloured figure showing sites that are evolving (1) under negative selection for the same amino acid, (2) negative selection for different amino acids (3) sites under negative selection in only one species, (4) sites under positive selection, (5) under episodic positive selection. II. Download and installation The program can be downloaded from http://web.cbio.uct.ac.za/~brejnev/downloads/ComputationalTools/ 1. Extract the SelectionMap-v1.0.zip file into a temporary folder. 2. In the temporary folder double-click on the file “SETUP.EXE and follow the instructions of the setup program. Please use the default installation directory “C:\ SelectionMap-v1.0” for installation. 3. Inside the installation folder, run the “HyPhy-CLI2.2.6.exe” file to install HYPHY III. Running the program 1. Starting the program To plot a selection map you simply have to double-click on the SelectionMap-v1.0.exe file, when the program has launched you can either choose to load codon alignment(s) and detect selection with FUBAR before plotting the selection map or load FUBAR and/or MEME csv output file(s) and their corresponding codon alignment(s) and then plot the map. The Start-up screen (Fig 1.) displays the two option of loading the input files. Fig 1. Start-up screen 2. Loading the input codon alignment (s) Once the user clicks the “Load codon alignment(s) in FASTA format” the “Loading codon alignment(s) screen” (Fig 2.) will appear and to load the input the user will either click on the frame in the middle or will drag-and-drop the codon alignment file(s) to the frame. Once the alignments are loaded, the user will click on the command button to detect selection. NB: If two or three alignments are loaded they must be profile aligned before the can be used. Fig 2. Loading codon alignment(s) screen 3. Loading the input data file(s) There are three types of analyses: (1) FUBAR-MEME that uses selection data from FUBAR and MEME, (2) FUBAR involving selection data from only FUBAR and (3) MEME involving selection data from only MEME. The analysis can be performed on one virus species, or multiple virus species in which case direct comparison of selection pattern are made. 4. Loading the input data file(s) To load the first input files click on “Load data file 1”, then “Load data file 2” to load the second and “Load data file 3” to load the third file. Alternatively drag and drop the input file(s) to the program picture fram. 5. FUBAR&MEME analysis Input file for this analysis (Fig 1.) consist of a text file in comma- or tab-separate format whose first to tenth columns contain the FUBAR output, eleventh to twentieth columns contain the MEME output and the twenty-first column contains the consensus (most frequent) amino acid at each codon site. Fig 1. Input format for FUBAR-MEME analysis NB. In case two or three input files are to be compared, the alignments must be profilealigned and separated before running FUBAR and MEME, to insure that homologous sites are aligned. 6. FUBAR analysis Input file for this analysis (Fig 2.) consist of a text file in comma- or tab-separate format whose first to tenth columns contain the FUBAR output and the eleventh column contains the consensus (most frequent) amino acid at each codon site. Fig 2. Input format for FUBAR analysis NB. In case two or three input files are to be compared, the codon alignments must be profile-aligned and separated before running FUBAR, to insure that homologous sites are aligned. 7. MEME analysis Input file for this analysis (Fig 3.) consist of a text file in comma- or tab-separate format whose first to tenth columns contain the MEME output. Fig 3. Input format for MEME analysis NB. In case two or three input files are to be compared, the codon alignments must be profile-aligned and separated before running MEME, to insure that homologous sites are aligned. IV. Program features Fig 4. Start-up screen Fig 4. SelectionMap main interface 1. Menus a. The File menu gives the option to exit the program b. The Save menu allows saving on the disk the selection map in either EMF or PNG format. c. The Copy menu allows copying and pasting the map into another program such us MS PowerPoint or MS Word. An alternative option is to write-click on the image and select copy. 2. Command buttons a. The Plot map command button allows plotting the map once the input files have been loaded. b. The Zoom in and Zoom out command buttons allow zooming the map in and out repsectively. c. The Set method allows selecting the method to use. The methods supported are: FUBAR-MEME, FUBAR and MEME. 3. The Input data file buttons (see three labels bellow the “FUBAR-MEME” caption), allow loading the input files, once they have been loaded their name are displayed on the labels. An alternative and easy option is to drag and drop the input data files on the picture frame. 4. The Posterior probability and P-value text boxes allow setting the statistical significance thresholds for selection inferred by FUBAR and MEME respectively. 5. The Delete file button allows removing the input files. 6. The Colour boxes allow setting the colour for different of types of selection detected. V. References Murrell B, Moola S, Mabona A, et al. (2013) FUBAR: a fast, unconstrained bayesian approximation for inferring 10.1093/molbev/mst030 selection. Mol Biol Evol 30:1196–205. doi: Murrell B, Werthim J, Moola S, et al. (2012) Detecting individual sites subject to episodic diversifying selection. PloS Genetics. DOI: 10.1371/journal.pgen.1002764 Authors: 1Brejnev Muhire, 2Arvind Varsani and 1Darren Martin 1Institute of Infectious Diseases and Molecular Medicine, Computational Biology Group, University of Cape Town, South Africa 2School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch, 8140, New Zealand Bug reporting: [email protected] or [email protected]