Download Creating a Venn diagram and list for unique genes from RAST

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

NUMT wikipedia , lookup

Gene expression programming wikipedia , lookup

Genomic imprinting wikipedia , lookup

Non-coding DNA wikipedia , lookup

Transposable element wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Gene expression profiling wikipedia , lookup

Copy-number variation wikipedia , lookup

Genome (book) wikipedia , lookup

RNA-Seq wikipedia , lookup

Metagenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Public health genomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Genetic engineering wikipedia , lookup

Human Genome Project wikipedia , lookup

Microevolution wikipedia , lookup

Genomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Genomic library wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome editing wikipedia , lookup

Minimal genome wikipedia , lookup

Life history theory wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Creating a Venn diagram and list for unique genes from RAST Purpose: To provide a Venn diagram chart and a compiled list comparing unique and shared genes amongst a reference organism and four others. Scope: This SOP is for anyone working in the Newman lab, or who has obtained a copy of the Excel spreadsheet template, and wants to make genomic comparisons between organisms, particularly bacteria. This SOP is focused on an Excel spreadsheet template that was created in the Newman lab, so other labs may have different methods of obtaining unique and shared gene values. Task: One student laboratory assistant or researcher could complete this entire SOP in approximately one to two hours, depending on the response time of the RAST server. This estimate includes receiving genetic comparison information from RAST and making all required spreadsheets and diagrams. This software has been created for ease of access to this information, which could takes lots more tedious work to complete and count manually. Instructions: 1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Open Firefox (this is the browser that works best with RAST) Go to rast.nmpdr.org Login to RAST (username: newmanlab password: 16srrna1) In the Jobs Overview window, find the organism you wish to focus on by searching the Name column and click View Details under Annotation Progress In the Job Details window, click Browse annotated genome in the SEED Viewer In the Organism Overview window, on the right, click the Compare tab. Then click on the link to sequence based comparison. In the Select Reference Organism window at the top, select the first organism that appears in the list of the five that you have chosen to compare (NOTE‐ One of the organisms must be the type species for that genus) you can type at the top of the box to narrow your selection. The default setting for Completeness, sort, and Domains can all be left alone. In the Select Comparison Organisms window at the bottom, select the other four organisms in order as they appear in the list. After the first organism has been selected, hold the CTRL key to make a multiple selection. (Note‐ Be sure to select the latest version of that genome by checking that the number in parenthesis is the highest for that organism) After all organisms have been selected click Compute. The comparison is complete when all Comparison organisms have a BlastDotPlot icon at the end. This information can be saved and exported for use in Excel by clicking the Export Table button. Make the file name “01 Reference organism STRAIN (genome number)” (This can be copied as it appears on the RAST output). Save this file and all subsequent others in the Newmanlab network space in the Strain Collection folder under your organism’s proper identification. In the organism’s file you can make a new folder titled Venn Diagram Comparison. Go back and complete steps 7‐11 with the other four organisms as the reference, toggling them in order of how they appear in the list. Be sure to use the same strain and genome each time. Once all of the tables have been exported and saved in the network space, go into the Procedures, Protocols, and Manuals folder and click on the folder Venn Diagram Calculator. 14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
Copy and save the following files into the Venn Diagram Comparison folder you created earlier: Venn Diagram Generator V 2.0, Venn Diagram template, and Venn Diagram Output Converter. Open the Venn Diagram Data Generator V 2.0 and go to the Paste Data Here tab Open the file you saved from RAST in the Venn Diagram Comparison folder, starting with 01. This is a Notepad file. Copy all text in the notepad file by typing CTRL A, then CTRL C Go to the Paste Data Here tab and press CTL V to paste the gene list into the Excel sheet. Proceed to the Gene Counts tab and copy the values listed in the B column. Open the Venn Diagram Output Converter Excel file you copied into your Venn Diagram Comparison folder earlier. Paste the copied values from the Gene Counts into the B column. Repeat steps 15‐20 for the other four organisms’ RAST outputs, pasting their gene counts into the adjacent C, D, E, and F columns. Be sure to keep the organisms in order by how the files are numbered. (1 goes in B, 2 goes in C, and so on) Scroll all the way over to the right in the Venn Diagram Output Converter, this spreadsheet has been set up to calculate average numbers of shared genes between the different organisms, as well as unique genes they retain. These numbers are what go into the Venn Diagram. Open the Venn Diagram template PowerPoint folder. Input the organisms corresponding names to their numbers in the template text. The other information about the genome (WGS‐ Whole Genome Shotgun, CDS‐ Coding sequence length, and number of base pairs) can be obtained in the Genome section of NCBI (select Genome in the drop‐down menu next to the search bar, then search for your organism). Input the average shared genes values from the Venn Diagram Output Converter that the spreadsheet calculated for you into the corresponding number sections on the Venn Diagram template. Important Note‐ Be sure to monitor the number of contigs (found in the NCBI genomic summary) for each genome you use in your comparison, below 100 is ideal. High numbers of contigs could mean that the sequence has low coverage and could be incorrect, or that a high level of contamination was present in the sequencing sample.