* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Creating a Venn diagram and list for unique genes from RAST
Gene expression programming wikipedia , lookup
Genomic imprinting wikipedia , lookup
Non-coding DNA wikipedia , lookup
Transposable element wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Gene expression profiling wikipedia , lookup
Copy-number variation wikipedia , lookup
Genome (book) wikipedia , lookup
Metagenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Public health genomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Genetic engineering wikipedia , lookup
Human Genome Project wikipedia , lookup
Microevolution wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic library wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genome editing wikipedia , lookup
Minimal genome wikipedia , lookup
Creating a Venn diagram and list for unique genes from RAST Purpose: To provide a Venn diagram chart and a compiled list comparing unique and shared genes amongst a reference organism and four others. Scope: This SOP is for anyone working in the Newman lab, or who has obtained a copy of the Excel spreadsheet template, and wants to make genomic comparisons between organisms, particularly bacteria. This SOP is focused on an Excel spreadsheet template that was created in the Newman lab, so other labs may have different methods of obtaining unique and shared gene values. Task: One student laboratory assistant or researcher could complete this entire SOP in approximately one to two hours, depending on the response time of the RAST server. This estimate includes receiving genetic comparison information from RAST and making all required spreadsheets and diagrams. This software has been created for ease of access to this information, which could takes lots more tedious work to complete and count manually. Instructions: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Open Firefox (this is the browser that works best with RAST) Go to rast.nmpdr.org Login to RAST (username: newmanlab password: 16srrna1) In the Jobs Overview window, find the organism you wish to focus on by searching the Name column and click View Details under Annotation Progress In the Job Details window, click Browse annotated genome in the SEED Viewer In the Organism Overview window, on the right, click the Compare tab. Then click on the link to sequence based comparison. In the Select Reference Organism window at the top, select the first organism that appears in the list of the five that you have chosen to compare (NOTE‐ One of the organisms must be the type species for that genus) you can type at the top of the box to narrow your selection. The default setting for Completeness, sort, and Domains can all be left alone. In the Select Comparison Organisms window at the bottom, select the other four organisms in order as they appear in the list. After the first organism has been selected, hold the CTRL key to make a multiple selection. (Note‐ Be sure to select the latest version of that genome by checking that the number in parenthesis is the highest for that organism) After all organisms have been selected click Compute. The comparison is complete when all Comparison organisms have a BlastDotPlot icon at the end. This information can be saved and exported for use in Excel by clicking the Export Table button. Make the file name “01 Reference organism STRAIN (genome number)” (This can be copied as it appears on the RAST output). Save this file and all subsequent others in the Newmanlab network space in the Strain Collection folder under your organism’s proper identification. In the organism’s file you can make a new folder titled Venn Diagram Comparison. Go back and complete steps 7‐11 with the other four organisms as the reference, toggling them in order of how they appear in the list. Be sure to use the same strain and genome each time. Once all of the tables have been exported and saved in the network space, go into the Procedures, Protocols, and Manuals folder and click on the folder Venn Diagram Calculator. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. Copy and save the following files into the Venn Diagram Comparison folder you created earlier: Venn Diagram Generator V 2.0, Venn Diagram template, and Venn Diagram Output Converter. Open the Venn Diagram Data Generator V 2.0 and go to the Paste Data Here tab Open the file you saved from RAST in the Venn Diagram Comparison folder, starting with 01. This is a Notepad file. Copy all text in the notepad file by typing CTRL A, then CTRL C Go to the Paste Data Here tab and press CTL V to paste the gene list into the Excel sheet. Proceed to the Gene Counts tab and copy the values listed in the B column. Open the Venn Diagram Output Converter Excel file you copied into your Venn Diagram Comparison folder earlier. Paste the copied values from the Gene Counts into the B column. Repeat steps 15‐20 for the other four organisms’ RAST outputs, pasting their gene counts into the adjacent C, D, E, and F columns. Be sure to keep the organisms in order by how the files are numbered. (1 goes in B, 2 goes in C, and so on) Scroll all the way over to the right in the Venn Diagram Output Converter, this spreadsheet has been set up to calculate average numbers of shared genes between the different organisms, as well as unique genes they retain. These numbers are what go into the Venn Diagram. Open the Venn Diagram template PowerPoint folder. Input the organisms corresponding names to their numbers in the template text. The other information about the genome (WGS‐ Whole Genome Shotgun, CDS‐ Coding sequence length, and number of base pairs) can be obtained in the Genome section of NCBI (select Genome in the drop‐down menu next to the search bar, then search for your organism). Input the average shared genes values from the Venn Diagram Output Converter that the spreadsheet calculated for you into the corresponding number sections on the Venn Diagram template. Important Note‐ Be sure to monitor the number of contigs (found in the NCBI genomic summary) for each genome you use in your comparison, below 100 is ideal. High numbers of contigs could mean that the sequence has low coverage and could be incorrect, or that a high level of contamination was present in the sequencing sample.