Download Phage Lab III - Generic Genome Browser of WUSTL Phages

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Transfer RNA wikipedia , lookup

Genomic imprinting wikipedia , lookup

NEDD9 wikipedia , lookup

Frameshift mutation wikipedia , lookup

Pathogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Expanded genetic code wikipedia , lookup

Genetic engineering wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Copy-number variation wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Epigenetics of human development wikipedia , lookup

History of genetic engineering wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Point mutation wikipedia , lookup

Genome (book) wikipedia , lookup

Gene therapy wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

The Selfish Gene wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome editing wikipedia , lookup

Genome evolution wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene desert wikipedia , lookup

Gene wikipedia , lookup

Gene expression profiling wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genetic code wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Designer baby wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Phage Lab III -­‐ Annotation, pulling it all together In Lab I we practiced using the genome browser to be able to see the results of Glimmer, GeneMark and GeneMark coding potential. In lab II we practiced using DNA Master to collect information on Shine-­‐Dalgarno (SD) scores and start codons and running BLAST searches to find and collect information on previously annotated genes. In this session we will practice collecting together all these results for a couple of genes in Etude prior to starting work on your own section. Gene 2 or Gene 3 Start by collecting all the information you need for the region around the Glimmer 00002 prediction around 450-­‐900 or Glimmer 00003 in the region from 800 – 1100. Gene 2 is easy, Gene 3 a little more challenging. For your chosen region collect the relevant information by filling out the numbered items below. Let's define the genomic environment, since genes rarely overlap by more than 10 bases or so, knowing where the neighboring genes are will help us annotate because of the “tight pack” rule. 1. Location and strand of gene to the left: 2. Location and strand of gene to the right: One strong signal that indicates the presence of genes is the presence of gene models from gene predictors. Let’s collect that info for this region: 3. Location (if present) of Glimmer prediction: 4. Location (if present) of GeneMark prediction: Since the presence of tRNA’s would preclude a gene in this region, let's look for them here. 5. Location (if present) of any tRNA’s predicted by tRNA-­‐scan or ARAGORN: Coding Potential is also a good signal that indicates the presence of a gene and an annotator would try to pick a start codon that would begin the gene BEFORE there is any high coding potential, so what are the results of the Coding Potential analysis for this region. 6. Region (if any) showing Coding Potential: 7. Basic description of the shape of the Coding Potential (most important is to note where the values start to get above about 50%): If there really is a gene here, there has to be a start codon. Usually there are several start codons any one of which could be the one used by the phage. Remember from your reading that Mycobacteria can use ATG, TTG and GTG for start codons. Since some start codons are used much more commonly that others you need to collect the exact sequence of each start codon. Also, each start codon has a SD score, these are helpful in picking among start codons. List all the start codons that you consider has a reasonable possibility of being the start codon used by the phage (Be sure to include the SD score as well as the sequence and the position). Recall the “tight pack rule” and its exceptions when considering what is “reasonable”. If you have forgotten them go back to the Introduction to Annotation and reread the section on the “tight pack rule”. 8. List the top candidate potential starts that are reasonable for this region and frame. Because of the tight pack rule “reasonable start codons” will be found from about 15 bp overlap of the upstream gene to about 50 bp gap. If there are no “reasonable” start codons, you will need to widen your search to “unusual start codons” say 30 bp overlap to 150 bp gap. Failing that look for “very unusual start codons” say 45 bp overlap or as far downstream as necessary): Finally we want to see if there are any very similar proteins that have been previously annotated, and find the start codon used in these other annotations. 9. What is the start coordinates of the DNA sequence you used in your BLAST search (You need to know this so you can interpret your BLAST results): 10. List the number of significant BLAST alignments (i.e. > 95% identical) with previously annotated proteins: 11. Describe the types of alignments focusing on the starting coordinates (The evidence here is: “does the BLAST alignment agree or disagree with the start you used when you picked your query in step 9?”): Finally, annotation! Start by deciding if there is a gene at all in this region. For this section list all the observations that support and refute the presence of a gene in this region and come to a conclusion (this will probably be 2 -­‐4 sentences). 12. Decide on presence of a gene: If you have decided that there is a gene in this region you must now pick the start codon best supported by the above observations. Using what you now know about genes in Mycobacteria phage and the “Guiding Principles” you must interpret the observations above and come to a conclusion. There are two ways you can do this. Start by listing the possible starts and then add your observations to each start noting if that observation supports or refutes the start; OR start with a list of each observation and under these, list the starts that are supported and refuted by that observation. Pick one of these two methods and enter your interpretation. 13. Interpretation: Finally, make your decision and write a summary of your analysis including an indication of how confident you are in your conclusions (again 2-­‐4 sentences). 14. Conclusion: Your data: Now repeat the analysis on another region of your choice in Etude. Fill in all 14 items. 1. Location and strand of gene to the left: 2. Location and strand of gene to the right: 3. Location (if present) of Glimmer prediction: 4. Location (if present) of GeneMark prediction: 5. Location (if present) of any tRNA’s predicted by tRNA-­‐scan or ARAGORN: 6. Region (if any) showing Coding Potential: 7. Basic description of the shape of the Coding Potential: 8. List all the potential starts that are reasonable for this region and frame: 9. What is the start & end coordinates of the DNA sequence you used in your BLAST search: 10. List the number of significant BLAST alignments with previously annotated proteins: 11. Describe the locations of alignments focusing on the starting coordinates (i.e. do both proteins start at the same amino acid, or is one protein have extra or missing amino acids at the start?): 12. Decide on presence of a gene: 13. Interpretation: 14. Conclusion: