* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Shardae Oliver
Nutriepigenomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Point mutation wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Metagenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Gene desert wikipedia , lookup
Gene therapy wikipedia , lookup
Gene expression programming wikipedia , lookup
Public health genomics wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Gene nomenclature wikipedia , lookup
Genome editing wikipedia , lookup
Microevolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
BIO 315 BIOINFORMATICS QUIZ #1 NAME ____SHARDAE OLIVER__________________________________________ PROBLEMS Problem #1 Read the 25 base DNA sequence from the following chromatogram. [Note: If more than 25 bases are shown, read 25 consecutive bases from the first easily read base] [Key: Red = T, Green = A, Blue = C, Black = G] a. Write the strand that you read on top and below it write the complementary strand. Be sure to label 5’ and 3’. b. Briefly describe 2 advantages of this type of sequencing compared to conventional sequencing. One advantage must be related to an advance in the informatics part of modern sequencing. 1 Problem #2 Consider the area below that is highlighted in blue. How would you evaluate if this is an area of the chromatogram considered to contain reliable sequence information? How did the computer interpret this information to come up with the sequence: CCCC (i.e. what criteria were used)? Problem #3 Using the sequence below, do a BLASTn search of the nr database restricted to Homo sapiens to identify the gene this sequence comes from. You may want to check the ‘Exclude: Models’ box. AGGATGGATATGACTTAGTGCAGGA a. Write down the E value from your first hit in the list that has an accession number beginning with “NM_” from the species Homo sapiens. Complete the pertinent information below for this hit. E value: _________________ Max score __________________ Make a screen shot showing where you got the E value and Max score Accession-Version #: ____________________ The following about the gene and source of information (i.e. database &/or site): Official Gene Symbol: ___________________________________________ Official Gene Name: ____________________________________________ Based on the accession number you can tell from which database this sequence comes. What type of database is it? To answer this, I want you to provide a brief description of the type, not the name or details of what is contained in this database. _________________________________________________________________ Check the sequence revision history of this sequence. When was the entry first seen at NCBI? _________________________________________________________________ 2 b. Now determine what disease or disorder this gene is related to and find a description of the phenotype associated with the primary disease/disorder. Briefly describe where you went to retrieve this information and how you got there. Write down the name of the disease(s) below. Make a screenshot w/ a link of the description of the phenotype(s) and add it to your Quiz Results file. Do not include a lot of extraneous information in your screen shot. If this gene is associated with more than one disease, list all of the diseases but only select one to describe the phenotype. Name of the disease(s): _________________________________________ How retrieved: c. Find the following information about this gene and indicate where you found the information. On what chromosome and position is the gene located? __________________ What is the size of the mRNA(s)? (Include units) ______________________ What size is the protein? (Include units) ______________________________ d. Analyze this sequence with the ORF finder. Print your results to a pdf, naming the file: NameNameQues3d. Then make a screenshot of the area showing the reading frames: circle the correct reading frame in both the table and the diagram (be specific). Did these results match what you expected from what you have already learned about this gene? Briefly explain (e.g. what matched and where did you find the info). Be sure to indicate what the ORF Finder suggested and confirm this information with the Genbank entry. Note any discrepancies. The correct reading frame was: ________________________________________ Include the frame and region (e.g. base positions) Results matched? Explain. 3 e. Go to Map Viewer. Bring up the gene and set your screen so that it shows the following parameters: Model Transcripts Ideogram Phenotype Homo sapiens UniGene Clusters (Hs_UniG) Genes (make this your Master) What is UniGene? ______________________________________ If you choose to use abbreviations, you must indicate what they stand for. For example: HSP in a BLAST search; HSP stands for High Scoring Pair You can use the abbreviation DNA without its definition. Set a region 50,000bp that encompasses your gene. Print to a pdf your results and then make a screenshot of the area showing your gene. Circle your gene on one of the maps. Name the printout file: NameNameQues3e1 Now set the region to encompass 5,000,000 bp. Print to a pdf your results and then make a screenshot of the area showing your gene. Circle your gene on one of the maps. Name the printout file: NameNameQues3e2 Problem #4 A BLAST search is done and one of the hits has an E value of 510-5 and another had an E-value of 510-2. a. Based on these values, which do you expect to be more closely related to your query sequence? b. One of the hits had an E-value of 2. interested in this match? Under what circumstances might you be Problem #5 Consider the sequence below. For ORF 2-, locate the longest possible ORF. How long is it (be sure to indicate units)? Assume that you have a full length sequence and can accurately identify where the ORF begins and ends. Show your work. Do this manually, not with ORF finder. 5’CGTCATAGATTACATGGGTTCATGCATTACCATG3’ 3’GCAGTATCTAATGTACCCAAGTACGTAATGGTAC5’ 4 TERMS Choose one term from the list below and give a brief description in your own words. Your description should indicate what the term refers to and the significance to bioinformatics. Write your answer on the back of this page. If you base any part of your answer on information that you look up, include the reference that you used. In silico Genome HGP BLAST BAC/YAC EST EXTRA CREDIT Do you agree with the sequence data in this region that was determined by the computer? If not by what criteria would you modify the sequence? FYI: The quality scores were as follows: A (9), C (9), A (8) 5 QUICK CHECKLIST Screenshots 3a of E value/Max score 3b w/ links 3d 3e, both views Model Transcripts Ideogram Phenotype Hs_UniG Gene Printouts (i.e. print to a pdf) 3d 3e, both views 6