Download Shardae Oliver

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nutriepigenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Point mutation wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Metagenomics wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene desert wikipedia , lookup

Gene therapy wikipedia , lookup

Gene expression programming wikipedia , lookup

Public health genomics wikipedia , lookup

Genomics wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Gene nomenclature wikipedia , lookup

Genome editing wikipedia , lookup

RNA-Seq wikipedia , lookup

Microevolution wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Designer baby wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
BIO 315 BIOINFORMATICS
QUIZ #1
NAME ____SHARDAE OLIVER__________________________________________
PROBLEMS
Problem #1
Read the 25 base DNA sequence from the following chromatogram. [Note: If more than
25 bases are shown, read 25 consecutive bases from the first easily read base]
[Key: Red = T, Green = A, Blue = C, Black = G]
a. Write the strand that you read on top and below it write the complementary strand. Be
sure to label 5’ and 3’.
b. Briefly describe 2 advantages of this type of sequencing compared to conventional
sequencing. One advantage must be related to an advance in the informatics part of
modern sequencing.
1
Problem #2
Consider the area below that is highlighted in blue. How would you evaluate if this is an
area of the chromatogram considered to contain reliable sequence information? How did
the computer interpret this information to come up with the sequence: CCCC (i.e. what
criteria were used)?
Problem #3
Using the sequence below, do a BLASTn search of the nr database restricted to Homo
sapiens to identify the gene this sequence comes from. You may want to check the
‘Exclude: Models’ box.
AGGATGGATATGACTTAGTGCAGGA
a. Write down the E value from your first hit in the list that has an accession number
beginning with “NM_” from the species Homo sapiens. Complete the pertinent
information below for this hit.
E value:
_________________
Max score __________________
Make a screen shot showing where you got the E value and Max score
Accession-Version #: ____________________
The following about the gene and source of information (i.e. database &/or site):
Official Gene Symbol: ___________________________________________
Official Gene Name: ____________________________________________
Based on the accession number you can tell from which database this sequence
comes. What type of database is it? To answer this, I want you to provide a brief
description of the type, not the name or details of what is contained in this
database.
_________________________________________________________________
Check the sequence revision history of this sequence. When was the entry first
seen at NCBI?
_________________________________________________________________
2
b. Now determine what disease or disorder this gene is related to and find a description of
the phenotype associated with the primary disease/disorder. Briefly describe where
you went to retrieve this information and how you got there. Write down the name of
the disease(s) below. Make a screenshot w/ a link of the description of the
phenotype(s) and add it to your Quiz Results file. Do not include a lot of extraneous
information in your screen shot. If this gene is associated with more than one disease,
list all of the diseases but only select one to describe the phenotype.
Name of the disease(s): _________________________________________
How retrieved:
c. Find the following information about this gene and indicate where you found the
information.
On what chromosome and position is the gene located? __________________
What is the size of the mRNA(s)? (Include units) ______________________
What size is the protein? (Include units) ______________________________
d. Analyze this sequence with the ORF finder. Print your results to a pdf, naming the
file: NameNameQues3d. Then make a screenshot of the area showing the reading
frames: circle the correct reading frame in both the table and the diagram (be
specific). Did these results match what you expected from what you have already
learned about this gene? Briefly explain (e.g. what matched and where did you find
the info). Be sure to indicate what the ORF Finder suggested and confirm this
information with the Genbank entry. Note any discrepancies.
The correct reading frame was: ________________________________________
Include the frame and region (e.g. base positions)
Results matched? Explain.
3
e. Go to Map Viewer. Bring up the gene and set your screen so that it shows the
following parameters:
Model Transcripts
Ideogram
Phenotype
Homo sapiens UniGene Clusters (Hs_UniG)
Genes (make this your Master)
What is UniGene? ______________________________________
If you choose to use abbreviations, you must indicate what they stand for.
For example: HSP in a BLAST search; HSP stands for High Scoring Pair
You can use the abbreviation DNA without its definition.
Set a region 50,000bp that encompasses your gene. Print to a pdf your results and
then make a screenshot of the area showing your gene. Circle your gene on one
of the maps. Name the printout file: NameNameQues3e1
Now set the region to encompass 5,000,000 bp. Print to a pdf your results and then
make a screenshot of the area showing your gene. Circle your gene on one of the
maps. Name the printout file: NameNameQues3e2
Problem #4
A BLAST search is done and one of the hits has an E value of 510-5 and another had an
E-value of 510-2.
a. Based on these values, which do you expect to be more closely related to your query
sequence?
b. One of the hits had an E-value of 2.
interested in this match?
Under what circumstances might you be
Problem #5
Consider the sequence below. For ORF 2-, locate the longest possible ORF. How long is
it (be sure to indicate units)? Assume that you have a full length sequence and can
accurately identify where the ORF begins and ends. Show your work. Do this manually,
not with ORF finder.
5’CGTCATAGATTACATGGGTTCATGCATTACCATG3’
3’GCAGTATCTAATGTACCCAAGTACGTAATGGTAC5’
4
TERMS
Choose one term from the list below and give a brief description in your own words.
Your description should indicate what the term refers to and the significance to
bioinformatics. Write your answer on the back of this page. If you base any part of your
answer on information that you look up, include the reference that you used.
In silico
Genome
HGP
BLAST
BAC/YAC
EST
EXTRA CREDIT
Do you agree with the sequence data in this region that was determined by the computer?
If not by what criteria would you modify the sequence?
FYI: The quality scores were as follows: A (9), C (9), A (8)
5
QUICK CHECKLIST
Screenshots
3a of E value/Max score
3b w/ links
3d
3e, both views
Model Transcripts
Ideogram
Phenotype
Hs_UniG
Gene
Printouts (i.e. print to a pdf)
3d
3e, both views
6