Download Exercise1_2015

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Oncogenomics wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Genomic imprinting wikipedia , lookup

Epistasis wikipedia , lookup

Protein moonlighting wikipedia , lookup

Public health genomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Pathogenomics wikipedia , lookup

Copy-number variation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Point mutation wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

NEDD9 wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Gene wikipedia , lookup

Genome evolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Nutriepigenomics wikipedia , lookup

The Selfish Gene wikipedia , lookup

Gene therapy wikipedia , lookup

Genome (book) wikipedia , lookup

Gene desert wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression profiling wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene nomenclature wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
Basics of Bioinformatics, University of Oulu
Spring 2015
Teachers Phillip Watts, Sonja Kujala
These exercises are a modified version of NCBI tutorials
(http://www.ncbi.nlm.nih.gov/education/tutorials/)
Please, make brief notes and add pictures (e.g. snipping tool ) as you go along! When
finished, send this free-form “report” to [email protected]
http://www.ncbi.nlm.nih.gov/gquery/
Exercise 1:
I. PubMed, PMC, Taxonomy and PopSet
Describe PubMed, PMC, Taxonomy and PopSet, briefly! Perform a search
for mammoth across all of the Entrez (NCBI gquery) databases. Which
databases contain records associated with the term mammoth? Link to
the mammoth literature citations in the PubMed database. Identify the
articles available free in PMC. Access the article “The year of the
mammoth”. Find a link where you can download a PDF copy of this
article. Then return back to Article format. What are the cited articles in
this publication? In what articles is this article cited in? Download the
abstracts of these articles. Access publications of some of the authors of
these articles.
Go back to the mammoth search in PubMed. Display the PopSet links.
Access the record by Greenwood with PopSet ID 14090839. View the
alignment (zoom to sequence and move along the alignment). Link from
the record to the Nucleotide database. The sequence alignment of which
gene is studied in this PopSet? Display the Taxonomy Links for the
PopSet and list the organisms covered.
Access the Taxonomy record for Mammuthus primigenius. What is the
origin of the mammoth’s specimens for some of the sequences reported
in the Entrez databases? What is the lineage for mammoth? Which are
the three major divisions of cellular organisms? Which of these has the
highest number of entries in the “Structure” database?
II. OMIM, UniGene and Homologene
Describe OMIM, UniGene and Homologene, briefly! Perform an
unlimited search for cytochrome c oxidase in the OMIM database.
Repeat the query for “cytochrome c oxidase” as a term. Which search is
more restrictive? Limit the retrieved entries only to those with gene
location on chromosomes 4, 6 and 19. How many records have you
retrieved? What is the chromosomal location of gene COX7A1 (OMIM
record 123995)? Note the information about muscle and liver isoforms.
Are there any known disease phenotypes (allelic variants) associated
with the COX7A1 gene? Access the UniGene of this record. Examine the
expression profiles. Now search the whole Unigene sith query COX7A1.
How many of these UniGene records are from mammals? Now limit the
search for the UniGene records that have expression evidence of at least
100 ESTs? How many of these UniGene records are from mammals?
Access the HomoloGene database and perform a search for records
relating to COX genes (gene name and use cox* as a query). How many
records do you retrieve? Are COX7A1 and COX7A2 members of the same
HomoloGene group?
Are all COX genes equally conserved in evolution?
Are there any COX genes that are conserved throughout the
superkingdom of Eukaryota (Use the Ancestor)? How many did you
find? Display the taxonomy tree for organisms included in the
HomoloGene COX1 record.
III. Entrez Gene
Describe Entrez Gene, briebly! Retrieve human entries related to "prion
protein" in Entrez Gene. Identify the gene for prion protein (PRNP).
Name the map location of this gene on the human genome. What is the
function of this protein? What are the alternate gene symbols? Name the
phenotypes associated with the mutations in this gene.
Is the RefSeq mRNA record reviewed? How many alternatively spliced
products have been annotated for the gene?
To obtain information about the homologs from other eukaryotes, click
on the Homologene link. Change the Display option to "Alignment
Scores". How great is the percent identity between the human and
mouse proteins? View the alignment by clicking on the "Blast" link.
Go back to the Entrez Gene report. Identify the clinically-associated
variations annotated on this gene by clicking on the SNP link. Next, filter
with “Clinical/LSDB”. How many of them are missense
(nonsynonymous) changes?