Download ECS 189K - UC Davis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cell-penetrating peptide wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Expanded genetic code wikipedia , lookup

Magnesium transporter wikipedia , lookup

Silencer (genetics) wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Genetic code wikipedia , lookup

Molecular evolution wikipedia , lookup

Gene expression wikipedia , lookup

List of types of proteins wikipedia , lookup

Protein wikipedia , lookup

Protein moonlighting wikipedia , lookup

Protein folding wikipedia , lookup

Cyclol wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Western blot wikipedia , lookup

Metalloprotein wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Point mutation wikipedia , lookup

Protein adsorption wikipedia , lookup

Biochemistry wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Homology modeling wikipedia , lookup

Protein structure prediction wikipedia , lookup

Transcript
ECS 129
Assignment: Option1 (No programming)
Due: Thursday, February 27, 2014
Genes and proteins
Problem1: Gene Finder
You have just sequenced a shot segment of DNA. You wish to analyze this DNA
sequence to determine whether it could encode a protein.
5’ TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT 3’
1) Find the longest possible coding region (also called open reading frame, or ORF).
Remember that there are six possibilities to check
2) Label which strand on the DNA will be the coding strand, and which will be the
template strand when this DNA is transcribed
3) Transcribe this ORF into mRNA, indicating the 5’ and 3’ ends
4) Translate this mRNA into a protein sequence, indicating the N and C termini.
Problem 2: Identification of a disease
A short fragment of a protein has been identified as a marker for a human genetic disease.
This fragment is:
GLASLGTPDEYIEKLAT
1) Identify the wild type human protein associated with this fragment. (Blast can be
run at: http://www.ncbi.nlm.nih.gov/blast/)
2) Obtain the one letter AA code sequence of the full wild type protein
3) Describe the single mutation between the wild type sequence and the marker
given above.
4) This mutation has been associated with an inherited disorder. Find the name of
that disorder, and write a small paragraph on the nature, and consequences of this
disease.
5) Visit the Genome pages at Ensembl (http://www.ensembl.org). Search the
Human Genome using the wild type sequence corresponding to the fragment of
protein given above (make sure to use the wild type fragment of the same size;
use BLASTP, and use “exact match”). Show the location(s) of the corresponding
gene projected on a human karyotype.
Problem 3: Accessing and Analyzing Structures from the Protein Data Bank
The Protein Data Bank, or PDB, is the best place to obtain 3-Dimensional structures of
proteins, nucleic acids and other macromolecules. From the PDB site
http://www.rcsb.org, you can locate proteins by keyword searching or by entering the
PDB accession number for the structure file, like 5PTI. Details on the molecule (how the
structure was determined, pertinent research articles, position of secondary structures,
unusual amino acids, etc) can be found on the RCSB web site but also in the PDB file
itself.
PDB files are just formatted text files, so you can open them in a text editor or even Word
and read them. There is a wealth of information there! The molecular viewing programs
use the ATOM records in the file, which contain residue name, residue number, atom
name and atom number. Often, the structure files include other molecules besides the
protein, such as water molecules, nucleic acid and bound ligands.
1) Find a protein that contains the unusual amino acid HYP. What is this amino
acid?
2) We first look at the structure of an HIV protease. We choose the PDB file 1A30.
Download this PDB file, and display its content using RASMOL or PYMOL.
This structure shows HIV protease as a dimer (chain A and B), bound to a small
tri-peptide (chain C, sequence EDL). Generate an image in which the ligand is
shown in spacefilling mode (CPK), and the protein dimer is shown in cartoon
mode. Save this picture in GIF or PNG format, and include it in your report
3) Now we look at the PDB file 1CWA, which contains the structure of the
complex between two proteins. Find the protein names, sizes and secondary
structure contents. Identify the non standard amino acids in chain C.
a) Describe briefly the importance of this structure
b) Generate a GIF or PNG image containing a CPK representation of the
complex
Good Luck !