Download o How is covariation used in RNA structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nucleic acid analogue wikipedia , lookup

Polyadenylation wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Gene wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Deoxyribozyme wikipedia , lookup

RNA silencing wikipedia , lookup

RNA wikipedia , lookup

Genomics wikipedia , lookup

Point mutation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Protein moonlighting wikipedia , lookup

RNA-Seq wikipedia , lookup

History of RNA biology wikipedia , lookup

Epitranscriptome wikipedia , lookup

NEDD9 wikipedia , lookup

Primary transcript wikipedia , lookup

Non-coding RNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nucleic acid tertiary structure wikipedia , lookup

RNA-binding protein wikipedia , lookup

Transcript
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 1 of 8
BCB 444/544- F07
Study Guide #2 –
For Exam 2 (Oct 26)
Answers will be discussed in Review Session on Thurs Oct 25
General comments






Exam 2 will cover all topics covered in class, lab and assigned readings:
 Lectures 13 - 26 (Wed Sept 19 thru Mon Oct 22 )
 Labs 5 - 8
 HW# 3 & 4 (not 5)
 All assigned reading & URLs indicated in PPTs, including:
Xiong: Chps 6 (beginning with HMMs), 7, 8, 12 - 15 (not 10 & 11)
Eddy: What is a hidden Markov Model?
Ginalski: Practical Lessons from Protein Structure Prediction
This study guide covers ~90% of material important for Exam 2 - no
guarantees about other 10%!
Exam 2 will be a closed-book, closed-notes, 50-minute exam.
Some questions will involve computation; bring your calculators if you like.
All required formulae or tables will be provided.
Some questions will require short essay-like answers that demonstrate your
understanding of key concepts covered in the course.
Topics & Study Questions:

Review: Nucleus, Chromosomes, Genes
o Name two primary differences in the organization of eukaryotic versus
prokaryotic cells

Review: RNA, Proteins, Promoters, Transcription factors
o What is the relationship between protein sequence – structure – function?
o Eukaryotic gene structure: Introns vs Exons
o Regulation of gene expression
 What is a promoter? An enhancer?
 What is a transcription factor?

PSSMs, Profiles & HMMs
o What are sequence logos?
o What is the main difference between a PSSM and a profile?
o How do you calculate the probability of a given path through an HMM?
o How do you calculate the most probable path through a HMM?
o How do you calculate the total probability of an observed sequence from a
HMM?
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 2 of 8

Protein Structure: Basics, Visualization, Classification
o Amino acid properties: Why is Glycine "special?" Proline? Cysteine?
o What are the 4 main levels of protein structure?
o What are 3 main types of secondary structural elements?
o Name 2 databases in which proteins are categorized on the basis of their
structural class.
o What is the major database for protein structure information?
o Name 2 protein structure visualization tools.
o What are the 2 main methods use to obtain high-resolution experimental
structures of proteins?

Protein Secondary Structure Prediction
o Why are different programs needed to predict the secondary structures
of globular vs. membrane proteins?
o What features of membrane proteins make them "easier" targets for
protein secondary structure prediction?

Protein Tertiary Structure & Prediction
o What is meant by the "protein folding" problem?
o What is meant by the "inverse folding" problem?
o What is the primary goal of the Structural Genomics Project?
o What are the 3 major methods for protein tertiary structure prediction?
o What is the CASP contest?
o Name 1 program for protein structure comparison.

RNA Structure & Prediction
o List 4 different types of functional RNAs
o Why are there so many different types of base-pairs in RNA structures?
o What types of bonds/molecular interactions are primarily responsible for
stabilizing RNA structures?
o What are 3 main methods for RNA structure prediction?
o How is covariation used in RNA structure prediction?
o What type of protein structure prediction method was developed by KaiMing Ho's group at ISU?

Gene Structure and (just a little bit on) Prediction
o Name 3 differences in the structural features of genes in eukaryotes vs
prokaryotes
o Name 4 DNA sequence "signals" in genes that are often used in
computational gene prediction approaches.
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 3 of 8
Sample Questions/Problems
1. Answer True or False or fill in blank to complete the following statements
a. ____ The cytoskeleton in eukaryotic cells organizes the extracellular space.
b. ____ The process by which information in RNA is used to make proteins is
called translation.
c. ____ The process in which information in DNA is copied into RNA is called
transcription.
d. ____ Peptide bonds are both planar and flexible.
e. ____ Enzymes that catalyze reactions in the cell are always proteins.
f. ____ Protein interactions are not required for the functions of most proteins.
g. ____ An exon is a segment of a eukaryotic gene that does not encode protein.
h. ____ In eukaryotes, one gene can sometimes encode several proteins.
i. ____ Transcription factors are proteins that often bind specific DNA
sequences and promote the initiation of transcription.
j. ____ Non-coding RNAs can play important roles in cells, even though they do
not produce proteins.
2. Short answer questions
a. Briefly describe how PSSMs and profiles are generated, and how they differ.
Name two applications for which either a PSSM or a profile could be used.
Why is a PSSM or profile used (i.e., what advantage does it confer over a
single consensus sequence)?
b. What are the 4 main hierarchical levels of protein structure? What types of
bonds (covalent or non-covalent) are most important for stabilizing structure
at each level.
c. What is the difference between a protein motif and a protein domain?
d. What is a HMM? Why is it more "powerful" and "flexible" than either a PSSM
or a profile?
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 4 of 8
3. "Problem Solving" & Short Essay
A. Protein Structure Basics & Prediction
According to Ginalski et al, 2005, what are the most important problems that
must be addressed to improve protein tertiary structure prediction?
B. RNA Structure Basics & Prediction
What are the three main approaches to RNA secondary structure prediction?
Explain the advantages and disadvantages (if any) of each.
C. Gene Structure Basics & Prediction
Using this hidden Markov model, calculate the most probable path for sequence
ACTG.
Your probability table will begin like this:
Start
E
5
I
End
Begin
1
0
0
0
0
A
0
C
0
T
0
G
0
End
0
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 5 of 8
D. Transcription/translation:
Below the DNA sequence shown, write the RNA sequence that would be
transcribed from the top strand of DNA, assuming it is copied completely from the
3' to the 5' end. On the RNA sequence, circle the START and STOP codons for
translation. Translate the RNA sequence into amino acids and write the deduced
protein sequence below, too.
DNA
3'-TATATCGCGTTACGATCTGCACAAGATCATC-5'
5'-ATATAGCGCAATGCTAGACGACTTCTAGTAG-3'
RNA
5'-
Protein:
NH2 -
Suppose the DNA sequence in another individual is different, due to a single basepair substitution, as shown below. What effects (if any) would you expect this SNP
have on the sequence and function of the expected protein product?
3'-TATATCGCGTTACGATCTGCACAAGGTCATC-5'
5'-ATATAGCGCAATGCTAGACGACTTCCAGTAG-3'
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 6 of 8
Important "Bioinformatics" vocabulary:
New Molecular Biology Jargon:
CpG island
RNA polymerase
Promoter
Enhancer
Post-transcriptional control
Primary, secondary, tertiary, quaternary structure
Hydrophobic/ hydrophilic
Motif
Domain
Covalent bond
Non-covalent bond
Peptide bond
Hydrogen bond (H-bond)
Base-stacking interactions
Rotamer
Ramachandran plot
Energy minimization
X-ray crystallography
NMR spectroscopy
AFM
Co-variation
Experimental constaints (for RNA structure prediction)
cDNA
EST
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 7 of 8
Bioinformatics Jargon:
Log odds score
PSSM
Profile
HMM
Markov model
1st, 2nd, 3rd order MMs
Obseved vs hidden variables
Emission vs transition probabilities
Viterbi algorithm
Profile HMM
Regularization
Pseudocounts
Dirichlet mixture
Regular expression
PSB, MMDB, MSD
PYMOL, Cn3D
CATH, SCOP, DALI
GOR V, CMD
Q3 SCORE
Phobius
Protein Structure Prediction
1) Ab initio
2) Threading or fold recognition
3) Homology modeling
Miyazawa-Jernigan (MJ) model/potentials
SWISS-MODEL
3-D JURY
BCB 444/544 Fall 07
Study Guide #2 - Oct 22
p 8 of 8
RNA structure prediction
1) Ab initio (thermodynamic)
2) Comparative
3) Combined computational & experimental
Gene structure prediction
1) Ab initio
2) Similarity based
3) Combined