Download tutorial10_3D_structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

List of types of proteins wikipedia , lookup

Protein wikipedia , lookup

Bimolecular fluorescence complementation wikipedia , lookup

Proteomics wikipedia , lookup

Circular dichroism wikipedia , lookup

Cyclol wikipedia , lookup

Protein design wikipedia , lookup

Alpha helix wikipedia , lookup

Western blot wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Rosetta@home wikipedia , lookup

Trimeric autotransporter adhesin wikipedia , lookup

Protein purification wikipedia , lookup

Protein folding wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Protein domain wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein structure prediction wikipedia , lookup

Homology modeling wikipedia , lookup

Structural alignment wikipedia , lookup

Transcript
Protein
Tertiary
Structure
Protein Data Bank (PDB)
• Contains all known 3D structural data of
large biological molecules, mostly proteins
and nucleic acids: ~87,000 structures.
• The data is typically obtained by X-ray
crystallography or NMR (Nuclear magnetic
resonance) spectroscopy and submitted
by biologists and biochemists from around
the world.
• Freely accessible.
Accession number
PDB file
Java based
visualization tools
2ndary structure
PDB file example:
A PDB file can be viewed by different visualization tools , such as Pymol
Protein, chain, domain
• Here is a
protein
compound by
4 chains.
• Which protein
is that?
Protein, chain, domain
• One chain may have
multiple domains.
• A protein domain is a
conserved part of a given
protein sequence and
structure that can evolve,
function, and exist
independently of the rest of
the protein chain.
• Each domain has a stable
3D structure.
Protein domain classifications
• Scientists have tried to classify proteins by
their structural properties into a tree-like
hierarchy.
• The 2 most used domain classifications
are CATH and SCOP.
CATH: Protein Domain Structure Classification
Class, Architecture, Topology and Homology
•Class: The secondary structure composition: mainlyalpha, mainly-beta and alpha-beta.
• Architecture: The overall shape of the domain structure.
Orientations of the secondary structures : e.g. barrel or 3layer sandwich.
• Topology: Structures are grouped into fold groups at
this level depending on both the overall shape and
connectivity of the secondary structures.
•Homologous
structures
Superfamily:
Evolutionary
conserved
http://www.cathdb.info/
CATH: Protein Domain Structure Classification
Class, Architecture, Topology and Homology
SCOP
Structural Classification of Proteins
http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html
Based on known protein structures
•Manually created by visual inspection
•Hierarchical database structure:
–Class, Fold, Superfamily, Family, Protein and Species
Node
Parents of node
Children
of node
Protein structure alignment
• Structural alignment attempts to establish
homology between two or more protein
structures based on their 3D conformation.
• Structural alignment
often implies evolutionary
relationships between
proteins with low seq-id.
Sequence – structure relations
• Similar sequences  Similar structures.
• Different sequences  ???
• Different sequences that fold into similar
structures are most interesting, since they
imply a common origin.
• This is what we aim to find
Protein structure alignment
• Alignment tools try to superimpose the 2
structures, so that the distance between them is
minimal.
• The distance measure is RMSD - Root Mean
Square Deviation.
• Given two sets of n points v and w, the RMSD is
defined as follows:
n
1
RMSD(v, w) 
vi  wi

n i 1
2
Protein structure alignment
• The structural alignment servers do
LOCAL structural alignment.
• They try to align larger stretches of protein
backbone with minimal RMSD.
• Thus, another parameter to assess the
quality of the alignment is the alignment
length.
Protein structure alignment
similar structures
• Low RMSD
 _________
dissimilar structures
• Low alignment length  _________
• SAS score = 100*RMSD/(alignment length)
similar structures
• Low SAS
 _________
Structure alignment servers
Dalilite:
http://www.ebi.ac.uk/Tools/structure/dalilite/
• 1XIS and 1NAR have only 7% sequence
identity, but they are structurally similar.
• We will download their pdb files from the PDB,
and structurally align them using Dalilite.
Insert PDB files
This file can be loaded to Pymol viewer
Food for thought
How can structure alignment help us in
structure prediction?
Structure prediction
• Input: protein sequence;
• Output: protein 3D structure.
• This is a VERY difficult task.
• CASP: Critical Assessment of Techniques for
Protein Structure Prediction
• Worldwide experiment for protein structure
prediction taking place every two years.
Structure prediction
Comparative Modeling
uses previously solved structures
as starting points, or templates.
Homology modeling:
searches similarity in
sequences with
known structures.
Protein threading:
sequence to
structure alignment,
against a database of
‘templates’ – known
structures.
Ab Initio Modeling
build 3D protein models
"from scratch", i.e., based
on physical principles
rather than on previously
solved structures.
I-TASSER structure prediction server
• based on multiple-threading alignments
• I-TASSER (as 'Zhang-Server') was ranked
as the No 1 server for protein structure
prediction in recent CASP7, CASP8,
CASP9, and CASP10 experiments.
I-TASSER results
I-TASSER results
I-TASSER results