Download Document

Document related concepts

Immunoprecipitation wikipedia , lookup

LSm wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Ubiquitin wikipedia , lookup

Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup

Proteasome wikipedia , lookup

SR protein wikipedia , lookup

Gene expression wikipedia , lookup

Magnesium transporter wikipedia , lookup

Protein (nutrient) wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

List of types of proteins wikipedia , lookup

G protein–coupled receptor wikipedia , lookup

Protein design wikipedia , lookup

Metalloprotein wikipedia , lookup

Protein folding wikipedia , lookup

Protein wikipedia , lookup

Protein domain wikipedia , lookup

QPNC-PAGE wikipedia , lookup

Drug discovery wikipedia , lookup

Protein moonlighting wikipedia , lookup

Drug design wikipedia , lookup

Cyclol wikipedia , lookup

Interactome wikipedia , lookup

Intrinsically disordered proteins wikipedia , lookup

Homology modeling wikipedia , lookup

Protein adsorption wikipedia , lookup

Protein mass spectrometry wikipedia , lookup

Western blot wikipedia , lookup

Protein structure prediction wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Transcript
oteomics
“The most beautiful thing we can experience is the mysterious. It is the source of all true art and all science. He
of contents
From the genome to the proteome
Classification of proteins
Experimental techniques
Inhibitor and drug design
Screening of ligands
Xray solved crystal structures
NMR structures
Empirical methods and predictive techniques
Posttranslational modification prediction
drug design  1
One of the most important applications of bioinformatics is r
The development and testing of a new drug is expensive bot
Functional genomics, bioinformatics and proteomics promise
d development pipeline
drug design  2
While the exact phases of the development of a drug are var
The testing process, which involves preclinical and clinical t
d drug design  3
The discovery process, which is rather laborious and expensiv
Target identification
Discovery and optimization of a “lead” compound
Toxicology and pharmacokinetics (which quantitatively studies abs
elopment some insights
Typically, researchers discover new drugs through:
l New insights into a disease process that allow researchers to
l Many tests of molecular compounds to find possible beneficia
l Existing treatments that have unanticipated effects
l New technologies, such as those that provide new ways to ta
drug design  4
Target identification consists of isolating a biological molecul
After target identification, the objective of drug design is the
Given that the function of the target is essential for the vital
Understanding the structure and the function of proteins is a
d drug design-5
https://www.youtube.com/watch?v=bIFnOVKd2Ko
drug design  6
Example
The HIV protease is a protein produced by the human immunode
The HIV protease is essential for the virus proliferation: the inhib
drug design  7
How can a molecule inhibit the action of an enzyme, such as
Proteases are proteins that digest other proteins, such as restrict
Many of the proteins that HIV needs to survive and proliferate in
This polypeptide must then be cut into the functional protein com
Like many other enzymes, the HIV protease has an active site, to
Design a molecule that binds to the active site of the HIV proteas
creening  1
The first step toward the discovery of an inhibitor for a partic
Traditionally, the search for lead compounds has always been
Recently, methods for high throughput screening (HTS) have
creening  2
The active sites of enzymes are housed in pockets (cavities)
The proteinligand interaction is dictated mainly by the comp
Docking and screening algorithms for ligands try to produce
docking  1
Docking is just the silicon simulation of the binding of a prot
Surface geometry
Interactions between related residues
Electrostatic force fields
docking  2
In many cases, the threedimensional structure of a protein
In drug design, molecular docking is used to determine how a pa
and docking  2
Molecular docking approaches have much in common with p
Both problems involve calculating the energy of a particular mole
docking  3
As in protein folding, there are two main considerations to take int
Define an energy function for evaluating the quality of a particular confo
Managing the flexibility of both the protein and the putative ligand
The keylock approach assumes a rigid protein structure which binds to a ligan
The induced fit docking allows flexibility of both the protein and the ligand
Compromise: assuming a rigid backbone, while allowing the flexibility of the si
docking  4
Experimental conformation
Best predicted conformation
docking  5
AutoDock (http://autodock.scripps.edu/) is a well known me
It uses a force field based on a grid in order to evaluate a particu
The force field is used to give a score to the resulting conformati
docking  6
AutoDock originally used a Monte Carlo/simulated annealing
Random changes are induced in the current position and conformation o
However, in order to allow the algorithm to find lowenergy states, over
The latest AutoDock releases use, instead, genetic algorithm
screening  1
The main compromise in designing docking algorithms is the
For screening databases of possible drugs, searching algorithms
screening  2
Methods specifically designed for database screening, such a
SLIDE characterizes the active site of the target in accordanc
Any potential ligand in the database is characterized in the s
screening  3
The indexing operation allows SLIDE to rapidly eliminat
By the reduction of the number of ligands that are subj
screening  4
Flexible side chain
Rigid anchor fragment
Identification of the correspondence with the model triangles by means of multi-level hash tables based on
Flexible side chain
chemical and geometrical peculiarities
Complementarity adaptation throu
Ligand triangle
Set of admissible model triangles
Multi-level hash table
Identification of chemically and geometrically possible overlaps between the ligand and the model triangles
Addition of the ligand side chains
Docking of the rigid anchor fragment based on triangle overlapping; collision resolution for the bac
and screening pipeline
l
l
Virtual screening, the search for bioactive compounds via computa
While virtual screening is already a standard practice in pharmaceu
and screening pipeline 2
l
l
In the survey the authors present an overview of recent developments
Finally, to facilitate the set-up of corresponding pipelines, a downloada
nteraction and affinity databases
l
l
l
Drug2Gene, 4.4 million of entries
BindingDB, 1.1 million of entries, www.bindingd
SuperTarget, insilico.charite.de/supertarget
at we have seen so far
https://www.youtube.com/watch?v=u49k72rUdyc&t=6s
s solved by Xray  1
Even the most powerful microscopic technique is insufficient to det
Instead, the discovery of Xrays by W. C. Roentgen (1895) has allo
s solved by Xray  2
In 1912, M. von Laue discovered that crystals, solid structures forme
In the early ‘50s, pioneering scientists such as D. Hodgkin were able
Today, the Xray crystallography was used to determine the structure
s solved by Xray  3
Source: PDB statistics
s solved by Xray  4
The first step in the crystallographic determination of a prote
Crystallization is a very delicate and challenging process, bu
Just as the sugar crystals can be produced through the slow evap
Protein crystals, however, are generally very small (from about 0
s solved by Xray  5
The growth of protein crystals generally requires carefully co
Once obtained, protein crystals are loaded inside a capillary
s solved by Xray  6
Originally, the diffraction pattern was captured on a radiogra
Modern tools for Xray crystallography are based on detecto
Given the gathered diffraction data, numerical methods and
s solved by Xray  7
In detail…
From the Xray diffraction spectrum of the crystals, crystallograp
Electron density maps can then be examined using computer gra
In this way, and eventually after some adjustments, a molecular
s solved by Xray  8
Finally, note that the obtained crystallographic structure is e
The crystallized proteins are not completely rigid and the mo
Moreover, the location of the water molecules within the crys
However, crystallography is currently the main method for vi
s solved by Xray  9
The Protein Data Bank (PDB, http://www.pdb.org) is the lea
Evaluation and interpretation of electron density map
Crystallization and crystal characterization
Data collection: diffraction spectrum of the crystal
d by X-Ray 10 celebrating
https://www.youtube.com/watch?v=uqQlwYv8VQI
uctures  1
The spectroscopic technique called Nuclear Magnetic Resonance (N
At the basis of NMR, there is the observation that the atoms of som
Atomic nuclei try to align themselves with the static magnetic field
uctures  2
The behavior of each atom is mainly influenced by the neigh
Data analysis and interpretation requires complex numerical
NMR methods do not use crystallization: they are very advan
uctures  3
The result of an NMR experiment is a set of constraints on th
These constraints can then be used, together with the protei
However, in general, many protein models can actually satisf
PDB collects approximately 11000 protein structures derived
uctures  4
Source: PDB statistics
R structures  5
https://www.youtube.com/watch?v=H-SQFSynKOk&t=16s
B1
Protein structures contained in PDB are stored in text format
Each line of a PDB file contains the coordinates (x,y,z), in angstro
Also, an image of the 3D structure of a protein can be obtain
For each structure in the PDB database, a four character cod
Example: 2APR identifies the rizopuspepsine data, which is an as
Files in PDB format are generally called XXXX.pdb or pdbXXXX.en
PDB  2
l
l
l
The Protein Data Bank (PDB) format provides a standard repre
Documentation describing the PDB file format is available from
Historical copies of the PDB file format from 1992* and 1996*
B3
re representations
The cartoon method evidences regions of secondary structure
The 3D threadlike repr
The representation of the molecular surface reveals the overall shape of the protein
predictive techniques  1
Problem: Definition of an algorithm which, given the threed
A very important question since many proteins are active only wh
predictive techniques  2
Solution: from the PDB database, select a set of sample s
There will be interfacial residues, involved in the contact surfac
For each residue, a set of features, to be measured and used to
predictive techniques  3
Possible features:
Number of residues within a given radius with respect to the test
Net charge of the residue and of the neighboring residues
Hydrophobicity level
Potential of the hydrogen bonds
Construction of a feature vector describing the given residue
In conjunction with the feature vector, a target is given, atte
Application of machine learning methods
modification prediction
The wide variety of protein structures and functions is partia
Removal of protein segments
Formation of covalent bonds between residues and sugars, or ph
Formation of crosslinks involving (possibly far) residues within a
Many of these modifications are carried out by other protein
Neural network based prediction techniques
sorting  1
The presence of internal cellular compartments surrounded by mem
Both the chemical environment and the protein population may dif
It is imperative, for energetic and functional reasons, that eukaryo
For example, histones  proteins that bound to DNA and are assoc
Other proteins  such as proteases, which are located inside perox
sorting  2
It seems that eukaryotic cells consider proteins as belo
The first set of proteins is exclusively translated by ribo
sorting  3
Subsequently, the mRNAs, translated by the floating ribosom
within the nucleus
in the mitochondria
in the chloroplasts
in the peroxisomes
sorting  4
Cytoplasm appears to be the default environment for protein
Organelles
Signal localization
Type
Signal
length
Mitochondria
Nterminal
Amphipathic helix
1230
Chloroplasts
Nterminal
Charged
25
Nucleus
Internal
Basicity
741
Peroxisomes
Cterminal
SKL
serine-lysine-leucine
3
sorting  5
Nuclear proteins possess a nuclear localization sequence: an
Mitochondrial proteins possess an amphipathic helix (which c
This mitochondrial signal sequence is recognized by a receptor on
sorting  6
The chloroplast proteins, encoded by nuclear genes, ha
Finally, proteins destined to peroxisomes possess one o
sorting  7
The second set of proteins is translated by ribosomes, bound
The endoplasmic reticulum is a network of membranes intim
All the proteins translated by the ER ribosomes, in fact, begi
When the first 1530 amino acids to be translated correspond to
sorting  8
Although no particular consensus sequence exists for the sig
When the translation resumes, the new polypeptide is extrud
A peptidase signal protein cuts the target Nterminal sequence fr
Neural nets: http://www.cbs.dtu.dk/services/SignalP/
cleavage  1
Both prokaryotes and eukaryotes possess several enzym
There are different types of proteolytic cleavage:
Removal of the methionine residue present at the beginnin
Removal of the signal peptides
cleavage  2
Sometimes, the cleavage signal is constituted by a single res
Chymotrypsin cuts polypeptides at the Cterminal of bulky aroma
Trypsin cuts the peptide bond on the carboxyl side of lysine and a
Elastase cuts the peptide bond on the Cterminal of small residue
However, in many cases, the sequence motif is longer and am
Neural networks: prediction accuracy > 98% (http://www.pa
cleavage  3
ylation  1
Glycosylation is the process that permanently binds an oligo
The presence of glycosylated residues can have a significant
In eukaryotes:
Nglycosylation
Oglycosylation
ylation  2
The Nglycosylation is the addition of an oligo-saccharide to
The main signal which indicates that an asparagine residue (Asn)
However, this sequence alone is not sufficient to determine glyco
ylation  3
The Oglycosylation is a posttranslational process in which
Unlike the Nglycosylation, known sequence motifs that mar
Neural Networks: accuracy of 75% for Nglycosylation and h
rylation  1
Phosphorylation (binding of a phosphate group) of surface re
Kinases, which are responsible for phosphorylation, are also invo
Since phosphorylation frequently serves as a signal for the enzym
Phosphatases are the enzymes responsible for removing phospha
rylation  2
Since phosphorylation of key residues of tyrosine, serine and
No single consensus sequence identifies a residue as a phosphory
Neural Networks: accuracy 70%
(http://www.cbs.dtu.dk/services/NetPhos)
in Protein Interaction 1

Protein interactions are characterized as stable or transient and they could be e

Hemoglobin and core RNA polymerase are examples of multi subunit interaction

Transient interactions are temporary and typically require a set of conditions tha
in Protein Interaction 2
Proteins bind to each other through a combination of hydrophobic bonding,
ein Protein Interaction 3
The result of two or more proteins that interact with a specific functional objective can be demon

Alter the kinetic properties of enzymes, which may be the result of subtle changes in substrate

Allow for substrate channeling by moving a substrate between domains or subunits, resulting

Create a new binding site, typically for small effect of molecules

Inactivate or destroy a protein

Change the specificity of a protein for its substrate through the interaction with different bindin

Serve a regulatory role in either an upstream or a downstream event
nteraction and drug design




The importance of unveiling the human protein interaction network is undeniable, particularly in the b
Even though protein interaction networks evolve over time and can suffer spontaneous alterations, o
These disorders may be caused by external pathogens, such as bacteria and viruses, or by intrinsic
Therefore, having the knowledge of how proteins interact with each other will provide a great opportu
ding…  1
While genomics is rapidly becoming a highly developed resea
The proteome characterization promises to bridge the gap be
Various taxonomies have been developed to classify and org
ding…  2
Equipped with databases of families, superfamilies and prote
Important proteomics applications are in drug design
Recent advances in protein structure understanding and in Xray
ding…  3
Although the 3D structure of proteins is fundamental to unde
Actually, the location signal and the various posttranslationa
Posttranslational modifications account for the fact that the sam
ED Talk about drugs
https://www.youtube.com/watch?v=RKmxL8VYy0M