Download Lecture 4 - ISP 2016

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein moonlighting wikipedia , lookup

Cell nucleus wikipedia , lookup

Protein wikipedia , lookup

SR protein wikipedia , lookup

Nucleosome wikipedia , lookup

JADE1 wikipedia , lookup

Gene expression wikipedia , lookup

List of types of proteins wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Transcript
Theory and Simulations of Polyelectrolytes
International Summer School
June 23-25 2016, Lomonosov Moscow State University
Natural polyelectrolytes
Alexey Shaytan, PHD
[email protected]
23 June 2016
https://goo.gl/cKnnbN
Outline
• Introduction: main class of natural polyelectrolytes
• Key differences between natural and synthetic polymers
• Main concepts and advances in molecular biology
• Detailed discussion of main classes of natural
polyelectrolytes and their complexes with examples:
–Nucleic acids (DNA, RNA)
–Proteins
–Polysaccharides
–Lipids
What are natural polyelectrolytes?
Natural polyelectrolytes are biomacromolecules
which are at the same time polyelectrolytes
IUPAC definitions*
Polyelectrolyte molecule: A macromolecule in which a substantial portion of the
constitutional units have ionizable or ionic groups, or both.
Macromolecule = Polymer molecule: A molecule of high relative molecular mass, the
structure of which essentially comprises the multiple repetition of units derived, actually
or conceptually, from molecules of low relative molecular mass.
Biomacromolecule: Macromolecule (including proteins, nucleic acids, and
polysaccharides) formed by living organisms.
Biopolymer: Substance composed of one type of biomacromolecules.
Polymer: A substance composed of macromolecules.
Common definitions
A macromolecule is a very large molecule.
*Terminology for biorelated polymers and applications (IUPAC Recommendations 2012)
Classes of natural polyelectrolytes?
Are the same as four classes of biopolymers
Biopolymer classes
•
•
•
•
Nucleic acids (DNA, RNA)
Proteins
Polysaccharides
Lipids*
DNA and RNA – are always polyelectrolytes.
In other classes – depends on composition.
Proteins on average have 20% of charged groups (in vertebrates).
Phospholipids (membrane lipids) – 1-2 charges per lipid molecule.
Polysaccharides may be uncharged or have highest density of all known
biopolymers (heparin).
*Not always considered as macromolecules or polymers.
Natural polyelectrolytes: main locations
Animal cell
Bacterial cell
RNA
DNA
Polysaccharides
•
•
•
Lipids
Extra cellular matrix
Cartilage, skin, etc.
Bacterial cell envelopes
• Membranes
Virus
Proteins
• everywhere
Diagram is not comprehensive and shows only example locations of biomacromolecules
From monomers to polymers
Class
Monomers
Monomer
types #
Topology
nucleotide
DNA/RNA
4
A,T/U,G,C
Linear
20
Linear
Many,
+many
modifications
Linear or
branched
Amino acid
Protein
Monosaccharide
Polysaccharide
Lipid
Lipid
Many,
Depends on
organism
Aggregates of
individual
molecules, e.g.
Lipid bilayers
Central dogma of molecular biology
DNA replication
In living organisms
*
Translation:
Ribosomes make
protein from mRNA
Transcription:
RNA polymerase
copies DNA to mRNA
*Image from: https://commons.wikimedia.org/wiki/File:Peptide_syn.png
Central dogma of molecular biology
https://en.wikipedia.org/wiki/Protein_production#/media/File:Genetic_code.svg
Practical synthesis of biopolymers
Obtaining and generating DNA
DNA
in vitro
transcription
• Solid-phase DNA synthesis
RNA
• Polymerase chain reaction (PCR)
Expression
system
Protein
purification
Protein
• Molecular cloning: use bacteria
to copy DNA
Folding and self-assembly
Proteins, RNA, and DNA may self-assemble into 3D
structures with extremely high precision
Small soluble proteins
spontaneously fold into
unique 3D structure
Ribosome 30S
subunit formed by
RNA and proteins
DNA origami
Box 1: protein folding
Levinthal's paradox: very large number of degrees of freedom in
an unfolded polypeptide chain, the molecule has an astronomical
number of possible conformations. An estimate 10^143 was made in one
of his papers.
Sequences are optimized for folding.
Folding funnel
Folding pathways with intermediates
billions of years
Evolution, selection, mutations
Mutations in DNA/RNA
during copying
Over billions of years nature performed vast sampling of
sequence space for DNA, RNA and proteins
Big numbers and sequence space
• Atoms in visible universe ~10^81
• Number of theoretical proteins 100 amino acids in length
~10^130
• Cells in human body ~4*10^13
• Bacteria on Earth ~5*10^30
Box 2: Antibodies and immune system
Antibodies may bind to other
molecules with high affinity
Generated by immune system via rapid
sampling of protein sequence space
Box 3: Aptamers, in vitro evolution
Aptamer - oligonucleotide or peptide that binds to a specific target molecule.
RNA aptamer binding
vitamin B molecule
SELEX method to generate
RNA/DNA aptamers
Macular degeneration
Macugen – aptamer binds VEGF
protein for treatment of AMD
Key features of biopolymers
What can we learn from nature?
• Self-assembly
• Precise non-covalent interactions between macromolecules
• Program 3D structure via sequence
• Library generation and artificial selection
• Conversion of various polymers (DNA<->RNA->protein)
• Use living systems as vehicles to produce polymers
Why study biopolymers?
We expect technological revolutions to happen in biotech soon.
• Next-generation sequencing
• Gene editing
• Optogenetics
Why study biopolymers?
Synthetic biology – rational engineering of new organisms.
• New layer of abstraction
• Standardization of biological parts
Electrical
engineering
Transistor
Synthetic
biology
Gene
Logic element
Regulatory
elements
Integrated circuit
Gene circuit
Chassis
Model organism
Registry of standard biological
parts pats.igem.org
Part II – detailed discussion
Nucleic acids
Nucleic acids
X=OH for RNA
X=H for DNA
X
nucleoside
nucleotide
DNA bases: A-T, G-C
RNA bases: A,U,G,C
Uracil
replaces Thymine in RNA
Nucleic acids
DNA forms
A-DNA
B-DNA
RNA secondary
structure
Z-DNA
RNA has A-from in double helix
RNA hairpin
Nucleic acids: RNA vs DNA
Sugar puckering
Only conformation
adopted by RNA
RNA has less conformational
entropy, lower penalty for
adopting various 3D structures
A or B conformation depends on sugar puckering,
which is affected by presence of 2’ OH group in
RNA
Chromatin
Complex
organism
Bacteria
Eukaryotic cell
Cell nucleus
6 µm
vs
• Chromatin = DNA + proteins + RNA
• Compacts DNA ~1000 000 times
•
•
•
•
Nucleus - control center of the cell
Turns genes on and off
Responds to stimuli
Has epigenetic memory
Human
DNA length:
2 meters
Total body
DNA length:
80 billion km
2 nm
+
+
10 nm
2x
Tetramer
=
Core Histones
Nucleosome structure
147 bp
2x
30 nm
300 nm
Nucleosome core
particle (NCP)
Histone
Tails
700 nm
Linker DNA
+
Linker Histone
Nucleosome
=
Nucleosome
Chromatosome
1400 nm
Felsenfeld and Groudine. Nature, 2003
3D-print your own nucleosome:
Nucleosome LEGO project
github.com/molsim/nuclLEGO
26
Nucleosome structure
AK Shaytan, GA Armeev, A Goncearenco, VB Zhurkin, D Landsman, AR Panchenko, JMB, 2016
Part II – detailed discussion
Proteins
Proteins
Amino acid
Peptide bond
Proteins: structure
It is believed that hydrophobic
collapse is the main driving
force for protein folding
Proteins: hydrophobic/hydrophilic balance
For globular soluble proteins charged residues are exposed on the surface
- charged
Among totally non exposed residues charged
residues are at ~6%
Charged amino acid frequency in vertebrates
(ASP,GLU,LYS,ARG) ~23%
+ charged
hydrophobic
Shaytan AK, Shaitan KV, Khokhlov AR. Biomacromolecules 2009, 10,1224-1237
Example: self-assembling fibrils
EF-C peptide
Amino-acid sequence:
Gln-Cys-Lys-Ile-Lys-Gln-Ile-Ile-Asn-Met-Trp-Gln
Nanofibrils (d=4 nm, l=100-400nm)
+
Viral vector
+
Cell
=
Up to 100 fold viral transduction
enhancement
Yolamanova M, Meier C, Shaytan AK, et al., Nature Nanotechnol. 2013, 8(2):130-6.
Protein design
Molecular
mechanics
force fields
Problems:
• Total free energy of folding is a sum of many opposing
components (hydrophobic, electrostatic, polar, entropic)
• Native proteins are marginally stable 5-15 kcal/mol between
folded and unfolded state
• Conformational space is huge! (Leventhal’s paradox)
Protein design
De novo protein design is possible due to advances in computational biology
Part II – detailed discussion
Polysaccharides
Polysaccharides
Polymers of monosaccharides (simple sugars)
General formula for simple sugar
Cyclic isomers of glucose
Stereo isomers of glucose
Amylose a polymer o glucose
Amylopectin – branched form
Polyelectrolytes among polysaccharides
Glycosaminoglycans – essential components of extracellular matrix
contributes to the tensile strength of
cartilage, tendons, ligaments
Heparin has the highest negative
charge density of any known biological
molecule.
Agarose
Part II – detailed discussion
Lipids
Lipids
Hydrophobic or amphiphilic small molecules
Lipid membranes
Lipid membranes
5nm
Atomistic models
M. Bozdaganyan,
maser thesis, 2010
Ion channels in membranes
M.A. Kasimova, master thesis, 2011
10 ns
M. Jensen, PNAS, 2010
Thank you for attention!
Questions?