Download of a protein

Proteins Attila Ambrus versatile functions in biological systems linear polymers of amino acids spontaneous folding to 3D structures that eventually determines function structure dictates function (DNA replication machinery) wide range of functional groups, most of them are chemically reactive functional groups account for function (e.g. enzymes) complexes with other biomacromolecules (proteins, RNA/DNA, lipids, carbohydrates, inorganics (e.g. ions), etc.) adopt even more functionalities that proteins alone lack some proteins are rigid, some are flexible: rigid proteins may work for connective tissues or cytoskeleton while flexible ones can assemble with other molecules for more complex functions (e.g. transmit some kind of information in or between cells) flexibility and function (the protein lactoferrin undergoes a substantial conformational change upon binding Fe3+ ; apo- and holo-enzymes) Alpha-amino acids building blocks of proteins four different substituents around (alpha-)carbon: chirality (except Gly) CORN rule: if COOH, R, NH2 are clockwise: D-form, anticlockwise: L-form L=S (except for cysteine) “side chain” absolute configuration (Cahn–Ingold–Prelog [CIP] system) Ionization state of amino acids as a function of pH (without side chain contributions) side chains Side chains they differ in size, shape, charge, H-bonding capacity, hydrophobic character and chemical reactivity twenty amino acids build up all proteins in all species in the evolutionary tree (with few exceptions; this “alphabet” is several billion years old) hydrophobic effect in proteins: hydrophobic core resisting contact with water (apolar character), multimerization surfaces (protein-protein interactions) polar side chains prefer being on the surface contacting water Proline is a special amino acid the ring structure markedly influences local protein structure due to its rigid nature (see also cis/trans peptide bonds later) Aromatic side chains reactive Determination of protein concentration # of Tyr, Trp and S-S bonds count for e of a protein Polar/uncharged amino acids reactive additional asymmetric center Cysteine is also special in a way… much more reactive than -OH two Cys –SHs can form disulfide bonds (-S-S-, by oxidation, forming cystine) that is particularly important in stabilization of the 3D structure of proteins Polar/charged amino acids at near neutral pH, depending on local environment (catalytic effects, enzyme active centers) Aspartic acid Glutamic acid in special environments/settings in a protein Asp/Glu can be (partially or transiently) protonated that generally has an important functional role in enzymatic mechanisms Why these amino acids (why not others)? they are versatile enough for structure and function of necessary proteins /enzymes for life they were probably available from prebiotic reactions (before the origin of life) other possible amino acids may be too reactive for the purpose (e.g. homoserine or homocysteine) spontaneous cyclization (limitations for protein structure) Peptide bond residue condensation dihedral angle: w (amide bond) w=0o for cis, 180o for trans isomer (isomerization is slow [10-100 s], but can be facilitated by peptidyl prolyl isomerases; normal protein folding is 10-100 ms) endergonic reaction under most conditions, needs input of free energy the peptide bond is kinetically stabilized (metastable) since the lifetime of a peptide bond in water is ~1000 years (in the absence of a catalyst) in folded proteins overwhelmingly (~1000:1) the trans isomer dominate (for X-Pro peptide bonds this ratio only ~3:1!; with is proline the similar state of energy) magnitudes ofinthe steric clashes two resonance forms, Ea=~20 kcal/mol, lesssimilar reactive than esters, detection effects are cis configuration of peptide bond: at 190-230 nm (UV spectrometry) 60% 40% relatively high dipole moment in the double-bonded form (~3.5 D), lining up these dipoles e.g. in an alpha-helix produces great net dipole moments (important in physico-chemical properties of proteins) peptide bonds (proteins) can be broken down to amino acids (or smaller peptides) chemically by acids or bases (generally with 6 M HCl, 110 oC, 18-96 h or 2-4 M NaOH, 100 oC, 4-8 h) or enzymatically by peptidases (proteinases, proteases, see later) Protein termini the protein chain has a polarity (the two ends of the chain are different) by convention, the –NH2 terminus is put at the start of writing the sequence (Leu-Phe-Gly-Gly-Tyr is another oligopeptide with indeed differing properties) Backbone and side chains distinctive side chains (variable parts) main chain or backbone (repeating/constant) there are great H-bonding potentials in the backbone: N-H is a good donor, C=O is a good acceptor they interact with one another and with functional groups from side chains and stabilize structural elements in proteins proteins generally contain 50-2000 amino acids (a muscle protein contain 27,000 amino acids) sequences of small numbers of amino acids are called oligopeptides or just peptides (although if they serve already protein-like functions, they may be called miniproteins) the average molecular weight of an amino acid is ~110 g/mol, hence the molecular weight (MW) of a protein generally ranges from 5,500 to 220,000 g/mol they also use as a unit of molecular weight of biomacromolecules the Dalton (after John Dalton [1766-1844] who suggested for the unit of atomic mass the weight of an H atom in 1803; since 1961 we use 12C as a basis of atomic weight especially due to the discovery of elemental isotopes in 1912). Designation of Dalton as a unit can be Da, D, d and kDa, kD, kd; practically the same number numerically as the regular MW, so for example: 50 kDa=~50,000 g/mol Cross-linking disulfide bonds oxidation occurs especially for extracellular proteins (intracellular environment is generally too reductive for the S-S bond) periplasm of bacterial strains are also rather oxidative and may support correct folding if proteins are stabile with specific –SH groups being oxidized (advantage of a periplasmic protein over-expression system, see protein purification, later) rarely there are other side chains participating in cross-links in proteins, like in collagen fibers in connective tissue or in fibrin blood clots Frederick Sanger, 1953 amino acid sequence of insuline (protein hormone, the very first protein sequence determined) ~2,000,000 protein sequences are known today! amino acid sequence = primary structure (of a protein) What is a protein sequence good for? essential to get to know the mechanism of action (e.g. catalytic mechanism of an enzyme) proteins with novel properties can be generated by varying the sequences of known proteins (the science of protein engineering) the primary sequence determines the 3D structure of the protein and it is the link between the genetically encoded information in DNA and the actual biological function of the protein analysis of the relation between primary and 3D structures uncovers mechanisms of folding/unfolding/refolding of proteins sequence determination is a component of molecular pathology (searching for mutations that determines predisposition to various diseases – alterations in amino acid sequence may result in abnormal function and disease) sequence of a protein reveals much about its evolutionary history, protein sequences that resemble to one another likely to have a common ancestor, hence molecular events in evolution can be traced down (phylogenetics – “relatedness”, molecular paleontology)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download of a protein