* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Basic concepts of molecular biology and proteins I
Signal transduction wikipedia , lookup
Point mutation wikipedia , lookup
Interactome wikipedia , lookup
Peptide synthesis wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Magnesium transporter wikipedia , lookup
Ribosomally synthesized and post-translationally modified peptides wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Genetic code wikipedia , lookup
Western blot wikipedia , lookup
Structural alignment wikipedia , lookup
Homology modeling wikipedia , lookup
Biosynthesis wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Metalloprotein wikipedia , lookup
Basic concepts of molecular biology and proteins I PROTEINS A large molecule composed of one or more chains of amino acids in a specific order; the order is determined by the base sequence of nucleotides in the gene coding for the protein. Proteins are required for the structure, function, and regulation of the body’s cells, tissues, and organs, and each protein has unique functions. Examples are hormones, enzymes, and antibodies. A peptide bond. This covalent bond forms when the carbon atom from the carboxyl group of one amino acid shares electrons with the nitrogen atom (blue) from the amino group of a second amino acid. As indicated, a molecule of water is lost in this condensation reaction Proteins are built up by amino acids that are linked by peptide bonds to form a polypeptide chain. (a) Schematic diagram of an amino acid, A central carbon atom (Ca) is attached to an amino group (NH2), a carboxyl group (COOH), a hydrogen atom (H), and a side chain (R). (b) In a polypeptide chain the carboxyl group of amino acid n has formed a peptide bond, C-N, to the amino group of amino acid n + 1. One water molecule is eliminated in this process. The repeating units, which are called residues, are divided into main-chain atoms and side chains. The main-chain part, which is identical in all residues, contains a central Ca atom attached to an NH group, a C'=O group, and an H atom. The side chain R, which is different for different residues, is bound to the Ca atom. Condensation rxn (a water is lost) Opposite is hydrolysis rxn (a water is added) Let us consider a macromolecule composed of n structural units along the backbone Ri= position vector for i Ri= Rxi Ryi Rzi { Ci } = ith conformation { C } = { R1, R2, R3, ..., Rn-1, Rn} Schematic representation of a chain of n backbone units. Bonds are labeled from 2 to n, and structural units from 1 to n. The location of the ith unit with respect to the laboratory-fixed frame OXYZ is indicated by the position vector Ri. R1 and R3 are explicitly shown. Diagram showing a polypeptide chain where the main-chain atoms are represented as rigid peptide units, linked through the Ca atoms. Each unit has two degrees of freedom; it can rotate around two bonds, its Ca-C' bond and its N- Ca bond. The angle of rotation around the N-Ca bond is called phi (f) and that around the CaC' bond is called psi (y). The conformation of the main-chain atoms is therefore determined by the values of these two angles for each amino acid. Looking down the H-Ca bond from the hydrogen atom, the L-form has CO, R and N substituents from Ca going from a clockwise direction (CORN) Chiral molecules (except Gly) Let us consider a macromolecule composed of n structural units along the backbone Ri= position vector for i Ri= Rxi Ryi Rzi { Ci } = ith conformation { C } = { R1, R2, R3, ..., Rn-1, Rn} Schematic representation of a chain of n backbone units. Bonds are labeled from 2 to n, and structural units from 1 to n. The location of the ith unit with respect to the laboratory-fixed frame OXYZ is indicated by the position vector Ri. R1 and R3 are explicitly shown. { C } = { R1, R2, R3, ..., Rn-1, Rn} If you know these variables, then you know the structure (3n variables since { Ri} = { Rix, Riy, Riz} Change of varaibles from R to internal coordinates Can be used for representing the conformation of a protein Schematic representation of a portion of the main chain of a macromolecule. li is the bond vector between units i-1 and i, as shown. ϕi denotes the torsional angle about bond i. Spatial representation of the torsional mobility around the bond i+1. The torsional angle ϕi+1 of bond i+1 determines the position of the atom Ci+2 relative to Ci-1. C'i+2 and C"i+2 represent the locations of the atom i+2, when ϕi assumes the respective values 180° and 0°, characteristic of the trans and cis rotameric states. GENERALIZED COORDINATES FOR DEFINING THE INTERNAL CONFORMATION If you eliminate translation and rotation => 3(n-2) variables will be left li = ri - ri-1 is the bond vector connecting the units i-1 and i, pointing from i-1 to i An alternative representation, change of coordinate system: ri is the position vector in the new frame. r1= 0 (removes translational degrees of freedom) if you choose the first bond along x axis, You remove the rotational degrees of freedom y2=z2=z3=0 The remaining 3n-6 coordinates { x2,x3,y3,x4,y4,z4,........ xn,yn,zn } define the internal configuration of the molecule The position of the ith unit with respect to original frame can be expressed in terms of the internal position vectors ri as T1 is the transformation matrix for the passage from the frame O1X1Y1Z1 into the laboratory-fixed frame OXYZ. (first atom is used as a reference) Note: cis vs trans state: Not all torsional angles are equally probable Some torsional angles, referred to as rotational isomeric states (RIS), are more frequent than others, these being favored by the intrinsic torsional energies of the particular bonds. Rotational energy as a function of dihedral angle for a threefold symmetric torsional potential (dashed curve) and a three-state potential with a preference for the trans isomer (ϕ = 180°) over the gauche isomers (60° and 300°) (solid curve), and the cis (0°) state being most unfavorable. Trans (180) gauche+ (300) gauche- (60) Cis (0, 360) Rotational isomeric states for the central bond in a segment of four backbone atoms. Large blue spheres show backbone atoms. They are indexed from 1 to 4. The small spheres show side groups; they are labeled by the indices of the backbone atoms to which they are affixed (with a prime sign). The staggered conformations are the most energetically favored conformations of two tetrahedrally coordinated carbon atoms. (a) A view along the C-C bond in ethane (CH3CH3) showing how the two carbon atoms can rotate so that their hydrogen atoms are either not staggered (aligned) or staggered.Three indistinguishable staggered conformations are obtained by a rotation of 120 degrees around the C-C bond. (b-d) Similar views as in (a) of valine. The three staggered conformations are different for valine because the three groups attached to Cβ are different. The first staggered conformation (b) is less crowded and energetically most favored because the two methyl groups bound to Cb are both close to the small H atom bound to Ca. 360 Ramachandran plots Psi showing allowed combinations of the conformational angles phi and psi 0 0 phi 360 360 Psi Psi 0 0 0 phi 360 360 0 phi 360 Proteins may assume infinitely many conformations. Fixed bond lengths (l) Fixed bond angles (θ) Variable torsional angles (φ) 0º < φ < 360º 0º < θ < 180º But just one of them is the native (tertiary structure) This process is almost reversible (native <-> non-native) Back to the protein science from protein structure Refolding of a denatured protein. Non-covalent bonds C-C bond has an energy of 83 kcal/mol Each has an energy of 5 kcal/mol N.....H-O (a water is lost) O.....H-N E KK Although they are very weak, many of them form to create a strong bonding arrangement Disulfide bonds. This diagram illustrates how covalent disulfide bonds form between adjacent cysteine side chains. As indicated, these cross-linkages can join either two parts of the same polypeptide chain or two different polypeptide chains. Since the energy required to break one covalent bond is much larger than the energy required to break even a whole set of noncovalent bonds, a disulfide bond can have a major stabilizing effect on a protein. Small proteins need more S-S bonds The disulfide is usually the end product of air oxidation according to the following schematic reaction scheme: 2 -CH2SH + 1/2 O2 ï -CH2-S-S-CH2 + H2O The binding of a protein to another molecule is highly selective. Many weak bonds are needed to enable a protein to bind tightly to a second molecule (a ligand). The ligand must therefore fit precisely into the protein's binding site, like a hand into a glove, so that a large number of noncovalent bonds can be formed between the protein and the ligand ¾ Interior of proteins is hydrophobic z z Hydrophobic core Hydrophilic surface ¾ Proteins have alpha helices and beta- sheets as their secondary structures Secondary structures: α-helices nth nth n+4th n+4th Mostly right-handed in proteins Linus Pauling (1951) first desciribed them Phi and psi angles are 1200 and 1100 consecutively H bonds between the C=O of residue n and NH of residue n+4. ~10 residues long (4-to-more than 40 residues) All the H bonds point in the same direction So helices have dipoles Negatively charged groups such as phosphate ions frequently bind to the amino ends of a helices. The dipole moment of an a helix as well as the possibility of hydrogen-bonding to free NH groups at the end of the helix favors binding. Some amino acids are favored in helices: Ala, Glu, Leu, and Met Pro, Gly, Tyr, and Ser are very unfavored The helical wheel or spiral 3.6 amino acids per turn, so the angle between two consecutive aa is 1000 Secondary structures: β-sheets Proteins formed by strands are rigid 5-to-10 residues long Residues are fully extended Phi and psi are range in the upper left quadrant Two types of beta sheet structures. (A) Antiparallel beta sheet (B) Parallel beta sheet. Both of these structures are common. The structure of a coiled-coil. Amphipathic: one side hydrophilic other side hydrophobic A collection of protein molecules, selected to show a range of sizes and shapes. Each protein is shown as a space-filling model, represented at the same scale. Collagen and elastin. (A) Collagen is a triple helix formed by three extended protein chains that wrap around one another. Many rodlike collagen molecules are cross-linked together in the extracellular space to form unextendable collagen fibrils (top) that have the tensile strength of steel. The striping on the collagen fibril is caused by the regular repeating arrangement of the collagen molecules within the fibril. (B) Elastin polypeptide chains are cross-linked together to form rubberlike, elastic fibers. Each elastin molecule uncoils into a more extended conformation when the fiber is stretched and will recoil spontaneously as soon as the stretching force is relaxed. ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ HW1: The coordinates of a polypeptide with the following sequence are provided as follows: ATOM 1 N THR 1 -7.712 14.556 16.794 1.00 17.45 ATOM 2 CA THR 1 -7.046 15.510 17.660 1.00 16.13 ATOM 3 C THR 1 -6.849 14.891 19.045 1.00 14.58 ATOM 8 N VAL 2 -5.693 15.098 19.646 1.00 12.07 ATOM 9 CA VAL 2 -5.490 14.585 21.007 1.00 10.98 ATOM 10 C VAL 2 -5.851 15.665 22.008 1.00 13.48 ATOM 15 N ALA 3 -6.867 15.389 22.802 1.00 10.39 ATOM 16 CA ALA 3 -7.253 16.247 23.919 1.00 12.54 ATOM 17 C ALA 3 -6.661 15.678 25.190 1.00 14.90 ATOM 20 N TYR 4 -6.222 16.563 26.096 1.00 9.30 ATOM 21 CA TYR 4 -5.723 16.132 27.380 1.00 7.77 ATOM 22 C TYR 4 -6.713 16.614 28.449 1.00 13.04 ATOM 32 N ILE 5 -7.246 15.673 29.198 1.00 9.28 ATOM 33 CA ILE 5 -8.238 15.938 30.232 1.00 11.00 ATOM 34 C ILE 5 -7.695 15.622 31.609 1.00 10.89 ATOM 40 N ALA 6 -7.893 16.576 32.533 1.00 8.54 ATOM 41 CA ALA 6 -7.525 16.339 33.920 1.00 11.63 ATOM 42 C ALA 6 -8.755 15.757 34.632 1.00 11.09 ATOM 45 N ILE 7 -8.490 14.746 35.464 1.00 11.88 ATOM 46 CA ILE 7 -9.580 14.193 36.271 1.00 12.85 ATOM 47 C ILE 7 -9.227 14.353 37.743 1.00 16.83 ATOM 53 N AGLY 8 -10.186 14.794 38.548 0.54 14.91 ATOM 55 CA AGLY 8 -9.921 14.902 39.978 0.54 17.87 ATOM 57 C AGLY 8 -11.072 14.308 40.774 0.54 15.38 ATOM 61 N ASER 9 -10.786 13.771 41.962 0.54 16.88 ATOM 63 CA ASER 9 -11.849 13.281 42.834 0.54 18.62 ATOM 65 C ASER 9 -11.365 12.948 44.236 0.54 15.13 ATOM 73 N AASN 10 -12.108 13.339 45.270 0.54 20.60 ATOM 75 CA AASN 10 -11.739 13.008 46.640 0.54 23.35 ATOM 77 C AASN 10 -12.820 12.169 47.327 0.54 27.76 N C C N C C N C C N C C N C C N C C N C C N C C N C C N C C ¾ Ignore the fisrt two columns, third column gives the type of the atom of the backbone chain. The fourth column gives the residue type, fifth column lists the residue numbers, the sixth-to-eighth columns gives the x, y,z of the atom. ¾ Calculate the phi, psi and omega angles of one of the residues. Theta angle of that residue and the bond vectors. ¾ Dou you think your residue is a part of a helix, a beta strand or a loop, (refer to the ramachandran map) Summary ¾ ¾ ¾ ¾ ¾ Protein interiors are hydrophobic Proteins are made of secondary structures Secondary structure elements are connected to form simple motifs Protein molecules are organized in a structural hierarchy Large polypeptide chains fold into several domains Two a helices that are connected by a short loop region in a specific geometric arrangement constitute a helix-turn-helix motif. Two such motifs are shown: the DNA-binding motif (a), which is further discussed in Chapter 8, and the calcium-binding motif (b), which is present in many proteins whose function is regulated by calcium. Schematic diagrams of the calcium-binding motif. The hairpin motif is very frequent in b sheets and is built up from two adjacent b strands that are joined by a loop region. Two examples of such motifs are shown. (a) Schematic diagram of the structure of bovine trypsin inhibitor. The hairpin motif is colored red. (b) Schematic diagram of the structure of the snake venom erabutoxin. The two hairpin motifs within the b sheet are colored red and green. The Greek key motif is found in antiparallel b sheets when four adjacent b strands are arranged in the pattern shown as a topology diagram in (a). The motif occurs in many b sheets and is exemplified here by the enzyme Staphylococcus nuclease (b). STRUCTURAL HIERARCHY ¾ Primary structure • Arrangement of aa along the linear polypeptide chain ¾ Secondary structure • helices and strands arrange themselves in simple motifs • several motifs usually are combined to form compact globular structures called domains ¾ Tertiary structure • Arrangement of motifs or domains in 3D space ¾ Quaternary structure • Arrangement of monomeric proteins wrto each other in 3D space Organization of polypeptide chains into domains. Small protein molecules like the epidermal growth factor, EGF, comprise only one domain. Others, like the serine proteinase chymotrypsin, are arranged in two domains that are require to form a functional unit. Many of the proteins that are involved in blood coagulation and fibrinolysis, such as urokinase, factor IX, and plasminogen, have long polypeptide chains tha comprise different combinations of domains homologous to EGF and serine proteinases and addition, calcium-binding domains and Kringle domains. ¾ The fundemental unit of tertiary structure is the domain ¾ Domain; A polypeptide chain or a part of a polypeptide chain that can independently fold into a stable tertiary structure. ¾ Domains are units of functions DOMAINS: Each domain can fold independently Elements of secondary structure such as alpha helices and beta sheets pack together into stable globular elements called domains. A typical protein molecule is built from one or more domains, often linked through relatively unstructured regions of polypeptide chain. Different domain structures Comparison of the conformations of two serine proteases. The backbone conformations of elastase and chymotrypsin. Although only those amino acids in the polypeptide chain shaded in green are the same in the two proteins, the two conformations are very similar nearly everywhere. The active site of each enzyme is circled in red; this is where the peptide bonds of the proteins that serve as substrates are bound and cleaved by hydrolysis. The serine proteases derive their name from the amino acid serine, whose side chain is part of the active site of each enzyme and directly participates in the cleavage reaction. Motifs that are adjacent in the amino acid sequence are also usually adjacent in the three-dimensional structure. Triose-phosphate isomerase is built up from four β−α−β−α motifs that are consecutive both in the amino acid sequence (a) and in the three-dimensional structure (b). Schematic diagram showing the packing of hydrophobic side chains between the two α helices in a coiled-coil structure. Every seventh residue in both α helices is a leucine, labeled "d." Due to the heptad repeat, the d-residues pack against each other along the coiled-coil. Residues labeled "a" are also usually hydrophobic and participate in forming the hydrophobic core along the coiled-coil. Salt bridges can stabilize coiled-coil structures and are sometimes important for the formation of heterodimeric coiled-coil structures. The residues labeled "e" and "g" in the heptad sequence are close to the hydrophobic core and can form salt bridges between the two α helices of a coiled-coil structure, the e-residue in one helix with the g-residue in the second and vice versa. (a) Schematic view from the top of a heptad repeat. (b) Schematic view from the side of a coiled-coil structure. Proteins can be divided into three main classes ¾ Alpha domain proteins ¾ Alpha/beta structures ¾ Beta domain structures Four-helix bundles frequently occur as domains in a proteins. Alpha-Beta proteins Alpha-Beta proteins Alpha-Beta proteins In most a/b barrel structures the eight b starnds of the barrel enclose a tightly Packed hydrophobic core formed by the side chains from b-strands. α/β barrel domain of an enzyme Hydrophilic hole Enzyme pyruvate folds into several domains