Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Protein–protein interaction wikipedia , lookup
Intrinsically disordered proteins wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Circular dichroism wikipedia , lookup
List of types of proteins wikipedia , lookup
Protein structure prediction wikipedia , lookup
Begin 01/15/09 We will take, as an operational definition of cancer, that it is an in vivo process related to cell transformation. In fact, one source of transformed cell lines is to culture tumor cells. A tumor might be considered as an in vivo analogue of a focus, and the other characteristics of transformed cells have parallels in cancer cells as well. Thus, tumor formation is related to loss of density-dependent growth regulation, while loss of anchorage dependence is the characteristic conferring metastatic properties, which means the ability of tumors to spread beyond the site of origin. Metastasis also involves acquisition of capabilities not applicable to the in vitro process of transformation. Important examples are: (1) the ability to penetrate blood vessel walls with special proteins (called matrix metalloproteinases) and (2) the ability to develop vascular network for blood supply to tumors (called angiogenesis, remember flow chart). About 10 years ago there was a tremendous amount of excitement surrounding the clinical trials of the angiogenesis inhibitor proteins angiostatin and endostatin. These are naturally occurring proteins that inhibit angiogenesis, and in mice, the murine proteins essentially cured solid tumors. Unfortunately, in clinical trials the human analogues exerted only small and temporary effects. An important characteristic common to transformed cells and tumor cells is that the changes just described are heritable. By implication, this involves alteration of genetic information; i.e., changes in DNA. Hence, DNA must be a target for oncogenic agents, including chemicals, and therefore DNA is a major focus for studies of carcinogenesis on a molecular level, including chemical carcinogenesis. Epigenetic processes, particularly the switching on or off of critical genes by environmentally induced changes in methylation state of 5′-GpC-3′ doublets have become a focus of attention, and are a pathway not involving a change in genetic information. This area of research is new and rapidly evolving, but will not be a focus of this course, as it doesn’t directly involve exposure to chemicals. Before proceeding further, it is important to point out that discussion of chemical carcinogenesis, like any other area of molecular biology is replete with its own specialized vocabulary. While the introduction of terminology may at first not seem to be overwhelming, it accumulates rapidly, and it is wise to be conscientious in learning definitions and keeping them straight. (An example is the nucleobase, nucleoside, and nucleotide series from ENVR 430.) 1 OVERVIEW OF CHEMISTRY I will quickly review the chemistry which I think that you should be familiar with to feel at ease with the course material. The presence of carbon defines organic molecules. Carbon requires 4 electrons to fill the K shell, and does this by forming bonds with 2 – 4 other atoms. When bonded to 4 atoms, carbon is tetravalent, and the bonds are oriented towards the vertices of a tetrahedron. In an undistorted structure, the bonds make internal angles of 109 o. This angle is important to us because it determines 3-dimensional structure, which is critical in determining function of biomolecules. Furthermore, any distortion from optimum geometry introduces strain which destabilizes structure. Another property of tetrahedral carbon is that when molecular groups at the apices are all different, as indicated in the overhead, property of chirality is introduced. This means that the reflection of the tetrahedron in a mirror gives an image that cannot be superimposed on the original model. The images are called enantiomers. Molecules with chiral centers have the optical property of rotating plane-polarized light and therefore have optical activity. The importance of chirality will become evident when we discuss metabolism, because metabolizing enzymes have chiral active sites and therefore selectively generate one out of several possible enantiomers. Since macromolecular targets of activated metabolites, including DNA are also chiral, chirality plays a crucial role in determining reaction pathways and hence in biological activity. When carbon is bonded to three atoms – tri-valent carbon – it must share two electrons with one substituent in a “double” bond to satisfy valence requirements. The overhead also shows the geometry of trivalent carbon, with bonds distributed in a plane at angles of 120o. Sharing two electrons in the double bond is accomplished by forming a σ bond and a π bond to one of the atoms. Because of the π bond, the geometry around the double bond is fixed, since rotation around the bond axis would require breaking the π bond. The resulting geometric relationship between the substituents of the π-bonded atoms defined as cis when they are on the same side of the perpendicular to the plane through the bonded atoms and trans when they are on opposite sides. 2 Approximate bond energies, i.e., the energy in Kcal /mole required to break a bond, or conversely the energy released in bond formation is: 83 Kcal/mole for C-C σ bonds 150 Kcal/mole for C=C double bonds Bond energies are of interest to us because energies released in assembling molecules provide a measure of relative stability of starting compounds and products and for reversible reactions, will determine the direction in which reactions will proceed spontaneously. In addition to valence and geometry, the idea of functional groups will be important from the standpoint of our discussions. Functional groups are arrangements of atoms that are not complete molecules, but occur frequently and are usually associated with specific properties when they are incorporated into molecules. Examples of functional groups that we most commonly encounter are shown in the next overhead. The interaction of biomolecules with their environment will also be an important for us, and will depend on the physico-chemical properties of the molecules. Chemical compounds may be divided into two classes based on physico-chemical properties: polar and non-polar. Polar molecules have higher solubility than non-polar molecules in water because the dipoles are attracted to and fit into the lattice created by polar water molecules, illustrated on the next overhead. Although water is not solid at ambient temperatures, this slide shows that it is nevertheless highly ordered. Polarity can be actual separation of charge, as in the case of salts, such as Na+Cl- or the zwitterionic form of amino acids, or may be the result of unequal sharing of electrons in covalent bonds between atoms of different electornegativities. Generally, compounds in which carbon is bonded to the electronegative atoms O, N and S are polar. Polar molecules can be accommodated efficiently into the water lattice and therefore tend to be water-soluble. Non-polar molecules have structures in which there is little or no charge separation. Non-polar 3 molecules are not soluble in water because they are excluded from the water lattice by lack of interaction with water dipoles. However, non-polar molecules are soluble in non-polar media, such as oils and fats. Compounds containing only C, H are non-polar, but lack of polarity may also be determined by symmetry of distribution of polar bonds. The overhead shows CCl4 as an example. In addition to electrostatic attractions, polar compounds may enhance interactions with water or other polar molecules by hydrogen bonding, which is very important in biochemistry. When H is bonded to an electronegative atom such as N or O, it loses most of its valence electron and thus can partially share an electron pair with an electron donor if one is within a short distance. Hbonds are stabilizing but weak, amounting to ~5 Kcal/mole, compared to 83 Kcal/mole of C-C bonds. Nevertheless, the energy of stabilization gained from H-bonding often determines 3-D structure of large, flexible biomolecules in aqueous medium, and so is very important in structure-function relationships. A critical property of H-bonds is that they are directional – atoms involved must be collinear for optimum overlap of the atomic orbitals involved in the sharing. The significance of directionality is that it imposes geometric constraints on H-bond formation. Enzymes and other proteins are made up of amino acids, which have the general structure shown on the right side of the next overhead: The group R is called a “side chain” and in amino acids of physiological importance, can be any of 20 groups. The overhead gives the side chains for the 20 “essential” amino acids, along with their three-letter abbreviations and one-letter codes. Table is a convenient reference, because single letter code often used to represent amino acid sequences in large proteins, and some of the letters are not obvious. Starting with Genes VIII, this table has been omitted. Remembering the description of a chiral molecule, you can see that all amino acids, except for glycine (R = H) are chiral and naturally occur in the L (=S) configuration. We shall discuss conventions associated with the terminology of chirality later, so for now accept that L and S indicate enantiomers (i.e., mirror images). Proteins are formed by linking amino acids together 4 through condensation of the carboxyl group of one aa with the α amino group on a second aa through the elimination of water. The bond formed in this manner is a peptide bond, and proteins are comprised of polymeric structures of 100 - 500 amino acids linked by such peptide bonds. Proteins are also referred to as polypeptides. Amino acids determine protein conformation in two ways. Through the primary structure, bends are introduced in the protein backbone at the site of the cyclic amino acid proline, as shown in the overhead: Cross link formation through oxidative coupling of sulfhydryl groups of cysteines may result in juxtaposition of two regions of a protein that are widely separated in the linear representation. Oxidative coupling can be described by oxidation of the –SH groups by one electron to give a thiyl radical, followed by coupling with a second thiyl radical to form a disulfide bond. Formally, H2 is released. Secondary structural features are introduced by non-covalent interactions between the side chains of the amino acids and between the side chains and the protein environment. There are three fundamental structures that are recognized: α-helices, β-sheets and spherical globules, illustrated in the structure of horseradish peroxidase shown as a ribbon diagram on the next overhead. Cys 11 and Cys 91, separated by 80 aa in the linear sequence are brought into juxtaposition by the Cys-Cys bond. This structure and the use of conventions such as ribbons and stick bonds is an example of how crystsal structures available from the Rutgers Protein Data Bank can be manipulated to illustrate various characteristics. Structures features may involve repetitive patterns, which may be recognizable in linear representations of proteins, which is the type of information available from sequencing. Such repeated patterns are called motifs. One area of bioinformatics currently receiving considerable attention is to scan sequences of newly isolated protein for recognition of motifs as a means to relate the new molecule to known proteins and to infer possible function through alignment of recognizable functional domains. An important point regarding the interactions just described is that they illustrate why changes in only one amino acid at a critical site, one possible consequence of a DNA mutation, can drastically affect the 3-D structure of a protein and can thus change or abolish function. This will be important in considering the differences between normal gene products and oncogene products when we discuss the effects of chemically induced mutations of genes. 5 Begin 01/20/09 OVERVIEW OF DNA STRUCTURE The other building blocks of interest to us are the components of DNA and RNA. From biology, chemistry or popular press you probably know that the structure of DNA is a double helix comprised of a sugar-phosphate backbone to which are appended compounds called bases. So there are three components: bases, sugar and phosphate. The bases by themselves, without an attached sugar are often referred to as nucleobases. They belong to two classes. Two two-ring bases, guanine and adenine, are derived from the parent compound purine. They are represented by the one letter abbreviations G and A or three letter codes Gua and Ade, with numbering conventions as shown at the bottom of the overhead. Two bases derived from the pyrimidine parent are cytosine and thymine, abbreviated by the one letter code C and T. There are three letter abbreviations Cyt and Thy, analogous to Gua and Ade, are starting to appear – in the recent past, only the one letter codes had been used for the pyrimidines. The sugar component is a ribose in RNA and a deoxyribose in DNA. When ribose or deoxyribose molecules are depicted as components of RNA or DNA, the positions on the sugars are “primed” to distinguish them from the positions of the nucleobase. The parent sugar is ribose, which is a 5-carbon sugar, with the carbon skeleton structure and numbering convention shown on the overhead. [OH, numbering and structure of sugar] Ribose has hydroxyl groups at C2,C3 and C5. The absence of a hydroxyl at C2 defines deoxyribose, which is the component of DNA and is one of the structural distinctions between DNA and RNA. It is important to note that the ribose and 2-deoxyribose molecules are optically active and all natural sugars have a D configuration at C1. Note very carefully the geometric relationship of the substituents relative to the (approximate) plane of the 5-membered sugar ring. This relationship defines ribose, as compared to, for example arabinose (change 3-OH). [Also, the geometrical relationship automatically determines the optical configuration of the remaining chiral centers once the optical configuration at C1′ is specified as D. Although these diagrams 6 depict planar rings, they are actually slightly puckered and discussion of this feature may arise in some readings, so be forewarned. Stages of assembling the building blocks are nucleobase, deoxynucleoside (attached sugar) through glycosidic bond, deoxynucleotide or deoxynucleic acid (sugar + phosphate). The phosphate unit provides a means of linking sugars into a strand, as we shall see in the next overhead. The next overhead is a formalized representation of assembled DNA. [OH] The sugars are linked by phosphate groups to form the DNA polymer backbone through formation of ester bonds the 5′-OH of one deoxyribose and the 3′-OH of a second deoxyribose. [OH, DNA strands] Because the sugar units are linked by formation of a diester to make the bridging phosphates, the linkages are described as phosphodiester linkages. The phosphates are ionized at biological pH and conventionally drawn in ionized form as on the overhead. Regarding the sugar-phosphate backbone, notice that the strands have an unattached 3'-OH at one end and a 5'-OH at the other end, so they have directionality. This feature is often referred to as the polarity of the strands, but is not to be confused by the polarity created in molecules by separation of charge. An important feature of the double helix which you can see in the representation on this overhead is that the strands associate with opposite polarities - the 3′ terminus of one strand is opposite the 5′ end of the complementary strand. The strands are associated by hydrogen bonding between the bases- interactions always being between purines and pyrimidines. [OH, DNA double helix structure] 7 The reason for this selectivity of association will shortly become evident. The most energetically favorable bonding schemes are between G-C and A-T, which are said to form complementary pairs. In the double helix, the strands are completely complementary to each other. The pairing scheme just described - which you should memorize- is called Watson-Crick scheme, after the team that first proposed them. It is also possible for Gua and Ade to form Hbonds with the O6-N7 edge or the N6-N7 edge, respectively. [OH, triple helix pairing] Pairing in this manner is called Hoogstein pairing and has been observed in triple helices in vitro. A second important example of Hoogstein pairing, from our perspective, occurs when Gua or Ade is modified by adduct formation and rotates around the glycosidic linkage so that the normal H-bonding edge faces away from the partner nucleobase in the double helix. In this situation the O6-N7 edge in Gua or N6-N7 edge of Ade will pair. In this orientation, Hoogstein pairing has implications for misincorporation of bases during replication, as we shall see later on. There may also be biological relevance for Hoogstein pairing in association telomeric structures at DNA ends. Telomeric structures are hypothesized to provide a means of preserving the ends of replicating DNA. Turning off the telomerase enzymes responsible for these structures may be involved in the aging process and the inappropriate continued functioning of telomerases may be related to immortalization, which is a step on the way to cell transformation. Some discussion of this appears periodically in the popular press, usually in relation to clinical trials of telomerase inhibitors, which are a class of anti cancer drugs. The diameter of the H-bonded pairs in the Watson-Crick scheme is 1.08 nm, and maintaining that constant diameter is important in the formation of the stable double helix. That is the reason for strict adherence to the pyrimidine-purine pairing selectivity. Note also that the G-C pair associates through 3 hydrogen bonds, while the A-T pair associates through 2 hydrogen bondsyou might infer from this that the G-C pair are more strongly bound. In fact, this would be 8 correct. This observation also does have biological relevance, which we will discuss in sketching out the mechanisms of replication and transcription. The next overhead shows two 3-D representations of the same DNA double helix, which I have reproduced from an NMR structure. The double helix, which is a secondary structural feature of DNA, is a conformation adopted because it is the most energetically favorable in the polar biological medium. In the helix, the relatively hydrophobic bases associate by stacking with their molecular planes parallel, while the polar (ionic) sugar-phosphate backbone winds around the outside of the stack with the ionized phosphates interacting with the polar aqueous environment. The fact that the pyrimidine-purine pairs have a constant diameter enhances the stability of this arrangement. The representation on the left is a space-filling structure with some associated water molecules. The structure on the right is identical, but in stick format to illustrate more clearly the base stacking and the ridge-and-groove pattern of the sugar-phosphate backbone: [OH, 3-D model of DNA with stick structure for contrast] The grooves are not symmetric, but one is large and one is small - so these features are appropriately called the major and minor grooves. The parameters by which the helix is characterized are: 34.6o rotation/base pair which results in 10.4 base pairs required for each 360o helical turn. The distance along the axis for each complete turn is 3.38 Å. The structure in the overhead is adopted by the DNA under physiological conditions and is called B-DNA. Environmental conditions can cause the helix to change conformation. One of the more pronounced changes occurs under conditions of high salt concentration in vitro, DNA adopts a left-handed helical conformation. The left-handed helix is called Z-DNA from the zigzag pattern formed by the sugar-phosphate strands. Next overhead is also from an NMR structure. Left, is a space-filling representation, right is a stick representation with the sugarphosphate backbone outlined in green. [OH, Z-DNA, stick, and space-filling structures] 9 In Z-DNA the rotation is -30o/base pair, so that there are 12 base pairs for each 360o helical turn and one turn occupies 5.71 Å distance along the axis of the helix. The backbone forms a single uniform groove around the stacked bases. In addition to high salt concentrations, the Zconformation is also favored by repetitive d(GC) regions (less so in AT regions). Since d(GC) runs are common in genomic DNA, the question of a biological role for Z-DNA arose. Z-DNA was found in negatively supercoiled DNA present during transcription and additionally a group of proteins which specifically recognize Z-DNA has recently been identified. The human RNAediting enzyme adenosine deaminase (ADAR1) was the first protein identified to interact with ZDNA. The existence of these proteins strongly supports biological significance for Z-DNA. Also, Z-DNA forming regions identified at promoter sites suggest a possible role of Z-DNA as a transcriptional regulator. . A feature of B-DNA that we will be concerned with in discussing the effect of covalent adduct formation is the relative orientations of the sugar and nucleobase with regard to rotation around the glycosidic bond - i.e., the sugar-C1→ base bond. As the following overhead shows, the hydrogen-bonding edge of the base is normally pointed away from the sugar: as shown in the 2D representation on the next overhead: [OH, orientation of base around glycosidic linkage] You are looking down the N-C1 bond from above. When the normal H-bonding edge of the nucleobase is oriented in the green area, the base is in an anti configuration. If the base roatates around the glycosidic linkage so that the normal H-bonding edge is in the red area, the base will be in a syn conformation. Formation of purine adducts, or other modification of the purine structure, particularly oxidation to 8-oxo-dA or -dG, can cause the base to adopt a syn configuration. In the syn configuration, the back edge of the base, O6- or N6-N7 edge becomes the H-bonding edge, so Hoogstein pair can be formed. As the next overhead shows, this allows 10 an 8-oxo-dGuo:dGuo pair to form during replication, because the diameter of the helix is not expanded with one of the purine nucleobases is in the syn orientation. As we shall see shortly, this is a possible explanation for a G•C → C•G transversion mutation as a result of oxidative damage. In addition to the twist of the helical structure of DNA, the helix itself may be twisted in a phenomenon called supercoiling. The next overhead shows supercoiling of circular DNA, but the same situation applies for long genomic DNA, where the ends are effectively fixed. If supercoiling is in the opposite sense to the right-handed helix, it is said to be negative, and the DNA is underwound because for each negative supercoil, one twist in the double helix is subtracted. If supercoiling is in the same sense as the right-handed helix, it is positive, and the DNA is overwound because for each positive supercoil, one twist is added. Since most DNA is negatively supercoiled and hence underwound, some biologically relevant role or significance might be predicted, and we have just described one possible example in talking about Z-DNA. In the extreme, underwinding equates to sections of unwound helix, as the overhead shows. [OH] In fact, this is a situation that has to precede a number of DNA functions in which exposure of a single strand is required: the most important examples being replication and transcription. Before leaving the physical description of DNA, there are additional conventions that are appropriate to introduce. Obviously, the 3-D pictures are not very a convenient way to represent DNA for purposes of illustrating reactions and hence other formalisms have been adopted. Most common is the “ladder” form: [OH, DNA ladder convention] 11 5' 3' A T C A G A T A G T C T 3' 5' where the complementary bases are drawn opposite each other in a ladder-like structure, with the glycosidic linkages indicated by solid lines and the H-bonded association with hatched lines. The polarity of the strands is indicated at the ends of the segment. Another common formalism is the step form: [OH, DNA step form] B P B P B P B P OH that is used particularly for indicating single stranded DNA. 12