Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Empirical energy function Summarizing some points about typical MM force field • In principle, for a given new molecule, all force field parameters need to be derived from scratch because once a functional group changes, it can change many things. H N N N N H2 C H N N N N But this will be inconvenient as you can image how many molecules are there? To get around this difficulty, a way to do is to use ‘unit molecular fragments’ to form the molecule we want to study. This way, we only need to build force field for these unit molecular fragments. (certainly, this is an approximation.) This approach has been widely used for polymer molecules like proteins and/or DNAs, etc. Q: How do we know the force fields are accurate? A: The force field parameters are chosen/optimized such that the calculated results will agree with experimental results well. Better agreement with experimental results and higher prediction ability of a force field, a better one. • Note that, they are many type of carbon atoms, even in the ‘unit molecular fragments’ you choose. For example, carbons, sometimes, is singly bonded or doubly bonded etc. This will affect vdw parameters, and many other parameters. • Usually, all the charges of atoms are FIXED. ( this is an approximation, considering the cis- and trans n-butane case) • • • • • Then, given a geometry of a studied molecule, the energy of that molecule at that structure can be calculated. Besides the typical MM force field, there are empirical force fields which can deal with bond breaking and formation cases. Molecular dynamics simulation • (Wikipedia) • • • • • • For transition metal molecular systems, the empirical force field functional forms can be more complex due to complexity of d-orbitals. The bonding between metal ions and ligands can be bonded or non-bonded. Well-known force field for biomolecules: AMBER, Charmm, OPLS, etc.. Note that the force field is still under development. The transferibility of force field at different conformations(e.g., cis and trans C-C-C-C, and ff of amino acids are derived from chosen conformations) and molecules (forming different molecules with unit molecular fragments) are not 100 % yet. Some people even argue that the force field need to be changed during MD simulation to achieve better simulation. In addition, polarizable force field is under development as well. In addition to the description of molecular interactions using empirical functional forms without including electrons explicitly, quantum chemistry calculation can be used to calculate molecular energy as well. This method includes electron orbitals in calculation explicitly. The calculation procedures in this "quantum chemistry calculation" can be categorized into two groups. One is semi-empirical method, the other is ab initio (this means "starting from the very beginning or the first principle") calculation method. Between the empirical MM force field and quantum chemistry calculation, there is a combined /compromised method, named QM(quantum mechanical) /MM method. This method allows to examine local electronic structure (orbitals) in the large molecular systems. Ways to alleviate problem of sampling in MD and MC simulation • Simulated annealing • Parallel tempering, also known as replica exchange Replicate Exchange MD simulation •Remove net translational and rotational motion due to truncated error at chosen time interval. Simulation step exchange with probability P = min{1, exp[-(E2-E1)(1/kT1 – 1/kT2)]} •Enhance sampling. (alleviate the problem of beingtrapped at a local minimum.) •For a helical peptide of ~20 amino acids, this speeds up about 10 times, compared with conventional MD. (PCCP) Random Number Generators /Monte Carlo Simulation • An example of a uniform (quasi)-random number generator: • • • • • • • • • • • • • • • (Fortran 77 version) double precision function usran(ir) c c this subroutine generates random values between 0.0 and 1.0 using c an integer seed c it is based on the imsl routine ggubs. c c double precision version c implicit double precision (a-h,o-z) parameter(da=16807.d0,db=2147483647.d0,dc=2147483648.d0) ir=abs(mod(da*ir,db)+0.5d0) usran=dfloat(ir)/dc return end Random Number Generators /Monte Carlo Simulation • An example of a uniform (quasi)-random number generator: • • • • • • • • • • • (Fortran 77 version, don’t mind the rigorousness of coding here.) double precision function RANF(DUMMY) c c this subroutine generates random values between 0.0 and 1.0 using c an integer seed c it is based on the imsl routine ggubs. c c double precision version c Parameter ( L= 1029, C=221591, M=1048576) DATA SEED /0/ • • • • SEED= MOD(SEED*L+C,M) RANF= REAL(SEED)/M RETURN END Protein Structure Basics References: 1. Branden & Tooze’s “Introduction to Protein Structure (second edition)”, Garland Publishing, 1999, ISBN: 0815323050. (藝軒有賣!) 2. Creighton's "Proteins: Structures and Molecular Properties (second edition)", Freeman and Company, 1993. There are several levels of protein structure Primary structure: the sequence of amino acid residues Secondary Structure: the polypeptide backbone conformation Tertiary Structure: the three-dimensional structure of a protein Quaternary Structure: the arrangement of one subunit relative to another in space Proteins are polypeptide chains The basic repeating unit along the main chain is NH-CH-C’=O, which is the residue of the common parts of amino acids after peptide bonds have been formed Amino Acids can be classified by their R groups (1) His The left form The right form The right form is usually predominant in model peptides. However, which form is predominant depends on the precise conditions in the local environment. Both forms (the left form & the right form) are found in proteins. Cysteines can form disulfide bridges The conformation of the main-chain atoms is therefore determined by the values of phi (), & psi () angles of each amino acid. Ramachandran Plot (a): a result from calculations of sterically allowed regions Glycine residues can adopt many different conformations (b) Observed values for all Residue types except glycine (c) Observed values for glycine : note that the values include combinations of and that are not allowed for other amino acids. Rotamers: Most side chains have one or a few conformations that occur most frequently than the other possible staggered conformations. These are called rotamers. Today, collections of these favored conformations, or rotamer libraries are a standard tool in computer programs used for modeling protein structures