Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Quantum Simulation Neural Networks Brian Barch Examples of results of molecular dynamics Problem: Molecular Dynamics Dataset: Atomic hydrogen • Want to simulate motion or distributions of atoms • Do this by constructing a potential energy surface (PES) • Predicts energy from position • Can then derive energy to get forces on atoms • Impossible analytically due to electron wavefunctions • Traditionally use iterative methods, i.e. density functional theory (DFT) or quantum chemistry methods • These have time vs accuracy tradeoff • Can augment with neural networks, which can predict accurately very quickly once trained • Neural network goal: predict energy from atomic position • • • • 8 datasets at different temperatures and pressure ~3000 configurations of hydrogen atoms each Each configuration: x, y, z for 54 atoms and total energy As neural network PES go, this is very high dimensional Preprocessing: • Calculated interatomic distances and angles between atoms • For baseline model: used these to calculate symmetry function values, which were then normalized and saved • For new model, preprocessed only distances and angles Snapshot of melting water from NN-based simulation Poliovirus, used in 108 atom simulation of infection Symmetry functions Atomic Neural Network Structure • Represent atomic environment in a form invariant under changes that shouldn’t affect energy • i.e. rotation, total system translation, index swaps, etc • Differentiable but highly nonlinear. • Different types of parameters and functions used per sym. func. mean can’t represent as a single vector function (which would make things much easier) • Cutoff function ensures they represent only local environment – reduces scaling • Baseline model (as used in literature): manually picked, kept as static hyperparameter, and used to preprocess data • Project goal: design and implement trainable Sym. func. types: symmetry functions • i.e. turn hyperparameter to normal parameter • To my knowledge, there are no papers even suggesting this Baseline NN structure Effects of parameters of G2 Sum of energies Preprocess Sym. funcs. Cutoff: used in others, but not a sym. func. Activation of sym. func. G2 for 2 atoms as a function of distance for different parameter values Separate NNs that share weights • Each atom is described by a vector of sym. funcs • Baseline model: • Take list of sym. func. vectors as input to NN • Individual atomic NN trained on each vector • NNs share weights • Total energy is sum of outputs of atomic NNs New convolutional Neural Network model: Preprocess NN Changes to model ѳ + D Ê • Replaced atomic NNs with 1D convolutional layers, with atom index as dimension • This part is functionally the same as the baseline, but more efficient • Allows GPU optimization, i.e. CuDNN • Different symmetry functions in different channels • Takes distances and angles as input and applies a sym. func. layer • Batch normalization layer instead of using pre-normalized data • Final layer is same as baseline model Trainable symmetry functions Atoms Sym. Func. Layer Batch normalization Results: Conv layers Loss during training Sum of atomic energies • Sym. func. layer calculates sym func values with parameters stored as theano tensors • Allows more generalizability, since no need to preprocess data • Allows backpropagation onto parameters the same as is done for weights • Caused NN to train slower, but once good sym. func. parameters are found they can be used for preprocessing to avoid this • Goal was to make trainable sym. funcs, which succeeded • New convolutional NN structure greatly increased train and prediction speed of NN MSE through GPU optimization (eV) • G1 – G3 could be implemented quickly, but G4 and G5 use angles, so require a triple sum over all atoms • Causes massive slowdown in feed • Will focus on increasing efficiency of complex sym. func. forward, and even more during Loss during training for a NN with static sym. func. parameters (left) and with implementation and training backpropagation trainable sym. func. parameters (right). Training MSE (red) and validation MSE • I plan to represent sym. funcs not as a single complex layer, 4 5 • Baselines results relied heavily on G and G , 1 2 3 (blue) are measured in electronvolts. Used 8 each of G , G , and G for this. but as a combination of multiple simple add, multiple, which are the most complex sym. funcs, so exponent, etc layers could not beat my best results with the • When compared to the same untrained sym funcs, trained ones always performed better • I will also focus on making sure NNs are learning useful sym. baseline model • Decreased MSE, faster training and less overfitting func. parameters by penalizing redundancies and optimizing • I did beat the results of the model used by • ~10% MSE decrease for best case, but more with worse hyperparameters training with an evolutionary neural network structure I made researchers from LBL on this problem though • Trained sym. fun. Parameters were unpredictable – sometimes stayed near initialization, previously (based on MPANN) other times multiple converged to same value or crossed over. • Unsure what to make of this Future directions