Download Quantum Simulation Neural Networks - b

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Transcript
Quantum Simulation Neural Networks
Brian Barch
Examples of results of molecular dynamics
Problem: Molecular Dynamics
Dataset: Atomic hydrogen
• Want to simulate motion or distributions of atoms
• Do this by constructing a potential energy surface (PES)
• Predicts energy from position
• Can then derive energy to get forces on atoms
• Impossible analytically due to electron wavefunctions
• Traditionally use iterative methods, i.e. density functional
theory (DFT) or quantum chemistry methods
• These have time vs accuracy tradeoff
• Can augment with neural networks, which can predict
accurately very quickly once trained
• Neural network goal: predict energy from atomic position
•
•
•
•
8 datasets at different temperatures and pressure
~3000 configurations of hydrogen atoms each
Each configuration: x, y, z for 54 atoms and total energy
As neural network PES go, this is very high dimensional
Preprocessing:
• Calculated interatomic distances and angles between atoms
• For baseline model: used these to calculate symmetry function
values, which were then normalized and saved
• For new model, preprocessed only distances and angles
Snapshot of melting
water from NN-based
simulation
Poliovirus, used in 108 atom
simulation of infection
Symmetry functions
Atomic Neural Network Structure
• Represent atomic environment in a form invariant under changes that shouldn’t affect energy
• i.e. rotation, total system translation, index swaps, etc
• Differentiable but highly nonlinear.
• Different types of parameters and functions used per sym. func. mean can’t represent as a single
vector function (which would make things much easier)
• Cutoff function ensures they represent only local environment – reduces scaling
• Baseline model (as used in literature): manually picked, kept as static hyperparameter, and used
to preprocess data
• Project goal: design and implement trainable
Sym. func. types:
symmetry functions
• i.e. turn hyperparameter to normal
parameter
• To my knowledge, there are no papers
even suggesting this
Baseline NN structure
Effects of parameters of G2
Sum of
energies
Preprocess
Sym. funcs.
Cutoff: used in others, but not a sym. func.
Activation of sym. func. G2 for 2 atoms as a function of
distance for different parameter values
Separate NNs that share
weights
• Each atom is described by a vector of sym. funcs
• Baseline model:
• Take list of sym. func. vectors as input to NN
• Individual atomic NN trained on each vector
• NNs share weights
• Total energy is sum of outputs of atomic NNs
New convolutional Neural Network model:
Preprocess
NN
Changes to model
ѳ
+
D
Ê
• Replaced atomic NNs with 1D convolutional layers, with
atom index as dimension
• This part is functionally the same as the
baseline, but more efficient
• Allows GPU optimization, i.e. CuDNN
• Different symmetry functions in different channels
• Takes distances and angles as input and applies a sym. func.
layer
• Batch normalization layer instead of using pre-normalized
data
• Final layer is same as baseline model
Trainable symmetry functions
Atoms
Sym. Func. Layer
Batch normalization
Results:
Conv layers
Loss during training
Sum of atomic
energies
• Sym. func. layer calculates sym func values with parameters
stored as theano tensors
• Allows more generalizability, since no need to
preprocess data
• Allows backpropagation onto parameters the
same as is done for weights
• Caused NN to train slower, but once good sym. func.
parameters are found they can be used for preprocessing to
avoid this
• Goal was to make trainable sym. funcs, which
succeeded
• New convolutional NN structure greatly
increased train and prediction speed of NN
MSE
through GPU optimization
(eV)
• G1 – G3 could be implemented quickly, but G4
and G5 use angles, so require a triple sum over
all atoms
• Causes massive slowdown in feed
• Will focus on increasing efficiency of complex sym. func.
forward, and even more during
Loss during training for a NN with static sym. func. parameters (left) and with
implementation and training
backpropagation
trainable sym. func. parameters (right). Training MSE (red) and validation MSE
• I plan to represent sym. funcs not as a single complex layer,
4
5
• Baselines results relied heavily on G and G ,
1
2
3
(blue) are measured in electronvolts. Used 8 each of G , G , and G for this.
but as a combination of multiple simple add, multiple,
which are the most complex sym. funcs, so
exponent, etc layers
could not beat my best results with the
• When compared to the same untrained sym funcs, trained ones always performed better
• I will also focus on making sure NNs are learning useful sym.
baseline model
• Decreased MSE, faster training and less overfitting
func. parameters by penalizing redundancies and optimizing
• I did beat the results of the model used by
• ~10% MSE decrease for best case, but more with worse hyperparameters
training with an evolutionary neural network structure I made
researchers from LBL on this problem though • Trained sym. fun. Parameters were unpredictable – sometimes stayed near initialization,
previously (based on MPANN)
other times multiple converged to same value or crossed over.
• Unsure what to make of this
Future directions