Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Balancing an Inverted Pendulum with a Multi-Layer Perceptron ECE 539 Final Project Spring 2000 Chad Seys Outline • • • • • • • • • The Inverted Pendulum The Problem Approach Position Representation Output Force Representation Initialization Convergence & Reinitialization Results Discussion The Inverted Pendulum: • Abstraction is a rigid rod attached at its lower end to a pivot point. • Like balancing a broom on the palm of hand. • Useful in modeling: – Launching a rocket into space – look up another The Problem: • Train a multi-layer perceptron to... – keep an inverted pendulum in its upright position – move an inverted pendulum from any position to the upright position (keep it balanced there). Approach: • Divide the 180 degrees into M arc segments (where M is odd). – M odd to provide a central region where no force is applied. – There will be M input neurons, one per segment. • There will be two output neurons whose outputs will be interpreted as opposing force vectors of fixed magnitude. Inverse Pendulum Position Representation • A few of the possibilities to explore: – (Chosen) A “1” in the input dimension corresponding to the arc segment which the inverse pendulum currently occupies, “0” in other dimensions. – As above, but have a gradual decline to “0” in neighboring segments. • Might help prevent overshoot at the top. – Alternatively, put “0” to the left of inv pendulum, “0.5” at the inv pendulum, and “1” to the right of the inv pendulum. • Might provide more directional information. Output Force Representation • The output neuron force vector will act perpendicularly to the center of mass of the inv pendulum. • Will use a supervised learning paradigm. – Training data will be a fixed correcting force to return the inverse pendulum to the vertical. • Ideally would use a unsupervised learning paradigm allowing varying correcting force magnitudes, but unsure how to implement. Initialization • at top with a small movement in one or the other direction • at increasing angles from the top with no movement. (not included in final version of project) Convergence & Reinitialization • The standard: Amount of match between output and the teacher’s data. • Also, over how many simulation steps does the inv pendulum stay within a small number of degrees of the top. Stability. – This may be the criteria for reinitialization. – May not reset the network weights, only the inverse pendulum position. – (did not appear in the final version of project) 1 1 1 1 1 1 H Hidden Neurons M Input Neurons H 1 H 1 M Arc Segments Fixed Output Force θ 2 Output Neurons Results (Force vs. Time Step): • Difficult to find a balance of force and sampling interval. – Using too large of a force would result in overcorrection. Results (Force vs. Time Step): – Too small of a force resulted in under correction. – Smaller time steps solve this problem, but increase memory usage and processing time. Did not reach 100% convergence. – Ran one promising (which appeared not to be under or over corrected) simulation for a period of several days (>69000 iterations) and achieved a convergence rate of only 61.3%. – By the way the pendulum falls during the testing section of the simulation, the neural network does not yet appear to have “learned” to balance the inverse pendulum. Results • Did not succeed in balancing a inverse pendulum during the duration of the simulation runs.