Download power point

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Balancing an Inverted Pendulum
with a Multi-Layer Perceptron
ECE 539 Final Project
Spring 2000
Chad Seys
Outline
•
•
•
•
•
•
•
•
•
The Inverted Pendulum
The Problem
Approach
Position Representation
Output Force Representation
Initialization
Convergence & Reinitialization
Results
Discussion
The Inverted Pendulum:
• Abstraction is a rigid rod
attached at its lower end
to a pivot point.
• Like balancing a broom
on the palm of hand.
• Useful in modeling:
– Launching a rocket into
space
– look up another
The Problem:
• Train a multi-layer perceptron to...
– keep an inverted pendulum in its upright
position
– move an inverted pendulum from any position
to the upright position (keep it balanced there).
Approach:
• Divide the 180 degrees into M arc segments
(where M is odd).
– M odd to provide a central region where no
force is applied.
– There will be M input neurons, one per
segment.
• There will be two output neurons whose
outputs will be interpreted as opposing
force vectors of fixed magnitude.
Inverse Pendulum Position
Representation
• A few of the possibilities to explore:
– (Chosen) A “1” in the input dimension
corresponding to the arc segment which the
inverse pendulum currently occupies, “0” in
other dimensions.
– As above, but have a gradual decline to “0” in
neighboring segments.
• Might help prevent overshoot at the top.
– Alternatively, put “0” to the left of inv
pendulum, “0.5” at the inv pendulum, and “1”
to the right of the inv pendulum.
• Might provide more directional information.
Output Force Representation
• The output neuron force vector will act
perpendicularly to the center of mass of the
inv pendulum.
• Will use a supervised learning paradigm.
– Training data will be a fixed correcting force to
return the inverse pendulum to the vertical.
• Ideally would use a unsupervised learning
paradigm allowing varying correcting force
magnitudes, but unsure how to implement.
Initialization
• at top with a small movement in one or the
other direction
• at increasing angles from the top with no
movement. (not included in final version of
project)
Convergence & Reinitialization
• The standard: Amount of match between
output and the teacher’s data.
• Also, over how many simulation steps does
the inv pendulum stay within a small
number of degrees of the top. Stability.
– This may be the criteria for reinitialization.
– May not reset the network weights, only the
inverse pendulum position.
– (did not appear in the final version of project)
1
1
1
1
1
1
H Hidden
Neurons
M Input
Neurons
H
1
H
1
M Arc Segments
Fixed
Output
Force
θ
2 Output
Neurons
Results (Force vs. Time Step):
• Difficult to find a balance of force and
sampling interval.
– Using too large of a force would result in overcorrection.
Results (Force vs. Time Step):
– Too small of a force resulted in under
correction.
– Smaller time steps solve this problem, but
increase memory usage and processing time.
Did not reach 100% convergence.
– Ran one promising (which appeared not to be under or over corrected)
simulation for a period of several days (>69000 iterations) and achieved a
convergence rate of only 61.3%.
– By the way the pendulum falls during the testing section of the simulation,
the neural network does not yet appear to have “learned” to balance the
inverse pendulum.
Results
• Did not succeed in balancing a inverse
pendulum during the duration of the
simulation runs.