Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced Topics in Robotics CS493/790 (X) Lecture 2 Instructor: Monica Nicolescu Robot Components • Sensors • Effectors and actuators – Used for locomotion and manipulation • Controllers for the above systems – Coordinating information from sensors with commands for the robot’s actuators • Robot = an autonomous system which exists in the physical world, can sense its environment and can act on it to achieve some goals CS 493/790(X) - Lecture 2 2 Sensors • Sensor = physical device that provides information about the world – Process is called sensing or perception • Robot sensing depends on the task • Sensor (perceptual) space – All possible values of sensor readings – One needs to “see” the world through the robot’s “eyes” – Grows quickly as you add more sensors CS 493/790(X) - Lecture 2 3 State Space • State: A description of the robot (of a system in general) • State space: All possible states a robot could be in – E.g.: light switch has two states, ON, OFF; light switch with dimmer has continuous state (possibly infinitely many states) • Different than the sensor/perceptual space!! – Internal state may be used to store information about the world (maps, location of “food”, etc.) • How intelligent a robot appears is strongly dependent on how much and how fast it can sense its environment and about itself CS 493/790(X) - Lecture 2 4 Representation • Internal state that stores information about the world is called a representation or internal model – Self: stored proprioception, goals, intentions, plans – Environment: maps – Objects, people, other robots – Task: what needs to be done, when, in what order • Representations and models influence determine the complexity of a robot’s “brain” CS 493/790(X) - Lecture 2 5 Action • Effectors: devices of the robot that have impact on the environment (legs, wings robotic legs, propeller) • Actuators: mechanisms that allow the effectors to do their work (muscles motors) • Robotic actuators are used for – locomotion (moving around, going places) – manipulation (handling objects) CS 493/790(X) - Lecture 2 6 Autonomy • Autonomy is the ability to make one’s own decisions and act on them. – For robots: take the appropriate action on a given situation • Autonomy can be complete (R2D2) or partial (teleoperated robots) CS 493/790(X) - Lecture 2 7 Control Architectures • Robot control is the means by which the sensing and action of a robot are coordinated • Controllers enable robots to be autonomous – Play the role of the “brain” and nervous system in animals • Controllers need not (should not) be a single program – Typically more than one controller, each process information from sensors and decide what actions to take – Should control modules be centralized? • Challenge: how do all these controllers coordinate with each other? CS 493/790(X) - Lecture 2 8 Spectrum of robot control From “Behavior-Based Robotics” by R. Arkin, MIT Press, 1998 CS 493/790(X) - Lecture 2 9 Robot control approaches • Reactive Control – Don’t think, (re)act. • Deliberative (Planner-based) Control – Think hard, act later. • Hybrid Control – Think and act separately & concurrently. • Behavior-Based Control (BBC) – Think the way you act. CS 493/790(X) - Lecture 2 10 Thinking vs. Acting • Thinking/Deliberating – slow, speed decreases with complexity – involves planning (looking into the future) to avoid bad solutions – thinking too long may be dangerous – requires (a lot of) accurate information – flexible for increasing complexity • Acting/Reaction – fast, regardless of complexity – innate/built-in or learned (from looking into the past) – limited flexibility for increasing complexity CS 493/790(X) - Lecture 2 11 Reactive Control: Don’t think, react! • Technique for tightly coupling perception and action to provide fast responses to changing, unstructured environments • Collection of stimulus-response rules • Limitations • Advantages – No/minimal state – Very fast and reactive – No memory – Powerful method: animals are largely reactive – No internal representations of the world – Unable to plan ahead – Unable to learn CS 493/790(X) - Lecture 2 12 Deliberative Control: Think hard, then act! • In DC the robot uses all the available sensory information and stored internal knowledge to create a plan of action: sense plan act (SPA) paradigm • Limitations – Planning requires search through potentially all possible plans these take a long time – Requires a world model, which may become outdated – Too slow for real-time response • Advantages – Capable of learning and prediction – Finds strategic solutions CS 493/790(X) - Lecture 2 13 Hybrid Control: Think and act independently & concurrently! • Combination of reactive and deliberative control – Reactive layer (bottom): deals with immediate reaction – Deliberative layer (top): creates plans – Middle layer: connects the two layers • Usually called “three-layer systems” • Major challenge: design of the middle layer – Reactive and deliberative layers operate on very different time-scales and representations (signals vs. symbols) – These layers must operate concurrently • Currently one of the two dominant control paradigms in robotics CS 493/790(X) - Lecture 2 14 Behavior-Based Control: Think the way you act! • Behaviors: concurrent processes that take inputs from sensors and other behaviors and send outputs to a robot’s actuators or other behaviors to achieve some goals • An alternative to hybrid control, inspired from biology • Has the same capabilities as hybrid control: – Act reactively and deliberatively • Also built from layers – However, there is no intermediate layer – Components have a uniform representation and time-scale CS 493/790(X) - Lecture 2 15 Fundamental Differences of Control • Time-scale: How fast do things happen? – how quickly the robot has to respond to the environment, compared to how quickly it can sense and think • Modularity: What are the components of the control system? – Refers to the way the control system is broken up into modules and how they interact with each other • Representation: What does the robot keep in its brain? – The form in which information is stored or encoded in the robot CS 493/790(X) - Lecture 2 16 Behavior Coordination • Behavior-based systems require consistent coordination between the component behaviors for conflict resolution • Coordination of behaviors can be: – Competitive: one behavior’s output is selected from multiple candidates – Cooperative: blend the output of multiple behaviors – Combination of the above two CS 493/790(X) - Lecture 2 17 Competitive Coordination • Arbitration: winner-take-all strategy only one response chosen • Behavioral prioritization – Subsumption Architecture • Action selection/activation spreading (Pattie Maes) – Behaviors actively compete with each other – Each behavior has an activation level driven by the robot’s goals and sensory information • Voting strategies – Behaviors cast votes on potential responses CS 493/790(X) - Lecture 2 18 Cooperative Coordination • Fusion: concurrently use the output of multiple behaviors • Major difficulty in finding a uniform command representation amenable to fusion • Fuzzy methods • Formal methods – Potential fields – Motor schemas – Dynamical systems CS 493/790(X) - Lecture 2 19 Example of Behavior Coordination Fusion: flocking (formations) Arbitration: foraging (search, coverage) CS 493/790(X) - Lecture 2 20 Learning & Adaptive Behavior • Learning produces changes within an agent that over time enable it to perform more effectively within its environment • Adaptation refers to an agent’s learning by making adjustments in order to be more attuned to its environment – Phenotypic (within an individual agent) or genotypic (evolutionary) – Acclimatization (slow) or homeostasis (rapid) CS 493/790(X) - Lecture 2 22 Learning Learning can improve performance in additional ways: • Introduce new knowledge (facts, behaviors, rules) • Generalize concepts • Specialize concepts for specific situations • Reorganize information • Create or discover new concepts • Create explanations • Reuse past experiences CS 493/790(X) - Lecture 2 23 Learning Methods • Reinforcement learning • Neural network (connectionist) learning • Evolutionary learning • Learning from experience – Memory-based – Case-based • Learning from demonstration • Inductive learning • Explanation-based learning • Multistrategy learning CS 493/790(X) - Lecture 2 24 Reinforcement Learning (RL) • Motivated by psychology (the Law of Effect, Thorndike 1991): Applying a reward immediately after the occurrence of a response increases its probability of reoccurring, while providing punishment after the response will decrease the probability • One of the most widely used methods for adaptation in robotics CS 493/790(X) - Lecture 2 25 Reinforcement Learning • Goal: learn an optimal policy that chooses the best action for every set of possible inputs • Policy: state/action mapping that determines which actions to take • Desirable outcomes are strengthened and undesirable outcomes are weakened • Critic: evaluates the system’s response and applies reinforcement – external: the user provides the reinforcement – internal: the system itself provides the reinforcement (reward function) CS 493/790(X) - Lecture 2 26 Learning to Walk • Maes, Brooks (1990) • Genghis: hexapod robot • Learned stable tripod stance and tripod gait • Rule-based subsumption controller • Two sensor modalities for feedback: – Two touch sensors to detect hitting the floor: - feedback – Trailing wheel to measure progress: + feedback CS 493/790(X) - Lecture 2 27 Learning to Walk • Nate Kohl & Peter Stone (2004) CS 493/790(X) - Lecture 2 28 Supervised Learning • Supervised learning requires the user to give the exact solution to the robot in the form of the error direction and magnitude • The user must know the exact desired behavior for each situation • Supervised learning involves training, which can be very slow; the user must supervise the system with numerous examples CS 493/790(X) - Lecture 2 29 Neural Networks • One of the most used supervised learning methods • Used for approximating real-valued and vectorvalued target functions • Inspired from biology: learning systems are built from complex networks of interconnecting neurons • The goal is to minimize the error between the network output and the desired output – This is achieved by adjusting the weights on the network connections CS 493/790(X) - Lecture 2 30 ALVINN • ALVINN (Autonomous Land Vehicle in a Neural Network) • Dean Pomerleau (1991) • Pittsburg to San Diego: 98.2% autonomous CS 493/790(X) - Lecture 2 31 Learning from Demonstration & RL • S. Schaal (’97) • Pole balancing, pendulum-swing-up CS 493/790(X) - Lecture 2 32 Learning from Demonstration Inspiration: • Human-like teaching by demonstration Demonstration Robot performance CS 493/790(X) - Lecture 2 33 Learning from Robot Teachers • Transfer of task knowledge from humans to robots Human demonstration CS 493/790(X) - Lecture 2 Robot performance 34 Classical Conditioning • Pavlov 1927 • Assumes that unconditioned stimuli (e.g. food) automatically generate an unconditioned response (e.g., salivation) • Conditioned stimulus (e.g., ringing a bell) can, over time, become associated with the unconditioned response CS 493/790(X) - Lecture 2 35 Darvin’s Perceptual Categorization Early training • After the 10th stimulus Two types of stimulus blocks – 6cm metallic cubes – Blobs: low conductivity (“bad taste”) – Stripes: high conductivity (“good taste”) • Instead of hard-wiring stimulus-response rules, develop these associations over time CS 493/790(X) - Lecture 2 36 Genetic Algorithms • Inspired from evolutionary biology • Individuals in a populations have a particular fitness with respect to a task • Individuals with the highest fitness are kept as survivors • Individuals with poor performance are discarded: the process of natural selection • Evolutionary process: search through the space of solutions to find the one with the highest fitness CS 493/790(X) - Lecture 2 37 Genetic Operators • Knowledge is encoded as bit strings: chromozome – Each bit represents a “gene” • Biologically inspired operators are applied to yield better generations CS 493/790(X) - Lecture 2 38 Evolving Structure and Control • Karl Sims 1994 • Evolved morphology and control for virtual creatures performing swimming, walking, jumping, and following • Genotypes encoded as directed graphs are used to produce 3D kinematic structures • Genotype encode points of attachment • Sensors used: contact, joint angle and photosensors CS 493/790(X) - Lecture 2 39 Evolving Structure and Control • Jordan Pollak – Real structures CS 493/790(X) - Lecture 2 40 Readings • F. Martin: Sections 1.1, 1.2.3, 5 • M. Matarić: Chapters 1, 3, 10 CS 493/790(X) - Lecture 2 41