Download Learning H

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Impression formation wikipedia , lookup

Social perception wikipedia , lookup

Albert Bandura wikipedia , lookup

Transcript
Chapter 6 Learning
•
•
•
•
Learning
–
Learning is any relatively permanent change in behavior that is based upon experience.
–
It is an area of psychology that seems simple to evaluate but is quite complex.
–
Both internal and external factors can influence and interfere with an organism’s learning.
Behaviorism
–
Behaviorists insist that psychologists should study only observable, measurable behaviors, not mental processes.
–
There exists a wide range of views among researchers who call themselves behaviorists.
Methodological behaviorism
–
Methodological behaviorists study only events that they can measure and observe.
–
They sometimes use those observations to make inferences about internal events.
–
From observing how an animal behaves in the presence of certain stimuli, a methodological behaviorist infers the
presence of an intervening variable.
–
If a monkey shows its teeth or makes loud noises after placement of a stuffed animal or a larger monkey of the same
species in its cage, and to a recording of growling noises of a predatory cat, the methodological behaviorist infers the
presence of fear (intervening variable).
–
What measurements would you take to infer the presence of intervening variables such as:
Hunger?
•
Affection?
•
Anger?
Radical behaviorism
–
•
•
Radical behaviorists believe that internal states are caused by external events or genetics.
•
The ultimate cause of behavior is observable events, not internal states.
•
Most vague discussions of mental states are rephrased into descriptions of behavior.
The rise of behaviorism
–
In the early 1900s, the structuralists used the technique of introspection to study psychology.
–
They asked subjects to describe their own experiences in order to study thoughts and ideas.
–
Behaviorists deemed it useless to ask people to describe their private experiences.
–
The accuracy of these reports was impossible to gauge.
–
Behaviorists insisted that psychology deal with observable and measurable events only.
–
Jacques Loeb argued that all animal and most human behavior could be explained with stimulus-response psychology.
•
This explains behavior in terms of how each stimulus triggers a response.
•
Flinching from a blow and shading one’s eyes from strong light are stimulus-response behaviors.
–
More complex patterns of behavior are the sum of changes of speed and direction elicited by various stimuli.
–
Modern behaviorists believe that behavior is produced by stimuli and responses, plus the effects of natural physiological
states (hunger, tiredness, etc.)
–
The behaviorists carried on the tradition of asking questions about animal learning previously abandoned by comparative
psychologists.
–
Early behaviorists thought it was possible to determine the basic laws of learning by studying how animals learn.
1
•
The assumptions of behaviorism
–
–
Behaviorists are deterministic
•
We live a universe of cause-and-effect. Our behavior is part of that universe, so it must have identifiable causes.
•
If enough is known about an individual’s experiences, influences, and genetics, we can predict behavior.
Mental explanations are ineffective.
•
Q. Why is she smiling?
•
A. She is smiling because she is happy.
•
Q How do you know she is happy?
•
A. We can tell she is happy because she is smiling.
–
Circular reasoning arises when the presence of internal states is inferred based on behavior.
–
The influence of this perspective can be seen in the American legal system. Witnesses are discouraged from inferences
about what they saw; they are encouraged to describe appearance and behavior.
–
The environment predominates.
•
The strongest influence on behavior is outcome.
•
The environment selects and perpetuates successful behaviors, as evolution selects successful animals.
•
•
Behaviorists neither deny nor emphasize heredity.
–
People dismiss behaviorism.
–
They reject the notion that thoughts, beliefs and emotions are the effect and not the cause of behavior.
–
Behaviorists argue that past outcomes of behaviors cause the thoughts, beliefs and emotions.
–
Can you support the idea that thoughts, beliefs and emotions exist independently of your experiences?
Classical Conditioning
–
Pavlov and Classical Conditioning
–
Ivan Pavlov was a physiologist who won a Nobel Prize for his research on digestion.
–
His original description of classical conditioning was a by-product of this research. He did not set out to discover
classical conditioning.
–
Pavlov noticed that the dogs he used in research salivated upon the sight of the lab workers who fed them.
•
He concluded that this reflex was “psychological” - based on the dog’s previous experiences.
•
Further testing demonstrated that the sight of food produced the same effect as giving the same amount of food
to the dog.
–
Based on tentative acceptance of the salivation reflex, Pavlov described this response as a “conditional reflex.”
–
The term was mistranslated into English as conditioned reflex.
–
This mistake created the terminology now used to describe classical conditioning.
–
Pavlov started with the unconditioned reflex of salivation to food. He hypothesized the presence of an automatic
connection.
•
–
A buzzer is a neutral stimulus. It elicits attention to the sound, but no automatic connection.
•
–
The dogs lifted their ears and looked around when the buzzer sounded, but did not salivate.
Pavlov hypothesized that animals transfer a response from one stimulus to another – a new learned connection.
•
–
The dogs had an unconditioned reflex of secretion of digestive juices to food.
If a buzzer always preceded the food, the buzzer would begin to elicit the reflex of salivation.
After a few pairings of the buzzer with the food, the dogs salivates as soon as the buzzer sounded.
2
•
•
•
•
Terminology
–
Unconditioned Stimulus (UCS)  An event that consistently and automatically elicits an unconditioned response.
–
Unconditioned Response (UCR)  An action that the unconditioned stimulus automatically elicits.
–
Conditioned Stimulus (CS)  Formerly the neutral stimulus, now paired with the unconditioned stimulus, elicits the same
response. That response depends on consistent pairing with the UCR.
–
Conditioned Response (CR)  The response elicited by the conditioned stimulus due to training. Usually it resembles the
UCR.
Factors that enhance conditioning
–
Conditioning is quicker when the conditioned (neutral) stimulus is unfamiliar. If you are habituated to (used to) the neutral
stimulus, it will take longer to form a connection.
–
Conditioning is facilitated when people are made aware of the connection between the CS and the UCS. Having been
informed of the conditioning procedure they are conditioned faster.
The processes of classical conditioning
–
The process that establishes a conditioned response is acquisition.
–
To extinguish a classically conditioned response, the conditioned stimulus is repeatedly presented without the
unconditioned stimulus. This is referred to as extinction.
–
A rabbit is conditioned to blink its eye. Repeated presentation of a musical tone is followed by a puff of air directly blown
in its eye. After a few repetitions, the rabbit blinks when the tone sounds. (Acquisition)
–
The tone is repeatedly played without the air puff. Gradually, the rabbit stops blinking. (Extinction)
–
Extinction does not erase the association between the CS and the UCS.
–
If the puff of air is presented again to the rabbit without warning, it blinks the next time the tone is played.
–
The temporary return of an extinguished response is spontaneous recovery.
–
The rabbit acquires the response. The response is then extinguished through the repeated presentation of the tone with no
air puff. Hours after the experiment, the rabbit hears a musical tone. It blinks.
–
Stimulus generalization is the extension of a conditioned response from the training stimulus to similar stimuli.
–
Baby Hannah is conditioned to smile and laugh at the title screen with dark background and white writing that precedes a
funny song and cartoon on her “Baby Genius” DVD. She also smiles and giggles at the FBI Warning screen on movie
DVDs.
–
Discrimination is the development of different responses to two stimuli because they produce two different outcomes.
–
Gradually Hannah stops laughing at the FBI Warning screen because the song and cartoon do not follow it.
Explanations of classical conditioning
–
The process of classical conditioning is more complex than it might seem.
–
The association is not merely a transfer of response from one stimulus to the other. The conditioned stimulus is a signal
to the organism.
–
Temporal contiguity aids the process of conditioning. The sooner the UCS occurs after the presentation of the CS, the
faster the CR is acquired.
–
The CR is acquired more quickly when the CS precedes the UCS. This is forward conditioning.
–
In trace conditioning, the CS stops before the UCS is presented. This is a relatively ineffective method.
–
Backward conditioning (UCS follows by the CS) rarely produces a response.
–
The phenomenon of blocking shows that it’s difficult to condition the same response to more than one stimulus.
–
When rats experience an electric shock (a UCS) they jump and shriek.
–
After being conditioned to a buzzer preceding the shock (a CS) they freeze in place at its sound, a typical alarm response
in rats.
–
The CS prepares the animal for a UCS.
3
•
•
Conditioning, contiguity and contingency
–
A conditioned response develops only if there is predictability or contingency.
–
The UCS is more likely to occur after the CS.
–
The learner discovers the event that predicts the outcome. It is unclear whether any actual complex thinking results from
this process.
Understanding Addiction
–
Classical conditioning is thought by those unfamiliar with psychology to simple, mechanical learning.
–
It is in fact a complex form of learning that requires processing of information by the learner.
•
Operant Conditioning
•
Thorndike and Operant Conditioning
–
In 1911 Edward Thorndike developed a simple, behaviorist explanation of learning.
–
He used a learning curve, a graph of the changes in behavior that occur over successive trials of an experiment, to record
how quickly cats learned to escape from a maze.
–
The cats’ learning curve indicated a slow and consistent progress towards the solution.
–
But cats would learn more quickly if the response selected produced an immediate escape.
–
The cats would try many different behaviors and learn to select the one that produced escape.
–
Overall it appeared to Thorndike that the cats were not “understanding” the connections between the solution and the
escape. There was no sudden increase in the learning curve to support that assumption.
–
Thorndike observed that the escape from the box acted as a reinforcement for the behavior that led to it.
•
•
A reinforcement is an event that increases the future probability of the most recent response.
Thorndike’s Law of Effect
–
“Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the
animal will, other things being equal, be more firmly connected to the situation, so that, when it (the situation) recurs, they
will be more likely to recur.”
•
Operant Conditioning
•
The type of learning that Thorndike studies has come to be known as operant or instrumental conditioning.
–
The process of changing behavior by following a response with a reinforcement.
–
The subject’s behavior determines and is affected by a specific outcome.
•
Operant conditioning differs from classical in that the former, the subject’s behavior affects the outcome.
•
Classical conditioning influences visceral, reflexive, and involuntary responses, while operant conditioning applies to skeletal,
somatic, and voluntary responses.
•
Processes of Operant Conditioning
•
In operant conditioning, extinction occurs if responses stop producing reinforcements.
–
A child for whom you are babysitting whines until you give him a cookie. If you stop giving the child cookies, he will
eventually stop whining.
•
Stimulus generalization occurs when a new stimulus is similar to the original reinforced stimulus. The more similar the new
stimulus is to the old, the more strongly the subject will respond.
•
Processes of Operant Conditioning
The child for whom you are babysitting falls and scrapes his knee. He is crying inconsolably. You give him a cookie. He continues
to whine and cry on and off all afternoon, stopping for brief periods after you give him a cookie. The stimulus of his whining has generalized
to crying and whining. You are responding to both.
•
Discrimination occurs when someone is reinforced for responding to one stimulus but not another. The individual will respond
more vigorously to one than to the other.
4
–
•
A stimulus that indicates which response is appropriate or inappropriate is called a discriminative stimulus.
–
•
If you stop giving the child cookies when he cries but continue when he whines, he will whine much more often than he
will cry.
The child for whom you baby-sit does not whine for cookies when his mother is present, because she never gives in to his
whining. As soon as she leaves, he begins whining for a cookie. The presence of his mother has become a discriminative
stimulus.
Phenomena of Operant Conditioning
–
A stimulus’ power to encourage some responses and discourage others is known as stimulus control.
•
•
•
–
Thorndike noted that some responses are more easily learned than others. The cats learned to escape from the mazes
relatively quickly, but learned to scratch themselves on cue slowly and inconsistently.
–
B.F. Skinner and the Shaping of Behavior
–
B.F. Skinner is the most influential of all radical behaviorists.
–
He demonstrated many potential applications of operant conditioning.
–
He was a firm believer in parsimony, seeking simple explanations in terms of reinforcement histories, and avoiding the
inference of complex mental processes.
Shaping Behavior
–
Shaping establishes new responses by reinforcing successive approximations to it.
–
Skinner used an “operant chamber” (referred to as a “Skinner box” by others) into which he put the animal he wished to
train by shaping.
–
Gradually the animal was reinforced for behaviors that approached the target activity until it fully performed the behavior.
–
To make a pigeon turn in a complete clockwise circle, Skinner would first reinforce the pigeon with food for just turning a
few degrees to the right. When the pigeon began turning to the right regularly, he would cease reinforcing until the pigeon
turned a few more degrees in that direction. When that behavior was established, he’d wait until the pigeon turned further
to the right, and reinforce that movement, until finally the pigeon turned in a complete circle.
Chaining Behavior
–
•
When his mother is present, the child for whom you baby-sit asks her politely for juice and crackers. When his
mother is absent, he whines for cookies. The presence or absence of one stimulus after another signals to him
which behaviors will or will not be reinforced.
This an operant conditioning method in which behaviors are reinforced by opportunities to engage in the next behavior
•
The animal learns the final behavior, and then the next to last, and so on, until the beginning of the sequence is
reached.
•
Eating is an example of a chained behavior in humans. We first learn to eat with utensils, and gradually acquire
the preceding activities of getting and preparing food.
–
Increasing and Decreasing the Frequency of Responses
–
A reinforcement is an event that increases the probability that a response will be repeated.
–
A punishment is an event that decreases the probability of a response.
–
Reinforcement and Punishment
–
A reinforcement is either the presentation of a desirable item such as money or food, or the removal of an unpleasant
stimulus, such as verbal nagging or physical pain.
–
A punishment is the removal of a desirable condition such as driving privileges or the presentation of an unpleasant
condition such as physical pain.
Reinforcement and Punishment
–
Most people respond better to both immediate reinforcement and immediate punishment.
–
Most punishments are given in American society for behaviors that are immediately reinforcing. The punishments may or
may not occur.
–
The threat of punishment under these conditions is not an effective tool for changing behavior.
5
–
Punishment tends to be ineffective except for temporarily suppressing undesirable behavior.
–
Mild, logical, and consistent punishment can be informative and helpful.
•
The presentation of an event that strengthens or increases the likelihood of an event is called positive reinforcement.
•
•
–
A parent praises a child for excellent performance on a test.
–
A waiter receives an extra large tip for good service.
Punishment is referred to as passive avoidance learning because in response to punishment an individual learns to avoid
the outcome by being passive.
–
A child learns to avoid the punishment of being sent to his room for the evening by not teasing his little sister.
–
A woman avoids distress by not calling her sister who always says cruel things whenever they talk.
Omission training occurs when the lack of a response produces reinforcement. Producing the response also leads to a
lack of reinforcement.
–
This is sometimes referred to as negative punishment.
•
•
Reinforcements and Punishments
–
Escape learning or active avoidance learning occurs if the responses lead to an escape from or an avoidance of
something painful.
•
•
Parents tell a teenager that if she breaks curfew again, she will lose her driving privileges for a month.
This is sometimes referred to as negative reinforcement.
•
A teenager cleans his room to avoid listening to any more of his dad’s nagging.
•
A babysitter gives a cookie to a child to stop his whining.
What Constitutes Reinforcement?
–
A reinforcer increases the likelihood of the preceding response.
•
This can be confusing because it leads to a circular explanation.
•
It can also be confusing because although generally a reinforcer is a pleasant event, it doesn’t have to be.
•
What constitutes a “pleasant event” can be hard to define or vary from person to person.
–
Many reinforcers satisfy biological needs, such as hunger.
–
Addictive behaviors don’t seem to give much pleasure to the addict (although they may be negatively reinforcing - done to
avoid the unpleasant condition of not having access to the drug.)
–
Some reinforcers don’t satisfy any immediate need, but represent a future opportunity to have greater access to resources
(such as a good grade – you can’t eat it, but getting many of them may raise your chances of having more to eat later in
your life.)
–
The Premack Principle
•
The Premack Principle states that the opportunity to engage in a preferred behavior will be a reinforcer for any
less preferred behavior.
•
–
A person who prefers going to the movies to going to museums can be reinforced for extra trips to the
museum with free movie passes.
The disequilibrium principle
•
The disequilibrium principle states that each person has a preferred pattern of dividing time between various
activities. If one is unable to engage in that pattern a return to it will be reinforcing.
•
A person who must work overtime for the next three weekends makes an extra effort to finish up the
assigned work to return to his preferred activity of playing golf.
•
Unconditioned reinforcers meet primary, biological needs and are found to be reinforcing for almost everyone. Food and
drink are unconditioned reinforcers.
•
Conditioned reinforcers are effective because they have become associated with unconditioned reinforcers. Money and
grades are conditioned reinforcers.
6
•
Learning What Leads to What
–
Thorndike had a strictly mechanical view of reinforcement. An animal that receives reinforcement for a behavior
will perform it more frequently. No learning takes place without reinforcement, and understanding of the reason
for the behaviors is not necessary.
•
–
Learning What Leads to What
–
In contrast, the idea of latent learning suggests that learning may occur in animals without being demonstrated
until the reward is presented.
•
•
A rat is left to explore and sniff around in a maze. When presented with the possibility of a reward of
food, he runs the maze as fast as the rat that was painstakingly trained with rewards to run it.
Schedules of Reinforcement
–
Schedules of reinforcement are rules of for delivery of reinforcement
•
Used to maintain learned behaviors that might be extinguished if reinforcement ceased.
•
Continuous reinforcement schedules provide reinforcement every time a response occurs.
•
However, outside of the laboratory, reinforcement rarely follows every occurrence of a desired
behavior.
–
Most schedules of reinforcement are intermittent. Some responses are reinforced and others are not.
–
One of the two categories of intermittent reinforcement is ratio - delivery of reinforcement depends on the
number of responses given.
–
The other category of intermittent reinforcement is interval - delivery of reinforcement depends on the amount of
time since the last reinforcement.
–
A fixed-ratio schedule provides reinforcement only after a certain (“fixed”) number of correct responses have
been made. A laboratory rat being reinforced for hitting a lever after every 5 hits is being reinforced on an FR-5
schedule.
•
–
–
Slot machines, like all gambling, provide a particularly compelling form of variable ratio reinforcement
to the player.
A fixed-interval schedule provides reinforcement for the first response made after a specific time interval. A
person who is paid every two weeks is reinforced for work on a fixed interval schedule.
•
–
The local gourmet coffee shop gives you a card that says if you buy 9 coffee drinks you will get the 10th
beverage for free.
A variable-ratio schedule provides reinforcement after a variable number of correct responses, usually working
out to an average in the long run. A baseball player who has a .333 batting average is reinforcing fans with hits
on a VR-3 schedule.
•
You receive your local newspaper at the same time every day. You probably have a good idea of when
to start checking for it. This is a fixed interval schedule.
A variable-interval schedule provides reinforcement after a variable amount of time has elapsed.
•
•
A rat learns to run a maze because food is present at the end of the alleys that lead to the exit from the
maze.
If your newspaper delivery person is very inconsistent about delivery times, showing up one day at
5:00AM, the next day at 7:30AM, etc., your paper is delivered on a variable interval schedule.
–
extinction of responses tends to take longer when an individual has been on an intermittent schedule rather than
a continuous schedule.
–
One explanation for this difference is that the lack of reinforcement does not signify the complete cessation of
reinforcements to the individual who’s been on an intermittent schedule.
Applications of Operant Conditioning
–
A wide variety of applications exists for the techniques of operant conditioning including:
•
Animal training for performance, military, and helper animals.
•
Persuasion in political and commercial enterprises.
7
•
•
•
–
In behavior modification, the clinician determines which reinforcers sustain an undesirable or unwanted
behavior.
–
The clinician tries to change the behavior by reducing the opportunities for reinforcement of the unwanted
behavior and providing reinforcers for a more acceptable behavior.
Operant Conditioning
–
Some people are disturbed by the idea that positive reinforcement might influence behavior.
–
You wouldn’t work hard in a course or a job if your performance didn’t matter and all the grades or bonuses were
given without regard to quality.
–
Operant conditioning provides a useful and powerful way to improve behavior.
Other Kinds of Learning
–
Conditioned Taste Aversions
–
If learning occurs reliably after just one trial, it is hard to know if the learning was a result of classical conditioning or
operant conditioning
•
One kind of learning that occurs after a single trial is an association between eating something and getting sick.
•
This is conditioned taste aversion.
•
Many species appear to have a built-in predisposition to associate illness with food that was consumed, even if
some time has elapsed between the consumption of the substance and the onset of the illness.
–
Birdsong Learning
–
The beautiful songs of male birds may be delightful to our ears, but they are serious business for the bird
•
The songs are crucial for soliciting attracting a suitable mate.
•
They are also a warning to potential invaders of the singer’s territory.
–
Some species of songbird are especially dependent on the process of hearing live songs of older males in order to
develop a normal song.
–
There is a sensitive period early in the bird’s life during which the song is learned most readily.
–
The young bird also learns better from a live male than from a tape recording, and will not learn the songs of other
species.
–
Birdsong learning resembles human language learning in some ways.
•
•
Psychological treatment, through the use of applied behavior analysis or behavior modification.
It requires a social context, has an optimal period for learning early in life, starts with a kind of babbling, and
tends to deteriorate if the individual becomes deaf later in life.
–
It differs from classical conditioning in that the song the baby male bird learns from is not an unconditioned stimulus – it
elicits no response.
–
It differs from operant conditioning in that during the sensitive period there is no apparent reinforcement of the learning.
Social Learning
–
–
The social-learning approach, defined by Albert Bandura, states that we learn many behaviors before we attempt them for
the first time.
•
Much learning, especially in humans, results from observing the behaviors of others and from imagining the
consequences of our own.
•
Two of the chief components of social learning are modeling and imitation.
Bandura and his assistants did experiments in which children watched films in which adults either did or did not attack an
inflated “Bobo” doll.
•
Children who saw the aggressive versions of the films were more likely to repeat those actions when left alone
with a similar toy.
8
•
•
•
•
The implication was that the children were imitating the aggressive behavior they had just seen.
–
There has been great interest in the work of Bandura because of the controversy over effects of violence in TV programs
and movies.
–
It is unclear if direct relationship exists between televised/cinematic violence and violent behavior. People vary widely in
susceptibility to the influence of violent imagery.
Vicarious Reinforcement and Punishment
–
Another aspect of the social learning approach is the idea that we are more likely to imitate behaviors of others we’ve seen
rewarded and less likely to imitate behaviors that create unpleasant results for others.
–
This substitution of others’ experiences for one’s own is vicarious reinforcement or vicarious punishment.
–
The effectiveness of vicarious reinforcement and punishment resembles that of direct reinforcement and punishment.
–
Vicarious reinforcement appears to be more effective than vicarious punishment in creating behavioral change.
–
Some people may be more able to avoid identifying with others whose behaviors brought about painful or unpleasant
consequences.
Self-Efficacy in Social Learning
–
We imitate people we admire.
–
Advertisers routinely use endorsements from celebrities and sports figures, and images of the happy, healthy, affluent
people that most of us would like to be.
–
We do not model ourselves after every admirable figure. We imitate others only when we have a sense of self-efficacy, and
perceive ourselves as also being able to perform the task successfully.
Learning
–
Classical conditioning, operant conditioning, conditioned taste aversions, and social learning represent a diverse set of
influences on human behavior.
–
Your everyday behavior is in large part a product of the combined effects of these processes.
NLEE/NLEE
9