Download Associationism

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Enactivism wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Catastrophic interference wikipedia , lookup

Neurophilosophy wikipedia , lookup

Recurrent neural network wikipedia , lookup

Behaviorism wikipedia , lookup

Perceptual learning wikipedia , lookup

Donald O. Hebb wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Educational psychology wikipedia , lookup

Eyeblink conditioning wikipedia , lookup

Dual process theory wikipedia , lookup

Machine learning wikipedia , lookup

Learning wikipedia , lookup

Classical conditioning wikipedia , lookup

Psychological behaviorism wikipedia , lookup

Transcript
Associationism
Associationism is one of the oldest, and, in some form or another, most widely held theories of thought.
Associationism has been the engine behind empiricism for centuries, from the British Empiricists through the
Behaviorists and modern day Connectionists. Nevertheless, ‘associationism’ does not refer to one particular
theory of cognition per se, but rather a constellation of related though separable theses. What ties these theses
together is a commitment to a certain arationality of thought: a creature's mental states are associated because
of some facts about its causal history, and having these mental states associated entails that bringing one of a
pair of associates to mind will, ceteris paribus, ensure that the other also becomes activated.













1. What is Associationism?
2. Associationism as a Theory of Mental Processes: The Empiricist Connection
3. Associationism as a Theory of Learning
4. Associationism as a Theory of Mental Structure
o 4.1 Associative Symmetry
o 4.2 Activation Maps of Associative Structure
o 4.3 Relation Between Associative Learning and Associative Structure
o 4.4 Extinction and Counterconditioning
5. Associative Transitions
6. Associative Instantiation
7. Relation between the Varieties of Association and Related Positions
8. Associationism in Social Psychology
o 8.1 Implicit Attitudes
o 8.2 Dual Process Theories
9. Criticisms of Associationism
o 9.1 Learning Curves
o 9.2 The Problem of Predication
o 9.3 Word Learning
9.3.1 Fast Mapping
9.3.2 Syntactic Category Learning
o 9.4 Against the Contiguity Relation of Associationism
9.4.1 Against the Necessity of Contiguity
9.4.2 Against the Sufficiency of Contiguity
o 9.5 Coextensionality
Bibliography
Academic Tools
Other Internet Resources
Related Entries
1. What is Associationism?
Associationism is a theory that connects learning to thought based on principles of the organism’s causal
history. Since its early roots, associationists have sought to use the history of an organism’s experience as the
main sculptor of cognitive architecture. In its most basic form, associationism has claimed that pairs of
thoughts become associated based on the organism’s past experience. So, for example, a basic form of
associationism (such as Hume’s) might claim that the frequency with which an organism has come into contact
with Xs and Ys in one’s environment determines the frequency with which thoughts about Xs and thoughts
about Ys will arise together in the organism's future.
Associationism’s popularity is in part due to how many different masters it can serve. In particular,
associationism can be used as a theory of learning (e.g., as in behaviorist theorizing), a theory of thinking (as in
Jamesian ‘streams of thought’), a theory of mental structures (e.g., as concept pairs), and a theory of the
implementation of thought (e.g., connectionism). All these theories are separable, but share a related,
empiricist-friendly core. As used here, a ‘pure associationist’ will refer to one who holds associationist theories
of learning, thinking, mental structure, and implementation. The ‘pure associationist’ is a somewhat idealized
position, one that no particular theorist may have ever held, but many have approximated to differing degrees
(e.g., Locke 1690/1975, Hume 1738/1975, Thorndike 1911, Skinner 1953, Hull 1943, Churchland 1986, 1989,
Churchland and Sejnowski 1990, Smolensky 1988, Elman 1991, Elman et al. 1996, McClelland et al. 2010,
Rydell and McConnell 2006, Fazio 2007).
Outside of these core uses of associationism the movement has also been closely aligned with a number of
different doctrines over the years: empiricism, behaviorism, anti-representationalism (i.e., skepticism about the
necessity of representational realism in psychological explanation), gradual learning, and domain-general
learning. All of these theses are dissociable from core associationist thought (see section 7). While one can be
an associationist without holding those theses, some of those theses imply associationism to differing degrees.
These extra theses’ historical and sociological ties to associationism are strong, and so will be intermittently
discussed below.
2. Associationism as a Theory of Mental Processes: The
Empiricist Connection
Empiricism is a general theoretical outlook, which tends to offer a theory of learning to explain as much of our
mental life as possible. From the British empiricists through Skinner and the behaviorists (see Behaviorism
entry) the main focus has been arguing for the acquisition of concepts (for the empiricists’ ‘Ideas’, for the
behaviorists ‘responses’) through learning. However, the mental processes that underwrite such learning are
almost never themselves posited to be learned.1 So winnowing down the amount of mental processes one has
to posit limits the amount of innate machinery the theorist is saddled with. Associationism, in its original form
as in Hume (1738/1975), was put forward as a theory of mental processes. Associationists’ attempt to answer
the question of how many mental processes there are by positing that there is only one mental process: the
ability to associate ideas.2
Of course, thinkers execute many different types of cognitive acts, so if there is only one mental process, the
ability to associate, that process must be flexible enough to accomplish a wide range of cognitive work. In
particular, it must be able to account for learning and thinking. Accordingly, associationism has been put to use
on both fronts. We will first discuss the theory of learning, and then, after analyzing that theory and seeing
what is putatively learned, we will return to the associationist theory of thinking.
3. Associationism as a Theory of Learning
In one of its senses, ‘associationism’ refers to a theory of how organisms acquire concepts, associative
structures, response biases, and even propositional knowledge. It is commonly acknowledged that
associationism took hold after the publishing of John Locke’s Essay Concerning Human Understanding
Empiricists who have wanted more than one type of learning mechanism have tended to be constructivists.
The basic constructivist position is to posit a single mental process, the ability to associate ideas, and to
construct new processes out of the single innate process (see, Fodor 1983 for discussion).
2 Though many later associationists, such as Pavlov and the behaviorists, had only one mental process, Hume
in fact also had the imagination. For discussion on how the imagination meshes with Hume’s empiricism and
associationism see Fodor (2003).
1
(1690/1975).3 However, Locke’s comments on associationism were terse (though fertile), and did not address
learning to any great degree. The first serious attempt to detail associationism as a theory of learning was given
by Hume in the Treatise of Human Nature (1738/1975).4 Hume’s associationism was, first and foremost, a theory
connecting how perceptions (‘Impressions’) determined trains of thought (successions of ‘Ideas’). Hume’s
empiricism, as enshrined in the Copy Principle 5, demanded that there were no Ideas in the mind that were not
first given in experience. For Hume, the principles of association constrained the functional role of Ideas once
they were copied from Impressions: if Impressions IM1 and IM2 were associated in perception, then their
corresponding Ideas, ID1 and ID2 would also become associated. In other words, the ordering of Ideas was
determined by the ordering of the Impressions that caused the Ideas to arise.
Hume’s theory then needs to analyze what types of associative relations between Impressions mattered for
determining the ordering of Ideas. Hume’s analysis consisted of three types of associative relations: cause and
effect, contiguity, and resemblance. If two Impressions instantiated one of these associative relations, then their
corresponding Ideas would mimic the same instantiation. 6 For instance, if Impression IM1 was
cotemporaneous with Impressions IM2, then (ceteris paribus) their corresponding Ideas, ID1 and ID2, would
become associated.
As stated, Hume’s associationism was mostly a way of determining the functional profile of Ideas. But we have
not yet said what it is for two Ideas to be associated (for that see section 4). Instead, one can see Hume’s
contribution as introducing a very influential type of learning—associative learning—for Hume’s theory
purports to explain how we learn to associate certain Ideas. But we can abstract away from Hume’s framework
of ideas and his account of the specific relations that underlie associative learning, and state the theory of
associative learning more generally: if two contents of experiences, X and Y, instantiate some associative
relation, R, then those contents will become associated, so that future activations of X will tend to bring about
activations of Y. The associationist then has to explain what relation R amounts to. The Humean form of
associative learning (where R is equated with cause and effect, contiguity, or resemblance) has been hugely
influential, informing the accounts of those such as Jeremy Bentham, J. S. Mill, and Alexander Bain (see, e.g.,
the entries on J.S. Mill and 19th Century Scottish Philosophy).7
Associative learning didn’t hit its stride until the work of Ivan Pavlov, which spurred the subsequent rise of the
behaviorist movement in psychology. Pavlov introduced the concept of classical conditioning as a modernized
version of associative learning. For Pavlov, classical conditioning was in part an experimental paradigm for
That said, one can detect aspects of associationism in earlier writers, such as Descartes when discussing
memory and Spinoza when discussing the emotions (see the entry on Descartes and on Spinoza on the
emotions).
4 Although Hume is generally acknowledged as laying the theoretical foundation of associationism, there is
some evidence that Francis Hutcheson’s use of associations greatly influenced him. See the entry on Scottish
Philosophy in the 18th Century.
5 “All our simple ideas in their first appearance are deriv'd from simple impressions, which are correspondent
to them, and which they exactly represent” (T 1.1.1.7/4).
6 This is a bit of a loose formulation. Strictly speaking, impressions themselves don’t instantiate any associative
relation, rather the contents of the Impressions do. For example, it isn’t that one’s Impression (understood as a
vehicle of thought) of chickens resembles roosters; rather it’s the content of one’s impressions resemble one
another. Presumably, all Impressions qua vehicles of thought resemble one another merely by being
Impressions. What differs between Impressions is (e.g.,) whether the content they represent resembles other
represented content. This distinction between vehicle and content is important for Hume’s overall architecture:
it’s not the vehicle of the Impression that gets copied into an Idea, but rather the content of that vehicle. That
said, to ease exposition the distinction between vehicles and contents is elided in the main text except where it
is important to distinguish.
7 Although some contemporary associationist views still retain all three original Humean associative relations,
the resemblance relation has come under the most scrutiny and is the least popular of the three. For
discussions of the problem of the resemblance criterion see Field and Davey (1999), and De Houwer (2009). In
the canonical Rescorla Wagner model (Rescorla and Wagner 1972), both contiguity and resemblance are
superseded by the contingency requirement.
3
teaching animals to learn new associations between stimuli. The general method of learning was to pair an
unconditioned stimulus (US) with a novel stimulus. An unconditioned stimulus is just a stimulus that naturally,
without training, provokes a response in an organism. Since this response is not itself learned, the response is
referred to as an ‘unconditioned response’ (UR). In Pavlov’s canonical experiment, the US was a meat powder,
as the smell of meat automatically brought about salivation (UR) in his canine subjects. The US is then paired
with a neutral stimulus, such as a bell. Over time, the contiguity between the US and the neutral stimulus causes
the neutral stimulus to provoke the same response as the US. Once the bell starts to provoke salivation, the bell
has become a ‘conditioned stimulus’ (CS) and the salivating, when prompted by the bell alone, a ‘conditioned
response’ (CR). The associative learning here is learning to form new stimulus-response pairs between the bell
and the salivation.8
Classical conditioning is a fairly circumscribed process. It is a ‘stimulus substitution’ paradigm where one
stimulus can be swapped for another to provoke a response. 9 However, the responses that are provoked
remain unchanged; all that changes is the stimulus that gets associated with the response. Thus, classical
conditioning seemed to some to be too restrictive to explain the panoply of novel behavior organisms appear
to execute.10
Edward Thorndike’s research with cats in puzzle boxes broadened the theory of associative learning by
introducing the notion of consequences to associative learning. Thorndike expanded the notion of associative
learning beyond instinctual behaviors and sensory substitution to genuinely novel behaviors. Thorndike’s
experiments initially probed (e.g.,) how cats learned to lift a lever to escape the “puzzle boxes” (the forbearer to
‘Skinner boxes’) that they were trapped in. The cats’ behaviors, such as attempting to lift a lever, were not
themselves instinctual behaviors like the URs of Pavlov’s experiments. Additionally, the cats’ behaviors were
shaped by the consequences that they brought on. For Thorndike it was because lifting the lever caused the
door to open that the cats learned the connection between the lever and the door. This new view of learning,
operant conditioning (for the organism is ‘operating’ on its environment) was not merely the passive learning of
Pavlov, but a species-nonspecific, general, active theory of learning.
This research culminated in Thorndike’s famous “Law of Effect” (1911), the first canonical psychological law
of associationist learning. It asserted that responses that are accompanied by the organism feeling satisfied will,
ceteris paribus, be more likely to be associated with the situation in which the behavior was executed, whereas
responses that are accompanied with a feeling of discomfort to the animal will, ceteris paribus, make the
response less likely to occur when the organism encounters the same situation. 11 The greater the positive or
negative feelings produced, the greater the likelihood that the behavior will be evinced. To this Thorndike
added the “Law of Exercise”, that responses to situations will, ceteris paribus, be more connected to those
situations in proportion to the frequency of past pairings between situation and response. Thorndike’s
paradigm was popularized and extended by B. F. Skinner (see, e.g., Skinner 1953) who stressed the notion not
just of consequences but of reinforcement as the basis of forming associations. For Skinner, a behavior would get
A variation on classical conditioning is evaluative conditioning, where one tries to transfer the valence of the
US onto the CS (see, e.g., De Houwer et al. 2001 for an overview). For instance, one might pair a favorable
flavor (e.g., sugar) with a novel neutral face stimulus, in order to transfer the positive valence to the previously
neutral face.
9 There are many different ways of construing the details of Pavlovian conditioning. For example, some would
restrict the usage further by arguing that the US must be biologically significant, or widen the usage, as Rescorla
does (see section 7). Some anti-associationists even believe that Pavlovian conditioning is real, but not
predicated on associations (Mitchell et al. 2009).
10 Classical conditioning also had some consequences that were a bit unpalatable for empiricists: if all learning
was to be given as forming associative bonds between USs, CSs, and responses, then all of our learning had to
bottom out in some behaviors that were preprogrammed to correspond to certain stimuli: in other words,
certain instinctual patterns of behavior were innately set to be elicited by certain stimuli. Even more
problematically, such instinctual patterns were apt to be species specific, so not generalizable to humans.
11 Note how Thorndike does not hesitate to speak of mental states like satisfaction and dissatisfaction, as
opposed to the most famous practitioner of operant conditioning, the radical behaviorist B. F. Skinner (see the
Behaviorism entry).
8
associated with a situation according to the frequency and strength of reinforcement that would arise as a
consequence of the behavior.
Since the days of Skinner, associative learning has come in many different variations. But what all varieties
share with their historical predecessors is that associative learning is supposed to mirror the contingencies in
the world without adding additional structure to them. The question of what contingencies associative learning
detects (that is, one’s preferred analysis of what the associative relation R is), is up for debate and changes
between theorists.
The final widely shared, though less central, property of associative learning concerns the domain generality of
associative learning. Domain generality’s prevalence among associationists is due in large part to their
traditional empiricist allegiances: excising domain specific learning mechanisms constrains the amount of innate
mental processes one has to posit. Thus it is no surprise to find that both Hume and Pavlov assumed that
associative learning could be used to acquire associations between any contents, regardless of the types of
contents they were. For example, Pavlov writes, “Any natural phenomenon chosen at will may be converted
into a conditioned stimulus. Any ocular stimulus, any desired sound, any odor, and the stimulation of any
portion of the skin, whether by mechanical means or by the application of heat or cold never failed to stimulate
the salivary glands.”(Pavlov 1906, p. 615). Note that for Pavlov the content of the CS doesn’t matter. Any
content will do, as long as it bears the right functional relationship in the organism’s learning history. In that
sense, the learning is domain general—it matters not what the content is, just the role it plays (for more on this
topic, see section 9.4).12
4. Associationism as a Theory of Mental Structure
Associative learning amounts to a constellation of related views that interprets learning as associating stimuli
with responses (in operant conditioning), or stimuli with other stimuli (in classical conditioning), or stimuli with
valences (in evaluative conditioning). Associative learning accounts raise the question: when one learns to
associate contents X and Y because (e.g.,) previous experiences with Xs and Ys instantiated R, how does one
store the information that X and Y are associated? A highly contrived sample answer to this question would be
that a thinker learns an explicitly represented unconscious conditional rule that states ‘when a token of X is
activated, then also activate a token of Y.’ Instead of such a highly intellectualized response, associationists have
found a natural (though by no means necessary, see section 4.2) complementary view that the information is
stored in an associative structure.
An associative structure describes the type of bond that connects two distinct mental states.13 An example of
such a structure is the associative pair SALT/PEPPER.14 The associative structure is defined, in the first instance,
functionally: if X and Y form an associative structure, then, ceteris paribus, activations of mental state X bring
about mental state Y and vice versa without the mediation of any other psychological states (such as an
From this level of abstraction, Pavlov and Skinner were united. Here’s Garcia’s on Skinnerian learning: “Any
stimulus applied immediately after the response which, by empirical test, would increase response production
was deemed a reinforcer…The general procedures were said to be applicable to any and all reflexes, in any and
all organisms. There was no need to concern ourselves with species differences, with brain differences, or with
reinforcer differences. The payoff schedule’s the thing wherein we’d capture control of the organism” (Garcia
1981, p. 155).
13 Radical behaviorists such as Skinner (e.g. 1953) would deny this claim, but only because of their ontological
objections to reifying mental states. But Eliminativism of the mental is a different thesis than associationism,
although both fit together well (see section 6).
14 Hereafter I will use the forward slash to denote an associative bond between the entities on either side of the
slash. Additionally, expressions written in small caps will be used to denote concepts, and I will assume that the
concepts’ structural descriptions are given by the expressions. Thus RED BIRD is taken to be a complex concept
consisting of two meaningful parts, the concept RED and the concept BIRD. However, BIRD will be assumed to
be a simple concept with no semantically decomposable parts. The structural descriptions are stipulated for
exegetical reasons and without commitment to the actual structure of the corresponding concepts.
12
explicitly represented rule telling the system to activate a concept because its associate has been activated). 15 In
other words, saying that two concepts are associated amounts to saying that there is a reliable, psychologically
basic causal relation that holds between them—the activation of the one of the concepts causes the activation
of the other. So, saying that someone harbors the structure SALT/PEPPER amounts to saying that activations of
SALT will cause activations of PEPPER (and vice versa) without the aid of any other cognitive states.
Associative structures are most naturally contrasted with propositional structures. A pure associationist is
opposed to propositional structures—strings of mental representations that express a proposition—because
propositionally structured mental representations have structure over and above the mere associative bond
between two concepts. Take, for example, the associative structure GREEN/TOUCAN. This structure does not
predicate green onto toucan. If we know that a mind has an associative bond between GREEN and TOUCAN,
then we know that activating one of those concepts leads to the activation of the other. A pure associative
theory rules out predication, for propositional structures aren’t just strings of associations. ‘Association’ (in
associative structures) just denotes a causal relation among mental representations, whereas predication
(roughly) expresses a relation between things in the world (or intentional contents that specify external
relations). Saying that someone has an associative thought GREEN/TOUCAN tells you something about the
causal and temporal sequences of the activation of concepts in one’s mind; saying that someone has the
thought THERE IS A GREEN TOUCAN tells you that a person is predicating greenness of a particular toucan (see
Fodor 2003, pp. 91-94, for an expansion of this point).
Associative structures needn’t just hold between simple concepts. One might have reason to posit associative
structures between propositional elements (see section 5) or between concepts and valences (see section 8). But
none of the proceeding is meant to imply that all structures are associative or propositional—there are other
representational formats that the mind might harbor (e.g., analog magnitudes or iconic structures). For
instance, not all semantically related concepts are harbored in associative structures. Semantically related
concepts may in fact also be directly associated (as in DOCTOR/NURSE) or they may not (as in HORSE/ZEBRA;
see Perea and Rosa 2001). The difference in structure is not just a theoretical possibility: these different
structures have different functional profiles. For example conditioned associations appear to last longer than
semantic associations do in subjects with dementia (Glosser and Friedman 1991).
4.1 Associative Symmetry
The analysis of associative structures implies that, ceteris paribus, associations are symmetric in their causal
effects: if a thinker has a bond between SALT/PEPPER, then SALT should bring about PEPPER just as well as
PEPPER brings about SALT. But all else is rarely equal. For example, behaviorists such as Thorndike, Hull, and
Skinner knew that the order of learning affected the causal sequence of recall: if one is always hearing ‘salt and
pepper’ then SALT will be more poised to activate PEPPER than PEPPER to activate SALT. So, included in the
ceteris paribus clause in the analysis of associative structures is the idealization that the learning of the
associative elements was equally well randomized in order.
Similarly, associative symmetry is violated when there are differing amounts of associative connections between
the individual associated elements. For example, in the GREEN/TOUCAN case, most thinkers will have many
more associations stemming from GREEN than stemming from TOUCAN. Suppose we have a thinker that only
associates TOUCAN with GREEN, but associates GREEN with a large host of other concepts (e.g., GRASS,
VEGETABLES, TEA, KERMIT, SEASICKNESS, MOSS, MOLD, LANTERN, IRELAND, etc). In this case one can expect
that TOUCAN will more quickly activate GREEN than GREEN will activate TOUCAN, for the former bond will
have its activation strength less weakened amongst other associates than the latter will.
15
The mediation parenthetical can get a bit complicated to state, for one might want to claim that (e.g.,)
WRENCH and HAMMER are associated, even if the association is mediated through a link between
SCREWDRIVER. In which case, it’s best to say that two concepts form a basic associative structure
if the
activation of one concept brings on the activation of another without there being any other mediating
psychological variable.
4.2 Activation Maps of Associative Structure
An associative activation map (sometimes called a ‘spreading activation’ map, Collins and Luftus 1975) is a
mapping for a single thinker of all the associative connections between concepts. 16 There are many ways of
operationalizing associative connections. In the abstract, a psychologist will attempt to probe which concepts
(or other mental elements) activate which other concepts (or elements). Imagine a subject who is asked to say
whether a string of letters constitutes a word or not, which is the typical goal given to subjects in a ‘lexical
decision task.’ If a subject has just seen the word ‘bugs’, we assume that the concept BUGS was activated. If the
subject is then quicker to say that, e.g., ‘insects’ is a word than the subject is to say that ‘toaster’ is, then we can
infer that INSECTS was primed, and is thus associatively related to BUGS, in this thinker. Likewise, if we find that
‘microphone’ is also responded to quicker, then we know that MICROPHONE is associatively related to ‘bugs.’
Using this procedure, one can generate an associative mapping of a thinker’s mind. Such a mapping would
constitute a mapping of the associative structures one harbors. However, to be a true activation map—a true
mapping of what concepts facilitate what—the mapping would also need to include information about the
violations of symmetry between concepts.
4.3 Relation Between Associative Learning and Associative Structures
The British Empiricists desired to have a thoroughgoing pure associationist theory, for it allowed them to
lessen the load of innate machinery they needed to posit. Likewise, the behaviorists also tended to want a pure
associationist theory (sometimes out of a similar empiricist tendency, other times because they were radical
behaviorists like Skinner, who banned all discussion of mental representations). Pure associationists tend to be
partial to a connection that Fodor (2003) refers to as “Bare-Boned Association.” The idea is that the current
strength of an association connection between X and Y is determined, ceteris paribus, by the frequency of the
past associations of X and Y. As stated, Bare-Boned Association assumes that associative structures encode, at
least implicitly, the frequency of past associations of X and Y, and the strength of that associative bond is
determined by the organism’s previous history of experiencing Xs and Ys. 17 In other words, the learning history
of past associations determines the current functional profile of the corresponding associative structures. 18
Although the picture sketched above, where associative learning eventuates in associative structure, is appealing
for many, it is not forced upon one. Logically speaking, there is no reason to bar any type of structure to arise
from a particular type of learning. One may, for example, gain propositional structures from associative
learning (see Mitchell et al. 2009 and Mandelbaum forthcoming for arguments that this is more than a mere
logical possibility). This may happen in two ways. In the first, one may gain an associative structure that has a
proposition as one of its associates. Assume that every time one’s father came home he immediately made
This claim should be qualified in a few ways. First, the mapping might not be a full mapping of a single
thinker as opposed to a subsystem of a single thinker (such as their intramodular representation of their
lexicon, see Fodor 1983). Secondly, the mapping needn’t be between concepts per se, and can instead be
between mental representations that for some reason or another one needn’t bestow the honorific of ‘concepts’
to (because, for example, the mental representations are intramodular and thus not properly ‘general’, see
Evans 1982).
17 ‘Experiencing Xs and Ys’ generally means something such as ‘having formed representations of Xs and Ys
based on their appearance in the ambient environment,’ but needn’t necessarily mean that. If one just happened
to keep thinking X followed by Y for any reason, even though Xs and Ys weren’t given in experience that too
could change the associative strength of the X/Y bond. Additionally, some theories allow ‘piggybacking’
associations—associations formed from activated propositional structures. For example, constantly having the
propositional thought MOLLY OWNS A DOG could affect the associative bond between MOLLY and DOG (see
Mandelbaum forthcoming for discussion).
18 Although bare-boned associationism provides a good approximation of Hume and Pavlov, it doesn’t quite
capture the full theory of those working in operant conditioning paradigms for it doesn’t involve any notion of
reinforcement, or updating one’s associative structure based on consequences. This isn’t accidental: how to
square cognitive updating (i.e., association-based or belief-based updating) based on consequences with the
Spartan tenets of associationism has often been a point of difficulty (see, e.g., Festinger and Carlsmith 1959).
16
dinner. In such a case one might associate the proposition DADDY IS HOME with the concept DINNER (that is
one might acquire: DADDY IS HOME/DINNER). However, one might also just have a propositional structure
result from associative learning. If every time one’s father came home he made dinner, then one might just end
up learning IF DADDY IS HOME THEN DINNER WILL COME SOON, which is a propositional structure.
4.4 Extinction and Counterconditioning
There is a different, tighter relationship between associative learning and associative structures concerning how
to modulate an association. Associative theorists, especially from Pavlov onward, have been clear on the
functional characteristics necessary to modulate an already created association. There have been two generally
agreed upon routes: extinction and counterconditioning. Suppose that, through associative learning, you have learned
to associate a CS with a US. How do we break that association? Associationists have posited that one breaks an
associative structure via two different particular types of associative learning (/unlearning). Extinction is the
name for one such process. During extinction one decouples the external presentation of the CS and the US by
presenting the CS without the US (and sometimes the US without the CS). Over time, the organism will learn
to disconnect the CS and US.
Counterconditioning names a similar process to extinction, though one which proceeds via a slightly different
method. Counterconditioning can only occur when an organism has an association between a mental
representation and a valence, as acquired in an evaluative conditioning paradigm. Suppose that one associates
DUCKS with a positive valence. To break this association via counterconditioning one introduces ducks not
with a lack of positive valence (as would happen in extinction) but with the opposite valence, a negative
valence. Counterconditioning counters the existing valence with the opposite valence. Over multiple exposures,
the initial representation/valence association weakens, and is perhaps completely broken.19
How successful extinction and counterconditioning are, and how they work, is the source of some controversy.
Although the traditional view is that extinction breaks associative bonds, it is an open empirical question
whether extinction proceeds by breaking the previously created associative bonds, or whether it proceeds by
leaving that bond alone but creating new, more salient (and perhaps context-specific) bonds between the CS
and other mental states (see Bouton 2002 for evidence for the latter interpretation). Additionally, reinstatement,
the spontaneous reappearance of an associative bond after seemingly successful extinction, has been observed
in many contexts (see, e.g., Dirikx et al. 2007 for reinstatement of fear in humans). 20
One fixed point in this debate is that one reverses associative structures via these two types of associative
learning/unlearning, and only via these two pathways. What one does not do is try to break an associative
structure by using practical or theoretical reasoning. If you associate SALT with PEPPER then telling you that salt
has nothing to do with pepper, or giving you very good reasons not to associate the two (say, someone will give
you $50,000 for not associating them) won’t affect the association. This much has at least been clear since
Locke. In the Essay concerning Human Understanding, in his chapter “On the Association of Ideas” (chapter
XXIII) he writes,
When this combination is settled, and while it lasts, it is not in the power of reason to help us, and
relieve us from the effects of it. Ideas in our minds, when they are there, will operate according to
their natures and circumstances. And here we see the cause why time cures certain affections, which
reason, though in the right, and allowed to be so, has not power over, nor is able against them to
prevail with those who are apt to hearken to it in other cases (2. 23. 13).
Curiously, it appears that extinction isn’t very effective in evaluative conditioning paradigms, though
counterconditioning is (see De Houwer 2011 for many citations, such as et al. Diaz et al. 2005 and
Vansteenwegen 2006).
20 Technically, reinstatement is the reappearance of the CR upon reexposure to the US after successful
extinction, whereas spontaneous recovery is the name for the return of the associative pairing just due to the
passage of time. Both reinstatement and spontaneous recovery are related, and both provide difficulties for the
traditional view of extinction.
19
Likewise, say one has just eaten lutefisk and then vomited. The smell and taste of lutefisk will then be
associated with feeling nauseated, and no amount of telling one that they shouldn’t be nauseated will be very
effective. Say the lutefisk that made one vomit was covered in poison, so that we know that the lutefisk wasn’t
the root cause of the sickness.21 Having this knowledge won’t dislodge the association. In essence, associative
structures are functionally defined as being fungible based on counterconditioning and extinction and nothing
else. Thus, assuming one sees counterconditioning and extinction as types of associative learning, we can say
that associative learning does not necessarily eventuate in associative structures, but associative structures can
only be modified by associative learning.
5. Associative Transitions
So far we’ve discussed learning and mental structures, but have yet to discuss thinking. The pure associationist
will want a theory that covers not just acquisition and cognitive structure, but also the transition between
thoughts. Associative transitions are a particular type of thinking, akin to what William James called “The
Stream of Thought” (James 1890). Associative transitions are movements between thoughts that are not
predicated on a prior logical relationship between the elements of the thoughts that one connects. In this sense,
associative transitions are contrasted with computational transitions as analyzed by the Computational Theory
of Mind (see the Computational Theory of Mind entry). CTM understands inferences as truth preserving
movements in thought that are underwritten by the formal/syntactic properties of thoughts. For example
inferring the conclusion in modus ponens from the premises is possible just based on the form of the major
and minor premise, and not on the content of the premises. Associative transitions are transitions in thought
that are not based on the logico-syntactic properties of thoughts. Rather, they are transitions in thought that
occur based on the associative relations among the separate thoughts.
Imagine an impure associationist model of the mind, one that contains both propositional and associative
structures. A computational inference might be one such as inferring YOU ARE A G from the thoughts IF YOU
ARE AN F, THEN YOU ARE A G, and YOU ARE AN F. However, an associative transition is just a stream of ideas
that needn’t have any formal, or even rational, relation between them, such as the transition from THIS
COFFEESHOP IS COLD to RUSSIA SHOULD ANNEX IDAHO, without there being any intervening thoughts. This
transition could be subserved merely by one’s association of IDAHO and COLD, or it could happen because the
two thoughts have tended to co-occur in the past, and their close temporal proximity caused an association
between the two thoughts to arise (or for many other reasons). Regardless of the etiology, the transition doesn’t
occur on the basis of the formal properties of the thoughts.22
According to this taxonomy, talk of an ‘associative inference’ (e.g., Anderson et al. 1994, Armstrong et al. 2012)
is a borderline oxymoron. The easiest way to give sense to the idea of an associative inference is for it to
involve transitions in thought that began because they were purely inferential (as understood by the
computational theory of mind) but then became associated over time. For example, at first one might make the
Interestingly, Locke also seemed to understand the nature of taste aversions (see section 9.4): “A grown
person surfeiting with honey no sooner hears the name of it, but his fancy immediately carries sickness and
qualms to his stomach, and he cannot bear the very idea of it; other ideas of dislike, and sickness, and vomiting,
presently accompany it, and he is disturbed; but he knows from whence to date this weakness, and can tell how
he got this indisposition. Had this happened to him by an over-dose of honey when a child, all the same effects
would have followed; but the cause would have been mistaken, and the antipathy counted natural” (Locke 1690
2.23.7).
22 In the example of associative transitions offered above, we used associations between propositions. But of
course a pure associationist view would not allow propositional structures. It is thus a bit more difficult for a
pure associationist to distinguish associative transitions from associative structures. For the pure associationist,
all transitions are associative transitions among associative structures, for association is the only available
mental process and associative structures the only available mental structure. Thus, for the pure associationist,
the only possible difference between an associative structure and an associative transition is a contingent
temporal one (where an associative structure is ideally contemporaneous whereas an associative transition
unfolds over time).
21
modus ponens inference because a particular series of thoughts instantiates the modus ponens form. Over time
the premises of that particular token of a modus ponens argument become associated with each other through
their continued use in that inference and now the thinker merely associates the premises with the conclusion.
That is, the constant contiguity between the premises and the conclusion occurred because the inference was
made so frequently, but the inference was originally made so frequently not because of the associative relations
between the premises and conclusion, but because the form of the thoughts (and the particular motivations of
the thinker). This constant contiguity then formed the basis for an associative linkage between the premises and
the conclusion.
As was the case for associative structures, associative transitions in thought are not just a logical possibility.
There are particular empirical differences associated with associative transitions versus inferential transitions.
Associative transitions tend to move across different content domains, whereas inferential transitions tend to
stay on a more focused set of contents. These differences have been seen to result in measurable differences in
mood: associative thinking across topics bolsters mood when compared to logical thinking on a single topic
(Mason and Bar 2011).
6. Associative Instantiation
The associationist position so far has been neutral on how associations are to be implemented. Implementation
can be seen at a representational (that is psychological) level of explanation, or at the neural level. A pure
associationist picture would posit an associative implementation base at one, or both, of these levels.23
The most well-known associative instantiation base is a class of networks called Connectionist networks (see
the Connectionism entry). Connectionist networks are sometimes pitched at the psychological level (see, e.g.,
Elman 1991, Elman et al. 1996, Smolensky 1988). This amounts to the claim that models of algorithms
embedded in the networks capture the essence of certain mental processes, such as associative learning. Other
times connectionist networks are said to be models of neural activity (‘neural networks’). Connectionist
networks consist in sets of nodes, generally input nodes, hidden nodes, and output nodes. Input nodes are
taken to be analogs of sensory neurons (or sub-symbolic sensory representations), output nodes the analog of
motor neurons (or sub-symbolic behavioral representations), and hidden nodes are stand-ins for all other
neurons.24 The network consists in these nodes being connected to each other with varying strengths. The
topology of the connections gives one an associative mapping of the system, with the associative weights
understood as the differing strengths of connections. On the psychological reading, these associations are
functionally defined; on the neurological reading, they are generally understood to be representing synaptic
conductance (and are the analogs of dendrites). Prima facie, these networks are purely associative and do not
contain propositional elements, and the nodes themselves are not to be equated with single representational
states (such as concepts; see, e.g., Gallistel and King 2009).
However, a connectionist network can implement a classical Turing machine architecture (see, e.g., McLaughlin
and Fodor 1990, Chalmers 1993). Many, if not most, of the adherents of classical computation, for example
proponents of CTM, think that the brain is an associative network, one which implements a classical
computational program. Some adherents of CTM do deny that the brain runs an associative network (see, e.g.,
Gallistel and King 2009, who appear to deny that there is any scientific level of explanation that association is
intimately involved in), but they do so on separate empirical grounds and not because of any logical
The question of how many levels of explanation one allows in their cognitive architecture is a wholly separate
question of whether any of those architectures are associationistic. Generalizations here vary wildly from
theorist to theorist. For example, many theorists, roughly following Marr (1982), assume there is just one
algorithmic (psychological/representational) level which is then instantiated in a physical (neurological) level
(see, e.g., Mitchell et al. 2009). Others generally assume that there are multiple psychological levels. For
instance, Fodor writes, “psychological faculties at the nth level are typically implemented by psychological
faculties at the n-1th level” (2003, p. 132).
24 In this context, ‘sub-symbolic’ just means that the node on its own has no semantic value. In other words, a
single node wouldn’t represent any content.
23
inconsistency with an associative brain implementing a classical mind.
When discussing an associative implementation base it is important to distinguish questions of associationist
structure from questions of representational reality. Connectionists have often been followers of the Skinnerian
anti-representationalist tradition (Skinner 1938). Because of the distributed nature of the nodes in connectionist
networks, the networks have tended to be analyzed as associative stimulus/response chains of subsymbolic
elements. However, the question of whether connectionist networks have representations which are distributed
in patterns of activity throughout different nodes of the network, or whether connectionist networks are best
understood as containing no representational structures at all, is orthogonal to both the question of whether
the networks are purely associative or computational, and whether the networks can implement classical
architectures.
7. Relation between the Varieties of Association and
Related Positions
These four types of associationism share a certain empiricist spiritual similarity, but are logically, and
empirically, separable. The pure associationist who wants to posit the smallest number of domain-general
mental processes will theorize that the mind consists of associative structures acquired by associative learning
which enter into associative transitions and are implemented in an associative instantiation base. However,
many hybrid views are available and frequently different associationist positions become mixed and matched,
especially once issues of empiricism, domain-specificity, and gradual learning arise. Below is a partial taxonomy
of where some well-known theorists lie in terms of associationism and these other, often related doctrines.
Prinz (2002) and Karmiloff-Smith (1992) are examples of empiricist non-associationists. It is rare to find an
associationist who is a nativist, but plenty of nativists have aspects of associationism in their own work. For
example, even the arch-nativist Jerry Fodor allows that intramodular lexicons contain associative structures
(Fodor 1983). Similarly, there are many non-behaviorist (at least non-radical, analytic, or methodological
behaviorist) associationists, such as Elman (1991), Smolensky (1988), Baeyens (De Houwer et al. 2001) and
modern day dual process theorists such as Evans and Stanovich (2013). It is quite difficult to find a nonassociationist behaviorist, though Tolman approximates one (Tolman 1948). Elman and Smolensky also qualify
as representationalist associationists, and Van Gelder (1995) as an anti-representationalist non-associationist.
Karmiloff-Smith (1992) can be interpreted as, for some areas of learning, a proponent of gradual learning
without being associationist (some might also read contemporary Bayesian theorists, e.g., Tenenbaum et al.
2011 and Chater et al. 2006 as holding a similar position for some areas of learning). Rescorla (1988) and Heyes
(2012) claim to be associationists who are pro step-wise, one shot learning (though Rescorla sees his project as
a continuation of the classical conditioning program, others see his data as grist for the anti-associationist, procomputationalist mill, see Gallistel and King 2009). Lastly, Tenenbaum and his contemporary Bayesians
colleagues sometimes qualify as holding a domain-general learning position without it being associationist. 25
8. Associationism in Social Psychology
Since the cognitive revolution, associationism’s influence has died out quite a bit in cognitive psychology and
psycholinguistics. This is not to say that all aspects of associative theorizing are dead in these areas; rather, they
have just taken on much smaller roles (for example, it has often been suggested that mental lexicons are
structured, in part, associatively, which is why lexical decision tasks are taken to be facilitation maps of one’s
lexicon). In other areas of cognitive psychology (for example, the study of causal cognition), associationism is
no longer the dominant theoretical paradigm, but is still very much alive as a theoretical option (see Shanks
2010 for an overview of associationism in causal cognition). Associationism is also still thriving in the
There are no domain-specific associationists because associative learning is incompatible with domain
specificity. Domain specificity assumes different mental processes for different domains, and associative
learning presupposes the same learning mechanism regardless of domain.
25
connectionist literature, as well as in the animal cognition tradition.
But the biggest contemporary stronghold of associationist theorizing resides in social psychology, an area
which has traditionally been hostile to associationism (see, e.g., Asch 1962, 1969). The ascendance of
associationism in social psychology has been a fairly recent development, and has caused a revival of
associationist theories in philosophy and cognitive science. The two areas of social psychology that have seen
the greatest renaissance of associationism are the implicit attitude and dual-process theory literature.
8.1 Implicit Attitudes
Implicit attitudes are generally operationally defined as mental representations that are unreported, inaccessible
to consciousness, and detectable in paradigms such as the Implicit Association Test (Greenwald et al. 1998),
the Affect Misperception Task (Payne 2009), the Sorted Paired Feature Task (Bar-Annan et al. 2009) and the
Go/No-Go Association Task (Nosek and Banaji 2001). The default position among social psychologists is to
treat implicit attitudes as if they are associations among mental representations (Fazio 2007), or among pairs of
mental representations and valences. In particular, they treat implicit attitudes as associative structures which
enter into associative transitions. Recently this issue has come under much debate (see De Houwer 2014,
Mandelbaum forthcoming; see also the Implicit Attitudes entry).
8.2 Dual Process Theories
Associative structures and transitions are widely implicated in a particular type of influential dual-process
theory. Though there are many dual-process theories in social psychology (see, e.g., the papers in Chaiken and
Trope, 1999, or the discussion in Evans and Stanovich 2013), the one most germane to associationism is also
the most popular. It originates from work in the psychology of reasoning and is often also invoked in the
heuristics and biases tradition (see, e.g., Kahneman 2011). It has been developed by many different
psychological theorists (Sloman 1996, Smith and Decoster 2000, Wilson et al. 2001, Stanovich and Evans 2013)
and, in parts, taken up by philosophers too (see, e.g., Gendler 2008, Frankish 2009, see also some of the essays
in Evans and Frankish 2009).
The dual-process strain most relevant to the current discussion posits two systems, one evolutionarily ancient
intuitive system underlying unconscious, automatic, fast, parallel and associative processing, the other an
evolutionarily recent reflective system characterized by conscious, controlled, slow, ‘rule-governed’ serial
processes (see, e.g., Evans and Frankish 2013). The ancient system, sometimes called ‘System 1’, is often
understood to include a collection of autonomous, distinct subsystems, each of which is recruited to deal with
distinct types of problems (see Stanovich 2011 for a discussion of ‘TASS—the autonomous set of systems’).
Although theories differ on how System 1 interacts with System 2,26 the theoretical core of System 1 is arguing
that its processing is essentially associative. As in the implicit attitude debate, dual systems models have recently
come under fire (see Kruglanski 2013, Osman 2013, Mandelbaum forthcoming), though they remain very
popular.
9. Criticisms of Associationism
Associationism has been a dominant theme in mental theorizing for centuries. As such, it has garnered an
appreciable amount of criticism.
9.1 Learning Curves
For example, in a ‘default-interventionist’ model System 2 processes are not always engaged though they are
in ‘parallel competitive’ models (both models include the constant automatic engagement of System 1). See
Evans and Stanovich 2013 for discussion.
26
The basic associative learning theories imply, either explicitly or implicitly, slow, gradual learning of associations
(Baeyens et al. 1995). The learning process can be summarized in a learning curve which plots the frequency (or
magnitude) of the conditioned response as a function of the number of reinforcements (Gallistel 2004, p.
13124). Mappings between CRs and USs are gradually built up over numerous trials (in the lab) or experiences
(in the world). Gradual, slow learning has come under fire from a variety of areas (see the Garcia effect and
language learning sections). However, here we just focus on the behavioral data. In a series of works reanalyzing animal behavior, Gallistel (2004, Gallistel and King 2009) has argued that although group-level learning
curves do display the properties of being negatively accelerated and gradually developing, these curves are
misleading because no individual’s learning curve has these properties. Gallistel has argued that learning for
individuals is generally step-like, rapid, and abrupt. An individual’s learning from a low-level of responding to
asymptotic responding is very quick. Sometimes, the learning is so quick that it is literally one-shot learning. For
example, after analyzing multiple experiments of animal learning of spatial location Gallistel writes “the
learning of a spatial location generally requires but a single experience. Several trials may, however, be required
to convince the subject that the location is predictable from trial to trial” (Gallistel 2004, p. 13130).
Gallistel argues that the reason the group learning curves look to be smooth and gradual is that there are large
individual differences between subjects in terms of when the onset latency of the step-wise curves begin (ibid,
p. 13125); in other words, different animals take different amounts of time for the learning to commence. The
differences between individual subject’s learning curves are predicated on when the steps begin and not by the
speed of the individual animal’s learning process. All individuals appear to show rapid rises in learning, but
since each begins their learning at different times, when we average over the group the rapid step-wise learning
appears to look like slow, gradual learning (Gallistel 2004, p. 13124).
9.2 The Problem of Predication
The problem of predication is, at its core, a problem of how an associative mechanism can result in the
acquisition of subject/predicate structures, structures which many theorists believe appear in language, thought,
and judgment. The first major discussion of the problem appears in Kant (1781/1787), but variants of the basic
Kantian criticism can be seen across the contemporary literature (see, e.g., Chomsky 1959, Fodor and Pylyshyn
1988, Fodor 2003, Mandelbaum 2013; for the details of the Kantian argument see the entry on Kant’s
Transcendental Argument).
For a pure associationist, association is ‘semantically transparent’ (see Fodor 2003), in that it purports to add no
additional structure to thoughts. When a simple concept, X and a simple concept Y, become associated one
acquires the associative structure X/Y. But X/Y has no additional structure on top of their contents. Knowing
that X and Y are associated amounts to knowing a causal fact: that activating Xs will bring about the activation
of Ys and vice versa. However, so the argument goes, some of our thoughts appear to have more structure than
this: the thought BIRDS FLY predicates the property of flying onto birds. The task for the associationist is to
explain how associative structures can distinguish a thinker who has a single (complex) thought BIRDS FLY from
a thinker who conjoins two simple thoughts in an associative structure where one thought, BIRDS, is
immediately followed by another, FLY. As long as the two simple thoughts are reliably causally correlated so
that, for a thinker, activations of BIRDS regularly brings about FLY, then that thinker has the associative
structure BIRDS/FLY. Yet it appears that thinker hasn’t yet had the thought BIRDS FLY. The problem of
predication is explaining how a purely associative mechanism could eventuate in complex thoughts. In Fodor’s
terms the problem boils down to how association, a causal relation among mental representations, can affect
predication, a relation among intentional contents (Fodor 2003).
A family of related objections to associationism can be interpreted as variations on this theme. For example,
problems of productivity, compositionality, and systematicity for associationist theorizing appear to be variants
of the problem of predication (for more on these specific issues see the Language of Thought Hypothesis
entry and the Compositionality entry). If association doesn’t add any additional structure to the mental
representations that get associated, then it is hard to see how it can explain the compositionality of thought,
which relies on structures that specify relations among intentional contents. Compositionality requires that the
meaning of a complex thought is determined by the meanings of its simple constituents along with their
syntactic arrangements. The challenge to associationism is to explain how an associative mechanism can give
rise to the syntactic structures necessary to distinguish a complex thought like BIRDS FLY from the temporal
succession of two simple thoughts BIRDS and FLY. Since the compositionality of thought is posited to undergird
the productivity of thought (thinkers’ abilities to think novel sentences of arbitrary lengths, e.g., GREEN BIRDS
FLY, GIANT GREEN BIRDS FLY, CUDDLY GIANT GREEN BIRDS FLY, etc.), associationism has problems explaining
productivity.
Systematicity is the thesis that there are predictable patterns among which thoughts a thinker is capable of
entertaining. Thinkers that can entertain thoughts of certain structures can always entertain distinct thoughts
that have related structure. For instance, any thinker who can think a complex thought of the form ‘X transitive
verb Y’ can think ‘Y transitive verb X.’ Systematicity entails that we won’t find any (human) thinker that can
only think one of those two thoughts, in which case we could not find a person who could think AUDREY
WRONGED MAX, but not MAX WRONGED AUDREY. Of course, these two thoughts have very different effects in
one’s cognitive economy. The challenge for the associationist is to explain how the associative structure
AUDREY/WRONGED/MAX can be distinguished from the structure MAX/WRONGED/AUDREY, while capturing
the differences in those thoughts’ effects.
Associationists have had different responses to the problem. Some have denied that human thought is actually
compositional, productive, and systematic, and other non-associationists have agreed with this critique. For
example, Prinz and Clark claim “concepts do not compose most of the time” (2002, 62), and Johnson (2004)
argues that the systematicity criterion is wrongheaded (see Aydede 1997 for extended discussion of these
issues). Rumelhart et al. offer a connectionist interpretation of ‘schemata’, one which is intended to cover some
of the phenomenon mentioned in this section (Rumelhart et al. 1986). Others have worked to show that
classical conditioning can indeed give rise to complex associative structures (Rescorla 1988). In defense of the
associationist construal of complex associations Rescorla writes, “Clearly, the animals had not simply coded the
RH [complex] compound in terms of parallel associations with its elements. Rather they had engaged in some
more hierarchical structuring of the situation, forming a representation of the compound and using it as an
associate” (Rescorla 1988, p. 156). Whether or not associationism has the theoretical tools to explain such
complex compounds by itself is still debated (see, e.g., Fodor 2003, Mitchell 2009, Gallistel and King 2009).
9.3 Word Learning
Multiple issues in the acquisition of the lexicon appear to cause problems for associationism. Some of the most
well known examples are reviewed below (for further discussion of word learning and associationism see
Bloom 2000).
9.3.1 Fast Mapping
Children learn words at an incredible rate, acquiring around 6,000 words by age 6 (Carey 2010, p. 184). If
gradual learning is the rule, then words too should be learned gradually across this time. However, this does not
appear to be the case. In a series of studies, Carey discovered the phenomenon of ‘fast mapping’, which is oneshot learning of a word (Carey 1978a, 1978b, Carey and Bartlett 1978). Her most influential example
investigated children’s acquisition of ‘chromium’ (a color word referring to olive green). Children were shown
one of two otherwise identical objects, which only differed in color and asked, “Can you get me the chromium
tray, not the red one, the chromium one” (recited in Carey 2010, p. 2). All of the children handed over the
correct tray at that time. When the children were later tested in differing contexts, more than half remembered
the referent of ‘chromium.’ These findings have been extended—for example, Markson and Bloom (1997)
showed that they are not specific to the remembering of novel words, but also hold for novel facts.
Fast mapping poses two problems for associationism. The first is that the learning of a new word did not
develop slowly, as would be predicted by proponents of gradual learning. The second is that in order for the
word learning to proceed, the mind must have been aided by additional principles not given by the
environment. Some of these principles such as Markman’s (1989) taxonomic, whole object, and mutual
exclusivity constraints, and Gleitman’s syntactic bootstrapping (Gleitman et al. 2004), imply that the mind does
add structure to what is learned. Consequently, the associationist claim that learning is just mapping external
contingencies without adding structure is imperiled.
9.3.2 Syntactic Category Learning
‘Motherese’, the name of the type of language that infants generally hear, consists of simple sentences such as
‘Nora want a bottle?’ and ‘Are you tired?’. These sentences almost always contain a noun and a verb. Yet, the
infant’s vocabulary massively over-represents nouns in the first 100 words or so, while massively underrepresenting the verbs (never mind adjectives or adverbs, which almost never appear in the first 100 words
infants produce; see, e.g., Goldin-Meadow, Seligman, and Gelman, 1976; Bates, Dale, and Thal, 1995). Even
more surprising is that the over-representation of nouns to verbs holds even though “the incidence of each
word (that is, the token frequency) is higher for the verbs than for the nouns in the common set used by
mothers” (Snedeker and Gleitman 2004, p. 259, citing data from Sandhoffer Smith, and Luo 2000). Moreover,
children hear a preponderance of determiners (‘the’ and ‘a’) but don’t produce them (Bloom 2000). These facts
are not specific to English, but hold cross-culturally (see, e.g., Caselli et al. 1995). The disparity between the
variation of the syntactic categories infants receive as input and produce as output is troublesome to
associationism, insofar as associationism is committed to the learned structures (and the behaviors that follow
from them) merely patterning what is given in experience.
9.4 Against the Contiguity Analysis of Associationism
Contiguity has been a central part of associationist analyses since the British Empiricists. In the experimental
literature, the problem of figuring out the parameters needed for contiguity has sometimes been termed the
problem of the ‘Window of Association’ (e.g., Gallistel and King 2009). The crux of the problem is that if
contiguity is to be a founding pillar of associationism, then the window needs to be relatively short. Thus the
need to specify the temporal properties of the window is a desideratum for any empirically adequate
associationist theory that involves contiguity. 27 A related problem for contiguity theorists is that if the domain
generality of associative learning is desired, then the window needs to be homogenous across content domains.
The late 1960s saw persuasive attacks on domain generality, as well as the necessity and sufficiency of the
contiguity criterion in general.
9.4.1 Against the Necessity of Contiguity
Research on ‘taste aversions’ and ‘bait-shyness’ provided a variety of problems with contiguity in the associative
learning tradition of classical conditioning. Garcia observed that a gustatory stimulus (e.g., drinking water or
eating a hot dog) but not an audiovisual stimulus (a light and a sound) would naturally become associated with
feeling nauseated. For instance, Garcia and Koelling (1966) paired an audiovisual stimulus, a light and a sound,
with a gustatory stimulus, flavored water. The two stimuli were then paired with the rats receiving radiation,
which made the rats feel nauseated. The rats associated the feeling of nausea with the water and not with the
sound, even though the sound was contiguous with the water. Moreover, the delay between ingesting the
gustatory stimulus and feeling nauseated could be quite long, with the feeling not coming on until 12 hours
later (Roll and Smith 1972), and the organism needn’t even be conscious when the negative feeling arises. (For
a review, see Seligman 1970, Garcia et al. 1974). The temporal delay shows that the CS (the flavored water)
needn’t be contiguous with the US (the feeling of nausea) in order for learning to occur, thus showing that
contiguity isn’t necessary for associative learning.
Garcia’s work also laid bare the problems with the domain general aspect of associationism. In the above study
the rat was prepared to associate the nausea with the gustatory stimulus, but would not associate it with the
audiovisual stimulus. However, if one changes the US from feeling nauseated to receiving shocks in perfect
contiguity with the audiovisual and gustatory stimuli, then the rats will associate the shocks with the audiovisual
stimulus but not with the gustatory stimulus. That is, rats are prepared to associate audiovisual stimuli with the
Gallistel and King (2009, 239) argue that there is no such window. Instead they argue that what matters for
learning in place of contiguity is a ratio of the time between the presentation of the CS and the appearance of
the US as compared to the time between different US presentations (in a given context). For example, speeding
up the CS/US connection by a factor of two reduces the amount of US presentations one needs by half.
27
shock but contraprepared to associate the shocks with the gustatory stimulus. Thus, learning does not seem to
be entirely domain general (for similar content specificity effects in humans, see Baeyens et al. 1990). 28
Lastly, ‘The Garcia effect’ has also been used to show problems in the learning curve (see section 9.1). ‘Taste
aversions’ are the phenomena whereby an organism gets sick from ingesting the stimulus and the taste (or
odor, Garcia et al. 1974) of that stimulus gets associated with the feeling of sickness. As anyone who has had
food poisoning can attest, this learning can proceed in a one-shot fashion, and needn’t have a gradual rise over
many trials (taste aversions have also been observed in humans, see, e.g., Bernstein and Webster 1980,
Bernsetin 1985, Logue et al. 1981, Rozin 1986).
9.4.2. Against the Sufficiency of Contiguity
Kamin’s famous blocking experiments (1969) showed that not all contiguous structures lead to classical
conditioning. A rat that has already learned that CS1 predicts a US, will not learn that a subsequent CS2
predicts the US, if the CS2 is always paired with the CS1. Suppose that a rat has learned that a light predicts a
shock because of the constant contiguity of the light and shock. After learning this, the rat has a sound
introduced which only arises in conjunction with the light and the shock. As long as the rat had previously
learned that the light predicts the shock, it will not learn that the sound does (as can be seen on later trials that
have the sound alone). In sum, having learned that the CS1 predicts the US blocks the organism from learning
that the CS2 predicts the US.29 So even though CS2 is perfectly contiguous with the US, the association
between CS2 and the US remains unlearned, thus serving as a counterexample to sufficiency of contiguity. 30
Similarly Rescorla (1968) demonstrated that a CS can appear only when the US appears and yet have the
association between them be unlearnable. If a tone is arranged to bellow only when there are shocks, but there
are still shocks when there are no tones (that is, the CS only appears with the US, but the US sometimes
appears without the CS), no associative learning between the CS and the US will occur. Instead, subjects (in
Rescorla 1968, rats) will only learn a connection between the shock and the experimental situation—e.g., the
room in which the experiment is carried out.
In large part because of the problems discussed in 9.4, many classical conditioning theorists gave up the
traditional program. Some, like Garcia, appeared to give up the classical theoretical framework altogether
(Garcia et al. 1974), others, such as Rescorla and Wagner, tried to usher the framework into the modern era
(see, Rescorla and Wagner 1972, Rescorla 1988), where conditioning is seen as sensitive to base rates and
It appears that content specificity of associations needn’t just be based on innate dispositions. For example,
in an evaluative conditioning paradigm using odors as USs and faces as CSs, the evaluative conditioning only
commenced when the odors were interpreted as plausibly human (Todrank et al. 1995). But ‘plausibly human’
included learned information (such as the odors associated with soap). When the odors were typically
associated with objects and not humans, no learning transpired. Additionally, there appears to be contentspecific differences in associative learning at a greater level of abstraction: there is evidence that negative
US/CS pairings are learned more quickly, and form stronger bonds than positive US/CS pairings (Rozin 1986,
Baeyens et al. 1990.)
29 Blocking has been observed in humans (see Dickinson et al. 1984) but one needn’t delve into the empirical
literature to feel the pull of the phenomenon. Imagine you’ve eaten an orange and immediately have an allergic
reaction. If in your next meal you eat an orange and an apple and have the allergic reaction, you will be less
likely to think the apple caused the reaction than you would were you to have never experienced the allergic
reaction after eating the orange.
30 More problematically for associationists, blocking doesn’t always work, but when it doesn’t isn’t predictable
by associative theory. For example, if a weak odor is paired with a strong taste and the pairing is followed by
gastrointestinal distress, the taste magnifies the sensitivity of the odor as a signal (Rusiniak 1979). Relatedly, if a
hawk eats a black mouse and gets sick, the hawk won’t just avoid black mice but will avoid all mice. However,
if the black mouse tastes different than a white mouse, then the hawk will continue to eat white mice even after
black mice make it sick (Brett et al. 1976).
28
driven by informational pick-up.31 Whether this movement is interpreted as a substantive revision of classical
conditioning (Rescorla 1988, Heyes 2012) or a wholesale abandoning of it (Gallistel and King 2009) is
debatable.
9.5 Coextensionality
The Rescorla experiment also demonstrates another problem in associative theorizing: the question of why
some property is singled out as a CS as opposed to different, equally contemporaneously instantiated
properties. Put a different way, one needs a principle to say what the ‘same situation’ amounts to in
generalizations such as Thorndike’s laws. For instance, if a CS and a US, say a tone and a shock, are perfectly
paired so that they are either both present or both absent, the organism won’t associate the location it received
shocks (e.g., the experimental setting) with getting shocked, it will just associate the tone with the shocks. But
in the condition where the US occurs without the CS, but the CS does not occur without the US, the organism
will gain an association between the shocks and the location. However, in both cases the location is present on
every trial. In contrast to shocks, x-ray radiation, when used as a US, never appears to become associated with
location, even if they are always perfectly paired (Garcia et al. 1972). 32
The problem of saying which properties become associated when multiple properties are coinstantiated
sometimes goes by the name the ‘Credit Assignment Problem (see, e.g., Gallistel and King 2009). 33 Some would
argue that this problem is a symptom of a larger issue: trying to use extensional criteria to specify intentional
content (see, e.g., Fodor 2003). Associationists need a criterion to which of the coextensive properties will in
fact be learned, and which not.
An additional worry stems from the observation that sometimes the lack of a property being instantiated is an
integral component of what is learned. To deal with the problem of missing properties, contemporary
associationists have introduced an important element to the theory: inhibition. For example, if a US and a CS
only appear when the other is absent, the organism will learn a negative relationship holds between them; that
is, the organism will learn that the absence of the CS predicts the US. 34 Here the CS becomes a ‘conditioned
inhibitor’ of the US. Inhibition, using associations as modulators and not just activators, is a central part of
current associationist thinking. For example, in connectionist networks, inhibition is implemented by the
activation of certain nodes inhibiting the activation of other nodes. Connection weights can be positive or
negative, with the negative weight standing in for the inhibitory strength of the association.
Bibliography
Oddly enough, evaluative conditioning does not seem as sensitive to base rates or as susceptible to ‘occasion
setting’ as classical conditioning is. See De Houwer et al. 2001).
32 The more one looks into how locational properties become associated, the more problems seem to mount.
For example, if a rat has a strong preference for a particular drink but gets shocked while ingesting that drink,
the rat will not change its preference of the flavor. Instead, the rat will just learn to avoid the drink when it
encounters it in the experimental location. But when the rat is given a chance to ingest the drink anywhere else
(e.g., back in its home cage) it will still continue to ingest the drink. Furthermore, in the case where the rat gets
shocked while drinking the highly desirable flavor in the Skinner box on trial N, the rat will increase how much
of the drink it will intake on trial N+1. This is a reasonable strategy: assuming that one knows they are going to
get shocked, they might as well intake as much as possible while getting shocked. For more on these points,
see Garcia (et al. 1970).
33 In other versions of the problem it is understood as the problem the organism faces in trying to figure out
which of its behaviors produced the environmental change that interests the organism. It also appears in
problems in Artificial Intelligence (see Minksy 1963).
34 For a pure associationist, one would phrase this as the organism learning to associate the lack of CS with the
US. How the pure associationist analyzes the absence of a CS while using only associative structures can also be
a difficult issue.
31




























Anderson, J., Spoehr, K. and Bennett, D., 1994, “A Study in Numerical Perversity: Teaching
Arithmetic to a Neural Network,” in Neural Networks for Knowledge Representation and Inference, D. Levine
and M. Aparicio IV (eds.), East Sussex: Psychology Press, pp. 311-335.
Armstrong, K., Kose, S., Williams, L., Woolard, A., and Heckers, S., 2012, “Impaired Associative
Inference in Patients with Schizophrenia,” Schizophrenia Bulletin, 38(3): 622-629.
Asch, S., 1962, “A Problem in the Theory of Associations,” Psychologische Beitrage, (6): 553–563.
–––, 1969, “A Reformulation of the Problem of Association,” American Psychologist, 24(2): 92–102.
Aydede, M., 1997, “Language of Thought: The Connectionist Contribution,” Minds and Machines, 7(1):
57-101.
Baeyens, F., Eelen, P., Van den Bergh, O., and Crombez, G., 1990, “Flavor-Flavor and Color-Flavor
Conditioning in Humans,” Learning and Motivation, 21 (4): 434-455.
Baeyens, F., Eelen, P., and Crombez, G., 1995, “Pavlovian Associations are Forever: On Classical
Conditioning and Extinction,” Journal of Psychophysiology, 9(2): 127–141.
Bar-Anan Y., Nosek, B., and Vianello, M., 2009, “The Sorting Paired Features Task: A Measure of
Association Strengths,” Experimental Psychology, 56(5): 329-343
Bates, E., and MacWhinney, B., 1987, “Competition, Variation, and Language Learning,” in B.
MacWhinney (Ed.), Mechanisms of Language Acquisition, Hillsdale, N.J.: Lawrence Erlbaum Associates,
pp. 157-193.
Bernstein, I., and Webster, M., 1980, “Learned Taste Aversions in Humans,” Physiology and Behavior,
25(3): 363–366.
Bernstein, I., 1985, “Learned Food Aversions in the Progression of Cancer and its Treatment,” in N.
Braveman and P. Bronstein, (eds.), Experimental Assessments and Clinical Applications of Conditioned Food
Aversions, New York: New York Academy of Sciences, pp. 365–80.
Bloom, P., 2000, How Children Learn the Meanings of Words, Cambridge: MIT press.
Bouton, M., 2002, “Context, Ambiguity, and Unlearning: Sources of Relapse after Behavioral
Extinction,” Biological Psychiatry, 52(10): 976-986.
Brett, L., Hankins, W., and Garcia, J., 1976, “Prey-Lithium Aversions. III: Buteo hawks,”Behavioral
Biology, 17(1), 87-98.
Carey, S., 1978a, “Less May Never Mean More,” in: R. Campbell; P. Smith, (eds.), Recent Advances in the
Psychology of Language, New York: Plenum Press, p. 109-132.
–––, 1978b, “The Child as Word Learner” in: J. Bresnan, G. Miller, M. Halle, (eds.), Linguistic Theory
and Psychological Reality, Cambridge: MIT Press, pp. 264-293.
–––, 2010, “Beyond Fast Mapping,” Language Learning and Development, 6(3): 184-205.
Carey, S., and Bartlett, E., 1978, “Acquiring a Single New Word,” Proceedings of the Stanford Child
Language Conference, 15: 17–29.
Caselli, M. C., Bates, E., Casadio, P., Fenson, J., Fenson, L., Sanderl, L., and Weir, J., 1995, “A Crosslinguistic Study of Early Lexical Development,” Cognitive Development, 10 (2): 159-199.
Chaiken, S., and Trope, Y., (eds.), 1999, Dual-Process Theories in Social Psychology, New York: Guilford
Press.
Chalmers, D., 1993, “Connectionism and Compositionality: Why Fodor and Pylyshyn Were Wrong,”
Philosophical Psychology 6(3): 305-319.
Chater, N., Tenenbaum, J., and Yuille, A., 2006, “Probabilistic Models of Cognition: Conceptual
Foundations,” Trends in Cognitive Sciences, 10 (7): 287-291.
Chater, N., 2009, “Rational Models of Conditioning,” Behavioral and Brain Sciences, 32 (2): 204-205.
Churchland, P., 1989, A Neurocomputational Perspective: The Nature of Mind and the Structure of Science,
Cambridge: MIT.
Churchland, P., Sejnowski, T., 1990, “Neural Representation and Neural Computation,” Philosophical
Perspectives, 4, 343-382.
Churchland, P., 1986, “Some Reductive Strategies in Cognitive Neurobiology,” Mind, 95 (379): 279–
309.
Chomsky, N., 1959, “A Review of B.F. Skinner’s Verbal Behavior,” Language, 35(1), 26-58.
Collins, A., and Loftus, E., 1975, “A Spreading-Activation Theory of Semantic
Processing,” Psychological Review, 82 (6): 407-428.

























De Houwer, J., Thomas, S., and Baeyens, F., 2001, “Association Learning of Likes and Dislikes: A
Review of 25 years of Research on Human Evaluative Conditioning,” Psychological Bulletin, 127(6): 853869.
De Houwer, J., 2009, “The Propositional Approach to Associative Learning as an Alternative for
Association Formation Models,” Learning & Behavior, 37(1), 1-20.
–––, 2011, “Evaluative Conditioning: A Review of Procedure Knowledge and Mental Process
Theories,” in T. Schachtman and S. Reilly (eds.), Associative Learning and Conditioning Theory: Human and
Non-Human Applications, New York: Oxford University Press, pp. 399-416.
–––, 2014, “A Propositional of Implicit Evaluation,” Social and Personality Psychology Compass, 8 (7): 342353.
Diaz, E., Ruis, G., and Baeyens, F., 2005, “Resistance to Extinction of Human Evaluative
Conditioning Using a Between-Subjects Design,” Cognition and Emotion, 19 (2): 245-268.
Dickinson, A., Shanks, D., and Evenden, J., 1984, “Judgment of Act-Outcome Contingency: The role
of Selective Attribution,” The Quarterly Journal of Experimental Psychology, 36(1), 29-50.
Dirikx, T., Hermans, D., Vansteenwegen, D., Baeyens, F., and Eelen, P., 2004, “Reinstatement of
Extinguished Conditioned Responses and Negative Stimulus Valence as a Pathway to Return of Fear
in Humans,” Learning and Memory, 11, 549-54.
Elman, J., 1991, “Distributed Representations, Simple Recurrent Networks, and Grammatical
Structure,” Machine learning, 7(2-3): 195-225.
Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A., Parisi, D., Plunkett, K., 1996, Rethinking
Innateness: A Connectionist Perspective on Development, Cambridge, MA: MIT Press.
Evans, G., 1982, The Varieties of Reference, J. McDowell (ed.), Oxford: Clarendon Press.
Evans, J., and Frankish, K., (eds.), 2009, In Two Minds: Dual Processes and Beyond, Oxford: Oxford
University Press.
Evans, J., and Stanovich, K., 2013, “Dual-Process Theories of Higher Cognition: Advancing the
Debate, Perspectives on Psychological Science, 8(3): 223-241.
Fazio, R., 2007, “Attitudes as Object-Evaluation Associations of Varying Strength,” Social
Cognition, 25(5): 603-637.
Festinger, L., and Carlsmith, J., 1959, “Cognitive Consequences of Forced Compliance,” The Journal of
Abnormal and Social Psychology, 58(2): 203-210.
Field, A., and Davey, G., 1999, “Reevaluating Evaluative Conditioning: A Nonassociative Explanation
of Conditioning Effects in the Visual Evaluative Conditioning Paradigm,” Journal of Experimental
Psychology: Animal Behavior Processes, 25(2): 211-224.
Fodor, J., and Pylyshyn, Z., 1988, “Connectionism and Cognitive Architecture: A Critical Analysis,”
Cognition, 28 (1-2): 3-71.
Fodor, J.,1983, The Modularity of Mind. Cambridge: MIT Press.
–––, 2003, Hume Variations. Oxford: Clarendon Press.
–––, and McLaughlin, B., 1990, “Connectionism and the Problem of Systematicity: Why Smolensky’s
Solution Doesn't Work,” Cognition, 35(2): 183-204.
Frankish, K., 2009, “Systems and Levels: Dual-System Theories and the Personal-Subpersonal
Distinction,” in J. Evans and K. Frankish (eds). In Two Minds: Dual Processes and Beyond. Oxford:
Oxford University Press, pp. 89–107.
Gallistel, C., Fairhurst, S., and Balsam, P., 2004, “The Learning Curve: Implications of a Quantitative
Analysis,” Proceedings of the National Academy of Sciences of the United States of America, 101(36): 1312413131.
Gallistel, C., and King, A., 2009, Memory and the Computational Brain: Why Cognitive Science Will Transform
Neuroscience, West Sussex: Wiley Blackwell.
Garcia, J., 1981, “Tilting at the Paper Mills of Academe,” American Psychologist, 36(2): 149-158.
Garcia, J., Kovner, R., and Green, K., 1970, “Cue Properties vs Palatability of Flavors in Avoidance
Learning,” Psychonomic Science, 20(5): 313-314.
Garcia, J., McGowan, B., and Green, K, 1972, “Biological Constraints on Conditioning II,” in W.
Black, and W. Prokasy (eds.), Classical Conditioning II: Current Research and Theory, New York: AppletonCentury-Crofts, pp. 3-27.





























Garcia, J., Hankins, W., and Rusiniak, K., 1974, “Behavioral Regulation of the Milieu Interne in Man
and Rat,” Science, 185(4154): 824-831.
Gendler, T., 2008, “Alief and Belief,” Journal of Philosophy 105 (10): 634–63.
Gleitman, L., Cassidy, K., Nappa, R., Papafragou, A, Trueswell, J., 2005, “Hard Words,” Language
Learning and Development, 1(1): 23–64.
Glosser, G. and Freidman, R., 1991, “Lexical but not Semantic Priming in Alzheimer’s Disease,”
Psychology and Aging 6 (4): 522-27.
Goldin-Meadow, S., Seligman, M., and Gelman, S., 1976, “Language in the Two-Year Old,” Cognition
4(2): 189-202.
Greenwald, A., McGhee , D., and Schwartz, J., 1998, “Measuring Individual Differences in Implicit
Cognition: The Implicit Association Test,” Journal of Personality and Social Psychology, 74(6): 1464–1480.
Heyes, C., 2012, “Simple Minds: A Qualified Defence of Associative Learning,” Philosophical
Transactions of the Royal Society B: Biological Sciences, 367(1603): 2695-2703.
Hull, C., 1943, Principles of Behavior, New York: Appleton-Century-Crofts.
Hume, D., 1738, A Treatise of Human Nature, L. A. Selby-Bigge (ed.), 2nd ed. revised by P. H. Nidditch,
Oxford: Clarendon Press, 1975.
James, W., 1890, The Principles of Psychology (Vol. 1). New York: Holt.
Johnson, K., 2004, “On the Systematicity of Language and Thought,” Journal of Philosophy, 101 (3):
111–139.
Kahneman, D., 2011, Thinking, Fast and Slow, New York: Farrar, Straus and Giroux.
Kamin, L., 1969, “Predictability, Surprise, Attention, and Conditioning,” in B. Campbell and R.
Church (eds.), Punishment and Aversive Behavior, New York: Appleton-Century-Crofts, pp. 279-296.
Kant, I. 1781/1787, Critique of Pure Reason, in P. Guyer and A. Wood (eds.) Critique of Pure Reason, New
York: Cambridge University Press.
Karmiloff-Smith, A., 1995, Beyond Modularity: A Developmental Perspective on Cognitive Science, Cambridge:
MIT Press/Bradford Books.
Kruglanski, A., 2013, “Only One? The Default Interventionist Perspective as a Unimodel—
Commentary on Evans & Stanovich,” Perspectives on Psychological Science, 8(3): 242-247.
Locke, J., 1690, An Essay Concerning Human Understanding, in Peter H. Nidditch (ed.) An Essay
Concerning Human Understanding, Oxford: Clarendon Press, 1975,
Logue, A., Ophir, I., and Strauss, K., 1981, “The Acquisition of Taste Aversion in Humans,”
Behavioral Research and Therapy, 19 (4): 319-33.
Mandelbaum, E., 2013, “Against Alief,” Philosophical Studies, 165 (1): 197-211.
–––, Forthcoming, “Attitude, Inference, Association: On the Propositional Structure of Implicit
Attitudes,” Nous.
Markman, E., 1989, Categorization and Naming in Children: Problems of Induction, Cambridge: MIT Press.
Markson, L., and Bloom, P., 1997, “Evidence Against a Dedicated System for Word Learning in
Children,” Nature, 385 (6619): 813-815.
Marr, D., 1982, Vision: A Computational Investigation into the Human Representation and Processing of Visual
Information, NY: W.H. Freeman and Co.
Mason, M., and Bar, M., 2012, “The Effect of Mental Progression on Mood,” Journal of Experimental
Psychology: General, 141(2): 217-221.
McClelland, J., Botvinick, M., Noelle, D., Plaut, D., Rogers, T., Seidenberg, M., and Smith, L., 2010,
“Letting Structure Emerge: Connectionist and Dynamic Systems Approaches to Cognition,” Trends in
Cognitive Sciences, 14 (8): 348–356.
Minsky, M., 1963, “Steps toward Artificial Intelligence,” in E. Feigenbaum and J. Feldman (eds.),
Computers And Thought, New York, NY: McGraw-Hill, pp. 406-450
Mitchell, C., De Houwer, J., and Lovibond, P., 2009, “The Propositional Nature of Human
Associative Learning,” Behavioral and Brain Sciences 32(2): 183-246.
Nosek, B., and Banaji, M, 2001, “The Go/No-Go Association Task,” Social Cognition 19 (6): 625-66.
Osman, M., 2013, “A Case Study Dual-Process Theories of Higher Cognition—Commentary on
Evans & Stanovich,” Perspectives on Psychological Science, 8(3): 248-252.




























Pavlov, I., 1906, “The Scientific Investigation of the Psychical Faculties or Processes in the Higher
Animals,” Science, 24 (620): 613-619.
Payne, B., 2009, “Attitude Misattribution: Implications for Attitude Measurement and the ImplicitExplicit Relationship,” In R. Petty, R. Fazio, and P. Briñol (eds.), Attitudes: Insights from the new wave of
implicit measures. Hillsdale, NJ: Erlbaum pp. 459-484.
Perea, M., and Rosa, E., 2002, “The Effects of Associative and Semantic Priming in the Lexical
Decision Task,” Psychological Research 66(3): 180-194.
Prinz, J., 2002, Furnishing the Mind: Concepts and their Perceptual Basis. Cambridge: MIT Press.
–––, and Clark, A., 2004, “Putting Concepts to Work: Some Thoughts for the 21st Century,” Mind &
Language, 19 (1), 57-69.
Rescorla, R., 1968, “Probability of Shock in the Presence and Absence of CS in Fear
Conditioning,” Journal of Comparative and Physiological Psychology, 66(1): 1-5.
–––, 1988, “Pavlovian Conditioning: It's Not What You Think It Is,” American Psychologist, 43(3): 151160.
Rescorla, R., and Wagner, A., 1972, “A Theory of Pavlovian Conditioning: Variations in the
Effectiveness of Reinforcement and Nonreinforcement,” in W. Black, and W. Prokasy (eds.) Classical
Conditioning II: Current Research and Theory, New York: Appleton-Century-Crofts, pp. 64-99.
Roll, D., and Smith, J., 1972, “Conditioned Taste Aversion in Anesthetized Rats,” in M. Hager and J.
Seligman (eds.), Biological Boundaries of Learning. New York: Appleton-Century-Crofts, pp. 98-102.
Rozin, P., 1986, “One-Trial Acquired Likes and Dislikes in Humans: Disgust as a US, Food
Predominance, and Negative Learning Predominance,” Learning and Motivation, 17(2): 180-189.
Rumelhart, D., Smolensky, P., McClelland, J., and Hinton, G., 1986, “Sequential Thought Processes in
PDP Models,” in J.McClelland and D. Rumelhart (eds.), Parallel Distributed Processing Vol. 2: Explorations
in the Microstructure of Cognition: Psychological and Biological Models, Cambridge: MIT Press, pp. 7-57.
Rusiniak, K., Hankins, W., Garcia, J., and Brett, L., 1979, “Flavor-illness Aversions: Potentiation of
Odor by Taste in Rats,” Behavioral and Neural Biology, 25(1), 1-17.
Rydell, R. and McConnell, A., 2006, “Understanding Implicit and Explicit Attitude Change: A
Systems of Reasoning Analysis,” Journal of Personality and Social Psychology 91 (6): 995-1008.
Sandhoffer, C., Smith, L., and Luo, J., 2000, “Counting Nouns and Verbs in the Input: Differential
Frequencies, Different Kinds of Learning?” Journal of Child Language, 27 (3): 561-585.
Seligman, M., 1970, “On the Generality of the Laws of Learning,” Psychological Review, 77 (5): 406-418.
Shanks, D., 2010, “Learning: From Association to Cognition,” Annual Review of Psychology, 1, 273–301.
Skinner, B., 1938, The Behavior of Organisms: An Experimental Analysis. Oxford: Appleton-Century.
–––, 1953, Science and Human Behavior. New York: Simon and Schuster.
Sloman, S., 1996, “The Empirical Case for Two Systems of Reasoning,” Psychological Bulletin, 119 (1): 322.
Smith, E. R. & DeCoster, J., 2000, “Dual-Process Models in Social and Cognitive Psychology:
Conceptual Integration and Links to Underlying Memory Systems,” Personality and Social Psychology
Review, 4(2): 108-131.
Smith, J., and Roll, D., 1967, “Trace Conditioning with X-rays as an Aversive Stimulus,” Psychonomic
Science, 9(1), 11-12.
Smolensky, P., 1988, “On the Proper Treatment of Connectionism,” Behavioral and Bruin Sciences, 11(1):
l-23.
Snedeker, J., and Gleitman, L., 2004, “Why it is Hard to Label Our Concepts,” in D. Hall and S.
Waxman (eds.), Weaving a Lexicon, Cambridge, MA: MIT Press, pp. 257-294.
Stanovich, K., 2011, Rationality and the Reflective Mind. New York: Oxford University Press.
Tenenbaum, J., Kemp, C., Griffiths, T., and Goodman, N., 2011, “How to Grow a Mind: Statistics,
Structure, and Abstraction.” Science, 331(6022): 1279-1285.
Thorndike, E., 1911, Animal intelligence: Experimental studies. New York: Macmillan.
Todrank, J., Byrnes, D., Wrzesniewski, A., and Rozin, P., 1995, “Odors can Change Preferences for
People in Photographs: A Cross-Modal Evaluative Conditioning Study with Olfactory USs and Visual
CSs,” Learning and Motivation, 26(2), 116-140.
Tolman, E., 1948, “Cognitive Maps in Rats and Men.” Psychological Review, 55(4): 189-208.



Van Gelder, T., 1995, “What Might Cognition Be, If not Computation?,” The Journal of Philosophy, 91
(7): 345-381.
Vansteenwegen, D., Francken, G., Vervliet, B., De Clercq, A., and Eelen, P., 2006, “Resistance to
Extinction in Evaluative Conditioning,” Journal of Experimental Psychology: Animal Behavior Processes,
32(1): 71-79.
Wilson, T., Lindsey, S., and Schooler, T., 2000, “A Model of Dual Attitudes,” Psychological Review 107
(1): 101-26.
Academic Tools
[Auto-inserted by SEP staff]
Other Internet Resources






John Locke’s Chapter on the Association of Ideas from An Essay Concerning Human Understanding:
http://oregonstate.edu/instruct/phl302/texts/locke/locke1/Book2c.html#Chapter XXXIII
David Hume’s A Treatise of Human Nature: http://www.earlymoderntexts.com/authors/hume.html
Williams James “The Stream of Consciousness”: http://psychclassics.yorku.ca/James/jimmy11.htm
William James “The Stream of Thought”: http://psychclassics.asu.edu/James/Principles/prin9.htm
(chapter from his Principles of Psychology)
Edward Thorndike on the Law of Effect (from his book Animal Intelligence):
http://psychclassics.yorku.ca/Thorndike/Animal/chap5.htm
Ivan Pavlov’s “Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral
Cortex” http://psychclassics.yorku.ca/Pavlov/
Related Entries
Behaviorism | Compositionality | Computational Theory of Mind | Connectionism | David Hume | J.S. Mill
| Kant’s Transcendental Argument | Logical Form | 19th Century Scottish Philosophy
Acknowledgments
Helpful feedback was received from Michael Brownstein, Bryce Huebner, Zoe Jenkin, Jake Quilty-Dunn,
Shaun Nichols, and Susanna Siegel who are hereby thanked for their efforts.
<Eric Mandelbaum>
<[email protected]>