* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Associationism
Survey
Document related concepts
Biology and consumer behaviour wikipedia , lookup
Types of artificial neural networks wikipedia , lookup
Catastrophic interference wikipedia , lookup
Neurophilosophy wikipedia , lookup
Recurrent neural network wikipedia , lookup
Behaviorism wikipedia , lookup
Perceptual learning wikipedia , lookup
Donald O. Hebb wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Educational psychology wikipedia , lookup
Eyeblink conditioning wikipedia , lookup
Dual process theory wikipedia , lookup
Machine learning wikipedia , lookup
Transcript
Associationism Associationism is one of the oldest, and, in some form or another, most widely held theories of thought. Associationism has been the engine behind empiricism for centuries, from the British Empiricists through the Behaviorists and modern day Connectionists. Nevertheless, ‘associationism’ does not refer to one particular theory of cognition per se, but rather a constellation of related though separable theses. What ties these theses together is a commitment to a certain arationality of thought: a creature's mental states are associated because of some facts about its causal history, and having these mental states associated entails that bringing one of a pair of associates to mind will, ceteris paribus, ensure that the other also becomes activated. 1. What is Associationism? 2. Associationism as a Theory of Mental Processes: The Empiricist Connection 3. Associationism as a Theory of Learning 4. Associationism as a Theory of Mental Structure o 4.1 Associative Symmetry o 4.2 Activation Maps of Associative Structure o 4.3 Relation Between Associative Learning and Associative Structure o 4.4 Extinction and Counterconditioning 5. Associative Transitions 6. Associative Instantiation 7. Relation between the Varieties of Association and Related Positions 8. Associationism in Social Psychology o 8.1 Implicit Attitudes o 8.2 Dual Process Theories 9. Criticisms of Associationism o 9.1 Learning Curves o 9.2 The Problem of Predication o 9.3 Word Learning 9.3.1 Fast Mapping 9.3.2 Syntactic Category Learning o 9.4 Against the Contiguity Relation of Associationism 9.4.1 Against the Necessity of Contiguity 9.4.2 Against the Sufficiency of Contiguity o 9.5 Coextensionality Bibliography Academic Tools Other Internet Resources Related Entries 1. What is Associationism? Associationism is a theory that connects learning to thought based on principles of the organism’s causal history. Since its early roots, associationists have sought to use the history of an organism’s experience as the main sculptor of cognitive architecture. In its most basic form, associationism has claimed that pairs of thoughts become associated based on the organism’s past experience. So, for example, a basic form of associationism (such as Hume’s) might claim that the frequency with which an organism has come into contact with Xs and Ys in one’s environment determines the frequency with which thoughts about Xs and thoughts about Ys will arise together in the organism's future. Associationism’s popularity is in part due to how many different masters it can serve. In particular, associationism can be used as a theory of learning (e.g., as in behaviorist theorizing), a theory of thinking (as in Jamesian ‘streams of thought’), a theory of mental structures (e.g., as concept pairs), and a theory of the implementation of thought (e.g., connectionism). All these theories are separable, but share a related, empiricist-friendly core. As used here, a ‘pure associationist’ will refer to one who holds associationist theories of learning, thinking, mental structure, and implementation. The ‘pure associationist’ is a somewhat idealized position, one that no particular theorist may have ever held, but many have approximated to differing degrees (e.g., Locke 1690/1975, Hume 1738/1975, Thorndike 1911, Skinner 1953, Hull 1943, Churchland 1986, 1989, Churchland and Sejnowski 1990, Smolensky 1988, Elman 1991, Elman et al. 1996, McClelland et al. 2010, Rydell and McConnell 2006, Fazio 2007). Outside of these core uses of associationism the movement has also been closely aligned with a number of different doctrines over the years: empiricism, behaviorism, anti-representationalism (i.e., skepticism about the necessity of representational realism in psychological explanation), gradual learning, and domain-general learning. All of these theses are dissociable from core associationist thought (see section 7). While one can be an associationist without holding those theses, some of those theses imply associationism to differing degrees. These extra theses’ historical and sociological ties to associationism are strong, and so will be intermittently discussed below. 2. Associationism as a Theory of Mental Processes: The Empiricist Connection Empiricism is a general theoretical outlook, which tends to offer a theory of learning to explain as much of our mental life as possible. From the British empiricists through Skinner and the behaviorists (see Behaviorism entry) the main focus has been arguing for the acquisition of concepts (for the empiricists’ ‘Ideas’, for the behaviorists ‘responses’) through learning. However, the mental processes that underwrite such learning are almost never themselves posited to be learned.1 So winnowing down the amount of mental processes one has to posit limits the amount of innate machinery the theorist is saddled with. Associationism, in its original form as in Hume (1738/1975), was put forward as a theory of mental processes. Associationists’ attempt to answer the question of how many mental processes there are by positing that there is only one mental process: the ability to associate ideas.2 Of course, thinkers execute many different types of cognitive acts, so if there is only one mental process, the ability to associate, that process must be flexible enough to accomplish a wide range of cognitive work. In particular, it must be able to account for learning and thinking. Accordingly, associationism has been put to use on both fronts. We will first discuss the theory of learning, and then, after analyzing that theory and seeing what is putatively learned, we will return to the associationist theory of thinking. 3. Associationism as a Theory of Learning In one of its senses, ‘associationism’ refers to a theory of how organisms acquire concepts, associative structures, response biases, and even propositional knowledge. It is commonly acknowledged that associationism took hold after the publishing of John Locke’s Essay Concerning Human Understanding Empiricists who have wanted more than one type of learning mechanism have tended to be constructivists. The basic constructivist position is to posit a single mental process, the ability to associate ideas, and to construct new processes out of the single innate process (see, Fodor 1983 for discussion). 2 Though many later associationists, such as Pavlov and the behaviorists, had only one mental process, Hume in fact also had the imagination. For discussion on how the imagination meshes with Hume’s empiricism and associationism see Fodor (2003). 1 (1690/1975).3 However, Locke’s comments on associationism were terse (though fertile), and did not address learning to any great degree. The first serious attempt to detail associationism as a theory of learning was given by Hume in the Treatise of Human Nature (1738/1975).4 Hume’s associationism was, first and foremost, a theory connecting how perceptions (‘Impressions’) determined trains of thought (successions of ‘Ideas’). Hume’s empiricism, as enshrined in the Copy Principle 5, demanded that there were no Ideas in the mind that were not first given in experience. For Hume, the principles of association constrained the functional role of Ideas once they were copied from Impressions: if Impressions IM1 and IM2 were associated in perception, then their corresponding Ideas, ID1 and ID2 would also become associated. In other words, the ordering of Ideas was determined by the ordering of the Impressions that caused the Ideas to arise. Hume’s theory then needs to analyze what types of associative relations between Impressions mattered for determining the ordering of Ideas. Hume’s analysis consisted of three types of associative relations: cause and effect, contiguity, and resemblance. If two Impressions instantiated one of these associative relations, then their corresponding Ideas would mimic the same instantiation. 6 For instance, if Impression IM1 was cotemporaneous with Impressions IM2, then (ceteris paribus) their corresponding Ideas, ID1 and ID2, would become associated. As stated, Hume’s associationism was mostly a way of determining the functional profile of Ideas. But we have not yet said what it is for two Ideas to be associated (for that see section 4). Instead, one can see Hume’s contribution as introducing a very influential type of learning—associative learning—for Hume’s theory purports to explain how we learn to associate certain Ideas. But we can abstract away from Hume’s framework of ideas and his account of the specific relations that underlie associative learning, and state the theory of associative learning more generally: if two contents of experiences, X and Y, instantiate some associative relation, R, then those contents will become associated, so that future activations of X will tend to bring about activations of Y. The associationist then has to explain what relation R amounts to. The Humean form of associative learning (where R is equated with cause and effect, contiguity, or resemblance) has been hugely influential, informing the accounts of those such as Jeremy Bentham, J. S. Mill, and Alexander Bain (see, e.g., the entries on J.S. Mill and 19th Century Scottish Philosophy).7 Associative learning didn’t hit its stride until the work of Ivan Pavlov, which spurred the subsequent rise of the behaviorist movement in psychology. Pavlov introduced the concept of classical conditioning as a modernized version of associative learning. For Pavlov, classical conditioning was in part an experimental paradigm for That said, one can detect aspects of associationism in earlier writers, such as Descartes when discussing memory and Spinoza when discussing the emotions (see the entry on Descartes and on Spinoza on the emotions). 4 Although Hume is generally acknowledged as laying the theoretical foundation of associationism, there is some evidence that Francis Hutcheson’s use of associations greatly influenced him. See the entry on Scottish Philosophy in the 18th Century. 5 “All our simple ideas in their first appearance are deriv'd from simple impressions, which are correspondent to them, and which they exactly represent” (T 1.1.1.7/4). 6 This is a bit of a loose formulation. Strictly speaking, impressions themselves don’t instantiate any associative relation, rather the contents of the Impressions do. For example, it isn’t that one’s Impression (understood as a vehicle of thought) of chickens resembles roosters; rather it’s the content of one’s impressions resemble one another. Presumably, all Impressions qua vehicles of thought resemble one another merely by being Impressions. What differs between Impressions is (e.g.,) whether the content they represent resembles other represented content. This distinction between vehicle and content is important for Hume’s overall architecture: it’s not the vehicle of the Impression that gets copied into an Idea, but rather the content of that vehicle. That said, to ease exposition the distinction between vehicles and contents is elided in the main text except where it is important to distinguish. 7 Although some contemporary associationist views still retain all three original Humean associative relations, the resemblance relation has come under the most scrutiny and is the least popular of the three. For discussions of the problem of the resemblance criterion see Field and Davey (1999), and De Houwer (2009). In the canonical Rescorla Wagner model (Rescorla and Wagner 1972), both contiguity and resemblance are superseded by the contingency requirement. 3 teaching animals to learn new associations between stimuli. The general method of learning was to pair an unconditioned stimulus (US) with a novel stimulus. An unconditioned stimulus is just a stimulus that naturally, without training, provokes a response in an organism. Since this response is not itself learned, the response is referred to as an ‘unconditioned response’ (UR). In Pavlov’s canonical experiment, the US was a meat powder, as the smell of meat automatically brought about salivation (UR) in his canine subjects. The US is then paired with a neutral stimulus, such as a bell. Over time, the contiguity between the US and the neutral stimulus causes the neutral stimulus to provoke the same response as the US. Once the bell starts to provoke salivation, the bell has become a ‘conditioned stimulus’ (CS) and the salivating, when prompted by the bell alone, a ‘conditioned response’ (CR). The associative learning here is learning to form new stimulus-response pairs between the bell and the salivation.8 Classical conditioning is a fairly circumscribed process. It is a ‘stimulus substitution’ paradigm where one stimulus can be swapped for another to provoke a response. 9 However, the responses that are provoked remain unchanged; all that changes is the stimulus that gets associated with the response. Thus, classical conditioning seemed to some to be too restrictive to explain the panoply of novel behavior organisms appear to execute.10 Edward Thorndike’s research with cats in puzzle boxes broadened the theory of associative learning by introducing the notion of consequences to associative learning. Thorndike expanded the notion of associative learning beyond instinctual behaviors and sensory substitution to genuinely novel behaviors. Thorndike’s experiments initially probed (e.g.,) how cats learned to lift a lever to escape the “puzzle boxes” (the forbearer to ‘Skinner boxes’) that they were trapped in. The cats’ behaviors, such as attempting to lift a lever, were not themselves instinctual behaviors like the URs of Pavlov’s experiments. Additionally, the cats’ behaviors were shaped by the consequences that they brought on. For Thorndike it was because lifting the lever caused the door to open that the cats learned the connection between the lever and the door. This new view of learning, operant conditioning (for the organism is ‘operating’ on its environment) was not merely the passive learning of Pavlov, but a species-nonspecific, general, active theory of learning. This research culminated in Thorndike’s famous “Law of Effect” (1911), the first canonical psychological law of associationist learning. It asserted that responses that are accompanied by the organism feeling satisfied will, ceteris paribus, be more likely to be associated with the situation in which the behavior was executed, whereas responses that are accompanied with a feeling of discomfort to the animal will, ceteris paribus, make the response less likely to occur when the organism encounters the same situation. 11 The greater the positive or negative feelings produced, the greater the likelihood that the behavior will be evinced. To this Thorndike added the “Law of Exercise”, that responses to situations will, ceteris paribus, be more connected to those situations in proportion to the frequency of past pairings between situation and response. Thorndike’s paradigm was popularized and extended by B. F. Skinner (see, e.g., Skinner 1953) who stressed the notion not just of consequences but of reinforcement as the basis of forming associations. For Skinner, a behavior would get A variation on classical conditioning is evaluative conditioning, where one tries to transfer the valence of the US onto the CS (see, e.g., De Houwer et al. 2001 for an overview). For instance, one might pair a favorable flavor (e.g., sugar) with a novel neutral face stimulus, in order to transfer the positive valence to the previously neutral face. 9 There are many different ways of construing the details of Pavlovian conditioning. For example, some would restrict the usage further by arguing that the US must be biologically significant, or widen the usage, as Rescorla does (see section 7). Some anti-associationists even believe that Pavlovian conditioning is real, but not predicated on associations (Mitchell et al. 2009). 10 Classical conditioning also had some consequences that were a bit unpalatable for empiricists: if all learning was to be given as forming associative bonds between USs, CSs, and responses, then all of our learning had to bottom out in some behaviors that were preprogrammed to correspond to certain stimuli: in other words, certain instinctual patterns of behavior were innately set to be elicited by certain stimuli. Even more problematically, such instinctual patterns were apt to be species specific, so not generalizable to humans. 11 Note how Thorndike does not hesitate to speak of mental states like satisfaction and dissatisfaction, as opposed to the most famous practitioner of operant conditioning, the radical behaviorist B. F. Skinner (see the Behaviorism entry). 8 associated with a situation according to the frequency and strength of reinforcement that would arise as a consequence of the behavior. Since the days of Skinner, associative learning has come in many different variations. But what all varieties share with their historical predecessors is that associative learning is supposed to mirror the contingencies in the world without adding additional structure to them. The question of what contingencies associative learning detects (that is, one’s preferred analysis of what the associative relation R is), is up for debate and changes between theorists. The final widely shared, though less central, property of associative learning concerns the domain generality of associative learning. Domain generality’s prevalence among associationists is due in large part to their traditional empiricist allegiances: excising domain specific learning mechanisms constrains the amount of innate mental processes one has to posit. Thus it is no surprise to find that both Hume and Pavlov assumed that associative learning could be used to acquire associations between any contents, regardless of the types of contents they were. For example, Pavlov writes, “Any natural phenomenon chosen at will may be converted into a conditioned stimulus. Any ocular stimulus, any desired sound, any odor, and the stimulation of any portion of the skin, whether by mechanical means or by the application of heat or cold never failed to stimulate the salivary glands.”(Pavlov 1906, p. 615). Note that for Pavlov the content of the CS doesn’t matter. Any content will do, as long as it bears the right functional relationship in the organism’s learning history. In that sense, the learning is domain general—it matters not what the content is, just the role it plays (for more on this topic, see section 9.4).12 4. Associationism as a Theory of Mental Structure Associative learning amounts to a constellation of related views that interprets learning as associating stimuli with responses (in operant conditioning), or stimuli with other stimuli (in classical conditioning), or stimuli with valences (in evaluative conditioning). Associative learning accounts raise the question: when one learns to associate contents X and Y because (e.g.,) previous experiences with Xs and Ys instantiated R, how does one store the information that X and Y are associated? A highly contrived sample answer to this question would be that a thinker learns an explicitly represented unconscious conditional rule that states ‘when a token of X is activated, then also activate a token of Y.’ Instead of such a highly intellectualized response, associationists have found a natural (though by no means necessary, see section 4.2) complementary view that the information is stored in an associative structure. An associative structure describes the type of bond that connects two distinct mental states.13 An example of such a structure is the associative pair SALT/PEPPER.14 The associative structure is defined, in the first instance, functionally: if X and Y form an associative structure, then, ceteris paribus, activations of mental state X bring about mental state Y and vice versa without the mediation of any other psychological states (such as an From this level of abstraction, Pavlov and Skinner were united. Here’s Garcia’s on Skinnerian learning: “Any stimulus applied immediately after the response which, by empirical test, would increase response production was deemed a reinforcer…The general procedures were said to be applicable to any and all reflexes, in any and all organisms. There was no need to concern ourselves with species differences, with brain differences, or with reinforcer differences. The payoff schedule’s the thing wherein we’d capture control of the organism” (Garcia 1981, p. 155). 13 Radical behaviorists such as Skinner (e.g. 1953) would deny this claim, but only because of their ontological objections to reifying mental states. But Eliminativism of the mental is a different thesis than associationism, although both fit together well (see section 6). 14 Hereafter I will use the forward slash to denote an associative bond between the entities on either side of the slash. Additionally, expressions written in small caps will be used to denote concepts, and I will assume that the concepts’ structural descriptions are given by the expressions. Thus RED BIRD is taken to be a complex concept consisting of two meaningful parts, the concept RED and the concept BIRD. However, BIRD will be assumed to be a simple concept with no semantically decomposable parts. The structural descriptions are stipulated for exegetical reasons and without commitment to the actual structure of the corresponding concepts. 12 explicitly represented rule telling the system to activate a concept because its associate has been activated). 15 In other words, saying that two concepts are associated amounts to saying that there is a reliable, psychologically basic causal relation that holds between them—the activation of the one of the concepts causes the activation of the other. So, saying that someone harbors the structure SALT/PEPPER amounts to saying that activations of SALT will cause activations of PEPPER (and vice versa) without the aid of any other cognitive states. Associative structures are most naturally contrasted with propositional structures. A pure associationist is opposed to propositional structures—strings of mental representations that express a proposition—because propositionally structured mental representations have structure over and above the mere associative bond between two concepts. Take, for example, the associative structure GREEN/TOUCAN. This structure does not predicate green onto toucan. If we know that a mind has an associative bond between GREEN and TOUCAN, then we know that activating one of those concepts leads to the activation of the other. A pure associative theory rules out predication, for propositional structures aren’t just strings of associations. ‘Association’ (in associative structures) just denotes a causal relation among mental representations, whereas predication (roughly) expresses a relation between things in the world (or intentional contents that specify external relations). Saying that someone has an associative thought GREEN/TOUCAN tells you something about the causal and temporal sequences of the activation of concepts in one’s mind; saying that someone has the thought THERE IS A GREEN TOUCAN tells you that a person is predicating greenness of a particular toucan (see Fodor 2003, pp. 91-94, for an expansion of this point). Associative structures needn’t just hold between simple concepts. One might have reason to posit associative structures between propositional elements (see section 5) or between concepts and valences (see section 8). But none of the proceeding is meant to imply that all structures are associative or propositional—there are other representational formats that the mind might harbor (e.g., analog magnitudes or iconic structures). For instance, not all semantically related concepts are harbored in associative structures. Semantically related concepts may in fact also be directly associated (as in DOCTOR/NURSE) or they may not (as in HORSE/ZEBRA; see Perea and Rosa 2001). The difference in structure is not just a theoretical possibility: these different structures have different functional profiles. For example conditioned associations appear to last longer than semantic associations do in subjects with dementia (Glosser and Friedman 1991). 4.1 Associative Symmetry The analysis of associative structures implies that, ceteris paribus, associations are symmetric in their causal effects: if a thinker has a bond between SALT/PEPPER, then SALT should bring about PEPPER just as well as PEPPER brings about SALT. But all else is rarely equal. For example, behaviorists such as Thorndike, Hull, and Skinner knew that the order of learning affected the causal sequence of recall: if one is always hearing ‘salt and pepper’ then SALT will be more poised to activate PEPPER than PEPPER to activate SALT. So, included in the ceteris paribus clause in the analysis of associative structures is the idealization that the learning of the associative elements was equally well randomized in order. Similarly, associative symmetry is violated when there are differing amounts of associative connections between the individual associated elements. For example, in the GREEN/TOUCAN case, most thinkers will have many more associations stemming from GREEN than stemming from TOUCAN. Suppose we have a thinker that only associates TOUCAN with GREEN, but associates GREEN with a large host of other concepts (e.g., GRASS, VEGETABLES, TEA, KERMIT, SEASICKNESS, MOSS, MOLD, LANTERN, IRELAND, etc). In this case one can expect that TOUCAN will more quickly activate GREEN than GREEN will activate TOUCAN, for the former bond will have its activation strength less weakened amongst other associates than the latter will. 15 The mediation parenthetical can get a bit complicated to state, for one might want to claim that (e.g.,) WRENCH and HAMMER are associated, even if the association is mediated through a link between SCREWDRIVER. In which case, it’s best to say that two concepts form a basic associative structure if the activation of one concept brings on the activation of another without there being any other mediating psychological variable. 4.2 Activation Maps of Associative Structure An associative activation map (sometimes called a ‘spreading activation’ map, Collins and Luftus 1975) is a mapping for a single thinker of all the associative connections between concepts. 16 There are many ways of operationalizing associative connections. In the abstract, a psychologist will attempt to probe which concepts (or other mental elements) activate which other concepts (or elements). Imagine a subject who is asked to say whether a string of letters constitutes a word or not, which is the typical goal given to subjects in a ‘lexical decision task.’ If a subject has just seen the word ‘bugs’, we assume that the concept BUGS was activated. If the subject is then quicker to say that, e.g., ‘insects’ is a word than the subject is to say that ‘toaster’ is, then we can infer that INSECTS was primed, and is thus associatively related to BUGS, in this thinker. Likewise, if we find that ‘microphone’ is also responded to quicker, then we know that MICROPHONE is associatively related to ‘bugs.’ Using this procedure, one can generate an associative mapping of a thinker’s mind. Such a mapping would constitute a mapping of the associative structures one harbors. However, to be a true activation map—a true mapping of what concepts facilitate what—the mapping would also need to include information about the violations of symmetry between concepts. 4.3 Relation Between Associative Learning and Associative Structures The British Empiricists desired to have a thoroughgoing pure associationist theory, for it allowed them to lessen the load of innate machinery they needed to posit. Likewise, the behaviorists also tended to want a pure associationist theory (sometimes out of a similar empiricist tendency, other times because they were radical behaviorists like Skinner, who banned all discussion of mental representations). Pure associationists tend to be partial to a connection that Fodor (2003) refers to as “Bare-Boned Association.” The idea is that the current strength of an association connection between X and Y is determined, ceteris paribus, by the frequency of the past associations of X and Y. As stated, Bare-Boned Association assumes that associative structures encode, at least implicitly, the frequency of past associations of X and Y, and the strength of that associative bond is determined by the organism’s previous history of experiencing Xs and Ys. 17 In other words, the learning history of past associations determines the current functional profile of the corresponding associative structures. 18 Although the picture sketched above, where associative learning eventuates in associative structure, is appealing for many, it is not forced upon one. Logically speaking, there is no reason to bar any type of structure to arise from a particular type of learning. One may, for example, gain propositional structures from associative learning (see Mitchell et al. 2009 and Mandelbaum forthcoming for arguments that this is more than a mere logical possibility). This may happen in two ways. In the first, one may gain an associative structure that has a proposition as one of its associates. Assume that every time one’s father came home he immediately made This claim should be qualified in a few ways. First, the mapping might not be a full mapping of a single thinker as opposed to a subsystem of a single thinker (such as their intramodular representation of their lexicon, see Fodor 1983). Secondly, the mapping needn’t be between concepts per se, and can instead be between mental representations that for some reason or another one needn’t bestow the honorific of ‘concepts’ to (because, for example, the mental representations are intramodular and thus not properly ‘general’, see Evans 1982). 17 ‘Experiencing Xs and Ys’ generally means something such as ‘having formed representations of Xs and Ys based on their appearance in the ambient environment,’ but needn’t necessarily mean that. If one just happened to keep thinking X followed by Y for any reason, even though Xs and Ys weren’t given in experience that too could change the associative strength of the X/Y bond. Additionally, some theories allow ‘piggybacking’ associations—associations formed from activated propositional structures. For example, constantly having the propositional thought MOLLY OWNS A DOG could affect the associative bond between MOLLY and DOG (see Mandelbaum forthcoming for discussion). 18 Although bare-boned associationism provides a good approximation of Hume and Pavlov, it doesn’t quite capture the full theory of those working in operant conditioning paradigms for it doesn’t involve any notion of reinforcement, or updating one’s associative structure based on consequences. This isn’t accidental: how to square cognitive updating (i.e., association-based or belief-based updating) based on consequences with the Spartan tenets of associationism has often been a point of difficulty (see, e.g., Festinger and Carlsmith 1959). 16 dinner. In such a case one might associate the proposition DADDY IS HOME with the concept DINNER (that is one might acquire: DADDY IS HOME/DINNER). However, one might also just have a propositional structure result from associative learning. If every time one’s father came home he made dinner, then one might just end up learning IF DADDY IS HOME THEN DINNER WILL COME SOON, which is a propositional structure. 4.4 Extinction and Counterconditioning There is a different, tighter relationship between associative learning and associative structures concerning how to modulate an association. Associative theorists, especially from Pavlov onward, have been clear on the functional characteristics necessary to modulate an already created association. There have been two generally agreed upon routes: extinction and counterconditioning. Suppose that, through associative learning, you have learned to associate a CS with a US. How do we break that association? Associationists have posited that one breaks an associative structure via two different particular types of associative learning (/unlearning). Extinction is the name for one such process. During extinction one decouples the external presentation of the CS and the US by presenting the CS without the US (and sometimes the US without the CS). Over time, the organism will learn to disconnect the CS and US. Counterconditioning names a similar process to extinction, though one which proceeds via a slightly different method. Counterconditioning can only occur when an organism has an association between a mental representation and a valence, as acquired in an evaluative conditioning paradigm. Suppose that one associates DUCKS with a positive valence. To break this association via counterconditioning one introduces ducks not with a lack of positive valence (as would happen in extinction) but with the opposite valence, a negative valence. Counterconditioning counters the existing valence with the opposite valence. Over multiple exposures, the initial representation/valence association weakens, and is perhaps completely broken.19 How successful extinction and counterconditioning are, and how they work, is the source of some controversy. Although the traditional view is that extinction breaks associative bonds, it is an open empirical question whether extinction proceeds by breaking the previously created associative bonds, or whether it proceeds by leaving that bond alone but creating new, more salient (and perhaps context-specific) bonds between the CS and other mental states (see Bouton 2002 for evidence for the latter interpretation). Additionally, reinstatement, the spontaneous reappearance of an associative bond after seemingly successful extinction, has been observed in many contexts (see, e.g., Dirikx et al. 2007 for reinstatement of fear in humans). 20 One fixed point in this debate is that one reverses associative structures via these two types of associative learning/unlearning, and only via these two pathways. What one does not do is try to break an associative structure by using practical or theoretical reasoning. If you associate SALT with PEPPER then telling you that salt has nothing to do with pepper, or giving you very good reasons not to associate the two (say, someone will give you $50,000 for not associating them) won’t affect the association. This much has at least been clear since Locke. In the Essay concerning Human Understanding, in his chapter “On the Association of Ideas” (chapter XXIII) he writes, When this combination is settled, and while it lasts, it is not in the power of reason to help us, and relieve us from the effects of it. Ideas in our minds, when they are there, will operate according to their natures and circumstances. And here we see the cause why time cures certain affections, which reason, though in the right, and allowed to be so, has not power over, nor is able against them to prevail with those who are apt to hearken to it in other cases (2. 23. 13). Curiously, it appears that extinction isn’t very effective in evaluative conditioning paradigms, though counterconditioning is (see De Houwer 2011 for many citations, such as et al. Diaz et al. 2005 and Vansteenwegen 2006). 20 Technically, reinstatement is the reappearance of the CR upon reexposure to the US after successful extinction, whereas spontaneous recovery is the name for the return of the associative pairing just due to the passage of time. Both reinstatement and spontaneous recovery are related, and both provide difficulties for the traditional view of extinction. 19 Likewise, say one has just eaten lutefisk and then vomited. The smell and taste of lutefisk will then be associated with feeling nauseated, and no amount of telling one that they shouldn’t be nauseated will be very effective. Say the lutefisk that made one vomit was covered in poison, so that we know that the lutefisk wasn’t the root cause of the sickness.21 Having this knowledge won’t dislodge the association. In essence, associative structures are functionally defined as being fungible based on counterconditioning and extinction and nothing else. Thus, assuming one sees counterconditioning and extinction as types of associative learning, we can say that associative learning does not necessarily eventuate in associative structures, but associative structures can only be modified by associative learning. 5. Associative Transitions So far we’ve discussed learning and mental structures, but have yet to discuss thinking. The pure associationist will want a theory that covers not just acquisition and cognitive structure, but also the transition between thoughts. Associative transitions are a particular type of thinking, akin to what William James called “The Stream of Thought” (James 1890). Associative transitions are movements between thoughts that are not predicated on a prior logical relationship between the elements of the thoughts that one connects. In this sense, associative transitions are contrasted with computational transitions as analyzed by the Computational Theory of Mind (see the Computational Theory of Mind entry). CTM understands inferences as truth preserving movements in thought that are underwritten by the formal/syntactic properties of thoughts. For example inferring the conclusion in modus ponens from the premises is possible just based on the form of the major and minor premise, and not on the content of the premises. Associative transitions are transitions in thought that are not based on the logico-syntactic properties of thoughts. Rather, they are transitions in thought that occur based on the associative relations among the separate thoughts. Imagine an impure associationist model of the mind, one that contains both propositional and associative structures. A computational inference might be one such as inferring YOU ARE A G from the thoughts IF YOU ARE AN F, THEN YOU ARE A G, and YOU ARE AN F. However, an associative transition is just a stream of ideas that needn’t have any formal, or even rational, relation between them, such as the transition from THIS COFFEESHOP IS COLD to RUSSIA SHOULD ANNEX IDAHO, without there being any intervening thoughts. This transition could be subserved merely by one’s association of IDAHO and COLD, or it could happen because the two thoughts have tended to co-occur in the past, and their close temporal proximity caused an association between the two thoughts to arise (or for many other reasons). Regardless of the etiology, the transition doesn’t occur on the basis of the formal properties of the thoughts.22 According to this taxonomy, talk of an ‘associative inference’ (e.g., Anderson et al. 1994, Armstrong et al. 2012) is a borderline oxymoron. The easiest way to give sense to the idea of an associative inference is for it to involve transitions in thought that began because they were purely inferential (as understood by the computational theory of mind) but then became associated over time. For example, at first one might make the Interestingly, Locke also seemed to understand the nature of taste aversions (see section 9.4): “A grown person surfeiting with honey no sooner hears the name of it, but his fancy immediately carries sickness and qualms to his stomach, and he cannot bear the very idea of it; other ideas of dislike, and sickness, and vomiting, presently accompany it, and he is disturbed; but he knows from whence to date this weakness, and can tell how he got this indisposition. Had this happened to him by an over-dose of honey when a child, all the same effects would have followed; but the cause would have been mistaken, and the antipathy counted natural” (Locke 1690 2.23.7). 22 In the example of associative transitions offered above, we used associations between propositions. But of course a pure associationist view would not allow propositional structures. It is thus a bit more difficult for a pure associationist to distinguish associative transitions from associative structures. For the pure associationist, all transitions are associative transitions among associative structures, for association is the only available mental process and associative structures the only available mental structure. Thus, for the pure associationist, the only possible difference between an associative structure and an associative transition is a contingent temporal one (where an associative structure is ideally contemporaneous whereas an associative transition unfolds over time). 21 modus ponens inference because a particular series of thoughts instantiates the modus ponens form. Over time the premises of that particular token of a modus ponens argument become associated with each other through their continued use in that inference and now the thinker merely associates the premises with the conclusion. That is, the constant contiguity between the premises and the conclusion occurred because the inference was made so frequently, but the inference was originally made so frequently not because of the associative relations between the premises and conclusion, but because the form of the thoughts (and the particular motivations of the thinker). This constant contiguity then formed the basis for an associative linkage between the premises and the conclusion. As was the case for associative structures, associative transitions in thought are not just a logical possibility. There are particular empirical differences associated with associative transitions versus inferential transitions. Associative transitions tend to move across different content domains, whereas inferential transitions tend to stay on a more focused set of contents. These differences have been seen to result in measurable differences in mood: associative thinking across topics bolsters mood when compared to logical thinking on a single topic (Mason and Bar 2011). 6. Associative Instantiation The associationist position so far has been neutral on how associations are to be implemented. Implementation can be seen at a representational (that is psychological) level of explanation, or at the neural level. A pure associationist picture would posit an associative implementation base at one, or both, of these levels.23 The most well-known associative instantiation base is a class of networks called Connectionist networks (see the Connectionism entry). Connectionist networks are sometimes pitched at the psychological level (see, e.g., Elman 1991, Elman et al. 1996, Smolensky 1988). This amounts to the claim that models of algorithms embedded in the networks capture the essence of certain mental processes, such as associative learning. Other times connectionist networks are said to be models of neural activity (‘neural networks’). Connectionist networks consist in sets of nodes, generally input nodes, hidden nodes, and output nodes. Input nodes are taken to be analogs of sensory neurons (or sub-symbolic sensory representations), output nodes the analog of motor neurons (or sub-symbolic behavioral representations), and hidden nodes are stand-ins for all other neurons.24 The network consists in these nodes being connected to each other with varying strengths. The topology of the connections gives one an associative mapping of the system, with the associative weights understood as the differing strengths of connections. On the psychological reading, these associations are functionally defined; on the neurological reading, they are generally understood to be representing synaptic conductance (and are the analogs of dendrites). Prima facie, these networks are purely associative and do not contain propositional elements, and the nodes themselves are not to be equated with single representational states (such as concepts; see, e.g., Gallistel and King 2009). However, a connectionist network can implement a classical Turing machine architecture (see, e.g., McLaughlin and Fodor 1990, Chalmers 1993). Many, if not most, of the adherents of classical computation, for example proponents of CTM, think that the brain is an associative network, one which implements a classical computational program. Some adherents of CTM do deny that the brain runs an associative network (see, e.g., Gallistel and King 2009, who appear to deny that there is any scientific level of explanation that association is intimately involved in), but they do so on separate empirical grounds and not because of any logical The question of how many levels of explanation one allows in their cognitive architecture is a wholly separate question of whether any of those architectures are associationistic. Generalizations here vary wildly from theorist to theorist. For example, many theorists, roughly following Marr (1982), assume there is just one algorithmic (psychological/representational) level which is then instantiated in a physical (neurological) level (see, e.g., Mitchell et al. 2009). Others generally assume that there are multiple psychological levels. For instance, Fodor writes, “psychological faculties at the nth level are typically implemented by psychological faculties at the n-1th level” (2003, p. 132). 24 In this context, ‘sub-symbolic’ just means that the node on its own has no semantic value. In other words, a single node wouldn’t represent any content. 23 inconsistency with an associative brain implementing a classical mind. When discussing an associative implementation base it is important to distinguish questions of associationist structure from questions of representational reality. Connectionists have often been followers of the Skinnerian anti-representationalist tradition (Skinner 1938). Because of the distributed nature of the nodes in connectionist networks, the networks have tended to be analyzed as associative stimulus/response chains of subsymbolic elements. However, the question of whether connectionist networks have representations which are distributed in patterns of activity throughout different nodes of the network, or whether connectionist networks are best understood as containing no representational structures at all, is orthogonal to both the question of whether the networks are purely associative or computational, and whether the networks can implement classical architectures. 7. Relation between the Varieties of Association and Related Positions These four types of associationism share a certain empiricist spiritual similarity, but are logically, and empirically, separable. The pure associationist who wants to posit the smallest number of domain-general mental processes will theorize that the mind consists of associative structures acquired by associative learning which enter into associative transitions and are implemented in an associative instantiation base. However, many hybrid views are available and frequently different associationist positions become mixed and matched, especially once issues of empiricism, domain-specificity, and gradual learning arise. Below is a partial taxonomy of where some well-known theorists lie in terms of associationism and these other, often related doctrines. Prinz (2002) and Karmiloff-Smith (1992) are examples of empiricist non-associationists. It is rare to find an associationist who is a nativist, but plenty of nativists have aspects of associationism in their own work. For example, even the arch-nativist Jerry Fodor allows that intramodular lexicons contain associative structures (Fodor 1983). Similarly, there are many non-behaviorist (at least non-radical, analytic, or methodological behaviorist) associationists, such as Elman (1991), Smolensky (1988), Baeyens (De Houwer et al. 2001) and modern day dual process theorists such as Evans and Stanovich (2013). It is quite difficult to find a nonassociationist behaviorist, though Tolman approximates one (Tolman 1948). Elman and Smolensky also qualify as representationalist associationists, and Van Gelder (1995) as an anti-representationalist non-associationist. Karmiloff-Smith (1992) can be interpreted as, for some areas of learning, a proponent of gradual learning without being associationist (some might also read contemporary Bayesian theorists, e.g., Tenenbaum et al. 2011 and Chater et al. 2006 as holding a similar position for some areas of learning). Rescorla (1988) and Heyes (2012) claim to be associationists who are pro step-wise, one shot learning (though Rescorla sees his project as a continuation of the classical conditioning program, others see his data as grist for the anti-associationist, procomputationalist mill, see Gallistel and King 2009). Lastly, Tenenbaum and his contemporary Bayesians colleagues sometimes qualify as holding a domain-general learning position without it being associationist. 25 8. Associationism in Social Psychology Since the cognitive revolution, associationism’s influence has died out quite a bit in cognitive psychology and psycholinguistics. This is not to say that all aspects of associative theorizing are dead in these areas; rather, they have just taken on much smaller roles (for example, it has often been suggested that mental lexicons are structured, in part, associatively, which is why lexical decision tasks are taken to be facilitation maps of one’s lexicon). In other areas of cognitive psychology (for example, the study of causal cognition), associationism is no longer the dominant theoretical paradigm, but is still very much alive as a theoretical option (see Shanks 2010 for an overview of associationism in causal cognition). Associationism is also still thriving in the There are no domain-specific associationists because associative learning is incompatible with domain specificity. Domain specificity assumes different mental processes for different domains, and associative learning presupposes the same learning mechanism regardless of domain. 25 connectionist literature, as well as in the animal cognition tradition. But the biggest contemporary stronghold of associationist theorizing resides in social psychology, an area which has traditionally been hostile to associationism (see, e.g., Asch 1962, 1969). The ascendance of associationism in social psychology has been a fairly recent development, and has caused a revival of associationist theories in philosophy and cognitive science. The two areas of social psychology that have seen the greatest renaissance of associationism are the implicit attitude and dual-process theory literature. 8.1 Implicit Attitudes Implicit attitudes are generally operationally defined as mental representations that are unreported, inaccessible to consciousness, and detectable in paradigms such as the Implicit Association Test (Greenwald et al. 1998), the Affect Misperception Task (Payne 2009), the Sorted Paired Feature Task (Bar-Annan et al. 2009) and the Go/No-Go Association Task (Nosek and Banaji 2001). The default position among social psychologists is to treat implicit attitudes as if they are associations among mental representations (Fazio 2007), or among pairs of mental representations and valences. In particular, they treat implicit attitudes as associative structures which enter into associative transitions. Recently this issue has come under much debate (see De Houwer 2014, Mandelbaum forthcoming; see also the Implicit Attitudes entry). 8.2 Dual Process Theories Associative structures and transitions are widely implicated in a particular type of influential dual-process theory. Though there are many dual-process theories in social psychology (see, e.g., the papers in Chaiken and Trope, 1999, or the discussion in Evans and Stanovich 2013), the one most germane to associationism is also the most popular. It originates from work in the psychology of reasoning and is often also invoked in the heuristics and biases tradition (see, e.g., Kahneman 2011). It has been developed by many different psychological theorists (Sloman 1996, Smith and Decoster 2000, Wilson et al. 2001, Stanovich and Evans 2013) and, in parts, taken up by philosophers too (see, e.g., Gendler 2008, Frankish 2009, see also some of the essays in Evans and Frankish 2009). The dual-process strain most relevant to the current discussion posits two systems, one evolutionarily ancient intuitive system underlying unconscious, automatic, fast, parallel and associative processing, the other an evolutionarily recent reflective system characterized by conscious, controlled, slow, ‘rule-governed’ serial processes (see, e.g., Evans and Frankish 2013). The ancient system, sometimes called ‘System 1’, is often understood to include a collection of autonomous, distinct subsystems, each of which is recruited to deal with distinct types of problems (see Stanovich 2011 for a discussion of ‘TASS—the autonomous set of systems’). Although theories differ on how System 1 interacts with System 2,26 the theoretical core of System 1 is arguing that its processing is essentially associative. As in the implicit attitude debate, dual systems models have recently come under fire (see Kruglanski 2013, Osman 2013, Mandelbaum forthcoming), though they remain very popular. 9. Criticisms of Associationism Associationism has been a dominant theme in mental theorizing for centuries. As such, it has garnered an appreciable amount of criticism. 9.1 Learning Curves For example, in a ‘default-interventionist’ model System 2 processes are not always engaged though they are in ‘parallel competitive’ models (both models include the constant automatic engagement of System 1). See Evans and Stanovich 2013 for discussion. 26 The basic associative learning theories imply, either explicitly or implicitly, slow, gradual learning of associations (Baeyens et al. 1995). The learning process can be summarized in a learning curve which plots the frequency (or magnitude) of the conditioned response as a function of the number of reinforcements (Gallistel 2004, p. 13124). Mappings between CRs and USs are gradually built up over numerous trials (in the lab) or experiences (in the world). Gradual, slow learning has come under fire from a variety of areas (see the Garcia effect and language learning sections). However, here we just focus on the behavioral data. In a series of works reanalyzing animal behavior, Gallistel (2004, Gallistel and King 2009) has argued that although group-level learning curves do display the properties of being negatively accelerated and gradually developing, these curves are misleading because no individual’s learning curve has these properties. Gallistel has argued that learning for individuals is generally step-like, rapid, and abrupt. An individual’s learning from a low-level of responding to asymptotic responding is very quick. Sometimes, the learning is so quick that it is literally one-shot learning. For example, after analyzing multiple experiments of animal learning of spatial location Gallistel writes “the learning of a spatial location generally requires but a single experience. Several trials may, however, be required to convince the subject that the location is predictable from trial to trial” (Gallistel 2004, p. 13130). Gallistel argues that the reason the group learning curves look to be smooth and gradual is that there are large individual differences between subjects in terms of when the onset latency of the step-wise curves begin (ibid, p. 13125); in other words, different animals take different amounts of time for the learning to commence. The differences between individual subject’s learning curves are predicated on when the steps begin and not by the speed of the individual animal’s learning process. All individuals appear to show rapid rises in learning, but since each begins their learning at different times, when we average over the group the rapid step-wise learning appears to look like slow, gradual learning (Gallistel 2004, p. 13124). 9.2 The Problem of Predication The problem of predication is, at its core, a problem of how an associative mechanism can result in the acquisition of subject/predicate structures, structures which many theorists believe appear in language, thought, and judgment. The first major discussion of the problem appears in Kant (1781/1787), but variants of the basic Kantian criticism can be seen across the contemporary literature (see, e.g., Chomsky 1959, Fodor and Pylyshyn 1988, Fodor 2003, Mandelbaum 2013; for the details of the Kantian argument see the entry on Kant’s Transcendental Argument). For a pure associationist, association is ‘semantically transparent’ (see Fodor 2003), in that it purports to add no additional structure to thoughts. When a simple concept, X and a simple concept Y, become associated one acquires the associative structure X/Y. But X/Y has no additional structure on top of their contents. Knowing that X and Y are associated amounts to knowing a causal fact: that activating Xs will bring about the activation of Ys and vice versa. However, so the argument goes, some of our thoughts appear to have more structure than this: the thought BIRDS FLY predicates the property of flying onto birds. The task for the associationist is to explain how associative structures can distinguish a thinker who has a single (complex) thought BIRDS FLY from a thinker who conjoins two simple thoughts in an associative structure where one thought, BIRDS, is immediately followed by another, FLY. As long as the two simple thoughts are reliably causally correlated so that, for a thinker, activations of BIRDS regularly brings about FLY, then that thinker has the associative structure BIRDS/FLY. Yet it appears that thinker hasn’t yet had the thought BIRDS FLY. The problem of predication is explaining how a purely associative mechanism could eventuate in complex thoughts. In Fodor’s terms the problem boils down to how association, a causal relation among mental representations, can affect predication, a relation among intentional contents (Fodor 2003). A family of related objections to associationism can be interpreted as variations on this theme. For example, problems of productivity, compositionality, and systematicity for associationist theorizing appear to be variants of the problem of predication (for more on these specific issues see the Language of Thought Hypothesis entry and the Compositionality entry). If association doesn’t add any additional structure to the mental representations that get associated, then it is hard to see how it can explain the compositionality of thought, which relies on structures that specify relations among intentional contents. Compositionality requires that the meaning of a complex thought is determined by the meanings of its simple constituents along with their syntactic arrangements. The challenge to associationism is to explain how an associative mechanism can give rise to the syntactic structures necessary to distinguish a complex thought like BIRDS FLY from the temporal succession of two simple thoughts BIRDS and FLY. Since the compositionality of thought is posited to undergird the productivity of thought (thinkers’ abilities to think novel sentences of arbitrary lengths, e.g., GREEN BIRDS FLY, GIANT GREEN BIRDS FLY, CUDDLY GIANT GREEN BIRDS FLY, etc.), associationism has problems explaining productivity. Systematicity is the thesis that there are predictable patterns among which thoughts a thinker is capable of entertaining. Thinkers that can entertain thoughts of certain structures can always entertain distinct thoughts that have related structure. For instance, any thinker who can think a complex thought of the form ‘X transitive verb Y’ can think ‘Y transitive verb X.’ Systematicity entails that we won’t find any (human) thinker that can only think one of those two thoughts, in which case we could not find a person who could think AUDREY WRONGED MAX, but not MAX WRONGED AUDREY. Of course, these two thoughts have very different effects in one’s cognitive economy. The challenge for the associationist is to explain how the associative structure AUDREY/WRONGED/MAX can be distinguished from the structure MAX/WRONGED/AUDREY, while capturing the differences in those thoughts’ effects. Associationists have had different responses to the problem. Some have denied that human thought is actually compositional, productive, and systematic, and other non-associationists have agreed with this critique. For example, Prinz and Clark claim “concepts do not compose most of the time” (2002, 62), and Johnson (2004) argues that the systematicity criterion is wrongheaded (see Aydede 1997 for extended discussion of these issues). Rumelhart et al. offer a connectionist interpretation of ‘schemata’, one which is intended to cover some of the phenomenon mentioned in this section (Rumelhart et al. 1986). Others have worked to show that classical conditioning can indeed give rise to complex associative structures (Rescorla 1988). In defense of the associationist construal of complex associations Rescorla writes, “Clearly, the animals had not simply coded the RH [complex] compound in terms of parallel associations with its elements. Rather they had engaged in some more hierarchical structuring of the situation, forming a representation of the compound and using it as an associate” (Rescorla 1988, p. 156). Whether or not associationism has the theoretical tools to explain such complex compounds by itself is still debated (see, e.g., Fodor 2003, Mitchell 2009, Gallistel and King 2009). 9.3 Word Learning Multiple issues in the acquisition of the lexicon appear to cause problems for associationism. Some of the most well known examples are reviewed below (for further discussion of word learning and associationism see Bloom 2000). 9.3.1 Fast Mapping Children learn words at an incredible rate, acquiring around 6,000 words by age 6 (Carey 2010, p. 184). If gradual learning is the rule, then words too should be learned gradually across this time. However, this does not appear to be the case. In a series of studies, Carey discovered the phenomenon of ‘fast mapping’, which is oneshot learning of a word (Carey 1978a, 1978b, Carey and Bartlett 1978). Her most influential example investigated children’s acquisition of ‘chromium’ (a color word referring to olive green). Children were shown one of two otherwise identical objects, which only differed in color and asked, “Can you get me the chromium tray, not the red one, the chromium one” (recited in Carey 2010, p. 2). All of the children handed over the correct tray at that time. When the children were later tested in differing contexts, more than half remembered the referent of ‘chromium.’ These findings have been extended—for example, Markson and Bloom (1997) showed that they are not specific to the remembering of novel words, but also hold for novel facts. Fast mapping poses two problems for associationism. The first is that the learning of a new word did not develop slowly, as would be predicted by proponents of gradual learning. The second is that in order for the word learning to proceed, the mind must have been aided by additional principles not given by the environment. Some of these principles such as Markman’s (1989) taxonomic, whole object, and mutual exclusivity constraints, and Gleitman’s syntactic bootstrapping (Gleitman et al. 2004), imply that the mind does add structure to what is learned. Consequently, the associationist claim that learning is just mapping external contingencies without adding structure is imperiled. 9.3.2 Syntactic Category Learning ‘Motherese’, the name of the type of language that infants generally hear, consists of simple sentences such as ‘Nora want a bottle?’ and ‘Are you tired?’. These sentences almost always contain a noun and a verb. Yet, the infant’s vocabulary massively over-represents nouns in the first 100 words or so, while massively underrepresenting the verbs (never mind adjectives or adverbs, which almost never appear in the first 100 words infants produce; see, e.g., Goldin-Meadow, Seligman, and Gelman, 1976; Bates, Dale, and Thal, 1995). Even more surprising is that the over-representation of nouns to verbs holds even though “the incidence of each word (that is, the token frequency) is higher for the verbs than for the nouns in the common set used by mothers” (Snedeker and Gleitman 2004, p. 259, citing data from Sandhoffer Smith, and Luo 2000). Moreover, children hear a preponderance of determiners (‘the’ and ‘a’) but don’t produce them (Bloom 2000). These facts are not specific to English, but hold cross-culturally (see, e.g., Caselli et al. 1995). The disparity between the variation of the syntactic categories infants receive as input and produce as output is troublesome to associationism, insofar as associationism is committed to the learned structures (and the behaviors that follow from them) merely patterning what is given in experience. 9.4 Against the Contiguity Analysis of Associationism Contiguity has been a central part of associationist analyses since the British Empiricists. In the experimental literature, the problem of figuring out the parameters needed for contiguity has sometimes been termed the problem of the ‘Window of Association’ (e.g., Gallistel and King 2009). The crux of the problem is that if contiguity is to be a founding pillar of associationism, then the window needs to be relatively short. Thus the need to specify the temporal properties of the window is a desideratum for any empirically adequate associationist theory that involves contiguity. 27 A related problem for contiguity theorists is that if the domain generality of associative learning is desired, then the window needs to be homogenous across content domains. The late 1960s saw persuasive attacks on domain generality, as well as the necessity and sufficiency of the contiguity criterion in general. 9.4.1 Against the Necessity of Contiguity Research on ‘taste aversions’ and ‘bait-shyness’ provided a variety of problems with contiguity in the associative learning tradition of classical conditioning. Garcia observed that a gustatory stimulus (e.g., drinking water or eating a hot dog) but not an audiovisual stimulus (a light and a sound) would naturally become associated with feeling nauseated. For instance, Garcia and Koelling (1966) paired an audiovisual stimulus, a light and a sound, with a gustatory stimulus, flavored water. The two stimuli were then paired with the rats receiving radiation, which made the rats feel nauseated. The rats associated the feeling of nausea with the water and not with the sound, even though the sound was contiguous with the water. Moreover, the delay between ingesting the gustatory stimulus and feeling nauseated could be quite long, with the feeling not coming on until 12 hours later (Roll and Smith 1972), and the organism needn’t even be conscious when the negative feeling arises. (For a review, see Seligman 1970, Garcia et al. 1974). The temporal delay shows that the CS (the flavored water) needn’t be contiguous with the US (the feeling of nausea) in order for learning to occur, thus showing that contiguity isn’t necessary for associative learning. Garcia’s work also laid bare the problems with the domain general aspect of associationism. In the above study the rat was prepared to associate the nausea with the gustatory stimulus, but would not associate it with the audiovisual stimulus. However, if one changes the US from feeling nauseated to receiving shocks in perfect contiguity with the audiovisual and gustatory stimuli, then the rats will associate the shocks with the audiovisual stimulus but not with the gustatory stimulus. That is, rats are prepared to associate audiovisual stimuli with the Gallistel and King (2009, 239) argue that there is no such window. Instead they argue that what matters for learning in place of contiguity is a ratio of the time between the presentation of the CS and the appearance of the US as compared to the time between different US presentations (in a given context). For example, speeding up the CS/US connection by a factor of two reduces the amount of US presentations one needs by half. 27 shock but contraprepared to associate the shocks with the gustatory stimulus. Thus, learning does not seem to be entirely domain general (for similar content specificity effects in humans, see Baeyens et al. 1990). 28 Lastly, ‘The Garcia effect’ has also been used to show problems in the learning curve (see section 9.1). ‘Taste aversions’ are the phenomena whereby an organism gets sick from ingesting the stimulus and the taste (or odor, Garcia et al. 1974) of that stimulus gets associated with the feeling of sickness. As anyone who has had food poisoning can attest, this learning can proceed in a one-shot fashion, and needn’t have a gradual rise over many trials (taste aversions have also been observed in humans, see, e.g., Bernstein and Webster 1980, Bernsetin 1985, Logue et al. 1981, Rozin 1986). 9.4.2. Against the Sufficiency of Contiguity Kamin’s famous blocking experiments (1969) showed that not all contiguous structures lead to classical conditioning. A rat that has already learned that CS1 predicts a US, will not learn that a subsequent CS2 predicts the US, if the CS2 is always paired with the CS1. Suppose that a rat has learned that a light predicts a shock because of the constant contiguity of the light and shock. After learning this, the rat has a sound introduced which only arises in conjunction with the light and the shock. As long as the rat had previously learned that the light predicts the shock, it will not learn that the sound does (as can be seen on later trials that have the sound alone). In sum, having learned that the CS1 predicts the US blocks the organism from learning that the CS2 predicts the US.29 So even though CS2 is perfectly contiguous with the US, the association between CS2 and the US remains unlearned, thus serving as a counterexample to sufficiency of contiguity. 30 Similarly Rescorla (1968) demonstrated that a CS can appear only when the US appears and yet have the association between them be unlearnable. If a tone is arranged to bellow only when there are shocks, but there are still shocks when there are no tones (that is, the CS only appears with the US, but the US sometimes appears without the CS), no associative learning between the CS and the US will occur. Instead, subjects (in Rescorla 1968, rats) will only learn a connection between the shock and the experimental situation—e.g., the room in which the experiment is carried out. In large part because of the problems discussed in 9.4, many classical conditioning theorists gave up the traditional program. Some, like Garcia, appeared to give up the classical theoretical framework altogether (Garcia et al. 1974), others, such as Rescorla and Wagner, tried to usher the framework into the modern era (see, Rescorla and Wagner 1972, Rescorla 1988), where conditioning is seen as sensitive to base rates and It appears that content specificity of associations needn’t just be based on innate dispositions. For example, in an evaluative conditioning paradigm using odors as USs and faces as CSs, the evaluative conditioning only commenced when the odors were interpreted as plausibly human (Todrank et al. 1995). But ‘plausibly human’ included learned information (such as the odors associated with soap). When the odors were typically associated with objects and not humans, no learning transpired. Additionally, there appears to be contentspecific differences in associative learning at a greater level of abstraction: there is evidence that negative US/CS pairings are learned more quickly, and form stronger bonds than positive US/CS pairings (Rozin 1986, Baeyens et al. 1990.) 29 Blocking has been observed in humans (see Dickinson et al. 1984) but one needn’t delve into the empirical literature to feel the pull of the phenomenon. Imagine you’ve eaten an orange and immediately have an allergic reaction. If in your next meal you eat an orange and an apple and have the allergic reaction, you will be less likely to think the apple caused the reaction than you would were you to have never experienced the allergic reaction after eating the orange. 30 More problematically for associationists, blocking doesn’t always work, but when it doesn’t isn’t predictable by associative theory. For example, if a weak odor is paired with a strong taste and the pairing is followed by gastrointestinal distress, the taste magnifies the sensitivity of the odor as a signal (Rusiniak 1979). Relatedly, if a hawk eats a black mouse and gets sick, the hawk won’t just avoid black mice but will avoid all mice. However, if the black mouse tastes different than a white mouse, then the hawk will continue to eat white mice even after black mice make it sick (Brett et al. 1976). 28 driven by informational pick-up.31 Whether this movement is interpreted as a substantive revision of classical conditioning (Rescorla 1988, Heyes 2012) or a wholesale abandoning of it (Gallistel and King 2009) is debatable. 9.5 Coextensionality The Rescorla experiment also demonstrates another problem in associative theorizing: the question of why some property is singled out as a CS as opposed to different, equally contemporaneously instantiated properties. Put a different way, one needs a principle to say what the ‘same situation’ amounts to in generalizations such as Thorndike’s laws. For instance, if a CS and a US, say a tone and a shock, are perfectly paired so that they are either both present or both absent, the organism won’t associate the location it received shocks (e.g., the experimental setting) with getting shocked, it will just associate the tone with the shocks. But in the condition where the US occurs without the CS, but the CS does not occur without the US, the organism will gain an association between the shocks and the location. However, in both cases the location is present on every trial. In contrast to shocks, x-ray radiation, when used as a US, never appears to become associated with location, even if they are always perfectly paired (Garcia et al. 1972). 32 The problem of saying which properties become associated when multiple properties are coinstantiated sometimes goes by the name the ‘Credit Assignment Problem (see, e.g., Gallistel and King 2009). 33 Some would argue that this problem is a symptom of a larger issue: trying to use extensional criteria to specify intentional content (see, e.g., Fodor 2003). Associationists need a criterion to which of the coextensive properties will in fact be learned, and which not. An additional worry stems from the observation that sometimes the lack of a property being instantiated is an integral component of what is learned. To deal with the problem of missing properties, contemporary associationists have introduced an important element to the theory: inhibition. For example, if a US and a CS only appear when the other is absent, the organism will learn a negative relationship holds between them; that is, the organism will learn that the absence of the CS predicts the US. 34 Here the CS becomes a ‘conditioned inhibitor’ of the US. Inhibition, using associations as modulators and not just activators, is a central part of current associationist thinking. For example, in connectionist networks, inhibition is implemented by the activation of certain nodes inhibiting the activation of other nodes. Connection weights can be positive or negative, with the negative weight standing in for the inhibitory strength of the association. Bibliography Oddly enough, evaluative conditioning does not seem as sensitive to base rates or as susceptible to ‘occasion setting’ as classical conditioning is. See De Houwer et al. 2001). 32 The more one looks into how locational properties become associated, the more problems seem to mount. For example, if a rat has a strong preference for a particular drink but gets shocked while ingesting that drink, the rat will not change its preference of the flavor. Instead, the rat will just learn to avoid the drink when it encounters it in the experimental location. But when the rat is given a chance to ingest the drink anywhere else (e.g., back in its home cage) it will still continue to ingest the drink. Furthermore, in the case where the rat gets shocked while drinking the highly desirable flavor in the Skinner box on trial N, the rat will increase how much of the drink it will intake on trial N+1. This is a reasonable strategy: assuming that one knows they are going to get shocked, they might as well intake as much as possible while getting shocked. For more on these points, see Garcia (et al. 1970). 33 In other versions of the problem it is understood as the problem the organism faces in trying to figure out which of its behaviors produced the environmental change that interests the organism. It also appears in problems in Artificial Intelligence (see Minksy 1963). 34 For a pure associationist, one would phrase this as the organism learning to associate the lack of CS with the US. How the pure associationist analyzes the absence of a CS while using only associative structures can also be a difficult issue. 31 Anderson, J., Spoehr, K. and Bennett, D., 1994, “A Study in Numerical Perversity: Teaching Arithmetic to a Neural Network,” in Neural Networks for Knowledge Representation and Inference, D. Levine and M. Aparicio IV (eds.), East Sussex: Psychology Press, pp. 311-335. Armstrong, K., Kose, S., Williams, L., Woolard, A., and Heckers, S., 2012, “Impaired Associative Inference in Patients with Schizophrenia,” Schizophrenia Bulletin, 38(3): 622-629. Asch, S., 1962, “A Problem in the Theory of Associations,” Psychologische Beitrage, (6): 553–563. –––, 1969, “A Reformulation of the Problem of Association,” American Psychologist, 24(2): 92–102. Aydede, M., 1997, “Language of Thought: The Connectionist Contribution,” Minds and Machines, 7(1): 57-101. Baeyens, F., Eelen, P., Van den Bergh, O., and Crombez, G., 1990, “Flavor-Flavor and Color-Flavor Conditioning in Humans,” Learning and Motivation, 21 (4): 434-455. Baeyens, F., Eelen, P., and Crombez, G., 1995, “Pavlovian Associations are Forever: On Classical Conditioning and Extinction,” Journal of Psychophysiology, 9(2): 127–141. Bar-Anan Y., Nosek, B., and Vianello, M., 2009, “The Sorting Paired Features Task: A Measure of Association Strengths,” Experimental Psychology, 56(5): 329-343 Bates, E., and MacWhinney, B., 1987, “Competition, Variation, and Language Learning,” in B. MacWhinney (Ed.), Mechanisms of Language Acquisition, Hillsdale, N.J.: Lawrence Erlbaum Associates, pp. 157-193. Bernstein, I., and Webster, M., 1980, “Learned Taste Aversions in Humans,” Physiology and Behavior, 25(3): 363–366. Bernstein, I., 1985, “Learned Food Aversions in the Progression of Cancer and its Treatment,” in N. Braveman and P. Bronstein, (eds.), Experimental Assessments and Clinical Applications of Conditioned Food Aversions, New York: New York Academy of Sciences, pp. 365–80. Bloom, P., 2000, How Children Learn the Meanings of Words, Cambridge: MIT press. Bouton, M., 2002, “Context, Ambiguity, and Unlearning: Sources of Relapse after Behavioral Extinction,” Biological Psychiatry, 52(10): 976-986. Brett, L., Hankins, W., and Garcia, J., 1976, “Prey-Lithium Aversions. III: Buteo hawks,”Behavioral Biology, 17(1), 87-98. Carey, S., 1978a, “Less May Never Mean More,” in: R. Campbell; P. Smith, (eds.), Recent Advances in the Psychology of Language, New York: Plenum Press, p. 109-132. –––, 1978b, “The Child as Word Learner” in: J. Bresnan, G. Miller, M. Halle, (eds.), Linguistic Theory and Psychological Reality, Cambridge: MIT Press, pp. 264-293. –––, 2010, “Beyond Fast Mapping,” Language Learning and Development, 6(3): 184-205. Carey, S., and Bartlett, E., 1978, “Acquiring a Single New Word,” Proceedings of the Stanford Child Language Conference, 15: 17–29. Caselli, M. C., Bates, E., Casadio, P., Fenson, J., Fenson, L., Sanderl, L., and Weir, J., 1995, “A Crosslinguistic Study of Early Lexical Development,” Cognitive Development, 10 (2): 159-199. Chaiken, S., and Trope, Y., (eds.), 1999, Dual-Process Theories in Social Psychology, New York: Guilford Press. Chalmers, D., 1993, “Connectionism and Compositionality: Why Fodor and Pylyshyn Were Wrong,” Philosophical Psychology 6(3): 305-319. Chater, N., Tenenbaum, J., and Yuille, A., 2006, “Probabilistic Models of Cognition: Conceptual Foundations,” Trends in Cognitive Sciences, 10 (7): 287-291. Chater, N., 2009, “Rational Models of Conditioning,” Behavioral and Brain Sciences, 32 (2): 204-205. Churchland, P., 1989, A Neurocomputational Perspective: The Nature of Mind and the Structure of Science, Cambridge: MIT. Churchland, P., Sejnowski, T., 1990, “Neural Representation and Neural Computation,” Philosophical Perspectives, 4, 343-382. Churchland, P., 1986, “Some Reductive Strategies in Cognitive Neurobiology,” Mind, 95 (379): 279– 309. Chomsky, N., 1959, “A Review of B.F. Skinner’s Verbal Behavior,” Language, 35(1), 26-58. Collins, A., and Loftus, E., 1975, “A Spreading-Activation Theory of Semantic Processing,” Psychological Review, 82 (6): 407-428. De Houwer, J., Thomas, S., and Baeyens, F., 2001, “Association Learning of Likes and Dislikes: A Review of 25 years of Research on Human Evaluative Conditioning,” Psychological Bulletin, 127(6): 853869. De Houwer, J., 2009, “The Propositional Approach to Associative Learning as an Alternative for Association Formation Models,” Learning & Behavior, 37(1), 1-20. –––, 2011, “Evaluative Conditioning: A Review of Procedure Knowledge and Mental Process Theories,” in T. Schachtman and S. Reilly (eds.), Associative Learning and Conditioning Theory: Human and Non-Human Applications, New York: Oxford University Press, pp. 399-416. –––, 2014, “A Propositional of Implicit Evaluation,” Social and Personality Psychology Compass, 8 (7): 342353. Diaz, E., Ruis, G., and Baeyens, F., 2005, “Resistance to Extinction of Human Evaluative Conditioning Using a Between-Subjects Design,” Cognition and Emotion, 19 (2): 245-268. Dickinson, A., Shanks, D., and Evenden, J., 1984, “Judgment of Act-Outcome Contingency: The role of Selective Attribution,” The Quarterly Journal of Experimental Psychology, 36(1), 29-50. Dirikx, T., Hermans, D., Vansteenwegen, D., Baeyens, F., and Eelen, P., 2004, “Reinstatement of Extinguished Conditioned Responses and Negative Stimulus Valence as a Pathway to Return of Fear in Humans,” Learning and Memory, 11, 549-54. Elman, J., 1991, “Distributed Representations, Simple Recurrent Networks, and Grammatical Structure,” Machine learning, 7(2-3): 195-225. Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A., Parisi, D., Plunkett, K., 1996, Rethinking Innateness: A Connectionist Perspective on Development, Cambridge, MA: MIT Press. Evans, G., 1982, The Varieties of Reference, J. McDowell (ed.), Oxford: Clarendon Press. Evans, J., and Frankish, K., (eds.), 2009, In Two Minds: Dual Processes and Beyond, Oxford: Oxford University Press. Evans, J., and Stanovich, K., 2013, “Dual-Process Theories of Higher Cognition: Advancing the Debate, Perspectives on Psychological Science, 8(3): 223-241. Fazio, R., 2007, “Attitudes as Object-Evaluation Associations of Varying Strength,” Social Cognition, 25(5): 603-637. Festinger, L., and Carlsmith, J., 1959, “Cognitive Consequences of Forced Compliance,” The Journal of Abnormal and Social Psychology, 58(2): 203-210. Field, A., and Davey, G., 1999, “Reevaluating Evaluative Conditioning: A Nonassociative Explanation of Conditioning Effects in the Visual Evaluative Conditioning Paradigm,” Journal of Experimental Psychology: Animal Behavior Processes, 25(2): 211-224. Fodor, J., and Pylyshyn, Z., 1988, “Connectionism and Cognitive Architecture: A Critical Analysis,” Cognition, 28 (1-2): 3-71. Fodor, J.,1983, The Modularity of Mind. Cambridge: MIT Press. –––, 2003, Hume Variations. Oxford: Clarendon Press. –––, and McLaughlin, B., 1990, “Connectionism and the Problem of Systematicity: Why Smolensky’s Solution Doesn't Work,” Cognition, 35(2): 183-204. Frankish, K., 2009, “Systems and Levels: Dual-System Theories and the Personal-Subpersonal Distinction,” in J. Evans and K. Frankish (eds). In Two Minds: Dual Processes and Beyond. Oxford: Oxford University Press, pp. 89–107. Gallistel, C., Fairhurst, S., and Balsam, P., 2004, “The Learning Curve: Implications of a Quantitative Analysis,” Proceedings of the National Academy of Sciences of the United States of America, 101(36): 1312413131. Gallistel, C., and King, A., 2009, Memory and the Computational Brain: Why Cognitive Science Will Transform Neuroscience, West Sussex: Wiley Blackwell. Garcia, J., 1981, “Tilting at the Paper Mills of Academe,” American Psychologist, 36(2): 149-158. Garcia, J., Kovner, R., and Green, K., 1970, “Cue Properties vs Palatability of Flavors in Avoidance Learning,” Psychonomic Science, 20(5): 313-314. Garcia, J., McGowan, B., and Green, K, 1972, “Biological Constraints on Conditioning II,” in W. Black, and W. Prokasy (eds.), Classical Conditioning II: Current Research and Theory, New York: AppletonCentury-Crofts, pp. 3-27. Garcia, J., Hankins, W., and Rusiniak, K., 1974, “Behavioral Regulation of the Milieu Interne in Man and Rat,” Science, 185(4154): 824-831. Gendler, T., 2008, “Alief and Belief,” Journal of Philosophy 105 (10): 634–63. Gleitman, L., Cassidy, K., Nappa, R., Papafragou, A, Trueswell, J., 2005, “Hard Words,” Language Learning and Development, 1(1): 23–64. Glosser, G. and Freidman, R., 1991, “Lexical but not Semantic Priming in Alzheimer’s Disease,” Psychology and Aging 6 (4): 522-27. Goldin-Meadow, S., Seligman, M., and Gelman, S., 1976, “Language in the Two-Year Old,” Cognition 4(2): 189-202. Greenwald, A., McGhee , D., and Schwartz, J., 1998, “Measuring Individual Differences in Implicit Cognition: The Implicit Association Test,” Journal of Personality and Social Psychology, 74(6): 1464–1480. Heyes, C., 2012, “Simple Minds: A Qualified Defence of Associative Learning,” Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603): 2695-2703. Hull, C., 1943, Principles of Behavior, New York: Appleton-Century-Crofts. Hume, D., 1738, A Treatise of Human Nature, L. A. Selby-Bigge (ed.), 2nd ed. revised by P. H. Nidditch, Oxford: Clarendon Press, 1975. James, W., 1890, The Principles of Psychology (Vol. 1). New York: Holt. Johnson, K., 2004, “On the Systematicity of Language and Thought,” Journal of Philosophy, 101 (3): 111–139. Kahneman, D., 2011, Thinking, Fast and Slow, New York: Farrar, Straus and Giroux. Kamin, L., 1969, “Predictability, Surprise, Attention, and Conditioning,” in B. Campbell and R. Church (eds.), Punishment and Aversive Behavior, New York: Appleton-Century-Crofts, pp. 279-296. Kant, I. 1781/1787, Critique of Pure Reason, in P. Guyer and A. Wood (eds.) Critique of Pure Reason, New York: Cambridge University Press. Karmiloff-Smith, A., 1995, Beyond Modularity: A Developmental Perspective on Cognitive Science, Cambridge: MIT Press/Bradford Books. Kruglanski, A., 2013, “Only One? The Default Interventionist Perspective as a Unimodel— Commentary on Evans & Stanovich,” Perspectives on Psychological Science, 8(3): 242-247. Locke, J., 1690, An Essay Concerning Human Understanding, in Peter H. Nidditch (ed.) An Essay Concerning Human Understanding, Oxford: Clarendon Press, 1975, Logue, A., Ophir, I., and Strauss, K., 1981, “The Acquisition of Taste Aversion in Humans,” Behavioral Research and Therapy, 19 (4): 319-33. Mandelbaum, E., 2013, “Against Alief,” Philosophical Studies, 165 (1): 197-211. –––, Forthcoming, “Attitude, Inference, Association: On the Propositional Structure of Implicit Attitudes,” Nous. Markman, E., 1989, Categorization and Naming in Children: Problems of Induction, Cambridge: MIT Press. Markson, L., and Bloom, P., 1997, “Evidence Against a Dedicated System for Word Learning in Children,” Nature, 385 (6619): 813-815. Marr, D., 1982, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, NY: W.H. Freeman and Co. Mason, M., and Bar, M., 2012, “The Effect of Mental Progression on Mood,” Journal of Experimental Psychology: General, 141(2): 217-221. McClelland, J., Botvinick, M., Noelle, D., Plaut, D., Rogers, T., Seidenberg, M., and Smith, L., 2010, “Letting Structure Emerge: Connectionist and Dynamic Systems Approaches to Cognition,” Trends in Cognitive Sciences, 14 (8): 348–356. Minsky, M., 1963, “Steps toward Artificial Intelligence,” in E. Feigenbaum and J. Feldman (eds.), Computers And Thought, New York, NY: McGraw-Hill, pp. 406-450 Mitchell, C., De Houwer, J., and Lovibond, P., 2009, “The Propositional Nature of Human Associative Learning,” Behavioral and Brain Sciences 32(2): 183-246. Nosek, B., and Banaji, M, 2001, “The Go/No-Go Association Task,” Social Cognition 19 (6): 625-66. Osman, M., 2013, “A Case Study Dual-Process Theories of Higher Cognition—Commentary on Evans & Stanovich,” Perspectives on Psychological Science, 8(3): 248-252. Pavlov, I., 1906, “The Scientific Investigation of the Psychical Faculties or Processes in the Higher Animals,” Science, 24 (620): 613-619. Payne, B., 2009, “Attitude Misattribution: Implications for Attitude Measurement and the ImplicitExplicit Relationship,” In R. Petty, R. Fazio, and P. Briñol (eds.), Attitudes: Insights from the new wave of implicit measures. Hillsdale, NJ: Erlbaum pp. 459-484. Perea, M., and Rosa, E., 2002, “The Effects of Associative and Semantic Priming in the Lexical Decision Task,” Psychological Research 66(3): 180-194. Prinz, J., 2002, Furnishing the Mind: Concepts and their Perceptual Basis. Cambridge: MIT Press. –––, and Clark, A., 2004, “Putting Concepts to Work: Some Thoughts for the 21st Century,” Mind & Language, 19 (1), 57-69. Rescorla, R., 1968, “Probability of Shock in the Presence and Absence of CS in Fear Conditioning,” Journal of Comparative and Physiological Psychology, 66(1): 1-5. –––, 1988, “Pavlovian Conditioning: It's Not What You Think It Is,” American Psychologist, 43(3): 151160. Rescorla, R., and Wagner, A., 1972, “A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement,” in W. Black, and W. Prokasy (eds.) Classical Conditioning II: Current Research and Theory, New York: Appleton-Century-Crofts, pp. 64-99. Roll, D., and Smith, J., 1972, “Conditioned Taste Aversion in Anesthetized Rats,” in M. Hager and J. Seligman (eds.), Biological Boundaries of Learning. New York: Appleton-Century-Crofts, pp. 98-102. Rozin, P., 1986, “One-Trial Acquired Likes and Dislikes in Humans: Disgust as a US, Food Predominance, and Negative Learning Predominance,” Learning and Motivation, 17(2): 180-189. Rumelhart, D., Smolensky, P., McClelland, J., and Hinton, G., 1986, “Sequential Thought Processes in PDP Models,” in J.McClelland and D. Rumelhart (eds.), Parallel Distributed Processing Vol. 2: Explorations in the Microstructure of Cognition: Psychological and Biological Models, Cambridge: MIT Press, pp. 7-57. Rusiniak, K., Hankins, W., Garcia, J., and Brett, L., 1979, “Flavor-illness Aversions: Potentiation of Odor by Taste in Rats,” Behavioral and Neural Biology, 25(1), 1-17. Rydell, R. and McConnell, A., 2006, “Understanding Implicit and Explicit Attitude Change: A Systems of Reasoning Analysis,” Journal of Personality and Social Psychology 91 (6): 995-1008. Sandhoffer, C., Smith, L., and Luo, J., 2000, “Counting Nouns and Verbs in the Input: Differential Frequencies, Different Kinds of Learning?” Journal of Child Language, 27 (3): 561-585. Seligman, M., 1970, “On the Generality of the Laws of Learning,” Psychological Review, 77 (5): 406-418. Shanks, D., 2010, “Learning: From Association to Cognition,” Annual Review of Psychology, 1, 273–301. Skinner, B., 1938, The Behavior of Organisms: An Experimental Analysis. Oxford: Appleton-Century. –––, 1953, Science and Human Behavior. New York: Simon and Schuster. Sloman, S., 1996, “The Empirical Case for Two Systems of Reasoning,” Psychological Bulletin, 119 (1): 322. Smith, E. R. & DeCoster, J., 2000, “Dual-Process Models in Social and Cognitive Psychology: Conceptual Integration and Links to Underlying Memory Systems,” Personality and Social Psychology Review, 4(2): 108-131. Smith, J., and Roll, D., 1967, “Trace Conditioning with X-rays as an Aversive Stimulus,” Psychonomic Science, 9(1), 11-12. Smolensky, P., 1988, “On the Proper Treatment of Connectionism,” Behavioral and Bruin Sciences, 11(1): l-23. Snedeker, J., and Gleitman, L., 2004, “Why it is Hard to Label Our Concepts,” in D. Hall and S. Waxman (eds.), Weaving a Lexicon, Cambridge, MA: MIT Press, pp. 257-294. Stanovich, K., 2011, Rationality and the Reflective Mind. New York: Oxford University Press. Tenenbaum, J., Kemp, C., Griffiths, T., and Goodman, N., 2011, “How to Grow a Mind: Statistics, Structure, and Abstraction.” Science, 331(6022): 1279-1285. Thorndike, E., 1911, Animal intelligence: Experimental studies. New York: Macmillan. Todrank, J., Byrnes, D., Wrzesniewski, A., and Rozin, P., 1995, “Odors can Change Preferences for People in Photographs: A Cross-Modal Evaluative Conditioning Study with Olfactory USs and Visual CSs,” Learning and Motivation, 26(2), 116-140. Tolman, E., 1948, “Cognitive Maps in Rats and Men.” Psychological Review, 55(4): 189-208. Van Gelder, T., 1995, “What Might Cognition Be, If not Computation?,” The Journal of Philosophy, 91 (7): 345-381. Vansteenwegen, D., Francken, G., Vervliet, B., De Clercq, A., and Eelen, P., 2006, “Resistance to Extinction in Evaluative Conditioning,” Journal of Experimental Psychology: Animal Behavior Processes, 32(1): 71-79. Wilson, T., Lindsey, S., and Schooler, T., 2000, “A Model of Dual Attitudes,” Psychological Review 107 (1): 101-26. Academic Tools [Auto-inserted by SEP staff] Other Internet Resources John Locke’s Chapter on the Association of Ideas from An Essay Concerning Human Understanding: http://oregonstate.edu/instruct/phl302/texts/locke/locke1/Book2c.html#Chapter XXXIII David Hume’s A Treatise of Human Nature: http://www.earlymoderntexts.com/authors/hume.html Williams James “The Stream of Consciousness”: http://psychclassics.yorku.ca/James/jimmy11.htm William James “The Stream of Thought”: http://psychclassics.asu.edu/James/Principles/prin9.htm (chapter from his Principles of Psychology) Edward Thorndike on the Law of Effect (from his book Animal Intelligence): http://psychclassics.yorku.ca/Thorndike/Animal/chap5.htm Ivan Pavlov’s “Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex” http://psychclassics.yorku.ca/Pavlov/ Related Entries Behaviorism | Compositionality | Computational Theory of Mind | Connectionism | David Hume | J.S. Mill | Kant’s Transcendental Argument | Logical Form | 19th Century Scottish Philosophy Acknowledgments Helpful feedback was received from Michael Brownstein, Bryce Huebner, Zoe Jenkin, Jake Quilty-Dunn, Shaun Nichols, and Susanna Siegel who are hereby thanked for their efforts. <Eric Mandelbaum> <[email protected]>