Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Journal of Experimental Psychology: Learning, Memory, and Cognition 2013, Vol. 39, No. 3, 663– 677 © 2012 American Psychological Association 0278-7393/13/$12.00 DOI: 10.1037/a0028578 Effects of Emotional Valence and Arousal on Recollective and Nonrecollective Recall Carlos F. A. Gomes and Charles J. Brainerd Lilian M. Stein Cornell University Pontifical Catholic University of Rio Grande do Sul The authors investigated the effects of valence and arousal on memory using a dual-process model that quantifies recollective and nonrecollective components of recall without relying on metacognitive judgments to separate them. The results showed that valenced words increased reconstruction (a component of nonrecollective retrieval) relative to neutral words. In addition, the authors found that positive valence increased recollective retrieval in comparison to negative valence, whereas negative valence increased nonrecollective retrieval relative to positive valence. The latter effect, however, depended on arousal: It was reliable only when arousal was high. The present findings supported the notion that emotional valence is a conceptual gist because it affected nonrecollective retrieval and because subjects’ recall protocols were clustered by valence. The results challenge the hypothesis that valence affects only recollection, and they clarify previous inconsistent findings about the effects of emotion on memory accuracy and brain activity. Keywords: emotion, memory, valence, recall, dual-retrieval models iment by Bradley, Greenwald, Petry, and Lang (1992) were asked to recall 60 pictures that they had rated according to valence and arousal. Recall of pictures rated as either negative or positive was better than recall of pictures rated as neutral. Similarly, Xu, Zhao, Zhao, and Yang (2011) found that recognition memory for negative and positive words was better than for neutral words. Therefore, arousal and valence both seem to improve memory for words. The relative effects of positive and negative valence on memory have been the focus of research since the 1920s (Flügel, 1925; Wohlgemuth, 1923).1 Much of this early work was centered on the Freudian prediction that people tend to forget more unpleasant than pleasant experiences, owing to the defense mechanism of repression. More simply, the Freudian hedonic memory hypothesis anticipated that negatively valenced information would be more poorly remembered than positively valenced information, a prediction that is also made by more contemporary hypotheses such as the Pollyanna principle (Matlin & Stang, 1978) and the fading affect bias (Walker, Skowronski, & Thompson, 2003). In that connection, Thomson (1930; see also Sharp, 1938; Stagner, 1933) investigated long-term retention of self-generated negative and positive words. During the first phase of the study, subjects were required to write 20 positive words and 20 negative words. During the second phase, 1 month later, they recalled as many of the words generated in the first phase as possible. Recall of positive words (38%) was higher than recall of negative words (24%). Of How does emotion affect episodic memory? In order to answer this question, memory researchers have often studied the effects of two key components of emotion: arousal and valence (Kensinger, 2009). Arousal is defined as the level of activation that is generated when a stimulus is encoded, which varies from calm (low) to excited (high). Research on the effects of arousal have suggested that memory performance for items high in arousal is better than for items low in arousal (Buchanan, Etzel, Adolphs, & Tranel, 2006; Cahill, Babinsky, Markowitsch, & McGaugh, 1995; Cahill & McGaugh, 1995). In an experiment by Cahill et al. (1996), for example, subjects watched 24 film clips in which half were high in arousal (e.g., animal mutilation and violent crimes) and half were low in arousal (e.g., court proceedings and travelogues). Three weeks later, subjects recalled more from the former. Valence, however, is defined as the level of pleasantness that is generated when a stimulus is encoded, which varies from strongly negative to neutral to strongly positive. Studies of the effects of valence have indicated that memory performance for both negative and positive items is usually better than for neutral items (Kensinger & Corkin, 2003; Ochsner, 2000). For instance, subjects in an exper- This article was published Online First May 21, 2012. Carlos F. A. Gomes and Charles J. Brainerd, Department of Human Development, Cornell University; Lilian M. Stein, Postgraduate Program in Psychology, Pontifical Catholic University of Rio Grande do Sul, Brazil. Portions of the research were supported by National Institutes of Health Grant 1RC1AG036915-02 and by Brazilian National Council for Scientific and Technological Development Grants CNPq; 107804/2007-7, 500486/ 2007-7. The software that was used to conduct the analyses of the dualretrieval model that are reported in this article (goodness-of-fit tests, parameter estimation, and significance tests of parameter values) is available from the first and second authors upon request. Correspondence concerning this article should be addressed to Carlos F. A. Gomes, Department of Human Development, Cornell University, Ithaca, NY 14853. E-mail: [email protected] 1 In fact, the earliest study we could find on the relationship between valence and memory was reported by Colegrove (1899). This study was a survey of 1,658 individuals in which one of the questions asked: “Do you remember pleasant or unpleasant experiences better?” (p. 243). The results revealed that pleasant experiences were reported as being better remembered than unpleasant experiences. Although this result was observed for children, young adults, and older adults, it peaked during adolescence and young adulthood. 663 664 GOMES, BRAINERD, AND STEIN course, such data do not provide evidence for or against a repression mechanism, nor any other psychological process, and the reason why positive events might be better remembered than negative ones remained unclear. Of late, dual-process conceptions have been invoked to explain emotion’s effects on episodic memory. According to such conceptions, episodic memories can be accompanied by mental reinstatement of contextual features of prior experience (recollective retrieval) or not (nonrecollective retrieval). Studies that have investigated the role of dual processes in the effects of emotion on memory have followed the traditional procedure of using subjects’ metacognitive judgments about their memories, especially remember/know judgments, to separate recollective from nonrecollective retrieval. Most experiments in which metacognitive judgments were used to quantify recollective and nonrecollective retrieval have focused on valence rather than arousal (though Ochsner, 2000, reported that high-arousal words elevated rates of remember judgments relative to low-arousal words). In Kensinger and Corkin’s (2003) research, for instance, subjects studied negative and neutral words and then performed a recognition test on which they made remember/know judgments about words that were recognized as old. Remember rates were higher for negative than for neutral words, but the two know rates did not differ, suggesting that negative valence stimulates increased recollective retrieval. In the same vein, Ochsner found higher remember rates for negative and positive words relative to neutral words, and vivid recollection of negative information is documented in other experiments (Dewhurst & Parry, 2000; Rimmele, Davachi, Petrov, Dougal, & Phelps, 2011; Sharot & Yonelinas, 2008). One explanation advanced in the literature (e.g., Christianson, 1992; Ochsner, 2000) is that items that generate an emotional reaction possess unique features (e.g., physiological reactions, associations with autobiographical memories) that render them more salient or distinctive than neutral items. The presence of such distinctive features boosts recollective retrieval and, consequently, remember judgments. However, the idea that emotion only enhances recollective retrieval is inconsistent with two other lines of investigation, namely studies on the effects of emotion on brain activity and on memory accuracy. If emotion only enhances recollective retrieval, memory performance for emotional items should be particularly associated with activity in regions linked to recollection or visual and sensory processing (e.g., hippocampal, parahippocampal, and temporooccipital regions; Macaluso, Frith, & Driver, 2000; Yonelinas, Hopfinger, Buonocore, Kroll, & Baynes, 2001) but not with regions linked to familiarity or semantic processing (e.g., prefrontal cortex [PFC]; Cabeza & Nyberg, 2000; Goldberg, Perfetti, Fiez, & Schneider, 2007; Yonelinas, Otten, Shaw, & Rugg, 2005). However, functional magnetic resonance imaging data suggest that memory enhancements due to valence are associated with increased activity in regions within the PFC, and memory enhancements due to arousal are associated with increased activity in the amygdala (Dolcos, LaBar, & Cabeza, 2004; Kensinger & Corkin, 2004). In addition, if emotion enhances recollective but not nonrecollective retrieval, memory accuracy for emotional items should be better in comparison to nonemotional items. However, several experiments have implicated negative valence in reduced memory accuracy. For instance, it has been reported that negative valence leads to more false memories of gist-consistent material than neutral or positive valence (Brainerd, Holliday, Reyna, Yang, & Toglia, 2010; Brainerd, Stein, Silveira, Rohenkohl, & Reyna, 2008; Dehon, Laroi, & Van der Linden, 2010; Goodman et al., 2011; Otgaar, Candel, & Merckelbach, 2008; but see Palmer & Dodson, 2009). Moreover, negative valence has been found to increase subjective reports of confidence and vividness without increasing memory accuracy. Talarico and Rubin (2003) found that when college students’ recollections of the events of September 11 were compared with their recollections of everyday events, confidence and vividness ratings were higher for the September 11 events than for the everyday events, but there was no difference in accuracy between them. To explain the contradictory effects of emotion on memory accuracy and metacognitive judgments, Phelps and Sharot (2008) concluded that emotion’s impact on metacognitive judgments is more robust and consistent than its impact on accuracy for objective details (see also Zimmerman & Kelley, 2010). The latter explanation, however, does not elucidate how emotion affects the mechanisms underlying memory performance. Although valence and arousal have been factorially manipulated in some experiments (for a review, see Kensinger, 2004), the confounding of these variables has been a persistent problem that makes theoretical interpretation uncertain. For example, in Cahill et al.’s (1996) study, high-arousal film clips were more negative than low-arousal film clips. In Cahill and McGaugh’s (1995) procedure, memory for emotionally arousing versus nonarousing pictures was assessed by comparing memory for pictures of a negative arousing event (a car accident, with accompanying brain damage and limb amputation) with pictures of neutral events. In Pierce and Kensinger (2011), negative and positive words had higher arousal values on the affective norms for English words (ANEW; Bradley & Lang, 1999) than neutral words. Similarly, in Xu et al.’s (2011) study, positive words were lower in arousal relative to negative ones. Finally, in Thomson’s (1930) study, it is unclear whether the positive words generated by the subjects also differed in arousal relative to the negative words that they generated. Other studies in which materials that differ in valence also differ in arousal include Sharot, Delgado, and Phelps (2004); Sharot, Varfaellie, and Yonelinas, (2007), and Dewhurst and Parry (2000). Importantly, in studies in which valence and arousal have been factorially manipulated, they have often been found to interact (e.g., Brainerd et al., 2010; Libkuman, Stabler, & Otani, 2004; Steinmetz, Addis, & Kensinger, 2010; Waring & Kensinger, 2011). For instance, Brainerd et al. (2010) found that the tendency of negative valence to elevate false memories was greater for highthan for low-arousal items. The Present Experiments In the research that we report, we investigated the effects of emotional arousal and valence on recollective and nonrecollective retrieval, using procedures that had two salient features. The first is that the effects of valence were assessed without the confounding influence of arousal, using lists that had been constructed from emotional word norms. In Experiment 1, valence levels were manipulated from negative to neutral to positive, with arousal level controlled. In Experiment 2, valance and arousal levels were manipulated factorially. The second feature, which requires further explanation, is that we used a new model-based procedure for measuring recollective and nonrecollective retrieval (Brainerd, EMOTIONAL RECALL Reyna, & Howe, 2009) to investigate how these processes are affected by valence and arousal. As mentioned, the traditional approach to measuring dualretrieval processes involves requesting supplementary metacognitive judgments following old/new recognition decisions about previously encoded material, typically remember/know judgments and occasionally confidence judgments. The use of such supplementary judgments is predicated on the notion that they separate the contributions of recollective and nonrecollective retrieval (e.g., Yonelinas, 2002). Recently, however, this assumption has stimulated a lively debate. Critiques of all of the standard supplementary judgment techniques have been published, with critiques of remember/know judgments having a particularly long history (e.g., Donaldson, 1996; Dunn, 2004, 2008; Rotello, Macmillan, & Reeder, 2004). As matters now stand, there are multiple reasons for doubting that the conventional strategy of supplementing old/ new recognition decisions with metacognitive judgments is a valid method of separating recollective from nonrecollective retrieval (Brainerd & Reyna, 2010; Malmberg, 2008). This, in turn, raises doubts about the meaning of the aforementioned findings on how emotional content affects recollective and nonrecollective retrieval. Brainerd et al. (2009) proposed an approach to separating recollective and nonrecollective processes that avoids extant criticisms of the metacognitive judgment strategy because (a) it uses recall rather than recognition as the focal memory task and (b) it does not require that subjects make metacognitive judgments. This recall-based procedure has recently been used to determine how several variables affect recollective and nonrecollective retrieval, including repetition, semantic organization of lists, list length, exemplar typicality, chronological age, explicitness of retrieval cues, and immediate versus delayed testing (e.g., Brainerd & Reyna, 2010; Brainerd, Aydin, & Reyna, in press). The procedure works as follows. Subjects receive a small number of study-test trials (a minimum of three) on a focal list, using any standard method of recall (e.g., associative, cued, free, serial). The resulting sequences of errors and successes are then analyzed with a simple two-stage Markov 665 chain, which is shown in Appendix A (Equation A1). This model posits that learning to recall an item consists of three performance states: an initial zero-recall state U; an intermediate state P, in which the probability of recalling the item is some value 0 ⬍ p ⬍ 1; and an absorbing errorless-recall state L. Items are assumed to be in state U before learning begins. After the first trial or after any subsequent trial, an item that is in U may have moved to P, to L, or remained in U. Items that move to L remain there on all subsequent trials, items that move to P can move to L on some subsequent trial, and items that remain in U can move to P or L on some subsequent trial. In this model, recollective retrieval is identified with state L, and nonrecollective retrieval is identified with state P (for a review of relevant data, see Brainerd et al., 2009). The model parameters that measure these processes are defined in Table 1. Recollective retrieval is measured with parameters that give the probability of moving to state L from state P or state U, which are called direct access parameters because this form of retrieval is assumed to activate items’ verbatim traces without searching through and comparing traces of other items. This form of retrieval generates recollective phenomenology. Nonrecollective retrieval, however, involves two component processes: reconstruction and familiarity. Reconstruction is measured via model parameters that give the probability of moving to state P from state U, and familiarity is measured by parameters that give the probability of successful recall of items that are in state P. Reconstructive retrieval is a delimited search operation that uses partial information about list items (e.g., “seasoning”) to construct sets of recall candidates (e.g., basil, oregano, pepper, sage). Because such information does not identify specific items, a further operation, familiarity, is required to decide which reconstructions ought to be output and which should be withheld. Reconstructed items are assumed to generate familiarity signals. Reconstructions that deliver stronger signals are more likely to be output than those that deliver weaker ones. Thus, nonrecollective retrieval consists of reconstruction ⫹ familiarity judgment. These operations support the error-prone recall of state P because reconstruction may generate candidate sets that do Table 1 Retrieval Processes That are Measured by the Markov Chain Process/parameter Definition Recollective retrieval Direct access D1 D2 The probability that an item’s verbatim trace can be accessed after the first study trial. For an item whose verbatim trace could not be accessed following prior study trials, the probability that its verbatim trace can be accessed after the current study trial. Nonrecollective retrieval Reconstruction R1 R2 Familiarity judgment J1 J2 For an item whose verbatim trace cannot be accessed after the first study trial, the probability that the item can be reconstructed after the first study trial. For an item whose verbatim trace cannot be accessed after the current study trial and that could not be reconstructed after prior study trials, the probability that the item can be reconstructed after the current study trial. For an item that is reconstructed following the first study trial, the probability that the reconstruction is judged to be familiar enough to output. For an item that is reconstructed following any study trial after the first one, the probability that the reconstruction is judged to be familiar enough to output. GOMES, BRAINERD, AND STEIN 666 not contain correct list items (e.g., garlic) and because familiarity judgment may screen out candidates that are actually list items. Experiment 1 In this experiment, we manipulated valence (negative, neutral, positive) and controlled arousal, the aim being to investigate how valence affects the retrieval processes that underlie recall. From previous studies of valence effects, we hypothesized that overall recall performance for valenced words (negative and positive) would be better than for nonvalenced words. However, our main objective was to measure how the two types of valence affect recollective and nonrecollective retrieval, using the parameters in Table 1. Method Subjects. Forty undergraduates (22 females) from a Brazilian university volunteered to participate in the experiment. Subjects were aged between 17 and 26 years (M ⫽ 19 years; SD ⫽ 2 years) and were required to give written informed consent to participate in the experiment. Materials. Each subject learned to recall a list of 34 words drawn from the Brazilian version of the ANEW (Kristensen, Gomes, Vieira, & Justo, 2011). The original words in Portuguese and their English translations are presented in Appendix B. The study list was composed of 10 negative words (Mvalence ⫽ 2.02; Marousal ⫽ 5.36), 10 neutral words2 (Mvalence ⫽ 5.04; Marousal ⫽ 5.28), 10 positive words (Mvalence ⫽ 7.98; Marousal ⫽ 5.36), and four filler words that were always presented in the first, second, 33rd, and 34th position in order to minimize primacy and recency effects. Analyses of variance (ANOVAs) indicated that the negative, neutral, and positive words differed with respect to valence but not arousal. In addition, an investigation of whether valence conditions differed with respect to five other dimensions was undertaken, namely semantic relatedness, word length, frequency, concreteness, and imagery. Semantic relatedness was measured by computing the frequency of six types of conceptual relations (antonymy, synonymy, taxonomic properties, entity properties, situation properties, and introspective properties; Wu & Barsalou, 2009) between all 55 possible pairs of words within each list. The scoring showed that the frequency of conceptual relations was low and similar across word lists. Similarly, ANOVAs indicated that there were no reliable differences among the positive, negative, and neutral words regarding word length, printed and spoken frequency in Portuguese (Sardinha, 2003), concreteness (Toglia & Battig, 1978), and imagery (Clark & Paivio, 2004). Procedure. Subjects were tested in groups and underwent a standard three-trial study-test procedure that has been used in recent experiments in which recollective and nonrecollective retrieval have been measured with the present procedure (e.g., Brainerd et al., in press; Brainerd & Reyna, 2010; Brainerd et al., 2009). The procedure consists of three pairs of study-test cycles, with a short buffer activity interpolated between each study and test cycle to eliminate short-term memory effects. The overall procedure was S1B1aT1a B1bT1b, S2B2T2, S3B3T3, where S denotes a study cycle on the word list, B denotes a buffer activity (30 s of letter shadowing), and T denotes a free-recall test. This design has important advantages over a canonical design (S1B1T1S2B2T2S3B3T3). Specifically, it generates additional data (T1b) that allows the Markov chain to be fit twice across study-test cycles (i.e., across T1aT2T3 and T1bT2T3), and, therefore, it generates two sets of parameter estimates that will differ if subjects tend to forget items when additional recall tests are imposed between study cycles. During study cycles, subjects viewed words one at a time at a 3-s rate (with a 1-s interword interval) on a computer screen and were told to pay close attention because their memory for the words would be tested later. With the exception of filler words, the presentation order of 30 focal words was random. During test cycles, subjects performed written free recall. They were told to write down as many studied words as possible and not to worry about spelling. Written recall continued until 30 s had elapsed without the emission of a new word. Results The results are presented in three parts. First, we report standard descriptive statistics and ANOVAs that assessed the qualitative effects of emotional valence on recall. Second, we report the model-based results, which assessed the effects of valence on recollective and nonrecollective retrieval. The latter analysis is divided into model fit and process-level analyses. Third, we report an additional analysis that evaluated the effects of valence on clustering in free recall, which is an index of semantic processing in recall. For all statistical tests, we adopted a .05 level of significance. Descriptive statistics and ANOVAs. We measured recall performance with three descriptive statistics: the proportion of words recalled on each test, the mean trial of last error (MTLE), and the mean total errors per item (MTEI). We computed all three measures separately for T1aT2T3 and T1bT2T3 and then averaged (T1T2T3). The average proportion of negative, neutral, and positive words recalled as a function of test trial are shown in Table 2. Inspection of Table 2 indicates better recall for positive than for negative valence and better recall for negative than for neutral valence. We submitted the recall proportion data to a 3 (valence: negative, neutral, positive) ⫻ 3 (test trial: T1, T2, T3) repeated measures ANOVA. There was a main effect of valence, F(2, 78) ⫽ 52.59, MSE ⫽ .02, 2p ⫽ .57, and a main effect of test trial, F(2, 78) ⫽ 251.60, MSE ⫽ .02, 2p ⫽ .87. Regarding the main effect of valence, paired comparisons showed that subjects recalled more positive (M ⫽ 56%) than negative words (M ⫽ 47%), and recalled more negative than neutral words (M ⫽ 35%). Naturally, the proportion of words recalled by subjects increased from Trial 1 (M ⫽ 26%) to Trial 2 (M ⫽ 51%), and from Trial 2 to Trial 3 (M ⫽ 62%). 2 In prior studies in which valenced words have been compared with arousal controlled (e.g., Brainerd et al., 2008), the procedure has been the same as the one used in Experiment 1. Specifically, sets of negative, “neutral,” and positive words have been compared in which all three sets have intermediate arousal scores (close to 5 on the 9-point ANEW scale). Thus, the term neutral refers to valence rather than arousal. However, “neutral” can have an arousal meaning, too, and words that are neutral in that sense are low-arousal words, whereas words of intermediate arousal levels are better conceptualized as “ambivalent.” In summary, the neutral words in Experiment 1 were neutral with respect to valence, not arousal, and if the arousal meaning is added, then the present set of neutral words could be termed neutral-ambivalent. EMOTIONAL RECALL 667 Table 2 Recall Performance Statistics as a Function of Experimental Condition Proportion recalled Condition Negative Neutral Positive T1 .26 (.13) .18 (.14) .33 (.15) Overall recall T2 T3 Experiment 1 .52 (.18) .65 (.19) .39 (.22) .49 (.21) .63 (.16) .71 (.18) MTLE MTEI 1.75 (.42) 2.10 (.52) 1.53 (.43) 1.58 (.41) 1.94 (.50) 1.34 (.37) Experiment 2 Negative Low arousal High arousal Positive Low arousal High arousal .23 (.13) .16 (.13) .38 (.16) .35 (.17) .53 (.16) .51 (.20) 1.99 (.37) 2.12 (.41) 1.86 (.36) 1.99 (.42) .22 (.12) .23 (.14) .42 (.18) .45 (.16) .53 (.19) .58 (.18) 1.96 (.41) 1.87 (.39) 1.83 (.40) 1.75 (.39) Note. MTLE ⫽ mean trial of last error; MTEI ⫽ mean total errors per item. Standard deviation appears in parentheses. We also found a main effect of valence using the MTLE and the MTEI measures. Summary statistics for MTLE and MTEI are presented in Table 2. Inspection of Table 2 suggests that learning to recall words was easier for positive than for both negative and neutral valence, and easier for negative than for neutral valence. We conducted two repeated measures ANOVAs with valence (negative, neutral, positive) as the independent variable and either MTLE or MTEI as the dependent variable. The analyses revealed a main effect of valence for MTLE, F(2, 78) ⫽ 30.98, MSE ⫽ .11, 2p ⫽ .44, and for MTEI, F(2, 78) ⫽ 52.59, MSE ⫽ .07, 2p ⫽ .57. Paired comparisons indicated that, independent of the type of performance measure used, positive recall was better than negative or neutral recall, and negative recall was better than neutral recall. Summing up, recall of valenced words (positive or negative) was better than recall of neutral words, regardless of the performance measure. Furthermore, recall of positive words was better than recall of negative words, regardless of the performance measure. Model analyses. In this section, we used the Markov chains in Appendix A (see also Brainerd et al., in press) to measure the effects of valence on recollective and nonrecollective retrieval. The Markov chains were applied to the outcome space implied by T1aT2T3 and T1bT2T3 free-recall tests, separately. Because parameter estimates were very similar between the two, which happens when forgetting between T1a and T1b is roughly zero (a common finding with healthy young adults; e.g., Brainerd et al., 2009), we pooled the data of both spaces and applied the Markov chains to the pooled data to generate the final set of parameter estimates. The intrusion rate across trials was very low (less than 1% of the words recalled) and, therefore, did not affect model-based results. As is customary, the model analyses proceed in two steps: (a) fit followed by (b) parameter estimation and significance testing. Fit. As is also customary (see Brainerd & Reyna, 2010), there were two types of fit analyses: necessarity tests and sufficiency tests. Necessity tests ask whether simpler models, which assume that recall involves only a single retrieval process, can account for the present data. If so, the dual-retrieval model fails on grounds of parsimony. As shown in Appendix A (Equations A12 and A14), there are two such one-process models that can be fit to the present data—namely, a model that posits a single recollective retrieval operation and a model that posits a single nonrecollective retrieval operation. The recollection-only model produces a G2 test statistic with 5 degrees of freedom (see Equation A15), and the nonrecollection-only model produces a G2 test statistic with 3 degrees of freedom (see Equation A13). We computed those tests for all three valence conditions. Because the G2 statistic is asymptotically distributed as 2 (Riefer & Batchelder, 1988), the critical value for rejection of the null hypothesis of fit at the .05 level for any of the conditions of this experiment is 11.07 for the oneprocess recollective model and 7.82 for the one-process nonrecollective model. As there were three conditions, the critical value of G2 to reject the recollective model for the experiment as a whole is 25.00, and the critical value to reject the nonrecollective model for the experiment as whole is 16.92. The null hypothesis of fit was rejected for the experiment as a whole, with the values of the G2 statistics exceeding their critical values by wide margins, G2(15) ⫽ 1655.38, and G2(9) ⫽ 129.17. We also computed G2 statistics for the one-process models for each experimental condition. The values for the negative, neutral, and positive valence condition were, respectively, 614.19, 386.18, and 655.01 for the recollective model, and 73.88, 34.00, and 21.29 for the nonrecollective model. Therefore, the null hypothesis of fit was rejected for the oneprocess models in each and every experimental condition, and the dual-retrieval model consequently passed the necessity tests. Turning to sufficiency tests, these tests ask whether the dualretrieval model is adequate to account for the data, or whether a more complex model (i.e., a model with additional retrieval processes) is needed. This test (see Equation A11) produces a G2 statistic with one degree of freedom, which is therefore distributed as 2(1), with a critical value of 3.84 to reject the dual-retrieval model for any of the conditions of this experiment. Because there are three conditions, a G2 value of 7.82 rejects the model for the experiment as whole. The G2 value for the experiment as a whole was less than half of the critical value, 3.11. The G2 values for the negative, neutral, and positive valence condition were .58, .04, and GOMES, BRAINERD, AND STEIN 668 2.49, respectively. Therefore, the dual-retrieval model passed the sufficiency tests as well as the necessity tests. Parameter estimates and significance tests. We now turn to the process-level analysis of the effects of valence on recall. Table 3 shows maximum likelihood estimates for the parameters of the dual-retrieval model for each experimental condition, which will be used to determine how valence affects recollective and nonrecollective retrieval. The grand average of the standard deviation of estimates across conditions was .11. There are two important comparisons in Table 3. The first is between valenced (positive, negative) and nonvalenced words (neutral), and the second is between positively and negatively valenced words. Results for the former comparison explain why valence improves recall, whereas results for the latter explain why positive valence improves recall more than negative valence. It turned out that the general superiority of valenced words was due to advantages in nonrecollective retrieval, whereas the superiority of positive over negative valence was due to advantages in recollective retrieval. In what follows, we present two sources of evidence that demonstrate this pattern: likelihood ratio tests of parameter differences among experimental conditions and recall performance decomposition in terms of recollective and nonrecollective recall. Statistical reliability of differences in parameter estimates was determined with likelihood ratio tests. In brief, between-condition likelihood ratio tests compare the goodness of fit of a joint model in which one parameter is set equal between any two conditions (e.g., D1 positive ⫽ D1 negative) against the fit of a joint model in which all parameters are free to vary between the same two conditions, which produces a G2 test statistic that is asymptotically distributed as 2(1). Inspection of the parameter values in Table 3 reveals three main results that were investigated with likelihood ratio tests. First, direct access parameters (recollective retrieval) were higher for positive valence relative to neutral (⌬M ⫽ .20) and negative valence (⌬M ⫽ .25). There was a reliable difference between positive and neutral valence for D1, G2(1) ⫽ 5.42, and D2, G2(1) ⫽ 4.44, and a reliable difference between positive and negative valence for D1, G2(1) ⫽ 7.38, and D2, G2(1) ⫽ 3.93. Second, reconstruction (nonrecollective retrieval) was lower for neutral relative to positive valence (⌬M ⫽ .09). There was a reliable difference in R1 for positive versus neutral valence, G2(1) ⫽ 4.68. Third, familiarity judgment (nonrecollective retrieval) was higher for negative relative to positive (⌬M ⫽ .40) and neutral valence (⌬M ⫽ .27). There was a reliable difference between negative and positive valence for J1, G2(1) ⫽ 6.45, and J2, G2(1) ⫽ 5.04. In addition, there was a reliable difference in J2 for negative versus neutral valence, G2(1) ⫽ 3.93. The parameter estimates in Table 3 can be used to decompose recall performance on each test trial into its recollective and nonrecollective components (see Brainerd & Reyna, 2010). When the parameter estimates are inserted in the dual-retrieval model’s starting vector W and the transition matrix M (see Appendix A Equation A1) and the two are multiplied, a new vector Z is generated that has four elements: (a) the proportion of words that were recalled via recollective retrieval, P(a); (b) the proportion of words that were reconstructed but not recalled, P(b); (c) the proportion of words that were recalled via nonrecollective retrieval, that is, those that were reconstructed and recalled, P(c); and (d) the proportion of words that were neither recollected nor reconstructed, P(d). For the ith test trial (for i ⱖ 1), this vector is Zi ⫽ [P(a), P(b), P(c), P(d)]i ⫽ W ⫻ Mi⫺1. The proportion of words recalled on the ith test trial is simply the sum of P(a)i and P(c)i. Thus, total recall on each test is expressed in terms of its recollective and nonrecollective components. When this was done (see Figure 1), we found that recall of positive words was primarily recollective, whereas recall of negative words was exclusively nonrecollective. This pattern held across all test trials. Moreover, analysis of learning over trials (i.e., the difference between performance on consecutive tests) indicated that learning to recall neutral and negative words was mainly due to increases in nonrecollective retrieval (for neutral words: ⌬T1,T2 ⫽ 19%, ⌬T2,T3 ⫽ 10%; for negative words: ⌬T1,T2 ⫽ 25%, ⌬T2,T3 ⫽ 14%) rather than to increases in recollective retrieval (for neutral words: ⌬T1,T2 ⫽ 2%, ⌬T2,T3 ⫽ 1%; for negative words: ⌬T1,T2 ⫽ 0%, ⌬T2,T3 ⫽ 0%). Conversely, learning to recall positive words was mainly due to increases in recollective retrieval (⌬T1,T2 ⫽ 20%, ⌬T2,T3 ⫽ 10%) rather than to increases in nonrecollective retrieval (⌬T1,T2 ⫽ 10%, ⌬T2,T3 ⫽ ⫺2%). Summary. In summary, goodness-of-fit tests showed that the dual-retrieval model was both necessary and sufficient to account for the data. Parameter estimates indicated that positive and neg- Table 3 Parameter Estimates of the Dual-Retrieval Model for Each Emotional Condition Direct access Condition Negative Neutral Positive Reconstruction D1 D2 Mean D .00 .08 .20 .00 .02 .29 .00 .05 .25 R1 R2 Experiment 1 .32 .43 .25 .33 .59 .17 Familiarity judgment Mean R J1 J2 Mean J .38 .29 .38 .81 .44 .27 .83 .65 .56 .82 .55 .42 .56 .72 .78 .86 .68 .56 .53 .49 Experiment 2 Negative Low arousal High arousal Positive Low arousal High arousal .12 .03 .15 .08 .14 .06 .13 .13 .24 .28 .19 .21 .11 .15 .11 .18 .11 .17 .31 .22 .16 .21 .24 .22 1.0 1.0 .38 .41 EMOTIONAL RECALL 669 80% Trial 1 Recall Performance 70% 60% Nonrecollective 50% Recollective Trial 2 Trial 3 21% 23% 40% 65% 30% 13% 51% 39% 50% 29% 40% 20% 26% 10% 10% 20% 11% 10% 8% 0% Negative Neutral Positive Negative Neutral Positive Negative Neutral Positive Figure 1. Recall decomposition in terms of recollective and nonrecollective retrieval as a function of valence and test trial in Experiment 1. ative valence affected recollective and nonrecollective retrieval in different ways. Specifically, positive valence (a) increased direct access relative to both negative and neutral valence, and (b) it also increased reconstruction relative to neutral valence. Negative valence increased familiarity judgment relative to positive and neutral valence. Thus, the reason that valenced items are remembered better than neutral ones is different for positive valence (direct access and reconstruction advantages) than for negative valence (familiarity advantage), although both positive and negative valence increased nonrecollective retrieval relative to neutral (on the positive valence side, reconstruction; on the negative valence side, familiarity judgment). Finally, decomposition of recall performance into its recollective and nonrecollective components showed that learning to recall positive words was largely a matter of learning how to retrieve recollectively, whereas learning how to recall negative or neutral words was essentially a matter of learning how to retrieve nonrecollectively. Clustering. To investigate the effects of valence on clustering in free recall (i.e., when two or more words of the same valence are recalled next to each other), we computed Bousfield’s (1953) ratio of repetition (RR) statistic separately for negative, neutral, and positive words. Averaged across all trials, the mean RR was .47, .33, and .48 for negative, neutral, and positive valence, respectively. We submitted the data to a repeated measures ANOVA, and the results showed a main effect of valence, F(2, 78) ⫽ 8.73, MSE ⫽ .03, 2p ⫽ .18. Paired comparisons showed that the RR was lower for neutral words than for both positive and negative words. Furthermore, although there was a reliable correlation between RR for negative and positive words, r(40) ⫽ .35, there was not a reliable correlation between RR for neutral words and either negative or positive words. Discussion The aim of the first experiment was to investigate how valence affects recall and its underlying retrieval processes. On the basis of the previous literature (Buchanan, 2007; Kensinger & Corkin, 2003; Ochsner, 2000), we hypothesized that recall of valenced words (negative, positive) would be better than recall of nonvalenced words (neutral). This hypothesis was supported. At the process level, this effect was due to increased nonrecollective retrieval of negative words and increased recollective and nonrecollective retrieval of positive words. On average, reconstruction was higher for valenced relative to nonvalenced words. The latter result was further supported by analysis of clustering in free recall, which showed that clustering was higher for valenced relative to nonvalenced words. In that connection, prior research with the present model has shown that nonrecollective retrieval increases as traditional measures of semantic relatedness of lists increase but recollective retrieval does not (Brainerd & Reyna, 2010). In addition, process-level analyses indicated that better recall of positive than negative words was due to increased recollective retrieval of positive words. One limitation of Experiment 1 is that holding arousal constant did not allow us to investigate whether valence effects on recall depend on arousal. This is an important limitation because emotional arousal alone has been implicated in changes in memory performance (Cahill & McGaugh, 1998). Specifically, arousing events are usually more likely to be recalled than nonarousing ones (Bradley et al., 1992), they can lead to focal memory enhancements (Christianson & Loftus, 1991), and they can render subsequent events less likely to be recalled (Hurlemann et al., 2005). Furthermore, arousal has been linked to biological mechanisms that are distinct from those associated with valence, such as changes in skin conductance (Lang, Greenwald, Bradley, & Hamm, 1993) and increased release of glucocorticoids from the adrenal cortex (McGaugh, 2004). Moreover, a few studies suggest that the effects of valence and arousal interact (e.g., Brainerd et al., 2010; Libkuman et al., 2004; Steinmetz et al., 2010). To illustrate, Waring and Kensinger (2011) showed subjects a sequence of scenes composed of negative, neutral, or positive items, such that half the negative and positive items were high in arousal and half were low in arousal. Later, subjects performed a recognition test. GOMES, BRAINERD, AND STEIN 670 Recognition was better for positive items high in arousal relative to positive items low in arousal. However, recognition was worse for negative items high in arousal relative to negative items low in arousal. This pattern is instructive because it suggests that how valence effects are expressed may depend on arousal, so that the findings of Experiment 1 may be limited to designs with constant levels of arousal. To overcome this limitation, we manipulated valence (negative, positive) and arousal (low, high) in a factorial design. Experiment 2 The aim of Experiment 2 was to investigate how the effects of valence on recall and its underlying retrieval processes are affected by changes in arousal level. Method Subjects. Sixty-three undergraduates (45 females) from a Brazilian university participated in the experiment. Subjects were aged between 17 and 26 years (M ⫽ 21 years, SD ⫽ 3 years) and were required to give written informed consent to participate in the experiment. Design and materials. A 2 (valence: negative, positive) ⫻ 2 (arousal: low, high) within-subjects design was used. Each subject learned to recall a list of 36 words drawn from the Brazilian version of the ANEW (Kristensen et al., 2011; see Appendix B). The study list was composed of nine negative and high-arousal words (negative-high; Mvalence ⫽ 3.16; Marousal ⫽ 5.63), nine negative and low-arousal words (negative-low; Mvalence ⫽ 3.31; Marousal ⫽ 4.11), nine positive and high-arousal words (positivehigh; Mvalence ⫽ 7.61; Marousal ⫽ 5.50), and nine positive and low-arousal words (positive-low; Mvalence ⫽ 7.71; Marousal ⫽ 3.93). Statistical tests (ANOVAs) confirmed four results that are apparent from the average valence and arousal scores for each condition: (a) The average valence score for positive words (M ⫽ 7.66) was statistically different from the average valence score for negative words (M ⫽ 3.24); (b) the average arousal score for high-arousal words (M ⫽ 5.57) was statistically different from the average arousal score for low-arousal words (M ⫽ 4.02); (c) there were no reliable differences regarding valence scores between positive-high and positive-low words, or between negative-high and negative-low words; and (d) there were no reliable differences regarding arousal scores between positive-high and negative-high words, or between positive-low and negative-low words. However, between-condition differences in arousal (⌬M ⫽ 1.55) were smaller than between-condition differences in valence (⌬M ⫽ 4.42). Such differences are due to the absence of either very highor very low-arousal words that are also strongly valenced in affective word pools. For this reason, the only other choices would be either not to manipulate arousal at all or to confound valence and arousal. The latter alternatives are obviously less desirable than factorial manipulation with a weaker arousal than valence manipulation. Moreover, our conclusions concerning the effects of valence and arousal on memory are qualitative rather than quantitative. More specifically, our conclusions are about whether valence and arousal affect retrieval processes rather than about whether these effects vary as a function of the strength of these manipulations. We acknowledge, however, that differences in sta- tistical power could lead to situations in which valence has reliable effects but arousal does not. In addition, the frequency of conceptual relations was low and similar across all four Valence ⫻ Arousal conditions. Finally, there were no reliable differences among the word lists regarding word length, printed and spoken frequency in Portuguese, concreteness, and imagery. Procedure. The experimental procedure was identical to the one used in Experiment 1 except for the following differences: In Experiment 2, subjects saw a list of 36 words and no filler words were used. Results Descriptive statistics and ANOVAs. As before, we measured recall performance by the proportion of words recalled on each test trial, the MTLE, and the MTEI. Regarding the first measure, Table 2 shows the average proportion of negative-low, negative-high, positive-low, and positive-high words recalled as a function of test trial. Visual inspection of Table 2 suggests that recall performance for positive-high words was better than for negative-high words. However, recall performance for positivelow words was similar to recall performance for negative-low words, showing that valence interacted with arousal. To investigate whether the numerical differences in the proportion of words recalled among conditions were statistically reliable, we conducted a 2 (valence: negative, positive) ⫻ 2 (arousal: low, high) ⫻ 3 (test trial: T1, T2, T3) repeated measures ANOVA. The results showed a main effect of valence, F(1, 62) ⫽ 13.10, MSE ⫽ .03, 2p ⫽ .17; a main effect of test trial, F(2, 124) ⫽ 508.90, MSE ⫽ .01, 2p ⫽ .89; and an interaction between valence and arousal, F(1, 62) ⫽ 10.64, MSE ⫽ .02, 2p ⫽ .14. Paired comparisons showed that subjects recalled more positive words on average (M ⫽ 40%) than negative (M ⫽ 36%). Analysis of the interaction between valence and arousal, however, showed that valence effects depended on arousal. On the one hand, the proportion of positive-high words recalled (M ⫽ 42%) was higher than the proportion of negativehigh words (M ⫽ 34%). On the other hand, there was not a reliable difference between the proportion of positive-low (M ⫽ 39%) and negative-low words recalled (M ⫽ 38%). As it is apparent in Table 2, the proportion of words recalled increased from Trial 1 (M ⫽ 21%) to Trial 2 (M ⫽ 40%), and from Trial 2 to Trial 3 (M ⫽ 54%). We found similar results using the MTLE and the MTEI measures. The average MTLE and MTEI score and standard deviation are presented in Table 2 for each condition. Inspection of Table 2 indicates that learning to recall positive-high words was easier than learning to recall negative-high words, and learning to recall positive-low words was not easier than learning to recall negativelow words. We submitted these data to a 2 (valence: negative, positive) ⫻ 2 (arousal: low, high) repeated measures ANOVA using either MTLE or MTEI as the dependent variable. The analyses revealed a main effect of valence for MTLE, F(1, 62) ⫽ 14.01, MSE ⫽ .09, 2p ⫽ .18, and for MTEI, F(1, 62) ⫽ 13.10, MSE ⫽ .09, 2p ⫽ .17, and an interaction between valence and arousal for MTLE, F(1, 62) ⫽ 9.25, MSE ⫽ .08, 2p ⫽ .13, and for MTEI, F(1, 62) ⫽ 10.46, MSE ⫽ .07, 2p ⫽ .14. Paired comparisons showed that learning to recall was easier for positive-high than for negative-high words, independent of the performance EMOTIONAL RECALL 671 iment 1, the value for the experiment as a whole, 5.80, was below the critical value for rejection of the null hypothesis of fit. The G2 values for the negative-low, negative-high, positive-low, and positive-high were 1.87, .22, 3.12, and .59, respectively. Thus, the dual-retrieval model passed both necessity and sufficiency tests. Parameter estimates and significance tests. Maximum likelihood estimates of the parameters of the dual-retrieval model are presented in Table 3 for each Valence ⫻ Arousal condition. The grand mean of the standard deviation of estimates across conditions was .05. Inspection of these data reveals an interaction between valence and arousal at the process level. Direct access was low for negative-high words in comparison to positive-high words (⌬M ⫽ .11) and negative-low words (⌬M ⫽ .08), but it was similar for positive-low and negative-low words (⌬M ⫽ .03). Likelihood ratio tests revealed a reliable difference between positive-high and negative-high words for D1, G2(1) ⫽ 10.03, and a reliable difference between negative-low and negative-high words for D1, G2(1) ⫽ 7.04. In addition, reconstruction was slightly higher for positive-low relative to negative-low words (⌬M ⫽ .05), which was due to a reliable difference in R1, G2(1) ⫽ 6.14. Finally, familiarity judgment was higher for negative words relative to positive words (⌬M ⫽ .31), and this difference was larger for high-arousal words (⌬M ⫽ .37) than for low-arousal words (⌬M ⫽ .25). However, likelihood ratio tests revealed a reliable difference only between negative-low and positive-low words for J1, G2(1) ⫽ 4.87. Thus, as before, the recall superiority of positive over negative words was due to advantages in recollective retrieval, but this superiority only held for words high in arousal. Finally, we decomposed recall performance on Trials 1, 2, and 3 in terms of recollective and nonrecollective retrieval. The results are presented in Figure 2. As in the first experiment, the decomposition showed that recall of positive-high words was primarily supported by recollective retrieval, whereas recall of negative-high words was primarily supported by nonrecollective retrieval. Analysis of the learning over trials indicated that learning to recall positive-high and negative-low words was due to increases in measure. However, there was no difference between positive-low and negative-low words in recall performance. In summary, valence interacted with arousal regardless of the performance measure. Specifically, recall performance was higher for positive than for negative valence only when arousal was high. Next, we evaluated how valence and arousal affected underling retrieval operations. Model analyses. As in Experiment 1, we used the Markov chain presented in Appendix A to measure the effects of valence and arousal on recollective and nonrecollective retrieval. Model fit analyses are reported first, followed by parameter estimation and significance testing. Fit. We started with the necessity tests, which ask whether simpler one-stage models can account for the present data. The recollection-only model produces a G2 test statistic with 5 degrees of freedom for each condition, and, therefore, the critical value for rejecting the null hypothesis of fit is 11.07 in each condition and 31.41 across the four Valence ⫻ Arousal conditions. The nonrecollection-only model produces a G2 test statistic with 3 degrees of freedom for each condition, and, therefore, the critical value for rejecting the null hypothesis of fit is 7.82 in each condition and 21.03 across the four Valence ⫻ Arousal conditions. The null hypothesis of fit was rejected for both models for the experiment as a whole, with the values of the G2 statistics exceeding their critical values by wide margins, G2(20) ⫽ 2631.88, and G2(12) ⫽ 281.33. The values of the G2 statistic for the negativelow, negative-high, positive-low, and positive-high conditions were, respectively, 827.47, 648.05, 545.97, and 610.39 for the recollective model, and 74.17, 97.23, 53.00, and 56.93 for the nonrecollective model. Next we turn to the sufficiency test, which asks whether the dual-retrieval model is adequate to account for the data. The dual-retrieval model produces a G2 test statistic with 1 degree of freedom for each condition, and, therefore, the critical value for rejecting the null hypothesis of fit is 3.84 in each condition and 9.49 across the four Valence ⫻ Arousal conditions. As in Exper- 70% Trial 1 Trial 2 Trial 3 Recall Performance 60% 50% Nonrecollective 19% Recollective 20% 40% 15% 30% 20% 27% 16% 35% 23% 25% 11% 11% 12% 29% 24% 13% 10% 8% 11% 15% 39% 33% 26% 20% 15% 10% 3% 0% Low High Negative Low High Positive Low High Negative Low High Positive Low High Negative Low High Positive Figure 2. Recall decomposition in terms of recollective and nonrecollective retrieval as a function of valence, arousal, and test trial in Experiment 2. 672 GOMES, BRAINERD, AND STEIN recollective retrieval (for positive-high: ⌬T1,T2 ⫽ 14%, ⌬T2,T3 ⫽ 10%; for negative-low: ⌬T1,T2 ⫽ 12%, ⌬T2,T3 ⫽ 9%) more than to increases in nonrecollective retrieval (for positive-high: ⌬T1,T2 ⫽ 8%, ⌬T2,T3 ⫽ 3%; for negative-low: ⌬T1,T2 ⫽ 4%, ⌬T2,T3 ⫽ 5%). However, learning to recall negative-high words was due to increases in nonrecollective retrieval (⌬T1,T2 ⫽ 12%, ⌬T2,T3 ⫽ 10%) more than to increases in recollective retrieval (⌬T1,T2 ⫽ 7%, ⌬T2,T3 ⫽ 5%). Learning to recall positive-low words was due to similar increases in recollective (⌬T1,T2 ⫽ 9%, ⌬T2,T3 ⫽ 6%) and nonrecollective retrieval (⌬T1,T2 ⫽ 12%, ⌬T2,T3 ⫽ 4%). Summary. Summing up, the effect of valence on retrieval processes interacted with arousal as follows: First, when arousal was high, recall of positive words involved more recollection and less nonrecollective recall than recall of negative words (see Figure 2). This is the same pattern as in Experiment 1. Second, when arousal was low, levels of recollection and reconstruction were similar for positive and negative words. However, it can be seen in Figure 2 that these different valence patterns are due to the fact that process-level effects of arousal are the opposite for the two valences. When valence is positive, high arousal drives recollective retrieval up and nonrecollective retrieval down. However, when valence is negative, high arousal drives recollective retrieval down and nonrecollective retrieval up. As discussed by Brainerd and Reyna (2010), this pattern points to an important property of our model-based approach to measuring underlying retrieval processes: When a manipulation causes performance in one condition to be better than in an another, or produces no between-condition differences, the model can show that the manipulation has opposite effects on underlying processes. Discussion In Experiment 2, we investigated how the effects of valence on recall and retrieval processes interact with the effects of arousal. Our results showed that arousal interacts with valence, as predicted by other studies that manipulated valence and arousal factorially (Brainerd et al., 2010; Libkuman et al., 2004; Steinmetz et al., 2010; Waring & Kensinger, 2011). Specifically, recall was better for positive than negative words only when both were high in arousal. At the process level, this effect was due to increased recollective retrieval (direct access) of positive words. Importantly, for negative valence, high arousal decreased recollective retrieval relative to low. This result helps to explain, for instance, the low rate of recollective recall of negative words in Experiment 1, as the arousal values of those words were closer to the negativehigh words in Experiment 2 than to the negative-low words. Moreover, our results indicate that the low rate of recollective recall of negative-high words was mainly due to the low level of the direct access during the first learning cycle (D1) rather than during later learning cycles (D2). Overall, recollective retrieval dominated when positive-high words were being recalled, but nonrecollective retrieval dominated when negative-high words were being recalled. Also, recall of negative-low words was dominated by recollection, whereas recall of positive-low words exhibited roughly equal contributions of recollective and nonrecollective retrieval. Thus, the relative contribution of the two forms of retrieval to overall recall of the two valence types depended on arousal. There were similarities and differences in the results of Experiments 2 versus Experiment 1. On the one hand, better recall of positive versus negative words was due to advantages in recollective retrieval in both experiments. On the other hand, there were differences in the absolute values of parameter estimates between the two experiments. Those differences, however, are not problematical for several reasons. In addition to measurement error, such differences could have been caused by differences in subject samples and methodological differences. With respect to methodological differences, list length and the number of within-subject conditions varied between experiments. General Discussion The aim of this study was to investigate the effects of emotional valence and arousal on recollective and nonrecollective retrieval, using a model-based approach that avoids extant criticisms of traditional metacognitive methods of measuring these processes. In two experiments, we manipulated valence and either controlled arousal or manipulated it factorially. Recollective retrieval and two components of nonrecollective retrieval (reconstruction and familiarity judgment) were measured. The experiments produced two principal results that clarify previous findings about the effects of emotion on memory accuracy (Bradley et al., 1992; Brainerd et al., 2010, 2008; Dehon et al., 2010; Goodman et al., 2011; Kensinger & Corkin, 2003) and brain activity (Dolcos et al., 2004; Kensinger & Corkin, 2004) and that reconcile some inconsistencies in these findings. First, our results suggested that valenced words (negative, positive), as opposed to nonvalenced words (neutral), increase nonrecollective retrieval. Second, our results suggested that positive valence foments recollective retrieval relative to negative, whereas negative valence foments nonrecollective retrieval relative to positive. The second experiment produced a further result of theoretical significance: Valence effects depended on arousal, both at the performance and process levels. In what follows, we discuss the theoretical and methodological implications of the findings reported here as well as the limitations of the present study. The investigation of the role of dual processes in the effects of emotion on memory has followed the traditional approach of using metamnemonic judgments, such as remember/know and confidence judgments, to separate recollective from nonrecollective retrieval. Studies in which the remember/know procedure has been used have shown that valenced items increase remember judgments in comparison to nonvalenced items (Dewhurst & Parry, 2000; Kensinger & Corkin, 2003; Ochsner, 2000; Sharot et al., 2007). However, the idea that emotion enhances recollective retrieval is inconsistent with findings about the effects of emotion on brain activity and memory accuracy. For instance, successful encoding of valenced material has been associated with activity in regions that are implicated in semantic processing (Dolcos et al., 2004; Kensinger & Corkin, 2004). The findings of our experiments fit well with the neuroimaging data. Specifically, valenced words increased the more semantically driven form of retrieval, reconstruction, and also increased a classic qualitative index of semantic processing in recall, clustering. This pattern suggests that retrieval of valenced memories should correlate with brain activity in regions implicated in semantic processing, the PFC being a case in point. EMOTIONAL RECALL Turning to the findings about the effects of emotion on memory accuracy, negative valence has been shown to increase false memories of gist-consistent events relative to neutral or positive (Brainerd et al., 2010, 2008; Dehon et al., 2010; Goodman et al., 2011). Brainerd et al. (2008) found that levels of false memory increase from positive to neutral to negative because of increasing retrieval of gist memories and decreasing retrieval of verbatim memories that allow rejection of false information. Our results connecting negative words to an error-prone retrieval mechanism (reconstruction from gist ⫹ familiarity) and retrieval of positive words to an errorless retrieval mechanism (direct access of targets’ verbatim traces) are consistent with this pattern. Retrieval of gist memories support both true and false memories (Brainerd & Reyna, 2002; Schacter, Verfaellie, & Pradere, 1996), which explains why subjects in our experiment recalled more negative than neutral words and why subjects in Brainerd et al.’s experiment showed more false memories for negative than for positive or neutral words (increased gist resemblance between true and false items). Although our findings also suggested that positive valence increases nonrecollective retrieval relative to neutral, (a) it did not increase nonrecollective retrieval as much as negative valence and (b) it greatly increased recollective retrieval, which should reduce false memories and improve accuracy. Results reported in other studies support the latter conclusion. In associative recognition, a test that is slanted toward memory for surface features, accuracy is better for positive word pairs relative to negative (Pierce & Kensinger, 2011). In addition, associative recall of names paired with happy faces is better than for names paired with neutral faces (Tsukiura & Cabeza, 2008, 2011). One explanation of why valenced words increase nonrecollective retrieval in comparison to nonvalenced words is that valence is a conceptual gist that people often rely on to reconstruct past events (Rivers, Reyna, & Mills, 2008). Knowledge about the valence of emotional states evoked during an episode (e.g., anger, fear, joy, love) can then provide a base for reconstruction of missing elements of that episode, whether they are true or false. This could make reconstruction more efficient by reducing the number of output candidates that are generated (e.g., only things that make me happy). Or it might increase the strengths of familiarity signals, too. Our results provided evidence of both effects. Why has the previous literature indicated that emotion enhances recollective but not nonrecollective retrieval? Here, it is important to note Phelps and Sharot’s (2008) conclusion that the influence of emotion on metacognitive judgments about memory (traditional methods of separating dual processes) is more consistent and robust than is its influence on actual memory performance (our separation method). Zimmerman and Kelley (2010), for instance, showed that predictions about future associative recall of negative word pairs were higher than predictions of recall of neutral word pairs, but actual recall of negative and neutral pairs were similar, which suggests that something other than actual memory for emotional items is used in metacognitive judgments. For example, such judgments appear to be influenced by beliefs about how emotion (valence and arousal) affects memory (Magnussen et al., 2006). This does not rule out the possibility that emotion affects actual memory processes, as suggested by the previous literature. In fact, our results support that idea. However, the simple hypothesis that valence affects only recollection did not prove successful, 673 and further, how retrieval processes reacted to valence depended on arousal. Regarding the influence of arousal, recall of positive words was superior to recall of negative words only when arousal was high. Second, recollective retrieval was higher for positive than for negative words only when arousal was high. Third, recall of negative words was chiefly supported by nonrecollective retrieval only when arousal was high, and recall of positive words was chiefly supported by recollective retrieval only when arousal was high. As mentioned, the confounding of valence and arousal has been a persistent problem in studies of the effects of emotion on memory. When arousal is manipulated, arousing items are often more negative than nonarousing ones (e.g., Cahill et al., 1996; Cahill & McGaugh, 1995). When valence is manipulated, negative and positive items are often more arousing than neutral (e.g., Dewhurst & Parry, 2000; Sharot et al., 2004, 2007; Pierce & Kensinger, 2011), and negative items are often more arousing than positive (e.g., Xu et al., 2011). Along with other studies in which valence and arousal have been factorially manipulated (Brainerd et al., 2010; Libkuman et al., 2004; Steinmetz et al., 2010; Waring & Kensinger, 2011), our results showed that confounding valence and arousal has important theoretical consequences. Specifically, the manner in which valence differences affect underlying retrieval processes depends on arousal, which surely affects how valence differences correlate with patterns of brain activity. Finally, a potentially important limitation of this study concerns the arousal levels of the words used in both experiments. Arousal scores vary from 1 (calm) to 9 (excited) on the ANEW. In the first experiment, arousal was controlled at 5.3. In the second experiment, arousal varied from 4.0 to 5.6. Thus, the arousal values of the words that were used in our experiments did not, on average, come from the lower third of the rating scale (very low arousal) or the upper third (very high arousal). Consequently, although valence and arousal were not confounded in either of our experiments and although they were factorially manipulated in Experiment 2, extreme values of arousal were not used. It is conceivable that the manner in which arousal affects underlying retrieval processes and the way valence interacts with arousal could be different at the extreme ends of the scale. References Bousfield, W. A. (1953). The occurrence of clustering in the recall of randomly arranged associates. The Journal of General Psychology, 49, 229 –240. doi:10.1080/00221309.1953.9710088 Bradley, M. M., Greenwald, M. K., Petry, M. C., & Lang, P. J. (1992). Remembering pictures: Pleasure and arousal in memory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 379 – 390. doi:10.1037/0278-7393.18.2.379 Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings [Technical Report C-1]. Gainesville: The Center for Research in Psychophysiology, University of Florida. Brainerd, C. J., Aydin, C., & Reyna, V. F. (in press). Development of dual-retrieval processes in recall: Learning, forgetting, and reminiscence. Journal of Memory and Language. doi:10.1016/ j.jml.2011.12.002 Brainerd, C. J., Holliday, R. E., Reyna, V. F., Yang, Y., & Toglia, M. P. (2010). Developmental reversals in false memory: Effects of emotional valence and arousal. Journal of Experimental Child Psychology, 107, 137–154. doi:10.1016/j.jecp.2010.04.013 674 GOMES, BRAINERD, AND STEIN Brainerd, C. J., & Reyna, V. F. (2002). Fuzzy-trace theory and false memory. Current Directions in Psychological Science, 11, 164 –169. doi:10.1111/1467-8721.00192 Brainerd, C. J., & Reyna, V. F. (2010). Recollective and nonrecollective recall. Journal of Memory and Language, 63, 425– 445. doi:10.1016/ j.jml.2010.05.002 Brainerd, C. J., Reyna, V. F., & Howe, M. L. (2009). Trichotomous processes in early memory development, aging, and cognitive impairment: A unified theory. Psychological Review, 116, 783– 832. doi: 10.1037/a0016963 Brainerd, C. J., Stein, L. M., Silveira, R. A. T., Rohenkohl, G., & Reyna, V. F. (2008). How does negative emotion cause false memories? Psychological Science, 19, 919 –925. doi:10.1111/j.1467-9280.2008 .02177.x Buchanan, T. W. (2007). Retrieval of emotional memories. Psychological Bulletin, 133, 761–779. doi:10.1037/0033-2909.133.5.761 Buchanan, T. W., Etzel, J. A., Adolphs, R., & Tranel, D. (2006). The influence of autonomic arousal and semantic relatedness on memory for emotional words. International Journal of Psychophysiology, 61, 26 –33. doi:10.1016/j.ijpsycho.2005.10.022 Cabeza, R., & Nyberg, L. (2000). Imaging cognition II: An empirical review of 275 PET and fMRI studies. Journal of Cognitive Neuroscience, 12, 1– 47. doi:10.1162/08989290051137585 Cahill, L., Babinsky, R., Markowitsch, H. J., & McGaugh, J. L. (1995, September). The amygdala and emotional memory. Nature, 377, 295– 296. doi:10.1038/377295a0 Cahill, L., Haier, R. J., Fallon, J., Alkire, M. T., Tang, C., Keator, D., . . . McGaugh, J. L. (1996). Amygdala activity at encoding correlated with long-term, free recall of emotional information. Proceedings of the National Academy of Sciences of the USA, 93, 8016 – 8021. doi:10.1073/ pnas.93.15.8016 Cahill, L., & McGaugh, J. L. (1995). A novel demonstration of enhanced memory associated with emotional arousal. Consciousness & Cognition, 4, 410 – 421. doi:10.1006/ccog.1995.1048 Cahill, L., & McGaugh, J. L. (1998). Mechanisms of emotional arousal and lasting declarative memory. Trends in Neurosciences, 21, 294 –299. doi:10.1016/S0166-2236(97)01214-9 Christianson, S.-A. (1992). Emotional stress and eyewitness memory: A critical review. Psychological Bulletin, 112, 284 –309. doi:10.1037/ 0033-2909.112.2.284 Christianson, S.-A., & Loftus, E. F. (1991). Remembering emotional events: The fate of detailed information. Cognition & Emotion, 5, 81–108. doi:10.1080/02699939108411027 Clark, J. M., & Paivio, A. (2004). Extensions of the Paivio, Yuille, and Madigan (1968) norms. Behavior Research Methods, Instruments, & Computers, 36, 371–383. doi:10.3758/BF03195584 Colegrove, F. W. (1899). Individual memories. The American Journal of Psychology, 10, 228 –255. doi:10.2307/1412480 Dehon, H., Laroi, F., & Van der Linden, M. (2010). Affective valence influences participant’s susceptibility to false memories and illusory recollection. Emotion, 10, 627– 639. doi:10.1037/a0019595 Dewhurst, S. A., & Parry, L. A. (2000). Emotionality, distinctiveness, and recollective experience. European Journal of Cognitive Psychology, 12, 541–551. doi:10.1080/095414400750050222 Dolcos, F., LaBar, K. S., & Cabeza, R. (2004). Dissociable effects of arousal and valence on prefrontal activity indexing emotional evaluation and subsequent memory: An event-related fMRI study. NeuroImage, 23, 64 –74. doi:10.1016/j.neuroimage.2004.05.015 Donaldson, W. (1996). The role of decision processes in remembering and knowing. Memory and Cognition, 24, 523–533. doi:10.3758/ BF03200940 Dunn, J. C. (2004). Remember-know: A matter of confidence. Psychological Review, 111, 524 –542. doi:10.1037/0033-295X.111.2.524 Dunn, J. C. (2008). The dimensionality of the remember– know task: A state-trace analysis. Psychological Review, 115, 426 – 446. doi:10.1037/ 0033-295X.115.2.426 Flügel, J. C. (1925). A quantitative study of feeling and emotion in everyday life. British Journal of Psychology, 15, 318 –355. doi:10.1111/ j.2044-8295.1925.tb00187.x Goldberg, R. F., Perfetti, C. A., Fiez, J. A., & Schneider, W. (2007). Selective retrieval of abstract semantic knowledge in left prefrontal cortex. The Journal of Neuroscience, 27, 3790 –3798. doi:10.1523/ JNEUROSCI.2381-06.2007 Goodman, G. S., Ogle, C. M., Block, S. D., Harris, L. S., Larson, R. P., Augusti, E.-M., . . . Urquiza, A. (2011). False memory for traumarelated Deese-Roediger-McDermott lists in adolescents and adults with histories of child sexual abuse. Developmental and Psychopathology, 23, 423– 438. doi:10.1017/S0954579411000150 Hurlemann, R., Hawellek, B., Matusch, A., Kolsch, H., Wollersen, H., Madea, B., . . . Dolan, R. J. (2005). Noradrenergic modulation of emotion-induced forgetting and remembering. Journal of Neuroscience, 25, 6343– 6349. doi:10.1523/JNEUROSCI.0228-05.2005 Kensinger, E. A. (2004). Remembering emotional experiences: The contributions of valence and arousal. Reviews in the Neurosciences, 15, 241–251. doi:10.1515/REVNEURO.2004.15.4.241 Kensinger, E. A. (2009). Remembering the details: Effects of emotion. Emotion Review, 1, 99 –113. doi:10.1177/1754073908100432 Kensinger, E. A., & Corkin, S. (2003). Memory enhancement for emotional words: Are emotional words more vividly remembered than neutral words? Memory and Cognition, 31, 1169 –1180. doi:10.3758/ BF03195800 Kensinger, E. A., & Corkin, S. (2004). Two routes to emotional memory: Distinct neural processes for valence and arousal. Proceedings of the National Academy of Sciences of the USA, 101, 3310 –3315. doi: 10.1073/pnas.0306408101 Kristensen, C. H., Gomes, C. F. A., Vieira, K., & Justo, A. (2011). Brazilian norms for the Affective Norms for English Words. Trends in Psychiatry and Psychotherapy, 33, 135–146. doi:10.1590/S223760892011000300003 Lang, P. J., Greenwald, M. K., Bradley, M. M., & Hamm, A. O. (1993). Looking at pictures: Affective, facial, visceral, and behavioral reactions. Psychophysiology, 30, 261–273. doi:10.1111/j.1469-8986.1993 .tb03352.x Libkuman, T., Stabler, C., & Otani, H. (2004). Arousal, valence, and memory for detail. Memory, 12, 237–247. doi:10.1080/ 09658210244000630 Macaluso, E., Frith, E. D., & Driver, J. (2000, August 18). Modulation of human visual cortex by crossmodal spatial attention. Science, 289, 1206 –1208. doi:10.1126/science.289.5482.1206 Magnussen, S., Andersson, J., Cornoldi, C., De Beni, R., Endestad, T., Goodman, G. S., . . . Zimmer, H. (2006). What people believe about memory? Memory, 14, 595– 613. doi:10.1080/09658210600646716 Malmberg, K. J. (2008). Recognition memory: A review of the critical findings and an integrated theory for relating them. Cognitive Psychology, 57, 335–384. doi:10.1016/j.cogpsych.2008.02.004 Matlin, M., & Stang, D. (1978). The Pollyanna principle. Cambridge, MA: Schenkman Publishing. McGaugh, J. L. (2004). The amygdala modulates the consolidation of memories of emotionally arousing experiences. Annual Review of Neuroscience, 27, 1–28. doi:10.1146/annurev.neuro.27.070203.144157 Ochsner, K. N. (2000). Are affective events richly recollected or simply familiar? The experience and process of recognizing feelings past. Journal of Experimental Psychology: General, 129, 242–261. doi: 10.1037/0096-3445.129.2.242 Otgaar, H., Candel, I., & Merckelbach, H. (2008). Children’s false memories: Easier to elicit for a negative than for a neutral event. Acta Psychologica, 128, 350 –354. doi:10.1016/j.actpsy.2008.03.009 Palmer, J. E., & Dodson, C. S. (2009). Investigating the mechanisms EMOTIONAL RECALL fuelling reduced false recall of emotional material. Cognition & Emotion, 23, 238 –259. doi:10.1080/02699930801976663 Phelps, E. A., & Sharot, T. (2008). How (and why) emotion enhances the subjective sense of recollection. Current Directions in Psychological Science, 17, 147–152. doi:10.1111/j.1467-8721.2008.00565.x Pierce, B. H., & Kensinger, E. A. (2011). Effects of emotion on associative recognition: Valence and retention interval matter. Emotion, 11, 139 – 144. doi:10.1037/a0021287 Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and the measurement of cognitive processes. Psychological Review, 95, 318 – 339. doi:10.1037/0033-295X.95.3.318 Rimmele, U., Davachi, L., Petrov, R., Dougal, S., & Phelps, E. A. (2011). Emotion enhances the subjective feeling of remembering, despite lower accuracy for contextual details. Emotion, 11, 553–562. doi:10.1037/ a0024246 Rivers, S. E., Reyna, V. F., & Mills, B. (2008). Risk taking under the influence: A fuzzy-trace theory of emotion in adolescence. Developmental Review, 28, 107–144. doi:10.1016/j.dr.2007.11.002 Rotello, C. M., Macmillan, N. A., & Reeder, J. A. (2004). Sum– difference theory of remembering and knowing: A two-dimensional signal detection model. Psychological Review, 111, 588 – 616. doi:10.1037/0033295X.111.3.588 Sardinha, T. B. (2003). The bank of Portuguese. Retrieved from http:// lael.pucsp.br/direct Schacter, D. L., Verfaellie, M., & Pradere, D. (1996). The neuropsychology of memory illusions: False recall and recognition in amnesic patients. Journal of Memory and Language, 35, 319 –334. doi:10.1006/ jmla.1996.0018 Sharot, T., Delgado, M. R., & Phelps, E. A. (2004). How emotion enhances the feeling of remembering. Nature Neuroscience, 7, 1376 –1380. doi: 10.1038/nn1353 Sharot, T., Verfaellie, M., & Yonelinas, A. P. (2007). How emotion strengthens the recollective experience: A time-dependent hippocampal process. PLoS ONE, 2, e1068. doi:10.1371/journal.pone.0001068 Sharot, T., & Yonelinas, A. P. (2008). Differential time-dependent effects of emotion on recollective experience and memory for contextual information. Cognition, 106, 538 –547. doi:10.1016/j.cognition.2007.03.002 Sharp, A. A. (1938). An experimental test of Freud’s doctrine of the relation of hedonic tone to memory revival. Journal of Experimental Psychology, 22, 395– 418. doi:10.1037/h0056403 Stagner, R. (1933). Factors influencing the memory value of words in a series. Journal of Experimental Psychology, 16, 129 –137. doi:10.1037/ h0074490 Steinmetz, K. R. M., Addis, D. R., & Kensinger, E. A. (2010). The effect of arousal on the emotional memory network depends on valence. NeuroImage, 53, 318 –324. doi:10.1016/j.neuroimage.2010.06.015 Talarico, J. M., & Rubin, D. C. (2003). Confidence, not consistency, 675 characterizes flashbulb memories. Psychological Science, 14, 455– 461. doi:10.1111/1467-9280.02453 Thomson, R. H. (1930). An experimental study of memory as influenced by feeling tone. Journal of Experimental Psychology, 13, 462– 468. doi:10.1037/h0074526 Toglia, M. P., & Battig, W. F. (1978). Handbook of semantic word norms. Hillsdale, NJ: Erlbaum. Tsukiura, T., & Cabeza, R. (2008). Orbitofrontal and hippocampal contributions to memory for face-name associations: The rewarding power of a smile. Neuropsychologia, 46, 2310 –2319. doi:10.1016/j.neuropsychologia .2008.03.013 Tsukiura, T., & Cabeza, R. (2011). Remembering beauty: Roles of orbitofrontal and hippocampal regions in successful memory encoding of attractive faces. NeuroImage, 54, 653– 660. doi:10.1016/j.neuroimage .2010.07.046 Walker, W. R., Skowronski, J. J., & Thompson, C. P. (2003). Life is pleasant–and memory helps to keep it that way! Review of General Psychology, 7, 203–210. doi:10.1037/1089-2680.7.2.203 Waring, J. D., & Kensinger, E. A. (2011). How emotion leads to selective memory: Neuroimaging evidence. Neuropsychologia, 49, 1831–1842. doi:10.1016/j.neuropsychologia.2011.03.007 Wohlgemuth, A. (1923). The influence of feeling on memory. British Journal of Psychology, 13, 405– 416. doi:10.1111/j.2044-8295.1923 .tb01167.x Wu, L., & Barsalou, L. W. (2009). Peceptual simulation in conceptual combination: Evidence from property generation. Acta Psychologica, 132, 173–189. doi:10.1016/j.actpsy.2009.02.002 Xu, X., Zhao, Y., Zhao, P., & Yang, J. (2011). Effects of level of processing on emotional memory: Gist and details. Cognition & Emotion, 25, 53–72. doi:10.1080/02699931003633805 Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory and Language, 46, 441–517. doi:10.1006/jmla.2002.2864 Yonelinas, A. P., Hopfinger, J. B., Buonocore, M. H., Kroll, N. E. A., & Baynes, K. (2001). Hippocampal, parahippocampal and occipitaltemporal contributions to associative and item recognition memory: An fMRI study. Brain Imaging, 12, 359 –363. doi:10.1097/00001756200102120-00035 Yonelinas, A. P., Otten, L. J., Shaw, K. N., & Rugg, M. D. (2005). Separating the brain regions involved in recollection and familiarity in recognition memory. The Journal of Neuroscience, 25, 3002–3008. doi:10.1523/JNEUROSCI.5295-04.2005 Zimmerman, C. A., & Kelley, C. M. (2010). “I’ll remember this!” effects of emotionality on memory predictions versus memory performance. Journal of Memory and Language, 62, 240 –253. doi:10.1016/ j.jml.2009.11.004 (Appendices follow) GOMES, BRAINERD, AND STEIN 676 Appendix A Dual-Retrieval Model Consider a multitrial experiment in which subjects undergo three learning cycles of the form S1T1 S2T2 S3T3. On each recall test, a target is either recalled (correct) or not (error). Therefore, a target’s history of recall across all three tests generates one out of eight possible patterns of correct (C) and error (E) responses: C1C2C3, C1C2E3, . . . , E1E2E3. The six parameters presented in Table 1 can be estimated from the frequency of such error-success patterns by applying a two-stage Markov chain that contains those parameters. The states of the two-stage model are U (an initial L共n ⫹ 1兲 1 D2 0 D2 L共n兲 M ⫽ PE 共n兲 PC 共n兲 U共n兲 W ⫽ 关L共1兲, PE 共1兲, PC 共1兲, U共1兲兴 ⫽ 关D1, 共1 ⫺ D1兲R1共1 ⫺ J1兲, 共1 ⫺ D1兲R1J1, 共1 ⫺ D1兲共1 ⫺ R1兲兴. PE 共n ⫹ 1兲 0 共1 ⫺ D2 兲共1 ⫺ J2 兲 共1 ⫺ J2 兲 共1 ⫺ D2 兲R2 共1 ⫺ J2 兲 The probabilities of the eight individual error-success patterns are obtained by multiplying the vector and the matrix together. Those expressions are: P共C1 C2 C3 兲 ⫽ D1 ⫹ 共1 ⫺ D1兲R1J1共J2兲 no-recall state), P (an intermediate partial-recall state, with a correct recall hidden state PC and an incorrect recall hidden state PE), and L (a terminal and an absorbing criterion-recall state). The two-stage Markov process for these states consists of the following starting vector W and transition matrix M: 2 (A2) P共C1 C2 E3 兲 ⫽ 共1 ⫺ D1兲R1J1J2共1 ⫺ J2兲 (A3) P共C1 E2 C3 兲 ⫽ 共1 ⫺ D1兲R1J1(1⫺J2)关D2 ⫹ 共1 ⫺ D1兲J2兴 (A4) P共C1 E2 E3 兲 ⫽ 共1 ⫺ D1兲R1J1(1⫺J2) 共1 ⫺ D2兲 (A5) 2 P共E1 C2 C3 兲 ⫽ 共1 ⫺ D1兲共1 ⫺ R1兲D2 ⫹ 共1 ⫺ D1兲R1共1⫺J1兲 ⫻ 关D2⫹共1⫺D2兲共J2兲 2 兴⫹共1⫺D1兲共1⫺R1兲共1⫺D2兲R2共J2兲 2 (A6) PC 共n ⫹ 1兲 0 共1 ⫺ D2 兲J2 J2 共1 ⫺ D2 兲R2 J2 . L6 ⫽ ⌸共pi兲N共i兲 , G 2 ⫽ ⫺2ln关L6 /L7兴, ⫹ R1共1 ⫺ J1兲共1 ⫺ D2兲J2共1 ⫺ J2兲兴 (A7) P共E1 E2 C3 兲 ⫽ 共1 ⫺ D1兲共1 ⫺ R1兲共1 ⫺ D2兲关共1 ⫺ R2兲D2 ⫹ 共1 ⫺ R2兲共1 ⫺ D2兲R2J2 ⫹ R2共1 ⫺ J2兲D2 ⫹ R2共1 ⫺ J2兲共1 ⫺ D2兲J2兴 ⫹ 共1 ⫺ D1兲R1共1 ⫺ J1兲共1 ⫺ D2兲共1 ⫺ J2兲 ⫻ 关D2 ⫹ 共1 ⫺ D2兲J2兴; (A8) P共E1 E2 E3 兲 ⫽ 共1 ⫺ D1兲共1 ⫺ R1兲兵关共1 ⫺ D2兲共1 ⫺ R2兲兴2 ⫹ 共1 ⫺ D2兲 (A1) (A10) in which the pi are the eight expressions on the right sides of Equations A2–A9, and the N(i) are empirical data counts of the corresponding error-success sequences. Because six memory parameters are estimated, the likelihood value in A10 is computed with 1 degree of freedom. A goodness-of-fit test that evaluates the null hypothesis that learning to recall involves two processes is then obtained by computing a likelihood ratio statistic that compares the likelihood in A10 with the likelihood of the same data when all seven observable probabilities are free to vary. That test statistic, which is asymptotically distributed as 2(1), is P共E1 C2 E3 兲 ⫽ 共1 ⫺ D1兲关共1 ⫺ R1兲共1 ⫺ D2兲R2J2共1 ⫺ J2兲 (A11) where L7 is the likelihood of the data when all seven observable probabilities are free to vary. For any set of data over which A11 can be defined, it is also possible to define two one-stage models that incorporate singletrace conceptions. Specifically, a single nonrecollective process is obtained from the two-process model by eliminating the U state from it, which results in the following initial probability vector W' and a transition matrix M': W' ⫽ 关L共1兲, PE 共1兲, PC 共1兲兴 ⫽ 关D1, 共1 ⫺ D1兲共1 ⫺ J1兲, 共1 ⫺ D1兲J1兴. ⫻ 共1 ⫺ R2兲共1 ⫺ D2兲R2共1 ⫺ J2兲 ⫹ 共1 ⫺ D2兲 R2共1 ⫺ J2兲 其 2 U共n ⫹ 1兲 0 0 0 共1 ⫺ D2 兲共1 ⫺ R2 兲 2 ⫹ 共1 ⫺ D1兲R1共1 ⫺ J1兲共1 ⫺ J2兲2 共1 ⫺ D2兲2 ; (A9) Maximum likelihood estimates of the six parameters in Table 1 are then obtained by maximizing the following likelihood function: L(n) M' ⫽ P 共n兲 E PC 共n兲 (Appendices continue) L共n ⫹ 1兲 1 D2 0 PE 共n ⫹ 1兲 0 共1 ⫺ D2 兲共1 ⫺ J2 兲 共1 ⫺ J2 兲 PC 共n ⫹ 1兲 0 共1 ⫺ D2 兲J2 . J2 (A12) EMOTIONAL RECALL Probabilities for each error-success pattern are then obtained as in the two-process model, which in turn can be used to maximize a simplified version of A11 that contains only the parameters in A12. The revised fit statistic is then, G ⫽ ⫺2ln关L4 /L7兴, 2 (A13) which is asymptotically distributed as 2(3) because the model has four parameters. In addition, a single recollective process model can be obtained from the two-process model by eliminating the P state. This generates the following initial probability vector W" and a transition matrix M": W" ⫽ 关L共1兲, U共1兲兴 ⫽ 关D1, 共1 ⫺ D1兲兴. 677 M" ⫽ L(n) U(n) L共n ⫹ 1兲 U共n ⫹ 1兲 1 0 共1 ⫺ D2 兲 D2 (A14) The likelihood of any set of data over which A11 can be defined can also be estimated for this one-process model maximizing a simplified version of A11 that contains only the parameters in A14. The revised fit statistic is then, G2 ⫽ ⫺2ln关L2 /L7], (A15) which is asymptotically distributed as 2(5) because the model has two parameters. In particular, note that the two one-stage models are hierarchically nested models. The implication is that if the null hypothesis of fit is rejected for A13, then it is also rejected for A15. Appendix B Original Word Lists Used in Experiments 1 and 2 (and their English translations in parentheses) Experiment 1 Fillers: Chaleira (kettle), martelo (hammer), metal (metal), and quadrado (square); Negative targets: Abandonado (abandoned), egoı́sta (selfish), horrı́vel (horrible), idiota (idiot), infecção (infection), lama (mud), lixo (garbage), luto (mourning), podre (rotten), and trevas (darkness); Neutral targets: Controle (control), fogo (fire), interesse (interest), juventude (youth), processo (process), reunião (meeting), sociedade (society), tempo (time), tobogã (toboggan), and falso (false); Positive targets: Beijo (kiss), conhecimento (knowledge), dinheiro (money), encontro (date), graduado (graduate), noiva (bride), pai (father), sexo (sex), sucesso (success), and vida (life). Experiment 2 Negative-low targets: Cicatriz (scar), débil (feeble), desajeitado (clumsy), desertor (deserter), detestar (detest), pecado (sin), porão (basement), preguiçoso (lazy), and ralé (rabble); Negative-high targets: Faminto (starving), fuga (flight), hospital (hospital), incomodar (annoy), infante (infant), perseguir (chase), pressão (pressure), suspeito (suspicious), and tubarão (shark); Positive-low targets: Adorável (adorable), favor (favor), gramado (lawn), inspirar (inspire), lago (lake), papel (paper), sábio (wise), satisfeito (satisfied), and terra (earth); Positive-high targets: Aventura (adventure), curioso (curious), dinheiro (money), encontro (date), graduado (graduate), liberdade (freedom), noiva (bride), paixão (passion), and vida (life). Received January 2, 2012 Revision received March 12, 2012 Accepted March 19, 2012 䡲