Download What to Click, When to Stop, and What to Buy: A Model of

What to Click, When to Stop, and What to Buy: A Model of Information Processing and Choice at an E-commerce Website Timothy J. Gilbride Imran S. Currim Ofer Mintz S. Siddarth* March 2015 * Timothy J. Gilbride ([email protected]) is Associate Professor and Notre Dame Chair in Marketing, Mendoza College of Business, University of Notre Dame, Notre Dame, IN 46556. Imran S. Currim ([email protected]) is Chancellor’s Professor and Professor of Marketing, Paul Merage School of Business, University of California, Irvine, CA 92697. Ofer Mintz ([email protected]) is Assistant Professor of Marketing, E. J. Ourso College of Business, Louisiana State University, Baton Rouge, LA 70803. S. Siddarth ([email protected]) is Associate Professor of Marketing, Marshall School of Business, University of Southern California, Los Angeles, CA 90089. The authors thank Internet Technology Group, Inc. (ITGi) for providing the data, and the UCI Paul Merage School of Business Dean’s Office for financial support. What to Click, When to Stop, and What to Buy: A Model of Information Processing and Choice at an E-commerce Website Abstract Information display boards have been extensively used to study how consumers process product attribute information but rarely to infer consumer preferences or predict choices. We bridge this gap by proposing a modeling framework for the decisions consumers make: which product-attribute to inspect next, when to stop processing information, and which, if any, product to purchase. The models are estimated on datasets collected at a popular manufacturer’s website wherein shoppers accessed productattribute information and made product choices. We find evidence that consumers (i) use both expected utility and spatial location to determine which product-attributes to inspect, (ii) stop processing information as a function of sequential rather than a fixed sample strategy, and (iii) use prior information for product-attributes not accessed. The results provide meaningful customer preference estimates and can be employed by firms to prioritize non-buyers for follow-up communications. Keywords: Choice Models; Information Processing; E-Commerce; Bayesian Models. 1 Websites such as Amazon, Apple, Best Buy, and CNET, amongst others, often organize productattribute information in the matrix form reminiscent of information display boards (IDBs) widely employed in lab-based information processing research (see Figure 1). This requires shoppers to make a sequence of decisions: (i) which product-attribute to inspect next, (ii) when to stop processing information, and (iii) which, if any, product to purchase. The main purpose of this paper is to develop an integrated model of these three decisions in order to test different theories of consumer information processing as well as infer consumers’ preferences and knowledge about the market by using data from consumers at the actual point-of-purchase. The large literature on behavioral information processing has examined how consumers process attribute information and make choices. This research, primarily based on laboratory studies using IDBs, demonstrates that consumers employ strategies such as processing by alternative (columns in Figure 1) or by attribute (rows in Figure 1), and/or access and process only a subset of the available information when making decisions. An equally large literature in economics has formulated optimal search models incorporating consumer’s prior knowledge and expected utility into a cost-benefit framework. In addition, recent empirical and experimental research in marketing and economics on search and information processing has drawn on behavioral theories to investigate alternative models that challenge previous theoretical models. Our work adds to these extant literatures by modeling multi-attribute, multi-product information processing from a field study in an actual purchase environment. Specifically, the data we employ for this work comes from the website of a firm that showed shoppers the names of products and attributes, as in the first row and column of Figure 1, but concealed the product-attribute values in the remaining cells. Shoppers could then click on a cell in order to reveal the values. Thus, our data captures shopper decisions on which product-attribute to inspect next, when to stop processing information, and which, if any, product to purchase. This data allows us to test several propositions of interest to researchers and managers. First, we propose and compare sequential and fixed sample stopping rules for when consumers stop processing information in the IDB. In the sequential rule, 2 the decision is based on the information that has been revealed for far in the IDB (analogous to sequential search1). In contrast, the fixed sample rule is simply based on the total amount of information processed (analogous to fixed sample search). Previous research has not compared sequential vs. fixed sample product-attribute information processing strategies in a real-world, multi-attribute point-of-purchase decision. Second, we investigate how product-attributes that are not revealed in the information acquisition process influence the choice and cell opening decisions, i.e., whether these levels are effectively ignored by the consumer or if consumers substitute some ex-ante values for these unseen product-attributes. Finally, we test whether incorporating alternative (column) and attribute (row) based information processing into the information acquisition model improves performance over a model that is based solely on cost-benefit principles. We extend the previous forced-choice, lab-based research findings by examining these questions using data from consumers who made actual purchase decisions. We also contribute two methodological ideas to facilitate the current research, which may prove useful to others working in this area. First, both behavioral and economic theories suggest that the level of the unknown attributes as well as the uncertainty about the value influence consumer information processing. Such uncertainty is typically represented by specifying parametric distributions of attributes, for example, prior knowledge about price may be represented via a log-normal distribution. In contrast, we propose an alternative, non-parametric method based on the likely maximum and minimum levels that the attributes may take. We show that this method does a good job of representing the uncertainty in attribute levels while significantly reducing the computational burden for the researcher. Our second methodological contribution is an alternative method to infer ex-ante expected attribute values which exploits the information processing pattern within and between consumers. Our main empirical findings are as follows. First, we find that consumers use a sequential strategy to decide when to stop processing additional information in the IDB. Second, we find consumers 1 The term “search” is typically used in the literature when information acquisition is in a non-IDB setting. Our context closely resembles an IDB setting and we will use the phrases “information processing” or “information acquisition” interchangeably, making the assumption that all information acquired is processed by consumers. We will use “search” when referring to literature in a non-IDB setting. 3 use prior information on attribute levels when they don’t inspect a particular product-attribute level, and our model based approach to inferring these values improves upon other methods. Finally, we find incorporating alternative (column) and attribute (row) based information processing, or what has been referred to as “spatial biases” in the economics literature (Sanjurjo 2014), improves the fit and forecasting performance of the models. However, since our model based approach is only a paramorphic representation, we do not draw any definitive conclusions about consumers’ actual mental processes. Managerially, the model provides attribute-weights analogous to those obtained in conjoint-type studies, but not previously obtained from IDB studies. In addition to the usual applications, the parameter estimates can be used to infer the market’s expected value for attributes. Further, we show how the model can be used by managers to identify their most likely prospects among those who did not purchase during this occasion at the point-of-purchase, which is important for follow-up communications. This work builds on previous research that used the same dataset. Mintz, Currim, and Jeliazkov (2013) link a consumer’s overall pattern of information processing on a website, to the decision of whether or not to purchase. “Pattern” was defined as the extent to which a consumer engaged in alternative- versus attribute based information processing and ranged in value from 1 to -1. However, the dependent variable in their model (whether or not to purchase) is different from ours (which productattribute to inspect next, when to stop, and which product to purchase), i.e., they do not analyze any of the three information processing or choice decisions investigated in this paper. Further, their independent variable (pattern) is also different from ours (product-attributes) and they cannot infer attribute importance weights. Currim, Mintz, and Siddarth (2015) study the final product choice of consumers and show that a reduced form model based on the information accessed by consumers fits better than one based on information available. By not modeling the cell opening decision and assuming that the revealed attribute levels are exogenous, they ignore the endogenous relationship between the attribute weights and the cells that are opened. In contrast, the structural model proposed in this research overcomes this endogeneity problem by jointly modeling information acquisition and product choice. In addition, neither 4 paper investigates sequential or fixed sample stopping rules nor how consumers represent and use prior information. Hence, in both method and purpose, this research differs from that of Mintz, Currim, and Jeliazkov (2013) and Currim, Mintz, and Siddarth (2015). 2. Background The literature on information processing and search is vast and, in this section, we briefly review only those studies that have a direct bearing on our research. The economic modeling literature takes a formal cost-benefit approach wherein decision makers compare the expected value of additional information to the cost of acquiring it. Hagerty and Aaker's (1984) model of how consumers process information from an IDB setting, which is the basis of our own implementation, is powerful and flexible, and builds on early work by Stigler (1961), Nelson (1970), Ratchford (1980), and Shugan (1980). Hagerty and Aaker operationalize the marginal benefit of further search via the expected value of sample information (EVSI) and propose a stopping rule based on comparing the EVSI to the cost of obtaining additional information. The EVSI is dependent on prior beliefs and the sequence in which information is inspected. Hagerty and Aaker show that if a consumer’s prior expectations for the value of each product’s attribute levels can be described by a multivariate normal distribution, then the EVSI can be calculated via the unit normal loss integral given by Raiffa and Schlaifer (1961). Hagerty and Aaker test their model using parameter estimates obtained from an initial survey and evaluate how well the EVSI model predicts the subsequent IDB information acquisition activity of their subjects. However, the data necessary to calibrate the model seems to limit its application to laboratory settings and product attributes that are continuous. In contrast, our models and empirical implementation handle both nominal and continuous attributes and directly estimates attribute importance from observed behavior, without using surveys. De los Santos et al. (2012) offer a useful contrast of sequential and fixed sample product search strategies. They conclude that consumers follow a fixed sample search process in the purchase of online books where product uncertainty is limited to only one attribute, price. Our work allows for uncertainty in all the product attributes, but limits the scope of information acquisition to the IDB. In work conceptually 5 similar to ours, Ke, Shen, and Villas-Boas (2014) formulate an analytical model of multi-attribute, multiproduct search in continuous time, with infinite product attributes and search yielding infinitesimal information. This extends the single product model of Branco, Sun, and Villas-Boas (2012). The continuous time, infinitesimal approach allows the information acquisition process and consumer’s value of a product to be represented as Brownian motion and modeled via dynamic programming methods to derive optimal solutions and insights. Gabaix et al. (2006) also investigate multi-attribute, multi-product search with a discrete number of continuous variables in an experimental setting; they find that consumers do not follow the optimal sequential search strategy, but rather employ cognitive coping mechanisms. We build on this research by relaxing the parametric assumptions about prior knowledge and estimating models in an actual purchase situation. The behavioral information processing literature based on laboratory studies using IDBs primarily focuses on describing consumers’ information processing and choice behaviors and typically eschews any formal econometric analysis. Among others, Payne, Bettman, and Johnson (1993), Bettman, Luce, and Payne (1998), and Dhar and Nowlis (2004) find that consumers’ process information in three basic ways; (1) by alternative; multiple attributes of a single alternative are processed before information on a new alternative is accessed; (2) by attribute; information about a single attribute across multiple alternatives is processed before information about another attribute is accessed; and (3) a combination of alternative- and attribute-processing. Behavioral researchers have found these patterns to be pervasive and to vary systematically depending on the stage of the choice task (Bettman and Park 1980, Gensch 1987, Payne 1976) and individual characteristics (Creyer, Bettman, and Payne 1990, Payne, Bettman, and Johnson 1993). Shi, Wedel, and Pieters (2013) use these theories and employ eye-tracking data from a lab-based setting to model product-attribute information acquisition using a three layer hidden Markov model. Our approach incorporates both cost-benefit and descriptive elements into a model based framework. 6 Sequential search models from economics and certain behavioral information processing models posit that the search process itself leads to a specific product selection. In optimal sequential search models, the chosen alternative is by definition the last product inspected. Similarly, in elimination-byaspects models from the behavioral literature, the manner in which information is processed leads to only one remaining alternative. Our model is more akin to two-stage models; we model information acquisition and then choice from the set of alternatives. Other studies have also incorporated both rational and descriptive behavioral approaches to model information processing. Simonson, Huber, and Payne (1988) collect data similar to the method of Hagerty and Aaker (1984) and find that prior certainty, attribute importance, and an overall measure of brand attractiveness have a significant impact on the sequence of information processed. It is important to note that their model is descriptive in nature without an underlying economic structure like that proposed by Hagerty and Aaker (1984). Urban and Roberts integrate multi-attribute preference, risk, and belief dynamics (due to word of mouth, reviews, etc.) into a choice model that focuses on pre-launch management of a new product. Fischer et al. (2000), Johnson, Payne, and Bettman (1988), and Kivetz, Netzer, and Srinivasan (2004), respectively, have modeled preference reversals, preference uncertainty, and the compromise effect. Meyer (1982) provides a formal descriptive model of consumer information processing behavior, “lying somewhere between a pure process model (e.g., Bettman 1979)…, and a purely-normative statistical model (e.g., Hagerty and Aaker 1984) aimed at describing the optimal rather than actual behavior in markets” (p. 95). Meyer models the probability that information on an alternative will be accessed at a given time, and how expectations with respect to attribute values are updated over the information acquisition sequence. Meyer tests his descriptive model in two controlled experiments, and unlike our research, does not model consumers’ purchasing decision. However, he notes the importance of “assess(ing) the degree to which it (the model) can predict individual behavior in complex settings” (p. 120). 7 In summary, none of the economic, behavioral, or hybrid approaches proposed heretofore focus on the main goal of this paper: to model which attribute to inspect next, when to stop, and which alternative to choose, in order to infer consumer preferences for attributes and predict choices; which we accomplish by employing data from an actual online point-of-purchase setting. 3. Model Our modeling effort has several goals. The first goal is to test different aspects of consumers’ point-of-purchase information processing. The modeling will focus on the type of stopping rule used, the calculation of the benefit to additional processing which determines which product-attribute to inspect next, and the role of unseen product-attributes in information processing and product choice. Second, the model is intended to provide managerially useful estimates of attribute importance from consumers in an actual purchasing situation. The information processing model is used to overcome the real-world difficulties of non-varying attribute levels and a fixed choice set across consumers in order to identify the model and obtain parameter estimates. The models presented are paramorphic representations of the actual consumer decision process. Although we rely on economic and behavioral theories to guide our modeling, we cannot incorporate all plausible theories or empirical regularities from lab based research. First, as noted by Bradlow, Hu, and Ho (2004), the theoretical description must be amenable to mathematical representation. Second, the data must be suitable to fitting and estimating the model. For example, while it may be possible to mathematically represent a behavioral hypothesis (for instance learning and updating one’s prior beliefs), the data may be insufficient to identify or reliably estimate the model. As such, we carefully restrict the models tested so that we can falsify one model under consideration versus another, without asserting that the preferred model is definitively the underlying consumer decision process. 3.1. Base Model We begin by fully elaborating a model that incorporates a sequential stopping rule, an expected value calculation of processing benefits, and with the average values of the actual attribute levels used to 8 represent the ex-ante values of product-attributes not inspected. An outline of the estimation procedure is included and full details are provided in the Online Technical Appendix. Following this initial description, we will introduce alternative mathematical representations designed to represent different behavioral mechanisms. Table 1 previews the main components in the model and the variations which will be explored. Consumers process information for and reveal product-attribute levels by clicking on cells in an online IDB; products are in columns and attributes are in rows. We assume that consumers have ex-ante expectations for the levels of the product-attributes, represented by x jk for product j and attribute k. As a consumer inspects product-attributes, she discovers xjk , the actual value of attribute k for product j. Let the vector xij represent the combination of revealed and ex-ante expected values for product j for person i. Before inspecting any product-attributes xij = x j . When a product-attribute level is revealed, xjk replaces x jk in the vector xij and this vector therefore corresponds to consumer i’s unique information processing pattern. The vector xij includes an intercept term that represents the product name or any other additional information that is revealed prior to search (e.g., column headings with product name). The first component of the model specifies how consumers make the final choice between product alternatives. As is standard in the discrete choice literature (e.g., Gilbride and Allenby 2004), we assume that the final product decision is based on a linear compensatory indirect utility function. T he indirect utility that consumer i expects from product j is given as: Vij   ' xijf   ij . (1) f Here xij represents the final vector of product-attribute levels for product j that consumer i is using in his decision. The error term ε represents additional uncertainty about any remaining attributes and/or other factors related to the product, e.g., the performance of the product in particular usage situations, durability, quality, etc. If the consumer makes a product choice without inspecting any product-attribute 9 levels at the point-of-purchase, his choice is based on  ' xij   ij . As the consumer sequentially inspects product-attribute levels, indirect utility is given by  ' xij   ij at each step of the process. Let X i f represent the final matrix of product-attributes across alternatives, in a particular choice set used by a consumer and εi the stacked vector of εij. The consumer’s ultimate decision problem is to choose the alternative corresponding to Max(  ' X i f   i ) . When εij is distributed i.i.d standard extreme value, the expected maximum utility across alternatives is given as (see Anderson, de Palma, and Thisse 1992):   E  Max( ' X i f   i )  ln  'exp   ' X i f    , (2) where ι is a vector of ones and γ is Euler’s constant. This value is also distributed according to the extreme value distribution (Ben-Akiva and Lerman 1985), a property that we will exploit in developing the likelihood function. Consumers acquire information about products and attribute levels from many different sources at different points in time and bring that information with them to the point-of-purchase. If their prior information is complete and known with certainty, we would not see any additional information acquisition at the point-of-purchase. We now consider the benefit to the consumer of inspecting or processing more information as part of their decision process. At any step of the process, let X in represent the matrix of product-attribute levels currently being used by the consumer after n cells have been opened2 and X in  the state of the matrix if one additional cell is opened. The benefit of revealing the contents of this new cell, xjk , is given by E  Max(  ' X in   i )   E  Max(  ' X in   i )  , which can, in turn, be written as:       ln  'exp   ' X in   ln  'exp  ' X in  . (3) 2 This includes the product-attribute levels already revealed and the ex-ante values for product-attributes which have not been opened. 10 Per equation (3), opening a new cell in the IDB will yield an incremental expected maximum uility, χ. Consistent with economic models of product search (e.g., Moorthy, Ratchford, and Talukdar 1997), we assume that consumers prior knowledge and uncertainty about the values of the product-attribute levels xjk   can be represented via the parametric distribution f x jk . Then, the expected benefit of inspecting the cell with attribute k for product j is:      ijkn   ln  'exp  ' X in  f ( x jk )dx jk  ln  'exp  ' X in  . (4) n The quantity  ijk can be calculated for every product-attribute level which has not already been inspected by the consumer and this quantity is used to decide if any additional information processing will be undertaken, and if so, which product-attribute level will be inspected next. We assume the cost of processing information in the IDB is constant across time and options and is not included in equation (4); it will be discussed below. The quantity in equation (4) plays the same role as the EVSI of Hagerty and Aaker (1984), but can accommodate any parametric distribution as opposed to just the multivariate normal and thus allows for continuous and discrete attributes. We finalize the model specification by formally stating the decisions faced by the consumer. Because, all consumers in our study inspected at least one product attribute level, the first decision made n n is which product attribute level to inspect. Mathematically, the consumer inspects xjk if  ijk is max{  ijk } for all j and k values. Like Hagerty and Aaker (1984), we assume that consumers will continue to inspect product-attributes as long as the expected gain exceeds some threshold, τ, which represents either the monetary or cognitive cost of acquisition. In other words, consumers will continue to inspect productattributes as long as: Max{ ijk }   , (5)  where the set { ijk } excludes cells that have already been inspected. If additional information processing n n is indicated, the consumer once again chooses to inspect attribute k for product j if  ijk is max{  ijk } for 11 all j and k values corresponding to cells that have not already been inspected. We use the superscript “n+” n and the subscript i to emphasize that the value of  ijk depends on which other product-attribute levels have already been inspected, i.e., the n previous selections made by individual i. Finally, letting X i f represent the final attribute matrix containing all the inspected attribute levels and a column representing the “none” or outside good, then the chosen alternative is the one which corresponds to Max(  ' X i f   i ) . This model represents a sequential stopping rule because the decision of how many productattributes to inspect (i.e., when to stop processing) in equation (5) is a function of the information revealed in the search process and is not decided a priori. Equation (4) calculates the benefit of processing as the expected value of revealing a particular product-attribute compared to the current state of knowledge. Unseen product-attributes, i.e., the product-attribute levels not revealed by the consumer in the IDB, are represented by the ex-ante expected values x jk . While the decision model relies on prior knowledge, expected values, and cost-benefit analysis, it does not represent the economically optimal sequential acquisition process. Gabaix et al. (2006) discuss the economically optimal process in the context of a multi-attribute, multi-product decision task with a finite number of attributes, and Ke, Shen, and Villas-Boas (2014) discuss a continuous-time, infinite attribute model. Our model departs from the “optimal” because equations (4) and (5) do not consider the option value of being able to inspect other attributes in the future. Similar to Gabaix et al.'s “directed cognition model,” our models assume that “agents act as if their next set of search operations were their last opportunity for search” (p. 1043). In a simple experiment, Gabaix et al. show that the myopic model fits subjects’ search data better than the optimal model and that in larger problems like our empirical example, calculating the optimal solution is intractable even using modern computing methods. Thus, while this base model is rational and uses orthodox assumptions from economic theory, it is not the optimal information processing strategy. 12 3.1.1. Model Estimation. The data contains three pieces of information corresponding to the model elements described above: (i) which product-attribute level to inspect next, (ii) whether or not to continue inspecting product-attribute levels, and (iii) which product to choose. Equation (1) introduces an error term associated with the consumer’s final choice and via the model structure, specifically equation (2), this error term is integrated out of the intermediate decisions. However, because the scale of the error term is assumed to be the same across alternatives, the expected value of the maximum utility is distributed according to the extreme value distribution. This provides a straightforward way to model the multinomial or binary outcomes of the information processing model. The actual observations of these events allows for the specification and estimation of the statistical models. The consumer chooses which attribute level to inspect next based on max{  ijk } for all j and k; n n this is a multinomial outcome. Let xijk be a candidate product-attribute level (i.e., the cell in the IDB). n Because the value of  ijk is the difference between two extreme valued variables, it is also distributed extreme value and inherits the scale term from ε in the underlying indirect utility function, equal to 1. The max{  ijk } therefore follows the standard multinomial choice probability. The probability that a particular n product-attribute level is inspected is given by: n ijk Pr( x  1)  exp  ijkn  J K   exp  j 1 k 1 jM kM (6) , n ijk  n where M is the set of product-attribute levels already inspected. Let ci = 1 if individual i continues to inspect product-attribute information at step n. Then cin = 1 if the largest value of  ijkn  is greater than n n some threshold. Specifically, if max  ijk    then information processing continues. Because  ijk is j , kM distributed extreme value with scale 1, this probability is: 13   exp  max  ijkn      j ,kM    Pr  cin  1    1  exp  max  ijkn      j ,kM    . (7) Given the previous assumptions about ε, the final choice probability is given by: Pr(yij  1)  exp   ' xijf  J 1 exp   ' xijf   j 1 , (8) where the J+1th product corresponds to the “no-purchase” option. Attribute levels for the outside good are set equal to 0. n For each individual, Ni attribute levels are inspected. Let  xi  correspond to the cell inspection n probability given in (6),  ci  correspond to the probability of continuing to inspect cells given in (7), 1  cin  the probability to quit, and  yij  the final choice probability, given in (8). The likelihood function for an individual is given by: Ni 1 i    xin  cin   xiNi  1  ciNi   yij  . (9) n 1 Bayesian Markov Chain Monte Carlo (MCMC) methods are used to obtain draws from the posteriors of all model parameters. In addition to the vector of importance weights β and the scalar stopping threshold τ, the parameters for f(xjk ) must be estimated or specified. Because all the products in the current study represent the same brand, in a particular price category we assume f(xjk ) = f(xk ), i.e., the a priori distribution of the likely levels of an attribute are the same across products. This is consistent with Moorthy, Ratchford, and Talukdar (1997) who argue that consumers’ expectations differ by brand. However, this assumption may need to be revised in situations in which the products represent different brands or are from different price tiers. 14 Our data contain continuous and discrete attributes. For continuous attributes, we assume xk is distributed log-normal ( k ,  k2 ) . Following the example of De los Santos et al. (2012), we use the actual distribution of product-attribute levels in the choice set and calculate E[xk ]= xk , the arithmetic mean. In our MCMC estimation it is more efficient to estimate the coefficient of variation which 3 together with xk allows us to calculate μ k and  k2 . For discrete attributes, f(xk ) is assumed to be Bernoulli with parameter θk = 0.5. The MCMC chain then produces a posterior distribution of the discrete values of xk . Orthogonal coding is used for discrete attributes; i.e., for a discrete attribute with two levels, its presence is coded by 0.5 and its absence is coded by −0.5. The product name and price are revealed to all consumers at the start of the data collection process and binary coding is used to capture these model intercepts. The integration in equation (4) is implemented numerically. Details of the estimation procedure appear in the Online Technical Appendix. Priors are diffuse but proper, and a standard Metropolis-Hastings algorithm is used to draw parameter values from their posteriors. The algorithm follows procedures detailed in Rossi, Allenby, and McCulloch (2005). Estimation of the remaining models are variations on the components outlined here and will therefore not be repeated. The most computationally challenging part of the estimation procedure is calculating the set { ijkn  } for each individual, for each attribute-level revealed. In a product-attribute matrix that has 3 products and 11 attributes (discussed in section 4), at time 0 the set { ijkn  } contains 33 items, and after the first attribute level is revealed it contains 32 items, and so on. Having a computationally tractable expression for the value of revealing a particular attribute-level (e.g., equation (4)) is essential to estimating this class of models with real data involving multiple products and attributes and hundreds of consumers. Consumers revealed different subsets of the product-attribute levels creating variation in the X i f matrix; because of this variation across consumers, the attribute importance weights β can be identified 3 Note that for the log-normal distribution E[xk] is not equal to μk. 15 using just the final product choice portion of the model, as in Currim, Mintz, and Siddarth (2015). However, modeling the information search process not only addresses the endogeneity concerns, it also helps to identify the attribute weights. Fixing xk and changing  k2 implies a different sequence of revealed product-attributes for each respondent. The parameters β and {  k2 } cannot be simultaneously changed in equation (6), keeping the likelihood constant because only β contributes to the likelihood in equation (8). Recall that after the first product-attribute level is revealed, each consumer has a unique set of { ijkn  } . For a given cell position specified by j and k, the value of  ijkn will change as additional productattribute levels are revealed, so that  ijkn   ijkn 1 . Because of this, the { ijkn  } are not confounded with the row or column order of the product-attributes in the IDB. Changing the stopping parameter τ implies a different number of inspected attributes and again since β and {  k2 } cannot be simultaneously changed without changing the other components of the likelihood, τ is uniquely identified by equation (7). Simulation experiments available from the authors demonstrate that the parameter values are identified and can be recovered using data sets analogous to those used in the empirical example. 3.2. Stopping Rule The base model assumes consumers use a sequential stopping rule in that the decision to continue inspecting attributes is a function of the information that has been revealed so far in the search. In n equation (7), the term  ijk depends on which product-attributes have been inspected (and their actual value xjk ) up to point n. An alternative is a fixed sample stopping rule, so named because the consumer decides before beginning the search how many product-attributes will be examined. Both the sequential search and the fixed sample search strategy have been proposed and examined extensively in the economics literature for the case of a single unknown product attribute, typically price or the wage rate. The sequential search strategy dominates the fixed sample search strategy from a normative standpoint, see a demonstration in Feinberg and Johnson (1977) and additional discussion and references in Miller (1993, p. 164). However, a recent study by De los Santos et al. (2012) 16 showed that many of the predictions of optimal sequential search were violated and that the fixed sample search provided a better description of consumers’ search behavior of online search and purchase of books. Honka (2014) uses a fixed sample search strategy to model consumer search and purchases of auto insurance; she links this strategy to the common practice in marketing of modeling consumer consideration sets, e.g., Mehta, Rajiv, and Srinivasan (2003). Under this approach, the fixed sample search takes the form of determining how many and which products will be examined and included in the final choice set. In the context of our information processing problem, the fixed sample stopping rule permits two possible behavioral interpretations. First, consistent with economic theory, a consumer decides before the task to open a fixed number of cells, say 5 or 6, and then to stop and make a decision about which product, if any, to purchase. Similar to the sequential strategy, this decision can be formulated in terms of the cost and benefits of information acquisition, integrated over the consumer’s prior knowledge of the attribute levels, and an optimal strategy can be formulated. An alternative explanation is that a consumer simply grows fatigued and is more likely to stop opening more cells, as a function of how many cells have already been opened. The key difference between sequential and fixed sampling in the current setting is that the decision to stop processing information in the fixed sample strategy is a function of how many cells have been opened, not a function of the information revealed in the IDB. To implement the fixed sample stopping rule, it would be natural to model the number of inspected product-attributes as a Poisson process. However, the summary statistics in our data (discussed in the next section) show that the average number of attributes revealed is 11.54 while the variance is 85.38, which violates the Poisson assumption that the mean and variance are equal. We therefore use a Negative Binomial distribution, which is often used as an over-dispersed Poisson. The Online Technical Appendix details the truncated Negative Binomial distribution which has parameters α and p. The Negative Binomial distribution models the decision to stop processing product-attributes as an increasing function of the number of product-attributes already inspected. That is, the probability of continuing to 17 inspect product-attributes is relatively high after the first product-attribute is inspected (the probability of quitting is low), but is much lower after the 10 th or 20th cell is revealed. In terms of the model likelihood, equation (7) is replaced with the probability of observing exactly n opened cells calculated from the truncated Negative Binomial distribution where n = 1,…,ni corresponds to each consumer’s specific information processing sequence. 3.3. Product-Attribute Selection In the base model, the benefit of continuing to process information or of inspecting a particular productn attribute level is the difference between the expected maximum utility of inspecting candidate cell xijk and the expected maximum utility of making the final choice using the current set of revealed productattributes. This is captured in equation (4) which requires specifying the distribution of xk and integrating over the possible values of xjk ; we refer to this as the “Expected Value” approach to product-attribute selection. In addition to statistical and economic theory, Meyer (1982) shows in an experimental setting that the probability that an alternative will be inspected is a function of both the expected utility and the uncertainty about the alternative. However, Miller (1993) suggests that the assumption that consumers necessarily manifest that information in the form of a probability distribution function is “less tenable.” From a purely practical standpoint, there is no closed form expression for the integral in equation (4) and numeric methods must be used, which is computationally costly. We propose an alternative method of computing the benefit of processing additional information that retains the level and dispersion of expected utility, but does not rely on parametric assumptions and is computationally less demanding. Let Mink represent the lowest value that attribute k is expected to equal and Maxk represent the highest. The level of expected utility is reflected in the values of Maxk and Mink while the range, Maxk – Mink , represents the uncertainty in the attribute levels. Using the Maxk and Mink , equation (4) is recast as:   ijkn  ln  'exp   ' X i  n  ( Max jk )      ln  'exp   ' X in( Min jk )   .     (10) 18 Thus, instead of integrating over the unknown values of xjk , the benefit of inspecting a cell is given by the difference in expected maximum utility when the attribute level is at its a priori expected highest value and when it is at its lowest. When a consumer is uncertain about an attribute, there will be a relatively big n difference between Maxk and Mink, which results in a relatively large value of  ijk and increases the probability that cell j,k will be inspected. In other words, when a consumer is uncertain about what she is going to get, there is a lot of benefit for her to acquire additional information. In contrast, when a n consumer is relatively certain about an attribute, Maxk and Mink will be relatively close,  ijk will be relatively small, and the probability that cell j,k is inspected will decrease. For discrete attributes, Maxjk is set equal to the attribute being “present” and Minjk is set equal to the attribute being “not present” in product j. When higher values of an attribute are expected to decrease indirect utility, i.e., βk < 0, then the first two terms in equation (10) are reversed. When the analyst does n not know whether higher or lower values will be preferred,  ijk can be based on the absolute value of the difference. Equation (10) has a closed form which facilitates estimation of the model. We refer to this as the “Max – Min” method of product-attribute selection. In terms of the model likelihood function, equation (10) simply replaces equation (4) in the n calculation of  ijk and the remainder of the model is unchanged. Instead of specifying a particular f(xk ) such as the log-normal and estimating the coefficient of variation for continuous variables, we estimate Maxk and Mink . The Online Technical Appendix contains full details. To facilitate estimation, we take Maxk and Mink to be symmetric around xk but find that within a broad range, the results are not sensitive to different assumed xk . We restrict Mink ≥ 0. Theory, past research, and empirical patterns in the data suggest that a consumer may select a particular product-attribute level based on its proximity to the previously opened cell, i.e., whether it is in the same row (attribute based processing in our data), column (alternative based processing), or diagonal (mixed processing) to the previous selection. Gabaix et al. (2006) experimental data was analyzed by 19 Sanjurjo (2014) and, consistent with the behavioral literature, he found strong tendencies for row, column, and “typewriter” processing in an IDB; he referred to these patterns as “spatial biases.” One way to capture this information acquisition process is via a distance metric that permits product-attributes closer c r to the last revealed cell to be preferred. Let Distijk and Distijk represent the row and column distance from the last item revealed. The next product-attribute accessed is that j,k combination that satisfies: Max( ijkn   r Distijkr  c Distijkc ) . (11) n In equation (11), holding  ijk constant, cells which are closer to the currently opened cell are preferred to those that are further away. The relative magnitudes of ϕr and ϕc determine whether row (attribute) or column (alternative) proximity is more important. Although row (attribute) and column (alternative) based information processing are typically thought of as moving to the cell immediately adjacent to the last cell, Simonson, Huber, and Payne (1988) found that 40% of transitions were more than one cell away or diagonally situated to the last attribute level accessed. This distance based metric accounts for these types of transitions. In our empirical analysis, we will combine the Max - Min calculation of equation (10) with the distance metric in equation (11) and refer to this as the “Hybrid” method of determining product-attribute selection. 3.4. Unseen Product-Attributes In the base model, if a particular product-attribute was not inspected by a consumer, it is assumed that the mean xk for that product-attribute is used by the consumer. Meyer (1982) provides some experimental support for this by showing that consumers treat completely unknown alternatives as if they had the average utility of products in the market. We refer to this assumption as using the “Actual Mean” to represent unseen product-attributes. We calculate xk as the arithmetic mean of the actual values xjk for continuous attributes in the choice set. While this has precedence in the search literature and it seems to be a reasonable assumption, there is no guarantee in any setting that the actual mean matches consumers’ expectations. We therefore 20 explore two different options for representing unseen product-attributes. The first is that consumers simply do not use product-attributes that they do not inspect in the IDB. This is consistent with the finding that consumers limit cognitive effort and focus on only a subset of information to make decisions (see Bettman, Luce, and Payne 1998). This is also the underlying rationale for statistical procedures such as variable selection in discrete choice models (Gilbride, Allenby, and Brazell 2006), although variable selection focuses on an attribute that is irrelevant for all products, as opposed to the attribute level of a particular product. Using our notation from earlier, before any product-cells are inspected, the vector xij = 0. When a product attribute level is revealed, xjk replaces the appropriate 0 in the vector xij . Unseen attributes are simply not included in the calculation of the indirect utility. However, this sets up a paradox in our model structure. Both the Expected Value and Max-Min models for product-attribute selection rely on consumers using prior knowledge of the attribute levels either through f(xk ) or through the values of Maxk and Mink . Why would consumers abandon this knowledge when considering the expected utility from a product choice? We emphasize the paramorphic nature of the decision models and argue that it is plausible for consumers to use one information set when deciding which product-attributes to inspect and a smaller information set when choosing the final product. Along similar lines, Andrews and Srinivasan (1995) postulated that consumers could have a “consideration utility” and a “choice utility” which differentially weight product attributes in their multi-stage decision model. We acknowledge the possible inconsistency in the model structure but estimate the model to see if it in fact does a better job representing consumers’ information acquisition and product choice. We will refer to unseen productattributes being “Not Used” when the appropriate elements of xij = 0 when product-attributes are not inspected. The “Actual Mean” method of dealing with unseen product-attributes relies quite heavily on the specified value of xk since it appears in the indirect utility function. The “Not Used” method requires the 21 potentially problematic assumption that consumers use different information sets for deciding the benefit of inspecting product-attributes compared to the final product choice; we would like to avoid these assumptions. We build on Branco, Sun, and Villas-Boas (2012) and propose an expectation deviation method of handling unseen product-attributes. In our model, there is an intercept term which represents the product labeled for each column in the IDB. Consider a product class with only two attributes, then the indirect utility is given by: Vij   0  1 x*j1   2 x*j 2   ij . (12) * Here the product-attribute values x jk represent deviations from the expected values xk and β0 is the indirect utility when the product-attribute levels all equal their expected values. If we assume that for unseen product-attribute levels that xjk = xk , then before any product-attribute levels are inspected, Vij = β0 + εij. Rewriting (12) in expectation deviation format and expanding β0 we get: Vij  0*  1 x1  2 x2  1  x j1  x1   2  x j 2  x2    ij . (13) Here 0  0  1 x1   2 x2 . When product-attribute 1 is inspected and product attribute 2 is not, we get: * Vij  0*  1 x1   2 x2  1  x j1  x1    2  x j 2  x2    ij (14) Vij     2 x2  1 x j1   ij . * 0 This result suggests a new parameterization of the indirect utility function. Let dijk = 1 when productattribute j,k has not been inspected by consumer i and dijk = 0 if it has. Then: Vij   0*  1 x j1 (1  dij1 )   2 x j 2 (1  dij 2 )  1dij1   2 dij 2   ij . (15) In this model  k  k xk . Here we can again assume that if a product-attribute is not inspected that xjk = xk but we do not need to specify xk , it is estimated in the parameter δk. We will refer to this method of handling unseen product-attributes as “Inferred Mean.” 22 The Inferred Mean method requires expanding the parameter vector β by K, the number of attributes in the product choice set, in order to estimate δ. In terms of our model set-up, the vector xij is now of length 2  K. Before any cells are inspected in the IDB, the first 1, …, K elements of xij are set equal to 0 while the last K+1, …, 2  K elements are set equal to 1, with these last elements representing the binary variables in equation (15). When a consumer inspects the cell corresponding to product j and attribute k, the value xjk replaces the corresponding 0 for xijk and in the second half of xij , the value of xijk  K is switched from 1 to 0. The sequence of product-attributes revealed by a consumer and the differences between the final set of product-attributes across consumers identifies the binary variables in equation (15). The remainder of the model set-up does not change. Table 1 summarizes the main modeling elements and the different variations that will be tested using our empirical data. The next section describes in detail the data collection process and summary statistics. 4. Data The data was collected by Internet Technology Group, Inc. (ITGi), for an online study of consumer behavior at a well-known electronic manufacturer’s website. Due to confidentiality agreements, the name or exact nature of the product cannot be revealed. However, the product is a consumer durable with the types and number of features similar to what might be found in a computer, tablet, e-reader, or camera. The data collection procedure is discussed next followed by descriptive statistics. 4.1. Data Collection Unlike previous IDB studies, the data does not come from a lab-based study but represents information processing and purchases of real shoppers. The manufacturer installed the Decision Board Platform (Mintz et al. 1997) on its website for a consecutive 50 hour period over a weekend. Similar to the classic Mouselab IDB used in previous lab studies (e.g., Payne, Bettman, and Johnson 1993), this platform recorded shoppers’ sequence of product-attribute information acquisition and their final choice. As shown 23 in Figure 2, three products were available in column format with each product’s model number and price shown in the first row. Cells containing information on eleven other product attributes for each alternative appeared in corresponding rows below, but these values were concealed until the consumer clicked and revealed the value of that attribute level. Similar to the “Choose” command button in the mock-up, a prominent “Customize and Buy” command button was located at the bottom of each column. 4 If a customer clicked on the “Customize and Buy” button, they were taken to a secure server where they entered shipping location and credit card information, which for legal reasons is unavailable to us. It is possible that some customers did not finish entering in their credit card information; however, the manufacturer’s product category manager confirmed that relatively few consumers abandoned their shopping carts after going to the “Customize and Buy” secure server and that a vast majority of customers who clicked “Customize and Buy” actually did purchase the chosen product. In any event, at a minimum, the data can be interpreted as revealing a decision to “Customize and Buy” if not “Buy.” No other data on consumers’ prior knowledge or demographics was collected by the company, typical of internet-based retail settings. The data was collected from shoppers who went to one of three distinct price tiers (i.e., high, medium, or low), each on different websites that featured three products and eleven attributes. To control for heterogeneity between these market segments, we analyze these datasets separately. We first present a detailed description and results for the high priced dataset. Later, in section 5 we present selected model based results for the other two data sets. Table 2 indicates which attributes were continuous and discrete, and the ordinal ranking of their attribute levels. Every shopper saw the same product-attribute matrix in which products 1 and 3 were the lower-priced alternatives and product 2 had a higher price. Attributes 10a and 10b were revealed together when a cell in row 10 was clicked; these were elements of the physical dimensions of the product (e.g., height and weight), which were not perfectly correlated across the alternatives (i.e., the lightest product was not the shortest). The products shown were real, not 4 The manufacturer decided to implement the “Customize and Buy” option rather than go with the Decision Board’s default “Choose” button. 24 hypothetical, and, as a result, attribute-levels for five of the attributes (A2, A4, A6, A7, A8, and A9) were the same for all products; further, for several other attributes these levels were the same for two of the three products. Despite these overlaps, a careful inspection reveals that the highest priced alternative, product 2, does not dominate on all the attributes because it is missing the desirable discrete attributes 3 and 11. Although consumers have to click on cells or links to access (more) information about attributes or alternatives on many popular websites like Facebook, Expedia, CNET, etc., they typically do not have to click on cells in order to read the information in product-attribute matrices on e-commerce websites. This raises the question of whether consumers alter their information processing strategy as a result of the data collection process. Previous experimental laboratory research using IDB computerized decision process tracers show that subjects display similar information processing, choices, and eye movements to subjects that are not using IDB computerized decision process trackers to make a decision (e.g., Johnson, Meyer, and Ghose 1989, Johnson, Payne, and Bettman 1988). The current methodology involves consumers making actual purchase decisions, but it is unknown how the data collection may specifically alter consumer behavior. We return to this topic in the conclusion. 4.2. Descriptive Statistics Data from a total of 136 shoppers are available for analysis. As shown in the right hand side of Figure 2, the data from each shopper provides the following information: the sequence of product-attribute levels inspected, the last cell accessed, and which one of the three products or the “no purchase” alternative was chosen. We only include those observations in which the customer accessed more than one cell. As reported in panel A of Table 3, 58 of the 136 shoppers (43%) “customized and bought,” with 27 customers purchasing product 2 (20%), the most expensive product, and 13 and 18 shoppers, respectively, purchasing the two equally priced products 1 (10%) and 3 (13%). Only 7% of shoppers accessed all the information in the Decision Board (panel B), with the average shopper accessing 11.54 of the 33 cells (panel A). Those who “customized and bought” accessed more cells than those who did not 25 (13.62 vs. 9.99), however inter-shopper variation was high (standard deviation 9.24). 68% of shoppers accessed information on all three products, 9% for two products and 23% for a single product. As expected and as reported in panel B of Table 3, shoppers accessed attributes in the top rows more often than those in the bottom rows; A1 was accessed most often (89% of the time), A3 second most (70%), and A10 least (40%). 50% of the shoppers accessed five or fewer rows, 15% accessed 6-10 rows, and only 35% of shoppers accessed all 11 rows. We also report in panel B the percentage of shoppers who clicked on individual cells on the Decision Board and find that most shoppers only accessed a subset of the information available to them in what would be considered a relatively high price, high involvement type of purchase. 5. Results We have data from three different market segments: high priced, medium priced, and low priced. In order to control for parameter heterogeneity, separate models were estimated for each dataset. As in the previous section, we will focus on the high priced market segment and then show that similar results were obtained in the other two segments. A total of 110 consumer responses were used to calibrate the models listed in the top of Table 4; 26 responses were reserved for hold-out testing. For all models, the MCMC chains were run for 50,000 iterations with a thinned sample of every 10 th from the last 25,000 used to estimate the posterior moments of the parameters. The β’s associated with Attributes A1 – A9 and A11 n were expected to be positive and equation (10) was used to calculate  ijk for the Max-Min and Hybrid models. Attributes 10a and 10b describe physical aspects of the product and it is unclear if larger or smaller values would be desired for this class of product; therefore, the absolute value of equation (10) was used for these two attributes. For the Expected Value models, in order to ensure comparability to the Max-Min and Hybrid models, the β’s associated with Attributes A1 – A9 and A11 were restricted to be greater than 0. The chains converged quickly and convergence was tested by starting the chains from multiple starting points and comparing the resulting parameter estimates. All parameters were estimated using a 26 Metropolis-Hastings algorithm with a random-walk proposal density, and were all drawn one-by-one with customized proposal densities and relatively long MCMC chains to ensure proper mixing. Table 4 describes the models and contains each model’s fit statistics, while Table 5 provides the posterior means and posterior standard deviations for parameters from selected models. 5.1. Model Comparisons A broad set of model combinations from Table 1 were estimated with choices guided by the logical consistency of the model alternatives and preliminary results. In-sample fit is measured by the log marginal density (LMD), calculated using the Gelfand and Dey (1994) importance sampler and includes a penalty for the number of parameters. Gamerman and Lopes (2006) show this measure of LMD performs well, and it has been used in other Bayesian analyses such as Lenk and DeSarbo (2000) and Gilbride and Lenk (2010). The LMD favors the model with the largest value, which is model 8, a Sequential stopping rule model using the Hybrid product-attribute selection process and the Inferred Mean method of handling unseen attributes. The out-of-sample fit is measured by computing the log likelihood of the observed data for the 26 consumers in the hold-out sample using a random sample of 2,500 draws of the parameters from the posterior distribution, and then calculating the average. This measure favors the model with the largest value, which again is model 8. Comparing the left hand to the right hand side of Table 4, we see that the in-sample fit favors the Sequential stopping rule over Fixed Sample stopping rule for models which are comparable on productattribute selection and assumptions about unseen product-attributes. For out-of-sample fit, the Sequential stopping rule dominates in eight of the eight comparisons. This suggests that consumers are responding to the information that is being revealed in the IDB when deciding whether or not to continue processing additional information, as opposed to basing their decision on just how many cells have already been opened. Focusing on the first four rows of Table 4, we can contrast the performance of the Expected Value versus the Max-Min method of calculating the benefits of additional information processing. The 27 Expected Value version of the model posits that consumers integrate over the distribution of likely product-attribute values; in the Max-Min the expected benefit to additional processing is calculated as the difference when the candidate product-attribute is at its maximum versus its minimum expected value. These ideas are captured in equations (4) and (10). In each comparable model, the Max-Min methodology has a better in-sample and out-of-sample fit than the Expected Value methodology. The Max-Min model fits the data better and offers significant computational savings. The Hybrid model modifies Max-Min (see equation (11)) by penalizing cells in the IDB which are farther away from the last inspected cell. The last six rows in the upper panel of Table 4 show that for otherwise comparable models, the Hybrid model has both better in-sample and out-of-sample fit (Models 3 vs. 6, 4 vs. 7, and 5 vs. 8). These results suggest that the proximity of cells has an important role in determining the information processing of consumers in IDBs.5 In the six comparable models on how to handle unseen product-attributes, models with the Not Used assumption fit the in-sample data better than models with the Actual Mean assumption; for out-ofsample fit, Not Used is preferred in five of the six comparisons (the exception is comparing model 9 to model 10). However, the Inferred Mean method, which includes parameters that account for the product category attribute means, fit the data the best. These results offer support for the substantive conclusion that consumers rely on prior information and provide a mechanism for modeling this when data on consumers’ prior beliefs is not available. In summary, the results show that when consumers decide on when to stop processing additional information, they rely on information revealed in the IDB as opposed to a pre-specified number of cells to open, or simply based on the number of cells already opened. When deciding which product-attribute levels to inspect, both the change in expected maximum utility and the location of the cell in relation to 5 Models were also fit using only the row and column distance metrics to determine which product-attribute would n be inspected next, i.e., equation (11) excluded  ijk . These models fit better than the Max-Min and worse than the Hybrid models, however parameter estimates for importance weights were all insignificant. These results are not included because a different justification of the error structure has to be assumed in order to calculate equation (6) in the likelihood. 28 the last cell inspected are important. Finally, when evaluating the overall products, consumers use a prior value for product-attributes which are not inspected in the IDB. Methodologically, the results show using a Max-Min type of calculation for the anticipated benefit of information processing is effective compared to the computationally more demanding Expected Value calculation. Further, the proposed Inferred Mean method of representing consumers’ prior values for product-attributes is viable in situations with limited consumer data. This analysis was replicated for the medium priced and low priced data sets for the best fitting models (models 7, 8, 15, and 16), and the results match the pattern seen in the high priced data set. Model fit statistics are reported at the bottom of Table 4. 5.2 Parameter Estimates Table 5 contains parameter estimates for selected models for the high priced data set. In addition to the models already described, a model which used only the final product choices to calibrate the β’s was also estimated. In this model, the decision of which product-attribute to inspect next and the when to stop inspecting attributes is random6 and only the opened product-attribute levels were used to calibrate the final choice model. Table 4 shows that the in-sample and out-of-sample fit for the model which did not consider consumers’ information processing was worse than all other models considered. Table 5 shows that all the posterior estimates of β for the “random” model included 0 in their 95% highest posterior densities (i.e., were not statistically significant) except for the product intercepts. By contrast, virtually all the estimates of the β’s in models 7 and 8 which model consumers’ information processing are statistically significant and have the expected sign. Only the β for A10a in Model 8 is not statistically significant. Contrasting the β’s from model 7 and 8, we see that with the Inferred Mean method of modeling unseen attributes that the absolute value of all the parameters and the posterior standard deviations are larger. Nine of the eleven estimates of δ from Model 8 are statistically significant; because 6 The probability of inspecting a product-attribute was equal to 1/(N-ni) where N is the total number of productattributes (N=33) and ni is the number of cells opened so far by consumer i; the probability is one divided by the number of product-attributes currently unopened in the IDB. The probability of stopping was ½ after each cell was inspected. 29 attributes A10a and A10b are opened at the same time, only a single value of δ could be estimated. Managerial applications of these parameter estimates will be discussed in the next section. In models 7 and 8, both the row ϕr and column ϕc coefficients of the Hybrid model are significant at the 0.05 level. The value for the row coefficient is greater than the value for the column coefficient (0.85 vs. 0.56 in model 7, 0.85 vs. 0.40 in model 8), which indicates that consumers were more likely to process by alternative (column) than by attribute (row). This contrasts with the empirical findings of Simonson, Huber, and Payne (1988) who found a greater propensity for attribute level processing in their laboratory based choice task; but supports the empirical findings in the two-stage literature (e.g., Bettman and Park 1980, Gensch 1987, Hauser and Wernerfelt 1990, Payne 1976), which suggest that alternativebased processing is expected over attribute-based processing just before a choice. The posterior means of the Maxk and Mink are listed for models 7 and 8. In estimating the MaxMin model, the Mink value was constrained to be greater than 0 because it would not make sense to have negative values for these attributes. The results show that for A1, A10a, and A10b that Mink was not significantly different from 0. Importantly, the upper limit Maxk is not constrained and none of these parameter estimates are unreasonable given the actual value of the attributes. In summary, these results show that the information processing models (models 7 and 8) produce better parameter estimates than models which just use the final choice data (model 17) to estimate market level preferences. These results also show that the models can be estimated using just the revealed sequence of product-attribute levels and final choices from an actual market setting. 5.3. Managerial Application of Models Here we briefly explore two managerial interesting applications of model 8 which incorporates the Sequential stopping rule, Hybrid product-attribute selection, and the Inferred Mean approach to handling unseen product-attributes. One of the interesting aspects of the Inferred Mean method is that theoretically,  k  k xk where xk is some prior value for attribute k that consumers use when product-attribute k is not opened for a particular product. This implies that xk   k k and we can estimate the market’s expected 30 value for attribute k. The last column of Table 5 performs this calculation using the posterior means of βk and δk. Since orthogonal coding was used for the discrete attributes,  k k = 0.5 indicates that the market expected that attribute to be part of the product offering while  k k = -0.5 indicates the opposite. However, since no constraints were placed on the estimates of δ it should not be surprising that  k k does not exactly equal 0.5 or -0.5; we will interpret the sign of  k k as indicating the market’s expectations. Attributes A2 and A7 – A9 were all expected to be part of the product offering in the high priced market. Attribute A3 was not expected to be included. Attribute A11 was a special one-time promotion that would have been unknown to consumers, this may explain why δ11 was not statistically significant, but directionally, the market did not expect this attribute. For the three continuous attributes that we can measure unambiguously, the implied values of xk (0.73, 65.70, and 13.50) are plausible given the product and are within the range of Mink and Maxk estimated for model 8. These results show that the model can provide estimates of the market expectations for the different attributes. Second, our method addresses a common problem in online marketing, poor conversion rates. Our model provides recommendations for prioritizing who should receive follow-up communications via either e-mail or retargeting display ads and which product should be featured for each individual, based on actual information processing on the firm’s website. In our hold-out sample of 26 shoppers, 16 did not purchase one of the three products. Table 6 provides the purchase probabilities from model 8 for each of the three products for each of these shoppers, with results sorted in terms of those most likely to buy one of the three products. This provides managers diagnostics for which shoppers had greater probabilities of purchasing and for which products they are most interested. Managers can compute a cut-off via an expected value calculation to determine who should receive follow-up information; for instance, in our example it may not be profitable for managers to follow-up with shoppers 9 – 16 because their forecasted purchase probability is relatively low. 7 It is important to note that because our method computes attribute 7 Shoppers 13 – 16 each opened the same 3 cells in the first row of the information display board. 31 importance weights, it gives different recommendations than one which simply counts the number of product-attributes revealed and targets shoppers based on that metric. Specifically, both shoppers 2 and 4 revealed 11 product-attribute levels, but they have very different purchase probabilities. 6. Conclusion This research contributes to the information processing literature by proposing new models, testing economic and descriptive behavioral theoretical principles, and measuring consumer preferences using data from an e-commerce website. Our method of data collection obtains revealed consumer preference information at the point-of-purchase from consumers in actual purchasing situations while they were shopping on a popular manufacturer’s retail website. We find that consumers’ decision of when to stop processing information is a function of the specific information revealed so far, as opposed to how many cells in the IDB have been opened. This more closely aligns with the economic theory of sequential search as opposed to fixed sample search. Although this research is not a test of whether consumers follow the optimal sequential or fixed sample search, it does test important aspects of these theories and is consistent with the emerging view in economics that models of actual behavior deviate from their theoretical optimum. Our results also support the view in both economic and behavioral literatures that consumers rely on prior information when product-attributes are not revealed in the IDB. This conclusion was facilitated by modeling the information acquisition process as deviations from consumers’ prior expectations. When the actual, average value of the attribute was used to represent consumer expectations, model fit deteriorated. In fact, in this data set, it was better to assume that consumers only selectively used productattribute values and ignored unopened cells in the IDB as opposed to using a plausible, but ultimately wrong value for consumers’ expectations. The proposed Inferred Mean method takes advantage of the different sequences of opened cells in the IDB with-in and across consumers to estimate the expected value of attributes. 32 In this relatively high priced, actual purchasing situation, the proximity of cells to the last one opened was an important determinant of consumers information processing. Our results confirm behavioral theory and lab based results in both psychology and economics on this topic. Attribute (row) based processing may occur because a particular attribute was important to the consumer or, analogously, alternative (column) based processing may be the result of that column being the consumer’s a priori favorite product. Importantly, our model includes expected utility in the decision of which attribute to open next, controlling for these preference based explanations. Our results show that both expected utility and proximity determine consumers’ information processing in an IDB. Product search and information acquisition can occur via different channels and at different points of time, not just at the point-of-purchase. Economic and statistical theory suggest that consumers integrate over their prior beliefs in order to calculate the expected benefit of processing additional information in an IDB. Behavioral theory also supports the idea that the level and amount of uncertainty in the attributes are important in determining consumer search. In our information processing context, as opposed to representing prior information via a parametric probability distribution, this research suggests representing it via the expected maximum and minimum value of the attributes. This method avoids the computationally costly necessity of numerical integration and provided a better fit to our data. This model should be of use to applied researchers and this result may be of interest to theoretical researchers. Our models are paramorphic representations of the consumer decision process and we do not draw definitive conclusions about consumers’ mental processes. We find that the sequential stopping rule fits the data better than the fixed sample stopping rule. De los Santos et al. (2012) concluded that consumers followed a fixed sample search; this finding may reflect differences in the task, i.e. between physical search for product information on one attribute versus acquiring information on multiple attributes in an IDB. Similarly, the Expected Value model and the Max-Min model are two different ways to represent consumers’ prior knowledge and assess the benefit of additional information processing. Our data does not offer insight into whether consumers are actually integrating over attribute values or 33 performing a difference operation, only that the Max-Min model fits our data best. Additional data and more refined measurement will be needed to support the findings of our model based results. This research also contributes to management practice since the proposed method has minimal information requirements and can be implemented by interested firms in actual purchase situations, as demonstrated in this research. The method used here can augment other marketing research techniques such as conjoint analysis to uncover consumer preferences for particular attributes and their impact on purchase decisions. Estimating attribute importance weights that account for the inherent endogeneity in information processing as well as the location of the attribute on the website’s IDB is a major managerial contribution of this research. Because consumers have different sequences of information processing as well as different final sets of revealed product-attributes, the model can overcome the difficulties of not having an experimentally designed product matrix. Across consumers, not everyone reveals all the information in the IDB, and the differences create variation that allows for parameter estimates, even for attributes with identical levels across attributes. Some consumers will reveal the values for a particular attribute for all three products, some for only two, others for a different set of two, some will only open the attribute for the first product, etc. The attribute weights obtained by the proposed models should help managers improve product design, pricing, and communication decisions based on authentic behavior by shoppers on the firm’s website. As demonstrated in Table 5, the parameters also estimate consumers’ ex-ante expectations for product attribute levels. Finally, this research provides diagnostics to help managers prioritize follow-up communications towards consumers who were most likely to purchase based on their current visit to a retailer’s website (see Table 6). Websites typically present all the information for all products and attributes in online IDBs. Thus, similar to conjoint and eye or mouse-tracking based marketing research studies, firms may need to engage in primary data collection to use the models proposed here. Compared to eye or mouse-tracking or conjoint studies, the consumers in this research were making actual purchase decisions and we can view 34 their exact information processing sequences. However, additional research is needed to determine whether the data collection method changes the proportion of consumers purchasing or the mix of products purchased. Laboratory based experiments can be used to test whether the order of attributes or products influences the parameter estimates; the models proposed here will facilitate that research. Another important area for additional research is modeling heterogeneity and/or consumers’ specific path to purchase in order to represent their prior information. In infrequently purchased categories it is rare to observe multiple purchases from the same consumer in point-of-purchase data. This makes it difficult to accurately estimate distributions of heterogeneity. One option would be to make parameters such as  0* or δ functions of demographic information. Google uses browser history and IP addresses to infer a range of demographics from web browsers; this information could be incorporated without altering the current data collection methodology. Consumers obtain product information from a variety of sources over extended periods of time and not just at the point-of-purchase. The specific information or source of that information may impact subsequent information processing and search; richer individual level data will require richer models. Our set-up also permits lab-based studies on the impact of conflicting versus non-conflicting information on which product-attributes to inspect, when to stop, and what to buy. Management research has always been interdisciplinary and this study extends that tradition by developing and applying new models with roots in economics and psychology. Importantly, this research shows that theoretical and lab based results can be measured and tested in an actual purchase situation. The new models can be used by management in applied situations as demonstrated and hopefully will be used as a basis for additional academic and commercial research. 35 References Anderson, S. P., A. D. Palma, J. F. Thisse. 1992. Discrete Choice Theory of Product Differentiation. Cambridge, Massachusetts, MIT Press. Andrews, R. L., T. C. Srinivasan. 1995. Studying Consideration Effects in Empirical Choice Models Using Scanner Panel Data. J. Mark. Res. 32(1) 30–41. Ben-Akiva, M., S. R. Lerman. 1985. Discrete Choice Analysis: Theory and Application to Travel Demand. Cambridge, Mass, The MIT Press. Bettman, J. R. 1979. An Information Processing Theory of Consumer Choice. Reading, Massachusetts, Addison-Wesley Publishing Company. Bettman, J. R., M. F. Luce, J. W. Payne. 1998. Constructive Consumer Choice Processes. J. Consum. Res. 25(3) 187–217. Bettman, J. R., C. W. Park. 1980. Effects of Prior Knowledge and Experience and Phase of the Choice Process on Consumer Decision Processes: A Protocol Analysis. J. Consum. Res. 7(3) 234–248. Bradlow, E. T., Y. Hu, T.-H. Ho. 2004. Modeling Behavioral Regularities of Consumer Learning in Conjoint Analysis. J. Mark. Res. 41(4) 392–396. Creyer, E. H., J. R. Bettman, J. W. Payne. 1990. The Impact of Accuracy and Effort Feedback and Goals on Adaptive Decision Behavior. J. Behav. Decis. Mak. 3(1) 1–16. Currim, I. S., O. Mintz, S. Siddarth. 2015. Information Accessed or Information Available? The Impact on Consumer Preferences Inferred at a Durable Product E-commerce Website. J. Interact. Mark. 29 11–25. De los Santos, B., A. Hortaçsu, M. R. Wildenbeest. 2012. Testing Models of Consumer Search Using Data on Web Browsing and Purchasing Behavior. Am. Econ. Rev. 102(6) 2955–2980. Dhar, R., S. M. Nowlis. 2004. To Buy or Not to Buy: Response Mode Effects on Consumer Choice. J. Mark. Res. 41(4) 423–432. Feinberg, R. M., W. R. Johnson. 1977. The Superiority of Sequential Search: A Calculation. South. Econ. J. 43(4) 1594–1598. Fischer, G. W., J. Jia, M. F. Luce. 2000. Attribute Conflict and Preference Uncertainty: The RandMAU Model. Manag. Sci. 46(5) 669–684. Gabaix, X., D. Laibson, G. Moloche, S. Weinberg. 2006. Costly Information Acquisition: Experimental Analysis of a Boundedly Rational Model. Am. Econ. Rev. 96(4) 1043–1068. Gamerman, D., H. F. Lopes. 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Second Edition 2 edition. Boca Raton, Chapman and Hall/CRC. Gelfand, A. E., D. K. Dey. 1994. Bayesian Model Choice: Asymptotics and Exact Calculations. J. R. Stat. Soc. Ser. B Methodol. 56(3) 501–514. 36 Gensch, D. H. 1987. A Two-Stage Disaggregate Attribute Choice Model. Mark. Sci. 6(3) 223–239. Gilbride, T. J., G. M. Allenby. 2004. A Choice Model with Conjunctive, Disjunctive, and Compensatory Screening Rules. Mark. Sci. 23(3) 391–406. Gilbride, T. J., G. M. Allenby, J. D. Brazell. 2006. Models for Heterogeneous Variable Selection. J. Mark. Res. 43(3) 420–430. Gilbride, T. J., P. J. Lenk. 2010. Posterior Predictive Model Checking: An Application to Multivariate Normal Heterogeneity. J. Mark. Res. 47(5) 896–909. Hagerty, M. R., D. A. Aaker. 1984. A Normative Model of Consumer Information Processing. Mark. Sci. 3(3) 227. Hauser, J. H., B. Wernerfelt. 1990. An Evaluation Cost Model of Consideration Sets. J. Consum. Res. 16(4) 393–408. Honka, E. 2014. Quantifying search and switching costs in the US auto insurance industry. RAND J. Econ. 45(4) 847–884. Johnson, E. J., R. J. Meyer, S. Ghose. 1989. When Choice Models Fail: Compensatory Models in Negatively Correlated Environments. J. Mark. Res. 26(3) 255–270. Johnson, E. J., J. W. Payne, J. R. Bettman. 1988. Information Displays and Preference Reversals. Organ. Behav. Hum. Decis. Process. 42(1) 1. Ke, T. T., Z.-J. M. Shen, J. M. Villas-Boas. 2014. Search for Information on Multiple Products. Working Paper. University of California, Berkeley. Kivetz, R., O. Netzer, V. “Seenu” Srinivasan. 2004. Alternative Models for Capturing the Compromise Effect. J. Mark. Res. 41(3) 237–257. Lenk, P. J., W. S. DeSarbo. 2000. Bayesian Inference for Finite Mixtures of Generalized Linear Models with Random Effects. Psychometrika 65(1) 93–119. Mehta, N., S. Rajiv, K. Srinivasan. 2003. Price Uncertainty and Consumer Search: A Structural Model of Consideration Set Formation. Mark. Sci. 22(1) 58–84. Meyer, R. J. 1982. A Descriptive Model of Consumer Information Search Behavior. Mark. Sci. 1(1) 93– 121. Miller, H. J. 1993. Consumer Search and Retail Analysis. J. Retail. 69(2) 160–192. Mintz, A., N. Geva, S. B. Redd, A. Carnes. 1997. The Effect of Dynamic and Static Choice Sets on Political Decision Making: An Analysis Using the Decision Board Platform. Am. Polit. Sci. Rev. 91(3) 553–566. Mintz, O., I. S. Currim, I. Jeliazkov. 2013. Information Processing Pattern and Propensity to Buy: An Investigation of Online Point-of-Purchase Behavior. Mark. Sci. 32(5) 716–732. 37 Moorthy, S., B. T. Ratchford, D. Talukdar. 1997. Consumer Information Search Revisited: Theory and Empirical Analysis. J. Consum. Res. 23(4) 263–277. Nelson, P. 1970. Information and Consumer Behavior. J. Polit. Econ. 78(2) 311–329. Payne, J. W. 1976. Task Complexity and Contingent Processing in Decision Making: An Information Search and Protocol Analysis. Organ. Behav. Hum. Perform. 16(2) 366–387. Payne, J. W., J. R. Bettman, E. J. Johnson. 1993. The Adaptive Decision Maker. Cambridge, UK, Cambridge University Press. Raiffa, H., R. Schlaifer. 1961. Applied Statistical Decision Theory. Division of Research, Graduate School of Business Administration, Harvard University. Ratchford, B. T. 1980. The Value of Information for Selected Appliances. J. Mark. Res. 17(1) 14–25. Rossi, P. E., G. M. Allenby, R. McCulloch. 2005. Bayesian Statistics and Marketing. Chichester, West Sussex, England, John Wiley & Sons. Sanjurjo, A. 2014. The Role of Memory in Search and Choice. Working Paper. Universidad de Alicante. Shi, S. W., M. Wedel, F. G. M. (Rik) Pieters. 2013. Information Acquisition During Online Decision Making: A Model-Based Exploration Using Eye-Tracking Data. Manag. Sci. 59(5) 1009–1026. Shugan, S. M. 1980. The Cost of Thinking. J. Consum. Res. 7(2) 99–111. Simonson, I., J. Huber, J. W. Payne. 1988. The Relationship Between Prior Brand Knowledge and Information Acquisition Order. J. Consum. Res. 14(4) 566–578. Stigler, G. J. 1961. The Economics of Information. J. Polit. Econ. 69(3) 213–225. 38 Figure 1. Online Shopping Example at Best Buy 39 Figure 2. Decision Board Example Sequence of Cells Accessed 40 Table 1. Summary of Model and Alternatives Stopping Rule Sequential Fixed Sample The decision to stop opening cells is a function of the information revealed so far. The decision to stop opening cells is a function of how many cells have been already been opened. Product-Attribute Selection Expected Value The benefit to acquire product-attribute information is the change in the expected maximum utility. Need to integrate over the potential values of the candidate product-attribute. Max - Min The benefit to acquire product-attribute information is the difference between the expected maximum utility, when the candidate productattribute is at its maximum versus at its minimum expected value. Hybrid The benefit to acquire product-attribute information includes the proximity of cells in the IDB: closer cells are preferred to cells which are farther away. Unseen Product-Attributes Actual Mean If a product-attribute is not seen by a consumer, she uses a prior value: xk calculated from the actual product attributes. Not Used If a product-attribute is not seen by a consumer, she simply does not use that product-attribute in the final product choice decision. Inferred Mean If a product-attribute is not seen by a consumer, she uses a prior value:  k  k xk are estimated as part of the model. 41 Table 2. Matrix of Information Presented to Shoppers Continuous or Product 1 Product 2 Product 3 Attribute Row Discrete Variable (Level) (Level) (Level) Price N/A Continuous 1 2 1 A1 1 Continuous 1 2 1 A2 2 Discrete 1 1 1 A3 3 Discrete 2 1 2 A4 4 Continuous 1 1 1 A5 5 Continuous 2 3 1 A6 6 Discrete 1 1 1 A7 7 Discrete 1 1 1 A8 8 Discrete 1 1 1 A9 9 Discrete 1 1 1 A10a Continuous 1 3 2 10 A10b Continuous 2 3 1 A11 11 Discrete 1 1 2 Prices and alternative (model) names were available to shoppers without clicking a cell; Attribute 10a and 10b were accessed by clicking the same cell; Level 1 indicates lowest values and level 3 indicates highest values for the attributes, for example, level 1 for price indicates a lower price, level 1 for attribute 10a indicates a shorter alternative, and level 1 for attribute 10b indicates a lighter alternative. 42 Table 3. Descriptive Statistics – High Priced Dataset Panel A. Number of Cells Accessed by Shoppers based on Customize and Buy (C&B) Decision Number of Cells Accessed Total Shoppers Count Average St. Dev. Median 2-4 Cells 5-9 Cells 10-15 Cells 16+ Cells Total % No C&B Count 11.54 9.24 9 38 33 28 37 136 C&B Product 1 % Count 9.99 8.36 7 28% 24% 21% 27% --- 26 19 15 18 78 % C&B Product 2 Count 16.08 11.49 14 33% 24% 19% 23% 57% 2 2 3 6 13 % C&B Product 3 Count 13.33 9.97 11 15% 15% 23% 46% 10% 6 3 10 8 27 % 12.28 9.19 11 22% 11% 37% 30% 20% 4 3 6 5 18 22% 17% 33% 28% 13% Panel B. Percentage of Shoppers Accessing Different Cells Attribute A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 Overall Product 1 60% 35% 41% 32% 28% 22% 21% 27% 24% 21% 30% 79% Product 2 73% 44% 46% 39% 35% 32% 32% 32% 34% 26% 35% 85% Product 3 68% 38% 40% 36% 32% 27% 26% 32% 29% 23% 32% 82% Overall 89% 68% 70% 60% 54% 49% 47% 48% 48% 40% 48% 7% 43 Table 4. Model Fit Statistics High Price Data, In-sample N=110, Out-of-sample N=26 Sequential Fixed Sample Unseen ProductOut-ofProduct-Attribute Unseen ProductModel Attributes LMD sample Selection Attributes Actual Mean −4385.9 −956.5 9 Expected Value Actual Mean Not Used −4317.6 −949.4 10 Expected Value Not Used 1 2 Product-Attribute Selection Expected Value Expected Value 3 4 5 Max-Min Max-Min Max-Min Actual Mean Not Used Inferred Mean −4350.3 −4037.0 −3924.3 −920.0 −801.3 −800.1 11 12 13 Max-Min Max-Min Max-Min 6 7 8 Hybrid Hybrid Hybrid Actual Mean Not Used Inferred Mean −3225.1 −3126.7 −3055.0 −662.9 −607.8 −598.7 14 15 16 Hybrid Hybrid Hybrid 17 Random Model Model 7 8 Model 7 8 LMD −4635.8 −4520.1 Out-ofsample −982.9 −983.4 Actual Mean Not Used Inferred Mean −4525.0 −4267.9 −4115.5 −945.1 −832.1 −819.6 Actual Mean Not Used Inferred Mean Random Not Used −3403.1 −3330.1 −3254.0 −688.5 −634.7 −622.2 −5025.9 −1060.1 LMD −5028.9 −4878.2 Out-ofsample −957.5 −907.6 Low Price Data, In-sample N=524, Out-of-sample N=60 Sequential Information Processing Fixed Sample Information Processing Product-Attribute Unseen ProductOut-ofProduct-Attribute Unseen ProductModel Selection Attributes LMD sample Selection Attributes LMD Hybrid Not Used −14192.9 −1690.6 15 Hybrid Not Used −14880.6 Hybrid Inferred Mean −13380.0 −1622.7 16 Hybrid Inferred Mean −14120.5 Out-ofsample −1837.6 −1775.7 Product-Attribute Selection Hybrid Hybrid Medium Price Data, In-sample N= 149, Out-sample N=28 Sequential Fixed Sample Unseen ProductOut-ofProduct-Attribute Unseen ProductModel Attributes LMD sample Selection Attributes Not Used −4388.6 −876.7 15 Hybrid Not Used Inferred Mean −4258.5 −857.1 16 Hybrid Inferred Mean 44 Table 5. Selected Parameter Estimates Model 17 – Random, Random, Not Used β’s for Product (std. dev.)  Attributes (0.36) A1 0.241 (1.26) A2 1.672 (0.81) A3 0.266 (0.01) A4 −0.015 (0.05) A5 0.035 (1.26) A6 −0.028 (1.92) A7 2.098 (1.57) A8 1.627 (1.82) A9 −0.452 (2.39) A10a 1.940 (0.54) A10b −0.349 (0.83) A11 −1.025 (0.54) Product 1 −3.024 (0.61) Product 2 −2.753 (0.58) Product 3 −2.860 Information Processing Parameters τ ϕ (Row) ϕ (Column) ------- Model 7 – Sequential, Hybrid, Not Used ------- Continuous Variables Endpoints A1 ----A4 ----A5 ----A10a ----A10b ----Parameter estimates in bold indicate that the 95% highest Model 8 – Sequential, Hybrid, Inferred Mean  (std. dev.)  (std. dev.) 2.112 1.651 1.561 0.018 0.045 1.489 1.585 1.689 1.568 −3.390 0.586 1.699 −6.628 −6.535 −7.054 (0.13) (0.04) (0.03) (0.00) (0.00) (0.03) (0.03) (0.03) (0.03) (0.52) (0.02) (0.05) (0.43) (0.46) (0.52) −2.749 0.847 0.564 Min k   (std. dev.) 1.475 4.984 1.317 0.112 0.337 4.352 4.476 6.190 4.966 −0.208 0.294 1.383 −27.510 −27.282 −27.621 (0.38) (1.07) (0.17) (0.03) (0.13) (1.22) (1.47) (1.63) (1.63) (0.93) (0.04) (0.29) (1.81) (1.81) (1.81) 1.078 1.810 −0.893 7.334 4.554 1.739 1.687 3.034 2.702 (0.65) (0.61) (0.23) (2.16) (1.92) (0.75) (0.88) (0.87) (0.95) 1.504 (1.30) −0.721 ------ (0.42) ------- −0.52 ------- (0.01) (0.00) (0.00) −2.405 0.850 0.402 (0.13) (0.03) (0.05) ------- ------- ------- Max k Min k Max k  0.73 0.36 −0.67 65.70 13.50 0.40 0.37 0.49 0.54 2.01 3.01 0.26 2.88 ------25.57 94.42 43.94 76.06 ------1.51 28.06 11.08 18.42 ------0.52 1.82 0.47 2.04 ------2.20 9.25 2.36 9.07 ------posterior density did not include 0, i.e., the parameter is significant at the 95% level. 45 Table 6. Hold-out Sample Predicted Probabilities Predicted Probability Shopper ID Prod 1 Prod 2 Prod 3 None 1 0.219 0.106 0.514 0.161 2 0.006 0.008 0.750 0.236 3 0.193 0.093 0.451 0.262 4 0.186 0.074 0.267 0.473 5 0.381 0.019 0.014 0.587 6 0.241 0.074 0.046 0.640 7 8 0.080 0.017 0.191 0.303 0.072 0.015 0.657 0.665 9 10 0.085 0.124 0.126 0.091 0.066 0.056 0.723 0.729 11 12 0.157 0.020 0.063 0.188 0.012 0.018 0.768 0.773 13 14 0.063 0.063 0.092 0.092 0.057 0.057 0.788 0.788 15 16 0.063 0.063 0.092 0.092 0.057 0.057 0.788 0.788 46

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download What to Click, When to Stop, and What to Buy: A Model of