Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Expert Systems with Applications 27 (2004) 623–633 www.elsevier.com/locate/eswa An integrated data mining and behavioral scoring model for analyzing bank customers Nan-Chen Hsieh* Department of Information Management, National Taipei College of Nursing, No. 365, Min Te Road 11257, Taipei, Taiwan, ROC Abstract Analyzing bank databases for customer behavior management is difficult since bank databases are multi-dimensional, comprised of monthly account records and daily transaction records. This study proposes an integrated data mining and behavioral scoring model to manage existing credit card customers in a bank. A self-organizing map neural network was used to identify groups of customers based on repayment behavior and recency, frequency, monetary behavioral scoring predicators. It also classified bank customers into three major profitable groups of customers. The resulting groups of customers were then profiled by customer’s feature attributes determined using an Apriori association rule inducer. This study demonstrates that identifying customers by a behavioral scoring model is helpful characteristics of customer and facilitates marketing strategy development. q 2004 Elsevier Ltd. All rights reserved. Keywords: Data mining; Behavioral scoring model; Customer segmentation; Neural network; Association rule 1. Introduction Contemporary marketing strategies perceive customers as important resources to an enterprise. Therefore, it is essential to enterprises to successfully acquire new customers and retain high value customers. To achieve these aims, many enterprises have gathered significant numbers of large databases, which then can be analyzed and applied to develop new business strategies and opportunities. However, instead of targeting all customers equally or providing the same incentive offers to all customers, enterprises can select only those customers who meet certain profitability criteria based on their individual needs or purchasing behaviors (Dyche & Dych, 2001). Credit scoring and behavioral scoring are techniques that help decision makers to realize their customers. Credit scoring models help to decide whether to grant credit to new applicants by customer’s characteristics such as age, income and martial status (Chen & Huang, 2003). Behavioral scoring models help to analyze purchasing behavior of * Tel./fax: C2-822-7101-2220. E-mail address: [email protected]. 0957-4174/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2004.06.007 existing customers (Setiono, Thong, & Yap, 1998). These two scoring models are highly related to the field of classification analysis by statistical analysis (Hand, 1981; Johnson & Wichern, 1998), especially classification analysis by neural networks in the field of data mining (Lancher, Coats, Shanker, & Fant, 1995). Until now, most existing data mining approaches have been discovering general rules (Agrawal, Imielinski, & Swami, 1993; Bult & Wansbeek, 1995; Setiono et al., 1998), predicting personal bankruptcy (Dasgupta, Dispensa, & Ghose, 1994; Desai, Crook, & Overstreet, 1996; Zhang, Hu, Patuwo, & Indro, 1999) and credit scoring (Kim & Sohn, 2004; Lancher et al., 1995; Sharda & Wilson, 1996) in bank databases. Few works have studied the mining of bank databases from the viewpoint of customer behavioral scoring (Sharda & Wilson, 1996). More specifically, we wanted to look at both the account data of the customers and their credit card transactions. With these data, the aim was to discover interesting patterns in the data that could provide clues about what incentives a company could offer as better marketing strategies to its customers. As shown in Fig. 1, this study presents a two-stage approach for behavioral scoring analysis of implicit knowledge using bank customer account and transaction data. Topics discussed include data 624 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 predicators. This SOM was employed to classify customers into three major profitable groups of customer: revolver user, transactor user, and convenience user. Once the SOM identified the profitable groups of customers, an Apriori profiled each group of customers focusing on demographic and geographic characteristics for building and maintaining the most profitable customer base. The customer profile then was used to describe a representative case in each group of customers, and served as a tool for establishing better bank marketing strategies. After analyzing the bank database, this study demonstrates that customer behavior scoring models are an effective method for banks to realize their most profitable customers. We conclude by analyzing target groups of customers using the proposed two-stage behavioral scoring model. For a better understanding of our solutions, this study is organized as follows. Section 2 makes a description of the analyses methodology. An integrated data mining and behavioral scoring model was presented. Section 3 assesses neural networks as a tool for customer segmentation while using past repayment behavior and RFM scoring variables to build behavioral scoring models. Section 3 also presents the processes of creating customer profiles according to their feature attributes as determined by an Apriori association rule inducer. Finally, conclusions are made in Section 4. 2. Description the analyses methodology 2.1. Credit and behavioral scoring models Fig. 1. Two-stage behavioral scoring modeling. preprocessing, customer behavior scoring modelling, sensitivity analysis of relative importance attributes contributing to the customer profiling, and the two stages of the behavioral scoring model itself. The key feature of the two-stage behavioral scoring model is a cascade involving self-organizing map (SOM) and an Apriori association rule inducer. An SOM (Kim & Sohn, 2004; Kohonen, 1995) is an unsupervised learning algorithm that relates multi-dimensional data as similar input vectors to the same region of a neuron map, and Apriori (Agrawal et al., 1993) is mainly used to find out the potential relationships between items or features that occur synchronously in the database. In the first stage of the approach presented here, a conceptual customer behavioral scoring model was established to predict profitable groups of customers based on previous repayment behavior and RFM (Bult & Wansbeek, 1995) behavioral scoring Credit and behavioral scoring models (Thomas, 2000) are one of the most successful applications of statistical and operational research modelling in finance and banking, and the number of scoring analysts in the industry is constantly increasing. The main objective of both credit and behavioral scoring models is to classify customers into groups (Lancher et al., 1995). Hence scoring problems are related to the field of classification analysis (Hand, 1981; Johnson & Wichern, 1998; Morrison, 1990). Applying to bank databases, classification analysis for credit scoring is used to categorize a new applicant as either accepted or rejected with respect to his features such as age, income and martial status (Chen & Huang, 2003). On the other hand, classification analysis for behavioral scoring is used to describe the behavior of existing customers by using behavioral scoring variables and also to predict future purchasing behavior or credit status of existing customers (Setiono et al., 1998). Until now, the building of both scoring models has been always based on a pragmatic approach; because of this, the best and most standard scoring models for every unique circumstance most certainly does not exist. Most previous studies have focused on building more accurate credit or behavioral scoring models and increasing the accuracy of the classification model with various kinds of statistical N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 techniques. However, analyzing bank databases for customer behavior management is difficult since bank databases are multi-dimensional, comprising of monthly account records and daily transaction records (Donato et al., 1999). Therefore, even with highly accurate scoring models, some misclassification patterns appear frequently. This study intended to draw much from data mining perspectives. Providing a general integrated data mining and behavioral scoring model for customer behavior analysis, which includes necessary preprocessing of the real-world data sets, scoring predicators derivation and customer profiling in order to support a standard model building process will be of great utility. The framework of two-stage behavioral scoring model serves as a tool to validate the effect of data mining techniques in practical scoring analysis applications. 2.2. Neural networks to the segmentation analysis For credit scoring or behavioral scoring analysis, many studies have presented that neural networks perform significantly better than statistical techniques such as linear discriminate analysis (LDA), multiple discriminate analysis (MDA), logistic regression analysis (LRA) and so on (Desai et al., 1996; Lancher et al., 1995; Malhotra and Malhotra, 2003; Sharda & Wilson, 1996; Zhang et al., 1999). The application of neural networks to segmentation analysis is a promising research area and is a challenge for a variety of marketers (Vellido, Lisboa, & Vaughan, 1999). Baesens, Viaene, Poel, Vanthienen, & Dedene (2002) employed Bayesian neural networks to repeat purchase behavior modelling in direct marketing. Davies, Moutinho, & Curry (1996) and Moutinho, Davies, & Curry (1996) analyzed how different bank customer groups represent different expectations of the automatic teller machines (ATMs) service. Rather than profiling segments based on demographic or geographic characteristics, Dasgupta et al. (1994) characterized potential customer segments in terms of lifestyle variables. Balakrishnan, Cooper, Jacob, & Lewis (1996) accomplished a six-segment classification study using coffee brand switching probabilities derived from the scanner data at a sub-household level. Mazanec (Mazanec, 1992) grouped tourists using a benefit approach. Setiono et al. (1998) utilized a rule-extraction neural network to aim at companies for the promotion of new information technology. Fish, Barnes, & Aiken (1995) proposed a new methodology for industrial market segmentation by neural networks. Lee, Chiu, Lu, & Chen (2002) explore the performance of credit scoring by integrating the back propagation neural networks with traditional discriminate analysis. Kim & Sohn (2004) used neural networks to manage customer loans. Among these studies, only Balakrishnan et al. used the frequency sensitive competitive learning (FSCL) algorithm in segmentation analysis. The rest of the studies used supervised feed-forward multilayer perceptron (MLP) 625 trained by back propagation and gradient descent, or similar alternatives. 2.3. Properties of the built behavioral scoring model In the business world, the most successful application of behavioral scoring model is embodied into databases, which is an approach of analyzing customer histories, looking for similar behavioral patterns among existing customer preferences and using those patterns for a targeted selection of existing or future customers The decisions to be made include which target groups of customers will be encouraged to spend more, what credit line to assign, whether to promote new products to particular groups of customers, and, if the repayment ability turns bad, how to manage debt recovery. Therefore, a behavioral scoring model is an information-driven marketing process that enables marketers to develop, test, implement, measure and appropriately modify customized marketing programs and strategies. In addition to customer values that credit scoring models use as major scoring information, repayment behavior patterns and customer purchasing histories are also required in a behavioral scoring model. Behavioral scoring models are intended to establish associations between the input predictors and the output scores in order to model the behavior of different customers. More precisely, behavioral scoring models tried to group customers that represent shared behavior patterns. This is carried out by assigning behavior scores to each customer and grouping customers into classes of similar score value using an SOM neural network. The behavior score is given by a mathematical function of the form: behavior score Z fSOM ðpredicator1 ; predicator2 ; .Þ: In this study, four predicators, namely, repayment behavior and RFM values are used to classify three profitable groups of customers. Individual customer scores are updated on a yearly basis in this study. 3. Assessing the neural network as customer segmentation 3.1. Preparing the data sets For this study, bank databases were provided by a major Taiwanese credit card issuer. Data preprocessing was required to ensure data field consistency in behavioral scoring model building. Obviously, not all the data are related to the chosen purposes, so knowledge extraction from the bank databases included the following three subactions. The first sub-action was intended to organize the raw data. Two data sets were obtained: a set containing effective credit card account information of 158,126 customers until June 2003, and another set storing over 626 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 20 millions individual transaction records for these accounts from January 2000 to June 2003. Then, two data sets were joined using a customer identifier to create a single behavioral-oriented data set. The second sub-action was the extraction of only that data considered useful for the analysis. Unnecessary data fields and records containing incomplete or missing data were removed from the data sets (Fish et al., 1995). The third sub-action was the application of simple statistics to calculate an aggregate of new behavioral scoring predicators. The aim of calculating the aggregate was to emphasize the customer repayment behavior and RFM (Bult & Wansbeek, 1995) information hidden in the 12 months observation period. In this case, the values derived from the database such as maximum, minimum and average of a set of variables (e.g. repayment states, payment cycle days, number of credit card purchases, consumption amount, interest on credit balance, and so on) for the monthly activity over the past 12 months were considered for the purpose of building a behavioral scoring model. As mentioned, the desired outcome is to be able to predict which customer belongs to which profitable group. The ranges of values of numerical predictor are split into intervals so that each interval contains as many customers as possible that have a significant homogeneous behavior. Multiple predictors can be grouped together to obtain the same effect. To derive the most profitable customers, it was chosen to identify similar repayment behavior with respect to RFM values found in the real world. 3.2. Analyzing the behavior of customers To establish a better relationship with customers, banks constantly seek ways of differentiating their offerings and developing more appropriate services for distinct market segments. An important observation on the current state-ofthe-art segmentation analysis is the use of past transaction data. The results produced are based on the assumption that the customer behavior follows patterns similar to past patterns and will repeat in the future. Therefore, there could not be a better time than now to recognize the importance of an effective new marketing strategy using data mining techniques. To increase the amount of purchases while improving customer satisfaction is a major goal. Segmentation analysis is a method of achieving more targeted communication with customers and is a pioneering step towards classifying individual customers according to previously defined groups of customers. The process of segmentation analysis describes the characteristics of groups of customers within the data, and putting customers into segments according to their affinities or similar characteristics. This study tries to construct a behavioral scoring model for direct marketing and encouraging consumption (Lancher et al., 1995). These two goals are similar for analyzing potential credit card customer behavior. However, attempts to make good customer behavior management may be limited by poor data relevance and quality, the volume of data needing to be processed, or difficulty in viewing the data. Therefore, the original data set could not be used directly to predict customer behavior, so extra behavioral scoring predicators were needed for predication. As mentioned, banks have three types of profitable customers: revolver users, transactor users and convenience users. Revolver users always carry a credit card balance, rolling over part of the bill to the next month, instead of paying off the balance in full each month. Revolver users are highly profitable customers because every month they pay considerable interest on their outstanding balance. Meanwhile, transactor users pay in full on or before the due date of the interest-free credit period and do not incur any interest payments or finance charges. Transactor users do not contribute significant revenue through interest on their credit balances, but the discount on each transaction they make still provides an important source of bank revenue. Finally, convenience users are customers who periodically charge large bills, such as for vacation or large purchases to their credit card, and then pay these bills off over several months. Convenience users thus contribute significant amounts of interest on their credit balance. Fig. 2 presents the conceptual framework used to answer the questions posed in this study. This figure shows the two components, customer segmentation and customer profiling, which serve as major issues to be discussed here. Generally, credit card issuers make money from annual fees, interest on credit balance, and the discount collected from merchants on each transaction. In this framework, account and transaction data sets are assumed to be input sets to customer segmentation. The values of RFM and repayment behavior are assumed to be behavioral scoring predicators affecting customer segmentation. The recency (R) value measures the average time distance between the day of makes a charge and the day pays the bill, frequency (F) value measures the average number of credit card purchases made, and monetary (M) value measures the amount of consumption spent during a yearly time period. Next, variables such as customer attributes and credit card usage are assumed to influence customer profiling. Finally, clusters and the associated customer profiles are assumed to be outputs, as well as influencing of credit card marketing strategies. In Fig. 2, repayment behavior is highly related to customer segmentation, but is an implicit variable which cannot be retrieved directly from the data set. We needed to develop a method for modeling the customer repayment behavior. As shown in the following equation, this study employs ‘Repayment Ability’ (RA) to model repayment behavior, Repayment Ability Z no: of months without delayed pay off : no: of months of holding the card N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 627 Fig. 2. A conceptual framework of customer behavioral modeling. The default observation range is assumed to be 12 months, and RA is computed as the ‘no. of months without delayed pay off’ divided by the ‘no. of months of holding the card’. For example, a customer without carries a credit card balance for 8 months, and then the degree of RA is computed as 8/12. For each customer, if RA is approaching one, then the repayment behavior of that customer is considered a transactor user. Meanwhile, if RA is between zero and one then the repayment behavior of that customer is considered a convenience user. Finally, if the value of RA is approaching to zero then the repayment behavior of that customer is considered a revolver user. 3.3. Assessing the SOM for customer behavioral scoring During the last years, the SOM (Kohonen, 1995) has gained in popularity as a classification analysis tool in business related areas (Vellido et al., 1999). In this study, the SOM is built with data from existing customers, which include variables from account and transaction data sets. All of the existing customer’s data are used to build the behavioral scoring model in order to predicate potential customer behavior. The behavioral scoring model utilized in this study is arranged to form a two dimensional SOM with a 4!4 rectangular shaped array of neurons. Each of these neurons is connected to the input vectors through synaptic weights which are adjusted during learning. The first phase of SOM is a rough estimation phase, used to capture the gross data patterns. The second phase is a tuning phase, used to adjust the map to model the fine features of the data. During the learning process, when a pattern is presented as an input to the neural network, each Euclidean distance between the pattern and each neuron is calculated using RA first and then RFM as input variables. For inputs to the SOM, each feature is scaled by subtracting the mean and dividing by the standard deviation, resulting in each scaled feature having a mean of zero and a standard deviation of one. Once the most similar neuron is determined, the neighborhood of that neuron is identified. The neighborhood of a neuron is defined as all the neurons within a given link distance of the matched neuron. All neurons in the neighborhood are adjusted to have feature values closer to the current case. The adjustment amount of the neuron weights is controlled by the learning rate. The SOM map is shown in Fig. 3, the repayment behavior, number of customers, ratio of number of customers relative to the overall customers, RA and RFM are shown for each neuron. Fig. 4 illustrated the overall distribution of customers with respect to three major profitable types of customers. The mass cases are distributed over neurons 9–16, the number of customers is 104,979 and repayment behavior is revolver user. Neurons 3, 4, 7 and 8 indicate a total of 21,202 customers are convenience users. Neurons 1, 2, 5 and 6 indicate a total of 31,945 customers are transactor users. On the basis that no meaningful conclusions can be drawn from small numbers of customers, no future analysis needs to be performed on the clusters with fewer than 1000 cases (i.e. neuron 6, 12 and 15). The next major step is to choose the target groups of customers, so as to choose the target customers for direct marketing and encourage 628 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 Fig. 3. Neurons in a 4!4 map, each neuron defines a cluster. consumption. The repayment behavior can be used to indicate the risk of customers, the risk degrees among three profitable groups of customers are ‘transactor user’%‘con‘convenience user’%‘revolver user’. Moreover, the clusters of RFM values tend to RYF[M[ of each profitable group are selected as target ones, all customers who belong to these clusters become candidates for conducting suitable marketing strategies for a bank, which attract the most attention. 3.4. Determining the relative importance variables After the segmentation of the existing customers, it is possible to infer the characteristics of each group of customers and from that propose appropriate management strategies. Customer profiling (Setiono et al., 1998) provides a basis for enterprises to offer customers better services and retain good customers. Customer profiling is done by assembling collected information on customers and their potential behavior. We first used neural network sensitivity analysis (Zurada, Malinowski, & Cloete, 1994) test for whole customers to determine if there are significant differences between each customer and minimize the input variables, then infer customer profiles by an Apriori association rule inducer. The data set obtained after data preprocessing contained 32 attributes, 10 character attributes and 22 continuous attributes. The neural network sensitivity analysis was used to reserve the relative importance attributes, repayment behavior and RFM values chosen as predicated variables for whole customers. As recommended by Hornik, Stinchcombe, & White (1989), one hidden layer network is sufficient to model a complex system with any desired accuracy, and the employed neural network model has just one hidden layer. Table 1 lists the distribution of the relative importance for each input variable using the neural network. The sensitivity analysis of the neural network and the order of most significant input variables indicate those variables that Fig. 4. Customer distribution to repayment behavior. N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 Table 1 Sensitivity analysis of the relative importance input variables Neural network model Input layer (no. of neurons) First hidden layer (no. of neurons) Output layer (no. of neurons) Predicted accuracy 32 20 4 96.16% Relative importance to RA and RFM Variable name Relative importance Comments Amount_of_Consumption Cardage_Month 0.40658 Monthly amount of consumption 0.30048 Creditline Total_Consumption Blockcod Occupation Cardtype Marital_Status 0.19980 0.15002 0.12431 0.03722 0.02546 0.01779 Age_Segments 0.01499 Sex N 0.00997 N Number of months for which the card has been held Credit line Yearly amount of consumption Card usage limit or not Encoded field Encoded field 0, single; 1, married; 2, divorced; 3, separation 1, !25; 2, 25–30; 3, 30–35; 4, 35–40; 5, 40–45; 6, 45–50; 7, 50–60; 8, O60) 1, male; 0, female are worth looking at in more detail. Factors with a relative importance of 0.00997 and above were used in successive customer profiling. In Table 1, Amount_of_Consumption, CardAge_Month and CreditLine, were the three most differentiating variables. On the other hand, Marital_Status, Age_Segments and Sex are the least differentiating variables. 3.5. Create customer profiles The study’s aim is to try to discover hidden patterns in bank databases so that it could better understand different characteristics about different customers and develop new strategies to provide better service. In the previous sections, we used behavioral scoring model to classify customers into clusters with shared characteristics. The employment of mining association rules was used to create customer profile in each cluster. The purpose of association rule extraction is to discover significant relationships between items or features that occur frequently in a transaction database. Let IZ{i1,i2,.,im} be a set of items. Let DB be a transaction database, where each transaction T consists of a set of items such that T4I. Given a set of items X4I, a transaction T contains X if and only if X4T. Support (X,DB) denotes the rate of X in DB. An association rule is an implication of the form ‘X0Y(s%, c%, l)’, where X4I, Y4I and XhYZ:. An association rule X0Y holds in DB with support s% if the probability of 629 a transaction in DB containing both X and Y is s%. (i.e. Support(X0Y)ZSupport(XgY, DB)). An association rule X0Y holds in DB with confidence c% if the probability of a transaction in DB which contains X also contains Y is also c%. (i.e. Confidence(X0Y)ZSupport(XgY,DB)/Support(X,DB)). A well-known Apriori algorithm (Agrawal et al., 1993) has been proposed for mining association rules in a transaction database. To find an association rule is to discover all the association rules whose support is larger than a minimum support (minsup) threshold and whose confidence is larger than a minimum confidence (minconf) threshold. The association rules must satisfy two conditions:Support(X0Y) R minsup,Confidence(X0Y) R minconf. When all of the association rules are generated, the simplest way to determine positive tendency of each association rule is to use the lift judgment. Lift is the ratio of confidence to expected confidence. Expected confidence is the number of transactions that include the consequent divided by the total number of transactions. Suppose that we used X0Y to determine a customer’s tendency of purchasing Y, the product purchase Support(Y, DB)/Support(XgY, DB) is the expected confidence, and the lift is computed as: LiftðX0 YÞ Z ConfidenceðX0 YÞ= Expected_ConfidenceðX0 YÞ: According to the SOM results, the customers are fall into three major profitable groups of customers dispersed over 16 clusters. The 10 variables deriving from the sensitivity analysis were chosen as predicate variables for association rule analysis. For simplified explanation, we chose only cluster-1 and cluster-2 for mining association rules. Parameters were set up to identify association rules that had at least 85% confidence and 5% support imposed on the Apriori association rule inducer. Table 2 lists the cluster profile of cluster-1 in the form of association rules, where each rule represents a customer profile that was dominant or most strongly associated with the customers matching that cluster. For discriminating purposes, we have grouped customers with shared behavioral characteristics. From this, marketers can create more accurate campaigns towards each target group of customers for cross-selling and encouraging consumption. After briefly reviewing the 16 clusters using cluster profiles, the customers with values tend to RYF[M[ can be targeted with greater accuracy. However, the risk arising from the different profitable groups of customers in practical applications should be considered. 3.6. Merging redundant association rules After customers were classified by the behavioral scoring model, the resulting clusters are then profiled by feature attributes determined using an Apriori association 630 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 Table 2 Cluster-1 profile Rule ID Association rules 1 Marital_StatusZ0*OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTyCardTypeZ100 Marital_StatusZ0*SexZ1&Age_SegmentsZ2 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ2 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ3 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ4 OccupationZ4010*Marital_StatusZ0&CardTypeZ113 OccupationZ4010*Marital_StatusZ0&SexZ1&Age_SegmentsZ3 OccupationZ4010*Marital_StatusZ0&SexZ1 OccupationZ4010*Marital_StatusZ0 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ3 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ4 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ5 OccupationZ4010*Marital_StatusZ1 and Age_SegmentsZ6 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ7 OccupationZ4010*Marital_StatusZ1&CardTypeZ113 OccupationZ4010*Marital_StatusZ1&SexZ1&Age_SegmentsZ4 OccupationZ4010*Marital_StatusZ1&SexZ1 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&SexZ1 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’ OccupationZ4010*Marital_StatusZ1 : 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 : rule inducer. An association rule is considered relevant for decision making if it has support and confidence at least equal to some minimal support and confidence thresholds defined by the user. As shown in Table 2, the extracted association rules are usually very large, to the present of a huge proportion of redundant rules conveying the same information. Many of the rules may contain redundant, irrelevant information or describe trivial knowledge. We present interactive strategies for pruning redundant association rules on the basis of equivalence relation to enhance its readability. Several methods have been proposed in the literature to reduce the number of extracted association rules. Silverstein, Brin, & Motwani (1998) used Pearson’s correlation statistic Support Confidence Lift 5.6% 87.9% 1.91 6.7% 17.3% 14.8% 6.5% 6.2% 6.7% 17.9% 46.0% 10.8% 13.2% 9.5% 6.2% 5.0% 5.7% 6.1% 22.7% 5.2% 5.9% 10.5% 22.7% 49.4% : 87.8% 87.8% 88.2% 89.1% 94.2% 87.8% 86.1% 87.9% 88.0% 87.9% 88.8% 91.0% 89.4% 92.7% 86.6% 88.1% 88.9% 88.0% 87.9% 88.4% 88.5% : 1.91 0.99 1.00 1.01 1.06 0.99 0.97 0.99 0.99 0.99 1.00 1.03 1.01 1.05 0.98 1.00 1.01 0.99 0.99 1.00 1.00 : R. to Rules 1 7 7 3w8 18 16,19 16,20 18w20 10w21 measure in replacement of confidence measure. Srikant & Agrawal (1995) defined generalized association rules using a taxonomy of the items set. Heckerman (Heckerman, 1996) and Silberschatz et al. (Silberschatz & Tuzhilin, 1996) measured the distance between association rules by evaluating the deviation according rule’s support and confidence. Bayardo, Agrawal, & Gunopulos (1999) used item constraints, which are Boolean expressions defined by user, to specify the form of association rules. Pasquier, Bastide, Taouil, & Lakhal (1999) adapted the Duquenne-Guigues basis for global implications, and the proper basis for partial implications to the framework of association rules. Klemettinen, Mannila, Ronkainen, Toivonen, & Verkamo (1994) simplified a relatively significant number of Table 3 The redundant-free cluster profile of cluster-1 (merged) Rule ID Association Rules Support Confidence Lift 1 3 5 6 7 12 13 14 15 16 18 19 20 : Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ0&CardTypeZ113 OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7 OccupationZ4010 * Marital_StatusZ1&CardTypeZ113 OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1 : 5.6% 17.3% 6.5% 6.2% 6.7% 9.5% 6.2% 5.0% 5.7% 6.1% 5.2% 5.9% 10.5% : 87.9% 87.8% 89.1% 94.2% 87.8% 88.8% 91.0% 89.4% 92.7% 86.6% 88.9% 88.0% 87.9% N 1.91 0.99 1.01 1.06 0.99 1.00 1.03 1.01 1.05 0.98 1.01 0.99 0.99 N N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 631 Table 4 The redundant-free cluster profile of cluster-2 (merged) Rule ID Association rules Support Confidence Lift 2 3 6 7 8 9 : Marital_StatusZ0 * CardTypeZ100&Age_SegmentsZ2 Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2 Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2 OccupationZ4010 * CardTypeZ100&Marital_StatusZ2 OccupationZ4010 * CardTypeZ113 OccupationZ4010 * CardTypeZ821 N 6.3% 5.3% 5.3% 8.3% 6.7% 5.5% N 86.1% 85.7% 85.6% 92.7% 93.8% 97.0% N 1.87 1.86 1.86 1.35 1.36 1.41 N association rules via the visualization technique. Bastide, Pasquier, Taouil, Stumme, & Lakhal (2000) used the Galois connection as a basis to discover minimal non-redundant association rules. Bayardo & Agrawal (1999) proposed the A-maximal rules which state that when the population of objects concerned is reduced when an item is added to the antecedent, the form of association rules will have maximal antecedents. We intended to provide strategies to reserve useful, relevant and non-redundant association rules. Thus, redundant rules which represent in certain databases the majority of extracted rules, particularly in the case of dense or correlated data for which the total number of valid rules is very large, will be pruned. Using the concept of equivalence class, the redundant rules will be collected in the same equivalence class. The presentation to the user will be only the most informative nonredundant association rules, where the union of the antecedents (or consequents) is equal to the unions of the antecedents (or consequents) of all the association rules valid in the context. The resulting rules will have minimal antecedents and maximal consequents in the same equivalence class. The extraction of a set of rules without any loss of information will convey all the information in a set of association rules that are all valid according to the context. This method is possible to deduce efficiently, without access to the original dataset; all valid association rules with their supports and confidences are from these bases. Suppose that X10Y1 is a redundant-free association rule, if and only if, there does not exist another association rule X20Y2, such that X24X1 and Y14Y2. For example, in Table 2, rule 9 is redundant to rules 3–8, because rule 9 does not convey additional information to the user. Therefore, rule 9 can be removed from the cluster profile. Here is an illustration of two types of rule merging principle. (1) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be two association rules in the same cluster profile, where X24X1 or Y14Y2. Then, X10Y1 (s1%, c1%, l1) is a redundant association rule and can be directly removed from the cluster profile. For example, Table 3 represents the redundant-free cluster profile to cluster-1 (Table 2). The last field in Table 2, ‘R. to Rules’, indicates the corresponding redundant association rules. (2) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be two association rules in the different cluster profiles, where X24X1 or Y14Y2, and t1, t2 are number of cases representing X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2), respectively. Then, X10Y1 (s1%, c1%, l1) is a redundant association rule and should be removed from Table 5 The redundant-free cluster profile of cluster-1 and cluster-2 (merged) Rule ID Association rules Support Confidence Lift U. to Rule 1 Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100 5.58% 87.9% 1.91 c2,id2 c2,id3 3 5 6 7 12 13 14 15 16 18 19 20 6 7 9 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ0&CardTypeZ113 OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7 OccupationZ4010 * Marital_StatusZ1&CardTypeZ113 OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1 Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2 OccupationZ4010 * CardTypeZ100&Marital_StatusZ2 OccupationZ4010 * CardTypeZ821 17.3% 6.5% 6.28% 6.7% 9.5% 6.2% 5.0% 5.88% 6.1% 5.2% 5.9% 10.5% 5.3% 8.3% 5.5% 87.8% 89.1% 94.1% 87.8% 88.8% 91.0% 89.4% 92.9% 86.6% 88.9% 88.0% 87.9% 85.6% 92.7% 97.0% 0.99 1.01 1.10 0.99 1.00 1.03 1.01 1.09 0.98 1.01 0.99 0.99 1.86 1.35 1.41 c2,id8 c2,id8 632 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 the cluster profile. Three judgment standards, support, confidence and lift, of X20Y2 (s 0 %,c 0 %,l 0 ) were updated as: and marketing strategies can be implemented according to more detailed customer sub-groups. 1 t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ s Z ; C B t1 C t2 C B B 0 t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ C Bc Z ;C C: t1 SupportðX1 ; DB1 Þ C t2 SupportðX2 ; DB2 Þ X 2 0 Y2 B C B C B 0 c0 C Bl Z A @ t1 SupportðY1 ; DB1 Þ C t2 SupportðY2 ; DB2 Þ : t1 C t2 0 0 For example, Tables 3 and 4 represent the redundant-free cluster profiles to cluster-1 and cluster-2, respectively. Suppose that these two tables are the customer profiles of the ‘transactor user’, the (c2, id2) rule in Table 4 is then a redundant rule to the (c1, id1) rule in Table 3 so it can be removed from Table 4, and the judgment standards of the first rule in Table 3 were updated as:Marital_StatusZ 0*OccupationZ4010 and SexZ1 and Age_SegmentsZ2 and CardTypeZ100 (5.6%, 87.8%, 1.90).The judgment standards of the rest redundant association rules were updated accordingly as in Table 5. In here, (cx, idy) denotes the association rule of Rule-ID y in cluster-x, and the last field in Table 5, ‘U. to Rule’, indicates the judgment standards updated according to which association rule. 4. Conclusion Credit and behavioral scoring have become useful tools to model financial problems. However, most studies have concentrated on building an accurate credit scoring model to decide whether or not to grant credit to new applicants. In order to strengthen customer behavior management for existing credit card customers, we created a behavioral scoring model using neural networks and an association rule inducer. The existing customers were divided into three profitable groups of customers according to their shared behavior and characteristics. Marketers then can infer the profiles of customers in each group and propose management strategies appropriate to the characteristics of each group. This study provides a good method of analyzing bank databases. Beyond simply understanding customer value, the bank gains the opportunities to establish better customer relationships while increasing customer loyalty and revenue. Additionally, this two-stage behavioral scoring model also can be applied to predicate personal bankruptcy among bank customers to the account database. Further research may aim at time-series behavioral scoring models that could include the change of credit status in every period. Credit card customers could be segmented into more subgroups according to newly developed predicators and so on. Thanks to this paper and many others, more detailed management References Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the SIGMOD’93, Washington, DC, , 207–216. Baesens, B., Viaene, S., Poel, D., Vanthienen, J., & Dedene, G. (2002). Bayesian neural network for repeat purchase modelling in direct marketing. European Journal of Operational Research, 138, 191–211. Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S., & Lewis, P. A. (1996). Comparative performance of the FSCL neural net and K-means algorithm for market segmentation. European Journal of Operational Research, 93, 346–357. Bastide, Y., Pasquier, N., Taouil, R., Stumme, G., & Lakhal, L. (2000). Mining minimal non-redundant association rules using frequent closed item sets. Lecture Notes in Computer Science, 1861, 972–986. Bayardo, R. J., & Agrawal, R. (1999). Mining the most interesting rules. Proceedings of KDD Conference , 145–154. Bayardo, R. J., Agrawal, R., & Gunopulos, D. (1999). Constraint-based rule mining in large, dense databases. Proceedings of ICDE Conference, 188–197. Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail. Marketing Science, 14(4), 378–381. Chen, M. C., & Huang, S. H. (2003). Credit scoring and rejected instances reassigning through evolutionary computation techniques. Expert Systems with Applications, 24, 433–441. Dasgupta, C. G., Dispensa, G. S., & Ghose, S. (1994). Comparing the predictive performance of a neural network model with some traditional market response models. International Journal of Forecasting, 10, 235–244. Davies, F., Moutinho, L., & Curry, B. (1996). Curry, ATM attitudes: a neural network analysis. Marketing Intelligence and Planning, 14(2), 26–32. Desai, V. S., Crook, J. N., & Overstreet, G. A., Jr.. (1996). A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research, 95, 24–37. Donato, J. M., Schryver, J. C., Hinkel, G. C., Schmoyer, R. L., Leuze, M. R., & Grandy, N. W. (1999). Mining multi-dimensional data for decision support. Future Generation Computer Systems, 15, 433–441. Dyche, J., & Dych, J. (2001). The CRM handbook: a business guide to customer relationship management. Reading, MA: Addison-Wesley. Fish, K. E., Barnes, J. H., & Aiken, M. W. (1995). Artificial neural networks—a new methodology for industrial market segmentation. Industrial Marketing Management, 24, 431–438. Hand, D. J. (1981). Discrimination and classification. New York: Wiley. Heckerman, D. (1996). Bayesian networks for knowledge discovery. Advances in knowledge discovery and data mining , 273–305. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximations. Neural Networks, 2, 336–359. N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical analysis (4th Ed.). Upper Saddle River, NJ: Prentice-Hall. Kim, Y. S., & Sohn, S. Y. (2004). Managing loan customers using misclassification patterns of credit scoring model. Expert Systems with Applications, 26, 567–573. Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., & Verkamo, A. I. (1994). Finding interesting rules from large sets of discovered association rules. Proceedings of CIKM Conference , 401–407. Kohonen, T. (1995). Self-organizing maps. Berlin: Springer. Lancher, R. C., Coats, P. K., Shanker, C. S., & Fant, L. F. (1995). A neural network for classifying the financial health of a firm. European Journal of Operational Research, 85(1), 53–65. Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using the hybrid neural discriminate technique. Expert Systems with Applications, 23, 245–254. Malhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using neural networks. Omega, 31(2), 83–96. Mazanec, J. A. (1992). Classifying tourists into market segments: a neural network approach. Journal of Travel and Tourism Marketing, 1(1), 39–59. Morrison, D. F. (1990). Multivariate statistical methods. New York, NY: McGraw-Hill. Moutinho, L., Davies, F., & Curry, B. (1996). The impact of gender on car buyer satisfaction and loyalty. Journal of Retailing and Consumer Sciences, 3(3), 135–144. Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Closed set based discovery of small covers for association rules. Proceedings of BDA Conference , 361–381. 633 Setiono, R., Thong, J. Y. L., & Yap, C. S. (1998). Symbolic rule extraction from neural networks—an application to identifying organizations adopting IT. Information and Management, 34(2), 91–101. Sharda, R., & Wilson, R. (1996). Neural network experiments in business failures predication: a review of predictive performance issues. International Journal of Computational Intelligence and Organizations, 1(2), 107–117. Silberschatz, A., & Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering, 8(6), 970–974. Silverstein, C., Brin, S., & Motwani, R. (1998). Beyond market baskets: generalizing association rules to dependence rules. Data Mining and Knowledge Discovery, 2(1), 39–68. Srikant, R., & Agrawal, R. (1995). Mining generalized association rules. Proceedings of VLDB Conference , 407–419. Thomas, L. C. (2000). A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International Journal of Forecasting, 16, 149–172. Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in business: a survey of applications (1992–1998). Expert Systems with Applications, 17, 51–70. Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: general framework and crossvalidation analysis. European Journal of Operational Research, 116, 16–32. Zurada, J. M., Malinowski, A., & Cloete, I. (1994). Sensitivity analysis for minimization of input data dimension for feedforward neural network. IEEE International Symposium on Circuits and Systems, London, May 20–June 3.