Download An integrated data mining and behavioral scoring model for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Types of artificial neural networks wikipedia , lookup

Transcript
Expert Systems with Applications 27 (2004) 623–633
www.elsevier.com/locate/eswa
An integrated data mining and behavioral scoring model
for analyzing bank customers
Nan-Chen Hsieh*
Department of Information Management, National Taipei College of Nursing, No. 365, Min Te Road 11257, Taipei, Taiwan, ROC
Abstract
Analyzing bank databases for customer behavior management is difficult since bank databases are multi-dimensional, comprised of
monthly account records and daily transaction records. This study proposes an integrated data mining and behavioral scoring model to
manage existing credit card customers in a bank. A self-organizing map neural network was used to identify groups of customers based on
repayment behavior and recency, frequency, monetary behavioral scoring predicators. It also classified bank customers into three major
profitable groups of customers. The resulting groups of customers were then profiled by customer’s feature attributes determined using an
Apriori association rule inducer. This study demonstrates that identifying customers by a behavioral scoring model is helpful characteristics
of customer and facilitates marketing strategy development.
q 2004 Elsevier Ltd. All rights reserved.
Keywords: Data mining; Behavioral scoring model; Customer segmentation; Neural network; Association rule
1. Introduction
Contemporary marketing strategies perceive customers
as important resources to an enterprise. Therefore, it is
essential to enterprises to successfully acquire new
customers and retain high value customers. To achieve
these aims, many enterprises have gathered significant
numbers of large databases, which then can be analyzed and
applied to develop new business strategies and
opportunities.
However, instead of targeting all customers equally or
providing the same incentive offers to all customers,
enterprises can select only those customers who meet
certain profitability criteria based on their individual needs
or purchasing behaviors (Dyche & Dych, 2001). Credit
scoring and behavioral scoring are techniques that help
decision makers to realize their customers. Credit scoring
models help to decide whether to grant credit to new
applicants by customer’s characteristics such as age, income
and martial status (Chen & Huang, 2003). Behavioral
scoring models help to analyze purchasing behavior of
* Tel./fax: C2-822-7101-2220.
E-mail address: [email protected].
0957-4174/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2004.06.007
existing customers (Setiono, Thong, & Yap, 1998). These
two scoring models are highly related to the field of
classification analysis by statistical analysis (Hand, 1981;
Johnson & Wichern, 1998), especially classification analysis by neural networks in the field of data mining (Lancher,
Coats, Shanker, & Fant, 1995).
Until now, most existing data mining approaches have
been discovering general rules (Agrawal, Imielinski, &
Swami, 1993; Bult & Wansbeek, 1995; Setiono et al., 1998),
predicting personal bankruptcy (Dasgupta, Dispensa, &
Ghose, 1994; Desai, Crook, & Overstreet, 1996; Zhang, Hu,
Patuwo, & Indro, 1999) and credit scoring (Kim & Sohn,
2004; Lancher et al., 1995; Sharda & Wilson, 1996) in bank
databases. Few works have studied the mining of bank
databases from the viewpoint of customer behavioral
scoring (Sharda & Wilson, 1996). More specifically, we
wanted to look at both the account data of the customers and
their credit card transactions. With these data, the aim was
to discover interesting patterns in the data that could provide
clues about what incentives a company could offer as better
marketing strategies to its customers. As shown in Fig. 1,
this study presents a two-stage approach for behavioral
scoring analysis of implicit knowledge using bank customer
account and transaction data. Topics discussed include data
624
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
predicators. This SOM was employed to classify customers
into three major profitable groups of customer: revolver
user, transactor user, and convenience user.
Once the SOM identified the profitable groups of
customers, an Apriori profiled each group of customers
focusing on demographic and geographic characteristics for
building and maintaining the most profitable customer base.
The customer profile then was used to describe a
representative case in each group of customers, and served
as a tool for establishing better bank marketing strategies.
After analyzing the bank database, this study demonstrates
that customer behavior scoring models are an effective
method for banks to realize their most profitable customers.
We conclude by analyzing target groups of customers using
the proposed two-stage behavioral scoring model.
For a better understanding of our solutions, this study is
organized as follows. Section 2 makes a description of the
analyses methodology. An integrated data mining and
behavioral scoring model was presented. Section 3 assesses
neural networks as a tool for customer segmentation while
using past repayment behavior and RFM scoring variables
to build behavioral scoring models. Section 3 also presents
the processes of creating customer profiles according to
their feature attributes as determined by an Apriori
association rule inducer. Finally, conclusions are made in
Section 4.
2. Description the analyses methodology
2.1. Credit and behavioral scoring models
Fig. 1. Two-stage behavioral scoring modeling.
preprocessing, customer behavior scoring modelling, sensitivity analysis of relative importance attributes contributing
to the customer profiling, and the two stages of the
behavioral scoring model itself.
The key feature of the two-stage behavioral scoring
model is a cascade involving self-organizing map (SOM)
and an Apriori association rule inducer. An SOM (Kim &
Sohn, 2004; Kohonen, 1995) is an unsupervised learning
algorithm that relates multi-dimensional data as similar
input vectors to the same region of a neuron map, and
Apriori (Agrawal et al., 1993) is mainly used to find out the
potential relationships between items or features that occur
synchronously in the database. In the first stage of the
approach presented here, a conceptual customer behavioral
scoring model was established to predict profitable groups
of customers based on previous repayment behavior and
RFM (Bult & Wansbeek, 1995) behavioral scoring
Credit and behavioral scoring models (Thomas, 2000)
are one of the most successful applications of statistical and
operational research modelling in finance and banking, and
the number of scoring analysts in the industry is constantly
increasing. The main objective of both credit and behavioral
scoring models is to classify customers into groups (Lancher
et al., 1995). Hence scoring problems are related to the field
of classification analysis (Hand, 1981; Johnson & Wichern,
1998; Morrison, 1990). Applying to bank databases,
classification analysis for credit scoring is used to categorize
a new applicant as either accepted or rejected with respect to
his features such as age, income and martial status (Chen &
Huang, 2003). On the other hand, classification analysis for
behavioral scoring is used to describe the behavior of
existing customers by using behavioral scoring variables
and also to predict future purchasing behavior or credit
status of existing customers (Setiono et al., 1998).
Until now, the building of both scoring models has been
always based on a pragmatic approach; because of this, the
best and most standard scoring models for every unique
circumstance most certainly does not exist. Most previous
studies have focused on building more accurate credit or
behavioral scoring models and increasing the accuracy of
the classification model with various kinds of statistical
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
techniques. However, analyzing bank databases for customer behavior management is difficult since bank databases
are multi-dimensional, comprising of monthly account
records and daily transaction records (Donato et al.,
1999). Therefore, even with highly accurate scoring models,
some misclassification patterns appear frequently.
This study intended to draw much from data mining
perspectives. Providing a general integrated data mining and
behavioral scoring model for customer behavior analysis,
which includes necessary preprocessing of the real-world
data sets, scoring predicators derivation and customer
profiling in order to support a standard model building
process will be of great utility. The framework of two-stage
behavioral scoring model serves as a tool to validate the
effect of data mining techniques in practical scoring analysis
applications.
2.2. Neural networks to the segmentation analysis
For credit scoring or behavioral scoring analysis, many
studies have presented that neural networks perform
significantly better than statistical techniques such as linear
discriminate analysis (LDA), multiple discriminate analysis
(MDA), logistic regression analysis (LRA) and so on (Desai
et al., 1996; Lancher et al., 1995; Malhotra and Malhotra,
2003; Sharda & Wilson, 1996; Zhang et al., 1999). The
application of neural networks to segmentation analysis is a
promising research area and is a challenge for a variety of
marketers (Vellido, Lisboa, & Vaughan, 1999).
Baesens, Viaene, Poel, Vanthienen, & Dedene (2002)
employed Bayesian neural networks to repeat purchase
behavior modelling in direct marketing. Davies, Moutinho,
& Curry (1996) and Moutinho, Davies, & Curry (1996)
analyzed how different bank customer groups represent
different expectations of the automatic teller machines
(ATMs) service. Rather than profiling segments based on
demographic or geographic characteristics, Dasgupta et al.
(1994) characterized potential customer segments in terms
of lifestyle variables. Balakrishnan, Cooper, Jacob, & Lewis
(1996) accomplished a six-segment classification study
using coffee brand switching probabilities derived from the
scanner data at a sub-household level. Mazanec (Mazanec,
1992) grouped tourists using a benefit approach. Setiono
et al. (1998) utilized a rule-extraction neural network to aim
at companies for the promotion of new information
technology. Fish, Barnes, & Aiken (1995) proposed a new
methodology for industrial market segmentation by neural
networks. Lee, Chiu, Lu, & Chen (2002) explore the
performance of credit scoring by integrating the back
propagation neural networks with traditional discriminate
analysis. Kim & Sohn (2004) used neural networks to
manage customer loans.
Among these studies, only Balakrishnan et al. used the
frequency sensitive competitive learning (FSCL) algorithm
in segmentation analysis. The rest of the studies used
supervised feed-forward multilayer perceptron (MLP)
625
trained by back propagation and gradient descent, or similar
alternatives.
2.3. Properties of the built behavioral scoring model
In the business world, the most successful application of
behavioral scoring model is embodied into databases, which
is an approach of analyzing customer histories, looking for
similar behavioral patterns among existing customer preferences and using those patterns for a targeted selection of
existing or future customers The decisions to be made
include which target groups of customers will be encouraged to spend more, what credit line to assign, whether to
promote new products to particular groups of customers,
and, if the repayment ability turns bad, how to manage debt
recovery. Therefore, a behavioral scoring model is an
information-driven marketing process that enables marketers to develop, test, implement, measure and appropriately
modify customized marketing programs and strategies.
In addition to customer values that credit scoring models
use as major scoring information, repayment behavior
patterns and customer purchasing histories are also required
in a behavioral scoring model. Behavioral scoring models
are intended to establish associations between the input
predictors and the output scores in order to model the
behavior of different customers. More precisely, behavioral
scoring models tried to group customers that represent
shared behavior patterns. This is carried out by assigning
behavior scores to each customer and grouping customers
into classes of similar score value using an SOM neural
network. The behavior score is given by a mathematical
function of the form:
behavior score Z fSOM ðpredicator1 ; predicator2 ; .Þ:
In this study, four predicators, namely, repayment behavior
and RFM values are used to classify three profitable groups
of customers. Individual customer scores are updated on a
yearly basis in this study.
3. Assessing the neural network as customer
segmentation
3.1. Preparing the data sets
For this study, bank databases were provided by a major
Taiwanese credit card issuer. Data preprocessing was
required to ensure data field consistency in behavioral
scoring model building. Obviously, not all the data are
related to the chosen purposes, so knowledge extraction
from the bank databases included the following three subactions. The first sub-action was intended to organize the
raw data. Two data sets were obtained: a set containing
effective credit card account information of 158,126
customers until June 2003, and another set storing over
626
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
20 millions individual transaction records for these accounts
from January 2000 to June 2003. Then, two data sets were
joined using a customer identifier to create a single
behavioral-oriented data set. The second sub-action was
the extraction of only that data considered useful for the
analysis. Unnecessary data fields and records containing
incomplete or missing data were removed from the data sets
(Fish et al., 1995). The third sub-action was the application
of simple statistics to calculate an aggregate of new
behavioral scoring predicators.
The aim of calculating the aggregate was to emphasize
the customer repayment behavior and RFM (Bult &
Wansbeek, 1995) information hidden in the 12 months
observation period. In this case, the values derived from the
database such as maximum, minimum and average of a set
of variables (e.g. repayment states, payment cycle days,
number of credit card purchases, consumption amount,
interest on credit balance, and so on) for the monthly
activity over the past 12 months were considered for the
purpose of building a behavioral scoring model. As
mentioned, the desired outcome is to be able to predict
which customer belongs to which profitable group. The
ranges of values of numerical predictor are split into
intervals so that each interval contains as many customers as
possible that have a significant homogeneous behavior.
Multiple predictors can be grouped together to obtain the
same effect. To derive the most profitable customers, it was
chosen to identify similar repayment behavior with respect
to RFM values found in the real world.
3.2. Analyzing the behavior of customers
To establish a better relationship with customers, banks
constantly seek ways of differentiating their offerings and
developing more appropriate services for distinct market
segments. An important observation on the current state-ofthe-art segmentation analysis is the use of past transaction
data. The results produced are based on the assumption that
the customer behavior follows patterns similar to past
patterns and will repeat in the future. Therefore, there could
not be a better time than now to recognize the importance of
an effective new marketing strategy using data mining
techniques. To increase the amount of purchases while
improving customer satisfaction is a major goal.
Segmentation analysis is a method of achieving more
targeted communication with customers and is a pioneering
step towards classifying individual customers according to
previously defined groups of customers. The process of
segmentation analysis describes the characteristics of
groups of customers within the data, and putting customers
into segments according to their affinities or similar
characteristics.
This study tries to construct a behavioral scoring model
for direct marketing and encouraging consumption (Lancher
et al., 1995). These two goals are similar for analyzing
potential credit card customer behavior. However, attempts
to make good customer behavior management may be
limited by poor data relevance and quality, the volume of
data needing to be processed, or difficulty in viewing the
data. Therefore, the original data set could not be used
directly to predict customer behavior, so extra behavioral
scoring predicators were needed for predication.
As mentioned, banks have three types of profitable
customers: revolver users, transactor users and convenience
users. Revolver users always carry a credit card balance,
rolling over part of the bill to the next month, instead of
paying off the balance in full each month. Revolver users are
highly profitable customers because every month they pay
considerable interest on their outstanding balance. Meanwhile, transactor users pay in full on or before the due date
of the interest-free credit period and do not incur any
interest payments or finance charges. Transactor users do
not contribute significant revenue through interest on their
credit balances, but the discount on each transaction they
make still provides an important source of bank revenue.
Finally, convenience users are customers who periodically
charge large bills, such as for vacation or large purchases to
their credit card, and then pay these bills off over several
months. Convenience users thus contribute significant
amounts of interest on their credit balance.
Fig. 2 presents the conceptual framework used to answer
the questions posed in this study. This figure shows the two
components, customer segmentation and customer profiling,
which serve as major issues to be discussed here. Generally,
credit card issuers make money from annual fees, interest on
credit balance, and the discount collected from merchants
on each transaction. In this framework, account and
transaction data sets are assumed to be input sets to
customer segmentation. The values of RFM and repayment
behavior are assumed to be behavioral scoring predicators
affecting customer segmentation.
The recency (R) value measures the average time
distance between the day of makes a charge and the day
pays the bill, frequency (F) value measures the average
number of credit card purchases made, and monetary (M)
value measures the amount of consumption spent during a
yearly time period. Next, variables such as customer
attributes and credit card usage are assumed to influence
customer profiling. Finally, clusters and the associated
customer profiles are assumed to be outputs, as well as
influencing of credit card marketing strategies. In Fig. 2,
repayment behavior is highly related to customer segmentation, but is an implicit variable which cannot be retrieved
directly from the data set. We needed to develop a method
for modeling the customer repayment behavior.
As shown in the following equation, this study employs
‘Repayment Ability’ (RA) to model repayment behavior,
Repayment Ability
Z
no: of months without delayed pay off
:
no: of months of holding the card
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
627
Fig. 2. A conceptual framework of customer behavioral modeling.
The default observation range is assumed to be
12 months, and RA is computed as the ‘no. of months
without delayed pay off’ divided by the ‘no. of months of
holding the card’. For example, a customer without carries a
credit card balance for 8 months, and then the degree of RA
is computed as 8/12. For each customer, if RA is
approaching one, then the repayment behavior of that
customer is considered a transactor user. Meanwhile, if RA
is between zero and one then the repayment behavior of that
customer is considered a convenience user. Finally, if the
value of RA is approaching to zero then the repayment
behavior of that customer is considered a revolver user.
3.3. Assessing the SOM for customer behavioral scoring
During the last years, the SOM (Kohonen, 1995) has
gained in popularity as a classification analysis tool in
business related areas (Vellido et al., 1999). In this study,
the SOM is built with data from existing customers, which
include variables from account and transaction data sets. All
of the existing customer’s data are used to build the
behavioral scoring model in order to predicate potential
customer behavior.
The behavioral scoring model utilized in this study is
arranged to form a two dimensional SOM with a 4!4
rectangular shaped array of neurons. Each of these neurons
is connected to the input vectors through synaptic weights
which are adjusted during learning. The first phase of
SOM is a rough estimation phase, used to capture the gross
data patterns. The second phase is a tuning phase, used to
adjust the map to model the fine features of the data.
During the learning process, when a pattern is presented as
an input to the neural network, each Euclidean distance
between the pattern and each neuron is calculated using
RA first and then RFM as input variables. For inputs to the
SOM, each feature is scaled by subtracting the mean and
dividing by the standard deviation, resulting in each scaled
feature having a mean of zero and a standard deviation of
one. Once the most similar neuron is determined, the
neighborhood of that neuron is identified. The neighborhood of a neuron is defined as all the neurons within a
given link distance of the matched neuron. All neurons in
the neighborhood are adjusted to have feature values closer
to the current case. The adjustment amount of the neuron
weights is controlled by the learning rate.
The SOM map is shown in Fig. 3, the repayment
behavior, number of customers, ratio of number of
customers relative to the overall customers, RA and RFM
are shown for each neuron. Fig. 4 illustrated the overall
distribution of customers with respect to three major
profitable types of customers. The mass cases are distributed
over neurons 9–16, the number of customers is 104,979 and
repayment behavior is revolver user. Neurons 3, 4, 7 and 8
indicate a total of 21,202 customers are convenience users.
Neurons 1, 2, 5 and 6 indicate a total of 31,945 customers
are transactor users.
On the basis that no meaningful conclusions can be
drawn from small numbers of customers, no future analysis
needs to be performed on the clusters with fewer than 1000
cases (i.e. neuron 6, 12 and 15). The next major step is to
choose the target groups of customers, so as to choose the
target customers for direct marketing and encourage
628
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
Fig. 3. Neurons in a 4!4 map, each neuron defines a cluster.
consumption. The repayment behavior can be used to
indicate the risk of customers, the risk degrees among three
profitable groups of customers are ‘transactor user’%‘con‘convenience user’%‘revolver user’. Moreover, the clusters
of RFM values tend to RYF[M[ of each profitable group
are selected as target ones, all customers who belong to
these clusters become candidates for conducting suitable
marketing strategies for a bank, which attract the most
attention.
3.4. Determining the relative importance variables
After the segmentation of the existing customers, it is
possible to infer the characteristics of each group of
customers and from that propose appropriate management
strategies. Customer profiling (Setiono et al., 1998) provides
a basis for enterprises to offer customers better services and
retain good customers. Customer profiling is done by
assembling collected information on customers and their
potential behavior. We first used neural network sensitivity
analysis (Zurada, Malinowski, & Cloete, 1994) test for
whole customers to determine if there are significant
differences between each customer and minimize the input
variables, then infer customer profiles by an Apriori
association rule inducer.
The data set obtained after data preprocessing contained
32 attributes, 10 character attributes and 22 continuous
attributes. The neural network sensitivity analysis was used
to reserve the relative importance attributes, repayment
behavior and RFM values chosen as predicated variables for
whole customers. As recommended by Hornik, Stinchcombe, & White (1989), one hidden layer network is
sufficient to model a complex system with any desired
accuracy, and the employed neural network model has just
one hidden layer.
Table 1 lists the distribution of the relative importance
for each input variable using the neural network. The
sensitivity analysis of the neural network and the order of
most significant input variables indicate those variables that
Fig. 4. Customer distribution to repayment behavior.
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
Table 1
Sensitivity analysis of the relative importance input variables
Neural network model
Input layer (no. of
neurons)
First hidden layer (no.
of neurons)
Output layer (no. of
neurons)
Predicted accuracy
32
20
4
96.16%
Relative importance to RA and RFM
Variable name
Relative
importance
Comments
Amount_of_Consumption
Cardage_Month
0.40658
Monthly amount of consumption
0.30048
Creditline
Total_Consumption
Blockcod
Occupation
Cardtype
Marital_Status
0.19980
0.15002
0.12431
0.03722
0.02546
0.01779
Age_Segments
0.01499
Sex
N
0.00997
N
Number of months for which the
card has been held
Credit line
Yearly amount of consumption
Card usage limit or not
Encoded field
Encoded field
0, single; 1, married; 2, divorced;
3, separation
1, !25; 2, 25–30; 3, 30–35; 4,
35–40; 5, 40–45; 6, 45–50; 7,
50–60; 8, O60)
1, male; 0, female
are worth looking at in more detail. Factors with a relative
importance of 0.00997 and above were used in successive
customer profiling. In Table 1, Amount_of_Consumption,
CardAge_Month and CreditLine, were the three most
differentiating variables. On the other hand, Marital_Status,
Age_Segments and Sex are the least differentiating variables.
3.5. Create customer profiles
The study’s aim is to try to discover hidden patterns
in bank databases so that it could better understand
different characteristics about different customers and
develop new strategies to provide better service. In the
previous sections, we used behavioral scoring model to
classify customers into clusters with shared characteristics. The employment of mining association rules was
used to create customer profile in each cluster. The
purpose of association rule extraction is to discover
significant relationships between items or features that
occur frequently in a transaction database.
Let IZ{i1,i2,.,im} be a set of items. Let DB be a
transaction database, where each transaction T consists of
a set of items such that T4I. Given a set of items X4I,
a transaction T contains X if and only if X4T. Support
(X,DB) denotes the rate of X in DB. An association rule
is an implication of the form ‘X0Y(s%, c%, l)’, where
X4I, Y4I and XhYZ:. An association rule X0Y
holds in DB with support s% if the probability of
629
a transaction in DB containing both X and Y is s%. (i.e.
Support(X0Y)ZSupport(XgY, DB)). An association
rule X0Y holds in DB with confidence c% if the
probability of a transaction in DB which contains X also
contains Y is also c%. (i.e. Confidence(X0Y)ZSupport(XgY,DB)/Support(X,DB)). A well-known Apriori
algorithm (Agrawal et al., 1993) has been proposed for
mining association rules in a transaction database. To
find an association rule is to discover all the association
rules whose support is larger than a minimum support
(minsup) threshold and whose confidence is larger than a
minimum confidence (minconf) threshold. The association rules must satisfy two conditions:Support(X0Y) R
minsup,Confidence(X0Y) R minconf.
When all of the association rules are generated, the
simplest way to determine positive tendency of each
association rule is to use the lift judgment. Lift is the
ratio of confidence to expected confidence. Expected
confidence is the number of transactions that include the
consequent divided by the total number of transactions.
Suppose that we used X0Y to determine a customer’s
tendency of purchasing Y, the product purchase Support(Y, DB)/Support(XgY, DB) is the expected confidence, and the lift is computed as:
LiftðX0 YÞ Z ConfidenceðX0 YÞ=
Expected_ConfidenceðX0 YÞ:
According to the SOM results, the customers are fall
into three major profitable groups of customers dispersed
over 16 clusters. The 10 variables deriving from the
sensitivity analysis were chosen as predicate variables for
association rule analysis. For simplified explanation, we
chose only cluster-1 and cluster-2 for mining association
rules. Parameters were set up to identify association rules
that had at least 85% confidence and 5% support
imposed on the Apriori association rule inducer.
Table 2 lists the cluster profile of cluster-1 in the form
of association rules, where each rule represents a customer
profile that was dominant or most strongly associated with
the customers matching that cluster. For discriminating
purposes, we have grouped customers with shared
behavioral characteristics. From this, marketers can create
more accurate campaigns towards each target group of
customers for cross-selling and encouraging consumption.
After briefly reviewing the 16 clusters using cluster
profiles, the customers with values tend to RYF[M[ can
be targeted with greater accuracy. However, the risk
arising from the different profitable groups of customers in
practical applications should be considered.
3.6. Merging redundant association rules
After customers were classified by the behavioral
scoring model, the resulting clusters are then profiled by
feature attributes determined using an Apriori association
630
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
Table 2
Cluster-1 profile
Rule ID
Association rules
1
Marital_StatusZ0*OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTyCardTypeZ100
Marital_StatusZ0*SexZ1&Age_SegmentsZ2
OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ2
OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ3
OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ4
OccupationZ4010*Marital_StatusZ0&CardTypeZ113
OccupationZ4010*Marital_StatusZ0&SexZ1&Age_SegmentsZ3
OccupationZ4010*Marital_StatusZ0&SexZ1
OccupationZ4010*Marital_StatusZ0
OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ3
OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ4
OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ5
OccupationZ4010*Marital_StatusZ1 and Age_SegmentsZ6
OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ7
OccupationZ4010*Marital_StatusZ1&CardTypeZ113
OccupationZ4010*Marital_StatusZ1&SexZ1&Age_SegmentsZ4
OccupationZ4010*Marital_StatusZ1&SexZ1
OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3
OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4
OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&SexZ1
OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’
OccupationZ4010*Marital_StatusZ1
:
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
:
rule inducer. An association rule is considered relevant
for decision making if it has support and confidence at
least equal to some minimal support and confidence
thresholds defined by the user. As shown in Table 2, the
extracted association rules are usually very large, to the
present of a huge proportion of redundant rules
conveying the same information. Many of the rules
may contain redundant, irrelevant information or describe
trivial knowledge. We present interactive strategies for
pruning redundant association rules on the basis of
equivalence relation to enhance its readability.
Several methods have been proposed in the literature to
reduce the number of extracted association rules. Silverstein,
Brin, & Motwani (1998) used Pearson’s correlation statistic
Support
Confidence
Lift
5.6%
87.9%
1.91
6.7%
17.3%
14.8%
6.5%
6.2%
6.7%
17.9%
46.0%
10.8%
13.2%
9.5%
6.2%
5.0%
5.7%
6.1%
22.7%
5.2%
5.9%
10.5%
22.7%
49.4%
:
87.8%
87.8%
88.2%
89.1%
94.2%
87.8%
86.1%
87.9%
88.0%
87.9%
88.8%
91.0%
89.4%
92.7%
86.6%
88.1%
88.9%
88.0%
87.9%
88.4%
88.5%
:
1.91
0.99
1.00
1.01
1.06
0.99
0.97
0.99
0.99
0.99
1.00
1.03
1.01
1.05
0.98
1.00
1.01
0.99
0.99
1.00
1.00
:
R. to Rules
1
7
7
3w8
18
16,19
16,20
18w20
10w21
measure in replacement of confidence measure. Srikant &
Agrawal (1995) defined generalized association rules using a
taxonomy of the items set. Heckerman (Heckerman, 1996)
and Silberschatz et al. (Silberschatz & Tuzhilin, 1996)
measured the distance between association rules by evaluating the deviation according rule’s support and confidence.
Bayardo, Agrawal, & Gunopulos (1999) used item constraints, which are Boolean expressions defined by user, to
specify the form of association rules. Pasquier, Bastide,
Taouil, & Lakhal (1999) adapted the Duquenne-Guigues
basis for global implications, and the proper basis for
partial implications to the framework of association rules.
Klemettinen, Mannila, Ronkainen, Toivonen, & Verkamo
(1994) simplified a relatively significant number of
Table 3
The redundant-free cluster profile of cluster-1 (merged)
Rule ID
Association Rules
Support
Confidence
Lift
1
3
5
6
7
12
13
14
15
16
18
19
20
:
Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100
OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2
OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ0&CardTypeZ113
OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7
OccupationZ4010 * Marital_StatusZ1&CardTypeZ113
OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1
:
5.6%
17.3%
6.5%
6.2%
6.7%
9.5%
6.2%
5.0%
5.7%
6.1%
5.2%
5.9%
10.5%
:
87.9%
87.8%
89.1%
94.2%
87.8%
88.8%
91.0%
89.4%
92.7%
86.6%
88.9%
88.0%
87.9%
N
1.91
0.99
1.01
1.06
0.99
1.00
1.03
1.01
1.05
0.98
1.01
0.99
0.99
N
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
631
Table 4
The redundant-free cluster profile of cluster-2 (merged)
Rule ID
Association rules
Support
Confidence
Lift
2
3
6
7
8
9
:
Marital_StatusZ0 * CardTypeZ100&Age_SegmentsZ2
Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2
Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2
OccupationZ4010 * CardTypeZ100&Marital_StatusZ2
OccupationZ4010 * CardTypeZ113
OccupationZ4010 * CardTypeZ821
N
6.3%
5.3%
5.3%
8.3%
6.7%
5.5%
N
86.1%
85.7%
85.6%
92.7%
93.8%
97.0%
N
1.87
1.86
1.86
1.35
1.36
1.41
N
association rules via the visualization technique. Bastide,
Pasquier, Taouil, Stumme, & Lakhal (2000) used the Galois
connection as a basis to discover minimal non-redundant
association rules. Bayardo & Agrawal (1999) proposed the
A-maximal rules which state that when the population of
objects concerned is reduced when an item is added to the
antecedent, the form of association rules will have maximal
antecedents.
We intended to provide strategies to reserve useful,
relevant and non-redundant association rules. Thus,
redundant rules which represent in certain databases
the majority of extracted rules, particularly in the case of
dense or correlated data for which the total number of
valid rules is very large, will be pruned. Using the
concept of equivalence class, the redundant rules will be
collected in the same equivalence class. The presentation
to the user will be only the most informative nonredundant association rules, where the union of the
antecedents (or consequents) is equal to the unions of the
antecedents (or consequents) of all the association rules
valid in the context. The resulting rules will have
minimal antecedents and maximal consequents in the
same equivalence class. The extraction of a set of rules
without any loss of information will convey all the
information in a set of association rules that are all valid
according to the context. This method is possible to
deduce efficiently, without access to the original dataset;
all valid association rules with their supports and
confidences are from these bases.
Suppose that X10Y1 is a redundant-free association rule,
if and only if, there does not exist another association rule
X20Y2, such that X24X1 and Y14Y2. For example, in
Table 2, rule 9 is redundant to rules 3–8, because rule 9 does
not convey additional information to the user. Therefore,
rule 9 can be removed from the cluster profile. Here is an
illustration of two types of rule merging principle.
(1) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be
two association rules in the same cluster profile, where
X24X1 or Y14Y2. Then, X10Y1 (s1%, c1%, l1) is a
redundant association rule and can be directly removed
from the cluster profile. For example, Table 3 represents
the redundant-free cluster profile to cluster-1 (Table 2).
The last field in Table 2, ‘R. to Rules’, indicates the
corresponding redundant association rules.
(2) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be
two association rules in the different cluster profiles,
where X24X1 or Y14Y2, and t1, t2 are number of cases
representing X10Y1 (s1%, c1%, l1) and X20Y2 (s2%,
c2%, l2), respectively. Then, X10Y1 (s1%, c1%, l1) is a
redundant association rule and should be removed from
Table 5
The redundant-free cluster profile of cluster-1 and cluster-2 (merged)
Rule ID
Association rules
Support
Confidence
Lift
U. to
Rule
1
Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100
5.58%
87.9%
1.91
c2,id2
c2,id3
3
5
6
7
12
13
14
15
16
18
19
20
6
7
9
OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2
OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ0&CardTypeZ113
OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6
OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7
OccupationZ4010 * Marital_StatusZ1&CardTypeZ113
OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4
OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1
Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2
OccupationZ4010 * CardTypeZ100&Marital_StatusZ2
OccupationZ4010 * CardTypeZ821
17.3%
6.5%
6.28%
6.7%
9.5%
6.2%
5.0%
5.88%
6.1%
5.2%
5.9%
10.5%
5.3%
8.3%
5.5%
87.8%
89.1%
94.1%
87.8%
88.8%
91.0%
89.4%
92.9%
86.6%
88.9%
88.0%
87.9%
85.6%
92.7%
97.0%
0.99
1.01
1.10
0.99
1.00
1.03
1.01
1.09
0.98
1.01
0.99
0.99
1.86
1.35
1.41
c2,id8
c2,id8
632
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
the cluster profile. Three judgment standards, support,
confidence and lift, of X20Y2 (s 0 %,c 0 %,l 0 ) were
updated as:
and marketing strategies can be implemented according to
more detailed customer sub-groups.
1
t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ
s Z
;
C
B
t1 C t2
C
B
B 0 t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ C
Bc Z
;C
C:
t1 SupportðX1 ; DB1 Þ C t2 SupportðX2 ; DB2 Þ
X 2 0 Y2 B
C
B
C
B 0
c0
C
Bl Z
A
@
t1 SupportðY1 ; DB1 Þ C t2 SupportðY2 ; DB2 Þ
:
t1 C t2
0
0
For example, Tables 3 and 4 represent the redundant-free
cluster profiles to cluster-1 and cluster-2, respectively.
Suppose that these two tables are the customer profiles of
the ‘transactor user’, the (c2, id2) rule in Table 4 is then a
redundant rule to the (c1, id1) rule in Table 3 so it can be
removed from Table 4, and the judgment standards of the
first rule in Table 3 were updated as:Marital_StatusZ
0*OccupationZ4010 and SexZ1 and Age_SegmentsZ2
and CardTypeZ100 (5.6%, 87.8%, 1.90).The judgment
standards of the rest redundant association rules were
updated accordingly as in Table 5. In here, (cx, idy) denotes
the association rule of Rule-ID y in cluster-x, and the last
field in Table 5, ‘U. to Rule’, indicates the judgment
standards updated according to which association rule.
4. Conclusion
Credit and behavioral scoring have become useful tools to
model financial problems. However, most studies have
concentrated on building an accurate credit scoring model to
decide whether or not to grant credit to new applicants. In
order to strengthen customer behavior management for
existing credit card customers, we created a behavioral
scoring model using neural networks and an association rule
inducer. The existing customers were divided into three
profitable groups of customers according to their shared
behavior and characteristics. Marketers then can infer the
profiles of customers in each group and propose management
strategies appropriate to the characteristics of each group.
This study provides a good method of analyzing bank
databases. Beyond simply understanding customer value,
the bank gains the opportunities to establish better customer
relationships while increasing customer loyalty and revenue. Additionally, this two-stage behavioral scoring model
also can be applied to predicate personal bankruptcy among
bank customers to the account database. Further research
may aim at time-series behavioral scoring models that could
include the change of credit status in every period. Credit
card customers could be segmented into more subgroups
according to newly developed predicators and so on. Thanks
to this paper and many others, more detailed management
References
Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules
between sets of items in large databases. Proceedings of the
SIGMOD’93, Washington, DC, , 207–216.
Baesens, B., Viaene, S., Poel, D., Vanthienen, J., & Dedene, G. (2002).
Bayesian neural network for repeat purchase modelling in
direct marketing. European Journal of Operational Research, 138,
191–211.
Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S., & Lewis, P. A. (1996).
Comparative performance of the FSCL neural net and K-means
algorithm for market segmentation. European Journal of Operational
Research, 93, 346–357.
Bastide, Y., Pasquier, N., Taouil, R., Stumme, G., & Lakhal, L. (2000).
Mining minimal non-redundant association rules using frequent closed
item sets. Lecture Notes in Computer Science, 1861, 972–986.
Bayardo, R. J., & Agrawal, R. (1999). Mining the most interesting rules.
Proceedings of KDD Conference , 145–154.
Bayardo, R. J., Agrawal, R., & Gunopulos, D. (1999). Constraint-based rule
mining in large, dense databases. Proceedings of ICDE Conference,
188–197.
Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail.
Marketing Science, 14(4), 378–381.
Chen, M. C., & Huang, S. H. (2003). Credit scoring and rejected instances
reassigning through evolutionary computation techniques. Expert
Systems with Applications, 24, 433–441.
Dasgupta, C. G., Dispensa, G. S., & Ghose, S. (1994). Comparing the
predictive performance of a neural network model with some traditional
market response models. International Journal of Forecasting, 10,
235–244.
Davies, F., Moutinho, L., & Curry, B. (1996). Curry, ATM attitudes: a
neural network analysis. Marketing Intelligence and Planning, 14(2),
26–32.
Desai, V. S., Crook, J. N., & Overstreet, G. A., Jr.. (1996). A comparison
of neural networks and linear scoring models in the credit
union environment. European Journal of Operational Research, 95,
24–37.
Donato, J. M., Schryver, J. C., Hinkel, G. C., Schmoyer, R. L., Leuze, M. R.,
& Grandy, N. W. (1999). Mining multi-dimensional data for decision
support. Future Generation Computer Systems, 15, 433–441.
Dyche, J., & Dych, J. (2001). The CRM handbook: a business guide to
customer relationship management. Reading, MA: Addison-Wesley.
Fish, K. E., Barnes, J. H., & Aiken, M. W. (1995). Artificial neural
networks—a new methodology for industrial market segmentation.
Industrial Marketing Management, 24, 431–438.
Hand, D. J. (1981). Discrimination and classification. New York: Wiley.
Heckerman, D. (1996). Bayesian networks for knowledge discovery.
Advances in knowledge discovery and data mining , 273–305.
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward
networks are universal approximations. Neural Networks, 2, 336–359.
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633
Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical
analysis (4th Ed.). Upper Saddle River, NJ: Prentice-Hall.
Kim, Y. S., & Sohn, S. Y. (2004). Managing loan customers using
misclassification patterns of credit scoring model. Expert Systems with
Applications, 26, 567–573.
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., & Verkamo,
A. I. (1994). Finding interesting rules from large sets of discovered
association rules. Proceedings of CIKM Conference , 401–407.
Kohonen, T. (1995). Self-organizing maps. Berlin: Springer.
Lancher, R. C., Coats, P. K., Shanker, C. S., & Fant, L. F. (1995). A neural
network for classifying the financial health of a firm. European Journal
of Operational Research, 85(1), 53–65.
Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using
the hybrid neural discriminate technique. Expert Systems with
Applications, 23, 245–254.
Malhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using
neural networks. Omega, 31(2), 83–96.
Mazanec, J. A. (1992). Classifying tourists into market segments: a neural
network approach. Journal of Travel and Tourism Marketing, 1(1),
39–59.
Morrison, D. F. (1990). Multivariate statistical methods. New York, NY:
McGraw-Hill.
Moutinho, L., Davies, F., & Curry, B. (1996). The impact of gender on car
buyer satisfaction and loyalty. Journal of Retailing and Consumer
Sciences, 3(3), 135–144.
Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Closed set based
discovery of small covers for association rules. Proceedings of BDA
Conference , 361–381.
633
Setiono, R., Thong, J. Y. L., & Yap, C. S. (1998). Symbolic rule extraction
from neural networks—an application to identifying organizations
adopting IT. Information and Management, 34(2), 91–101.
Sharda, R., & Wilson, R. (1996). Neural network experiments in business
failures predication: a review of predictive performance issues.
International Journal of Computational Intelligence and Organizations, 1(2), 107–117.
Silberschatz, A., & Tuzhilin, A. (1996). What makes patterns interesting in
knowledge discovery systems. IEEE Transactions on Knowledge and
Data Engineering, 8(6), 970–974.
Silverstein, C., Brin, S., & Motwani, R. (1998). Beyond market baskets:
generalizing association rules to dependence rules. Data Mining and
Knowledge Discovery, 2(1), 39–68.
Srikant, R., & Agrawal, R. (1995). Mining generalized association rules.
Proceedings of VLDB Conference , 407–419.
Thomas, L. C. (2000). A survey of credit and behavioural scoring:
forecasting financial risk of lending to consumers. International
Journal of Forecasting, 16, 149–172.
Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in
business: a survey of applications (1992–1998). Expert Systems with
Applications, 17, 51–70.
Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural
networks in bankruptcy prediction: general framework and crossvalidation analysis. European Journal of Operational Research, 116,
16–32.
Zurada, J. M., Malinowski, A., & Cloete, I. (1994). Sensitivity analysis for
minimization of input data dimension for feedforward neural network.
IEEE International Symposium on Circuits and Systems, London, May
20–June 3.