Download classification algorithm implementation of data mining in

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

K-means clustering wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
International Seminar on Scientific Issues and Trends (ISSIT) 2014
CLASSIFICATION ALGORITHM IMPLEMENTATION OF DATA
MINING IN THE DETERMINATION OF GIVING LOAN
COOPERATIVE
Suryanto1 ), Nandang Iriadi2)
1)
Informatics Management Program,Akademi Manajemen Informatika dan Komputer Bina Sarana Informatika
Jl. RS Fatmawati No.24 Pondok Labu 12450
[email protected]
2)
Computer Engineering Program,Akademi Manajemen Informatika dan Komputer Bina Sarana Informatika
Jl.RS Fatmawati No.24 Pondok Labu 12450
[email protected]
Abstract- Cooperatives as a form of organization that is important in promoting economic growth . Credit
unions be an alternative for people to get funding in an effort to improve their quality of life , fulfillment of daily
needs and develop usaha.Tidak doubt , provide loan funds to customers will surely emerge problems , such as
late customer pays the mortgage funds , abuse funds for other purposes , the client fails to expand its business so
as to result in cooperative funds do not flow or it can lead to credit macet.Tujuan this research is to establish a
model of an algorithm that can predict the behavior of troubled borrowers . Data Mining is one of the methods
that can be used to analyze existing data chunks that can be used to summarize the data provide specific
information related to the data . Data Mining classification of the decision tree using the C4.5 algorithm in the
form of a rule statement. Decision tree model was able to improve the accuracy in analyzing the credit
worthiness of potential borrowers filed . The richer the information or knowledge that is contained by the
training data , the accuracy of the decision tree will increase . The results obtained for the value of 86.13 %
Accuracy , Precision Value 89.94 % , 92.61 % recall rate curve AUC is worth 0842 each prediction entry into
the category of good value and excellent classification classification .
Keywords :behavior of borrowers , Data Mining , C4.5 Algorithm
I. INTRODUCTION
The development of cooperatives that provide
fresh loans to prospective members and members of
the cooperative are emerging. All the facilities and the
ease in obtaining loans have been designed in such a
way with certain requirements set forth by the
cooperative tersebut.koperasi are institutions that
provide loans to the borrowers who need quick funds
for purposes of members who need it. In general, the
allocation of funds from the loan cooperatives such as,
for venture capital, education, home renovation..Not
be denied, lend funds to members of the cooperative
will surely emerge problems, such as paying the
mortgage late member funds, misuse of funds for the
purposes others, members failed to expand its business
so as to result in cooperative funds do not flow or it
can lead to bad credit. Therefore, this study was
conducted to help resolve those problems by designing
a data mining application that serves to predict the
behavior of a potential borrower borrowing (credit) on
the cooperative. By applying the C4.5 Decision Tree
algorithm based is expected to improve the accuracy
of the analysis to predict the behavior of members of
the cooperative who could potentially make borrowing
(loans) against Ceger Jaya Multipurpose Cooperative
Enterprises located in Ceger-Cipayung East Jakarta, a
place chosen by the researchers as a case study the
number of members who apply for loan credit fund in
2010 as many as 762 people with customer problems
Proceeding ISSIT 2014, Page: A-160
and as many as 300 people in 2011-2012 the number
of customers as many as 928 people with problematic
customers as much as 202 people. The data examined
are permitted for the data in predicting customer
behavior 2011-2012.Untuk this credit need a method
or technique that can process data that already exists
in the cooperative. One method uses data may
mining.Dari data mining functions that have been
mentioned will be used in this research. Data mining
techniques are applied to an application that is built
while the classification classification method used is
Decision Tree (decision tree). The algorithm is used as
a decision tree algorithm is forming algorithm C4.5.
The data in the process in this study is a data member
of the installment Ceger Jaya Multipurpose
Cooperative Enterprises in 2011-2012 in Microsoft
Excel format. This study is an application that can
facilitate Multipurpose Cooperative Enterprises (KSU)
"Ceger Jaya" in obtaining credit marketing targets in
the future. Given these data mining, the KSU Ceger
Jaya can get an overview of doing business lending.
To predict the behavior of these credit customers need
a method or technique that can process data that is
already in the bank. One method may be to use data
mining techniques. In data mining techniques are six
functions (tasks) which are
1. Function Description, used to briefly set of
Function
2. Estimation, are used to estimate an existing case
data
International Seminar on Scientific Issues and Trends (ISSIT) 2014
3. Function prediction, estimating the results of the
unknown
4. Calcification functions, classifies the data, which
consists of the mean vector algorithm, the Knearest algorithm, ID3 algorithm, the algorithm
C4.5, C5.0 algorithm.
II. THEORY
2.1. Overview of Study
There are several studies that use C4.5 Decision
Tree as a model of the algorithm to predict based on
historical data.
1. Credit Scoring Model Based on Decision Tree
and the Simulated Annealing Algorithm [8]. Make
a model to predict which customers are
problematic and not problematic in loan payments
by using models and C4.5 Decision Tree and the
Simulated Annealing Algorithm. The data used
are taken from the German company which is a
credit finance company. Jiang took a few
attributes and then incorporated into the model to
predict the percentage of customers with
problems.
2. Could Decision Trees Improve the Classification
Accuracy
and
Interpretability of Loan Granting Decisions?
[24]
.Make a study to compare several algorithms
such as Linear Regression, Neural Network,
Support Vector Machine, Case Base Reasoning,
Rule Based Fuzzy Neural Network and Decision
Tree.
2.2
Credit
Understanding credit in Article 1 number 11 of Act
No. 10 of 1998 on the change in the law No. 7 of 1992
concerning Banking is the provision of cash or
equivalent, based on agreements between bank
lending and borrowing with other parties who require
the borrower to repay the debt within a certain time
with bunga.Dalam distributing the funds, the bank or
lender has a condition called specific requirements
that must be met namely[9]::
1. Type of credit needed
2. Desired amount
3. The loan period
4. Way of loan repayment
5. Collateral held
6. The financial statements of periods
7. Feasibility and
8. other requirements
Definition of credit in general can be interpreted in
two ways namely[9]:
1. Credit in the sense of the provision or distribution
in money
2. Credits in the form of goods or services
According to
Analysis can also be measured creditworthiness with
7P principles namely[9]:
1. Personality
2.
3.
4.
5.
6.
7.
Assessment is used to determine the personality
of the prospective customers
Perpose That is the purpose of taking credit
(productive enterprises, used alone, trade)
Party means to extend credit, the bank sort out
into several categories (small business loans,
medium or large)
Payment is the way loan payments by the
customer (whether from income or from the
object financed)
Prospect which to assess future expectations,
especially the object-financed loans.
Profitability means financed by bank credit would
benefit both parties, both the bank or the
customer.
Protection, means of protection against creditfinanced object. Credit analysis techniques using
C4.5 algorithm is expected to minimize the entry
of the debtor is problematic, as more and more
troubled borrowers more it will increase the level
of bad loans, which in turn can lead to
bankruptcy.
That there are some attributes that accompany the
debitor data such as age, amount of credit,
checking, guarantor, loan term, long work, bank
account number, employment status, credit
history, the status of the home, funds safe ,
marital status, reason for the loan, and others. [24]
2.3. Data Mining
Data mining is an iterative process that is aimed at
the analysis of the database, with the aim of
extracting information and knowledge that can
prove the accuracy of the data and potentially good
and useful for professionals engaged in the field of
science in decision-making and problem solving
[22]
Activity Data Mining can be divided into two
parts, according to the main objectives of the
analysis[22]:
1. Interpretation
The purpose of interpretation is to identify
regular patterns in the data and to express data
through rules and criteria that can be easily
understood by experts in the application
domain.
2. Prediction
The goal of prediction is to anticipate the
value of that variable is not a sequence which
will assume in the future or to estimate the
probability of future events.
Data mining is the core of the process of Knowledge
Discovery in Databases (KDD) [16]. KDD is organized
process to identify valid patterns, new, useful, and
understandable from a large data sets and complex.
KDD consists of nine
steps As shown in Figure 2.1. The following is a brief
explanation of the steps in the KDD [16].
1. Formation of understanding the application
domain At this stage the purpose of determining
Proceeding ISSIT 2014, Page: A-161
International Seminar on Scientific Issues and Trends (ISSIT) 2014
the end-user and the relevant sections where KDD
done.
2. Select and create a data set in which the
knowledge
discovery
process
will
performed Determination data to be used for the
KDD process is done at this stage.
3. Preprocessing and cleansing in this phase of
enhanced data reliability. Including clearing the
data, such as dealing with incomplete data,
eliminating distractions or outliers
4. Transformation of data At this stage, make better
use of data dimension reduction methods and
transformation attributes.
5. Choosing a suitable data mining tasks At this
stage the specified type of data mining that will
be used, whether classification, regression, or
2. Algorithm C4.5
C4.5 algorithm is an enhancement of the ID3
algorithm. C4.5 algorithm to construct a decision tree
from a set of training data, in the form of cases or
records (tuples) in the database, where each record has
a discrete or both continuous attributes. [14] A decision
tree structure similar to a tree where there is an
internal node (non-leaf) which describe attributes,
each branch represents the results of the attributes
tested, and each leaf describes the class. C4.5
algorithm using the concept of information gain or
entropy reduction for selecting the optimal division [7]
clustering, depending on the purpose of KDD and
the previous stage.
6. Choosing the data mining algorithm At this stage,
the selection of the most appropriate algorithm to
find patterns.
7. The use of data mining algorithms At this stage,
the implementation of data mining algorithms that
have been determined in the previous stage.
8. Evaluation At this stage of the evaluation and
interpretation of the pattern is obtained.
The use of knowledge gained At this stage, the
knowledge incorporated into other systems and
activate the system and measure the results.
1. Classification
The classification is one of the most common
applications for data mining. [7]
There are several stages in making a decision tree
algorithm C4.5 [10], namely:
a. Preparing the training data.
Training data are usually taken from historical data
that never happened before and has been grouped
into certain classes.
b. Determine the roots of the tree.
The roots will be taken from the selected
attributes, by calculating the value of the gain of
each attribute, the highest gain value that will be
the first root. Before calculating the gain of an
attribute value, first calculate the entropy value. To
calculate the entropy value used formula.
Specification:
S = the set of cases
n = number of partitions S
pi = proportion of Si to S
Entropy( S )  i 1  pi. log 2 pi
n
3. Then calculate the gain value using the formula:
n
Gain( S , A)  Entropy( S )  
i 1
| Si |
* Entropy( Si )
|S|
Specification:
S = the set of cases
A = feature
n = number of partitions of attributes A
| Si | = proportion of Si to S
| S | = number of cases in
4.
5.
3.
Repeat step 2 until all record partition
Decision tree partitioning process will stop
when:
a. All records in the node N endapat same class.
b. Nothing in the record attributes are
partitioned again.
c. There is no record in the empty branches.
The development of algorithms C4.5 ID3
algorithm. C4.5 algorithm to construct a decision
tree from a set of training data, in the form of
Rule Based Classification
Rule-based or rule-based algorithm is the best
way to represent the number of bits of data or
Proceeding ISSIT 2014, Page: A-162
cases or records (tuples) in the database, where
each record has a discrete or both continuous
attributes[14] . The algorithm C4.5 to build a
decision tree is as follows[10]:
a. Select an attribute as root
b. Create a branch for each value
c. For the case of the branch
d. Repeat the process for each branch until all
cases on a branch
knowledge [7] Rule-based logic is usually written
in the form of IF-THEN or if made the equation
is:
International Seminar on Scientific Issues and Trends (ISSIT) 2014
IF condition THEN conlusion
example of a rule is:
IF age = youth AND student = yes THEN
buys_computer = yes
IF statement of the above equation is known as the
rule antecedent or precondition while the THEN
statement is referred to as the rule consequent. In the
rule antecedent usually include one or more attributes
(eg age and student attributes) and use logical AND if
using more than one attribute. Rule consequent is a
b. ROC curve
ROC curve showing the classification accuracy
and to compare visually. ROC express confusion
matrix.. ROC Curve is another way to test the
performance of the classification [6]. The accuracy
accordance with his partner. Performance
accuracy AUC)[6] Can be classified into five
groups:
a. 0.90 - 1.00 = Exellent Classification
b. 0.80 - 0.90 = Good Classification
c. 0.70 - 0.80 = Fair Classification
d. 0.60 - 0.70 = Poor Classification
I.
RESULTS AND DISCUSSION
3.1. Measurement Research
entropy
class prediction, in the above example is buying a
computer predictions or buys_computer = yes [7]
4. Evaluation and Validation Methods.
a. Confusion Matrix
Confusion Matrix is a tool (tools) visualization is
commonly used in supervised learning. Each
column in the matrix is a sample class
prediction, while each row represents an actual
incident
in
class
(Gorunescu,
2010).
A. Results of Research
The purpose of this study to test the accuracy
of credit analysis using the C4.5 algorithm. The
data is analyzed loan data in the form of loans,
ie loans that all data has been approved by the
Cooperative Business Solutions Business
"Ceger Jaya".
a. Algorithm C4.5
Steps to make C.45 algorithm using the
training data which amounts to 928, namely:
Prepare the training data. Training data used
in this study amounted to 928 records.
Calculate the value of entropy.
b. After calculation of entropy using the formula,
example, for the attribute income, entropy will
be obtained as follows:
Entropy( S )  i 1  pi. log 2 pi
n
(S) = (-726
= 0.7559
/
928
*
log2
(726/928))
c. After that, calculate the value of the gain for
each attribute, and then select the highest gain
value. The highest gain value that will be used
as the root of the tree. Gain calculation using
+
(-
202/928
*
log2
(202/928))
the
formula:
For example, for the attribute income, the gain
will be obtained:
n
Gain( S , A)  Entropy( S )  
i 1
| Si |
* Entropy( Si )
|S|
Gain (S, A) = 0.7559- ((301/928 * 0.9998) + (354/928
highest gain value 0.1718 Therefore, income
* 0.3134)) + (273/928 * 0.4771)= 0.1718.
is the root node in the decision tree. Here are
From the calculation of entropy and the gain,
the results of the calculation of entropy and
it appears that the Income attribute has the
gain in Table 3.1
Table 3.1 Results of entropy value and the gain to determine the root node
Node
Data
Current
Troubled
Entropy
928
726
202
0,7559
Income
928
726
202
0,7559
2-3 jt
301
147
154
0,9996
3-4 Jt
354
334
20
0,3134
5-9 Jt
273
245
28
0,4771
Gain
Information
Gain
Split
Gain Ratio
Root
0,1717786
0,584116
0,1718
1,5765429
0,1089591
Proceeding ISSIT 2014, Page: A-163
International Seminar on Scientific Issues and Trends (ISSIT) 2014
Defendant Family
928
726
202
0,7559
Many
435
415
20
0,2691
Moderately
157
100
57
0,9452
Slighty
240
123
117
0,9995
Empty
96
88
8
0,4138
Business Activity
928
726
202
0,7559
Shop
361
274
87
0,7967
Services
273
219
54
0,7175
Workshop
294
233
61
0,7366
928
726
202
0,7559
448
355
93
0,7368
Owned Alone
480
371
109
0,7729
Loan Ceiling
928
726
202
0,7559
1-5 Jt
231
184
47
0,7288
6-10 Jt
234
185
49
0,7403
11-15 Jt
233
181
52
0,7659
16-30 Jt
230
176
54
0,7863
Period
928
726
202
0,7559
1-6 Bln
528
417
111
0,7419
7-12 Bln
400
309
91
0,7736
Interest Rate
928
726
202
0,7559
4%
479
373
106
0,7625
3%
449
353
96
0,7487
Status of Residence
928
726
202
0,7559
Your Own Home
344
270
74
0,7511
Rental Home
288
220
68
0,7885
ride
296
236
60
0,7273
Goods Guarantee
928
726
202
0,7559
Motorcycle reg
562
444
118
0,7414
Car reg
366
282
84
0,7772
Status of business
place
Renting
From the calculation of entropy and the gain obtained
in Table 3.1, it appears that income has a value
attribute has the highest gain ratio is 0.1089591.
Therefore, the income is the root node in the decision
tree. To determine the next node, the node 1.1 entropy
.
0,1685537
0,587341
0,1686
1,7892474
0,0942037
0,001519
0,754376
0,0015
1,5745435
0,0009647
0,0004024
0,755492
0,0004
0,9991421
0,0004027
0,0006239
0,755271
0,0006
1,5052644
0,0004145
0,0003092
0,755585
0,0003
0,9862325
0,0003135
5,93E-05
0,755835
0,0001
0,999246
5,93E-05
0,0007565
0,755138
0,0008
1,5804253
0,0004787
0,0003848
0,75551
0,0004
0,9675783
0,0003977
calculation is done again and gain based attribute
income. The number of cases is the number of cases
calculated by the value of the root node (Income) .. So
the form of the Tree of the calculation as the following
figure
3.1
Figure 3.1 Tree root node
In Figure 3.1, the decision tree generated from the
calculation of the overall gain ratio for the attribute.
Based on the results of the calculation of the gain ratio
in Table 4.1, income has the highest gain value ratio
Proceeding ISSIT 2014, Page: A-164
and has three paths according to its attributes, namely:
= 2-3 Jt, Jt = 3-4 and a = 5-9 Jt. The next step is taking
into account the gain ratio of the existing tree roots to
International Seminar on Scientific Issues and Trends (ISSIT) 2014
determine the node 1.1 to have a path Income (Jt = 23), see table 3.2 as follows:
While decesion tree data processed using
RapidMiner can be seen in Figure 3.2:
Figure 3.2 Decision Tree Algorithm Using C4.5
Based on the above decision tree, classification rules
can be established, as follows:
1. R1: IF income = 2-3 Jt AND Liability
Family=Many AND Status of Business Premises
= renting AND Period of time =1-6 Bln THEN
Current
2. R2: IF income = 2-3 Jt AND Liability
Family=Many AND Status of Business Premises
= renting AND Period of time =7-12 Bln AND
Good guarantee = Car register THEN Current
3. R3: IF income = 2-3 Jt AND Liability
Family=Many AND Status of Business Premises
= renting AND Period of time =7-12 Bln AND
Good guarantee = Motocyle register THEN
Troubled
4. R4: IF income = 2-3 Jt AND Liability
Family=Many AND Status of Business Premises
= one’s own THEN Current
5. R5: IF income = 2-3 Jt AND Liability
Family=Slightly Loan Ceiling = 1-5 Jt AND
Business Activities = Machine Shop THEN
Current
6. R6: IF income = 2-3 Jt AND Liability
Family=Slightly AND Loan Ceiling = 1-5 Jt
AND Business Activities = Services THEN
Troubled
7. R7: IF income = 2-3 Jt AND Liability
Family=Slightly AND Loan Ceiling = 1-5 Jt
AND Business Activities = Store THEN Troubled
8. R8: IF income = 2-3 Jt AND Liability
Family=Slightly AND Loan Ceiling = 11-15 Jt
AND Business Activities = Machine Shop THEN
Troubled
9. R9: IF income = 2-3 Jt AND Liability
Family=Slightly AND Loan Ceiling = 11-15 Jt
AND Business Activities = Services THEN
Current
10. R10: IF income = 2-3 Jt AND Liability
Family=Slightly AND Loan Ceiling = 11-15 Jt
AND Business Activities = Store THEN Troubled
11. R11: IF income = 3-4 Jt AND Liability
Family=Many THEN Current
12. R12:IF income = 3-4 Jt AND Liability
Family=Empty AND Loan Ceiling = 1-5 Jt
THEN Current
13. R13:IF income = 3-4 Jt AND Liability
Family=Empty AND Loan Ceiling = 11-15 Jt
THEN Troubled
14. R14:IF income = 3-4 Jt AND Liability
Family=Empty AND Loan Ceiling = 16-30 Jt
THEN Current
15. R15:IF income = 3-4 Jt AND Liability
Family=Empty AND Loan Ceiling = 6-10 Jt
THEN Current
3.2. Evaluation and Validation Model
The results of model testing has been done, the
accuracy of the testing done by using the confusion
matrix and ROC curve / AUC (Area Under Cover).
1. Confussion Matrix
Table 3.3 The accuracy is the accuracy of the
calculation of the training data using the C4.5
algorithm. Known from 928 training data and 10
attributes (Income, Business Activity, Status of
Business Place, Ceiling loans, term, Interest Rate,
Status Dwelling, Vehicle Owned, Printed
Collateral, Loan Quality), using the method of
C4.5 algorithm obtained 46 Data in accordance
with the predictions Troubled Troubled, Troubled
turns 15 Current prediction, prediction of data 30
Current Troubled turns, and the data 238 in
accordance with Current Current prediction.
Table 3.2 Confussion Matrix (acuracy)
Proceeding ISSIT 2014, Page: A-165
International Seminar on Scientific Issues and Trends (ISSIT) 2014
2. Evaluation of the ROC curve ROC has a level of
diagnostic value, namely[6] :
Accuracy is worth 0.90 - 1.00 = excellent
classification
Accuracy is worth 0.80 - 0.90 = good classification
Accuracy is worth 0.70 - 0.80 = fair classification
Accuracy is worth 0.60 - 0.70 = poor classification
Accuracy is worth 0:50 - 0.60 = failure
The results of the processing to algorithms C4.5 ROC
using training data for 0842 can be seen in Table
3.3,Table 3.5. with a good level of diagnostic
classification. After testing, the training data obtained
measurement results on the training data that accuracy
= 86.52%, precision = 89.17% and recall = 94.20%
and Araea Under Curve =0.844
Figure 3.4 From these results there is a chart with the
value of ROC AUC (Area Under Curve) of 0.844 for
diagnostic testing where the results of data Good
Classification.
a. Similarities that can be used to calculate precision
and accuracy namely[7]:
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Or using equation accuracy namely [6]
Precision (Current) = 46 / (46 + 0) = 1 = 100%
Precision (Troubled) = 30 / (30 + 243) = 0.8981 =
89.81%
Recall (Current) = 46 / (46 + 30) = 0.60526 = 60.53%
Recall (Troubled) = 243/ (30 + 243) = 0.8981 =
89.81%
Accuracy = (46 + 243) / (46 + 15 + 30 + 243) =
0.8652 = 86.52%
Figure 3.3 ROC curve by the method of Decision
Tree
Table 3.3 Testing Data Matrix Confussion
Accuracy : 86.13%
Pred. Troubled
Pred. Current
Class recall
True: Current
48
27
64.00%
Table 3.3 is the calculation of testing the accuracy of
the data using the C4.5 algorithm. Known from 928
training data using the C4.5 algorithm 46 the data
obtained in accordance with the predictions Troubled
Troubled, Troubled turns 15 Current prediction,
IV. CONCLUSIONS
From the research conducted it can be concluded
that C4.5 algorithm can be used as a tool of analysis
carried out by credit analysts. This is reinforced by the
results of the evaluation study that C4.5 algorithm can
analyze and troubled loans that borrowers are not
troubled as much as 86.52%. The pattern of
DecisionTree formed can be applied to the application
to make it easier to detect problematic behavior
Borrower. By applying the C4.5 Decision Tree
algorithm based on the accuracy of the analysis is
expected to improve the behavior of Borrower Funds.
Although the C4.5 algorithm model has been
implemented and running well in the system, but there
are some things that should be added to increase.
True: Troubled
19
238
92.61%
Precision
71.64%
89.81%
prediction of data 30 Current Troubled turns, and the
data 243
in accordance with Current Current
predictions.
[2] Anwar, Syaiful (2012) Penerapan data mining
untuk memprediksi perilaku nasabah kredit: studi
kasus BPR Marcorindo perdana ciputat
Tesis,Magister Ilmu Komputer,STMIK Nusa
Mandiri,Jakarta.
[3] Bramer,Mark(2007) Principles of Data Mining,
Verlag London Limited Springer
[4]
C.R.Kothari. (2004). Research Methology
Methods and Techniques. India: New Age
International Limited.
[5]
Firmansyah (2011) Penerapan Algoritma
Klasifikasi C4.5 untuk penentuan Kelayakan
Pemberian Kredit Koperasi,Tesis,Magister Ilmu
Komputer, STMIK Nusa Mandiri, Jakarta.
[6]
Gorunescu, Florin (2011). Data Mining:
Concepts, Models, and Techniques. Verlag
Berlin Heidelberg: Springer
REFERENCES
[1] Alpaydin, Ethem. (2010). Introduction to M
achine Learning. London: The MIT Press
Proceeding ISSIT 2014, Page: A-166
International Seminar on Scientific Issues and Trends (ISSIT) 2014
[7] Han, J., & Kamber, M. (2006). Data Mining
Concept and Tehniques. San Fransisco: Morgan
Kauffman.
[8] Jiang, Y. (2009 ). Credit Scoring Model Based on
Decision Tree and the Simulated Annealing
Algorithm. 2009 World Congress on Computer
Science and Information Engineering (hal. 18 22). Los Angeles: IEEE Computer Society.
[9] Kasmir. (2011). Analisis Laporan Keuangan.
Jakarta : PT. Rajagrafindo Persada
[10] Kusrini, & Luthfi, E. T. (2009). Algoritma Data
Mining. Yogyakarta: Andi Publishing.
[11] Kotsiantis, S., Kanellopoulos, D., Karioti, V., &
Tampakas, V. (2009). An ontology-based portal
for credit risk analysis. 2009 2nd IEEE
International Conference on Computer Science
and Information Technology, (hal. 165169).
Beijing.
[12] Leidiyana, Heny (2011) Komparasi Algoritma
Klasifikasi Data Mining Dalam Penentuan
Resiko Kredit Kepemilikan Kendaraan Bemotor
,Tesis,Magister Ilmu Komputer,STMIK Nusa
Mandiri,Jakarta.
[13] Lai, K. K., Yu, L., Zhou, L., & Wang, S. (2006).
Credit Risk Evaluation with Least Square
Support Vector Machine. Springer-Verlag , 490495
[14] Larose, D. T. (2005).Discovering Knowledge in
Data. New Jersey: John Willey & Sons, Inc.
[15] Liao. (2007). Recent Advances in Data Mining of
Enterprise Data: Algorithms and Application.
Singapore: World Scientific Publishing
[16]
Maimon, Oded&Rokach, Lior.(2005). Data
Mining and Knowledge Discovey Handbook.
New York: Springer
[17] Odeh, O. O., Featherstone, A. M., & Das, S.
(2010). Predicting Credit Default: Comparative
Results from an Artificial Neural Network,
Logistic Regression and Adaptive Neuro-Fuzzy
Inference System. EuroJournals Publishing, Inc.
2010 , 7-17.
[18]
Profil Koperasi Serba Usaha “Ceger Jaya”
Kelurahan Ceger-Cipayung Jakarta Timur
[19]
Sumathi, & S., Sivanandam, S.N. (2006).
Introduction to Data Mining and its
Applications. Berlin Heidelberg New York:
Springer
[20] Sholichah, Alfiyatus,(2009),Data Mining Untuk
Pembiayaan
Murabahah
Menggunakan
Association Rule (Studi Kasus BMT MMU
Sidogiri),Skripsi,Universitas
Islam
Negeri
Maulana Malik Ibrahim,Malang.
[21] Sogala,(2006) Satchidananda S. Comparing the
Efficacy of the Decision Trees with Logistic
Regression for Credit Risk Analysis. India
[22] Vercellis, Carlo (2009). Business Intelligent:
Data Mining and Optimization for Decision
Making. Southern Gate, Chichester, West
Sussex: John Willey & Sons, Ltd.
[23]
Zemke , Stefan (2003) Data Mining for
Prediction.Financial Series Case, Doctoral Thesis
The Royal Institute of Technology,Sweden
Department of Computer and Systems Sciences
[24]
Zurada, J. (2010). Could Decision Trees
Imnprove the Classification Accuracy and
Interpretability of Loan Granting Decisions.
HICSS '10 Proceedings of the 2010 43rdHawaii
International Conference on System Sciences,
(hal. 19). Koloa.
Suryanto, is a lecturer of Computer Science, AMIK
BSI. He received a Master Degree in Computer
Science from STMIK Nusa Mandiri Program of
Information System in 2010 on Program of
“Management Information System”. M. Kom
research interests are in Management Information
system. he is a researcher who won the novice faculty
research grants DIKTI period 2013-2014
Nandang Iriadi, is a lecturer of Computer Science,
of Computer Science, AMIK BSI He received a
Master Degree in Computer Science from STMIK
Nusa Mandiri in 2010 on “Management Information
system”. M. Kom research interests are in
Management Information System. he is a researcher
who won the novice faculty research grants DIKTI
period 2013-2014
Proceeding ISSIT 2014, Page: A-167