Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Seminar on Scientific Issues and Trends (ISSIT) 2014 CLASSIFICATION ALGORITHM IMPLEMENTATION OF DATA MINING IN THE DETERMINATION OF GIVING LOAN COOPERATIVE Suryanto1 ), Nandang Iriadi2) 1) Informatics Management Program,Akademi Manajemen Informatika dan Komputer Bina Sarana Informatika Jl. RS Fatmawati No.24 Pondok Labu 12450 [email protected] 2) Computer Engineering Program,Akademi Manajemen Informatika dan Komputer Bina Sarana Informatika Jl.RS Fatmawati No.24 Pondok Labu 12450 [email protected] Abstract- Cooperatives as a form of organization that is important in promoting economic growth . Credit unions be an alternative for people to get funding in an effort to improve their quality of life , fulfillment of daily needs and develop usaha.Tidak doubt , provide loan funds to customers will surely emerge problems , such as late customer pays the mortgage funds , abuse funds for other purposes , the client fails to expand its business so as to result in cooperative funds do not flow or it can lead to credit macet.Tujuan this research is to establish a model of an algorithm that can predict the behavior of troubled borrowers . Data Mining is one of the methods that can be used to analyze existing data chunks that can be used to summarize the data provide specific information related to the data . Data Mining classification of the decision tree using the C4.5 algorithm in the form of a rule statement. Decision tree model was able to improve the accuracy in analyzing the credit worthiness of potential borrowers filed . The richer the information or knowledge that is contained by the training data , the accuracy of the decision tree will increase . The results obtained for the value of 86.13 % Accuracy , Precision Value 89.94 % , 92.61 % recall rate curve AUC is worth 0842 each prediction entry into the category of good value and excellent classification classification . Keywords :behavior of borrowers , Data Mining , C4.5 Algorithm I. INTRODUCTION The development of cooperatives that provide fresh loans to prospective members and members of the cooperative are emerging. All the facilities and the ease in obtaining loans have been designed in such a way with certain requirements set forth by the cooperative tersebut.koperasi are institutions that provide loans to the borrowers who need quick funds for purposes of members who need it. In general, the allocation of funds from the loan cooperatives such as, for venture capital, education, home renovation..Not be denied, lend funds to members of the cooperative will surely emerge problems, such as paying the mortgage late member funds, misuse of funds for the purposes others, members failed to expand its business so as to result in cooperative funds do not flow or it can lead to bad credit. Therefore, this study was conducted to help resolve those problems by designing a data mining application that serves to predict the behavior of a potential borrower borrowing (credit) on the cooperative. By applying the C4.5 Decision Tree algorithm based is expected to improve the accuracy of the analysis to predict the behavior of members of the cooperative who could potentially make borrowing (loans) against Ceger Jaya Multipurpose Cooperative Enterprises located in Ceger-Cipayung East Jakarta, a place chosen by the researchers as a case study the number of members who apply for loan credit fund in 2010 as many as 762 people with customer problems Proceeding ISSIT 2014, Page: A-160 and as many as 300 people in 2011-2012 the number of customers as many as 928 people with problematic customers as much as 202 people. The data examined are permitted for the data in predicting customer behavior 2011-2012.Untuk this credit need a method or technique that can process data that already exists in the cooperative. One method uses data may mining.Dari data mining functions that have been mentioned will be used in this research. Data mining techniques are applied to an application that is built while the classification classification method used is Decision Tree (decision tree). The algorithm is used as a decision tree algorithm is forming algorithm C4.5. The data in the process in this study is a data member of the installment Ceger Jaya Multipurpose Cooperative Enterprises in 2011-2012 in Microsoft Excel format. This study is an application that can facilitate Multipurpose Cooperative Enterprises (KSU) "Ceger Jaya" in obtaining credit marketing targets in the future. Given these data mining, the KSU Ceger Jaya can get an overview of doing business lending. To predict the behavior of these credit customers need a method or technique that can process data that is already in the bank. One method may be to use data mining techniques. In data mining techniques are six functions (tasks) which are 1. Function Description, used to briefly set of Function 2. Estimation, are used to estimate an existing case data International Seminar on Scientific Issues and Trends (ISSIT) 2014 3. Function prediction, estimating the results of the unknown 4. Calcification functions, classifies the data, which consists of the mean vector algorithm, the Knearest algorithm, ID3 algorithm, the algorithm C4.5, C5.0 algorithm. II. THEORY 2.1. Overview of Study There are several studies that use C4.5 Decision Tree as a model of the algorithm to predict based on historical data. 1. Credit Scoring Model Based on Decision Tree and the Simulated Annealing Algorithm [8]. Make a model to predict which customers are problematic and not problematic in loan payments by using models and C4.5 Decision Tree and the Simulated Annealing Algorithm. The data used are taken from the German company which is a credit finance company. Jiang took a few attributes and then incorporated into the model to predict the percentage of customers with problems. 2. Could Decision Trees Improve the Classification Accuracy and Interpretability of Loan Granting Decisions? [24] .Make a study to compare several algorithms such as Linear Regression, Neural Network, Support Vector Machine, Case Base Reasoning, Rule Based Fuzzy Neural Network and Decision Tree. 2.2 Credit Understanding credit in Article 1 number 11 of Act No. 10 of 1998 on the change in the law No. 7 of 1992 concerning Banking is the provision of cash or equivalent, based on agreements between bank lending and borrowing with other parties who require the borrower to repay the debt within a certain time with bunga.Dalam distributing the funds, the bank or lender has a condition called specific requirements that must be met namely[9]:: 1. Type of credit needed 2. Desired amount 3. The loan period 4. Way of loan repayment 5. Collateral held 6. The financial statements of periods 7. Feasibility and 8. other requirements Definition of credit in general can be interpreted in two ways namely[9]: 1. Credit in the sense of the provision or distribution in money 2. Credits in the form of goods or services According to Analysis can also be measured creditworthiness with 7P principles namely[9]: 1. Personality 2. 3. 4. 5. 6. 7. Assessment is used to determine the personality of the prospective customers Perpose That is the purpose of taking credit (productive enterprises, used alone, trade) Party means to extend credit, the bank sort out into several categories (small business loans, medium or large) Payment is the way loan payments by the customer (whether from income or from the object financed) Prospect which to assess future expectations, especially the object-financed loans. Profitability means financed by bank credit would benefit both parties, both the bank or the customer. Protection, means of protection against creditfinanced object. Credit analysis techniques using C4.5 algorithm is expected to minimize the entry of the debtor is problematic, as more and more troubled borrowers more it will increase the level of bad loans, which in turn can lead to bankruptcy. That there are some attributes that accompany the debitor data such as age, amount of credit, checking, guarantor, loan term, long work, bank account number, employment status, credit history, the status of the home, funds safe , marital status, reason for the loan, and others. [24] 2.3. Data Mining Data mining is an iterative process that is aimed at the analysis of the database, with the aim of extracting information and knowledge that can prove the accuracy of the data and potentially good and useful for professionals engaged in the field of science in decision-making and problem solving [22] Activity Data Mining can be divided into two parts, according to the main objectives of the analysis[22]: 1. Interpretation The purpose of interpretation is to identify regular patterns in the data and to express data through rules and criteria that can be easily understood by experts in the application domain. 2. Prediction The goal of prediction is to anticipate the value of that variable is not a sequence which will assume in the future or to estimate the probability of future events. Data mining is the core of the process of Knowledge Discovery in Databases (KDD) [16]. KDD is organized process to identify valid patterns, new, useful, and understandable from a large data sets and complex. KDD consists of nine steps As shown in Figure 2.1. The following is a brief explanation of the steps in the KDD [16]. 1. Formation of understanding the application domain At this stage the purpose of determining Proceeding ISSIT 2014, Page: A-161 International Seminar on Scientific Issues and Trends (ISSIT) 2014 the end-user and the relevant sections where KDD done. 2. Select and create a data set in which the knowledge discovery process will performed Determination data to be used for the KDD process is done at this stage. 3. Preprocessing and cleansing in this phase of enhanced data reliability. Including clearing the data, such as dealing with incomplete data, eliminating distractions or outliers 4. Transformation of data At this stage, make better use of data dimension reduction methods and transformation attributes. 5. Choosing a suitable data mining tasks At this stage the specified type of data mining that will be used, whether classification, regression, or 2. Algorithm C4.5 C4.5 algorithm is an enhancement of the ID3 algorithm. C4.5 algorithm to construct a decision tree from a set of training data, in the form of cases or records (tuples) in the database, where each record has a discrete or both continuous attributes. [14] A decision tree structure similar to a tree where there is an internal node (non-leaf) which describe attributes, each branch represents the results of the attributes tested, and each leaf describes the class. C4.5 algorithm using the concept of information gain or entropy reduction for selecting the optimal division [7] clustering, depending on the purpose of KDD and the previous stage. 6. Choosing the data mining algorithm At this stage, the selection of the most appropriate algorithm to find patterns. 7. The use of data mining algorithms At this stage, the implementation of data mining algorithms that have been determined in the previous stage. 8. Evaluation At this stage of the evaluation and interpretation of the pattern is obtained. The use of knowledge gained At this stage, the knowledge incorporated into other systems and activate the system and measure the results. 1. Classification The classification is one of the most common applications for data mining. [7] There are several stages in making a decision tree algorithm C4.5 [10], namely: a. Preparing the training data. Training data are usually taken from historical data that never happened before and has been grouped into certain classes. b. Determine the roots of the tree. The roots will be taken from the selected attributes, by calculating the value of the gain of each attribute, the highest gain value that will be the first root. Before calculating the gain of an attribute value, first calculate the entropy value. To calculate the entropy value used formula. Specification: S = the set of cases n = number of partitions S pi = proportion of Si to S Entropy( S ) i 1 pi. log 2 pi n 3. Then calculate the gain value using the formula: n Gain( S , A) Entropy( S ) i 1 | Si | * Entropy( Si ) |S| Specification: S = the set of cases A = feature n = number of partitions of attributes A | Si | = proportion of Si to S | S | = number of cases in 4. 5. 3. Repeat step 2 until all record partition Decision tree partitioning process will stop when: a. All records in the node N endapat same class. b. Nothing in the record attributes are partitioned again. c. There is no record in the empty branches. The development of algorithms C4.5 ID3 algorithm. C4.5 algorithm to construct a decision tree from a set of training data, in the form of Rule Based Classification Rule-based or rule-based algorithm is the best way to represent the number of bits of data or Proceeding ISSIT 2014, Page: A-162 cases or records (tuples) in the database, where each record has a discrete or both continuous attributes[14] . The algorithm C4.5 to build a decision tree is as follows[10]: a. Select an attribute as root b. Create a branch for each value c. For the case of the branch d. Repeat the process for each branch until all cases on a branch knowledge [7] Rule-based logic is usually written in the form of IF-THEN or if made the equation is: International Seminar on Scientific Issues and Trends (ISSIT) 2014 IF condition THEN conlusion example of a rule is: IF age = youth AND student = yes THEN buys_computer = yes IF statement of the above equation is known as the rule antecedent or precondition while the THEN statement is referred to as the rule consequent. In the rule antecedent usually include one or more attributes (eg age and student attributes) and use logical AND if using more than one attribute. Rule consequent is a b. ROC curve ROC curve showing the classification accuracy and to compare visually. ROC express confusion matrix.. ROC Curve is another way to test the performance of the classification [6]. The accuracy accordance with his partner. Performance accuracy AUC)[6] Can be classified into five groups: a. 0.90 - 1.00 = Exellent Classification b. 0.80 - 0.90 = Good Classification c. 0.70 - 0.80 = Fair Classification d. 0.60 - 0.70 = Poor Classification I. RESULTS AND DISCUSSION 3.1. Measurement Research entropy class prediction, in the above example is buying a computer predictions or buys_computer = yes [7] 4. Evaluation and Validation Methods. a. Confusion Matrix Confusion Matrix is a tool (tools) visualization is commonly used in supervised learning. Each column in the matrix is a sample class prediction, while each row represents an actual incident in class (Gorunescu, 2010). A. Results of Research The purpose of this study to test the accuracy of credit analysis using the C4.5 algorithm. The data is analyzed loan data in the form of loans, ie loans that all data has been approved by the Cooperative Business Solutions Business "Ceger Jaya". a. Algorithm C4.5 Steps to make C.45 algorithm using the training data which amounts to 928, namely: Prepare the training data. Training data used in this study amounted to 928 records. Calculate the value of entropy. b. After calculation of entropy using the formula, example, for the attribute income, entropy will be obtained as follows: Entropy( S ) i 1 pi. log 2 pi n (S) = (-726 = 0.7559 / 928 * log2 (726/928)) c. After that, calculate the value of the gain for each attribute, and then select the highest gain value. The highest gain value that will be used as the root of the tree. Gain calculation using + (- 202/928 * log2 (202/928)) the formula: For example, for the attribute income, the gain will be obtained: n Gain( S , A) Entropy( S ) i 1 | Si | * Entropy( Si ) |S| Gain (S, A) = 0.7559- ((301/928 * 0.9998) + (354/928 highest gain value 0.1718 Therefore, income * 0.3134)) + (273/928 * 0.4771)= 0.1718. is the root node in the decision tree. Here are From the calculation of entropy and the gain, the results of the calculation of entropy and it appears that the Income attribute has the gain in Table 3.1 Table 3.1 Results of entropy value and the gain to determine the root node Node Data Current Troubled Entropy 928 726 202 0,7559 Income 928 726 202 0,7559 2-3 jt 301 147 154 0,9996 3-4 Jt 354 334 20 0,3134 5-9 Jt 273 245 28 0,4771 Gain Information Gain Split Gain Ratio Root 0,1717786 0,584116 0,1718 1,5765429 0,1089591 Proceeding ISSIT 2014, Page: A-163 International Seminar on Scientific Issues and Trends (ISSIT) 2014 Defendant Family 928 726 202 0,7559 Many 435 415 20 0,2691 Moderately 157 100 57 0,9452 Slighty 240 123 117 0,9995 Empty 96 88 8 0,4138 Business Activity 928 726 202 0,7559 Shop 361 274 87 0,7967 Services 273 219 54 0,7175 Workshop 294 233 61 0,7366 928 726 202 0,7559 448 355 93 0,7368 Owned Alone 480 371 109 0,7729 Loan Ceiling 928 726 202 0,7559 1-5 Jt 231 184 47 0,7288 6-10 Jt 234 185 49 0,7403 11-15 Jt 233 181 52 0,7659 16-30 Jt 230 176 54 0,7863 Period 928 726 202 0,7559 1-6 Bln 528 417 111 0,7419 7-12 Bln 400 309 91 0,7736 Interest Rate 928 726 202 0,7559 4% 479 373 106 0,7625 3% 449 353 96 0,7487 Status of Residence 928 726 202 0,7559 Your Own Home 344 270 74 0,7511 Rental Home 288 220 68 0,7885 ride 296 236 60 0,7273 Goods Guarantee 928 726 202 0,7559 Motorcycle reg 562 444 118 0,7414 Car reg 366 282 84 0,7772 Status of business place Renting From the calculation of entropy and the gain obtained in Table 3.1, it appears that income has a value attribute has the highest gain ratio is 0.1089591. Therefore, the income is the root node in the decision tree. To determine the next node, the node 1.1 entropy . 0,1685537 0,587341 0,1686 1,7892474 0,0942037 0,001519 0,754376 0,0015 1,5745435 0,0009647 0,0004024 0,755492 0,0004 0,9991421 0,0004027 0,0006239 0,755271 0,0006 1,5052644 0,0004145 0,0003092 0,755585 0,0003 0,9862325 0,0003135 5,93E-05 0,755835 0,0001 0,999246 5,93E-05 0,0007565 0,755138 0,0008 1,5804253 0,0004787 0,0003848 0,75551 0,0004 0,9675783 0,0003977 calculation is done again and gain based attribute income. The number of cases is the number of cases calculated by the value of the root node (Income) .. So the form of the Tree of the calculation as the following figure 3.1 Figure 3.1 Tree root node In Figure 3.1, the decision tree generated from the calculation of the overall gain ratio for the attribute. Based on the results of the calculation of the gain ratio in Table 4.1, income has the highest gain value ratio Proceeding ISSIT 2014, Page: A-164 and has three paths according to its attributes, namely: = 2-3 Jt, Jt = 3-4 and a = 5-9 Jt. The next step is taking into account the gain ratio of the existing tree roots to International Seminar on Scientific Issues and Trends (ISSIT) 2014 determine the node 1.1 to have a path Income (Jt = 23), see table 3.2 as follows: While decesion tree data processed using RapidMiner can be seen in Figure 3.2: Figure 3.2 Decision Tree Algorithm Using C4.5 Based on the above decision tree, classification rules can be established, as follows: 1. R1: IF income = 2-3 Jt AND Liability Family=Many AND Status of Business Premises = renting AND Period of time =1-6 Bln THEN Current 2. R2: IF income = 2-3 Jt AND Liability Family=Many AND Status of Business Premises = renting AND Period of time =7-12 Bln AND Good guarantee = Car register THEN Current 3. R3: IF income = 2-3 Jt AND Liability Family=Many AND Status of Business Premises = renting AND Period of time =7-12 Bln AND Good guarantee = Motocyle register THEN Troubled 4. R4: IF income = 2-3 Jt AND Liability Family=Many AND Status of Business Premises = one’s own THEN Current 5. R5: IF income = 2-3 Jt AND Liability Family=Slightly Loan Ceiling = 1-5 Jt AND Business Activities = Machine Shop THEN Current 6. R6: IF income = 2-3 Jt AND Liability Family=Slightly AND Loan Ceiling = 1-5 Jt AND Business Activities = Services THEN Troubled 7. R7: IF income = 2-3 Jt AND Liability Family=Slightly AND Loan Ceiling = 1-5 Jt AND Business Activities = Store THEN Troubled 8. R8: IF income = 2-3 Jt AND Liability Family=Slightly AND Loan Ceiling = 11-15 Jt AND Business Activities = Machine Shop THEN Troubled 9. R9: IF income = 2-3 Jt AND Liability Family=Slightly AND Loan Ceiling = 11-15 Jt AND Business Activities = Services THEN Current 10. R10: IF income = 2-3 Jt AND Liability Family=Slightly AND Loan Ceiling = 11-15 Jt AND Business Activities = Store THEN Troubled 11. R11: IF income = 3-4 Jt AND Liability Family=Many THEN Current 12. R12:IF income = 3-4 Jt AND Liability Family=Empty AND Loan Ceiling = 1-5 Jt THEN Current 13. R13:IF income = 3-4 Jt AND Liability Family=Empty AND Loan Ceiling = 11-15 Jt THEN Troubled 14. R14:IF income = 3-4 Jt AND Liability Family=Empty AND Loan Ceiling = 16-30 Jt THEN Current 15. R15:IF income = 3-4 Jt AND Liability Family=Empty AND Loan Ceiling = 6-10 Jt THEN Current 3.2. Evaluation and Validation Model The results of model testing has been done, the accuracy of the testing done by using the confusion matrix and ROC curve / AUC (Area Under Cover). 1. Confussion Matrix Table 3.3 The accuracy is the accuracy of the calculation of the training data using the C4.5 algorithm. Known from 928 training data and 10 attributes (Income, Business Activity, Status of Business Place, Ceiling loans, term, Interest Rate, Status Dwelling, Vehicle Owned, Printed Collateral, Loan Quality), using the method of C4.5 algorithm obtained 46 Data in accordance with the predictions Troubled Troubled, Troubled turns 15 Current prediction, prediction of data 30 Current Troubled turns, and the data 238 in accordance with Current Current prediction. Table 3.2 Confussion Matrix (acuracy) Proceeding ISSIT 2014, Page: A-165 International Seminar on Scientific Issues and Trends (ISSIT) 2014 2. Evaluation of the ROC curve ROC has a level of diagnostic value, namely[6] : Accuracy is worth 0.90 - 1.00 = excellent classification Accuracy is worth 0.80 - 0.90 = good classification Accuracy is worth 0.70 - 0.80 = fair classification Accuracy is worth 0.60 - 0.70 = poor classification Accuracy is worth 0:50 - 0.60 = failure The results of the processing to algorithms C4.5 ROC using training data for 0842 can be seen in Table 3.3,Table 3.5. with a good level of diagnostic classification. After testing, the training data obtained measurement results on the training data that accuracy = 86.52%, precision = 89.17% and recall = 94.20% and Araea Under Curve =0.844 Figure 3.4 From these results there is a chart with the value of ROC AUC (Area Under Curve) of 0.844 for diagnostic testing where the results of data Good Classification. a. Similarities that can be used to calculate precision and accuracy namely[7]: Accuracy = (TP + TN) / (TP + TN + FP + FN) Or using equation accuracy namely [6] Precision (Current) = 46 / (46 + 0) = 1 = 100% Precision (Troubled) = 30 / (30 + 243) = 0.8981 = 89.81% Recall (Current) = 46 / (46 + 30) = 0.60526 = 60.53% Recall (Troubled) = 243/ (30 + 243) = 0.8981 = 89.81% Accuracy = (46 + 243) / (46 + 15 + 30 + 243) = 0.8652 = 86.52% Figure 3.3 ROC curve by the method of Decision Tree Table 3.3 Testing Data Matrix Confussion Accuracy : 86.13% Pred. Troubled Pred. Current Class recall True: Current 48 27 64.00% Table 3.3 is the calculation of testing the accuracy of the data using the C4.5 algorithm. Known from 928 training data using the C4.5 algorithm 46 the data obtained in accordance with the predictions Troubled Troubled, Troubled turns 15 Current prediction, IV. CONCLUSIONS From the research conducted it can be concluded that C4.5 algorithm can be used as a tool of analysis carried out by credit analysts. This is reinforced by the results of the evaluation study that C4.5 algorithm can analyze and troubled loans that borrowers are not troubled as much as 86.52%. The pattern of DecisionTree formed can be applied to the application to make it easier to detect problematic behavior Borrower. By applying the C4.5 Decision Tree algorithm based on the accuracy of the analysis is expected to improve the behavior of Borrower Funds. Although the C4.5 algorithm model has been implemented and running well in the system, but there are some things that should be added to increase. True: Troubled 19 238 92.61% Precision 71.64% 89.81% prediction of data 30 Current Troubled turns, and the data 243 in accordance with Current Current predictions. [2] Anwar, Syaiful (2012) Penerapan data mining untuk memprediksi perilaku nasabah kredit: studi kasus BPR Marcorindo perdana ciputat Tesis,Magister Ilmu Komputer,STMIK Nusa Mandiri,Jakarta. [3] Bramer,Mark(2007) Principles of Data Mining, Verlag London Limited Springer [4] C.R.Kothari. (2004). Research Methology Methods and Techniques. India: New Age International Limited. [5] Firmansyah (2011) Penerapan Algoritma Klasifikasi C4.5 untuk penentuan Kelayakan Pemberian Kredit Koperasi,Tesis,Magister Ilmu Komputer, STMIK Nusa Mandiri, Jakarta. [6] Gorunescu, Florin (2011). Data Mining: Concepts, Models, and Techniques. Verlag Berlin Heidelberg: Springer REFERENCES [1] Alpaydin, Ethem. (2010). Introduction to M achine Learning. London: The MIT Press Proceeding ISSIT 2014, Page: A-166 International Seminar on Scientific Issues and Trends (ISSIT) 2014 [7] Han, J., & Kamber, M. (2006). Data Mining Concept and Tehniques. San Fransisco: Morgan Kauffman. [8] Jiang, Y. (2009 ). Credit Scoring Model Based on Decision Tree and the Simulated Annealing Algorithm. 2009 World Congress on Computer Science and Information Engineering (hal. 18 22). Los Angeles: IEEE Computer Society. [9] Kasmir. (2011). Analisis Laporan Keuangan. Jakarta : PT. Rajagrafindo Persada [10] Kusrini, & Luthfi, E. T. (2009). Algoritma Data Mining. Yogyakarta: Andi Publishing. [11] Kotsiantis, S., Kanellopoulos, D., Karioti, V., & Tampakas, V. (2009). An ontology-based portal for credit risk analysis. 2009 2nd IEEE International Conference on Computer Science and Information Technology, (hal. 165169). Beijing. [12] Leidiyana, Heny (2011) Komparasi Algoritma Klasifikasi Data Mining Dalam Penentuan Resiko Kredit Kepemilikan Kendaraan Bemotor ,Tesis,Magister Ilmu Komputer,STMIK Nusa Mandiri,Jakarta. [13] Lai, K. K., Yu, L., Zhou, L., & Wang, S. (2006). Credit Risk Evaluation with Least Square Support Vector Machine. Springer-Verlag , 490495 [14] Larose, D. T. (2005).Discovering Knowledge in Data. New Jersey: John Willey & Sons, Inc. [15] Liao. (2007). Recent Advances in Data Mining of Enterprise Data: Algorithms and Application. Singapore: World Scientific Publishing [16] Maimon, Oded&Rokach, Lior.(2005). Data Mining and Knowledge Discovey Handbook. New York: Springer [17] Odeh, O. O., Featherstone, A. M., & Das, S. (2010). Predicting Credit Default: Comparative Results from an Artificial Neural Network, Logistic Regression and Adaptive Neuro-Fuzzy Inference System. EuroJournals Publishing, Inc. 2010 , 7-17. [18] Profil Koperasi Serba Usaha “Ceger Jaya” Kelurahan Ceger-Cipayung Jakarta Timur [19] Sumathi, & S., Sivanandam, S.N. (2006). Introduction to Data Mining and its Applications. Berlin Heidelberg New York: Springer [20] Sholichah, Alfiyatus,(2009),Data Mining Untuk Pembiayaan Murabahah Menggunakan Association Rule (Studi Kasus BMT MMU Sidogiri),Skripsi,Universitas Islam Negeri Maulana Malik Ibrahim,Malang. [21] Sogala,(2006) Satchidananda S. Comparing the Efficacy of the Decision Trees with Logistic Regression for Credit Risk Analysis. India [22] Vercellis, Carlo (2009). Business Intelligent: Data Mining and Optimization for Decision Making. Southern Gate, Chichester, West Sussex: John Willey & Sons, Ltd. [23] Zemke , Stefan (2003) Data Mining for Prediction.Financial Series Case, Doctoral Thesis The Royal Institute of Technology,Sweden Department of Computer and Systems Sciences [24] Zurada, J. (2010). Could Decision Trees Imnprove the Classification Accuracy and Interpretability of Loan Granting Decisions. HICSS '10 Proceedings of the 2010 43rdHawaii International Conference on System Sciences, (hal. 19). Koloa. Suryanto, is a lecturer of Computer Science, AMIK BSI. He received a Master Degree in Computer Science from STMIK Nusa Mandiri Program of Information System in 2010 on Program of “Management Information System”. M. Kom research interests are in Management Information system. he is a researcher who won the novice faculty research grants DIKTI period 2013-2014 Nandang Iriadi, is a lecturer of Computer Science, of Computer Science, AMIK BSI He received a Master Degree in Computer Science from STMIK Nusa Mandiri in 2010 on “Management Information system”. M. Kom research interests are in Management Information System. he is a researcher who won the novice faculty research grants DIKTI period 2013-2014 Proceeding ISSIT 2014, Page: A-167