Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
www.ierjournal.org International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621 ISSN 2395-1621 Real Time Credit Card Transaction Analysis #1 Anushree Naik, #2Kalyani Phulmamdikar, #3Shreya Pradhan, #4Sayali Thorat #5 Prof. Sachin V. Dhande 1 [email protected] [email protected] 3 [email protected] 4 [email protected] 2 #1234 Department of Computer Engineering, SKNCOE, Savitribai Phule Pune University Pune, Maharashtra, India. . ABSTRACT ARTICLE INFO Credit card fraud is a growing problem that affects card holders around the world. As credit card becomes the most prevailing mode of payment for online as well as regular purchase, fraud related with it are also increasing. This paper discusses automated credit card fraud detection by means of machine learning. We are applying Bayesian belief network (Naïve Bayes Theorem) to the problem and show its significant result in real world financial data. Bayesian network helps to obtain high fraud coverage with low false alarm. Finally future directions are indicated to improve implemented techniques and results. Article History Keywords— Online banking, Credit card fraud detection, Data mining, OTP, Naïve Bayes, Hadoop file system, Map reduce, Machine learning. 25 th January 2016 I. INTRODUCTION Credit card frauds are increasing day by day because fraudsters are so expert that they find new ways for committing fraudulent transactions each day which demand constant innovation for its detection techniques. Many techniques based on Artificial Intelligence, Data Mining, Machine Learning, Neural Network, Genetic Programming etc. has evolved in detecting various credit card fraudulent transactions. A steady indulgent of all these approaches will positively lead to an efficient credit card fraud detection system. This paper presents a survey of various techniques used in credit card fraud detection mechanisms and Naive Bayes in detail. In this paper a new comparison measure that realistically represents the monetary gains and losses due to fraud detection is proposed. Moreover, using the proposed cost measure a cost sensitive method based on Bayes algorithm is presented. To get more accurate results, we have performed analysis on training data. © 2015, IERJ All Rights Reserved Received :16th January 2016 Received in revised form : 17th January 2016 Accepted :20th January , 2016 Published online : II. LITERATURE SURVEY The use of credit and debit cards has increased significantly in the last years, unfortunately so has the fraud committed with them. Credit card fraud detection has drawn a lot of research interest and a number of techniques, with special emphasis on neural network, data mining and distributed data mining have been suggested. According to European Central Bank report of 2014, 794 million were lost in frauds in the year 2012. Currently, financial institutions deal with fraud detection with series of if-then rules created by internal risk team. The rules perform well as long as there are no new fraud patterns, as repeated fraud patterns are required for the team to detect new fraud patterns. There is therefore, a clear need for better approach to the credit card fraud detection problem. It has in particular: Neural Network, Artificial Immune System, Decision Tree, Genetic Algorithm and Naive Bayesian algorithm. Existing System: Artificial Neural Networks (These works similar to the human brain thinking system.), Decision Tree (It is a tree table generating system used to Page 1 www.ierjournal.org International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621 take decision by connecting nodes),Genetic Algorithm (It gives a general approach to solve the problem.) Proposed System is using Bayes Rule, it uses an underlying Probabilistic Model which solves problem having categorical and continuous valued attributes. III. VARIOUS TECHNIQUES USED FOR CREDIT CARD FRAUD DETECTION What types of fraud do you need to look out for? 1. Bankruptcy Fraud 2. Theft/Counterfeit Fraud 3. Behaviour Fraud 4. Application Fraud Credit card fraud detection techniques: A. Decision Tree: In this case, a similarity tree is defined recursively, the nodes are labelled with the use of attribute name, edges are labelled using values of attributes and then there are the leaves which contain an intensity factor that is defined as the ratio of number of transactions that satisfy the outlined conditions. Main advantage of this method of fraud detection is, that it is easy to implement, understood and display. Disadvantage are when you are forced to check every transaction one by one. B. Genetic Algorithm: In most of the instances, algorithms are recommended as predictive methods or means of fraud detection. This method tends to follow this trait. Genetic algorithm has been proved to deliver results when it comes to giving credible home insurance data. This techniques also incorporates a range of methods that are used to predict any suspicious behaviour. C. Clustering Techniques: Clustering techniques that are used to detect behavioural fraud. Peer group analysis is a system that allows for the identification of accounts that are behaving differently from one another at any moment in times, particularly when they were behaving the same previously. Those accounts are flagged as suspicious. Fraud analysts can then proceed to investigate such discrepancies. The hypothesis in these clustering techniques is that, if accounts behave the same over a certain period of time and then certain account starts to behave significantly differently, the account holder should be notified. D. Neural Networks: Neural networks are also recommended as effective credit card fraud detection methods. The only issue with this method is that all data has to be clustered by the type of account it belongs to. Credit card fraud is a major issue that if not dealt with effectively, it can result in myriad complications. It is vital to try and find ways of detecting the issues and resolving them as soon as they arise. learning technique that is Naïve Bayes Algorithm. By applying this we find out whether a particular is actual user or a fraudulent one. If the user is found to be a regular one, normal transaction is done. Otherwise, OTP is generated. Visual Cryptography and image processing are then applied on it. Wherein a grayscale imaged two shares of OTP are generated. Out of which one is sent through an email and one is sent on a mobile application. These shares are then superimposed. If they match exactly then user is permitted for transaction otherwise the Credit Card is locked by the admin. V. THEOREM FOR CREDIT CARD FRAUD DETECTION AND PREVENTION A. Bayes Theorem Background: Naive Bayes classifier is simple probabilistic classifier based on applying Bayes’ theorem with strong (naive) independence assumptions. Bayes theorem was named after Thomas Bayes (170261) who studied how to compute a distribution for the probability parameter of Binomial distribution. After Bayes death, his friend Richard Price edited and presented this work in 1763. B. Bayes Technique: Bayes rule [Jiawei Han,Micheline Kamber, Jian Pei:"Data Mining-concepts and techniques] depends upon Prior Probability (Individual) as well as Posterior Probability (Conditional). The formula is given by:P (Y|X) = [P (X|Y).P (Y)] / P (X) Where, P (X) = Independent Probability of X P (Y) = Independent Probability of Y P (X|Y) = Conditional Probability of X IV. ARCHITECTURE Here, the first step is where signing in process will be done through secure network. The user will browse through list of services. After he/she is done selecting the service(s), the credit card details will be collected from the user. These details will be stored in the database and the bank customer will be uniquely identified. Also, request for the required amount is initiated. Data mining is then applied on this user information. This is done by using one of the machine © 2015, IERJ All Rights Reserved given Y P (Y|X) = Conditional Probability of Y given X Here, the database will be storing all the data and transactions details of the customer. The transaction details could be the Time of the Transactions (morning, afternoon, evening, night), Amount of the Transactions Page 2 www.ierjournal.org International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621 generally made (a threshold limit is set according to customer’s frequent limit eg: Below 20,000 or Above 20,000), Day of the Transactions (Monday, Tuesday, Wednesday, etc.), Frequency of the transactions (number of times the card is swiped by the customer in a day). Such parameters are considered, we have considered utmost of them and hence the precision of the system increases. There are two major steps in this algorithm they are:1. Training: In this step the Initial Probability is calculated i.e. P (Yes) or P (No). 2. Detection: Firstly, the Individual Probability is calculated i.e. P (Attribute|Yes) or P (Attribute|No). Example: P (Monday|Yes) (This gives the probability of number of transactions on Monday and there is some fraud found). In this way all the parameters are considered with all their types with positive and negative probability. Secondly, the Final Probability is found i.e. P(X1)=P(Attribute|Yes).P(Yes)= P(Monday|Yes).P(Morning|Yes).P(>20,000|Yes).P(5|Yes) Similarly for P(X2)=P(Attribute|No).P(No) is found. If the result of P(X1)>P(X2), then YES there is Fraud happening, further OTP is generated as mentioned earlier. Otherwise, there is no fraud and normal transaction can be made. C. Bayesian Network: A Bayesian network [Jiawei Han,Micheline Kamber, Jian Pei:"Data Mining-concepts and techniques] is a Graphical Probabilistic Model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). D. Visual Cryptography: This is a technique which allows visual information (pictures, texts, etc.) to be encrypted in such a way that decryption becomes a mechanical operation that does not require a computer. a. Visual cryptography sharing case: In this scheme, we have a secret image which is encoded into two shares. Every pixel from secret image is encoded into multiple sub pixels in each share image using matrix to determine the colour of the pixel. In (2,2) sharing case we use complimentary matrices to share a black pixel and identical matrices to share a white pixel. Staking the shares we have all subpixels associated with black pixel now black, while 50% of the sub-pixels associated with white pixel remain white. b. Cheating Visual Secret Sharing Scheme: For binary(2,2) visual cryptography that creates two encrypted images from an original image, there is a simple algorithm where first image is created of random pixels having same size and shape as the original image. Next, second image of same size and share as first is created but here pixels of original image is same as the corresponding pixels in the first encrypted image. The pixels of the second encrypted image are set to opposite colour. Here, pixel of the original image are different than the corresponding pixel of the first encrypted image and same pixels of the second encrypted image are set to the same colour as the corresponding pixel of the first encrypted image. Then these two random images can now be combined using exclusive OR (XOR) to recreate the original image. VI. USE OF HADOOP AND HDFS IN THE PROPOSED SYSTEM Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the framework. The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS) and a processing part called MapReduce. HDFS stores large files (typically in the range of gigabytes to terabytes) across multiple machines. It achieves reliability by replicating the data across multiple hosts. In proposed system we are using HDFS for storing and fast accessing(using MapReduce) of the user logs. Whole data of user will get stored using HDFS. VII.CONCLUSION This proposed system in this paper is a real time system which is feasible and can be implemented. The use of such systems in the Bank Server can handle crucial frauds related to Credit Cards. Our evaluation confirmed that including the real cost by creating cost sensitive system using a Bayes © 2015, IERJ All Rights Reserved Page 3 www.ierjournal.org International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621 minimum risk classifier, gives rise to much better fraud detection results in the sense of higher savings. Considering the growing world of internet and requirement of High level Security to avoid Cyber Crimes, our system can be successfully applied in many Online Payment System in the near future. REFRENCES [1] Shailesh S. Dhok“ Credit Card Fraud Detection Using Hidden Markov Rule” International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-2, Issue-1, March 2012 [2] MohdAveshZubair Khan1, Jabir Daud Pathan2, Ali HaiderEkbal Ahmed3”Credit Card Fraud detection using Hidden Markov Model & K-Clustering” [3] S.Benson Edwin Raj, A. Annie Portia, “Analysis on Credit Card Fraud Detection Methods”, IEEE International Conference on Computer, Communication and Electrical Technology,IEEEMarch2011. [4] Alejandro Correa Bahnsen, AleksandarStojanovic, DjamilaAouada and Bj¨ornOttersten “Credit Card Fraud Detection using Bayes Minimum risk” 2013 12th International Conference on Machine Learning and Applications. [5] S. Benson Edwin Raj, A. Annie Portia, ―Analysis on Credit Card Fraud Detection Methods‖, IEEE International Conference on Computer, Communication and Electrical Technology, IEEE March 2011 [6] M.Hamdi Ozcelik, Mine Isik, ―Improving a credit card fraud detection system using Genetic algorithm‖, IEEE International Conference on Networking and Information Technology, IEEE 2010. [7] Genetic algorithms for credit card fraud detection by Daniel Garner, IEEE Transactions May 2011. [8] Research on credit card fraud detection model based on distance sum IEEE 2009 International Joint Conference on Artificial Intelligence. Kundu, Suvasini Panigrahi, Shamik Sural and Arun K. Majumdar, ―BLAST-SSAHA Hybridization for Credit Card Fraud Detection,‖ IEEE Transactions On Dependable And Secure Computing, vol. 6, Issue no. 4, pp.309-315, OctoberDecember 2009. [13] A. Chiu, C. Tsai, ―A Web Services-Based Collaborative Scheme for Credit Card Fraud Detection,‖ Proceedings of the IEEE International Conference on eTechnology, e-Commerce and e-Service, pp.177-181, 2004. [14] Amlan Kundu, Suvasini Panigrahi, Shamik Sural and Arun K. Majumdar, ―Credit card fraud detection A fusion approach using Dempster–Shafer theory and Bayesian learning,‖ Special Issue on Information Fusion in Computer Security, Vol. 10, Issue no 4, pp.354- 363, October 2009. [15] Liu Ren, Zhang Liping, Zhan Yinqiang. A Study on Construction of Analysis Based CRM System. Computer Applications and Software. Vol.21, Apr. 2004, pp. 46-47. [16] M. Mehdi, S. Zair, A. Anou and M. Bensebti,‖ A Bayesian Networks in Intrusion Detection Systems,‖ International Journal of Computational Intelligence Research, Issue No. 1, pp.09731873 Vol. 3, 2007. [17] Ezawa.K. & Norton.S,‖Constructing Bayesian Networks to Predict Uncollectible Telecommunications Accounts,‖ IEEE Expert, October; 45-51, 1996. [18] Blickle, T., & Thiele, L. (1995). A Comparison of Selection Schemes used in Genetic Algorithms (Vol. 2). Zurich: Swiss Federal Institute of Technology. [19] Jitendra Dara,Laxman Gundemoni, ―Credit Card Security and E-Payment.‖ 2006. [18] Wang Xi. Some Ideas about Credit Card Fraud Prediction China Trial. Apr. 2008, pp. 74-75. [20] M. Hamdi Ozcelik, Ekrem Duman, Mine Isik, Tugba Cevik, Improving a credit card fraud detection system using genetic algorithm, International conference on Networking and information technology 2010. [9] Credit card fraud detection using neural network, Raghavedra Patidar, Lokesh Sharma, ISSN: 2231-2307, Volume, IssueNCAI211, JUNE2011. [21] Wen-Fang YU, Na Wang, Research on Credit Card Fraud Detection Model Based on Distance Sum, IEEE International Joint Conference on Artificial Intelligence 2009. [10] Panigraili, S., Kundu, A., Sural, S. & Majumdar, A. (2009). Credit Card Fraud Detection: A Fusion Approach Using Dempster-Shafer Theory and Bayesian Learning. Information Fusion , 354-363. [22] Jiawei Han, Micheline Kamber, Jian Pei : "Data Mining-Concepts and Techniques". [11] Dr Markus Roggenbach. CS364 Software testing slides. Swansea University, 2011. [12]. D.WHITLEY,―Genetic Algorithm And Neural Network.‖2003. [13] Wang Xi. Some Ideas about Credit Card Fraud Prediction China Trial. Apr. 2008, pp. 74-75. [10] Amlan © 2015, IERJ All Rights Reserved Page 4