Download - IERJournal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
www.ierjournal.org
International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621
ISSN 2395-1621
Real Time Credit Card Transaction
Analysis
#1
Anushree Naik, #2Kalyani Phulmamdikar, #3Shreya Pradhan, #4Sayali Thorat
#5
Prof. Sachin V. Dhande
1
[email protected]
[email protected]
3
[email protected]
4
[email protected]
2
#1234
Department of Computer Engineering, SKNCOE, Savitribai Phule Pune University
Pune, Maharashtra, India.
.
ABSTRACT
ARTICLE INFO
Credit card fraud is a growing problem that affects card holders around the world.
As credit card becomes the most prevailing mode of payment for online as well as
regular purchase, fraud related with it are also increasing. This paper discusses
automated credit card fraud detection by means of machine learning. We are
applying Bayesian belief network (Naïve Bayes Theorem) to the problem and show its
significant result in real world financial data. Bayesian network helps to obtain high
fraud coverage with low false alarm. Finally future directions are indicated to
improve implemented techniques and results.
Article History
Keywords— Online banking, Credit card fraud detection, Data mining, OTP, Naïve Bayes,
Hadoop file system, Map reduce, Machine learning.
25 th January 2016
I. INTRODUCTION
Credit card frauds are increasing day by day because
fraudsters are so expert that they find new ways for
committing fraudulent transactions each day which demand
constant innovation for its detection techniques. Many
techniques based on Artificial Intelligence, Data Mining,
Machine Learning, Neural Network, Genetic Programming
etc. has evolved in detecting various credit card fraudulent
transactions. A steady indulgent of all these approaches will
positively lead to an efficient credit card fraud detection
system. This paper presents a survey of various techniques
used in credit card fraud detection mechanisms and Naive
Bayes in detail. In this paper a new comparison measure that
realistically represents the monetary gains and losses due to
fraud detection is proposed. Moreover, using the proposed
cost measure a cost sensitive method based on Bayes
algorithm is presented. To get more accurate results, we
have performed analysis on training data.
© 2015, IERJ All Rights Reserved
Received :16th January 2016
Received in revised form :
17th January 2016
Accepted :20th January , 2016
Published online :
II. LITERATURE SURVEY
The use of credit and debit cards has increased
significantly in the last years, unfortunately so has the fraud
committed with them. Credit card fraud detection has drawn
a lot of research interest and a number of techniques, with
special emphasis on neural network, data mining and
distributed data mining have been suggested. According to
European Central Bank report of 2014, 794 million were
lost in frauds in the year 2012. Currently, financial
institutions deal with fraud detection with series of if-then
rules created by internal risk team. The rules perform well
as long as there are no new fraud patterns, as repeated fraud
patterns are required for the team to detect new fraud
patterns. There is therefore, a clear need for better approach
to the credit card fraud detection problem. It has in
particular: Neural Network, Artificial Immune System,
Decision Tree, Genetic Algorithm and Naive Bayesian
algorithm. Existing System: Artificial Neural Networks
(These works similar to the human brain thinking system.),
Decision Tree (It is a tree table generating system used to
Page 1
www.ierjournal.org
International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621
take decision by connecting nodes),Genetic Algorithm (It
gives a general approach to solve the problem.)
Proposed System is using Bayes Rule, it uses an underlying
Probabilistic Model which solves problem having
categorical and continuous valued attributes.
III. VARIOUS TECHNIQUES USED FOR CREDIT CARD FRAUD
DETECTION
What types of fraud do you need to look out for?
1. Bankruptcy Fraud
2. Theft/Counterfeit Fraud
3. Behaviour Fraud
4. Application Fraud
Credit card fraud detection techniques:
A.
Decision Tree: In this case, a similarity tree is
defined recursively, the nodes are labelled with the use of
attribute name, edges are labelled using values of attributes
and then there are the leaves which contain an intensity factor
that is defined as the ratio of number of transactions that
satisfy the outlined conditions.
Main advantage of this method of fraud detection is, that it is
easy to implement, understood and display. Disadvantage are
when you are forced to check every transaction one by one.
B.
Genetic Algorithm: In most of the instances,
algorithms are recommended as predictive methods or means
of fraud detection. This method tends to follow this trait.
Genetic algorithm has been proved to deliver results when it
comes to giving credible home insurance data. This
techniques also incorporates a range of methods that are used
to predict any suspicious behaviour.
C.
Clustering Techniques: Clustering techniques that
are used to detect behavioural fraud. Peer group analysis is a
system that allows for the identification of accounts that are
behaving differently from one another at any moment in
times, particularly when they were behaving the same
previously. Those accounts are flagged as suspicious. Fraud
analysts can then proceed to investigate such discrepancies.
The hypothesis in these clustering techniques is that, if
accounts behave the same over a certain period of time and
then certain account starts to behave significantly differently,
the account holder should be notified.
D.
Neural Networks: Neural networks are also
recommended as effective credit card fraud detection
methods. The only issue with this method is that all data has
to be clustered by the type of account it belongs to. Credit
card fraud is a major issue that if not dealt with effectively, it
can result in myriad complications. It is vital to try and find
ways of detecting the issues and resolving them as soon as
they arise.
learning technique that is Naïve Bayes Algorithm. By
applying this we find out whether a particular is actual user
or a fraudulent one. If the user is found to be a regular one,
normal transaction is done. Otherwise, OTP is generated.
Visual Cryptography and image processing are then applied
on it. Wherein a grayscale imaged two shares of OTP are
generated. Out of which one is sent through an email and
one is sent on a mobile application. These shares are then
superimposed. If they match exactly then user is permitted
for transaction otherwise the Credit Card is locked by the
admin.
V. THEOREM FOR CREDIT CARD FRAUD
DETECTION AND PREVENTION
A. Bayes Theorem Background: Naive Bayes classifier is
simple probabilistic classifier based on applying Bayes’
theorem with strong (naive) independence assumptions.
Bayes theorem was named after Thomas Bayes (170261) who studied how to compute a distribution for the
probability parameter of Binomial distribution. After
Bayes death, his friend Richard Price edited and
presented this work in 1763.
B. Bayes Technique: Bayes rule [Jiawei Han,Micheline
Kamber, Jian Pei:"Data Mining-concepts and
techniques] depends upon Prior Probability (Individual)
as well as Posterior Probability (Conditional).
The formula is given by:P (Y|X) = [P (X|Y).P (Y)] / P (X)
Where,
P (X) = Independent Probability of X
P (Y) = Independent Probability of Y
P (X|Y) = Conditional Probability of X
IV. ARCHITECTURE
Here, the first step is where signing in process will be done
through secure network. The user will browse through list of
services. After he/she is done selecting the service(s), the
credit card details will be collected from the user. These
details will be stored in the database and the bank customer
will be uniquely identified. Also, request for the required
amount is initiated. Data mining is then applied on this user
information. This is done by using one of the machine
© 2015, IERJ All Rights Reserved
given Y
P (Y|X) = Conditional Probability of Y
given X
Here, the database will be storing all the data and
transactions details of the customer. The transaction
details could be the Time of the Transactions (morning,
afternoon, evening, night), Amount of the Transactions
Page 2
www.ierjournal.org
International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621
generally made (a threshold limit is set according to
customer’s frequent limit eg: Below 20,000 or Above
20,000), Day of the Transactions (Monday, Tuesday,
Wednesday, etc.), Frequency of the transactions (number
of times the card is swiped by the customer in a day).
Such parameters are considered, we have considered
utmost of them and hence the precision of the system
increases. There are two major steps in this algorithm
they are:1. Training:
In this step the Initial Probability is
calculated i.e. P (Yes) or P (No).
2. Detection:
Firstly, the Individual Probability is
calculated i.e. P (Attribute|Yes) or P
(Attribute|No). Example: P (Monday|Yes)
(This gives the probability of number of transactions on
Monday and there is some fraud found). In this way all the
parameters are considered with all their types with positive
and negative probability.
Secondly, the Final Probability is found i.e.
P(X1)=P(Attribute|Yes).P(Yes)=
P(Monday|Yes).P(Morning|Yes).P(>20,000|Yes).P(5|Yes)
Similarly for
P(X2)=P(Attribute|No).P(No) is found.
If the result of P(X1)>P(X2), then YES there is Fraud
happening, further OTP is generated as mentioned earlier.
Otherwise, there is no fraud and normal transaction can be
made.
C. Bayesian Network: A Bayesian network [Jiawei
Han,Micheline Kamber, Jian Pei:"Data Mining-concepts
and techniques] is a Graphical Probabilistic Model that
represents a set of random variables and their
conditional dependencies via a directed acyclic graph
(DAG).
D. Visual Cryptography: This is a technique which allows
visual information (pictures, texts, etc.) to be encrypted in
such a way that decryption becomes a mechanical operation
that does not require a computer.
a. Visual cryptography sharing case: In this scheme, we have
a secret image which is encoded into two shares. Every
pixel from secret image is encoded into multiple sub pixels
in each share image using matrix to determine the colour of
the pixel. In (2,2) sharing case we use complimentary
matrices to share a black pixel and identical matrices to
share a white pixel. Staking the shares we have all subpixels associated with black pixel now black, while 50% of
the sub-pixels associated with white pixel remain white.
b.
Cheating Visual Secret Sharing Scheme: For
binary(2,2) visual cryptography that creates two encrypted
images from an original image, there is a simple
algorithm where first image is created of random pixels
having same size and shape as the original image. Next,
second image of same size and share as first is created but
here pixels of original image is same as the corresponding
pixels in the first encrypted image. The pixels of the
second encrypted image are set to opposite colour. Here,
pixel of the original image are different than the
corresponding pixel of the first encrypted image and same
pixels of the second encrypted image are set to the same
colour as the corresponding pixel of the first encrypted
image. Then these two random images can now be
combined using exclusive OR (XOR) to recreate the
original image.
VI. USE OF HADOOP AND HDFS IN THE
PROPOSED SYSTEM
Apache Hadoop is an open-source software framework
written in Java for distributed storage and distributed
processing of very large data sets on computer clusters built
from commodity hardware. All the modules in Hadoop are
designed with a fundamental assumption that hardware
failures are common and should be automatically handled
by the framework. The core of Apache Hadoop consists of a
storage part, known as Hadoop Distributed File System
(HDFS) and a processing part called MapReduce.
HDFS stores large files (typically in the range of gigabytes
to terabytes) across multiple machines. It achieves reliability
by replicating the data across multiple hosts. In proposed
system we are using HDFS for storing and fast
accessing(using MapReduce) of the user logs. Whole data of
user will get stored using HDFS.
VII.CONCLUSION
This proposed system in this paper is a real time system
which is feasible and can be implemented. The use of such
systems in the Bank Server can handle crucial frauds related
to Credit Cards. Our evaluation confirmed that including the
real cost by creating cost sensitive system using a Bayes
© 2015, IERJ All Rights Reserved
Page 3
www.ierjournal.org
International Engineering Research Journal (IERJ) Volume 1 Issue 11 Page 1663-1666, 2016, ISSN 2395-1621
minimum risk classifier, gives rise to much better fraud
detection results in the sense of higher savings. Considering
the growing world of internet and requirement of High level
Security to avoid Cyber Crimes, our system can be
successfully applied in many Online Payment System in the
near future.
REFRENCES
[1] Shailesh S. Dhok“ Credit Card Fraud Detection Using
Hidden Markov Rule”
International Journal of Soft
Computing and Engineering (IJSCE) ISSN: 2231-2307,
Volume-2, Issue-1, March 2012
[2] MohdAveshZubair Khan1, Jabir Daud Pathan2, Ali
HaiderEkbal Ahmed3”Credit Card Fraud detection using
Hidden Markov Model & K-Clustering”
[3] S.Benson Edwin Raj, A. Annie Portia, “Analysis on
Credit Card Fraud Detection Methods”, IEEE International
Conference on Computer, Communication and Electrical
Technology,IEEEMarch2011.
[4] Alejandro Correa Bahnsen, AleksandarStojanovic,
DjamilaAouada and Bj¨ornOttersten “Credit Card Fraud
Detection using Bayes Minimum risk” 2013 12th
International Conference on Machine Learning and
Applications.
[5] S. Benson Edwin Raj, A. Annie Portia, ―Analysis on
Credit Card Fraud Detection Methods‖, IEEE International
Conference on Computer, Communication and Electrical
Technology, IEEE March 2011
[6] M.Hamdi Ozcelik, Mine Isik, ―Improving a credit card
fraud detection system using Genetic algorithm‖, IEEE
International Conference on Networking and Information
Technology, IEEE 2010.
[7] Genetic algorithms for credit card fraud detection by
Daniel Garner, IEEE Transactions May 2011.
[8] Research on credit card fraud detection model based on
distance sum IEEE 2009 International Joint Conference on
Artificial Intelligence.
Kundu, Suvasini Panigrahi, Shamik Sural and Arun K.
Majumdar, ―BLAST-SSAHA Hybridization for Credit
Card Fraud Detection,‖ IEEE Transactions On Dependable
And Secure Computing, vol. 6, Issue no. 4, pp.309-315,
OctoberDecember 2009.
[13] A. Chiu, C. Tsai, ―A Web Services-Based
Collaborative Scheme for Credit Card Fraud Detection,‖
Proceedings of the IEEE International Conference on eTechnology, e-Commerce and e-Service, pp.177-181, 2004.
[14] Amlan Kundu, Suvasini Panigrahi, Shamik Sural and
Arun K. Majumdar, ―Credit card fraud detection A fusion
approach using Dempster–Shafer theory and Bayesian
learning,‖ Special Issue on Information Fusion in Computer
Security, Vol. 10, Issue no 4, pp.354- 363, October 2009.
[15] Liu Ren, Zhang Liping, Zhan Yinqiang. A Study on
Construction of Analysis Based CRM System. Computer
Applications and Software. Vol.21, Apr. 2004, pp. 46-47.
[16] M. Mehdi, S. Zair, A. Anou and M. Bensebti,‖ A
Bayesian Networks in Intrusion Detection Systems,‖
International Journal of Computational Intelligence
Research, Issue No. 1, pp.09731873 Vol. 3, 2007.
[17] Ezawa.K. & Norton.S,‖Constructing Bayesian
Networks to Predict Uncollectible Telecommunications
Accounts,‖ IEEE Expert, October; 45-51, 1996.
[18] Blickle, T., & Thiele, L. (1995). A Comparison of
Selection Schemes used in Genetic Algorithms (Vol. 2).
Zurich: Swiss Federal Institute of Technology.
[19] Jitendra Dara,Laxman Gundemoni, ―Credit Card
Security and E-Payment.‖ 2006.
[18] Wang Xi. Some Ideas about Credit Card Fraud
Prediction China Trial. Apr. 2008, pp. 74-75.
[20] M. Hamdi Ozcelik, Ekrem Duman, Mine Isik, Tugba
Cevik, Improving a credit card fraud detection system using
genetic algorithm, International conference on Networking
and information technology 2010.
[9] Credit card fraud detection using neural network,
Raghavedra Patidar, Lokesh Sharma, ISSN: 2231-2307,
Volume, IssueNCAI211, JUNE2011.
[21] Wen-Fang YU, Na Wang, Research on Credit Card
Fraud Detection Model Based on Distance Sum, IEEE
International Joint Conference on Artificial Intelligence
2009.
[10] Panigraili, S., Kundu, A., Sural, S. & Majumdar, A.
(2009). Credit Card Fraud Detection: A Fusion Approach
Using Dempster-Shafer Theory and Bayesian Learning.
Information Fusion , 354-363.
[22] Jiawei Han, Micheline Kamber, Jian Pei : "Data
Mining-Concepts and Techniques".
[11] Dr Markus Roggenbach. CS364 Software testing slides.
Swansea University, 2011.
[12]. D.WHITLEY,―Genetic Algorithm And Neural
Network.‖2003.
[13] Wang Xi. Some Ideas about Credit Card Fraud
Prediction China Trial. Apr. 2008, pp. 74-75. [10] Amlan
© 2015, IERJ All Rights Reserved
Page 4