Download Rake - Intelligrate

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genetic clustering wikipedia , lookup

K-means clustering wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
The e-banking antifraud solution
The intelligent software that protects
consumers and banks from the most
sophisticated hacker attacks
SCENARIO
Online fraud
The increase in online banking transactions and the resulting movement of cash via the
Net have shifted the focus of organised crime from bank robberies to online fraud, which is
just as lucrative but far less risky.
1
SCENARIO
Fraud mechanisms
The following mechanisms are used to perpetrate online fraud:
Unauthorised access to online current accounts via:
Theft and subsequent use of Credentials
Web-in-the-Middle
And subsequent transfer of money by three different means:
– The money is transferred through a chain of decoy accounts and finally
credited to a foreign bank (often by recruiting current account holders
online, not always in good faith).
– The money is used to buy multiple phone top-ups and the resulting credit
immediately spent using premium phone numbers.
– The money is used to load prepaid Credit Cards and immediately use to
buy easily resellable goods (such as jewellery or electronic goods).
2
SCENARIO
Theft and subsequent use of Credentials
Data theft by cracking Bank Databases
This is the most brutal but often the most effective mechanism, allowing
thousands of authentication details to be obtained in one go.
(Latest known event, 12 December 2008, 450,000 Accounts, Germany)
E-mail phishing
The most commonly used method for spying and collecting authentication
credentials in online banking.
Banking Trojans
These incorporate various different mechanisms for capturing and sending
access data: from keyloggers to generating films of mouse movements on
the screen.
They began to spread four years ago and have recently reached a critical
level of penetration.
(According to an empirical assessment carried out in November 2008, approximately 1% of enduser PCs are affected )
3
SCENARIO
Banking Trojans
Trojans act by transferring Usernames, Passwords and Digital Signature
Certificates, as well as screenshots and all the characters typed, to pirate sites
created to gather this kind of information.
The list of these Trojans is long and well documented, although they are fairly
unknown to the vast majority of Internet users and antivirus tools often fail to pick
them up.
Their operating principle is not too dissimilar from the one introduced by
Bancos.NL, the first of the documented Trojans.
The most widespread forms remain in standby mode until the browser connects
to one of the addresses listed in the code.
At this point, when the user browses an Internet banking site, the Trojans
activate by sending his or her Username and Password, plus any other
confidential information, to the pirate sites that collect them.
4
SCENARIO
Web-in-the-middle
Trojan.Silentbanker, and its subsequent variants, affect innocent users of
online banking services by intercepting client current account information before
it is coded and sending it to a central attack database.
They have the ability to intercept online banking transactions which are normally
well protected by two level authentication procedures.
During a banking transaction, Silentbanker replaces the user account with the
hacker's account, while the user continues to perceive a perfectly normal
banking transaction.
Since they have no inkling that their details have been hacked, users
unknowingly send money to the hacker's account after having accessed the
second level of authentication.
5
SCENARIO
Countermeasures
Banks lack the means to combat this phenomenon.
This is because the infection affects their customers' computers, without
any anomaly being picked up on the Internet Banking server.
Only occasionally will the bank perceive the presence of malware on a
customer's computer in a log file. This happens when the Trojan
modifies website pages to request additional information, and
consequently the web server receives POST fields which do not appear
in "clean" transactions. Analysing these POST fields allows customers
with infected PCs to be identified, although it may be difficult to
determine how to proceed:
a) Notify the user: there is a risk that this operation will be perceived by
users as an attack on their privacy. They may think that the bank has
hacked into their PC in order to get hold of this information.
b) Manually monitor the relevant account to identify any fraudulent
activity.
6
ANALYSIS
Analysis of real cases
Identifying the presence of these additional POST fields allows the percentage of
infected computers between customers and banks to be estimated.
Between September and November 2008, by analysing the log files of some of
our client banks, we estimated that:
Approximately 0.5% of users are infected withTrojan.silentbanker.
0.5% of transactions may potentially become fraudulent.
A further 0.5% appear to be infected by other Trojans that modify access pages
but which we were unable to identify.
Considering the country as a whole, we can calculate the following:
• Since there are between 5 and 10 million online banking accounts in Italy,
50,000 computers could potentially be infected.
• Fraud is normally committed by ordering three or more bank transfers for
variable amounts of between 2,000 and 5,000 €.
• The turnover of this kind of fraud is 300,000,000 €.
7
RAKE
The winning solution
Rake allows unusual behaviour by users to be identified by using clustering and
classification techniques that are specific to Data Mining; the same ones that are
employed by the fraud detection products implemented by major credit card
operators
Automatic monitoring of current accounts and reporting of all unusual
movements based on data mining procedures aimed at identifying the
behavioural profile of each individual user
Automatic clustering of user behaviour: allows habitual behaviour to be
identified and anything that deviates from this to be recognised;
Historical analysis of typical user parameters, in order to reduce the presence
of false positives and increase the effectiveness of the tool itself.
This method is successful even in dealing with new kinds of misappropriation,
specifically because it is based on analysing user behaviour rather than knowing
the procedure by which fraud is committed.
Fraud which has already taken place and has known behavioural patterns can
be incorporated into a second assessment step for further verification.
8
RAKE
How it works
Data collection
Data collectionstage
Loading and storage of transactions carried
out over a number of weeks, in order to provide
a minimum number of transactions.
Clustering Processes
Clusters are calculated by using only the
movements that relate to the last 6 months, so
that only "recent" behaviour is taken into account.
Using these movements, statistical clusters are
examined using the E.M. (Expectation
Maximisation) algorithm, after excluding any
Outliers (events that fall outside the clusters)
using other clustering algorithms (OPTICS LOF or
DENCLUE).
The clustering process is carried out every day,
after the logs have been obtained and processed,
and the results are saved on a second DB to
improve the performance of the system.
The clustering results are entered in a DB so
that comparisons can easily be drawn between
new transaction orders and pre-calculated
clusters and a weighting can therefore be
assigned to the transactions.
9
RAKE
How it works
Pattern Recognition Mechanisms
In order to identify successions of events which have previously led to
misappropriations
A series of misappropriations have taken place by means of a series of bank transfers, carried out on
consecutive days, for amounts that increase steadily by 1,000 € a time. This can be incorporated in the
fraud detection mechanisms, but a search has to be carried out first to find any events of the same nature in
the previous history, in order to determine whether they are truly "dangerous".
In fact, if the previous succession proved to be a very widespread event that is not connected with
attempted fraud, it would be pointless to search for it among new transactions.
IP Georeferencing Mechanisms
In order to identify sudden changes in Internet Banking access locations (or
providers)
This feature is incorporated in the analysis of beneficiaries (and possibly userAgents) in order to select
false positives.
The system allows the user's IP address to be checked against a blacklist containing addresses used by
the TOR anonymising service (other anonymisers are due to be added at a later date).
10
RAKE
How it works
Anonymiser search mechanisms
In order to identify transactions originating from IP address concealment
services
Intelligrate collects the IP addresses of the leading anonymisation services, such as TOR, updating
RAKE on a daily basis. This allows the origin of various transactions to be analysed, blocking or
blacklisting any that originate from the aforesaid services.
Whitelists and Blacklists
In order to allow the separate management of specific current accounts if
necessary
RAKE allows specific accounts to be added to whitelists if they are to be completely excluded from any
checks, while allowing any particularly suspicious bank details, telephone numbers or the numbers of
reloadable cards to be added to blacklists...
11
RAKE
How it works
Various different mechanisms are used to assess the "deviation" between
an individual movement and normal behaviour:
Cluster EM
(Expectation Maximization)
Assessment of whether the
movement comes within the
combination of normal distributions
identified by E.M. clustering (with
different weightings depending on
the degree of deviation from the
cluster average and edges).
This assessment is made without
any reassessment of the clusters.
Cluster 2D-GridClustering
OPTICS Local Outlier
Detection
Calculation of the deviations
Geometric assessment of the
deviation from existing clusters using a from the clusters using OPTICS
to determine whether the
two-dimensional version of
individual event comes within
GridClustering algorithms.
the cluster or can be identified
This assessment is also carried out
as an Outlier.
without recalculating the clusters.
By applying the three mechanisms, a "Minority Report" policy can be adopted to
report the anomaly and call the user, if necessary, only when three positives occur.
12
RAKE
Application Modes
Rake is available in two different application modes which can be chosen with
the bank based on the technical features of the e-banking service, the way in
which customers interact with it and the need to ensure the promptness of
transactions.
Online Mode
RAKE is connected directly to the e-banking application. For each transaction,
RAKE is sent a string containing all the transaction data and returns a weighting of
between 0 (transaction OK) and 10 (transaction with a very high probability of being
fraudulent). The e-banking application can ask the customer an additional question
to verify authenticity or send a confirmation text message.
Offline Mode
Every evening, the day's transactions are assessed and a report is produced
containing reports of any events that are probably fraudulent. This can be sent to
the helpdesk for telephone verification of the most suspicious ones.
13
RAKE
Online Application Mode
The Online Mode requires the transaction weighting to be returned within a
maximum of 1 second.
This requirement makes it impossible to carry out clustering using all the
available tools and therefore requires the use of EM clusters, which provide a
geometric representation of the result, and GridBased clusters, which allow one
to determine quickly whether each new event comes within a cluster.
With EM clustering, new transaction orders are therefore compared to the
elipses representing the clusters and are given an increasingly negative
weighting the further they are from the centre of the clusters.
2D-GC clustering determines whether new transaction orders fall within cells that
already belong to a cluster or if their presence turns groups of transactions into
clusters that were not previously clusters.
14
RAKE
EM Clustering – Before the Outlier Search
15
RAKE
EM Clustering excluding Outliers
16
RAKE
2D-GC Clustering with 2 points per cluster
17
RAKE
2D-GC Clustering with 3 points per cluster
18
RAKE
EM Clustering – Before the Outlier Search
19
RAKE
EM Clustering excluding Outliers
20
RAKE
2D-GC Clustering with 2 points per cluster
21
RAKE
2D-GC Clustering with 3 points per cluster
22
RAKE
OFFLINE Application Mode
The Offline Mode does not require short calculation times to be respected.
A complete analysis can therefore be carried out using the three methods
previously described, returning more accurate "probability of fraudulent event"
scores.
The result is a daily report that allows the relevant departments to investigate
any particularly suspicious events.
23
RAKE
Search for Outliers
24
RAKE
Identified Outliers
25
RAKE
Search for Outliers
26
RAKE
Identified Outliers
27
RAKE
Installation and configuration
Rake consists of three modules:
The clusteriser, which takes into account the history of each account and the
list of movements, the beneficiaries, the user-agents and the IP/connection
providers
The Database which, in addition to storing movements and clusters, uses
Stored Procedures to produce a daily report on suspected misappropriations
The client, which provides real time responses on the degree of reliability of
each transaction.
The three modules can be implemented on a single machine or split between
different machines to improve performance.
The client component can be replicated and integrated with load sharing
equipment.
The DB component is implemented in MySQL or Oracle 11g and can be
intetgrated into DBs supplied by the customer.
28
RAKE
Installation and configuration
Rake is supplied either as an application to be installed on Unix machines,
whether Solaris or Linux RedHat, or as a Virtual Appliance for VMware:
CentOS + MySQL
CentOS + Oracle
Solaris 10 + Oracle
Solaris 10 + MySQL
Rake can be integrated with customer applications and supplied with ad hoc
communication modules if adaptation to specific Internet Banking
environments is required.
29
RAKE
Installation and configuration
Rake can easily be administered via a Web interface.
The reports can be displayed via the Web or downloaded in CSV or XLS
format.
If requested by the customer, the reports can be displayed by a Web
Service.
Screenshots of the administration and reporting displays are shown
below.
30
RAKE
Configuration
31
RAKE
Configuration
32
RAKE
Configuration
33
RAKE
Configuration
34
RAKE
Reports
35
RAKE
Reports
36
RAKE
THANK YOU!
INTELLIGRATE srl
Via XII OTTOBRE 2/92
16121 GENOVA
ITALY
Tel.: +39 0105954161
Fax: +39 010586753
Email: [email protected]
Web: www.intelligrate.it
37