Download Project in Computer Security 236349 Michael Kuritzky 306741638

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Versant Object Database wikipedia , lookup

Java ConcurrentMap wikipedia , lookup

Association rule learning wikipedia , lookup

Transcript
Project in Computer Security
236349
Michael Kuritzky
306741638
Guy Cepelevich
304045305
Honeypot data analysis
Final Report
Supervisor: Amichai Shulman
Overview
From Wikipedia: In computer terminology, a honeypot is a trap set to detect, deflect, or in
some manner counteract attempts at unauthorized use of information systems. Generally it
consists of a computer, data, or a network site that appears to be part of a network, but is
actually isolated, (un)protected, and monitored, and which seems to contain information or a
resource of value to attackers.
In our case, the honeypot is a remote machine running an anonymizing proxy service, which
keeps logs of all the requests passed through it.
Project goals and development order:
1. Set up and activate the honeypot.
2. Create a database, parse the honeypot’s logs and inject them into the DB.
3. Create a system that provides an easy-to-use interface to view and classify the log
entries.
4. Add a rule system for automatic classification of both new (future) and existing entries.
5. Add a rule management system for rule activation, deactivation and deletion.
Implementation details:

We used Privoxy (open-source; available from http://www.privoxy.org ) to provide the
anonymizing proxy service.

The proxy runs on a remote Amazon EC2 (Elastic Compute Cloud) Instance.

The logs are stored on an Amazon EBS (Elastic Block Storage) drive connected to the
Instance.

The logs are downloaded from the EBS drive, parsed and stored in a MySQL database.
We store the database locally because the EC2 instance tends to be very slow.

We chose to use MySQL to store the database for the following reasons:
o Very common
o Free
o Easy to access
o Existing management tools
o Easy to write rules on the entries
o Developers’ experience 

The system consists of two parts – the PrivoxyLogParser and PrivoxyLogViewer.

Both were written in Java, for the following reasons:
o Free
o Developers’ experience 
o Extensive integrated and third-party library support (JDBC for database
connections, Swing for GUI development)

In order to avoid memory problems (most logs are around 10MB), the parser goes over
the logs line-by-line, parses them and calls its listeners each time it finishes parsing a
log entry.

A listener connected to the parser inserts the new log entries into the database.

We use two methods to classify the entries:
o Interest:

True indicates that the entries are interesting (e.g. pertaining to some
known attack pattern).

False indicates that the entries are not interesting, i.e. representing
regular internet usage activity.

Null indicates that the entries have not been classified yet. The user can
either classify them manually from the Entries Panel (see below), or create
a Rule to classify them automatically (see below).
o Tags:

Each entry can be associated with one or more tags (e.g. PORN, SQL
INJECTION, CLICK FORGERY) for easier browsing, grouping and
filtering.

The viewer consists of three panels (more detailed documentation below):
o Entries Panel:

Provides convenient display of entries from the DB.

Allows on-the-spot manipulation of the entries.

Allows the user to conveniently load more logs into the DB.
o Rule Editing Panel:

Provides an interface for creating rules for automatic entry manipulation.
o Rule Management Panel:

Provides an interface for activating, deactivating or deleting existing rules.

Allows the user to view all entries matching a rule.

Allows the user to “load” a rule for editing, or use it as a template for a new
(perhaps more refined) rule.
Setting up the system

If you wish to set up a new “log-producing” instance:
o Download and install Privoxy from
http://sourceforge.net/projects/ijbswa/files/
o Modify Privoxy’s config file – set it to write logs into your preferred dir.
o Add the backupLog.bat script in Appendix B (also attached to the submission) to
Windows’ Task Scheduler, and set it to run at regular intervals (preferably once
an hour – approx. 10MB of log data are generated each hour, from our
observation). Don’t forget to modify the script to suit the file/folder locations on
your instance!
o Set up a way to copy the log files to another machine (ClipBook service,
rdpclip.exe, ftp server, etc.).

If you don’t have it yet, download, install and run MySQL Server from
http://www.mysql.com/downloads/mysql/ .

Run the schema creation script in Appendix A (also attached to the submission).
NOTE: if, for some reason, you need to run the script again, make sure you delete the
"hagana_proj" DB instance from your MySQL server (if its there) prior to activating the
script.

If you don’t have it yet, download and install the Java Runtime Environment from
http://www.java.com/en/download/manual.jsp
(or Java Development Kit, which contains the JRE, from
http://www.oracle.com/technetwork/java/javase/downloads/index.html ).

If you don’t have it yet, download and install the latest version of Eclipse.

In Eclipse, Import the two projects, PrivoxyLogParser and PrivoxyLogViewer.

In the PrivoxyLogParser project, go to the file messages.properties (in the res
folder), and change the values there to fit your MySQL server properties (server
address, username and password).

Run the PrivoxyLogViewer (the file to run there is MainFrame.java) and use the Load
Logs button in the Entries panel (see detailed documentation below) to load Privoxy
logs into the system.

Download some logs from the EC2 Instance.

That’s it – you’re ready to view the logs! 
GUI Documentation

Entries Panel
o Screenshot:
o Load Logs – opens a file chooser which allows the user to choose privoxy log
files to parse (multiple choice is supported). Once the logs are parsed, all the
active rules will be applied to the entries.
o Reset – resets the panel to its initial state (i.e. view all entries).
o Create Rule From Selection – uses the currently selected cells as a template
for a new rule (OR between all the values; see Rule Editing Panel documentation
below).
o Prev Page, Next Page – scrolls between pages of the current view (we can’t
load all the entries at once because of memory concerns, so we load 200 at a
time).
o Entry table – displays the main columns of the entries. Click on a single cell to
view its content in the Selected cell field, and view the Request Headers and
Response Headers associated with that entry. Select a cell or range of cells to
view and/or modify the tags associated with them, and/or modify the interest
associated with them.
o Filter entries by interest – the user can choose to view only entries with a
certain Interest value.
o Change interest of selected rows to – the user can change the Interest level of
all selected rows at once.
o Tags – the user can assign one or more tags to the selected entries (use the >
and < buttons to move tags between the Applied and Unapplied lists). Also, the
user can add new tags to the system (using the + button) or delete ones from it
(using the Delete tag button).

Rule Editing Panel
o Screenshot:
o Rule name – must be unique (if a rule with the current name exists, a dialog will
appear [upon clicking Save, of course] asking the user whether to overwrite it),
up to 50 chars.
o Rule description – a short textual description of the purpose of the rule.
o SQL Statement – the main part of the rule. Here, the user needs to write an SQL
statement that will catch the entries he is interested in, and return their entry_id
(note the static “header” of the query – the statement must return a list of
entry_id’s!). Use the DB Schema button below to show a reminder of the
relevant part of the database schema:
o Actions – the actions to take upon the entries matching the rule. Note that these
actions (and especially the Delete entries one) CANNOT be undone!
The actions are:

Delete Entries

Change interest level

Apply tags

Unapply tags
o Upon clicking the Save button, the user will be asked whether to apply the rule
now, or just save it for the future entries. Note that applying rules can take a
while!

Rule Management Panel
o Screenshot:
o Selected Rule Description – click on a rule to view its description in this field.
o Active rules – these rules will be applied to any newly parsed logs.
o Inactive rules – these rules are just sitting around in the memory until the user
decides to make them active again. 
o Use the <, > buttons to move rules between the Active and Inactive lists.
o Delete selected rule – does just that.
o Use selected rule as template to a new rule – opens the rule in the Rule
Editing Panel, where the user can either overwrite it or use it to create a new
rule.
o View entries related to the selected rule – opens the entries (if there are any)
“captured” by the rule in the Entries Panel.
Class Diagram
PrivoxyLogParser
IPrivoxyLogListener
IPrivoxyLogEntry
DBInjectLogListener
PrivoxyLogEntry
MainFrame
Rule
EntriesPanel
RuleEditingPanel
RuleManagementPanel
EntryTable
AvailableTagsList
ActiveRulesList
EntryTableModel
IVarLengthTagList
InactiveRulesList
HeadersTable
VarLengthTagList
TagsAvailableList
TagsNotAvailableList
VarLengthTagListModel
Sample Results
These sample results come from analyzing about 90,000 log entries, which are about 1.5 days
of traffic on the honeypot.

A (very) simple rule to catch porn:
o The rule itself:
o And the results (about 1000 entries; more can probably be caught with more
sophisticated queries):
Interestingly enough, most requests come from the 78.159.125.0 subnet…

Suspicious user-agents
o This rule catches requests that supposedly come from computers with the MS
Windows 98 operating system. Since we’re in 2010 now, this seems quite
suspicious…
o The rule itself:
o And the results – about 9000(!) entries!

A quick look at the results of the Win98 rule revealed a bunch of requests to a site
rd.apmebf.com , which (in addition to the domain itself looking suspicious), all have
&host=media.fastclick.net at the end of the URL.
This seemed suspicious to us, so we wrote a trivial rule (“WHERE m.host LIKE
‘%apmebf%’”) to catch all the apmebf requests and assign a tag to them. In a lesser
database (about 19,000 entries), this rule caught about 600 entries, coming in batches
of several dozen requests, often asking for the same page.
From Googling the word apmebf, it seems that it’s some malicious site, blocked by
many firewalls.
Future work

Make the SQL queries more efficient (currently we have a problem dealing with
databases with a large number of entries).

Make the user defined queries more structured and guided.

Support for creating automatic queries from multiple selection from the entries table.

Support for reconstruction and “replay” of requests.
Appendix A – Database creation script
CREATE DATABASE `hagana_proj` /*!40100 DEFAULT CHARACTER SET latin1 */;
DROP TABLE IF EXISTS `hagana_proj`.`active_rules`;
CREATE TABLE
`hagana_proj`.`active_rules` (
`RULE_NAME` char(50) NOT NULL DEFAULT 'NEW_RULE',
PRIMARY KEY (`RULE_NAME`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`associated_tags`;
CREATE TABLE
`hagana_proj`.`associated_tags` (
`ENTRY_ID` bigint(20) unsigned NOT NULL,
`TAG` varchar(50) NOT NULL,
PRIMARY KEY (`ENTRY_ID`,`TAG`) USING BTREE,
KEY `Tag_Index` (`TAG`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`master`;
CREATE TABLE
`hagana_proj`.`master` (
`ENTRY_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`SOURCE_IP` char(15) NOT NULL,
`COUNTRY` varchar(15) DEFAULT NULL,
`DESTINATION_URL` varchar(3000) DEFAULT NULL,
`HTTP_METHOD` char(5) NOT NULL,
`REQUESTED_PAGE` varchar(3000) DEFAULT NULL,
`PROTOCOL` char(15) DEFAULT NULL,
`PROTOCOL_VERSION` double DEFAULT NULL,
`DECISION` char(5) NOT NULL,
`RESPONSE_CODE` int(10) unsigned DEFAULT NULL,
`PRIVOXY_COMMENT_CODE` int(10) unsigned DEFAULT NULL,
`TIMESTAMP` datetime NOT NULL,
`IS_INTERESTING` tinyint(1) DEFAULT NULL,
`HOST` varchar(3000) DEFAULT NULL,
PRIMARY KEY (`ENTRY_ID`) USING BTREE,
KEY `IP_index` (`SOURCE_IP`),
KEY `Interest_Index` (`IS_INTERESTING`)
) ENGINE=InnoDB AUTO_INCREMENT=153623 DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`request_headers`;
CREATE TABLE
`hagana_proj`.`request_headers` (
`ENTRY_ID` bigint(20) unsigned NOT NULL,
`REQ_HEADER_NAME` varchar(100) NOT NULL,
`REQ_HEADER_CONTENT` varchar(3000) DEFAULT NULL,
KEY `Header_Id_Index` (`ENTRY_ID`) USING BTREE,
KEY `Header_Name_Index` (`REQ_HEADER_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`response_headers`;
CREATE TABLE
`hagana_proj`.`response_headers` (
`ENTRY_ID` bigint(20) unsigned NOT NULL,
`RES_HEADER_NAME` varchar(100) NOT NULL,
`RES_HEADER_CONTENT` varchar(3000) DEFAULT NULL,
KEY `Header_Id_Index` (`RES_HEADER_NAME`) USING BTREE,
KEY `Header_Name_Index` (`RES_HEADER_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`rules_list`;
CREATE TABLE
`hagana_proj`.`rules_list` (
`RULE_NAME` char(50) NOT NULL DEFAULT 'NEW_RULE',
`DESCRIPTION` varchar(3000) DEFAULT NULL,
`SQL_QUERY` varchar(5000) NOT NULL,
`TO_DELETE_ENTRIES` smallint(1) unsigned NOT NULL DEFAULT '0',
`TO_CHANGE_INTEREST` smallint(1) unsigned NOT NULL DEFAULT '0',
`CHANGE_INTEREST_TO` smallint(1) unsigned DEFAULT NULL,
PRIMARY KEY (`RULE_NAME`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`tag_list`;
CREATE TABLE
`hagana_proj`.`tag_list` (
`TAG` varchar(50) NOT NULL,
PRIMARY KEY (`TAG`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`tags_associated_in_rules`;
CREATE TABLE
`hagana_proj`.`tags_associated_in_rules` (
`RULE_NAME` char(50) NOT NULL,
`TAG` varchar(50) NOT NULL,
PRIMARY KEY (`RULE_NAME`,`TAG`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP TABLE IF EXISTS `hagana_proj`.`tags_removed_in_rules`;
CREATE TABLE
`hagana_proj`.`tags_removed_in_rules` (
`RULE_NAME` char(50) NOT NULL,
`TAG` varchar(50) NOT NULL,
PRIMARY KEY (`RULE_NAME`,`TAG`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Appendix B – log backup script
@echo off
:: Variables
set copyCmd=copy /y
set delCmd=del /f
set startCmd=start
set killCmd=taskkill /f /im
set day=%date:~4,2%
set month=%date:~7,2%
set year=%date:~10,4%
set hour=%time:~0,2%
set minute=%time:~3,2%
set timestamp=%day%_%month%_%year%_%hour%_%minute%
set srcDir=E:\PrivoxyLogs
set srcFile=%srcDir%\privoxy.log
set dstDir=%srcDir%\LogBackups
set dstFile=%dstDir%\privoxy_%timestamp%.log
set privoxyDir=D:\Privoxy-students
set privoxyExe=%privoxyDir%\privoxy.exe
::
:: Kill privoxy so that it won't bother the backup process
%killCmd% privoxy.exe
:: Copy the old log file
%copyCmd% %srcFile% %dstFile%
:: Delete the old log file
%delCmd% %srcFile%
:: Create a new empty file instead
echo. 2>%srcFile%
:: Start privoxy back up
%startCmd% %privoxyExe%
::
echo Backup done!