Download Project Recommended by United Technologies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Project Recommended by United Technologies Corporation
Research Center
Transaction Data Analysis
Transactional data are characterized by discrete “events” occurring at specific times.
Events correspond to some changes in the state of the system as the effect of a transaction. Examples are the events occurring in a financial market where stocks are
bought/sold or in a physical access control system where access granted/denied, etc.
are generated at various entry/exit points. Generally at each time instance more than
one event are generated with different attributes. As an example, one can think that, in
a financial market, an event is “buy” and the attributes are the name of the stock, the
company/person who bought it, the amount of stocks, its current value, to name a few.
For such data it is generally interesting to find patterns from where one then develops
models that can be used for various different purposes such as predictive analytics or
anomaly detection.
Compared to other time series capturing (sampling) data from a continuous time process, such as the position of a vehicle on a road network, the temperature in a room,
and so on, transaction data can be thought as samples from a system whose underlying
behavior is following state machines, rules and policies better described by logic statements.
In this context, the project aims at reviewing the state of the art for transactional data
analysis and, if possible, the application of standard methods such as clustering, pattern
recognition/discovery or anomaly detection (e.g. spam filters) to this type of data.
Possible freely available data sets that can be used for project are:
•
Wikipedia access traces
http://www.wikibench.eu/?page_id=60
•
Wikipedia Page Traffic Statistic V3 http://aws.amazon.com/datasets/6025882142118545
•
Federal Contracts from the Federal Procurement Data Center (USASpending.gov)
http://aws.amazon.com/datasets/2406
•
Common Crawl
http://commoncrawl.org/the-data/