Download 24012017174656__Privacy-Preserving Outsourced Association

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Privacy-Preserving Outsourced Association Rule
Mining on Vertically Partitioned Databases
Abstract
Association rule mining and frequent itemset mining are two popular and
widely studied data analysis techniques for a range of applications. In this paper,
we focus on privacy preserving mining on vertically partitioned databases. In such
a scenario, data owners wish to learn the association rules or frequent itemsets
from a collective dataset, and disclose as little information about their (sensitive)
raw data as possible to other data owners and third parties. To ensure data privacy,
we design an efficient homomorphic encryption scheme and a secure comparison
scheme. We then propose a cloud-aided frequent itemset mining solution, which is
used to build an association rule mining solution. Our solutions are designed for
outsourced databases that allow multiple data owners to efficiently share their data
securely without compromising on data privacy. Our solutions leak less
information about the raw data than most existing solutions. In comparison to the
only known solution achieving a similar privacy level as our proposed solutions,
the performance of our proposed solutions is 3 to 5 orders of magnitude higher.
Based on our experiment findings using different parameters and datasets, we
demonstrate that the run time in each of our solutions is only one order higher than
that in the best non-privacy-preserving data mining algorithms. Since both data and
computing work are outsourced to the cloud servers, the resource consumption at
the data owner end is very low.
Architecture:
SYSTEM ANALYSIS
Existing System:
Our solutions leak less information about the raw data than most existing
solutions. Our solutions achieve a higher privacy level, as most existing solutions
require either the sharing / exposure of raw data or the disclosure of the exact
supports to data owners. Such requirements result in the leakage of sensitive
information of the raw data. To the best of our knowledge, the only existing
privacy- preserving solution that does not leak sensitive information of the raw
data is one of frequent itemset mining solutions (hereafter referred to as strong
solution” in this paper). This solution cannot be used for association rule mining.
All existing solutions, with the exception of utilize a third-party server to compute
the mining result. Some solutions use asymmetric homomorphic encryption to
compute the supports of itemsets.
PROPOSED SYSTEM:
Proposed a solution to counter frequency analysis attack on substitution
cipher. However, a later work
demonstrated that
solution is not secure.
Proposed a solution based on k-anonymity frequency . To counter frequency
analysis attack, the data owner inserts fictitious transactions in the encrypted
database to conceal the item frequency.
A privacy-preserving outsourced association rule mining solution based on
predicate encryption. This solution is resilient to chosen-plaintext attacks on
encrypted items, but it is vulnerable to frequency analysis attacks. Applying this
solution to vertically partitioned databases will also result in the leakage of the
exact supports to data owners.Privacy-preserving outsourced frequent itemset
mining solution for vertically partitioned databases.
ALGORITHM
Apriori algorithm:
The Apriori Algorithm is an influential algorithm for mining frequent
itemsets for boolean association rules. Apriori uses a "bottom up" approach, where
frequent subsets are extended one item at a time (a step known as candidate
generation, and groups of candidates are tested against the data
Clustering algorithm:
Clustering is a process of partitioning a set of data (or objects) into a set
of meaningful sub-classes, called clusters.
Help users understand the natural
grouping or structure in a data set. Clustering: unsupervised classification: no
predefined classes.
Polynomial Time Algorithm:
An algorithm that is guaranteed to terminate within a number of
steps which is a polynomial function of the size of the problem. See also
computational time complexity. Search the data without loss of time to provide
outstream for the process.
MODULE DESCRIPTION
MODULE
 Product Upload
 Product Search.
 Association Rule.
MODULE DESCRIPTION
Product Upload:
The admin wants to upload new product to the cloud, it needs to verify the
validity of the cloud and recover the real secret key. We show the time for these
two processes happened in different time periods. They only happen in the time
periods when the client needs to upload new product to the cloud. Furthermore, the
work for verifying the correctness of the can fully be done by the cloud.
Product Search:
We can consider the dishonest cloud server as a suspect, the data user as a
search data to the server .If the server show the search relevant data. Then
the user select and buying the product. After continue the relevant product
to show the user side. If the user want buying the product and complaint the
irrelevant product. The product search based on index based .The cloud
provide the data based on index terms. The relevant product specified for the
user frequently buying product of services..
Association Rule:
Association rule mining and frequent itemset mining are two
popular and widely studied data analysis techniques for a range of
applications. Data owners wish to learn the association rules or frequent
itemsets from a collective dataset, and disclose as little information
about their (sensitive) raw data as possible to other data owners and third
parties.
which is used to build an association rule mining solution. Our
solutions are designed for outsourced databases that allow multiple data
owners to efficiently share their data securely without compromising on
data privacy. Our solutions leak less information about the raw data than
most existing solutions.
Frequent itemset mining and association rule mining, two widely
used data analysis techniques, are generally used for discovering
frequently
co-occurring
data
items
and
interesting
association
relationships between data items respectively in large transaction
databases.
SYSTEM SPECIFICATION
Hardware Requirements:
• System
: Pentium IV 2.4 GHz.
• Hard Disk
: 40 GB.
• Floppy Drive
: 1.44 Mb.
• Monitor
: 14’ Colour Monitor.
• Mouse
: Optical Mouse.
• Ram
: 512 Mb.
Software Requirements:
• Operating system : Windows 7 Ultimate.
• Coding Language : ASP.Net with C#
• Front-End
: Visual Studio 2012 Professional.
• Data Base
: SQL Server 2008.
Conclusion:
In this paper, we proposed a privacy-preserving outsourced frequent itemset
mining solution for vertically partitioned databases. This allows the data owners to
outsource mining task on their joint data in a privacy-preserving manner. Based on
this solution, we built a privacy-preserving outsourced association rule mining
solution for vertically partitioned databases. Our solutions protect data owner’s raw
data from other data owners and the cloud. Our solutions also ensure the privacy of
the mining results from the cloud. Compared with most existing solutions, our
solutions leak less information about the data owners’ raw data. Our evaluation has
also demonstrated that our solutions are very efficient; therefore, our solutions are
suitable to be used by data owners wishing to outsource their databases to the
cloud but require a high level of privacy without compromising on performance.
Such as secure data aggregation, beyond the data mining solutions described. This
article has been accepted for publication in a future issue of this journal, but has
not been fully edited. Content may change prior to final publication.Demonstrating
the utility of the proposed homomorphic encryption scheme and outsourced
comparison scheme in other settings will be the focus of future research.