Download ioannis - Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia, lookup

Exploratory factor analysis wikipedia, lookup

Transcript
A Secure Protocol for
Computing Dot-products in
Clustered and Distributed
Environments
Ioannis Ioannidis, Ananth Grama and Mikhail Atallah
Purdue University.
Acknowledgements: National Science Foundation.
The Problem
• Dot-products are the basis of many important
applications
•
•
•
•
Scientific computations
Data mining
Transaction processing
Biometrics
• Use of distributed environments creates
security issues
• Data too valuable to expose
• Untrusted links or hosts
• Spoofing is very easy
The Problem
• Each party is honest-but-curious
– They play by the rules, but if they can find
out more, they will.
• Only one of the parties is interested in
the result.
• We have a random number generator,
which generates a uniformly distributed
random integer, cast into a real.
Candidate Solution
• Use conventional cryptography
– Secure tunneling can protect the links
– More complex protocols offer protection
against untrusted hosts
• Unfortunately, public-key crypto has a
high complexity
– Modular exponentiation computations can
have a crippling effect on the overall
performance
Security vs. Efficiency
• Ideally, no information should leak about the
participating vectors during a secure dotproduct protocol
• However, in the context of the given problem,
in a clustered environment, security need not
be so tight
– Dot-products inherently leak data in the solution
– Some leakage may be acceptable, since the same
dot-product will not be computed multiple times
– Small compromises in security can lead to large
gains in efficiency
An Efficient Alternative
• Use linear algebraic properties to achieve a
sufficient level of security
–
–
–
–
–
–
Hide a vector inside a matrix
Scramble the matrix
Multiply the matrix by the other vector
Retrieve the dot-product
A large part of the computation can be reused
Both parties must share a secret – a number –
before the protocol
An Efficient Alternative
• Security is not perfect
– A small number of equations will leak
– Statistics can reveal information
• But is sufficient for a real-world setting
– If you don’t need to execute the same instance
many times, leaking a few equations is not a
problem
– Statistical attacks demand larges amounts of
information
– Not so easy to gather them in clustered
environments
The Protocol
The Protocol
The Protocol
The Protocol
An Example:
Example (continued):
Proof of Correctness
Proof of Correctness
Proof of Correctness
Algorithmic Considerations
• Time overhead
– How much more computation needs to be
performed?
– Public-key cryptography adds an unacceptable
amount of overhead.
– But it is the only solution if perfect secrecy is the
goal.
• Communication overhead
– Network latency prevails in larger networks.
– Bit count is the decisive factor in tightly coupled
networks.
Stability Considerations
• Algebraic manipulations of the data can
introduce numerical errors in scientific
computation data.
• Any protocol applied to real-valued
vectors must be numerically stable to be
of practical importance.
Experimental Results
• The protocol was executed on two
PIII/450Mhz machines connected on a
Gigabit Ethernet network
• Data was randomly generated vectors of
length 106
• We measured the total overhead
(computation and communication)
– Communication overhead is expected to be a
factor of 4
Experimental Results
• Measured overhead showed a factor of
4.69 overhead
– Communication overhead is the
dominating factor, even on a fast network
• Average numerical error was measured
to 4.5 x 10-9
Conclusions and Ongoing
Research
• It is possible to execute multiparty, realvalued dot-product computations
efficiently and with satisfactory security
• Binary dot-products pose a different
problem due to the sparsity of the
vectors
– Number theoretic techniques introduce
large time and communication overheads