Download ioannis - Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Exploratory factor analysis wikipedia , lookup

Transcript
A Secure Protocol for
Computing Dot-products in
Clustered and Distributed
Environments
Ioannis Ioannidis, Ananth Grama and Mikhail Atallah
Purdue University.
Acknowledgements: National Science Foundation.
The Problem
• Dot-products are the basis of many important
applications
•
•
•
•
Scientific computations
Data mining
Transaction processing
Biometrics
• Use of distributed environments creates
security issues
• Data too valuable to expose
• Untrusted links or hosts
• Spoofing is very easy
The Problem
• Each party is honest-but-curious
– They play by the rules, but if they can find
out more, they will.
• Only one of the parties is interested in
the result.
• We have a random number generator,
which generates a uniformly distributed
random integer, cast into a real.
Candidate Solution
• Use conventional cryptography
– Secure tunneling can protect the links
– More complex protocols offer protection
against untrusted hosts
• Unfortunately, public-key crypto has a
high complexity
– Modular exponentiation computations can
have a crippling effect on the overall
performance
Security vs. Efficiency
• Ideally, no information should leak about the
participating vectors during a secure dotproduct protocol
• However, in the context of the given problem,
in a clustered environment, security need not
be so tight
– Dot-products inherently leak data in the solution
– Some leakage may be acceptable, since the same
dot-product will not be computed multiple times
– Small compromises in security can lead to large
gains in efficiency
An Efficient Alternative
• Use linear algebraic properties to achieve a
sufficient level of security
–
–
–
–
–
–
Hide a vector inside a matrix
Scramble the matrix
Multiply the matrix by the other vector
Retrieve the dot-product
A large part of the computation can be reused
Both parties must share a secret – a number –
before the protocol
An Efficient Alternative
• Security is not perfect
– A small number of equations will leak
– Statistics can reveal information
• But is sufficient for a real-world setting
– If you don’t need to execute the same instance
many times, leaking a few equations is not a
problem
– Statistical attacks demand larges amounts of
information
– Not so easy to gather them in clustered
environments
The Protocol
The Protocol
The Protocol
The Protocol
An Example:
Example (continued):
Proof of Correctness
Proof of Correctness
Proof of Correctness
Algorithmic Considerations
• Time overhead
– How much more computation needs to be
performed?
– Public-key cryptography adds an unacceptable
amount of overhead.
– But it is the only solution if perfect secrecy is the
goal.
• Communication overhead
– Network latency prevails in larger networks.
– Bit count is the decisive factor in tightly coupled
networks.
Stability Considerations
• Algebraic manipulations of the data can
introduce numerical errors in scientific
computation data.
• Any protocol applied to real-valued
vectors must be numerically stable to be
of practical importance.
Experimental Results
• The protocol was executed on two
PIII/450Mhz machines connected on a
Gigabit Ethernet network
• Data was randomly generated vectors of
length 106
• We measured the total overhead
(computation and communication)
– Communication overhead is expected to be a
factor of 4
Experimental Results
• Measured overhead showed a factor of
4.69 overhead
– Communication overhead is the
dominating factor, even on a fast network
• Average numerical error was measured
to 4.5 x 10-9
Conclusions and Ongoing
Research
• It is possible to execute multiparty, realvalued dot-product computations
efficiently and with satisfactory security
• Binary dot-products pose a different
problem due to the sparsity of the
vectors
– Number theoretic techniques introduce
large time and communication overheads