* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download slides19
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Team Foundation Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational algebra wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Versant Object Database wikipedia , lookup
Lecture 17: Executing SQL over Encrypted
Data in Database-Service-Provider Model
Professor Chen Li
Executing SQL over Encrypted Data in
Database-Service-Provider Model
Hakan Hacigumus
University of California, Irvine
Bala Iyer
IBM Silicon Valley Lab.
Chen Li
University of California, Irvine
Sharad Mehrotra
University of California, Irvine
SIGMOD 2002, Madison, Wisconsin, USA
What do we want to do?
Server
User Data
User
Untrusted Administrator
Encrypted User
Database
We want to store the data on “a server”
But the problem is we do not trust “the server” for sensitive
information!
encrypt the data and store it
but still be able to run queries over the encrypted data
do most of the work at the server
If the server is trusted, ICDE 2002
3
Why is it important anyway?
Server
Internet
User
Untrusted Administrator
Encrypted User
Database
(Untrusted) Application Service Provider
Application Service Provider (ASP) Model for Database
DB management transferred to service provider for
backup, administration, restoration, space management, upgrades etc.
use the database “as a service” provided by an ASP
use SW, HW, human resources of ASP, instead of your own
4
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
System Architecture
Client Site
Server Site
Encrypted
Results
Query
Executer
Temporary
Results
Client Side
Query
?
Server Side
Query
Service Provider
Query
Translator
Original Query
Metadata
?
Encrypted User
Database
?
Actual Results
User
6
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Relational Encryption
NAME
SALARY
etuple
PID
N_ID
S_ID
P_ID
John
50000
2
fErf!$Q!!vddf>></|
50
1
10
Marry
110000
2
F%%3w&%gfErf!$
65
2
10
James
95000
3
&%gfsdf$%343v<l
50
2
20
105000
4
%%33w&%gfs##!
65
2
20
Lisa
Server Site
Store an encrypted string – etuple – for each tuple in the original table
This is called “row level encryption”
Any kind of encryption technique can be used
Blowfish encryption algorithm is used for this work
Create an index for each (or selected) attribute(s) in the original table
9
Building the Index:
Partition and Identification Functions
Partition function divides domain values into partitions (buckets)
Partition (R.A) = { [0,200], (200,400], (400,600], (600,800], (800,1000] }
partitioning function has an impact on performance as well as privacy
Identification function assigns a partition id to each partition of attribute A
Partition (Bucket) ids
2
0
7
200
5
400
1
600
4
800
1000
Domain Values
e.g.
identR.A( (200,400] ) = 7
Any function can be use as identification function, e.g., hash functions
10
Mapping Functions
Mapping function maps a value v in the domain of attribute A to the id
of the partition which value v belongs to
Partition (Bucket) ids
2
0
7
200
5
400
1
600
4
800
1000
Domain Values
e.g.
MapR.A( 250 ) = 7, MapR.A( 620 ) = 1
11
Storing Encrypted Data
R = < A, B, C >
RS = < etuple, A_id, B_id, C_id >
etuple = encrypt ( A | B | C )
A_id = MapR.A( A ), B_id = MapR.B( B ), C_id = MapR.C( C )
Table: EMPLOYEES
Table: EMPLOYEE
NAME
SALARY
Etuple
PID
N_ID
S_ID
P_ID
John
50000
2
fErf!$Q!!vddf>></|
50
1
10
Marry
110000
2
F%%3w&%gfErf!$
65
2
10
James
95000
3
&%gfsdf$%343v<l
50
2
20
105000
4
%%33w&%gfs##!
65
2
20
Lisa
12
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Mapping Conditions
Q: SELECT name, pname FROM emp, proj
WHERE emp.pid=proj.pid AND salary > 100k
Server stores attribute indices determined by mapping functions
Client stores metadata and utilizes that to translate the query
Conditions:
Condition Attribute op Value
Condition Attribute op Attribute
Condition (Condition
Condition) | (Condition Condition)
| (not Condition)
15
Mapping Conditions (2)
Example:
Attribute = Value
Mapcond( A = v ) AS = MapA( v )
Mapcond( A = 250 ) AS = 7
Partition Ids
2
0
7
200
5
400
1
600
4
800
1000
Domain Values
16
Mapping Conditions (3)
Attribute1 = Attribute2
Mapcond( A = B ) N (AS = identA( pk ) BS = identB( pl ))
where N is pk partition (A), pl partition (B), pk pl
Partitions
A_id
Partitions
B_id
[0,100]
2
[0,200]
9
(100,200]
4
(200,400]
8
(200,300]
3
C:A=B
C’ :
(AS = 2 BS = 9)
(AS = 4 BS = 9)
(AS = 3 BS = 8)
17
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Relational Operators over
Encrypted Relations
Partition the computation of the operators across client and server
Compute (possibly) superset of answers at the server
Filter the answers at the client
Objective : minimize the work at the client and process the answers
as soon as they arrive without requiring storage at the client
Operators studied:
Selection
Join
Grouping and Aggregation
Sorting
Duplicate Elimination
Set Difference
Union
Projection
19
Selection Operator
( R ) = ( D (
c
c
S
S
(R ))
Mapcond(c)
Example:
A=250
A=250
D
TABLE
A_id = 7
E_TABLE
2
0
7
200
5
400
1
600
Client Query
Server Query
4
800
1000
20
Join Operator
R
c
T=
(D(R
c
S
S
S
Mapcond(c)
T )
Client Query
A=B
Example:
D
C
EMP
C’
PROJ
E_EMP
Partitions
A_id
Partitions
E_PROJ
C:A=B
B_id
[0,100]
2
[0,200]
9
(100,200]
4
(200,400]
8
(200,300]
3
Server Query
C’ :(A_id = 2 B_id =
9)
(A_id = 4 B_id = 9)
(A_id = 3 B_id = 8)
21
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Query Decomposition
Q: SELECT name, pname FROM emp, proj
WHERE emp.pid=proj.pid AND salary > 100k
Client Query
name,pname
name,pname
e.pid = p.pid
e.pid = p.pid
salary >100k
salary >100k
D
D
Encrypted
(PROJ)
PROJ
EMP
Server Query
Encrypted
(EMP)
24
Query Decomposition (2)
Client Query
name,pname
Client Query
name,pname
e.pid = p.pid
e.pid = p.pid
salary >100k
salary >100k
D
D
D
E_PROJ
s_id = 1 v s_id = 2
D
E_PROJ
E_EMP
E_EMP
Server Query
Server Query
25
Query Decomposition (3)
Client Query
name,pname
e.pid = p.pid
Client Query
name,pname
salary >100k e.pid =
p.pid
salary >100k
D
D
D
e.p_id = p.p_id
s_id = 1 v s_id = 2
E_EMP
Server Query
E_PROJ
s_id = 1 v s_id = 2
E_PROJ
E_EMP
Server Query
26
Query Decomposition (4)
Client Query
name,pname
salary >100k e.pid =
Q: SELECT
FROM
WHERE
p.pid
e.p_id = p.p_id
E_EMP
Server Query
emp.pid=proj.pid AND
salary > 100k
QS: SELECT e_emp.etuple, e_proj.etuple
D
s_id = 1 v s_id = 2
name, pname
emp, proj
E_PROJ
FROM e_emp, e_proj
WHERE
e.p_id=p.p_id AND
s_id = 1 OR s_id = 2
QC: SELECT
FROM
WHERE
name, pname
temp
emp.pid=proj.pid AND
salary > 100k
27
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Talk Outline
System Architecture
How to create Metadata: Relational Encryption and Storage Model
Query Decomposition and Relational Operators
Query Decomposition – Examples
Experimental Results
Conclusion
Experimental Evaluation
Data
Queries
TPC-H database, scale factor 0.1
Based on TPC-H Queries Q#6 and Q#3
Partitioning Strategy
Equi-depth histograms for the first set of experiments
Equi-width histograms for the second set of experiments
30
Effect of Number of Buckets in
Non-Join Query
Cost Factors for Query Response Time
Query Response
Time
40
30
Client Side
Network
Server Side
20
10
0
2
8
Number of Buckets
Client and communications costs decreases with increasing number of buckets
due to better filtering at the server
Server cost doesn’t decrease as much, table scan remains best choice in the
optimizer
31
Effect of Number of Buckets in
Non-Join Query
Client/Server v.s. Single Server
Query Response
Time
35
30
25
20
15
10
5
0
Single Server
Server Side
Client Side
2
8
Number of Buckets
Single Server: Server is trusted and performs all operations including
decryption on site
Shows that proposed query execution protocol doesn’t introduce significant
overhead
32
Effect of Number of Buckets in Join Query
Client
Server
Total
1
75
100
150
250
300
Number of Buckets
500
750
1500
Effect of Decryption Time
Query Response Time
Query Response Time
Client, Server, and Total Response Times
Client /w
decryption
Client w/o
decryption
1
75
100
150
250
300
500
750
1500
Number of Buckets
Sharp decrease in query response time with increase in the number of
buckets due to better filtering at the server
Client side query response time is greater than server side query response
time due to dominant decryption cost on the query (second graph)
33
Effect of Number of Buckets in Join Query
Query Response Time
Client/Server v.s. Single Server
C/S
Single Server
1
75
100
150
250
300
500
750
1500
Number of Buckets
Single Server: Server is trusted and performs all operations including
decryption on site
Consistent with the previous results showing proposed query execution
protocol doesn’t introduce significant overhead
34
Conclusion
ASP model is a promising solution for enterprise computing in Internet
era
We studied data privacy problem
Proposed solution
in the context of ASP model
when the ASP is not trusted
encrypts data, creates “coarse indexes” and stores the data at ASP
allows only data owner to decrypt the data
With query decomposition
most of query execution performed at ASP
client only performs encryption/decryption, filtering and continues to
benefit from ASP model
35