Download LN29 - WSU EECS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer security wikipedia , lookup

Geographic information system wikipedia , lookup

Theoretical computer science wikipedia , lookup

Pattern recognition wikipedia , lookup

Neuroinformatics wikipedia , lookup

Data analysis wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
CPT-S 580-06
Advanced Databases
Yinghui Wu
EME 49
ADB (ln29)
1
CPT-S 580-08 Advanced Databases
DBMS: privacy and security in the Cloud
 Data security and privacy
 Security and privacy in cloud
 Data confidentiality
 Research Challenges
ADB (ln29)
adapted from “Secure and Privacypreserving database services in the cloud,
Divy Agrawal, et.al, ICDE 2013 tutorial”
Database systems: security & privacy issues
ADB (ln29)
Access Control [Bertino et al. TDSC’05]
 Problem Statement: authorizing data access scopes (relations,
attributes, tuples) to users of DBMS
 Discretionary access control
– Authorization administration policies, ie, granting and
revoking authorization (centralized, ownership, etc)
– Content-based using views and rewriting for fine-grained
access control
– Role-based access control: a function with a set of actions,
consisting of users members
 Mandatory access control:
– Object and subject classification (eg, top secret, secret,
unclassified, etc).
4
Data Anonymization
 Problem: protecting Personally Identifiable Information (PII) and their
sensitive attributes
Quasi-identifier
Sensitive
DOB
Gender
Zipcode
Disease
1/21/76
Male
53715
Heart Disease
4/13/86
Female
53715
Hepatitis
2/28/76
Male
53703
Brochitis
1/21/76
Male
53703
Broken Arm
4/13/86
Female
53706
Flu
2/28/76
Female
53706
Hang Nail
Quasi-identifiers
need to be
generalized
or suppressed
Quasi-identifiers are sets of attributes that can be linked
with external data to uniquely identify an individual
5
Solution: k-Anonymity
[Samarati et al. TR’98]
 Quasi-identifiers indistinguishable among k individuals
 Implemented by building generalization hierarchy or partitioning
multi-dimensional data space
Equivalence
Homogeneity
attack class
share same QI
Background
knowledge attack
6
Enhanced Solution: l-Diversity
[Machanavajjhala et al. ICDE’06]
• At least l values for sensitive attributes in each equivalence
class
Similarity attack
A 3-diverse patient table
Zipcode
Age
Salary
Disease
476**
2*
20K
Gastric Ulcer
476**
2*
25K
Gastritis
476**
2*
30K
Stomach Cancer
4790*
≥40
50K
Gastritis
4790*
≥40
100K
Flu
4790*
≥40
70K
Bronchitis
476**
3*
60K
Bronchitis
476**
3*
80K
Pneumonia
476**
3*
90K
Stomach Cancer
Skewness attack
7
Enhanced Solution: t-Closeness
[Li et al. ICDE’07]
•
Distance between overall distribution of sensitive attribute values and
distribution of sensitive attribute values in an equivalence class
bounded by t
8
Differential Privacy for Statistical Data
[Dwork ICALP’06]
 Strong privacy guarantees while querying a database
Query
P(A)
A
Indistiguishable!
PERTURBATION
Query
P(A’)
A’
PERTURBATION
 A randomized function K gives ε-Differential Privacy IFF for all datasets
D1 and D2 differing on at most one element, and all S Î Range (K)
ln
Pr[K(D1 ) Î S]
£e
Pr[K(D2 ) Î S]
9
Secure Devices for Privacy
[Anciaux et al. SIGMOD’07]

Problem: protecting private data during queries involving both private (hidden) and
public (visible) data

Solution: carry private data in a secure USB key, ensure private data never leaves
the USB key, and only public data flows to the key

Query optimization for small RAM USB key
4/11/2013
ICDE 2013 Tutorial
10
Database security & privacy in the cloud
ADB (ln29)
Cloud – A Tempting Attack Target
 Why the cloud?
– Ubiquitous access to consolidated data.
– Shared infrastructure economies of scale
– A lot of small and medium businesses
 Why attack?
– Target one service provider, attack multiple companies
– Financial gain from trading sensitive information
12
Cloud Provides Novel Attack Opportunities
 Co-residence attack [Ristenpart et al. CCS’09]
– Adversary: non-provider-affiliated malicious parties
– Map and identify location of target VM
– Place attacker VM co-resident with target VM
– Cross-VM side-channel attacks (due to sharing of physical
resources): eg, number of visitors to a page, or keystroke attacks
for password retrieval.
 Signature wrapping attack
–
–
–
–
[Somorovsky et al. CCSW’11]
Control Interface compromise by capturing a SOAP msg.
Manipulate SOAP message with arbitrary XML fragments
Use XML signature vulnerability to pass authentication
Take control of a victim’s account
13
A Barrier to Conquer
 Security and privacy – a barrier
to cloud adoption
 Data (sensitive data) – a key
concern
 need to solve data security and
privacy problems in the cloud
14
Problems Amplified by the Cloud
 Data confidentiality
– Attacks
• Unauthorized accesses,
side channel attacks
– Solutions
• Encryption, querying
encrypted data
• Trusted computing
• Access privacy
– Attacks
• Inferences on access
patterns or query results
– Solutions
• Private information
retrieval
• Query obfuscation
Query
Data
Answer
User
Cloud Servers
15
Challenges: Conflicting Goals
High
Existing
Services
Ideal State
Functionality
Performance
Many Crypto
Systems/Protocols
Low
Confidentiality / Privacy
High
16
Data confidentiality
ADB (ln29)
Database as a Service
[Hacigümüs et al. ICDE’02]
 Protects data from steeling but plaintext data can still be seen
on the server
 Write – encrypt before storing
– insert into lineitem (discount) values (encrypt(10,key))
 Read – decrypt before access
– select decrypt(discount,key) from lineitem where custid =
300
 Encryption alternatives
– Software level v.s. Hardware level (cryptographic
coprocessor) encryption
– Granularity: field, row, page
18
Partition and Identification Index
[Hacigümüs et al. SIGMOD’02]
 E(tuple): encrypted-tuple, {attribute-index}
 Attribute-index: attribute value partition ids
2
0
7
200
5
400
1
600
4
800
1000
19
Partition and Identification Index
 Client knows a map function, Map(val) = id of the partition
containing val
Random mapping
2
0
7
5
400
200
1
4
800
600
1000
Order-preserving mapping
1
0
2
200
4
400
5
600
7
800
1000
20
Mapping Predicate Conditions
• Map(< val) : ids of the partitions that could contain values < val
• E.g. Map(eid < 280) = {2, 7} for random mapping
• Map(> val) : ids of the partitions that could contain values > val
• Map(Ai = Aj): pairs of ids of the partitions that could have equal
Ai and Aj values
• Decryption and processing on the client
21
Mapping Predicate Conditions
emp.did = mrg.did
22
Partition / Bucketization Review
 Pros
– Efficient computation on the server
 Cons
– Data update is hard (may need re-distribution)
– Filtering super answer set could be time consuming
depending on the partitions sizes
– Might reveal value distribution from relative partitions
changes during dynamic data updates
23
CryptDB [Popa et al. SOSP’11]
 Supports a wide range of SQL queries over encrypted data
 Server fully evaluates queries on encrypted data, and client does
not perform query processing
 SQL-aware encryption
– leverage provable practical techniques for different SQL
operators over encrypted data
 Adjustable query-based encryption
– Dynamically adjust the encryption level of data items according
to user’s queries
 Onion of encryptions
– From weaker forms of encryption that allow certain computation
to stronger forms of encryption that reveal no information
24
SQL-Aware Onion Encryption
RND: no functionality
DET: equality selection
SEARCH: word selection
(only for text fields)
JOIN: equality join
RND: no functionality
OPE: comparison
OPE-JOIN: inequality join
Any value
Any value
HOM: sum
int value
25
CryptDB System
For sending certain onion layer key
For performing cryptographic operations
26
Open problems
ADB (ln29)
Open Research Problems
 Encryption for processing range/join database queries on
encrypted data
 Improve performance of querying encrypted data for use in
practical OLTP applications
– Pre-computation
– Parallel calculation
 End to end security in the cloud
– Need information flow control and auditing in addition to
cryptography or trusted computing based approaches
28
Concluding Remarks
 Cloud security and privacy is not a completely new problem.
Some issues are amplified by the cloud.
 Protecting data confidentiality and access privacy
 Maintaining practical functionality and performance while
achieving security and privacy
29
References






•
•
•
[Bertino et al. TDSC’05] E. Bertino et al. Database security-concepts, approaches, and
challenges. In IEEE TDSC, 2(1), 2005.
[Samarati et al. TR’98] P. Samarati et al. Protecting privacy when disclosing information: kanonymity and its enforcement through generalization and suppression. TR 1998.
[Machanavajjhala et al. ICDE’06] A. Machanavajjhala et al. l-diversity: privacy beyond kanonymity. In ICDE 2006.
[Li et al. ICDE’07] N. Li et al. t-closeness: privacy beyond k-anonymity and l-diversity. In
ICDE 2007.
[Dwork ICALP’06] C. Dwork. Differential privacy. In ICALP(2) 2006.
[Verykios et al. SIGMOD’04] V. S. Verykios et al. State-of-the-art in privacy preserving data
mining. In SIGMOD 2004.
[Agrawal et al. SIGMOD’00] R. Agrawal et al. Privacy-preserving data mining. In
SIGMOD 2000.
[Clifton et al. KDD’02] C. Clifton et al. Tools for privacy preserving distributed
data mining. In KDD 2002.
[Anciaux et al. SIGMOD’07] N. Anciaux et al. GhostDB: querying visible and
hidden data without leaks. In SIGMOD 2007.
30
References








[Chaudhuri et al. CIDR’11] S. Chaudhuri et al. Database access control &
privacy: is there a common ground? In CIDR 2011.
[Ristenpart et al. CCS’09] T. Ristenpart et al. Hey, you, get off of my cloud:
exploring information leakage in third-party compute clouds. In CCS 2009.
[Somorovsky et al. CCSW’11] J. Somorovsky et al. All your clouds are belong to
us: security analysis of cloud management interfaces. In CCSW 2011.
[Hacigümüs et al. ICDE’02] H. Hacigümüs et al. Providing database as a
service. In ICDE 2002.
[Song et al. S&P’00] D. Song et al. Practical techniques for searches on
encrypted data. In S&P 2000.
[Hacigümüs et al. SIGMOD’02] H. Hacigümüs et al. Executing SQL over
encrypted data in the database service provider mode. In SIGMOD 2002.
[Hore et al. VLDB’04] B. Hore et al. A privacy-preserving index for range
queries. In VLDB 2004.
[Agrawal et al. SIGMOD’04] R. Agrawal et al. Order preserving encryption for
numeric data. In SIGMOD 2004.
31
References









[Popa et al. SOSP’11] R. A. Popa et al. Cryptdb: protecting confidentiality with encrypted
query processing. In SOSP 2011.
[Damiani et al. CCS’03] E. Damiani et al. Balancing confidentiality and efficiency in
untrusted relational DBMSs. In CCS 2003.
[Wang et al. SDM’11] S. Wang et al. A comprehensive framework for secure query
processing on relational data in the cloud. In SDM 2011.
[Aggarwal et al. CIDR’05] G. Aggarwal et al. Two can keep a secret: a distributed
architecture for secure database services. In CIDR 2005.
[Emekci et al. ICDE’06] F. Emekci et al. Privacy preserving query processing using third
parties. In ICDE 2006.
[Agrawal et al. SRDS’88] D. Agrawal et al. Quorum consensus algorithms for secure and
reliable data. In SRDS 1988.
[Bajaj et al. SIGMOD’11] S. Bajaj et al. Trusteddb: a trusted hardware based database with
privacy and data confidentiality. In SIGMOD 2011.
[Song et al. IEEE’12] D. Song et al. Cloud data protection for the masses. In IEEE
Computer, 45(1), 2012.
[Chor et al. JACM’98] B. Chor et al. Private information retrieval. In J. ACM, 45(6), 1998.
32
References
 [Kushilevitz et al. FOCS’97] E. Kushilevitz et al. Replication is not needed:






single database, computationally private information retrieval. In FOCS
1997.
[Sion et al. NDSS’07] R. Sion et al. On the computational practicality of
private information retrieval. In NDSS 2007.
[Olumofin et al. FC’11] F. G. Olumofin et al. Revisiting the computational
practicality of private information retrieval. In FC 2011.
[Williams et al. NDSS’08] P. Williams et al. Usable private information
retrieval. In NDSS 2008.
[Wang et al. DBSEC’10] S. Wang et al. Generalizing PIR for practical
private retrieval of public data. In DBSec 2010.
[Wang et al. DAPD’13] S. Wang et al. Towards practical private processing
of database queries over public data. In DAPD 2013.
[Vimercati et al. ICDCS’11] S. D. C. Vimercati et al. Efficient and private
access to outsourced data. In ICDCS 2011.
33