Download Privacy-Aware Computing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia, lookup

Transcript
Privacy-Aware Computing
Introduction
Outline
 Brief introduction
 Motivating applications
 Major research issues




Tentative schedule
Reading assignments
Project
Grading
Parties concerning privacy
 Individual privacy






Customer data
Public data: census data, voting record
Health record
locations
Online activities
…
 Organization privacy




Owning collections of personal data
Business secrets
Legal issues prevent data sharing
…
Cases of privacy aware
computing
 Public use of private data
 Data mining enables knowledge discovery on large
populations, but people are reluctant to release
personal information due to the privacy concern
 The Centers for Disease Control want to identify
disease outbreaks by pooling multiple datasets
that contain patient information
 Insurance companies have data on disease
incidents, and patient background, etc.. Personal
medical records help them maximize profits – but
customers will not be happy with that.
More Examples
 Industry Collaborations / Trade Groups.
 An industry trade group may want to identify best
practices to help members, but some practices are
trade secrets.
 How do we provide “commodity” results to all
(Manufacturing using chemical supplies from supplier X
have high failure rates), while still preserving secrets
(manufacturing process Y gives low failure rates)?
 Multinational corps
 Multinational corps may want to pool data from
different countries for analysis, but national
laws may prevent transborder data sharing
More examples
 Web search
 Search engine companies keep the
cookies and search history, which can be
used to derive personal information (AOL
dataset)
 Social networking
 When you use social networks, you leave a
trace of personal data and interactions
 Companies can use the data for Ads targeting
– there is a risk of privacy breach and personal
data abuse
More examples
 Mobile computing
 When you allow google latitude to trace your
locations, you loose location privacy
 Life style, clinic visits, political tendency, domestic
violence
 Cloud computing
 Users have to outsource data to the
cloud
 Data can be sensitive (personal
information, customer records, patient
info…)
Major research areas
 Micro data publishing
 Anonymize data for statistical analysis and
modeling
 Privacy preserving data mining
 Data outsourcing
 Cloud computing
 Outsource data to untrusted parties for using data
intensive services
 Databases
 Statistical databases
 Private information retrieval
Major areas
 Social networks
 Personal bio data, preferences, friends,
interactions
 How to design mechanisms for users to
conveniently control private data
 Mobile computing
 Location privacy
 Collaborative computing
 Collaborative data mining – share model but not
individual records
Major technical challenges
 Techniques
 Data perturbation
 Change data values while preserving global
information
 Data anonymization
 Make sure at least k records have the same
“virtual identifiers”, while preserving info
 Cryptographic techniques
 Secure multiparty computation
 Private information retrieval
 crypto-protocols for privacy preserving DM
 Privacy evaluation
 Tradeoff between privacy and data utility
Differences between Security
and privacy
 Privacy: decisions on what personal
information is released and who can
access it.
 Security makes sure these decisions
are respected
 Security is often a necessary method to
implement privacy
National security and privacy
 They are conflicting…
 Enhance national security
 Surveillance devices are everywhere
 US PATROIT Act 2001
 … the Act dramatically reduced restrictions
on law enforcement agencies' ability to
search telephone, e-mail communications,
medical, financial, and other records …
 Big Brother is watching you –
individuals have to sacrifice privacy
Tentative Schedule
 Data perturbation
 Data anonymization
 Privacy metrics and differential
privacy
 Privacy preserving data mining
 Private information retrieval
 Secure data outsourcing
 Privacy in online social networks
 Other privacy issues
Reading assignments
 One selected paper from the reading list for
most weeks ~10
 Submit reading summary
 Before Monday noon
 How to write reading summary?
 Five parts:




Title
Research problems
Major contributions
Strengths
 Weaknesses or missing points
 Length: a few paragraphs to one page
Paper presentation
 Choose one paper from the reading
list, or recent major conferences
 Finish in 15 minutes
 Maximum two students per class
 Signup sheet
 When: office hours: 3-4:30pm MW, first
two week
 Make sure you pick a slot asap
Course Project
 1~2 person per team
 Types
 Experimental study on existing techniques (from the
paper list)
 Propose new algorithms
 Apply the learned techniques to some applications
 Your research
 Note
 You are encouraged to propose your own project
 The goal is to help you better understand problems
and techniques and get some hands-on experience
Project Schedule
 Proposal
 About 2 pages
 Problem description and what you plan
to do
 By the end of January
 Final deliverables
 Report
 Code
Class discussion
 You are encouraged to ask questions
or present different opinions in the
class
 Many of the topics are active research
topics
 You have chances to generate
publishable ideas
Grading






Reading summaries – 35%
Paper presentation – 10%
Project proposal – 10%
Project final report – 15%
Code – 10%
Final exam – 20%
Communication
 Announcements by emails
 Other issues, [email protected]
 Office: Joshi 385
 Office hours: 3-4:30pm MW or by
appointment.
 Slides will be posted on
www.cs.wright.edu/keke.chen/privacy/