Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Warehousing Data Mining Privacy Reading Farkas Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer. 2007. Extended RBAC-design and implementation for a secure data warehouse. Int. J. Bus. Intell. Data Min. 2, 4 (December 2007), 367382., https://www.utdallas.edu/~bxt043000/Publications/ Technical-Reports/UTDCS-35-07.pdf Sweeney L, Abu A, and Winn J. Identifying Participants in the Personal Genome Project by Name. Harvard University. Data Privacy Lab. White Paper 1021-1. April 24, 2013. http://dataprivacylab.org/projects/pgp/1021-1.pdf CSCE 824 - Spring 2015 2 Data Warehousing Repository of data providing organized and cleaned enterprisewide data (obtained form a variety of sources) in a standardized format – Data mart (single subject area) – Enterprise data warehouse (integrated data marts) – Metadata Farkas CSCE 824 - Spring 2015 3 OLAP Analysis Farkas Aggregation functions Factual data access Complex criteria Visualization CSCE 824 - Spring 2015 4 Warehouse Evaluation Farkas Enterprise-wide support Consistency and integration across diverse domain Security support Support for operational users Flexible access for decision makers CSCE 824 - Spring 2015 5 Data Integration Farkas Data access Data federation Change capture Need ETL (extraction, transformation, load) CSCE 824 - Spring 2015 6 Data Warehouse Users Internal users – Employees – Managerial External users – Reporting and auditing – Research Farkas CSCE 824 - Spring 2015 7 Data Mining Farkas Databases to be mined Knowledge to be mined Techniques Used Applications supported CSCE 824 - Spring 2015 8 Data Mining Task Farkas DM: mostly automated Prediction Tasks – Use some variables to predict unknown or future values of other variables Description Tasks – Find human-interpretable patterns that describe the data CSCE 824 - Spring 2015 9 Common Tasks Farkas Classification [Predictive] Clustering [Descriptive] Association Rule Mining [Descriptive] Regression [Predictive] Deviation Detection [Predictive] CSCE 824 - Spring 2015 10 Security for Data Warehousing Farkas Establish organizations security policies and procedures Implement logical access control Restrict physical access Establish internal control and auditing CSCE 824 - Spring 2015 11 Data Warehousing Issues: Integrity Poor quality data: inaccurate, incomplete, missing meta-data Loss of traditional consistency, e.g., keys Source data quality vs. derived data quality – Trust in the result of analysis? Farkas CSCE 824 - Spring 2015 12 Big Data Security and Privacy Amount of data being considered Privacy-preserving analytics Granular Access Control – Flat, two dimensional tables Farkas Transaction logs and auditing Real time monitoring CSCE 824 - Spring 2015 13 Big Data Integrity Farkas Data Accuracy Source provenance End-point filtering and validation CSCE 824 - Spring 2015 14 Access Control Layered defense: – Access to processes that extract operational data – Access to data and process that transforms operational data – Access to data and meta-data in the warehouse Farkas CSCE 824 - Spring 2015 15 Access Control Issues Farkas Mapping from local to warehouse policies How to handle “new” data Scalability Identity Management CSCE 824 - Spring 2015 16 Inference Problem Data Mining: discover “new knowledge” how to evaluate security risks? Example security risks: – Prediction of sensitive information – Misuse of information Assurance of “discovery” Farkas CSCE 824 - Spring 2015 17 Privacy and Sensitivity Farkas Large volume of private (personal) data Need: – Proper acquisition, maintenance, usage, and retention policy – Integrity verification – Control of analysis methods (aggregation may reveal sensitive data) CSCE 824 - Spring 2015 18 Privacy Farkas What is the difference between confidentiality and privacy? Identity, location, activity, etc. Anonymity vs. accountability CSCE 824 - Spring 2015 19 Legislations Privacy Act of 1974, U.S. Department of Justice (http://www.usdoj.gov/oip/04_7_1.html ) Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, (http://www.ed.gov/policy/gen/guid/fpco/ferpa/in dex.html ) Health Insurance Portability and Accountability Act of 1996 (HIPAA), (http://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Act ) Telecommunications Consumer Privacy Act (http://www.answers.com/topic/electroniccommunications-privacy-act ) Farkas CSCE 824 - Spring 2015 20 Online Social Network Social Relationship Communication context changes social relationships Social relationships maintained through different media grow at different rates and to different depths No clear consensus which media is the best Farkas CSCE 824 - Spring 2015 21 Internet and Social Relationships Internet Bridges distance at a low cost New participants tend to “like” each other more Less stressful than face-to-face meeting People focus on communicating their “selves” (except a few malicious users) Farkas CSCE 824 - Spring 2015 22 Social Network Description of the social structure between actors Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds Support online interaction and content sharing Farkas CSCE 824 - Spring 2015 23 Social Network Analysis The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities Behavioral Profiling Note: Social Network Signatures – User names may change, family and friends are more difficult to change Farkas CSCE 824 - Spring 2015 24 Interesting Read: Farkas M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, http://citeseer.ist.psu.edu/viewd oc/summary?doi=10.1.1.149.446 8 CSCE 824 - Spring 2015 25 Next Farkas Web application insecurity: risk to databases CSCE 824 - Spring 2015 26