Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
The 6th Post Graduate Conference of Computer Engineering (cPGCON-2017) Privacy Preserving Data Publishing Paper ID: XX Track: Wireless Networks and Communications Presented by: Mr/Ms XYZ Guided By: Prof/Dr XYZ College Name: XYZ College Code: XX Contents • • • • • • • • Introduction and Motivation Our Contribution Literature survey Our Proposed Approach Methodology of Evaluation Performance Result Analysis Conclusions and Future Work References cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 2 Introduction • A large amount of data has been collected by various organization viz. Medical and Insurance – For the Research and analysis – Contains sensitive personal information – Privacy related incidents occur in [1-7] [9-40] [50-64] cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 3 Introduction.. • Attempts to preserve the privacy has been addressed in [3-4][10-29][75][79][80-81][172-174][180] cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 4 Motivation • We observe in the Anonymization Approaches [3][1029][49][74-77][79][80-81] that – there is tradeoff between privacy and information loss • We notice that the k-anonymity model – Suffer from the information loss due to generalization and suppression – Could not maintain the diversity among the sensitive attribute cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 5 Our Contribution • We propose a sensitive attribute based clustering approach for the k-anonymity model. – For minimizing the information loss – For minimizing the disclosure risk – To maintain the diversity among the sensitive attributes cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 6 Literature Survey • Various approaches [1-20][30-50][53-77] have been proposed in the literature for PPDM. • Traditional Approaches [68-73][75] – Disclose the data using inferences from original data cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 7 An Illustration • Give some example, (wherein state How your proposed approach would solve the problem and gives the solution in better way?) cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 8 Our Proposed Approach/Mathematical Model • The central outline of the proposed algorithm is as follows. – Step 1: We first load the database. – Step 2: We identify and classify the attributes such as identifier, quasi-identifier and sensitive attribute in a database. – Step 3:… – Step 4:… cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 9 Our Proposed Approach .. Database Initial Solution Encoding-Grouping Solution Encoding Distance Matrix Grouping with the k and l parameters Objective Function Bacterial Foraging Optimization Chemotaxis Reproduction EliminationDispersal Note: Draw a pictorial representation of your proposed system cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 10 Methodology of Evaluation • We compare our proposed approach with state of the art clustering approaches viz. – Kabir et al. [17] Systematic clustering algorithm, 2011 – Byun et al. [12] Greedy k-member algorithm, 2007 cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 11 Methodology of Evaluation.. • We use Visual Basic 6.0 and Microsoft Access 2007 for the implementation and run on 3.2 GHz Intel Core 2 Duo Processor machine with 2 GB RAM. • The Microsoft Windows XP Professional is used as an operating system. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 12 Metrics of Evaluation • We evaluated our proposed approach with respect to the parameters – such as information loss and execution time. • We ran our proposed approach on the various – k-values such as 20, 40, 60, 80 and 100. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 13 Test Application • Write the pseudo code of your proposed algorithm in Courier New Font of size 22. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 14 Data Set • We use Adult dataset from the UCI Machine Learning Repository with 32561 records and 14 attributes. – Out of them, we retain only attributes viz. Age, Race, Marital-status, Sex, fnlwt and Occupation. – The attribute Occupation is taken as a sensitive attribute in the dataset. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 15 Performance Result and Analysis Figure 1: Information loss for the Adult Dataset cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 16 Our Observations • It is indeed feasible to make a cluster based on sensitive attribute for minimizing the disclosure risk with lesser information loss. • During the evaluation, we notice that our proposed approach, – Sometime our algorithm is affected with similar sensitive attribute, if the real dataset contain similar kind of sensitive attribute. – Thus, it becomes simpler to the miner to identify an individuals. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 17 Conclusion • In our Approach, we proposed a sensitive attribute for the k-anonymity model. – The empirical evaluations shows that it is feasible to achieve lesser information loss at k≦40 instead of setting higher value of k. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 18 Future Work • In our research attempt, – we have focused on the clustering of the static and centralized database in privacy preserving data mining. – However, the database is growing tremendously via use of the Internet. – Thus, our future work would be to extend the k-anonymity and the l-diversity model using the BFO algorithm to the dynamic and distributed database. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 19 References [1] A. I. Anton, Q. He and D. L. Baumer, “Inside JetBlue’s privacy policy violations”, In: IEEE Security and Privacy, Vol. 2, No. 6, pp. 12-18, 2004. [2] R. Agrawal and R. Srikant, “Privacy preserving data mining”, In: ACM SIGMOD Record, Vol. 29, No. 2, pp. 439-450, 2000. [3] Y. Lindell and B. Pinkas, “Privacy preserving data mining”, In: Journal of Cryptology, Vol. 15, No. 3, pp. 177-206, 2002. [4] F. D. Schoeman, “Philosophical dimensions of privacy: an anthology”, In: Cambridge University Press, 1984. [5] G. J. Walters, “Human Rights in an information age: a philosophical analysis”, In: Chapter 5, University of Toronto Press, 2002. [6] J. Zhan, “Using cryptography for privacy protection in data mining system”, In: Proceedings of the 1st WICI International Workshop on Web Intelligence Meets Brain Informatics (WImBI), LNCS 4845, pp. 494-513, 2007. cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 20 Thank You cPGCON-2017, Track=xx, Paper ID=xx, College code=xx 21