Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Y X C. Lucchese, M. Vlachos, University of Venice, Italy IBM Research D. Rajan, IBM Research P.S. Yu University of Chicago Objective: Ownership seal with Mining Guarantees the trajectories are modified imperceptibly, but their neighboring objects are not distorted NN Search Final Destination Clustering … Classification Embed a stamp so that we can claim ownership of the data Output on database and data mining operations is the same as on the original data Applications: Database Search Watermark does not change the nearest neighbor Search operations remains same – outsource data to a mining company – maintain principal rights of the dataset NN(x) y1 y2 x We want to retain the Nearest Neighbors of each object. Determine the maximum watermark embedding power p which maintains NN for all objects: Dp(x, NN(x)) < Dp(x,y) Applications: Classification Preservation Modified Dataset including watermark Dataset of time-series/trajectories with class labels Class A Class A Class A Class B Objective: Distort the data imperceptibly so that class labels are maintained. Unacceptable Class B Class B Acceptable Applications: Clustering Preservation Results of clustering remains the same – geodesic distances will remain the same – hierarchical clustering will not be affected Gray-necked Owl Monkey Female Gray-necked Owl Monkey Male Orangutan juvenile Mandrill male Red Howler Monkey Male Mantled Howler Monkey Orangutan2 male Mandrill2 male Juvenile Baboon De Brazza Monkey Juvenile Male De Brazza Monkey Male Common Chimpanzee male Common Chimpanzee Male 2 The secret key is embedded in a domain resilient to common trajectory transformations Frequency Domain Frequency Domain Phase ft Magnitude same modified Phase Magnitude ift watermarked magnitudes original data watermark Example: w = [-1 1 -1 -1 1 1 ] Additive Embedding in Magnitudes p (embedding power) watermarked data Techniques are also applicable for image shapes (shapes can be treated as trajectories) Red Howler Monkey Male (Alouatta seniculus seniculus) Orangutan skull Extracted Shape Conversion of skull shape into a two-dimensional sequence Embed the key in the k most important coefficients Secret information is hidden in some of the frequency components Y X 2 coeffs 16 coeffs 4 coeffs 32 coeffs 8 coeffs 64 coeffs Select the frequency coefficients that best describe the shape of the trajectory One can select either highest energy coefficients, or low frequency coefficients. Removal of the watermark will be more difficult without destroying the important trajectory characteristics key is detected very efficiently even when it is inserted with low embedding power Threshold Frequency Domain Phase ft watermarked data Detection of the embedded key is virtually perfect Magnitude correlation watermark w = [-1 1 -1 -1 1 1 ] Better Detection (semi-blind): Remove ‘background noise’ bias before the embedding and during the detection example of using our technique for spanning tree preservation MST before watermarking MST after watermarking the proposed fast algorithm prunes a significant amount of the search space Finding the maximum embedding power NN(x) y x z We need to examine for each power p, how many times the following is violated: Dp(x, NN(x)) > Dp(x,y) Express distance parametrized by the embedding power of the key our approach can embed the hidden information more than 300 times faster than the brute-force approach The fast search techniques find the same result as the exhaustive search, but are 2-3 orders of magnitude faster Running Time The efficient key embedding + detection allow for effective key recovery even under attacks Geometric Attacks: perfect detection under Translation/Rotation/Scaling attacks Gaussian Noise attack has to destroy the data in order to be effective Decimation attack can be perfectly withstood Data Reduction attack (even when pruning 50% of dataset) is not effective