Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DBPD: A Dynamic Birthmark-based Software Plagiarism Detection Tool Zhenzhou Tian [email protected] MOE Key Lab for Intelligent Networks and Network Security Xi’an Jiaotong University, China 2017/8/12 1 Introduction Software plagiarism has been a serious threat to the healthy development of software industry • Violate licenses for commercial interests or unwittingly • Weak code protection awareness • Powerful automated code obfuscation tools • Distributed in binary form 2 Introduction Many software birthmark based techniques are proposed Static Birthmarks: CVFV,SMC,IS,UC… Dynamic Birthmarks: WPP, SCSSB, SCDG, DKISB… Seldom tools are publically available Tool Static/Dynamic Language Sandmark Static Java bytecode Stigmata Static Java bytecode Birthmarking Dynamic Java bytecode JPlag Static Source code Dynamic birthmarks are believed to perform better than static birthmarks 3 Framework of DBPD Software Birthmark A set of characteristics extracted from a program that reflects intrinsic properties of the program, and which can be used to identify the program uniquely. Design Overview Birthmark Generator Defendant Binary Input Plaintiff Binary DKISB Generator Dynamic Analysis Module SODB Generator SCSSB Generator 4 Similarity Calculator & Decision Maker Three Dynamic Birthmarks Three Birthmark Approaches Implemented DKISB: Dynamic Key Instruction Sequence Birthmark Generated using k-gram algorithm from dynamic key instructions (instructions that are both value updating and input correlated). SCSSB: System Call Short Sequence Birthmark Extracted by splitting system call sequence into short sub-sequences SODB: Stack Operation Dynamic Birthmark Generated by analyzing the behavior of stack operations, utilizing the law of push and pop operation of call stack to uniquely identify a program 5 Demonstration Independently implemented software with similar functionalities 6 Demonstration Plagiarism Using Different Compilers and Optimization Levels 7 Demonstration Plagiarism Using Specific Obfuscation Tools 8 Demonstration Cross-Platform Plagiarism Scenario 9 Some Definitions 10 Some Definitions 11