Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Reactively Adaptive Malware What is it? How do we detect it? Dr. Bhavani Thuraisingham Cyber Security Research and Education Institute https://csi.utdallas.edu The University of Texas at Dallas April 19, 2013 FEARLESS engineering 1 Outline • Analogies • Malware: What is it? • Our Solutions – Profs. Thuraisingham, Khan, Hamlen, Lin, Makris, Cardenas, Kantarcioglu • Directions – Holistic Interdisciplinary Treatment FEARLESS engineering Analogies: The Human Body • Humans infected with virus and bacteria • Virus replicates itself and spreads throughout the body • Attacks vital organs • Doctor conducts tests and detects the problem • Medicine is given to slow the progress of the disease • Patient’s condition may improve or the patient may die FEARLESS engineering Analogies: An Organization • Bad person joins the organization and pretends to be a good person • He/she monitors what is going on and spies on the organization • Conveys vital information to the adversary – insider threat • Builds a network of bad people • Takes over the organization FEARLESS engineering What is a Malware? • It’s a piece of software that is malicious and carries out bad things • It infects a vulnerable and neglected machine • It attacks the various components of the machine– the operating system (vital organs), applications (limbs) and hardware (bone) • It spreads across a network of machines • It cripples the machines and the network • It conveys vital information to the enemy – the hacker • It takes over the network and carries out its agenda FEARLESS engineering Victim Network What does it look like? Example: Melissa Virus March 26, 1999 The Virus-Antivirus Arms Race • • Malware (e.g., viruses) – Rogue programs that carry out malicious actions on victim machines • Vandalism (delete files, carry out phishing scams, etc.) • reconnaissance & secret exfiltration (cyber-warfare / hacktivism) • Sabotage (e.g., attacks against power grids) – Randomly mutate themselves automatically as they propagate • Harder to detect since no two samples look identical Antivirus defenses – Defenders manually reverse-engineer many malware samples – Find mutation patterns – Build defenses to automatically detect & quarantine all mutants FEARLESS engineering Incidents Reported 1990-2001 Incidents Reported to Computer Emergency Response Team/Coordination Center (CERT/CC) 60000 50000 40000 30000 20000 10000 0 90 91 92 93 94 95 96 97 98 99 00 Everything changed with Code Red attack in 2001 FEARLESS engineering 01 Problem is much worse now! FEARLESS engineering Our Malware Team Data Mining Solutions for Malware Professor Latifur Khan Android Malware and Solutions Professor Zhiqiang Lin Reactively Adaptive Malware and Solutions Professor Kevin Hamlen Hardware Malware and Solutions Professor Yiorgos Makris Adversarial Mining Solutions Smart Grid Malware Professor Murat Kantarcioglu and Solutions Professor Alvaro Cardenas FEARLESS engineering Data Mining Solutions Data Mining Knowledge Discovery in Databases Data Pattern Processing Knowledge Extraction The process of discovering meaningful new correlations, patterns, trends and nuggets by sifting through large amounts of attack data, often previously unknown, using pattern recognition technologies and machine learning statistical and mathematical techniques. Thuraisingham, Data Mining: Technologies, Techniques, Tools and Trends, CRC Press 1998 FEARLESS engineering Training and Testing • Extract features ✗Binary n-gram features ✗Assembly n-gram features Training Data Enhancements to current data mining approaches Hierarchical Clustering (DGSOT) Data Mining Classification Model Training Good Class Testing Bad Class DGSOT: Dynamically Growing Self-Organizing Tree Our novel solution • Supported by US Air Force 2005-2008 – PI: Thuraisingham, Co-PI: Khan FEARLESS engineering Testing Data Report Results: Example • HFS = Hybrid Feature Set (Binary and Assembly) • BFS = Binary Feature Set • AFS = Assembly Feature Set FEARLESS engineering Reactively Adaptive Malware: What is it? • Next-generation Malware Technology – Malware that mutates NON-randomly – LEARNS and ADAPTS to antivirus defenses fully automatically in the wild – Immune to conventional antivirus defenses – Supported by the U.S. Air Force; 2010-2013 • PI: Hamlen, Co-PI: Khan FEARLESS engineering Data Mining-based Anti-antivirus [Hamlen & Khan] Signature Query Interface Antivirus Signature Database Signature Inference Engine Signature Approximation Model Obfuscation Generation Malware Binary Obfuscation Function Testing FEARLESS engineering Obfuscated Binary propagate “Frankenstein” [Mohan & Hamlen, USENIX WOOT, 2012] • Stitch together code harvested from benign binaries to re-implement malware on each propagation. • Many offensive advantages: – resulting malware is 100% metamorphic • no common features between mutants – statistically indistinguishable from benign-ware • everything is plaintext code (no cyphertexts) – no runtime unpacking • evades write-then-execute protections – obfuscation is targeted and directed • evolves to match infected system’s notion of “benign” FEARLESS engineering Frankenstein Press Coverage • Presented at USENIX Offensive Technologies (WOOT) mid-August 2012 • Thousands of news stories in August/September – The Economist, New Scientist, NBC News, Wired UK, The Verge, Huffington Post, Live Science, … FEARLESS engineering Solution we are exploring: SNODMAL Stream Based Novel Class Detection • Divide the data stream into equal sized chunks – Train a classifier from each data chunk – Keep the best L such classifier-ensemble Note: Di may contain data points from different classes Labeled chunk Data chunks D1 Classifiers C1 Ensemble D2 D543 D654 Unlabeled chunk C1 FEARLESS engineering C2 C42 C53 C543 Addresses infinite length and concept-drift Prediction Smartphones can also be infected with malware! FEARLESS engineering Our Solution – Combine Static Analysis with Dynamic Analysis • Static Analysis – Data mining solutions • Dynamic Analysis – Platform – Android & I-Phone – Reverse engineering Mal App • Level App – System call Behavior – Operating systems – Network • Supported by US Air Force 2012-2016 – Technical Leads Lin and Khan FEARLESS engineering Network Behavior Remote Server The Hunt for the Kill Switch Adee, IEEE Spectrum, 2008 We cannot forget about Hardware Do you Trust Your Chips? Yiorgos Makris ([email protected]) Research Supported by: 2012 Phobos-Grunt Mission Fails Due to Counterfeit Non Space-Rated Chips The Hacker in Your Hardware, Villasenor, Scientific American 2010 Our Solution to Hardware Trojan FEARLESS engineering That’s not all – Attacks to Critical Infrastructures Attacks Maroochy Shire 2000 HVAC 2012 Stuxnet 2010 Smart Meters 2012 FEARLESS engineering Threats Obama administration demonstrates attack to power grid in Feb. 2012 DHS and INL study impact of cyberattacks on generator New Attack-Detection Mechanisms by Incorporating “Physical Constraints” of the System • 1st Step: Model the Physical World Physical World • 2nd Step: Detect Attacks – Compare received signal from expected signal Model System of Differential Equations • 3rd Step: Response to Attacks • 4th Step: Security Analysis Missed Detections Study stealthy attacks False Positives Ensure safety of automated response [Alvaro Cárdenas, et.al. AsiaCCS, 2011] FEARLESS engineering It never ends! We need to mine the adversary • Adversary changes its behavior to avoid being detected • Data Miner and the Adversary are playing games • Remember, malware detection is a two class problem? • Good class (e.g., benign program) • Bad class (e.g., malware) • Adapt your classifier to changing adversary behavior • Questions? – How to model this game? Does this game ever end? – Is there an equilibrium point in the game? FEARLESS engineering Our Solution: Game Playing • Adversarial Stackelberg Game – Adversary chooses an action – After observing the action, data miner chooses a counteraction – Game ends with payoffs to each player • Adversary may use malware obfuscation • Change has some cost to the adversary • We need data mining techniques to handle the changes by the adversary • Funded by the US Army; 2012-2015 – PI: Kantarcioglu, Co-PI: Thuraisingham FEARLESS engineering Where do we go from here: Holistic Treatment Three actors interacting with each other: • The Doctor – The Defender/Analyst • The Patient – The User /Soldier • The Virus/Bacteria – The Malware/Attacker Together with ECS, SOM, EPPS and BBS, we are proposing an Interdisciplinary approach. FEARLESS engineering