Download adam - Stanford Crypto group

Detecting Fraudulent Clicks From BotNets 2.0 Adam Barth Joint work with Dan Boneh, Andrew Bortz, Collin Jackson, John Mitchell, Weidong Shao, and Elizabeth Stinson BotNets, Current and Future Traditional BotNets Permanent malware • Infect host – Email attachments – Drive-by downloads BotNets 2.0 Ephemeral • Browser-based – Malicious advertisements – Popular web sites Click-fraud, Spam, DDoS, Key-logging Click-fraud, Spam, (maybe DDoS) ~100,000 members Much larger Browser Security Model • Same-origin policy for network access – Origin is scheme://host:port • Write HTTP anywhere on the network – Easy using HTML forms – Except restricted ports, like 25 (SMTP) • Read from origin only – Can read some “library” formats from anywhere • JavaScript, CSS, Images, Applets, etc Desired Properties of Policy • Can’t send spam – Writes to port 25 blocked • Can’t click advertisements – Need to READ a token to make a click count • Unfortunately… DNS Rebinding Attacks • Circumvent browser network access policy • attacker.com points to attacker and target <policy-file-request/> <allow-access-from domain="*" to-ports="*" /> rebind DNS attacker’s server target server • Can read and write sockets to anywhere An Experiment • We ran a Flash ad (gains socket access) – Paid $30 – 50,951 impressions from 44,924 unique IP addresses • 90.6% of browser vulnerable – More if we include other rebinding attacks • $100 to hijack 100,000 IP addresses – No click required – Impressions are cheap Duration of IP Hijacking A Long Tail • Some impressions last for days Using Rebinding for Click-Fraud • Enroll as a publisher with ad network A – Publish pay-per-click ads on your site • Enroll as a advertiser with ad network B – Buy pay-per-impression Flash ads • Buy bots for $0.001 each – Use 99% just to generate impressions on your site – Use 1% to generate ad clicks on $0.50/per-click ads – Multiply your money by 5, repeat Implications for Click-Fraud Defense • Simulates IP distribution exactly – Each bot an independent sample from web visitors – Black-listing IPs as bot infested meaningless • Traffic time-appropriate for IP – Human at that IP actually surfing the web right now • HTTP headers appropriate for IP – Grab real headers from request for Flash ad – Can’t get cookies, but many networks don’t use them Distinguish Bots from Humans • Bots cannot simulate human cognition • Can’t use traditional CAPTCHAs – Too disruptive to the user experience – User has not interest in proving their humanity • Click-fraud detection a different problem – CAPTCHAs determine if this client a human – We just need estimate the proportion of humans A Straw-Man Design • Humans click “Yes!” • Bots click at random • Ad network stats: – 3487 Yes clicks – 1271 No clicks • How many bots? – Expectation: 2542 – High probably bound an exercise for the reader A Real Advertisement • Where will humans click? • Bots cannot simulate • Can’t trick humans into clicking – Actually need process ad Image Recognition Doesn’t Help • Suppose the bot can identify the hot spots – Say by segmenting the image using vision techniques • In what ratio should the bot click? – Depends on the relative appeal of the hot spots – Requires human-level AI to get right • Any error a signal of bot proportion Fraudster Has to Click on Many Ads Ad Network can Measure Humans • At first, run ads on trusted partners – Record distribution of human click location – Easy to record (x, y) coordinates of click on web • Cheap for ad network – Was going to run ad anyway • Expensive for attacker to influence – Must use valuable bot clicks without payout – Must be clicking everywhere all the time A Work in Progress • Need to validate diversity in distribution – Will run real ads and measure click location – How does distribution vary by screen location of ad? • Experiment with ad design – Objective: human click location hard for bot to predict • Text ads? – Less area to click and less enticing visuals – There still might be a valuable signal in click location Conclusions • BotNets 2.0 are coming – Cheap, large-scale, ephemeral bots in the browser – Don’t require full-machine compromise – Heuristic click-fraud detection’s days are numbered • Click location can divide humans from bots – Accurate simulation requires human cognition – Easy for ad networks to deploy – More science needed to determine effectiveness Thanks!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download adam - Stanford Crypto group