Download Bots—Fast Growing Bane of the Web: Crawlers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Web 2.0 wikipedia , lookup

Marketing mix modeling wikipedia , lookup

Service parts pricing wikipedia , lookup

Search engine optimization wikipedia , lookup

Sales process engineering wikipedia , lookup

Web analytics wikipedia , lookup

Transcript
SESSION ID: SPO2-W05
Bots—Fast Growing Bane of the
Web: Crawlers, Scrapers and
Account Checkers
John Summers
Vice President Security Business
Unit, Akamai Technologies
#RSAC
#RSAC
Security always gets the tough nuts.
But are bots just a security problem?
#RSAC
Agenda
Understanding
Bot Problem
Security
alwaysthe
gets
the tough nuts.
Size
Operator
But
Types of Bots
are bots just a security problem?
Business and IT Impact.
The Right Strategy for Each Use Case
Lessons Learned
#RSAC
Understanding the “Bot Problem”
Your site traffic
What you think your
traffic looks like
What your traffic
actually looks like
#RSAC
Understanding the “Bot Problem”
Unknown │ 10%
Site development & monitoring │ 5%
Other │ 1%
Advertising │ 1%
Web archiver │ 2%
Search engine │ 20%
63% │ User traffic
#RSAC
This is a bot.
#RSAC
This is a bot.
A bot is just an automated tool.
#RSAC
More important is the bot operator, which programs the
tool to perform a repetitive task on the Internet.
A bot is just an automated tool.
Understanding bots requires understanding the bot operator.
How big is the Bot problem
Clothing company: 15.3% BW Car dealer: 32.3% BW
#RSAC
Hi-Tech distrib: 70.3% BW
How big is the Bot problem
31.9%
Volume
31.5%
Hits
Bot Traffic vs. Total Traffic
#RSAC
21st – 28th January 2016
MEDIAN across all companies
using BotMan
Types of Bots
More than 1,300 known bots
#RSAC
#RSAC
On going research
Dealing with
is a process
What’s wrong with…
#RSAC
Business and IT Impacts
Origin load
IT
Poor site performance and higher infrastructure costs
#RSAC
#RSAC
Business and IT Impacts
Origin load
IT
Poor site performance and higher infrastructure costs
Content aggregation
Web scraping
Loss of opportunity and customer relationships
IP theft and loss of competitive advantage
Inventory grabbing
Web analytics
BUSINESS
Loss of sales and customer relationships
Effective monitoring of marketing metrics
Form / comment spam
Good bot management
False records and defaced online communities
Search engines, partners and internal bots
Business and IT Impacts. Stakeholders
#RSAC
Origin load Infrastrucure Manager, IT Manager, CIO
IT
Content aggregation
Head of Sales, Content Manager
Web scraping
Content
aggregation
CISO, Web
Legal,scraping
VP of Products
Inventory grabbing
Inventory grabbing
Loss
of sales
and customer relationships
Web
analytics
Head of Sales, Procurement, Operations
Web analytics
monitoring
marketing
VP of Effective
Marketing,
VPofof
Sales metrics
Loss of opportunity and customer relationships
IP theft and loss of competitive advantage
BUSINESS
Form / comment spam
Good bot management
Form
/
comment
spam
VP
of
Marketing,
Community
Manager
False records and defaced online communities
Search
engines, partners
and internal
bots
Good bot management
SEO, VP of Channels, Procurement
#RSAC
Manage, Not Mitigate
Motivation
The bot is here to get something
Whack-a-mole
The bot returns but is now
better hidden from detection
Evasion
Operator modifies the bot to
evade detection / mitigation
Blocking
Prevents the bot from getting
what it came for
Awareness
Blocking also alerts bot operator
#RSAC
Let’s Talk about Use Cases
What are the impacts on
your website and business?
A bot use case is a combination of the bot
operator and the business and IT impact.
What are the incentives
and goals of the operator?
#RSAC
Let’s Talk about Use Cases
Aggregators
Mixed
Search engine
Beneficial
Grey marketers
Mixed
Spam bots
3rd
party services
Beneficial
Harmful
Web scrapers
Partner bots
Beneficial
Harmful
#RSAC
What If You Could
Hoodwink the Price Scrapers?
Your site
Higher prices
$
Customers
$
$+1
$+1
IMPACT: competitors automatically match
your online prices, steal your customers and
increase their sales at your expense
Competitors
VALUE: feed competitors incorrect pricing to
maintain competitive advantage, keep your
customers and maximize your online sales
#RSAC
What If You Could
Fool the Grey Marketers?
Your site
Higher prices
Out-of-stock
Customers
$
$
$+1
X
Grey marketer
X
Special action
IMPACT: 3rd-parties purchase or hold scarce
inventory, preventing you from selling to real
customers and reducing customer satisfaction
VALUE: slow, fool with higher prices or out-
of-stock pages, or signal to your origin to take
special action on transactions from 3rd-parties
©2015 AKAMAI | FASTER FORWARDTM
#RSAC
What If You Could
Make Your Partners Behave?
Your site
Slow traffic
Direct to API
Customers
$
$
API
Partner bot
API
IMPACT: Your own partners are aggressively
scraping your site to keep up with the latest
pricing and content updates
VALUE: provide partners with the updates
they need while reducing associated origin
load, or redirect partners to use your API
#RSAC
What If You Could
Show Google the Love?
Your site
$
$
Slow traffic
Alternate origin
IMPACT: Heavy bot traffic and aggressive
search engines (e.g., Baidu) reduce site
performance for Googlebot
Googlebot
Aggressive search
engines, other bots
VALUE: Slow or direct aggressive search
engines and other bots to alternate origin to
maximize performance for Googlebot
Bot Management Is a PROCESS
EDUCATION
ANALYSIS
IMPLEMENT
Concern
Has some understanding of a “bot problem”
Step back
Identify stakeholders with other “bot problems”
Visibility
Identify bot traffic and monitor behavior
Analysis
Identify different bots and understand impact
Manage
Determine and implement appropriate policies
Measure
Measure results of management actions
#RSAC
What have we learned?
#RSAC
Bots are not only a security problem. Business and IT are impacted
There are good bots, bad bots, and ’so-so’ bots.
…and there is always a bot operator with a specific target
Just blocking bots is the wrong thing to do
The way you detect, categorize and respond to bots is key to get the
benefit of the good bots and avoid the problem of the bad bots
Managing bots is an ongoing process, not a one time event
25
#RSAC
Thank you!
Questions