Data Mining and Text Analytics
Laura Quinn
Internet-based Advertising
The internet-based advertising industry is worth billions and has
grown massively in recent years. It is constantly expanding and
therefore requires robust systems which are able to handle billions
of queries from users in order to provide ads from millions of
advertisers on many websites. This leads to a massive reliance on:
* Real-time collection of data
* Management and analysis of the data in order to provide
appropriate ads.
These advertisements exist in many forms including:
* Sponsored Search
* Display Advertising
* Rich Media Ads
* Interstitial Ads
Google Sponsored Search
Advertisers pay
Google only when
their ad is clicked.
The ads may be
shown when a
user query
matching the
websites and ads
keywords, this
helps provide ads
tailored to the
users search.
Collection of Data from Google
We may show you ads based on many factors, including:
 Types of websites you visit, and mobile apps you have on your device
 The DoubleClick cookie on your browser and the settings in your Ads
 Websites and apps you’ve visited that belong to businesses that
advertise with Google
 Previous interactions with Google’s ads or advertising services
 Your Google or YouTube profile
We don’t do the following:
 Link your name or personally identifiable information to your
DoubleClick cookie without your consent
 Associate your DoubleClick cookie with sensitive topics like race,
religion, sexual orientation, or health without your consent
Google and YouTube Internet-based
Google Ads Settings
based on websites
that you have
visited. Without
logging in to Google,
it can determine
* Gender
* Age
* Languages
* Interests
Off-line algorithms are used by observing data over a
period of time and make decisions from the data . In
this case, this would be looking at search queries from
the time period and the advertisers bids for those
queries. However, this would not be useful for the
sponsored search example as it would not provide realtime results.
On-line algorithms are a better way to show sponsored
search ads as they do not require the entire data input from
the beginning.
They are able to use information about:
* The advertisers remaining budget
* The click-through rate
These can be classed as a greedy algorithm, this means that
they try to find the best solution after each input but this
does not necessarily mean it will be the optimum solution in
terms of the profit for the advertiser and also the search
Advertising Algorithms
In Conclusion
The online advertising industry must continue to tackle
many new and existing challenges. These include how to
understand the needs of the users and also what the
advertiser wants to achieve. It must also deal with which
type of advertising is the most suitable for each of the
different advertisements, as well as the challenges faced
with continuing to provide the user with customized and
useful ads. These are challenges which will need to be
tackled by data miners and ad companies to find new
technologies and systems.