Download Contextual Presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Transcript
Contextual Presentation
Petteri Nurmi
University of Helsinki – Department of Computer Science
[email protected]
1.
Table of Contents
1.
Table of Contents
2.
*Abstract
3.
Introduction
4.
Content Adaptation
5.
6.
4.1.
Introduction
4.2.
The Content Adaptation Process
4.3.
Challenges in Content Adaptation
4.4.
Approaches to Content Adaptation
4.5.
Device Independence
Personalization
5.1.
Introduction
5.2.
Collaborative methods
5.3.
Content-based personalization
5.4.
Predict methods for mobile applications
Applications
6.1.
Introduction
6.2.
Aura
6.3.
*ParcTAB
6.4.
*GroupLENS
6.5.
Amazon
6.6.
SPOT Watch
7.
*References
8.
*Method references
9.
*Techniques references
10.
*Pictures index4
11.
*Index
2.
Abstract
a
3.
Introduction
What is contextual presentation? In [Schilit-94] different categories such as proximate
selection were given but this offers a very limited view of the possibilities of using contextual information. Generally speaking contextual presentation deals with the use of context
dependent information in applications and devices.
How can contextual presentation help? In this paper we consider only two different aspects of contextual presentation, namely content adaptation and personalization. Content
adaptation deals with the problem of providing different presentations of the same data
depending on the device- and network capabilities. Personalization deals with learning
from the user. In simplest case this means providing information to the user he/she might
find interesting. A more complex possibility is to learn dynamics of the user behaviour and
use this to help the user.
Contextual presentation isn’t an own field of study at the moment. Instead most of the
research results go hand-in-hand with ubiquitous application/devices. That is why we selected a large set of devices and applications and offer also a practical view to the problem
at hand.
The paper is organized as follows. In section 4 we discuss what content adaptation is
and why it’s important. Section 5 describes various aspects of personalization based on
the categorization by [Hirsch-02]. Section 6 deals with some interesting applications such
as the AURA[], ParcTAB[] and Amazon[].
4.
4.1.
Content Adaptation and Device Independence
Introduction
The amount of information in the Internet has grown rapidly. Most of the content is
designed for tabletop computers and they are not suitable for devices with small screens,
limited colour depth etc. Web content designers usually use rich-media content on the
web pages and this poses another problem – what if the bandwidth is limited? This problem can be also described as the “World Wide Wait” problem. Content-Adaptation tries to
solve both problems by transforming the web content into a more suitable form that takes
into account the users personal preferences, device capabilities and the environmental
parameters such as the bandwidth.
Device Independence deals with the problem of making the content accessible from
different kinds of devices. A simple scenario would be an office worker, who is browsing
the web in his office. Later in the evening he wants visit some page he visited during daytime but now he has only a mobile phone offering web access. The page should still be
available and the quality of the transformed page should be such that the page is still
readable.
4.2.
The Content Adaptation Process
The content adaptation process starts when a client requests a web page. The order in
which the different phases of the process occur is almost identical to every approach so
we give here a more generalized model of the adaptation process. Figure 5.1 illustrates
the process.
Figure 5.1 the content adaptation process
After the request arrives at the server we either determine the client’s capabilities or
send the requested contents to an external entity that queries for the clients capabilities
and then performs the transcoding operations. The problem arising here is that what parameters we need and how to send them? One framework for delivering the parameters is
to use CC/PP (Composite Capability/ Preference Profile [CCPP]). The CC/PP makes it possible to get the necessary device capabilities from the device vendors and so it reduces the
amount of data the client needs to send.
The parameters we want choose can be described in terms of user preferences, device
capabilities and networking parameters. Important user parameters include such as colour, timing, scaling etc. Device capabilities can include the remaining energy, buffer size,
network adapter speed etc. The Network parameters can be estimated from the sent
headers; these include bandwidth and round-trip time etc.
After we have the contents we want to transform and the parameters we need to take
into account, we select a set of transformations that are applied and the resulting content
is sent to the client. Typical transformation algorithms can be categorized as follows
[Teeuw-01].

Information abstraction: reduce bandwidth requirements but preserve the information that has the highest value to the user (etc. by compression techniques).

Modality transformation: different modes for content, etc. transforming video
data into image data.

Data transcoding:
convert the data into a different (and more suitable?)
format. Example transforming JPG pictures to GIF pictures.

Data priority: Remove irrelevant data and preserve relevant and/or interesting
(to the user) data.

Purpose classification:
Allow priority levels for different objects on the page
and order the objects in relevancy order. If the full content can’t be shown, remove the data that is redundant.
Figure 5.2 illustrates the usage of a section-outlining algorithm. The algorithm transforms the different section headers into hyperlinks that allow the user to get the content
of the section and the actual text in the sections is removed.
Figure 5.2 Example of a transformation algorithm – Section outlining
The transformation algorithms face the problem of how to select which information is
relevant? Semantic analysis of the document is too slow and too error sensitive. In [Wu99] the need for context-categorization techniques is discussed. The trivial solution would
be to assigns priorities to different objects and the levels would guide the transformation
process.
After the content is delivered to the user, the user has the possibility to alter his/her
profile. Also the network parameters can change, so we can’t cache the parameter values
but the process will be started from the beginning the next time the user requests a page.
4.3.
Challenges in Content Adaptation
Main challenge is to implement an architecture that provides web content to all devices
with web capabilities and makes it possible to adapt to the different environmental and
device-based limitations and take into account what the user want. The system should be
able to respond to changes in the environment and it should offer means of providing
some kind of QoS (quality of service) based services.
Another important issue relates to privacy and copyright issues. The process shouldn’t
raise new privacy related threats/issues and the content delivers should have some means
of controlling the quality and content that is delivered.
4.4.
Approaches to Content Adaptation
The basic question is where the transcoding of the content is done? The possible solutions are to use client-side scripting, proxy-based transcoding, content transforming servers or author-side transcoding.
Client-side scripting uses JavaScript ([JavaScript]) or other client side script languages
to perform some transcoding to the web content. Another client-side method is to implement the transcoding process in the web browser or to build some “plug-in” applications
that first transcodes the data and after that give the result to the web browser. This approach has some serious drawbacks:

The full content is delivered to the client every time. If the bandwidth is limited,
the transcoding process can only take into account the capabilities of the device.

The number of various devices with web access is large, thus implementing a
general authoring program is next to impossible.

Small devices have limited memory and/or limited processing capabilities thus
affecting the complexity of the scripts very much.
Proxy based transcoding is an intermediate solution. The device sends the request to a
web proxy that handles the request, transcodes the page and returns the resulting page
to the client. In [Han-98] a dynamic adaptation system is described that does image
transcoding on a web-proxy. The system is dynamic in the sense that it calculates the
workload of the server and tries to perform transcoding only if there is enough computational capacity available. If the server doesn’t have enough resources, the image is left out
thus degrading the quality of the contents but minimizing the timeouts due to server overload.
The main problems in proxy-based adaptation are the server overload problem discussed above and the problems with copyright issues. The latter means that the resulting
content may be unacceptable for the content provider. The following problem illustrates
the problem. Consider a web service that uses banner ads. The proxy may leave these
banners off from the final version but for the provider the visibility of the banners is important because it gets money from them.
The next approach discussed here is the usage of a specialized content-transformation
server. The situation is illustrated in figure 5.2 c). This approach is not used (ever) because of the serious problems it has. The following list should make it clear, why this
method is never used.

Breaks end-to-end security, for example hash signatures can’t be used anymore
because the content is altered in between.

The quality degration may not be acceptable for the content provider.

Security threats in the communications (man-in-the-middle etc.).
The final approach discussed here is the usage of a content-adaptation engine at the
provider side. This approach is used for example in [Lum-02]. When the request arrives to
the provider’s web server, the context parameters (device capabilities, user preferences,
network parameters) and the requested page are given to a content-decision engine that
decides the optimum transforms and forms the resulting content. This is then sent to the
client. The main advantage of this approach is that it allows the content provider to fully
control the resulting content. The disadvantages are that it’s expensive to have own content adaptation engines and that learning to adapt is nearly impossible to implement because of the following reasons:

The content provider can’t learn the user’s global behaviour.

The devices may not have enough storage and/or computational capacity for
performing behavioural learning.
FIGURE 5.2 the different approaches to content adaptation
a) Client side scripting
b) Proxy-based transcoding
c) Intermediate server based transcoding
d) Provider (author) based transcoding
4.5.
Device Independence
Device Independence can be achieved by providing languages that allow device independence and have support for content adaptation. According to [Lemlouma-02] such
language doesn’t exist. Usually when speaking of device independency, the following
techniques are mentioned: XHTML, XML, XSL and CSS ([XHTML] [XML] [XSL] [CSS]). The
usage of this kind of languages would make both sided independency possible. Because
there exists various different devices with different capabilities, it’s impossible for the content provider (or too expensive at least) to design the contents differently for every configuration. Device independent languages use device independent methods for showing
the content on the client side but also allow the content provider to use conditional structures to have some control over the result. A simple scenario of this is a web application
that chooses the used style sheet according to some client side capabilities.
5.
Personalization as a method for ContextPresentation
5.1.
Introduction
In [Hirsch-02] personalization tasks are categorized into three main tasks: contentbased prediction, collaborative methods and using the past to predict the future.
Consider an online news portal that has news from various categories. When a user is
browsing the online content, he usually is interested in only few categories and the preferences for these categories vary from “highly-interesting” to “somewhat interesting”.
Content-based prediction is the task of learning these interesting categories, using the
extracted information and to make access to categories, the user might think interesting,
easier. Methods for content-based prediction are discussed in section 5.3. These methods
are quite new so the section only presents two research-papers that deal with this subject
[][].
Collaborative methods are used to “automate the word-of-mouth”. Consider for example a movie portal that allows people to rate different items. When a user reviews a film
that he liked, he would probably like to have recommendations about films that might interest him/her. Collaborative methods look at most similar users and make recommendations based on the items the other users have found interesting. Collaborative methods
are discussed in section 5.2.
Using the past to predict the future means that some history information is used to
predict what the user (or application) might do next. In everyday computer use, users
need to do monotonic action sequences. The sequences vary depending on the user so no
general model can be done to ease up the situation. Instead we can register the action
sequences and try to predict the next actions and offer shortcuts that are easily accessible. The current research so far hasn’t used prediction methods very much but they surely
are a promising technique in the future. In section 5.4 some possible methods for future
prediction in mobile applications are discussed.
5.2.
Collaborative methods
Collaborative filtering is widely used on commercial web sites such as Amazon, IMDB
(Internet Movie DataBase) etc. The benefits of collaborative filtering methods are mutual,
the users are provided information about items that might interest him/her and the enterprises are offered a simple marketing method for trying to increase their sales count.
Collaborative filtering has three main challenges that are listed below:

Quality of recommendations.

Computational cost of recommendation algorithm.

The used algorithms are complete. This means that every purchased/rated item
in the database should be recommended at some point.
The quality of recommendations means simply that the users should be satisfied with
the recommendations they get from the site. If the users are dissatisfied, they will be disappointed and won’t use the system anymore. This easily leads to a situation where the
algorithms offer only recommendations that have very strong confidence. This means that
the system recommends only items that it thinks are very interesting (probability > 0.9?).
This kind of scenario usually means that only a limited set of items is used for recommendations.
Some web sites have very many (millions) customers and products. Going through
such a large dataset in real-time is impossible so the used algorithms must be divided into
offline and online parts, where the time-complexity of the online part should be as small
as possible.
The problem with many collaborative methods is that the easily lead to monotonic behaviour as the system recommends the items with most ratings/purchases etc. This leaves
a set of products completely out of the spectrum and can lead to a situation, where some
users don’t get any recommendations because they only have bought/rated rare items.
This kind of situation should be avoided at any cost.
Figure 5.1 making recommendations – overview of the process
The simplified overview of the recommendation making process is shown on figure 5.1.
The first phase of the process is to perform information retrieval techniques [] to build a
vector-space model from the customers and items. In clustering methods [] the first phase
is to cluster the data and then perform the vector-space modelling. Because the data sets
are sparse, some dimensionality reduction [] techniques can be used to reduce the space
requirements of the algorithms. Dimensionality reduction techniques remove coordinates
that are irrelevant because of general noise of measurements. One such technique is the
principal component analysis [].
The second phase of the process is to calculate the most similar users. For this phase
some similarity metric is used. The most commonly used metrics ([][][])can be seen in
figure 5.2.
Figure 5.2 Similarity metrics
The actual recommendation phase is quite easy after the similar users are found. Usually this is done by calculating a summarizing vector of the similar users group. This summarizing vector can be modelled as a bar graph as seen in figure 5.3.
Figure 5.3 Summarizing vectors
The summarizing vector is normalized. Now the values of the coordinates can be
though of as probabilities of interest so the top-N values are selected and recommended.
If no pre-clustering of data is used, the whole customer database has to be compared
with the user. In the worst case this takes O(MN) time , where M is the number of items
and N is the number of customers. None of the computations can be done offline so if the
site has a large customer and/or products base, these methods can’t be used. The quality
of recommendations is usually quite good but they tend to offer recommendations from
only a restricted set of items.
If pre-clustering is used the data is first clustered [] offline and summarizing vectors
that represent the group are formed. When the actual online computation is performed,
the user vector is compared with the cluster vectors and the items are selected within the
clusters. The clustering can be done by selecting only one cluster/user or by allowing users to belong to multiple clusters with some confidence value. The good thing about clustering is that it makes the computations more effective as the calculation can be divided
into offline and online parts. The major drawback is that the quality of recommendations is
usually much worse than with standard collaborative filtering methods. The quality of recommendations can be improved by a more fine-grained clustering but this makes the
online computations more time-consuming. Also clustering tends to recommend only frequent items as the rare items are mixed up with the general measurement errors. What is
interesting is that no paper discussed the possibility to use multiple group clustering and
then perform collaborative filtering within the most similar customers.
One way of forming recommendations is to use search-based methods. These usually
search for items from the same author, actors etc. that are popular. This is the simplest
method of doing the process and also the worst. The recommendations tend to be very
general and if the product base is very large this can lead to very large result sets. The
good point with this approach that it puts all programming issues to the programmers of
the used database management system.
For large datasets and real-time recommendations another way of calculating the recommendations must be used: item-based recommendation algorithms [][]. Amazon () is
an example of a site that uses this kind of methods. The idea of the process is quite similar to user-based recommendations; this is illustrated in figure 5.4.
Figure 5.4 item-based recommendation algorithms – the process
Basically the process consists of first finding sets of items than tend to appear together. This data is used to calculate a similar-items table. The table construction phase consists of iterating through the items that occurred with item i and calculating a similarity
value using some similarity metric (see figure 5.2). When the user purchases/rates some
item the item similarity tables are used to find matches for the item and from these the
top-k items are recommended. This easily (and usually) leads to general recommendations
but the good thing is that the online calculations can be done very fast. At Amazon the
offline calculation take O(N^2 M) time in the worst case so this complexity could be optimized. Generally the item-based recommendations are simple association rule mining processes [] so the existing algorithms could be modified to support different similarity metrics and thus providing better offline performance and the possibility to customize the
metrics depending on the task.
5.3.
Content based personalization
Content based personalization is quite a new topic so no general overview can be given. Currently the methods are based on clustering and probabilistic modelling. We present
here two different research projects as an introductory material to this topic.
The first project [] was done by Microsoft Research in 2000. First the data was clustered and secondly the clusters were used to build simple Markov processes from the data. From the data two kinds of models were constructed. First-order Markov models from
the data were constructed for each cluster. For example consider a situation where the
users request pages from the weather category after they had read sport news. Then this
forms a simple two state Markov process and the frequencies can be used as occurrence
probabilities. The other model was to construct unordered Markov processes, where the
order of the visits doesn’t matter, but only the visited categories are interesting.
The second project [] discussed here was done by University of Helsinki and it used a
very complex Bayesian model. The clustering was done in two ways, both the pages and
users were clustered, and this data was used to construct a Bayesian network []. The variables of the Network are illustrated in figure 5.5. This model gives the probability that a
certain user belonging to a certain group will request a view that has this article from this
cluster of pages and this information was used to customize views for different users.
Figure 5.5 a two-way clustered Bayesian model
At the current moment content based personalization is an emerging technology that
probably will be used widely in the future. People would prefer this kind of solutions if they
work well enough and thus commercial sites are interested in this kind of applications.
Accurate models for demographic and content dependent personalization are difficult to
build and this is why only few sites can offer this kind of service. If a generic application
could be built, the market possibilities would be huge.
5.4.
Predict methods for mobile applications
Predicting user behaviour in mobile applications leads to a better user experience as
the user doesn’t need to repeat monotonic actions repeatedly. No paper that considers
models for this kind of applications was found so we only present some possibilities.
A simple method would be to use data mining techniques [][]. After certain amount of
actions sequences the log data of action sequences is used to generate association rules []
and their confidence levels. This method poses for example the following problems:

Memory requirements too big? This depends of the amount of data that is
stored and the number of sequences.

How often would the mining (and updating) be performed?

This offers only periodic learning.
The next possible method that extends the previous model is to use the Bayesian rule
\latex {} to update the probabilities after the first clustering phase. This allows us to replace the log data with the probabilities and thus memory requirements would be eased.
Some possible problems:

If the user doesn’t use a shortcut key provided by the system, how is the updating controlled?

Memory requirements?
Hidden Markov Models [] are quite good in modelling simple action sequences. With
HMM the next states can be predicted and various shortcut keys to different following
states can be offered. This is probably the best model for many situations as its reasonable simple to implement and it doesn’t require large amounts of memory. The interesting
problem is how to learn the characteristics of the process. This can be seen as a Markov
Decision Process (MDP) [] and reinforcement learning techniques can be used.
The last model discussed here is to use Bayesian networks []. Using Bayesian networks
[] the system doesn’t necessary need to learn the distribution in the initial state as user
studies can be used to form an initial distribution, which is then updated depending on the
user actions.
6.
6.1.
Applications and Devices
Introduction
From the different applications we tried to select those that offer some practical sights
into the theories presented before. Aura [] and ParcTAB [] are more ambitious projects of
which the first was done by Carnegie Mellon University and the second by Xerox. These
offer a more wide view to how context presentation can be used.
GroupLens [] and Amazon [] are typical examples of how personalization is and can be
used. A more ambitious project that isn’t discussed here is the Lumière [] project by Microsoft. GroupLens is a web site that offers recommendations of NetNews and Amazon is
a book store in the web that offers recommendations for items to buy.
For a simple example of how location can be used to deliver content, we discuss shortly the commercial project SpotWATCH [], which was done in collaboration with Microsoft
and is based on using radio signals to transmit information and using a simple algorithm to
filter the data depending on the location.
6.2.
Aura
Project Aura is a research project by Carnegie Mellon University. The main goal is to
provide a framework that can support effective use of resources and minimize the need
for user distractions in a pervasive computing environment.
The main idea is to divide the environment into four different components that all have
their specific tasks. The architectural overview of Aura is shown in figure 9.1.
Figure 9.1 Components of Aura in a certain environment.
Every environment has two static components, the environment manager and the content observer. The dynamic components are the task manager and the service suppliers.
For the user the tasks are represented as a collection of abstract services. An example
of this is edit text + watch video. First the task manager negotiates a configuration with
the environment manager. After the negotiation phase the environment manager returns
a handle to a service supplier. Using this handle the task manager can access the supplier
that offers the required service.
When a user requests for a certain kind of service the environment manager looks
through its database of service suppliers and selects the most appropriate one. The simplest form of selection can be illustrated by the following example. Assume that the user is
using Linux and requests text editing. Now the environment manager selects XEmacs but
if the user was using windows it would have chosen Microsoft Word. The architecture allows more sophisticated control using XML [XML]. The service suppliers are basically different applications that are wrapped to the Aura API according to some parameters.
What if the environment changes? We said that the task manager is responsible for the
reconfiguration but we need some way to inform it that changes have occurred. This is
done using a context observer. The context observer gets its information from the different sensors and uses this information to notify the task managers about changes in the
environment.
6.3.
ParcTAB
6.4.
GroupLENS
6.5.
Amazon
Amazon is a typical example of an internet web site that offers recommendations to
the user. The algorithm Amazon uses is an item-to-item collaborative filtering algorithm
(section 7.5). According to [Linden-03] Amazon had over 29 million customers and several million data items in January 2003. Because Amazon is a web site, it should calculate
the recommendations in real-time thus offline processing is needed. The offline phase
consists of building similar items tables by finding items that customers tend to by customers. This is a frequent set data mining problem and the existing algorithms offer effective means of finding these sets. The similarity between items is calculated using the
cosine similarity metric. Because of this comprehensive offline calculation phase, the recommendations can be provided in real-time.
6.6.
SPOT Watch
FM Radio signals can be used to send different data such as traffic forecasts, movie
times, traffics alerts and advertisements. Figure 9.2 illustrates a simple scenario where
proximity content is delivered to a wristwatch.
Figure 9.2
The SPOT Watch architecture listens to different radio stations and recognizes from the
signal data those that send SPOT data. This recognition can be done using preprogramming the system to listen to only certain frequencies. The other way is to use
some form of identification patterns in the signal data. Once the radio stations are known,
an intensity vector of the signal strengths is extracted. Some filtering method can be applied to reduce the effects of noise.
The SPOT watch uses a Right SPOT algorithm to infer the current location. The system
doesn’t try to get the exact location but instead to locate the area/neighbourhood, where
the user is. The location of the radio transmitters is known in advance, so the strength of
the signal can be used to estimate, where the user is. Right SPOT uses Bayesian inferring
to calculate conditional probabilities of the areas. This is illustrated in figure 9.3.
Figure 9.3
From the probabilities a histogram is built. The area with the largest probability is the
most probable (maximum likelihood) location at the moment so it’s selected. This information is used to filter SPOT data from the radio signals.
7.
References
[1] [Bickmore-97]
Timothy W. Bickmore and Bill N. Schilit, Digestor: DeviceIndependent Access to the World Wide Web, Proceedings of the 6th World
Wide Web Conference (WWW 6), 1997, pages 655–663;
Vadim Gerasimov and Walter Bender, Things that talk: Using sound for device-to-device and device-to-human communication,
IBM Systems Journal Vol. 39 Nos 3&4, 2000.
[2] [Gerasimov-00]
[3] [Han-98]
Richard Han, Pravin Bhagwat, Richard LaMaire, Todd Mummert, Veronique Perret and Jim Rubas, Dynamic Adaptation in an Image
Transcoding Proxy for Mobile Web Browsing, IEEE Personal Communications Magazine, December, 1998.
Eric Horvitz, Lumiere Project: Bayesian Reasoning for
Automated Assistance, Microsoft Research 1999.
[4] [Horvitz-99]
Tayeb Lemlouma and Nabil Layaïda, Device Independent
Principles for Adapted Content Delivery, OPERA Project, 2002.
[5] [Lemlouma-02]
[6] [Linden-03]
Greg Linden, Brent Smith and Jeremy York, Amazon.com
Recommendations: Item-to-item collaborative filtering.
Wai Yip Lum and Francis C.M. Lau, A Context-Aware Decision Engine for Content Adaptation, Pervasive-Computing 5:41-49, 2002.
[7] [Lum-02]
[8] [Koll-01]
Siva Kollipara, Rohit Sah, Srinivasan Badrinarayanan, Rabee
Alshemali, SENSE: A Toolkit for Stick – e Frameworks, December 2001.
John Krumm and Eric Horvitz, RightSPOT: A Novel Sense
of Location for a Smart Personal Object, Microsoft Research Paper, Ubicomp
2003, Seattle.
[9] [Krumm-03]
Anil Madhavapeddy, David Scott and Richard Sharp, Context-Aware Computing with Sound, Ubicomp 2003, Seattle.
[10] [Madhav-03]
João Pedro Sousa and David Garlan, Aura: An Architectural Framework for User Mobility in Ubiquitous Computing Environments,
Proceedings of the 3rd Working IEEE/IFIP Conference on Software Architecture,
August 2002.
[11] [Sousa-02]
[12] [Teeuw-01]
Wouter Teeuw, Content Adaptation, Telematica Institut.
[13] [Wu-99]
Jon C.S. Wu, Eric C.N. Hsi, Warner ten Kate and Peter M.C.
Chen, A Framework for Web Content Adaptation, Philips Research Paper,
1999.
8.
[1]
References to used techniques
[CCPP]
Composite Capability/Preference Profiles (CC/PP):
Structure and Vocabularies, W3C Working Draft 28 July 2003
[2]
wont