Download INF SERV – Media Storage and Distribution Systems

Document related concepts

Generalized linear model wikipedia , lookup

Transcript
INF SERV – Media Storage and Distribution Systems:
User Modeling
13/9 – 2004
Why user modeling?
Multimedia approach

If you can’t make it, fake it
Translation


Present real-life quality
If not possible, save resources where it is not recognizable
Requirement



Know content and environment
Understand limitations to user perception
If these limitations must be violated, know least disturbing
saving options
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
User Modelling
What?

Formalized understanding of


users’ awareness
user behaviour
Why?


Achieve the best price/performance ratio
Understand actual resource needs




achieve higher compression using lossy compression
potential of trading resources against each other
potential of resource sharing
relax relation between media
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Applications of User Modelling
Encoding Formats

Exploit limited awareness of users



JPEG/MPEG video and image compression
MP3 audio compression
Based on medical and psychological models
Quality Adaptation

Adapt to changing resource availability

no models - need experiments
Synchronity

Exploit limited awareness of users

no models - need experiments
Access Patterns



When will users access a content?
Which content will users access?
How will they interact with the content?

no models, insufficient experiments - need information from related sources
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
User Perception of
Quality Changes
Quality Changes
Quality of a single stream


Issue in Video-on-Demand, Music-on Demand, ...
Not quality of an entire multimedia application
Quality Changes

Usually due to changes in resource availability



overloaded server
congested network
overloaded client
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability


Random
Planned
Short-term change in
resource availability


Random
Planned
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability

Random

no back channel
no content adaptivity

continuous severe disruption


Planned
Short-term change in
resource availability


Random
Planned
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability

Random

no back channel
no content adaptivity

continuous severe disruption


Planned

change to another encoding
format
change to another quality level

requires mainly codec work

Short-term change in
resource availability


Random
Planned
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability


Random
Planned
Short-term change in
resource availability


Random
Planned
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability


Random
Planned
Short-term change in
resource availability

Random




packet loss
frame drop
alleviated by protocols and
codecs
Planned
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in
resource availability


Random
Planned
Short-term change in
resource availability

Random




packet loss
frame drop
alleviated by protocols and
codecs
Planned


scaling of data streams
appropriate choices require
user model
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Kinds of Quality Changes
Long-term change in resource availability

Random

no back channel
no content adaptivity

continuous severe disruption


Planned

change to another encoding format
change to another quality level

requires mainly codec work

Short-term change in resource availability

Random

packet loss
frame drop

alleviated by protocols and codecs


Planned

scaling of data streams

appropriate choices require user model
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Planned quality changes
Audio



Lots of research in scalable audio
No specific results for distribution systems
Rule-of-thumb

Always degrade video before audio
Video


Long-term changes
Short-term changes
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Planned quality changes
Audio
Video

Long-term changes




Use separately encoded streams
Switch between formats
Non-scalable formats compress better than scalable ones (Source: Yuriy
Reznik, RealNetworks)
Short-term changes
Switching between formats


Needs no user modeling
Is an architecture issue
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Planned quality changes
Audio
Video


Long-term changes
Short-term changes


Use scalable encoding
Reduce short-term fluctuation by prefetching and buffering
Two kinds of scalable encoding schemes

Non-hierarchical

encodings are more error-resilient
o

fractal single image encoding
Hierarchial

encodings have better compression ratios
Scalable encoding


Support for prefetching and buffering is an architecture issue
Choice of prefetched and buffered data is not
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Planned quality changes
Audio
Video


Long-term changes
Short-term changes


Use scalable encoding
Reduce short-term fluctuation by prefetching and buffering
Short-term fluctuations

Characterized by



frequent quality changes
small prefetching and buffering overhead
Supposed to be very disruptive
See for yourself: subjective assessment
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
A test performed by the Multimedia Communications Group at
TU Darmstadt
Goal

Predict the most appropriate way to change quality
Approach



Create artificial drop in layered video sequences
Show pairs of video sequences to testers
Ask which sequence is more acceptable
Compare two means of prediction

Peak signal-to-noise ratio (higher is better)



compares degraded and original sequences per-frame
ignores order
Spectrum of layer changes (lower is better)


takes number of layer changes into account
ignores content and order
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
layers
layers
Used SPEG (OGI) as layer encoded video format
frames
frames
layers
layers
amplitude of layer variation
frames
frames
frequency of layer variation
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
layers
layers
What is better?
frames
frames
layers
layers
First gap first or lowest gap first?
frames
frames
Early or late high quality?
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
How does the spectrum correspond with the results of the subjective
assessment?
Comparison with the peak signal-to-noise ratio
#
Metric
ts 1
1
Subjective assessment
2
PSNR (higher is better)
3 Spectrum (lower is better)
#
Farm 1
Clip
1
Subjective assessment
2
PSNR (higher is better)
3 Spectrum (lower is better)
ts 1
0.35
M&C1
ts 2
ts 1
0.55
ts 2
0.73
62.86 49.47 61.46 73.28 63.15 52.38
2
Clip
Metric
ts 2
Farm 2
2
6.86
M&C3
ts 1
4
2
M&C4
ts 2
ts 1
1.18
1
T-Tennis3
ts 2
1.02
ts 1
ts 2
2.18
48.01 25.08 49.40 26.95 66.02 63.28
2
0
2
0
0.5
0.5
According to the results of the subjective assessment the spectrum is a more
suitable measure than the PSNR
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Subjective Assessment
Conclusions


Subjective assessment of variations in layer encoded videos
Comparison of spectrum measure vs. PSNR measure



Observing spectrum changes is easier to implement
Spectrum changes indicate user perception better than PSNR
Spectrum changes do not capture all situations
Missing


Subjective assessment of longer sequences
Better heuristics



"thickness" of layers
order to quality changes
target layer of changes
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
User Model for Synchronity
Synchronization
Content Relation

se.g.: several views of the same data
Spatial Relations

Layout
Temporal Relations

Intra-object Synchronization


Intra-object synchronization defines the time relation between various
presentation units of one time-dependent media object
Inter-object Synchronization

Inter-object synchronization defines the synchronization between media
objects
Relevance



Hardly relevant in current NVoD systems
Somewhat relevant in conferencing systems
Relevant in upcoming multi-object formats: MPEG-4, Quicktime
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Inter-object Synchronization
Lip synchronization


demands for a tight coupling of audio and video streams
with
a limited skew between the two media streams
Slide show with audio comment
Main problem of the user model

permissible skew
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Inter-object Synchronization
A lip synchronized audio video sequence (Audio1 and Video) is followed by a
replay of a recorded user interaction (RI), a slide sequence (P1 - P3) and an
animation (Animation) which is partially commented using an audio
sequence (Audio2). Starting the animation presentation, a multiple choice
question is presented to the user (Interaction). If the user has made a
selection, a final picture (P4) is shown
Main problem of the user model

permissible latency


analysing object sequence allow prefetching
user interaction complicates prefetching
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Synchronization Requirements – Fundamentals
100% accuracy is not required, i.e., skew is allowed
Skew depends on


Media
Applications
Difference between


Detection of skew
Annoyance of skew
Explicit knowledge on skew


Alleviates implementation
Allows for portability
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Experimental Set-Up
Experiments at IBM ENC Heidelberg to quantify synchronization
requirements for


Audio/video synchronization
Audio/pointer synchronization
Selection of material

Duration



30s in experiments
5s would have been sufficient
Reuse of same material for all tests
Introduction of artificial skew


By media composition with professional video equipment
With frame based granularity
Experiments

Large set of test candidates




Professional: cutter at TV studios
Casual: every day “user”
Awareness of the synchronization issues
Set of tests with different skews lasted 45 min
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Lip Synchronization: Major Influencing Factors
Video

Content




Continuous (talking head) vs. discrete events (hammer and nails)
Background (no distraction)
Resolution and quality
View mode (head view, shoulder view, body view)
Audio



Content
Background noise or music
Language and articulation
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Lip Synchronization: Level of Detection
Areas



In sync QoS: +/- 80 ms
Transient
Out of sync
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Lip Synch.: Level of Accuracy/Annoyance
Some observations


Asymmetry
Additional tests with long movie


+/- 80 ms: no distraction
-240 ms, +160 ms: disturbing
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Pointer Synchronization
Fundamental CSCW shared workspace issue
Analysis of CSCW scenarios


Discrete pointer movement (e.g. “technical sketch”)
Continuous pointer movements (e.g. “route on map”)
Most challenging probes


Short audio
Fast pointer movement
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Pointer Synchronization: Level of Detection
Observations

Difficult to detect “out of sync”


i.e., other magnitude than lip sync
Asymmetry

According to every day experience
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Pointer Synchronization: Level of Annoyance
Areas



In sync: QoS -500 ms, +750 ms
Transient
Out of sync
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Quality of Service of Two Related Media Objects
Expressed by a quality of service value for the skew


Acceptable skew within the involved data streams
Affordable synchronization boundaries
Production level synchronization


Data should be captured and recorded with no skew at all
To be used if synchronized data will be further processed
Presentation level synchronization


Reasonable synchronization at the user interface
To be used if synchronized data will not be further processed
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Quality of Service of Two Related Media Objects
Media
Video
Mode, application
QoS
Animation
Correlated
+/- 120 ms
Audio
Lip synchronization
+/- 80 ms
Images
Overlay
+/- 240 ms
No overlay
+/- 500 ms
Overlay
+/- 240 ms
No overlay
+/- 500 ms
Text
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Quality of Service of Two Related Media Objects
Media
Audio
Mode, application
QoS
Animation
Event colleration
+/- 80 ms
Audio
Tightly coupled (stereo)
+/- 11 μs
Loosely coupled (dialog
mode with various
participants)
+/- 120 ms
Loosely coupled
(background music)
+/- 500 ms
Tightly coupled (music with
notes)
+/- 5 ms
Loosely coupled (slide
show)
+/- 500 ms
Text
Text annotation
+/- 240 ms
Pointer
Audio related to shown item -500 - +750 ms
Image
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
User Model for Access Patterns
Modelling
User behaviour

The basis for simulation and emulation


In turn allows performance tests
Separation into


Frequency of using the VoD system
Selection of a movie
User Interaction

Models exist

But are not verified
Selection of a movie


Dominated by the access probability
Should be simulated by realistic access patterns
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Focus on Video-on-Demand
Video-on-demand systems




Objects are generally consumed from start to end
Repeated consumption is rare
Objects are read-only
Hierarchical distribution system is the rule
Caching approach


Simple approach first
Various existing algorithms
Simulation approach


No real-world systems exist
Similar real-world situations can be adopted
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Using Existing Models
Use of existing access models ?



Some access models exist
Most are used to investigate single server or cluster behavior
Real-world data is necessary to verify existing models
Optimistic model



Cache hit probabilities are over-estimated
Caches are under-dimensioned
Network traffic is higher than expected
Pessimistic model



Cache hit probabilities are under-estimated
Cache servers are too large or not used at all
Networks are overly large
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Existing Data Sources for Video-on-Demand
Movie magazines



Data about average user behaviour
Represents large user populations
Small number of observation points (weekly)
Movie rental shops



Actual rental operations
Serves only a small user population
Initial peaks may be clipped
Cinemas




Actual viewing operations
Serves only a small user population
Few number of titles
Short observation periods
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Model for Large User Populations
Zipf Distribution
N

C
 
z (i)   , C  1 / 1 / j 
i
 j 1

Verified for VoD by A. Chervenak




N - overall number of movies
ξ – skew factor
i - movie i in a list ordered by descreasing popularities
z(i) - hit probability
Many application contexts


all kinds of product popularity investigations
http://linkage.rockefeller.edu/wli/zipf/ collects applications of Zipf’s law

natural languages, monkey-typing texts, web access statistics, Internet
traffic, bibliometrics, informetrics, scientometrics, library science, finance,
business, ecological systems, ...
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Verification: Movie Magazine
Movie magazine




Characteristics of observations on large user populations
Smoothness
Predictability of trends
Sharp increase and slower decrease in popularities
Highlander 3
Highlander 3
0
20
60000
top 100 ranking
media control index
80000
40000
20000
40
60
80
0
100
0
5
10
15
weeks
20
INF5070 – media servers and distribution systems
25
0
5
10
15
weeks
20
25
2004 Carsten Griwodz & Pål Halvorsen
Comparison with the Zipf Distribution
probability curves for 250 movie titles
1
rental probability
0.9
0.8
0.7
0.6
4/3/96
0.5
z(i)
0.4
0.3
4/6/96
0.2
0.1
0
0
20
40
60
movie index
80
100
Well-known and accepted model
Easily computable
Compatible with the 90:10 rule-of-thumb
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Verification: Small and Large User Populations
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Verification: Small and Large User Populations
Similarities




Small populations follow the general trends
Computing averages makes the trends better visible
Time-scale of popularity changes is identical
No decrease to a zero average popularity
Differences


Large differences in total numbers
Large day-to-day fluctuations in the small populations
Typical assumptions


90:10 rule
Zipf distribution models real hit probability
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Problems of Zipf
Does not work in distribution hierarchies

Access to independent caches beyond first-level are not described
Not easily extended to model day-to-day changes


Is timeless
Describes a snapshot situation
Optimistic for the popularity of most popular titles
Chris Hillman, bionet.info-theory, 1995
Any power law distribution for the frequency with which various
combinations of ‘‘letters’’ appear in a sequence
is due simply to a very general statistical phenomenon,
and certainly does not indicate some deep underlying process or
language.
Rather, it says you probably aren’t looking at your problem the right way!

INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Approaches to Long-term Development
Model variations for long-term studies

Static approach



CD sales model



Smooth curve with a single peak
Models the increase and decrease in popularity
Shifted Zipf distribution



No long-term changes
Movie are assumed to be distributed in off-peak hours
Zipf distribution models the daily distribution
Shift simulates daily shift of popularities
Permutated Zipf distribution


Zipf distribution models the daily distribution
Permutation simulates daily shift of popularities
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Verification: Zipf Variations
popularity index change
relevance change
100
80
relevance change of a real movie
100
80
60
40
20
0
0
50
100
150
200
250
age in days
60
40
20
0
0
50
100
150
200
age in days
250
popularity index change
popularity index change
Rotation model for day-to-day relevance changes
relevance change of a real movie
100
80
60
40
20
0
0
50
100
150
200
250
age in days
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Verification: Zipf Variations
popularity index change
relevance change
100
80
relevance change of a real movie
100
80
60
40
20
0
0
50
60
100
150
200
250
age in days
40
20
0
0
50
100
150
200
age in days
250
popularity index change
popularity index change
Permutation model for day-to-day relevance changes
relevance change of a real movie
100
80
60
40
20
0
0
50
100
150
200
250
age in days
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Modelling: Requirements
Model should represent movie life cycles




To
To
To
To
reflect the aging of titles
observe movement of movies through a hierarchy of servers
make observations with respect to a single movie
support the idea of pre-distribution
Model should work for large and small user populations


To allow variations in client numbers
To prevent from built-in smoothing effects
Model can not be trace-driven




The number of movies is too small
The observation time is too short
The user population size is not variable
One title can not be re-used without similarity effects
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
New Model: Movie Life Cycle
Characteristics




Quick popularity increase
Various top popularities
Various speeds in popularity decrease
Various residual popularity
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
New Model: User Population Size
50 draws per day
500 draws per day
1.5
movie hits
movie hits
2
1
0.5
0
0
50
100
150
200
250
9
8
7
6
5
4
3
2
1
0
0
50
100
200
250
days
50000 draws per day
70
700
60
600
movie hits
movie hits
days
5000 draws per day
150
50
40
30
20
10
500
400
300
200
100
0
0
0
50
100
150
days
200
250
0
50
100
150
200
250
days
Smoothing effect of larger user populations
Day-to-day relevance changes
Probability distribution of all movies by „new releases“
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Problems with Data Sources
Lack of additional real-world data

No verification data for medium-sized populations available
Missing details

Genres



Single day probability variations


Children´s choices at daytime, adults´ choices at night
Regional popularity differences



Popularity rise and decline depends on genres
Single users´ behaviour can be predicted
Ethnic groups
Regional information
Comebacks

Sequels inspire comebacks
Detail overload

Simplifications are required for large simulations
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Video Access Modeling
Simple Zipf models are not suited for simulation of
server hierarchies
Trace-driven simulation can not be used
Our model is sufficient for general investigation on
caching




Long-term movie life cycles can be modeled nicely
Optimistic assumptions due to smoothness are removed
Variations in movie behavior are supported
Day-to-day popularity changes are realistic
It is not sufficient yet for advanced caching
mechanisms


Single-day variations are missing
Genres are missing
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Summary
User modeling helps achieving a good
price/performance ratio for multimedia systems
User modeling allows cheating
Examples seen:



Modeling quality assessment of layered video
Modeling audio/video synchronization
Modeling video access probability
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
References
Ann Chervenak: Tertiary Storage: An Evaluation of New Applications, PhD thesis, University of California,
Berkeley, 1994
Carsten Griwodz, Michael Bär, Lars Wolf: Long-Movie Popularity Models in Video-on-Demand Systems, ACM
Multimedia, Seattle, WA, USA, Nov. 1997
Charles Krasic, Jonathan Walpole: Priority-Progress Streaming for Quality-Adaptive Multimedia, ACM
Multimedia Doctoral Symposium, Ottawa, Canada, Oct. 2001
Ralf Steinmetz, Klara Nahrstedt: Multimedia Fundamentals, Volume I: Media Coding and Content Processing
(2nd Edition), Prentice Hall, 2002, ISBN 0130313998
Michael Zink, Oliver Künzel, Jens Schmitt, Ralf Steinmetz: Subjective Impression of Variations in LayerEncoded Videos, IWQoS, Monterey, CA, USA, Jun. 2003
Michael Zink, Jens Schmitt, and Carsten Griwodz. Layer-Encoded Video Streaming: A Proxy's Perspective. In
IEEE Communications Magazine, Vol. 42, No. 8, August 2004
INF5070 – media servers and distribution systems
2004 Carsten Griwodz & Pål Halvorsen