Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INF SERV – Media Storage and Distribution Systems: User Modeling 13/9 – 2004 Why user modeling? Multimedia approach  If you can’t make it, fake it Translation   Present real-life quality If not possible, save resources where it is not recognizable Requirement    Know content and environment Understand limitations to user perception If these limitations must be violated, know least disturbing saving options INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen User Modelling What?  Formalized understanding of   users’ awareness user behaviour Why?   Achieve the best price/performance ratio Understand actual resource needs     achieve higher compression using lossy compression potential of trading resources against each other potential of resource sharing relax relation between media INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Applications of User Modelling Encoding Formats  Exploit limited awareness of users    JPEG/MPEG video and image compression MP3 audio compression Based on medical and psychological models Quality Adaptation  Adapt to changing resource availability  no models - need experiments Synchronity  Exploit limited awareness of users  no models - need experiments Access Patterns    When will users access a content? Which content will users access? How will they interact with the content?  no models, insufficient experiments - need information from related sources INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen User Perception of Quality Changes Quality Changes Quality of a single stream   Issue in Video-on-Demand, Music-on Demand, ... Not quality of an entire multimedia application Quality Changes  Usually due to changes in resource availability    overloaded server congested network overloaded client INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability   Random Planned Short-term change in resource availability   Random Planned INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability  Random  no back channel no content adaptivity  continuous severe disruption   Planned Short-term change in resource availability   Random Planned INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability  Random  no back channel no content adaptivity  continuous severe disruption   Planned  change to another encoding format change to another quality level  requires mainly codec work  Short-term change in resource availability   Random Planned INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability   Random Planned Short-term change in resource availability   Random Planned INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability   Random Planned Short-term change in resource availability  Random     packet loss frame drop alleviated by protocols and codecs Planned INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability   Random Planned Short-term change in resource availability  Random     packet loss frame drop alleviated by protocols and codecs Planned   scaling of data streams appropriate choices require user model INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Kinds of Quality Changes Long-term change in resource availability  Random  no back channel no content adaptivity  continuous severe disruption   Planned  change to another encoding format change to another quality level  requires mainly codec work  Short-term change in resource availability  Random  packet loss frame drop  alleviated by protocols and codecs   Planned  scaling of data streams  appropriate choices require user model INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Planned quality changes Audio    Lots of research in scalable audio No specific results for distribution systems Rule-of-thumb  Always degrade video before audio Video   Long-term changes Short-term changes INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Planned quality changes Audio Video  Long-term changes     Use separately encoded streams Switch between formats Non-scalable formats compress better than scalable ones (Source: Yuriy Reznik, RealNetworks) Short-term changes Switching between formats   Needs no user modeling Is an architecture issue INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Planned quality changes Audio Video   Long-term changes Short-term changes   Use scalable encoding Reduce short-term fluctuation by prefetching and buffering Two kinds of scalable encoding schemes  Non-hierarchical  encodings are more error-resilient o  fractal single image encoding Hierarchial  encodings have better compression ratios Scalable encoding   Support for prefetching and buffering is an architecture issue Choice of prefetched and buffered data is not INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Planned quality changes Audio Video   Long-term changes Short-term changes   Use scalable encoding Reduce short-term fluctuation by prefetching and buffering Short-term fluctuations  Characterized by    frequent quality changes small prefetching and buffering overhead Supposed to be very disruptive See for yourself: subjective assessment INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment A test performed by the Multimedia Communications Group at TU Darmstadt Goal  Predict the most appropriate way to change quality Approach    Create artificial drop in layered video sequences Show pairs of video sequences to testers Ask which sequence is more acceptable Compare two means of prediction  Peak signal-to-noise ratio (higher is better)    compares degraded and original sequences per-frame ignores order Spectrum of layer changes (lower is better)   takes number of layer changes into account ignores content and order INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment layers layers Used SPEG (OGI) as layer encoded video format frames frames layers layers amplitude of layer variation frames frames frequency of layer variation INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment layers layers What is better? frames frames layers layers First gap first or lowest gap first? frames frames Early or late high quality? INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment How does the spectrum correspond with the results of the subjective assessment? Comparison with the peak signal-to-noise ratio # Metric ts 1 1 Subjective assessment 2 PSNR (higher is better) 3 Spectrum (lower is better) # Farm 1 Clip 1 Subjective assessment 2 PSNR (higher is better) 3 Spectrum (lower is better) ts 1 0.35 M&C1 ts 2 ts 1 0.55 ts 2 0.73 62.86 49.47 61.46 73.28 63.15 52.38 2 Clip Metric ts 2 Farm 2 2 6.86 M&C3 ts 1 4 2 M&C4 ts 2 ts 1 1.18 1 T-Tennis3 ts 2 1.02 ts 1 ts 2 2.18 48.01 25.08 49.40 26.95 66.02 63.28 2 0 2 0 0.5 0.5 According to the results of the subjective assessment the spectrum is a more suitable measure than the PSNR INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Subjective Assessment Conclusions   Subjective assessment of variations in layer encoded videos Comparison of spectrum measure vs. PSNR measure    Observing spectrum changes is easier to implement Spectrum changes indicate user perception better than PSNR Spectrum changes do not capture all situations Missing   Subjective assessment of longer sequences Better heuristics    "thickness" of layers order to quality changes target layer of changes INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen User Model for Synchronity Synchronization Content Relation  se.g.: several views of the same data Spatial Relations  Layout Temporal Relations  Intra-object Synchronization   Intra-object synchronization defines the time relation between various presentation units of one time-dependent media object Inter-object Synchronization  Inter-object synchronization defines the synchronization between media objects Relevance    Hardly relevant in current NVoD systems Somewhat relevant in conferencing systems Relevant in upcoming multi-object formats: MPEG-4, Quicktime INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Inter-object Synchronization Lip synchronization   demands for a tight coupling of audio and video streams with a limited skew between the two media streams Slide show with audio comment Main problem of the user model  permissible skew INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Inter-object Synchronization A lip synchronized audio video sequence (Audio1 and Video) is followed by a replay of a recorded user interaction (RI), a slide sequence (P1 - P3) and an animation (Animation) which is partially commented using an audio sequence (Audio2). Starting the animation presentation, a multiple choice question is presented to the user (Interaction). If the user has made a selection, a final picture (P4) is shown Main problem of the user model  permissible latency   analysing object sequence allow prefetching user interaction complicates prefetching INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Synchronization Requirements – Fundamentals 100% accuracy is not required, i.e., skew is allowed Skew depends on   Media Applications Difference between   Detection of skew Annoyance of skew Explicit knowledge on skew   Alleviates implementation Allows for portability INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Experimental Set-Up Experiments at IBM ENC Heidelberg to quantify synchronization requirements for   Audio/video synchronization Audio/pointer synchronization Selection of material  Duration    30s in experiments 5s would have been sufficient Reuse of same material for all tests Introduction of artificial skew   By media composition with professional video equipment With frame based granularity Experiments  Large set of test candidates     Professional: cutter at TV studios Casual: every day “user” Awareness of the synchronization issues Set of tests with different skews lasted 45 min INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Lip Synchronization: Major Influencing Factors Video  Content     Continuous (talking head) vs. discrete events (hammer and nails) Background (no distraction) Resolution and quality View mode (head view, shoulder view, body view) Audio    Content Background noise or music Language and articulation INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Lip Synchronization: Level of Detection Areas    In sync QoS: +/- 80 ms Transient Out of sync INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Lip Synch.: Level of Accuracy/Annoyance Some observations   Asymmetry Additional tests with long movie   +/- 80 ms: no distraction -240 ms, +160 ms: disturbing INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Pointer Synchronization Fundamental CSCW shared workspace issue Analysis of CSCW scenarios   Discrete pointer movement (e.g. “technical sketch”) Continuous pointer movements (e.g. “route on map”) Most challenging probes   Short audio Fast pointer movement INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Pointer Synchronization: Level of Detection Observations  Difficult to detect “out of sync”   i.e., other magnitude than lip sync Asymmetry  According to every day experience INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Pointer Synchronization: Level of Annoyance Areas    In sync: QoS -500 ms, +750 ms Transient Out of sync INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Quality of Service of Two Related Media Objects Expressed by a quality of service value for the skew   Acceptable skew within the involved data streams Affordable synchronization boundaries Production level synchronization   Data should be captured and recorded with no skew at all To be used if synchronized data will be further processed Presentation level synchronization   Reasonable synchronization at the user interface To be used if synchronized data will not be further processed INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Quality of Service of Two Related Media Objects Media Video Mode, application QoS Animation Correlated +/- 120 ms Audio Lip synchronization +/- 80 ms Images Overlay +/- 240 ms No overlay +/- 500 ms Overlay +/- 240 ms No overlay +/- 500 ms Text INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Quality of Service of Two Related Media Objects Media Audio Mode, application QoS Animation Event colleration +/- 80 ms Audio Tightly coupled (stereo) +/- 11 μs Loosely coupled (dialog mode with various participants) +/- 120 ms Loosely coupled (background music) +/- 500 ms Tightly coupled (music with notes) +/- 5 ms Loosely coupled (slide show) +/- 500 ms Text Text annotation +/- 240 ms Pointer Audio related to shown item -500 - +750 ms Image INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen User Model for Access Patterns Modelling User behaviour  The basis for simulation and emulation   In turn allows performance tests Separation into   Frequency of using the VoD system Selection of a movie User Interaction  Models exist  But are not verified Selection of a movie   Dominated by the access probability Should be simulated by realistic access patterns INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Focus on Video-on-Demand Video-on-demand systems     Objects are generally consumed from start to end Repeated consumption is rare Objects are read-only Hierarchical distribution system is the rule Caching approach   Simple approach first Various existing algorithms Simulation approach   No real-world systems exist Similar real-world situations can be adopted INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Using Existing Models Use of existing access models ?    Some access models exist Most are used to investigate single server or cluster behavior Real-world data is necessary to verify existing models Optimistic model    Cache hit probabilities are over-estimated Caches are under-dimensioned Network traffic is higher than expected Pessimistic model    Cache hit probabilities are under-estimated Cache servers are too large or not used at all Networks are overly large INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Existing Data Sources for Video-on-Demand Movie magazines    Data about average user behaviour Represents large user populations Small number of observation points (weekly) Movie rental shops    Actual rental operations Serves only a small user population Initial peaks may be clipped Cinemas     Actual viewing operations Serves only a small user population Few number of titles Short observation periods INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Model for Large User Populations Zipf Distribution N  C   z (i)   , C  1 / 1 / j  i  j 1  Verified for VoD by A. Chervenak     N - overall number of movies ξ – skew factor i - movie i in a list ordered by descreasing popularities z(i) - hit probability Many application contexts   all kinds of product popularity investigations http://linkage.rockefeller.edu/wli/zipf/ collects applications of Zipf’s law  natural languages, monkey-typing texts, web access statistics, Internet traffic, bibliometrics, informetrics, scientometrics, library science, finance, business, ecological systems, ... INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Verification: Movie Magazine Movie magazine     Characteristics of observations on large user populations Smoothness Predictability of trends Sharp increase and slower decrease in popularities Highlander 3 Highlander 3 0 20 60000 top 100 ranking media control index 80000 40000 20000 40 60 80 0 100 0 5 10 15 weeks 20 INF5070 – media servers and distribution systems 25 0 5 10 15 weeks 20 25 2004 Carsten Griwodz & Pål Halvorsen Comparison with the Zipf Distribution probability curves for 250 movie titles 1 rental probability 0.9 0.8 0.7 0.6 4/3/96 0.5 z(i) 0.4 0.3 4/6/96 0.2 0.1 0 0 20 40 60 movie index 80 100 Well-known and accepted model Easily computable Compatible with the 90:10 rule-of-thumb INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Verification: Small and Large User Populations INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Verification: Small and Large User Populations Similarities     Small populations follow the general trends Computing averages makes the trends better visible Time-scale of popularity changes is identical No decrease to a zero average popularity Differences   Large differences in total numbers Large day-to-day fluctuations in the small populations Typical assumptions   90:10 rule Zipf distribution models real hit probability INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Problems of Zipf Does not work in distribution hierarchies  Access to independent caches beyond first-level are not described Not easily extended to model day-to-day changes   Is timeless Describes a snapshot situation Optimistic for the popularity of most popular titles Chris Hillman, bionet.info-theory, 1995 Any power law distribution for the frequency with which various combinations of ‘‘letters’’ appear in a sequence is due simply to a very general statistical phenomenon, and certainly does not indicate some deep underlying process or language. Rather, it says you probably aren’t looking at your problem the right way!  INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Approaches to Long-term Development Model variations for long-term studies  Static approach    CD sales model    Smooth curve with a single peak Models the increase and decrease in popularity Shifted Zipf distribution    No long-term changes Movie are assumed to be distributed in off-peak hours Zipf distribution models the daily distribution Shift simulates daily shift of popularities Permutated Zipf distribution   Zipf distribution models the daily distribution Permutation simulates daily shift of popularities INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Verification: Zipf Variations popularity index change relevance change 100 80 relevance change of a real movie 100 80 60 40 20 0 0 50 100 150 200 250 age in days 60 40 20 0 0 50 100 150 200 age in days 250 popularity index change popularity index change Rotation model for day-to-day relevance changes relevance change of a real movie 100 80 60 40 20 0 0 50 100 150 200 250 age in days INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Verification: Zipf Variations popularity index change relevance change 100 80 relevance change of a real movie 100 80 60 40 20 0 0 50 60 100 150 200 250 age in days 40 20 0 0 50 100 150 200 age in days 250 popularity index change popularity index change Permutation model for day-to-day relevance changes relevance change of a real movie 100 80 60 40 20 0 0 50 100 150 200 250 age in days INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Modelling: Requirements Model should represent movie life cycles     To To To To reflect the aging of titles observe movement of movies through a hierarchy of servers make observations with respect to a single movie support the idea of pre-distribution Model should work for large and small user populations   To allow variations in client numbers To prevent from built-in smoothing effects Model can not be trace-driven     The number of movies is too small The observation time is too short The user population size is not variable One title can not be re-used without similarity effects INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen New Model: Movie Life Cycle Characteristics     Quick popularity increase Various top popularities Various speeds in popularity decrease Various residual popularity INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen New Model: User Population Size 50 draws per day 500 draws per day 1.5 movie hits movie hits 2 1 0.5 0 0 50 100 150 200 250 9 8 7 6 5 4 3 2 1 0 0 50 100 200 250 days 50000 draws per day 70 700 60 600 movie hits movie hits days 5000 draws per day 150 50 40 30 20 10 500 400 300 200 100 0 0 0 50 100 150 days 200 250 0 50 100 150 200 250 days Smoothing effect of larger user populations Day-to-day relevance changes Probability distribution of all movies by „new releases“ INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Problems with Data Sources Lack of additional real-world data  No verification data for medium-sized populations available Missing details  Genres    Single day probability variations   Children´s choices at daytime, adults´ choices at night Regional popularity differences    Popularity rise and decline depends on genres Single users´ behaviour can be predicted Ethnic groups Regional information Comebacks  Sequels inspire comebacks Detail overload  Simplifications are required for large simulations INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Video Access Modeling Simple Zipf models are not suited for simulation of server hierarchies Trace-driven simulation can not be used Our model is sufficient for general investigation on caching     Long-term movie life cycles can be modeled nicely Optimistic assumptions due to smoothness are removed Variations in movie behavior are supported Day-to-day popularity changes are realistic It is not sufficient yet for advanced caching mechanisms   Single-day variations are missing Genres are missing INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen Summary User modeling helps achieving a good price/performance ratio for multimedia systems User modeling allows cheating Examples seen:    Modeling quality assessment of layered video Modeling audio/video synchronization Modeling video access probability INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen References Ann Chervenak: Tertiary Storage: An Evaluation of New Applications, PhD thesis, University of California, Berkeley, 1994 Carsten Griwodz, Michael Bär, Lars Wolf: Long-Movie Popularity Models in Video-on-Demand Systems, ACM Multimedia, Seattle, WA, USA, Nov. 1997 Charles Krasic, Jonathan Walpole: Priority-Progress Streaming for Quality-Adaptive Multimedia, ACM Multimedia Doctoral Symposium, Ottawa, Canada, Oct. 2001 Ralf Steinmetz, Klara Nahrstedt: Multimedia Fundamentals, Volume I: Media Coding and Content Processing (2nd Edition), Prentice Hall, 2002, ISBN 0130313998 Michael Zink, Oliver Künzel, Jens Schmitt, Ralf Steinmetz: Subjective Impression of Variations in LayerEncoded Videos, IWQoS, Monterey, CA, USA, Jun. 2003 Michael Zink, Jens Schmitt, and Carsten Griwodz. Layer-Encoded Video Streaming: A Proxy's Perspective. In IEEE Communications Magazine, Vol. 42, No. 8, August 2004 INF5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen