Download Query Processing, Resource Management and Approximate in a

Traffic Prediction on the Internet Anne Denton Outline  Paper by Y. Baryshnikov, E. Coffman, D. Rubenstein and B. Yimwadsana  Solutions  Time-Series prediction  Our work for the KDD-cup 03 Time Series Prediction on the Internet By Y. Baryshnikov, E. Coffman, D. Rubenstein and B. Yimwadsana  Adjustment to “hot spots”  Avoiding degradation, even “denial of service”  Can “hot spots” be predicted?  Can predicted “hot spots” be avoided? What are “hot spots”?  Exceptionally large numbers of requests  Spontaneous, short lifetime  “instant” ramp up in traffic  Only valid on long time scales  Claim: time scale for increase larger than time scale to react  Why does increase take time?  Passing on the word  How good does a predictor have to be?  Cost of missing a “hot spot” higher than aggregate cost of false alarms (similar to hurricane) Examples  Olympics (Nagano 98)  Soccer World Cup (98)  NASA (95) What to do about “hot spots”?  <Detour> “The Columbia Hotspot Rescue Service: A Research Plan” E. Coffman, P. Jelenkovic, J.Nieh, and D. Rubenstein  Approaches  Deal ad hoc with high request  Build a better network (expensive)  Content delivery services  Caching  Extra bandwidth  Suggested solution: use available and underutilized resources Hotspot Rescue Service  Server-based approach  Requires additional resources from server when necessary  Resources provided by other members of Hotspot Rescue Service  Peer-to-Peer approach  Requires additional resources from client when necessary  Caching Four Phases  Prediction (see rest of presentation)  Server-based: daemons  P2P: plug-ins  Replication  Server-based: replication of objects  P2P: identified cached copies  More advanced: redistribution of traffic load  Notification  Modifications to DNS (Domain Name System)  P2P system proactively announces hot objects and indicates alternative locations?  Termination <End of Detour> Tail of Distribution  Requests per 10-second time slot  X-axis: number of hits per time slot  Y-axis: probability that that number of hits will be exceeded Time Scales  Prediction relies on correlation between values at different times  Auto correlation function  f (t ) f (t   ) dt  Predictability on time scales of 5-30 min Prediction Algorithm  Standard problem  Signal processing  Econometrics  Internet traffic  Particularly bursty  Simplest model  Linear extrapolation Structure of Prediction Algorithms  Traffic observation  # of requests in time unit (t-1,t]  Usually 1s  Prediction window  Duration Wp  0  Advance notice   Prediction at time t:  Mapping of observations in [t-Wp,t] to a number pt  0 of requests predicted in interval [t+, t++1] that is  units in the future Linear Prediction  Linear Fit: Least squares linear fit  pt = ft(t+) with t  ft(s) = at s+bt 2   f ( i )  r  t i  Minimizing i t W p  Performance: O(W+T)  W: Window size  T: uptime duration  Problems  Prediction window size must match burstiness parameters governing request flow Results  Depends on properties of autocorrelation function Conclusions of Paper  Build a load-based taxonomy of web server traffic  Depends on technological, sociological, and psychological factors  Look for quantification of basic patterns reflecting behavior Do we agree ???  Why cluster when we can classify!! Our Approach  Normally time series prediction uses only data in that time series  We use similarity to other instances  E.g., other web sites  Model-free  Weighted Nearest Neighbor approach  Problem:  How integrate time? Typical Nearest Neighbor Classification / Regression  R(A1, …, An, C)  Attributes Ai  C class label (classification)  or continuous variable (regression)  Based on distance function on Ai  K nearest neighbors  Neighbors within a range  Use kernel function to weight closer ones higher Weighting of Attributes  Some attributes are more important than others  Apply scaling to space  Optimize weights through  Hill-climbing  Genetic Algorithm  How does this generalize to a timeseries? Our Answer  Identify “relevant” sections in the time series  E.g. times with already high download rates  We’ll call each relevant section a “prediction” Predictions  Each prediction contains information about  The nature of the time series  The time instance in question, i.e. the history of requests  The actual change in requests  Make a table of predictions  Leads to a relation just as standard classification / regression setting Data Set  Paper citations in “e-print ArXive”  Background: KDD-cup 03  Predict the change in citations in successive 3month periods  Only consider periods with at least 6 citations  Evaluation: L1 distance (Manhattan distance) between predicted and real difference  Very close match between citation history and request history  Predict change in requests  Only consider periods that already show large number of requests Attributes of a “Prediction”  Quantitative attributes  Number of citations in window  Gradient of citations in window  Aggregate number of citations up to and through window (assume finite time series)  Attribute values given by time series       Keyword occurrences Author Number of revisions of papers Maximum time interval between revisions Country of origin Format Similarity Function  Common kernel-function  ( x0  x1 ) 2  K ( x0 , x1 )  exp  2 2     What worked better 1 K ( x0 , x1 )  1  w x0  x1 Plot of Similarity Function 1 f(x) 0.8 0.6 Gaussian 0.4 1/(1+x) 0.2 0 0 5 10 x 15 20 Accuracy  No linear extrapolation data available  Could lead to negative citations  Comparison  Default prediction: No change: 1851  Very simple model (decrease by 0.3 in 3 months): 1532  Prediction based on average of time series (synchronized at first non-0): 1593  Prediction based on quantitative attributes: 1465  Full prediction (prelimiary): 1357  Weight optimized (very preliminary): reduction 1414 -> 1391 Results 3000 2500 2000 Series1 Series2 1500 Series3 Series4 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 Conclusions  Method works well for citation prediction  Yet to be tested for hot-spot prediction

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Query Processing, Resource Management and Approximate in a