Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CPSC 531: Output Data Analysis Instructor: Anirban Mahanti Office: ICT 745 Email: [email protected] Class Location: TRB 101 Lectures: TR 15:30 – 16:45 hours Slides primarily adapted from: “The Art of Computer Systems Performance Analysis” by Raj Jain, Wiley 1991. [Chapters 12, 13, and 25] CPSC 531: Data Analysis 1 Outline Measures of Central Tendency Mean, Median, Mode How to Summarize Variability? Comparing Systems Using Sample Data Comparing Two Alternatives Transient Removal CPSC 531: Data Analysis 2 Measures of Central Tendency (1) Sample mean – sum of all observations divided by the total number of observations Always exists and is unique Mean gives equal weight to all observations Mean is strongly affected by outliers Sample median – list observations in an increasing order; the observation in the middle of the list is the median; Even # of observations – mean of middle two values Always exists and is unique Resistant to outliers (compared to mean) CPSC 531: Data Analysis 3 Measures of Central Tendency (2) mode 0.4 Mode may not exists (e.g., all sample have equal weight) More than one mode may exist (i.e. bimodal) If only one mode then distribution is unimodal 0.2 0.1 0 0 4 8 12 x 16 20 mode mode 0.2 PDF f(x) 0.15 0.1 0.05 0 0 4 8 12 16 20 x mode 0.6 0.5 PDF f(x) histogram from the observations; find bucket with peak frequency; the middle point of this bucket is the mode; PDF f(x) Sample mode – plot 0.3 0.4 0.3 0.2 0.1 0 0 4 8 12 x CPSC 531: Data Analysis 4 Measure of Central Tendency (3) Is data categorical? Yes: use mode e.g. most used resource in a system Is total of interest? Yes: use mean e.g. total response time for Web requests Is distribution skewed? Yes: use median • Median less influenced by outlier than mean. No: use mean. Why? CPSC 531: Data Analysis 5 Common Misuses of Means (1) Usefulness of mean depends on the number of observations and the variance E.g. two response time samples: 10 ms and 1000 ms. Mean is 505 ms! Correct index but useless. Using mean without regard to skewness System A 10 9 11 10 10 Mean: 10 Mode: 10 Min,Max: [9,11] System B 5 5 5 4 31 10 5 [4,31] CPSC 531: Data Analysis 6 Common Misuses of Means (2) Mean of a Product by Multiplying means Mean of product equals product of means if the two random variables are independent. If x and y are correlated E(xy) != E(x)E(y) Avg. users in system 23; avg. processes/user 2. Avg. # of processes in system? Is it 46? No! Number of processes spawned by users depends on the load. CPSC 531: Data Analysis 7 Outline Measures of Central Tendency How to Summarize Variability? Comparing Systems Using Sample Data Comparing Two Alternatives Transient Removal CPSC 531: Data Analysis 8 Summarizing Variability Summarizing by a single number rarely enough. Given two systems with same mean, we generally prefer one with less variability 20% 4s Mean=2s Response Time Frequency Frequency 80% 1.5 s 60% ~ 0.001 s ~5 s 40% Mean=2s Response Time Indices of dispersion • Range, Variance, 10- and 90-percentiles, Semi-interquantile range, and mean absolute deviation CPSC 531: Data Analysis 9 Range Easy to calculate; range = max – min In many scenarios, not very useful: Min may be zero Max may be an “outlier” With more samples, max may keep increasing and min may keep decreasing → no “stable” point Range is useful if systems performance is bounded CPSC 531: Data Analysis 10 Variance and Standard Deviation Given sample of n observations {x1, x2, …, xn} the sample variance is calculated as: 2 1 n s xi x n 1 i 1 2 1 n where x xi n i 1 Sample variance: s2 (square of the unit of observation) Sample standard deviation: s (in unit of observation) Note the (n-1) in variance computation (n-1) of the n differences are independent Given (n-1) differences, the nth difference can be computed Number of independent terms is the degrees of freedom (df) CPSC 531: Data Analysis 11 Standard Deviation (SD) Standard deviation and mean have same units Preferred! E.g. a) Mean = 2 s, SD = 2 s; high variability? E.g. b) Mean = 2 s, SD = 0.2 s; low variability? Another widely used measure – C.O.V C.O.V = Ratio of standard deviation to mean C.O.V does not have any units C.O.V shows magnitude of variability C.O.V in (a) is 1 and in (b) is .1 CPSC 531: Data Analysis 12 Percentiles, Quantiles, Quartiles Lower and upper bounds expressed in percents or as fractions 90-percentile →0.9-quantile –quantile: sort and take [(n-1)+1]th observation • [] means round to nearest integer Quartiles divide data into parts at 25%, 50%, 75% → quartiles (Q1, Q2, Q3) 25% of the observations ≤ Q1 (the first quartlie) Second quartile Q2 is also the median The range (Q3 – Q1) is interquartile range (Q3 – Q1)/2 is semi-interquartile (SIQR) range CPSC 531: Data Analysis 13 Mean Absolute Deviation Mean absolute deviation is calculated as: 1 n xi x n i 1 CPSC 531: Data Analysis 14 Influence of Outliers Range: considerably Sample variance: considerably, but less than range Mean absolute deviation: less than variance Doesn’t square (aka magnify) the outliers SIQR range: very resistant Use SIQR for index of dispersion whenever median is used as index of central tendency CPSC 531: Data Analysis 15 Outline Measures of Central Tendency How to Summarize Variability? Comparing Systems Using Sample Data Sample vs. Population Confidence Interval for Mean Comparing Two Alternatives Transient Removal CPSC 531: Data Analysis 16 Comparing Systems Using Sample Data The words “sample” and “example” have a common root – “essample” (French) One sample does not prove a theory - a sample is just an example The point is - definite statement cannot be made about characteristics of all systems. However, probabilistic statements about the range of most systems can be made Confidence interval concept as a building block CPSC 531: Data Analysis 17 Sample versus Population Generate 1-million random numbers with mean and SD and put them in an urn Draw sample of n observations {x1, x2, …, xn} has mean , standard deviation s x x is likely different than ! The population mean is unknown or impossible to obtain in many real-world scenarios obtain estimate of from x Therefore, CPSC 531: Data Analysis 18 Confidence Interval for the Mean Define bounds c1 and c2 such that: Prob{c1 < < c2} = 1- (c1, c2) is confidence interval is significance level 100(1- ) is confidence level Typically small desired confidence level 90%, 95% or 99% One approach: take k samples, find sample means, sort, and take the [1+0.05(k-1)]th as c1 and [1+0.95(k-1)]th as c2 CPSC 531: Data Analysis 19 Central Limit Theorem We do not need many samples. Confidence intervals can be determined from one sample because ~ N(, /sqrt(n)) SD of sample mean /sqrt(n) called Standard error Using the CLT, a 100(1- )% confidence interval for a population mean is ( -z1-/2s/sqrt(n), +z1-/2s/sqrt(n)) x x x z1-/2 is the (1-/2)-quantile of a unit normal variate (and is obtained from a table!) s is the sample SD CPSC 531: Data Analysis 20 Confidence Interval Example CPU times obtained by repeating experiment 32 times. The sorted set consists of {1.9,2.7,2.8,2.8,2.8,2.9,3.1,3.1,3.2,3.2,3.3,3.4,3.6,3.7,3.8,3.9,3.9 ,4.1,4.1,4.2,4.2,4.4,4.5,4.5,4.8,4.9,5.1,5.1,5.3,5.6,5.9} Mean = 3.9, standard deviation (s) = 0.95, n=32 For 90% confidence interval z1-/2 = 1.645, and we get {3.90 + (1.645)(0.95)/(sqrt(32))} = (3.62,4.17) CPSC 531: Data Analysis 21 Meaning of Confidence Interval What does this mean? With 90% confidence, we can say population mean is within the above bounds; that is, chance of error is 10%. E.g., Take 100 samples and construct CI’s. In 10 cases, the interval will not contain population mean x -c x x +c 90% chance that this interval contains CPSC 531: Data Analysis 22 Length of Confidence Interval Let z1-/2s/sqrt(n) = c Then, z1-/2 = (c.sqrt(n))/s Larger s implies wider confidence interval Larger n implies shorter confidence interval • → with more observations, we are better able to predict population mean • → square-root n relationship implies increasing observations by a factor of 4 only cuts confidence interval by a factor of 2. Confidence Interval computation, as described here works for n ≥ 30. CPSC 531: Data Analysis 23 What if n not large? For smaller samples, can construct confidence intervals only if observations come from normally distributed population x t[1 / 2;n1]s / n , x t[1 / 2;n 1]s / n t[1-α/2;n-1] is the (1-α/2)-quantile of a t-variate with (n-1) degrees of freedom CPSC 531: Data Analysis 24 Testing for a Zero Mean Check if measured value is significantly different than zero Determine confidence interval Then check if zero is inside interval. Procedure applicable to any other value a mean 0 Mean is zero Mean is nonzero CPSC 531: Data Analysis 25 Outline Measures of Central Tendency How to Summarize Variability? Comparing Systems Using Sample Data Comparing Two Alternatives Transient Removal CPSC 531: Data Analysis 26 Comparing Two Alternatives Often interested in comparing systems “naïve” VOD vs. “batching” VOD (assignment 3) “SJF” vs. “FIFO” request scheduling (assignment 1) Statistical techniques for such comparison: Paired Observations Unpaired Observations (we will omit this!) Approximate Visual Test Did you use any of these in your assignments? CPSC 531: Data Analysis 27 Paired Observations (1) n experiments with one-to-one corrsp. between test on system A and test on system B no correspondence => unpaired This test uses the zero mean idea… Treat the two samples as one sample of n pairs For each pair, compute difference Construct confidence interval for difference CI includes zero => systems not significantly different CPSC 531: Data Analysis 28 Paired Observations (2) Six similar workloads used on two systems. {(5.4, 19.1), (16.6, 3.5), (0.6,3.4), (1.4,2.5), (0.6, 3.6) (7.3, 1.7)} Is one system better? The performance differences are {-13.7, 13.1, -2.8, -1.1, -3.0, 5.6} Sample mean = -.32, sample SD = 9.03 CI = -0.32 + t[sqrt(81.62/6)] = -0.32 + t(3.69) .95 quantile of t with 5 DF’s is 2.015 90% confidence interval = (-7.75, 7.11) Systems not different as zero mean in CI CPSC 531: Data Analysis 29 Approximate Visual Test Compute confidence interval for means If CI’s don’t overlap, one system better than the other mean mean CI’s do not overlap => alternatives different mean CI’s overlap and mean of one is in the CI of the other => not significantly diff. CI’s overlap but mean of one is not in the CI of the other => need more testing CPSC 531: Data Analysis 30 Determining Sample Size Goal: find the smallest sample size n such that desired confidence in the results Method: small set of preliminary measurements estimate variance from the measurements use estimate to determine sample size for accuracy r% accuracy=> +r% at 100(1-)% confidence r xz x 1 100 n s 100zs n rx 2 CPSC 531: Data Analysis 31 Outline Measures of Central Tendency How to Summarize Variability? Comparing Systems Using Sample Data Comparing Two Alternatives Transient Removal CPSC 531: Data Analysis 32 Transient Removal In many simulations, we are interested in steady state performance Remove initial transient state However, defining exactly what constitutes end of transient state is difficult! Several heuristics developed: Long runs Proper initialization Truncation Initial data deletion Moving average of replications Batch means CPSC 531: Data Analysis 33 Long Runs Use very long runs Impact of transient state becomes negligible Wasteful use of resources How long is “long enough”? Raj Jain text recommends that this method not be used in isolation CPSC 531: Data Analysis 34 Batch Means Run simulation for long duration Divide observations (N) into m batches, each of size n Compute variance of batch means using procedure shown for n = 2, 3, 4, 5 … Plot variance vs. batch size Ignore 1) Computebatch mean 1 n xi xij , i 1,2,...,m n i 1 2)Computeoverallmean 1 m x xi m i 1 3) Computevarianceof batch means 1 m 2 Var ( x ) ( xi x ) m 1 i 1 Variance of Batch means Transient interval Batch Size n CPSC 531: Data Analysis 35