Download Confidence Interval for The Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Manijeh Keshtgary
Chapter 13







How to report the performance as a single number?
Is specifying the mean the correct way?
How to report the variability of measured quantities?
What are the alternatives to variance and when are they
appropriate?
How to interpret the variability?
How much confidence can you put on data with a large
variability?
How many measurements are required to get a desired
level of statistical confidence?
How to summarize the results of several different
workloads on a single computer system?
How to compare two or more computer systems using
several different workloads? Is comparing the mean
sufficient?
2





Sample Versus Population
Confidence Interval for The Mean
Approximate Visual Test
One Sided Confidence Intervals
Sample Size for Determining Mean
3



Old French word `essample'
 `sample' and `example'
One example  theory
One sample  Definite statement
4


Generate several million random numbers
with mean  and standard deviation s
Draw a sample of n observations: {x1, x2, …, xn}
x
◦ Sample mean (x)  population mean ()

Parameters: population characteristics
◦ Unknown, Use Greek letters (, s)

Statistics: Sample estimates
◦ Random, Use English letters (x, s)
5


it is not possible to get a perfect estimate of
the population mean from any finite number
of finite size samples.
The best we can do is to get probabilistic
bounds. Thus, we may be able to get two
bounds, for instance, c1 and c2,

k samples  k Sample means
◦  Can't get a single estimate of 
◦  Use bounds c1 and c2:
 Probability{c1    c2} = 1-  ( is very small)

Confidence interval: [(c1, c2)]

Significance level:  ( 0.1 )
Confidence level: 100(1-)

Confidence coefficient: 1-

c1

c2
7


Use 5-percentile and 95-percentile of the sample means to get
90% Confidence interval  Need many samples (n > 30)
Central limit theorem: Sample mean of independent and
identically distributed observations:

Where  = population mean, s = population standard deviation
Standard Error: Standard deviation of the sample mean

100(1-)% confidence interval for :
z1-/2 = (1-/2)-quantile of N(0,1)
-z1-/2
0
z1-/2
8

For example, for a
two–sided confidence
interval at 95%, α=
0.05 and p = 1 – α/2
= 0.975. The entry in
the row labeled 0.97
and column labeled
0.005 gives zp =
1.960.



x = 3.90, s = 0.95 and n = 32
A 90% confidence interval for the mean
=
We can state with 90% confidence that the
population mean is between 3.62 and 4.17.
The chance of error in this statement is 10%.
10

If we take 100 samples and construct
confidence interval for each sample, the
interval would include the population mean in
90 cases.
c1

c2
Total yes > 100(1-)
11



The preceding confidence interval applies
only for large samples, that is, for samples of
size greater than 30.
For smaller samples, confidence intervals can
be constructed only if the observations come
from a normally distributed population
100(1-) % confidence interval for n < 30
◦ Note: can be constructed only if observations come
from a normally distributed population

t[1-/2; n-1] = (1-/2)-quantile of a t-variate
with n-1 degrees of freedom
◦ Listed in Table A.4 in the Appendix
13

Table A.4 lists t[p;n]. For example, the t[0.95;13]
required for a two–sided 90% confidence
interval of the mean of a sample of 14
observation is 1.771

Sample
◦ -0.04, -0.19, 0.14, -0.09, -0.14, 0.19, 0.04, and
0.09.



Mean = 0, Sample standard deviation = 0.138.
For 90% interval: t[0.95;7] = 1.895
Confidence interval for the mean
15
16


Difference in processor times: {1.5, 2.6, -1.8, 1.3, -0.5,
1.7, 2.4}
Question: Can we say with 99% confidence
that one is superior to the other?
◦ Sample size = n = 7
◦ Mean = 7.20/7 = 1.03
◦ Sample variance = (22.84 - 7.20*7.20/7)/6 = 2.57
◦ Sample standard deviation =
= 1.60
t[0.995; 6] = 3.707

99% confidence interval = (-1.21, 3.27)
17



Opposite signs  we cannot say with 99%
confidence that the mean difference is
significantly different from zero
Answer: They are same
Answer: The difference is zero
18

Difference in processor times
◦ {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}.


Question: Is the difference 1?
99% Confidence interval = (-1.21, 3.27)
◦ The confidence interval includes 1 =>

Yes: The difference is 1 with 99% of
confidence
19

Paired: one-to-one correspondence between
the ith test of system A and the ith test on
system B
◦ Example: Performance on ith workload
◦ Straightforward analysis: the two samples are
treated as one sample of n pairs
◦ Use confidence interval of the difference

Unpaired: No correspondence
◦ Example: n people on System A, n on System B
Need more sophisticated method
◦ t-test procedure
20



Performance: {(5.4, 19.1), (16.6, 3.5), (0.6, 3.4), (1.4,
2.5), (0.6, 3.6), (7.3, 1.7)}. Is one system better?
Differences: {-13.7, 13.1, -2.8, -1.1, -3.0, 5.6}.
Answer: No. They are not different
(the confidence interval includes zero)
21

1. Compute the sample means

2. Compute the sample standard deviations
22

3. Compute the mean difference
4. Compute the standard deviation of the mean difference

5. Compute the effective number of degrees of freedom

6. Compute the confidence interval for the mean difference

7. If the confidence interval includes zero,
the difference is not significant

23

Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64,
7.26}
Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}
Question: Are the two systems significantly different?
For system A:

For System B:


24

The confidence interval includes zero
 the two systems are not different
25
26




Times on System A: {5.36, 16.57, 0.62, 1.41, 0.64,
7.26}
Times on system B: {19.12, 3.52, 3.38, 2.50, 3.60, 1.74}
t[0.95, 5] = 2.015
The 90% confidence interval for the mean of
A = 5.31  (2.015)
= (0.24, 10.38)
The 90% confidence interval for the mean of
B = 5.64  (2.015)
= (0.18, 11.10)
Confidence intervals overlap and the mean of one falls
in the confidence interval for the other
◦  Two systems are not different at this level of confidence
27





Need not always be 90% or 95% or 99%
Based on the loss that you would sustain if the
parameter is outside the range and the gain you
would have if the parameter is inside the range
Low loss  Low confidence level is fine
◦ E.g., lottery of 5 Million, one dollar ticket cost,
with probability of winning 10-7 (one in 10 million)
◦ 90% confidence  buy 9 million tickets (and pay $9M)
◦ 0.01% confidence level is fine
50% confidence level may or may not be too low
99% confidence level may or may not be too high
28

Time between crashes was measured for two
systems A and B. The mean and standard
deviations of the time are listed in Table
13.1. To check if System A is more
susceptible to failures than System B,




Larger sample  Narrower confidence interval
resulting in higher confidence
Question: How many observations n to get an accuracy
of § r% and a confidence level of 100(1-)%?
We know that for a sample of size n, the 100(1 α)% confidence interval of the population mean is
r% accuracy implies that confidence interval should be
33


Sample mean of the response time = 20 seconds
Sample standard deviation = 5
Question: How many repetitions are needed to get the
response time accurate within 1 second at 95%
confidence?
Required accuracy = 1 in 20 = 5%
Here,
= 20, s= 5, z= 1.960, and r=5,
n=
A total of 97 observations are needed.
34




All statistics based on a sample are random
and should be specified with a confidence
interval
If the confidence interval includes zero, the
hypothesis that the population mean is zero
cannot be rejected
Paired observations 
Test the difference for zero mean
Unpaired observations  More sophisticated
t-test
35