Download Sample Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Sample Statistics
Modeling and Simulation
CS 313
1
SAMPLE STATISTICS
Discrete-event simulations generate a lot of experimental data.
To facilitate the analysis of all this data, it is conventional to compress the
data into a handful of meaningful statistics.
We have already seen examples of this, where job averages and time
averages were used to characterize the performance of a single-server
service node.
Each time a discrete-event simulation program is used to generate data, it is
important to appreciate that this data is only a sample from that much
larger population.




2
SAMPLE STATISTICS
If the size of sample is small, essentially all that can be done is compute the
sample mean and standard deviation.
If the size of sample is not small, a sample-data histogram can be
computed and then used to analyze the distribution of data in the sample.


3
SAMPLE MEAN AND STANDARD
DEVIATION
How to collect data in DES? Two types of statistical analysis:
Within-the-run (e.g., job avg and time avg used to characterize the
performance of a SSQ system)
Between-the-run: simulate the system repeatedly by simply changing the
initial seed from run to run.



4
SAMPLE MEAN AND STANDARD
DEVIATION

Definitions:
Consider a sample x1, x2, . . . , xn (continuous or discrete)
Sample Mean:

Sample Variance:

Sample Standard Deviation:

Coefficient of Variation:


5
UNDERSTANDING THE STATISTICS
Mean: a measure of central tendency
Variance, Deviation: measures of dispersion about the mean
The sample standard deviation has the same "units" as the data and the
sample mean. For example, if the data has units of sec then so also does the
sample mean and standard deviation.
Although the sample variance is more amenable to mathematical
manipulation (because it is free of the square root), the sample standard
deviation is typically the preferred measure of dispersion, since it has the
same units as the data.
Note that the coefficient of variation (C.V.) is unit-less, but a common
shift in data changes the C.V.
e.g.: measure students’ heights on the floor, in chairs





6
RELATING THE MEAN AND STANDARD
DEVIATION

The root-mean-square (rms) function d(x) measures dispersion about any
value x

d(x) measures dispersion about any value x
Theorem 4.1.1
 The sample mean gives the smallest possible value for d(x)
 The standard deviation s is that smallest value:

7
RELATING THE MEAN AND STANDARD
DEVIATION

Example:
Collect 50 observations
The sample mean is 1.095
The sample standard deviation is 0.354:

The smallest value of d(x) is s, as shown in the figure



8
LINEAR DATA TRANSFORMATION
Often the output data generated by simulations should be converted to
different units (sec), the change in system statistics can be determined
directly, without any need to re-process the converted data.

9
LINEAR DATA TRANSFORMATION
10
NONLINEAR DATA TRANSFORMATION




When data is used to generate a Boolean (1 or 0) outcome, we need
nonlinear data transformation
The value of xi is not important as the effect
E.g., consider the effect: it will rain tomorrow. How much rain we will have
is not important
Let A be a fixed set and
11
NONLINEAR DATA TRANSFORMATION
12
DISCRETE-DATA HISTOGRAMS
13
DISCRETE-DATA HISTOGRAMS

Example 1:
14
DISCRETE-DATA HISTOGRAMS

Example 2:
15
HISTOGRAM MEAN AND STANDARD
DEVIATION

The discrete-data histogram mean is

The discrete-data histogram standard deviation is

The discrete-data histogram variance is s2
16
HISTOGRAM MEAN AND STANDARD
DEVIATION
17
HISTOGRAM MEAN AND STANDARD
DEVIATION

Example 4.2.3
For the data in Example 4.2.1 (three dice)

For the data in the Example 4.2.2 (balls placed in boxes)

18
CONTINUOUS-DATA HISTOGRAMS
19
CONTINUOUS-DATA HISTOGRAMS

Binning
20
CONTINUOUS-DATA HISTOGRAMS
21
CONTINUOUS-DATA HISTOGRAMS

Example: buffon
22
HISTOGRAM PARAMETER GUIDELINES
23
CONTINUOUS-DATA HISTOGRAMS

Example 4.3.2: Smooth, Noisy Histograms
24
RELATIVE FREQUENCY
25
HISTOGRAM INTEGRALS
26
HISTOGRAM MEAN AND STANDARD
DEVIATION
27