Download doc

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter VI – Statistical Analysis of Experimental Data
CHAPTER VI
Statistical Analysis of Experimental Data
Measurements do not lead to a unique value. This is a result of the multitude of errors
(mainly random errors) that can affect them. It is therefore essential to take into account
these variabilities to use statistical methods to interpret the results obtained through an
experiment.
6.1. Introduction
An example of the use of statistical analysis of experimental data is to use a
representation under the form of a histogram.
Let us consider the following data representing the measurement of a temperature.
Number of readings
1
1
2
4
8
9
12
4
5
5
4
3
2
Temperature (C)
1089
1092
1094
1095
1098
1100
1104
1105
1107
1108
1110
1112
1115
The data are first arranged into groups called bins. Here the size of a bin is 5C. The
bins have to satisfy a couple of conditions:
- The bins usually have the same size and cover the entire range of the data without
overlap.
Figure 6.1. Histogram.
Instrumentation and Measurements \ LK\ 2009
47
Chapter VI – Statistical Analysis of Experimental Data
The above ‘bell’ shaped curve of the histogram is typical of experimental data
(although this is not a rule, see figure 6.2 for other types of histograms).
Figure 6.2. Different distributions of data. a) symmetric; b) skewed; c) j-shaped; d) bimodale; e) uniform.
- Discrete and random variables:
Continuous random variables: It is a variable that can take any real value in a certain
domain.
Discrete variables: It a variable that can take a limited number of values.
6.2. General concepts and definitions
Population: The population comprises the entire collection of objects, measurements,
observations, and so on whose properties are under consideration and about which
some generalizations are to be made.
Sample: A sample is a representative subset of a population on which an experiment is
performed and numerical data are obtained.
Instrumentation and Measurements \ LK\ 2009
48
Chapter VI – Statistical Analysis of Experimental Data
Sample space: The set of all possible outcomes of an experiment is called the sample
space. Is can be a discrete sample space of a continuous sample space.
Random variable: It is a variable that will change no matter how you try to precisely
repeat the experiment. A random variable can be discrete or continuous.
Distribution function: It is a mathematical relationship used to represent the values of
the random variable.
Parameter: It is an attribute of the entire population (exp. average, median, …)
Event: It is the outcome of a random experiment.
Statistic: It is an attribute of the sample (exp. average, median, …)
Probability: It is the chance of occurrence of the event in an experiment.
6.2.2. Measures of central tendency
n
- Mean: x  
i 1
xi
n
N
xi
i 1 N
- Median: it is the value at the center of a set, arranged in ascending or descending
order. If the size of the set is even, the median represents the average of the two central
peaks.
And for a finite number of elements:   
- Mode: It represents the value of the variable that corresponds to the peak value of the
probability of occurrence of an event.
6.2.3. Measures of dispersion
-
deviation: d i  xi  x
-
mean deviation: d  
n
-
di
i 1 n
standard deviation (for a population with a finite number of elements):

N

xi   2
N
- Sample standard deviation:
i 1
Instrumentation and Measurements \ LK\ 2009
49
Chapter VI – Statistical Analysis of Experimental Data
S
n
xi  x 2
i 1
n

it is used to estimate the population standard
deviation.
-
2

 for population
The variance: variance   2

S for sample
6.3. Probability
‘Probability is a numerical value expressing the likelihood of occurrence of an event
relative to all possibilities in a sample space.’
The probability of occurrence of an event A is defined as the number of successful
occurrences (m) divided by the total number of possible outcomes (n) in a sample space,
evaluated for n >> 1.
probabilit y of event A 
m
n
The event can be represented by: 1) a continuous random variable (the probability is
expressed as P(x)); 2) a discrete random variable (the probability is expressed as P(xi)).
Here are some properties relative to probability:
a- 0  Px or xi   1
b- If event A is the complement of event A, then: P( A )  1  P( A)
c- It the events A and B are mutually exclusive (A and B can not occur
simultaneously): P( A  B)  P( A)  P( B)
d- It the events A and B are independent, the probability that both A and B will occur
tighter is: P( AB)  P( A)  P( B)
eThe
probability
of
occurrence
of
A
or
B
or
both
is:
P( A  B)  P( A)  P( B)  P( AB)
Example
A distributor claims that the chance that any of the three major components of a computer
(CPU, monitor, and keyboard) is defective is 3%. Calculate the chance that all three will
be defective in a single computer?
6.3.1. Probability distribution functions
An important function of statistics is to use information from a sample to predict the
behavior of a population.
For particular situations, experience has shown that the distribution of the random
variable follow a certain mathematical pattern (function). Then, if the parameters of
this function can be determined using the sample data, it will be possible to predict the
properties of the parent population. Such functions are called: probability mass
Instrumentation and Measurements \ LK\ 2009
50
Chapter VI – Statistical Analysis of Experimental Data
functions for discrete random variables . For continuous random variables, these
functions are called probability density functions.
- Probability mass function:
n
 P( x )  1 ;
i 1
i
The mean of the population for a discrete random variable (also called the expected
N
value):    xi P xi 
i 1
N
2
The variance of the population is given by:  2   xi    Pxi 
i 1
- Probability density function:
Pxi  x  xi  dx   f xi dx
And then, to find the probability of x to occur between a and b values:
b
Pa  x  b    f x dx
a
The mean of the population is:  

 x f x dx

The variance of the population is:  2 

 x    f x dx
2

Example
Consider the following probability distribution function for a continuous random
variable:
 3x 2
2 x3

f ( x)   35
0 elsewhere

a- Show that this function satisfies the requirements of a probability distribution
function.
b- Calculate the expected mean value of x.
c- Calculate the variance and the standard deviation of x.
- Cumulative distribution function:
Instrumentation and Measurements \ LK\ 2009
51
Chapter VI – Statistical Analysis of Experimental Data
This type of distributions is used when you want to know the probability of event to be
lower that a certain value (x).
F rv  x   F ( x)   f ( x)dx  P(rv  x)
x

i
For discrete random variable: F rv  xi    P( x j )
j 1
Cumulative distribution function has the two following properties:
P(a  x  b)  F (b)  F (a)
P( x  a)  1  F (a)
Instrumentation and Measurements \ LK\ 2009
52