Download Powerpoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Quantitative Methods
Topic 5
Probability Distributions
1
Outline

Probability Distributions
 For
categorical variables
 For continuous variables

Concept of making inference
2
Reading
Chapters 4, 5 and Chapter 6
(particularly Chapter 6)
Fundamentals of Statistical Reasoning in Education,
Colardarci et al.
3
Tossing a coin 10 times - 1
If the coin is not biased, we would expect
“heads” to turn up 50% of the time.
 However, in 10 tosses, we will not get
exactly 5 “heads”.

 Sometimes,
it could be 4 heads out of 10
tosses. Sometimes it could be 3 heads, etc.
4
Tossing a coin 10 times - 2

What is the probability of getting
 No
‘heads’ in 10 tosses
 1 ‘head’ in 10 tosses
 2 ‘heads’ in 10 tosses
 3 ‘heads’ in 10 tosses
 ……
5
Do an experiment in EXCEL

See animated demo
 CoinToss1_demo.swf
6
Frequencies of 50 sets of coin tosses
7
Histogram of 50 sets of coin tosses
8
Some terminology

Random variable
A
variable the values of which are determined
by chance.

Examples of random variables
 Number
of heads in 10 tosses of a coin
 Test score of students
 Height
 Income
9
Probability distribution (function)

Shows the frequency (or chance) or
occurrence of each value of the random
variable.
10
Probability Distribution of Coin
Number of
Toss - 1
heads in 10


Slide 8 shows the
empirical probability
distribution.
Theoretical one can
be computed
See animated demo

Binomial Probability_demo.swf
tosses
Probability
0
0.001
1
0.010
2
0.044
3
0.117
4
0.205
5
0.246
6
0.205
7
0.117
8
0.044
9
0.010
10
0.001
11
Probability Distribution of Coin
Toss - 2
Theoretical probabilities
0.300
0.250
0.200
0.150
0.100
0.050
0.000
0
1
2
3
4
5
6
7
8
9
10
12
How can we use the probability
distribution - 1?

Provide information about “central
tendency” (where the middle is, typically
captured by Mean or Median), and
variation (typically captured by standard
deviation).
13
How can we use the probability
distribution - 2?
Use the distribution as a point of reference
 Example:

 If
we find that, 20% of the time, we obtain only
1 head in 10 coin tosses, when the theoretical
probability is about 1%, we may conclude that
the coin is biased (not 50-50 chance of
tossing a head)

Theoretical distribution will be better than
empirical distribution, because of
fluctuation in the collection of data.
14
Random variables that are
continuous
Collect a sample of height measurement
of people.
 Form an empirical probability distribution
 Typically, the probability distribution will be
a bell-shaped curve.
 Compute mean and standard devation
 Empirical distribution is obtained
 Can we obtain theoretical distribution?

15
Normal distribution - 1
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
0
1
2
3
4
16
Normal distribution - 2

A random variable, X, that has a normal
distribution with mean  and standard deviation 
can be transformed to a variable, Z, that has
standard normal distribution where the mean is 0
and the standard deviation is 1.
z-score


Need only discuss properties of the standard

z
normal distribution
x
17
Standard normal distribution - 1
0.45
0.4
0.35
0.3
5% in this
region
0.25
0.2
2.5% in
this
region
0.15
0.1
0.05
0
-4
-3
-2
-1
-1.64
0
1
2
1.96
3
4
18
Standard normal distribution - 2




2.5% outside 1.96
So around 5% less than -1.96, or greater than
1.96.
So the general statement that
Around 95% of the observations are within -2
and 2.
More generally, around 95% of the observations
are within -2 and 2 (± 2 standard deviations).
19
Standard normal distribution - 3

Around 95% of the observations lie within ± two
standard deviations (strictly, ±1.96)
0.45
0.4
95% in
this
region
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
0
1
2
3
4
20
Standard normal distribution - 3

Around 68% of the observations lie within ± one
standard deviation
68% in
this
region
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
0
1
2
3
4
21
Computing normal probabilities in
EXCEL

See animated demo
NormalProbability_demo.swf
22
Exercise - 1

For the data set distributed in Week 2,
TIMSS2003AUS,sav, for the variable bsmmat01
(second last variable, maths estimated ability),
compute the score range where the middle 95%
of the scores lie:
 Use
the observed scores and compute the percentiles
from the observations
 Assume the population is normally distributed
23
Exercise - 2

Dave scored 538. What percentage of
students obtained scores higher than
Dave?
 Use
the observed scores and compute the
percentiles from the observations
 Assume the population is normally distributed
24
Related documents