Download Z- score

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Probability
 Probability: is a numerical measure of the likelihood
that an event will occur
 An experiment: is any process that generates welldefined outcomes
 Sample space (S): is the set of all possible outcomes
of an experiment
 An event (A): is an outcome or set of outcomes that
are of interest to the experiment. An event (A) is a
subset of the sample space (S)
 The probability of an event A {P (A)}: is a measure of
the likelihood that an event A will occur
 Example: Tossing a coin
Experiment: Toss a coin and observe the up face
S{
} S= {H, T}
H (head) T (tail)
 Example: Tossing a coin twice
Experiment: flip a coin twice and observe the
sequence (keeping track of order) of up faces.
S= {HH, HT, TH, TT}
A= {Tossing at least one head}
A = {HH, HT, TH}
Example = Tossing by a dice
Experiment: Tossing a six-sided dice and
S= {1, 2, 3, 4, 5, 6}
A= {roll an even number}
A = {2, 4, 6}
Methods of assigning probability
 Classical probability: Each outcome is equally likely
 It is applicable to games of chance
 In the cases, if there are N outcomes in S, then the
probability of any one outcome is 1/N
 If A is any event and nA is the number of outcomes in
A, then:
P (A) =
nA
N
Example: Tossing a dice:
S= {1, 2, 3, 4, 5, 6}
P (1) = P(2)= P(3)=P (4)=P(5)=P(6)= 1 6
A= {roll an even number}= {2, 4, 6}
P (A) = 3/6
= 0.5
Empirical probability is simply the relative
frequency that some event is observed to
happen (or fail).
Number of times an event occurred divided by
the number of trials:
n
P (A) = N
Where:
N= total number of trails
nA Number of outcomes producing A
A
Relative frequency example
Children No.
0
1
2
3
4
5
Sum
Frequency
40
80
50
30
10
5
215
Relative frequency
40/215 = 0.19
80/215 = 0.37
50/215 = 0.23
30/215 = 0.14
10/215 = 0.05
5/215 = 0.02
215/215 = 1.00
Basic concepts of probability:
Probability values are always assigned on a
scale from 0 to 1
A probability near 0 indicates an event is
unlikely to occur
A probability near 1 indicates an event is
almost certain to occur
A probability near of 0.5 indicates event is just
as likely as it is unlikely
The sum of the probabilities of all outcomes
must be 1
Definitions
Mutually exclusive events: occurrence of one
event precludes the occurrence of the other
event
Independent event: occurrence of one event
does not affect the occurrence or nonoccurrence of the other event
Complementary events: all elementary events
that are not in the event A are in its
complementary event.
P (Sample space)
P (A') = 1-P (A)
Laws of Probability
The addition rule: The probability of one event
or another
P (A or B) = P (A) + P (B) – P (A and B)
If A and B are mutually exclusive events (A
and B can not occur at the same time), then
P (A or B) = P (A) + P (B)
Examples:
Gender
Type of position
Managerial
Professional
Technical
Clerical
Total
Total
8
31
52
9
100
3
13
17
2.7
55
69
31 100


 0.645
P (T  C) = P (T) + P (C):
155 155 155
11
44
69
31
155
Law of multiplication: The probability of both the
A and B occur together
P (A and B) = P(A) × P(B/A)
If A and B are independent (the occurrence of
one does not affect the occurrence of the
other):
P (B/A)= P(B), and then
P (A and B) = P(A) × P(B)
Probability of at least one = 1- Probability of
non
Probability Distribution
Defined: It is the distribution of all possible
outcomes of a particular event. Examples of
probability distribution are:
the binomial distribution (only 2 statistically
independent outcomes are possible on each
attempt) (Example coin flip)
the normal distribution
other underlying distributions exist (such as the
Poisson, t, f, chi-square, ect.) that are used to
make statistical inferences.
The normal probability distribution
 The normal curve is bell-shaped that has a single
peak at the exact centre of the distribution.
 The arithmetic mean, median, and mode of the
distribution are equal and located at the peak
 The normal probability distribution is symmetrical
1
about its mean (of2 the observations are above the
1
mean and are
below).
2
 It is determined by 2 quantities: the mean and the SD.
 The random variable has an infinite theoretical range
(Tails do not touch X – axis).
 The total area under the curve is = 1
Figure
68% of the area under the carve is between  1
SD
95% of the area under the carve is between 
1.96 SD
99% of the area under the carve is between 
2.58 SD
Why the normal distribution is important?
A/ Because many types of data that are of
interest have a normal distribution
Central Limit theorem
sampling distribution of means becomes
normal as N increases, regardless of shape of
original distribution
Binominal distribution becomes normal as N
increases
N.B:
Normal distribution is a continuous one
Binomial distribution is a quantitative discrete
Standard normal distribution (curve)
 A normal distribution with a X of zero and SD of 1 is
called standard normal distribution
 Any normal distribution can be converted to the
standard normal distribution using the Z-statistics
(value)
 Z-value (SND): is the distance between the selected
value, designated X, and the population mean (M),
divided by the population SD (  )
 M
Z= 
 The standard normal distribution curve is bell-shaped
curve centered around zero with a SD=1
Z- score
Z-score is often called the standardized value
or Standard Normal Deviate (SND). It denotes
the number of SD.s a data value X is distant
from the  and in which direction.
A data value less than sample mean will have a
z-score less then zero;
A data value greater than the sample X will
have a z-score greater than zero; and
A data value = the  will have a z-score of zero
Normal curve table
 The normal curve table gives the precise
percentage of scores (values) between the (zscore of zero) and any other z-score. It can be
used to determine:
1. proportion of scores above or below a
particular z-score
2. proportion of scores between the and a
particular z–score
3. proportion of scores between two z–scores
By converting raw scores to z-scores, can be
used in the same way for raw sources.
Can also used in the opposite way:
Determine a z-score for a particular proportion
of scores under the normal curve.
* Table lists positive z-scores
* Can work for negatives too
* Why? Because curve is symmetrical
Steps for figuring percentage above or
below a z-score:
Convert raw score to z-score, if necessary
Draw a normal curve:
- indicate where z-score falls
- Shade area you are trying to find
Find the exact percentage with normal curve
table
Figure
Steps for figuring a z-score or raw score
from a percentage:
Draw normal curve, shedding an
approximate area for the percentage
concerned
Find the exact z-score using normal curve
table
Convert z–score to raw score, if desired
Figure
Example:
 For  = 2200, M = 2000, = 200, Z = (2200-2000)/200=1
 For  = 1700, M = 2000,  = 200, Z = (1700 – 2000)/200= -1.5
 A z-value of 1 indicates that the value of 2200 is 1 SD above
the of 2000, while a z-value of -1.5 indicates that the value of
1700 is 1.5 SD below the of 2000.
Example:
 For M= 500,  = 365, determine the position of 722 in SD units
Figure
X M

= 0.61
=
722  500
365
222
=
365
We can also determine how much of the area
under the normal curve is found between any
point on the curve and the 
Once you have a z-score, you can use the
table to find the area of the z-score
0.61 (from table A) = 0 .2291 = 0.23
Therefore, 22.9% or 23%
Q/ How much of the population lies between
500 and 722?
A/ 0.5 – 0.23 = 0.27
Q/ How much of the population is to the left?
A/ 0.5 + 0.23 = 0.73
Example:
 The daily water usage per person in an area, is normally
distributed with a  of 20 gallons and a SD of 5 gallons
Q1/ About 68% of the daily water usage per person in this area
lies between what 2 values?
A/ About 68% of the daily water usage will lie between 15 and
25 gallons
Q2/ What is the probability that a person from this area,
selected at random, will use less then 20 gallons par day?
A/ P (X < 20) = 0.5
Q3/ What percent uses between 20 and 24
gallons?
The z-value associated with X=24:
z = (24 -20)/ 5 = 0.8
From the table, the probability of z= 0.8 is
0.2119.
Thus, P (20 < × < 24) = 0.5 – 0.2119 = 0.2881
= 28.81%
Figure
What percent of the population uses between
18 and 26 gallous?
A/ The z-value associated with X = 18:
z = (18-20)/5= -0.4
and for X=26:
z= (26-20)/5 = 1.2
Thus P (18 <× < 26) = P (-0.4 < Z < 1.2)
=0.6554 – 0.1151 =0.5403
Example: Height of young women:
The distribution of heights of women, aged 2029 years, is approximately normal with X =64
inch and SD= 2.7 inch
Q/ Approximately, 68% of women have height
between ……………. and ………….
Q/ ~ 2.5% of women are shorter than ……..
Q/ Approximately, what proportion of women
are taller then 72.1=?