Download P(x) - ShareStudies.com

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Location: Chemistry 001
Time: Friday, Nov 20, 7pm to 9pm
o Make Copies of:
o Areas under the Normal Curve
o Appendix B.1, page 784
o (Student’s t distribution)
o Appendix B.2, page 785
o Binomial Probability Distribution
o Appendix B.9, pages 794,798
 n x
p( x)  P( X  x)    p (1  p) n  x
 x
Constructing a Frequency Distribution
 Decide
on the number of classes to group data into.
2k greater than n,
where k=number of classes, & n=number of observations.
 Ex.: if we have 80 observations, we should use 7 classes
(26=64<80; 27=128>80)
 Rule:
 Determine
the class interval or width. It should be the
same for all classes.
 Rule
H L
i
K
Constructing a Frequency Distribution
Count the number of items in each class. The number of items in
each class is called the class frequency.
Selling Prices (in 000)
15 up to 18
18 up to 21
21 up to 24
24 up to 27
27 up to 30
30 up to 33
33 up to 36
Total
Frequency
8
23
17
18
8
4
2
80
Graphic Presentation
Histogram
Frequency
Polygon
Cumulative
Frequency Polygon
Histogram
23
Number of Vehicles
25
20
17
18
15
10
8
8
4
5
2
0
1/1/00
Selling Price
Frequency Polygon
Frequencies
A Frequency Polygon consists of line segments connecting
the points formed by the class midpoint and the class
frequency
40
35
30
25
20
15
10
5
0
13.5 16.5 19.5 22.5 25.5 28.5 31.5 34.5 37.5
Selling Price
Cumulative Frequency Distribution
A Cumulative Frequency Distribution is used to determine how
many or what proportion of the data values are below or above a
certain value.
Selling Prices (inFrequency
000)
Cmltv Frqcy
15 up to 18
8
8
18 up to 21
23
31 8+23
21 up to 24
17
48 8+23+17
24 up to 27
18
66
27 up to 30
8
74
30 up to 33
4
78
33 up to 36
2
80
Total
80
Cumulative Frequency Polygon
Number of
vehicles sold
80
70
60
50
40
30
20
10
0
100%
75%
50%
25%
0
15
18
21
24
27
Selling Price ($000)
30
33
36
Chapter 3
Measures of Location
o Arithmetic mean
X


N
o Weighted mean
( w1 X 1  w2 X 2  ...  wn X n )
Xw 
( w1  w2  ...wn )
o Median
o Mode
o If Median<Mean then +vly skewed.
o If Median>Mean then –vly skewed
Measures of Dispersion
Dispersion refers to the spread or variability in
the data.
o Range
o Mean deviation
|X X |
MD 
o Variance
Population Var

2
( X  )


N
o Standard deviation
2
,

n
Sample Var
S2 
2
(
X

X
)

n 1
Mean of Grouped Data (from a frequency
distribution)
f

X
M
S
 f (M  X )
n
f is the frequency in each class
M is the midpoint of each class
n is the total number of frequencies
n 1
2
Chapter 4
Percentiles
Location of a percentile:
P
Lp  ( n  1)
100
n = number of observations
P = desired percentile
Percentiles
Location of a percentile:
P
Lp  ( n  1)
100
n = number of observations
P = desired percentile
o Q1=L25, Q2=Median=L50, Q3=L75.
o 70th Decile = L70.
Example: 43, 61, 75, 91, 101, 104
L25=(6+1)25/100=1.75 (1.75th position) or .75 of distance between 1st and
2nd observation= 43 + (61-43)(0.75)=56.5.
Box Plot
L=13, H=30, Q1=15 , Q2=18 , Q3=22.
Outlier if value > Q3 + 1.5(Q3-Q1) , or
if value <Q1 – 1.5(Q3-Q1), where Q3-Q1 is the inter-quartile range.
Q1
Median
Q3
L
12
H
14
16
18
20
22
24
26
28
30
32
Ex: Using the twelve stock prices, we find the mean to be
84.42, standard deviation, 7.18, median, 84.5.
Coefficient of variation
s
CV 
(100%) = 8.5%
X
Coefficient of skewness
3 (X  Median)
sk 
s
= -.035
Chapter 5
Classical Probability
 Based
on the assumption that the outcomes of an experiment
are equally likely.
 Probability of an event= Number of favorable outcomes / Total
number of possible outcomes.
We roll a die. What is the probability of the event “an
even # appears face up”?
 Example:
Possible outcomes are:1,2,3,4,5,6.(6)
 Favorable outcomes are:2,4,6.(3)


Probability of an even number=3/6 =.5
Empirical Probability
 Based
on relative frequency.
 Probability of an event= Number of times event occurred in the
past / Total number of observations
 Example:
What is the probability of a future space shuttle mission
being successful, given that 2 out of the last 113 missions ended with
a disaster?
Probability of a successful mission= Number of successful flights / Total
number of flights.
 P(A)= 111 / 113= .98

Conditional Probability
A conditional probability is the probability of a
particular event occurring, given that another
event has occurred.
The probability of the event A given that the
event B has occurred is written P(A|B).
Two events A and B are independent if the occurrence of one has no effect on
the probability of the occurrence of the other
P(A|B)=P(A) or P(B|A)=P(B).
Rules for Computing Probabilities Addition Rule
Rules of Addition
 Special Rule of Addition - If two events A
and B are mutually exclusive, the
probability of one or the other event’s
occurring equals the sum of their
probabilities.
P(A or B) = P(A) + P(B)

The General Rule of Addition - If A and B
are two events that are not mutually
exclusive, then P(A or B) is given by the
following formula:
P(A or B) = P(A) + P(B) - P(A and B)
A
B
A and B
Joint Probability of A and B
Contingency Table:
Example: The Dean of the School of Business at Owens University collected the
following information about undergraduate students in her college
Major
Accounting
Male
170
Female
110
Total
280
Finance
120
100
220
Marketing
160
70
230
Management
150
120
270
Total
600
400
1000
Chapter 6
Constructing a PDF and a CDF for a
Discreet Random Variable
Example: Toss a coin three times and let X be the number of
heads. What is the PDF and CDF of X?
Outcome
Prob.
X
HHH
1/8
3
HHT
1/8
2
HTH
1/8
2
HTT
1/8
1
THH
1/8
2
THT
1/8
1
TTH
1/8
1
TTT
1/8
0
x

P(X = x) F(x)=P(X ≤ x)
0
1/8
1/8
1
3/8
=1/8+3/8=1/2
2
3/8
=1/2+3/8=7/8
3
1/8
1
Expected Value (Mean)
 Mathematically:
The expected value (or mean) of a RV X is
µ = E(X) =
 xp(x)
all x
 Sometimes
 Additivity:
write µX
E(X + Y) = E(X) + E(Y)
Variance and Standard Deviation
A
measure of the variability of a RV is its Variance
 To compute the variance of a discrete RV X
 Compute
µ
 For each possible x, compute (x – µ)2 p(x)
 Add up these values
 It helps to construct a table
 In
a formula:
σ 2  Var(X)  (x  μ)2 p(x)
all x
OR
σ 2  Var(X)   x 2 p(x)   2
all x
 Standard
Deviation (SD): σ  Var(X)
Variance and Standard Deviation
 Consider
µ
the pdf of the random variable:
x
0
1
2
3
p(x)
1/8
3/8
3/8
1/8
= 3/2
 Var(X)
= (0 – 3/2)2(1/8) + (1 – 3/2)2 (3/8) + (2 – 3/2)2 (3/8)
+ (3 – 3/2)2 (1/8) = 3/4
What is the CDF at 2 = F(2)=P(X=2)+P(X=1)+P(X=0)=7/8
The Binomial Distribution
Let X be the number of “successes” in n independent “trials,” each with
success probability p,
 Such an X is a Binomial R.V. with parameters n and p

 n x
p( x)  P( X  x)    p (1  p) n  x
 x
where
n
n!
  
 x  x!(n  x)!
n is the number of trials
x is the number of observed successes, x=0…n
p is the probability of success on each trial

What is the mean and variance of a Binomial Random Variable?

In the book: the probability p is denoted by π,
p( x)  n Cx (1   )
x
n x
where
n!
n Cx 
x!(n  x)!
The Binomial Distribution
 An
important part of understanding probability/statistics is
recognizing a “binomial situation”
 Binomial
example
 Number

n = number of items, p = probability of a product being defective
 Number

of students in this class who are in senior year
n = number of students in this class, p = probability of a student being a senior.
 Number

of defective products in a sample of items.
of no-shows for a flight
n = number of passengers, p = probability of a no show flight
 Number
of times next week I’ll get stuck in traffic on my way to
school

n = number of work days per week, p = probability I get stuck in traffic
Chapter 7
Continuous Probability Distributions

For Discrete RV X, the pdf is given by p(x)=P(X=x) for all possible values
of x.

For a Continuous RV X, P(X=x)=0 for all values of x.
 Example: If X is the amount of time you wait in line at Starbucks then
P(X=30.567… seconds)=0.

The pdf of a continuous RV is represented by a function p(x) for all values
of x where the area under p(x) is 1.

The Uniform and Normal Distributions are commonly used Continuous
Distributions.
Uniform Distribution
 The
simplest distribution for a continuous random variable.
 Rectangular in shape, constant (uniform) height
 Defined by minimum and maximum values a and b.
 Areas within the distribution represent probabilities
 Example:

Time to fly on MEA from Beirut to Paris ranges from 4 hrs to 5hrs. Random
variable is flight time; it is continuous.
P(x)
A continuous
Uniform Distribution
1/(b-a)
a
b
x
Uniform Distribution
 Mean:
ab

2
 SD:
(b  a ) 2

12
 Height:
1
if a≤ x ≤b,
P( x ) 
ba
0
elsewhere.
The Standard Normal Distribution
The standard normal distribution is a normal distribution with
a mean of 0 and a standard deviation of 1.
It is also called the z distribution.
A z-value is the signed distance between a selected value,
designated X, and the mean µ, divided by the standard
deviation, σ. The formula is:
z
X 

The Normal Distribution
 We
want to know the area under the curve between the mean,
283, and 285.4 grams. Or P(283< X <285.4)
 We convert the x values into z values
z value for 283:
 z= (x-μ)/σ= (283-283)/1.6 = 0
 z value for 285.4:
 z= (285.4-283)/1.6 = 1.5

P(283< weight <285.4) = P(0<z<1.5) =
The area under the curve, between 0.00 and 1.5 = 0.4332
The Normal Distribution
 What
is the value of X for which 5% will be larger than X.
 What is the value of X for which 95% will fall below.
We obtain z from Appendix B.1, z=1.65
We convert to the x value, x= σ z+ μ.
Chapters 8 and 9
Sampling Error

Samples are used to estimate population characteristics.


Unlikely sample mean (standard dev.) equals to population mean (standard dev.)
Error made in estimating the population mean based on the sample?
Definition: Difference between a sample statistic and its corresponding
population parameter
 Example: output of each employee: 97,103,96,99,105 units. Select samples of
two and find their mean
 Sample1: {97,105} with mean = 101

 Sampling
Error = X    101  100  1
Sample2: {103,96} with mean = 100
 Sampling Error = X    99.5  100  0.5
 Sampling errors are random and occur by chance.


To make accurate predictions based on sample results, we need to first develop
sampling distributions of the sample means.
Sampling: Distribution of the Sample Mean
(Sigma Known)
o If a population follows the normal distribution, the
sampling distribution of the sample mean will also
follow the normal distribution.
o To determine the probability a sample mean falls
within a particular region, use:
X 
z
 n
Note that:

n
is called the Standard Error
of the Mean.
Sampling: Distribution of the Sample Mean
(Sigma Unknown)
o
If the population does not follow the normal
distribution, but the sample is of at least 30
observations, the sample means will follow the normal
distribution.
o
To determine the probability a sample mean falls
within a particular region, use:
X 
z
s n
Point Estimate
 Definition:
The statistic computed from sample information and
used to estimate the population parameter.
 Examples:



Sample mean, X is a point estimate for the population mean, µ
Sample standard error s is a point estimate of population standard
deviation σ
Sample proportion p is a point estimate of population proportion π
Confidence Interval
Confidence interval equations,

CI Eq1:
X z
CI Eq2:
s
X z
n
CI Eq3:
s
X t
n
n
When to Use the t Distribution
Is the
population
normal?
No
Assume Normal
and go through
the flow chart
again
Yes
Is the
population SD
known?
Is n 30 or
more?
Eq3
No
Use a
nonparametric
test
Yes
Use the z
distribution
Eq2 or Eq1
No
Use t if n less than
or equal to 30,
Use z if n is more
than 30
Eq2
Yes
Eq1
Use the z
distribution
Sample Size for Estimating Population Mean
 zs 
n 
E
2
n is the sample size;
z is the standard normal value corresponding to the desired
level of confidence;
s is an estimate of the population SD;
E is the maximum allowable error (1/2 length of the CI).
If the result is not a whole number, round up.
Standard Error of the Sample
Proportion
p 
p(1  p)
n
Confidence Interval for a Population
Proportion
p(1  p)
pz
n
Sample Size for the
Population Proportion
Three items need to be specified:
1. The desired level of confidence.
2. The margin of error in the population proportion.
3. An estimate of the population proportion.
z
n  p (1  p )  
E
2
If an estimate of π is not available, use p=0.5 to approximately
estimate the sample size.
Finite-Population Correction Factor
If the population size N is not very large, then we use a
population correction factor when computing the CI.
If (n/N > 0.05) then use :
s
X t
n
N n
N 1

N n
N 1
X z
n
,
OR
s
X z
n
N n
N 1
Finite-Population Correction Factor
If the population size N is not very large, then we use a
population correction factor when computing the CI.
If (n/N > 0.05) then use :
p(1  p) N  n
pz
n
N 1
Material NOT Included in the Midterm
o
o
o
Chapter 2:
o
Chebyzhev’s Theorem
o
Geometric Mean
Chapter 4:
o
Software Coefficient of Skewness
o
Stem-and-Leaf Displays
Chapter 5:
o
Permutation Equation
Material NOT Included in the Midterm
o
o
Chapter 6:
o
Hypergeometric Probability Distribution
o
Poisson Probability Distribution
o
Covariance
Chapter 7:
o
The Normal Approximation to the Binomial