Download 1 REVIEW 1.1 PROBABILITY DISTRIBUTIONS Normal Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
I
REVIEW
1.1
PROBABILITY DISTRIBUTIONS
Normal Distribution
The probability distribution for a continuous random variable can be represented by a
curve corresponding to the theoretical shape of the histogram for the given variable. The
area under the curve is equal to 1 and the area under the curve between any two values of
the variable gives the probability that the variable will take values within that range.
Most of the random variables in nature have, approximately, a Normal Distribution.
f(x)
X1 X2
w
x
X3
P(X ᧸ x1) = A1
P(x1 ᧸X᧸ x2) = A2
P(X ᧺ x3 ) = A3
The Z Distribution
Let a random variable X have a normal distribution with the meanwand a standard
X P
deviation ˰, then:
Z where Z has a normal distribution with mean P and
V
standard deviation V
1 .
1
Let a random variable X have any distribution with mean wand a standard deviation˰
n
¦x
i 1
X
Let
i
be the sample mean from a sample of size n ᧺ 30, then:
n
X has a normal distribution and
P x P Vx
V
n
X P
and
V
Z
n
The T Distribution
Let a random variable have any distribution and let x and s be the mean and the standard
deviation from a sample of size n ᧺30, then:
X P
S
T
n
The T distribution is similar to the Z distribution. Its shape depends upon n and it
approaches the Z distribution for n large.
w
2
The Ȥ2 (Chi- Square) Distribution Let X be a normal random variable with mean P and a standard deviation V . Let s be the
the standard deviation from a sample of size n. Then:
n 1 S 2
V
2
F2
F2
The shape of the Ȥ2 distribution depends upon n.
The F Distribution
Let X1, X2 be two normal random variables with means P1 , P 2 and standard deviations
V 1 V 2 and let s1, s2 be the sample standard deviations from two samples of sizes n1 and
n2. Then:
S1 V 1
2
2
S2 V 2
2
2
F
The shape of the F distribution depends upon n1 and n2.
3
1.2 ESTIMATION
The theoretical procedures for estimating and hypothesis testing are based upon the
probability distributions of estimators. Estimators are random variables that are used to
estimate population parameters (ex. P and V Estimations of population parameters are given in terms ofĮ) Confidence Intervals,
consisting of the value of the estimator plus or minus the error of estimation. The error of
estimation depends upon the distribution of the estimator and its standard deviation,
called standard error.
Estimation of the population mean µ
§ V ·
x r zD ¨
¸
2©
n¹
A (1Į) Confidence Interval for µ is given by:
§ s ·
or x r tD ¨
¸
2 ©
n¹
Depending on whether or not ˰ is known.
§ V ·
§ s ·
RU t(D ,n 1) ¨
Here x is the estimator of µ and zD ¨
¸
¸ represents the error of
2 ©
2
n¹
© n¹
estimation.
As an example, let (Į) be .95 or 95% which is the standard Confidence Level for
Confidence Intervals. The corresponding Z෰is Z.025 ᧹1.96
1D
D
2
.95
D
.025
1.96 §¨
·
¸
© n¹
V
w
1.96 §¨
·
¸
© n¹
V
2
.025
&
§ V ·
x r 1.96¨
¸ will catch the mean µ 95% of the time (or with confidence 95%).
© n¹
4
Estimation of the Difference of Population Means µ1 ᪫ µ2
A Į) Confidence Interval for w᪫wis given by:
x x r z
1
2
§ V 12 V 22 ·
¸
D ¨
¸
2¨
n
n
1
2
©
¹
or
§ s12 s 22 ·
2¸
D , ¨
2
2 n1 n 2 2 ¨
n
n 2 ¸¹
1
©
x x rt
1
2
where the subscripts 1 and 2 refer to population 1 and population 2.
Here the estimator of P1 P 2 is x1 x 2 which has
either a Z distribution (˰˰ known) or a T distribution (˰˰XQNQRZQand
V 12 V 22
n1
2
n2
2
s1 s 2
2
2
n1 n2
2
or
2
are, respectively the standard deviations of ( x1 x 2 ).
Estimation of the Population Proportion p
A 1 D Confidence Interval for p is given by:
§
pˆ r zD ¨¨
2
©
ˆˆ ·
pq
¸
n ¸¹
§
Here p̂ (sample proportion) is the estimator of p and zD ¨¨
2
©
ˆˆ ·
pq
¸ is the error of
n ¸¹
estimation because p̂ has a Z distribution and has a standard deviation equal to
pˆ qˆ
n
5
1.3 HYPOTHESIS TESTING
This procedure is used to test the value of one or more population parameters. The
procedure consists of setting a pair of mutually exclusive hypotheses (Null Hypothesis Ho
or Alternate Hypothesis HA or H1) about the value of the parameter and deciding whether
or not the Null Hypothesis can be rejected upon sampling information. In the following
cases we will consider only the set of hypotheses of equality and non-equality of the
population parameter.
Hypothesis Testing for the Population Mean µ
H0: µ = µo
H1: µ  µo
Under H0,
X Po
Test Statistic:
V
n
Z* =
follows a Z distribution.
X Po
V
n
Decision Rule: Reject H0 if |Z*| > zĮ/2 or if Pv < Į where Pv = 2P(Z > |Z*|)
If H0 is true, then Z* will follow a Z distribution. If the Z* value is too large or too small,
then the null hypothesis is rejected. For rejecting the Null Hypothesis, Z* is compared
with the critical values zĮ/2 and –zĮ/2, where Į is the level of significance of the test or the
Type-I error.
If V is unknown, then the test statistic is T *
X Po
s
n
6
Hypothesis Testing for Two Population Means µ1 - µ2
H0: µ1 = µ2
H1: µ1  µ2
x1 x2
Under H0,
V 12
n1
V 22
follows a Z distribution
n2
Tets Statistic: Z*=
x1 x2
V 12
n1
V 22
n2
Decision Rule: Reject H0 if |Z*| > zĮ/2 or if Pv > Į .
If V 1 and V 2 are unknown, then the test statistic is :
x1 x 2
T
s12 s 22
n1 n 2
n
D
For paired samples the test statistic is:
SD
¦D D
T where
D
¦D
i 1
n
i
and
2
SD
i
where Di are the differences between matched observations.
n 1
Hypothesis Testing for the Population Proportion p
H0: p p 0
H 1 p z p0
under H0,
p̂ p0
p0 q0 follows a Z distribution
n
Test Statistic: Z* =
p̂ p0
p0 q0 n
Decision Rule: Reject H0 if | Z*| > zĮ/2 or Pv < Į
7
Hypothesis Testing for the Population Variances ˰2
2
H0: V
2
0
V
H 1 : V 2 z V 02
S 2 (n 1)
Under H0,
V
follows F 2( n 1)
2
Test Statistic: F 2 * =
S 2 (n 1)
V 02
Decision Rule: Reject H0 if F 2 * ! F 2 (D / 2,n 1) or F 2 * F 2 (1D / 2,n 1)
Hypothesis Testing for Two population variances V 1 V 2 .
2
H 0: V 1 2
2
V 22
H1: V 1 z V 2
Under H0,
2
2
S12
follows an F( n1 1,n2 1)
S2 2
Test Statistic: F * =
S12
S2 2
Decision Rule: Reject H0 if F ! F( n1 1,n2 1,D )
*
8