Download Lecture 8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Data Analysis and Statistical Methods
Statistics 651
http://www.stat.tamu.edu/~suhasini/teaching.html
Lecture 8 (MWF) The binomial and introducing the normal distribution
Suhasini Subba Rao
Lecture 8 (MWF) Introducing the normal distribution
The binomial: mean and variance
• Suppose that each individual in the population can have either a success
(1) or a failure (0).
• Let p denote the proportion of the entire population who have a success
(the proportion of 1s in the population).
• Suppose a random sample of size n is drawn and let Sn be the number
of successes (number of 1s) in that sample.
• Sn is a random variable taking values in {0, 1, . . . , n} (eg. S4 is the
number of successes out of 4 and has the outcomes {0, 1, 2, 3, 4}).
• Sn has all the properties of a random variable, we can associate a
probability to each outcome (the binomial distribution) and it has a
1
Lecture 8 (MWF) Introducing the normal distribution
probability plot. Since it has a probability plot, it must have a center
and a spread, therefore it has a mean and a variance.
– The mean of a binomial is n × p.
– The
p variance of a binomial is n × p × (1 − p) the standard deviation
is n × p × (1 − p).
2
Lecture 8 (MWF) Introducing the normal distribution
Example 1: n= 4 and p=0.2
• Suppose there are 4 exam questions, each answer is either wrong or
right, and we give it the value {0, 1}. The probability of a success is
P (X = 1) = 0.2 (the probability of getting it right by random guessing)
and the probability of a failure is P (X = 0) = 0.8.
• We are interested in the number of successes out of 4, this is the random
variable S4. Using the arguments we gave earlier we can show that:
4
3
P (S4 = 1) = 4 × (0.8) × (0.2)
P (S4 = 0)
=
(0.8) ,
P (S4 = 2)
=
6 × (0.8) × (0.2) ,
P (S4 = 4)
=
(0.2)
2
2
1
P (S4 = 3) = 4 × (0.8) × (0.2)
3
4
Hence we can plot the histogram, which has a center and a spread.
The
√ mean of S4 is 4 × 0.2 = 0.8 and the standard deviation is
4 × 0.2 × 0.8 = 0.8.
3
Lecture 8 (MWF) Introducing the normal distribution
The x-axis corresponds to the 5 different possible grades. Eg. none say
yes, one says yes, two says yes, three says yes and all say yes. Since this is
discrete numerical variable the y-axis can correspond to a probability.
4
Lecture 8 (MWF) Introducing the normal distribution
Example 2: n=50 and p=0.2
• Suppose there are 50 exam questions. There is 20% getting one correct.
This is a binomial with B(50, 0.2).
• This relative frequency histogram will be the Binomial distribution.
In Statcrunch you can make a plot of it (stat -> calculators ->
binomial), use n = 50 and p = 0.2.
• Being completely clueless and answering randomly means that the
average number of question I get right will be 20% of 50, this is
0.2 × 50 = 10 (formula mean = n × p).
• The variation is measured with a standard
for
√
√ deviation. The variation
× 0.2 × 0.8 = 8 (using
more score (or standard deviation) is s= 50 p
the formula standard deviation of a binomial is n × p × (1 − p)). Look
5
Lecture 8 (MWF) Introducing the normal distribution
at the distribution given in Statcrunch, does this look representative of
the spread?
• The distribution is not symmetric but it is close to symmetric (locally)
about 10 (compare that with the the with an exam with only 15 question,
this distribution looks ‘less’ symmetric and bell shaped.).
• Observe that the probability of, say, scoring 10 out of 50 is quite small,
despite this being the average grade if one were to randomly guess.
This is because as the sample size grows the chance of any on outcome
becomes very small. It is reason we are interested in the chance of an
interval, for example, scoring between 5-15.
6
Lecture 8 (MWF) Introducing the normal distribution
7
Lecture 8 (MWF) Introducing the normal distribution
Example 3: n=50 and p=0.5
• Suppose now the exam is a true or false exam. The probability of getting
a question correct by randomly guessing is 50%. There are 50 questions
on the exam, the grade in the exam follows a Bin(50, 0.5).
• I am interested in the number out of 50. Ie. S50 (the number of
successes out of 50). The average number of successes is 50 × 0.5 = 25.
Because S50 is a random variable, it has a histgram (distribution) and
thus a variance (measure of spread). Its variance is 50 × 0.5 × 0.5 = 12.5.
This measures how spread out the distribution is from the mean.
• In Statcrunch, make a plot of the histogram (stat -> calculators
-> binomial), use n = 50 and p = 0.5.
• We see that it is symmetric about 25.
8
Lecture 8 (MWF) Introducing the normal distribution
9
Lecture 8 (MWF) Introducing the normal distribution
Plots in summary
10
Lecture 8 (MWF) Introducing the normal distribution
Summary of binomial plots
Suppose that the number of questions in the exam was just 4:
• We showed if P (X = 1) = 0.2 and P (X = 0) =√0.8, then for n = 4 the
mean is 4 × 0.2 = 0.8 (the standard deviation is 4 × 0.2 × 0.8) and the
histogram is right skewed. This means we were more likely to observe
large values of Sn (in terms of surveys this means a lot of people say
yes).
• On the other hand if the chance of success is 0.8, ie. P (X = 1) = 0.8
and P (X = 0) = 0.8, then for n = 4, the mean is 4 × 0.8 = 3.2 (the
variance is 4 × 0.2 × 0.8) and the histogram is left skewed.
• If P (X = 1) = P (X = 0) = 1/2, then for n = 4, the mean is 4×0.5 = 2
and we are most likely to observe in the middle of the interval [0, 4].
This time the histogram is symmetric (about 2).
11
Lecture 8 (MWF) Introducing the normal distribution
• Now suppose the number of people we sample increases (we go from
n = 4 to n = 50). What we observe is that around the peak of the
histogram there is a symmetry (regardless of whether overall there is
symmetry or not).
In other words, regardless of the overall skew, about the peak its close
to symmetric, with a similar shape.
• As an exercise in statcrunch make plots of the histogram for p = 0.05
and let the sample size grow. See how the histogram looks less and less
skewed about the center.
12
Lecture 8 (MWF) Introducing the normal distribution
Binomial p = 0.05 for various n
13
Lecture 8 (MWF) Introducing the normal distribution
• In statistics it is not the number Sn out of n that we are interested in
(for example, the number of people who say they will vote for a party
out of random sample of 500 individuals), but the proportion. In other
words, the proportion of the sample who say they will vote for a party.
This sample proportion p̂ = Sn/n is an estimator of the true proportion
p of people of who will vote for a party.
• The sample proportion is random (since Sn is random). Therefore it has
a distribution. The distribution of p̂n has the same shape as Sn.
– The plots above show that if n is quite large, p and 1 − p are not too
small, the distribution of Sn (and p̂n) looks ‘bell shaped’.
– It is centered
p about np (the mean of the Binomial) with standard
deviation np(1 − p).
The
p distribution of p̂ is centered about p with standard deviation
p(1 − p)/n.
14
Lecture 8 (MWF) Introducing the normal distribution
• In other words, the distribution of the sample proportions becomes more
and more bell shaped.
• What you are seeing is the central limit theorem coming into play, that
the sample proportion becomes more bell shaped as the sample size
grows.
• Note, it is not that the distribution of the sample that gets more
bell shaped as the sample size increases, but the distribution of
the sample proportion or sample mean - the estimator - that gets
more bell shaped. You cannot change the original distribution, it is
what it is. Sometimes it is bell shaped, a lot of the times it is not.
• You would have observed something similar from HW1, Q5, when you
plotted the averages of M&Ms.
15
Lecture 8 (MWF) Introducing the normal distribution
What is that bell shape? The normal distribution
• We often find that the distribution of random variables that arise in
nature have a distinctive shape.
• This distinctive shape of bell shape curve is called a normal distribution.
The arises all over the place:
– The distribution of bullets when fired at a target.
– The outcomes of social surveys.
– Biological data (such as the height of a women).
• The normal distribution is a family of densities which are different but
have certain characteristics in common.
• The normal distribution (sometimes called the Gaussian) is the most
commonly used distribution in statistics.
16
Lecture 8 (MWF) Introducing the normal distribution
The normal distribution (cont.)
• It is completely characterised by two parameters, the mean and variance.
– The mean µ.
– The variance σ 2.
Formally the density function of the normal distribution looks like:
2
(x − µ)
1
(you don’t have to remember this!)
exp −
y = f (x) = p
2
2
σ
2pσ
This is a symmetric curve which is centered about µ with ‘spread’ σ.
See handout: normal distribution introduction.pdf.
17
Lecture 8 (MWF) Introducing the normal distribution
Different normal distributions
Fit the distribution to female human, male human, gorilla and giraffe.
18
Lecture 8 (MWF) Introducing the normal distribution
Calculating probabilities
• What we want do is use the fact that observations come from a normal
distribution to calculate probabilities. For example, suppose female
heights are normally distributed with mean 64.5 inches and standard
deviation 2.5 inches. I want to use this information to:
– Calculate things like percentiles. Jane is 71 inches tall, what is her
percentile (ie. the proportion of people who are less than 71 inches).
– If someone is in the 90th percentile, how tall are they?
• In order to calculate these percentiles (probabilities), we need to utilize
the normality of heights.
• But first before doing this, we need to introduce the z-transform. This is
a transformation, which ‘measures’ the number of standard deviations the
19
Lecture 8 (MWF) Introducing the normal distribution
data is from the mean. For example 71 inches is (71 − 64.5)/2.5 = 2.6
standard deviations from the mean.
• Once the z-tranform has been evaluated, we can use the standard normal
to calculate probabilities.
• We start by calculating probabilities in the z-transform world, all of which
use the standard normal.
20
Lecture 8 (MWF) Introducing the normal distribution
The standard normal - page 1090 of Longnecker and Ott
• The normal tables give the probabilities P (Z < z) in the special case
Z ∼ N (0, 1) (the so called standard normal):
– mean is zero (µ = 0)
– variance is one σ 2 = 1.
• Look at the normal tables. Suppose we want to use it to evaluate the
P (Z < b). The two sides of the table give together b, the inside of the
table yields the probability P (Z < b).
• Suppose we want to evaluate P (Z ≤ 1.23), since 1.23 = 1.2 + 0.03, the
first column gives the 1.2 values and first row gives the 0.03 value. We
find the 1.2 and 0.03 values and locate the value in the inside of the
table where this column and row intersect.
21
Lecture 8 (MWF) Introducing the normal distribution
• This intersection point is the probability, that is P (Z ≤ 1.23) = 0.8907.
• The area under the graph is the probability, which corresponds to the
value given in the table.
22
Lecture 8 (MWF) Introducing the normal distribution
Examples - standard normal
The following exercise will seem dry (because there are not many real
life populations which follow a standard normal distribution). However,
calculating standard normal probabilities will be very useful in calculating
nonstandard normal probabilities (which are widely used).
(a) Evaluate P (0.6 < Z ≤ 1.3).
(b) (i) P (Z ≤ −1.1), (ii) P (Z ≤ 0.6), (iii) P (Z ≤ 3.0), (iv) P (Z ≤ −2.12).
(c) How to interprete P (Z ≤ −1.1) and P (Z ≤ 3.0)?
(d) (i) P (Z > −1.1), (ii) P (Z > 0.6), (iii) P (Z > 3.0), (iv) P (Z > −2.12).
(e) (i) P (−1.1 < Z ≤ 0.6), (ii) P (−2.12 < Z ≤ 3.0), (iii) P (−2.12 < Z ≤
0)
23
Lecture 8 (MWF) Introducing the normal distribution
Look at the handout http://www.stat.tamu.edu/~suhasini/
teaching651/standard_normal_tables.pdf for the solutions.
24
Lecture 8 (MWF) Introducing the normal distribution
Calculating probabilities using Statcrunch
Probabilities can also be calculated in Statcrunch. Go to Stat →
Calculate and select Normal. Put in the correct mean and standard
deviation (0 and 1).
25
Lecture 8 (MWF) Introducing the normal distribution
A ‘practical’ example
• Suppose that Z is the weight of an alien species (that has learnt the
mystery of zero and negative weight). We will assume that this species
of aliens has a weight distribution which follows a standard normal, mean
zero and standard deviation/variance which is one. A plot the density
(remember this is similar to a histogram) is given below.
26
Lecture 8 (MWF) Introducing the normal distribution
• From the plot you can see they are equally likely to have a positive
weight as it is to have a negative weight. Indeed the chance of the alien’s
weight being positive is 50% (since it is symmetric about zero).
• Again, by symmetry, the chance of it’s weight being more than 2 is the
same as it is to be less than -2.
• Furthermore, the weight of this alien species can take any number from
negative infinite to positive infinite.
• However, most of the time (in fact more than 99% of the time) if you
randomly select one of these aliens, their weight will be between (−3, 3).
We now illustrate why:
• Draw the normal distribution show what P (−3.0 ≤ Z ≤ 3) is on it.
Evaluate this probability.
27
Lecture 8 (MWF) Introducing the normal distribution
• We see that P (−3.0 ≤ Z ≤ 3) = P (Z ≤ 3) − P (Z < −3) = 0.9987 −
0.0013, which is large. Hence we are likely to draw alien weights in this
interval.
28
Lecture 8 (MWF) Introducing the normal distribution
Given a probability, locating the value on the x-axis
• Suppose Z ∼ N (0, 1) (Z is a random variable with a standard normal
distribution).
Question 1 We want to find the value of t such that P (Z ≤ t) = 0.8. For
example, if the weight of randomly selected person followed a standard
normal distribution, then P (Z ≤ t) = 0.8, means the probability I draw
a randomly selected person and their weight is less than t (we want to
find this t) is equal to 0.8.
29
Lecture 8 (MWF) Introducing the normal distribution
• Solution 1 To find the t, look inside the normal tables, and locate 0.8
(very, very important you look inside the table not on the sides). Then
read out and you will see that you should get the value 0.84. Hence
P (Z ≤ 0.84) = 0.8, and t = 0.84.
30
Lecture 8 (MWF) Introducing the normal distribution
• Question 2 We want to find the value of t such that P (Z > t) = 0.02.
• Solution 2 First draw this.
We know that if P (Z > t) = 0.02, then P (Z ≤ t) = 1 − 0.02 = 0.98.
Look inside the tables to find 0.98. You should see approximately 2.06.
Hence P (Z ≤ 2.06) = 0.98. Hence t = 2.06.
31