Download chapter 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
STP 420 SUMMER 2005
STP 420
INTRODUCTION TO APPLIED STATISTICS
NOTES
PART 2 – PROBABILITY AND INFERENCE
CHAPTER 4
PROBABILITY: THE STUDY OF RANDOMNESS
4.1
Randomness
The results of tossing a coin, or choosing an SRS cannot be predicted in advance.
The language of probability
Random – does not mean haphazard but instead is a description of some kind of order
that emerges only in the long run
Consider the experiment of tossing a coin. The proportion of tosses that give a head to the
total number of tosses seems to approach 0.5 in the long run.
Eg. of tossing a coin 10 times: H H T H T H T T H H
# of heads/total # of tosses = 6/10 = 0.6
As the experiment is repeated many times it seems that the proportion of heads
approaches 0.5.
Randomness and probability
Random – individual outcomes are uncertain but there is nonetheless a regular
distribution of outcomes in a large number of repetitions.
The probability of any outcome of a random phenomenon is the proportion of times the
outcome would occur in a very long series of repetitions also called long-term relative
frequency.
1
STP 420 SUMMER 2005
Thinking about randomness
Outcome of a coin toss
Random sample
Never really observe a probability exactly because the number of repetitions can go on
infinitely.
These repetitions are independent trials, ie, the outcome of one trial must not influence
any other outcome.
The computer can help in doing many repetitions of an experiment through simulations.
The uses of probability
Tossing coin, tossing dice, dealing shuffled cards, spinning a roulette wheel
Games of chance are ancient but not studied by mathematicians until the 16th and 17th
century (Blaise Pascal and Pierre de Fermat).
Gambling uses these games of chance and are still with us.
4.2
Probability models
Tossing a coin has two parts
1.
List of possible outcomes
2.
Probability of each outcome
Sample space (S) – of a random phenomenon is the set of all possible outcomes.
Eg. Toss a coin
S = {heads, tails} or S = {H, T} – 2 different outcomes
Toss a coin 4 times is vague
The outcomes are:
HHHH
HHHT
HHTH
HTHH
THHH
HHTT
HTHT
HTTH
THHT
THTH
TTHH
HTTT
THTT
TTHT
TTTH
TTTT
- 16 different outcomes
2
STP 420 SUMMER 2005
More exact may be counting the number of heads in 4 tosses called a random variable
S = {0, 1, 2, 3, 4}
Proportion of getting 0 heads in 4 tosses equals the probability of getting 0 heads is 1/16
Proportion of getting 1 heads in 4 tosses equals the probability of getting 1 heads is 4/16
Proportion of getting 2 heads in 4 tosses equals the probability of getting 2 heads is 6/16
Proportion of getting 3 heads in 4 tosses equals the probability of getting 3 heads is 4/16
Proportion of getting 4 heads in 4 tosses equals the probability of getting 4 heads is 1/16
Intuitive probability
We need to assign probabilities to single outcomes and to sets of outcomes (events)
Event – outcome or set of outcomes of a random phenomenon (subset of a sample space)
1.
Any probability is between 0 and 1 since all proportions must be between 0 and 1
2.
All possible outcomes together must have a probability of 1
3.
The probability that an event does not occur is 1 minus the probability that the
event does occur.
4.
If two events have no outcomes in common, the probability that one or the other
is the sum of their individual probabilities.
Probability rules
1.
The probability P(A) of any event A satisfies 0  P(A)  1.
2.
If S is the sample space in a probability model, then P(S) = 1.
3.
The complement of any event A is the event that A does not occur (Ac)
P(Ac) = 1 – P(A)
4.
Two events A and B are disjoint if they have no outcomes in common and so can
never occur simultaneously. If A and B are disjoint,
P(A or B) = P(A) + P(B) addition rule for disjoint events
These rules can be easily seen in a Venn diagram
3
STP 420 SUMMER 2005
Assigning probabilities: finite number of outcomes (finite sample space)
Assign a probability (must be between 0 and 1) to each individual outcome. Sum of these
probabilities must equal 1. The probability of an event is the sum of the probabilities of
the outcomes making up the event.
Assigning probabilities: equally likely outcomes
Equally likely is based of some balanced phenomenon.
Eg.
1.
2.
3.
The two faces on a coin (equally shaped and seem equally likely to fall on any of
those faces)
The six faces on a die
The 10 digits in a random number table
Equally likely outcomes
If a random phenomenon has k possible outcomes, all equally likely, then each individual
outcome has probability 1/k. The probability of any event A is
P(A) = count of outcomes in A = count of outcomes in A
Count of outcomes in S
k
Independence and the multiplication rule
Two events A and B are independent if knowing that one occurs does not change the
probability that the other occurs. If A and B are independent,
P(A and B) = P(A)P(B) is the multiplication rule for independent events
4.3
Random variables
Random variable – variable whose value is a numerical outcome of a random
phenomenon
4
STP 420 SUMMER 2005
Discrete random variables
Discrete random variable X has a finite number of possible values. The probability
distribution is:
Value of X
Probability
x1 x2 x3 … xk
p1 p2 p3 … pk
The probabilities pi must satisfy:
1.
0  pi  1
2.
p1 + p2 + … + p k = 1
The probability of an event A is the sum of the probabilities pi of the particular xi making
up the event.
Probability histogram – histogram having probabilities as the vertical axis and the
outcomes as the horizontal axis
Continuous random variables
Continuous random variable X takes all values in an interval of numbers. The
probability distribution of X is described by a density curve. The probability of any
event A is the area under the density curve and above the values of X that make up the
event A.
Remember that P(X = a) = 0 for any outcome a in a continuous distribution X
We have to work with intervals instead so that we can compute the area under the curve
on that interval.
Also, the total area under a density curve is 1 and is directly related to the probability
phenomenon.
5
STP 420 SUMMER 2005
Normal distributions as probability distributions
There are infinitely many normal distributions, X ~ N(, ) where  specifies the mean
and  specifies the standard deviation.
Standardizing each of these normal distributions gives us the standard normal
distribution, Z is N(0, 1) where the mean is 0 and the standard deviation is 1, and we can
then use the standard normal tables to compute these areas (probabilities).
Standard normal random variable = Z 
4.4
X
Means and variances of random variables
The mean of a probability distribution is 
The mean of a random variable is called the expected value
The mean of a random variable
If X is a discrete random variable with distribution
Value of X
Probability
x1 x2 x3 … xk
p1 p2 p3 … pk
The mean of X is x = x1p1 + x2p2 + … + xkpk = xipi
It is the sum of the products of the outcomes and their respective probabilities.
Statistical estimation and the law of large numbers
Law of large numbers
Draw independent observations at random from any population with finite mean .
Decide how accurately you would like to estimate . As the number of observations
drawn increases, the mean x of the observed values eventually approaches the mean 
of the population as closely as you specified and then stays that close.
6
STP 420 SUMMER 2005
Thinking about the law of large numbers
The law of large numbers states that, as the number of trials increases; in the long run, the
probability of an outcome seem to approach a certain value.
Eg.
for a coin, P(H) = P(T) = ½ since there are 2 equally likely outcomes
for a die, P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6 since there are 6 equally likely
outcomes
Law of small numbers
For a small number of trials the resulting probabilities may be very different from what it
turns out in the long run. This can be misleading and one has to be careful when making
decisions or conclusions based on a small number of trials.
Beyond the basics – more laws of large numbers
Is there a winning system for gambling?
People create their own structures for determining how much to bet from play to play.
What if observations are not independent?
Rules for means
1.
If X is a random variable and a and b are fixed numbers, then
a+bX = a + bx
2.
If X and Y are random variables, then X+Y = X + Y
Variance of a Discrete Random Variable
Suppose the X is a discrete random variable whose distribution is
7
STP 420 SUMMER 2005
x1 x2 x3 … xk
p1 p2 p3 … pk
Value of X
Probability
and that x is the mean of X. The variance of X is
2
2

(
x

)
p

(
x

)
p

.
.
.

(
x

)
p

(
x

)
p
k
X
k
i
X
i
2
2
2
X
1
X
1
2
X
2
The standard deviation X of X is the square root of the variance.
Rules for Variances
1.
If X is a random variable and a and b are fixed numbers, then
2
abX
2.
b2
2
X
If X and Y are independent random variables, then

2
XY

2
XY
8