Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STP 420 SUMMER 2005 STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES PART 2 – PROBABILITY AND INFERENCE CHAPTER 4 PROBABILITY: THE STUDY OF RANDOMNESS 4.1 Randomness The results of tossing a coin, or choosing an SRS cannot be predicted in advance. The language of probability Random – does not mean haphazard but instead is a description of some kind of order that emerges only in the long run Consider the experiment of tossing a coin. The proportion of tosses that give a head to the total number of tosses seems to approach 0.5 in the long run. Eg. of tossing a coin 10 times: H H T H T H T T H H # of heads/total # of tosses = 6/10 = 0.6 As the experiment is repeated many times it seems that the proportion of heads approaches 0.5. Randomness and probability Random – individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions. The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions also called long-term relative frequency. 1 STP 420 SUMMER 2005 Thinking about randomness Outcome of a coin toss Random sample Never really observe a probability exactly because the number of repetitions can go on infinitely. These repetitions are independent trials, ie, the outcome of one trial must not influence any other outcome. The computer can help in doing many repetitions of an experiment through simulations. The uses of probability Tossing coin, tossing dice, dealing shuffled cards, spinning a roulette wheel Games of chance are ancient but not studied by mathematicians until the 16th and 17th century (Blaise Pascal and Pierre de Fermat). Gambling uses these games of chance and are still with us. 4.2 Probability models Tossing a coin has two parts 1. List of possible outcomes 2. Probability of each outcome Sample space (S) – of a random phenomenon is the set of all possible outcomes. Eg. Toss a coin S = {heads, tails} or S = {H, T} – 2 different outcomes Toss a coin 4 times is vague The outcomes are: HHHH HHHT HHTH HTHH THHH HHTT HTHT HTTH THHT THTH TTHH HTTT THTT TTHT TTTH TTTT - 16 different outcomes 2 STP 420 SUMMER 2005 More exact may be counting the number of heads in 4 tosses called a random variable S = {0, 1, 2, 3, 4} Proportion of getting 0 heads in 4 tosses equals the probability of getting 0 heads is 1/16 Proportion of getting 1 heads in 4 tosses equals the probability of getting 1 heads is 4/16 Proportion of getting 2 heads in 4 tosses equals the probability of getting 2 heads is 6/16 Proportion of getting 3 heads in 4 tosses equals the probability of getting 3 heads is 4/16 Proportion of getting 4 heads in 4 tosses equals the probability of getting 4 heads is 1/16 Intuitive probability We need to assign probabilities to single outcomes and to sets of outcomes (events) Event – outcome or set of outcomes of a random phenomenon (subset of a sample space) 1. Any probability is between 0 and 1 since all proportions must be between 0 and 1 2. All possible outcomes together must have a probability of 1 3. The probability that an event does not occur is 1 minus the probability that the event does occur. 4. If two events have no outcomes in common, the probability that one or the other is the sum of their individual probabilities. Probability rules 1. The probability P(A) of any event A satisfies 0 P(A) 1. 2. If S is the sample space in a probability model, then P(S) = 1. 3. The complement of any event A is the event that A does not occur (Ac) P(Ac) = 1 – P(A) 4. Two events A and B are disjoint if they have no outcomes in common and so can never occur simultaneously. If A and B are disjoint, P(A or B) = P(A) + P(B) addition rule for disjoint events These rules can be easily seen in a Venn diagram 3 STP 420 SUMMER 2005 Assigning probabilities: finite number of outcomes (finite sample space) Assign a probability (must be between 0 and 1) to each individual outcome. Sum of these probabilities must equal 1. The probability of an event is the sum of the probabilities of the outcomes making up the event. Assigning probabilities: equally likely outcomes Equally likely is based of some balanced phenomenon. Eg. 1. 2. 3. The two faces on a coin (equally shaped and seem equally likely to fall on any of those faces) The six faces on a die The 10 digits in a random number table Equally likely outcomes If a random phenomenon has k possible outcomes, all equally likely, then each individual outcome has probability 1/k. The probability of any event A is P(A) = count of outcomes in A = count of outcomes in A Count of outcomes in S k Independence and the multiplication rule Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent, P(A and B) = P(A)P(B) is the multiplication rule for independent events 4.3 Random variables Random variable – variable whose value is a numerical outcome of a random phenomenon 4 STP 420 SUMMER 2005 Discrete random variables Discrete random variable X has a finite number of possible values. The probability distribution is: Value of X Probability x1 x2 x3 … xk p1 p2 p3 … pk The probabilities pi must satisfy: 1. 0 pi 1 2. p1 + p2 + … + p k = 1 The probability of an event A is the sum of the probabilities pi of the particular xi making up the event. Probability histogram – histogram having probabilities as the vertical axis and the outcomes as the horizontal axis Continuous random variables Continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event A is the area under the density curve and above the values of X that make up the event A. Remember that P(X = a) = 0 for any outcome a in a continuous distribution X We have to work with intervals instead so that we can compute the area under the curve on that interval. Also, the total area under a density curve is 1 and is directly related to the probability phenomenon. 5 STP 420 SUMMER 2005 Normal distributions as probability distributions There are infinitely many normal distributions, X ~ N(, ) where specifies the mean and specifies the standard deviation. Standardizing each of these normal distributions gives us the standard normal distribution, Z is N(0, 1) where the mean is 0 and the standard deviation is 1, and we can then use the standard normal tables to compute these areas (probabilities). Standard normal random variable = Z 4.4 X Means and variances of random variables The mean of a probability distribution is The mean of a random variable is called the expected value The mean of a random variable If X is a discrete random variable with distribution Value of X Probability x1 x2 x3 … xk p1 p2 p3 … pk The mean of X is x = x1p1 + x2p2 + … + xkpk = xipi It is the sum of the products of the outcomes and their respective probabilities. Statistical estimation and the law of large numbers Law of large numbers Draw independent observations at random from any population with finite mean . Decide how accurately you would like to estimate . As the number of observations drawn increases, the mean x of the observed values eventually approaches the mean of the population as closely as you specified and then stays that close. 6 STP 420 SUMMER 2005 Thinking about the law of large numbers The law of large numbers states that, as the number of trials increases; in the long run, the probability of an outcome seem to approach a certain value. Eg. for a coin, P(H) = P(T) = ½ since there are 2 equally likely outcomes for a die, P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6 since there are 6 equally likely outcomes Law of small numbers For a small number of trials the resulting probabilities may be very different from what it turns out in the long run. This can be misleading and one has to be careful when making decisions or conclusions based on a small number of trials. Beyond the basics – more laws of large numbers Is there a winning system for gambling? People create their own structures for determining how much to bet from play to play. What if observations are not independent? Rules for means 1. If X is a random variable and a and b are fixed numbers, then a+bX = a + bx 2. If X and Y are random variables, then X+Y = X + Y Variance of a Discrete Random Variable Suppose the X is a discrete random variable whose distribution is 7 STP 420 SUMMER 2005 x1 x2 x3 … xk p1 p2 p3 … pk Value of X Probability and that x is the mean of X. The variance of X is 2 2 ( x ) p ( x ) p . . . ( x ) p ( x ) p k X k i X i 2 2 2 X 1 X 1 2 X 2 The standard deviation X of X is the square root of the variance. Rules for Variances 1. If X is a random variable and a and b are fixed numbers, then 2 abX 2. b2 2 X If X and Y are independent random variables, then 2 XY 2 XY 8