Download A short introduction to probability for statistics

A Quik Exploration of Probability for Statistis (Part 1) Consider a statistial experiment. That's an observation (a measurement), whose outome may not be ertain beforehand. Examples are • Ask a person about their preferred ie ream avor • Observe the olor of the eyes of a person • Measure the voltage produed by a battery • Measure the body temperature of an individual • Measure the time between bus arrivals at a bus stop Let's all the outome of this experiment X, whih we take to be a number (if the observation is qualitative, like eye olor or preferred avor, we assume we have set up a oding onvention). In general, we don't know before atually observing or measuring what value will be taken by X, but we assume we know what values are possible. Suppose, for simpliity, that X an only take a nite number of values (e.g., body temperatures assumed to be between 95.0 and 109.0, in inrements of 0.1), say, experiment will be X = xj , x1 , x2 , . . . , xn . Then the result of the if the measurement turns out to be xj (e.g., if the temperature is measured to be 98.6, the result of the experiment is X = 98.6). Of ourse, we may be taking more than one observation, so eah one will have its outome, and we ould onsider the ombined results as, for example, X = xj , Y = yk , Z = zl , and so on. A probabilisti model is a proedure that assigns a number between 0 and 1 to the outome of our observations, 0 representing an impossible outome, and 1 a sure outome. To make sure we do not run into paradoxes and inonsistenies, we do this in a spei way. 1 2 Sample Spae and Events For some fairly deep reasons, it is best to set our model by dening a set S, whih is usually alled sample spae, and think of it as the olletion of all possible outomes (we don't need to be too spei about what we mean by that), and think of our outome result X as a funtion mapping S to a set of numbers: X :S→R so that the result X X(s) = xj . = xj is the set of elements in S, K, suh that, if s ∈ K, These funtions are alled random variables. You may want to 1 quikly review your knowledge of set theory at this point . We are thus inter- S , dened by things like {s |X(s) = xj }, or, more generally, {s |X(s) = xj , Y (s) = yk , Z(s) = zl }. We will all these sets events, and we ested in subset of have some tehnial assumptions that we need to make about these sets, whih you an hek in a more mathematially oriented report. We will assoiate a number between 0 and 1 to eah of these sets, as suggested above. This number is alled a probability, and we denote the probability of set E as P [E]. The extreme values orrespond to impossibility (probability 0), and ertainty (probability 1). Right now, temporarily, let's not worry about how we are doing the assoiation we just assume we an. For onsisteny reasons, the following properties 2 must be satised : • If as A and B are events, A ∩ B = ∅), then and if they ave no ommon points (we write this P [A ∪ B] = P [A] + P [B] • P [S] = 1, that is, something has surely to happen. If you are familiar with Venn diagrams, and think of probability of an event (a set) as some kind of mass assoiated to the set, you will nd the following more general result obvious (it is easy to prove it, from the previous statements, and a bit of set theory): P [A ∪ B] = P [A] + P [B] − P [A ∩ B] 1 We use the following symbols here. x is an element of the set A is written x ∈ A; A A ⊂ B ; the set of elements that are both in A and in B (alled B ) is written A ∩ B ; the set of elements that are either in A, or in B , or in both (alled the union of A and B ) is written A ∪ B ; the set of elements in S , our ontainer set, that are not in A, alled the omplement of A is variously denoted by S \ A, A, or Ac (we will use the last here); the omplement of S is the empty set (the set with no elements), denoted by ∅. is a subset of B is written the intersetion of A and 2 To develop the theory in a omplete and eetive way, there are some additional onsid- erations to be made, speially about ountable olletions of events. This is not anything we have to deal in our ontext, and is delegated to your mathematial probability lass, if you'll take one. 3 Note that, when onsidering several observations, we are onsidering the intersetion of the orresponding events. That is, {s |X(s) = xj , Y (s) = yk , Z(s) = zl } = {s |X(s) = xj }∩{s |Y (s) = yk }∩{s |Z(s) = zl } (1) c A onsequene of these properties is that, sine for any event, A ∪ A = S , [Ac ] = 1 − P [A]. In partiular, the omplement of S is the empty set, and P P [∅] = 0, denoting the impossible event. Conditional Probabilities and Independene Sine we usually deal with more than one observation, the joint probabilities as in the expression (1) above are really important to us. In general, they may be diult to assess, and knowing eah of the individual probabilities is normally not enough. To address these questions it turns out to be very useful to dene how one observation would aet the others. In other words, if, for example, out to be equal to x, will turn out to be equal to Denition turns Y y? We dene the onditional probability of P [X = x |Y = y ] = Note how we need X how does this new knowledge aet the likelihood that X = x, given that P [X = x, Y = y] P [Y = y] Y =y as (2) P [Y = y] > 0 for onditional probabilities to be dened (you annot examine how the ourrene of an event whih annot our would aet the probabilities of other events). Typial simple examples are like the following: suppose we throw a die, and assume that eah of the possible six outomes has the same probability (hene, beause all six together exhaust the sample spae, whih has probability 1, eah 1 has probability 6 ). Let X be the outome of the toss. Suppose now that you don't know the value of X , but have been told that the outome is an even number let's denote this by Y = 0 (Y = 1 X hange: odd). Now, the probabilities for would denote that the outome is for example, the outome annot be 1, 3, or 5, so P [X = 1 |Y = 0 ] = 0 {X = 1} ∩ {Y = 0} = ∅. On the other hand, P [Y = 1] = 12 (sine 1 it is equal to {X = 1} ∪ {X = 3} ∪ {X = 5}, eah of whih has probability 6 , and whih do not have ommon points), while {X = 1} ⊂ {Y = 1}, so that P [X = 1, Y = 1] = P [X = 1]. Combining this with (2), we nd that beause P [X = 1 |Y = 1 ] = 1/6 2 1 = = 1/2 6 3 4 There are a ouple of onsequenes of (2) that are useful. The rst is the equivalent formula P [X = x, Y = y] = P [X = x |Y = y ] P [Y = y] (sometimes alled the multipliation formula), whih shows how we need to know onditional probabilities in order to be able to determine joint probabilities. The seond onsequene is alled Bayes' Formula, and follows from the multipliation formula: P [Y = y |X = x ] = P [X = x |Y = y ] P [Y = y] P [X = x] (3) While this formula is the starting point of a dierent approah to statistial analysis than the one whih we will be mostly onerned with, it is also an interesting formalization of so-alled indutive arguments. In a separate le (look for the le Conditional.pdf ), we'll show how it an be used to reah solid onlusions in some simple, but fun, problems. We an make one additional observation about the denominator in (3). Suppose Y an take only values y1 , y2 , . . . , ym , with known probabilities P [Y = yk ]. Then, sine one (and only one) of these has to our, we an say that the event {X = x}, an be deomposed into its intersetions with these m events, whih have no point in ommon. Consequently, we will have that P [X = x] = P [X = x |Y = y1 ] P [Y = y1 ] + P [X = x |Y = y2 ] P [Y = y2 ] + . . . . . . + P [X = x |Y = ym ] P [Y = ym ] a formula that is sometimes alled the Total Probability Formula, and omes often handy when working with Bayes' Formula. When performing repeated statistial observations we often try to avoid a situation where one outome aets the probability of another one. For this to hold, we need P [X = x |Y = y ] = P [X = x] and looking at (2), we see that this is equivalent to P [X = x, Y = y] = P [X = x] P [Y = y] When this is the ase (for all x and y ), we say that the random variables X and Y n random variables X1 , X2 , . . . , Xn are independent. More generally, we say that are independent if, for all values that they an take, we have P [X1 = x1 , X2 = x2 , . . . , Xn = xn ] = P [X1 = x1 ] P [X2 = x2 ] . . . P [Xn = xn ] Referring to a previous omment, this is the ase when joint probabilities an be omputed from the individual probabilities alone. This simpliation makes it very tempting to assume that independene is the ase, even when there is no evidene of that, or, worse, when it positively is not so. Many mistakes in statistis have been aused by faulty independene assumptions. 5 Classial Probability and Other Options Curiously enough, the rst studies in quantitative probability onerned games of hane (oin ips, die games, ard games, roulette, and so on). In most ases, the events in suh games an be onstruted starting from a nite set of events that we an assume have the same likelihood of happening . Typial examples, are ips of a fair oin (no reason to assume that we will end up with more heads than tails or vie-versa), and throws of a fair die. The assumption of not having any reason to assume that one outome is more likely than another is alled the priniple of suient reason, and it provides a tool, when appliable, to alulate probabilities from rst priniples. It also leads to interesting, and sometimes surprising results that are a lot of fun. Note: The appliability of this approah, sometimes alled Classial Probability, is limited to the ase when we have a nite sent of possible outomes, and it is reasonable to assume that a basi set of these an be assigned equal probabilities. Standard examples are provided by games of hane suh as those mentioned. For X = i, Y = j with example, when tossing two die, whose resulting values we write as i and j taking integer values form 1 to 6, it is easy to onlude that we an assign equal probabilities to the 36 events of the form {X = i, Y = j}, eah having probability 1 . With a little eort, we an then alulate the probabilities of X + Y taking 36 1 spei values (for example, P [X + Y = 7] = 6 , sine this result omes from the union of the six events {X = 1, Y = 6}, {X = 2, Y = 5}, {X = 3, Y = 4}, {X = 4, Y = 3}, {X = 5, Y = 2}, {X = 6, Y = 1}) As you an see, if Classial Probability is viable, the problem redues to a ounting problem. Don't that fat fool you in thinking that it will always be an easy problem: ounting things when there is a lot of them an be extremely diult, if we don't have several hundred or thousand years available. In general, we will need other tools. Sometimes, using limit theorems, similar to those that we will look at shortly, an provide us with a priori models for omplex systems (we will give a ouple of examples). At other times, the same limit theorems justify the use of observed frequenies, when an experiment is repeated many times, 3 resulting in something like an empirial probability This is what we will be onerned about: using statistis to speify or to validate a spei probabilisti model. As a side note, we should remark that even this approah is not always available: espeially in the soial sienes we may fae situations where the observation will be a on-o event (no possibility of repeating it many times), and we still would like to assign probabilities. In this ase, it has been proposed to onsider subjetive probabilities probabilities onsidered a measure of personal ondene in the likelihood of spei results. This is an approah strongly onneted with Bayesian statistis that has had a revival of interest in reent years, espeially sine to elaborate on it, it is often neessary to work with ompliated probability expressions, and the advent of powerful inexpensive omputing resoures has made this approah muh more pratial. We will not be onerned with this line of researh in this lass, but be aware that it is out there. 3 As we will see, if we are areful, we an interpret probabilities as it is usually done, that is as a limit value of the frequeny of ourrene, when we repeat our observations a suient number of times we expet roughly half of a sequene of fair oin tosses to result in Heads, and half in Tails.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A short introduction to probability for statistics