Download 1. Introduction

STA111 - Lecture 1 Welcome to STA111! Some basic information: • Instructor: Víctor Peña (email: [email protected]) • Course Website: http://stat.duke.edu/~vp58/sta111. 1 What is the difference between Probability and Statistics? Unfortunately, I don’t think there is a simple answer to this question (some might even argue that there isn’t one at all!). Justin Rising gives this answer on Quora: In probability, we’re given a model, and asked what kind of data we’re likely to see. In statistics, we’re given data, and asked what kind of model is likely to have generated it. This definition is not completely satisfactory (as we will see as we learn more about probability and statistics), but it is a good start. Let’s illustrate this with a typical example (see for example the answer by John D. Cook on StackExchange). Suppose we have a bag with a total of 100 jelly beans. Some of them are red, and some of them are green. The probabilist knows the proportion of red to green jelly beans, and wants to know, for example, the probability of drawing 2 red jelly beans in a row. The statistician doesn’t know the proportion of red to green jelly beans, and wants to estimate it after having drawn 2 red jelly beans in a row. Let’s stick with this example for a bit. The statistician is also interested in: 1. Quantifiying how precise the estimation is. Suppose that the statistician has drawn 98 jelly beans and all of them are red. It seems clear that the estimation after drawing 98 beans will be more “precise” (in some sense) than the original estimation based on a sample size of 2 beans. 2. Deciding how many jelly beans he should draw until he expects to achieve a sufficient precision. Drawing jelly beans out of a bag is boring, so he might not want to draw all 100 beans and know the proportion with all certainty. The statistician might be content with estimating the proportion sufficiently well. 3. Investigating whether the assumed probabilistic framework corresponds with reality. Imagine that the statistician draws 30 red jelly beans in a row, but knows that the proportion of red/green jelly beans should be roughly 50%. The statistician was planning on estimating the proportion under the assumption that the jelly beans are mixed well. After seeing this, the statistician suspects that it might not be a reasonable assumption – whoever put the jelly beans in that bag might have put all the green beans first and didn’t mix the beans at all, so it is pretty likely that the first 30 beans are all red, even if the true proportion is 50%. There are other types of questions statisticians are interested in. For example, some statisticians study how, when, and under which assumptions we can infer causal relationships from data (e.g. infer a causal relationship between smoking and cancer from data). We will talk about this later in the course. 1 Harvey Motulsky on StackExchange proposes the following diagram which summarizes the section pretty well: Probability General → Specific Population → Sample Model → Data Statistics General ← Specific Population ← Sample Model ← Data In this course we will cover probability first and then move on to statistics (labs will be an exception!). 2 Basic Probability Here I will follow Chapter 1 of our textbook pretty closely. 2.1 Sets, Experiments, Sample Spaces, and Events For us, a set is simply a collection of objects. We can define sets by listing its elements (for example, A = {a, e, i, o, u} or B = {1, 2, 3, 4}) or by giving a complete description (for instance, A is the set of vowels, B is the set of positive integers stricly less than 5). An experiment will be “anything” whose outcome is yet unknown to us but for which we know the possible set of outcomes in advance. The sample space is the set of possible outcomes. An event is a subset of the sample space. For example: • Tossing a coin twice is an experiment with sample space equal to {heads/heads, heads/tails, tails/heads, tails/tails}. An example of event is “obtaining the same outcome twice”, which corresponds to the subset {heads/heads, tails/tails} of the sample space. • Rolling a die is an experiment with sample space equal to {1, 2, 3, 4, 5, 6}. The event “obtaining an odd number” corresponds to the subset {1, 3, 5} of the sample space. • Asking ourselves whether Duke basketball will win the national championship in 2016 also counts as an experiment, since the set of possible outcomes is known ({yes, no}) but the outcome is something we don’t know yet. Exercise 1. Come up with 4 examples of experiments. Specify their sample space and give an example of an event for each of them. 2 2.2 Interpretations of Probability There are many different interpretations of probability. Philosophers (and some statisticians) are still debating on this issue. Here are 3 very rough explanations of 3 interpretations of probability we will use in this course: • Principle of indifference: Break down the sample space until you can convince yourself (and others) that there is no reason to consider one outcome more likely than another. Then assign equal probabilities to all of them. For example, if we are rolling a die and it is “fair”, we can say that the outcomes 1,2,3,4,5,6 are equally likely because of “symmetry” or “physics”. Then, the probability of an event is defined as (number of favorable outcomes)/(number of possible outcomes). • Limiting frequencies: This one is easier to understand. The probability of an event can be interpreted as the limit of frequency of times it would occur if we were to repeat the experiment ad infinitum. For instance, we can interpret the probability of the event “rolling a die and getting a 6” as the long-run proportion of times we get a 6 as we roll the die again and again. Single events such as “the next time I roll a die I will get a 6”, or “Duke basketball will win the national championship in 2016” don’t fit very well here. • Degree of belief: The probability of an event is your degree of belief that it will happen. Different people have different opinions and, given an event, two agents can assign different probabilities. If my beliefs about uncertain propositions are coherent and I want to update them in light of data, probability calculus is the way to do it. We will not spend much time discussing the pros and cons of each of them, and our interpretation will depend on the context. Maybe we could use different simbols for the different interpretations, but (almost) nobody does that. If you are interested, you can take a look at this article or ask me. It turns out that the mathematical definition of probability doesn’t depend on how we interpret it. A probability measure will be defined as a function that maps events to numbers between 0 and 1 and satisfies some properties. Before we introduce the mathematical definition of probability, though, we need to brush up some basic set theory. 2.3 Basic Set Theory The empty set ∅ is the set containing no elements. The symbol ∈ denotes set membership and 6∈ denotes that an element is not a member of a set. If A and B are sets, A is a subset of B (A ⊂ B) if x ∈ A implies x ∈ B. Two sets are equal if A ⊂ B and B ⊂ A. Now we define some operations on sets: • Union: x ∈ A ∪ B if x ∈ A or x ∈ B (or both). • Intersection: x ∈ A ∩ B if x ∈ A and x ∈ B. • Complement: (with respect to a universal set Ω) Let A ⊂ Ω. Then x ∈ Ac if x ∈ Ω but x 6∈ A. • Set difference: Let A and B be subsets of Ω. Then, A \ B = A ∩ B c : that is A \ B are the x ∈ Ω such that x ∈ A and x 6∈ B. • Cardinality: |A| is the number of elements in A. • Power set: P(A) is the collection of all subsets of A. 3 Two sets A and B are said to be disjoint if A ∩ B = ∅ (i.e. they have no elements in common). Examples: • Let Ω = {1, 2, 3, 4, 5, 6, 7, 8} and A = {0, 1, 2, 3, 4}, B = {2, 3}, C = {3, 4, 5, 7}. Then A ∪ B = A, A ∩ B = B, B ∪ C = {2, 3, 4, 5, 7}, B ∩ C = {3}, A ∪ C = {0, 1, 2, 3, 4, 5, 7}, A \ B = {0, 1, 4}, AC = {5, 6, 7, 8}, |A| = 5, |B| = 2, P(B) = {∅, {2}, {3}, {2, 3}}, etc. • Let N0 = {0, 1, 2, 3, ... }, O = {1, 3, 5, 7, ... }, E = {0, 2, 4, 6, ... }. Then O ⊂ N0 , E ⊂ N0 , O ∪ E = N0 , O ∩ E = ∅, etc. Exercise 2. Let Ω be the universal set and let A, B ⊂ Ω. Answer the following questions, justifying your answers. 1. What is A ∪ Ac ? 2. What is A ∩ Ac ? 3. Assume A ⊂ B. What are A ∩ B and A ∪ B? We finish this section with a couple of useful results: • De Morgan’s laws: (A1 ∪ A2 ∪ · · · ∪ An )c = (Ac1 ∩ Ac2 ∩ · · · ∩ Acn ) and (A1 ∩ A2 ∩ · · · ∩ An )c = (Ac1 ∪ Ac2 ∪ · · · ∪ Acn ). • Inclusion-Exclusion formula: |A ∪ B| = |A| + |B| − |A ∩ B|. 2.4 Mathematical Definition of Probability Let Ω be the sample space of an experiment and A be the collection of events, which is a suitable collection of subsets of Ω. A probability measure is a function that takes events in A as inputs and satisfies: 1. P (A) ≤ 1 for all A ∈ A. 2. P (Ω) = 1. 3. If A1 , A2 , ... are disjoint events, then P (∪i Ai ) = P i P (Ai ) That is, it assigns numbers between 0 and 1 for all events, the probability of the universal set is 1, and the probability of the union of disjoint events equals the sum of the probabilities of the events. If A is an event, the interpretation of P (A) is “the probability that A happens”. Examples: • Suppose we toss a fair coin twice. Let H denote heads and T denote tails, so the sample space is {HH, HT, TH, TT}. Since the coin is not loaded, they are all equally likely: P ({TT}) = P ({TH}) = P ({HT}) = P ({TT}) = 1/4. The probability of obtaining the same outcome twice is P ({TT} ∪ {HH}), and since {TT} and {HH} are disjoint, we have P ({TT} ∪ {HH}) = 1/2. Properties: Let Ω be the sample space, and let A1 and A2 be events: 1. P (∅) = 0. 2. 0 ≤ P (A) ≤ 1. 4 3. If A1 ⊂ A2 , P (A1 ) ≤ P (A2 ). 4. If Ac = Ω \ A is the complement of A, P (Ac ) = 1 − P (A). 5. P (A1 ∪ A2 ) = P (A1 ) + P (A2 ) − P (A1 ∩ A2 ). 6. P (A1 ) = P (A1 ∩ A2 ) + P (A1 ∩ Ac2 ). Exercise 3. Show the properties above.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1. Introduction