Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5: Probability Concepts April 17 In Chapter 5: 5.1 What is Probability? 5.2 Types of Random Variables 5.3 Discrete Random Variables 5.4 Continuous Random Variables 5.5 More Rules and Properties of Probability Definitions • Random variable ≡ a numerical quantity that takes on different values depending on chance • Population ≡ the set of all possible values for a random variable • Event ≡ an outcome or set of outcomes • Probability ≡ the proportion of times an event is expected to occur in the population Ideas about probability are founded on relative frequencies (proportions) in populations. Probability Illustrated • In a given year, there were 42,636 traffic fatalities in a population of N = 293,655,000 • If I randomly select a person from this population, what is the probability they will experience a traffic fatality by the end of that year? ANS: The relative frequency of this event in the population = 42,636/ 293,655,000 = 0.0001452. Thus, Pr(traf. fatality) = 0.0001452 (about 1 in 6887 1/.0001452) Probability as a repetitive process Experiments sample a population in which 20% of observations are positives. This figure shows two such experiments. The sample proportion approaches the true probability of selection as n increases. Subjective Probability Probability can be used to quantify a level of belief Probability Verbal expression 0.00 Never 0.05 Seldom 0.20 Infrequent 0.50 As often as not 0.80 Very frequent 0.95 Highly likely 1.00 Always §5.2: Random Variables • Random variable ≡ a numerical quantity that takes on different values depending on chance • Two types of random variables • Discrete random variables: a countable set of possible outcome (e.g., the number of cases in an SRS from the population) • Continuous random variable: an unbroken continuum of possible outcome (e.g., the average weight of an SRS of newborns selected from the population (Xeno’s paradox…) • §5.3: Discrete Random Variables Probability mass function (pmf) ≡ a mathematical relation that assigns probabilities to all possible outcomes for a discrete random variables • Illustrative example: “Four Patients”. Suppose I treat four patients with an intervention that is successful 75% of the time. Let X ≡ the variable number of success in this experiment. This is the pmf for this random variable: x 0 1 2 3 4 Pr(X=x) 0.0039 0.0469 0.2109 0.4219 0.3164 Discrete Random Variables The pmf can be shown in tabular or graphical form x 0 1 2 3 4 Pr(X=x) 0.0039 0.0469 0.2109 0.4219 0.3164 Properties of Probabilities • Property 1. Probabilities are always between 0 and 1 • Property 2. A sample space is all possible outcomes. The probabilities in the sample space sum to 1 (exactly). • Property 3. The complement of an event is “the event not happening”. The probability of a complement is 1 minus the probability of the event. • Property 4. Probabilities of disjoint events can be added. Properties of Probabilities In symbols • Property 1. 0 ≤ Pr(A) ≤ 1 • Property 2. Pr(S) = 1, where S represent the sample space (all possible outcomes) • Property 3. Pr(Ā) = 1 – Pr(A), Ā represent the complement of A (not A) • Property 4. If A and B are disjoint, then Pr(A or B) = Pr(A) + Pr(B) Properties 1 & 2 Illustrated “Four patients” pmf Property 1. 0 ≤ Pr(A) ≤ 1 Note that all individual probabilities are between 0 and 1. Property 2. Pr(S) = 1 Note that the sum of all probabilities = .0039 + .0469 + .2109 + .4219 + .3164 = 1 Property 3 Illustrated Property 3. Pr(Ā) = 1 – Pr(A), “Four patients” pmf As an example, let A represent 4 successes. Pr(A) = .3164 Ā A Let Ā represent the complement of A (“not A”), which is “3 or fewer”. Pr(Ā) = 1 – Pr(A) = 1 – 0.3164 = 0.6836 Property 4 Illustrated “Four patients” pmf Property 4. Pr(A or B) = Pr(A) + Pr(B) for disjoint events Let A represent 4 successes Let B represent 3 successes B A Since A and B are disjoint, Pr(A or B) = Pr(A) + Pr(B) = 0.3164 + 0.4129 = 0.7293. The probability of observing 3 or 4 successes is 0.7293 (about 73%). Area Under the Curve (AUC) • The area under curves (AUC) on a pmf corresponds to probability • In this figure, Pr(X = 2) = area of shaded region = height × base = .2109 × 1.0 = 0.2109 “Four patients” pmf Cumulative Probability • “Cumulative probability” refers to probability of that value or less • Notation: Pr(X ≤ x) • Corresponds to AUC to the left of the point (“left tail”) .2109 .0469 .0039 Example: Pr(X ≤ 2) = shaded “tail” = 0.0039 + 0.0469 + 0.2109 = 0.2617 §5.4 Continuous Random Variables Continuous random variables form a continuum of possible values. As an illustration, consider the spinner in this illustration. This spinner will generate a continuum of random numbers between 0 to 1 §5.4: Continuous Random Variables A probability density functions (pdf) is a mathematical relation that assigns probabilities to all possible outcomes for a continuous random variable. The pdf for our random spinner is shown here. 0.5 The shaded area under the curve represents probability, in this instance: Pr(0 ≤ X ≤ 0.5) = 0.5 Examples of pdfs • pdfs obey all the rules of probabilities • pdfs come in many forms (shapes). Here are some examples: Uniform pdf Normal pdf Chi-square pdf Exercise 5.13 pdf The most common pdf is the Normal. (We study the Normal pdf in detail in the next chapter.) Area Under the Curve • As was the case with pmfs, pdfs display probability with the area under the curve (AUC) • This histogram shades bars corresponding to ages ≤ 9 (~40% of histogram) • This shaded AUC on the Normal pdf curve also corresponds to ~40% of total. x 12 1 f ( x) e 2 2