Download 01-Introduction to Probability

BIOINF 2118 01-Introduction to Probability 2013-01-08, p.1 of 4 Probability and Statistics Probability: Process Data “Given a process or mechanism, after many repetitions what kinds of outcomes (data) can we expect?” Statistics: Data Process “Given some data, what can we say about the process or mechanism that gave rise to the data?” Example: Diagnostic testing A patient arrives in the clinic. The doctor suspects that the patient suffers a particular illness. The true state, “healthy” or “sick”, is unknown; therefore the doctor orders a diagnostic test. True state of patient =  = “the process”. The result = X = “the data”. Top arrow: probability: If we knew the process, then the probabilities of the result would be “known” (at least roughly, from previous patients’ data). Bottom arrow: statistics: After the test, we still do not know the true state of the patient, but from the data X we now have better knowledge. DATA PROCESS X=negative X=positive X=indeterminate TOTAL 0.03 0.02 1.00  = “healthy” 0.95 0.03 0.95 0.02 1.00  = “sick” TABLE 1: each row is a model; the collection of rows is a model family. Here  is the unknown “true state of nature”, and X, the test result, is an observation. Generating a test result X is the result of a process under a particular state of nature . The sample space is the set of possible observations, X = {negative, positive, indeterminate}. The parameter space is the set of possible “states of nature, = {healthy, sick}. Each table entry is a conditional probability . The “healthy” row is a probability distribution, Tthe “sick” row is another probability distribution,. For example, if  = “healthy”, the probability distribution is: Pr(X=negative) = 0.95, Pr(X=positive)=0.03, Pr(X=indeterminate)=0.02. The pair of rows is a model family (or a model). + Each column is a likelihood function, L. (We write L : Q ® .) For example if X=negative is observed, then L( = “healthy”) = 0.95, L( = “sick”)=0.03. In the context of a likelihood, these numbers are NOT probabilities. (Columns don’t add to one.) BIOINF 2118 01-Introduction to Probability 2013-01-08, p.2 of 4 Now suppose that the prevalence of the disease is 10%. Prevalence = Pr( = “sick”). The following table is the joint distribution of  and X.  = “healthy”  = “sick” TOTAL X=negative 0.855 0.003 0.858 X=positive 0.027 0.095 0.122 X=indeterminate 0.018 0.002 0.020 TOTAL 0.90 0.10 1.00 TABLE 2: joint probabilities for each combination Interpretations of Probability • Frequency interpretation: “ ” means: “If I do the test repeatedly on a large number of sick patients, then in the long run roughly 95% of the test results will equal 2. • Subjective (Bayesian) interpretation- before data is observed: , which means: “Given what I know now, my current belief is that there’s a 10% chance that this patient is sick.” Connection with decision-making: This belief sometimes represents a willingness to gamble that the patient is sick, if the payoff is above the ratio 9-to-1 (0.9/0.1), but not if it’s below 9-to-1. • Subjective (Bayesian) interpretation- after data is observed: X=negative (1) X=positive (2) X=indeterminate (3) 0.9965 0.221 0.9  = “healthy” 0.0035 0.779 0.1  = “sick” TOTAL 1.0000 1.000 1.0 TABLE 3: conditional probabilities, given X ,which means: “Given what I knew before, plus what I know now (the data), my current belief is that there’s a 77.9% chance that this patient is sick.” Table 3 combines the two types of probability: belief and frequency. Now the gambling odds are 0.779/(1-0.779) = 3.52. “Statistics” is assessing whether the patient is healthy or sick, after observing X. We saw this above, in the form of the posterior probability, 0.779. When the prevalence is not known, we have to use the frequency interpretation of probability, using the models in TABLE 1. A great tool is the likelihood ratio, LR: BIOINF 2118 01-Introduction to Probability 2013-01-08, p.3 of 4 . LR(X=1) = 0.03/0.95 ~ 1/32, LR(X=2) =0.95/0.03 ~ 32, LR(X=3) = 1. Here we see that the observation X=negative lowers the probability of “sick”, and X=positive raises. Observing X=indeterminate does not provide any information, as reflected in LR=1. For each value of X, we can see in what way the value of the probability changes, but we cannot say what the final probability is because we do not know the initial probability. Experiments • An experiment is any process in which the outcome is uncertain. • Examples: Rolling a die, conducting a clinical trial, conducting a survey, getting married,…. • The sample space X is the set of possible outcomes. • Example: For our diagnostic test, X = {1, 2, 3}. For rolling a die, X = {1, 2, 3, 4, 5, 6}. Sets and Subsets • A sample space X is a set. • An outcome is an element of the sample space, . • An event is a subset of the sample space. For example, is the event of rolling an even number with a die. • An event A implies another event B if every outcome in A also belongs to B. This relation is denoted , “A is a subset of B”. • A parameter space is a set. • A hypothesis is a subset . Empty, Finite and Infinite Sets • The empty set contains no outcomes. It is denoted by . For all events A, • Sets may be finite or infinite. is finite. • Infinite sets may be countably infinite or uncountably infinite. X =[0,1] is uncountable. . is countable (but infinite). Union, Intersection, Complement Concept union (either/or) intersection (both) complement (not) empty set, or null set set product Subset element of symbol Ac or { } X R function or value union( ) intersect( ) setdiff( ) NULL, character(0) expand.grid( ) all(is.element( )) is.element( ) Disjoint Events • A and B are disjoint if and only if . . • Events are disjoint or mutually exclusive if, for every . BIOINF 2118 01-Introduction to Probability 2013-01-08, p.4 of 4 Semi-formal definition of probability A probability space is a sample space X, together with a mapping Pr from events in a sample space to [0,1] (in mathematical notation, Pr: 2X  [0,1]) that satisfy three axioms: Axiom 1: For every event A , . (To be technically correct: there may be very esoteric sets which cannot be assigned a probability.) Axiom 2: Pr(X) = 1. Axiom 3: For every “countable” sequence of disjoint events , Some probability theorems . . . . Some formal definitions: Given a parameter space and a sample space X, a model family indexed by is a set of probability distributions . When X is observed, the likelihood function is the function defined by . (Later, we’ll modify this slightly for “continuous distributions”.)  STUDY CAREFULLY ALL NOTATION AND DEFINITIONS.  The likelihood function: the parameter space The probability model: X , the sample space

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 01-Introduction to Probability