Download Probability and Statistics for Particle Physics

Probability and Statistics for Particle Physics Javier Magnin CBPF – Brazilian Center for Research in Physics Rio de Janeiro - Brazil Outline • Course: three one hour lectures • 1st lecture: • General ideas / Preliminary concepts • Probability and statistics • Distributions • 2nd lecture: • Error matrix • Combining errors / results • Parameter fitting and hypothesis testing • 3rd lecture: • Parameter fitting and hypothesis testing (cont.) • Examples of fitting procedures st 1 lecture Preliminary concepts  Two types of experimental results: • Determination of the numerical value of some physical quantity Parameter determination • Testing whether a particular theory is consistent with data Hypothesis testing  In real life there is a degree of overlapping between both types above  We will go through both types of results along these lectures Why estimate errors ? Example: • consider the accepted value of the speed of light c = 2.998 x 108 m/s • Assume that a new measurement gives c´ = (3.09  x) x 108 m/s Question: • Are these two numbers consistent ? Why estimate errors ? Example: • consider the accepted value of the speed of light c = 2.998 x 108 m/s • Assume that a new measurement gives c´ = (3.09  x) x 108 m/s • If x =  0.15  the new determination is consistent with the accepted value. • If x =  0.01  both values are inconsistent  there is evidence for a change in the speed of light ! • If x =  2  both values are consistent, but accuracy is so low that is impossible to detect a change on c ! Random and systematic errors Consider the experiment of determining the decay constant  of a radioactive source:  Count how many decays are observed in a time interval t  Determine the decay rate Number of decaying nuclei • Random errors:  inherent statistical error in counting events  uncertainty in the mass of the sample  timing of the period for which the decay are observed • Systematic errors:  efficiency of the counter used to detect the decays  background (i.e. particles coming from other sources)  purity of the radioactive sample  calibration errors Probability Suppose you repeat an experiment several times. Even if you are able to keep constant the essential conditions, the repetition will produce different results The result of an individual measurement will be unpredictable even when the possible results of a series of measurements have a well-defined distribution Definition: The probability p of obtaining a specific result when performing one measurement or trial is defined as p= Number of times on which that result occurs Total number of measurements or trials Rules of probability 1. If P(A) is the probability of a given event A, then 0  P(A)  1. 2. The probability P(A+B) that at least A or B occurs is such that P(A+B)  P(A) + P(B). The equality is valid only if A and B are exclusive events. 3. The probability P(AB) of obtaining both A and B is P(AB) = P(A|B)P(B) = P(B|A)P(A), where P(A|B) is the probability of obtaining A given that B has occurred (P(A|B) is know as the conditional probability of A given B). 4. The rule 3. defines the conditional probability as P(A|B) = P(AB)/P(B). Comments (about the rules !) 1. P(A+B) = P(A) + P(B) – P(AB) to avoid double counting ! 2. P(A|B) = P(AB)/P(B) = (NC/N)/(NB/N) = NC/NB 3. P(A|B)  P(B|A) 4. If P(A|B) = P(A) then A and B are independent, which is equivalent to say that P(AB) = P(A)P(B) Example: use of conditional probability Measurement of the mass difference m = m(KL) – m(KS) from the K0p and K0p cross sections p p+ K0 p+ K0 p K+ K0 +p pK0 K+ + p  K0p+p (production) +p K0  p+ + p(decay) •K0 are detected in the decay (event B) •We want to measure K0p  K0p (event A) •We are interested in P(AB) = P(B|A)P(A) Bayes theorem Let the sample space  be spanned by n mutually exclusive and exhaustive sets Bi, i P(Bi) = 1. If A is also a set belonging to , then P(Bi|A) = P(A|Bi) P(Bi) j P(A|Bj)P(Bj) Example (of the Bayes theorem) Consider three drawers B1, B2, B3, each one with two coins. B1 has two gold coins, B2 has one gold and one silver and B3 has two silver coins.  Now select a random drawer and pick a coin from it.  Supposing that the coin is gold, what is the probability of having a second gold coin in the same drawer (or, what is the probability of having selected the drawer B1, given that I selected a gold coin)? B1 B2 B3 If A is the event of first picking a gold coin, then P(A|B1) = 1; P(A|B2) = 1/2; P(A|B3) = 0 Since the drawers is selected at random, P(B1) = P(B2) = P(B3) = 1/3 Hence, from Bayes theorem follows that P(B1|A) = P(A|B1) P(B1) j P(A|Bj)P(Bj) = 1 x 1/3 1x1/3 + 1/2x1/3 + 0x1/3 = 2/3 Thus, although the probability of selecting the drawer B1 is 1/3, the observation that the first coin drawn was gold doubles the probability that the drawer B1 had been selected Statistics Probability: we start with a well defined problem, and calculate from this the possible outcome of a specific experiment Statistics: we use data from a experiment to deduce what are the rules or laws relevant to the experiment Statistics Statistics  two different problems (with some overlap) Parameter determination: use data to measure the parameter of a given model or theory Hypothesis testing: use data to test the hypothesis of a model Probability Statistics From theory to data From data to theory •Toss a coin. If P(heads) = 0.5, how many heads you get in 50 tosses? •Given a sample of K* mesons of known polarization, what is the forward/backward asymmetry? •If you observe 27 heads in 50 tosses, what is the value of P(heads)? (*) •Of 1000 K* mesons, 600 were observed that decay forward. What is the K* polarization? (**) (*) Parameter determination: deduce a quantity and its error (**) Hypothesis testing: check a theory Distributions In general, the result of repeating the same measurement many times does not lead to the same result Experiment: • measure the length of one side of your table 10 times and display the results in a histogram. • What happens if you repeat the measurement 50 times ? Distribution n(x): describes how often a value of the variable x occurs in a sample Measuring the table... Sample: the total number of measurements of the side of the table, N n x n Bin size x x  discrete or continuous variable distributed in a finite or infinite range n x Mean and variance: center and width of a distribution Assume a set of N separate measurements of the same physical quantity Mean Variance Mean and variance: center and width of a distribution Assume a set of N separate measurements of the same physical quantity Mean Variance Convention: m and s2 are the true values of the mean and the variance resp. Entries in each x bin (66 in total) The mean x is known to an accuracy of s/ N (more about 1/ N later) Then s2 will not change by increasing N, but the variance of the mean, s2/ N, decreases with N s2 is a measurement of how the distribution spreads out and not a measurement of the error in x ! x is better determined as N increases s2 is a property of the distribution Continuous distributions Number of events Mean Variance N large Special distributions Binomial distribution N independent trials, each of which has only two possible outcomes, success p or failure (1-p) Probability of failure in the remaining N-r trials Ordering of the successes and failures Probability of obtaining successes on r attempts Symmetric if p=(1-p) r = non negative integer 0 < p < 1, p Real Mean: Variance: Where the binomial distribution apply ?  Throw a dice N times. What is the probability of obtaining a 6 on exactly r occasions ?  The angles that the decay products from a given source make with some fixed axis are measured. If the expected distribution is known, what is the probability of observing r decays in the forward hemisphere ( < p/2) from a total sample of N decays ? Properties and limiting cases    Poisson   Gaussian Poisson distribution Probability of observing r independent events in a time interval t, when the counting rate is m and the expected number of events in the time interval is  Limiting case of the binomial distribution when r non negative integer  positive real number Mean = Variance =  Where the Poisson distribution apply ?  Particle emission from a radioactive source: if particles are emitted from the source at an average rate of  (= number of particles emitted by unit time), the number of particles emitted, r, in a time interval t follows a Poisson distribution. Additive property of independent Poisson variables Assume that you have a radioactive emitting source in a medium where there exist radioactive background emissions at a rate mb. The radioactive source emits at a ratio mx Radioactive source Background What is the distribution probability for the emission of source + background ? r = number of particles emitted by the source + background Binomial formula Gaussian distribution General form: Continuous variable Mean Variance Normalization Standard form Properties I[m-s,m+s] ~ 0.68 I[-,+ ] • Symmetric w.r.t. m • if xi is gaussian with m and s2 then is gaussian with m and s2/n Where the Gaussian distribution apply ?  The result of repeating an experiment many times produces a spread of answers whose distribution is approximately Gaussian.  If the individual errors contributing to the final answer are small, the approximation to a Gaussian is especially good. N  ; Np = m Binomial Poisson N m Gaussian

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Probability and Statistics for Particle Physics