Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bases of the theory of probability and mathematical statistics. Slide ‹#› An experiment is a situation involving chance or probability that leads to results called outcomes.In the problem above, the experiment is spinning the spinner. An outcome is the result of a single trial of an experiment.The possible outcomes are landing on yellow, blue, green or red. An event is one or more outcomes of an experiment.One event of this experiment is landing on blue. Probability is the measure of how likely an event is. Slide ‹#› Definitions 1 Certain .5 50/50 Probability is the numerical measure of the likelihood that the event will occur. Value is between 0 and 1. Sum of the probabilities of all events is 1. 0 Impossible Slide ‹#› Experimental vs.Theoretical Experimental probability: P(event) = number of times event occurs total number of trials Theoretical probability: P(E) = number of favorable outcomes total number of possible outcomes Slide ‹#› Identifying the Type of Probability You draw a marble out of the bag, record the color, and replace the marble. After 6 draws, you record 2 red marbles P(red)= 2/6 = 1/3 Experimental (The result is found by repeating an experiment.) Trial Red Blue 1 2 1 1 3 4 1 1 5 1 6 1 Total Exp. Prob. 2 4 1/3 2/3 Slide ‹#› The complement of A is everything in the sample space S that is NOT in A. •If S A the rectangular box is S, and the white circle is A, then everything in the box that’s outside the circle is Ac , which is the complement of A. Slide ‹#› Theorem Pr (Ac) = 1 - Pr (A) Example: If A is the event that a randomly selected student is male, and the probability of A is 0.6, what is Ac and what is its probability? Ac is the event that a randomly selected student is female, and its probability is 0.4. Slide ‹#› The union of A & B (denoted A U B) is everything in the sample space that is in either A or B or both. S A •The B union of A & B is the whole white area. Slide ‹#› The intersection of A & B (denoted A∩B) is everything in the sample space that is in both A & B. S A B •The intersection of A & B is the pink overlapping area. Slide ‹#› Example A family is planning to have 2 children. Suppose boys (B) & girls (G) are equally likely. What is the sample space S? S = {BB, GG, BG, GB} Slide ‹#› Example continued If E is the event that both children are the same sex, what does E look like & what is its probability? E = {BB, GG} Since boys & girls are equally likely, each of the four outcomes in the sample space S = {BB, GG, BG, GB} is equally likely & has a probability of 1/4. So Pr(E) = 2/4 = 1/2 = 0.5 Slide ‹#› Example cont’d: Recall that E = {BB, GG} & Pr(E)=0.5 What is the complement of E and what is its probability? Ec Pr = {BG, GB} (Ec) = 1- Pr(E) = 1 - 0.5 = 0.5 Slide ‹#› Example continued If F is the event that at least one of the children is a girl, what does F look like & what is its probability? F = {BG, GB, GG} Pr(F) = 3/4 = 0.75 Slide ‹#› Recall: E = {BB, GG} & Pr(E)=0.5 F = {BG, GB, GG} & Pr(F) = 0.75 What is E∩F? {GG} What is its probability? 1/4 = 0.25 Slide ‹#› Recall: E = {BB, GG} & Pr(E)=0.5 F = {BG, GB, GG} & Pr(F) = 0.75 What is the EUF? {BB, GG, BG, GB} = S What is the probability of EUF? 1 If you add the separate probabilities of E & F together, do you get Pr(EUF)? Let’s try it. + Pr(F) = 0.5 + 0.75 = 1.25 ≠ 1 = Pr (EUF) Why doesn’t it work? We counted GG (the intersection of E & F) twice. Slide ‹#› Pr(E) A formula for Pr(EUF) Pr(EUF) = Pr(E) + Pr(F) - Pr(E∩F) If E & F do not overlap, then the intersection is the empty set, & the probability of the intersection is zero. When there is no overlap, Pr(EUF) = Pr(E) + Pr(F) . Slide ‹#› Independent Events We can deduce an important result from the conditional law of probability: If B has no effect on A, then, P(A B) = P(A) and we say the events are independent. ( The probability of A does not depend on B. ) So, P(A|B) = P(A B) P(B) becomes or P(A) = P(A B) P(B) P(A B) = P(A) P(B) Slide ‹#› Independent Events Tests for independence P(A B) = P(A) P(B A) = P(B) or P(A B) = P(A) P(B) Slide ‹#› The Multiplication Rule If events A and B are independent, then the probability of two events, A and B occurring in a sequence (or simultaneously) is: P( A and B) = P( A B) = P( A) P( B) This rule can extend to any number of independent events. Two events are independent if the occurrence of the first event does not affect the probability of the occurrence of the second event. More on this later Slide ‹#› Mutually Exclusive Two events A and B are mutually exclusive if and only if: P( A B) = 0 In a Venn diagram this means that event A is disjoint from event B. A B A and B are M.E. A B A and B are not M.E. Slide ‹#› The Addition Rule The probability that at least one of the events A or B will occur, P(A or B), is given by: P( A or B) = P( A B) = P( A) P( B) P( A B) If events A and B are mutually exclusive, then the addition rule is simplified to: P( A or B) = P( A B) = P( A) P( B) This simplified rule can be extended to any number of mutually exclusive events. Slide ‹#› Conditional Probability Conditional probability is the probability of an event occurring, given that another event has already occurred. Conditional probability restricts the sample space. The conditional probability of event B occurring, given that event A has occurred, is denoted by P(B|A) and is read as “probability of B, given A.” We use conditional probability when two events occurring in sequence are not independent. In other words, the fact that the first event (event A) has occurred affects the probability that the second event (event B) will occur. Slide ‹#› Conditional Probability Formula for Conditional Probability P( A B) P( B A) P( A | B) = or P( B | A) = P( B) P( A) Better off to use your brain and work out conditional probabilities from looking at the sample space, otherwise use the formula. Slide ‹#› Assigning Probabilities Two basic requirements for assigning probabilities 1. The probability assigned to each experimental outcome must be between 0 and 1, inclusively. If we let Ei denote the ith experimental outcome and P(Ei) its probability, then this requirement can be written as 0 P(Ei) 1 for all I 2. The sum of the for all the experimental probabilities outcomes must equal 1.0. For n experimental outcomes, this requirement can be written as P(E1)+ P(E2)+… + P(En) =1 Slide ‹#› Classical Method If an experiment has n possible outcomes, this method would assign a probability of 1/n to each outcome. Example Experiment: Rolling a die Sample Space: S = {1, 2, 3, 4, 5, 6} Probabilities: Each sample point has a 1/6 chance of occurring Slide ‹#› Slide ‹#› THEORETICAL PROBABILITY I have a quarter My quarter has a heads side and a tails side Since my quarter has only 2 sides, there are only 2 possible outcomes when I flip it. It will either land on heads, or tails HEADS TAILS Slide ‹#› THEORETICAL PROBABILITY When I flip my coin, the probability that my coin will land on heads is 1 in 2 What is the probability that my coin will land on tails?? HEADS TAILS Slide ‹#› Theoretical Probability Right!!! There is a 1 in 2 probability that my coin will land on tails!!! A probability of 1 in 2 can be written in three ways: •As a fraction: HEADS ½ •As a decimal: .50 TAILS •As a percent: 50% Slide ‹#› Theoretical Probability I have three marbles in a bag. 1 marble is red 1 marble is blue 1 marble is green I am going to take 1 marble from the bag. What is the probability that I will pick out a red marble? Slide ‹#› Theoretical Probability Since there are three marbles and only one is red, I have a 1 in 3 chance of picking out a red marble. I can write this in three ways: As a fraction: 1/3 As a decimal: .33 As a percent: 33% Slide ‹#› Experimental Probability Experimental probability is found by repeating an experiment and observing the outcomes. Slide ‹#› Experimental Probability Remember the bag of marbles? The bag has only 1 red, 1 green, and 1 blue marble in it. There are a total of 3 marbles in the bag. Theoretical Probability says there is a 1 in 3 chance of selecting a red, a green or a blue marble. Slide ‹#› Experimental Probability Draw 1 marble from the bag. It is a red marble. Record the outcome on the tally sheet Marble number red blue green 1 1 2 3 4 5 6 Slide ‹#› Experimental Probability Put the red marble back in the bag and draw again. This time your drew a green marble. Record this outcome on the tally sheet. Marble number red blue green 1 1 2 1 3 4 Slide ‹#› Experimental Probability Place the green marble back in the bag. Continue drawing marbles and recording outcomes until you have drawn 6 times. (remember to place each marble back in the bag before drawing again.) Slide ‹#› Experimental Probability After 6 draws your chart will look similar to this. Look at the red column. Of our 6 draws, we selected a red marble 2 times. Marble number red blue green 1 1 2 1 3 1 4 1 5 1 6 1 Total 2 1 3 Slide ‹#› Experimental Probability The experimental probability of drawing a red marble was 2 in 6. This can be expressed as a fraction: 2/6 or 1/3 a decimal : .33 or a percentage: 33% Marble number red blue green 1 1 2 1 3 1 4 1 5 1 6 1 Total 2 1 3 Slide ‹#› Experimental Probability Notice the Experimental Probability of drawing a red, blue or green marble. Marble number red blue green 1 1 2 1 3 1 4 1 5 1 6 1 Total 2 1 3 2/6 3/6 Exp. or or Prob. 1/3 1/6 1/2 Slide ‹#› Comparing Experimental and Theoretical Probability Look at the chart at the right. Is the experimental probability always the same as the theoretical probability? red Exp. Prob. Theo. Prob. blue green 1/3 1/6 1/2 1/3 1/3 1/3 Slide ‹#› Comparing Experimental and Theoretical Probability In this experiment, the experimental and theoretical probabilities of selecting a red marble are equal. red Exp. Prob. Theo. Prob. blue green 1/3 1/6 1/2 1/3 1/3 1/3 Slide ‹#› Comparing Experimental and Theoretical Probability The experimental probability of selecting a blue marble is less than Exp. the theoretical probability. Prob. The experimental Theo. probability of selecting a Prob. green marble is greater than the theoretical probability. red blue green 1/3 1/6 1/2 1/3 1/3 1/3 Slide ‹#› Point and interval estimations of parameters of the normally updiffused sign. Concept of statistical evaluation. Slide ‹#› What is statistics? a branch of mathematics that provides techniques to analyze whether or not your data is significant (meaningful) Statistical applications are based on probability statements Nothing is “proved” with statistics Statistics are reported Statistics report the probability that similar results would occur if you repeated the experiment Slide ‹#› Statistics deals with numbers Need to know nature of numbers collected Continuous variables: type of numbers associated with measuring or weighing; any value in a continuous interval of measurement. Examples: Weight of students, height of plants, time to flowering Discrete variables: type of numbers that are counted or categorical Examples: Numbers of boys, girls, insects, plants Slide ‹#› Standard Deviation and Variance Standard deviation and variance are the most common measures of total risk They measure the dispersion of a set of observations around the mean observation Slide ‹#› Standard Deviation and Variance (cont’d) General equation for variance: 2 n Variance = 2 = prob( xi ) xi x i =1 If all outcomes are equally likely: n 2 1 = xi x n i =1 2 Slide ‹#› Standard Deviation and Variance (cont’d) Equation for standard deviation: Standard deviation = = 2 = 2 n prob( x ) x x i =1 i i Slide ‹#› 1.The Normal distribution – parameters m and (or 2) Comment: If m = 0 and = 1 the distribution is called the standard normal distribution 0.03 Normal distribution with m = 50 and =15 0.025 0.02 Normal distribution with m = 70 and =20 0.015 0.01 0.005 0 0 20 40 60 80 100 120 Slide ‹#› The probability density of the normal distribution 1 f ( x) = e 2 xm 2 , x 2 2 If a random variable, X, has a normal distribution with mean m and variance 2 then we will write: X ~ N m , 2 Slide ‹#› The Chi-square distribution The Chi-square (c2) distribution with n d.f. 1 2 n 1 1 x 2 x2 e 2 f x = n2 0 n n 2 x 1 2 2 x e n2 n = 2 2 0 x0 x0 x0 x0 Slide ‹#› Graph: The c2 distribution (n = 4) 0.2 (n = 5) (n = 6) 0.1 0 0 4 8 12 16 Slide ‹#› Basic Properties of the Chi-Square distribution If z has a Standard Normal distribution then z2 has a c2 distribution with 1 degree of freedom. 2. If z1, z2,…, zn are independent random variables each having Standard Normal distribution then 1. U = z12 z22 ... zn2 has a c2 distribution with n degrees of freedom. 3. Let X and Y be independent random variables having a c2 distribution with n1 and n2 degrees of freedom respectively then X + Y has a c2 distribution with degrees of freedom n1 + n2. Slide ‹#› continued 4. Let x1, x2,…, xn, be independent random variables having a c2 distribution with n1 , n2 ,…, nn degrees of freedom respectively then x1+ x2 +…+ xn has a c2 distribution with degrees of freedom n1 +…+ nn. 5. Suppose X and Y are independent random variables with X and X + Y having a c2 distribution with n1 and n (n > n1 ) degrees of freedom respectively then Y has a c2 distribution with degrees of freedom n - n1. Slide ‹#› The non-central Chi-squared distribution If z1, z2,…, zn are independent random variables each having a Normal distribution with mean mi and variance 2 = 1, then U = z z ... zn 2 1 2 2 2 has a non-central c2 distribution with n degrees of freedom and non-centrality parameter n = 12 mi2 i =1 Slide ‹#› Mean and Variance of non-central c2 distribution If U has a non-central c2 distribution with n degrees of freedom and non-centrality parameter = Then n 1 2 m i =1 n 2 i E U = n 2 = n mi2 i =1 VarU = 2n 4 If U has a central c2 distribution with n degrees of freedom and is zero, thus EU = n VarU = 2n Slide ‹#› Estimation of Population Parameters Statistical inference refers to making inferences about a population parameter through the use of sample information The sample statistics summarize sample information and can be used to make inferences about the population parameters Two approaches to estimate population parameters Point estimation: Obtain a value estimate for the population parameter Interval estimation: Construct an interval within which the population parameter will lie with a certain probability Slide ‹#› Point Estimation In attempting to obtain point estimates of population parameters, the following questions arise What is a point estimate of the population mean? How good of an estimate do we obtain through the methodology that we follow? Example: What is a point estimate of the average yield on ten-year Treasury bonds? To answer this question, we use a formula that takes sample information and produces a number Slide ‹#› Point Estimation A formula that uses sample information to produce an estimate of a population parameter is called an estimator A specific value of an estimator obtained from information of a specific sample is called an estimate Example: We said that the sample mean is a good estimate of the population mean The sample mean is an estimator A particular value of the sample mean is an estimate Slide ‹#› Interval Estimation In the probabilistic interpretation, we say that A 95% confidence interval for a population parameter means that, in repeated sampling, 95% of such confidence intervals will include the population parameter In the practical interpretation, we say that We are 95% confident that the 95% confidence interval will include the population parameter Slide ‹#› Constructing Confidence Intervals Confidence intervals have similar structures Point Estimate Reliability Factor Standard Error Reliability factor is a number based on the assumed distribution of the point estimate and the level of confidence Standard error of the sample statistic providing the point estimate Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance X the sample mean, then we are interested in the confidence If is interval, such that the following probability is .9 .9 = P 1.645 Z 1.645 X m = P 1.645 1.645 / n 1.645 1.645 = P X m n n 1.645 1.645 = P X mX n n Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance Following the above expression for the structure of a confidence interval, we rewrite the confidence interval as follows X 1.645 n Note that from the standard normal density PZ 1.65 = FZ 1.65 = 0.95 P( Z 1.65) = FZ 1.65 = 0.05 Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance Given this result and that the level of confidence for this interval (1) is .90, we conclude that The area under the standard normal to the left of –1.65 is 0.05 The area under the standard normal to the right of 1.65 is 0.05 Thus, the two reliability factors represent the cutoffs -z/2 and z/2 for the standard normal Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance In general, a 100(1-)% confidence interval for the population mean m when we draw samples from a normal distribution with known variance 2 is given by X z / 2 n where z/2 is the number for which PZ z / 2 = 2 Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance Note: We typically use the following reliability factors when constructing confidence intervals based on the standard normal distribution 90% interval: z0.05 = 1.65 95% interval: z0.025 = 1.96 99% interval: z0.005 = 2.58 Implication: As the degree of confidence increases the interval becomes wider Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Known Variance Example: Suppose we draw a sample of 100 observations of returns on the Nikkei index, assumed to be normally distributed, with sample mean 4% and standard deviation 6% What is the 95% confidence interval for the population mean? The standard error is .06/ The confidence interval is .04 1.96(.006) = .006 100 Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Unknown Variance In a more typical scenario, the population variance is unknown Note that, if the sample size is large, the previous results can be modified as follows The population distribution need not be normal The population variance need not be known The sample standard deviation will be a sufficiently good estimator of the population standard deviation Thus, the confidence interval for the population mean derived above can be used by substituting s for Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Unknown Variance However, if the sample size is small and the population variance is unknown, we cannot use the standard normal distribution If we replace the unknown with the sample st. deviation s the following quantity t= X m s/ n follows Student’s t distribution with (n – 1) degrees of freedom Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Unknown Variance The t-distribution has mean 0 and (n – 1) degrees of freedom As degrees of freedom increase, the t-distribution approaches the standard normal distribution Also, t-distributions have fatter tails, but as degrees of freedom increase (df = 8 or more) the tails become less fat and resemble that of a normal distribution Slide ‹#› Confidence Interval for Mean of a Normal Distribution with Unknown Variance In general, a 100(1-)% confidence interval for the population mean m when we draw small samples from a normal distribution with an unknown variance 2 is given by X tn 1, / 2 s n where tn-1,/2 is the number for which Ptn1 tn1, / 2 = 2 Slide ‹#› Confidence Interval for the Population Variance of a Normal Population Suppose we have obtained a random sample of n observations from a normal population with variance 2 and that the sample variance is s2. A 100(1 - )% confidence interval for the population variance is n 1s 2 c n21, / 2 2 n 1s 2 c n21,1 / 2 Slide ‹#› End Slide ‹#›