Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ST3905 Lecturer : Supratik Roy Email : [email protected] (Unix) : [email protected] Phone: ext. 3626 What do we want to do? 1. What is statistics? 2. Describing Information : 3. Summarization, Visual and non-Visual representation 4. Drawing conclusion from information : 5. Managing uncertainty and incompleteness of information Resources 1. Recommended textbook : Probability and Statistics for Engineering and the Sciences : Jay L. Devore. International Thomson Publishing. 2. Software : R : homepage www.r-project.home Describing Information 1. Why summarization of information? 2. Visual representation (aka graphical Descriptive Statistics) 3. Non-visual representation (numerical measures) 4. Classical techniques vs modern IT Stem and Leaf Plot Decimal point is 2 places to the right of the colon 0:8 1 : 000011122233333333333344444 1 : 55555566666677777778888888899999999999 2 : 0000000111111111111222222233333333444444444 2 : 555556666666666777778889999999999999999 3 : 000000001111112222333333333444 3 : 55555555666667777777888888899999999 4 : 0122234 4 : 55555678888889 5 : 111111134 5 : 555667 6 : 44 6:7 com com ple x ple x Pie-Chart diffgeom algebra diffgeom algebra isti c s reals reals sta t sta t isti c s DotChart Child Care Old Suburb Coast County New Suburb o o o Health Services Old Suburb Coast County New Suburb o o o Community Centers Old Suburb Coast County New Suburb o o Family & Youth Old Suburb Coast County New Suburb Other Old Suburb Coast County New Suburb o o o o o 10 o 20 30 o Histogram 0 5 10 15 50 samples from a t distribution with 5 d.f. -4 -3 -2 -1 my.sample 0 1 2 0 5 10 15 Histogram-Categorical Northeast South North Central state.region West Rules for Histograms 1. Height of Rectangle proportional to frequency of class 2. No. of classes proportional to sqrt(total no. of observations) [not a hard and fast rule] 3. In case of categorical data, keep rectangle widths identical, and base of rectangles separate. 4. Best, if possible, let the software do it. Data -0.053626486 -0.828128399 0.214910482 0.346570399 [5] -0.849316517 0.001077376 0.736191791 1.417540397 [9] -2.382332275 -2.699019949 -0.111907192 1.384903284 [13] 2.113286699 -1.828108272 -1.108280724 0.131883612 [17] -0.394494473 0.829806888 0.023178033 0.019839537 [21] -0.346280222 -0.251981108 1.159853307 -0.249501904 [25] -1.342704742 -2.012653224 -1.535503208 0.869806233 [29] -1.313495887 -0.244408426 -0.998886998 -1.446769605 [33] 1.224528053 -0.410163230 0.032230907 -0.137297112 [37] -2.717620031 -0.728570438 0.034697116 2.202863874 [41] -0.170794163 0.353651680 -0.673296374 3.136364814 [45] -1.260108638 -0.367334893 -0.652217259 -0.301847039 [49] 0.315180215 0.190766333 Tabulation Class -3,-2 -2,-1 -1,0 0,1 1,2 2,3 3,4 //// //// //// //// //// // / // //// //// //// //// freq 4 7 /// 18 14 4 2 1 Total 50 200 400 600 800 Box-Plot - I 1 2 3 4 5 6 7 Box Plot – II 18-24 25-34 35-44 45-54 55-64 65+ Box Plot – III 200 400 Payoff 600 800 NJ Pick-it Lottery (5/22/75-3/16/76) 0 1 2 3 4 5 6 Leading Digit of Winning Numbers 7 8 9 Non-Visual (numerical measures) 1. Pictures vs. quantitative measures 2. Criteria for selection of a measure – purpose of study 3. Qualities that a measure should have 4. We live in an uncertain world – chances of error Measures of Location 1. Mean : 2. Mode 3. Median Location : mean, median algebra test scores 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 43 50 41 69 52 38 51 54 43 47 54 51 70 58 44 54 52 32 42 70 21 22 23 24 25 50 49 56 59 38 Mean = 50.68 10% trimmed mean of scores = 50.33333 Median = 51 Location : Non-classical An M-estimate of location is a solution mu of the equation: sum(psi( (y-mu)/s )) = 0. Data set : car.miles (bisquare) 204.5395 (Huber’s ) 204.2571 Tabular method of computing Class freq -3,-2 -2,-1 -1,0 0,1 1,2 2,3 3,4 4 7 18 14 4 2 1 50 Classmidpt -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 Rel. freq 0.08 0.14 0.36 0.28 0.08 0.04 0.02 r.f X midpt -0.20 -0.21 -0.18 0.14 0.12 0.10 0.07 -0.16 Tabular method of computing Class freq Classmidpt(x) -3,-2 -2,-1 -1,0 0,1 1,2 2,3 3,4 4 7 18 14 4 2 1 50 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 A=-0.5 x-A/d -2 -1 0 1 2 3 4 Rel. freq r.f X x 0.08 0.14 0.36 0.28 0.08 0.04 0.02 -0.16 -0.14 0 0.28 0.16 0.12 0.08 0.34 Measures of Scale (aka Dispersion) 1. Variance (unbiased) : sum((x-mean(x))^2)/(N-1) 2. Variance (biased) : sum((x-mean(x))^2)/(N) 3. Standard Deviation : sqrt( variance) Tabular method of computing Class Classmidpt(x) -3,-2 -2,-1 -1,0 0,1 1,2 2,3 3,4 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 A=-0.5 x’=(xA)/d -2 -1 0 1 2 3 4 x^2 Rel. freq r.f X x^2 4 1 0 1 4 9 16 0.08 0.14 0.36 0.28 0.08 0.04 0.02 0.32 0.14 0 0.28 0.32 0.36 0.32 1.74 Robust measures of scale 1. The MAD scale estimate generally has very small bias compared with other scale estimators when there is "contamination" in the data. 2. Tau-estimates and A-estimates also have 50% breakdown, but are more efficient for Gaussian data. 3. The A-estimate that scale.a computes is redescending, so it is inappropriate if it necessary that the scale estimate always be increasing as the size of a datapoint is increased. However, the A-estimate is very good if all of the contamination is far from the "good" data. Comparison of scale measures MAD(corn.yield) =4.15128 scale.tau(corn.yield) = 4.027753 scale.a(corn.yield) = 4.040902 var(corn.yield) = 19.04191 sqrt(var(corn.yield)) = 4.363703 N.B. To really compare you have to compare for various probability distributions as well as various sample sizes. Probability 1. Concept of an Experiment on Random observables 2. Sets and Events, Random variables, Probability (a).Set of all basic outcomes = Sample space = S (b).An element of S or union of elements in S = An event (Asingleton event = simple event, else compound) (c) A numerical function that associates an event with a number(s) = Random Variable (d) A map from E onto [0,1] obeying certain rules = probability Examples of Probability Consider toss of single coin : 1. A single throw : Only two possible outcomes – Head or Tail 2. Two consecutive throws : Four possible outcomes – (Head, Head), (Head, Tail), (Tail, Head), (Tail, Tail) 3. Unbiased coin : P(Head turns up) = 0.5 4. Define R.V. X to be X(Head)=1, X(Tail)=0. P(X=1)=0.5, P(X=0)=0.5. Axioms of Probability 1. 0 <= P(A) <= 1 for any event A 2. P[A B] = P[A]+P[B] if A,B are disjoint sets/events 3. P[S] =1 Basic Formulae-I 1. P[A’] = 1- P[A] 2. P[A B] = 0 if A,B are disjoint 3. P[A B] = P[A]+P[B]-P[A B] 4. P[A B C] = P[A]+P[B]+ P[C] -P[A B] –P[A C] – P[B C] +P[A B C] Basic Formulae-I-Examples-1 Consider the coin tossing experiment with three consecutive tosses, and Head or Tail being equally likely in any throw. Sample space = { HHH,HHT,HTH,HTT,THT,THH,TTH,TTT } Define A = {there are at least 2 Heads }; P(A)=0.5 Define B = {there are at least 1 Tail }; P(B)=0.875 • A B = {HHT,THH,HTH}; P[A B ]=3/8 • P[A B] = P[A]+P[B]-P[A B] = 1 Basic Formulae-I-Examples-2 Venn Diagrams: A AB B Basic Formulae-I-Examples-3 Venn Diagrams: A AB B Basic Formulae-I-Examples-4 Venn Diagrams: A B’ or complement of B B Basic Formulae – I-Examples 1. A family that owns 2 cars is selected, and for both the older car and the newer car we note whether the car was manufactured in America, Europe or Asia. (a) what are the possible outcomes of this experiment (b) which outcomes are contained in the event that one car is American and the other is non-American? ( c) which outcomes are contained in the event that at least one car is non-American? 2. In a certain residential suburb, 60% of all households subscribe to the metropolitan newspaper published in a nearby city, 80% subscribe to the local afternoon paper, and 50% of all households subscribe to both papers. If a household is selected at random, what is the probability that it subscribes to (1) at least one of the two (2) exactly one newspaper? Basic Formulae - II 1. Counting Principle : For an ordered sequence to be formed from N groups G1,G2,….GN with sizes k1,k2,….kN, the total no. of sequences that can be formed are k1 x k2 x ….kN. 2. For any positive integer m, m! is read as “m-factorial” and defined by m!=m(m-1)(m-2)…3.2.1 3. An ordered sequence of k objects taken from a set of n distinct objects is called a Permutation of size k of the objects, and is denoted by Pk,n = n(n-1)…(n-k+1) = n!/(n-k)! 4. Any unordered subset of size k from a set of n distinct objects is called a Combination, denoted Ck,n. = Pk,n /k! =n!/[k! (n-k)!] Basic Formulae – II-Example 1. A student wishes to commute first to a junior college for two years and then to a state college campus. Within commuting range there are four junior colleges and three state colleges. How many choices of junior college and state college are available to her? I f junior colleges are denoted by 1,2,3,4 and state colleges by a,b,c, choices are (1,a),(1,b),…,(4,c), a total of 12 choices. With n1 =4 and n2=3, N=n1n2=12 without a list. 2. There are 8 teaching assistants available for grading papers in a particular course. The first exam consists of 4 questions, and the professor wishes to select a different assistant to grade each question (only 1 assistant per question). In how many ways can assistants be chosen to grade the exam? Ans. P4,8 =(8)(7)(6)(5)=1680. Basic Formulae – II-Examples 1. Consider the set {A,B,C,D,E}. We know that there are 5!/(5-3)! =60 permutations of size 3. There are 6 permutations of size 3 consisting of the elements {A,B,C} since these 3 can be ordered 3.2.1 = 3! = 6 ways: (A,B,C), (A,C,B), (B,A,C),(B,C,A), (C,A,B) and (C,B,A). These 6 permutations are equivalent to the single combination {A,B,C}. Similarly for any other combination of size 3, there are 3! Permutations, each obtained by ordering the 3 objects. Thus, 60 = P3,5 = C3,5 .3! So C3,5 =60 / 3! =10. These 10 combinations are {A,B,C},{A,B,D},{A,B,E},{A,C,D},{A,C,E},{A,D,E},{B, C,D},{B,C,E},{B,D,E},{C,D,E}. Basic Formulae – II-Example 1. The student Engineers council at a certain college has one student representative from each of the 6 engineering majors (civil, food, electrical, industrial, materials, and mechanical). In how many ways can (a) Both a council president and a vice president be selected? (b) A president, a vice-president, and a secretary be selected? ( c) Two members be selected for the Presidents Council? 2. A real estate agent is showing homes to a prospective buyer. There are 10 homes in the desired price range listed in the area. The buyer has time to visit only 3 of them. (a) In how many ways could the 3 homes be chosen if the order of visiting is considered? (b) how many ways could the 3 homes chosen if the order is unimportant? If 4 of the homes are new and 6 been previously occupied and if 3 homes to visit are randomly chosen, what is the prob. That all 3 are new? Basic Formulae-III 1. Pk,n = n!/(n-k)! 2. Ck,n = n!/[k!(n-k)!] 3. For any two events A and B with P(B)>0, the Conditional Probability of A given (that ) B (has occurred)is defined by P(A|B) = P(A B)/P(B) [=0 if P(B)=0] 4. Let A,B be disjoint and C be any event with P[C]>0. Then P(C)=P(C|A)P(A)+P(C|B)P(B) [Law of Total Probability] 5. Let A,B be disjoint and C be any event with P[C]>0. Then P(A|C)=P(C|A)P(A)/[P(C|A)P(A)+P(C|B)P(B)]. [Bayes Theorem] Basic Formulae-III-examples 1. Suppose that of all individuals buying a certain PC, 60% include a word processing program in their purchase, 40% include a spreadsheet program, and 30% include both types of programs. Consider randomly selecting a purchaser and let A={word processing program included} and B={spreadsheet program included}. Then P(A)=0.6, p(B)=0.4, and P(both included)=P(AB)=0.30. Given that the selected individual included a spreadsheet program, the probability that a word program was also included is P(A|B) = P(A B)/P(B) =0.30/0.40 =0.75. Basic Formulae-III-examples 1. A chain of video stores sells 3 different brands of VCR’s. Of its VCR sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20% are brand 3. Each manufacturer offers a 1year warranty on parts and labour. It is known that 25% of brand 1’s VCR’s require warranty repair work, whereas the corresponding percentages for brands 2 and 3 are 20% and 10%, respectively. (a) What is the probability that a randomly selected purchaser has a VCR that will need repair while under warranty? 2. Let Ai ={brand I is purchased} for i=1,2,3. Let B={needs repair}, the given data implies p(B|A1)=0.25, P(B|A2)=0.20,P(B|A3)=0.10. Basic Formulae-III-examples 1. Only 1 in 100 adults is afflicted with a rare disease for which a diagnostic test has been developed. The test is such that when an individual actually has the disease, a positive result will occur 99% of the time, while an individual without the disease will show a positive test result only 2% of the time. If a randomly selected individual is tested and the result is positive, what is the probability that the individual has the disease? Let A1 ={individual has the disease}, A2 ={individual does not have disease}, B={positive test result}. Then P(A1)=0.001, P(A2)=0.999, P(B|A1)=0.99, and P(B|A2)=0.02. P(B)=0.02097, P(A1|B)=P(A1B)/P(B)=0.047 Basic Formulae-IV 1. Two events A and B are independent if P(A|B) = P(A B)/P(B) = P(A) and are dependent otherwise. 2. Two events A and B are independent if and only if P(AB) = P(A)P(B). Random Variables - Discrete 1. A discrete set is a set such that either it is finite or there exists a map from each element of the set into a subset of the set of Natural numbers. 2. A discrete random variable is a r.v. which takes values in a discrete set consisting of numbers. 3. The probability distribution or probability mass function (pmf) of a discrete r.v. X is defined for every number x by p(x)=P(X=x)=P(all s S: X(s)=x) [P[X=x] is read “the probability that the r.v. X assumes the value x”. Note, p(x) >= 0, sum of p(x) over all possible x is 1 Random Variables - Discrete 1. Bernoulli trials : (Coin toss is a particular example). The random variable X takes two values 1, and 0. 2. Notation : P[X=1]=p, 0<p<1 (Note that this automatically implies P[X=0]=1-p) 3. A general (arbitrary) discrete random variable can be denoted by an uppercase letter, say, X 4. The discrete values that can be taken by X are x1,x2,x3,…xn (assuming that total no. of values possible is n) 5. Typically, the corresponding probability masses are denoted by p1,p2,…,pn Cumulative Distribution Function 1. The probability distribution or probability mass function of a discrete r.v. is defined for every number x by p(x) =P(X=x) = P(all s S: X(s)=x). 2. The Cumulative distribution function (cdf) F(x) of a discrete r.v. X with pmf (probability mass function) p(x) is defined for every number x by F(x)=P(Xx)={y : y x} p(y) 3. For any number x, F(x) is the probability that the observed value of X will be at most x. 4. For any two numbers a,b with a b, P(a X b) = F(b)-F(a-) where a- represents the largest possible X value that is strictly less than a. Discrete R.V.-illustration 1. Consider the Bernoulli r.v. X with P[X=1]=p, 0<p<1. The probability mass function can be given by px(1-p)1-x 2. The Cumulative distribution function (cdf) F(x) = P(Xx) ={y : y x} p(y) = (1-p)1[x<1] .1[x>0] 3. 0 1 Operations on RV’s 1. Expectation of a RV 2. Expectations of functions of RV’s 3. Special Cases : Moments, Covariance Expected Values of Random Variables 1. Let X be a discrete r.v. with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X) or X , is E(X) = X ={xD} x.p(x) 2. Note that E(X) may not always exists. Consider p(x)=k/x2 3. For Bernoulli X, E(X)=p.1+(1-p).0 = p 4. E(a +bX) = a+bE(X) [linearity property of expectation] Expected Values of functions of Random Variables 1. Let X be a discrete r.v. with set of possible values D and pmf p(x). The expected value or mean value of f(X), denoted by E(f(X)) or f(X) , is E(f(X)) ={xD} f(x).p(x) 2. Example : Variance. E(X)]2=E(X2)-[E(X)]2 Var(X)=V(X)=E[X- 3. Variance of Bernoulli X : E(X-p)2 =E(X2)-p2 = 1.p –p2 = p(1-p) 4. Classical expression of variance of n numbers x1,x2,…xn is simply the variance of a r.v. X that takes the values x1,x2,…xn , each with probability 1/n. Expected Values of functions of Random Variables 1. E[a+bX]=a+bE[X]; Var(a+bX)=b2Var(X) 2. Standard deviation; aka s.d. is Var(X) 3. Let X be the r.v. with pmf x 3 4 5 p(x) .3 .4 .3 E(X)=30.3 +40.4 +50.3=4.0 Var(X) = (3-4)2 0.3 + (4-4)2 0.4 +(5-4)2 0.3 =0.6 s.d. (X) = 0.77 Expected Values of functions of Random Variables 1. Let X be the r.v. with pmf x 0 1 2 3 4 p(x) .08 .15 .45 0.27 0.05 Find E(X), Var(X), s.d.(X) R.V.D - Binomial 1. Binomial experiment : total number of a particular outcome in a sequence of trials with only two possible outcomes. 2. The Binomial r.v. X, with parameters, (n,p) denoted for short by BIN(n,p) is defined by P[X=x]=Ck,n px(1-p)n-x , x=0,1,2,…,n 3. X=X1+X2+…+Xn, where Xk’s are independent Bernoulli r.v.s. 4. E[X] = np (Exercise!) ; Var[X] = np(1-p) R.V.D – Binomial-2 Consider the outcome for a binomial experiment with 4 trials R .V . D - B in o m ial O ut com e SSSS SSSF SSF S SSF F SF SS SF SF SF F S SF F F x 4 3 3 2 3 2 2 1 P rob. p4 p3 ( 1 ¡ p) p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 O ut com e F SSS F SSF F SF S F SF F F F SS F F SF FFFS FFFF x 3 2 2 1 2 1 1 0 P rob p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 p( 1 ¡ p) 3 ( 1 ¡ p) 4 R.V.D – Binomial-3 Each of six r andomly select ed cola dr inker s is given a glass cont aining cola S and one cont aining cola F. T he glasses ar e ident ical in appear ance except for a code on t he bot t om t o ident ify t he cola. Suppose t her e is act ually no t endency among cola dr inker s t o pr efer one cola t o t he ot her . T hen p = P( a select ed individual pr efer s S)= 0:5, so wit h X = t he number among t he six who pr efer S, X » B I N (6; 0:5). T hus, P(X = 3) = 0 1 B B B @ C C C A 6 3 (0:5) 3(0:5) 3 = 20(0:5) 6 = 0:313 t he pr obabilit y t hat at least t hr ee pr efer S is 0 P(X ¸ 3) = X6 B B B @ x= 3 1 6 x C C C A (0:5) x (0:5) 6¡ x = 20(0:5) 6 = 0:0:656 R.V.D – Binomial-4 E(X ) = np. Pr oof: Fir st M et hod: 0 Xn 1 n x @ A px (1 ¡ p) n¡ x = E(X ) = x x= 0 Xn n! px (1 ¡ p) n¡ x x x= 0 x!(n ¡ x)! Xn (n ¡ 1)! p:px¡ 1(1 ¡ p) n¡ 1¡ (x¡ 1) n = x= 1 (x ¡ 1)!(n ¡ x)! n¡ X 1 (n ¡ 1)! px (1 ¡ p) n¡ 1¡ x = p = np x= 0 x!(n ¡ 1 ¡ (x ¡ 1))! P Second met hod : Recognize t hat X = nk= 1 X k wher e each X k is a B er noulli r andom var iable. U se t he linear it y P pr oper t y of expect at ions, i.e., E(X ) = nk= 1 E(X k ) = np R.V.D – Binomial-5 V ar (X ) = np(1 ¡ p). Pr oof: Fir st M et hod: E (X 2) = E (X (X ¡ 1) + X ) = E [X (X ¡ 1)] + E [X ] = np+ Xn 0 x(x¡ 1) @ x= 0 = np+ Xn n(n¡ 1) x= 2 n x 1 A x p (1¡ p) n¡ x = Xn x(x¡ 1) x= 0 n! px (1¡ p) n¡ x x!(n ¡ x)! (n ¡ 2)! p2:px¡ 2(1¡ p) n¡ 2¡ (x¡ 1) (x ¡ 2)!(n ¡ 2 ¡ (x ¡ 2))! n¡ 2 2 X = np+ n(n¡ 1)p (n ¡ 2)! px (1¡ p) n¡ 2¡ x = np+ n(n¡ 1)p2 x= 0 x!(n ¡ 2 ¡ (x ¡ 2))! V ar (X ) = np + n2p2 ¡ np2 ¡ (np) 2 = np ¡ np2 = np(1 ¡ p) P Second met hod : R ecognize t hat X = nk= 1 X k wher e each X k is a B er noulli r andom var iable. U se t he linear it y pr oper t y of expect at ions, as ext ended t o var iance,i.e., P V ar (X ) = nk= 1 V ar (X k ) = np(1 ¡ p) R.V.D – Poisson 1. Poisson r.v. can be thought of as a limit of Binomial experiment where n is very large, and np approaches a limit , say . 2. The Poisson r.v. X, with parameter, , denoted for short by POI() is defined by P[X=x]=e- x /x! , x=0,1,2,… 3. E[X] = (Exercise!) ; Var[X] = R.V.D – Poisson-2 L et X denot e t he number of ° aws on t he sur face of r andomly select ed boiler of a cer t ain t ype. Suppose X has a Poisson dist r ibut ion wit h ¸ = 5. T hen t he pr obabilit y t hat a r andomly select ed boiler has exact ly t wo ° aws is e¡ 5(5) 2 P(X = 2) = = 0:084 2! T he pr obabilit y t hat a boiler cont ains at most t wo ° aws is P(X · 2) = X2 ¡5 x à ! e (5) 25 = e¡ 5 1 + 5 + = 0:125 x! 2 x= 0 R.V.D – Poisson-3 (A s an appr oximat ion t o B inomial) I f a Publisher of non-t echnical books t akes gr eat pains t o ensur e t hat it s books ar e fr ee of t ypogr aphical er r or s, so t hat t he pr obabilit y of any given page cont aining at least one such er r or is 0.005 and er r or s ar e independent fr om page t o page, what is t he pr obabilit y t hat one of it s 400 page novels will cont ain exact ly 1 page wit h er r or s? A t most 3 pages wit h er r or s? W it h S denot ing a page cont aining at least 1 er r or and F an er r or fr ee page, t he number X of pages cont aining at least 1 er r or is a B inomial r .v. wit h n = 400 and p = 0:005, so np = 2. e¡ 2(2) 1 P(X = 1) = b(1; 400; 0:005) ¼ p(1; 2) = = 0:271 1! R.V.D – Poisson-4 E (X ) = ¸ . e¡ ¸ ¸ x E (X ) = x = x! x= 0 X1 e¡ ¸ ¸ x¡ 1 = ¸ = ¸ (x ¡ 1)! x= 1 X1 e¡ ¸ ¸ x x= 1 (x ¡ 1)! X1 e¡ ¸ ¸ x x= 0 x! X1 = ¸ e¡ ¸ e¸ = ¸ V ar (X ) = E[X (X ¡ 1)] + E [X ] ¡ (E[X ]) 2 ¡ ¸ x e ¸ 2 = ¸ ¡ ¸ + x(x ¡ 1) = ¸ ¡ ¸2+ x! x= 0 X1 e¡ ¸ ¸ x = ¸ ¡ ¸ +¸ x= 0 x! 2 1 2X = ¸ ¡ ¸ 2 + ¸ 2e¡ ¸ e¸ e¡ ¸ ¸ 2¸ x¡ 2 x= 2 (x ¡ 2)! X1 Random Variables - Continuous 1. A continuous random variable is a r.v. which takes values in an interval on the real number line. (If multivariate then on the two dimensional real plane, etc.). 2. The probability distribution or probability density function (pdf) of a continuous r.v. X is defined by, a function, say, f(x) such that P[a X b] = a b f(x)dx 3. i.e. the probability that X lies between a and b is given by the area under the graph of f(x) enclosed on the x-axis by a and b. 4. If X is a continuous r.v., then for any constant c, P[X=c]=0. Cumulative distribution functions 1. The cumulative distribution function (c.d.f) of a continuous r.v. X with pdf f(x) is defined by F(x) = P[X x] = -x f(x)dx 2. The density, f(x) is obtained by differentiating F(x) as a function of x. 3. The expectation of a continuous r.v. X with pdf f(x) is defined by E(X) = - xf(x)dx 4. The variance of a continuous r.v. X with pdf f(x), and expectation is defined by Var(X) = - [x]2f(x)dx R.V.C - Uniform 1. The Uniform r.v. X, with parameters, (a,b) denoted for short by UNIF(a,b), with a<b, is defined by the density f(x)=1/[b-a] if a<x<b, and 0 otherwise. 2. F(x)= 0 if x<a, = x/[b-a] if a<x<b, =1, if x>b 3. E[X] = (b-a)/2 (Exercise!) ; Var[X] = ? (Find out!) R.V.C - Exponential 1. The Exponential r.v. X, with parameter , denoted for short by EXP(), with >0, is defined by the density f(x)=(1/ )exp(- x/) if 0<x, and 0 otherwise. 2. F(x)= 0 if x<0, = 1- exp(- x/) , 0<x 3. E[X] = (Exercise!) ; Var[X] = ? (Find out!) R.V.C – Normal or Gaussian 1. The Normal r.v. X, with parameters (, 2), denoted for short by N(, 2), with 2 >0, and -< < , is defined by the density 2. f(x)=(1/ (2))exp(- (x- )2/ (22)) 3. E[X] = (Exercise!) ; Var[X] = 2 (Exercise!) 4. It has a symmetric density function (about its mean). All the measures of central tendency, mean, mode, median are the same. 5. It occurs as the most common limiting distribution for averages of random variables, i.e., averages of large no. of r.v.s can be well approximated by it, for most r.v.s. R.V.C – Normal or Gaussian-2 1. If X follows N(1, 12) and Y follows N(2, 22) then aX+bY, where a,b are real constants, follows N(a1 +b2, a212 +b222) 2. The above result can be extended to any finite number of independent Normal random variables. 3. If X follows N(0, 1), then X is called a standard normal r.v., and the corresponding distribution function is called a standard normal distribution. 4. If X follows N(, 2), then Y=(X- )/ follows N(0, 1). Percentiles of a continuous R.V. 1. Let p be a no. between 0 and 1. The (100p)th percentile of the distribution of a continuous r.v. X, with density f(x), denoted by (p), is defined by p = - (p) f(x)dx R .V . D - B in o m ial O ut com e SSSS SSSF SSF S SSF F SF SS SF SF SF F S SF F F x 4 3 3 2 3 2 2 1 P rob. p4 p3 ( 1 ¡ p) p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 O ut com e F SSS F SSF F SF S F SF F F F SS F F SF FFFS FFFF x 3 2 2 1 2 1 1 0 P rob p3 ( 1 ¡ p) p2 ( 1 ¡ p) 2 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 p2 ( 1 ¡ p) 2 p( 1 ¡ p) 3 p( 1 ¡ p) 3 ( 1 ¡ p) 4 Gaussian or Normal Distribution Sample as Random Observables Parametric Inference Tests of Hypothesis Hypothesis Tests for Normal Population