Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FOR ALL CALCULATIONS WRITE OUT FULL CALCULATOR DISPLAY AS YOUR ANSWER. DON’T ROUND. Graphical Representation Averages & Measures of Dispersion: f HISTOGRAMS: Area Bar X MEAN: Frequency FrequencyDensity Frequency ClassWidth X or fx n f x x or fx n f 2 STD DEV: x 2 ADDITION LAW: P( A B) P( A) P( B) P( A B) P( A B) MULTIPLICATION LAW: P( A B) P( B) MUTUALLY EXCLUSIVE: 2 x2 NO INTERSECTION P( A B) P( A) P( B) P( A B) 0 MEDIAN & QUARTILES for raw data: n 1 th value ALWAYS. Median = 2 Used for CONTINUOUS DATA only. BOX & WHISKER DIAGRAMS: Lower Quartile = median of the values BELOW the median. Upper Quartile = median of the values ABOVE the median. f Highest Value Lowest Value Median LQ MEDIAN by INTERPOLATION: Median b UQ 10 20 30 40 50 60 % CODING: used to simplify the arithmetic of finding the mean / standard deviation of a Grouped Frequency Distribution: -VE SKEW. Q2 Q1 Q3 Q2. mean<median<mode. +VE SKEW. Q2 Q1 Q3 Q2. mode<median<mean. SYMMETRICAL Q2 Q1 Q3 Q2. c 1 ( n CF ) fc 2 where b = lower class boundary; c = class width; f c = class frequency; CF = cumulative frequency up to ‘median class’ x Advantages: Highlights trends. Illustrates skewness. Highlights average Highlights outliers. Disadvantages: Time consuming to draw. Does not retain original data. y classmidpo int midpo int of middle class (uniform)classwidth To decode - Mean: reverse code Probability: multiply by class width, add on midpoint of middle class. - Standard Deviation: ONLY multiply by class width. PROOF OF INDEPENDENCE: P( A B) P( A) Random Variables: E( X ) xp( x) E( X 2 ) x2 p( x) VAR( X ) E ( X 2 ) [ E ( X )]2 THE DISCRETE UNIFORM DISTRIBUTION: X 1 STEM & LEAF DIAGRAMS: Advantages: Retains original data. Highlights trends (resembles a Bar Chart). Illustrates skewness. Disadvantages: Time consuming to draw. Does not highlight any averages. EQN of REGRESSION LINE: s y bx a where b xy , a y bx and… sxx x y s x 2 ( x ) 2 s y 2 ( y ) 2 sxy xy xx yy n n n … n E( X ) p(x) n 1 2 PMCC: sxy r sxx s yy where -1≤r≤1 Perfect negative correlation Very Strong negative correlation Strong negative correlation Moderate negative correlation Weak negative correlation No correlation Weak positive correlation Moderate positive correlation Strong positive correlation Very Strong positive correlation Perfect positive correlation 1 (n 1)( n 1) 12 SCALING RANDOM VARIABLES: Normally used when a Discrete Uniform Distribution has been ‘scaled’ ie. x=1,2,3,…,n changed to y=3,5,7,…2n+1. Var(aX b) a 2Var( X ) E (aX bY ) aE ( X ) bE (Y ) Var (aX bY ) a 2Var ( X ) b 2Var (Y ) E (aX b) aE ( X ) b The Normal Distribution: STANDARDISING: X 2 Z where Z ~ N ( , ) NB. Mean & Variance are parameters. Useful Regions 1 - Ф(z) TABLES: Ф(z) REMEMBER WHEN ASKED YOU MUST ALWAYS INTERPRET THE GRADIENT IN CONTEXT. -1 -1 to -0.9 -0.9 to -0.7 -0.7 to -0.4 -0.4 to -0.2 -0.2 to 0.2 0.2 to 0.4 0.4 to 0.7 0.7 to 0.9 0.9 to 1 1 Var ( X ) X values (outcomes) MUST begin at 1 and be consecutive Regression & Correlation: mode=median=mean. 2 Ф(z1) + Ф(z2) -1 1 - Ф(z) Ф(z1) – Ф(z2) Values in tables are areas to the LEFT of z values (LESS THAN) THINK about the sign of Z – negative if left of centre. WORKING BACKWARDS: Remember to ADD any simultaneous equations so that μ’s cancel. Lower tail z values cannot be looked up directly – 1minus and remember to make z negative.