Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Summation Notation The symbol ∑ means sum. If , is a set of n numbers (data) then is the sum of those numbers. We often abbreviate The expression represents the average (mean) value of the numbers If . is a sample data set then n is called the sample size. Example. If a data set is 3, 8, 4, 5, 3, 4, 6 (n=7), compute a) ∑ x = 3+8+ 4+ 5+ 3+ 4+ 6 = 33 b) ∑ x2 = 32+82+ 42+ 52+ 32+ 42+ 62 = 175 c) ∑ x2 – (∑ x)2 = 175 – 332 = -914 2.3 – Numerical Measures of Central Tendency Below is a dot plot of some sample data Where is the center of this distribution? Common measures of the center (or central tendency) of a set of numbers are mean, median, and mode The mean (average) value of a sample data x is called the sample mean and is denoted by x x Sample Mean: x n where n is the sample size (= number of observations) The “mean (average)” value of a variable x in the whole population is called the population mean and is denoted by µ. The population mean µ is rarely know or computable. It is a parameter of a model that mathematically describes the population. Example. The mean of -1, 0, 2, 5 is x 1 0 2 5 1.5 4 The median of a sample data x the middle value (if n is odd) or the average of two middle values (if n is even) of observations arranged in ascending order. Example. a) Find the median of 5, 2, 7, 1, 0 Solution: Sorted data: 0, 1, 2, 5, 7, median = 2 b) Find the median of 5, 2, 7, 1, 0, 4 Solution: Sorted data: 0, 1, 2, 4, 5, 7, median = (2+4)/2 = 3 The mode is the value that occurs most frequently in a data set. In the case when data consists of large number of different observations we instead define the modal class, which is an interval (or intervals) in a histogram for these data that contains the largest number of observations. Example. a) The mode of the data set 0, 2, 3, 1, 4, 3, 2, 3, 5 is 3 b) For R&D data from Section 2.1 the modal class is [7,8) Exercise. Given data 1, 2, 7, 10, 2, 4, 4, 13, 2. a) Find the mean. b) Find the median. c) Find the mode. d) Make the dot plot and mark all the three values. e) Do a) and b) using TI-83 and Excel Geometric Interpretation Median = value that divides histogram into two equal halves Mean = balancing point of a histogram (or a dot plot) 50% median 50% mean 2.4 –Measures of Variability A) 3, 3, 3, 3, 3, 3, 3, 3, 3 NO VARIABILITY B) 3, 2, 3, 4, 4, 3, 3, 3, 2 SMALL VARIABILITY C) 2, 6, 4, 1, 3, 0, 2, 5, 4 LARGE VARIABILITY Common measures of variability are: range, variance, and standard deviation Range = maximum – minimum Example. For the above data sets the ranges are: Range for the set A = 3 – 3 = 0, Range for the set B = 4 – 2 = 2, Range for the set C = 6 – 0 = 6 Deviation = x x = difference between a data value and the sample mean xx x x Example: For the above data set 0 1 2 3 3 4 4 5 6 the sample mean x 3.1 and the deviation of x = 5 from the mean is x x 5 3.1 1.9 Sample Variance = s2 = “average” square deviation of data values from their mean Sample Standard Deviation = s = square root of the variance. Example. Compute the mean, variance and the standard deviation for the data set C) 2, 6, 4, 1, 3, 0, 2, 5, 4: a) by hand, b) using TI-83, c) using Excel a) the sample size n = 9. the sample mean = the sample variance = the sample standard deviation = b) TI-83 commands: STAT - EDIT – [choose] 1:Edit… - ENTER – [enter data in column L1] STAT - CALC – [choose] 1:1-Var Stats… - ENTER – L1 [2nd 1] – ENTER c) Excel functions: sample mean =AVERAGE(block), sample standard deviation = STDEV(block) Exercise. Given data 1, 2, 7, 10, 2, 4, 4, 13, 2. a) Find the mean. b) Find the deviation of x = 2 from the mean c) Find the range d) Find the variance s2 and the standard deviation s for this data set (any method) e) From each data value subtract the mean and divide the result by s, that is for each x compute . Then compute the mean and the standard deviation for the set of z’s.