Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Descriptive Statistics II: By the end of this class you should be able to: • • • describe the meaning of and calculate the mean and standard deviation of a sample estimate normal proportions based on mean and standard deviation plot a histograms with alternative scaling Palm: Section 7.1, 7.2 please download cordbreak1.mat & FWtemperature.txt Exercise • Download FWTemperature.txt • Read into MATLAB • Prepare a single figure with two plots – a histogram of March highs (row 2) – a histogram of April highs (row 4) • Label these plots fully • Print out the your commands and the resulting figure Review: Quantifying Variation Mean Standard Deviation Central Tendency Spread >> mean(x) n x x i 1 n i >> std(x) difference deviation of each point about the mean sx Summation yields one number squared all values positive n 2 ( x x ) i i 1 n 1 Divide by n-1 normalize the sum for based on degrees of freedom Formula MATLAB EXCEL >> mean(variable) = average(range) >> std(variable) = stdev(range) n Mean Sample Standard Deviation x x i 1 i n n sx 2 ( x x ) i i 1 n 1 The Normal (Gaussian) Distribution Mode probability density (scaled frequency) 0.4 0.35 0.3 (Population) Standard Deviation 0.25 0.2 0.15 0.1 Mean 0.05 0 -4 -3 -2 -1 0 1 2 standard deviations from the mean 3 4 Note on Sample and Population Statistics Standard Deviation Mean Sample (The estimate from a sample of the whole population) Population (The true value from the entire population) as n s s s s m x m x or m Expected Proportions for known s mean m probability density (scaled frequency) 0.4 Percentage of observations in the given range 0.35 0.3 68 % 1s 0.25 0.2 95.5 % 0.15 0.1 0.05 0 -4 99.7% 2s 3s -3 -2 -1 0 1 2 standard deviations from the mean 3 4 Expected Proportions for known s probability density (scaled frequency) 0.4 0.35 0.3 0.25 0.2 68 % 0.15 0.1 100 68 16% 2 16 % 0.05 0 -4 -3 -2 -1 0 1 2 standard deviations from the mean 3 4 Proportions Problem Data analysis of the breaking strength of a certain fabric shows that it is normally distributed with a mean of 200 lb and a variance (s2) of 9. • Estimate the percentage of fabric samples that will have a breaking strength between 197 lb and 203 lb. • Estimate the percentage of fabric samples that will have a breaking strength no less than 194 lb. 9 x 10 -3 Cord Breaking Distribution with Normal Curve 8 1 ( x m )2 /(2s 2 ) p (x ) e s 2 Scaled Frequency 7 6 5 4 3 2 1 0 145 165 185 205 225 245 265 285 305 325 345 365 Breaking Force (n) Review: Types of Histograms Type Freq. Absolute Frequency absolute count in each bin Relative Frequency fraction of total count in each bin Scaled Frequency fraction of total area in each bin Formula =z z sum(z ) z sum (z ) * bin width Use Matlab for a quick picture >> hist(x, n) compare samples when total counts differ >> [x,z] = hist(x) >> zr = z/sum(z) >> bar(x, zr) compare samples when bin sizes differs >> b = bin centers >> [x,z] = hist(x,b) >> zs = z/(sum(z)*w) >> bar(x, zs) Additional Example (not covered in class) Looking at two sets of data • Look at a histogram of the second set of data, ‘cord2’ • How would you compare it to cord the first set of data? • What problems do you run into?