Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Frequency Distribution and Variation Prepared by E.G. Gascon Frequency Distributions Frequency distribution: Quantitative Data is a table that shows classes or intervals (frequency f of a class is the number of data entries in the class Lower class limit = least number that can belong to the class Upper class limit = greatest number that can belong to the class Class width = distance between lower (or upper) limits of consecutive classes. (Not- lower-upper within a class) Range – difference between the maximum and minimum data Class boundaries- are the numbers that separate classes without forming gaps between them Constructing a Frequency Distribution Decide on the number of classes (could be arbitrary) Find the range= highest value – lowest value Find the class width = Divide the range by number of classes (round up to next whole number if decimal) Decide the class limits Tally Count tally to find frequency Total frequency Creating a Histogram in Excel There are several ways depending upon the version Household Income Example Midpoi Frequency in nt Thousands 2500 814 7500 1389 12500 1268 20000 2203 30000 1722 42500 2243 62500 2030 87500 868 Frequency in Thousands 2500 Frequency Enter the data (Midpoint as text by writing each with a ‘ in front ex: ‘250 Select the Select the data and create a column 2000 1500 1000 500 0 2500 7500 12500 20000 30000 Midpoint Income 42500 62500 87500 Creating a Histogram in Excel-p2 Make the bars touch by changing the “gap width= 0” Right click on the bars and select “Format Data Series” Frequency in Thousands 2500 Frequency 2000 1500 1000 500 0 2500 7500 12500 20000 30000 Midpoint Income 42500 62500 87500 Measures of Central Tendency Mean: sum of the data divided by number of entries Median: Middle of data when the data set is ordered. Affected by outliers (values which are a distance from the majority of entries If the data set has an odd number of entries median is the middle data entry. If the data set is even number of entries, the median is the mean of the two middle entries. Mode: is the data entry that occurs with the greatest frequency. If no entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, each entry is a mode and the data set is called bimodal. The mode is the only measure of that is used to describe data non-numeric data, when working with quantitative data, it is rarely used. Measures of Variation Range: is the difference between the maximum and minimum data entries in the set. Deviation: of an entry x, in a population data set is the difference between the entry and the mean of the data set Variance is the average of the sums of all the deviations. (not easily calculated in a large sample so…. Sample variance: Sample Standard Deviation: s2 s ( x x) 2 n 1 2 ( x x ) n 1 Interpretation of the Standard Deviation The size of the standard deviation tells up something about how spread out the data are from the mean. ~68% of the data lie within 1 standard deviation of the mean (1 times the size of the SD on either side of the mean) ~95% of the data lie within 2 standard deviation of the mean (2 times the size of the SD on either side of the mean) ~99.7% of the data lie within 3 standard deviation of the mean (3 times the size of the SD on either side of the mean) Standard Score, (z-score) represents the number of standard deviations a given value x falls from the mean .