Data • Quantitative data are numerical observation – age of students in a class. Age is quantitative data because it quantifies the age of a person • Qualitative data are categorical observation. – Marital status; married, single, divorced, widowed. Continued • Variable: assumes different value. – Gender is a variable which includes male, female • grouping data – 5< classes<15 • Use Sturges rule to determine # of classes. • Use a bar graph to describe qualitative data Frequency Distribution. • Relative frequency: frequency of a given data value/ total frequency. – Relative frequency is a fraction of the total frequency. – 25% of the total number of students passed the test. • Cumulative frequency is used to show the number of observations below or above a certain value. Measure of Central Tendency • Extreme values:very high or low compared to the rest of data – Draw a box-plot to determine the number of extreme values/outliers • When there are no extreme values, use mean for the central tendency. X- is used to denote a sample mean • When there are extreme values, use median for the central tendency Measure of variation in data • Range is the simplest measure of variation in data and is = Highest value -lowest value – Not a good measure because you are using only two data of the data set. • Standard deviation measures the dispersion of data from the mean of the data set. – Better than range because it used all data in the data set. Mean of data • When data are given without frequency column, • add all values of x and then divide by the total number of data • For repeat data, multiply each data value by its frequency and then add them. Divide the sum by the sum of all frequencies. • For grouped data, get the class mark of Median of data • Arrange the data in ascending order and find the middle number.This applies to odd number of data. Formula: (n+1)/2 indicates the position of the number. • When the number of data is even, find two middle numbers and get the arithmetic mean • (n/2+(n/2+1)/2 indicates the positions Mode • Mode is the number that shows up very often in the data set • For mode of repeat and grouped data, follow the formulas given in the class. • If there are more than two modes in a given set of data, the sense of central tendency is obscured. Quartiles • First quartile value cuts of a lower-tail area of 25% of the distribution. If the first quartile mark of the test #1 is 60, 25 of the students received less than 60.(Q1) • The second quartile cuts of the lower 50 of the distribution. (Q2) • The third quartile separates the lower 75% of the distribution from the upper Skewness • The distribution is skewed to the right when the right tail is longer than the left one. mean is > median> mode. • The distribution is skewed to the left when the left tail is longer than the right one. Mean is <median<mode. • The distribution is symmetric when the left side is the mirror image of the right. Empirical Rule • For the distribution that is symmetric and unimodal, about 68%of the data fall within one standard deviation of the mean. • 95% of the data fall within 2 standard deviation of the mean. • About 99.9 5 of the data fall within 3 standard deviation of the mean.