Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Welcome to Week 03 Thurs MAT135 Statistics http://media.dcnews.ro/image/201109/w670/statistics.jpg Review Descriptive Statistics graphs n max min each observation frequencies “averages” Descriptive Statistics And… Measures of variability! Descriptive Statistics Averages tell where the data tends to pile up Descriptive Statistics Another good way to describe data is how spread out it is Variability Measures of variability tell how close to the “average” the sample data tend to be Variability Just like measures of central tendency, there are several measures of variability Variability Range = max – min sum of (obs – x)2 Variance = n − 1 s = variance Statistics vs Parameters Statistic Parameter n x s2 s N μ σ2 σ Questions? Descriptive Statistics Other numbers and calculations can be used to summarize our data Frequencies Frequency – the number in a category Number of Users 9 18 15 8 1 0 1 Frequencies Cumulative frequency – the number of observations that fall in that category or a previous category This can only be done if the categories can be ordered Cumulative Frequencies How many observations occur in a given category and any previous ordered categories: Minutes Internet Number of Usage Users Cumulative Number of Users 1-20 9 9 21-40 18 27 41-60 15 42 61-80 8 50 81-100 1 51 101-120 0 51 121+ 1 52 Cumulative Frequencies The last value is always “n” the sample size Minutes Internet Number of Usage Users Cumulative Number of Users 1-20 9 9 21-40 18 27 41-60 15 42 61-80 8 50 81-100 1 51 101-120 0 51 121+ 1 52 Cumulative Frequencies The histogram for a cumulative frequency distribution is called an “ogive” Cumulative Frequencies Data table: A B A B A C B B Cum Freq distribution: A: 3 A or B: 3+4=7 A,B or C: 8 n = 8 Histogram: Cumulative Frequencies Note that the last category in a cumulative frequency ALWAYS has the value n Cumulative Frequencies Note also a cumulative frequency cannot get smaller as you move up the categories Cumulative Frequencies Note also a cumulative frequency cannot get smaller as you move up the categories It can stay the same (if the category count is 0) Cumulative Frequencies An ogive typically forms an “s” shape Questions? Fractiles Another way of describing frequency data A measure of position Based on the ogive (cumulative frequency) or ordered data Fractiles How to do it: find n order the data divide the data into the # of pieces you want, each with an equal # of members Fractiles quartile - four pieces percentile - 100 pieces FRACTILES IN-CLASS PROBLEM 6 17 88 33 30 11 41 46 62 5 78 31 54 Step 1: Find n! FRACTILES IN-CLASS PROBLEM 6 17 88 33 30 11 41 46 62 5 78 31 54 n = 12 What’s next? FRACTILES IN-CLASS PROBLEM 7 5 11 17 30 31 33 41 46 54 62 78 88 Order the data! What if you split it into equal halves? How many observations would be in each half? FRACTILES IN-CLASS PROBLEM 8 5 11 17 30 31 33 41 46 54 62 78 88 Poof! 6 observations in each half! This is the 50th percentile or the “median” FRACTILES IN-CLASS PROBLEM 9,12 5 11 17 30 31 33 41 46 54 62 78 88 The 50th percentile or the “median” 33+41 = = 37 2 FRACTILES IN-CLASS PROBLEM 10 5 11 17 30 31 33 41 46 54 62 78 88 What if you wanted quartiles? How many observations would be in each quartile? Where would the splits be? FRACTILES IN-CLASS PROBLEM 11,13 5 11 17 30 31 33 41 46 54 62 78 88 Poof! 3 observations in each quartile! FRACTILES IN-CLASS PROBLEM 11,13 5 11 17 30 31 33 41 46 54 62 78 88 1st 30+17 quartile = = 23.5 2 3rd 62+54 quartile = = 58 2 Fractiles Quartiles and percentiles are common, others not so much The median is also common, but it is called “the median” rather than “the 50th percentile” or “2nd quartile” Questions? Variability Another measure of variability: Variability Interquartile range (IQR): IQR = 3rd quartile – 1st quartile Variability The interquartile range is in the same units as the original data (like the range and standard deviation “s”) FRACTILES IN-CLASS PROBLEM 14 5 11 17 30 31 33 41 46 54 62 78 88 What is the IQR for our data? FRACTILES IN-CLASS PROBLEM 14 5 11 17 30 31 33 41 46 54 62 78 88 1st 30+17 quartile = = 23.5 2 3rd 62+54 quartile = = 58 2 So the IQR is… FRACTILES IN-CLASS PROBLEM 14 5 11 17 30 31 33 41 46 54 62 78 88 1st 30+17 quartile = = 23.5 2 3rd 62+54 quartile = = 58 2 IQR = 58 - 23.5 = 34.5 Questions? Continuous Distributions You can think of smooth quantitative data graphs as a series of skinnier and skinnier bars Continuous Distributions When the width of the bars reach “zero” the graph is perfectly smooth Continuous Distributions SO, a smooth quantitative (continuous) graph can be thought of as a bar chart where the bars have width zero Normal Distribution The most popular continuous graph in statistics is the NORMAL DISTRIBUTION Normal Distribution Two descriptive statistics completely define the shape of a normal distribution: Mean µ Standard deviation σ Normal Distribution Suppose we have a normal distribution, µ = 12 σ = 2 Normal Distribution If µ = 12 12 Normal Distribution If µ = 12 σ = 2 6 8 10 12 14 16 18 Normal Distribution PROJECT QUESTION Suppose we have a normal distribution, µ = 10 ? Normal Distribution PROJECT QUESTION Suppose we have a normal distribution, µ = 10 σ = 5 ? ? ? 10 ? ? ? Normal Distribution PROJECT QUESTION Suppose we have a normal distribution, µ = 10 σ = 5 -5 0 5 10 15 20 25 Normal Distribution We can change any normallydistributed variable into a standard normal One with: mean = 0 standard deviation = 1 Normal Distribution PROJECT QUESTION For the standard normal distribution, µ = 0 σ = 1 ? ? ? ? ? ? ? Normal Distribution For the standard normal distribution, µ = 0 σ = 1 -3 -2 -1 0 1 2 3 Normal Distribution The standard normal is also called “z” Normal Distribution To calculate a “z-score”: Take your value x Subtract the mean µ Divide by the standard deviation σ Normal Distribution z = (x - µ)/σ Normal Distribution IN-CLASS PROBLEMS Suppose we have a normal distribution, µ = 10 σ = 2 z = (x - µ)/σ = (x-10)/2 Calculate the z values for x = 9, 10, 15 Normal Distribution IN-CLASS PROBLEMS z = (x - µ)/σ = (x-10)/2 x . 9 z = (9-10)/2 = -1/2 Normal Distribution IN-CLASS PROBLEMS z = (x - µ)/σ = (x-10)/2 x . 9 z = (9-10)/2 = -1/2 10 z = (10-10)/2 = 0 Normal Distribution IN-CLASS PROBLEMS z = (x - µ)/σ = (x-10)/2 x . 9 z = (9-10)/2 = -1/2 10 z = (10-10)/2 = 0 15 z = (15-10)/2 = 5/2 On the graph: -3 -2 -1 0 | -1/2 1 2 3 | 5/2 Empirical Rule Questions? You survived! Turn in your classwork! Don’t forget your homework due next week! Have a great rest of the week!