Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introducing the Study of Statistics Henan University of Technology – November 2013 © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.1 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable 4. Distribution 5. Probability 6. Tables 7. Inferences Try to learn what each of these 7 Steps really means and you will be well on your way to be the master of statistics. Descriptive Statistics covers the first 6 Steps but its important to understand why we study statistics which is Step 7. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.2 Seven 7 Steps • 7 steps - so if you don’t follow these steps which step is the problem? This course concerns Steps 1 to 6 but for the uses of statistics we refer to Step 7 as well. • Note down the problem area for you (if any) and seek help about it. Resources include: lecturer, tutors, internet, textbooks, website and powerpoints. All of these resources are available to you to help you learn. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.3 What is a Distribution? • A distribution is an arrangement of values of a variable showing their observed or theoretical frequency of occurrence. • Distributions may be based on Discrete data or Continuous data. What is the difference? • Throws of a dice = Discrete data • Share market movements = Continuous data © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.4 Distributions of Each Type Discrete Continuous Binomial Normal Poisson Uniform Exponential Students’ t Chi Squared F We will be looking at these distributions during the course © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.5 Distributions and Variables A distribution is an arrangement of values of a variable showing their observed or theoretical frequency of occurrence. A variable is a characteristic of a population. Throw two dice you get 36 possible pairs i.e. a population of 36. A characteristic = example the sum of two dice. A random variable is a rule that assigns a number to each outcome example the sum of the two dice = 11 possible outcomes. (2 to 12) © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.6 Distributions and Variables How many possible outcomes for the difference between two dice? Only 6 possible outcomes – because it’s a different characteristic or rule, its still a random variable of the population of 36 possible pairs. This is the reason we have so many distributions: Because we want to look at different rules within the data to get the information we need. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.7 What is Statistics? Seven 7 Steps 1. Population 2. Sample A population is the group of all of the items of interest. For example, all of the fish in a lake. Or, all of the people in China. Or all of the Panda in China. 3. Variable It can be a group of things. Possible number of combinations 4. Distribution when you throw two dice. Share market changes in Shanghai stock 5. Probability market. Interest rates paid by Chinese banks. 6. Tables 7. Inferences A parameter is a measurement of the population – e.g. mean, population standard deviation. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.8 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable 4. Distribution 5. Probability 6. Tables 7. Inferences A sample is a set of data drawn from the population. For example, catching some fish in a lake. A survey of some Chinese people. Studying some Panda. Shanghai stock market changes between 1990 and 2005. Populations can be very, very large so its easier to study samples and use that evidence to support or oppose a point of view. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.9 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable 4. Distribution 5. Probability 6. Tables 7. Inferences A variable is a characteristic of a population. For example, fish in a lake over the length of 20 centimetres. Panda over the weight of 30 kilogrammes. A variable is a characteristic of a population but a parameter is a measurement of a population. For example, the mean outcome of one dice throw is 3.5. but one throw is a variable with 6 possible outcomes. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.10 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable 4. Distribution 5. Probability 6. Tables 7. Inferences A distribution is an arrangement of values of a variable showing their observed or theoretical frequency of occurrence. Throw a single dice 24 times and observe how often the number 6 appears. Try again with number 1 this time. These are two distributions. The theoretical distribution is one 6 every 6 throws and one 1 every 6 throws. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.11 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable 4. Distribution 5. Probability 6. Tables 7. Inferences Probability is likelihood – the chance of an event occurring. The event needs to be clearly defined and understood. An event is often a collection of more simple events. The probability of an event is the sum of the probabilities of the simple events. The probability of throwing a dice and getting a value equal to or less than 3 is 1/2 or 1/6 plus 1/6 plus 1/6. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.12 What is Statistics? Seven 7 Steps 1. 2. 3. 4. Tables list out the probabilities of Population actual events with varying degrees of significance. The lower the degrees of significance the Sample higher degree of confidence we can have in our proposed value – Variable using the table’s value to compare Distribution with our proposed value. 5. Probability 6. Tables 7. Inferences Tables vary according to the probability distribution being used. And some tables list critical values ( for selected levels of significance) rather than produce a full set of numbers. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.13 What is Statistics? Seven 7 Steps 1. Population 2. Sample 3. Variable Inferential statistics developes methods to draw conclusions about characteristics of a population – based on sample data. It is logical to move from descriptive to inferential statistics. 4. Distribution Statistics aims to get new understanding – to get useful 5. Probability information from lots of data. 6. Tables 7. Inferences Similar problems are likely to have similar solutions. But we need to build confidence before making such judgements. © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.14 Statistic Finally the word ‘statistic’: • Recall we use the word ‘parameter’ to refer to a measurement of a population • ‘Population’ covers all of the items of interest while ‘sample’ covers only a proportion of the population • We use the word ‘statistic’ to refer to a measurement of a sample • A sample statistic is a guide to the value of a population parameter © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.15 Statistic Finally the word ‘statistic’: • Recall we use the word ‘parameter’ to refer to a measurement of a population • ‘Population’ covers all of the items of interest while ‘sample’ covers only a proportion of the population • We use the word ‘statistic’ to refer to a measurement of a sample • A sample statistic is a guide to the value of a population parameter © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1.16