Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1995 7888 4320 000 000001 00023 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. 16 16-2 Data Analysis: Testing for Significant Differences 1234 0001 897251 00000 1995 7888 4320 000 000001 00023 C H A P T E R Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. The Value of Testing for Differences in Data 1995 7888 4320 000 000001 00023 Basic statistical techniques bring “structure” to the raw data which has been captured by the research team. Every data set needs to be summarized to discern what the entire set of responses mean. The output from statistical analysis can be displayed graphically, adding a fresh perspective to a decision-maker’s understanding of an information problem, or market opportunity. 16-3 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. SPSS Applications Database 1995 7888 4320 000 000001 00023 Easy-to-use software like SPSS for Windows has changed the way statistics is being taught and learned: Class participants no longer have to learn a system of elaborate code to conduct data analysis. Data is entered, items are chosen from pull-down menus, and options can be “clicked” to create graphs and perform simple or complex analyses. Generally, SPSS for Windows has improved the quality of life for: 1. Research Teams applying statistics. 2. Teachers teaching statistics. 3. Students learning statistics. 16-4 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Bar Charts: An Example 1995 7888 4320 000 000001 00023 100 Frequency 80 60 40 20 0 1 2 3 4 5 6 Importance 16-5 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Line Charts: An Example 1995 7888 4320 000 000001 00023 100 80 60 40 20 0 1 16-6 2 3 4 5 6 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Pie Charts: An Example 1995 7888 4320 000 000001 00023 Very Important 26.3% Important 19.7% Somewhat Important 6.6% 6.6% 23.4% 17.1% 16-7 Somewhat Unimportent Unimportant Very unimportant Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Analysis Techniques 1995 7888 4320 000 000001 00023 Descriptive Statistics: Used by researchers to summarize sample data. Univariate Statistics: Used when a researcher investigates one variable at a time. Bivariate Statistics: Used when a researcher investigates two variables at a time. Multivariate Statistics: In a broad sense, multivariate statistics refer to any simultaneous analysis of more than two variables. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Descriptive Statistics 1995 7888 4320 000 000001 00023 This type of statistics describes sample data and often leads to subsequent analyses. What is the average income of the sample? How old is the average employee in Company X? How different are the employees’ ages in Company X? How spread out is the income data that have been drawn as a sample of the population? Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Descriptive Statistics – cont’d 1995 7888 4320 000 000001 00023 Three Groups Central Tendency of the Variable Mean Median Mode Dispersion Range Variation (or Standard Deviation) Coefficient of Variation Shape of the Distribution Skewness Kurtosis Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Central Tendency 1995 7888 4320 000 000001 00023 Average: Single value that is typical or representative of a group of numbers. An average is frequently referred to as a measure of central tendency. The most common known measures of central tendency are the mean, median, and mode. A measure of central tendency describes the center of a distribution. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Central Tendency – cont’d 1995 7888 4320 000 000001 00023 Mean: Most commonly used measure of central tendency. The sum of the values in a data set divided by the number of values in the set. The computation of the mean is based on all values of a set of data. Median: Value of the middle item when the numbers are arranged in order of magnitude. A positional average Not defined algebraically as is the mean In some cases, cannot be computed exactly as can the mean Is centrally located Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Central Tendency – cont’d 1995 7888 4320 000 000001 00023 Mode: The value that occurs most frequently in the set. When there are two or more modes in a set of data, the data are called bimodal or multimodal. The value with the highest frequency in the set of values. By definition, is not affected by extreme values. Is easy to compute with a set of discrete data. The value of the mode may be greatly affected by the method of designating the class intervals. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Frequency Distribution 1995 7888 4320 000 000001 00023 Raw Data: The collected data which have not been organized numerically. Frequency: The number of times a value is repeated. Frequency Distribution: A distribution of data that summarizes the number of times a certain value of a variable occurs and is expressed in terms of percentage. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Shapes of Frequency Distributions 1995 7888 4320 000 000001 00023 In addition to the averages and dispersions, two other measures are used in describing the characteristics of a group of data. These measures are skewness and kurtosis. Measure of Skewness Skewness: Indicates the direction of an asymmetrical distribution, either leaning toward higher values or lower values. Kurtosis: Indicates the relative peakedness according to the frequency distribution. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 16.5 Skewness of a Distribution 1995 7888 4320 000 000001 00023 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Central Tendency 1995 7888 4320 000 000001 00023 I. The Mean is the arithmetical average of the sample. For example, 5.5 is the mean of the following sample: (1,2,3,4,5,6,7,8,9,10) II. The Median is the middle value of a rankordered distribution: For example, 5.5. is the median of the following sample: (1,2,3,4,5,6,7,8,9,10) III. The Mode is the most frequently mentioned response, or number in a data set. There isn’t a mode in the following sample: (1,2,3,4,5,6,7,8,9,10) 16-8 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Measures of Dispersion 1995 7888 4320 000 000001 00023 The Range is the distance between the biggest and smallest values in a data set. For example, 9 is the range for the following data set: (1,2,3,4,5,6,7,8,9,10) The Variance is the average squared deviation about the mean of a distribution of values. What is the variance for the data set listed above? The Standard Deviation is the average distance of the distribution values from the mean. What is the standard deviation for the data set listed above? 16-9 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. What are Univariate Statistics? 1995 7888 4320 000 000001 00023 Univariate Statistics: Statistics used when a researcher investigates only one variable at a time. More specifically, this type of statistics is used when only one measurement of each element in the sample is taken, or multiple measurements of each element are taken but each variable is analyzed independently. Univariate statistical techniques can be further broken down according to whether the data is nominal, ordinal, interval, or ratio scaled. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. An Overview of Hypothesis Testing 1995 7888 4320 000 000001 00023 A Statistical Hypothesis: (or simply Hypothesis) An assumption or informed guess made about a population characteristic. It can be defined as an unproven statement or proposition about something under investigation by a researcher. Before accepting or rejecting any hypothesis, marketing managers test it to determine the likelihood of it being true. It can either be rejected or not rejected. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. An Overview of Hypothesis Testing (cont.) 1995 7888 4320 000 000001 00023 Null hypothesis: The hypothesis to be tested for possible acceptance or rejection. A null hypothesis is usually denoted by the symbol Ho. Alternative hypothesis: An assumption believed to be true if the null hypothesis is false. An alternative hypothesis is denoted by H1. In a given test, there is usually only one null hypothesis, but there may be several alternative hypotheses. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Terminology 1995 7888 4320 000 000001 00023 Degrees of Freedom: The number of variables that can vary freely in a set of variables under certain conditions. Statistical Significance: The differences in findings that cannot be caused by chance or sampling error alone. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Steps for Testing Hypotheses 1995 7888 4320 000 000001 00023 State the null and alternative hypotheses. Select a suitable test statistic and its distribution. Select the level of significance and critical values. State the decision rule. Collect relevant data and perform the calculations. Make a decision. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 16.6 A General Procedure for Hypothesis Testing Formulate H0 and H1 Step 2 Select Appropriate Test Step 3 Choose Level of Significance, α Step 4 Collect Data and Calculate Test Statistic 1995 7888 4320 000 000001 00023 Step 1 a) Determine Probability Associated with Test Statistic(TSCAL) a) Compare with Level of Significance, α Step 5 Step 6 b) Determine Critical Value of Test Statistic TSCR Determine if TSCR falls into (Non) Rejection Region b) Step 7 Reject or Do Not Reject H0 Step 8 Draw Marketing Research Conclusion Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Procedure for Testing Hypotheses 1995 7888 4320 000 000001 00023 Step 1: State the Null and Alternative Hypotheses The null hypothesis can be stated as having no difference between the two given values, or the difference is zero. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Step 2: Select a Suitable Test Statistic and Its Distribution 1995 7888 4320 000 000001 00023 To decide on the appropriate test statistic, the researchers should consider the shape and characteristics of the sampling distribution. Test statistic is calculated from the sample data, whose sampling distribution is used to test whether we may reject the null hypothesis. Popular test statistics for testing hypotheses are t-test, F-test, and the chi-square goodness of fit test. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Step 3: Select the Level of Significance and Critical Values 1995 7888 4320 000 000001 00023 Two Types of Problems, or Errors, Can Result Type I Error: The researcher rejects a null hypothesis that actually is true. Type II Error: The researcher accepts a null hypothesis that actually is not true. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Step 3: Select the Level of Significance and Critical Value( cont.) 1995 7888 4320 000 000001 00023 Level Of Significance Specifying Type I Error (): The maximum probability of making a Type I error specified in a hypothesis test. The level of significance is usually specified before a test is made. The value of 5% (= 0.05) or 1% (= 0.01) is frequently used to set the level of significance, although other values may also be used. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Step 3: Select the Level of Significance and Critical Values (cont.) 1995 7888 4320 000 000001 00023 Two-Tailed Tests and One-Tailed Tests: The level of significance may be represented by a portion of the area under the normal curve in two ways: 1. The hypothesis tests based on the level of significance represented by both tails under the normal curve are called two-tailed tests or two-sided tests. 2. If the level of significance is represented by only one tail, the tests are called one-tailed tests or one-sided tests. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. 1995 7888 4320 000 000001 00023 Step 4: State the Decision Rule Specifies the conditions under which the null hypothesis may be rejected, given the sample results. It is based on the level of significance and is stated prior to data collection. Reject the null hypothesis if the difference between the sample mean and the hypothesized population mean falls into a rejection region. Otherwise, accept the null hypothesis. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. 1995 7888 4320 000 000001 00023 Step 5: Collect Relevant Data and Perform the Calculations Collect the relevant information Perform the calculations Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. 1995 7888 4320 000 000001 00023 Step 6: Make a Decision Refer back to our decision rule (Step 4). We reject the null hypothesis when the computed value falls in the rejection region or accept it when the computed value falls in the acceptance region. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 16.8 Probability of z With a One-Tailed Test Chosen Confidence Level = 95% 1995 7888 4320 000 000001 00023 Chosen Level of Significance, α=.05 z = 1.645 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Selected Hypothesis Tests – cont’d 1995 7888 4320 000 000001 00023 t-Test and t-Distribution: (a.k.a., the Student’s distribution) The t-distribution is a bell-shaped and symmetric distribution that is used for testing small samples (n < 30). Can be used to test a hypothesis about a sample mean when the population standard deviation (σ) is unknown and the sample size is considered small, usually less than or equal to 30. When a t-distribution is used to test a hypothesis, then the test is called a t-test. The distribution of the values of t is not normal, but its use and the shape are somewhat analogous to those of the standard normal distribution of z. Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Developing a Hypothesis 1995 7888 4320 000 000001 00023 The first “stage” in testing a hypothesis is to develop one to be tested! A hypothesis allows the research team to compare two groups of respondents to see if there are important differences between them. A hypothesis is developed before any data is collected by the research team. A hypothesis is developed as part of the overall research plan agreed to by the research team and the client. 16-10 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Significance: Type I Error 1995 7888 4320 000 000001 00023 A Type I error occurs when a research team rejects the null hypothesis when it is true. This is often referred to as the “probability of alpha”. 16-11 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Significance: Type II Error 1995 7888 4320 000 000001 00023 A Type II error occurs when a research team accepts the null hypothesis when the alternative hypothesis is true. This is often referred to as the “probability of beta”. 16-12 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Figure 16.7 Type I Error (α) and Type II Error (β ) 1995 7888 4320 000 000001 00023 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Univariate Hypothesis Testing: An Example One-Sample Statistics 1995 7888 4320 000 000001 00023 X2-Competitive Price N Mean Std. Deviation 50 2.22 1.15 Std. Error Mean .16 One-Sample Test Test Value = 5.5 95% Confidence Interval of the Difference X2-Competitive Price 16-13 t df -20.203 49 Sig. Mean (2-tailed) Difference Lower Upper .000 -3.28 -3.61 -2.95 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Bivariate Hypothesis Testing: An Example Group Statistics 1995 7888 4320 000 000001 00023 Std. Error Mean N Mean Std. Deviation Male 20 4.35 1.04 .23 Female 30 5.07 .78 .14 Gender Satisfaction level 16-14a Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Bivariate Hypothesis Testing: An Example 1995 7888 4320 000 000001 00023 Independent Samples Test t-test for Equality of Means Levene’s Test for Equality of Variances 95% Confidence Interval of the Mean F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference Lower Upper 3.415 .071 -2.775 48 .008 -.72 .26 -1.24 -.20 -2.624 33.048 .013 -.72 .27 -1.27 -.16 Satisfaction level Equal variances assumed Equal variances not assumed 16-14b Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Analysis of Variance: ANOVA 1995 7888 4320 000 000001 00023 In the “language-game” of marketing research, ANOVA stands for the “analysis of variance”. ANOVA is a very sophisticated statistical technique which tells a research team whether three or more means are statistically distinct from one another. 16-15 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Analysis of Variance: n-Way ANOVA 1995 7888 4320 000 000001 00023 In the language-game of marketing research, an n-Way ANOVA is another intricate statistical technique which allows the research team to explore several independent variables simultaneously. 16-16 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Analysis of Variance: MANOVA 1995 7888 4320 000 000001 00023 In the language-game of marketing research, there’s also something called MANOVA. MANOVA considers the mean differences for a group of dependent measures, exploring a bunch of dependent variables across a bunch of independent variables. 16-17 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Summary of Learning Objectives 1995 7888 4320 000 000001 00023 Understand the mean, median, mode as measures of central tendency. Understand the range and standard deviation of a frequency distribution as measures of dispersion. Understand how to graph measures of central tendency. Understand the difference between independent and related samples. Explain hypothesis testing and assess potential error in its use. Understand univariate and bivariate statistical tests. Apply and interpret the results of the ANOVA and n-way ANOVA statistical methods. 16-18 Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved.