Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Thoughts, Tips and Suggestions for Teaching Statistics for Today's Students David M. Levine, Baruch College (CUNY) [email protected] The First Day of Class • First impressions are critically important in everything you do in life. • This is the most important class of the semester. • You need to set the tone to create a new impression that the course will be important to their business education. DSI Seattle WA 2015 The Typical Introductory Business Statistics Course • Overview/orientation • Tables and Charts/Descriptive Statistics • Probability and Probability Distributions • Confidence Intervals and Hypothesis Testing • Regression DSI Seattle WA 2015 Additions? Statistics as a way of thinking and problem-solving. Use a problemsolving framework such as DCOVA (see References 1 - 4): Define your business objective and the variables for which you want to reach conclusions Collect the data from appropriate sources Organize the data collected Visualize the data by constructing charts Analyze the data to reach conclusions and present those results DSI Seattle WA 2015 Additions? continued • Descriptive Analytics - Drilling down Multidimensional contingency tables Slicers Big data • Predictive Analytics - Increased emphasis on p-values - Regression - Logistic regression and classification and regression trees - (not possible in one-semester course) DSI Seattle WA 2015 Reductions? • Reduce Probability: no more than 30 minutes to define terms • Reduce Probability distributions: cover only the normal distribution • Reduce Hypothesis testing: cover only basic concepts, difference between means, difference between proportions (needed in A-B testing common in online presentation systems) DSI Seattle WA 2015 Tell A Story • Each example should tell a story • Focus on an application from a functional area of business – accounting, eco/finance, management, marketing, information systems • For every story, use the DCOVA steps of Define, Collect, Organize, Visualize, and Analyze DSI Seattle WA 2015 Tables and Charts/Descriptive Statistics • Organizing and Visualizing Categorical Data • Summary tables • Bar charts • Pie charts • Pareto diagrams • Two-way contingency tables • Multiway contingency tables • Drilling down/Excel slicers DSI Seattle WA 2015 Tables and Charts/Descriptive Statistics • Organizing and Visualizing Categorical Data • Summary tables • Bar charts • Pie charts • Pareto diagrams • Two-way contingency tables • Multiway contingency tables • Drilling down/Excel slicers DSI Seattle WA 2015 Experiment 1 Web designers tested a new call to action button on its webpage. Every visitor to the webpage was randomly shown either the original call to action button (the control) or the new variation. The metric used to measure success was the download rate: the number of people who downloaded the file divided by the number of people who saw that particular call to action button. Results of the experiment yielded the following: Variations Downloads Visitors Original Call to Action Button 351 3,642 New Call to Action Button 485 3,556 DSI Seattle WA 2015 Results Approximately 9.6% of the web site visitors who were shown the original call to action button downloaded the file as compared to approximately 13.6% of the web site visitors who were shown the new call to action button. The results were highly statistically significant showing that the download rate was higher for the new call to action button. There was 95% confidence that the actual difference in the download rate between the original and new call to action buttons was between approximately 2.5% and 5.5%. DSI Seattle WA 2015 Experiment 2 Web designers tested a new web design on its webpage. Every visitor to the webpage was randomly shown either the original web design (the control) or the new variation. The metric used to measure success was the download rate: the number of people who downloaded the file divided by the number of people who saw that particular web design. Results of the experiment yielded the following: Variations Downloads Visitors Original web design 305 3,427 New web design 353 3,751 DSI Seattle WA 2015 Results Approximately 8.9% of the web site visitors who were shown the original web design downloaded the file as compared to approximately 9.4% of the web site visitors who were shown the new web design. The results showed that there was insufficient statistical evidence that the download rate was higher for the new web design. DSI Seattle WA 2015 Experiment 3 Web designers now tested two factors simultaneously – the call to action button and the new web design. Every visitor to the webpage was randomly shown one of the following: • Old call to action button with old web design • New call to action button with old web design • Old call to action button with new web design • New call to action button with new web design • Again, the metric used to measure success was the download rate: the number of people who downloaded the file divided by the number of people who saw that particular call to action button and web design. Results of the experiment yielded the following: DSI Seattle WA 2015 Downloads Call to Action Button Web Design Yes No Total Old Old 83 917 1,000 New Old 137 863 1,000 Old New 95 905 1,000 New New 170 830 1,000 485 3,515 4,000 Total • Old call to action button with old web design: 8.3% downloaded the file • New call to action button with old web design: 13.7% downloaded the file • Old call to action button with new web design: 9.5% downloaded the file • New call to action button with new web design: 17.0% downloaded the file DSI Seattle WA 2015 Results Notice that the results for the first three combinations of call to action button and web design were similar to the first two experiments. However, when the new call to action button was combined with the new web design, there was a multiplicative or synergistic effect in which having both of these together resulted in an effect that was more than each effect separately. This effect could only be discovered by simultaneously varying the two effects and was not seen in the first two experiments when only one effect was varied at a time. DSI Seattle WA 2015 Pedagogical Point • Your analytical process worked as you added variables and determined whether unforeseen relationships were uncovered. • Drilling down with the additional factor enabled you to find uncover an unforeseen relationship on the likelihood of downloading the file that was not apparent when only one of the factor was studied. DSI Seattle WA 2015 Excel Slicers • A panel of clickable buttons that appears superimposed over a worksheet. • Each slicer panel corresponds to one of the variables that is under study. • Each button in a variable’s slicer panel represents a unique value of the variable that is found in the data under study. • You can create a slicer for any variable that has been associated with a PivotTable and not just the variables that you have physically inserted into a PivotTable. This allows you to work with more than three or four variables at same time in a way that avoids creating an overly complex multidimensional contingency table that would be hard to read. DSI Seattle WA 2015 Excel Slicers (continued) • By clicking buttons in slicer panels you can ask questions of the data you have collected, one of the basic methods of business analytics. • This contrasts to the methods of organizing data which allow you to observe data relationships but not ask about the presence or absence of specific relationships. • Because a set of slicers can give you a “heads-up” about the data you have collected, using a set of slicers mimics the function of a business analytics dashboard. DSI Seattle WA 2015 An Excel Slicer Count of Category Column Labels Row Labels Growth Four Grand Total 1 1 Mid-Cap 1 1 Grand Total 1 1 DSI Seattle WA 2015 Descriptive Statistics • Measures of Central Tendency – mean, median, mode • Measures of variation – range, variance, standard deviation, coefficient of variation, Z scores • Shape: skewness and kurtosis • Exploring data – quartiles, interquartile range, five-number summary, boxplot DSI Seattle WA 2015 Probability and Probability Distributions • Probability – no more than 30 – 60 minutes • Do an example without formulas • Define terms • Make sure students know that the smallest value is 0 and the largest value is 1 • Probability distributions – cover only the normal distribution • No need to explicitly cover the binomial distribution DSI Seattle WA 2015 Sampling Distributions and Confidence Intervals • Focus on the concept of the sampling distribution and the Central limit theorem. • Show chart of what happens as sample size is increased for different populations • Develop concept of confidence interval possibly with different samples taken from a population • Cover confidence intervals and sample size determination only for mean and for proportion DSI Seattle WA 2015 Hypothesis Testing • Don’t try to cover too many different tests. The more tests you try to cover, the less that students will understand. • Fundamental concepts using one sample test for the mean or the proportion to be able to develop concept of the p-value. • Test for difference between means • Test for difference between proportions (Z or chi-square) DSI Seattle WA 2015 Regression • Only simple linear regression in a one semester undergraduate course • Use software; don’t compute regression coefficients • Focus on interpretation • Residual analysis DSI Seattle WA 2015 Logistic Regression Predicting a categorical dependent variable • Cannot use least squares regression • Odds ratio • Logistic regression model • Predicting probability of an event of interest • Deviance statistic • Wald statistic Example Predicting the likelihood of upgrading to a premium credit card based on the monthly purchase amount and whether the account has multiple cards Classification and Regression Trees Decision trees that split data into groups based on the values of independent or explanatory (X) variables. • Not affected by the distribution of the variables • Splitting determines which values of a specific independent variable are useful in predicting the dependent (Y) variable present • Using a categorical dependent Y variable results in a classification tree • Using a numerical dependent Y variable results in a regression tree • Rules for splitting the tree • Pruning back a tree • If possible, divide data into training sample and validation sample Example Predicting the likelihood of upgrading to a premium credit card based on the monthly purchase amount and whether the account has multiple cards” (same example used in logistic regression) Example Predicting sales of energy bars based on price and promotion expenses” (could use same example as in multiple regression) References 1. Berenson, M. L., D. M. Levine, and K. A. Szabat, Basic Business Statistics 13th Ed., (Boston, MA.: Pearson Education, 2015) 2. Levine, D. M. and D. F. Stephan, “Teaching Introductory Business Statistics Using the DCOVA Framework”, Decision Sciences Journal of Innovative Education, Vol. 9, September 2011, pp. 393397 3. Levine, D. M., D. F. Stephan, and K.A. Szabat, Statistics for Managers Using Microsoft Excel, 8th Ed., (Boston, MA.: Pearson Education, 2017) 4. Levine, D. M., K. A. Szabat , and D. F. Stephan, Business Statistics: A First Course, 7th Ed., (Boston, MA.: Pearson Education, 2016) DSI Seattle WS 2015