Download Thought on Course Topics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Choice modelling wikipedia , lookup

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Transcript
Thoughts, Tips and Suggestions for Teaching
Statistics for Today's Students
David M. Levine, Baruch College (CUNY)
[email protected]
The First Day of Class
• First impressions are critically important in
everything you do in life.
• This is the most important class of the semester.
• You need to set the tone to create a new
impression that the course will be important to
their business education.
DSI Seattle WA 2015
The Typical Introductory
Business Statistics Course
• Overview/orientation
• Tables and Charts/Descriptive Statistics
• Probability and Probability Distributions
• Confidence Intervals and Hypothesis Testing
• Regression
DSI Seattle WA 2015
Additions?
Statistics as a way of thinking and problem-solving. Use a problemsolving framework such as DCOVA (see References 1 - 4):
Define your business objective and the variables for which you
want to reach conclusions
Collect the data from appropriate sources
Organize the data collected
Visualize the data by constructing charts
Analyze the data to reach conclusions and present those results
DSI Seattle WA 2015
Additions? continued
• Descriptive Analytics
-
Drilling down
Multidimensional contingency tables
Slicers
Big data
• Predictive Analytics
- Increased emphasis on p-values
- Regression
- Logistic regression and classification and regression trees
- (not possible in one-semester course)
DSI Seattle WA 2015
Reductions?
• Reduce Probability: no more than 30 minutes to
define terms
• Reduce Probability distributions: cover only the
normal distribution
• Reduce Hypothesis testing: cover only basic
concepts, difference between means, difference
between proportions (needed in A-B testing
common in online presentation systems)
DSI Seattle WA 2015
Tell A Story
• Each example should tell a story
• Focus on an application from a functional area of
business – accounting, eco/finance, management,
marketing, information systems
• For every story, use the DCOVA steps of Define,
Collect, Organize, Visualize, and Analyze
DSI Seattle WA 2015
Tables and Charts/Descriptive Statistics
• Organizing and Visualizing Categorical Data
• Summary tables
• Bar charts
• Pie charts
• Pareto diagrams
• Two-way contingency tables
• Multiway contingency tables
• Drilling down/Excel slicers
DSI Seattle WA 2015
Tables and Charts/Descriptive Statistics
• Organizing and Visualizing Categorical Data
• Summary tables
• Bar charts
• Pie charts
• Pareto diagrams
• Two-way contingency tables
• Multiway contingency tables
• Drilling down/Excel slicers
DSI Seattle WA 2015
Experiment 1
Web designers tested a new call to action button on its
webpage. Every visitor to the webpage was randomly
shown either the original call to action button (the
control) or the new variation. The metric used to
measure success was the download rate: the number of
people who downloaded the file divided by the number
of people who saw that particular call to action button.
Results of the experiment yielded the following:
Variations
Downloads
Visitors
Original Call to Action Button 351
3,642
New Call to Action Button
485
3,556
DSI Seattle WA 2015
Results
Approximately 9.6% of the web site visitors who were shown the
original call to action button downloaded the file as compared to
approximately 13.6% of the web site visitors who were shown the new
call to action button.
The results were highly statistically significant showing that the
download rate was higher for the new call to action button. There was
95% confidence that the actual difference in the download rate between
the original and new call to action buttons was between approximately
2.5% and 5.5%.
DSI Seattle WA 2015
Experiment 2
Web designers tested a new web design on its webpage.
Every visitor to the webpage was randomly shown
either the original web design (the control) or the new
variation. The metric used to measure success was the
download rate: the number of people who downloaded
the file divided by the number of people who saw that
particular web design. Results of the experiment
yielded the following:
Variations
Downloads
Visitors
Original web design
305
3,427
New web design
353
3,751
DSI Seattle WA 2015
Results
Approximately 8.9% of the web site visitors
who were shown the original web design
downloaded the file as compared to
approximately 9.4% of the web site visitors who
were shown the new web design.
The results showed that there was insufficient
statistical evidence that the download rate was
higher for the new web design.
DSI Seattle WA 2015
Experiment 3
Web designers now tested two factors simultaneously – the call to action
button and the new web design. Every visitor to the webpage was randomly
shown one of the following:
• Old call to action button with old web design
• New call to action button with old web design
• Old call to action button with new web design
• New call to action button with new web design
• Again, the metric used to measure success was the download rate: the
number of people who downloaded the file divided by the number of people
who saw that particular call to action button and web design. Results of the
experiment yielded the following:
DSI Seattle WA 2015
Downloads
Call to Action Button
Web Design
Yes
No
Total
Old
Old
83
917
1,000
New
Old
137
863
1,000
Old
New
95
905
1,000
New
New
170
830
1,000
485
3,515
4,000
Total
• Old call to action button with old web design: 8.3% downloaded the file
• New call to action button with old web design: 13.7% downloaded the file
• Old call to action button with new web design: 9.5% downloaded the file
• New call to action button with new web design: 17.0% downloaded the file
DSI Seattle WA 2015
Results
Notice that the results for the first three combinations of call to
action button and web design were similar to the first two
experiments. However, when the new call to action button was
combined with the new web design, there was a multiplicative
or synergistic effect in which having both of these together
resulted in an effect that was more than each effect separately.
This effect could only be discovered by simultaneously varying
the two effects and was not seen in the first two experiments
when only one effect was varied at a time.
DSI Seattle WA 2015
Pedagogical Point
• Your analytical process worked as you added variables and
determined whether unforeseen relationships were uncovered.
• Drilling down with the additional factor enabled you to find
uncover an unforeseen relationship on the likelihood of
downloading the file that was not apparent when only one of
the factor was studied.
DSI Seattle WA 2015
Excel Slicers
• A panel of clickable buttons that appears superimposed over a worksheet.
• Each slicer panel corresponds to one of the variables that is under study.
• Each button in a variable’s slicer panel represents a unique value of the
variable that is found in the data under study.
• You can create a slicer for any variable that has been associated with a
PivotTable and not just the variables that you have physically inserted into a
PivotTable. This allows you to work with more than three or four variables
at same time in a way that avoids creating an overly complex
multidimensional contingency table that would be hard to read.
DSI Seattle WA 2015
Excel Slicers (continued)
• By clicking buttons in slicer panels you can ask questions of the data
you have collected, one of the basic methods of business analytics.
• This contrasts to the methods of organizing data which allow you to
observe data relationships but not ask about the presence or absence of
specific relationships.
• Because a set of slicers can give you a “heads-up” about the data you
have collected, using a set of slicers mimics the function of a business
analytics dashboard.
DSI Seattle WA 2015
An Excel Slicer
Count of Category Column Labels
Row Labels
Growth
Four
Grand Total
1
1
Mid-Cap
1
1
Grand Total
1
1
DSI Seattle WA 2015
Descriptive Statistics
• Measures of Central Tendency – mean, median, mode
• Measures of variation – range, variance, standard deviation,
coefficient of variation, Z scores
• Shape: skewness and kurtosis
• Exploring data – quartiles, interquartile range, five-number
summary, boxplot
DSI Seattle WA 2015
Probability and Probability
Distributions
• Probability – no more than 30 – 60 minutes
• Do an example without formulas
• Define terms
• Make sure students know that the smallest
value is 0 and the largest value is 1
• Probability distributions – cover only the
normal distribution
• No need to explicitly cover the binomial
distribution
DSI Seattle WA 2015
Sampling Distributions and Confidence
Intervals
• Focus on the concept of the sampling distribution and the
Central limit theorem.
• Show chart of what happens as sample size is increased for
different populations
• Develop concept of confidence interval possibly with
different samples taken from a population
• Cover confidence intervals and sample size determination
only for mean and for proportion
DSI Seattle WA 2015
Hypothesis Testing
• Don’t try to cover too many different tests. The more tests
you try to cover, the less that students will understand.
• Fundamental concepts using one sample test for the mean or
the proportion to be able to develop concept of the p-value.
• Test for difference between means
• Test for difference between proportions (Z or chi-square)
DSI Seattle WA 2015
Regression
• Only simple linear regression in a one semester
undergraduate course
• Use software; don’t compute regression coefficients
• Focus on interpretation
• Residual analysis
DSI Seattle WA 2015
Logistic Regression
Predicting a categorical dependent variable
• Cannot use least squares regression
• Odds ratio
• Logistic regression model
• Predicting probability of an event of interest
• Deviance statistic
• Wald statistic
Example
Predicting the likelihood of upgrading to a premium credit card
based on the monthly purchase amount and whether the
account has multiple cards
Classification and Regression Trees
Decision trees that split data into groups based on the values of
independent or explanatory (X) variables.
• Not affected by the distribution of the variables
• Splitting determines which values of a specific independent variable are useful
in predicting the dependent (Y) variable present
• Using a categorical dependent Y variable results in a classification tree
• Using a numerical dependent Y variable results in a regression tree
• Rules for splitting the tree
• Pruning back a tree
• If possible, divide data into training sample and validation sample
Example
Predicting the likelihood of upgrading to a premium credit card
based on the monthly purchase amount and whether the
account has multiple cards” (same example used in logistic
regression)
Example
Predicting sales of energy bars based on price and promotion
expenses” (could use same example as in multiple regression)
References
1. Berenson, M. L., D. M. Levine, and K. A. Szabat, Basic Business Statistics 13th Ed., (Boston, MA.:
Pearson Education, 2015)
2. Levine, D. M. and D. F. Stephan, “Teaching Introductory Business Statistics Using the DCOVA
Framework”, Decision Sciences Journal of Innovative Education, Vol. 9, September 2011, pp. 393397
3. Levine, D. M., D. F. Stephan, and K.A. Szabat, Statistics for Managers Using Microsoft Excel, 8th
Ed., (Boston, MA.: Pearson Education, 2017)
4. Levine, D. M., K. A. Szabat , and D. F. Stephan, Business Statistics: A First Course, 7th Ed.,
(Boston, MA.: Pearson Education, 2016)
DSI Seattle WS 2015