Download 8 - rphilip

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
CHAPTER
8
Data Analysis and
Statistical Methods:
Univariate and
Bivariate Analyses
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Dr. Ravi Zacharias – Are all non-Christians going
to hell?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Use of Statistical Methods in Marketing
Research
 statistical analysis helps distinguish “signal” from
“noise” and contrast their relative size
 data analysis procedures
 univariate, bivariate & multivariate procedures
 regression methods
 factor analysis
 cluster and latent class analyses
 multidimensional scaling
 conjoint analysis
 interdependence methods – to elucidate the structure of a
set of variables
 dependence methods – to help explain a separate
dependent variable
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 1. What are the steps in the data analysis
process?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Overview of Data Analysis Process
Coding
Transcribing
Data cleaning
Variable specification and recoding
Selecting a data-analsyis strategy
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 2. What questions help researchers
identify appropriate analytical techniques
after data collection?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Overview of Data Analysis Procedures
 to choose data analysis technique, first determine:
 the number of variables to be analyzed together
• focus on a single variable: univariate analysis
• considering two variables: bivariate analysis
• considering many variables at once: multivariate analysis
 whether data is to be described or used to make inferences
• descriptive statistics: summary measures of sample data
• inferential statistics: probability theory used to make
statements about the population
 what level of measurement is available in variable(s) of
interest
• nominal scale
• ordinal scale
• interval scale
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 3. What are two types of analytical
statistics?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analytical Statistics
Univariate
Descriptive
• Measures of central tendency (Mean,
Median Mode)
• Measures of dispersion (Standard
Deviation)
Inferential
• Z Test
• T-Test
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Analytical Statistics
Bivariate
Descriptive
• Correlation Coefficient
• Regression
Inferential
• Chi-Square Test
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 4. What are the steps involved in the
univariate data analysis?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Overview of Univariate Data Analysis
Procedures
FIGURE
8.1
Overview of univariate data analysis
procedures
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 5. Define Descriptive Statistics.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics
It provides summary measures of the data
contained in all the elements of ta sample,
particularly measures of central tendency,
which describes where the bulk of the data
are located, and dispersion, which
describes how ‘spread out’ the data values
are around the central measure.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics
 central tendency – where are the bulk of the data?
 mean – average value, used for interval data
 median – middle value of the data, used for ordinal data
 mode – the category of a variable that occurs most often,
used for nominal data (can be bimodal)
 dispersion – how spread out is the data?
 standard deviation –used for interval data
 absolute frequencies – number of items in the sample in
each category of the variable, used for nominal data
 relative frequencies – proportion of items in the sample in
each category of the variable, used for nominal data
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Mean
Back
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Standard Deviation
Back
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Dr. Ravi Zacharias – What is the future of our
culture?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 6. Define Null Hypothesis and Alternative
Hypothesis.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Testing
 null hypothesis (H0) – a specific statement, opposed to
the alternative hypothesis, subjected to statistical testing
(Ho); statement that a population parameter takes on a
particular value
 with enough evidence, can be rejected in favor of the
alternate hypothesis
 never accepted as valid, only unable to be rejected
 alternative hypothesis – a specific statement, opposed to
the null hypothesis, subjected to statistical testing (H1);
 the sampling distribution reveals whether the sample
value is different enough from the null hypothesis value
to have occurred through sampling error alone
 “one-tailed” test when alternative hypothesis is
directional
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Null Hypotheses??
A statement of “equality”
“No relationship between the variables you
are studying”
“No differences between the two groups you
are trying to compare”
 There will be no difference between the
average scores of freshmen and seniors in my
Financial Stewardship class
 There will be no relationship between
absenteeism and grades achieved in my
Production and Operations class
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Research Hypotheses??
A statement of “inequality”
“There is a relationship between the variables
you are studying”
“There are differences between the two
groups you are trying to compare”
There will be a difference between the
average scores of freshmen and seniors in
my Financial Stewardship class
There will be a relationship between
absenteeism and grades achieved in my
Production and Operations class
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Null vs. Research Hypotheses
Null Hypotheses
 Equality
Research Hypotheses
 Inequality
 Population based
 Sample based
 Indirectly tested
 Directly tested
 Greek symbols
 Roman symbols
 Implied Hypotheses
 Stated Hypotheses
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypotheses
Null Hypotheses
 There is no difference in the frequency or the
proportion of occurrences in each category.
 HO: P1 = P2 = P3
Research Hypotheses
 There is a difference in the frequency or proportion
of occurrences in each category.
 H1: P1 ≠ P2 ≠ P3
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 7. What are two possible errors in
hypothetical testing?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Testing
(continued)
 Type I error (α): H0 is true and is rejected
 when values are outliers on the distribution
 significance level – tolerable level of Type I error (α)
 confidence level of the test (1 - α) – likelihood of being
correct (not rejecting H0 when it is true)
 Type II error (): H0 is false and is not rejected
 when different sampling distribution happens to be likely
under H0
 power of the test (1 - ) – probability of rejecting a false null
hypothesis
 For a given sample size,  increases as α decreases
 have to balance ‘tolerable’ degrees of the two types of error,
α and 
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Testing
(continued)
TABLE 9-1 S ummary of Hypothesis Testing Errors
True condition
S ample conclusion
H0 is true
H0 is false
1 Correct decision
1 Type II error
Do not reject H0
Reject H0
2 Confidence level
3 Probability = 1 – a
2 Probability = β
1 Type I error
1 Correct decision
2 Significance level
3 Probability = a
2 Power of the test
3 Probability = 1 – β
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 8. What are two methods used in
controlling hypothesis testing errors?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Bivariate Analytical Techniques
1. Significance Level (p<0.05) – toleration of error
2. Confidence level (95%) – likelihood of being correct
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 9. What are the steps in hypothesis
testing?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Steps in Hypothesis Testing
1. Formulate null (H0) and alternative (H1) hypotheses
2. Select appropriate statistical test for the data type
3. Specify significance level, α
4. Determine value of the test statistic for the given α
5. Perform statistical test, yielding a value of the statistic
6. If the computed test statistic from step 5 is greater than
the tabulated value from step 4, the null hypothesis is
rejected.
The information in the hypothesis test is summarized in a
single quantity called the p-value.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 10. What are the two inferential statistical
tests available for analyzing interval data?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Inferential Statistics
 z-test
 is X likely to have come from a population with the hypothesized
mean value μ = μ0?
 if  is known, then z 
X 
/ n
 if  is unknown and n is large (generally >30), then:
z
X  X 

sX
s/ n
 If calculated z value is large enough to occur by sampling error
alone less than α of the time, the null hypothesis is rejected in favor
of the alternate hypothesis with 1- α confidence.
 For proportions, when np(1-p)  5, z 
p 
p(1  p)/n
where p = sample and π = hypothesized proportions.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Inferential Statistics
(continued)
 t-test
 If n < 30 and  is unknown, the t-test should be used
• population must be known to be normally distributed
• degrees-of-freedom must be known (for means, df = n-1)
 holds for any sample size
 population standard deviation () is estimated by the sample
standard deviation, s
 critical values of the t statistic are in Table A-2, page 659
(Oi  Ei )2
 
Ei
i 1
k
2
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Dr. Ravi Zacharias – Why the Bible?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 11. What is the inferential statistical test
used for analyzing nominal data?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Inferential Statistics
(continued)
 chi-square test for nominal data
 compares hypothesized population distribution (“E”) against
observed distribution (“O”):
 for a univariate chi-square test, df = k - 1
(Oi  Ei )2
 
Ei
i 1
k
2
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Bivariate Procedures
Bivariate analysis determines whether the values of one
variable offer useful information about the values of another.
FIGURE
8.4
Bivariate data
analysis
procedures
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 12. What questions help researchers
identify appropriate bivariate analytical
techniques after data collection?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Bivariate Analytical Techniques
1. Relationship
2. Association
3. Prediction
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Q. 13. Define ANOVA.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
ANOVA (Analysis of Variance)
A statistical technique for examining the difference among
means for two or more populations.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Dr. Ravi Zacharias – How do I know God is
working in my life?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics for Bivariate
Analysis
Regression reveals how independent variables relate to a
dependent variable and aids predictions based on this.
 linear correlation coefficient (rXY)
 measure of the linear relationship between X and Y:
n
rXY

Cov( X , Y )
Cov( X , X )Cov(Y , Y )
n

(X
i 1
i
 X )(Yi  Y )
(n  1) sx s y


(X
i 1
i
 X )(Yi  Y )
n
 n
2
2
(
X

X
)
(
Y

Y
)
 i
  i

 i 1
  i 1

n
 ( X i  X )   (Yi  Y ) 
1



(n  1) i 1  sx
  s y 
 If r = 0, there is no linear relationship between the variables
 coefficient of determination (r2) – the exact percentage of
variation shared by 2 variables
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics for Bivariate Analysis
(continued)
 partitioning the sum of squares:
 fitting the regression line
 to predict Yˆi based on Xi , estimate a regression line
Yˆ  a  bX that minimizes the sum of squared errors:
i
i
n
b 
SS XY
SS XX

(X
i 1
i
 X )(Yi  Y )
n
(X
i 1
i
 X )2
a  Y  bX
 F-test – ratio of regression variance to error variance
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Inferential Statistics for Bivariate Analysis
 regression coefficient: how many standard errors is the
sample (b) value from the hypothesized value (β):
t

b
sb
 population means: how many standard errors is difference
between the means from zero:
t 
X Y
s pool where spool is the pooled standard error
 nominal association: check chi-square of cross-tabulation
tables for significance
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.