Download Selling an Idea or a Product

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistical Analysis of
Data
Graziano and Raulin
Research Methods: Chapter 5
This multimedia product and its contents are protected under copyright law. The following are prohibited by
law: (1) Any public performance or display, including transmission of any image over a network; (2)
Preparation of any derivative work, including the extraction, in whole or in part, of any images; (3) Any rental,
lease, or lending of the program.
Copyright © Allyn & Bacon (2007)
Individual Differences

A fact of life
– People differ from one another
– People differ from one occasion to another


Most psychological variables have small
effects compared to individual differences
Statistics give us a way to detect such
subtle effects
Copyright © Allyn & Bacon (2007)
Descriptive Statistics


Are used to describe the data
Many types of descriptive statistics
– Frequency distributions
– Summary measures
– Graphical representations of the data


A way to visualize the data
The first step in any statistical analysis
Copyright © Allyn & Bacon (2007)
Frequency Distributions

First step in organization of data
– Can see how the scores are distributed



Used with all types of data
Illustrate relationships between
variables in a cross-tabulation
Simplify distributions by using a
grouped frequency distribution
Copyright © Allyn & Bacon (2007)
Creating Frequency
Distributions


Create a column
with all possible
scores
Count the number
of people that fall
into each score
– Some frequencies
may be zero (no
one had that score)

Can only do a
frequency
distribution if:
– The scores are not
continuous
– The range of scores
is not too large
(becomes unwieldy)
Copyright © Allyn & Bacon (2007)
Creating a Grouped
Frequency Distribution


Start by creating
about 10-15 equal
sized intervals
sufficient to cover
the range of scores
Count the number
of people in each
interval


Necessary
whenever the
distribution is
continuous
Useful when the
range of scores is
large
Copyright © Allyn & Bacon (2007)
Cross-Tabulation

A way to see the relationship between
two nominal or ordinal variables
– When done with score data, it is usually
done as a scatter plot (covered later)

Create a set of cells by listing the
values of one variable as columns and
the values of the other as rows
Copyright © Allyn & Bacon (2007)
Cross-Tabulation Example
Males
Females
Total
Democrats
4
5
9
Republicans
6
1
7
Other
7
1
8
Total
17
7
24
Copyright © Allyn & Bacon (2007)
Graphing Data


Visual displays are often easier to
comprehend
Two types of graphs covered here
– Histograms
– Frequency Polygons
Copyright © Allyn & Bacon (2007)
Histograms


A bar graph, as
shown at the right
Can be used to
graph either
– Data representing
discrete categories
– Data representing
scores from a
continuous variable
Sample Histogram
60
50
40
Freq 30
20
10
0
1 2 3 4 5 6
Scores
Copyright © Allyn & Bacon (2007)
Graphing 2 Distributions


Possible to graph
two or more
distributions to see
how they compare
Note that one of
the two groups in
this histogram was
the same group
graphed previously
Sample Histogram
80
70
60
50
Freq 40
30
20
10
0
Copyright © Allyn & Bacon (2007)
1 2 3 4 5 6
Scores
Frequency Polygon
Like a histogram
except that the
frequency is shown
with a dot, with the
dots connected
Frequency Polygon
60
50
40
Frequency

30
20
10
0
1
2
3
4
Scores
Copyright © Allyn & Bacon (2007)
5
6
Two Frequency Polygons

Can compare two of
more frequency
polygons on the
same scale
Easier to compare
groups because the
graph appears less
cluttered than
multiple histograms
Frequency Polygons
80
70
60
Frequency

50
40
30
20
10
0
1
2
3
4
Scores
Copyright © Allyn & Bacon (2007)
5
6
Shapes of Distributions


Many psychological
variables are
distributed normally
The distribution is
skewed if scores
bunch up at one
end
Copyright © Allyn & Bacon (2007)
Measures of Central
Tendency

Mode: the most frequently occurring score
– Easy to compute from frequency distribution

Median: the middle score in a distribution
– Less affected than the mean by a few deviant
scores

Mean: the arithmetic average
– Most commonly used central tendency measure
– Used in later inferential statistics
Copyright © Allyn & Bacon (2007)
Finding the Mode



Easiest way to find the mode is to
construct a frequency distribution first
Find the score with the largest
frequency
If there are two or more scores that
are tied for the largest frequency,
report each of them
Copyright © Allyn & Bacon (2007)
Computing the Median


Order the scores from smallest to
largest
Determine the middle score [(N+1)/2]
– If 7 scores, the middle is the fourth score
[(7+1)/2]=4
– If 10 scores, the middle score is half way
between the 5th and 6th scores
[(10+1)/2]=5.5
Copyright © Allyn & Bacon (2007)
Computing the Mean



Compute the mean
of 3, 4, 2, 5, 7, & 5
Sum the numbers
(26)
Count the numbers
(6)

Plug these values
into the equations
X

X
N
26
X
 4.33
6
Copyright © Allyn & Bacon (2007)
Measuring Variability



Range: lowest to highest score
Average Deviation: average
distance from the mean
Variance: average squared distance
from the mean
– Used in later inferential statistics

Standard Deviation: square root of
variance
Copyright © Allyn & Bacon (2007)
The Range

Computing the Range
– Find the lowest score
– Find the highest score
– Subtract the lowest from the highest
score

Easy to compute, but unstable
because it relies on only two scores
Copyright © Allyn & Bacon (2007)
The Average Deviation

Computing the average deviation
– Compute the mean
– Compute the distance of each score from the
mean (absolute distance, ignore sign)
– Sum those distances and divide by the number
of scores

Easy to understand conceptually, but rarely
used because it does not have good
statistical properties
Copyright © Allyn & Bacon (2007)
The Variance

Computing the Variance
– Compute the mean
– Compute the distance of each score from the
mean
– Square those distance
– Sum those squared distances and divide by the
degrees of freedom (N - 1)

Good statistical properties, but this measure
of variability is in squared units
Copyright © Allyn & Bacon (2007)
The Standard Deviation

Computing the Standard Deviation
– Compute the variance
– Take the square root of the variance

This measure, like the variance, has
good statistical properties and is
measured in the same units as the
mean
Copyright © Allyn & Bacon (2007)
Measures of Relationship

Pearson product-moment correlation
– Used with interval or ratio data

Spearman rank-order correlation
– Used when one variable is ordinal and the
second is at least ordinal

Scatter plots
– Visual representation of a correlation
– Helps to identify nonlinear relationships
Copyright © Allyn & Bacon (2007)
Correlations

Range from –1.00 to +1.00
– A -1.00 means a perfect negative
relationship (as one score decreases, the other
increases a predictable amount)
– +1.00 means a perfect positive
relationship
– 0.00 means that there is no relationship
Copyright © Allyn & Bacon (2007)
Linear Relationships



Correlation coefficients are sensitive only to
linear relationships
Linear relationships mean that the points of
a scatter plot cluster around a straight line
Should always look at the scatter plot to see
whether the correlation coefficient is
appropriate
Copyright © Allyn & Bacon (2007)
Regression


Using a correlation to predict one
variable from knowing the score on
the other variable
Usually a linear regression (finding the
best fitting straight line for the data)

Best illustrated in a scatter plot with
the regression line also plotted (see
Figure 5.6)
Copyright © Allyn & Bacon (2007)
Reliability Indices



Test-retest reliability and interrater
reliability are indexed with a Pearson
product-moment correlation
Internal consistency reliability is
indexed with coefficient alpha
Details on these computations are
included on the Student Resource
Website
Copyright © Allyn & Bacon (2007)
Standard Scores
(Z-scores)



A way to put scores on a common scale
Computed by subtracting the mean from the
score and dividing by the standard deviation
Interpreting the Z-score
– Positive Z-scores are above the mean; negative
Z-scores are below the mean
– The larger the absolute value of the Z-score, the
further the score is from the mean
Copyright © Allyn & Bacon (2007)
Inferential Statistics



Used to draw inferences about
populations on the basis of samples
Sometimes called “statistical tests”
Provide an objective way of
quantifying the strength of the
evidence for a hypothesis
Copyright © Allyn & Bacon (2007)
Populations and Samples



Population: the larger groups of all
participants of interest
Sample: a subset of the population
Samples almost never represent
populations perfectly (sampling error)
– Not really an error
– Just the natural variability that you can
expect from one sample to another
Copyright © Allyn & Bacon (2007)
The Null Hypothesis



States that there is NO difference between
the population means
Compare sample means to test the null
hypothesis
Population parameters & sample statistics
– Population parameter: descriptive statistic
computed from everyone in the population
– Sample statistics: a descriptive statistic
computed from everyone in your sample
Copyright © Allyn & Bacon (2007)
Statistical Decisions

Either Reject or Fail to Reject the null
hypothesis
– Rejecting the null hypothesis suggests that there is a
difference in the populations sampled
– Failing to reject suggests that no difference exists
– Decision is based on probability
– Alpha: the statistical decision criteria used in testing the
null hypothesis
– Traditionally, alpha is set to small values (.05 or .01)

Always a chance for error in our decision
Copyright © Allyn & Bacon (2007)
Statistical Decision
Process
Reject Null
Hypothesis
Retain Null
Hypothesis
Null Hypothesis
is True
Type I
Error
Correct
Decision
Null Hypothesis
is False
Correct
Decision
Type II
Error
Copyright © Allyn & Bacon (2007)
Testing for Mean
Differences



t-test for independent groups: tests
mean difference of two independent groups
Correlated t-test: tests mean difference of
two correlated groups
Analysis of Variance: tests mean
differences in two or more groups
– Groups may or may not be independent
– Also capable of evaluating factorial designs
Copyright © Allyn & Bacon (2007)
Power of a Statistical Test



Sensitivity of the procedure to detect real
differences between populations
A function of both the statistical test and the
precision of the research design
Increasing the sample size increases the
power
– Larger samples estimate the population
parameters more precisely
Copyright © Allyn & Bacon (2007)
Effect Size



Indication the size of the group
differences
Unlike the statistical test, the effect
size is NOT affected by the size of the
sample
More details on effect size
– In Chapter 15
– On the Student Resource Website
Copyright © Allyn & Bacon (2007)
Statistical versus
Practical Significance

Statistical significance: Is the observed
group difference unlikely to be due to
sampling error
– Can get statistical significance, even with very small
population differences if the sample size is large enough

Practical significance looks at whether the
difference is large enough to be of value in
a practical sense
– More concerned with the effect size
Copyright © Allyn & Bacon (2007)
Meta-Analysis


Relatively new statistical technique
Allows researchers to statistically
combine the results of several studies
to get a sense of how powerful the
effect is
– Discussed in more detail in Chapter 15
Copyright © Allyn & Bacon (2007)
Summary


Statistics allow us to detect and evaluate
group differences that are small compared
to individual differences
Descriptive versus inferential statistics
– Descriptive statistics describe the data
– Inferential statistics are used to draw inferences about
population parameters on the basis of sample statistics

Statistics objectify evaluations, but do not
guarantee correct decisions
Copyright © Allyn & Bacon (2007)