Download Statistical Eval..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistical Evaluation of Data
Chapter 15
1
/41
Descriptive / inferential
• Descriptive statistics are methods that help
researchers organize, summarize, and simplify
the results obtained from research studies.
• Inferential statistics are methods that use the
results obtained from samples to help make
generalizations about populations.
2
/41
Statistic / parameter
• A summary value that describes a sample is
called a statistic. M=25 s=2
• A summary value that describes a population
is called a parameter. µ =25 σ=2
3
/41
Frequency Distributions
One method of simplifying and organizing a set
of scores is to group them into an organized
display that shows the entire set.
4
/41
Example
5
/41
Histogram & Polygon
6
/41
Bar Graphs
7
/41
How to interpret?
8
/41
http://www.transparency.org/cpi2014/results
9
/41
Other types of graphs
https://freedomhouse.org/report/freedom-net/freedom-net-2014#.VIDtzDEc7Ak
10
/41
Central tendency
The goal of central tendency is to identify the
value that is most typical or most representative
of the entire group.
11
/41
Central tendency
• The mean is the arithmetic average.
• The median measures central tendency by
identifying the score that divides the
distribution in half.
• The mode is the most frequently occurring
score in the distribution.
12
/41
Variability
Variability is a measure of the spread of scores
in a distribution.
1. Range (the difference between min and max)
2. Standard deviation describes the average
distance from the mean.
3. Variance measures variability by computing
the average squared distance from the mean.
13
/41
Variance = the index of variability.
SD = SQRT (Variance)
Variance = (Sum of Squares) / N
X
10
7
9
8
7
6
5
4
3
1
Total=60
Mean=6
X-M
4
1
3
2
1
0
-1
-2
-3
-5
(X-M)2
16
1
9
4
1
0
1
4
9
25
Variance = 70/10= 7
SD = SQRT(7) =2.64
SS =70
14
/41
Non-numerical Data
Proportion or percentage in each category.
For example,
• 43% prefer Democrat candidate,
• 28% prefer Republican candidate,
• 29% are undecided
15
/41
Correlations
A correlation is a statistical value that measures
and describes the direction and degree of
relationship between two variables.
16
/41
Types of correlation
Variable Y\X
Quantitiative X
Ordinal X
Nominal X
Quantitative Y
Pearson r
Biserial rb
Point Biserial rpb
Ordinal Y
Biserial rb
Spearman rho/Tetrachoric rtet
Rank Biserial rrb
Nominal Y
Point Biserial rpb
Rank Bisereal rrb
Phi, C, λ Lambda
Phi for dichotomous data only
Pearson's contingency coefficient known as C
Cramer's V coefficient
Goodman and Kruskal lambda coefficient
http://www.andrews.edu/~calkins/math/edrm611/edrm13.htm
17
/41
Regression
18
/41
Regression
• Whenever a linear relationship exists, it is
possible to compute the equation for the
straight line that provides the best fit for the
data points.
• The process of finding the linear equation is
called regression, and the resulting equation is
called the regression equation.
19
/41
Where is the regression line?
120
110
100
STRENGTH
90
80
70
140
150
160
170
180
190
200
210
220
WEIGHT
20
/41
Which one is the regression line?
120
110
100
STRENGTH
90
80
70
140
150
160
170
180
190
200
210
220
WEIGHT
21
/41
regression equation
All linear equations have the same general
structure and can be expressed as
• Y = bX+a Y= 2X + 1
22
/41
standardized form
• Often the regression equation is reported in
standardized form, which means that the
original X and Y scores were standardized, or
transformed into z- scores, before the
equation was computed.
ȥy=βȥx
23
/41
Multiple Regression
24
/41
25
/41
INFERENTIAL STATISTICS
• INFERENTIAL STATISTICS
26
/41
Sampling Error
Random samples
No treatment
27
/41
Is the difference due to a sampling
error?
Random samples
Violent /Nonviolent TV
28
/41
Is the difference due to a sampling
error?
• Sampling error is the naturally occurring
difference between a sample statistic and the
corresponding population parameter.
• The problem for the researcher is to decide
whether the 4- point difference was caused by
the treatments ( the different television
programs) or is just a case of sampling error
29
/41
Hypothesis testing
• A hypothesis test is a statistical procedure that
uses sample data to evaluate the credibility of
a hypothesis about a population.
30
/41
5 elements of a hypothesis test
1. The Null Hypothesis
The null hypothesis is a statement about the
population, or populations, being examined, and
always says that there is no effect, no change, or no
relationship.
2. The Sample Statistic
The data from the research study are used to
compute the sample statistic.
31
/41
5 elements of a hypothesis test
3. The Standard Error
Standard error is a measure of the average, or standard distance
between sample statistic and the corresponding population
parameter.
"standard error of the mean , sm" refers to the standard deviation of the distribution of sample means taken from a population.
4. The Test Statistic
A test statistic is a mathematical technique for comparing the
sample statistic with the null hypothesis, using the standard
error as a baseline.
M 1 M 2
t
sm
32
/41
5 elements of a hypothesis test
5. The Alpha Level ( Level of Significance)
The alpha level, or level of significance, for a
hypothesis test is the maximum probability that the
research result was obtained simply by chance.
A hypothesis test with an alpha level of .05, for
example, means that the test demands that there is
less than a 5% (. 05) probability that the results are
caused only by chance.
33
/41
Reporting Results from a Hypothesis
Test
• In the literature, significance levels are
reported as p values.
For example, a research paper may report a
significant difference between two treatments
with p <.05. The expression p <.05 simply means
that there is less than a .05 probability that the
result is caused by chance.
34
/41
Errors in Hypothesis Testing
If a researcher is misled by the results from the
sample, it is likely that the researcher will reach
an incorrect conclusion.
Two kinds of errors can be made in hypothesis
testing.
35
/41
Type I Errors
• A Type I error occurs when a researcher finds evidence
for a significant result when, in fact, there is no effect (
no relationship) in the population.
•
The error occurs because the researcher has, by chance, selected an extreme sample that appears to show
the existence of an effect when there is none.
• The consequence of a Type I error is a false report. This
is a serious mistake.
• Fortunately, the likelihood of a Type I error is very
small, and the exact probability of this kind of mistake
is known to everyone who sees the research report.
36
/41
Type II error
• A Type II error occurs when sample data do
not show evidence of a significant effect
when, in fact, a real effect does exist in the
population.
•
This often occurs when the effect is so small that it does not show up in the sample.
37
/41
Factors that Influence the Outcome of
a Hypothesis Test
1. The sample size.
The difference found with a large sample is
more likely to be significant than the same result
found with a small sample.
2. The Size of the Variance
When the variance is small, the data show a
clear mean difference between the two
treatments.
38
/41
Effect Size
• Knowing the significance of difference is not
enough. We need to know the size of the
effect.
39
/41
Measuring Effect Size with Cohen’s d
40
/41
Measuring Effect Size as a Percentage
of Variance ( r2)
The effect size can also be measured by calculating the
percentage of variance in the treatment condition that
could be predicted by the variance in the control group.
df = (n1-1)+(n2-1)
41
/41
Questions?
42
/41
Group Discussion
• Identify the two basic concerns with using a
correlation to measure split-half reliability and
explain how these concerns are addressed by
Spearman-Brown, K-R 20, and Cronbach’s
alpha.
• Identify the basic concern with using the
percentage of agreement as a measure of
inter-rater reliability and explain how this
concern is addressed by Cohen’s kappa.
43
/41