Download Calculating Descriptive Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Using statistics in small-scale language education research
Jean Turner
© Taylor & Francis 2014

Calculate and report descriptive statistics.

Create and review a histogram.*

Calculate and interpret the Shapiro–Wilk statistic.
*a.k.a. frequency distribution
© Taylor & Francis 2014
Student #
Score
Student #
Score
1st
4
12th
13
2nd
5
13th
13
3rd
7
14th
13
4th
8
15th
14
5th
8
16th
14
6th
9
17th
14
7th
9
18th
15
8th
10
19th
15
9th
10
20th
15
10th
10
21st
15
11th
13
© Taylor & Francis 2014

Mean = 11.14286

Median = 13

Mode = 13 and 15

Range = 11 points

Standard deviation = 3.42927
© Taylor & Francis 2014
© Taylor & Francis 2014

The descriptive statistics give a sense of ...
◦ central tendency
◦ dispersion

The histogram gives a sense of...
◦ the general shape of the distribution
◦ the possibility of outlier scores
© Taylor & Francis 2014

In Parametric Statistics Land...
◦ Researchers believe their data will match the normal
distribution model.

The hypothesis that one of these researchers would
propose is:
◦ Null hypothesis: The data are (probably) normally
distributed.
© Taylor & Francis 2014
How likely is it that the scores are normally
distributed?
The Shapiro–Wilk statistic
Tests that hypothesis!
© Taylor & Francis 2014

Enter the data.
>mydata = c(4, 5, 7, 8, 8, 9, 9, 10, 10, 10,
13, 13, 13, 13, 14, 14, 14, 15, 15, 15, 15)
© Taylor & Francis 2014

Calculate descriptive statistics. (Remember how?)
>summary
>subset (table (mydata), table(mydata)==max (table(mydata)))
>sd
> maximum score – minimum score
© Taylor & Francis 2014

Make a histogram.
>hist (mydata, col = “orange”, breaks = 10)
© Taylor & Francis 2014

Calculate the Shapiro–Wilk statistic.
>shapiro.test (mydata)
Shapiro–Wilk normality test
data: mydata
W = 0.9002, p-value = 0.03527
© Taylor & Francis 2014

The observed value of the Shapiro–Wilk statistic is:
W = 0.9002

The exact probability of the outcome, W = 0.9002, is:
p-value = 0.03527
© Taylor & Francis 2014
What does this mean?—are the data
probably normally distributed or not?
© Taylor & Francis 2014

For the Shapiro–Wilk statistic:
◦ If p is more than .05, we can be 95% certain that the data are
normally distributed. (In other words, the null hypothesis is
probably true.)
◦ If p is less than .05, we can be 95% certain that the data are not
normally distributed. (In other words, the null hypothesis is
probably false.)
© Taylor & Francis 2014

Oh, p = 0.03527 is less than .05.
◦ The null hypothesis is probably not true.
◦ I can be 95% certain that it isn’t true!
◦ The data are probably not normally distributed.
© Taylor & Francis 2014

Check homework practice problem #19 from Chapter
Two.
The null hypothesis: The data are (probably) normally distributed.

Enter the data.
>spanish.vocab = c(41, 33, 32, 29, 27, 27, 26, 24, 19,
19, 18, 17, 14)
© Taylor & Francis 2014

shapiro.test (spanish.vocab)
Shapiro–Wilk normality test
data: spanish.vocab
W = 0.958, p-value = 0.7225
© Taylor & Francis 2014

The observed value of the Shapiro–Wilk statistic is:
W = 0.958

The exact probability of the observed value, W = 0.958, is:
p-value = 0.7225
© Taylor & Francis 2014
I’m reminding myself…

For the Shapiro–Wilk statistic:
◦ If p is more than .05, we can be 95% certain that the data are
normally distributed. (In other words, the null hypothesis is
probably true.)
◦ If p is less than .05, we can be 95% certain that the data are not
normally distributed. (That is, the null hypothesis is probably
false.)
© Taylor & Francis 2014

For the Spanish data, p = .7725, which is greater than
.05.
◦ The null hypothesis is probably true.
◦ I can be 95% certain the hypothesis is true.
◦ The data probably are normally distributed.
© Taylor & Francis 2014
Related documents