Download Chapter 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
MATH 10043
CHAPTER 3
EXAMPLES & DEFINITIONS
Section 3.2
Definition: When we’re interested in the middle or center of a data set, we’re looking at
measures of central tendency. These are generally called averages. The four we will
study are mean, median, mode, and midrange.
Definition: The arithmetic mean of a data set is found by adding the data values and
dividing the total by the number of data values. One can think of mean as a balance
point of a data set.
" x = sum of the data values
x =
n
number of data values
Notation:
x -- sample mean
!
µ -- population
mean
(This is a statistic.)
(This is a parameter.)
[Greek letter lower-case ‘mu’]
! Ex. A) A committee of TCU students has the ages below. (a) Find the mean age of
this group of students. (b) One more student joins the committee, a non-traditional
student with an age of 57. Find the new mean.
18
20
19
21
19
22
21
Definition: The median of a data set is found by locating
" • the single middle value of the dataset if n is odd,
#
$ • the mean of the two middle values if n is even,
%
in an ORDERED list of values.
One can think of median,
x˜ as the physical center of a data set.
!
Ex. B) Find the median of both the original committee of students in example A, and
!
the new committee.
•SORTING A LIST USING THE CALCULATOR (optional)
STAT
ENTER (EDIT menu)
clear list if necessary
enter data into a list
STAT
SORTA (option 2)
enter list number -- for example, 2ND
L1
ENTER
To see sorted list: STAT EDIT menu
•USING THE CALCULATOR TO GET DESCRIPTIVE STATISTICS:
[NOTE: use this ONLY to CHECK your answers until after the first exam!]
STAT
CALC menu
choose 1-VAR STAT; enter list number -- for example, 2ND L1
use down arrow to see entire display
ENTER
Ex. C) Find the median of the list of student ages below. How does the median of this
list compare with that of example B? Find the mean of the data set below. How does
this value compare with the median?
57
21
19
19
71
19
21
65
Definition: The mode of a data set is the most frequently occurring value in a data set.
When two values occur with the same greatest frequency, we say that the data set is
bimodal. When no value is repeated, OR if MORE than two data values have the same
greatest frequency, we say that the data set has no mode.
Definition: The midrange of a data set is the value midway between the lowest and
highest values in the data set.
L+H
midrange =
2
Ex. D) (a) Find the mode of the data set given in example C. (b) Find the midrange of
!
the data set.
Section 3.3
Definition: Measures of dispersion measure the spread or variability of the data set.
Definition: Range – the difference between the largest and smallest values in a data
set.
Ex. E) Find the range of the test scores given below:
63
75
92
80
92
90
80
71
57
2
Definition: Standard deviation – roughly, the average distance of the data values
from their mean.
Notation:
s -- sample standard deviation
σ -- population standard deviation
[Greek lower-case ‘sigma’]
(This is a statistic.)
(This is a parameter.)
Ex. F) Find the standard deviation of the test scores given below.
63
75
92
80
92
90
80
s=
s2
71
57
Variance & standard deviation:
2
s =
# (x " x)2
n"1
!
•USING
STANDARD DEVIATION:
! THE CALCULATOR TO CALCULATE
STAT
EDIT menu
(clear lists if necessary)
enter x values (data values) into List 1
calculate the mean of the data set
define L2 = L1 - mean
ENTER
2
define L3 = L2
ENTER
sum list 3 (see instructions below)
use sum of list 3 in variance formula
•SUMMING A LIST ON THE CALCULATOR:
***If you are on the list screen, you must do 2ND QUIT
2ND LIST key [the LIST key is the same as the STAT key]
go to MATH menu
choose SUM (option 5)
2ND L#
ENTER
Ex. G) Find the standard deviation of the data set given.
3
9
13
16
16
23
3
Ex. H) Calculate the mean and standard deviation of the high-school student ages
below:
Age
Frequency
14
5
15
6
16
8
17
7
18
4
Σ
30
•USING THE CALCULATOR TO FIND DESCRIPTIVE STATISTICS WITH
FREQUENCY DATA:
STAT
EDIT menu
(clear lists if necessary)
enter x values (data values) into List 1
enter f values (frequencies) into List 2
STAT
CALC menu
choose 1 VAR STAT L1, L2 (the comma key is above the 7 key)
ENTER
Ex. I) (a) Make a dot plot of the data given below. (b) Use the fact that x = 15 and
s = 4.3 to locate the data values on the dot plot within 1, 2, and 3 standard deviations of
the mean. (c) How many data values are within 1 standard deviation of the mean?
What percent? (d) How many data values are within 2 standard deviations of the
!
mean? What percent? (e) How many data values are within 3 standard deviations of
the mean? What percent?
2
15
11
15
13
16
13
16
13
16
14
17
14
17
14
18
15
21
15
25
Empirical Rule: if the data are approximately normally distributed,
• approximately 68% of the data will lie within one standard deviation of
the mean,
• approximately 95% of the data will lie within two standard deviations
of the mean,
• approximately 99.7% of the data will lie within three standard
deviations of the mean.
Ex. J) If we have 20 values in a normally distributed data set, approximately how many
values should be within 1 standard deviation of the mean?
4
Ex. K) Consider a data set having a normal distribution.
(a) What percentage of the data is less than the mean of this data set? (b) What
percentage of the data is greater than a value that is one standard deviation below the
mean? (c) What percentage of the data is between two standard deviations below the
mean and three standard deviations above the mean?
PRACTICE PROBLEMS OVER SECTION 3.3
1. A May 25, 2005 article on KidsHealth.org reported that nationally, 6- to 10-year-olds get an average of
$2 to $5 per week for their allowance. A sample of parents of children from a local elementary school
were surveyed about the allowance of their 6- to 10-year-olds; the results are below. (a) Does the mean
allowance of local children seem consistent with the national average? (b) Find the median, mode, and
midrange of the data. (c) Find the standard deviation of the data.
3
4
5
5
8
5
8
10
5
6
6
8
7
6
5
5
4
3
5
2
2. Suppose you know that the standard deviation of a data set equals zero. What does this tell you
about the data values themselves?
3. If a data set consists of values measured in ounces, what is the unit of measure for the standard
deviation? For the variance?
4. A survey of the time to complete
construction on several projects is
summarized below. Estimate the
mean and standard deviation of
construction times. [USE 1-VAR
STAT – no work to show!]
Construction times (in months)
Number of Projects
3≤x<6
2
6≤x<9
10
9 ≤ x < 12
12
12 ≤ x < 15
9
15 ≤ x < 18
7
5. Consider a data set having a normal distribution [it may be helpful to draw sketches]. What percentage
of the data is: (a) greater than the mean of this data set? (b) within one standard deviation of the mean?
(c) greater than a value that is one standard deviation above the mean? (d) less than a value that is
two standard deviations below the mean? (e) between one standard deviation below the mean and two
standard deviations above the mean?
6. The mean mileage of a tire is 30,000 miles, with a standard deviation of 2500 miles. If we assume that
the mileage is normally distributed, approximately what percentage of all such tires will last for:
(a) between 22,500 and 37,500 miles? (b) more than 25,000 miles? (c) fewer than 32,500 miles?
th
[Problems 5 – 6 above adapted from Elementary Statistics by Johnson & Kuby, 8 edition.]
5
Section 3.4
Definition: Measures of relative standing indicate the position of a data value in
terms of its data set.
Definition: A z-score (standard score) gives the position of a data value in terms of
standard deviations from its mean.
z=
z-score:
x"x
s
Ex. L) The mean height of adult males is 69 inches, with a standard deviation of 2.8
inches. The mean height of adult females is 65.5 inches, with a standard deviation of
! NBA player Michael Jordan is 6 feet 6 inches tall, and former
2.5 inches. Former
WNBA player Rebecca Lobo is 6 feet, 4 inches tall. Which of these individuals is taller
for his or her gender?
Ex. M) A student knows that her point score on a test was 628, and that her standard
score was 2.41. She was able to find out that the mean of the test was 614. What was
the standard deviation of the test?
Definition: Quartiles divide an ordered data set into four equal parts.
Definition: The five-number summary of a data set is a list giving the low (minimum),
Q1, median (Q2), Q3, and high (maximum) of the data set.
Definition: The interquartile range of a data set is the range of the quartiles:
IQR = Q3 – Q1.
Ex. N) Find (a) the quartiles, (b) the five-number summary, and (c) the interquartile
range of each data set below:
A:
11
21
14
7
10
17
18
4
18
7
9
B:
11
21
14
7
10
17
18
4
18
7
9
21
6
PRACTICE PROBLEMS OVER SECTION 3.4
1. Two intelligence tests are the ETS and the WISC exams. Suppose Kate scores a 625 on the ETS,
which is designed to have a mean of 500 and a standard deviation of 100. Liz scores a 121 on the WISC,
which has a mean of 100 and standard deviation of 15. Who scored higher?
2. Find the quartiles, five-number summary, and interquartile range of the following data sets:
a.
12
21
25
38
24
9
21
29
32
b.
5
26
46
35
15
27
43
34
16
32
41
32
20
31
38
25
21
38
14
32
◊
3 . An exam produced grades with a mean score of 74.2 and a standard deviation of 11.5. Rounding to
two decimal places, find the z-score for each of the following test scores:
(a) 54
(b) 79
(c) 93
4. A nationally administered test has a mean of 500 and a standard deviation of 100. If your standard
score (i.e. z-score) on the test was 1.80, what was your actual test score?
5. What does it mean to say that a data value has a standard score of (a) 1.5? (b) –2.1?
(c) In general, what does the standard score measure?
◊
6 . Which x-value has the higher value relative to the set of data from which it comes?
A: x = 85, where x = 72 and s = 8.
B: x = 93, where x = 87 and s = 5.
7. Which x-value has the lower value relative to the set of data from which it comes?
C: x = 24.1, where x = 25.7 and s = 1.8.
D: x = 33.2, where x = 34.1 and s = 4.3.
8. Why does the z-score for a value from a normal distribution usually lie between –3 and +3?
◊
9 . A sample has a mean of 120 and a standard deviation of 20. Find the value of x that corresponds to
each of these standard scores:
(a) z = 0
(b) z = 1.20
(c) z = -1.40
(d) z = 2.05
10. During a series of fitness tests at school, Bob was told that his z-score for the Softball Throw was
1.25, and that the standard deviation for the school was 16 feet. Bob knows that he threw the softball 179
feet during the test. Find the mean Softball Throw distance for the school.
11. During a series of fitness tests at school, Bill was told that his z-score for sit-ups was –0.75, and that
the mean for the school was 70 sit-ups. Bill knows that he did 61 sit-ups during the test. Find the
standard deviation of sit-ups for the school.
th
[Problems 3 – 11 above adapted from Elementary Statistics by Johnson & Kuby, 8 edition.]
7