Download Look at the above figure and note that when a variable is normally

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript

Look at the above figure and note that when a variable is normally distributed, the mean, median, and
mode are the same number.
 When the variable is skewed to the left (i.e., negatively skewed), the mean shifts to the left the most, the
median shifts to the left the second most, and the mode the least affected by the presence of skew in the
data.
 Therefore, when the data are negatively skewed, this happens:
mean < median < mode.
 When the variable is skewed to the right (i.e., positively skewed), the mean is shifted to the right the
most, the median is shifted to the right the second most, and the mode the least affected.
 Therefore, when the data are positively skewed, this happens:
mean > median > mode.
 If you go to the end of the curve, to where it is pulled out the most, you will see that the order goes
mean, median, and mode as you “walk up the curve” for negatively and positively skewed curves.
You can use the following two rules to provide some information about skewness even when you cannot see a
line graph of the data (i.e., all you need is the mean and the median):
1. 1. Rule One. If the mean is less than the median, the data are skewed to the left.
2. 2. Rule Two. If the mean is greater than the median, the data are skewed to the right.
Unit 1: One Variable Statistics.
1. The following data represents the Algebra test scores of 20 students:
29, 31, 67, 67, 69, 70, 71, 72, 75, 77, 78, 80,83,85, 87, 90,90,91,91,93
a) Create a stem-leaf
d) Create a box-plot
b) Create a histogram
e) What is the interquartile range?
c) What is the mean, mode, and median?
f) What is the standard deviation?
st
g) Suppose that a 21 score of a 5 on a test is added. Which of ‘c’ is likely to change the most?
Will the standard deviation increase or decrease?
h) Describe the shape, center and spread of this data.
2. Describe the shape of the following distributions. Use words like skewed to the left (less numbers on the
left), skewed to the right (less numbers to the right), uniform, unimodal (one mode tower), bimodal (two
mode towers), gap , outlier, symmetric and bell shaped.
D
C
DD
3. Use the following box-plot to answer the question
a)
b)
c)
d)
What is the interquartile range?
What is the maximum?
What is the minimum?
Draw a histogram that best models this boxplot.
4. A set of data has a mean of 56.7 and a standard deviation of 2.5. What would the new mean and new
standard deviation be if each of the data is increased by 5?
5.
a)
b)
c)
The data below shows the ages of teachers: 23, 25, 25, 27, 29, 30,31,32,33,34,36,37,40,41,42,43,55
What measure of spread is more appropriate for this data (mean, mode, median, or standard deviation?)
Are there any outliers? (Explain…)
Describe the shape of this data.
6. A teacher interviewed 200 students and found the following results:
Male
Sophomore
.2
Junior
.1
Senior
.25
a) How many senior males did the teacher interview?
b) If the student was a sophomore, what gender was it more likely to be?
c) How many more female juniors were there than male juniors?
Female
.1
.25
.1
7. The following are test scores of a Biology test: 55, 59, 60, 61,65,65,65,70,72,73,74,75,76,77,78, 82, 88,
93,94,95
a) Create a stem-leaf
d) Create a box-plot
b) Create a histogram
e) What is the interquartile range?
c) What is the mean, mode, and median?
f) What is the standard deviation?
st
g) Suppose that a 21 score of a 5 on a test is added. Which of ‘c’ is likely to change the most?
Will the standard deviation increase or decrease?
8. Use the following box-plot to answer the questions
a) What is the interquartile range?
b) What is the maximum and minimum?
9. A set of data has a mean of 44.7 and a standard deviation of 1.3. What would the new mean and new
standard deviation be if each of the data is decreased by 3?
10. The data below shows the household income: 23500, 26000, 28000, 29000, 32500, 36500, 38500, 42200,
55000, 58500, 62000, 67000, 112200
a) What measure of spread is more appropriate for this data (mean, mode, median, or standard deviation?)
b) Are there any outliers? (Explain…)
c) Describe the shape of this data.
d) Suppose that a millionaire is added to the data set, which measure of spread will increase the most?
11. 400 people were surveyed. Each person was asked whether he or she prefers Drama or Comedy movies.
The results are shown in the relative frequency table below:
Age
21-30
31-40
>41
Total
Drama
.25
.15
.20
.6
Comedy
.30
.06
.04
.4
Total
.55
.21
.24
1
a) How many more people prefer drama than comedy?
b) How many more 21-30 year olds prefer comedy than drama?
c) How many total people prefer drama?
12. Match the following histogram (E-H) to the correct box-plot (A-D):
13. The following table shows the population of various towns:
City
Eastville
Northville
Southville
Westville
Centerville
Wayoutville Salemville
Population
30
40
56
48
51
53
64
(thousands)
Suppose that Winstonville with a population of 26,000 is added to the data set. Which of the following is true?
A. The mean increases
B. The range remains the same
C. The standard deviation increases D. The interquartile-range increases
14. A chair lift can hold a maximum of 40000 pounds and 200 people. A safety inspector is determining if the
chair lift is safe or not. Which measure of central tendency of people’s weight would be most useful to
determine if the chair lift is safe?
A. Mean B. Median
C. Mode
D. Standard Deviation
A.
B.
C.
D.
15. The number of points scored by a football team in the first ten games of a season are: 10,14, 17, 20,
35,35,38,42. What would happen to the data distribution if the team scored 27,27, and 28 in the next three
games?
The standard deviation would increase
The data distribution would become more peaked and more widely spread.
The data distribution would become less peaked and less widely spread.
The data distribution would become more peaked and less widely spread.
A.
B.
C.
D.
16. Which of the following is most likely to have the highest standard deviation? To have the lowest
interquartile-range?
The average age of 100 people surveyed at the mall
The average age of 100 people living in a retirement center
The weight of 100 new-born babies.
The amount of TV’s that 100 households have.
A.
B.
C.
D.
17. Which of the following statements is true?
A data set with high standard deviation means that there is low variability.
A data set with low standard deviation means that the data is spread out.
As the amount of data increases, the standard deviation usually increases.
A data set with many outliers will have a high standard deviation.
18. Which of the following data sets has the highest interquartile range? Lowest standard deviation?
A. Integers from 1-20
C. The prime numbers from 1-100
B. Even integers from 2-40
D. List of ages of 20 different teenagers.
19. The math club has 50 members. Each member was asked if he preferred Algebra or Geometry. The results
are shown in the relative frequency table below:
Underclassmen
Upperclassmen
Total
Geometry
.36
.04
.4
Algebra
.28
.32
.6
Total
.64
.36
1.00
Which is true?
A. Eight more underclassmen prefer Geometry than Algebra.
B. Thirty-six students are upperclassmen
C. Fourteen more upperclassmen prefer Algebra than Geometry
D. 32% of the students preferred Algebra.
ACTIVATOR:
15, 18, 21, 7, 29, 20, 9, 23, 25, 25, 29, 14, 8, 18, 26, 28, 27, 19, 7, 26
Use the set of data to do the following:

Write the numbers in order.

Find the median for the set of numbers.

Find the mean for the set of numbers.

Find the range for the set of numbers.

Find the mode for the set of numbers.

Find the outliers if any for the set of numbers.