Download Practice problems from chapters 2 and 3 Question

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Practice problems from chapters 2 and 3
Question-1. For each of the following variables, indicate whether it is quantitative
or qualitative and specify which of the four levels of measurement (nominal, ordinal,
interval, and ratio) is most appropriate.
a) Class standing (i.e., letter grades) of students of a statistics class
b) Admitting diagnosis of patients admitted to a mental health clinic
c) Weights of babies born in a hospital during a year
d) Gender of babies born in a hospital during a year
e) Under-arm temperature of day-old infants born in a hospital.
Answer: a) Qualitative and Ordinal; b) Qualitative and Nominal; c) Quantitative
and Ratio; d) Qualitative and Nominal; e) Quantitative and Interval.
Question-2. Consider the following the following sample data set
0.3, 0.6, 0.9, 1.3, 0.4, 0.6, 1.2, 1.4, 1.1, 0.2, 0.2
a) Find the mean, median, standard deviation, and range.
b) Find the interquartile range.
c) Find the 45th and 87th percentiles.
Answer: Sort data in ascending order:
0.2, 0.2, 0.3, 0.4, 0.6, 0.6, 0.9, 1.1, 1.2, 1.3, 1.4
I want to remind you that Lk = location of kth percentile in the sorted data and Pk =
data value at the location of kth percentile of sorted data
a) Mean=0.75; Median: since n=11 is odd number, the median is the 6th number=0.6
Standard deviation:
s P
P 2 s
2
n( xi ) − ( xi )
11(8.16) − (8.2)2
=
= 0.45
s=
n(n − 1)
11(11 − 1)
Range=1.4-0.2=1.2
b) Interquartile range = Q3 − Q1 .
kn
• To find P25 or the first quartile, find L25 = 100
, where k=25 and n=11. L25 =
0.25 ∗ 11 = 2.75. It is a fraction number. So the third value in the sorted data,
1
i.e., P25 = Q1 = 0.3;
kn
• To find P75 or the third quartile, find L75 = 100
, where k=75 and n=11. L75 =
0.75 ∗ 11 = 8.25. It is a fraction number. So 9th value in the sorted data, i.e.,
P75 = Q3 = 1.2.
• Interquartile range=1.2-0.3= 0.9.
c) Find P45 and P87 .
kn
, where k=45 and n=11. L45 = 0.45 ∗ 11 = 4.95.
• To find P45 , calculate L45 = 100
It is a fraction number. So 5th value in the sorted data, i.e., P45 = 0.6
kn
• To find P87 , calculate L87 = 100
, where k=87 and n=11. L87 = 0.87 ∗ 11 = 9.57.
It is a fraction number. So 10th values in the sorted data i.e., P87 = 1.3
Question-3. A study of physical fitness tests for 12 randomly selected Pre-Medical
students measured their exercise capacity (in minutes). The following data resulted:
34, 19, 33, 30, 43, 36, 32, 41, 31, 31, 37, 18
a) Find the mean, the median, and the mode for the students’ exercise capacity.
b) Find the standard deviation and the variance for the sample data of the students’
exercise capacity.
c) Provide the five number summary for the students’ exercise capacity.
d) Find the percentile corresponding to 36 minutes.
e) Find P24 .
Answer:
a) Mean=32.08; Median=
6th value+7th value
2
=
32+33
2
= 32.5; Mode=31
b) Standard deviation:
s P
P 2 s
2
n( xi ) − ( xi )
12(12971) − (385)2 √
s=
=
= 56.26 = 7.5,
n(n − 1)
12(12 − 1)
s2 = 56.26
c) Five number of summary: Min=18, Q1 = 30.5, Q2 = 32.5, Q3 = 36.5, Max=43.
e) The number of data points less than 36=8. Percentile of 36 =
f) P24 = 3rd value of sorted data=30
2
8
12
∗ 100 = 67
Question-4. Suppose we call unusual observations in a population of normally distributed data that are either at least 2 standard deviation above the mean or about 2
standard deviation below the mean. What percent are unusual?
Answer: 5%
Question-5. Suppose the distribution of grades in your statistics class is normal, with
mean = 83.4, s = 7.0. There are 120 students in the class. If your score is 97.4 in the
class, roughly how many students have scores higher than you?
= 2. If z-score=2 (you may think that z-score is
Answer: Here z − score = 97.4−83.4
7
the same as k in the 68 − 95 − 99.7 rule, that is, k = 2). In this rule, when k = 2, you
are in the 95th position, that is, 2.5% (consider both side of normal curve, 5%/2=2.5%)
of the population has a score greater than you (and therefore a higher exam score).
If there are 120 people in the class then about (.025)*(120) = 3 students have higher
scores.
Question-6. Listed below are the thorax lengths of (in millimeters) of a sample of
male fruit flies. Based on these sample values, is a thorax length of 0.68 mm unusual?
Why or why not?
0.72, 0.90, 0.84, 0.68, 0.84, 0.90, 0.92, 0.84, 0.64, 0.84, 0.76
Answer: Mean x̄ = 0.81, standard deviation s = 0.094, z-score= x−x̄
=
s
−1.38. This score lies between -2 and 2. So 0.68 is not unusual.
0.68−0.81
0.094
=
Question-7. A woman wrote to Dear Abby and claimed that she gave birth 305 days
from a visit from her husband, who was in Navy. Lengths of pregnancies have a mean
of 268 days and a standard deviation of 15 days. Find the z-score for 305 days. Is such
a length unusual? What do you conclude?
Answer: Mean x̄ = 268, standard deviation s = 15, z-score= x−x̄
= 305−268
= 2.47.
s
15
This score is greater than 2 but less than 3. So 305 is somewhat unusual. We can call
305 days long pregnancy as an unusual.
Question-8. If a sample has a mean 55 and a standard deviation 6, use the empirical
rule (show normal curve) to determine interval that you would expect 95% of the data
to lie. Assume the data show a bell-shaped distribution.
3
Answer: x̄ ± ks. Here k = 2, x̄ = 55, and s = 6. Interval is: 55 ± 2 ∗ 6 = [43, 67]
Question-9. Use the sample data listed below to find the coefficient of variation for
each of the two samples. Compare and interpret the result in your own words?
Heights (in.) of men: 71, 66, 72, 69, 68, 69
Lengths (mm) of cuckoo eggs: 19.7, 21.7, 21.9, 22.1, 22.1, 22.3, 22.7, 22.9, 23.9
Answer::
Height of Men: mean x̄ = 69.17, standard deviation s = 2.14, CV= x̄s × 100 =
2.14
× 100 = 3.09%.
69.17
Lengths of cuckoo eggs: mean x̄ = 22.14, standard deviation s = 1.13, CV= x̄s × 100 =
1.13
× 100 = 5.1%.
22.14
The relative variation to the mean in the cuckoo eggs length in greater by a factor of
1.7 times than the relative variation to the mean of the men’s heights.
Question-10. Use Chebyshev’s theorem to find what percent of the values will fall
between 10 and 26 for a data set with mean of 18 and standard deviation of 2.
Answer: As we don’t know the population distribution, we have to use Tchebychev’s
inequality. Here x̄ = 18 and s = 2. Consider the interval [x̄ − ks, x̄ + ks]. The length of
this interval= (x̄ + ks) − (x̄ − ks) = x̄ + ks − x̄ + ks = 2ks. According to the question
we have
2ks = 26 − 10
2 ∗ k ∗ 2 = 16
4k = 16
4k
16
=
4
4
k = 4
To get proportion of values, use Tchebychev’s rule and which is = 1 − k12 = 1 −
1
1 − 16
= 0.94. That is, at least 94% values lies between the interval 10 to 26.
1
=
42
Question-11. The box plot is created for wait time (in minutes) from the hospital’s
Emergency Room. These wait times are based on a sample of 160 patients during the
month of January. (See Figure 1)
a) What is an approximate value of the inter-quartile range for the above data?
4
Figure 1: Boxplot
Answer: IQR = 14 − 8.5 = 5.5
b) If a patient waited for 22 minutes, can this time be declared as potential outlier?
Justify your answer.
Answer: Yes, because 22 is greater than the upper fence value 20.
c) Estimate the number of patients who waited less than 14 minutes. Comment on the
shape of the distribution of wait times.
Answer: 75%. The shape of the distribution is skewed to the left.
Question-12. How many different 9-letter code words can be made using the symbols
%, %, %, %, &, &, &, +, +?
9!
Answer:
= 1260 as there are nine items where four are alike, three are alike,
4!3!2!
and two are alike.
Question-13. How many different ways can 5 identical tubes of tartar control toothpaste, 3 identical tubes of bright white toothpaste, and 4 identical tubes of mint toothpaste be arranged in a grocery counter display? (Answer: 27,720).
5
12!
Answer:
= 27720 as there twelve items in all where five are alike, three are
5!3!4!
alike, and four are alike.
Question-14. Six men and seven women apply for two identical jobs. If the jobs are
filled at random, find the following:
(a) The probability that both are filled by men.
(b) The probability that both are filled by women.
(c) The probability that one man and one woman are hired.
(d) The probability that the one man and the one woman who are twins are hired.
Ans.
Answer:
(a) The random variable X, counts the number of men in a sample of two drawn
without replacement from a population of size 13. As probability is the relative
frequency of the event of interest in the sample space, we need to compute a ratio.
The denominator of the ratio counts the number of ways two applicants can be
drawn without replacement from a pool of 13 applicants. Order is not important
here. In the numerator, we compute the number of ways two men can be selected
from the six male applicants and none of the women can be selected from the seven
female applicants. Then we apply the counting rule that says in how many ways
two men can be selected
and
none of the women can be selected together. The
6
7
×
probability is then 2 13 0 = 0.192.
2
6
(b)
0
7
× 2
= 0.269.
13
2
6
(c)
1
× 71
= 0.538
13
2
1
(d)
1
×
1
1
× 50 ×
13
6
0
= 0.013
2
6