Download solution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Transcript
.
CLABE Statistics
Homework assignment - Problem sheet 1
.
1.
A supervisor of a plant kept records of the time (in seconds) that employees needed to complete
a particular task. The data are summarized as follows:
Time
30 ` 40
40 ` 50
50 ` 60
60 ` 80
80 ` 100
100 ` 150
10
15
20
30
24
20
Frequency
(a) Graph the data with a histogram.
(b) Discuss possible errors occurring if the frequency is used instead of the density.
SOLUTION
0.008
0.003
Density
0.013
0.017
(a) The histogram is the following:
30
40
50
60
80
100
150
Time
(b) In this case it is not possible to use frequencies in place of densities because the interval
classes have dierent sizes. The wrong histogram is plotted below.
1
0.084
0.126
Frequency
0.168
0.252
Wrong histogram
30
40
50
60
80
100
150
Time
2.
You want to buy a new house and, for this reason, last week you visited 10 dierent ats which
are on sale. All the ats are located in the same area and are similar for dimension and other
basic characteristics.
The prices (X , in thousands of euros) of the ats you visited are given
below
176,
153,
215,
185,
168,
197,
159,
162,
181,
160
(a) Calculate the mean and the standard deviation of the data.
(b) Provide the ve number summary of the data.
(c) Represent the data in a histogram with the following values as delimiters for the interval
classes: 150; 170; 190; 220.
(d) You know that
of
a
and
b
Y = aX + b
in the case where
a and b are two positive
ȳ = 489 and sY = 48.59.
where
constants. Calculate the values
SOLUTION
(a) The mean is
x̄ = 175.6,
the sample variance is
s2X = 377.82
and the standard deviation
sX = 19.44.
(b) The data, in increasing order, are
153 159 160 162 168 176 181 185 197 215
and therefore the ve number summary is given by
min=153
Q1 = (159 + 160)/2 = 159.5
Me = Q2 = (168 + 176)/2 = 172
2
(1)
Q3 = (185 + 197)/2 = 191
Max=215.
0.0067
Density
0.0150
0.0250
(c) The histogram is given below
150
170
190
220
price
48.59 = |a|19.44 and, therefore, |a| = 2.5. It is said in the
a is positive, so that a = 2.5. Furthermore, ȳ = ax̄ + b, that is,
489 = 2.5 × 175.6 + b and b = 489 − 2.5 × 175.6 = 50.
(d) Since
sY = |a|sX ,
that is
text of the problem that
3.
A sample of students at a local high school were asked what their plans were after graduation.
Possible responses were college (C), military (M), work (W), and other (O). The following data
were collected.
M
C
C
W
O
C
W
W
W
C
C
C
M
W
O
O
M
W
O
O
M
W
W
C
C
C
C
C
C
C
(a) Create a frequency table for the data.
(b) Create a bar chart for the data. Use relative frequency for the vertical axis.
(c) What proportion of students in the sample plan to enter college after graduation?
(d) What proportion of students in the sample plan to either work or enter the military after
graduation?
SOLUTION
(a) The frequency table for the data is as follows
3
Plans after graduation
category
freq.
rel.freq
C
13
0.43
M
4
0.13
O
5
0.17
W
8
0.27
30
1.00
0.3
0.2
0.0
0.1
relative frequency
0.4
0.5
(b) The bar chart for the data is as follows
C
M
O
W
Plans after graduation
(c) The proportion of students in the sample who plan to enter college after graduation is equal
to 0.43.
(d) The proportion of students in the sample who plan to either work or enter the military after
graduation is equal to 0.4.
4.
The following data set contains the number of calories for chicken sandwiches at Burger King.
670, 920, 1160, 340, 570, 750, 930, 770, 970, 260, 370, 310, 800, 450, 520, 630
4
(a) Find the range of this data set.
(b) Find the ve-number summary.
(c) Find the interquartile range.
(d) Construct a boxplot.
SOLUTION
(a) The range of the given data set is
R = 1160 − 260 = 900;
(b) The ve-number summary is: min=260,
800+920
2
= 860,
Q1 =
370+450
2
= 410, Me =
630+670
2
= 650, Q3 =
Max=1160;
(c) The interquartile range is IQR=
860 − 410 = 450.
400
600
800
1000
(d) The boxplot of the data is given below
Calories
5.
A researcher in an alcoholism treatment center, interested in summarizing the length of stay
in the center for rst-time patients, randomly selects 14 records of individuals institutionalized
within the previous two years. The length of stay in the center, in days, are as follows
7,
6,
6,
3,
8,
5,
4,
5
7,
5,
4,
5,
6,
6,
7.
(a) Create a frequency table for the data;
(b) provide a suitable graphical representation of the data;
(c) calculate the sample mean and variance of the data.
SOLUTION
(a) The frequency table of the data is given below.
Length of stay
category
freq.
rel.freq
cum. freq.
3
1
0.071
0.071
4
2
0.142
0.214
5
3
0.214
0.428
6
4
0.286
0.714
7
3
0.214
0.928
8
1
0.071
1.00
14
1.00
0.21
0.14
0.07
Relative freq.
0.29
(b) These data can be suitably represented by means of a barplot for discrete variables
2
3
4
5
6
7
8
9
Length of stay
(c) If we denote by
X
the variable of interest then
x̄ = 5.64, s2X = 1.94
e
sX = 1.39.
6.
The owner of a bakery is interested in investigating the purchase behavior of his customers. For
this purpose, 12 receipts issued in June are randomly extracted.
selected receipts are as follows
6
The amounts in euro of the
4.05
6.60
3.45
17.70
14.25
1.95
10.20
8.10
8.70
10.80
5.55
1.95
(a) Calculate the mean and the standard deviation of the data.
(b) Calculate the median and the interquartile range of the data.
(c) Represent the data in a histogram with the following values as delimiters for the interval
classes: 1; 2; 5; 9; 11; 20.
(d) The amount of every receipt includes a tax computed as the 21% of the net value of the
receipt. You decide to carry out the analysis with respect to the net amounts. Compute
points (a) and (b) above with respect to this new set of data.
SOLUTION
(a) The mean, the variance and the standard deviation are
x̄ = 7.78, s2X = 23.93 and sX = 4, 89
respectively.
(b) If we write the data points in increasing ordering we obtain the sequence
1.95
1.95
3.45
4.05
5.55
6.60
8.10
8.70
10.20
10.80
14.25
17.70
and it follows that the rst, second and third quartiles are Q1 = (3.45 + 4.05)/2 = 3.75,
Q2 = (6.60 + 8.10)/2 = 7.35 and Q3 = (10.20 + 10.80)/2 = 10.5 respectively. Hence, the
median of the data is Me = 7.35 whereas the interquartile range is IQR = 10.5−3.75 = 6.75.
(c) The required histogram is represented below, note that the density of every bin can be read
from the vertical axis.
0.083
0.019
0.056
density
0.167
histogram of receipt values
0
1
2
5
9
11
20
euro
(d) Let
Y
denote the receipt net values. Then
that
• ȳ =
•
s2Y
1
1.21
=
× x̄ = 6.43;
1 2 2
sX
1.21
= 16.34;
7
Y =
1
1.21
×X
is a linear transformation of
X
so
• sY =
1
1.21 sX
= 4.04.
It is also straightforward to see that multiplying every data point by a positive constant does
not change the ordering of the values and, furthermore, the median and the interquartile
range of
Y
can be computed by multiplying the corresponding quantities of
X
by
1
1.21 .
Hence
•
•
the median of
Y
is
1
1.21
the interquartile range
× 7.35 = 6.07;
1
of Y is
1.21 × 6.75 = 5.58
7.
Consider the following sample of ve values and corresponding weights:
(a) Calculate the arithmetic mean of the
(b) Calculate the weighted mean of the
xi
wi
4.6
8
3.2
3
5.4
6
2.6
2
5.2
5
xi
xi
values without weights.
values.
SOLUTION
(a) The arithmetic mean is equal to 4.2;
(b) The weighted mean is equal to
110
24
= 4.583.
8.
Let
X
be the age of the students enrolled in an on-line macroeconomics course. The ages of a
sample of 12 students are
21
22
27
36
18
19
22
23
22
28
36
33
(a) Calculate the mean, the median and the modal age.
(b) Calculate the mean of the following variables:
i)
ii)
Y = X/(40 − X);
W = X/40 − X .
SOLUTION
8
(a) The mean is
x̄ = 25.58.
The data points, in increasing ordering, are
18 19 21 22 22 22 23 27 28 33 36 36
and it is easy to see that
(b) Since
Y
Me =
22+23
2
= 22.5
whereas the modal age is 22.
X , in order to compute the mean of Y one has to
ȳ = 2.91. On the other hand,
therefore w̄ = x̄/40 − x̄ = −24.94.
is NOT a linear transformation of
apply the transformation to every data point to obtain that
W
is a linear transformation of
X
and
9