Download Chapters 1-4 - Psyc 2021 M – Data Analysis I

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
EXERCISES
Chapter 1
Questions 1 through 4 refer to the following set of scores (X): -3, 0, -5, 2
1.
2.
3.
4.
5.
For the data above, what is ΣX + 2
a.
2
b.
-4
c.
12
d.
22
For these data, what is ΣX2
a.
0
b.
c.
100
d.
38
For the data above, what is Σ(X+1)2
a.
30
b.
1
c.
42
For the data above, what is (ΣX)2
a.
0
b.
36
c.
115
d.
5/9
a.
c.
10/27 d.
7/12
divided by
3/4
-12
2/3 =
b.
5/6
d.
none of above
209
6.
Which of the following is NOT measured on a nominal scale?
a.
eye colour
b.
party affiliation
c.
telephone numbers
d.
gender
e.
none of the above; all are nominal scales
7.
Which of the following is measured on an interval scale
a.
height
b.
weight
c.
temperature
d.
time
e.
all of the above are interval scales
8.
If your data consist of numbers, it is not valid to perform arithmetic operations on
those numbers, if your data were derived from a ______ scale.
a.
continuous
b.
discrete
c.
interval
d. nominal
9.
A subject’s attitude towards the death penalty is assessed on a scale that consists
of: strongly opposed, somewhat opposed, slightly opposed, neutral, slightly for,
somewhat for, strongly for. Which of the following scales is being used?
a.
nominal
b.
ordinal
c.
interval
d. ratio
10.
What are the four measuring scales? State their characteristics and permissible
operations.
Scale
Characteristics
poss. operations
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
1
EXERCISES Chapter 2
1.
If your data has been measured on a nominal scale, the most appropriate way to
display the distribution is by means of _______.
a.
a frequency polygon
b.
a histogram
c.
a bar graph
d.
a cumulative frequency curve
2.
The largest cumulative frequency in a cumulative frequency distribution = ___.
a.
1.0
b.
100
c.
N
d.
100/N
3.
The prime purpose of grouping data is _______.
a.
to facilitate calculation
b.
to reduce the amount of calculation
c.
to get an exact representation of all the data
d.
to get an overall picture of the whole group
4.
The number of groups used
a.
is the range divided by the size of the interval
b.
depends to some extend on the number of data (N)
c.
mostly between 8 and 12
d.
all of the above
5.
Some interval sizes are preferred over others because _______.
a.
they are more convenient
b.
they are traditional
c.
they assure more accuracy of representation
d.
none of the above; interval size is whatever results if one divides the range
by the number of groups
14.
A cumulative frequency graph requires that the data are at least of a(n) ______
scale.
a.
nominal
b.
ordinal
c.
interval
d.
ratio
15.
The cumulative frequency graph coordinates the cumulative frequency (Y) with
the ______ of the variable (X).
a.
lower limit
b.
upper limit
c.
midpoint d. any of above
16.
In a histogram the X axis can be labelled with _____
a.
the exact limits of the intervals
b.
the midpoints
c.
the apparent limits (whole numbers) d.
any of the above
17.
Which are the 3 factors to consider when choosing a suitable way to group data?
(1)________________; (2) ________________ (3) ___________________
2
1.
A group of 60 people started out on a marathon. The distances (in km) run by the
various people are listed below:
40
4
21
15
22
14
12
19
23
22
19
10
19
15
14
23
25
8
13
13
19
29
20
17
17
21
29
16
24
34
34
31
30
31
22
30
8
25
24
24
26
22
36
23
22
23
21
21
15 11
18 7
38 6
10 13
25 22
22 9
a. Group the data in an appropriate fashion
b. Graph the data above by using a
- Histogram
- polygon of frequencies
- ogive of Galton
2
Jim decides to improve his French by reading on a regular basis. He sets himself
the goal of reading every day except Fridays and cover an ideal total of 100 pages
per week or a minimum of 80. The number of pages he reads is as follows:
Monday
15 pages
Tuesday
20 pages
Wednesday
5 pages
Thursday
10 pages
Saturday
25 pages
Sunday
15 pages
Use a cumulative frequency distribution to represent the amount of reading that
Jim has achieved throughout the week.
1. The ogive of Galton is a highly effective tool in behaviour modification. Graphing
accomplishments in a cumulative fashion for a determined unit of time (e.g. a
week) has a strong motivating effect since progress appears more salient when
made more visible. Suppose a student wants to increase her/his study behaviour to
total a fixed number of hours (e.g. 15 hours) per week. S/he charts the cumulative
study time for those days where studying is possible, leaving out those days where
studying cannot or will not be done because of prior obligations.
The X- axis could look like below or use any other set of days. The upper limit of
each value is 12:00 midnight.
Mon
|
Tue
|
Thur
|
Fri
|
Sun
|
Generate your own data.
3
EXERCISES Chapter 3
1.
The prime purpose of the central tendency is _______
a.
to precisely reflect the value of all the data
b.
to respect the individuality of each of the data points
c.
to reflect the group as a whole
d.
all of the above
2.
Imagine that a friend of yours has earned 24 credits of A’s, 12 credits of B’s, and
12 credits of C’s. If an A is worth 8 points, a B = 6 points, and a C = 24 points,
what is your friends GPA?
a.
6.0
b.
6.5
c.
6.66
d.
7.0
3.
If a constant is subtracted from all the scores in the distribution _______.
a.
the standard deviation will be unchanged
b.
the standard deviation will become negative
c.
the standard deviation may become negative
d.
a constant should also be subtracted from the standard deviation
4.
Here is a set of data that describes a population
101
104
106
107
108
109
110
112
115
Determine the following measures. SHOW YOUR WORK
a.
the mean
b.
the variance of the population
c
the standard deviation of the population
d.
the Sum of Squares (SS)
e.
the median
f.
the mode
g.
the range
5. Subtract 100 from each of the values listed above (if you have not done so previously).
Determine again the measures a through g
6.
What is the one advantage of using the inter-quartile range rather than the range?
7.
If data that are measured on an interval / ratio scale the usual choice of statistics are
the mean and the standard deviation. Give one reason why a researcher may find
a.
that the median and inter-quartile range better describe the data.
___________________________________________________________________
b.
that the mode and range better describe the data.
_________________________________________________________________
4
8.
On the basis of examination performance, an instructor identifies the following
subgroups of students:
a.
those with a percentile rank of 90 or higher
b.
those with a percentile rank of 10 or less
c.
those with percentile ranks between 40 and 49
d.
those with percentile ranks between 51 and 60
With which subgroup would the instructor choose to work if he was looking for the easiest
way of raising the median of the whole class? Explain.
_____________________________________________________________________
9.
Calculate the mean and the standard deviation for the grouped frequency
distribution of examination grades.
Class Interval f
95 – 99
1
90 – 94
1
85 – 89
3
80 – 84
5
75 – 79
11
70 – 74
15
65 – 69
9
60 – 64
7
55 – 59
5
50 – 54
2
45 – 49
0
40 - 44
1
10.
Using the data above, draw a cumulative frequency distribution; graphically
determine the mean and the inter-quartile range
11.
Describe a situation where all three measures of central tendency will fall together.
12.
Which measure of central tendency (mean, median, or mode) would be most
appropriate for describing each of the following sets of data?
Note that what is required to answer these questions is neither memorization nor plugging
numbers into formulas but some general knowledge and the readiness to make use of it.
a) Heart rates for a group of men before going on a 10 km run
b) Amounts of time participants spend solving a brainteaser, with some
participants unable to solve it
c) Religious affiliation of students in a statistics class
d) Heights in cm for a group of girls in a third-grade class
e) Household income of residents in Toronto
5
EXERCISES Chapter 4
Given is a normal distribution of scores with a mean of 60 and s2 =36, use this
information to answer Questions1 - 3.
1.
Approximately what percentage of scores fall below the test scores of 50?
a.
45.25%
b.
4.75%
c.
95.25 %
d.
91%
2.
What is a likely range for these data?
a.
0 to 100
b.
0 to 120
c.
45 to 75
d.
there is no way of knowing without additional information
3.
What is the z score corresponding to a test score of 50?
a.
- 1.67
b.
+1.67
c.
- 2.1
d.
+2.1
4.
What is the purpose of standardizing scores?
5.
Normal distributions can have means and standard deviations of any value.
The standard normal distribution always has a mean of ____ and a standard
deviation of ____. Why?
6.
Assuming that the average height (μ) for 14-year-old girls is 1.50m, and the
standard deviation (σ) is 7.5cm, calculate the z score for a female who is
a. 1.50 m
b. 1.40 m
c. 1.45 m
d. 1.70m
e. 1.60m
7.
Calculate the height of a 14 year old girls whose z score is
a. z = +1
b. z = +2.5
c. z = -.5
d. z = -1.5
e. z = 0
8.
Determine the percentage of z-scores in a normal distribution that are ____
a. ≥ +1.25
b. ≤ + 1.0
c. ≥ - 1.58
d. ≤ + 2.10 e. ≤ -0.80
9.
Suppose you were asked about the percentage of observations that fall at z = 1.5.
If the question can be answered, what is it? If it cannot be answered, why can’t it?
10.
Which proportion of observations fall between the z-values of ___
a.
0 and 2.9
b.
-.5 and +1.5
c.
-2.0 and +1.2
d.
-.5 and -15
e.
+1.3 and + 1.8
f.
-1.4 and -0.7
11.
IQ scores are normally distributed with a mean of μ = 100 and a standard
deviation of σ = 15.
Find the two IQ scores between which the middle 60% of the distribution fall.
The school board wants to establish special classes for the students whose IQ
scores fall into the lower 5%. How low an IQ will a student have to have to be
eligible for such a special class?
A person gets an IQ score of 125. What is his/her percentile rank?
What is the probability of finding a person with an IQ of 150 or higher?
a.
b.
c.
d.
6
12.
The average height of five-year-old boys is known to be X b = 109.3 cm with a
standard deviation of sb = 4.3. The average height for girls is X g = 104.7 cm with a
standard deviation of sg = 5.8. In a randomly selected group of 100 boys and 100
girls,
a.
b.
How many girls will be taller than the average boy?
How many boys will be taller than the average girl?
Hint: If this question gives you trouble, it may be because you are trying to take a
short-cut by simply plugging numbers into a formula. Instead, go step by step,
starting out with a rough sketch to visualize what you are looking for. The first
question asks about the number of girls, therefore your first sketch will show the
distribution of girls, marked by the mean of the girls at the centre of the distribution.
The value 109.3 is the average for the boys and in the centre of the boy’s
distribution, but in the girls’ distribution it is just an X-value among others. Mark
that value in the girls’ distribution. Where does it fall relative to their mean? Which
region within the distribution are you looking for to find the girls that are taller than
the average boy? In which column will it be found?
13.
Sam scored 75% on his English exam, in which the average was 60 with a
standard deviation of 15. He scored 60 % on his Russian exam, in which the
average was 50 with a standard deviation of 5. Relative to his classmates, on
which exam did Sam perform better? Why?
14.
John Doe and Tom Smith are two friends taking the same course. The class is a
very large one and students are graded on a curve, i.e. relative to each other. Five
tests are written with the following results:
John Doe
85
40
60
80
35
Tom Smith
65
45
60
65
65
Class μ
65
50
55
70
50
σ
20
5
10
20
5
John Doe calculates that his average score is the same as that of his friend Tom, i.e.
X JD = X TS = 300/5 = 60, and that therefore they should be getting the same grade.
However, the instructor gives a grade to Tom that is higher than the one he gives to
John. How does the instructor justify the difference in grades?
7