Download STA-2023 Statistics for Business, Supplementary Exercises

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STA-2023
Supplementary Exercises
(Revised: Oct. 2013)
Chapter 2: Descriptive Statistics
1. Consider the annual dividend yields (in dollars) for a random sample
of 30 stocks:
20.5
15.4
16.9
13.4
8.8
19.5
12.7
7.8
14.3
22.1
15.6
5.4
23.3
19.2
20.8
24.1
17.0
11.8
9.2
12.6
9.9
28.6
18.4
16.8
15.9
27.8
21.9
15.2
12.0
5.1
a) Complete the following frequency table
Class Boundaries
Frequency
Rel. Freq.
(3 dec.)
Perc. Freq.
4.95 – 8.95
8.95 – 12.95
12.95 – 16.95
16.95 – 20.95
20.95 – 24.95
24.95 – 28.95
b)
c)
d)
e)
How many stocks had a dividend yield more than $20.95?
What percent of stocks had a dividend yield less than $12.95?
What are the boundaries of the modal class?
Construct a percent frequency histogram.
f) Use your calculator’s statistical functions to compute the sample
mean and sample standard deviation (round to 2 decimals).
g) Determine the Five Number Summary.
h) Identify the outliers, if any, in this data set? Justify your answer.
Hint: Calculate the lower and upper fences.
2. The PFA-100 is a medical device that measures the platelet function
of subjects by simulating the clotting process of the blood. It requires
only a small blood sample and the results are reported in less than
five minutes as “closure times”. The following data are the closure
times (in seconds) for 25 healthy individuals:
105
92
80
94
113
97
95
79
106
88
83
99
95
91
117
76
114
102
93
95
84
108
93
106
87
a) Construct a Stem & Leaf diagram for this data set. Specify the
units for the stems and leaves
b) What was the shortest and longest closure time?
c) What was the most frequent closure time?
d) How many subjects had a closure time less than 90 seconds?
e) What percent of these 25 subjects had closure times of at least 100
seconds?
f) Use your calculator’s statistical functions to compute the sample
mean and sample standard deviation (round to the nearest integer)
3. Troponin-I is an enzyme related to muscle activity. An immunochemistry test based on Troponin-I showed the following results
(ng/dL) for 30 subjects with chest pain admitted to the emergency
room of a hospital:
3.1
4.3
2.5
1.7
5.2
3.4
2.0
3.6
6.5
4.8
1.8
2.9
1.6
3.1
5.4
3.9
3.8
2.3
4.3
4.5
3.7
6.9
2.2
1.5
1.8
5.7
2.8
5.0
2.8
2.6
a) Construct a stem & leaf plot for this data set. Specify the measurement
units for the stems and leaves.
b) What was the lowest and highest Troponin-I measurement?
c) What’s the class interval with the highest frequency?
d) How many subjects had a Troponin-I level of 5.0 (ng/dL) or above?
e) What percent of subjects had Troponin-I level below 3.0 ng/dL?
f) Obtain the median based on the stem & leaf plot
4. Determine the Five Number Summary for the weights of 21 male
FIU students. Identify any outlier that may exist using the lower and
upper fences.
121, 173, 157, 165, 170, 161, 142, 171, 184, 115, 172, 159, 187, 166,
158, 163, 145, 196, 172, 130, 171
5. The blood type for twenty adult subjects was recorded (see data
below).
a) Construct a freq. table and bar graph for this categorical data set
b) What is the most frequent blood type?
c) How many subjects have blood type different from “O”
d) What percent of subjects have either blood type A or B?
6. Twenty one biology majors conducted an experiment during a Bio
lab class. Each student measured their own rock’s density by
weighing the rock (in grams) and then dropping the rock into a
cylinder with a know amount of water to record the volume, in
milliliters, of the rock (see data below, Table 1).
a) Use your calculator statistical capabilities to find the mean and
standard deviation
b) Obtain the Five Number Summary
c) Determine the presence of outliers by calculating the upper and
lower fences. Justify you answer
7. Sixteen FIU students determined their own Body Mass Index (BMI)
(see data below, Table 1). Weight and height measurements were
used by each student to calculate his/her BMI
a) Use your calculator statistical capabilities to find the mean and
standard deviation
b) Obtain the Five Number Summary
c) Determine the presence of outliers by calculating the upper and
lower fences. Justify your answer
8. Each of 23 biology scholars blew Carbon Dioxide into a tube
containing the phenol red indicator to turn the solution from red to
yellow (i.e. basic to acidic). Each student then added their own piece
of Elodea into the solution, closed the tube, covered it with foil and
let it sit for 10 minutes. Once the 10 minutes had passed, the pH of
the solution was recorded, the foil removed and the tube placed
directly in the light for 10 minutes. Once the 10 minutes had elapsed,
the pH was recorded again (see data below, Table 1).
a) Obtain the Five Number Summary for the pH measurements
before and after light effect
b) Determine the presence of outliers for each data set by calculating
the upper and lower fences. Justify your answer
c) Use your calculator statistical capabilities to find the mean and
standard deviation
d) Do you observe a substantive difference in the location of center
before and after light effect? Explain
9. The pulse rate before and after exercise were determined for 15 FIU
science majors (see data below, Table 1)
a) Obtain the Five Number Summary for the pulse rates before and
after exercise
b) Determine the presence of outliers for each data set by calculating
the upper and lower fences. Justify
c) Do you observe a substantive difference in the location of center
before and after exercise based on the medians?
Table 1
BMI
21.6
23.0
19.2
20.4
27.4
21.6
20.1
27.1
23.6
23.0
19.3
27.1
22.6
27.3
22.1
24.7
Rock Dens
1.58
1.80
1.81
1.50
1.09
2.11
1.20
1.80
2.06
1.90
2.15
2.43
1.75
1.70
2.02
3.85
1.15
3.10
1.60
2.69
1.40
Ph Before
5.04
5.10
5.07
5.23
5.25
5.60
5.34
4.93
5.33
4.95
5.65
5.61
5.54
5.19
5.49
5.28
5.01
5.01
4.90
4.87
4.69
4.91
5.33
Ph After
6.15
6.28
6.35
6.47
6.55
6.13
5.68
6.23
6.26
6.09
6.44
6.25
6.40
6.54
5.99
6.04
6.21
5.88
6.02
6.23
6.21
6.04
6.07
Pulse B
88
92
96
72
80
78
74
72
88
57
62
83
63
67
67
Pulse A
184
192
168
124
140
172
130
136
176
147
126
172
127
119
119
Blood Type
A
O
B
B
A
AB
O
A
A
B
AB
A
AB
B
B
A
AB
O
A
B
Chapter 3: Probability
1. Consider the random experiment consisting of rolling a fair die twice
(Remember the 36 outcome sample space)
(a)
List all sample points included in the following events:
A: the sum of the two rolls is less than four
B: the sum of the two rolls is greater than nine
C: observing the same result for both rolls
D: the sum of the two rolls is an even number
(b)
Find
P(A)
P( Bc )
P(B ∩ C)
P(A U D)
2. A survey was conducted on 200 cars at a given Miami dealership.
The survey involved two questions:
 Type of Transmission
Automatic (A)
Manual (M)
 Car Model
Sedan: 4 doors (F)
Coupe: 2 doors (T)
Hatchback: 3 doors (H)
The results of the survey are summarized in the following contingency
table where the rows represent the type of transmission and columns the
car model:
A
M
F
64
16
T
46
10
H
36
28
a) If a car is chosen at random, among the 200 cars included in the
study, find the probability of the following events:




Choosing an Automatic car
Choosing a Sedan car
Choosing a car that is both Automatic and Sedan
Choosing a car that is either Automatic or Sedan
b) Find the following conditional probabilities
 P(M | H)
 P(H | M)
c) Are the events Manual transmission (M) and Hatchback model (H)
mutually exclusive? Explain.
d) Are the events Manual transmission (M) and Hatchback model (H)
independent? Justify your answer numerically.
3. A group of 200 files in a medical clinic classifies the patients by
gender and by type of diabetes (I or II). The grouping is shown as
follows. The table gives the number of patients in each classification:
F
M
I
70
50
II
40
40
If one patient is chosen at random from the 200 files,
a) What is the probability that the patient has Type I diabetes?
(5)
b) What is the probability that the patient is male?
c)What is the probability that the patient chosen is male or the patient
chosen has Type I diabetes?
d) What is the probability that the patient is male, given the patient has
Type I diabetes?
(10)
Chapter 4: Probability Distributions
Part I: Discrete random variables
1. The random variable “X” designates the number of weekly
breakdowns for a photocopy machine. The table below describes the
probability distribution for “X’
X
0
1
2
3
4
P(X)
.20
.40
.25
.10
.05
a) Find the probability of the following events for any randomly chosen
week and interpret the result for each part:




Observing less than 2 machine breakdowns
Observing at most 2 machine breakdown
Observing at least 3 machine breakdowns
Observing exactly 4 machine breakdowns
b) Compute and interpret the mean of “X”
c) How many machine breakdowns are expected over a 25-week
period? Justify your answer numerically
d) Compute the Standard Deviation of “X”. Interpret the result
e) Draw a point probability graph for this problem
f) Find P(X < μ – σ)
g) Calculate the probability that “X” exceeds two standard deviations
above the mean?
2. Assumed that 30% (hypothetical figure) of all public state
universities belong to ethnic minorities.
a) Find the probability of observing the following events among a
random sample of 15 public state university students:
 Exactly four belong to ethnic minorities (use the binomial
formula and your calculator)
 At most 6 belong to ethnic minorities (use the binomial
table)
 At least 8 belong to ethnic minorities (use the binomial
table)
 Less than 12 but more than 8 belong to ethnic minorities
(use the binomial table)
b) What is the expected number of students in this sample that
belong to ethnic minorities? Justify your answer numerically.
3. A local newspaper claims that 60% of the items advertised in its
classifieds section are sold within a week of the first appearance of
the ad.
a) If a random sample of 25 advertised items from last month are
randomly selected, find the probability of observing the following
events:
 Less than 20 were sold within a week
 More than 10 were sold within a week
 Between 10 and 20 inclusive were sold within a week.
b) What is the expected number of items sold within a week among
random samples of 25 items? Justify your answer numerically
4. Suppose the number of errors on income tax forms processed by an
accounting firm averages 2.5 per month and assume that the errors
occur at random and independent one to another. What’s the
probability that during the first month of the next tax season the
following events will be observed for this accounting firm?
 No errors (use the Poisson formula)
 At most two errors (use the Poisson formula)
 Between 1 and 3 errors inclusive (use the Poisson
formula)
5. Students arrive to a given booth of the GC food court at the average
rate of 3.2 people per minute. Assuming the students arrival follow a
Poisson process,
a) Find the probability of observing the following events in a given
period of one minute:
 Exactly 3 students arrive (use the Poisson formula)
 At least 2 students arrive (use the Poisson formula)
 Less than 3 students arrive (use the Poisson formula)
b) Find the probability of observing the same events in a two minute
period
Chapter 4: Probability Distributions
Part II: Continuous random variables
1. The IQ score of the 5th grade children population in a school district
has a normal distribution with a mean 105 and standard deviation 10.
a) Draw the normal curve of IQ scores for this population
b) If a child is chosen at random from this population, find the
probability of observing the following events. For each part, graph
the probability as the associated area under the normal curve of IQ
scores.
 An IQ score less than 90
 An IQ score of 130 or higher
 An IQ score between 84 and 128
c) What is the IQ score (rounded to the nearest whole number) that has
90% of the defined population below it? Show the graph supporting
your solution.
2. A big tech company measures the emotional intelligence of job
applicants with a particular test. Last year it was observed a mean
score of 85 points and standard deviation of 6. If the distribution of
scores was Normal (bell shaped),
a) What percent of candidates exceeded 80 points?
b) What percent of candidates scored between 75 and 90 points?
c) A candidate accepted for a given position fell at the 96th percentile
of the classification. What was the candidate score (round to one
decimal)?
3. The time it takes to drain a fully charged battery in a typical smart
phone has a mean of 8.2 hours with a standard deviation of 1.1
hours. Assuming that this time is normally distributed,
a) Find the probability of observing the following events for a fully
charge smart phone
 The battery is drained in under 7 hours
 The battery is drained in after 9 hours
 The battery is drained between 7 and 9 hours
b) Determine the number of hours (rounded to one decimal) that
limits the top 25% of batteries.
4. The fuel consumption of a Boeing 787 aircraft in cruising mode
averages 3210 gallons per hour. Assume that the consumption is
normally distributed with a standard deviation of 180 gallons per
hour. If a Boeing 787 is in cruising mode, what is the probability (as
a percentage) the fuel consumption is
a. Less than 3400 gallons per hour?
b. More than 3000 gallons per hour?
c. Between 3100 and 3200 gallons per hour?
Chapter 6: Estimation with Confidence Intervals
1. Assume that data from Chapter 2 exercises #6-9 come from simple
random samples. Estimate the population mean for each problem using a
95% confidence level. What assumption is required for the validity of the
procedure used?
2. A survey was conducted on home gardening in Miami-Dade County. To
that end a random sample of 65 households with backyard gardens was
selected. The sample mean size of the gardens was 579 square feet with a
sample standard deviation of 148 square feet.
a) Determine a point estimate for the mean size  of all Miami-Dade
backyard home gardens.
b) Find a 90% confidence interval (rounded to the nearest integer) for
the population mean  and interpret the result in the context of the
problem.
c) Find a 99% confidence interval (rounded to the nearest integer) for .
Compare the width of this interval estimate to the result from part (b)
d) Discuss the precision of the two previous interval estimates
3. A poll asked people in a given city about their intention of vote for a
given candidate to a political race. Among a random sample of 613
registered voters 324 showed support to this candidate.
a) Determine a point estimate (rounded to 3 decimals) for “p”, the
proportion of all registered voters in this city that support the given
candidate.
b) Find a 98% confidence interval (rounded to 3 decimals) for “p”
and interpret the result in the context of the problem.
c) If the sample size is increased, what will be the effect on the
precision of our interval estimate? Explain.
4. Consider a study investigating the physiological changes that
accompany laughter. Ninety six randomly selected students from
some college watched film clips designed to evoke laughter. During
the laughing period, the researchers measured the heart rate (beats
per minute) of each subject, with the following results: sample
mean = 75.3 and sample std. dev. = 6.2. Obtain a 97% confidence
interval (rounded to one decimal) for the population mean  of all
students at this college.
5. Suppose that a random sample of twenty runners among the world
class category of marathon runners was chosen. Their best time over
the last year was obtained, resulting in a sample mean of 134.6
minutes with a sample standard deviation of 5.4 minutes.
a) Obtain a 95% confidence interval (rounded to one decimal) for
the population mean  of all runners in this class.
b) What will be the impact on the interval’s precision if the sample
size is reduced in half? Explain
6. The proportion of students who use a cell phone on college
campuses across the country has increased tremendously over the
past few years. Four hundred students were selected at random from
a large university with 350 of them indicating the use of cell phones
on campus. Find a 94% confidence interval (rounded to 3 decimals)
for “p”, the proportion of students who use a cell phone on this
university campus, and interpret the result in the context of the
problem.
Chapter 7: Hypothesis Testing based on a single sample
For the following problems conduct the five step procedure of hypothesis
testing discussed in class
1. Using sample data from exercise #1, chapter 2, test the hypothesis that
the mean yield of all income producing stocks is higher than $15. Use a
significance level α = 0.05
2. Using sample data from exercise #2, chapter 2, test the hypothesis that
mean PFA closure time of the healthy population is less than 100. Use a
significance level α = 0.01
3. Using sample data from exercise #3, chapter 2, test the hypothesis that
mean Troponin-I level of the chest pain patient population differs from 3.0
ng/dL. Use a significance level α = 0.10
4. Using sample data from exercise #7, chapter 2, test the hypothesis that
mean BMI of the FIU student population is less than 25. Use a
significance level α = 0.05
5. Using sample data from exercise #3, chapter 6, test the hypothesis
that a majority of registered voters in this city support the given
candidate. Use a significance level α = 0.01
6. The average retail price for tomatoes in Miami-Dade County last
year was $1.45 per pound. A recent study considered a sample of 14
different supermarkets in this county that gave an average of $1.54
with a standard deviation of 9 cents.
a) Do these data provide sufficient evidence at the 5% significance
level to conclude that the current mean price for tomatoes in MiamiDade is higher than last year?
b) Determine the p-value for this test and interpret the result.
7. A recent poll of 603 randomly selected youngsters from a given city
revealed that 62 of them are currently drug users. (Note: figures are
hypothetical.)
a) Do these data provide sufficient evidence at the 1% significant level
to conclude that the current percentage of drug users among the
given youngster population is different from 11%?
b) Calculate and interpret the p-value for this test.
8. A study claims that the nationwide average annual tuition for private
high schools is less than $7000. A random sample of 55 private high
schools had an average annual tuition of $6625 and a sample
standard deviation of $1210. Test the claim at a 10% level of
significance. Make the conclusion and find the p-value for the test.
9. A random sample of 175 students is taken from a large university on
the West coast to estimate the proportion of students whose parent
bought a car for them when they left for college. When interviewed,
91 students in the sample responded that their parents bought them a
car.
a) Do these data provide sufficient evidence at the 4% significant level
to conclude that the proportion of all students from this West Coast
University whose parents bought a car for them when they left for
college is less than 55%?
b) Calculate and interpret the p-value for this test.
Answers to selected supplementary exercises
Chapter 2: Descriptive Statistics
1. b) 6 stocks; c) 33.3%; d) $12.95 - $16.95;
f) = $16.07, s = $6.06;
g) Min = 5.10, Q1 =12.00, Q2 = 15.75, Q3 = 20.50,
Max = 28.60;
h) No Outliers
2. b) Min = 76 sec, Max = 117 sec; c) Mode = 95 sec;
d) 7 subjects; e) 32%; f) = 95.7 sec, s = 11.2 sec
3. b) Min = 1.5, Max = 6.9; c) Modal class: 2.0 – 2.9; d) 6 subjects;
e) 43.3%; f) Median = 3.25
4. a) Min = 115, Q1 = 151, Q2 = 165, Q3 = 172, Max = 196;
b) Fences: LF = 119.5, UF = 203.5; c) One outlier at the lower
end = 115 pounds
5. b) Blood type A; c) 17 subjects; d) 65%
6. a) = 1.94, s = 0.66
b) Min = 1.09, Q1 = 1.54, Q2 = 1.80, Q3 = 2.13,
Max = 3.85
c) Fences: LF = 0.65, UF = 3.02; Two outliers at the upper end:
3.10 and 3.85
7. a) = 23.1, s = 2.9
b) Min = 19.2, Q1 = 21.0, Q2 = 22.8, Q3 = 25.9,
Max = 27.4
c) Fences: LF = 15.1, UF = 31.8; No outliers
Chapter 3: Probability
1. a) A = { (1,1), (1,2), (2,1) }
B = { (4,6), (5,5), (5,6), (6,4), (6,5), (6,6) }
C = { (1,1), (2,2), (3,3), (4,4), (5,5), (6,6) }
b) P(A) = 3/36 = .083
P( Bc ) = 30/36 = .833
P(B ∩ C) = 2/36 = .056
P(A U D) = 20/36 = .556
2. a) P(A) = .73, P(F) = .40, P(A ∩ F) = .32, P(A U F) = .81
b) P( M | H ) = .44, P( H | M ) = .52
c) Not, because there are 28 cars at the intersection
d) Not, because P(M | H) ≠ P(M) [.44 ≠ .27]
3. a) P(I) = .60, P(M) = .45, P(M U I) = 0.80, P(M | I) = 0.42
Chapter 4, Part I: Discrete random variables
1. a) P(X < 2) = .60, P(X ≤ 2) = .85, P(X ≥ 3) = .15,
P(X = 4) = .05
b) μ = E(x) = 1.4
c) 35 machine breakdowns
d) σ = √1.14 = 1.07
f) P(X < μ – σ) = P(X < 0.33) = P(X = 0) = 0.20
g) P(X > μ + 2σ) = P(X > 3.54) = P(X = 4) = 0.05
2. a) P(X = 4) = .219; P(X ≤ 6) = .869, P(X ≥ 8) = .050,
P(8 < X < 12) = .015
b) E(x) = 4.5
3. a) P(X < 20) = .971, P(X > 10) = .966,
P(10 ≤ X ≤ 20) = .978
b) E(x) = 15
4. P(X = 0) = .082, P(X ≤ 2) = .543, P(1 ≤ X ≤ 3) = .676
5. a) P(X = 3) = .223, P(X ≥ 2) = .829, P(X < 3) = .380
b) P(X = 3) = .074, P(X ≥ 2) = .987, P(X < 3) = .047
Chapter 4, Part II: Continuous random variables
1. b) P(X < 190) = .0668, P(X ≥ 130) = .0062,
P(84 ≤ X ≤ 128) = .9714
c) IQ score = 118
2. a) P(X > 130) = 79.67%
b) P(75 ≤ X ≤ 90) = 74.92%
c) Test score = 95.5
3. a) P(X < 7) = .1379, P(X > 9) = .2327,
P(7 ≤ X ≤ 9) = 0.6294
b) Drain time = 8.9 hours
4. a) P(X < 3400) = 85.54%
b) P(X > 3000) = 87.90%
c) P(3100 ≤ X ≤ 3200) = 20.52%
Chapter 6: Estimation
2. Point estimate =
= 579 ft2
b) 90% C.I. for µ: (549 , 609) ft2
c) 99% C. I. for µ: (532 , 626) ft2
d) For a given sample size, as the confidence level increases the
interval estimate becomes wider and less precise due to a larger
margin of error
3. a) Point estimate = 0.529 or 52.9%
b) Interval estimate for “p”: (0.482 , 0.576) or (48.2% , 57.6%)
c) For a given confidence level, as the sample size increases the
margin of error decreases making the interval estimate
narrower and more precise
4. Confidence limits for µ: 75.3 +/- 1.4 or (73.9 , 76.7)
5. a) 95% confidence limits for µ: 134.6 +/- 2.5 or
(132.1 , 137.1)
b) For a given confidence level, as the sample size decreases the
margin of error increases making the interval estimate wider
and less precise
6. Confidence limits for “p”: 0.875 +/- 0.074 or
(0.801 , 0.949)
Chapter 7: Hypothesis testing
1. Ho: μ ≤ 15 Ha: μ > 15
RR = { TS > 1.645 }
TS = 0.97
Decision: Fail to reject Ho
Conclusion: Insufficient evidence to conclude that the mean yield
of all income producing stocks is higher than $15.
2. Ho: μ ≥ 100 Ha: μ < 100
RR = { TS < -2.492 }
TS = -1.93
Decision: Fail to reject Ho
Conclusion: Insufficient evidence to conclude that the mean PFA
closure time of the healthy population is less than 100 seconds.
3. Ho: μ = 3.0 Ha: μ ≠ 3.0
RR = { TS > 1.645 or TS < -1.645 }
TS = 1.92
Decision: Reject Ho
Conclusion: Sufficient evidence to conclude that the mean
Troponin-I level of the chest pain patient population differs from
3.0 ng/dL
4. Ho: μ ≥ 25 Ha: μ < 25
RR = { TS < -1.753 }
TS = -2.66
Decision: Reject Ho
Conclusion: Sufficient evidence to conclude that the mean BMI
of the FIU student population is less than 25.
5. Ho: p ≤ 0.50 Ha: p > 0.50
RR = { TS > 2.33 }
TS = 1.43
Decision: Fail to reject Ho
Conclusion: Insufficient evidence to conclude that the majority
of all registered voters in this city support the given candidate.
6. Ho: μ ≤ 1.45 Ha: μ > 1.45
RR = { TS > 1.771 }
TS = 3.75
Decision: Reject Ho
Conclusion: Sufficient evidence to conclude that the current
mean price of bananas is higher than last year
7. Ho: p = 0.11 Ha: p ≠ 0.11
RR { TS < -2.575 or TS > 2.575 }
TS = 0.55
Decision: Fail to reject Ho
Conclusion: Insufficient evidence to conclude that the population
proportion of drug users differs from 11%
8. Ho: μ ≥ 7000 Ha: μ < 7000
RR = { TS < -1.28 }
TS = -2.30
Decision: Reject Ho
Conclusion: Sufficient evidence to conclude that the nationwide
average annual tuition for private high schools is less than $7000
9. Ho: p ≥ 0.55 Ha: p < 0.55
RR = { TS < -1.75 }
TS = -0.80
Decision: Fail to reject Ho
Conclusion: Insufficient evidence to conclude that the proportion
of all students from this West Coast University whose parents
bought a car when they left for college is less than 55%