Download Preparatory questions for Quiz-1 on Statistics course.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Preparatory questions for Quiz-1 on Statistics course.
Spring semester, 2016
Instructor: Elchin Rashidov
Department: International School of Economics
Azerbaijan State University of Economics
1) You have conducted a survey and asked 8 students in the university on how many
absences they have received in the last semester. Their responses were:
10, 0, 8, 0, 7, 3, 10, and 2.
What are the mean, standard deviation and coefficient of variation for this data?
2) The operations manager of a plant that manufactures tires wants to compare the actual
inner diameters of two grades of tires, each of which is expected to be 575
millimeters. A sample of five tires of each grade was selected, and the results
representing the inner diameters of the tires, ranked from smallest to largest, are as
follows:
Grade X: 568 570 575 578 584
Grade Y: 573 574 575 577 578
For each of the two grades of tires, compute the mean, median, and standard
deviation. Which grade of tire is providing better quality? Explain.
3) Chebychev’s theorem is used to approximate the proportion of observations for any
data set, regardless of the shape of the distribution. Assume that a distribution has a
mean of 200 and standard deviation of 20.
a) Approximately what proportion of the observations is between 140 and 260?
b) Approximately what proportion of the observations is between 170 and 230?
4) Consider a population of 8000 mutual funds that primarily invest in large
companies.You have determined that the the mean one-year total percentage return
achieved by all the funds, is 12 and that the standard deviation, is 2.
According to the Chebyshev rule, how many funds are expected to be within one
standard deviation of the mean?
5) What does the Z score measure? Calculate for the following data set:
10, 0, 3, 10, 7, 3, 0, and 7.
6) What are the differences among the mean, median, and mode, and what are the
advantages and disadvantages of each?
7) How do the empirical rule and the Chebyshev rule differ?
8) How do the covariance and the coefficient of correlation differ?
9) Explain what is meant by correlation is not a causation. Give an example.
10) The management at a small manufacturing plant has noticed that the price of steel has
increased significantly over the past several years. Looking over their records, they
find that over the four-year period, prices have increased by 40%. They expect this
same trend in prices for next year. In budgeting for next year, by how much should
they expect prices to increase?
11) Consider a population of 8000 mutual funds that primarily invest in large
companies.You have determined that the the mean one-year total percentage return
achieved by all the funds, is 12 and that the standard deviation, is 2.
According to the Chebyshev rule, at least 93.75% of these funds are expected to have
one-year total returns between what two amounts?
12) Consider a population of 50,000 applicants applying to the State Student Admission
Committee (SSAC) to become a bachelor student.
You have conducted a mock-exam with sample applicants, and determined that the
distribution is normal, the mean score is 400 and the standard deviation is 100.
According to the empirical rule, what percentage of these applicants are expected to
score between 500 and 600?
13) Consider a population of 50,000 applicants applying to the State Student Admission
Committee (SSAC) to become a bachelor student.
You have conducted a mock-exam with sample applicants, and determined that the
distribution is normal, the mean score is 400 and the standard deviation is 100.
The Ministry of Finance can fund the tuition of 8000 students only. SSAC has hired
you as a consultant to help them in setting the exam passing score for applicants to
qualify for the free tuition (i.e. funded by the government). How would you set the
exam passing score for the free tuition? Show your calculations.
14) Consider a population of 50,000 applicants applying to the State Student Admission
Committee (SSAC) to become a bachelor student.
You have conducted a mock-exam with sample applicants, and determined that the
distribution is normal, the mean score is 400 and the standard deviation is 100.
Universities demand 33000 students only. SSAC has hired you as a consultant to help
them in setting the exam passing score to meet that demand. How would you set the
exam passing score to meet that demand? Show your calculations.
15) Consider a population of 50,000 applicants applying to the State Student Admission
Committee (SSAC) to become a bachelor student.
You have conducted a mock-exam with sample applicants, and determined that the
distribution is normal, the mean score is 400 and the standard deviation is 100. SSAC
decided to set the passing score at 300. How many applicants will qualify to become
student?
16) Consider a population of 50,000 applicants applying to the State Student Admission
Committee (SSAC) to become a bachelor student.
You have conducted a mock-exam with sample applicants, and determined that the
distribution is normal, the mean score is 400 and the standard deviation is 100.
The government has decided to provide a stipend of 10,000 USD to students scoring
more than 600. Ministry of Finance employs you to estimate the yearly budget of this
stipend for these students. Perform the job and show your calculations.
17) The annual percentage returns on two stocks over a 7-year period were as follows:
Stock A:
4.% 10%
15%
-10% -20% 8.% 5%
Stock B:
6%
4%
3%
7%
8% 6%
5%
a) Compare the means and standard deviations of these two population distribution.
b) Compute an appropriate measure for determining the average yearly growth for both
stocks. Which stock has a greater return (rate of growth)?
c) Compute an appropriate measure of dispersion for both stocks to measure the
volatility of both stocks. Which stock is more volatile?
18) One of the students has recorded the number of students eating lunch at the
University dining in the ground floor over 110 days last semester. These data are
presented below.
# of Students
>160 but < 190
>190 but < 220
>220 but < 250
>250 but < 280
>280 but < 310
# of Days
11
27
42
23
7
a) Estimate the number of lunches eaten by students during the semester there.
b) Assuming, there are 500 students in the university, on average, how many lunches
each student get in the university dining during the semester?
19) One of the students has recorded the number of students eating lunch at the
University dining in the ground floor over 110 days last semester. These data are
presented below.
# of Students
>160 but < 190
>190 but < 220
>220 but < 250
>250 but < 280
>280 but < 310
# of Days
11
27
42
23
7
The owner of the dining hires you as a consultant to tell her the mean number and
standard deviation of students’ daily showing up for lunch? Show your calculations.
20) Suppose, you pass the admission test organized by the State Student Admission
Committee (SSAC) to become a bachelor student. You have been told that your score
is 84th percentile. Calculate the score. Assume normal distribution with the mean
score of 400 and the standard deviation of 100.
21) Suppose, you pass the admission test organized by the State Student Admission
Committee (SSAC) to become a bachelor student. There are 50,000 applicants. SSAC
send you an information telling that your score 600. How many students have
performed worse than you did. Assume normal distribution with the mean score of
400 and the standard deviation of 100.
22) In general, which of the covariance and the sample correlation coefficient is a more
useful measure of the relationship between the two variables? Explain.
23) The probability that a student eats lunch at the University dining is 0.32. The
probability that a student is female is 0.62. The probability than a student eats lunch
at the University dining and is female is 0.21. What is the probability that a randomly
chosen student either eats at the dining or is female?
24) In a recent survey about Russian policy in Syria, 62 % of the respondents said that
they support Russian policy in Syria. Females comprised 53% of the sample, and of
the females, 46% supported Russian policy in Syria.
A person is selected at random. What is the probability that the person we select is
female and support Russian policy in Syria?
25) Are the events “does not support Russian policy in Syria” and “female” statistically
independent? Why or why not?
26) Suppose we select a supporter of Russian policy in Syria, what is the probability that
the person we select is female?
Quiz 2
QUESTIONS SET #1
1. Discuss the normal distribution, give an example to such distribution and explain why you think it might
be a normal distribution; discuss and illustrate what happens when you change mean and standard
deviation.
2. Explain standardized normal random variable, apply the transformation formula on and give an example.
3. Discuss and given an example to the use of table for cumulative standardized normal distribution.
4. Explain the sampling distribution in general and the sampling distribution of the mean and give an
example.
5. Explain the unbiased property of the sample mean and give an example.
6. Explain standard error of the mean and give an example.
7. Explain the central limit theorem and illustrate graphs.
PROBLEMS SET
1. The amount of time you have to wait at a particular stoplight is uniformly distributed between zero and
two minutes.
a. What is the probability that you have to wait between 15 and 45 seconds for the light?
b. Eighty percent of the time, the light will change before you have to wait how long?
2. You have recently joined a country club. The number of times you expect to play golf in a month is
represented by a random variable with a mean of 10 and a standard deviation of 2.2. Assume you pay
monthly membership fees of $500 per month and pay an additional $50 per round of golf.
a. What is your average monthly bill from the country club?
b. What is the standard deviation for your average monthly bill from the country club?
3. Let the random variable X follow a normal distribution with a mean of 17.1 and a standard deviation of
3.2.
a. What is P(X > 16)? Answer: 0.6331
b. What is P(15 < X < 20)? Answer: 0.5640
4. In a recent survey about US policy in Iraq, 62 % of the respondents said that they support US policy in
Iraq. Females comprised 53% of the sample, and of the females, 46% supported US policy in Iraq.
a. Is the event “does not support US policy in Iraq” statistically independent from gender? Why or
why not?
b. Suppose we select a supporter of US policy in Iraq, what is the probability that the person we
select is female?
5. Fred’s Surfboard Shop makes surfboards by hand. The number of surfboards that Fred makes during a
week depends on the wave conditions. Fred has estimated the following probabilities for surfboard
production for the next week.
Number of Surfboards
5
6
7
8
9
10
Probability
0.13
0.22
0.31
0.17
0.13
0.04
Let A be the event that Fred produces more than seven surfboards. Let B be the event that Fred
produces exactly six surfboards.
a. What is the probability of event A?
b. Are events A and B collectively exhaustive? Why?
c. Are events A and B mutually exclusive? Why?
d. What is the probability of the complement of A?
e. What is the probability of the intersection of events A and B? Why?
6. The number of orders that come into a mail-order sales office each month is normally distributed with a
mean of 298 and a standard deviation of 15.4.
a. What is the probability that in a particular month the office receives more than 310 orders?
ANSWER: 0.2177
b. The probability is 0.3 that the sales office receives less than how many orders? ANSWER: 290.0
7. The length of time it takes to be seated at a local restaurant on Friday night is normally distributed with a
mean of 15 minutes and a standard deviation of 4.75 minutes.
a. What is the probability that you have to wait more than 20 minutes to be seated? ANSWER:
0.1469
b. What is the probability that you have to wait between 13 and 16 minutes to be seated?
ANSWER: 0.246
QUESTIONS SET #2
1. Explain the sampling frame, why it is important, and give an example not used in the class and textbook.
2. Explain the probabilistic sampling, advantages and disadvantages, and give an example not used in the
class and textbook.
3. Explain the non-probabilistic sampling, advantages and disadvantages, and give an example not used in
the class and textbook.
4. Explain the judgmental and convenience samplings, advantages and disadvantages of each, and give an
example not used in the class and textbook.
5. Explain the simple random and systematic samplings, advantages and disadvantages of each, and give
an example not used in the class and textbook.
6. Explain the stratified and cluster samplings, advantages and disadvantages of each, and give an
example not used in the class and textbook.
7. Explain the sampling with and without replacement, and give an example not used in the class and
textbook.
8. Explain the coverage error, and give an example not used in the class and textbook. Discuss ethical
issues related to that and give an example.
9. Explain the sampling error, and give an example not used in the class and textbook. Discuss ethical
issues related to that and give an example.
10. Explain the non-response error, and give an example not used in the class and textbook. Discuss ethical
issues related to that and give an example.
11. Explain the measurement error, and give an example not used in the class and textbook. Discuss ethical
issues related to that and give an example.
Quiz 3
QUESTIONS SET
1. Explain the sampling distribution in general and the sampling distribution of the mean and give an
example. Explain the central limit theorem and illustrate graphs.
2. Explain standard error of the mean and give an example. Why does the standard error of the mean
decrease as the sample size, n, increases?
3. Why is the sample mean an unbiased estimator of the population mean? Why does the sampling
distribution of the mean follow a normal distribution for a large enough sample size, even though the
population may not be normally distributed?
4. What is the difference between a population distribution and a sampling distribution? Under what
circumstances does the sampling distribution of the proportion approximately follow the normal
distribution?
5. When are you able to use the t distribution to develop the confidence interval estimate for the mean?
6. Why is it true that for a given sample size, n, an increase in confidence is achieved by widening (and
making less precise) the confidence interval? Why can you never really have 100% confidence of
correctly estimating the population characteristic of interest?
7. What is the difference between a null hypothesis, and an alternative hypothesis? What is the difference
between a Type I error and a Type II error? What is meant by the power of a test?
8. What is the six-step critical value approach to hypothesis testing?
PROBLEMS SET #1
8. An auditor needs to estimate the percentage of times a company fails to follow an internal control
procedure. A sample of 50 from a population of 1,000 items is selected, and in 7 instances, the internal
control procedure was not followed.
a. Construct a 90% one-sided confidence interval estimate for the population proportion of items in
which the internal control procedure was not followed.
b. If the tolerable exception rate is 0.15, what should the auditor conclude?
9. Assuming that the population is normally distributed, construct a 95% confidence interval for the
population mean, based on the following sample: 11, 12, 13, 14, 50.
Change the number 50 to 15 and recalculate the confidence interval. Using these results, describe the
effect of an outlier (i.e., an extreme value) on the confidence interval.
10. The market research director for Dotty’s Department Store wants to study women’s spending on
cosmetics. A survey of the store’s customers is designed in order to estimate the proportion of women
who purchase their cosmetics primarily from Dotty’s Department Store and the mean yearly amount that
women spend on cosmetics. A previous survey found that the standard deviation of the amount women
spend on cosmetics in a year is approximately $18.
a. What sample size is needed to have 99% confidence of estimating the population mean amount
spent to within + or - $5?
b. How many of the store’s credit card holders need to be selected to have 90% confidence of
estimating the population proportion to within + or - 0.045?
11. The telephone company has the business objective of wanting to estimate the proportion of households
that would purchase an additional telephone line if it were made available at a substantially reduced
installation cost. Data are collected from a random sample of 500 households. The results indicate that
135 of the households would purchase the additional telephone line at a reduced installation cost.
a. Construct a 99% confidence interval estimate for the population proportion of households that
would purchase the additional telephone line.
b. How would the manager in charge of promotional programs concerning residential customers use
the results in (a)?
12. The manager of a paint supply store wants to estimate the actual amount of paint contained in 1-gallon
cans purchased from a nationally known manufacturer. The manufacturer’s specifications state that the
standard deviation of the amount of paint is equal to 0.02 gallon. A random sample of 50 cans is
selected, and the sample mean amount of paint per 1-gallon can is 0.995 gallon.
a. Construct a 99% confidence interval estimate for the population mean amount of paint included in
a 1-gallon can.
b. On the basis of these results, do you think that the manager has a right to complain to the
manufacturer? Why?
c. Must you assume that the population amount of paint per can is normally distributed here? Explain.
d. Construct a 95% confidence interval estimate. How does this change your answer to (b)?
13. If n = 36, sample mean = 75, S = 24, and assuming that the population is normally distributed, construct a
95% confidence interval estimate for the population mean.
14. If a quality control manager wants to estimate, with 95% confidence, the mean life of light bulbs to within
+ or - 20 hours and also assumes that the population standard deviation is 100 hours, how many light
bulbs need to be selected?
15. If the manager of a paint supply store wants to estimate, with 95% confidence, the mean amount of paint
in a 1-gallon can to within + or - 0.004 gallon and also assumes that the standard deviation is 0.02 gallon,
what sample size is needed?
PROBLEMS SET #2
1. A population has four members (called A, B, C, and D). You would like to select a random sample of n =
2, which you decide to do in the following way: Flip a coin; if it is heads, the sample will be items A and B;
if it is tails ,the sample will be items C and D. Although this is a random sample, it is not a simple random
sample. Explain why.
2. You want to select a random sample of n = 1 from a population of three items (which are called A, B, and
C). The rule for selecting the sample is as follows: Flip a coin; if it is heads, pick item A; if it is tails, flip the
coin again; this time, if it is heads, choose B; if it is tails, choose C. Explain why this is a probability
sample but not a simple random sample.
3. Time spent using e-mail per session is normally distributed, with population mean = 8 minutes and
population standard deviation = 2 minutes. If you select a random sample of 25 sessions,
a. what is the probability that the sample mean is between 7.8 and 8.2 minutes?
b. what is the probability that the sample mean is between 7.5 and 8 minutes?
c. If you select a random sample of 100 sessions, what is the probability that the sample mean is
between 7.8 and 8.2 minutes?
d. Explain the difference in the results of (a) and (c).
4. A political pollster is conducting an analysis of sample results in order to make predictions on election
night. Assuming a two-candidate election, if a specific candidate receives at least 55% of the vote in the
sample, that candidate will be forecast as the winner of the election. If you select a random sample of 100
voters, what is the probability that a candidate will be forecast as the winner when
a. the population percentage of her vote is 50.1%?
b. the population percentage of her vote is 60%?
c. the population percentage of her vote is 49% (and she will actually lose the election)?
d. If the sample size is increased to 400, what are your answers to (a) through (c)? Discuss.
5. Junk bonds reported strong returns in 2009. The population of junk bonds earned a mean return of 57.5%
in 2009. Assume that the returns for junk bonds were distributed as a normal random variable, with a
mean of 57.5 and a standard deviation of 20. If you selected a random sample of 16 junk bonds from this
population, what is the probability that the sample would have a mean return
a. less than 50?
b. between 40 and 60?
c. greater than 40?
6. A simple random sample of full-time employees is selected from a company list containing the names of
all full-time employees in order to evaluate job satisfaction.
a. Give an example of possible coverage error.
b. Give an example of possible non-response error.
c. Give an example of possible sampling error.
d. Give an example of possible measurement error.
7. Many consumer groups feel that the U.S. Food and Drug Administration (FDA) drug approval process is
too easy and, as a result, too many drugs are approved that are later found to be unsafe. On the other
hand, a number of industry lobbyists have pushed for a more lenient approval process so that
pharmaceutical companies can get new drugs approved more easily and quickly. Consider a null
hypothesis that a new, unapproved drug is unsafe and an alternative hypothesis that a new, unapproved
drug is safe.
a. Explain the risks of committing a Type I or Type II error.
b. Which type of error are the consumer groups trying to avoid? Explain.
c. Which type of error are the industry lobbyists trying to avoid? Explain.
8. Do students at your school study more than, less than, or about the same as students at other schools?
BusinessWeek reported that at the top 50 schools, students studied an average of 14.6 hours per week.
Set up a hypothesis test to try to prove that the mean number of hours studied at your school is different
from the 14.6-hour-per-week benchmark reported by BusinessWeek.
a. State the null and alternative hypotheses.
b. What is a Type I error for your test?
c. What is a Type II error for your test?