Download 3 - JustAnswer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
3.
a. What score was earned by more students than any other score? Why? 84. It's the mode.
b. How many students scored between 68 and 94 on the exam? 25 students. Because 50% of the
students lie between the 1st and 3rd quartiles and there are 50 students.
c. What was the highest score earned on the exam? 98 (see below for calculations
(a+b)/2 = 72 (that's the midrange)
b-a = 52 (that's the range)
So we can solve these equations: b= 52+a
(2a+52)/2 = 72 --> a+26 = 72 --> a=46 --> b=98.
d. What was the lowest score earned on the exam? 46 (see above for calculations)
e. How many students scored within three standard deviations of the mean ? By Chebyshev's
Theorem, at least 1-1/9 = 0.889% or 44.44 students scored within that interval.
Show your calculations leading to your standard deviation and variance on #2.
2.
A math test was given with the following results:
80, 69, 92, 75, 88, 37, 98, 92, 90, 81, 32, 50, 59, 66, 67, 66
Find the range, standard deviation, and variance for the scores.
To find the range we subtract the minimum number from the maximum number:
32, 37, 50, 59, 66, 66, 67, 69, 75, 80, 81, 88, 90, 92, 92, 98
Range = 98-32 = 66
Finding the variance, it’s helpful to use a table:
x
x-bar
x-(x-bar) (x-(x-bar)^2)
1550.391
32 -39.375
1181.641
37 -34.375
456.891
50 -21.375
153.141
59 -12.375
28.891
66 -5.375
-5.375
28.891
66
19.141
67 -4.375
5.641
69 -2.375
3.625
13.141
75
8.625
74.391
80
9.625
92.641
81
276.391
88 16.625
346.891
90 18.625
20.625
425.391
92
425.391
92 20.625
708.891
98 26.625
71.375
5787.75 <--- sum of this column
385.85 <--- divide the sum by n-1
So the variance is 385.85.
The standard deviation is simply the square root of the variance: sqrt(385.85) = 19.64
Be sure to include your calculations to help me find your errors. #6
6.
An animal trainer obtained the following data (Table A) in a study of reaction times of dogs (in
seconds) to a specific stimulus. He then selected another group of dogs that were much older than the
first group and measure their reaction times to the same stimulus. The data is shown in Table B.
Table A
Table B
Classes
Frequency
Classes
Frequency
2.3 – 2.9
10
2.3 – 2.9
1
3.0 – 3.6
12
3.0 – 3.6
3
3.7 – 4.3
6
3.7 – 4.3
4
4.4 – 5.0
8
4.4 – 5.0
16
5.1 – 5.7
4
5.1 – 5.7
14
5.8 – 6.4
2
5.8 – 6.4
4
Find the variance and standard deviation for the two distributions above. Compare the variation of the
data sets. Decide if one data set is more variable than the other.
Be sure to include your calculations.
The formula for the variance of grouped data is:
Where f is the frequency in a particular category and m is the midpoint in a particular category.
We can use a table to find this answer as well (note that there are 42 observations in each table):
TABLE A
Class
Low High Midpoint, m Frequency, f
2.3
3
2.9
3.6
3.7
4.4
4.3
5
5.1
5.8
5.7
6.4
2.6
3.3
4
4.7
5.4
6.1
f*m
10
12
6
8
4
2
sum(f*m)=
(sum(f*m))^2=
((sum(f*m))^2)/n=
So the variance is
s2 
26
39.6
24
37.6
21.6
12.2
161
25921
617.167
f*m^2
67.6
130.68
96
176.72
116.64
74.42
662.06 <---- sum of f*m^2
1
662.06  617.667 44.893

 1.095
 662.06  617.667  
n 1
41
41
TABLE B
Class
Low High Midpoint, m Frequency, f
2.3
3
2.9
3.6
3.7
4.4
4.3
5
5.1
5.8
5.7
6.4
f*m
1
3
2.6
3.3
4
4.7
5.4
6.1
4
16
14
4
sum(f*m)=
(sum(f*m))^2=
((sum(f*m))^2)/n=
And the variance is
s2 
f*m^2
2.6
6.76
9.9
32.67
16
64
75.2 353.44
75.6 408.24
24.4 148.84
203.7 1013.95 <---- sum of f*m^2
41493.69
987.945
1
1013.95  987.945 26.005

 0.634
1013.95  987.945  
n 1
41
41
1.
Explain the difference between a discrete and a continuous random variable. Give two examples
of each type of random variable.
A discrete random variable takes on countable steps. For example, if you were going to count the
number of traffic accidents at an intersection on a particular day. Or if you were counting the number
of defects for a particular day on a production line.
A continuous random variable can take on any value in a particular interval. For example, if you were
going to count the time between calls at a phone bank. Or if you were going to measure the weight of
a particular item.
2.
Determine whether each of the distributions given below represents a probability distribution.
Justify your answer.
a)
x
1
2
3
4
P (x)
1/8
1/8
3/8
1/8
No, because the probabilities sum to less than 1.
b)
x
3
P (x)
6
0.2
8
0
1
No, because the probabilities sum to more than 1.
c)
x
20
30
40
50
P (x)
0.3
0.2
0.1
0.4
Yes, the probabilities sum to 1.
3.
Four cards are selected, one at a time, from a standard deck of 52 cards. Let x represent the
number of aces drawn in a set of 4 cards.
a.
If this experiment is completed without replacement, explain why x is not a binomial random
variable.
When you don't replace the cards in the deck, the number of cards in the deck changes, so the
probability of drawing an ace isn’t the same after the first draw as it is in the second draw. The
binomial distribution requires a series of identical, independent Bernoulli trials.
b.
If this experiment is completed with replacement, explain why x is a binomial random variable.
This is because each draw is identical in probability to the last and the results of the previous draw to
not influence the current draw. That satisfies all the requirements of the binomial distribution:
independent, identical Bernoulli trials.
4.
How does the bell-shaped curve for the sampling distribution of sample means for samples of
size n = 100 compare to the bell-shaped curve for the sampling distribution of sample means for
samples of size n = 60?
The bell shaped curve for the sample mean for a sample of 100 is taller and narrower than the one
with sample size 60 because the variance of the sample average is var(x)/n. Larger sample sizes
result in smaller variances, so the curve is higher and more narrow when the sample size is 100 than
it is when it is 60.
5.
What are the characteristics of the normal distribution? Why is the normal distribution important
in statistical analysis? Provide an example of an application of the normal distribution.
It is a continuous, symmetric distribution with a tell-tale bell shape. Its two parameters are the mean
and the standard deviation.
The Normal distribution is frequently used in working the sample mean. The central limit theorem
gives the result that the sample average is approximately normal when the sample size is pretty big
(e.g., over 30). So, one of the most common applications of the normal distribution is in working with
sample average. Other applications include, regression, the t-test, and ANOVA.
6.
In your own words describe the standard normal distribution. Explain why it can be used to find
probabilities for all normal distributions.
The standard normal distribution is a special case of the normal distribution. It has mean=0 and
standard deviation =1. It can be used to find probabilities for all normal distributions because the
formula z=(x-bar - mean)/(standard deviation) converts any observation from any normal distribution
to the standard normal distribution.
7.
Explain why the normal distribution can be used as an approximation to the binomial
distribution. What conditions must be met to use the normal distribution to approximate the binomial
distribution? Why is a correction for continuity necessary?
As long as n*p>5 and n*(1-p)>5, then we can approximate the binomial distribution with the normal
distribution. The central limit theorem allows this. Typically, these requirements are met through
having a large sample size, also one of the requirements of the central limit theorem.
8.
a.
Consider a binomial distribution with 15 identical trials, and a probability of success of 0.5
Find the probability that x = 2 using the binomial tables
P(X=2) = 0.0032
b.
Use the normal approximation to find the probability that x = 2
Using the continuity correction factor, we need to calculate P(1.5<x<2.5) where x follows a normal
distribution with mean = n*p = 7.5 and a standard deviation = sqrt(n*p*(1-p)) = 1.936.
So, finding the z-scores gives:
Z= (1.5 - 7.5)/1.936 = -3.098
Z= (2.5 - 7.5)/1.936 = -2.582
We can find, from the table, that P(-3.098<z<-2.582) = 0.005- 0.001 = 0.004
9.
The diameters of oranges in a certain orchard are normally distributed with a mean of 5.26
inches and a standard deviation of 0.50 inches.
a) Find the z-score:
(4.5-5.26)/0.5 = -1.52
From the table we find that P(z<-1.52) = 0.064.
So 6.4% of the oranges have diameter less than 4.5 inches.
b) Find the z-score:
(5.12-5.26)/0.5 = -0.28
From the table, we find that P(z>-0.28) = 0.610
So, we expect that 61% of the oranges to have diameter more than 5.12 inches.
c.
A random sample of 100 oranges is gathered and the mean diameter obtained was 5.12. If
another sample of 100 is taken, what is the probability that its sample mean will be greater than 5.12
inches?
Since we know that the original observations follow a normal distribution, we know that the sample
mean follows a normal distribution as well. In this case, the mean is 5.26 inches with standard
devaition of 0.5/sqrt(100) = 0.05
So, we can find the z-score:
(5.12-5.26)/0.05 = -2.8
And we can find from the table that P(z>-2.8) = 0.997
That is we can expect that 99.7% of the time, a sample of 100 oranges will have diameter more than
5.12 inches.
d.
Why is the z-score used in answering (a), (b), and (c)?
Because we have access to a standard normal table. Without the standard normal table, we wouldn't
be able to find the probabilities.
e.
Why is the formula for z used in (c) different from that used in (a) and (b)?
In (a) and (b) we are working with probabilities that relate to individual observations. I problem (c)
we are working with the sample mean. They follow a different distribution (i.e., they have a different
standard deviation) so the formula is slightly different to calculate their z-score.
10.
Assume that the population of heights of male college students is approximately normally
distributed with mean m of 68 inches and standard deviation s of 3.75 inches. A random sample of 16
heights is obtained.
x is normally distributed with a mean of 68 inches and a standard deviation of 3.75 inches.
b.
Find the proportion of male college students whose height is greater than 70 inches.
(70-68)/3.75 = 0.533
P(z>0.533)=0.297
So, the proportion of male college students whose height is greater than 70 inches of 0.297
c.
Describe the distribution of , the mean of samples of size 16.
The sample average is normally distributed with mean of 68 inches and standard deviation of
3.75/sqrt(16) = 0.9375
d.
Find the mean and standard error of the distribution.
The sample average is normally distributed with mean of 68 inches and standard deviation of
3.75/sqrt(16) = 0.9375
e.
Find P (x-bar > 70) = 0.143 (see work below):
z=(70-68)/0.9375 = 2.133
P(z>2.133) = 0.016
f.
Find P (x-bar < 67) = 0.297 (see work below):
z=(67-68)/0.9375 = -1.066
P(z<-1.066) = 0.143
Part I T/F & Multiple Choice
1.
False
2.
False.
3.
True
4.
False
5.
True
6.
False
7.
C. 0.714
8.
D.
The result of one trial does not affect the probability of success on any other trial
9.
B.
P(z < 0)
10.
C.
n = 100, p = 0.05
Part II. Short Answers & Computational Questions
1. Find the following probabilities:
a.
Events A and B are mutually exclusive events defined on a common sample space. If P (A) = 0.4
and P(A or B) = 0.9, find P(B).
Mutually exclusive means that P(A and B) = 0.
So P(A or B) = P(A) + P(B) - P(A and B)
-0.4 = 0.5
b.
Events A and B are defined on a common sample space. If P(A) = 0.20, P(B) = 0.40, and P(A or
B) = 0.56, find P(A and B)
By the same rule
P(A or B) = P(A) + P(B) P(A and B)
0.56 = 0.2 + 0.4 - P(A and B)
2.
a.
b.
c.
d.
= 0.04
Classify the following as discrete or continuous random variables
continuous
discrete
discrete
continuous
3.
A small bag of Skittles candies has the following assortment: red (10), blue (2), orange (5),
brown (21), green (0), and yellow (18). Construct the probability distribution for x.
What is x? To what does it refer? The count of reds drawn? Blues?
Greens?
4.
Find the mean and standard deviation of the following probability distribution:
x
1
2
P(x)
0.3
3
0.5
0.2
Mean = 1*0.3 + 2*0.5 + 3*0.2 = 0.3 + 1 + 0.6 = 1.9
Standard deviation = sqrt of (0.3*(1-1.9)^2+0.5*(2-1.9)^2 + 0.2*(3-1.9)^2) = 0.7
5.
In testing a new drug, researchers found that 5% of all patients using it will have a mild side
effect. A random sample of 11 patients using the drug is selected. Find the probability that:
a)
exactly two will have this mild side effect
11!/9!*2! * 0.5^2 * 0.95^9 = 0.0867
b)
at least one will have this mild side effect.
1-(0.95^11) = 0.4312
6.
A large shipment of TV sets is accepted upon delivery if an inspection of ten randomly selected
TV sets yields no more than one defective TV.
a) Find the probability that this shipment is accepted if 5% of the total shipment is defective.
(10 * 0.05 * 0.95^9) + 0.95^10 = 0.914
b)
Find the probability that this shipment is not accepted if 10% of this shipment is defective
(10 * 0.1 * 0.90^9) + 0.90^10 = 0.736
7.
X has a normal distribution with a mean of 75.0 and a standard deviation of 2.5. Find the
following probabilities:
a)
P(x < 70.0) = 0.022750132
b)
P(72.5 < x < 80.0) = 0.818594614
c)
P(x >82.5) = 0.001349898
8.
Find the value of z such that 40% of the distribution lies between it and the mean.
Let’s find the two values a and b such that
P(z<a) = .0.3
Since the distribution is symmetric, the other value is just the positive version of this number:
Using the table, we find that P(z<-0.524) = 0.3
So, the range (-0.524,0.524) contains the middle 40% of the distribution.
9.
Assume that the average annual salary for a worker in th United States is $27500 and that the
annual salaries for Americans are normally distributed with a standard deviation equal to $6250. Find
the following:
a)
What percentage of Americans earn below $18000?
P(x<18000) = 0.0643
b)
What percentage of Americans earn above $40000?
P(x>40000) = 0.0228