Download y - Lake-Sumter State College

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
103
Write the equation of the line that passes through the points (– 2 , 4) and (3 , – 6)
Equation: y = mx + b
y -y
(4) - (-6) 10
m= 1 2 =
= = -2
x -x
(-2) - (3) -5
1 2
To find b: y = mx + b then y = -2x + b ; let x = 3 and y = – 6 and you get – 6 = (-2)(3) + b
Solve for b and you get 0 which means the line goes through the y axis at 0.
Final Answer: y = -2x + 0 or y = -2x
b
m
104
Linear regression equation made easy
Write the regression equation given:
x
3
1
3
5
y
5
8
6
4
105
Linear regression equation (Sample)
y = mx + b
x is the independent variable
m is the slope and b is the y intercept.
Formula to find the slope m =
n • xy - x • y
2
n x2 - x
Formula to find the y intercept the easy way → b = y - mx
x
To find y find the mean of the y coordinates. To find
y
OR you can use this formula b =
n
x2 -
x
x2 -
x
find the mean of the x coordinates.
xy
2
Make a table like this
x
y
The x coordinates The y coordinates Multiply the x
go here.
go here.
and y
coordinates.
∑x
∑y
x2
x•y
Square the x
coordinate.
∑x • y
Notation
n
→ The number of data pairs
∑x
→ The sum of the x values
∑y
→ The sum of the y values
∑x2
→ The sum of x squared
(∑x)2 → The x values should be added up then the sum is squared
∑x y → Multiply x and y then add the products
∑x2
106
Example 1: Write the linear regression equation given the coordinates:
(3, 5); (1, 8); (3, 6) and (5, 4).
Put the x coordinates in this column.
Put the y coordinates in this column.
Put the product of the x and y coordinates in this column.
The square the x coordinates goes in this column.
x
3
1
3
5
∑x
y
5
8
6
4
∑y
x•y
15
8
18
20
∑x • y
x2
9
1
9
25
∑x2
↓
12
↓
23
↓
61
↓
44
m=
n • xy - x • y
n x2 - x 2
m=
4•61 - 12•23 -32
=
= -1
2
32
4 44 - 12
Add up each column.
Substitute the numbers from
above into this formula to find m,
the slope of the line.
Now find b (the y – intercept.) To find b use this formula: b = y -mx .
You already found m and (x and y bar) are the averages of the x and y coordinates.
107
b = y -mx
y=
23 = 5.75 and
4
x = 12
4
=3
b = 5.75 – (– 1)(3) = 8.75
To find x bar (the mean of the x
coordinates) take the sum of the x
coordinates and divide the sum by the
number of x coordinates which in this
case is 4. To find y bar (the mean of
the y coordinates) take the sum of the
y coordinates and divide the sum by
the number of y coordinates which in
this case is 4. Substitute the numbers
into the formula and solve for b.
You can also use this formula to find b.
y
b=
n
x2 -
x
x2 -
x
xy
2
To finish the problem substitute m and b into: y = mx + b:
Final answer: y = – 1x + 8.75
108
Example
Jason is a local real estate agent in upstate New York who wants to compare the price of a
cottage located on Otter Lake to the number of blocks a cottage is from Otter Lake. The table
shown below shows a sample of sales and location data. Write the linear regression equation
for this data.
Number
of blocks
from the
lake (x)
0
1
2
3
4
5
m=
Price of a
cottage
(y)
$380,000
$299,000
$248,000
$189,000
$175,000
$163,000
n • xy - x• y
n x2 - x 2
x
0
1
2
3
4
5
=
y
$380,000
$299,000
$248,000
$189,000
$175,000
$163,000
1, 454, 000
= 242, 333
6
x=
15
= 2.5
6
b = 242,333 – (-43314.29)(2.5) = 350618.725
y = -43314.29 x +350618.73
x2
00
01
04
09
16
25
(6× 2877000) - (15×1,454,000) -4548000
=
= -43314.28571
6(55) - 225
105
b = y-mx
y=
(x)(y)
0
299,000
496,000
567,000
700,000
815,000
109
Example # 1
Since 1990, fireworks usage nationwide has grown, as shown in the table below, where t
represents the number of years since 1990, and p represents the fireworks usage in millions of
pounds.
Number of years
since 1990 (t)
Firework usage per
year in millions of
pounds (p)
0
2
67.6
88.8
4
6
7
8
119 120.1 132.5 118.3
9
11
159.2 161.6
a) Find the equation of the linear regression model for this set of data, where t is the
independent variable. Round all values to four decimal places.
a)
y = 8.1875x + 72.7860
b) Using this equation, determine in what year fireworks usage would have reached 99 million
pounds.
b)
1993
c) Based on this linear model, how many millions of pounds of fireworks would be used in the
year 2008? Round all your answer to the nearest tenths.
c)
220.2
110
Question # 2
The accompanying table illustrates the number of movie theaters showing a popular film and
the film’s weekly gross earnings, in millions of dollars.
Number of Theaters (x)
443
455
493
530
569
657
723
1064
Gross Earnings (y) (millions of
dollars)
2.57
2.65
3.37
4.05
4.76
4.76
5.15
9.35
a) Write the linear regression equation for this set of data, rounding all values to two decimal
places.
b) Using this linear regression equation, find the approximate gross earnings, in millions of
dollars, generated by 610 theaters. Round all your answers to two decimal places.
c) Find the minimum number of theaters that would generate at least 7.65 million dollars in
gross earnings in one week.
Answers:
a) y = .01x – 1.80
b) 4.3
c) 945
111
Correlation Coefficient
In statistics the correlation coefficient indicates the strength and direction of a linear
relationship between two random variables.
→ The linear correlation coefficient for a sample
r
(rho)→ The linear correlation coefficient for a population
Formula
n
r=
n
x2 -
xy -
x
2
x * n
y
y2 -
y2
Notation
n
→ The number of data pairs
∑x
→ The sum of the x values
∑y
→ The sum of the y values
∑x2
→ The sum of x squared
(∑x)2 → The x values should be added up then the sum is squared
∑x y → Multiply x and y then add the products
∑y2
→ The sum of y squared
(∑y)2 → The y values should be added up then the sum is squared
112
Positive correlation: If x and y have a strong positive linear correlation, r is close to +1. An r
value of exactly +1 indicates a perfect positive fit and this occurs only when the data points all
lie exactly on a straight line. A correlation greater than .8 is described as strong and a
correlation less than .5 is described as weak.
Correlation “r” is close to 1
Y
•
• • •
• •
• •
• •
• •
• •
X
Negative correlation: If x and y have a strong negative linear correlation, r is close to -1. An r
value of exactly -1 indicates a perfect negative fit and this occurs only when the data points all
lie exactly on a straight line.
Correlation “r” is close to -1
Y
••
•
•
•
• •
• ••
•
• •
• •
X
113
Zero correlation: If there is no linear correlation or a weak linear correlation, r is close to 0.
A value near zero means that there is a random, nonlinear relationship between the two
variables
Correlation “r” is close to 0
Y
correlation, r•is close to 0 correlation,
•
•
•
•
• •
•
•
•
•
•
• • •
• • • •
• • •
•
•
•
•
X
114
1)
Y
2)
The graph shown below has what type of correlation?
a) Negative correlation
b) Positive correlation
c) Zero correlation
d) Non linear correlation
•
• • •
• •
• •
• •
• •
• •
X
The graph shown below has what type of correlation?
a) Negative correlation
b) Positive correlation
c) Zero correlation
d) Non linear correlation
Y
•
•
•
•
•
•
•
•
•
•
•
•
•
•
X
3)
Y
What is the best approximate value of the correlation coefficient for the graph shown
below?
a) – .85
b) – .16
c) .23
d) .90
• •
• •
• •
• •
• •
• •
• •
X
115
4)
KE
Kathy determined the kinetic energy (KE) of an object at different velocities (V) and
found the linear correlation coefficient between KE and V to be +.8 which graph shown
below shows this relationship?
••
•
a)
•
•
• •
• ••
•
• •
• •
V
KE
• • • • • •
• • • • • • •
b)
V
KE
•
c)
•
•
•
•
•
•
V
KE
d)
•
•
•
• •
•
•
•
V
116
5)
Y
Which plot shows the strongest correlation?
••
•
•
•
•
a)
•
• •
•
X
Y
•
•
• • •
• • • •
• • •
b)
•
X
Y
•
c)
•
•
•
•
•
•
X
Y
• •
d)
• •
•
•
•
•
X
117
Which plot shows a correlation closest to – 1?
6)
Y
a)
•
•• •
• •
•
•
• •
X
Y
•
•
• • •
• • • •
• • •
b)
•
X
Y
•
c)
•
•
•
•
•
•
X
Y
d)
•
•
•
•
•
•
X
118
7)
Y
Which plot shows a negative correlation?
••
•
•
•
•
•
•
•
•
a)
X
Y
•
•
b)
•
•
•
•
•
•
•
•
•
•
•
X
• •
Y
• •
c)
• •
•
X
Y
d)
•
•
•
•
•
•
•
•
X
119
Question Generator# 1
Frequency Tables
Make a frequency table with 6 classes using the data below. Show the class limits, class
boundaries, frequency, class midpoints and relative frequency. Then construct a frequency
histogram.
One-way commuting distances in miles for 60 workers in downtown Dallas are given below
13
07
12
06
34
14
47
25
45
02
13
26
10
08
01
14
41
10
Class Limits
Lower – Upper
03
21
08
13
28
24
16
19
04
07
36
37
Frequency
20
15
16
15
17
31
17
03
11
46
24
08
40
17
18
12
27
16
Class boundary
04
14
23
09
29
12
02
06
12
18
09
16
Class Midpoint
Relative Frequency
120
Question Generator # 2
5 number summary
1)
From the stem and leaf plot shown below find the 5 number summary and make a
horizontal box and whisker plot
2
3, 5, 1, 0, 6, 4
3
0, 7, 3, 5, 2
4
9, 3, 5, 1, 7
5
6, 0, 1, 7, 3, 9
Key: 4│3 = 43
2)
Which of the following could be affected if you have an extremely large upper outlier?
a) Q1
b) Q2
c) Q3
3)
What is the IQR for the following data? {3, 7, 4, 9, 1, 2, 5, 6, 8}
4)
A data value is considered to be an outlier if …
a) It lies between – 1.5 × IQR and 1.5 × IQR
b) It lies between Q1 – 1.5 × IQR and Q3 + 1.5 × IQR
c) It lies between Q1 and Q3
d) It is smaller than Q1 – 1.5 × IQR or larger than Q3 + 1.5 × IQR
5)
True or false: An outlier does not affect Q2
6)
Using the accompanying box and whisker plot shown below, what is the median?
│
55
7)
60
65
│
│
│
70
75
80
│
85
90
95
100
From the given data shown below find the 5 number summary then make a horizontal
box and whisker plot and check for potential outliers. {90, 93, 65, 88, 83, 90, 85, 83, 83,
85, 88, 75, 73, 90, 88, 88, 93, 83, 40, 85, 93, 83}
121
Question Generator # 3
Mean, Median, Mode and Symbol Sampler
01) Students selected at random who were asked how many times they attended a movie at a
commercial theatre since the beginning of the school year.
The responses were…… {1, 5, 7, 1, 4, 6} Find the numerical values of:
a) n
b) ∑x
c) ∑x2
d) (∑x)2
e)
x
f)
x
g) The mode
02) Using the histogram shown below find the numerical value of the mean rounded to the
nearest whole number, then find the interval the median and mode are located in.
Temperatures in April @ 7:00 am
F
R
E
Q
U
E
N
C
Y
10
09
08
07
06
05
04
03
02
01
00
50 – 57 58 – 65 66 – 73 74 – 81 82 – 89
Temperature
122
Question Generator # 4
Standard Deviation
01)
If the mean and median are located to the left of the mode the normal curve will be
shifted to the ___________?
02)
Using the normal curve what % of the data will be located …
a) Within 2 standard deviations of the mean?
b) Outside 3 standard deviations of the mean?
03)
Using Chebyshev’s rule what % of the data will be located within …
a) 3 standard deviations of the mean?
b) 2.75 standard deviations of the mean?
04)
Using the table below which shows the distribution of bowling scores find
standard deviation (s) to the nearest tenth.
Interval
91 – 110
111 – 130
131 – 150
151 – 170
171 – 190
191 – 210
x
and the
Frequency
10
11
8
4
6
5
05)
Given x = 19 and s = 6 find the range of scores that are located 2 standard deviations
from the mean.
06)
Bayonne High School has 17 students on its bowling team. Each student bowled one
game and their scores are listed below.
a) Find the population standard deviation of these scores.
b) How many of these scores fall within one standard deviation of the mean?
Scores (x)
140
145
150
160
170
180
194
Frequency
4
3
2
3
2
2
1
123
Question Generator # 5
Probability
01)
The diagram below shows a square dartboard. The side of the dartboard measures 20
inches. The circular shaded region at the center has a diameter that measures 10 inches.
If darts thrown at the board are equally likely to land anywhere on the board, what is
the probability that a dart does not land in the shaded region?
20 inches
2)
If the P(A) = .76 the P(B) = .56 and P(A and B) = .43 what is P(A or B)?
3)
Burger Heaven offers 4 types of burgers, 5 types of beverages and 3 types of desserts. If
a meal consists of 1 burger, one beverage and one dessert, how many possible meals can
be chosen?
4)
A draw contains 3 red paperclips, 4 green paperclips and 5 blue paperclips. One
paperclip is taken from the draw and then replaced. Then another paperclip is taken
from the draw. What is the probability that the first paperclip is red and the second
paperclip is blue?
5)
A draw contains 3 red paperclips, 4 green paperclips and 5 blue paperclips. One
paperclip is taken from the draw and not replaced. Then another paperclip is taken
from the draw. What is the probability that the first paperclip is red and the second
paperclip is blue?
6)
How many elements are in the sample space of rolling one die? Now list the sample
space.
7)
How many elements are in the sample space of tossing 3 pennies?
List the sample space.
8)
There are two entry doors and 3 staircases in the school. How many ways are there to
enter the building and go to the 2nd floor?
124
9)
The 4 aces are removed from a deck of cards. Then a coin is tossed and one of the aces is
chosen. What is the probability of getting a head on the coin and the ace of hearts?
Draw the sample space.
10)
A die is rolled. What is the probability that the number is even or less than 4?
11)
A pair of dice is rolled and the sum recorded. What is the probability of rolling an even
number?
12)
A standard deck of cards is shuffled. What is the probability of choosing the 5 of
diamonds?
13)
A spinner contains eight equal regions, numbered 1 to 8. The arrow has an equally
likely chance of landing on one of the eight regions. If the arrow lands on the line it is
spun again. What is the probability that the arrow lands on an odd number.
125
1
1)
1
Question Generator # 6
Probability
The spinner shown to the right is spun once, find ……
a) P(it will land on a number that is a factor of 35)
b) P(6 or 2)
c) P(4 or even)
d) P(of a natural number less than 9)
e) P(0)
2
3
1
4
5
8
6
1
7
2)
A card is selected from a regular deck of cards, find …
a) P(of a face card)
b) P(not a queen)
c) P(red card or jack)
3)
There is a 35% chance of rain tomorrow. What is the probability of no rain?
4)
A box contains five white balls, three black balls and two red balls. How many red balls
must be added to the box so the probability of drawing a red ball is
5)
The probability of rolling a number less than 5 is
greater is
6)
3
?
4
4
and the probability of rolling a 5 or
6
2
. The odds in favor of rolling a number less than 5 are?
6
Adam bought a package of marbles and sorted all of them by color as shown in the
graph below.
Number
Of
Marbles
5
4
3
2
1
0
red white black yellow
a) What was the total number of marbles in the package?
b) If one marble was selected at random, find the probability it was black, red or yellow?
c) If two marbles were selected at random, with replacement, find the probability one of the
two marbles was red?
d) If two marbles were selected at random, without replacement, find the probability neithel
marble was redor black?
e) If two marbles were selected at random, with replacement, find the probability one of the
two marbles was blue?
126
Question Generator # 7
Probability Distribution
1)
Find the expected number of girls in a two child family?
For question 2, 3 and 4 determine whether a probability distribution is given. If no tell
me why it’s no and if it’s yes find the mean and standard deviation.
2)
X
P(X)
1
.15
2
.2
3
.1
4
.2
5
.1
6
.15
7
.1
X
P(X)
1
.23
2
.13
3
.04
4
.17
5
.09
6
.11
7
.2
X
P(X)
1
-.12
4
5
.27 -.05
6
.14
7
.39
3)
4)
5)
3
.04
The probability distribution for a random variable x is given below. Your job is to find
the variance of the distribution.
X
P(X)
6)
2
.33
-3
.1
-2
.3
0
.2
2
.3
3
.1
Find the value of y that makes the following table a probability distribution.
X
P(X)
-2
.13
-1
.17
0
y
1
.19
2
.22
127
7)
In a restaurant the following probability distribution was obtained for the number of
items a person ordered for a large pizza.
Number of items: 0
Probability P(x) .3
1
.4
2
.2
3
.06
4
.04
Find the mean number of toppings for the distribution and can you conclude that
people like pizza with a lot of toppings?
8)
A franchise of small grocery stores has kept a record of the number of bad checks
passed in its stores. They used the data to get a probability distribution for the number
of bad checks passed in a store each week. In the table below x = the number of bad
checks and P(x) is the probability that x bad checks will be passes in a week.
X
0
P(X) .3
1
.3
2
.2
3
.1
4
.1
a) Calculate the expected number of bad checks the chain will get in one week.
b) Calculate the standard deviation for the number of bad checks.
c) What is the probability that no bad checks will be passed in a week?
d) What is the probability that two or more bad checks will be passed in a week?
9)
The table shown below gives a probability distribution for a random variable x. What is
the probability that x is an odd number?
X
P(X)
10)
1
.15
2
.2
3
.1
4
.2
5
.1
6
.15
7
.1
Pyramid Lake, Nevada is one of the best places in the lower 48 states to catch a trophy
cutthroat trout. In this table, x = the number of fish caught in a six hour period. The
percentage data are the percentages of fisherman who caught x fish in a 6 hour period.
x
P(x)
0
1
2
3
44% 36% 15% 4%
4 or more
1%
a) Find the probability that a fisherman selected at random catches one or more fish in
a 6 hour period.
b) Find the probability that a fisherman selected at random catches two or more fish in
a 6 hour period.
c) Compute μ
d) Compute σ
128
Question Generator # 8
Binomial Distribution
1)
Rose is the last person to compete in basketball free throw content. To win, Rose must
be successful in at least 4 out of 5 throws. If the probability that Rose will be successful
on any single throw is .75 what is the probability Rose will win the contest?
2)
If Kathy randomly guesses at 10 multiple choice questions and each question has 4
possible choices, find the probability that she guesses exactly 3 questions correctly.
3)
When you call 411 for directory assistance 90% of the time you are given a correct
number. If 12 requests are selected at random what is the probability of at most 3
wrong phone numbers?
Question 4, 5 and 6 determine whether the following examples result in a binomial
distribution. If you say no give an example as to why you said no.
4)
Surveying 100 college students by asking them how many credit they are taking.
5)
Spinning a roulette wheel 12 times and finding the number of times the outcome is red.
6)
Guessing the answer to 20 multiple choice questions then determining if the answers are
right or wrong.
Question 7, 8 and 9 use table A1 to find the probability given …
7)
n = 7 ; P(x) = .01 and x = 1
8)
n = 9 ; P(x) = .3 and x = 1, 2 and 3
9)
n = 14 ; P(x) = .95 and x = At least 11
Question 10 and 11 use the binomial formula to find the probability given …
1
and r = 4
3
10)
n = 10 ; P(x) =
11)
n = 13 ; P(x) = .75 and r = 6 and 7
129
12)
A basket contains 3 balls, 1 red, 1 blue and 1 yellow. Three balls are selected with
replacement. Find the probability no red ball is selected.
13)
The probability that an individual selected has blue eyes is .4 find the probability that at
least one of the eight people selected has blue eyes.
14)
If the probability that a family selected at random will be audited by the IRS is .0378
your job here is to write the binomial distribution formula to determine the probability
that exactly 5 of 20 families will be audited by the IRS this year.
15)
Six cards are selected from a regular deck of placing cards. Find the probability that
exactly 3 picture cards are selected.
16)
Fifty seven percent of the companies in the USA use networking to recruit workers.
What is the probability that in a survey of ten companies exactly half of them use
networking to recruit workers?
17)
If die is rolled 3 times what is the probability you will get 6 three times.
18)
Which of the following is not a property of a binomial distribution?
a) The number of trials is fixed.
b) There are exactly two outcomes for each trial.
c) The individual trials are dependent on each other.
d) The probability of success is the same for each trial.
130
Question Generator # 9
C.L.T.
1)
If human body temperatures have a mean of 96.6° F with a standard deviation of .62° F.
If a sample of 26 people are randomly selected, find the probability of getting a mean
body temperature less than 98.2° F.
2)
For women aged 18-24 systolic blood pressures are normally distributed with a mean of
114.8 and a standard deviation of 13.1 If 12 women in that age bracket are randomly
selected find the probability that their mean systolic blood pressure is greater than 120.
3)
Suppose scores for a certain test follow a normal distribution with a mean of 63.6 and a
standard deviation σ of 2.5. If one test is selected at random find the probability that a
score is between 63 and 65.
4)
Suppose scores for a certain test follow a normal distribution with a mean of 63.6 and a
standard deviation σ of 2.5. If 75 tests are selected at random find the probability that
the mean score is between 63 and 65.
5)
Heights of women are normally distributed with a mean of 63.9 inches and a standard
deviation σ of 2.5 inches. If you take a sample of ten women, find the probability that
the average height is between 62 inches and 67 inches.
6)
IQ scores are normally distributed with a mean of 100 and a standard deviation σ of 15.
If 25 people are selected find the probability the mean is between 95 and 105.
131
Question Generator # 10
Confidence Intervals
1)
If you found a 95% confident interval of 18.8 months to 48 months for the mean
duration of imprisonment of all East German political prisoners determine the
margin of error.
2)
Find the sample size required to have a margin of error of 12 and a 99% confidence
interval given an σ of 42 months.
3)
Phone calls received by the 411 operator are normally distributed. A random sample of
16 days showed the 411 requests had an average of 233 phone calls per day with a
standard deviation of 31 calls. Find an 80% confidence interval for the mean number of
411 requests received per day.