Download Math 116 – Activity 1 – Part 1 - Montgomery College Student Web

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
M116 – TI 83/84 CALCULATOR – CH 1
Section 1.2 – Random number generator
1) Select 3 students at random from your Statistics class.
a) There are 28 students in your class. Select 3 students at random.
Use the TI-83 calculator to generate 3 random integers from 1 to 28.
The instruction in the home screen of your calculator should read:
randInt(1,28,3)
Here are the steps to accomplish this:
Press MATH, arrow right to PRB, and select 5:randInt(
Type 1,28,3)
Notice: the “,” is the black key
Above the key for the number 7
Press ENTER
b) List the three numbers obtained. If some of the numbers are repeated, press ENTER again
and select as many numbers as necessary to complete the list of five different integers.
c) Check with a classmate. Are his/her numbers the same as yours? Explain.
d) Check with the class roster shown on the transparency to name the students selected.
e) Comment on the importance of random selection.
f) You have 5 minutes to get to know the 3 students in your list.
g) How can we do this using the random number table from our book?
1
M116 – NOTES – CH 1
Section 1.1 – Exploring Sampling Techniques

Simple random sample

Systematic random sampling:
Step 1: Divide the population size by the sample size, and round the result down to the
nearest whole number. Call it m.
Step 2: Select at random a number between 1 and m, and call it k.
Step 3: Select for the sample those member of the population that are
numbered k, k+m, k + 2m, etc

Cluster sampling:
Step 1: Divide the population into groups (clusters)
Step 2: Obtain a simple random sample of the clusters
Step 3: Use all the members of the clusters obtained in step 2 as the
sample.

Stratified random sampling with proportional allocation
Step 1: Divide the population into subpopulations (strata)
Step 2: From each stratum, obtain a simple random sample of size proportional to the
size of the stratum.
Step 3: Use all the members obtained in Step 2 as a sample.

Convenience sampling
2) Exploring sampling techniques
Consider the following population
Students who are enrolled in Professor Aronne’s 3 Statistics classes during the Spring
semester of 2007
a) Discuss how you would select a simple random sample of 35 students from this population.
i) Without a random number generator.
ii) With a random number generator.
b) Discuss how you would select 35 students by using the systematic method.
c) Discuss how you would select a sample from this population by using each of the following
methods:
i) Convenience method
ii) Stratified method
iii) Cluster method
2
M116 – TI 83/84 CALCULATOR – CH 2
Section 2.2 – Using the calculator to
Create a New List, Sort, and construct a Frequency Distribution
3) To get ready for this activity, create a new list labeled GLUCO
Here are the steps to accomplish this:
Press STAT
Select 1:Edit
Arrow right and up until the cursor is on the name of the last list of
your editor (the name has to be highlighted)
Arrow right and type the name of the new list: GLUCO
Press ENTER
Enter the data from problem 2 page 67. Press ENTER after each entry. All the numbers
should go into the same list.
4) Construct a frequency distribution of 6 classes for the GLUCO data.
a) Calculate the class width:
class.width 
l arg est.value  lowest.value
(rounded up)
number.of .classes
b) Use the smallest number as the lower limit of the first class. Obtain all other lower limits
by adding the class width. Then write the upper limits.
Classes
Frequency
c) In order to determine the frequencies we are going to SORT the list GLUCO, and then
explore the list to count how many values are in each of the classes.
To SORT the list press STAT, select 2:SortA(
Press 2nd STAT to select the list GLUCO
Press ENTER
Then, get into the editor by pressing STAT, 1:Edit and scroll down to determine the
frequencies.
Count how many numbers are in each class and record on the table from part b.
3
d) Using your results from part (b), complete the following table:
CHAPTER 2
Class limits
Relative
frequency
Class
midpoint
Class
boundaries
Frequency
Cumulative
frequency
e) Sketch the corresponding histogram and label.
Use the same graph to sketch the corresponding frequency polygon for the data
g) Sketch the corresponding ogive
h) Sketch a Stem and Leaf plot for the GLUCO data (from #2 on page 67).
i) Sketch a Dot Plot for the GLUCO data. Dot plots are explained in problem #17 on pages 73
and 74
4
M116 – TI 83/84 CALCULATOR – CH 2
Section 2.2 – Using the calculator to Sketch Histograms for Raw Data
5) Use the calculator to sketch a histogram for the data stored in GLUCO.
Here are the steps to accomplish this:
1st: Set up the histogram
Press 2nd Y= [STAT PLOT]
Select 1:Plot1… (or any other plot)
Turn the plot ON by pressing ENTER
Arrow down and to the right to select the histogram
Indicate GLUCO for the location of the data in Xlist
To select GLUCO press 2nd STAT[LIST], scroll down
and press ENTER to select
Indicate 1 for Freq
(Notice: Press ALPHA 1)
2nd: Set up the WINDOW. To sketch a histogram with a specific class width,
we need to set up the window values according to the specifications given
below.
You will need some numbers from the classes produced in the previous page
Press WINDOW
Use the following values:
Xmin = lower class limit of the first class
Xmax =lower class limit of the next class beyond the data
(Xmin + (number of classes)*(class width))
Xscl = class width
Ymin = -5
Ymax = a number larger than the largest frequency
(try any number, then adjust if necessary)
Yscl = 1
Yres = 1
Press GRAPH
3rd: Read the frequencies
Press TRACE and arrow to the right to read the classes and
frequencies.
Make sure the classes agree with the ones obtained in the previous page. Sketch the
histogram here.
5
M116 – TI 83/84 CALCULATOR – CH 2
6) Use the calculator to sketch a histogram for the grouped data from part 4-d
(Use L1, L2)
Enter midpoints into L1 and frequencies into L2
In the STAT PLOT window, when you select the histogram, indicate L1 for XList
and L2 for Freq
If you still have the same WINDOW selections as indicated on the previous page,
press GRAPH and TRACE to check on the class limits and frequencies.
7) Explore the feature ZOOM 9:ZoomStat. Press TRACE, arrow to the right and observe
the frequencies. Are they the same as the ones obtained before?
What is the class width? What are the class limits of the first and second class?
6
M116 – TI 83/84 CALCULATOR – CH 3
Sections 3.1-3.4 – Using the calculator to
Find the Mean, Median, Standard Deviation, and 5-number Summary
8) Use the data from problem 2, page 67, which you have stored into the list GLUCO, to
find the mean, standard deviation and the 5-number summary

Raw Data (list of all 70 numbers listed on page 67)
Instructions in the home screen should read 1-Var Stats GLUCO
Press STAT
Arrow to CALC
Select 1:1-Var Stats
Select the list GLUCO from the 2nd STAT (LIST) menu
Press ENTER

Grouped Data (use midpoints and frequencies. See page 4)
Instructions in the home screen should read: 1-Var Stats L1,Ll2
Enter midpoints into a list (L1),
Enter frequencies into another list (L2)
Press STAT
Arrow over to CALC
Press 1:1-Var Stats
Select L1, L2
Press ENTER

Observe the values obtained for the raw data and for the grouped data. Are they the
same? If not, why is that? Which answers are exact?
7
M116 – TI 83/84 CALCULATOR – CH 3
Section 3.4 – Using the calculator to construct Box-and-Whisker Plots, and TRACE to
find the 5-number summary
9) Use the data from problem 2, page 67 (which is stored into the list GLUCO), to
construct a box plot
Here are the steps to accomplish this:
Press 2nd Y= (STAT PLOTS)
Turn one Plot ON, and make sure all others are OFF.
Arrow down and right to select the box plot that shows the outliers
Select GLUCO for Xlist (from the 2nd STAT[LIST] menu)
Select 1 for Freq
Press ZOOM 9 (this automatically opens an appropriate window)
Press TRACE and use the left-right arrows to obtain the 5-number
summary
_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_
10) Constructing the Box Plot and Histogram for the same data
Here are the steps to accomplish this:
Turn ON a second plot
Select a histogram for the data stored in list GLUCO
Press GRAPH
If necessary, press the WINDOW key and select a larger number for Y-max to provide
enough space to graph the histogram and the box-plot.
_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_
8
M116 – NOTES – CH 3
Section 3.2 - Chebyshev’s Theorem
For any set of data (either population or sample) and for any constant k greater than 1, the
proportion of the data that must lie within k standard deviations on either side of the mean is at
least 1-1/k^2
For any set of data
 At least 75% of the data fall win the interval from µ- 2σ to µ+ 2σ
(Within 2 standard deviations from the mean)
 At least 89% of the data fall win the interval from µ- 3σ to µ+ 3σ
(Within 3 standard deviations from the mean)
 At least 93.8% of the data fall win the interval from µ- 4σ to µ+ 4σ
(Within 4 standard deviations from the mean)
11) Use the GLUCO data (from #2, page 67) to determine a Chebyshev interval about
the mean in which
a) At least 75% of the data fall
b) At least 89% of the data fall
c) At least 93.8% of the data fall
d) Explore the SORTED data which is in the GLUCO list and determine the actual percentage
of values which lies
i) Within two standard deviations from the mean
ii) Within three standard deviations from the mean
iii) Within four standard deviations from the mean
9
M116 – NOTES
Empirical Rule and Range Rule of Thumb
Empirical Rule (section 6.1)
For a distribution that is symmetrical and bell-shaped (normal distribution)



About 68% of the data fall within the interval from µ- σ to µ+ σ
(Within 1 standard deviation of the mean)
About 95% of the data fall within the interval from µ- 2σ to µ+ 2σ
(Within 2 standard deviations of the mean)
About 99.7% of the data fall within the interval from µ- 3σ to µ+ 3σ
(Within 3 standard deviations of the mean)
Range rule of thumb (section 6.2)
The range rule of thumb is based on the principle that for many data sets (symmetrical, bell
shaped), the vast majority (such as 95%) of sample values lie within two standard deviations of
the mean.
To roughly estimate the standard deviation, use:
s ~ (highest value – lowest value)/4
To roughly estimate the minimum and maximum “usual” sample values, use:
Minimum “usual” value ~ mean – 2 * standard deviation
Maximum “usual” value ~ mean + 2 * standard deviation
11-e) Are the percentages obtained in part (d) of the previous page suggesting that the
GLUCO distribution is bell shaped?
11-f) What values of the GLUCO data are usual, which ones are unusual?
10
M116 – TI 83/84 CALCULATOR – CH 2-3
Loading Data Sets into your calculator

Here are the data sets that will be loaded into your calculator. Come to my office
to get them
CRGVL = Regular Coke Volume (oz)
PRGVL = Regular Pepsi Volume (oz)
CDTWT = Diet Coke Weight (lb)
CRGWT = Regular Coke Weight (lb)
FHED = Head circumferences of Two-Month-Old Baby-Girls (cm)
MHED = Head circumferences of Two-Month-Old Baby-Boys (cm)
FEMAL = ages of Females who finished a recent New York City Marathon
MALE = ages of Males who finished a recent New York City Marathon

Uploading a list from the memory into the editor of the calculator
Upload the data from the CRGVL list into the editor.
Press STAT
Select 1:Edit
Arrow up and to the right until we get into a list that has NO NAME
Press 2nd STAT[LIST]
Arrow down and select the list CRGVL and press ENTER twice.
11
M116 – TI 83/84 CALCULATOR – CH 2-3
12) What does the distribution of Volumes of Regular Coke cans look like?
Before constructing any graphs, think about the following:
a) Think on selecting a sample of regular Coke cans, recording their volumes, and using the
calculator to sketch a histogram. What do you think the histogram will look like? What shape
will this distribution have?
b) Now let’s look at the data that we have in CRGVL. Is your prediction correct?
c) Now use the calculator to sketch a histogram for the data set CRGVL. Is the histogram
what you predicted? Comment on the results.
Also, press TRACE and write the classes and frequencies obtained.
13) Let’s observe two graphs together for the same data set
Set up a second STAT PLOT with a box plot for the data CRGVL. Press ZOOM 9:Stat, you
may need to press the WINDOW key of the calculator and change the Ymax to fit both
graphs.
Write the five-number summary for the data.
12
M116 – TI 83/84 CALCULATOR – CH 2-3
Comparing Data Sets
14) Do you think the distribution of volumes for regular Pepsi will look the same as the
one for regular Coke?
CRGVL = Regular Coke Volume (oz)
PRGVL = Regular Pepsi Volume (oz)
a) Now let’s look at box plots for both distributions CRGVL and PRGVL. Turn both plots
ON and press ZOOM 9:Stat to select a window. Is it what you predicted? Comment on your
results.
Use the scale provided below as a guide to sketch the box plots.
b) Record the 5 number summary and the outliers for each of the distributions.
c) Also mention the smallest and largest number of the distributions which are not outliers.
_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_
d) Which of the two data sets have
(i) A larger standard deviation?
(ii) A larger mean?
(iii) A larger median?
(iv) A larger IQR
(v) A larger range
e) The 75th percentile of the CRGVL distribution is about __________ which is the same as
the _________th percentile of the PRGVL distribution.
f) For each distribution, give the range of the middle 50% of the data.
13
M116 – TI 83/84 CALCULATOR – CH 2-3
15) Comparing Weights of Diet Coke and Regular Coke by using Box Plots
CDTWT = Diet Coke Weight (lb)
CRGWT = Regular Coke Weight (lb)
Before constructing any graphs, think about both box plots.
a) Do you think they will have the same length (range)?
Will they have the same minimum and maximum, or one of the plots will be farther to the
right of the other? If so, which will be to the right?
b) Construct a box plot for each of the distribution of the weights of regular and diet Coke.
Display both plots in the same window. Is it what you predicted? Compare the graphs and
determine whether there appears to be a significant difference between the two distributions.
If so, provide a possible explanation for the difference.
Use the scale provided below as a guide to sketch the box plots.
Record the 5 number summary and the outliers for each of the distributions.
Also mention the smallest and largest number of the distributions which are not outliers.
_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_
16) Go back and look at each of the graphs obtained throughout this activity. Think
about the measures of the center and variation for the data set. (Chapter 3)
Do you think the mean and median are the same? If not, which one will be larger? Think
about the standard deviation. Will it be a large number or a small number? Try to tie concepts
from chapters 2 and 3 together.
14
M116 – NOTES
Choosing an appropriate number to describe the data
Measuring the center of a distribution

The mean cannot resist the influence of extreme observations. It is not a resistant measure of
the center

The median is a resistant measure of the center.

If the distribution is symmetric, the mean and median are the same.

If the distribution is close to symmetric, the mean and median are very close in values.

In a skewed distribution, the mean is farther out in the long tail than is the median
Measuring the spread of a distribution – Box Plots and the 5-number summary

The minimum and maximum values show the full spread of the data (but they may be outliers)

The interquartile range marks the spread of the middle half of the data.

In a symmetric distribution, the first and third quartiles are equally distant from the median

In most distributions that are skewed to the right, the third quartile will be farther above the
median than the first quartile

The standard deviation measures spread by looking at how far the observations are from their
mean
Choosing measures of center and spread

The five-number summary is usually better than the mean and standard deviation for
describing a skewed distribution or a distribution with strong outliers.

Use the mean and standard deviation only for reasonably symmetric distributions that are free
of outliers.
Example 1: Distributions of incomes are usually skewed to the right. Which measure of the center is
more appropriate? Why?

Reports about incomes and other strongly skewed distributions usually give the median rather
that the mean.
Example 2: The mean and median selling price of existing single-family homes sold in June 2002
were $163,900 and $210,900. Which of these numbers is the mean and which is the median? Explain
how you know.
15