Download STAT 10020 Minitab

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STAT 10020
Minitab - Lab 2
Part 1 - Probability
1.
Consider the sample space when two dice are thrown together. One die is green and the
other, red. There are 36 outcomes: (1, 1), (1, 2) ... (1, 6), (2, 1) ... (6, 6). Enter these sample points
into a Minitab worksheet. Let C1 represent the number on the green die and let C2 represent the
number on the red die. Make sure that each of the 36 possible red-green pairs appears exactly
once in your worksheet. We are interested in the sum of the numbers on the two dice, so make a
new column called Result containing the sum of C1 and C2 (remember we saw how to do this in
Lab 1).
2.
Now, we want to get the probability of each result in C3 occurring. Use code like this:
MTB > TALLY C3;
SUBC > COUNTS;
SUBC > PERCENTS.
Here, we are asking Minitab to display the number of times each value is seen in C3. We also ask
it to display the percentage of the time each value is seen. Notice the syntax we use. The
semicolon at the end of the first line tells Minitab not to process the command immediately but
instead to let us enter a subcommand. The full stop at the end of the third line tells Minitab that we
have finished entering commands and it is time to process them.
Look at the Session window. To get the probability of a particular result (4, for example) divide the
number of times it occurs (the number in the Counts column) by the number of points in the sample
space. How does this probability relate to the percentage column?
Fill in the probability of each result in the following table:
Result
Probability
2
3
4
5
6
7
8
9
10
11
12
3.
Suppose we want to roll two dice 50 times and see what pattern the results follow. We
could roll them by hand and record the result each time, but Minitab gives us an easier way to do it.
Picking a number at random from the Result column is really the same as throwing a pair of dice
and recording the result. So we could write each number in the Result column on a piece of paper,
put all 36 pieces into a hat, and pull one out. Or we could just ask Minitab to choose a number from
the Result column at random, repeat that 50 times, and put all 50 numbers into a new column.
Click on Calc – Random Data – Sample from Columns.
You want to
simulate throwing
two dice with the
sum of the two
being recorded.
(Which column is
the sample space
for such an
experiment?)
The number of
throws you
want in the
sample
Tick this box
Enter the next
available column
(In this case, C4)
You should see 50 numbers appear in C4. Why did we choose to sample with replacement?
Now we’re going to compare the percentage of times each result occurred with the probability of
each result occurring. Type in the Session window:
MTB > TALLY C3 C4;
SUBC > PERCENTS.
Compare the two columns. Are the percentages shown similar for both?
Repeat the simulation, but take a sample of 5,000 (store the results in C5). This is equivalent to
rolling a pair of dice 5,000 times. Tally the results in this new column and compare them with the
results in C3. What do you notice? Is there a difference between the pattern of numbers in C4 and
C5?
4.
Now consider an experiment where two coins are tossed. We are interested in the number
of heads – 0, 1 or 2. In a new worksheet or in new columns of your current worksheet, enter the
sample space for this experiment, with one column representing coin 1 and another representing
coin 2. Use the techniques you have just learned to find the probability of the following outcomes:
i.
Exactly 1 head:
ii.
At least 1 head:
Use Minitab to simulate 10 tosses of 2 coins.
What percentage of the time does exactly 1 head occur?
What percentage of the time does at least 1 head occur?
Do these agree with the probabilities you calculated above?
Repeat the simulation, but simulate 100 and then 1,000 tosses. How do the percentages of each
outcome occurring agree with the probabilities now? Why is this?
Part 2 – Data Analysis
1.
Go to the class page (www.ucd.ie/statdept/classpages/stat_10020.html) and download the
data set called Pulse.mtw. Save it in the Minitab folder on your H drive. (Remember, this is the
folder where you saved your work from Lab 1).
2.
In Minitab, click on File – Open Worksheet to open the worksheet. You will see data in
several columns from an experiment to study pulse rate. In this experiment, volunteers recorded
their resting pulse rate (shown in the column Pulse1). They were then randomised into two groups,
one of which ran for one minute and the other of which sat still for one minute (recoded in the
column Ran). They then took their pulses again (recorded in Pulse2). Some other data including
height, weight, sex and some other background information is also recorded.
3.
We want a histogram of the variable Pulse1. Click on Graph – Histogram and choose
Simple Histogram. In the dialogue box that you see, double-click on Pulse1.
4.
Now draw a dotplot of the same data. (Click on Graph – Dotplot). From the two graphs
you’ve drawn, can you locate the lowest and highest values of resting pulse rate?
5.
Look at the Ran column. It contains 1 or 2, depending on whether the subject ran or not.
We’re going to try to decide which value means that the subject ran by looking at the second pulse
measurement. Go to Graph – Dotplot and choose With Groups. You should see a dialogue box
like this:
Choose the
variable you
want to graph
– Pulse 2
Choose the
variable you
want to group
by - Ran
Which group do you think ran for one minute?
6.
Next we are going to compare descriptive statistics for the two groups. There are two ways
of doing this. The first is to separate the data in the Pulse2 column according to the number in the
Ran column. Click on Data - Unstack Columns. You should see a dialogue box like this:
The variable you
want to separate.
How do you want
to separate the
variable?
Now you should have two new columns containing the values from Pulse2. Use Stat - Basic
Statistics - Display Descriptive Statistics and select Pulse2_1 and Pulse2_2 as the variables.
Compare the means and medians. Does this agree with the impression you got from the dotplot
earlier?
7.
There is a simpler way to do this. Go to Stat - Basic Statistics - Display Descriptive
Statistics again. This time, select Pulse2 as the variable and check the box marked "By variable".
Put "Ran" in this box and click OK, and you should see the descriptive statistics displayed
separately for those who did and did not run. Are they the same as your results from Step 6? (They
should be!)
8.
Now we’ll compare the heights of male and female volunteers. You can use the method
described in either Step 6 or Step 7 - whichever one makes more sense to you. How does the
mean height of males compare with that of females?
9.
Save your work. Click on File - Save Project. Remember to save your work in your home
directory, which is shown as your student number or by the letter H. If you save it anywhere else
you won't be able to access is again.
ASSIGNMENT
Submit this assignment at the beginning of your next lab, 2 weeks from today. Assignments
must be submitted before class; any that are completed during class will be considered late
and you will not receive credit for them. Include appropriate output from Minitab to support
your answers. Remember to put your name and student number, as well as the lab time and
room you attend, on your assignment.
1.
Consider an experiment where counters numbered 1 to 5 are placed in each of two bags.
One counter is drawn from each bag and then replaced. In a new Minitab worksheet,
produce the sample space for the experiment. (Hint: you should have 25 outcomes in total).
a)
What is the probability of drawing any given number from the first bag?
b)
Suppose we want to add the numbers on the two counters we drew. What code
would you use in Minitab to make a third column containing the sum of the numbers on the
two counters? Call this column "Total".
2.
c)
What is the probability of getting a total of 3?
d)
What is the probability of getting a total of 4 or less?
e)
What is the probability of getting a total of 5 or more?
Simulate 50 runs of this experiment.
a)
What percentage of the time was the result 3?
b)
What percentage of the time was the result 4 or less?
c)
What percentage of the time was the result 5 or more?
Simulate 10,000 runs of this experiment.
3.
d)
What percentage of the time was the result 3?
e)
What percentage of the time was the result 4 or less?
f)
What percentage of the time was the result 5 or more?
g)
Why are the percentages closer to the true probabilities now?
Using the data from the simulation of 50 trials, draw a dotplot. Do you expect it to be left or
right skewed or symmetric?
Is it in fact left or right skewed, or is it symmetric?
REVISION SUMMARY
After this lab you should be able to:
-
Tally a column (using both counts and percentages)
-
Use the subcommand function
-
Take a random sample from a column
-
Generate graphs and summary statistics
-
Unstack data into new columns.