Download Guide to using Minitab and Excel

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Minitab 16
Excel 2010
Reference Guide
Prepared for MATH201/MATH202
Bryan Crissinger
University of Delaware
Department of Mathematical Sciences
Contents
Introduction ................................................................................................................................................ 5
Troubleshooting (Minitab) .......................................................................................................................... 6
Troubleshooting (Excel)............................................................................................................................. 7
Getting More Help ..................................................................................................................................... 8
Descriptive Statistics
Frequency Table (Minitab) ........................................................................................................................ 9
Bar Chart (Minitab) .................................................................................................................................. 11
Bar Chart (Excel) ..................................................................................................................................... 13
Pie Chart (Minitab) ................................................................................................................................... 14
Pie Chart (Excel) ..................................................................................................................................... 16
Creating New Variables (Minitab) ............................................................................................................ 17
Creating New Variables (Excel) .............................................................................................................. 20
Descriptive Statistics (Minitab) ................................................................................................................ 21
Descriptive Statistics (Excel) ................................................................................................................... 24
Dotplot (Minitab) ...................................................................................................................................... 26
Histogram (Minitab) ................................................................................................................................. 28
Histogram (Excel) .................................................................................................................................... 33
Boxplot (Minitab) ...................................................................................................................................... 35
Stem-and-Leaf Plot (Minitab) .................................................................................................................. 37
Normal Probability Plot (Minitab) ............................................................................................................. 39
Graphical Summary (Minitab) .................................................................................................................. 41
Scatterplot (Minitab) ................................................................................................................................ 43
Scatterplot (Excel) ................................................................................................................................... 46
Probability Distributions
Binomial Distribution (Minitab) ................................................................................................................. 47
Binomial Distribution (Excel) ................................................................................................................... 50
Poisson Distribution (Minitab) .................................................................................................................. 52
Poisson Distribution (Excel) .................................................................................................................... 55
Normal Distribution (Minitab) ................................................................................................................... 57
Normal Distribution (Excel) ...................................................................................................................... 62
t Distribution (Minitab).............................................................................................................................. 64
t Distribution (Excel) ................................................................................................................................ 68
Chi-Square Distribution (Minitab) ............................................................................................................ 71
Chi-Square Distribution (Excel) ............................................................................................................... 74
2
F Distribution (Minitab) ............................................................................................................................ 77
F Distribution (Excel) ............................................................................................................................... 80
One-Sample Inference
z-interval and z-test for 𝝁 (Minitab).......................................................................................................... 82
t-test and t-interval for 𝝁 (Minitab) ........................................................................................................... 84
t-interval for 𝝁 (Excel) .............................................................................................................................. 86
Interval and Test for 𝒑 (Minitab) .............................................................................................................. 88
Interval and Test for 𝝈𝟐 (Minitab)............................................................................................................. 90
Two-Sample Inference
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Minitab) ................................................. 92
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Excel) .......................................................................... 95
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Minitab) ........................................................... 98
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Excel) .................................................................................. 100
z-Interval and z-Test for π’‘πŸ βˆ’ π’‘πŸ Using Independent Samples (Minitab) .............................................. 102
Test and Interval for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Minitab) ...................................................... 105
F-Test for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Excel) .......................................................................... 108
ANOVA
One-Way ANOVA (Minitab) ................................................................................................................... 111
One-Way ANOVA (Excel)...................................................................................................................... 115
Two-Way ANOVA (Minitab) ................................................................................................................... 117
Two-Way ANOVA (Excel)...................................................................................................................... 121
Interaction Plot (Minitab)........................................................................................................................ 125
Contingency Tables
Chi-Square Test for One-Way Table (Minitab) ...................................................................................... 126
Chi-Square Test for Two-Way Table (Minitab) ...................................................................................... 129
Regression
Regression (Minitab) ............................................................................................................................. 132
Regression (Excel) ................................................................................................................................ 137
Control Charts
Μ… Chart (Minitab) .................................................................................................................................... 140
𝒙
R Chart (Minitab) ................................................................................................................................... 143
P Chart (Minitab) ................................................................................................................................... 146
3
Time Series
Time Series Plot (Minitab) ..................................................................................................................... 148
Moving Averages (Minitab) .................................................................................................................... 149
Single Exponential Smoothing (Minitab) ............................................................................................... 151
Trend Analysis (Minitab) ........................................................................................................................ 153
Trend Analysis (Excel)........................................................................................................................... 156
Seasonal Regression Models ................................................................................................................ 158
4
Introduction
Minitab
Modern statistical practice always involves the use of software to do data analysis. We will use Minitab
(version 16) or Excel (2010 or 2013) to do many statistical analyses and it will be beneficial for you to use
such a package to do homework problems that require the use of software, especially where hand
computations are burdensome. 101 Ewing is not open to students other than during lab times, so if you
need to use Minitab, there are two options:
ο‚·
Computer labs on campus where Minitab software is installed: 305 Pearson Hall, 111 and 113
MacDowell Hall, Smith Hall Computing Site, B&E Lab in Purnell
ο‚·
Download Minitab at www.onthehub.com/minitab. This site allows you try Minitab free for 30 days,
rent Minitab for 6 months ($30), or buy a copy ($100). Minitab is currently available for Windows only.
Excel
While Excel is designed as spreadsheet software, it can perform basic data analysis functions as well.
Since Excel is used extensively in business, it’s worthwhile to learn about its statistical capabilities (and
limitations).
To enable the data analysis features in Excel, you must make sure two add-ins are activated.
Excel 2007: Click on the Office Button in the upper left corner of the window
Excel 2010 or 2013: Choose File > Options
and then select Excel Options. Choose Add-Ins from the menu at the left. You should see the following
active application add-ins:
Analysis ToolPak
Analysis ToolPak – VBA
If not, make sure Excel Add-ins are selected to manage at the bottom of the window and click Go.
Check the two Analysis ToolPak add-ins and click Ok. You may have to restart Excel for the change to
take effect.
These two features activate the Data Analysis option under the Data tab at the top of your Excel window.
Textbook Datasets
You can directly input to Minitab and Excel most of the datasets referenced in your textbook, rather than
entering the data by hand. The files are located on the CD that comes with your textbook.
5
Troubleshooting (Minitab)
In Minitab Express, when I try to copy and
paste a data set into the worksheet,
everything ends up in one column.
Minitab Express may recognize the
columns for data copied from Excel better
than from other sources.
Try copying/pasting into an Excel
spreadsheet first. If the same thing
happens in Excel, check out the
Troubleshooting (Excel) section below to
separate the data into columns in Excel.
Once you do that, copy from Excel and
paste into your Minitab Express
worksheet.
I'm using File > Open Project… to open a
data file but the data file doesn't show up
in the dialog box.
If the data file is a Minitab worksheet, you
must use File > Open Worksheet…
instead of File > Open Project….
Minitab shows a column contains text
data, e.g. C3-T, but I only have numbers
in the column.
It's possible you may have typed or
copied a non-numeric character in one of
the cells. Try using Data > Change Data
Type > Text to Numeric… and store the
numeric columns in a new column.
In a dialog box, a column I need to use is
not shown in the list of available columns
on the left.
Try clicking in the box where you want to
use the column first. If that doesn't work,
it could be that Minitab is expecting a
numeric data column and the column
you're trying to use contains text data.
Text data columns are indicated with a T
suffix in the column heading, e.g. C3-T.
Paired t-test or regression: I get an error
message that my two columns must be of
the same length.
Both columns must have the same
number of values (can include missing
values).
2-Sample t-test: I get an error that there
must be exactly two distinct subgroups.
You may be using stacked data format
where there are more than 2 distinct
values in the grouping variable column
(subscripts).
6
Troubleshooting (Excel)
When I try to copy and paste a data set
into the worksheet, everything ends up in
one column.
Copy the data as usual. Select the cell in
the top left corner of the space where you
want to paste the data. Right-click on this
cell, select Paste Special…, and choose
Text as the source.
If that doesn't work, highlight the column
you pasted, choose Data > Text to
Columns, and choose Delimited as the
file type. Select delimiters from the list
until the columns are shown separated
properly.
If that fails, try importing the data to
Minitab first, copy the Minitab worksheet,
and paste into Excel.
There are no statistics displayed when I
request descriptive statistics.
You must check the Summary statistics
box in the dialog box.
A histogram is not displayed when I
request a histogram.
You must check the Chart Output box in
the dialog box.
I get this error message: "Input range
contains non-numeric data."
If an input range contains a column label,
check the Labels or Labels in First Row
box in the dialog box. Otherwise, there
may be non-numeric characters in the
column (usually indicated by left-justified
numbers).
Paired t-test or regression: I get an error
message that my two columns must have
the same number of rows.
Excel doesn't handle missing data for
either a paired analysis of means or
regression. All columns used must
contain only non-missing values and be of
the same length.
7
Getting More Help
This guide is designed to be a quick-reference tool, not an exhaustive reference.
Minitab
Excel
Help Buttons
You can access documentation for specific
dialog boxes by clicking on the Help button
in the dialog box.
? Button
Click on the ? in the upper right of the
window to access Microsoft's help for
Excel.
Help > Help
Access help by topic as well as use the
index and search features.
Note: The Excel/DDXL or Excel/XLSTAT
parts of the Using Technology sections at
the ends of the chapters will not apply
unless you install the DDXL add-in (on the
CD accompanying the textbook). We do
not use this add-in as it is not generally
available with Excel.
Help > Methods and Formulas
This feature shows the methods and
formulas used in the procedures you
specify.
Textbook: Using Technology Sections
You'll find documentation on using many of
the tools listed in this guide and others in
the Using Technology sections at the end
of each chapter.
8
Frequency Table (Minitab)
The data should be entered in the Minitab worksheet with one row per observation. In
this example we have data on several students in a class. The variables are Student ID
and Class.
Choose Stat > Tables > Tally Individual Variables.
In the dialog box, select the categorical
variable you want to summarize with a
frequency table. The default frequency
table includes only category counts
(frequencies) but you can also request
percents (similar to relative
frequencies) in this dialog box, as
shown here.
9
Frequency Table (Minitab)
The frequency table will be displayed in the Session window. Note that misspelled
words are considered a different category.
10
Bar Chart (Minitab)
The data should be entered in the Minitab worksheet with one row per observation. In
this example we have data on several students in a class. The variables are Student ID
and Class.
Choose Graph > Bar Chart.
Choose the Simple bar chart option and specify Class as the categorical variable you
want to chart. You can customize the chart by using the six buttons in the dialog box.
11
Bar Chart (Minitab)
The chart will display in a separate graph window.
Chart of Class
6
5
Count
4
3
2
1
0
Frehsman
Freshman
Junior
Class
Senior
Sophomore
12
Bar Chart (Excel)
Enter the frequency table in the worksheet; Excel won’t do this automatically for you.
Highlight the cells containing the frequency table and choose Insert > Column for a
vertical bar chart or Insert > Bar for a horizontal bar chart.
13
Pie Chart (Minitab)
The data should be entered in the Minitab worksheet with one row per observation. In
this example we have data on several students in a class. The variables are Student ID
and Class.
Choose Graph > Pie Chart.
The default pie chart requires data in the format as shown above with one row per
observation (Chart counts of unique values). Specify Class as the categorical variable
you want to chart. You can customize the chart by using the six buttons in the dialog
box. Use the Labels button and the Slice Labels tab to label the pie slices with the
category percentages.
14
Pie Chart (Minitab)
The pie chart will be shown in a separate graph window.
Pie Chart of Class
7.1%
14.3%
C ategory
Frehsman
Freshman
Junior
Senior
Sophomore
42.9%
21.4%
14.3%
15
Pie Chart (Excel)
Enter the frequency table in the worksheet; Excel won’t do this automatically for you.
Highlight the cells containing the frequency table and choose Insert > Pie.
An easy way to add the percentages to the chart is to select the first Chart Layout after
you've created the pie chart. Click on the chart title to edit it.
Class Rank
2
16%
4
36%
4
19%
6
29%
16
Creating New Variables (Minitab)
You can create new columns using the data in existing columns quickly and easily.
Option 1: Calc > Row Statistics…
Here we want to compute an average price for each
of the stocks in the worksheet using the four existing
prices.
Input variables: select the four columns of prices
Store result in: specify a name for the new column
where Minitab will store the average price for each
stock
17
Creating New Variables (Minitab)
Minitab computes the average prices and stores them in the worksheet in the next
available column.
Option 2: Calc > Calculator…
18
Creating New Variables (Minitab)
Store result in variable: specify a name for
the new column where Minitab will store the
average price for each stock
Expression: write the expression for the
calculation you want to do
This option gives you more flexibility in the
kinds of calculations you can do.
19
Creating New Variables (Excel)
In Excel, you create new columns by writing formulas in the cells. Begin the formula
with an equal sign.
Then copy the cell containing the formula and paste it to all the cells below it. An easy
way to do that is to drag the lower right corner of the cell straight down. The cursor will
change from a white cross to a black cross.
Since we did not use any absolute cell references (e.g. $B$2), the references
automatically change to the appropriate rows.
20
Descriptive Statistics (Minitab)
Option 1: Use Stat > Basic Statistics > Display Descriptive Statistics… to obtain
many numeric descriptive statistics for columns of numeric data.
Here we have data on the number of soft drinks consumed per week for a sample of
males and a sample of females. The data are shown in the worksheet in two different
formats:
Unstacked Format (C1 and C2): Each sample of data has its own column.
Stacked Format (C3 and C4): Both samples are stacked in C3 with a column of
gender indicators in C4.
In practice, we typically have grouped data in only one format.
21
Descriptive Statistics (Minitab)
Variables: Specify the column(s) containing the data you want to summarize.
If you want to summarize data separately for different groups, specify the analysis as
shown below depending on whether you have unstacked or stacked data.
Unstacked
Stacked
The output will display in the
session window. Note the
slight differences in the output
for the two data formats. Many
tools in Minitab can
accommodate data in either
format.
22
Descriptive Statistics (Minitab)
Option 2: Use Calc > Column Statistics… to obtain a single statistic for a numeric
data column.
Specify the Statistic you want.
Input variable: Specify the column
containing the data you want to summarize.
By default, the output will display in the
Session window unless you specify an
optional storage column.
23
Descriptive Statistics (Excel)
Use Data > Data Analysis > Descriptive Statistics.
Here we have data in unstacked format.
Input Range: Highlight the cells containing
the data.
Grouped By: Columns
This tells Excel we have unstacked data.
Labels in First Row: Check this box if the
Input Range contains column labels. If not,
don't check the box. Checking the box tells
Excel to ignore what's in the first row of the
Input Range.
Summary Statistics: You must check this
box for the output to display.
24
Descriptive Statistics (Excel)
The output will be displayed in a separate worksheet. Make the columns containing the
labels (here A and C) wider so that the entire labels are displayed.
25
Dotplot (Minitab)
Use Graph > Dotplot…
The dialog box gives you several input options depending on the format of the data.
One Y, Simple: Use this for one-sample
analysis, i.e. a single column of numeric data.
One Y, With Groups: Use for stacked data.
Multiple Y's, Simple: Use for unstacked data.
Multiple Y's, With Groups: Use for stacked
data having multiple response/comparison
variables.
While you have the option to stack the dots in the two dotplots with grouped data, this is
not recommended as it makes determining features such as shape difficult to
determine.
26
Dotplot (Minitab)
One Y, Simple: Displays the distribution of number of soft drinks combined for males
and females.
Dotplot of Number
2
4
6
Number
8
10
12
Multiple Y's, Simple: Uses the unstacked data to display the distribution of number of
soft drinks separately for males and females.
Dotplot of NumMales, NumFemales
NumMales
NumFemales
2
4
6
Data
8
10
12
One Y, With Groups: Same comparison analysis using the stacked data.
Gender
Dotplot of Number
Female
Male
27
2
4
6
Number
8
10
12
Histogram (Minitab)
Use Graph > Histogram…
The dialog box gives you several input options depending on the format of the data.
Simple: Use this for one-sample analysis, i.e. a
single column of numeric data.
With Outline and Groups: Use for stacked data.
You also have the option of having Minitab draw the
best fitting normal distribution either on the
histogram.
28
Histogram (Minitab)
Simple: Displays the distribution of number of soft drinks combined for males and
females.
Histogram of Number
16
14
Frequency
12
10
8
6
4
2
0
1.5
3.0
4.5
6.0
7.5
Number
9.0
10.5
12.0
Group Comparisons
An alternative to using the With Outline and Groups option to create multiple histograms
for grouped data (as the picture can get messy with all the outlines overlaid) is to use
the Simple option and the Multiple Graphs… button.
Stacked Data
Graph Variables: Specify the single
column of numeric data, i.e. the
response/comparison variable.
29
Histogram (Minitab)
Click on Multiple Graphs… and request that
the multiple histograms be shown In
separate panels of the same graph.
Also check the box so that the X axis scales
and bins will be the same for the histograms.
By Variable tab: Specify Gender as the "by
variable", i.e. the grouping variable.
Histogram of Number
It is important that the x-axis
scales be identical to allow for
an accurate comparison of the
features of the distributions.
1.5
Female
12
Frequency
8
6
4
2
1.5
3.0
4.5
6.0
7.5
9.0 10.5 12.0
Number
Panel variable: Gender
30
4.5
6.0
7.5
Male
10
0
3.0
9.0 10.5 12.0
Histogram (Minitab)
Unstacked Data
Graph Variables: Specify both numeric
data columns.
Click on Multiple Graphs… and request that
the multiple histograms be shown In
separate panels of the same graph.
Also check the box so that the X axis scales
and bins will be the same for the histograms.
Note that the scales of the
frequency axes are different;
these could be made identical
by checking the Same Y in
the Multiple Graphs… dialog.
Histogram of NumMales, NumFemales
1.5
NumMales
12
5
Frequency
4
8
3
6
2
4
1
2
1.5
3.0
4.5
31
6.0
7.5
4.5
6.0
7.5
9.0
NumFemales
10
0
3.0
9.0
10.5 12.0
0
10.5 12.0
Histogram (Minitab)
You can change the default interval (bin) definitions by right-clicking on the bars of the
histogram once it's created and selecting Edit Bars…
Under the Binning tab change Interval Type
to Cutpoint.
Define the Cutpoint positions with a list of the
endpoints of the intervals. You can do this
long-hand: 0 1 2 3 4 5 6 7 8 9 10 11 12
or by using Minitab short-hand as shown
here.
0:12/1 requests intervals starting at 0 and
ending at 12 each with width of 1.
Histogram of Number
10
Frequency
8
6
4
2
0
0
2
4
6
Number
8
10
12
Note: Minitab's interval/bin definitions use the [ , ) convention. For example, in the
above histogram, the interval [1, 2) includes the 4 subjects who drink 1 soft drink per
week.
Note: You can also use the Midpoint Interval Type and change the Number of
intervals to the desired number without having to specify the individual endpoints.
32
Histogram (Excel)
In addition to having the data entered in the worksheet, you must enter a list of the
endpoints of the intervals (bins) for the histogram somewhere in the worksheet.
Use Data > Data Analysis > Histogram.
Input Range: Highlight the column of
cells containing the data.
Bin Range: Highlight the column of
cells containing the interval endpoints.
Labels: Check this box if both the Input
Range and Bin Range contain column
labels. If not, don't check the box.
Checking the box tells Excel to ignore
what's in the first row of these ranges.
Chart Output: You must check this box for the histogram to display.
33
Histogram (Excel)
Once the histogram is created, select the bars portion
of the graph and right-click. Choose Format Data
Series… and change the Gap Width to 0% (No Gap).
Excel outputs both the histogram and a frequency
table showing the counts in each interval (bin). The
Bin Endpoints in the frequency table are the upper
endpoints of each interval. For example, there are 5
observations in the interval (6, 8], 3 observations in the
interval (8, 10], etc.
In order to construct multiple histograms, you must do a separate analysis for each
group. Be sure to use the same Bin Range for each.
Note: Excel's interval/bin definitions use the ( , ] convention. For example, in the above
histogram, the interval (4, 6] includes the 7 subjects who drink either 5 or 6 soft drinks
per week.
34
Boxplot (Minitab)
Use Graph > Boxplot…
For this example we'll use data on length of time in practice (years) for a sample of
physicians.
The dialog box gives you several input options depending on the format of the data.
One Y, Simple: Use this for one-sample
analysis, i.e. a single column of numeric data.
One Y, With Groups: Use for stacked data.
Multiple Y's, Simple: Use for unstacked data.
Multiple Y's, With Groups: Use for stacked
data having multiple response/comparison
variables.
35
Boxplot (Minitab)
Here we show an analysis of length of time in practice (YRSPRAC) by specialty
(SPEC). Since these data are stacked (single numeric column of times and a
categorical column of specialty indicators), we'll use the One Y, With Groups input
option.
Graph variables: Specify the single
column of numeric data, i.e. the
response/comparison variable.
Categorical variables for grouping:
Specify the categorical grouping variable.
You can display the boxplot horizontally instead of
vertically (the default) by using the Scale… button
and checking the box to Transpose value and
category scales.
Boxplot of YRSPRAC
MED
SPEC
The output shows an outlier at 40 years in
practice in the Surgery specialty sample.
Minitab classifies any observations outside
the inner fences as outliers and shows them
as asterisks in the boxplot.
SURG
The whiskers extend to the most extreme
observations just inside the inner fences.
0
10
20
YRSPRAC
30
Hover the mouse pointer over the boxplot to see some descriptive statistics and
numeric features of the boxplot.
36
40
Stem-and-Leaf Plot (Minitab)
Use Graph > Stem-and-Leaf…
Graph variables: Specify a column of
numeric data.
By variable: Specify the categorical grouping
variable.
Note: The by variable needs to be coded with
numbers in the worksheet for you to be able to
use it here. In our data example, we'd first
have to create a new column with numeric
codes first: for example 1 = SURG 2 = MED.
You can do this automatically using Data > Code > Text to Numeric…
37
Stem-and-Leaf Plot (Minitab)
This particular graph is displayed to the Session Window.
The leftmost column in the stem-and-leaf plot shown is called the depths. The numbers
are cumulative counts of numbers of leaves in each row staring from each extreme and
increasing up to the row containing the median. The depth for the row containing the
median is a simple count of the number of leaves in that row and is indicated in
parentheses.
In this example, there are 16 leaves in the first row, 21 in the second row for a total of
37, 2 leaves in the last row, 1 in the next to last row for a total of 3, etc. The median is
in the third row. There are 23 leaves in that row.
38
Normal Probability Plot (Minitab)
Use Graph > Probability Plot… > Single
Using the Multiple option overlays several normal probability plots on the same set of
axes which can look rather jumbled.
Graph variables: Specify a column of
numeric data.
You can request multiple normal
probability plots for grouped data in a way
similar to that for histograms. See the
documentation for Histogram (Minitab) for
details.
39
Normal Probability Plot (Minitab)
Probability Plot of YRSPRAC
Normal - 95% CI
99.9
Mean
StDev
N
AD
P-Value
99
95
90
Percent
The extent to which the pattern of
points deviates from a straight line is
an indication as to the lack of fit of a
normal model for the data. Minitab
also provides confidence bands and
the Anderson-Darling test for normality:
80
70
60
50
40
30
20
10
H0: data come from a normal model
Ha: data do not come from a normal
model
5
1
0.1
40
-20
-10
0
10
20
YRSPRAC
30
40
50
14.60
9.161
112
0.954
0.015
Graphical Summary (Minitab)
This tool displays many important numeric statistics alongside several graphical
summaries for numeric variables.
Use Stat > Basic Statistics > Graphical Summary…
The data shown are eruption data for several eruptions of the Old Faithful geyser in
Yellowstone National Park, Wyoming. Here we summarize the actual time until the next
eruption in minutes (ATM).
Graph variables: Specify a column of numeric
data.
By variables (optional): Specify an optional
categorical variable for grouping if you want
separate analyses for different groups.
Confidence level: Specify a confidence level for
confidence intervals for the population mean,
median, and standard deviation.
41
Graphical Summary (Minitab)
In addition to many basic descriptive statistics, Minitab shows the results of a formal test
for normality (see Normal Probability Plot) and several confidence intervals.
The histogram is editable, just like the output of Graph > Histogram so that, for example,
you can change the interval/bin definitions by right-clicking on one of the bars (see
Histogram).
Summary for ATM
A nderson-D arling N ormality Test
50
60
70
80
90
A -S quared
P -V alue <
1.65
0.005
M ean
S tD ev
V ariance
S kew ness
Kurtosis
N
76.352
16.494
272.044
-0.07322
-1.40627
54
M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum
100
49.000
60.000
82.000
91.000
107.000
95% C onfidence Interv al for M ean
71.850
80.854
95% C onfidence Interv al for M edian
65.000
85.643
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
13.865
Mean
Median
65
70
75
80
85
42
20.362
Scatterplot (Minitab)
There are several ways to obtain a scatterplot of
(x,y) pairs of numeric data. The data must be
entered in the worksheet with two columns for the
numeric data and where the rows represent each
observation.
Here we have 24 orange juice samples, 6 from
each of 4 brands, the pectin content, and measure
of sweetness for each.
Basic scatterplots: Use Graph > Scatterplot…
You can also add regression lines and use
different plotting symbols for different groups.
We show two examples here.
Simple: Specify the columns containing the y-axis and x-axis coordinates.
Scatterplot of SweetIndex vs Pectin
6.0
5.9
SweetIndex
5.8
5.7
5.6
5.5
5.4
5.3
5.2
5.1
200
43
250
300
Pectin
350
400
Scatterplot (Minitab)
With Regression and Groups: Specify the columns containing the y-axis and x-axis
coordinates.
Categorical variables for grouping: Specify the column containing the group
indicators.
Scatterplot of SweetIndex vs Pectin
Brand
A
B
C
D
6.0
5.9
SweetIndex
5.8
5.7
5.6
5.5
5.4
5.3
5.2
5.1
200
250
300
Pectin
350
400
Scatterplot with some regression output: Use Stat > Regression > Fitted Line
Plot…
Response (Y): Specify the column
containing the y-axis coordinates.
Predictor (X): Specify the column
containing the x-axis coordinates.
Type of Regression Model: Specify
the relationship between x and y; the
default is Linear; you may also specify a Quadratic or Cubic model.
44
Scatterplot (Minitab)
The graph output shows the regression line overlaid on the scatterplot as well as the
estimated regresstion equation and some basic regression statistics. The session
window shows these statistics as well as the regression ANOVA table.
Regression Analysis: SweetIndex versus Pectin
Fitted Line Plot
SweetIndex = 6.252 - 0.002311 Pectin
The regression equation is
SweetIndex = 6.252 - 0.002311 Pectin
S
R-Sq
R-Sq(adj)
6.0
5.9
S = 0.214998
R-Sq = 22.9%
R-Sq(adj) = 19.4%
Analysis of Variance
SweetIndex
5.8
5.7
5.6
5.5
5.4
5.3
Source
Regression
Error
Total
DF
1
22
23
SS
0.30140
1.01693
1.31833
MS
0.301402
0.046224
F
6.52
P
0.018
5.2
5.1
200
45
250
300
Pectin
350
400
0.214998
22.9%
19.4%
Scatterplot (Excel)
The data must be entered in the worksheet with two columns
for the numeric data and where the rows represent each
observation. The column containing the x-axis coordinates
must be first.
Here we have 24 orange juice samples, the pectin content,
and measure of sweetness for each.
Highlight both columns of data and then choose Insert >
Scatter.
Axis labels should be added: with the graph window active,
choose Layout > Axis Titles and select a format for the
Horizontal and Vertical axes.
You can also delete the legend and delete or change the
chart title.
SweetIndex
A regression line can be overlaid on the scatterplot by rightclicking on one of the points and selecting Add Trendline…
6.1
6
5.9
5.8
5.7
5.6
5.5
5.4
5.3
5.2
5.1
0
100
200
300
400
Pectin
46
500
Binomial Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab:
Select the Binomial distribution from the dropdown box.
Number of trials: Specify 𝑛.
Event probability: Specify 𝑝.
47
Binomial Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of π‘₯ and compute probabilities, specify that you want to Define Shaded
Area By X Value.
Below are several examples where the random variable 𝑋 has a Binomial distribution
with 𝑛 = 10 and 𝑝 = .3.
𝑃(𝑋 β‰₯ 4) = .3504
Distribution Plot
Binomial, n=10, p=0.3
0.30
0.25
Probability
0.20
0.15
0.10
0.05
0.00
0.3504
0
4
X
𝑃(𝑋 ≀ 4) = .8497
Distribution Plot
Binomial, n=10, p=0.3
0.30
0.25
0.8497
Probability
0.20
0.15
0.10
0.05
0.00
48
4
X
8
Binomial Distribution (Minitab)
𝑃(4 ≀ 𝑋 ≀ 6) = .3398
Distribution Plot
Binomial, n=10, p=0.3
0.30
0.25
Probability
0.20
0.15
0.3398
0.10
0.05
0.00
0
4
X
6
8
Note: To get individual probabilities of the form 𝑃(𝑋 = π‘˜), use the Middle option and
specify the same values for x1 and x2.
𝑃(𝑋 = 4) = .2001
Distribution Plot
Binomial, n=10, p=0.3
0.30
0.25
0.2001
Probability
0.20
0.15
0.10
0.05
0.00
49
0
4
X
8
Binomial Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
Select the BINOMDIST
function and complete the
dialog box.
Number_s: Specify the
value of π‘˜.
Trials: Specify 𝑛.
Probability_s: Specify 𝑝.
Cumulative: Specify TRUE to compute 𝑃(𝑋 ≀ π‘˜) or FALSE to compute 𝑃(𝑋 = π‘˜).
In this example we compute 𝑃(𝑋 ≀ 4) = .849731667 for 𝑛 = 10 and 𝑝 = .3. The result
is shown in the dialog box as soon as you specify all four inputs.
50
Binomial Distribution (Excel)
Here we compute
𝑃(𝑋 = 4) = .200120949
by specifying FALSE for
Cumulative.
51
Poisson Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab:
Select the Poisson distribution from the dropdown box.
Mean: Specify πœ†.
52
Poisson Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of π‘₯ and compute probabilities, specify that you want to Define Shaded
Area By X Value.
Below are several examples where the random variable 𝑋 has a Poisson distribution
with πœ† = 3.8.
𝑃(𝑋 β‰₯ 6) = .1844
Distribution Plot
Poisson, Mean=3.8
0.20
Probability
0.15
0.10
0.05
0.1844
0.00
0
X
6
𝑃(𝑋 ≀ 3) = .4735
Distribution Plot
Poisson, Mean=3.8
0.20
0.4735
Probability
0.15
0.10
0.05
0.00
53
3
X
11
Poisson Distribution (Minitab)
𝑃(2 ≀ 𝑋 ≀ 7) = .8525
Distribution Plot
Poisson, Mean=3.8
0.20
0.8525
Probability
0.15
0.10
0.05
0.00
0
2
X
7
11
Note: To get individual probabilities of the form 𝑃(𝑋 = π‘˜), use the Middle option and
specify the same values for x1 and x2.
𝑃(𝑋 = 4) = .1944
Distribution Plot
Poisson, Mean=3.8
0.1944
0.20
Probability
0.15
0.10
0.05
0.00
54
0
4
X
11
Poisson Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
Select the POISSON function
and complete the dialog box.
X: Specify the value of π‘˜.
Mean: Specify πœ†.
Cumulative: Specify TRUE to
compute 𝑃(𝑋 ≀ π‘˜) or FALSE
to compute 𝑃(𝑋 = π‘˜).
In this example we compute 𝑃(𝑋 ≀ 4) = .667843601 for πœ† = 3.8. The result is shown in
the dialog box as soon as you specify all three inputs.
55
Poisson Distribution (Excel)
Here we compute
𝑃(𝑋 = 4) = .194358757
by specifying FALSE for
Cumulative.
56
Normal Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab:
Select the Normal distribution from the
drop-down box.
Mean: Specify πœ‡.
Standard deviation: Specify 𝜎.
Note: The default normal distribution is
the standard normal.
57
Normal Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of π‘₯ and compute probabilities, specify that you want to Define Shaded
Area By X Value.
Below are several examples where the random variable 𝑋 has a standard normal
distribution.
𝑃(𝑋 β‰₯ 1.28) = .1003
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.2
0.1
0.1003
0.0
0
X
1.28
𝑃(𝑋 ≀ 1.28) = .8997
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.8997
0.2
0.1
0.0
58
0
X
1.28
Normal Distribution (Minitab)
𝑃(|𝑋| > 1.96) = 𝑃(𝑋 < βˆ’1.96) + 𝑃(𝑋 > 1.96) = .0250 + .0250 = .0500
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.2
0.1
0.02500
0.0
0.02500
-1.96
0
X
1.96
𝑃(βˆ’2 < 𝑋 < 1) = .8186
Distribution Plot
Normal, Mean=0, StDev=1
0.4
0.8186
Density
0.3
0.2
0.1
0.0
59
-2
0
X
1
Normal Distribution (Minitab)
Note: For continuous distributions, 𝑃(𝑋 = π‘˜) = 0 since there is no area under the curve
at a point.
𝑃(𝑋 = 1.55) = 0
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.2
0
0.1
0.0
0
X
1.55
To input probability values and compute π‘₯ values, specify that you want to Define
Shaded Area By Probability.
Find π‘₯0 such that 𝑃(𝑋 > π‘₯0 ) = .90.
π‘₯0 = βˆ’1.282
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.9
0.2
0.1
0.0
60
-1.282
0
X
Normal Distribution (Minitab)
Find π‘₯0 such that 𝑃(𝑋 < π‘₯0 ) = .75.
π‘₯0 = .6745
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.75
0.2
0.1
0.0
Find π‘₯0 such that 𝑃(|𝑋| > π‘₯0 ) = .01.
0
X
0.6745
π‘₯0 = 2.576
Distribution Plot
Normal, Mean=0, StDev=1
0.4
Density
0.3
0.2
0.1
0.0
61
0.005
0.005
-2.576
0
X
2.576
Normal Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
Select the NORMDIST
function and complete the
dialog box.
X: Specify the value of π‘˜.
Mean: Specify πœ‡.
Standard_dev: Specify 𝜎.
Cumulative: Specify
TRUE to compute 𝑃(𝑋 ≀
π‘˜). Specifying FALSE will compute 𝑓(π‘˜), the height of the normal curve at 𝑋 = π‘˜.
In this example we compute 𝑃(𝑋 ≀ βˆ’.55) = .291159687 for the standard normal
distribution. The result is shown in the dialog box as soon as you specify all four inputs.
62
Normal Distribution (Excel)
To input probability values and compute π‘₯ values, use the NORMINV function.
Find π‘₯0 such that 𝑃(𝑋 ≀ π‘₯0 ) = .95.
π‘₯0 = 1.644853627
Probability: Specify the
cumulative probability, i.e.
the area under the curve
to the left of π‘₯0 .
Mean: Specify πœ‡.
Standard_dev: Specify 𝜎.
63
t Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab: Select the t distribution
from the drop-down box.
Degrees of freedom: Specify the degrees
of freedom.
64
t Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of π‘₯ (𝑑 values) and compute probabilities, specify that you want to
Define Shaded Area By X Value.
Below are several examples where the random variable 𝑋 has a t distribution with 12
degrees of freedom.
𝑃(𝑑 > 2.179) = .02499
Distribution Plot
T, df=12
0.4
Density
0.3
0.2
0.1
0.02499
0.0
0
X
2.179
𝑃(𝑑 ≀ 3.055) = .9950
Distribution Plot
T, df=12
0.4
0.9950
Density
0.3
0.2
0.1
0.0
65
0
X
3.055
t Distribution (Minitab)
𝑃(|𝑑| > 1.356) = .1000
Distribution Plot
T, df=12
0.4
Density
0.3
0.2
0.1
0.1000
0.0
0.1000
-1.356
0
X
1.356
𝑃(1.5 ≀ 𝑑 ≀ 2.5) = .06577
Distribution Plot
T, df=12
0.4
Density
0.3
0.2
0.1
0.06577
0.0
66
0
X
1.5
2.5
t Distribution (Minitab)
To input probability values and compute π‘₯ values, specify that you want to Define
Shaded Area By Probability.
Find 𝑑0 such that 𝑃(𝑑 β‰₯ 𝑑0 ) = .05.
𝑑0 = 1.782
Distribution Plot
T, df=12
0.4
Density
0.3
0.2
0.1
0.05
0.0
Find 𝑑0 such that 𝑃(|𝑑| > 𝑑0 ) = .01.
0
X
1.782
𝑑0 = 3.055
Distribution Plot
T, df=12
0.4
Density
0.3
0.2
0.1
0.0
67
0.005
-3.055
0.005
0
X
3.055
t Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
There are several functions you can use
depending on what you want to do.
In the following examples we'll use a t-distribution
with 6 degrees of freedom.
Finding lower-tailed probabilities:
Select the T.DIST function
and complete the dialog box.
X: Specify the 𝑑 value.
Deg_freedom: Specify the
degrees of freedom.
Cumulative: Specify TRUE
to compute 𝑃(𝑑 ≀ βˆ’1.76).
Specifying FALSE will compute 𝑓(βˆ’1.76), the height of the t curve at βˆ’1.76.
In this example we compute 𝑃(𝑑 ≀ βˆ’1.76) = .064447607. The result is shown in the
dialog box as soon as you specify all three inputs.
68
t Distribution (Excel)
Finding upper-tailed probabilities:
Select the T.DIST.RT
function and complete the
dialog box.
X: Specify the 𝑑 value.
Deg_freedom: Specify the
degrees of freedom.
In this example we compute 𝑃(𝑑 > 2.73) = .017093009. The result is shown in the
dialog box as soon as you specify both inputs.
Finding two-tailed probabilities:
Select the T.DIST.2T
function and complete the
dialog box.
X: Specify the 𝑑 value. This
must be a positive value.
Deg_freedom: Specify the
degrees of freedom.
In this example we compute 𝑃(|𝑑| > 1.56) = .169778183. The result is shown in the
dialog box as soon as you specify both inputs.
69
t Distribution (Excel)
To input probability values and compute 𝑑 values, use these functions:
Find 𝑑0 such that 𝑃(𝑑 ≀ 𝑑0 ) = .95.
𝑑0 = 1.943180281
Select the T.INV function
and complete the dialog
box.
Probability: Specify the
cumulative probability, i.e.
the area under the curve to
the left of 𝑑0 .
Deg_freedom: Specify the
degrees of freedom.
Find 𝑑0 such that 𝑃(|𝑑| > 𝑑0 ) = .05.
𝑑0 = 2.446911851
Probability: Specify the
total tail probability, i.e. the
area under the curve to the
left of βˆ’π‘‘0 plus the area to
the right of 𝑑0 .
Deg_freedom: Specify the
degrees of freedom.
70
Chi-Square Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab: Select the Chi-Square
distribution from the drop-down box.
Degrees of freedom: Specify the degrees
of freedom.
71
Chi-Square Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of π‘₯ 2 and compute probabilities, specify that you want to Define
Shaded Area By X Value.
Below are several examples where the random variable 𝑋 2 has a Chi-Square
distribution with 9 degrees of freedom.
𝑃(𝑋 2 > 1.735) = .995
Distribution Plot
Chi-Square, df=9
0.10
Density
0.08
0.06
0.9950
0.04
0.02
0.00
0 1.735
X
𝑃(𝑋 2 ≀ 1.735) = .005
Distribution Plot
Chi-Square, df=9
0.10
Density
0.08
0.06
0.04
0.02
0.005001
0.00
0 1.735
72
X
Chi-Square Distribution (Minitab)
To input probability values and compute π‘₯ 2 values, specify that you want to Define
Shaded Area By Probability.
Find π‘₯02 such that 𝑃(𝑋 2 > π‘₯02 ) = .05.
π‘₯02 = 16.92
Distribution Plot
Chi-Square, df=9
0.10
Density
0.08
0.06
0.04
0.02
0.05
0.00
0
X
Find π‘₯12 and π‘₯22 such that 𝑃(𝑋 2 < π‘₯12 π‘œπ‘Ÿ 𝑋 2 > π‘₯22 ) = .05.
16.92
π‘₯12 = 2.700 π‘₯22 = 19.02
Distribution Plot
Chi-Square, df=9
0.10
Density
0.08
0.06
0.04
0.02
0.00
73
0.025
0.025
0
2.700
X
19.02
Chi-Square Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
There are several functions you can use
depending on what you want to do.
In the following examples we'll use a Chi-Square
distribution with 9 degrees of freedom.
Finding lower-tailed probabilities:
Select the CHISQ.DIST
function and complete the
dialog box.
X: Specify the π‘₯ 2 value.
Deg_freedom: Specify the
degrees of freedom.
Cumulative: Specify TRUE to
compute 𝑃(𝑋 2 ≀ 1.6837). Specifying FALSE will compute 𝑓(14.6837), the height of the
chi-square curve at 14.6837.
In this example we compute 𝑃(𝑋 2 ≀ 14.6837) = .900001297. The result is shown in the
dialog box as soon as you specify all three inputs.
74
Chi-Square Distribution (Excel)
Finding upper-tailed probabilities:
Select the CHISQ.DIST.RT
function and complete the
dialog box.
X: Specify the π‘₯ 2 value.
Deg_freedom: Specify the
degrees of freedom.
In this example we compute 𝑃(𝑋 2 > 3.325) = .950005451. The result is shown in the
dialog box as soon as you specify both inputs.
To input probability values and compute π‘₯ 2 values, use these functions:
Find π‘₯02 such that 𝑃(𝑋 2 ≀ π‘₯02 ) = .90.
π‘₯02 = 14.68365657
Select the CHISQ.INV
function and complete the
dialog box.
Probability: Specify the
cumulative probability, i.e.
the area under the curve to
the left of π‘₯02 .
Deg_freedom: Specify the
degrees of freedom.
75
Chi-Square Distribution (Excel)
Find π‘₯02 such that 𝑃(𝑋 2 > π‘₯02 ) = .05.
π‘₯02 = 16.9189776
Select the CHISQ.INV.RT
function and complete the
dialog box.
Probability: Specify the
upper-tailed probability, i.e.
the area under the curve to
the right of π‘₯02 .
Deg_freedom: Specify the
degrees of freedom.
76
F Distribution (Minitab)
Use Graph > Probability Distribution Plot… > View Probability.
Distribution tab: Select the F distribution
from the drop-down box.
Numerator df: Specify the numerator
degrees of freedom.
Denominator df: Specify the denominator
degrees of freedom.
77
F Distribution (Minitab)
Shaded Area tab: The following dialogs show how to specify several kinds of
probability calculations.
To input values of 𝑓 and compute probabilities, specify that you want to Define Shaded
Area By X Value.
Below are several examples where the random variable 𝐹 has an F distribution with 4
numerator degrees of freedom and 10 denominator degrees of freedom.
𝑃(𝐹 > 3.48) = .04993
Distribution Plot
F, df1=4, df2=10
0.7
0.6
Density
0.5
0.4
0.3
0.2
0.1
0.0
78
0.04993
0
X
3.48
F Distribution (Minitab)
To input probability values and compute 𝑓 values, specify that you want to Define
Shaded Area By Probability.
Find 𝑓0 such that 𝑃(𝐹 > 𝑓0 ) = .025.
𝑓0 = 4.468
Distribution Plot
F, df1=4, df2=10
0.7
0.6
Density
0.5
0.4
0.3
0.2
0.1
0.0
79
0.025
0
X
4.468
F Distribution (Excel)
In a blank cell, type an equal sign to insert a function and select More Functions…
from the drop-down list of functions.
There are several functions you can use
depending on what you want to do.
In the following examples we'll use an F
distribution with 4 numerator degrees of
freedom and 10 denominator degrees of
freedom.
Finding upper-tailed probabilities:
Select the F.DIST.RT
function and complete the
dialog box.
X: Specify the 𝑓 value.
Deg_freedom1: Specify the
numerator degrees of
freedom.
Deg_freedom2: Specify the denominator degrees of freedom.
In this example we compute 𝑃(𝐹 > 5.99) = .010023913. The result is shown in the
dialog box as soon as you specify all three inputs.
80
F Distribution (Excel)
To input probability values and compute 𝑓 values, use these functions:
Find 𝑓0 such that 𝑃(𝐹 > 𝑓0 ) = .10.
𝑓0 = 2.605336431
Select the F.INV.RT function
and complete the dialog box.
Probability: Specify the
upper-tailed probability, i.e. the
area under the curve to the
right of 𝑓0 .
Deg_freedom1: Specify the
numerator degrees of
freedom.
Deg_freedom2: Specify the denominator degrees of freedom.
81
z-interval and z-test for 𝝁 (Minitab)
Use Stat > Basic Statistics > 1-Sample Z…
As an example, we use body temperature data collected on a sample of healthy adults.
We'll assume that 𝜎 = .75.
Samples in columns: Specify the column
containing the raw data.
Summarized data: Use this option if you
have only the sample size (𝑛) and the
sample mean (π‘₯Μ… ) available.
Standard deviation: Enter 𝜎. If you want
to do a z-test with 𝜎 unknown (i.e. for large
sample sizes), find the sample standard
deviation (𝑠) first and enter it here.
If you want to perform a hypotheses test, check the box and enter πœ‡0 as the
Hypothesized mean. By default, Minitab calculates only a confidence interval for πœ‡.
82
z-interval and z-test for 𝝁 (Minitab)
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Alternative: Specify a different direction for π»π‘Ž . The
default is not equal (β‰ ) for a two-tailed test. Leave this
as not equal to obtain the usual "two-tailed" confidence interval. Changing this option
will provide one-sided confidence intervals.
The output shows the
hypotheses,
𝐻0 : πœ‡ = 98.6
π»π‘Ž : πœ‡ β‰  98.6
indicates that the z-test
statistic and confidence
interval are computed
using 𝜎 = .75,
and displays the
endpoints of the
confidence interval and
the hypothesis test results
(z-test statistic and pvalue), in addition to
some descriptive
statistics.
83
t-test and t-interval for 𝝁 (Minitab)
Use Stat > Basic Statistics > 1-Sample t…
As an example, we use earnings per share figures (EPS) collected on a sample of
publically-traded companies.
Samples in columns: Specify the column
containing the raw data.
Summarized data: Use this option if you
only have the sample size (𝑛), the sample
mean (π‘₯Μ… ), and the sample standard
deviation (𝑠) available.
If you want to perform a hypotheses test,
check the box and enter πœ‡0 as the
Hypothesized mean. By default, Minitab
calculates only a confidence interval for πœ‡.
84
t-test and t-interval for 𝝁 (Minitab)
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Alternative: Specify a different direction for π»π‘Ž . The
default is not equal (β‰ ) for a two-tailed test. Leave this
as not equal to obtain the usual "two-tailed" confidence
interval. Changing this option will provide one-sided
confidence intervals.
The output shows the
hypotheses,
𝐻0 : πœ‡ = 4
π»π‘Ž : πœ‡ < 4
displays the
hypothesis test results
(t-test statistic and pvalue), and the upper
bound of a one-sided
confidence interval
(since the alternative
hypothesis is specified
as one-tailed), in
addition to some
descriptive statistics.
85
t-interval for 𝝁 (Excel)
Excel does not output results of a hypothesis test for πœ‡ but you can use it to construct a
confidence interval using the t procedure.
Enter the raw data in a column of the worksheet.
Use Data > Data Analysis > Descriptive Statistics.
Input Range: Highlight the cells containing the
data.
Grouped By: Columns
If the Input Range contains more than 1 column of
data, Excel does a separate analysis for each
one.
Labels in First Row: Check this box if the Input
Range contains column labels. If not, don't check
the box. Checking the box tells Excel to ignore
what's in the first row of the Input Range.
Summary Statistics: Check this box for the descriptive statistics to display.
Confidence Level for Mean: Check this box and specify the desired confidence level
(95% is the default).
86
t-interval for 𝝁 (Excel)
In addition to the univariate
descriptive statistics, Excel
displays the margin of error for
the confidence interval
calculated as
𝑑𝛼⁄2
𝑠
βˆšπ‘›
.
The confidence interval is then
π‘₯Μ… ± 𝑑𝛼⁄2
𝑠
.
βˆšπ‘›
87
Interval and Test for 𝒑 (Minitab)
Use Stat > Basic Statistics > 1 Proportion…
As an example, we use the results of a set of coin tosses where 9 were heads and 21
were tails.
Samples in columns: Specify the column
containing the raw data, i.e. a column of
"success"/"failure" indicators.
Summarized data: Use this input option if
you only know the Number of events
("successes") and Number of trials (𝑛).
If you want to perform a hypotheses test,
check the box and enter 𝑝0 as the
Hypothesized proportion. By default,
Minitab calculates only a confidence interval for 𝑝.
88
Interval and Test for 𝒑 (Minitab)
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Alternative: Specify a different direction for π»π‘Ž . The
default is not equal (β‰ ) for a two-tailed test. Leave this
as not equal to obtain the usual "two-tailed" confidence
interval. Changing this option will provide one-sided confidence intervals.
Use test and interval based on normal distribution: Check this box to use the zinterval and z-test procedures if the sample size requirements are met. If not, leave the
box unchecked to use more robust procedures based on the binomial distribution.
The output shows the
hypotheses,
𝐻0 : 𝑝 = .5
π»π‘Ž : 𝑝 β‰  .5
displays the endpoints
of the confidence
interval and the
hypothesis test results
(z-test statistic and pvalue), in addition to
the sample counts and
sample proportion of
successes.
89
Interval and Test for 𝝈𝟐 (Minitab)
Use Stat > Basic Statistics > 1 Variance…
As an example, we use earnings per share figures (EPS) collected on a sample of
publically-traded companies.
Data: Choose the input option.
Samples in columns: Specify this option
if the raw data are in a column of the
worksheet.
Sample standard deviation/variance:
Specify this option if you have only
summary statistics available for the
sample data.
If you want to perform a hypotheses test, check the box and enter 𝜎02 as the Value if
you choose Hypothesized variance. Enter 𝜎0 as the Value if you choose
Hypothesized standard deviation. By default, Minitab calculates only a confidence
interval for 𝜎 and 𝜎 2
90
Interval and Test for 𝝈𝟐 (Minitab)
To change the interval/test defaults, use Options…
Confidence level: Specify a different
confidence level for the interval. The default is
95%.
Alternative: Specify a different direction for π»π‘Ž .
The default is not equal (β‰ ) for a two-tailed test.
Leave this as not equal to obtain the usual "twotailed" confidence interval. Changing this option will provide one-sided confidence
intervals.
Test and CI for One Variance: EPS
The output shows the
hypotheses,
𝐻0 : 𝜎 2 = 9
π»π‘Ž : 𝜎 2 β‰  9
displays the endpoints of the
confidence intervals for both 𝜎
and 𝜎 2 , and the hypothesis test
results, in addition to summary
statistics of the sample data.
Note that two different methods
are used depending on the
assumption you want to make
about the shape of the
population distribution from
which the data are sampled.
Method
Null hypothesis
Alternative hypothesis
Sigma-squared = 9
Sigma-squared not = 9
The chi-square method is only for the normal
distribution.
The Bonett method is for any continuous distribution.
Statistics
Variable
EPS
N
5
StDev
1.41
Variance
1.99
95% Confidence Intervals
Variable
EPS
Method
Chi-Square
Bonett
CI for
StDev
(0.85, 4.06)
(0.86, 3.80)
Method
Chi-Square
Bonett
Test
Statistic
0.89
β€”
CI for
Variance
(0.72, 16.46)
(0.75, 14.42)
Tests
Variable
EPS
DF
4
β€”
P-Value
0.147
0.103
Since the Chi-square procedures are sensitive to the normal distribution assumption, it's
a good idea to verify the data are closely normally distributed (e.g. with a normal
probability plot) before using them.
91
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Minitab)
Use Stat > Basic Statistics > 2-Sample t…
Our example is data collected on the length of cuts (in feet) of columns from two
different saws (A and B). We're interested in whether these data show enough
evidence that one saw is cutting the columns shorter, on average, than the other saw.
The data are shown in the worksheet in both unstacked format (C1 and C2) and
stacked format (C3 and C4-T).
Samples in one column: Use this
option for stacked data. Specify the
column containing the numeric
response/comparison variable (Samples)
and the column containing the grouping
variable (Subscripts).
Note: When you use this option, Minitab
uses the group which comes first
alphanumerically in the Subscripts
column as the first term in the difference
in means. Here it would be πœ‡π΄ βˆ’ πœ‡π΅ .
92
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Minitab)
Samples in different columns: Use this option for unstacked data. Specify the both
columns of numeric values.
Note: When you use this option, the group whose data is specified as the First column
is used as the first term in the difference in means. Here it would be πœ‡π΅ βˆ’ πœ‡π΄ .
Summarized data: Use this option if you have only the sample sizes and descriptive
statistics available for each sample.
Assume equal variances: Check this box if you want to assume equal population
variances, i.e. 𝜎12 = 𝜎22 . The default is to not assume equal variances. The differences
in the tests are shown below.
Assumption
𝜎12 = 𝜎22
𝜎12 β‰  𝜎22
Test Statistic
𝑑=
(π‘₯Μ… 1 βˆ’π‘₯Μ… 2 )βˆ’π·0
𝑑=
(π‘₯Μ… 1 βˆ’π‘₯Μ…2 )βˆ’π·0
2( 1 + 1 )
βˆšπ‘ π‘
𝑛1 𝑛2
2
Degrees of Freedom
where 𝑠𝑝2 =
(𝑛1 βˆ’1 )𝑠12 +(𝑛2 βˆ’1 )𝑠22
𝑛1 +𝑛2 βˆ’2
𝑛1 + 𝑛2 βˆ’ 2
2
2 2
𝑠
𝑠
( 1+ 2)
2
𝑠
𝑠
√ 1+ 2
𝑛1 𝑛2
⌊
𝑛1 𝑛2
2
2
2
𝑠2
𝑠
( 1)
( 2)
𝑛1
𝑛2
+
𝑛1 βˆ’1 𝑛2 βˆ’1
βŒ‹
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Test difference: Specify the value of 𝐷0 . The default is
0.
Alternative: Specify a different direction for π»π‘Ž . The default is not equal (β‰ ) for a twotailed test. Leave this as not equal to obtain the usual "two-tailed" confidence interval.
Changing this option will provide one-sided confidence intervals.
93
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Minitab)
The output shows descriptive statistics for both groups.
In addition, the difference in means on which we're making inference is shown (πœ‡π΄ βˆ’ πœ‡π΅ )
as well as the point estimate for the difference (π‘₯Μ…1 βˆ’ π‘₯Μ…2 ) based on the data. The two last
lines display the results of the inference procedures: the endpoints of the confidence
interval for πœ‡π΄ βˆ’ πœ‡π΅ and the results of the test of
𝐻0 : πœ‡π΄ βˆ’ πœ‡π΅ = 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ β‰  0 .
Two-Sample T-Test and CI: Length, Saw
Two-sample T for Length
Saw
A
B
N
9
9
Mean
8.0489
8.0700
StDev
0.0372
0.0224
SE Mean
0.012
0.0075
Difference = mu (A) - mu (B)
Estimate for difference: -0.0211
95% CI for difference: (-0.0524, 0.0102)
T-Test of difference = 0 (vs not =): T-Value = -1.46
94
P-Value = 0.168
DF = 13
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Excel)
Use Data > Data Analysis > t-Test: Two-Sample Assuming Equal Variances
or Data > Data Analysis > t-Test: Two-Sample Assuming Unequal Variances
depending on whether you want to assume equal population variances (𝜎12 = 𝜎22 ) or not.
The differences in the tests are shown below.
Assumption
𝜎12 = 𝜎22
𝜎12 β‰  𝜎22
Test Statistic
𝑑=
𝑑=
(π‘₯Μ… 1 βˆ’π‘₯Μ… 2 )βˆ’π·0
2( 1 + 1 )
βˆšπ‘ π‘
𝑛1 𝑛2
Degrees of Freedom
where 𝑠𝑝2 =
(𝑛1 βˆ’1 )𝑠12 +(𝑛2 βˆ’1 )𝑠22
𝑛1 +𝑛2 βˆ’2
𝑛1 + 𝑛2 βˆ’ 2
2
2 2
𝑠
𝑠
( 1+ 2)
(π‘₯Μ… 1 βˆ’π‘₯Μ…2 )βˆ’π·0
𝑠2 𝑠2
√ 1+ 2
𝑛1 𝑛2
⌊
𝑛1 𝑛2
2
2
2
𝑠2
𝑠
( 1)
( 2)
𝑛1
𝑛2
+
𝑛1 βˆ’1 𝑛2 βˆ’1
βŒ‹
Our example is data collected on the length of cuts (in feet) of columns from two
different saws (A and B). We're interested in whether these data show enough
evidence that one saw is cutting the columns shorter, on average, than the other saw.
The data are shown in the worksheet in unstacked format.
95
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Excel)
Variable 1 Range: Highlight the cells
containing the numeric responses for the first
group.
Variable 2 Range: Highlight the cells
containing the numeric responses for the
second group.
Note: The group whose data is specified as
the Variable 1 Range is used as the first term
in the difference in means. Here it would be πœ‡π΄ βˆ’ πœ‡π΅ .
Hypothesized Mean Difference: Specify 𝐷0 .
Labels: Check this box if the Input Ranges contain column labels. If not, don't check
the box. Checking the box tells Excel to ignore what's in the first row of the Input
Ranges.
Alpha: Specify 𝛼 for the test.
The output contains descriptive
statistics for both groups and results
of the hypothesis test of
𝐻0 : πœ‡π΄ βˆ’ πœ‡π΅ = 0 .
Both one-tail and two-tail p-values
and critical values are reported.
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
96
A
B
8.048889
8.07
0.001386 0.0005
9
9
0
13
-1.45831
0.084245
1.770933
0.16849
2.160369
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Independent Samples (Excel)
IMPORTANT NOTE: Excel does not report one-tailed p-values and critical values
correctly, in general. The one-tailed β€œp-value” reported is actually the area under the tcurve in the upper or lower tail, depending on whether the value of the test statistic β€œt
Stat” is positive or negative, respectively. Excel mistakenly assumes that β€œt Stat” will be
positive if you have an upper-tailed test and negative if you have a lower-tailed test,
which usually happens, but not always. It is poor statistical practice to base the
direction of the hypotheses on the data. Also, the one-tailed β€œt Critical” reported is
actually the absolute value of the critical value, so it could have the wrong sign
depending on what the alternative hypothesis is.
Two-tailed p-values are reported correctly but there is a second critical value in the
lower tail having the same absolute value with a negative sign.
The following chart shows the correct rejection regions and p-values for the
corresponding tests. The highlighted values are the correct values.
Alternative Hypothesis
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ < 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ > 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ β‰  0
Rejection Region
𝑑 < βˆ’1.770933
𝑑 > 1.770933
|𝑑| > 2.160369
p-value
𝑃(𝑇 < βˆ’1.45831) = .084245
𝑃(𝑇 > βˆ’1.45831) = 1 βˆ’ .084245 = .915755
𝑃(|𝑇| > |βˆ’1.45831|) = .16849
97
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Minitab)
Use Stat > Basic Statistics > Paired t…
The example data are distances (yards) a golf ball was driven off a tee for a sample of
golfers. Each golfer hit two balls, one of brand A and one of brand B. The data are
shown in unstacked format.
Samples in columns: Use this option for
unstacked data. Specify the both columns of
numeric values.
Note: When you use this option, the group whose
data is specified as the First sample is used as the
first term in the difference in means. Here it would
be πœ‡π΄ βˆ’ πœ‡π΅ .
Summarized data (differences): Use this option
if you have only the sample sizes and descriptive
statistics available for the paired differences.
98
t-Interval and t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Minitab)
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Test difference: Specify the value of 𝐷0 . The default
is 0.
Alternative: Specify a different direction for π»π‘Ž . The default is not equal (β‰ ) for a twotailed test. Leave this as not equal to obtain the usual "two-tailed" confidence interval.
Changing this option will provide one-sided confidence intervals.
The output shows descriptive statistics for both groups and the paired differences.
The two last lines display the results of the inference procedures: the endpoints of the
confidence interval for πœ‡π΄ βˆ’ πœ‡π΅ and the results of the test of
𝐻0 : πœ‡π΄ βˆ’ πœ‡π΅ = 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ β‰  0 .
Paired T-Test and CI: A, B
Paired T for A - B
A
B
Difference
N
17
17
17
Mean
256.18
251.41
4.76
StDev
18.70
14.93
9.09
SE Mean
4.54
3.62
2.20
95% CI for mean difference: (0.09, 9.44)
T-Test of mean difference = 0 (vs not = 0): T-Value = 2.16
P-Value = 0.046
Note: An alternative is to use Stat > Basic Statistics > 1-Sample Z… or Stat > Basic
Statistics > 1-Sample t… and use the paired differences as the input data.
99
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Excel)
Use Data > Data Analysis > t-test: Paired Two-Sample for Means.
The example data are distances (yards) a golf ball was driven off a tee for a sample of
golfers. Each golfer hit two balls, one of brand A and one of brand B. The data are
shown in unstacked format.
Variable 1 Range: Highlight the cells containing the numeric responses for the first
group.
Variable 2 Range: Highlight the cells containing the numeric responses for the second
group.
Note: The group whose data is specified as the Variable 1 Range is used as the first
term in the difference in means. Here it would be πœ‡π΄ βˆ’ πœ‡π΅ .
Hypothesized Mean Difference: Specify 𝐷0 .
Labels: Check this box if the Input Ranges contain column labels. If not, don't check
the box. Checking the box tells Excel to ignore what's in the first row of the Input
Ranges.
Alpha: Specify 𝛼 for the test.
100
t-Test for 𝝁𝟏 βˆ’ 𝝁𝟐 Using Paired Samples (Excel)
The output contains descriptive
statistics for both groups and the
Pearson correlation coefficient (π‘Ÿ)
between the two samples. Also
shown are the results of the
hypothesis test of
𝐻0 : πœ‡π΄ βˆ’ πœ‡π΅ = 0 .
Both one-tail and two-tail p-values
and critical values are reported.
t-Test: Paired Two Sample for Means
Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
A
B
256.1765 251.4118
349.7794 223.0074
17
17
0.877617
0
16
2.162019
0.023056
1.745884
0.046112
2.119905
IMPORTANT NOTE: Excel does
not report one-tailed p-values and
critical values correctly, in general. The one-tailed β€œp-value” reported is actually the
area under the t-curve in the upper or lower tail, depending on whether the value of the
test statistic β€œt Stat” is positive or negative, respectively. Excel mistakenly assumes that
β€œt Stat” will be positive if you have an upper-tailed test and negative if you have a lowertailed test, which usually happens, but not always. It is poor statistical practice to base
the direction of the hypotheses on the data. Also, the one-tailed β€œt Critical” reported is
actually the absolute value of the critical value, so it could have the wrong sign
depending on what the alternative hypothesis is.
Two-tailed p-values are reported correctly but there is a second critical value in the
lower tail having the same absolute value with a negative sign.
The following chart shows the correct rejection regions and p-values for the
corresponding tests. The highlighted values are the correct values.
Alternative Hypothesis
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ < 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ > 0
π»π‘Ž : πœ‡π΄ βˆ’ πœ‡π΅ β‰  0
Rejection Region
𝑑 < βˆ’1.745884
𝑑 > 1.745884
|𝑑| > 2.119905
p-value
𝑃(𝑇 < 2.162019) = 1 βˆ’ .023056 = .976944
𝑃(𝑇 > 2.162019) = .023056
𝑃(|𝑇| > |2.162019|) = .046112
Note: An alternative is to use Data > Data Analysis > Descriptive Statistics and use
the paired differences as the input data. This approach will only output the margin of
error for a paired t-interval for πœ‡1 βˆ’ πœ‡2 .
101
z-Interval and z-Test for π’‘πŸ βˆ’ π’‘πŸ Using Independent Samples (Minitab)
Use Stat > Basic Statistics > 2 Proportions…
The example data are indicators of whether or not a washing machine needed repairs in
the first five years of operation for a sample of the company's machines and a sample of
a competitor's machines. The data are shown in both unstacked format (C1-T and C2T) and stacked format (C3-T and C4-T). 13 of the 100 company's machines and 26 of
the 100 competitor's machines needed repairs.
Samples in one column: Use this option for
stacked data. Specify the column containing
the categorical response/comparison variable
(Samples) and the column containing the
grouping variable (Subscripts).
Note: When you use this option, Minitab uses
the group which comes first alphanumerically
in the Subscripts column as the first term in
the difference in proportions. Here it would be
π‘πΆπ‘œπ‘šπ‘π‘Žπ‘›π‘¦ βˆ’ π‘πΆπ‘œπ‘šπ‘π‘’π‘‘π‘–π‘‘π‘œπ‘Ÿ .
102
z-Interval and z-Test for π’‘πŸ βˆ’ π’‘πŸ Using Independent Samples (Minitab)
Samples in different columns: Use this option for unstacked data. Specify the both
columns of categorical responses.
Note: When you use this option, the group whose data is specified as the First column
is used as the first term in the difference in proportions. Here it would be π‘πΆπ‘œπ‘šπ‘π‘Žπ‘›π‘¦ βˆ’
π‘πΆπ‘œπ‘šπ‘π‘’π‘‘π‘–π‘‘π‘œπ‘Ÿ .
Summarized data: Use this option if you have only the sample sizes (Trials) and
numbers of "successes" (Events) for each sample.
To change the interval/test defaults, use Options…
Confidence level: Specify a different confidence level
for the interval. The default is 95%.
Test difference: Specify the value of 𝐷0 . The default
is 0.
Alternative: Specify a different direction for π»π‘Ž . The default is not equal (β‰ ) for a twotailed test. Leave this as not equal to obtain the usual "two-tailed" confidence interval.
Changing this option will provide one-sided confidence intervals.
Use pooled estimate of p for test: Check this box if you want Minitab to compute the
z-test statistic as
𝑧=
(𝑝̂1 βˆ’π‘Μ‚2 )βˆ’0
1
1
βˆšπ‘Μ‚(1βˆ’π‘Μ‚)(𝑛 +𝑛 )
1
2
π‘₯ +π‘₯
where 𝑝̂ = 𝑛1+𝑛2 is the pooled estimate of 𝑝 and π‘₯1 and π‘₯2 are the
1
2
numbers of "successes" (Events) in each group. This is only appropriate under the
assumption that 𝑝1 = 𝑝2 = 𝑝, i.e. 𝑝1 βˆ’ 𝑝2 = 0, so only check this box if you specify 𝐷0 =
0.
If you do not check the box, Minitab computes the z-test statistic as
𝑧=
(𝑝̂1 βˆ’π‘Μ‚2 )βˆ’π·0
Μ‚ (1βˆ’π‘
Μ‚1) 𝑝
Μ‚ (1βˆ’π‘
Μ‚2)
𝑝
√ 1
+ 2
𝑛1
𝑛2
.
Checking or not checking the box has no effect on the confidence interval for 𝑝1 βˆ’ 𝑝2 .
103
z-Interval and z-Test for π’‘πŸ βˆ’ π’‘πŸ Using Independent Samples (Minitab)
The output shows the numbers of events (X), the sample sizes (N), and sample
proportions (Sample p) for both samples.
In addition, the difference in proportions on which we're making inference is shown
(𝑝1 βˆ’ 𝑝2) as well as the point estimate for the difference (𝑝̂1 βˆ’ 𝑝̂2) based on the data.
The next two lines display the results of the z-procedures: the endpoints of the
confidence interval for 𝑝1 βˆ’ 𝑝2 and the results of the test of
𝐻0 : 𝑝1 βˆ’ 𝑝2 = 0
π»π‘Ž : 𝑝1 βˆ’ 𝑝2 β‰  0 .
The last line shows the p-value for Fisher's exact test of hypotheses. Use this test when
the large samples condition (required for the validity of the z-procedures) fails.
Test and CI for Two Proportions
Sample
1
2
X
13
26
N
100
100
Sample p
0.130000
0.260000
Difference = p (1) - p (2)
Estimate for difference: -0.13
95% CI for difference: (-0.238331, -0.0216688)
Test for difference = 0 (vs not = 0): Z = -2.32
Fisher's exact test: P-Value = 0.031
104
P-Value = 0.020
Test and Interval for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Minitab)
Use Stat > Basic Statistics > 2 Variances…
Our example is data collected on the length of cuts (in feet) of columns from two
different saws (A and B). We're interested in whether these data show enough
evidence that the saws differ on the variation in the lengths of cuts. The data are shown
in the worksheet in both unstacked format (C1 and C2) and stacked format (C3 and C4T).
Data: Specify the input option.
Samples in one column: Use this
option for stacked data format.
Specify the numeric
response/comparison variable
(Samples) and the categorical
grouping variable (Subscripts).
Note: When you use this option, Minitab uses the group which comes first
alphanumerically in the Subscripts column as the numerator term in the ratio of
variances. Here it would be 𝜎𝐴2 /𝜎𝐡2 .
105
Test and Interval for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Minitab)
Samples in different columns: Use
this option for unstacked data.
Specify both columns of numeric
responses.
Note: When you use this option, the
group whose data is specified as the
First column is used as the numerator
term in the ratio of variances. Here it
would be 𝜎𝐴2 /𝜎𝐡2 .
Sample standard deviations/variances: Use these options if you have only
descriptive statistics available for both groups. The group whose summaries are
specified as the First sample is used as the numerator term in the ratio of variances.
To change the interval/test defaults, use
Options…
Confidence level: Specify a different
confidence level for the interval. The
default is 95%.
Hypothesized ratio: Specify the Value of
the null hypothesized ratio of standard deviations (default) or variances. The default
Value is 1, i.e. 𝐻0 :
𝜎1
𝜎2
= 1 or 𝐻0 :
𝜎12
𝜎22
= 1.
Alternative: Specify a different direction for π»π‘Ž . The default is not equal (β‰ ) for a twotailed test. Leave this as not equal to obtain the usual "two-tailed" confidence interval.
Changing this option will provide one-sided confidence intervals.
106
Test and Interval for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Minitab)
The output shows the hypotheses,
𝐻0 :
2
𝜎𝐴
=1
2
𝜎𝐡
π»π‘Ž :
2
𝜎𝐴
2
𝜎𝐡
β‰ 1
displays descriptive statistics for each group, and point estimates of the population
ratios of standard deviations and variances:
point estimate of
𝜎12
𝜎22
𝜎12
𝜎22
𝑠1
𝑠2
𝜎
𝑠2
as the point estimate of 𝜎1 and 𝑠12 as the
2
2
. Next are the endpoints of the confidence intervals for both
𝜎1
𝜎2
and
, and the hypothesis test results.
Note that two different methods are used depending on the assumption you want to
make about the shapes of the two population distributions from which the data are
sampled. Since the F procedures are sensitive to the normal distribution assumption,
it's a good idea to verify that both samples of data are closely normally distributed (e.g.
with a normal probability plot) before using them.
Test and CI for Two Variances: Length vs Saw
Method
Null hypothesis
Alternative hypothesis
Significance level
Variance(A) / Variance(B) = 1
Variance(A) / Variance(B) not = 1
Alpha = 0.05
Statistics
Saw
A
B
N
9
9
StDev
0.037
0.022
Variance
0.001
0.001
Ratio of standard deviations = 1.665
Ratio of variances = 2.772
95% Confidence Intervals
Distribution
of Data
Normal
Continuous
CI for StDev
Ratio
(0.791, 3.506)
(0.446, 3.441)
CI for
Variance
Ratio
(0.625, 12.290)
(0.199, 11.844)
Tests
Method
F Test (normal)
Levene's Test (any continuous)
DF1
8
1
DF2
8
16
Test
Statistic
2.77
0.87
107
P-Value
0.171
0.365
F-Test for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Excel)
Use Data > Data Analysis > F-Test Two-Sample for Variances.
Our example is data collected on the length of cuts (in feet) of columns from two
different saws (A and B). We're interested in whether these data show enough
evidence that the saws differ on the variation in the lengths of cuts. The data are shown
in the worksheet in unstacked format.
Variable 1 Range: Highlight the cells
containing the numeric responses for the
first group.
Variable 2 Range: Highlight the cells
containing the numeric responses for the
second group.
Note: The group whose data is specified as the Variable 1 Range is used as the
numerator term in the ratio of variances. Here it would be 𝜎𝐴2 /𝜎𝐡2 .
Labels: Check this box if the Input Ranges contain column labels. If not, don't check
the box. Checking the box tells Excel to ignore what's in the first row of the Input
Ranges.
Alpha: Specify 𝛼 for the test.
108
F-Test for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Excel)
The output contains descriptive
statistics for both groups and the
results of the hypothesis test of
𝐻0 :
2
𝜎𝐴
2
𝜎𝐡
=1
Only "one-tail p-values" and "one-tail
critical values" are reported.
F-Test Two-Sample for Variances
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
A
8.048889
0.001386
9
8
2.772222
0.085334
3.438101
B
8.07
0.0005
9
8
IMPORTANT NOTE: Excel does
not report one-tailed p-values and critical values correctly, in general. The one-tailed β€œpvalue” reported is actually the area under the F-curve in the upper or lower tail,
depending on the value of the test statistic β€œF.” Excel mistakenly assumes that β€œF” will
be relatively large if you have an upper-tailed test and relatively small if you have a
lower-tailed test, which usually happens, but not always. The one-tailed β€œF Critical”
reported also corresponds to the direction Excel chooses for the alternative. It is poor
statistical practice to base the direction of the hypotheses on the data.
To conduct a two-tailed test,
𝐻0 :
π»π‘Ž :
2
𝜎𝐴
2
𝜎𝐡
2
𝜎𝐴
2
𝜎𝐡
=1
β‰ 1
use the Excel function F.TEST.
In a blank cell, type an equal sign to
insert a function and select More
Functions… from the drop-down list of
functions.
109
F-Test for 𝝈𝟐𝟏 /𝝈𝟐𝟐 Using Independent Samples (Excel)
Select the F.TEST function and complete the dialog box.
Array 1: Highlight the cells
containing the numeric
responses for the first
group. You may highlight
the column labels if you
choose; Excel will ignore
any non-numeric data.
Array 2: Highlight the cells
containing the numeric
responses for the second
group. You may highlight the column labels if you choose; Excel will ignore any nonnumeric data.
Note: The group whose data is specified as the Array 1 range is used as the numerator
term in the ratio of variances. Here it would be 𝜎𝐴2 /𝜎𝐡2 .
The two-tailed p-value (.170667608 here) is shown in the dialog box as soon as you
specify both arrays.
110
One-Way ANOVA (Minitab)
Stacked Data
Use Stat > ANOVA > One-Way…
All response data must be in a single column with a
second column indicating the levels of the factor. In the
worksheet at the right, we have random samples of 10
scores from each of 3 different training programs.
Response: Specify the response variable.
Factor: Specify the column containing the factor levels.
Use the Comparisons… button to specify a
multiple comparisons analysis.
Here we request Tukey's adjustment for multiple
comparisons so that the family (experimentwise)
error rate remains at .05 (5%).
111
One-Way ANOVA (Minitab)
The output shows the ANOVA table, individual confidence intervals for the population
means (NOT adjusted for multiple comparisons), and the multiple comparisons analysis
(if requested).
One-way ANOVA: Score versus Program
Source
Program
Error
Total
DF
2
27
29
S = 9.823
Level
1
2
3
N
10
10
10
SS
62.1
2605.3
2667.4
MS
31.0
96.5
R-Sq = 2.33%
Mean
71.200
70.300
67.800
StDev
10.261
9.627
9.566
F
0.32
P
0.728
R-Sq(adj) = 0.00%
Individual 95% CIs For Mean Based on
Pooled StDev
-------+---------+---------+---------+-(-----------*------------)
(------------*-----------)
(------------*-----------)
-------+---------+---------+---------+-65.0
70.0
75.0
80.0
Pooled StDev = 9.823
Grouping Information Using Tukey Method
Program
1
2
3
N
10
10
10
Mean
71.200
70.300
67.800
Grouping
A
A
A
Means that do not share a letter are significantly different.
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Program
Individual confidence level = 98.04%
Program = 1 subtracted from:
Program
2
3
Lower
-11.803
-14.303
Center
-0.900
-3.400
Upper
10.003
7.503
+---------+---------+---------+--------(---------------*--------------)
(--------------*---------------)
+---------+---------+---------+---------14.0
-7.0
0.0
7.0
Program = 2 subtracted from:
Program
3
Lower
-13.403
Center
-2.500
Upper
8.403
+---------+---------+---------+--------(--------------*---------------)
+---------+---------+---------+---------14.0
-7.0
0.0
7.0
112
One-Way ANOVA (Minitab)
Unstacked Data
Use Stat > ANOVA > One-Way(Unstacked)…
Response data must be in separate columns, one for
each sample. In the worksheet at the right, we have
random samples of 10 scores from each of 3 different
training programs.
Responses (in separate columns): Specify the columns containing the response
data.
Use the Comparisons… button to specify a multiple comparisons analysis.
Here we request Tukey's adjustment for multiple
comparisons so that the family (experimentwise)
error rate remains at .05 (5%).
113
One-Way ANOVA (Minitab)
The output shows the ANOVA table, individual confidence intervals for the population
means (NOT adjusted for multiple comparisons), and the multiple comparisons analysis
(if requested).
One-way ANOVA: Score_1, Score_2, Score_3
Source
Factor
Error
Total
DF
2
27
29
S = 9.823
Level
Score_1
Score_2
Score_3
N
10
10
10
SS
62.1
2605.3
2667.4
MS
31.0
96.5
R-Sq = 2.33%
Mean
71.200
70.300
67.800
F
0.32
P
0.728
R-Sq(adj) = 0.00%
StDev
10.261
9.627
9.566
Individual 95% CIs For Mean Based on
Pooled StDev
-------+---------+---------+---------+-(-----------*------------)
(------------*-----------)
(------------*-----------)
-------+---------+---------+---------+-65.0
70.0
75.0
80.0
Pooled StDev = 9.823
Grouping Information Using Tukey Method
Score_1
Score_2
Score_3
N
10
10
10
Mean
71.200
70.300
67.800
Grouping
A
A
A
Means that do not share a letter are significantly different.
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons
Individual confidence level = 98.04%
Score_1 subtracted from:
Score_2
Score_3
Lower
-11.803
-14.303
Center
-0.900
-3.400
Upper
10.003
7.503
+---------+---------+---------+--------(---------------*--------------)
(--------------*---------------)
+---------+---------+---------+---------14.0
-7.0
0.0
7.0
Score_2 subtracted from:
Score_3
Lower
-13.403
Center
-2.500
Upper
8.403
+---------+---------+---------+--------(--------------*---------------)
+---------+---------+---------+---------14.0
-7.0
0.0
7.0
114
One-Way ANOVA (Excel)
Enter the data in the worksheet in unstacked format, i.e. the
responses in separate columns, one for each sample. In
the worksheet at the right, we have random samples of 10
scores from each of 3 different training programs.
Use Data > Data Analysis > ANOVA: Single Factor.
Input Range: Specify the input data as the range of cells
containing response columns.
Grouped By: Select columns. (You can also input the response data in separate rows,
one for each sample, but this is not typically done. In that case, select Rows here.)
Labels in First Row: Check this box if the Input Ranges contain column labels. If not,
don't check the box. Checking the box tells Excel to ignore what's in the first row of the
Input Range.
Alpha: The level of significance for critical values. The default is .05.
Note: Excel does not do multiple comparisons.
115
One-Way ANOVA (Excel)
The output shows summary statistics for each sample and the ANOVA table which
includes the critical value for the F-test statistic.
Anova: Single Factor
SUMMARY
Groups
Score_1
Score_2
Score_3
Count
Sum
10 712
10 703
10 678
Average Variance
71.2 105.2889
70.3 92.67778
67.8 91.51111
ANOVA
Source of
Variation
Between Groups
Within Groups
SS
62.06667
2605.3
df
MS
F
P-value
F crit
2 31.03333 0.321614 0.727717 3.354131
27 96.49259
Total
2667.367
29
116
Two-Way ANOVA (Minitab)
Data must be entered in stacked format with one column
for the numeric responses and one column for each
factor. In this example, we have factor A at 3 levels and
factor B at 2 levels.
There are two ways to do the analysis in Minitab. One
way is to use Stat > ANOVA > Two-Way…
Response: Specify the column containing the responses.
Row factor / Column factor: Specify columns containing the levels of each factor.
The output will be the ANOVA table.
Two-way ANOVA: Score versus FactorA, FactorB
Source
FactorA
FactorB
Interaction
Error
Total
S = 6.285
DF
2
1
2
6
11
SS
118.5
108.0
46.5
237.0
510.0
MS
59.25
108.00
23.25
39.50
R-Sq = 53.53%
F
1.50
2.73
0.59
P
0.296
0.149
0.584
R-Sq(adj) = 14.80%
Note that this tool does not allow for multiple comparisons. To do multiple comparisons
in a two-way ANOVA, use the General Linear Model tool:
Stat > ANOVA > General Linear Model
117
Two-Way ANOVA (Minitab)
Responses: Specify the column
containing the responses.
Model: Specify the sources of variation
you want to account for. Enter both
columns containing the factor levels. To
account for an interaction between two
factors, specify an interaction term using
an asterisk, as shown. You should only
specify an interaction term when there are
multiple observations per treatment.
The output will show the ANOVA table as well as a list of the factors and levels.
General Linear Model: Score versus FactorA, FactorB
Factor
FactorA
FactorB
Type
fixed
fixed
Levels
3
2
Values
1, 2, 3
1, 2
Analysis of Variance for Score, using Adjusted SS for Tests
Source
FactorA
FactorB
FactorA*FactorB
Error
Total
S = 6.28490
DF
2
1
2
6
11
Seq SS
118.50
108.00
46.50
237.00
510.00
R-Sq = 53.53%
Adj SS
118.50
108.00
46.50
237.00
Adj MS
59.25
108.00
23.25
39.50
F
1.50
2.73
0.59
P
0.296
0.149
0.584
R-Sq(adj) = 14.80%
Use the Comparisons… button to request a multiple comparisons analysis.
Row or Column Means Comparisons
Terms: Specify the column containing the
levels of the row or column factor, as
shown here.
Method: Select a multiple comparisons
method.
Check the Confidence interval approach
for displaying the results.
118
Two-Way ANOVA (Minitab)
The output shows a confidence interval (adjusted for multiple comparisons) for the
difference, πœ‡π‘– βˆ’ πœ‡π‘— , between each pair of row means (πœ‡1 , πœ‡2 , … , πœ‡π‘Ž ) or column means
(πœ‡1 , πœ‡2 , … , πœ‡π‘ ). Here there are 3 levels of Factor A giving 3 row means (πœ‡1 , πœ‡2 , πœ‡3 ) and
(32) = 3 pairs.
Tukey 95.0% Simultaneous Confidence Intervals
Response Variable Score
All Pairwise Comparisons among Levels of FactorA
FactorA = 1 subtracted from:
FactorA
2
3
Lower
-18.89
-11.39
FactorA = 2
FactorA
3
Center
-5.250
2.250
Upper
8.388
15.888
------+---------+---------+---------+
(-----------*----------)
(----------*----------)
------+---------+---------+---------+
-12
0
12
24
subtracted from:
Lower
-6.138
Center
7.500
Upper
21.14
------+---------+---------+---------+
(----------*-----------)
------+---------+---------+---------+
-12
0
12
24
Example: A 95% confidence interval for πœ‡2 βˆ’ πœ‡1 is (βˆ’18.89 , 8.388).
Treatment/Cell Means Comparisons
Terms: Specify the interaction term using
an asterisk, as shown here.
Method: Select a multiple comparisons
method.
Check the Confidence interval approach
for displaying the results.
119
Two-Way ANOVA (Minitab)
The output shows a confidence interval (adjusted for multiple comparisons) for the
difference, πœ‡π‘–π‘— βˆ’ πœ‡π‘˜π‘™ , between each pair of treatment means (πœ‡11 , πœ‡12 , … , πœ‡π‘Žπ‘ ). Here
there are 6 treatment means (πœ‡11 , πœ‡12 , πœ‡21 , πœ‡22 , πœ‡31 , πœ‡32 ) and (62) = 15 pairs.
Tukey 95.0% Simultaneous Confidence Intervals
Response Variable Score
All Pairwise Comparisons among Levels of FactorA*FactorB
FactorA = 1
FactorB = 1 subtracted from:
FactorA
1
2
2
3
3
FactorB
2
1
2
1
2
FactorA = 1
FactorB = 2
FactorA
2
2
3
3
FactorA
2
3
3
FactorA
3
3
FactorA
3
------+---------+---------+---------+
(---------*---------)
(---------*---------)
(---------*---------)
(---------*---------)
(---------*---------)
------+---------+---------+---------+
-25
0
25
50
Lower
-39.02
-29.52
-27.02
-26.52
Center
-14.00
-4.50
-2.00
-1.50
Upper
11.02
20.52
23.02
23.52
------+---------+---------+---------+
(---------*---------)
(---------*---------)
(---------*---------)
(---------*---------)
------+---------+---------+---------+
-25
0
25
50
Lower
-15.52
-13.02
-12.52
Center
9.500
12.000
12.500
Upper
34.52
37.02
37.52
------+---------+---------+---------+
(---------*---------)
(---------*---------)
(---------*---------)
------+---------+---------+---------+
-25
0
25
50
Upper
27.52
28.02
------+---------+---------+---------+
(---------*---------)
(---------*---------)
------+---------+---------+---------+
-25
0
25
50
Upper
25.52
------+---------+---------+---------+
(---------*---------)
------+---------+---------+---------+
-25
0
25
50
subtracted from:
FactorB
1
2
FactorA = 3
FactorB = 1
Upper
33.02
19.02
28.52
31.02
31.52
subtracted from:
FactorB
2
1
2
FactorA = 2
FactorB = 2
Center
8.000
-6.000
3.500
6.000
6.500
subtracted from:
FactorB
1
2
1
2
FactorA = 2
FactorB = 1
Lower
-17.02
-31.02
-21.52
-19.02
-18.52
Lower
-22.52
-22.02
Center
2.500
3.000
subtracted from:
FactorB
2
Lower
-24.52
Center
0.5000
Example: A 95% confidence interval for πœ‡32 βˆ’ πœ‡12 is (βˆ’26.52 , 23.52).
120
Two-Way ANOVA (Excel)
Enter the data in the worksheet in a format that resembles
the design (rows for factor A and columns for factor B). In
this example, we have factor A at 3 levels and factor B at
2 levels. There are 2 observations in each treatment.
If there is more than one observation per treatment, use
Data > Data Analysis > ANOVA: Two-Factor With
Replication.
Input Range: Specify the input data as the range of cells containing data table.
Include the labels in the range.
Rows per sample: Indicate how many observations were made in each treatment.
Alpha: The level of significance for critical values. The default is .05.
Note: Excel does not do multiple comparisons.
121
Two-Way ANOVA (Excel)
The output gives a table of descriptive statistics (count, sum, average, variance) of the
responses in each treatment as well as across each row and column. The ANOVA
table is displayed at the bottom.
Anova: Two-Factor With Replication
SUMMARY
1
2 Total
1
Count
Sum
Average
Variance
2
140
70
2
2
156
78
72
4
296
74
46
2
128
64
2
2
4
147
275
73.5 68.75
40.5 44.25
2
Count
Sum
Average
Variance
3
Count
Sum
Average
Variance
2
2
4
152
153
305
76 76.5 76.25
8 112.5 40.25
Total
Count
Sum
Average
Variance
ANOVA
Source of Variation
Sample
Columns
Interaction
Within
Total
6
420
70
31.2
6
456
76
49.2
SS
118.5
108
46.5
237
df
510
2
1
2
6
MS
F
P-value
F crit
59.25
1.5 0.296296 5.143253
108 2.734177 0.149307 5.987378
23.25 0.588608 0.584233 5.143253
39.5
11
122
Two-Way ANOVA (Excel)
If there is only one observation per cell (as in a
block design), use Data > Data Analysis >
ANOVA: Two-Factor Without Replication.
In this example of a block design, we have a factor
at 3 levels and a blocking variable at 5 levels.
Input Range: Specify the input data as the range of cells containing data table.
Labels: Check this box if you highlighted the row/column labels in the input range.
Alpha: The level of significance for critical values. The default is .05.
Note: Excel does not do multiple comparisons.
123
Two-Way ANOVA (Excel)
The output shows descriptive statistics of the responses across each row and column.
The ANOVA table is displayed at the bottom.
Anova: Two-Factor Without Replication
SUMMARY
Count
Sum
164.0231
86.89977
167.0332
101.9388
218.3436
Average
54.67437
28.96659
55.67772
33.9796
72.78119
Variance
139.9776
133.6837
1312.929
1603.765
158.9724
1
2
3
4
5
3
3
3
3
3
1
2
3
5 285.2015 57.04029 383.9797
5 235.5977 47.11954 778.1437
5 217.4393 43.48785 1341.297
ANOVA
Source of
Variation
Rows
Columns
Error
SS
3807.157
492.1318
6206.523
Total
10505.81
df
MS
F
P-value
F crit
4 951.7892 1.226824 0.372009 3.837853
2 246.0659 0.317171 0.736959 4.45897
8 775.8153
14
124
Interaction Plot (Minitab)
Data must be entered in stacked format with one column
for the numeric responses and one column for each
factor. In this example, we have factor A at 3 levels and
factor B at 2 levels.
Use Stat > ANOVA > Interactions Plot…
Responses: Specify the column containing the responses.
Factors: Specify the columns containing the levels of the two factors.
Display full interaction plot matrix: Check this box to construct the interaction plot
two ways: Factor A levels as x-axis and Factor B levels as different plotting symbols;
Factor B levels as x-axis and Factor A levels as different plotting symbols.
Interaction Plot for Score
Data Means
1
2
80
75
FactorA
FactorA
1
2
3
70
65
80
FactorB
1
2
75
FactorB
70
65
1
2
3
125
Chi-Square Test for One-Way Table (Minitab)
Use Stat > Tables > Chi-Square Goodness-of-Fit Test (One Variable)…
If the data are summarized as frequencies of the categories, use the first input option:
Observed counts: Specify the column containing the frequencies.
Category names (optional): Specify the column containing the category names.
If you have raw categorical data in a single column with one row per observation, use
the second input option:
Categorical data: Specify the column containing the
categorical data.
126
Chi-Square Test for One-Way Table (Minitab)
Under Test, specify what the null hypothesis of the test looks like:
Equal proportions: Use this option if the null hypothesis specifies equal proportions:
1
𝐻0 : 𝑝1 = 𝑝2 = β‹― = π‘π‘˜ = π‘˜
Specific proportions: Use this option if the null hypothesis specifies different
proportions. Specify a column in the worksheet that contains the proportions specified
in the null hypothesis. Here we use 𝐻0 : 𝑝𝐡𝑙𝑒𝑒 = .3, 𝑝𝑅𝑒𝑑 = .4, π‘π‘Œπ‘’π‘™π‘™π‘œπ‘€ = .3.
Important Note: If you use the Observed counts input option, the specific proportions
must be in the same order as the order the categories appear in the frequency table. If
you use the Categorical data input option, the specific proportions must be correspond
to the alphabetical order of the categories in the data.
Here, since we're using the Categorical data input option, the null hypothesized
proportion for Blue comes first.
127
Chi-Square Test for One-Way Table (Minitab)
The output shows the frequency table with the observed counts, null hypothesized
proportions, expected counts, and chi-square contributions,
square test statistic value and p-value.
Chi-Square Goodness-of-Fit Test for Categorical Variable: Color
Category
Blue
Red
Yellow
N
44
N*
0
Observed
13
11
20
DF
2
Chi-Sq
5.98106
Test
Proportion
0.3
0.4
0.3
Expected
13.2
17.6
13.2
Contribution
to Chi-Sq
0.00303
2.47500
3.50303
P-Value
0.050
128
(𝑛𝑖 βˆ’πΈπ‘– )2
𝐸𝑖
, as well as the chi-
Chi-Square Test for Two-Way Table (Minitab)
There are two menu options, depending on how the data are shown in the worksheet.
Raw/stacked data format: two columns holding the category
values of each variable
Here we have 3 random samples of 50 Plain, Peanut, and Dark
Chocolate M&Ms and observe the color of each.
Use Stat > Tables > Cross Tabulation and Chi-Square.
For rows: Specify the column containing the categories you'd like to use as the rows of
the 2-way contingency table.
For columns: Specify the column containing the categories you'd like to use as the
columns of the 2-way contingency table.
Use the Chi-Square… button to specify the ChiSquare analysis. You can also display the
expected counts.
129
Chi-Square Test for Two-Way Table (Minitab)
The output shows the contingency table and two versions of the chi-square test. Our
textbook uses the Pearson Chi-Square test.
Tabulated statistics: Type, Color
Rows: Type
Dark
Peanut
Plain
All
Columns: Color
Blue
Brown
Green
Orange
Red
Yellow
All
9
9
11
29
4
5
8
17
5
6
7
18
7
8
10
25
14
13
9
36
11
9
5
25
50
50
50
150
Cell Contents:
Count
Pearson Chi-Square = 6.105, DF = 10, P-Value = 0.806
Likelihood Ratio Chi-Square = 6.241, DF = 10, P-Value = 0.795
Summarized count data: worksheet holds
the contingency table counts
Here we have the results of 500 father/son
pairs classified by both the father's and the
son's occupation. The father's occupation is
shown on the rows and the son's on the
columns.
Use Stat > Tables > Chi-Square Test
(Two-Way Table in Worksheet)…
Columns containing the table: Specify the
columns of the worksheet containing the cell counts.
130
Chi-Square Test for Two-Way Table (Minitab)
The output shows the contingency table with expected counts and the individual
contributions to the chi-square statistic, along with the results of the chi-square test.
Chi-Square Test: Prof/Bus, Skilled, Unskilled, Farmer
Expected counts are printed below observed counts
Chi-Square contributions are printed below expected counts
Prof/Bus
55
34.20
12.650
Skilled
38
41.40
0.279
Unskilled
7
16.00
5.063
Farmer
0
8.40
8.400
Total
100
2
79
59.85
6.127
71
72.45
0.029
25
28.00
0.321
0
14.70
14.700
175
3
22
49.59
15.350
75
60.03
3.733
38
23.20
9.441
10
12.18
0.390
145
4
15
27.36
5.584
23
33.12
3.092
10
12.80
0.613
32
6.72
95.101
80
Total
171
207
80
42
500
1
Chi-Sq = 180.874, DF = 9, P-Value = 0.000
131
Regression (Minitab)
To perform regression analysis,
data must be entered in the
worksheet with one row per
observation and one column for
each variable term in the
regression model.
Here we have 24 orange juice
samples, 6 from each of 4
brands, the pectin content, and
measure of sweetness for each.
We use 3 dummy/indicator
variables to indicate the 4 levels
of brand.
Use Stat > Regression > Regression…
Response: Specify the column containing
the y-variable data.
Predictors: Specify the column(s) containing
the data corresponding to each term in the
model. Here we fit a simple linear model of
the form
𝑦 = 𝛽0 + 𝛽1 𝑃𝑒𝑐𝑑𝑖𝑛 + πœ–
132
Regression (Minitab)
Use the Graphs… button to request residual
plots.
Residuals for Plots: Specify the type of
residuals.
Regular
Standardized
Deleted
𝑦𝑖 βˆ’ 𝑦̂𝑖
𝑦𝑖 βˆ’π‘¦Μ‚π‘–
𝑠
𝑦𝑖 βˆ’π‘¦Μ‚π‘–βˆ—
π‘ βˆ—
where π‘¦Μ‚π‘–βˆ— and 𝑠 βˆ— are calculated based on the
model fit to the data where observation 𝑖 is omitted.
Residual Plots: Request plots individually or a Four-in-one display.
Residuals versus the variables: Request residuals (on the y-axis) plotted against any
variable in the worksheet.
Use the Options… button to request
additional analyses such as Variance
inflation factors (VIF) for each term in the
model and the Durbin-Watson statistic to
test for first-order autocorrelation.
Predication intervals for new
observations: Use this option to request
confidence intervals and prediction
intervals for a particular set of x's. Specify
the values of the x variables in the same
order as their terms appear in the model, i.e. in the same order as you entered the
terms in the Predictors box. Here we predict Sweetness for 250 ppm of Pectin.
133
Regression (Minitab)
The output of this analysis is displayed in both the session window and several graph
windows.
Regression Analysis: SweetIndex versus Pectin
The regression equation is
SweetIndex = 6.25 - 0.00231 Pectin
Predictor
Constant
Pectin
Coef
6.2521
-0.0023106
S = 0.214998
SE Coef
0.2366
0.0009049
R-Sq = 22.9%
T
26.42
-2.55
P
0.000
0.018
R-Sq(adj) = 19.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
22
23
SS
0.30140
1.01693
1.31833
MS
0.30140
0.04622
F
6.52
P
0.018
Unusual Observations
Obs
1
11
16
Pectin
220
410
383
SweetIndex
5.2000
5.4000
5.3000
Fit
5.7437
5.3047
5.3671
SE Fit
0.0552
0.1453
0.1222
Residual
-0.5437
0.0953
-0.0671
St Resid
-2.62R
0.60 X
-0.38 X
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large leverage.
Predicted Values for New Observations
New Obs
1
Fit
5.6744
SE Fit
0.0443
95% CI
(5.5825, 5.7664)
95% PI
(5.2192, 6.1297)
Values of Predictors for New Observations
New Obs
1
Pectin
250
Residual Plots for SweetIndex
Residuals Versus Pectin
Normal Probability Plot
(response is SweetIndex)
Percent
0.3
0.4
90
0.2
50
0.1
10
0.0
1
0.0
-0.2
-0.4
-0.50
-0.25
-0.1
0.00
Residual
0.25
0.50
5.3
5.4
Histogram
-0.2
-0.5
200
250
300
Pectin
350
400
0.2
6
4
2
0
134
5.7
Versus Order
Residual
-0.4
5.5
5.6
Fitted Value
0.4
8
-0.3
Frequency
Residual
0.2
Versus Fits
99
Residual
0.4
0.0
-0.2
-0.4
-0.4
-0.2
0.0
Residual
0.2
2
4
6
8 10 12 14 16 18 20 22 24
Observation Order
Regression (Minitab)
To fit separate, parallel lines to the data, one for each brand, we fit the multiple
regression model
𝑦 = 𝛽0 + 𝛽1 𝑃𝑒𝑐𝑑𝑖𝑛 + 𝛽2 π΅π‘Ÿπ‘Žπ‘›π‘‘π΄ + 𝛽3 π΅π‘Ÿπ‘Žπ‘›π‘‘π΅ + 𝛽4 π΅π‘Ÿπ‘Žπ‘›π‘‘πΆ + πœ–
Use Stat > Regression > Regression… and specify 4 predictor terms. We use the
Options… button to request a confidence interval and prediction interval for a pectin
content of 250 ppm from Brand B.
135
Regression (Minitab)
Regression Analysis: SweetIndex versus Pectin, Brand_A, Brand_B, Brand_C
The regression equation is
SweetIndex = 6.19 - 0.00193 Pectin - 0.024 Brand_A - 0.023 Brand_B
- 0.108 Brand_C
Predictor
Constant
Pectin
Brand_A
Brand_B
Brand_C
Coef
6.1924
-0.001927
-0.0238
-0.0233
-0.1081
S = 0.227640
SE Coef
0.2894
0.001151
0.1322
0.1362
0.1465
R-Sq = 25.3%
T
21.40
-1.68
-0.18
-0.17
-0.74
P
0.000
0.110
0.859
0.866
0.470
R-Sq(adj) = 9.6%
Analysis of Variance
Source
Regression
Residual Error
Total
Source
Pectin
Brand_A
Brand_B
Brand_C
DF
1
1
1
1
DF
4
19
23
SS
0.33375
0.98458
1.31833
MS
0.08344
0.05182
F
1.61
P
0.213
Seq SS
0.30140
0.00009
0.00407
0.02819
Unusual Observations
Obs
1
11
Pectin
220
410
SweetIndex
5.2000
5.4000
Fit
5.7446
5.3789
SE Fit
0.0932
0.1866
Residual
-0.5446
0.0211
St Resid
-2.62R
0.16 X
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large leverage.
Predicted Values for New Observations
New Obs
1
Fit
5.6873
SE Fit
0.0956
95% CI
(5.4873, 5.8873)
95% PI
(5.1705, 6.2040)
Values of Predictors for New Observations
New Obs
1
Pectin
250
Brand_A
0.000000
Brand_B
1.00
Brand_C
0.000000
136
Regression (Excel)
To perform regression analysis, data
must be entered in the worksheet
with one row per observation and
one column for each variable term in
the regression model.
All columns containing data for the
predictor terms in the model must be
in one contiguous block of columns.
Here we have 24 orange juice
samples, 6 from each of 4 brands,
the pectin content, and measure of
sweetness for each.
We use 3 dummy/indicator variables
to indicate the 4 levels of brand.
Use Data > Data Analysis > Regression.
Input Y Range: Specify the column
containing the y-variable data.
Input X Range: Specify the column or block
of columns containing the data corresponding
to each term in the model. Here we fit a
multiple regression model of the form
𝑦 = 𝛽0 + 𝛽1 𝑃𝑒𝑐𝑑𝑖𝑛 + 𝛽2 π΅π‘Ÿπ‘Žπ‘›π‘‘π΄ + 𝛽3 π΅π‘Ÿπ‘Žπ‘›π‘‘π΅ +
𝛽4 π΅π‘Ÿπ‘Žπ‘›π‘‘πΆ + πœ– .
137
Regression (Excel)
Labels: Check this box if the input ranges include the column names.
Residual Plots: Request a residual plot of residuals vs. each predictor variable.
Normal Probability Plots: Request a normal probability plot of the response (y) data.
This is not really the plot we want to look at; we'd like a normal probability plot of the
residuals (see Minitab).
The output shows some regression summary statistics, the ANOVA table, and table of
parameter estimates. The interval estimates are confidence intervals for the 𝛽
parameters.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.503153
R Square
0.253163
Adjusted R Square
0.095935
Standard Error
0.22764
Observations
24
ANOVA
df
Regression
SS
MS
4
0.333754
0.083438
Residual
19
0.98458
0.05182
Total
23
1.318333
Coefficients
Standard Error
Intercept
6.19237
Pectin
Brand_A
F
1.61016
Significance F
0.2127
Lower
95.0%
Upper
95.0%
6.797988
5.586752
6.797988
-0.00434
0.000481
-0.00434
0.000481
-0.30045
0.25291
-0.30045
0.25291
0.866215
-0.3084
0.261873
-0.3084
0.261873
0.469758
-0.4148
0.198622
-0.4148
0.198622
t Stat
P-value
0.289351
21.4009
9.27E-15
5.586752
-0.00193
0.001151
-1.67519
0.110277
-0.02377
0.132192
-0.17982
0.859196
Brand_B
-0.02326
0.136232
-0.17076
Brand_C
-0.10809
0.146541
-0.73762
138
Lower 95%
Upper
95%
Regression (Excel)
Pectin Residual Plot
Brand_A Residual
Plot
0.4
0.3
0.4
0.2
0.2
0
-0.1 0
200
400
Residuals
Residuals
0.1
600
-0.2
-0.3
-0.4
0
0
0.5
1
1.5
-0.2
-0.4
-0.5
-0.6
-0.6
Pectin
Brand_B Residual Plot
Brand_C Residual
Plot
0.4
0.4
0.2
0
0
0.5
1
Residuals
Residuals
0.2
1.5
-0.2
-0.4
0
-0.2
0
0.5
-0.4
-0.6
-0.6
Brand_B
Normal Probability Plot
6.5
SweetIndex
Brand_A
6
5.5
5
0
20
40
60
80
100
Sample Percentile
139
120
Brand_C
1
1.5
Μ… Chart (Minitab)
𝒙
Use Stat > Control Charts > Variables Charts for Subgroups > X-bar…
There are two ways to input the data: stacked and
unstacked.
Stacked Format
All observations for a chart are in one column: Data for
each sample are stacked in a single column. Here we show
data for 10 samples of size 4 in a single column.
Subgroup sizes: Enter the sample size for each sample (𝑛).
Use Xbar Options… to further specify
the analysis:
Use the Estimate tab to specify the
method for estimating the process
standard deviation (𝜎). Our text uses the
Rbar (𝑅̅ ) method.
140
Μ… Chart (Minitab)
𝒙
Use the S Limits tab to request control
limits and zone boundaries at 1-, 2-, and
3-standard deviations from the centerline.
Enter the multiples separated by spaces.
Note: "Standard deviation" here refers to
𝜎
𝜎π‘₯Μ… = 𝑛. If the Rbar method of estimating
√
𝜎 is used, 𝜎π‘₯Μ… β‰ˆ
Μ…
𝑅
𝑑2
βˆšπ‘›
.
Use the Tests tab to request all relevant
tests for special causes.
Note: The default settings of K
correspond to the tests in our text.
Xbar Chart of X
13
+3SL=12.658
12
Sample Mean
The output is an π‘₯Μ… chart. If any of
the tests for special causes reveal
unusual patterns/points, they will
be noted in the session window
and flagged on the chart. Here,
none of the 8 tests reveal unusual
patterns.
+2SL=11.732
11
+1SL=10.806
_
_
X=9.880
10
9
-1SL=8.954
8
-2SL=8.028
7
-3SL=7.102
1
141
2
3
4
5
6
Sample
7
8
9
10
Μ… Chart (Minitab)
𝒙
Unstacked Format
Observations for a subgroup are in one row of
columns: Data for each sample are in a row of the
worksheet and each row shows a different sample. Here we show data for 10 samples
of size 3. Specify the columns containing the data for each sample.
Using the same Xbar options as above gives the following chart.
Xbar Chart of X1, ..., X3
130
+3SL=125.21
Sample Mean
120
+2SL=117.82
+1SL=110.43
110
_
_
X=103.04
100
-1SL=95.65
90
-2SL=88.26
-3SL=80.86
80
1
2
3
4
5
6
Sample
7
8
9
10
142
R Chart (Minitab)
Use Stat > Control Charts > Variables Charts for Subgroups > R…
There are two ways to input the data: stacked and unstacked.
Stacked Format
All observations for a chart are in one column: Data for
each sample are stacked in a single column. Here we show
data for 10 samples of size 4 in a single column.
Subgroup sizes: Enter the sample size for each sample (𝑛).
Use R Options… to further specify the
analysis:
Use the Estimate tab to specify the
method for estimating the process
standard deviation (𝜎). Our text uses the
Rbar (𝑅̅ ) method.
143
R Chart (Minitab)
Use the S Limits tab to request control
limits and zone boundaries at 1-, 2-, and 3standard deviations from the centerline.
Enter the multiples separated by spaces.
Note: "Standard deviation" here refers to
the standard deviation of the sample
range, πœŽπ‘… .
Use the Tests tab to request all relevant
tests for special causes.
Note: The default settings of K correspond
to the tests in our text.
R Chart of X
9
+3SL=8.699
8
+2SL=7.070
7
Sample Range
The output is an R chart. If any of
the tests for special causes reveal
unusual patterns/points, they will be
noted in the session window and
flagged on the chart. Here, none of
the 4 tests reveal unusual patterns.
6
+1SL=5.442
5
_
R=3.813
4
3
-1SL=2.185
2
1
-2SL=0.556
-3SL=0
0
1
144
2
3
4
5
6
Sample
7
8
9
10
R Chart (Minitab)
Unstacked Format
Observations for a subgroup are in one row of columns: Data for each sample are
in a row of the worksheet and each row shows a different sample. Here we show data
for 10 samples of size 3. Specify the columns containing the data for each sample.
Using the same R options as above gives the following chart.
R Chart of X1, ..., X3
60
+3SL=55.79
50
Sample Range
+2SL=44.42
40
+1SL=33.04
30
_
R=21.67
20
-1SL=10.30
10
0
-3SL=0
-2SL=0
1
2
3
4
5
6
Sample
7
8
9
10
145
P Chart (Minitab)
The data must be arranged in the worksheet with a column
containing the number of "successes" for each sample.
Here we have 15 samples of 20 items, each of which we
classify as defective or not. The number defective for each
sample is shown.
Use Stat > Control Charts > Attributes Chart > P…
Variables: Specify the column containing the number of successes for each sample.
Subgroup sizes: Enter the sample size for each sample (𝑛).
Use P Chart Options… to further spec ify
the analysis:
Use the S Limits tab to request control
limits and zone boundaries at 1-, 2-, and 3standard deviations from the centerline.
Enter the multiples separated by spaces.
Note: "Standard deviation" here refers to
𝑝(1βˆ’π‘)
πœŽπ‘Μ‚ = √
𝑛
. 𝑝 is estimated by 𝑝̅ , the
overall proportion of successes.
146
P Chart (Minitab)
Use the Tests tab to request all relevant
tests for special causes.
Note: The default settings of K correspond
to the tests in our text.
The output is a P chart. If any of the tests for special causes reveal unusual
patterns/points, they will be noted in the session window and flagged on the chart.
Here, none of the 4 tests reveal unusual patterns.
P Chart of Num Defective
0.30
+3SL=0.2754
0.25
+2SL=0.2125
Proportion
0.20
0.15
+1SL=0.1496
0.10
_
P=0.0867
0.05
-1SL=0.0238
0.00
-3SL=0
-2SL=0
1
2
3
4
5
6
7
8
9
Sample
10 11 12 13 14 15
147
Time Series Plot (Minitab)
To plot a time series obtained at regular intervals of time,
record the time series values in the worksheet in a single
column. Here we have the quarterly price (dollars) of a
stock for 4 years.
Use Stat > Time Series > Time Series Plot and choose
Simple.
Series: Specify the column containing the time series values.
The default x-axis is a time
index that begins at 1 with the
first time point (here, first
quarter of 2009).
Time Series Plot of Price
90
Price
80
70
60
50
40
2
148
4
6
8
Index
10
12
14
16
Moving Averages (Minitab)
Record the time series values in the worksheet in a single
column. Here we have the quarterly price (dollars) of a
stock for 4 years.
Use Stat > Time Series > Moving Average…
Variable: Specify the column containing the time series values.
MA length: Specify the number of time points to include in the moving average.
Center the moving averages: Check this box to compute the moving average at time
𝑖, (𝑀𝐴𝑖 ), using the window centered at the time point. The default is to use a window
ending at time 𝑖.
Use the Storage… button to store the moving
averages in the worksheet.
149
Moving Averages (Minitab)
Use the Graphs… button to plot the
smoothed vs. the actual values. The default
is to plot the predicted (previous smoothed
value) vs. the actual values.
The output shows the original time series (actual values), the smoothed series using
moving averages, and three accuracy measures.
Moving Average Plot for Price
Variable
A ctual
Smoothed
90
Mov ing A v erage
Length 3
Price
80
A ccuracy Measures
MA PE
8.5700
MA D
6.1674
MSD
53.6898
70
60
50
40
2
4
6
8
10
Index
12
14
16
150
Single Exponential Smoothing (Minitab)
Record the time series values in the worksheet in a single
column. Here we have the quarterly price (dollars) of a
stock for 4 years.
Use Stat > Time Series > Single Exp Smoothing…
Variable: Specify the column containing the time series values.
Weight to Use in Smoothing: Select the Use radio button and specify the weight, 𝑀.
The default is to use an optimizing algorithm to determine the weight.
Under Options… set the initial smoothed value as
the the first actual value (K = 1). This matches the
technique our text uses.
Use the Storage… button to store the smoothed
values in the worksheet.
151
Single Exponential Smoothing (Minitab)
Use the Graphs… button to plot the smoothed
vs. the actual values. The default is to plot the
predicted (previous smoothed value) vs. the
actual values.
The output shows the original time series (actual values), the smoothed series using
single exponential smoothing, and three accuracy measures.
Note: Minitab refers to the smoothing weight, 𝑀, as "Alpha."
Smoothing Plot for Price
Single Exponential Method
Variable
A ctual
Smoothed
Price
90
80
Smoothing Constant
A lpha 0.7
70
A ccuracy Measures
MA PE
10.8279
MA D
7.5114
MSD
94.7440
60
50
40
2
4
6
8
10
Index
12
14
16
152
Trend Analysis (Minitab)
Record the time series values in the worksheet in a single
column. Here we have the quarterly price (dollars) of a
stock for 4 years.
Use Stat > Time Series > Trend Analysis…
Variable: Specify the column containing the time series values.
Model Type: Select the type of trend model you want to fit. Here we select a linear
trend model (linear regression model) of the form π‘Œπ‘‘ = 𝛽0 + 𝛽1 𝑑 + πœ– where 𝑑 =
1, 2, 3, … , 𝑛. Minitab creates this column for you so that you don't need to specify a
predictor column.
Use Generate forecasts to compute predicted values of the time series for a specified
number of time periods past the end of the observed series. Here we request a
forecasted price for first quarter 2013.
153
Trend Analysis (Minitab)
The graph shows the estimated trend model, the original time series (actual values), the
predicted values using the estimated trend model (fits), and the forecasted values,
along with three accuracy measures. These summaries are also shown in the session
window where the values of the forecasts are also displayed.
Trend Analysis for Price
Data
Length
NMissing
Trend Analysis Plot for Price
Linear Trend Model
Yt = 43.53 + 2.93*t
Price
16
0
100
90
A ccuracy Measures
MA PE
8.8131
MA D
5.7672
MSD
42.1620
80
Price
Fitted Trend Equation
Yt = 43.53 + 2.93*t
Variable
A ctual
Fits
Forecasts
70
60
Accuracy Measures
50
MAPE
MAD
MSD
40
8.8131
5.7672
42.1620
2
Forecasts
Period
17
Forecast
93.3646
Note: You can also use Stat > Regression >
Regression… to obtain the same estimated trend
model. You must first add a time index column
having the values 1, 2, 3, … , 𝑛 to the worksheet and
then specify that column as the Predictor column.
154
4
6
8
10
Index
12
14
16
Trend Analysis (Minitab)
Use the Options… button to request
forecasts by specifying the value of 𝑑 for the
desired time period.
The output will be formatted differently but
will show the same estimated model and
forecasts.
Note: Be careful when using p-values and
confidence intervals in a time series context
as the assumption of independent errors
may be violated.
Regression Analysis: Price versus t
The regression equation is
Price = 43.5 + 2.93 t
Predictor
Constant
t
Coef
43.532
2.9313
S = 6.94155
SE Coef
3.640
0.3765
R-Sq = 81.2%
T
11.96
7.79
P
0.000
0.000
R-Sq(adj) = 79.9%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
14
15
SS
2921.5
674.6
3596.1
MS
2921.5
48.2
F
60.63
P
0.000
Unusual Observations
Obs
12
t
12.0
Price
91.96
Fit
78.71
SE Fit
2.18
Residual
13.26
St Resid
2.01R
R denotes an observation with a large standardized residual.
Predicted Values for New Observations
New Obs
1
Fit
93.36
SE Fit
3.64
95% CI
(85.56, 101.17)
95% PI
(76.55, 110.18)
Values of Predictors for New Observations
New Obs
1
t
17.0
155
Trend Analysis (Excel)
Record the time series values in the worksheet in
a single column. Here we have the quarterly price
(dollars) of a stock for 4 years.
Add a time index column having the values
1, 2, 3, … , 𝑛 to the worksheet.
Use Data > Data Analysis > Regression to fit
the trend model to the data. Here we fit the linear
trend model (linear regression model) of the form
π‘Œπ‘‘ = 𝛽0 + 𝛽1 𝑑 + πœ–
Input Y Range: Specify the column containing the time series values.
Input X Range: Specify the column containing the time index column.
Labels: Check this box if the input ranges contain the column labels.
156
Trend Analysis (Excel)
From the regression output, you can obtain the estimated trend model and use it to
generate forecasts.
Note: Be careful when using p-values and confidence intervals in a time series context
as the assumption of independent errors may be violated.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.901338
R Square
0.81241
Adjusted R Square
0.799011
Standard Error
6.941553
Observations
16
ANOVA
df
Regression
SS
MS
F
60.63097
1
2921.513
2921.513
Residual
14
674.5922
48.18516
Total
15
3596.105
Coefficients
Standard Error
43.532
2.931328
Intercept
t
Significance F
1.87E-06
t Stat
P-value
3.640181
11.95875
9.78E-09
35.72459
0.376459
7.786589
1.87E-06
2.123905
157
Lower 95%
Upper
95%
Lower
95.0%
Upper
95.0%
51.33942
35.72459
51.33942
3.738751
2.123905
3.738751
Seasonal Regression Models
Record the time series values
in the worksheet in a single
column. Here we have the
quarterly price (dollars) of a
stock for 4 years.
Include columns for 𝑙 βˆ’ 1
dummy/indicator variables to
represent 𝑙 seasons. Here we
use 3 dummy variables to
represent the 4 quarters.
If you want to incorporate a
trend component in the model,
add a time index column to the
worksheet.
Refer to Regression (Minitab) or Regression (Excel) in this guide to fit the desired
model to the data:
Components
Seasonal only
Seasonal + Trend
Model
π‘Œπ‘‘ = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯3 + πœ–
π‘Œπ‘‘ = 𝛽0 + 𝛽1 π‘₯1 + 𝛽2 π‘₯2 + 𝛽3 π‘₯3 + 𝛽4 𝑑 + πœ–
158