Download Statistical Inference

Document related concepts

Psychometrics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical inference wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Key points













An
estimate
is
an
indication of the value of
an unknown quantity
based on observed data.
A population is the entire
collection of people or
things you are interested
in;
A
census
is
a
measurement of all the
units in the population;
A population parameter is
a number that results from
measuring all the units in
the population;
A sampling frame is the
specific data from which
the sample is drawn, e.g.,
a telephone book;
A unit of analysis is the
type of object of interest,
e.g.,
arsons,
fire
departments, firefighters;
A sample is a subset of
some of the units in the
population;
A statistic is a number that
results from measuring all
the units in the sample;
Statistics derived from
samples are used to
estimate
population
parameters.
N = the number of cases in
the sampling frame
n = the number of cases in
the sample
NCn = the number of
combinations (subsets) of
n from N
f = n/N = the sampling
fraction
INFERENTIAL STATISTICS
Table of Contents
SAMPLE: ....................................................................................................................
SAMPLING .................................................................................................................
REASONS FOR SAMPLING.....................................................................................
ECONOMY ...........................................................................................................
TIME FACTOR: ...................................................................................................
VERY LARGE POPULATION: ..........................................................................
PARTLY ACCESSIBLE POPULATIONS: .........................................................
THE DESTRUCTIVE NATURE OF THE OBSERVATION: ............................
ACCURACY AND SAMPLING: ........................................................................
BIAS AND ERROR IN SAMPLING ........................................................................
SAMPLING ERROR..................................................................................................
NON SAMPLING ERROR ........................................................................................
POPULATION PARAMETER AND SAMPLE STATISTICS .............................
PROBABILITY OF RANDOM SAMPLING ..........................................................
Simple Random Sampling .........................................................................................
1.
Stratified Random sampling ..............................................................................
Systematic Random Sampling : .................................................................................
Cluster random Sampling: ........................................................................................
Multistage random sampling: ...................................................................................
Sequential Random Sampling : .................................................................................
NON PROBABILITY SAMPLING ............................................... Error! Bookmark
Purposive sampling: .................................................................................................
Quota Sampling ........................................................................................................
Convenience sampling: .............................................................................................
DIFFERENCES BETWEEN RANDOM AND NON RANDOM
SAMPLING .................................................................................................................
Sampling techniques: Advantages and disadvantages ............................................
How to Choose the Best Sampling Method ..............................................................
SAMPLING DISTRIBUTION ..................................................................................
Why the sampling distribution is important ..............................................................
1
SAMPLING
Central Limit Theorem ..............................................................................................

As you increase the
sample size, regardless of
the shape you create, the
distribution (i.e. look at
the histogram) becomes
more bell-shaped.
Variability of a Sampling Distribution: ....................................................................
SAMPLING DISTRIBUTION OF MEANS ...............................................................
Sampling distribution in case of without replacement: ............... Error! Bookmark
Sampling distribution of difference between means: ................................................
Sampling distribution of proportions ........................................................................

Statistics: The mean x
Sampling distribution of differences of proportions: ................................................
and standard deviation s
for the sample are Objectives ....................................................................................................................
statistics. They are used as
estimates
of
the
parameters. Statistics are
variables.

A sample mean, denoted
(pronounced “x-bar”), is
an
average
of
n
observations. It measures
the center of the observed
data values.

A
sample
standard
deviation, denoted, is an
average deviation of n
observations. It measures
the spread or dispersion of
the observed data values.
INFERENTIAL STATISTICS
2
SAMPLING
SAMPLING
SAMPLE:
A sample is a group of units selected from a larger group (the population). By studying the
sample it is hoped to draw valid conclusions about the larger group.
A sample is generally selected for study because the population is too large to study in its
entirety. The sample should be representative of the general population. This is often best
achieved by random sampling (probability sampling). Also, before collecting the sample, it is
important that the researcher carefully and completely defines the population, including a
description of the members to be included.
Example
In a classroom of 30 students in which half the students are male and half are female, a representative
sample might include six students: three males and three females.
SAMPLING
A process used in statistical analysis in which a predetermined number of observations will
be taken from a larger population. There are two major categories in sampling:


Probability sampling
Non-probability sampling
Examples
1. Conducting a poll to predict the winner of an upcoming election
2. Inspecting a sample of parts to determine if the entire lot meets requirements
3. Sometimes "measuring" or "testing" something destroys it. The government requires
automakers who want to sell cars in the U.S. to demonstrate that their cars can
survive certain crash tests. Obviously, the company can't be expected to crash every
car, to see if it survives! So the company crashes only a sample of cars.
REASONS FOR SAMPLING
ECONOMY:
There is an economic advantage of using a sample in research. Obviously, taking a sample
requires fewer resources than a census.
INFERENTIAL STATISTICS
3
SAMPLING
TIME FACTOR:
A sample may provide you with needed information quickly. For example, you are a Doctor
and a disease has broken out in a village within your area of jurisdiction, the disease is
contagious and it is killing within hours nobody knows what it is. You are required to
conduct quick tests to help save the situation. If you try a census of those affected, they may
be long dead when you arrive with your results. In such a case just a few of those already
infected could be used to provide the required information.
VERY LARGE POPULATION:
Many populations about which inferences must be made are quite large. For example,
consider the population of high school seniors in United States of America, a group
numbering 4,000,000. The responsible agency in the government has to plan for how they
will be absorbed into the different departments and even the private sector. The employers
would like to have specific knowledge about the student`s plans in order to make compatible
plans to absorb them during the coming year. But the big size of the population makes it
physically impossible to conduct a census. In such a case, selecting a representative sample
may be the only way to get the information required from high school seniors.
PARTLY ACCESSIBLE POPULATIONS:
There are some populations that are so difficult to get access to that only a sample can be
used. Like people in prison, like crashed aero planes in the deep seas, presidents etc. The
inaccessibility may be economic or time related. For example natural disasters like a flood
that occurs every 100 years or take the example of the flood that occurred in Noah`s days. It
has never occurred again.
THE DESTRUCTIVE NATURE OF THE OBSERVATION:
Sometimes the act of observing the desired characteristic of a unit of the population destroys
it for the intended use. Good examples of this occur in quality control. For example to test
the quality of a fuse, to determine whether it is defective, it must be destroyed. To obtain a
census of the quality of a lorry load of fuses, you have to destroy all of them. This is contrary
to the purpose served by quality-control testing. In this case, only a sample should be used to
assess the quality of the fuses.
ACCURACY AND SAMPLING:
A sample may be more accurate than a census. A sloppily conducted census can provide less
reliable information than a carefully obtained sample.
INFERENTIAL STATISTICS
4
SAMPLING
BIAS AND ERROR IN SAMPLING
Sampling bias is a tendency to favor the selection of units that have particular characteristics.
A sample is expected to mirror the population from which it comes; however, there is no
guarantee that any sample will be precisely representative of the population from which it
comes. Chance may dictate that a disproportionate number of untypical observations will be
made like for the case of testing fuses, the sample of fuses may consist of more or less faulty
fuses than the real population proportion of faulty cases.
SAMPLING ERROR
Sampling error is incurred when the statistical characteristics of a population are estimated
from a subset, or sample, of that population. Since the sample does not include all members
of the population, statistics on the sample, such as means and quantiles, generally differ from
parameters on the entire population. For example, if one measures the height of a thousand
individuals from a country of one million, the average height of the thousand is typically not
the same as the average height of all one million people in the country. Since sampling is
typically done to determine the characteristics of a whole population, the difference between
the sample and population values is considered a sampling error.“Increasing the sample
size can decrease the sampling error”
NON SAMPLING ERROR
A statistical error caused by human error to which a specific statistical analysis is exposed.
These errors can include, but are not limited to, data entry errors, biased questions in a
questionnaire, biased processing/decision making, inappropriate analysis conclusions and
false information provided by respondents.
POPULATION-PARAMETER AND SAMPLE-STATISTICS
A parameter is a value, usually unknown (and which therefore has to be estimated or tested),
used to represent a certain population characteristic. For example, the population mean is a
parameter that is often used to indicate the average value of a quantity.
Within a population, a parameter is a fixed value which does not vary. Each sample drawn
from the population has its own value of any statistic that is used to estimate this parameter.
For example, the mean of the data in a sample is used to give information about the overall
mean in the population from which that sample was drawn.
INFERENTIAL STATISTICS
5
SAMPLING
For example, say you want to know the mean income of the subscribers to a particular
magazine—a parameter of a population. You draw a random sample of 100 subscribers and
determine that their mean income is $27,500 (a statistic). You conclude that the population
means income μ is likely to be close to $27,500 as well. This example is one of statistical
inference.
“σ”
“σ2 ”
INFERENTIAL STATISTICS
6
SAMPLING
PROBABILITY OR RANDOM SAMPLING
Simple Random Sampling
In statistics, a simple random sample is a subset of individuals (a sample) chosen from a
larger set (a population). Each individual is chosen randomly and entirely by chance, such
that each individual has the same probability of being chosen at any stage during the
sampling process, and each subset of k individuals has the same probability of being chosen
for the sample as any other subset of k individuals. This process and technique is known as
simple random sampling. A simple random sample is an unbiased surveying technique.
In small populations and often in large ones, such sampling is typically done "without
replacement", i.e., one deliberately avoids choosing any member of the population more
than once. Although simple random sampling can be conducted with replacement instead,
this is less common and would normally be described more fully as simple random
sampling with replacement.
Example 1
Let’s say you have a population of 1,000 people and you wish to choose a simple random
sample of 50 people. First, each person is numbered 1 through 1,000. Then, you generate a
list of 50 random numbers and those individuals assigned those numbers are the ones you
include in the sample.
Example 2
Figure 1.1
An example of simple random sampling of 10 subjects, represented by the red ‘stickmen’,
selected at random from a total of 50 subjects.
INFERENTIAL STATISTICS
7
SAMPLING
Case Study: Selecting a simple random sample of students
A simple random sample of 25 students is to be selected from a school of 500 students. Using
a list of all 500 students, each student is given a number (1 to 500), and these numbers are
written on small pieces of paper. All the 500 papers are put in a box, after which the box is
shaken vigorously to ensure randomization. Then, 25 papers are taken out of the box, and the
numbers are recorded. The students belonging to these numbers will constitute the simple
random sample.
Stratified Random sampling:
Stratification is the process of dividing members of the population into homogeneous
subgroups before sampling.
Example 1
Figure 1.2
The 50 subjects in Figure 1.2 have been stratified (divided) into two subgroups – one of 30
subjects (outlined in blue), and one of 20 subjects (outlined in green). A sample of 10
subjects has been selected, but they have not been picked entirely at random. Instead, 6 have
been selected at random from the 30 blue subjects and 4 have been selected at random from
the 20 green subjects, to ensure that the blue and green individuals are proportionately
represented in the sample of 10 selected individuals.
Example 2
A survey is conducted on household water supply in a district comprising 2,000 households,
of which 400 (or 20%) are urban and 1,600 (or 80%) are rural. It is suspected that in urban
areas the access to safe water sources is much more satisfactory than in rural areas. A
decision is made to sample 200 household’s altogether, but to include 100 urban households
and 100 rural households.
INFERENTIAL STATISTICS
8
SAMPLING
Probability base and non-probability base:
Representation of the subgroups can be proportionate or disproportionate. For example, if
you wanted to sample 100 farmers from a population of farmers in which 90% are male and
10% are female, a proportionate stratified sample would select 90 males and 10 females. But
you may want to know more about the women farmers than is possible in a sample of only
ten subjects. So you can select a disproportionate stratified sample, for example, you could
select 50 males and 50 females.
Systematic Random Sampling:
It is a method of selecting sample members from a larger population, according to a random
starting point and a fixed, periodic interval. Typically, every "nth" member is selected from
the total population for inclusion in the sample population.
Example 1
An example of systematic sampling of every tenth subject selected systematically from a
total of 50 subjects.
Example 2
A systematic sample is to be selected from 1,200 students from the same school. The
required sample size is 100. The study population is 1,200 and the sample size is 100, so a
systematic sampling interval is found by dividing the study population by the sample size:
1,200 ÷ 100 = 12
the sampling interval is therefore 12.
The number of the first student to be included in the sample should be chosen randomly, for
example by blindly picking one out of twelve pieces of paper, numbered 1 to 12. If number 6
INFERENTIAL STATISTICS
9
SAMPLING
is picked, then every twelfth student will be included in the sample, starting with student
number 6, until 100 students have been selected.
Cluster random Sampling:
Cluster sampling is a sampling technique used when "natural" but relatively homogeneous
groupings are evident in a statistical population. It is often used in marketing research. In this
technique, the total population is divided into groups (or clusters) and a simple random
sample of the groups is selected. Then the required information is collected from a simple
random sample of the elements within each selected group.
Example 1
Let’s say that a researcher is studying the academic performance of high school students in
the United States and wanted to choose a cluster sample based on geography. First, the
researcher would divide the entire population of the United States into clusters, or states.
Then, the researcher would select either a simple random sample or a systematic random
sample of those clusters/states. Let’s say he/she chose a random sample of 15 states and
he/she wanted a final sample of 5,000 students. The researcher would then select those 5,000
high school students from those 15 states either through simple or systematic random
sampling.
Example 2
The most common cluster used in research is a geographical cluster. For example, a
researcher wants to survey academic performance of high school students in Spain.
1. He can divide the entire population (population of Spain) into different clusters
(cities).
INFERENTIAL STATISTICS
10
SAMPLING
2. Then the researcher selects a number of clusters depending on his research through
simple or systematic random sampling.
3. Then, from the selected clusters (randomly selected cities) the researcher can either
include all the high school students as subjects or he can select a number of subjects
from each cluster through simple or systematic random sampling.
The important thing to remember about this sampling technique is to give all the clusters
equal chances of being selected.
Multistage random sampling:
Multistage sampling is a complex form of cluster sampling. Cluster sampling is a type of
sampling which involves dividing the population into groups (or clusters). Then, one or more
clusters are chosen at random and everyone within the chosen cluster is sampled.
Example 1
For instance, when the polling organization samples US voters, they don’t do a SRS. Since
voter lists are compiled by counties, they might first do a sample of the counties and then
sample within the selected counties. This illustrates two stages. In some instances, they might
use even more stages. At each stage, they might do a stratified random sample on gender,
race, income level, or any other useful variable on which they could get information before
sampling.
INFERENTIAL STATISTICS
11
SAMPLING
Example 2
For example, household surveys conducted by the Australian Bureau of Statistics begin by
dividing metropolitan regions into 'collection districts' and selecting some of these collection
districts (first stage). The selected collection districts are then divided into blocks, and blocks
are chosen from within each selected collection district (second stage). Next, dwellings are
listed within each selected block, and some of these dwellings are selected (third stage). This
method makes it unnecessary to create a list of every dwelling in the region and necessary
only for selected blocks.
NON PROBABILITY SAMPLING
Sequential Random Sampling:
Sequential sampling is a non-probability sampling technique wherein the researcher picks a
single or a group of subjects in a given time interval, conducts his study, analyzes the results
then picks another group of subjects if needed and so on.
Example 1
If a business organization wanted to determine the need for a new product, it might use
sequential sampling as part of its research process. The business might distribute
questionnaires to a selected group of potential customers asking for response to questions or
scenarios that would help to measure the perceptions of the responders to the idea for the
potential product.
Example 2
A manufacturing plant might pull off the assembly line for close evaluation every fourth
product that was created with a new type of material or process. The testing of that sample
portion of the output would verify whether or not the new material or process contributed to
the making of a final product that met required specifications.
Purposive sampling:
A purposive, or judgmental, sample is one that is selected based on the knowledge of a
population and the purpose of the study. It is based on the researcher’s own expertise.
Example 1
If a researcher is studying the nature of school spirit as exhibited at a school pep rally, he or
she might interview people who did not appear to be caught up in the emotions of the crowd
or students who did not attend the rally at all. In this case, the researcher is using a purposive
sample because those being interviewed fit a specific purpose or description.
INFERENTIAL STATISTICS
12
SAMPLING
Example 2
In a study wherein a researcher wants to know what it takes to graduate summa cum laude in
college, the only people who can give the researcher first hand advise are the individuals who
graduated summa cum laude. With this very specific and very limited pool of individuals that
can be considered as a subject, the researcher must use judgmental sampling.
Quota Sampling
A quota sample is a survey design in which interviewers recruit respondents according to a
set of guidelines that will result in an overall sample with certain proportions of people with
various social characteristics.
Example 1
The requirement might be to produce a collection of interviews that is evenly divided
between men and women, has certain percentages of people from different races and age
categories etc.
Example 2
Let’s say, for example, that you want to obtain a proportional quota sample of 100 people
based on gender. First you would need to find out the proportion of the population that is
men and the proportion that is women. If you found out the larger population is 40% women
and 60% men, you would need a sample of 40 women and 60 men for a total of 100
respondents. You would start sampling and continue until you got those proportions and then
you would stop. So, if you’ve already got 40 women for the sample, but not 60 men, you
would continue to sample men and discard any legitimate women respondents that came
along.
Convenience sampling:
A convenience sample is simply one in which the researcher uses any subjects that are
available to participate in the research study. This could mean stopping people in a street
corner as they pass by or surveying passersby in a mall. It could also mean surveying friends,
students, or colleagues that the researcher has regular access to.
Example 1
Let’s say that a researcher and professor at a University are interested in studying drinking
behaviors among college students. The professor teaches a sociology 101 class to mostly
college freshmen and decides to use his or her class as the study sample. He or she passes out
surveys during class for the students to complete and hand in.
Example 2
INFERENTIAL STATISTICS
13
SAMPLING
Convenience sampling is often used when statistical data gathered from a specific group of
people is desired. For example, if a company wants to figure out what flavor of pizza sells
the best in college students, they could poll an average local college and reliably say that that
is an accurate representation of most college students.
DIFFERENCES BETWEEN RANDOM AND NON RANDOM SAMPLING
The differences between Probability (Random) Sampling and Non-Probability (NonRandom) Sampling are summarized below.
Probability (Random) Sampling
Non-Probability (Non-Random) Sampling
Allows the use of statistics, tests hypotheses
Exploratory research, generates hypotheses
Can estimate population parameters
Population parameters are not of interest
Eliminates bias
Adequacy of the sample can't be known
Must have random selection of units
Cheaper, easier, quicker to carry out
Sampling techniques: Advantages and disadvantages
Technique Descriptions
Advantages
Simple
Random sample from
whole population
Highly representative if Not
possible
without
all subjects participate; complete list of population
the ideal
members;
potentially
uneconomical to achieve;
can be disruptive to isolate
members from a group;
time-scale may be too long,
data/sample could change
Stratified
Random sample from
identifiable groups
(strata), subgroups,
etc.
Can ensure that specific
groups are represented,
even proportionally, in
the sample(s) (e.g., by
gender), by selecting
INFERENTIAL STATISTICS
14
Disadvantages
More complex, requires
greater effort than simple
random; strata must be
carefully defined
SAMPLING
individuals from strata
list
Cluster
Random samples of
successive clusters of
subjects (e.g., by
institution) until small
groups are chosen as
units
Possible
to
select
randomly when no single
list
of
population
members exists, but
local lists do; data
collected on groups may
avoid introduction of
confounding by isolating
members
Stage
Combination
of
cluster
(randomly
selecting clusters) and
random or stratified
random sampling of
individuals
Can make up probability Complex,
combines
sample by random at limitations of cluster and
stages
and
within stratified random sampling
groups; possible to select
random sample when
population lists are very
localized
Purposive
Hand-pick subjects on Ensures balance of group Samples are not easily
the basis of specific sizes when multiple defensible
as
being
characteristics
groups are to be selected representative
of
populations due to potential
subjectivity of researcher
Quota
Select individuals as
they come to fill a
quota
by
characteristics
proportional
to
populations
Ensures selection of Not possible to prove that
adequate numbers of the sample is representative
subjects with appropriate of designated population
characteristics
Snowball
Subjects with desired
traits or characteristics
give names of further
appropriate subjects
Possible
to
include
members of groups
where no lists or
identifiable clusters even
exist (e.g., drug abusers,
criminals)
INFERENTIAL STATISTICS
15
Clusters in a level must be
equivalent and some natural
ones are not for essential
characteristics
(e.g.,
geographic: numbers equal,
but unemployment rates
differ)
No way of knowing
whether the sample is
representative
of
the
population
SAMPLING
Volunteer,
accidental,
convenience
Either asking for volunteers, or
the consequence of not all those
selected finally participating, or
a set of subjects who just
happen to
Inexpensive way Can be highly
of
ensuring unrepresentative
sufficient numbers
of a study
be available
How to Choose the Best Sampling Method
In this section, we illustrate how to choose the best sampling method by working through a
sample problem. Here is the problem:
Problem Statement
At the end of every school year, the state administers a reading test to a sample of third
graders. The school system has 20,000 third graders, half boys and half girls. There are
1000 third-grade classes, each with 20 students.
The maximum budget for this research is $3600. The only expense is the cost to proctor
each test session. This amounts to $100 per session.
The purpose of the study is to estimate the reading proficiency of third graders, based on
sample data. School administrators want to maximize the precision of this estimate
without exceeding the $3600 budget. What sampling method should they use?
Finding the "best" sampling method is a four-step process. We work through each step
below.



List goals. This study has two main goals: (1) maximize quality production and (2)
stay within budget.
Identify potential sampling methods.
Test methods. A key part of the analysis is to test the ability of each potential
sampling method to satisfy the research goals. Specifically, we will want to know the
INFERENTIAL STATISTICS
16
SAMPLING

level of precision and the cost associated with each potential method. For our test, we
use the standard error to measure precision. The smaller the standard error, the greater
the precision.
Choose best method. In this example, the cost of each sampling method is identical,
so none of the methods has an advantage on cost. However, the methods do differ
with respect to precision (as measured by standard error). Cluster sampling provides
the most precision (i.e., the smallest standard error); so cluster sampling is the best
method.
SAMPLING DISTRIBUTION
1) The sampling distribution is a theoretical distribution of a sample statistic.
2.) There is a different sampling distribution for each sample statistic.
3) The sampling distribution of the mean is a special case of the sampling distribution.
4.) The Central Limit Theorem relates the parameters of the sampling distribution of the
mean to the population model and is very important in statistical thinking.
Why the sampling distribution is important?
We use the sampling distribution of a statistic to determine the probability that the value of
the statistic is like other possible sample values. It helps us determine the likelihood of error
in concluding there is a relationship when there is not, or in concluding that two statistics are
different.
The sampling distribution is derived assuming the null hypothesis is correct. The sampling
distribution says, if there is no relationship between x and y, these are the statistics we would
expect and their associated probabilities.
Central Limit Theorem
The central limit theorem states that the sampling distribution of any statistic will be normal
or nearly normal, if the sample size is large enough.
How large is "large enough"? As a rough rule of thumb, many statisticians say that a
sample size of 30 is large enough. If you know something about the shape of the sample
distribution, you can refine that rule. The sample size is large enough if any of the following
conditions apply.


The population distribution is normal.
The sampling distribution is symmetric, uni modal, without outliers, and the sample
size is 15 or less.
INFERENTIAL STATISTICS
17
SAMPLING


The sampling distribution is moderately skewed, uni modal, without outliers, and the
sample size is between 16 and 40.
The sample size is greater than 40, without outliers.
The exact shape of any normal curve is totally determined by its mean and standard
deviation. Therefore, if we know the mean and standard deviation of a statistic, we can find
the mean and standard deviation of the sampling distribution of the statistic (assuming that
the statistic came from a "large" sample).
Sampling distribution
Distribution of a sample statistic is called sampling distribution.
OR
A probability distribution of all the possible means of the samples is distributions of the
sample means; statisticians call this a sampling distribution of mean.
Suppose that we draw all possible samples of size n from a given population. Suppose further
that we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample.
The probability distribution of this statistic is called a sampling distribution.
Variability of a Sampling Distribution:
Variability of sampling distribution is measured by its standard deviation or variance, it
depends upon three factors.

N: no of observations in the population.

n: no of observations in the sample

How random sample is chosen?
Sampling distribution will have roughly the same sampling error if population size is
much larger than the sample size, whether sampling is done with or without
replacement. Sampling error would be smaller if the sample represents a significant
figure (say, 1/10) of population, when we sample without replacement.
What is the difference between sampling and population distribution?
Sampling distribution is a distribution of sample statistic while population distribution is the
distribution of the population we selected for deducing our results, our area of interest.
Population distribution refers to the patterns that a population creates as they spread within
an area. A sampling distribution is a representative, random sample of that population.
INFERENTIAL STATISTICS
18
SAMPLING
SAMPLING DISTRIBUTION OF MEANS
In order to demonstrate the properties of sampling distribution, let us consider a simple
example. Suppose that our population consists of N=5 numbers 1, 2, 3, 4, 5. The mean (
and the standard deviation (σ) of this population are given by
CASE I: SAMPLING DISTRIBUTION WITH REPLACEMENT
When N=5, n=2
µ= ∑
=
=3
=1.4142
Suppose that we draw all possible samples of size n=2 with replacement and then for each
sample compute the sample mean x. There are Nn=52=25 samples of size 2 which can b
drawn with replacement. These samples are
Sample
mean
Sample
Mean
Sample
mean
sample
Mean
(1,1)
1
(2,3)
2.5
(3,5)
4
(5,2)
3.5
(1,2)
1.5
(2,4)
3
(4,1)
2.5
(5,3)
4
(1,3)
2
(2,5)
3.5
(4,2)
3
(5,4)
4.5
(1,4)
2.5
(3,1)
2
(4,3)
3.5
(5,5)
5
(1,5)
3
(3,2)
2.5
(4,4)
4
(2,1)
1.5
(3,3)
3
(4,5)
4.5
(2,2)
2
(3,4)
3.5
(5,1)
3
INFERENTIAL STATISTICS
19
SAMPLING
X
Tally
F
Pr(x)
∑
∑(x2.Pr(x))
1
I
1
1/25
1/25
1/25
1.5
II
2
2/25
3/25
9/25
2
III
3
3/25
6/25
36/25
2.5
IIII
4
4/25
10/25
100/25
3
IIII
5
5/25
15/25
225/25
3.5
IIII
4
4/25
14/25
196/25
4
III
3
3/25
12/25
144/25
4.5
II
2
2/25
9/25
81/25
5
I
1
1/25
5/25
25/25
∑=75/25=3
E(X)2=32.68
∑f=25
E(x) =∑ (x. Pr(x) = 3
V(x) =E(x2) –E (x) 2
=32.68-(3)2
=23.68
To prove functional relationships
1. E(x) = µ
µ=
=3
E(x) = µ
3=3
INFERENTIAL STATISTICS
20
SAMPLING
2. V(x)=
σ
= 6.88
σ2=47.36
V(x) =
σ
23.68=23.68
CASE II: Sampling distribution with replacement
N=5, n=3
Suppose that we draw all possible samples of size n=3 with replacement and then for each
sample compute the sample mean x. There are Nn=53=125 samples of size 3 which can b
drawn with replacement these samples are
Samples Mean Sample Mean Sample Mean
Sample Mean Sample Mean
1,1,1
1
2,1,2
2.66
3,1,3
2.33
4,1,4
3
5,1,5
3.33
1,1,2
1.33
2,1,3
2
3,1,4
2.66
4,1,5
3.33
5,2,1
2.66
1,1,3
1.66
2,1,4
2.33
3,1,5
3
4,2,1
2.33
5,2,2
3
1,1,4
2
2,1,5
2.66
3,2,1
2
4,2,2
2.66
5,2,3
3.33
1,1,5
2.33
2,2,1
1.33
3,2,2
2.33
4,2,3
3
5,2,4
3.66
1,2,1
1.33
2,2,2
2
3,2,3
2.66
4,2,4
3.33
5,2,5
4
1,2,2
1.66
2,2,3
2.33
3,2,4
3
4,2,5
3.66
5,3,1
3
1,2,3
2
2,2,4
2.66
3,2,5
3.33
4,3,1
2.66
5,3,2
3.33
1,2,4
2.33
2,2,5
3
3,3,1
2.33
4,3,2
3
5,3,3
3.66
INFERENTIAL STATISTICS
21
SAMPLING
1,2,5
2.66
2,3,1
2
3,3,2
2.66
4,3,3
3.33
5,3,4
4
1,3,1
1.66
2,3,2
2.33
3,3,3
3
4,3,4
3.66
5,3,5
4.33
1,3,2
2
2,3,3
2.66
3,3,4
3.33
4,3,5
4
5,4,1
3.33
1,3,3
2.33
2,3,4
3
3,3,5
3.66
4,4,1
3
5,4,2
3.66
1,3,4
2.66
2,3,5
3.33
3,4,1
2.66
4,4,2
3.33
5,4,3
4
1,3,5
3
2,4,1
2,33
3,4,2
3
4,4,3
3.66
5,4,4
4.33
1,4,1
2
2,4,2
2.66
3,4,3
3.33
4,4,4
4
5,4,5
4.66
1,4,2
2.33
2,4,3
3
3,4,4
3.66
4,4,5
4.33
5,5,1
3.66
1,4,3
2.66
2,4,4
3.33
3,4,5
4
4,5,1
3.33
5,5,2
4
1,4,4
3
2,4,5
3.66
3,5,1
3
4,5,2
3.66
5,5,3
4.33
1,4,5
3.33
2,5,1
2.66
3,5,2
3.33
4,5,3
4
5,5,4
4.66
1,5,1
2.33
2,5,2
3
3,5,3
3.66
4,5,4
4.33
5,5,5
5
1,5,2
2.66
2,5,3
3.33
3,5,4
4
4,5,5
4.66
1,5,3
3
2,5,4
3.66
3,5,5
4.33
5,1,1
2.33
1,5,4
3.33
2,5,5
4
4,1,1
2
5,1,2
2.66
1,5,5
3.66
3,1,1
1.33
4,1,2
2.33
5,1,3
3
2,1,1
1.33
3,1,2
2
4,1,3
2.66
5,1,4
3.33
X
Tally
F
Pr(x)
∑
1
I
1
1/125
1/25
1/125
1.33
IIII/
5
5/125
6.65/125
8.84/125
1.66
III
3
3/125
4.98/125
8.26/125
2
IIII/ IIII/
10
10/125
20/125
40/125
INFERENTIAL STATISTICS
22
∑(x2.Pr(x))
SAMPLING
2.33
IIII/
IIII/
IIII/ 15
15/125
34.95/125
81.42/125
2.66
IIII/
IIII/ 19
IIII/ III
19/125
50.54/125
134.42/125
3
IIII/
IIII/ 19
IIII/ IIII
19/125
57/125
171/125
3.33
IIII/
IIII/ 19
IIII/ IIII
19/125
63.27/125
210.52/125
3.66
IIII/
IIII/ 19
IIII/ IIII
19/125
69.54/125
254.41/125
4
IIII/ IIII/
10
10/125
40/125
160/125
4.33
IIII/ I
6
6/125
25.98/125
112.49/125
4.66
III
3
3/125
13.98/125
65.146/125
5
I
1
1/125
5/125
25/125
∑=392.89/125=3.1 E(X)2=10.18
∑f=125
E(x) =∑ (x. Pr(x)
=3.1
V(x) =E(x2) –E (x) 2
=-(10.18)–(3.14)2
=
0.32
To prove functional relationship
1. E(x) = µ
3.1=3.1 hence proved
µ=
= 3.1
2. V(x)=
INFERENTIAL STATISTICS
23
SAMPLING
σ2=V(x).n = (0.32). (3) = 0.96
V(x) =
=0.96/3 =0.32 hence proved
CASE III: SAMPLING DISTRIBUTION IN CASE OF WITHOUT
REPLACEMENT
Suppose now we draw all possible samples of size 2 from our population without
replacement, for each sample we will compute the sample mean. As N=1, 2, 3, 4, 5 with n=2
Sample
Mean(x)
Sample
Mean(x)
(1,2)
1.5
(2,4)
3
(1,3)
2
(2,5)
3.5
(1,4)
2.5
(3,4)
3.5
(1,5)
3
(3,5)
4
(2,3)
2.5
(4,5)
4.5
x
tally
F
Pr(x)
∑
∑(x2.Pr(x))
1.5
I
1
1/10
1.5/10
2.25/10
2
I
1
1/10
2/10
4/10
2.5
II
2
2/10
5/10
25/10
3
II
2
2/10
6/10
36/10
3.5
II
2
2/10
7/10
49/10
4
I
1
1/10
4/10
16/10
INFERENTIAL STATISTICS
24
SAMPLING
4.5
I
1
1/10
∑f=10
4.5/10
20.25/10
∑=30/10=3
E(X)2=15.25
Prove the results
1. E(X) =µ
∑
µ
=
=3
So E(X) =∑(x .Pr(x))
3=3
σ
2. V(x)=
V(x) =E(x2)-E(x) 2=15.25-(3)2=6.25
σ2= ∑(x-µ) 2/N
σ2 =
(1-3)2+ (2-3)2+ (3-3)2+ (4-3)2+ (5-3)2
σ2 =
σ2 = 2
6.25=6.25 hence proved
CASE IV: Sampling distribution in case of without replacement
A population consists of four numbers 2, 4, 6, 8 all the possible samples of size n=3 are given
below in the table which can be drawn without replacement from this population
Here N=4 and n=3 the number of possible samples of size 3 which can be drawn are as given
below
INFERENTIAL STATISTICS
25
SAMPLING
Samples
Mean
(2,4,6)
4
(2,4,8)
4.67
(2,6,8)
5.33
(4,6,8)
6
X
Tally
F
Pr(x)
∑
∑(x2.Pr(x))
4
I
1
1/4
4/4
16/4
4.67
I
1
1/4
4.67/4
21.80/4
5.33
I
1
1/4
5.33/4
28.40/4
6
I
1
1/4
6/4
36/4
∑=20/4=5
E(X)2=25.55
∑f=4
To prove the results:
1. E(x)=
Where E(x) = ∑ (x. Pr(x)) =5
µ=
=5
Hence 5=5
2. V(x) =
V(X) = E(x2)-E(x) 2
V(X) = 25.55-25 = 0.55
INFERENTIAL STATISTICS
26
SAMPLING
And σ2= ∑(x-µ) 2/N
2
+ (4-5)2+ (6-5)2+ (8-5)2
σ2 =
σ2 = 5
σ2 =
(
)=0.55
Hence proved
CASE V: SAMPLING DISTRIBUTION OF DIFFERENCE BETWEEN
MEANS
Suppose we have two infinite populations I and II with means µ1 andµ2 and standard
deviations σ1 andσ2 respectively. Let X1 be the mean of a sample of size n1 from population I
and X2 be the mean of the sample of size n2 from population II, independent of the sample I.
the means of samples, each of size n1 from the population will yield a sampling distribution
of X1 with mean µ1 and standard deviation σ1.similarly the means of samples each of size n2
from population II will yield a sampling distribution of X2.with a mean µ2 and standard
deviation σ2
From all combinations of these samples from the two populations we can obtain a
distribution of differences of means, X1-X2 which is called the sampling distribution of
differences of the means. The mean and standard deviation of this sampling distribution is
denoted by µ1-µ2 and σ1- σ2 are given by
µ1-µ2
= µx 1- µx
2
σ x1 - x 2 =V(X1-X2) = V(X1) + V(X2) = (σx1)2+ (σx2)2 x 2 = (σx1)2/n1+ (σx2)2/n2
Suppose that population I consists of 2 numbers (4, 6, 8) and population II consists of 3
numbers (1, 2, 3)
For population I and II: N1=3, n1=2, N2=3, n2=2
N1
X1
N2
X2
4, 4
4
1,1
1
4, 6
5
1,2
1.5
4,8
6
1,3
2
6,4
5
2,1
1.5
INFERENTIAL STATISTICS
27
SAMPLING
6,6
6
2,2
2
6,8
7
2,3
2.5
8,4
6
3,1
2
8,6
7
3,2
2.5
8,8
8
3,3
3
Difference Table X1-X2
x1
x2 1
1.5
2
1.5
2
2.5
2
2.5
3
4
3
2.5
2
2.5
2
1.5
2
1.5
1
5
4
3.5
3
3.5
3
2.5
3
2.5
2
6
5
4.5
4
4.5
4
3.5
4
3.5
3
5
4
3.5
3
3.5
3
2.5
3
2.5
2
6
5
4.5
4
4.5
4
3.5
4
3.5
3
7
6
5.5
5
5.5
5
4.5
5
4.5
4
6
5
4.5
4
4.5
4
3.5
4
3.5
3
7
6
5.5
5
5.5
5
4.5
5
4.5
4
8
7
6.5
6
6.5
6
5.5
6
5.5
5
X1-X2=d
TALLY
F
Pr(d)
d. Pr(d)
d2.Pr(d)
1
I
1
1/81
1/81
1/81
1.5
II
2
2/81
3/81
4.5/81
2
IIII/
5
5/81
10/81
20/81
2.5
IIII/I
6
6/81
15/81
37/81
3
IIII/ IIII/
10
10/81
30/81
90/81
INFERENTIAL STATISTICS
28
SAMPLING
3.5
IIII/IIII/
10
10/81
35/81
122.5/81
4
IIII/ IIII/ 13
III
13/81
52/81
208/81
4.5
IIII/IIII/
10
10/81
45/81
202.5/81
5
IIII/IIII/
10
10/81
50/81
250/81
5.5
IIII/I
6
6/81
33/81
181.5/81
6
IIII/
5
5/81
30/81
180/81
6.5
II
2
2/81
13/81
84/81
7
I
1
1/81
7/81
49/81
E(x1x2)=E(d)=4
E(X1)2-(X2)2=E(d2)= 17.6
∑f=81
Prove the results:
1. E(d) = µ1-µ2
E (d) = 2.5
µ1-µ2=∑ (X1-X2) / N
µ1-µ2=4
Hence proved 4 = 4
2. V(d) = (σ x1 )2/n1+ (σx2)2/n2
(σ1)2= (4-6)2 + (6-6)2+ (8-6)/3 =8/3
(σ2)2= (1-2)2 + (2-2)2+ (3-2)2/3=2/3
(σ x1 ) 2/n1+ (σx2)2/n2=8/3 .1/2 +5/3 .1/2
=5/3
=1.66
V (d) = E (d2) –E (d) 2
V (d) =17.6 – (4)2
V (d) =17.6 - 16
V (d) = 1.6
Hence proved
INFERENTIAL STATISTICS
29
SAMPLING
SAMPLE PROPORTION
Proportion is referred as a certain fraction of the total possessing certain attribute of our
interest; let us take an example to understand the concept of proportion
Suppose a student guesses at the answer on every question in a 300-question exam. If he gets
60 questions correct, then his proportion of correct guesses is 60/300=.20. If he gets 75
questions correct, then his proportion of correct guesses is 75/300=.25. The proportion of
correct guesses is simply the number of correct guesses divided by the total number of
questions.
Now, let X denote the number of successes out of a sample of n observations. If each
observation is a success with probability p independently of the other observations, then X is
a binomial random variable with parameters n and p. Furthermore, the proportion of
successes in the sample is also a random variable and is computed as
Sampling distribution of proportions:
Consider an experiment that results in a success on each trial with probability p or a failure
with a probability q=1-p. to obtain a sample of size n we perform n trials of experiment and
we are sampling from an infinite population. For example the population may be all possible
tosses of a fair coin in which the probability of getting head (success) is p=1/2 the mean
would be µ=np and the standard deviation σ=√
Question: 01 A population consists of five members .the marital status of each member
is given below, where M and S stand for married and single respectively.
Member
1
2
3
4
5
Marital
status
M
S
M
S
S
a) Determine the proportion of married members in the population
b) Select all possible samples of two members from this population (i) with replacement, (ii)
without replacement and compute the proportion of married members in each sample.
INFERENTIAL STATISTICS
30
SAMPLING
Solution:
a) Since there are 2 married members in the population
N=5, p=2/5
=0.4 or 40%
b) There are Nn=52 =25 possible samples of size n=2 which can be drawn with
replacement form the population these samples are given below
sample
p
sample
p
sample
p
sample
p
(1,1)
1
(2,4)
0
(4,2)
0
(5,5)
0
(1,2)
0.5
(2,5)
0
(4,3)
0.5
(1,3)
1
(3,1)
1
(4,4)
0
(1,4)
0.5
(3,2)
0.5
(4,5)
0
(1,5)
0.5
(3,3)
1
(5,1)
0.5
(2,1)
0.5
(3,4)
0.5
(5,2)
0
(2,2)
0
(3,5)
0.5
(5,3)
0.5
(2,3)
0.5
(4,1)
0.5
(5,4)
0
P
tally
f
Pr(p)
p.Pr(p)
p2
p2.Pr(p)
0
IIII/ IIII
9
9/25
0
0
0
0.5
IIII/ IIII/ 12
II
12/25
6/25
.25
3/25
1
IIII
4/25
4/25
1
4/25
4
∑f=25
E(p)=0.4
E(p2)=0.28
To prove the results:
1) E(p) = P
P=
0.4=0.4
INFERENTIAL STATISTICS
31
SAMPLING

2) V ( p ) = pq/n
V (p) =E (p2)-E (p) 2
=0.28-(0.4)2
=0.12
Pq/n= (0.4) (1-0.4)/2
=0.12
0.12=0.12 hence proved
In case of without replacement the samples drawn of sine n=2 are 10
Sample
Proportion
Sample
proportion
(1,2)
0.5
(2,4)
0
(1,3)
1
(2,5)
0
(1,4)
0.5
(3,4)
0.5
(1,5)
0.5
(3,5)
0.5
(2,3)
0.5
(4,5)
0
P
Tally
F
Pr(p)
p.Pr(p)
p2
p2.Pr(p)
0
III
3
3/10
O
0
0
0.5
IIII/ I
6
6/10
3/10
0.25
1.5/10
1
I
1
1/10
1/10
1
1/10
∑f=10
E(p)=0.4
E(p2)=0.25
Prove the results
1) E(p)=µp
0.4=0.4
INFERENTIAL STATISTICS
32
SAMPLING
2) V (p) = √
√
V (p) = √
–
=√
=√
√
= 0.3
√
√
√
=√
√
=√
= 0.3
Hence proved
SAMPLING DISTRIBUTION OF DIFFERENCES OF PROPORTIONS
Consider independent samples of size n1 and n2 drawn at random from two binomial
populations with parameters p1, q1 and p2 and q2 respectively, we denote proportion of
successes of each sample by P1 & P2. From all combination of these samples from the true
population we can obtain the sampling distributions of the differences of P1-P2 which is
called the sampling distribution of differences of proportions. The mean and standard
deviation are given below.
Mean: µp1-µp2 =p1-p2
Standard deviation: (σp1-p2) = √
=√
Question 1: let P1 denote the proportion of odd numbers in a random sample of size n1=2
with replacement from a finite population of size N1=3: 3, 6, 9.similarly, let P2 denote the
proportion of odd numbers in a random sample of size n2=3 with replacement from a finite
population of N2=2: (6, 7). Form sampling distributions of P1-P2 also find the mean and
variance of a sampling distribution of P1-P2 and verify the results.
Solution:
In population 1, N1=3, n1=2 there are Nn=32= 9 possible samples which can be drawn with
replacement from this population. These samples are
Samples
Proportion
Samples
Proportion
3,3
1
6,6
0
3,6
½
6,9
½
INFERENTIAL STATISTICS
33
SAMPLING
3,9
1
9,3
1
6,2
½
9,6
½
9,9
1
In population 2, N2=2 n2=3 there are Nn=23=8 possible samples which can be drawn from
this population. These samples are
Samples
Proportion
Sample
Proportion
6,6,6
0
7,6,6
1/3
6,6,7
1/3
7,6,7
2/3
6,7,6
1/3
7,7,6
2/3
6,7,7
2/3
7,7,7
1
Difference table =p1-p2
P1/p2
0
1/3
1/3
1/3
2/3
2/3
2/3
1
0
0
-1/3
-1/3
-1/3
-2/3
-2/3
-2/3
-1
1/2
1/2
1/6
1/6
1/6
-1/6
-1/6
-1/6
-1/2
P1-P2
tally
f
-1
-2/3
-1/2
-1/3
-1/6
I
III
IIII
III
IIII/
IIII/ II
IIII/
IIII/
IIII/ II
IIII/
0
1/6
1/3
INFERENTIAL STATISTICS
1/2
½
1/6
1/6
1/6
-1/6
-1/6
-1/6
-1/2
1/2
½
1/6
1/6
1/6
-1/6
-1/6
-1/6
-1/2
1/2
½
1/6
1/6
1/6
-1/6
-1/6
-1/6
-1/2
1
1
2/3
2/3
2/3
1/3
1/3
1/3
0
1
1
2/3
2/3
2/3
1/3
1/3
1/3
0
1
1
2/3
2/3
2/3
1/3
1/3
1/3
0
1
1
2/3
2/3
2/3
1/3
1/3
1/3
0
-1/72
-2/72
-2/72
-1/72
-2/72
(P1P2)2
1
4/6
¼
1/9
1/36
(P1-P2)2. F(P1P2)
1/72
4/216
1/72
1/216
1/216
5/72
12/72
0
2/72
0
1/36
0
1/216
12/72
4/72
1/9
4/216
P1-P2. f(p1-p2)
1
3
4
3
12
F(p1P2)
1/72
3/72
4/72
3/72
12/72
5
12
12
34
SAMPLING
½
2/3
1
IIII/ II
IIII
IIII/
IIII/ II
IIII
4
12
4/72
12/72
2/72
8/72
¼
4/9
1/72
16/216
4
4/72
4/72
1
4/72
∑f=72
∑(P1-p2). f(P1P2) = 12/72
∑(P1-P2)2. f(P1P2) = 48/216
Prove the results
Mean:
µp1-p2=∑ (P1-P2) f (P1-P2)
∑ (P1-P2) f (P1-P2) =12/72
=1/6
Variance:
(σp1-p2)2=∑ (P1-P2)2 f (P1-P2) – (µp1-p2)2
=48/216-1/36
=7/36
The proportion of odd numbers in population 1 and 2 are P1=2/3 and P2=1/2
respectively
Prove the results
1) µp1-p2= P1-P2
= 2/3-1/2
=1/6
2) (σp1-p2)=p1 (1-p1)/n1 + p2(1-p2)/n2
= (2/3) (1/3)/2 + (1/2) (1/2)/3
= (1/9) + (1/12)
=7/36
Which agrees with the results obtained above?
INFERENTIAL STATISTICS
35
SAMPLING
ASSESSMENT QUESTION
Question 1: A population consists of 4 numbers 3, 7, 11, 15 considering all possible samples
of size n=2 which can be drawn from this population find i) the population mean ii) the
population standard deviation iii) the mean of the sampling distribution of means iv) the
standard deviation of the sampling distribution of means. Verify iii) and IV) directly from i)
and ii) by one of the suitable formulae
1. Compute µx , (σx)2 and σx directly without forming the frequency/ sampling
distribution of means if the sampling is without replacement and thus verify the
results
Question 2: Random samples of size 2 are selected from the finite population consisting
of the numbers 3, 5, 7, 9, 11, 13.
a) Find the mean and standard deviation of this population.
b) List the 15 possible random samples (n = 2) that can be selected from this population and
calculate their means.
c) Use the results of part (b) to construct the sampling distribution of the means of these
samples.
d) Calculate the mean l and variance r ² of the probability distribution in part (c) and compare
them with the results obtained in part (a).
Question 3: A city currently does not have a National Football League team. Fifty-four
percent of all the city’s residents are in favor of attracting an NFL team. A random sample of
1000 of the city’s residents is selected, and asked if they would want an NFL team.
A. What is the probability the percentage of those residents polled who are in favor of
attracting an NFL team is less than 50%?
B. What is the probability the percentage of those residents polled who are in favor of
attracting an NFL team is more than 3% from the actual percentage of 54%?
Question4: Let P1 denote the proportion of even numbers in a random sample of size n1=2
without replacement from a population of size N1=3 consisting of value 4, 6, 9 similarly let
P2 denotes the proportion of even numbers in a random sample of size n2=2 without
replacement from a population size N2=3 consisting of values 2, 3, 5. Find the mean and the
variance of the differences of two proportions and verify the results.
1.µp1-p2=∑ (P1-P2) f (P1-P2)
2. (σp1-p2) = p1 (1-p1)/n1 + p2 (1-p2)/n2
Question5: A population consists of 6 values 1, 3, 5, 7, 9, 11.Take all the possible samples
of size 2 which can be drawn i) with replacement ii) without replacement from this
population. Find the sample means and form a sampling distribution of the mean in case of
with replacement. Compute mean and variance directly in case of without replacement. Find
the means and variances and verify the results
INFERENTIAL STATISTICS
36
SAMPLING
I) E(x) =µ, V(x) =
And for ii) E(x) =µ, V(X) =
Question 6: the weights of 1000 students of a college are normally distributed with a mean
68.5kg and standard deviation 2.7kg.if 200 random samples of 25 student each are obtained
from this population find the expected mean and standard deviation of the sampling
distribution of means if sampling is done i) with replacement, ii) without replacement also
verify the respective results.
Question7: draw all possible random samples of size n1=2 with replacement from a
population 3, 4, 5 similarly draw all possible random samples of size n2=2 with replacement
from another finite population 1,2,3 a) find sample means X1 and X2 and the possible
differences between the sample means of the two populations. B) Form a sampling
distribution of X1-X2 and compute its mean and variance and verify the results of difference
between means.
Question8: A population consists of 7 numbers 1, 1, 2, 3, 4, 5, 6 draw all possible samples of
size n=3 without replacement from this population and find the sample proportion of odd
numbers in the sample. Construct the sampling distribution of proportions and also verify
their respective results.
1) E(p)=µp
2) V (p) = √
√
Objectives
1. When each member of a population has an equally likely chance of being selected, this is
called:
a)
b)
c)
d)
A nonrandom sampling method
A quota sample
A snowball sample
An Equal probability selection method
2. Which of the following techniques yields a simple random sample?
a) Choosing volunteers from an introductory psychology class to participate
b) Listing the individuals by ethnic group and choosing a proportion from within each
ethnic group at random.
INFERENTIAL STATISTICS
37
SAMPLING
c) Numbering all the elements of a sampling frame and then using a random number
table to pick cases from the table.
d) Randomly selecting schools, and then sampling everyone within the school.
3. Which of the following is not true about stratified random sampling?
a) It involves a random selection process from identified subgroups
b) Proportions of groups in the sample must always match their population proportions
c) Disproportional stratified random sampling is especially helpful for getting large
enough subgroup samples when subgroup comparisons are to be done
d) Proportional stratified random sampling yields a representative sample
4. Which of the following statements are true?
a) The larger the sample size, the greater the sampling error
b) The more categories or breakdowns you want to make in your data analysis, the
larger the sample needed
c) The fewer categories or breakdowns you want to make in your data analysis, the
larger the sample needed
d) As sample size decreases, so does the size of the confidence interval
5. Which of the following formulae is used to determine how many people to include in the
original sampling?
a)
b)
c)
d)
Desired sample size/Desired sample size + 1
Proportion likely to respond/desired sample size
Proportion likely to respond/population size
Desired sample size/Proportion likely to respond
6. Which of the following sampling techniques is an equal probability selection method (i.e.,
EPSEM) in which every individual in the population has an equal chance of being selected?
a)
b)
c)
d)
Simple random sampling
Systematic sampling c. Proportional stratified sampling
Cluster sampling using the PPS technique
All of the above are EPSEM
7. Which of the following is not a form of nonrandom sampling?
a) Snowball sampling
INFERENTIAL STATISTICS
38
SAMPLING
b)
c)
d)
e)
Convenience sampling
Quota sampling
Purposive sampling
They are all forms of nonrandom sampling
8. Which of the following will give a more “accurate” representation of the population from
which a sample has been taken?
a)
b)
c)
d)
A large sample based on the convenience sampling technique
A small sample based on simple random sampling
A large sample based on simple random sampling
A small cluster sample
9. Sampling in qualitative research is similar to which type of sampling in quantitative
research?
a)
b)
c)
d)
Simple random sampling
Systematic sampling
Quota sampling
Purposive sampling
10. Which of the following would generally require the largest sample size?
a)
b)
c)
d)
e)
Cluster sampling
Simple random sampling
Systematic sampling
Proportional stratified sampling
Answers:
1. D
6. E
2. C
7. E
3. B
8. C
4. B
9. D
5. D
10. A
INFERENTIAL STATISTICS
39
SAMPLING
1) Choose the pair of symbols that complete the sentence-------------- is a parameter,
whereas-------------- is a statistic.
a) N, µ
b) n, s
c)
d)
2) In which of the following situations would x=
computing x?
√ be the correct formula to use for
a) Sampling from infinite population without replacement
b) Sampling from infinite or finite population without replacement
c) Sampling from finite population without replacement
d) Both b and c but not a
3) suppose that a population has standard deviation 5 what is the standard deviation of
the sampling distribution of the mean of sample size n=25
a) 5
b) 25
c) 1
d) 0.2
4) A border patrol check point stops every 10th passenger van is using
a) Stratified sampling
b) Cluster sampling
c) Systematic sampling
d) Sequential sampling
5) Standard error of the mean is the standard deviation of the
a) Population
b) Statistic
c) Sample
d) Sampling distribution of means
6) If samples of size n are drawn without replacement from a population of size N with
mean (µ) and variance( ), the standard error of the sample mean would be
a)
√
c) (N-n/N-1).
b)
d) (N-n/N-1).
INFERENTIAL STATISTICS
40
SAMPLING
7) in sampling without replacement
a) n<N
b) n>N
c) n≤N
d) n≥N
8) The standard error increases when the sample size is
a) Increased
b) Decreased
c) small
d) Large
9) The number of possible samples drawn by using with replacement as compared to
without replacement would be
a) more
b) Less
c) Equal
d) None
10) The difference between a statistic and a parameter is called
a) Sampling distribution
b) Sampling error
c) Systematic error
d) Non sampling error
11) Which of the following statements best describes the relationship between a parameter
and a statistic?
a)
b)
c)
d)
A parameter has a sampling distribution with the statistic as its mean.
A parameter has a sampling distribution that can be used to determine what
values the statistic is likely to have in repeated samples.
A parameter is used to estimate a statistic.
A statistic is used to estimate a parameter.
12) Sampling distribution of is the
a)
b)
c)
probability distribution of the sample mean
probability distribution of the sample proportion
mean of the sample
INFERENTIAL STATISTICS
41
SAMPLING
a. mean of the population
13) A simple random sample of 100 observations was taken from a large population. The
sample mean and the standard deviation were determined to be 80 and 12 respectively. The
standard error of the mean is
a. 1.20
b. 0.12
c. 8.00
d. 0.80
14) The probability distribution of all possible values of the sample proportion is the
A. probability density function of
B. sampling distribution of
C. same as , since it considers all possible values of the sample proportion
D. sampling distribution of
15) Since the sample size is always smaller than the size of the population, the sample mean
A. must always be smaller than the population mean
B. must be larger than the population mean
C. must be equal to the population mean
D. can be smaller, larger, or equal to the population mean
16) Standard deviation of all possible values is called the
A. standard error of proportion
B. standard error of the mean
C. mean deviation
D. central variation
INFERENTIAL STATISTICS
42
SAMPLING
17) As the sample size becomes larger, the sampling distribution of the sample mean
approaches a
A. binomial distribution
b. Poisson distribution
C. normal distribution
D. chi-square distribution
INFERENTIAL STATISTICS
43
SAMPLING
References:
http://www.southalabama.edu/coe/bset/johnson/dr_johnson/mcq/mc7.pdf
http://labspace.open.ac.uk/mod/oucontent/view.php?id=454418&section=1.5.2
http://sociology.about.com/od/Q_Index/g/Quota-Sample.htm
http://www.stats.gla.ac.uk/steps/glossary/basic_definitions.html#sampdistn
http://www.investopedia.com/terms/s/sampling.asp
http://www.csulb.edu/~msaintg/ppa696/696sampl.htm#Why sample
http://schatz.sju.edu/methods/sampling/intro.html
https://www.google.com.pk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja
&ved=0CCcQFjAA&url=http%3A%2F%2Flibguides.usc.edu%2Floader.php%3Ftype
%3Dd%26id%3D675792&ei=OzujUoWnK4_KsgbHyYCwDg&usg=AFQjCNHi8T9Ua
REnJxzUbP_Y72YacNEoaQ&bvm=bv.57752919,d.Y
http://www.vtasq.org/pdf_ppt/program_presentations/Sampling%20Presentation%20
Oct%2026%202011.pdf
http://www.slideshare.net/samanshuaib7/savedfiles?s_title=sampling-ppt-myreport&user_login=mjfababaer
http://www.investopedia.com/terms/n/non-samplingerror.asp
http://www.cliffsnotes.com/math/statistics/sampling/populations-samples-parametersand-statistics
http://stattrek.com/sampling/sampling-distribution.aspx
http://people.uncw.edu/pricej/teaching/statistics/hyp_test.htm
http://stattrek.com/survey-research/compare-sampling-methods.aspx
INFERENTIAL STATISTICS
44
SAMPLING
STATISTICAL INFERENCE: HYPOTHESIS TESTING
Statistical inference:
Statistical inference is the process of drawing conclusions from data
that are subject to random variation, for example, observational
errors or sampling variation.
It is concerned with making predictions or inferences about a
population from observations and analyses of a sample. This means
But keep in mind that
sample should be large
enough to represent the
population or we can say it
should be a representative
part of it.
that by using sample results of a particular population, we can
conclude about whole population and its characteristics.
Hypothesis:
A statistical hypothesis is a claim (assertion, statement, belief or assumption) about an unknown
population parameter value.
For example, an investment company claims that the average return across all its investments is
20 percent and so on. To test such claims sample data are collected and analyzed. On the basis of
sample findings, hypothesized value of population parameter is accepted or rejected.
STEPS IN HYPOTHESIS TESTING
Specification of hypothesis:
The
Null Hypothesis
rejection
hypothesis
leads
of
to
null
the
An assumption to be tested for possible rejection is called null
acceptance of an alternative
hypothesis and is denoted by H0.
hypothesis.
Alternate Hypothesis
Any hypothesis that is different from the null hypothesis and is set up
in parallel to the null hypothesis, is called an alternative hypothesis
and is denoted by H1
INFERENTIAL STATISTICS
45
HYPOTHESIS TESTING
Types of Hypothesis
Directional:
Directional hypothesis are those where one can predict the direction (effect of one variable
on the other as 'Positive' or 'Negative')
For example, Girls perform better than boys (‗better than‘ shows the direction predicted)
One tail test:
A one tailed test looks for an increase or decrease in the parameter. In a one-tailed test, the
critical region will have just one part (the red area below). If our sample value lies in this
region, we reject the null hypothesis in favor of the alternative.
Suppose we are looking for a definite decrease. Then the critical region will be to the left.
Note, however, that in the one-tailed test the value of the parameter can be as high as you
like.
Non - Directional
Non Directional hypothesis are those where one does not predict the kind of effect but can
state a relationship between variable 1 and variable 2.
For example, there will be a difference in the performance of girls & boys (Not defining
what kind of difference)
Two tail test:
A two-tailed test looks for any change in the parameter (which can be any change- increase
or decrease).A two-tailed t-test divides distribution in half, placing half in the each tail. The
INFERENTIAL STATISTICS
46
HYPOTHESIS TESTING
null hypothesis in this case is a particular value, and there are two values for alternative
hypotheses, one positive and one negative. The critical value of t, tcrit, is written with both a
plus and minus sign (±). For example, the critical value of t when there are ten degrees of
freedom (DF=10) and
is set to .05, is tcri = ± 2.228. The sampling distribution model used
in a two-tailed t-test is illustrated below:
Level of significance:
The significance level is
usually denoted by α
The significance level of a statistical hypothesis test is a fixed
Significance Level = P (type
I error) = α
probability of wrongly rejecting the null hypothesis H0, if it is in
fact true. It is the probability of a type I error (explained
Usually, the significance
level is chosen to be 0.05 (or
equivalently, 5%).
below) and is set by the investigator in relation to the
consequences of such an error. That is, we want to make the
significance level as small as possible in order to protect the null
hypothesis and to prevent, as far as possible, the investigator
from inadvertently making false claims.
Type 1 error:
In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it
is in fact true; that is, H0 is wrongly rejected.
For example, in a clinical trial of a new drug, the null hypothesis might be that the
new drug is no better, on average, than the current drug; i.e.
H0: There is no difference between the two drugs on average.
INFERENTIAL STATISTICS
47
HYPOTHESIS TESTING
A type I error would occur if we concluded that the two drugs produced different
effects when in fact there was no difference between them.
Type II Error:
A type II error occurs when the null hypothesis H0, is not
The probability of a type II
rejected when it is in fact false.
error is generally unknown,
According to previous example, a type II error would
but is symbolized by β and
occur if it was concluded that the two drugs produced the
written P (type II error) = β
same effect, i.e. there is no difference between the two
drugs on average, when in fact they produced different
ones. A type II error is frequently due to sample sizes
being too small.
Accept H0
H0 is true
H0 is false
Correct decision with
Type 2 error (β)
confidence(1-α)
Reject H0
Type 1 error α
Correct
decision
with
confidence (1-β)
Table showing Type I and Type II errors
Which Error Is More Dangerous?
Both errors can be dangerous depending upon the situation faced. For example, if in court, a
judge makes wrong decision by releasing a criminal (Type I error), it would be more
dangerous than punishing the right person (Type II error). Similarly some situations can be
faced where Type II error would be more dangerous.
INFERENTIAL STATISTICS
48
HYPOTHESIS TESTING
Test statistics:
A test statistic is a quantity calculated from our sample of data. Its value is used to decide
whether or not the null hypothesis should be rejected in our hypothesis test.
Cases
Case1
Case2
Case3
Case4
Case5
Case6
Case7
Case8
No of
parameter
1
2
1
2
1
2
1
Parameter
δ or δ2
n
Statistic
µ
Known
-
z=
µ
µ
µ
µ
µ
P
2
p
1
2
Known
Unknown
Unknown
Unknown
Unknown
Known
Known
-
z=
n<30
t=
𝑠
𝑛
𝛿x 1− x 2
x −μ
𝑠
with n-1 d.f
𝑛
𝑑−𝜇 𝑑
𝑆𝑑
-
𝑛
z=
-
x −μ
( x1− x2 ) − ( µ 1− μ 2 )
t=
n<30
𝑛
(𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 )
z=
z=
𝛿
( x 1− x 2 ) − ( µ1−µ2 )
n>30
n>30
x −μ
with n-1 d.f
𝐩̂−𝐩
𝐩𝐪 𝐧
p̂ 1 − p̂ 2
z=
p (1−p̂ (1 𝑛 1 + 1 𝑛 2 )
Case 9
Case 10
Case 11
Case 12
2
1
1
σ
Two σ2
Α
Β
-
-
-
𝜒2 =
𝑛 −1 𝑠 2
𝜎2
-
-
-
-
-
with ( r – 1 )( c – 1 )
f=
t=
t=
a−α
S.E a
b−β
S.E b
σ 21
σ 22
with n-2 d.f
with n-2 d.f
Calculation:
Calculate the standard error of the sample statistic. Use the standard error to convert the
observed value of the sample statistic to a standardize value.
INFERENTIAL STATISTICS
49
HYPOTHESIS TESTING
Critical region:
The critical region CR, or rejection region RR, is a set of values of the test statistic for which
the null hypothesis is rejected in a hypothesis test. That is, the sample space for the test
statistic is partitioned into two regions; one region (the critical region) will lead us to reject
the null hypothesis H0, the other will not. So, if the observed value of the test statistic is a
member of the critical region, we conclude "Reject H0"; if it is not a member of the critical
region then we conclude "Do not reject H0".
Conclusion:
The final conclusion is made by comparing the test statistic (which is a summary of the
information observed in the sample) to the decision rule. The final conclusion will be either
to reject the null hypothesis (because the sample data are very unlikely if the null hypothesis
is true) or not to reject the null hypothesis (because the sample data are not very unlikely).
SPECIAL NOTE
Acceptance of hypothesis:
In real life cases, when we reject the null hypothesis, it does not mean that we are accepting
the
alternate
hypothesis.
That's
because
a
hypothesis
test
does
not
determine which hypothesis is true, or even which is most likely; it only assesses whether
available evidence exists to reject the null hypothesis.
Conclusion:
When we are testing any hypothesis, we are not actually performing that situation, we are
only doing paper work, so we cannot say that we are making decision on the basis of null and
alternate hypothesis. We always say that we are making conclusion on these basis. As
discussed above, when we reject the null hypothesis, it does not mean that we are accepting
the alternate hypothesis i.e. we are making ―conclusion‖ that we are rejecting null; but we
are not making ―decision‖ that we are accepting the alternate on the other side.
INFERENTIAL STATISTICS
50
HYPOTHESIS TESTING
Understanding through example:
Look at it in terms of "innocent until proven guilty" in a courtroom: As the person analyzing
data, you are the judge. The hypothesis test is the trial, and the null hypothesis is the
defendant. The alternative hypothesis is like the prosecution, which needs to make its
case beyond a reasonable doubt (say, with 95% certainty).
If the evidence presented doesn't prove the defendant is guilty beyond a reasonable doubt,
you still have not proved that the defendant is innocent. But based on the evidence, you can't
reject that possibility.
So how that verdict would be announced? It enters the court record as "Not guilty". That
phrase is perfect: "Not guilty" doesn't mean the defendant is innocent, because that has not
been proven. It just means the prosecution couldn't prove its case to the necessary, "beyond a
reasonable doubt" standard. It failed to convince the judge to abandon the assumption of
innocence.
If you follow that rationale, then you can see that "failure to reject the null" is just the
statistical equivalent of "not guilty." In a trial, the burden of proof falls to the prosecution.
When analyzing data, the entire burden of proof falls to the sample data you've collected. Just
as "not guilty" is not the same thing as "innocent," neither is "failing to reject" the same as
"accepting" the null hypothesis.
So the next time you're looking to hang around at the local Nulls Angels clubhouse,
remember that "failing to reject the null" is not "accepting the null." Knowing the difference
just might get Tiny to buy you a drink.
INFERENTIAL STATISTICS
51
HYPOTHESIS TESTING
Z Table:
How to Read a Z Table:
Example: Percent of Population between 0 and 0.45
Start at the row for 0.4, and read along until 0.45: there is the value 0.1736
And 0.1736 is 17.36%
So 17.36% of the population is between 0 and 0.45 Standard Deviations from the Mean
INFERENTIAL STATISTICS
52
HYPOTHESIS TESTING
T Table:
How to read t table:
First we will get the value of alpha with degree of freedom then we will search degree of
freedom vertically and we will see the value of alpha horizontally. The value coming under
the respective degree of freedom and alpha is the required value.
INFERENTIAL STATISTICS
53
HYPOTHESIS TESTING
CHI SQUARE
GOODNESS OF FIT
When an analyst attempts to fit a statistical model to observed data, he or she may wonder
how well the model actually reflects the data. How "close" are the observed values to those
which would be expected under the fitted model? One statistical test that addresses this issue
is the chi-square goodness of fit test. This test is commonly used to test association of
variables in two-way tables where the assumed model of independence is evaluated against
the observed data. In general, the chi-square test statistic is of the form
If the computed test statistic is large, then the observed and expected values are not close and
the model is a poor fit to the data.
The following are properties of the goodness-of-fit test:

The data are the observed frequencies. This means that there is only one data value
for each category.

The degree of freedom is one less than the number of categories, not one less than the
sample size.

It is always a right tail test.

It has a chi-square distribution.

The value of the test statistic doesn't change if the order of the categories is switched.
TEST OF INDEPENDENCE
In the test for independence, the claim is that the row and column variables are independent
of each other. This is the null hypothesis.
The multiplication rule said that if two events were independent, then the probability of both
occurring was the product of the probabilities of each occurring. It is a key to working the
INFERENTIAL STATISTICS
54
HYPOTHESIS TESTING
test for independence. If you end up rejecting the null hypothesis, then the assumption must
have been wrong and the row and column variable are dependent. Remember, all hypothesis
testing is done under the assumption the null hypothesis is true.
The test statistic used is the same as the chi-square goodness-of-fit test. The principle behind
the test for independence is the same as the principle behind the goodness-of-fit test. The test
for independence is always a right tail test.
In fact, you can think of the test for independence as a goodness-of-fit test where the data is
arranged into table form. This table is called a contingency table.
The test statistic has a chi-square distribution when the following assumptions are met:

The data are obtained from a random sample

The expected frequency of each category must be at least 5.

The following are properties of the test for independence

The data are the observed frequencies.

The data is arranged into a contingency table.

The degrees of freedom are the degrees of freedom for the row variable times the
degrees of freedom for the column variable. It is not one less than the sample size; it
is the product of the two degrees of freedom.

It is always a right tail test.

It has a chi-square distribution.

The expected value is computed by taking the row total times the column total and
dividing by the grand total

The value of the test statistic doesn't change if the orders of the rows or columns are
switched.
 The value of the test statistic doesn't change if the rows and columns are interchanged
(transpose of the matrix)
INFERENTIAL STATISTICS
55
HYPOTHESIS TESTING
TEST FOR HOMOGENIETY
The test for homogeneity is a method, based on the chi-square statistic, for testing whether
two or more multinomial distributions are equal.
When is this test used?

The data is multinomial data in a contingency table or two way cross-classification
table. All expected values are at least 5.

Another rule of thumb is that there are more than 4 cells, the average of the expected
values is at least 5, and the smallest expected value is at least 1.

The cells have counts or frequencies. It doesn‘t work if the data is percentages or
relative frequencies!

Either the row totals or column totals are fixed.

The data comes from multiple samples which are independent.

This test is used to see if the different samples come from populations with the same
distribution. Each cell will have a frequency or count. It is necessary to find row,
column, and grand totals. There will be i categories and j samples or distributions.
The notation below assumes the categories are the rows and the samples are the
columns. If they are switched, all the calculations and results will be the same.
How to decide which χ2 test is appropriate one to use among above three tests?
Goodness of Fit: Use the Goodness of Fit Test when you want to decide whether a
population with unknown distribution "fits" a known distribution. In this case there will be a
single qualitative survey question or a single outcome of an experiment from a single
population. Goodness of fit is typically used to see if the population is uniform (all outcomes
occur with equal frequency), the population is normal, or the population is the same as
another population with known distribution. The null and alternative hypotheses are:
H0: The population fits the given distribution.
Ha: The population does not fit the given distribution.
INFERENTIAL STATISTICS
56
HYPOTHESIS TESTING
Independence: Use the Test for Independence when you want to decide whether two
variables are independent or dependent. In this case there will be two qualitative survey
questions or experiments and a contingency table will be constructed. The goal is to see if
the two variables are unrelated (independent) or related (dependent). The null and alternative
hypotheses are:
H0: The two variables are independent.
Ha: The two variables are dependent.
Homogeneity: Use the Test for Homogeneity when you want to decide if two populations
with unknown distribution have the same distribution as each other. In this case there will be
a single qualitative survey question or experiment given to two different populations. The
null and alternative hypotheses are:
H0: The two populations follow the same distribution.
Ha: The two populations have different distributions.
FISHER’s EXACT TEST
The Fisher's Exact test procedure calculates an exact probability value for the relationship
between two dichotomous variables, as found in a two by two cross table. The program
calculates the difference between the data observed and the data expected, considering the
given marginal and the assumptions of the model of independence. It works in exactly the
same way as the Chi-square test for independence; however, the Chi-square gives only an
estimate of the true probability value, an estimate which might not be very accurate if the
marginal is very uneven or if there is a small value (less than five) in one of the cells. In such
cases the Fisher exact test is a better choice than the Chi-square. However, in many cases
the Chi-square is preferred because the Fisher exact test is difficult to calculate.
The probability of observing a given set of frequencies a, b, c and d in a 2 x 2 contingency
table, given fixed row and column marginal totals and sample size n, is:
(a + b )! ( a + c )! ( c + d )! ( b + d )!
𝑎! 𝑏! 𝑐! 𝑑! 𝑛!
INFERENTIAL STATISTICS
57
HYPOTHESIS TESTING
SPECIAL CASE OF CONTIGENCY
A 2×2 contingency table shows the frequencies of occurrence of all combinations of the
levels of two dichotomous variables, in a sample of size N. A schematic form of such a
table is given by the figure below.
A research question of interest is often whether the variables summarized i n a
contingency table are independent of each other. The test to determine if this is so
depends on which, if any, of the margins are fixed, either by design or for the purposes
of the analysis. For example, in a randomized trial in which the number of subjects to be
randomized to each treatment group has been specified, the row margins would be fixed
but the column margins would not (it is customary to use rows for treatments and
columns for outcomes). In a matched study, however, in which one might sample 100
cases (smokers, say) and 1000 controls (non–smokers), and then test each of these 1100
subjects for the presence or absence of some exposure that may have predicted their own
smoking status (perhaps a parent who smoked), it would be the column margins that are
fixed. In a random and unstrained sample, in which each subject sampled is then cross–
classified by two attributes (say smoking status and gender), neither margin would be
fixed. Finally, in Fisher‘s famous tea–tasting experiment, in which a lady was to guess
whether the milk or the tea infusion was first added to the cup by dividing 8 cups into
two sets of 4, both the row and the column margins would be fixed by the design. Yet in
the first case mentioned, that of a randomized trial with fixed row margins but not fixed
column margins, the column margins may be treated as fixed for the purposes of the
analysis, so as to ensure exactness.
When the row and column margins are fixed, either by design or for the analysis,
independence can be tested using Fisher‘s exact test. This test is based on the hyper
INFERENTIAL STATISTICS
58
HYPOTHESIS TESTING
geometric distribution and it is computationally intensive, especially in large samples.
Therefore, Fisher advocated the use of Pearson‘s statistic,
𝜒2 =
n (ad − bc )2
𝑎 + 𝑏 𝑐 + 𝑑 𝑎 + 𝑐 (𝑏 + 𝑑)
F-Test
The F-distribution is formed by the ratio of two independent chi-square variables divided by
their respective degrees of freedom.
Since F is formed by chi-square, many of the chi-square properties carry over to the F
distribution.
The F-values are all non-negative
The distribution is non-symmetric
The mean is approximately 1
There are two independent degrees of freedom, one for the numerator, and one for the
denominator.
There are many different F distributions, one for each pair of degrees of freedom.
The F-test is designed to test if two population variances are equal. It does
this by comparing the ratio of two variances. So, if the variances are equal,
the ratio of the variances will be 1.
If the null hypothesis is true, then the F test-statistic given above can be simplified
(dramatically). This ratio of sample variances will be test statistic used. If the null hypothesis
is false, then we will reject the null hypothesis that the ratio was equal to 1 and our
assumption that they were equal.
There are several different F-tables. Each one has a different level of significance. So, find
the correct level of significance first, and then look up the numerator degrees of freedom and
the denominator degrees of freedom to find the critical value.
You will notice that all of the tables only give level of significance for right tail tests.
Because the F distribution is not symmetric, and there are no negative values, you may not
simply take the opposite of the right critical value to find the left critical value. The way to
INFERENTIAL STATISTICS
59
HYPOTHESIS TESTING
find a left critical value is to reverse the degrees of freedom, look up the right critical value,
and then take the reciprocal of this value.
For example, the critical value with 0.05 on the left with 12 numerator and 15 denominator
degrees of freedom is found of taking the reciprocal of the critical value with 0.05 on the
right with 15 numerator and 12 denominator degrees of freedom.
Assumptions:

The larger variance should always be placed in the numerator

The test statistic is F = s1^2 / s2^2 where s1^2 > s2^2

Divide alpha by 2 for a two tail test and then find the right critical value

If standard deviations are given instead of variances, they must be squared

When the degrees of freedom aren't given in the table, go with the value with the
larger critical value (this happens to be the smaller degrees of freedom). This is so
that you are less likely to reject in error (type I error)

The populations from which the samples were obtained must be normal.

The samples must be independent
How to read an F table:

Find the column that corresponds to the relevant numerator degrees of freedom, r1.

Find the three rows that correspond to the relevant denominator degrees of
freedom, r2.

Find the one row, from the group of three rows that is headed by the probability of
interest... whether it's 0.01, 0.025, and 0.05.

Determine the F-value where the r1 column and the probability row intersect.
INFERENTIAL STATISTICS
60
HYPOTHESIS TESTING
Analysis of Variance-ANOVA
ONE-WAY ANOVA
A One-Way Analysis of Variance is a way to test the equality of three or more means at one
time by using variances.
Assumptions

The populations from which the samples were obtained must be normally or
approximately normally distributed.

The samples must be independent.

The variances of the populations must be equal.
Hypothesis
The null hypothesis will be that all population means are equal; the alternative hypothesis is
that at least one means is different.
In the following, lower case letters apply to the individual samples and capital letters apply to
the entire set collectively. That is, n is one of many sample sizes, but N is the total sample
size.
Grand Mean
The grand mean of a set of samples is the total of all the data values divided by the
total sample size. This requires that you have all of the sample data available to
you, which is usually the case, but not always. It turns out that all that is necessary
to find perform a one-way analysis of variance are the number of samples, the
sample means, the sample variances, and the sample sizes.
Another way to find the grand mean is to find the weighted average of the sample
means. The weight applied is the sample size.
INFERENTIAL STATISTICS
61
HYPOTHESIS TESTING
Total Variation
The total variation (not variance) is comprised the sum of the squares of the
differences of each mean with the grand mean.
There is the between group variation and the within group variation. The whole idea behind
the analysis of variance is to compare the ratio of between group variance to within group
variance. If the variance caused by the interaction between the samples is much larger when
compared to the variance that appears within each group, then it is because the means aren't
the same.
Between Group Variation
The variation due to the interaction between the samples is denoted SS
(B) for Sum of Squares Between groups. If the sample means are close
to each other (and therefore the Grand Mean) this will be small. There
are k samples involved with one data value for each sample (the sample
mean), so there are k-1 degrees of freedom.
The variance due to the interaction between the samples is denoted MS (B) for Mean Square
Between groups. This is the between group variation divided by its degrees of freedom. It is
also denoted by
.
Within Group Variation
The variation due to differences within individual samples denoted SS
(W) for Sum of Squares Within groups. Each sample is considered
independently, no interaction between samples is involved. The degree
of freedom is equal to the sum of the individual degrees of freedom for
each sample. Since each sample has degrees of freedom equal to one
less than their sample sizes, and there are k samples, the total degrees of
freedom is k less than the total sample size: df = N - k.
The variance due to the differences within individual samples is denoted MS (W) for Mean
Square Within groups. This is the within group variation divided by its degrees of freedom. It
INFERENTIAL STATISTICS
62
HYPOTHESIS TESTING
is also denoted by
. It is the weighted average of the variances (weighted with the degrees
of freedom).
F test statistic
Recall that an F variable is the ratio of two independent chi-square variables divided by their
respective degrees of freedom. Also recall that the F test statistic is the ratio of two sample
variances, well, it turns out that's exactly what we have here. The F test statistic is found by
dividing the between group variance by the within group variance. The degrees of freedom
for the numerator are the degrees of freedom for the between group (k-1) and the degrees of
freedom for the denominator are the degrees of freedom for the within group (N-k).
Summary Table
All of this sounds like a lot to remember, and it is. However, there is a table which makes
things really nice.
Between
SS
df
MS
F
SS(B)
k-1
SS(B)
MS(B)
-----------
--------------
k-1
MS(W)
SS(W)
--.
SS(W)
Within
N-k
----------N-k
Total
SS(W) + SS(B)
INFERENTIAL STATISTICS
N-1
63
.
.
HYPOTHESIS TESTING
Notice that each Mean Square is just the Sum of Squares divided by its degrees of freedom,
and the F value is the ratio of the mean squares. Do not put the largest variance in the
numerator, always divide the between variance by the within variance. If the between
variance is smaller than the within variance, then the means are really close to each other and
you will fail to reject the claim that they are all equal. The degrees of freedom of the F-test
are in the same order they appear in the table.
Decision Rule
The decision will be to reject the null hypothesis if the test statistic from the table is greater
than the F critical value with k-1 numerator and N-k denominator degrees of freedom.
If the decision is to reject the null, then at least one of the means is different. However, the
ANOVA does not tell you where the difference lies. For this, you need another test, either
the Scheffe' or Tukey test.
TWO-WAY ANOVA
The two-way analysis of variance is an extension to the one-way analysis of variance. There
are two independent variables (hence the name two-way).
Assumptions

The populations from which the samples were obtained must be normally or
approximately normally distributed.

The samples must be independent.

The variances of the populations must be equal.

The groups must have the same sample size.
Hypothesis

There are three sets of hypothesis with the two-way ANOVA.

The null hypotheses for each of the sets are given below.
INFERENTIAL STATISTICS
64
HYPOTHESIS TESTING

The population means of the first factor are equal. This is like the one-way ANOVA
for the row factor.

The population means of the second factor are equal. This is like the one-way
ANOVA for the column factor.
 There is no interaction between the two factors. This is similar to performing a test
for independence with contingency tables.
Factors
The two independent variables in a two-way ANOVA are called factors. The idea is that
there are two variables, factors, which affect the dependent variable. Each factor will have
two or more levels within it, and the degrees of freedom for each factor is one less than the
number of levels.
Treatment Groups
Treatment Groups are formed by making all possible combinations of the two factors. For
example, if the first factor has 3 levels and the second factor has 2 levels, then there will be
3x2=6 different treatment groups.
Main Effect
The main effect involves the independent variables one at a time. The interaction is ignored
for this part. Just the rows or just the columns are used, not mixed. This is the part which is
similar to the one-way analysis of variance. Each of the variances calculated to analyze the
main effects are like the between variances
Interaction Effect
The interaction effect is the effect that one factor has on the other factor. The degrees of
freedom here are the product of the two degrees of freedom for each factor.
INFERENTIAL STATISTICS
65
HYPOTHESIS TESTING
Within Variation
The Within variation is the sum of squares within each treatment group. You have one less
than the sample size (remember all treatment groups must have the same sample size for a
two-way ANOVA) for each treatment group. The total number of treatment groups is the
product of the number of levels for each factor. The within variance is the within variation
divided by its degrees of freedom. The within group is also called the error.
F-Tests
There is an F-test for each of the hypotheses, and the F-test is the mean square for each main
effect and the interaction effect divided by the within variance. The numerator degrees of
freedom come from each effect, and the denominator degrees of freedom is the degrees of
freedom for the within variance in each case.
Two-Way ANOVA Table
It is assumed that main effect A has a levels (and A = a-1 df), main effect B has b levels (and
B = b-1 df), n is the sample size of each treatment, and N = abn is the total sample size.
Notice the overall degree of freedom is once again one less than the total sample size.
Source
Main Effect A
SS
df
MS
F
Given
A,
SS / df
MS(A) / MS(W)
SS / df
MS(B) / MS(W)
SS / df
MS(A*B) /
a-1
Main Effect B
Given
B,
b-1
Interaction
Given
A*B,
(a-1)(b-
Effect
MS(W)
1)
INFERENTIAL STATISTICS
66
HYPOTHESIS TESTING
Within
Given
N - ab,
SS / df
ab(n-1)
Total
sum of others
N - 1,
abn - 1
MULTIPLE COMPARISON TEST
In a one-way ANOVA, the F statistic tests whether the treatment effects are all equal, i.e. that
there are no differences among the means of the J groups. A significant F value indicates that
there are differences in the means, but it does not tell you where those differences are, e.g.
group 1‘s mean might be different than group 2‘s mean but not different from group 3‘s
mean.
To isolate where the differences are, you could do a series of pair wise T-tests. The problem
with this is that the significance levels can be misleading. For example, if you have 7 groups,
there will be 21 pair wise comparisons of means; if using the .05 level of significance, you
would expect at least one statistically significant difference even if no differences exist.
Therefore, various methods have been developed for doing multiple comparisons of group
means.
LSD
LSD stands for Least Significant Difference t test. This test does not control the overall
probability of rejecting the hypotheses that some pairs of means are different, while in fact
they are equal, i.e. it doesn‘t matter if you are comparing 1 pair of means or a 100, no
adjustment is made for the number of comparisons. The formula is;
INFERENTIAL STATISTICS
67
HYPOTHESIS TESTING
BONFERRONI
The Bonferroni adjustment is the simplest. It basically multiplies each of the
significance levels from the LSD test by the number of tests performed, i.e.
J*(J-1)/2
If this value is greater than 1, then a significance level of 1 is used.
SIDAK
While simple, the Bonferroni adjustment actually overcompensates for the fact that
multiple comparisons are being made, e.g. if you do 21 tests, the probability is NOT
1.05 that at least one of them will be significant at the .05 level; rather, it is 1 – .9521
=.659. The Sidak adjustment computes the level of significance as
1-(1-LSDsignificance) J*(J-1)/2
SCHEFFE
The Scheffe test takes a somewhat different approach. The Scheffe test computes an F
statistic with d.f. = J-1, N-J.
Scheffe = LSD2/ (J – 1).
INFERENTIAL STATISTICS
68
HYPOTHESIS TESTING
PROBLEMS RELATED TO EACH CASE
Case 1: (For testing μ when δ or δ 2 is known)
Q1:
The mean lifetime of electric light bulbs produced by a company has in the past been 1120
hours with standard deviation of 125 hours. A sample of 100 electric bulbs recently chosen
from a supply of newly produced bulbs showed a mean lifetime of 1070 hours. Test the
hypothesis that the mean lifetime of bulbs has not changed, using 5% levels of significance.
Solution:
Step 1: Specification of hypothesis
Ho: μ= 1120 hours (The mean life time of the bulbs has not changed)
Two Tailed
H1: μ≠ 1120 hours (the mean life time of bulbs has changed)
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is known.
Step 3: Test Statistics
xˉ −μ
z = σ/ n
Step 4: Calculation
z cal =
1070 −1120
125/ 100
50
z cal = - 125 (10)
z cal= -4
Step 5: Critical Region
Reject Ho if;
INFERENTIAL STATISTICS
69
HYPOTHESIS TESTING
│z cal│≥ z tab
So;
z tab=1.96
│-4 │≥ 1.96
Step 6: Conclusion
Since the calculated value of z exceeds the critical values of z
tab=
1.96. (z lies in the critical
region), we reject Ho at 5% level of significance. We therefore, conclude that the mean
lifetime of the bulbs has changed.
Q2:
The mean weight of a tablet of a certain drug is claimed to be 50 mg. A sample of 100 tablets
showed a mean weight of 50.15 mg with a standard deviation of 0.4 mg. using a 1% level of
significance, can we conclude that the desired weight is not properly maintained.
Step 1: Specification of hypothesis
Ho: μ= 50 mg (the weight of the tablet is properly maintained)
Two Tailed
H1: μ≠50 mg (the weight of the tablet is not properly maintained)
Step 2: Level of Significance
α= 0.01 (1 %)
Standard deviation is known.
Step 3: Test Statistics
xˉ −μ
z = σ/ n
Step 4: Calculation
z cal =
z cal =
INFERENTIAL STATISTICS
70
50.15−50
0.4/ 100
0.15
0.4
(10)
HYPOTHESIS TESTING
z cal= 3.75
Step 5: Critical Region
Reject Ho if;
│z cal│≥ z tab
So;
z tab= 2.58
│3.75│ ≥ 2.58
Step 6: Conclusion
Since the calculated value of z is greater than the critical values of z
tab.
(z lies in the critical
region), we reject Ho at 1% level of significance. We therefore, conclude that the weight of
the tablet is not properly maintained.
Practice Questions
Q1:
A manufacturer supplies the rear axles for U.S Postal Service mail trucks. These axles must
be able to withstand 80,000 pounds per square inch in stress tests, but an excessively strong
axle raises production costs significantly. Long experience indicates that the standard
deviation of the strength of its axles is 4,000 pounds per square inch. The manufacturer
selects a sample of 100 axles from production, tests them, and finds that the mean stress
capacity of the sample is 79,600 pounds per square inch.
Q2:
It has been found from experience at the mean breaking strength of a particular brand of
threads is 9.63N with a standard deviation of 1.40N. Recently a sample of 36 pieces of
threads showed a mean breaking strength of 8.93N. Can we conclude at 5% and 1% levels of
significance that the threads have become inferior?
INFERENTIAL STATISTICS
71
HYPOTHESIS TESTING
Case 2: (For testing μ 1 and μ2 when δ or δ 2 is known)
Q1:
A firm believes that the tires produced by process A on an average last longer than tires
produced by process B. To test this belief, random samples of tires produced by the two
processes were tested and the results are:
Process
Sample Size
Average
Standard
Lifetime
Deviation
(in km)
(in km)
A
50
22,400
1000
B
50
21,800
1000
Is there evidence at a 5% level of significance that the firm is correct in its belief?
Solution:
Step 1: Specification of hypothesis
Let us take the null hypothesis that the use of vitamin C reduces the mean time required to
recover from the common cold, that is
Ho: (μ1- μ2) ≤ 0
H1: (μ1- μ2) > 0
Upper Tail
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is known.
Step 3: Test Statistics
INFERENTIAL STATISTICS
72
HYPOTHESIS TESTING
z=
( x 1− x 2 ) − ( µ1−µ2 )
(𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 )
Step 4: Calculation
22,400−21,800
z cal =
1000 2 1000 2
+
50
50
z cal = -
600
20,000+20,000
600
z cal= 200
z cal= 3
Step 5: Critical Region
Reject Ho if;
z cal > z tab
So;
z tab= 1.65
3 > 1.65
Step 6: Conclusion
Since the calculated value of z exceeds the critical values of z tab= 1.645. (Z lies in the critical
region), we reject Ho at 5% level of significance. We therefore, conclude that the tires
produced by process a last longer than those produced by process B.
Q2:
A random sample of size 6 from a normal population with variance 24 gave mean= 15. A
sample of size 8 from a normal population with variance 80 gave mean = 13. Test the H o=
μ1-μ2=0 against not equal to 0.
Solution:
Step 1: Specification of hypothesis
Ho: μ1= μ2
Ho: μ1 - μ2 = 0
INFERENTIAL STATISTICS
73
Two Tailed
HYPOTHESIS TESTING
H1: μ1- μ2 ≠ 0
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is known.
Step 3: Test Statistics
z=
( x 1− x 2 ) − ( µ1−µ2 )
(𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 )
Step 4: Calculation
z cal =
(15−13)−0
z cal =
z cal=
24 80
+
6 58
2
4+10
2
14
z cal= 0.535
Step 5: Critical Region
Reject Ho if;
│z cal│≥ z tab
So;
z tab= 1.65
0.535 ≱ 1.65
Step 6: Conclusion
Since the calculated value of z is less than the critical values of z
tab
(z do not lies in the
critical region), we do not reject Ho at 5% level of significance.
INFERENTIAL STATISTICS
74
HYPOTHESIS TESTING
Practice Questions
Q1:
On an examination in a statistics course, the average marks of 50 boys was 72 with a
population standard deviation of 8, while the average marks of 45 girls was 75. Test the
hypothesis at (a) 5% and (b) 1% level of significance that the boys‘ performance is inferior to
that of the girls.
Q2:
Two samples A and B detailed below were taken from normal populations of standard
deviation 2.5. Decide whether the difference of sample means is significant at the 0.05 level
of significance.
A
16
18
23
26
19
24
25
23
21
22
B
20
21
23
25
27
24
26
24
28
25
Case 3: (For testing μ when δ or δ 2 is unknown and n > 30)
Q1:
A company claims that the average lifetime of his product is 2000 hours. A random sample
of 64 products is put on test and their lifetime in hours is recorded. The following sums are
obtained from the lifetimes: Σx = 127808 and Σ(x-xˉ )2 = 9694.6. Test the hypothesis that the
manufacturer is overestimating the lifetimes of the products. Take α = 0.01.
Solution:
Step 1: Specification of hypothesis
Ho: μ = 2000 hours
Lower Tail
H1: μ < 2000 hours
Step 2: Level of Significance
α= 0.01 (1 %)
INFERENTIAL STATISTICS
75
HYPOTHESIS TESTING
Standard deviation is unknown.
Step 3: Test Statistics
xˉ −μ
z = 𝑠/ 𝑛
Step 4: Calculation
z cal =
1997−2000
12.31/ 64
3(8)
z cal = − 12.31 = - 1.95
Step 5: Critical Region
Reject Ho if;
z cal < - z tab
So;
z tab= 2.58
-1.95 ≮ -2.33
Step 6: Conclusion
Since the calculated value of z do not exceeds the critical values of z
tab
(z do not lies in the
critical region), we reject Ho at 5% level of significance. We therefore, conclude that the
mean lifetime is less than 2000 hours.
Q2:
Individual filing of income tax returns prior to 30th June had an average refund of Rs. 1200.
Consider the population of last minutes filers who file their returns during the last week of
June. For a random sample of 400individuals who filed a return between 25 and 30 June, the
sample mean refund was Rs.1054 and the sample standard deviation was Rs. 1600. Using 5%
level of significance, test the belief that the individuals who wait until the last week of June
to file their returns to get a higher refund than early the filers.
Solution:
Step 1: Specification of hypothesis
INFERENTIAL STATISTICS
76
HYPOTHESIS TESTING
Ho: μ ≥ 1200
Lower Tail
H1: μ < 1200
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
xˉ −μ
z = s/ n
Step 4: Calculation
z cal =
z cal = −
1054 −1200
1600 / 400
146
80
= - 1.825
Step 5: Critical Region
Reject Ho if;
z cal < -z tab
So;
z tab= 1.96
1.825 ≮ -1.96
Step 6: Conclusion
Since the calculated value of z exceeds the critical values of z tab (z lies in the critical region),
we reject Ho at 5% level of significance. We therefore, conclude that the mean is less than
1200.
INFERENTIAL STATISTICS
77
HYPOTHESIS TESTING
Practice Questions
Q1:
A package device is set to fill detergent powder packets with a mean weight of 5 kg, with
standard deviation of 0.21kg. The weight of packets can be assumed to be normally
distributed. The weight of packets is known to drift upwards over a period of time due to
machine fault, which is not tolerable. A random sample of 100 packets is taken and weighted.
This sample has a mean weight of 5.03kg. Can we conclude that the mean weight produced
by the machine has increased? Use a 5% level of significance.
Q2:
The mean life span of a sample fluorescent LEDs produced by a company is found to be
1600 days with a standard deviation of 150 days. Test the hypothesis that the mean life span
of fluorescent LEDs produced in general is higher than the mean life of 1570 days at α = 0.01
level of significance.
Case 4: (For testing μ 1 and μ2 when δ or δ 2 is unknown and n > 30)
Q1:
An experiment was conducted to compare the mean time in days required to recover from the
common cold for a person given daily dose of 4mg of vitamin C versus those who were not
given a vitamin supplement. Suppose that 35 adults were randomly selected for each
treatment category and that the mean recovery times and standard devotions for the two
groups were as f6llows:
Vitamin C
No Vitamin Supplement
Sample size
35
35
Sample mean
5.8
6.9
Sample Standard
1.2
2.9
Deviation
INFERENTIAL STATISTICS
78
HYPOTHESIS TESTING
Test the hypothesis that the use of vitamin C reduces the mean time required to recover
from a common cold and its complications, at the level of significance α =0.05.
Solution:
Step 1: Specification of hypothesis
Ho: (μ1 - μ2) ≤ 0
Upper Tail
H1: (μ1- μ2) > 0
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
z=
( x 1− x 2 ) − ( µ1−µ2 )
(𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 )
Step 4: Calculation
z cal =
z cal =
(5.8−6.9)−0
1.2 2 2.9 2
+
35
35
−1.1
0.041+0.240
−1.1
z cal= 0.530
z cal= -2.605
Step 5: Critical Region
Reject Ho if;
z cal > z tab
So;
z tab= 1.65
-2.605 ≯ 1.65
Step 6: Conclusion
INFERENTIAL STATISTICS
79
HYPOTHESIS TESTING
Since the calculated value of z
cal
is less than the critical values of z
tab
(z do not lies in the
critical region), we do not reject Ho at 5% level of significance.
Q2:
The education testing Service conducted a study to investigate differences between the scores
of female and male students on the Mathematics Aptitude Test. The study identified a
random sample of 562 female and 852 male students who had achieved the same high score
on the mathematics portion of the test. That is, the female and male students viewed as
having similar high ability in mathematics. The verbal scores for the two samples are given
below:
Sample mean
Sample standard
deviation
Female
Male
547
525
83
78
Do the data support the conclusion that given populations of female and male students
with similar high ability in mathematics, the female students will have a significantly
high verbal ability? Test at α =0.05 significance level. What is your conclusion?
Solution:
Step 1: Specification of hypothesis
Ho: (μ1 -μ2) ≥ 0
H1: (μ1- μ2) < 0
Lower Tail
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
INFERENTIAL STATISTICS
80
HYPOTHESIS TESTING
z=
( x 1− x 2 ) − ( µ1−µ2 )
(𝜎 1 2 /𝑛 1 )+( 𝜎 2 2 /𝑛 2 )
Step 4: Calculation
z cal =
z cal =
(547−525)−0
83 2 78 2
+
562 852
22
12.258+7.140
22
z cal = 4.404
z cal = 4.995
Step 5: Critical Region
Reject Ho if;
z cal < - z tab
So;
z tab = 1.65
4.995 ≮- 1.65
Step 6: Conclusion
Since the calculated value of z cal exceeds than the critical values of - z tab (z lies in the critical
region), we reject Ho at 5% level of significance.
Practice Questions
Q1:
The mean height of 50 males students of group I is 68.2 inches with a standard deviation of
2.5 inches, while 0 males students of group II as a mean height of 67.5 inches with a standard
deviation of 2.8 inches. Test the hypothesis that male students of group of I are taller than
male students of group II at the 0.05 level of significance.
INFERENTIAL STATISTICS
81
HYPOTHESIS TESTING
Q2:
A farmer claims that the average yield of corn of variety-----variety B by at least 12 bushels
per acre.----- with a standard deviation of 6.28 bushels per acre, while variety B yielded on
average 77.8 bushels per acre with a standard deviation of 5.64 bushels per acre. Test the
farmer‘s claim using a 0.05 level of significance.
Case 5: (For testing μ when δ or δ 2 is unknown and n < 30)
Q1:
Researchers are interested in whether the mean level of enzyme B in a certain population is
different from 120. They measure levels of enzyme B in a sample of 15 individuals and find
that the mean, xˉ = 96 and the sample standard deviation, s = 35.
Step 1: Specification of hypothesis
Ho: μ= 120 hours
Two Tailed
H1: μ≠ 120 hours
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown
Step 3: Test Statistics
xˉ −μ
t = s/ n
with n-1 d.f
Step 4: Calculation
t cal =
96−120
35/ 15
t cal = - 2.65
Step 5: Critical Region
INFERENTIAL STATISTICS
82
HYPOTHESIS TESTING
Reject Ho if;
│t cal│≥ t tab (n-1)
So;
t tab (n-1) = 2.145
(d.f 15-1= 14)
│- 2.65│≥ - 2.145
Step 6: Conclusion
Since the calculated value of t
cal
exceeds the critical values of t
tab
(t lies in the critical
region), we reject Ho at 5% level of significance.
Q2:
The Average breaking strength of steel rods is specified to be 18.5 thousand kg. For this a
sample of 14 rods was tested. The mean and standard deviation obtained were 17.85 and
1.955, respectively. Test the significance of the deviation.
Step 1: Specification of hypothesis
Ho: μ= 18.5
Two Tailed
H1: μ≠ 18.5
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
t=
xˉ −μ
s/ n
with n-1 d.f
Step 4: Calculation
t cal =
17.85−18.5
1.955/ 14
t cal = -1.24
INFERENTIAL STATISTICS
83
HYPOTHESIS TESTING
Step 5: Critical Region
Reject Ho if;
│t cal│≥ t tab
So;
t tab (n-1) =2.16
(d.f 14-1= 13)
│-1.24 │≥ -2.16
Step 6: Conclusion
Since the calculated value of t cal not exceeds the critical values of t
tab.
(t do not lies in the
critical region), so; we do not reject Ho at 5% level of significance.
Practice Questions
Q1:
An automobile tire manufacturer claims that the average life of a particular grade of tire is
more than 20,000 km when used under normal conditions. A random sample of 16 tires was
tested and a mean and standard deviation of 22,000 km and 5,000 km respectively were
computed. Assuming the life of the tires in km to be approximately normally distributed,
decide whether the manufacturer‘s claim is valid.
Q2:
A random sample of 22 fifth grade pupils have a grade point average of 5.0 in math‘s with a
standard deviation of 0.452, whereas marks range from 1 (worst) to 6 (excellent). The grade
point average (GPA) of all fifth grade pupils of the last five years is 4.7. Is the GPA of the 22
pupils different from the populations‘ GPA?
INFERENTIAL STATISTICS
84
HYPOTHESIS TESTING
Pupil
Grade
points
INFERENTIAL STATISTICS
1
5
2
5.5
3
4.5
4
5
5
5
6
6
7
5
8
5
9
4.5
10
5
11
5
12
4.5
13
4.5
14
5.5
15
4
16
5
17
5
85
HYPOTHESIS TESTING
18
5.5
19
4.5
20
5.5
21
5
22
5.5
Mean
5.0
Variance
0.2045
Case 6: (For testing μ 1 and μ2 when δ or δ 2 is unknown and n < 30)
Q1: A researcher interested in employee satisfaction and productivity measured the
number of units produced by employees at a plant before and after a company-wide pay
raise occurred. The researcher hypothesized that production would be higher after the raise
compared to before the raise. Assume that the difference scores are normally distributed and
let a = 0.05.
INFERENTIAL STATISTICS
Participants
Befor
After
1
e7
7
2
4
5
3
8
9
4
5
8
6
9
6
6
6
6
7
5
5
8
9
5
7
4
7
86
HYPOTHESIS TESTING
Solution:
Step 1: Specification of hypothesis
Ho: There is no difference in the number of units produced before and after the raise, or the
number of units was higher before the raise.
Upper Tail
H1: The number of units produced was higher after the raise.
Step 2: Level of Significance
α= 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
𝑑−𝜇 𝑑
t=𝑆
𝑑
𝑛
with n-1 d.f
Step 4: Calculation
Calculate the difference scores and the intermediate numbers for the SS formula:
Difference Score
D2
7
0
0
4
5
-1
1
3
8
9
-1
1
4
8
9
-1
1
5
6
6
0
0
6
6
6
0
0
7
5
5
0
0
8
5
4
+1
1
9
7
7
Participants
Before
After
1
7
2
0
∑D= -2n=9
0
∑D2=4
D = 2/9 =-0.222
INFERENTIAL STATISTICS
87
HYPOTHESIS TESTING
(ΣD)2
SSD= ΣD2 - –
SD =
SS D
(𝑛−1)
𝑛
=
=
3.55
(9−1)
(−2)2
4- –
9
= 3.55
= 0.66
Apply the formula:
𝑑−𝜇 𝑑
t cal = 𝑆
𝑑
𝑛
t cal =
with n-1 d.f
−0.222− 0
0.222
t cal = -0.999
Step 5: Critical Region
Reject Ho if;
│t cal│> t tab (n-1)
So;
t tab= 1.86
(d.f 9-1 = 8)
│-0.999 │≯ 1.86
Step 6: Conclusion
Since the calculated value of t cal do not exceeds than the critical values of t tab (n-1) (t do not lie
in the critical region), so; we do not reject Ho at 5% level of significance.
Q2:
A sociologist is interested in the decay of long-term memory compared to the number of
errors in memory that an individual made after 1 week and after 1 year for a specific crime
event. Participants viewed a videotape of a bank robbery and were asked a number of
specific questions about the video 1 week after viewing it. They were asked the same
questions 1 year after seeing the video. The number of memory errors was recorded for each
participant at each time period. The researchers asked whether or not there was a significant
difference in the number of errors in the two time periods. Assume that the difference scores
are normally distributed and let a = 0.05.
INFERENTIAL STATISTICS
88
HYPOTHESIS TESTING
Subject
One Week
One Year
1
5
7
2
4
5
3
4
6
8
9
9
5
6
6
6
5
6
7
4
5
8
5
4
9
7
7
Solution:
Step 1: Specification of hypothesis
Ho: There is no difference in the number of errors made at 1 week and at 1 year.
Two Tailed
H1: There is a difference in the number of errors made at 1 week and at 1 year.
Step 2: Level of Significance
α = 0.05 (5 %)
Standard deviation is unknown.
Step 3: Test Statistics
𝑑−𝜇 𝑑
t=𝑆
𝑑
𝑛
with n-1 d.f
Step 4: Calculation
Calculate the difference scores and the intermediate numbers for the SS formula:
INFERENTIAL STATISTICS
89
HYPOTHESIS TESTING
Subject
one
one
Difference Score
D2
Week Year
1
5
7
-2
4
2
4
5
-1
1
3
6
9
-3
9
4
8
9
-1
1
5
6
6
0
0
6
5
6
-1
1
7
4
5
-1
1
8
5
4
+1
1
9
7
7
0
0
∑D=-8
∑D2=18
n=9
D=
(ΣD)2
SSD= ΣD2 - –
SD =
SS D
(𝑛−1)
𝑛
=
=
10.88
(9−1)
(−8)2
18- –
9
-8/9=-0.
8888
= 10.88
= 1.166
Apply the formula:
𝑑−𝜇 𝑑
t cal = 𝑆
𝑑
𝑛
t cal =
with n-1 d.f
−0.888− 0
1.1666
9
t cal = -2.2855
Step 5: Critical Region
Reject Ho if;
│t cal│ ≥ t tab (n-1)
INFERENTIAL STATISTICS
90
HYPOTHESIS TESTING
So;
t tab= 2.306
(d.f 9-1 = 8)
│-2.2855│≱ 2.306
Step 6: Conclusion
Since the calculated value of t cal do not exceeds than the critical values of t tab (n-1) (t do not lie
in the critical region), so; we do not reject Ho at 5% level of significance.
Practice Questions
Q1:
Suppose you are interested in developing a counseling technique to reduce stress within
marriages. You randomly select two samples of married individuals out of ten churches in
the association. You provide Group 1 with group counseling and study materials. You
provide Group 2 with individual counseling and study materials.
At the conclusion of the treatment period, you measure the level of marital stress in the group
members. Here are the scores:
Group 1
Group 2
25 17 29 29 26 24 27 33
21 26 28 31 14 27 29 23
23 14 21 26 20 27 26 32
18 25 32 23 16 21 17 20
20 32 17 23 20 30 26 12
26 23
INFERENTIAL STATISTICS
91
7 18 29 32 24 19
HYPOTHESIS TESTING
Q2:
A professor wants to empirically measure the impact of stated course objectives and learning
outcomes in one of her classes. The course is divided into four major units, each with an
exam. For units I and III, she instructs the class to study lecture notes and reading
assignments in preparation for the exams.
For units II and IV, she provides clearly written instructional objectives. These objectives
form the basis for the exams in units II and IV. Test scores from I, III are combined, as are
scores from II, IV, giving a total possible score of 200 for the ―with‖ and ―without‖
objectives conditions. Do students achieve significantly better when learning and testing are
tied together with objectives? (α=0.01)
Here is a random sample of 10 scores from her class:
INFERENTIAL STATISTICS
Instructional
Objectives
1
Without
165(I,III)
With182
(II,IV)
2
Subject
178
189
3
143
179
4
187
196
5
186
188
6
127
153
7
138
154
8
155
178
9
157
169
10
171
191
ΣX=1607
ΣY=1779
92
HYPOTHESIS TESTING
Case 7: (For testing p when δ or δ 2 is known)
Q1:
Suppose that you interview 1000 exiting voters about who they voted for governor. Of
the 1000 voters, 550 reported that they voted for the democratic candidate. Is there sufficient
evidence to suggest that the democratic candidate will win the election at the .01 level?
Solution:
Step 1: Specification of hypothesis
H0: p =.5
H1: p >.5
Upper Tail
Step 2: Level of Significance
α= 0.05 (5 %)
Step 3: Test Statistics
z=
p̂ −p
𝑝𝑞 𝑛
Step 4: Calculation
z cal =
0.6−0.5
0.5(1−0.5)/1000
z cal = 3.16
Step 5: Critical Region
Reject Ho if;
z cal > z tab
So;
z tab= 1.96
3.16 > 1.96
INFERENTIAL STATISTICS
93
HYPOTHESIS TESTING
Step 6: Conclusion
Since the calculated value of z
cal
exceeds the critical values of z
tab
(Z lies in the critical
region), we reject Ho at 5% level of significance. So we can conclude that the democratic
candidate will win.
Q2:
The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are very
satisfied with the service they receive. To test this claim, the local newspaper surveyed 100
customers, using simple random sampling. Among the sampled customers, 73 percent say
they are very satisfied. Based on these findings, can we reject the CEO's hypothesis that 80%
of the customers are very satisfied? Use a 0.05 level of significance.
Step 1: Specification of hypothesis
H0: p = 0.80
H1: p ≠ 0.80
Two Tailed
Step 2: Level of Significance
α= 0.05 (5 %)
Step 3: Test Statistics
z=
p̂ −p
𝑝𝑞 𝑛
Step 4: Calculation
z cal =
0.73−0.8
0.8(0.2)/100
z cal = -1.75
Step 5: Critical Region
Reject Ho if;
│z cal│≥ z tab
So;
z tab=1.96
INFERENTIAL STATISTICS
94
HYPOTHESIS TESTING
│-1.75│≱ 1.96
Step 6: Conclusion
The above calculation shows us that 1.75 is not in the rejection region. Therefore we fail to
reject H0.
Practice Questions
Q3:
1500 randomly selected pine trees were tested for traces of the Bark Beetle infestation. It
was found that 153 of the trees showed such traces. Test the hypothesis that more
than 10% of the Tahoe trees have been infested. (Use a 5% level of significance)
Q4:
Suppose the CEO claims that at least 80 percent of the company's 1,000,000 customers are
very satisfied. Again, 100 customers are surveyed using simple random sampling. The result:
73 percent are very satisfied. Based on these results, should we accept or reject the CEO's
hypothesis? Assume a significance level of 0.05.
Case 8: (For testing 𝐩̂𝟏 and 𝐩̂𝟐 when δ or δ 2 is known)
Q1:
Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The
company states that the drug is equally effective for men and women. To test this claim, they
choose as simple random sample of 100 women and 200 men from a population of 100,000
volunteers.
At the end of the study, 38% of the women caught a cold; and 51% of the men caught a cold.
Based on these findings, can we reject the company's claim that the drug is equally effective
for men and women? Use a 0.05 level of significance.
Solution:
INFERENTIAL STATISTICS
95
HYPOTHESIS TESTING
Step 1: Specification of hypothesis
H0: p1 = p2
H1: p1 ≠ p2
Two Tailed
Step 2: Level of Significance
α = 0.05 (5 %)
Step 3: Test Statistics
p̂ 1 − p̂ 2
z=
p (1−p̂ (1 𝑛 1 + 1 𝑛 2 )
Step 4: Calculation
p = [p1 (n1) + p2 (n2)] / (n1 + n2)
p = [(0.38(100)) + [0.51 (200)] / (100 + 200)
p = 140/300
p=0.467
SE = { p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }
SE =
[ 0.467 * 0.533 * ( 1/100 + 1/200 ) ]
SE=
[0.003733] = 0.061
z cal =
0.38−0.51
0.061
z cal = -2.13
Step 5: Critical Region
Reject Ho if;
│z cal│≥ z tab
INFERENTIAL STATISTICS
96
HYPOTHESIS TESTING
So;
z tab = 1.65
2.13 ≥ 1.65
Step 6: Conclusion
The above calculation shows us that 2.13 are in the rejection region. Therefore we will
reject H0.
Q2:
Suppose the previous example is stated a little bit differently. Suppose the Acme Drug
Company develops a new drug, designed to prevent colds. The company states that the drug
is more effective for women than for men. To test this claim, they choose a simple random
sample of 100 women and 200 men from a population of 100,000 volunteers.
Step 1: Specification of hypothesis
H0: p1 = p2
Lower Tail
H1: p1 < p2
Step 2: Level of Significance
α= 0.01 (5 %)
Step 3: Test Statistics
p̂ 1 − p̂ 2
z=
p (1−p̂ (1 𝑛 1 + 1 𝑛 2 )
Step 4: Calculation
p = [p1 (n1) + p2 (n2)] / (n1 + n2)
p = [(0.38 ×100) + (0.51 × 200)] / (100 + 200)
p = 140/300
p = 0.467
INFERENTIAL STATISTICS
97
HYPOTHESIS TESTING
SE = { p ×( 1 - p ) × [ (1/n1) + (1/n2) ] }
SE =
[ 0.467 × 0.533 × ( 1/100 + 1/200 ) ]
SE = [0.003733]
SE = 0.061
z cal =
0.38−0.51
0.061
z cal = -2.13
Step 5: Critical Region
Reject Ho if;
z cal < -z tab
So;
z tab=1.96
-2.13 < -1.65
Step 6: Conclusion
The above calculation shows us that - 2.13 is in the rejection region. Therefore we reject H0.
Practice Questions
Q1:
Consider a production process that produced 10,000 widgets in January and experienced a
total of 100 rejected widgets after a quality control inspection (i.e., failure rate = 0.01,
success rate = 0.99). A Six Sigma project was deployed to fix this problem and by March the
improvement plan was in place. In April, the process produced 8,000 widgets and
INFERENTIAL STATISTICS
98
HYPOTHESIS TESTING
Q2:
Researchers want to test the effectiveness of a new anti-anxiety medication. In clinical
testing, 64 out of 200 people taking the medication report symptoms of anxiety. The people
receiving a placebo, 92 out of 200 report symptoms of anxiety. Is the medication working
any differently than the placebo? Test this claim using alpha = 0.05.
CASE 9: (For testing one variance)
Q1:
A cigarette manufacturer wishes to test the claim that the variance of nicotine content of its
cigarettes is 0.644. Nicotine content is measured in milligrams and is assumed normally
distributed. A sample of 20 cigarettes has a standard deviation of 1.00milligram. At α = 0.05,
is there enough evidence to reject the manufacturer‘s claim?
Solution:
Step 1: Specification of hypothesis
H0: σ2 = 0.644
H1: σ2 ≠ 0.644
Two Tailed
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
χ2 =
n−1 s 2
σ2
with n - 1 d.f
Step 4: Calculation
20 − 1 (1.0)2
χ =
0.644
2
χ2 = 29.5 With 19 d.f
Step 5: Critical region
Reject H0 if;
INFERENTIAL STATISTICS
99
HYPOTHESIS TESTING
2
2
𝜒𝑐𝑎𝑙
≥ 𝜒𝑡𝑎𝑏
(𝑛−1)
So;
2
𝜒𝑡𝑎𝑏
(𝑛−1) = 32.852
29.5 ≱ 32.852
Step 6: Conclusion
We failed to reject H0 i.e. we do not have enough evidence to reject the manufacturer‘s claim
that the variance of the nicotine content of the cigarettes is equal to 0.644.
Q2:
A pharmaceutical company is considering the purchase of new bottling machines to increase
efficiency. The factory currently makes use of machines that fill cough syrup bottles whose
volume of medicine has a standard deviation of 1.6 mL. The new machine they are
considering was tested on 30 bottles, producing a batch with a standard deviation of 1.25 mL.
Does this machine produce a variance less than 1.6 mL at the 0.05 significance level?
Solution:
Step 1: Specification of hypothesis
H0: σ2 ≥ 2.56
H1: σ2 < 2.56
Lower Tail
Step2: Level of significance
α = 0.05
Step 3: Test statistics
n−1 s 2
χ2 =
σ2
with n - 1 d.f
Step 4: Calculation
χ2 =
30 − 1 (1.25)2
2.56
χ2 = 17.700 With 29 d.f
INFERENTIAL STATISTICS
100
HYPOTHESIS TESTING
Step 5: Critical region
Reject H0 if;
2
2
𝜒𝑐𝑎𝑙
< 𝜒𝑡𝑎𝑏
(𝑛−1)
So;
2
𝜒𝑡𝑎𝑏
(𝑛 −1) = 45.722
17.700 < 45.722
Step 6: Conclusion
We reject H0 i.e. this machine produces a variance less than 1.6.
Practice Questions
Q1:
In a study in which the subjects were 15 patients suffering from pulmonary sarcoid disease,
blood gas determinations were made. The variance of the sample was 450. Test the
hypothesis that the population variance is less than 250.
Q2:
A nutritionist claims that the standard deviation of the number of calories in 1 tablespoon of
the major brands of pancake syrup is 60. A sample of major brands of syrup is selected, and
the number of calories is shown. At

= 0.10, can the claim be rejected?
53
210
100
200
100
220
210
100
240
200
100
210
100
210
100
210
100
60
CASE 10: (For testing ratio of variances)
Q1:
The variability in the amount of impurities present in a batch of chemicals used for a
particular process depends on the length of time that the process is in operation. Suppose a
INFERENTIAL STATISTICS
101
HYPOTHESIS TESTING
sample of size 25 is drawn from the normal process which is to be compared to a sample of a
new process that has been developed to reduce the variability of impurities.
Sample 1
Sample 2
n
25
25
σ2
1.04
0.51
Solution:
Step 1: Specification of hypothesis
H0: σ12 = σ 22
H1: σ 12> σ 22
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
σ2
F = σ 12 with 𝜈1 𝑎𝑛𝑑 𝜈2 d.f
2
Step 4: Calculation
F=
1.04
0.51
F = 2.04 With 24 and 24 d.f
Step 5: Critical region
Reject H0 if;
F cal > F tab (ν1, ν2)
So;
F tab (ν1, ν2) = 1.9838
2.04 > 1.9838
Step 6:
INFERENTIAL STATISTICS
102
HYPOTHESIS TESTING
We reject H0 and conclude that the variability in the new process (Sample 2) is less than the
variability in the original process (Sample 1).
Q2:
A math test is given in two classrooms. In the first classroom (21 students) the mean was
84.3 and the variance was 16.8. In the second classroom (16 students) the mean was 83.7
with a variance of 42.6. Are the two classroom variances different?
Solution:
Step 1: Specification of hypothesis
H0: σ12 = σ 22
H1: σ 12> σ 22
Upper Tail
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
F=
σ 21
σ 22
with 𝜈1 𝑎𝑛𝑑 𝜈2 d.f
Step 4: Calculation
F=
42.6
16.8
F = 2.54 With 15 and 20 d.f
Step 5: Critical region
Reject H0 if;
F cal > F tab (ν1, ν2)
So;
F tab (ν1, ν2) = 2.2033
2.54 > 2.2033
Step 6:
INFERENTIAL STATISTICS
103
HYPOTHESIS TESTING
We reject H0 and conclude that the variances are significantly different.
Practice Questions
Q1:
A tire manufacturer claims that the variance of the diameters in a certain tire model is 8.6. A
random sample of ten tires has a variance of 4.3. At α = 0.01, is there enough evidence to
reject the manufacturer‘s claim? Assume the population is normally distributed.
Q2:
A manufacturer wishes to determine whether there is less variability in the silver plating done
by Company 1 than that done by Company 2. Independent random samples yield the
following results. Do the populations have different variances?
CASE 11: (For testing α)
Q1:
H0: α = 0
H1: α ≠ 0
Two Tailed
α=5%
t = a – α / S.E α
= 5.8253 – 0 / 0.82702
= 7.043723
│t cal │≥ t tab
Tab = t α/2 µ
│7.04373 │ ≥ 0.025 (16)
│7.04373 │ ≥ 2.583
So, Reject H0
CASE 12: (For testing β)
Q1:
INFERENTIAL STATISTICS
104
HYPOTHESIS TESTING
H0: β = 0
H1: β ≠ 0
Two Tailed
α= 5%
t= b - β/ S.Eb
= 0.5676 – 0/ 0.0183
= 31.01
│t cal │≥ t tab
ttab = t α/2 µ
│31.01│ ≥ 0.025 (16)
│31.01│ ≥ 2.583
So, Reject H0
GOODNESS OF FIT TEST (PROBLEMS)
Q1:
If we toss a die 150 times and find that we have following distribution of rolls is the die fair?
Face
1
2
3
4
5
6
No. of
22
21
22
27
22
36
rolls
Solution:
Step 1: Specification of hypothesis
H0: The distribution is binomial with n = 6 but p is identified
Two Tailed
H1: The distribution is not binomial
INFERENTIAL STATISTICS
105
HYPOTHESIS TESTING
Step 2: Level of significance
α = 0.05
Step 3: Test Statistics
2
𝜒𝑐𝑎𝑙
=
2
𝑛 (𝑜 𝑖 −𝑒 𝑖 )
𝑖=1
𝑒
With n-1-k d.f where n = no. of categories and k = no. of unidentified
𝑖
parameter
Step 4: Calculation
We have to fit this chi square distribution in binomial distribution. For this we have to make
proper table of observed and expected frequencies.
ei = pi .
(oi – ei)2 /
∑f
ei
0.0275
4.125
77.45
42
0.115
17.25
0.815
22
66
0.257
38.55
7.105
4
27
108
0.322
48.33
9.41
5
22
110
0.216
32.4
3.338
6
36
216
0.60
90
32.4
-
∑f=
∑ f(x) =
-
-
𝝌𝟐𝒄𝒂𝒍 =
150
564
Face
No. of
(x)
rolls (f)
1
22
22
2
21
3
𝜇=
Σ 𝑓(𝑥)
Σ𝑓
f(x)
pi =
𝑛
𝑥
𝑝 𝑥 𝑞 𝑛−𝑥
129.88
= np
564
𝜇 = 150 = 3.76 = np
p = 3.76 / 6 = 0.6267
INFERENTIAL STATISTICS
106
HYPOTHESIS TESTING
1-p = q
q = 1 – 0.626 = 0.374
Step 5: Critical region
Reject H0 if;
2
2
𝜒𝑐𝑎𝑙
≥ 𝜒𝑡𝑎𝑏
(𝑛−1−𝑘)
So;
2
𝜒𝑡𝑎𝑏
(𝑛−1−𝑘) = 11.143
129.88 ≥ 11.143
Step 6: Conclusion
We reject H0 as above statement is true i.e. the distribution is not binomial.
Practice Questions
Q1:
The letter distribution of the 5 most popular letters in the English language is known to be
approximately
letter
E
T
N
R
O
freq.
29
21
17
17
16
That is when E, T, N, R, O appear, on average 29 times out of 100 it is an E and not the other
4. This information is useful in cryptography to break some basic secret codes. Suppose a
text is analyzed and the number of E, T, N, R and O's are counted. The following distribution
is found
INFERENTIAL STATISTICS
letter
E
T
N
R
O
freq.
100
110
80
55
14
107
HYPOTHESIS TESTING
Do a chi-square goodness of fit hypothesis test to see if the letter proportions for this text
are pE=.29, pT=.21, pN=.17, pR=.17, pO=.16 or are different.
Q2:
A new casino game involves rolling 3 dice. The winnings are directly proportional to the
total number of sixes rolled. Suppose a gambler plays the game 100 times, with the following
observed counts:
Number of Sixes
Number of Rolls
48
35
15
3
The casino becomes suspicious of the gambler and wishes to determine whether the dice are
fair. What do they conclude?
TEST OF INDEPENDENCE (PROBLEMS)
Q1: Suppose you have the following categorical data set.
Table. Incidence of three types of malaria in three tropical regions.
Asia
Africa
South
Totals
America
Malaria A
31
14
45
90
Malaria B
2
5
53
60
Malaria C
53
45
2
100
Totals
86
64
100
250
INFERENTIAL STATISTICS
108
HYPOTHESIS TESTING
Step 1: Specification of hypothesis
H0: relationship between location and type of malaria.
Two Tailed
H1: No relationship between location and type of malaria.
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
Step 4: Calculation
We could now set up the following table:
(O — E)2/ E
Expected
31
30.96
0.04
0.0016
0.0000516
14
23.04
9.04
81.72
3.546
45
36.00
9.00
81.00
2.25
2
20.64
18.64
347.45
16.83
5
15.36
10.36
107.33
6.99
53
24.00
29.00
841.00
35.04
53
34.40
18.60
345.96
10.06
45
25.60
19.40
376.36
14.70
2
40.00
38.00
1444.00
36.10
INFERENTIAL STATISTICS
|O -E|
(O — E)2
Observed
109
HYPOTHESIS TESTING
Chi Square = 125.516
Degrees of Freedom = (c - 1) (r - 1) = 2(2) = 4
Step 5: Critical region
Reject H0 if;
χ2cal ≥ χ2tab
c−1 (r−1)
So;
χ2tab
c−1 (r−1)
= 9.488
125.516 ≥ 9.488
Step 6: Conclusion
Thus, we would reject the null hypothesis that there is no relationship between location and
type of malaria.
Q2:
Suppose you conducted a drug trial on a group of animals and you hypothesized that the
animals receiving the drug would show increased heart rates compared to those that did not
receive the drug. You conduct the study and collect the following data:
Heart Rate
No Heart Rate
Total
Increased
Increase
Treated
36
14
50
Not
30
25
55
66
39
105
treated
Total
INFERENTIAL STATISTICS
110
HYPOTHESIS TESTING
Step 1: Specification of hypothesis
Upper Tail
H0: The proportion of animals whose heart rate increased is independent of drug treatment.
H1: The proportion of animals whose heart rate increased is associated with drug treatment.
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
Step 4: Calculation
(O — E)2
(O — E)2/ E
4.58
20.976
0.667
18.57
4.57
20.884
1.124
30
34.57
4.57
20.884
0.604
25
20.42
4.58
20.976
1.027
Observed
Expected
36
31.42
14
|O -E|
Chi square = 3.422
Degrees of Freedom = (2-1) x (2-1) = 1
Step 5: Critical region
Reject H0 if;
χ2cal ≥ χ2tab
c−1 (r−1)
So;
χ2tab
c−1 (r−1)
= 3.841
3.422 ≱ 3.841
Step 6: Conclusion
INFERENTIAL STATISTICS
111
HYPOTHESIS TESTING
χ 2 values are 3.422 and this value is less than the table value of 3.841. So we do not reject
the H0.
Practice Questions
Q1:
In a certain town, there are about one million eligible voters. A simple random sample of
10000 eligible voters was chosen to study the relationship between sex and participation in
the last election. The results are summarized in the following 2X2 (read two by two)
contingency table:
Men
Women
Voted
2792
3591
Didn't vote
1486
2131
We want to check whether being a man or a woman (columns) is independent of having
voted in the last election (rows). In other words is "sex and voting independent"?
Q2:
Each respondent in the Current Population Survey of March 1993 was classified as
employed, unemployed, or outside the labor force. The results for men in California age 3544 can be cross-tabulated by marital status, as follows:
Widowed, divorced,
Married
or separated
never
married
__________________________________________
Employed
679
103
114
Unemployed
63
10
20
Not in labor force
42
18
25
INFERENTIAL STATISTICS
112
HYPOTHESIS TESTING
Men of different marital status seem to have different distributions of labor force
status. Or is this just chance variation? (You may assume the table results from a
simple random sample.)
TEST FOR HOMOGENIETY (PROBLEMS)
Q1:
We had selected a random sample of 20 males from the population of males in the school and
another, independent, random sample of 16 females from the population of females in the
school. Within each sample we classify the students as democrat, republican and
independent.
Do a chi square test of homogeneity to see if there is a difference between political party
preferences on the basis of gender, from the data given below;
Democrat
Republican
Independent
Totals
Male
11
17
2
20
Female
7
8
1
16
Totals
18
15
3
36
Step 1: Specification of hypothesis
H0: Political party preference is independent of gender
Two Tailed
H1: Political party preference is dependent on gender
Step 2: Level of significance
α = 0.05
INFERENTIAL STATISTICS
113
HYPOTHESIS TESTING
Step 3: Test Statistics
2
𝜒𝑐𝑎𝑙
=
𝑁2
𝐴𝐵
[
Σ 𝑎 𝑖2
𝑐𝑖
−
𝐴2
𝑁
] With n-1 d.f
Step 4: Calculation
ai
a i2
ai2 / ci
11
121
6.72
7
49
3.26
2
4
1.33
2
𝜒𝑐𝑎𝑙
362
400
=
[ 11.313 −
]
20 (16)
36
2
𝜒𝑐𝑎𝑙
= 0.822
Step 5: Critical region
Reject H0 if;
2
2
𝜒𝑐𝑎𝑙
≥ 𝜒𝑡𝑎𝑏
(𝑛−1)
So;
2
𝜒𝑡𝑎𝑏
(𝑛−1) = 12.832
0.822 ≱ 12.832
Step 6: Conclusion
We fail to reject H0 i.e. political party preference is independent of gender.
Q2: In a study of the television viewing habits of children, a developmental psychologist
selects a random sample of 300 first graders - 100 boys and 200 girls. Each child is asked
which of the following TV programs they like best: The Lone Ranger, Sesame Street, or The
Simpsons. Results are shown in the contingency table below.
INFERENTIAL STATISTICS
114
HYPOTHESIS TESTING
Viewing Preferences
Lone
Sesame
The
Row
Ranger
Street
Simpsons
total
Boys
50
30
20
100
Girls
50
80
70
200
100
110
80
300
Column
total
Do the boys' preferences for these TV programs differ significantly from the girls'
preferences? Use a 0.05 level of significance.
Step 1: Specification of hypothesis
H0: Boy‘s preferences for these TV programs differ from those of girl‘s preferences
Two Tailed
H1: Boy‘s preferences for these TV programs differ from those of girl‘s preferences
Step 2: Level of significance
α = 0.05
Step 3: Test Statistics
2
𝜒𝑐𝑎𝑙
=
𝑁2
𝐴𝐵
[
Σ 𝑎 𝑖2
𝑐𝑖
−
𝐴2
𝑁
] With n-1 d.f
Step 4: Calculation
INFERENTIAL STATISTICS
ai
a i2
ai2 / ci
50
2500
25
30
900
8.181
20
400
4.44
115
HYPOTHESIS TESTING
2
𝜒𝑐𝑎𝑙
3002
10000
=
[ 36.625 −
]
100 (200)
300
2
𝜒𝑐𝑎𝑙
= 14.8275
Step 5: Critical region
Reject H0 if;
2
2
𝜒𝑐𝑎𝑙
≥ 𝜒𝑡𝑎𝑏
(𝑛−1)
So;
2
𝜒𝑡𝑎𝑏
(𝑛−1) = 12.832
14.8275 ≥ 12.832
Step 6: Conclusion
We reject H0 i.e. boy‘s preferences for these TV programs differ from those of girl‘s
preferences.
Practice Questions
Q1:
A survey of drivers was taken to see if they had been in an accident during the previous year,
and if so was it a minor or major accident. The results are tabulated by age group:
Accident Type
INFERENTIAL STATISTICS
AGE
None
minor
major
under 18
67
10
5
18-25
42
6
5
26-40
75
8
4
40-65
56
4
6
116
HYPOTHESIS TESTING
over 65
57
15
1
Do a chi-squared hypothesis test of homogeneity to see if there is difference in
distributions based on age.
Q2:
To determine if there was an association between race and opinions about schools,
researchers surveyed 3 randomly selected groups of parents and asked them ―Are high
schools in your state doing an excellent, good, fair or poor job or don‘t you know enough to
say?‖
FISHER’s EXACT TEST (PROBLEMS)
Q1:
Use the Fisher‘s exact test hypothesis that the inoculation is independent of immunity from
attack among a population exposed to a certain disease.
Not inoculated
Inoculated
Not attacked
3
5
Attacked
10
2
INFERENTIAL STATISTICS
117
HYPOTHESIS TESTING
Solution:
Step 1: Specification of hypothesis
H0: Inoculation is independent of immunity
Two Tailed
H1: Inoculation is dependent on immunity
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
𝑃=
(a + b )! ( a + c )! ( c + d )! ( b + d )!
𝑎! 𝑏! 𝑐! 𝑑! 𝑛!
Step 4: Calculations
TABLE 0
Not inoculated
Inoculated
Total
Not attacked
3
5
8
Attacked
10
2
12
Total
13
7
20
Not inoculated
Inoculated
Total
Not attacked
3-1 = 2
5+1 = 6
8
Attacked
10+1 = 11
2-1 = 1
12
Total
13
7
20
P0 = 0.0477
TABLE 1
INFERENTIAL STATISTICS
118
HYPOTHESIS TESTING
P1 = 0.0043
TABLE 2
Not
Inoculated
Total
inoculated
Not attacked
3-2 = 1
5+2 = 7
8
Attacked
10+2 = 12
2-2 = 0
12
Total
13
7
20
P2 = 0.0001
Step 5: Critical region
Reject H0 if;
2(Grand P) is not negligible
Grand P = P0 + P1 + P2
Grand P = 0.0521
So;
2(Grand P) = 0.1042
Step 6: Conclusion
As 2(Grand P) is not negligible, so we reject H0 i.e. inoculation is dependent on immunity.
INFERENTIAL STATISTICS
119
HYPOTHESIS TESTING
Practice Questions
Q1:
Suppose, in a fictitious experiment, 4 subjects in an Experimental Group and 4 subjects in a
Control Group are asked to solve an anagram problem. Three of the 4 subjects in the
Experimental Group and none of the subjects in the Control Group solved the problem. Table
below shows the results in a contingency table.
Experimental
Control
Total
Solved
3
0
3
Did Not Solve
1
4
5
4
4
8
Total
Perform Fisher’s exact test.
ONE WAY ANNOVA PROBLEM
A study compared three number of hours of relief provide by five different brands of antacid
administered to 25 different people, each with stomach acid considered strong. The results
are given
A
B
C
D
E
TOTAL
4.4
5.8
4.8
2.9
4.6
-
4.6
5.2
5.9
2.7
4.3
-
4.5
4.9
4.9
2.9
3.8
-
4.1
4.7
4.6
3.9
5.2
-
3.8
4.6
4.3
4.3
4.4
-
INFERENTIAL STATISTICS
120
HYPOTHESIS TESTING
Tj
21.4
25.2
24.5
16.7
22.3
110
Tj2
457.96
635.04
600.25
278.89
497.29
2469.43
∑xij
92.02
127.94
121.51
57.81
100.49
500
C.F=T...2 / n×r =484
TSS= 500 - 484 = 16
BSS= ∑ Tj2 / r ─ C.F = 2469/5 ─ 484 = 9.886
WSS=TSS─BSS = 16─9.886 = 6.114
SOV
SS
d.f
MS
F.ratio
b/w sample
9.886
4
2.4715
2.0211
WSS
6.114
5
1.2228
Total
9
Step 1: Specification of hypothesis
Ho: µ1=µ2=µ3
H1:µ1≠µ2≠µ3
Two Tailed
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
F = MSS / MSE
INFERENTIAL STATISTICS
121
with d.f (ν1, ν2)
HYPOTHESIS TESTING
Step 4: Calculation
F cal = 2.0112
Step 5: Critical region
Reject H0 if;
F cal ≥ F tab (ν1, ν2)
So;
F tab (ν1, ν2) = 5.1922
2.0112 ≱ 5.1922
Step 6: Conclusion
We fail to reject H0.
Practice Questions
Q1: The training methods were compared to see whether they lead to greater productivity
after training. The following are productivity measures for individuals trained by each
method.
Method 1
45
40
50
39
53
44
Method 2
59
43
47
51
39
49
Method 3
41
37
43
40
52
37
At the 0.05 LOS do the three training methods lead to different levels of productivity?
Q2:
A research study was conducted to examine the clinical efficacy of a new antidepressant.
Depressed patients were randomly assigned to one of three groups: a placebo group, a group
that received a low dose of the drug, and a group that received a moderate dose of the drug.
After four weeks of treatment, the patients completed the Beck Depression Inventory. The
INFERENTIAL STATISTICS
122
HYPOTHESIS TESTING
higher the score, the more depressed the patient. The data are presented below. Compute the
appropriate test.
Placebo
Low Dose
Moderate Dose
38
22
14
47
19
26
39
8
11
25
23
18
42
31
5
TWO WAY ANNOVA PROBLEMS:
Q1: When a restaurant server writes a friendly note or draws a ―happy face‖ on your
restaurant check, is this just a friendly act or is there a financial incentive? Psychologists
conducted a randomized experiment to investigate whether drawing a happy face on the back
of a restaurant bill increased the average tip given to the server. One female server and one
male server in a Philadelphia restaurant either did or did not draw a happy face on checks
during the experiment. In all they drew happy faces on 45 checks and did not draw happy
faces on 44 checks. The sequence of drawing the happy faces or not was random.
Complete the following two-way ANOVA table and then perform the appropriate F tests for
main effects and interaction and state your conclusions.
Source
DF
SS
MS
F
Message
---
14.7
---
---
INFERENTIAL STATISTICS
123
HYPOTHESIS TESTING
Gender
---
---
2602.0
---
Interaction
---
438.7
---
---
Error
---
---
109.8
---
Total
---
12407.9
---
---
Source
DF
SS
MS
F
Message
1
14.7
14.7
0.134
Gender
1
2602.0
2602.0
23.7
Interaction
1
438.7
438.7
4.0
Error
85
9333.0
109.8
-
Total
88
12407.9
-
-
Solution:
TEST 1
Step 1: Specification of hypothesis
H0: No main effect of message
H1: A main effect of message exists
Step 2: Level of significance
α = 0.05
INFERENTIAL STATISTICS
124
HYPOTHESIS TESTING
Step 3: Test statistics
F0 = MSS / MSE
Step 4: Calculation
F0 = 14.7/109.8 = 0.134 with numerator degrees of freedom 1 and denominator degrees of
freedom 85.
Step 5: Conclusion
This test statistic corresponds to a p-value of 0.7152. We do not have any evidence to reject
the null hypothesis that there is main effect of message on the average amount a server gets
tipped.
TEST 2
Step 1: Specification of hypothesis
H0: No main effect of gender
H1: A main effect of gender exists
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
F0 = MSS / MSE
Step 4: Calculation
F0 = 2602/109.8 = 23.7 with numerator d.f = 1 and
Denominator df = 85.
Step 5: Conclusion
This test statistic corresponds to a p-value of less than .0001. We have very strong evidence
that a main effect of gender does exist.
TEST 3
INFERENTIAL STATISTICS
125
HYPOTHESIS TESTING
Step 1: Specification of hypothesis
H0: No interaction effect between gender and message
H1: An interaction effect between gender and message exists.
Step 2: Level of significance
α = 0.05
Step 3: Test statistics
F0 = MSS / MSE
Step 4: Calculation
F0 = 438.7/109.8 = 4.0 with numerator d.f = 1 and denominator d.f = 85
Step 5: Conclusion
This test statistic corresponds to a p-value of .0487. We have evidence to reject the null
hypothesis of no interaction at the = 0.05 level. We have reason to believe that there is an
interaction effect.
Practice Questions
Q1: As a budding psychologist, you wonder whether you can teach old dogs new tricks. So
you go to the pound and adopt 15 old dogs and 15 puppies. Then you attempt to teach each
of the 30 dogs one of the standard dog tricks, "sit", "stay", and "roll over." Teaching only
one trick to each dog, you keep a record of how many days it takes before they learn the
tricks. The results of your experiment are listed in the table below. Use that data to conduct
a two-way analysis of variance to determine if old dogs can learn new tricks.
INFERENTIAL STATISTICS
126
HYPOTHESIS TESTING
Type of Trick
Puppies
(Row 1)
Old Dogs
(Row 2)
INFERENTIAL STATISTICS
"Sit"
"Shake"
"Roll Over"
(Column 1)
(Column 2)
(Column 3)
2
4
6
1
5
9
3
4
7
1
6
8
2
7
10
2
9
13
5
10
12
2
11
15
4
13
17
3
7
13
127
HYPOTHESIS TESTING
REFERENCES:
http://en.wikipedia.org/wiki/Statistical_inference
http://sociology.about.com/od/Statistics/a/Descriptive-inferential-statistics.htm
http://blog.minitab.com/blog/understanding-statistics/things-statisticians-say-failure-toreject-the-null-hypothesis
https://people.richland.edu/james/lecture/m170/ch12-fit.html
https://people.richland.edu/james/lecture/m170/ch12-ind.html
http://mathnstats.com/index.php/hypothesis-testing/85-chi-square-tests/129-chi-square-testfor-homogeneity.html
http://www.ltcconline.net/greenl/Courses/201/Regression/HomogeneityCollaborative/homog
eneity.html
http://faculty.london.edu/cstefanescu/Yates.pdf
https://people.richland.edu/james/lecture/m170/ch13-f.html
https://people.richland.edu/james/lecture/m170/ch13-1wy.html
https://people.richland.edu/james/lecture/m170/ch13-2wy.html
http://www3.nd.edu/~rwilliam/stats1/x53.pdf
INFERENTIAL STATISTICS
128
HYPOTHESIS TESTING
STATISTICAL INFERENCE-ESTIMATION
“Statistical inferences is the art of drawing conclusion and or inference about the population
from the limited information contained in the sample”
Statistical inference is the process of drawing conclusions from data that are subject to random
variation, for example, observational errors or sampling variation. Statistical inference is used to
describe systems of procedures that can be used to draw conclusions from datasets arising from
systems affected by random variation, such as observational errors, random sampling, or random
experimentation.
Inferential Statistics
Inferential statistics is concerned with making predictions or inferences about a population from
observations and analyses of a sample. That is, we can take the results of an analysis using a sample
and can generalize it to the larger population that the sample represents. In order to do this,
however, it is imperative that the sample is representative of the group to which it is being
generalized.
To address this issue of generalization, we have tests of significance. A Chi-square or T-test, for
example, can tell us the probability that the results of our analysis on the sample are representative
of the population that the sample represents. In other words, these tests of significance tell us the
probability that the results of the analysis could have occurred by chance when there is no
relationship at all between the variables we studied in the population we studied.
Examples
of
inferential
statistics
include linear
regression
analyses, logistic
regression
analyses,ANOVA, correlation analyses, structural equation modeling, and survival analysis, to
name a few.
 Two important areas of statistical inference are estimation and testing of hypothesis.
Hypothesis:
A statistical hypothesis is a claim (assertion, statement, belief or assumption) about an
unknown population parameter value. For example an investment company claims that the
average return across all its investments is 20 percent and so on. To test such claims sample
data are collected and analyzed. On the basis of sample findings hypothesized value of
population parameter is accepted or rejected.
INFERENTIAL STATISTCS
130
ESTIMATION
Estimation:
The method which is used to estimate the value of a population parameter from the value of
corresponding sample statistic. For example: A company needs to understand consumer
awareness of its products. In the following example the decision maker needs to examine the
following concepts that are useful for drawing statistical inference about an unknown value
of population or process parameter.
The procedures of making judgements about a population parameter from a sample statistic are
referred to as statistical estimation or simply estimation. Estimation is further divided into point
estimation and interval estimation.
POINT ESTIMATION
The object of point estimation is to obtain a single number from the sample which will represent the
unknown value of the population parameter. Population parameter (population mean, population
variance, population proportion etc.) are estimated from the corresponding sample statistics (sample
mean, sample variance, sample proportion). A statistic is used to estimate a population parameter is
called a point estimator or simply an estimator and a specific numerical value which we obtain for
an estimator in a given problem is called an estimate.
CRITERIA FOR POINT ESTIMATORS:
A point estimator is considered a good estimator if it satisfies various criteria.
 Unbiasedness
 Consistency
 Efficiency
 Sufficiency
UNBIASEDNESS
An estimator is defined to be unbiased if the statistic used as an estimator has its expected value
equal to the true value of the population parameter being estimated.
CONSISTENCY
An estimator is said to be consistent if the statistics to be used as estimators becomes closer and
INFERENTIAL STATISTCS
131
ESTIMATION
closer to the population parameter being estimated as sample size n increases.
EFFICIENCY
An unbiased estimator is defined to be efficient if the variance of its sampling distribution is smaller
than that of the sampling distribution of any other unbiased estimator of the same parameters.
SUFFICIENCY
An estimator is defined to be sufficient if the statistic used as estimator uses all the information that
is contained in the sample. Any statistic that is not computed from all values in the sample is not a
sufficient estimator.
CONFIDENCE INTERVAL
A point estimator (e.g. a sample mean) calculated from sample data, provides a single number as an
estimate of the population parameter. A point estimator cannot be expected to be exactly equal to
the population parameter. For example the mean of a sample taken from a population may assume
different values for different samples. A sample mean obtained from one sample cannot be equal to
the population mean. We therefore estimate an interval of values within which the population
parameter may be expected to lie with a certain degree of confidence. A range of values used to
estimate a population parameter is known as interval estimation by confidence interval and the
interval (a, b) that will include the population parameter with a high probability (e.g. 0.90, 0.95 or
0.99) is known as confidence interval. The 90%, 95% or 99% confidence interval shows that we are
90%, 95% or 99% confident that our computed interval does in fact contain the unknown
population parameter. The limits ‘a’ and ‘b’ are called the lower and upper confidence limit of the
interval; the probability 0.90 0.95 or 0.99 is called the confidence coefficient or level of confidence
and is denoted by 1-alpha. The probability that the interval does not contain the parameter is
denoted by alpha. The probability curve is one; the level of confidence is always equal to 1-alpha.
Pr⦋ –Z α/2 ≤ z ≤ Z α/2 ⦌ = 1-α
Pr⦋-Z α/2 ≤
≤ Z α/2 ⦌ =1-α
Pr⦋ –Z α/2 S.E ≤
INFERENTIAL STATISTCS
≤ Z α/2 S.E ]=1-α
132
ESTIMATION
≤x
Pr⦋ –Z α/2 S.E
Pr⦋ Z α/2 S.E
Pr⦋
Pr[
≤ Z α/2 S.E ⦌ =1-α
- x≥
≤
Z α/2 S.E
–Z α/2.
- Z α/2 S.E
≤
⦌= 1-α
Z α/2 S.E ] = 1-α
≤
Z α/2.
] =1-α
In short we can write 100(1-α) % C.I for µ/δ is known
Pr (
Z α/2.
Pr
=1-α
Pr
=1-α
≥
Pr [
Pr [
≥
≥
≥
INFERENTIAL STATISTCS
]=1-α
] =1-α
133
ESTIMATION
TABLE:
CONFIDENCE INTERVAL
Z VALUE
90%
1.65
95%
1.96
99%
2.58
99,9%
3.291
INFERENTIAL STATISTCS
134
ESTIMATION
CONFIDENCE INTERVAL FOR POPULATION WITH MEAN WHEN
POPULATION VARIANCE IS KNOWN
We want to construct a 95%confidence interval for the population mean when the population
variance is unknown. Let X denotes the sample mean. We know that the sample distribution of
the sample mean X is normal with mean and standard deviation. Thus the statistic
Will be normally distributed with mean is zero and standard deviation one. The range
Is called the 95% confidence interval for the population mean.
TABLE:
INFERENTIAL STATISTCS
135
ESTIMATION
QUESTION:
Give a sample random of 25 observations from a normal population for which mean is unknown
and s.d is 5. Suppose the sample mean is found to be 45. Find i) 95% ii) 99% confidence interval
for the population mean?
LARGE SAMPLE CONFIDENCE INTERVAL FOR POPULATION MEAN WHEN
POPULATION VARIANCE IS UNKNOWN
In establishing a confidence interval for mean we have use the value of population standard
deviation in determining the width of the interval. But the standard deviation of the population, like
the mean of the population, is unknown. In this situation when the sample size is large n > 30 the
population standard deviation may be approximated by a sample standard deviation S or s. Thus
when is large and the population is unknown a 100 (1-α) % confidence interval for μ is given by
QUESTION:
The mean and standard deviation of the maximum loads supported by 60 cables are11.09 and 0.73
tons respectively. Find i) 95% ii) 99% confidence interval for mean of the maximum loads of all
cables produced by the company.
QUESTION:
To estimate the average weekly income of unskilled workers in a large city, an investigator collects
weekly income data from a random sample of 75 unskilled workers. The mean and standard
deviation are found to be Rs. 127 and Rs. 15 respectively. Compute a) 90% and b) 80% confidence
interval for the mean weekly income?
CONFIDENCE INTERVAL FOR DIFERENCE BETWEEN POPULATION MEANS
From each of two populations an independent random sample is drawn. Sample means, X1andX2, are
calculated. The difference is X1-X2which is an unbiased estimator of the difference between the two
INFERENTIAL STATISTCS
136
ESTIMATION
population means, µ1- µ2.The variance of the estimator is (σ12/n1) and (σ22/n2).
QUESTION:
A research team is interested in the difference between serum uric acid levels in patients with and
without Down's syndrome. In a large hospital for the treatment of the mentally retarded, a sample of
12 individuals with Down's syndrome yielded a mean of
1=
4.5 mg/100 ml. In a general hospital a
sample of 15 normal individuals of the same age and sex were found to have a mean value of
2=
3.4 mg/100 ml. If it is reasonable to assume that the two populations of values are normally
distributed with variances equal to 1 and 1.5, find the 95 percent confidence interval for µ1-µ2
Given
2
1=4.5σ1
n1 =12
n2=15
2
=1
=3.4 σ22=1.5
QUESTION:
Two independent samples of 100 machinists and 100 carpenters are taken to estimate the difference
between the weekly wages of the two categories of workers. The sample mean wages for the
mechanist and carpenters are
1
=Rs. 345 and
2
=Rs. 340 respectively. The population variance
for mechanist and σ12 =196 and σ22 =204. Determine a) 90% b) 99% confidence interval for the true
difference between the average wages of machinist’s carpenters?
LARGE CONFIDENCE INTERVAL FOR DIFFERENCE BETWEEN TWO
POPULATION MEANS
When the population variance σ12 and σ22
are unknown and the population are not normally
distributed we can obtain a confidence interval for the difference between two population means
provided the sample sizes are large. In this situation when the sample sizes are large n1 > 30, n2>30,
σ12 and σ22 respectively and a 100(1-α) % confidence interval for the difference between two
population means
INFERENTIAL STATISTCS
137
ESTIMATION
QUESTION:
Students from schools A and B compared on the basis of their scores on an aptitude test. Two
random samples of 90 and 100 students are selected from schools. The sample means are 76.4 and
81.2 where as the sample standard deviations are 8.2 and 7.6 respectively. Establish a 98%
confidence interval for the difference in population mean scores between students of schools A and
B?
CONFIDENCE INTERVAL FOR PROPORTIONS
we have established a confidence interval for mean we can obtain a confidence interval for the
binomial parameter p. the interval is based on the estimator P , the proportion of success in the
sample size of n. we have noted earlier that the distribution of the estimator P is approximately
normal with mean is equal to p and standard deviation
When p is not too close to 0 or 1. Thus a 100 (1-α) % confidence interval for p is
This interval depends on the population proportion p which is generally unknown. However when n
is large the population proportion p is approximated by the sample proportion P. thus for large n, a
100 (1-α) % confidence interval for p is.
INFERENTIAL STATISTCS
138
ESTIMATION
QUESTION:
In a random sample of 100 articles, 10 are found to be defective. Obtain a 95% confidence interval
for the true proportion of defectives in population of such articles?
QUESTION:
In a random sample of 500 farmers in a certain rural area, 41 were found to be employed. Compute
a99% confidence interval for the rate of unemployment in that area?
CONFIDENCE INTERVALS FOR DIFFERENCE BETWEEN PROPORTIONS
For large n1 and n2 a 100(1-α) % confidence interval for the difference of two binomial parameter
p1-p2 is given by
Where p1 and p2 are the proportion of success in random samples of sizes n1 and n2 respectively.
QUESTION:
In random samples of 400 adults and 600 teenagers who watched a certain TV programmed, 100
adults and 300 teenagers indicated that they liked it. Construct a) 95% b) 99% confidence interval
of the difference in proportions of all adults and all teenagers who watched the program me and
liked it?
CONFIDENCE INTERVAL FOR POPULATION MEAN BASED ON SMALL
SAMPLES
The procedure for determining confidence interval for the population mean based on small samples
is the same as for large samples except that we use the t-distribution instead of the standard normal
distribution. Confidence interval for the population mean can be computed when the population
INFERENTIAL STATISTCS
139
ESTIMATION
variance id unknown and the sample is small. In general if the population distribution is normal and
if σ12 is unknown a 100(1-α) % confidence interval for mean is given by
With n-1(degree of freedom)
A comparison of this confidence interval with the confidence interval formula shows that for small
samples we replaced z to t distribution and we replaced sigma by s which is the sample estimate of
sigma. As n increases both methods tend towards agreement.
QUESTION:
A sample of 12 measurements of the breaking strength of cotton threads gave a mean of 209 grams
and a standard deviation is 35 grams. Find 95% and 99% confidence limit for the actual mean
breaking strength?
QUESTION:
Five measurements of the reaction time of an individual to a certain stimulus were recorded as 0.28,
0.30, 0.27, 0.33, 0.31 seconds. Find 95% and 99% confidence interval for the actual mean reaction
time?
CONFIDENCE
INTERVAL FOR
DIFFERENCE
BETWEEN
POPULATION
MEANS μ1 - μ2 BASED ON SMALL SAMPLES
The random samples of sizes n1 and n2 from normal population with variances σ12 and σ22
respectively. Let
1
and
2
be the respective sample means. Confidence interval for the difference
between two population mean μ1 and μ2 can be computed when the population variances are
unknown and the sample sizes are small. If σ12 =σ22 we can estimate the common variance by sp
(square) given by
INFERENTIAL STATISTCS
140
ESTIMATION
Hence the 100(1-α ) % confidence interval for
With degree of freedom (n1 +n2 -2)
QUESTION:
Two random samples of size n1= 9 and n2= 16 from two independent populations having normal
distribution provide the means and standard deviations.
Find a 95% confidence interval for μ1- μ2 assuming σ
1 =64 and
2=59, s1 = 6 and s2 =5.
=1, σ = 2.
QUESTION:
A random sample of 10 university professors were gave their salaries in thousands RS. 13, 11, 19,
15, 22, 20, 14, 17, 14, 15. Another random sample 5 college professors gave their salaries in
thousands RS. 9, 12, 8, 10, 16. Construct a 95% confidence interval for the difference between
means of the salaries of universities and college professors assuming that their variances are equal?
CONFIDENCE INTERVAL FOR PAIRED OBBSERVATIONS
Now consider estimation procedures for the difference of two population means when the samples
are not independent and the variance of the two populations are not necessarily equal. The pairs are
independent when the two samples are selected from normal populations difference d1, d2....den
constitute a single random samples from a population of difference which is normally distributed
with mean and variance.
INFERENTIAL STATISTCS
141
ESTIMATION
QUESTION:
Twenty college freshmen were divided into 10 pairs each member of the pair having approximately
the same I.Q one of the pair was selected at random and assigned to mathematics section using
programmed materials only. The other members of each pairs were assigned to a section in which
the teacher lectured. At the end of the semester grouped was given the same examination and the
following results were recorded
Pair
Programmed Material
Lecturer
1
76
81
2
70
52
3
85
87
4
58
70
5
91
86
6
75
77
7
82
90
8
64
73
9
79
85
10
88
83
Find a 98%confidence interval for the true difference in the two learning procedures?
INFERENTIAL STATISTCS
142
ESTIMATION
TABLE:
DEGREES OF FREEDOM
In many statistical problems we are required to determine the degrees of freedom. This refers to a
positive whole number that indicates the lack of restrictions in our calculations. The degree of
freedom is the number of values in a calculation that we can vary.
Student t Distribution
Degrees of freedom play an important role when using the Student t-score table. There are actually
several t-score distributions. We differentiate between these distributions by use of degrees of
freedom. Here the probability distribution that we use depends upon the size of our sample. If our
sample size is n, then the number of degrees of freedom is n - 1. For instance, a sample size of 22
would require us to use the row of the t-score table with 21 degrees of freedom.
Chi-Square Distribution
The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an
identical manner as with the t distribution, the sample size determines which distribution to use. If
the sample size is n, then there are n - 1 degrees of freedom.
INFERENTIAL STATISTCS
143
ESTIMATION
Standard deviation
Another place where degrees of freedom show up is in the formula for the standard deviation. This
occurrence is not as overt, but we can see it if we know where to look. To find a standard
deviation we are looking for the "average" deviation from the mean. However after subtracting the
mean from each data value and squaring the differences, we end up dividing by n - 1 rather
than n as we might expect.
The presence of the n - 1 comes from the number of degrees of freedom. Since the n data values and
the sample mean are being used in the formula, there are n - 1 degrees of freedom.
Advanced Techniques
More advanced statistical techniques use more complicated ways of counting the degrees of
freedom. When calculating the test statistic for two means with independent samples
of n1and n2 elements, the number of degrees of freedom has quite a complicated formula. It can be
estimated by using the smaller of n1 - 1 and n2 - 1
Another example of a different way to count the degrees of freedom comes with an F test. In
conducting an F test we have k samples each of size n. The degrees of freedom in the numerator
are k - 1 and in the denominator is k (n - 1).
Chi-square
A chi-square test is a statistical test commonly used for testing independence and goodness of fit.
Testing independence determines whether two or more observations across two populations are
dependent on each other (that is, whether one variable helps to estimate the other). Testing for
goodness of fit determines if an observed frequency distribution matches a theoretical frequency
distribution. In both cases the equation to calculate the chi-square statistic is
Where O equals the observed frequency and E the expected frequency. The results of a chi-square
test, along with the degrees of freedom, are used with a previously calculated table of chi-square
distributions to find a p-value. The p-value can then be used to determine the significance of the test
INFERENTIAL STATISTCS
144
ESTIMATION
Data used in a chi-square analysis has to satisfy the following conditions

Randomly drawn from the population,

reported in raw counts of frequency,

measured variables must be independent,

observed frequencies cannot be too small

Values of independent and dependent variables must be mutually exclusive.

There are two types of chi-square test.
The Chi-square test for goodness of fit
Which compares the expected and observed values to determine how well an experimenter’s
predictions fit the data?
The Chi-square test for independence
Which compares two sets of categories to determine whether the two groups are distributed
differently among the categories?
QUESTION:
Calculate the chi square values, state the null hypothesis and alternatives hypothesis with α =0.05?
LUNGS
GUMS
THROAT
BONES
SKIN
TOTAL
ADDICTED 23
78
100
20
29
250
NOT
4
11
13
25
7
60
27
89
113
45
36
310
ADDICTED
TOTAL
INFERENTIAL STATISTCS
145
ESTIMATION
QUESTION:
If 5 coins were tossed 1000 times and the number of heads were given below? Test a binomial
distribution give a satisfactory fit to these data?
NO OF HEADS
F
0
38
1
144
2
342
3
287
4
164
5
25
ANOVA
A study compared the number of hours of relief provide by five different brands of antacid
administered to 25 different people, each with stomach acid considered strong. The results are given
A
B
C
D
E
4.4
5.8
4.8
2.9
4.6
4.6
5.2
5.9
2.7
4.3
4.5
4.9
4.9
2.9
3.8
INFERENTIAL STATISTCS
146
TOTAL
ESTIMATION
4.1
4.7
4.6
3.9
5.2
3.8
4.6
4.3
4.3
4.4
Tj
21.4
25.2
24.5
16.7
22.3
110
Tj2
457.96
635.04
600.25
278.89
497.29
2469.43
∑xij
92.02
127.94
121.51
57.81
100.49
500
C.F= /n×r
=484
TSS=
=500─484
=16
BSS= /r ─C.F
= 2469/5─ 484
=9.886
WSS=TSS─BSS
=16─9.886
= 6.114
SOV
SS
d.f
MS
F.ratio
b/w sample
9.886
4
2.4715
2.0211
WSS
6.114
5
1.2228
Total
INFERENTIAL STATISTCS
9
147
ESTIMATION
1. Specification of hypothesis
Ho: µ1=µ2=µ3
H1:µ1≠µ2≠µ3
2. Level of significant
α=5%
3. Test statistics
F= with d.f (µ1,µ2)
4. Calculation
Fcal= 2.0112
5. Critical region
If Reject Ho
fcal≥ftab
f0.05 (4, 5)
= 5.1922
6. Conclusion
2.0112≥5.1922
Ho is not rejected
QUESTION:
The training methods were compared to see whether they lead to greater productivity after training.
The following are productivity measures for individuals trained by each method.
Method 1
45
40
50
39
53
44
Method 2
59
43
47
51
39
49
Method 3
41
37
43
40
52
37
At the 0.05 LOS do the three training methods lead to different levels of productivity?
INFERENTIAL STATISTCS
148
ESTIMATION
Table of contents:
4.1 INTRODUCTION
4.1.1- Qualitative Vs Quantitative variable
4.1.2- Correlation & Causation
4.1.3- Correlation& Regression
4.2 CORRELATION ANALYSIS
4.2.1- Definition
4.2.2- Properties
4.2.3- Solved Example
4.2.4.1- Example 1
4.2.4.2- Example 2
4.2.4.3-Example 3
4.3 SCATTER PLOT
4.4.1- Why scatter plot?
4.4.2- Types of scatter plot
4.4.2.1- Positive correlation
4.4.2.2- Negative correlation
4.4.2.3- No correlation
4.4.2.4- Curvilinear relation
4.4.2.5-Strong positive correlation
4.4.2.6-Strong negative correlation
STATISTICAL INFERENCE
148
ANALYSIS OF CORELATION AND REGRESSION
4.4- SIMPLE LINERA REGRESSION
4.5.1- Simple linear regression model
4.5.2- Estimating unknown regression coefficients α, β.
4.5.3- Solved Example
4.5- SATISTICAL INFERENCE - CORRELATION
4.6.1- Hypothesis testing for ⍴.
4.6.2- Confidence interval for………………??
4.6- SATISTICAL INFERENCE REGRESSION
4.7.1- hypothesis test for α & β
4.7.2- Confidence interval for α & β
STATISTICAL INFERENCE
149
ANALYSIS OF CORELATION AND REGRESSION
CORRELATION& SIMPLE REGRESSION:
Definition:
The strength of relationship between any two variable is called correlation. It is denoted by ―r”
Correlation Analysis is the statistical tool we can use to describe the degree to which one
variable is linearly related to another.
INTRODUCTION:
Correlation is quantitative estimate of the relationship between two or more variables. When we
refer to simple correlation, we are measuring the strength and direction of the relation between
a dependent variable (Y) and one single independent variable (X), although it is called Simple
correlation. This is in contrast to multiple correlations. When we refer to multiple correlations,
we are measuring the strength and direction of the relationship between a dependent variable (Y)
and more than one independent variables (x).
Why it is used:
Correlation Analysis is used in conjunction with regression analysis to measure how well the
regression line explains the variation of the dependent variable.
Quantitative vs. Qualitative
Quantitative variables are variables measured on a numeric scale. Height, weight, response time,
subjective rating of pain, temperature, and score on an exam are all examples of quantitative
variables.
STATISTICAL INFERENCE
150
ANALYSIS OF CORELATION AND REGRESSION
Qualitative variables are variables with no natural sense of ordering. They are therefore
measured on a nominal scale. For instance, hair color (Black, Brown, Gray, Red, and Yellow) is
a qualitative variable

Both variables are qualitative; we analyze an association through a comparison of
conditional probabilities and graphically represent the data using contingency
tables. Examples of qualitative variables are gender and class standing.

Both variables are quantitative; while analyzing this situation we consider how
one variable, called a response variable, changes in relation to changes in the
other variable called an explanatory variable. Graphically we use scatter plots to
display two quantitative variables. Examples are age, height, weight (i.e. things
that are measured).

One variable is categorical and the other is quantitative; for instance height and
gender. These are best compared by using side-by-side box plots to display any
differences or similarities in the center and variability of the quantitative variable
(e.g. height) across the categories (e.g. Male and Female).
Correlation and causation
If there is a significant linear correlation between two variables, then one of five situations can
be true.

There is a direct cause and effect relationship

There is a reverse cause and effect relationship

The relationship may be caused by a third variable

The relationship may be caused by complex interactions of several variables

The relationship may be coincidental
If we conduct a study and we establish a strong correlation does this mean we also have
causation? That is, if two variables are related does that imply that one variable causes the other
to occur?
STATISTICAL INFERENCE
151
ANALYSIS OF CORELATION AND REGRESSION
Consider smoking cigarettes and lung cancer; does smoking cause lung cancer? Initially this was
answered as YES, but this was based on a strong correlation between smoking and lung cancer.
Not until scientific research verified that smoking can lead to lung cancer was causation
established. If you were to review the history of cigarette warning labels, the first mandated label
only mentioned that smoking was hazardous to your health. Not until 1981 did the label mention
that smoking causes lung cancer. To establish causation one must rule out the possibility
of lurking variable(s). The best method to accomplish this is through a solid design of your
experiment, preferably one that uses a control group.
Regression and correlation
Regression and correlation analyses are statistical tool that when properly used, can significantly
help people to make decision. Unfortunately they are often misused .As a result decision makers
often make inaccurate forecasts and less than desirable decision.
CORRELATION ANALYSIS
Definition
It is a technique to determine the degree to which variables are linearly related to another.
Correlation analysis measures the relationship between two items, for example, a security's price
and an indicator. The resulting value (called the "correlation coefficient") shows if changes in
one item (e.g., an indicator) will result in changes in the other item (e.g., the security's price).
Spearman's rank correlation coefficient
It is written in short as the Greek letter rho ( ) or sometimes as
. This means that it is a
number that shows how closely two sets of data are linked. It only can be done on data that can
be put in order, highest to lowest.
STATISTICAL INFERENCE
152
ANALYSIS OF CORELATION AND REGRESSION
For example, if you have data for how expensive different computers are, and data for how fast
the computers are, you could see if they are linked, and how closely they are linked, using
.
Correlation Coefficient “r”:
Properties:

The quantity ―r”, called the linear correlation coefficient, measures the strength and
the direction of a linear relationship between two variables. The linear correlation
coefficient is sometimes referred to as the Pearson product moment correlation
coefficient in honor of its developer Karl Pearson.

The mathematical formula for computing ―r” is:
Where n is the number of pairs of data.

The value of ―r” is such that -1 < r < +1. The + and – signs are used for positive
linear correlations and negative linear correlations, respectively.

Positive correlation:
If x and y have a strong positive linear correlation, ―r” is close
to +1. An ―r” value of exactly +1 indicates a perfect positive fit. Positive values
indicate a relationship between x and y variables, such that as values for x increases,
values for y also increase.

Negative correlation:
If x and y have a strong negative linear correlation, r is close
to -1. An r value of exactly -1 indicates a perfect negative fit. Negative values
indicate a relationship between x and y such that as values for x increase, values
for y decrease.

No correlation: If there is no linear correlation or a weak linear correlation, r is
close to 0. A value near zero means that there is a random, nonlinear relationship
between the two variables
STATISTICAL INFERENCE
153
ANALYSIS OF CORELATION AND REGRESSION

Note that ―r” is a dimensionless quantity; that is; it does not depend on the units
employed.

A perfect correlation of ± 1 occurs only when the data points all lie exactly on a
straight line. If r = +1, the slope of this line is positive. If r = -1, the slope of this
line is negative.

A correlation greater than 0.8 is generally described as strong, whereas a correlation
less than 0.5 are generally described as weak. These values can vary based upon the
"type" of data being examined. A study utilizing scientific data may require a stronger
correlation than a study using social science data.
Coefficient of Determination, r 2 or R2:

The coefficient of determination,‖ r 2‖, is useful because it gives the proportion of
the variance (fluctuation) of one variable that is predictable from the other variable.
It is a measure that allows us to determine how certain one can be in making
predictions from a certain model/graph.

The coefficient of determination is the ratio of the explained variation to the total
variation.

The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength
of the linear association between x and y.

The coefficient of determination represents the percent of the data that is the closest
to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that
85% of the total variation in y can be explained by the linear relationship between x
and y (as described by the regression equation). The other 15% of the total variation
in y remains unexplained.
STATISTICAL INFERENCE
154
ANALYSIS OF CORELATION AND REGRESSION

The coefficient of determination is a measure of how well the regression line
represents the data. If the regression line passes exactly through every point on the
scatter plot, it would be able to explain all of the variation. The further the line is
away from the points, the less it is able to explain.
Correlation Example
EXAMPLE: 1
The following sample populations represent a perfect positive linear correlation.
X = [-8.1, 1.0, -14.3, 4.2, -10.1, 4.3, 6.3, 5.0, 15.1, -2.2]
Y = [-9.8, -0.7, -16.0, 2.5, -11.8, 2.6, 4.6, 3.3, 13.4, -3.9]
Compute the correlation coefficient of X and Y.
CORRELATE(X, Y)
R=1.00000
EXAMPLE: 2
The following sample populations represent a high negative linear correlation.
X = [1.8, -2.7, 0.7, -0.5, -1.3, -0.9, 0.6, -1.5, 2.5, 3.0]
Y = [-4.7, 9.8, -3.7, 2.8, 5.1, 3.9, -3.6, 5.8, -7.3, -7.4]
Compute the correlation coefficient of X and Y:
CORRELATE(X, Y)
R = -0.979907
STATISTICAL INFERENCE
155
ANALYSIS OF CORELATION AND REGRESSION
EXAMPLE: 3
The following sample populations represent a poor linear correlation.
X = [-1.8, 0.1, -0.1, 1.9, 0.5, 1.1, 1.9, 0.3, -0.2, -1.0]
Y = [1.5, -1.0, -0.6, 1.1, 0.7, -0.7, 1.1, -0.1, 0.6, -0.1]
Compute the correlation coefficient of X and Y:
CORRELATE(X, Y)
r=0.0322859
SCATTER PLOT
The Scatter Diagram is a tool for determining the potential correlation between two different
sets of variables, i.e., how one variable changes with the other variable. This diagram simply
plots pairs of corresponding data from two variables, which are usually two variables in a
process being studied. The scatter diagram does not determine the exact relationship between
the two variables, but it does indicate whether they are correlated or not. It, by itself, also does
not predict cause and effect relationships between these variables.
Why it is used?
1) To quickly confirm a hypothesis that two variables are correlated.
2) Provide a graphical representation of the strength of the relationship between two variables.
3) Serve as a follow-up step to a cause-effect analysis to establish whether a change in an
identified cause can indeed produce a change in its identified effect.
To make a scatter diagram for two variables requiring confirmation of correlation, the
following simple steps are usually followed:
1) Collect pairs of data for the two variables and tabulate them;
2) Draw the x- and y-axes of the diagram, along with the scales that increase to the right for the
x-axis and upward for the y-axis;
STATISTICAL INFERENCE
156
ANALYSIS OF CORELATION AND REGRESSION
3) Assign the data for one variable to the x-axis (the independent variable) and the data for the
other variable to the y-axis (the independent variable);
4) Plot the data pairs on the scatter diagram, encircling (as many times as necessary) all data
points that are repeated.
Interpretation of the resulting scatter diagram is as simple as looking at the pattern formed by
the points. If the data points plotted on the scatter diagram are all over the place with no
discernible pattern whatsoever, then there is no correlation at all between the two variables of
the scatter diagram. An example of a scatter diagram that shows no correlation is shown in
Figure 1.
Figure 1: A Scatter Diagram showing no correlation
There is positive correlation between two sets of data if an increase in the x-value results in an
increase in the y-value. Figure 2a shows a scatter diagram that exhibits positive correlation.
Note that in such a correlation, the data points constitute a perceivable diagonal line that goes
from the lower left to the upper right corner.
Not all sets of data pairs will exhibit a strong positive correlation, even if an increase in the xvalue somehow results generally in an increase in the y-value.
An example of this 'weak' type of positive correlation is shown in the scatter diagram of Figure
2b, which is said to exhibit just a 'possible positive correlation.' This scatter diagram still shows
a perceivable diagonal line going in the upper right direction, but the points are more spread
apart than in a scatter diagram with strong positive correlation.
STATISTICAL INFERENCE
157
ANALYSIS OF CORELATION AND REGRESSION
Figure 2: Scatter Diagrams showing positive correlation (a, left) and just a possible positive
correlation (b, right)
If the scatter diagram formed also shows a perceivable diagonal line, but the line is going in a
direction opposite that of positive correlation (i.e., from the upper left to the lower right corner)
as shown in Figure 3a, then the data pairs are exhibiting negative correlation. This means that y
decreases as x increases. Again, the negative correlation is strong if the line formed by the data
points is narrow and much defined.
If the negative correlation is not strong, resulting in data points that are not closely packed
together, then there is just a 'possible negative correlation.' An example of a scatter diagram for
such type of correlation is shown in Figure 3b.
Figure 3: Scatter Diagrams showing negative correlation (a, left) and just a possible negative
correlation (b, right)
Determining the exact nature of correlation between variables can lead to benefits. These
include:
STATISTICAL INFERENCE
158
ANALYSIS OF CORELATION AND REGRESSION
1) Better understanding of cause-effect relationships.
2) Reduction of data gathering requirements.
3) Establishment of more effective process controls.
4) Easier development of check and balance schemes; etc
SIMLPE LINEAR REGRESSION
When you think of regression, you think of prediction. A regression uses the historical
relationship between an independent and a dependent variable to predict the future values of the
dependent variable. When one independent variable is used in a regression, it is called a simple
regression; when two or more independent variables are used, it is called a multiple regression.
As you can see, there are several different classes of regression procedures, with each having
varying degrees of complexity and explanatory power. The most basic type of regression is that
of simple linear regression. A simple linear regression uses only one independent variable, and
it describes the relationship between the independent variable and dependent variable as a
straight line
Regression Model:
In simple linear regression, the model used to describe the relationship between a single
dependent variable y and a single independent variable x is
y = a0 + a1x + K
a0 and a1 are referred to as the model parameters, and is a probabilistic error term that accounts
for the variability in y that cannot be explained by the linear relationship with x. If the error term
were not present, the model would be deterministic; in that case, knowledge of the value of x
would be sufficient to determine the value of y.
Least squares method.
STATISTICAL INFERENCE
159
ANALYSIS OF CORELATION AND REGRESSION
Either a simple or multiple regression models is initially posed as a hypothesis concerning the
relationship among the dependent and independent variables. The least squares method is the
most widely used procedure for developing estimates of the model parameters.
The following table lists the monthly sales and advertising expenditures for all of last year by a
digital electronics company.
In this case, you would plot last year's data for monthly sales and advertising expenditures as
shown on the scatter plot below. (Data for independent and dependent variables must be from the
same period of time.)
Scatter plots are effective in visually identifying relationships between variables. These
relationships can be expressed mathematically in terms of a correlation coefficient, which is
STATISTICAL INFERENCE
160
ANALYSIS OF CORELATION AND REGRESSION
commonly referred to as a correlation. Correlations are indices of the strength of the relationship
between two variables. They can be any value from –1 to +1. (Correlations are covered in greater
detail in the Covariance and Correlation topic of this section.)
When you use regression to predict future values of the dependent variable, the ideal correlation
between the independent and dependent variable is high—in absolute value terms, somewhere in
the range between 0.5 - 0.99. Viewing the scatter plot above, you can see that there appears to be
some degree of correlation between the level of advertising expenditure and product awareness.
When calculated, this correlation equals .89. This historical data will enable you to predict the
relationship between the two variables in the future, before any further expense is incurred. In
order to make these predictions, a regression line must be drawn from the information appearing
in the scatter plot.
Regression Line
The figure below is the same as the scatter plot above, with the addition of a regression line fitted
to the historical data.
The regression line is the line with the smallest possible set of distances between itself and each
data point. As you can see, the regression line touches some data points, but not others. The
distances of the data points from the regression line are called error terms.
STATISTICAL INFERENCE
161
ANALYSIS OF CORELATION AND REGRESSION
A regression line will always contain error terms because, in reality, independent variables are
never perfect predictors of the dependent variables. There are many uncontrollable factors in the
business world. The error term exists because a regression model can never include all possible
variables; some predictive capacity will always be absent, particularly in simple regression.
The typical procedure for finding the line of best fit is called the least-squares method. This
calculation is usually performed using computer software. In this calculation, the best fit is found
by taking the difference between each data point and the line, squaring each difference, and
adding the values together. The least-squares method is based upon the principle that the sum of
the squared errors should be made as small as possible so the regression line has the least error.
Once this line is determined, it can be extended beyond the historical data to predict future levels
of product awareness, given a particular level of advertising expenditure.
STATISTICAL INFERENCE
162
ANALYSIS OF CORELATION AND REGRESSION
The extension of the line of regression requires the assumption that the underlying process
causing the relationship between the two variables is valid beyond the range of the sample data.
Regression is a powerful business tool due to its ability to predict future relationships between
variables such as these.
When you run a regression in Excel or in a statistics program, the program will provide you with
a report. The details of these reports, and the definition of all the terms included in the report, are
beyond the scope of the course.
Equation of a Regression Line
You may recall the equation of a straight line from your review of the Linear Functions topic in
the Algebra section of this course.
Variables, constants, and coefficients are represented in the equation of a line as

X represents the independent variable.

F(x) represents the dependent variable.

The constant b denotes the y-intercept—this will be the value of the dependent variable if
the independent variable is equal to zero.

The coefficient m describes the movement in the dependent variable as a result of a given
movement in the independent variable.
In finance, linear regressions are commonly used to describe the returns of an individual security
(dependent variable) compared to the returns of the market in general (independent variable).
The equation for the simple linear regressions used to describe security movements is also a
straight line and is expressed in a format, which, while similar, does contain a couple of twists.
The equation below is a regression equation for a straight line describing the relationship
between the returns of security I and the market in general.
STATISTICAL INFERENCE
163
ANALYSIS OF CORELATION AND REGRESSION

ri represents the return of security I and is the dependent variable

rm represents the return of the market in general and is the independent variable

b is the slope of the regression line, and it describes the level of movement in security I as
a result of a unit of movement in the market in general

α is the y-intercept of the regression line

I is an error term that describes the distance between an actual data point and the
corresponding point on the regression line
The graph below provides a visual depiction of this regression line. The returns of the market in
general are represented in this graph by the returns of the S&P 500—a common surrogate for
market returns.
You may be familiar with discussions in financial circles about the beta (β) of a security being a
measure of the security's risk. The risk measure of beta is calculated using regression techniques.
Beta, the slope of the regression line, was described above as the level of movement in the
returns of a given security for each unit of movement in the market in general. A security with a
high beta is considered risky and will experience big swings in its returns as compared to those
of the market. A security with a low beta is considered less risky and will have returns that
fluctuate less than those of the market. The alpha term (α) in the regression equation of a security
represents the security's propensity to move independent of the market. The alpha and beta of a
STATISTICAL INFERENCE
164
ANALYSIS OF CORELATION AND REGRESSION
security cannot be observed directly but are estimated, based on the past performance of a
security, through regression analysis.
Example
The example data in Table 1 are plotted in Figure 1. You can see that there is a positive
relationship between X and Y. If you were going to predict Y from X, the higher the value of X,
the higher your prediction of Y
Table 1: Example data
X
Y
1.00
1.00
2.00
2.00
3.00
1.30
4.00
3.75
5.00
2.25
Figure 1: A scatter plot of the example data.
Linear regression consists of finding the best-fitting straight line through the points. The
best-fitting line is called a regression line. The black diagonal line in Figure 2 is the regression
line and consists of the predicted score on Y for each possible value of X. The vertical lines from
the points to the regression line represent the errors of prediction. As you can see, the red point is
STATISTICAL INFERENCE
165
ANALYSIS OF CORELATION AND REGRESSION
very near the regression line; its error of prediction is small. By contrast, the yellow point is
much higher than the regression line and therefore its error of prediction is large.
3.75
1.33
Figure 2; a scatter plot of the example data, The black line consists of the predictions, the points
are the actual data, and the vertical lines between the points and the black line represent errors of
prediction.The error of prediction for a point is the value of the point minus the predicted value
(the value on the line).
Table 2 shows the predicted values (Y') and the errors of prediction (Y-Y'). For example, the first
point has a Y of 1.00 and a predicted Y (called Y') of 1.21. Therefore, its error of prediction is 0.21.
Table 2: Example data.
STATISTICAL INFERENCE
X
Y
Y'
1.00
1.00
1.210 -0.210
0.044
2.00
2.00
1.635 0.365
0.133
3.00
1.30
2.060 -0.760
0.578
4.00
3.75
2.485 1.265
1.600
5.00
2.25
2.910 -0.660
0.436
166
Y-Y'
(Y-Y') 2
ANALYSIS OF CORELATION AND REGRESSION
You may have noticed that we did not specify what is meant by "best-fitting line". By far, the
most commonly-used criterion for the best-fitting line is the line that minimizes the sum of the
squared errors of prediction. That is the criterion that was used to find the line in Figure 2. The
last column in Table 2 shows the squared errors of prediction. The sum of the squared errors of
prediction shown in Table 2 is lower than it would be for any other regression line.
The formula for a regression line is Y' = b(X) + A
Where Y' is the predicted score, b is the slope of the line, and A is the Y intercept. The equation
for the line in Figure 2 is
Y' = 0.425X + 0.785
For X = 1,
Y' = (0.425) (1) + 0.785 = 1.21.
For X = 2,
Y' = (0.425) (2) + 0.785 = 1.64.
Example
Thus we can derive from the data in table below:
STATISTICAL INFERENCE
167
ANALYSIS OF CORELATION AND REGRESSION
From this we get that
Minimizes S2, the least squares estimate, is given by
&
It can be shown that
Which is of use because we have calculated all the components of equation in the calculation of
the correlation coefficient?
The calculation of the correlation coefficient on the data in table 11.2 gave the following:
Applying these figures to the formulae for the regression coefficients, we have:
STATISTICAL INFERENCE
168
ANALYSIS OF CORELATION AND REGRESSION
Therefore, in this case, the equation for the regression of y on x becomes
This means that, on average, for every increase in height of 1 cm the increase in anatomical dead
space is 1.033 ml over the range of measurements made.
The line representing the equation is shown superimposed on the scatter diagram of the data in
figure shown above. The way to draw the line is to take three values of x, one on the left side of
the scatter diagram, one in the middle and one on the right, and substitute these in the equation,
as follows:
If x = 110, y = (1.033 x 110) - 82.4 = 31.2
If x = 140, y = (1.033 x 140) - 82.4 = 62.2
If x = 170, y = (1.033 x 170) - 82.4 = 93.2
Example
Suppose we measured the height and weight of a random sample of adults in shopping malls in
the U.S. We want to predict weight from height in the population.
Table 2.1
Ht
Wt
61
105
62
120
63
120
65
160
65
120
STATISTICAL INFERENCE
169
ANALYSIS OF CORELATION AND REGRESSION
68
145
69
175
70
160
72
185
75
210
N=10 N=10
67
150
Mean
20.89 1155.5 Variance (S2)
4.57
33.99
Standard Deviation (S)
Correlation (r) = .94
It is customary to talk about the regression of Y on X, hence the regression of weight on height
in our example. The regression equation of our example is Y = -316.86 + 6.97X, where -361.86
is the intercept (a) and 6.97 is the slope (b).
We could also write that weight is -316.86+6.97height. The slope value means that for each inch
we increase in height, we expect to increase approximately 7 pounds in weight (increase does not
mean change in height or weight within a person; rather it means change in people who have a
certain height or weight).
The intercept is the value of Y that we expect when X is zero. So if we had a person 0 inches
tall, they should weigh -316.86 pounds. Of course we do not find people who are zero inches tall
and we do not find people with negative weight. It is often the case in psychology the value of
the intercept has no meaningful interpretation.
STATISTICAL INFERENCE
170
ANALYSIS OF CORELATION AND REGRESSION
STATISTICAL INFERENCE - Correlation
Hypotheses testing and confidence interval
Formulas:
S.E = Sy.x/√∑(𝑿𝟏 − 𝑿𝟐)2
Sy.x= √∑(𝒀𝟏 − 𝒀𝟐)2/n-2
Suppose that we took 7 mice and measured their body weight and their length from nose to tail.
We obtained the following results and want to know if there is any relationship between the
measured variables. [To keep the calculations simple, we will use small numbers]
Mouse
Units of weight (x)
Units of length (y)
1
1
2
2
4
5
3
3
8
4
4
12
5
8
14
STATISTICAL INFERENCE
171
ANALYSIS OF CORELATION AND REGRESSION
6
9
19
7
8
22
Procedure
(1) Plot the results on graph paper. This is the essential first step, because only then can we see
what the relationship might be - is it linear, logarithmic, sigmoid, etc?
In our case the relationship seems to be linear, so we will continue on that assumption. If it does
not seem to be linear we might need to transform the data.
(2) Set out a table as follows and calculate S x, S y, S x2, S y2, S xy,
(mean of y).
x2
y2
Xy
2
1
4
2
5
16
25
20
Weight
Length
(x)
(y)
Mouse 1
1
Mouse 2
4
STATISTICAL INFERENCE
and
172
ANALYSIS OF CORELATION AND REGRESSION
Mouse 3
3
8
9
64
24
Mouse 4
4
12
16
144
48
Mouse 5
8
14
64
196
112
Mouse 6
9
19
81
361
152
Mouse 7
8
22
64
484
176
Total
S x = 37
S y = 82
S x2 =
S y2 =
S xy
251
1278
553
Mean
(3) Calculate
(4) Calculate
(5) Calculate
=
=
5.286
11.714
=
= 55.429 in our case.
= 317.429 in our case.
(this can be positive or negative) = 119.571.
(6) Calculate r (correlation coefficient):
r = 0.9014 in our case.
STATISTICAL INFERENCE
173
ANALYSIS OF CORELATION AND REGRESSION
(7) Look up r in a table of correlation coefficients (ignoring + or - sign). The number of degrees
of freedom is two less than the number of points on the graph (5 df in our example because we
have 7 points). If our calculated r value exceeds the tabulated value at p = 0.05 then the
correlation is significant. Our calculated value (0.9014) does exceed the tabulated value (0.754).
It also exceeds the tabulated value for p = 0.01 but not for p = 0.001. If the null hypothesis were
true (that there is no relationship between length and weight) we would have obtained a
correlation coefficient as high as this in less than 1 in 100 times. So we can be confident that
weight and length are positively correlated in our sample of mice.
Suppose that we had the following results from an experiment in which we measured the growth
of a cell culture (as optical density) at different pH levels.
Optical
pH
density
3
0.1
4
0.2
4.5
0.25
5
0.32
5.5
0.33
6
0.35
6.5
0.47
7
0.49
7.5
0.53
We plot these results (see below) and they suggest a straight-line relationship.
STATISTICAL INFERENCE
174
ANALYSIS OF CORELATION AND REGRESSION
Using the same procedures as for correlation, set out a table as follows and
calculate S x, S y, S x2, S y2, S xy,
pH (x)
and
(mean of y).
Optical
x2
y2
Xy
density (y)
3
0.1
9
0.01
0.3
4
0.2
16
0.04
0.8
4.5
0.25
20.25
0.0625
1.125
5
0.32
25
0.1024
1.6
5.5
0.33
30.25
0.1089
1.815
6
0.35
36
0.1225
2.1
6.5
0.47
42.25
0.2209
3.055
STATISTICAL INFERENCE
175
ANALYSIS OF CORELATION AND REGRESSION
Total
Mean
7
0.49
49
0.240
3.43
7.5
0.53
56.25
0.281
3.975
S x = 49
S y = 3.04
S x2 = 284
S y2 = 1.1882 S xy = 18.2
= 5.444
Now calculate
Calculate
Calculate
= 0.3378
= 17.22 in our case.
= 0.1614 in our case.
(this can be positive or negative) = +1.649
Now we want to use regression analysis to find the line of best fit to the data. We have done
nearly all the work for this in the calculations above.
The regression equation for y on x is: y = bx + α where b is the slope and a is the intercept (the
point where the line crosses the y axis)
We calculate b as:
b= 1.649 x 17.22 = 0.0958 in our case
We calculate a as:
a=
STATISTICAL INFERENCE
176
-b
ANALYSIS OF CORELATION AND REGRESSION
From the known values of
(0.3378),
(5.444) and b (0.0958) we thus find a (-0.1837).
So the equation for the line of best fit is: y = 0.096x - 0.184 (to 3 decimal places).
To draw the line through the data points, we substitute in this equation. For example:
When x = 4, y = 0.384, so one point on the line has the x, y coordinates (4, 0.384);
When x = 7, y = 0.488, so another point on the line has the x, y coordinates (7, 0.488).
It is also true that the line of best fit always passes through the point with co-ordinates , y so we
actually need only one other calculated point in order to draw a straight line.
EXAMPLE
Question?
Solution
Confidence Interval for α:
Pr [a- t α/2 (µ). S.Eα ≤ α ≤ a +tα/2 (µ). S.Ea ]= 1-α
S.Eα = δy.x √1/n + x2/ √∑(xi-x)2
=0.82702
Pr[5.82539-2.583*0.8270 ≤ α ≤ 5.82539 + 2.583*0.8270
Pr [3.69 ≤ α ≤ 7.961]
Confidence Interval of β:
Pr [b- t α/2 (µ).S.Eb ≤ β ≤ b +tα/2 (µ). S.Eb ]= 1-β
S.Eβ = δy.x / √∑ (xi-𝑥)2
STATISTICAL INFERENCE
177
ANALYSIS OF CORELATION AND REGRESSION
= 2.58072/√-19828
= 2.58072/ 140.8119
=0.0183
Pr [ 0.5676- 2.583* 0.0183 ≤ β ≤ 0.5676+ 2.583* 0.0183]
Pr [0.5676 – 0.024339 ≤ β ≤0.5676 + 0.024339]
Pr [0.5203 ≤ β ≤ 0.6148]
Hypothesis Testing Of α:
H0: α = 0
H1: α ≠ 0
α=5%
t = a – α / S.E α
= 5.8253 – 0 / 0.82702
= 7.043723
│t cal │≥ t tab
t tab = t α/2 µ
│7.04373 │ ≥ 0.025 (16)
│7.04373 │ ≥ 2.583
So, Reject H0
STATISTICAL INFERENCE
178
ANALYSIS OF CORELATION AND REGRESSION
Hypothesis testing of β:
H0: β = 0
H1: β ≠ 0
α= 5%
t= b - β/ S.Eb
= 0.5676 – 0/ 0.0183
= 31.01
│t cal │≥ t tab
t tab = t α/2 µ
│31.01│ ≥ 0.025 (16)
│31.01│ ≥ 2.583
So, Reject H0
STATISTICAL INFERENCE
179
ANALYSIS OF CORELATION AND REGRESSION
Reference



















Statistics for management (seventh Edition) –RICHARD L . LEVIN, DAVID S.
RUBIN
Applied Statistics for Public Policy – BRIAN . MACFIE AND PHILIP M. NUFEIO
http://www.cimt.plymouth.ac.uk/projects/mepres/book9/bk9i8/bk9_8i3.html
http://mathbits.com/MathBits/TISection/Statistics2/correlation.htm
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC374386/
http://archive.bio.ed.ac.uk/jdeacon/statistics/tress11.html
http://www.slideshare.net/21_venkat/correlation-regression-17406392
http://lycofs01.lycoming.edu/~sprgene/M123/Text/UNIT_09.pdf
http://www.icoachmath.com/problems/problemslink.aspx?search=scatter%20plot&
grade=0
http://www.statistics.com/index.php?page=glossary&term_id=538
http://vassarstats.net/textbook/ch3a.html
http://www.stats.gla.ac.uk/steps/glossary/paired_data.html
http://www.siliconfareast.com/scatterdiagram.html
http://answers.yahoo.com/question/index?qid=20090710024107AAT7qnj
http://www.metastock.com/Customer/Resources/TAAZ/?c=3&p=44
http://www.physics.nyu.edu/grierlab/idl_html_help/mathematics6.html
http://ci.columbia.edu/ci/premba_test/c0331/s7/s7_6.html
http://luna.cas.usf.edu/~mbrannic/files/regression/regbas.html
http://www.bmj.com/about-bmj/resources-readers/publications/statistics-squareone/11-correlation-and-regression
STATISTICAL INFERENCE
180
ANALYSIS OF CORELATION AND REGRESSION
NON PARAMETRIC TESTING
INTRODUCTION
Until now, everything that we have done in statistics was based on this assumption that the data
is normally distributed. But sometimes we do not know about the real distribution of the data. In
this case, we will use non parametric tests.
A non parametric test is one that makes no assumption about the specific shape of the
population from which a sample is drawn.
It is also called a distribution free test.
When we know what distribution we are dealing with, it is much more practical and useful to use
a particular test which is designed to serve that specific purpose. For example, if we are talking
about normal distribution, we can use a z, t or a F test for performing inferences about
parameters. But when we do not know about the population, then it is better to use a test that fits
for any kind of distribution. In this way, we will be prepared for any condition.
A non parametric test should be used instead of its parametric counterpart whenever:
1. Data are of the nominal or ordinal scales of measurement.
2. When there are definite outliers, or
3. Data are of interval or ratio scale measurement but one or more other assumptions, such
as the normality of the underlying population distribution, are not met.
When our data is normally distributed, the mean is equal to the median and we use the mean as
our measure of center. However, if our data is skewed, then the median is a much better measure
of center. Therefore, just like the Z, t and F tests made inferences about the population mean(s),
nonparametric tests make inferences about the population median(s).
Advantages of Non parametric Testing
1. Fewer assumptions about the population. Non parametric tests do not assume the
population has any specific distribution.
2. The techniques can be applied when sample sizes are very small.
3. Samples with data of the nominal and ordinal scales of measurement can be tested.
4. In most cases, computations are easier than its parametric counterpart.
INFERENTIAL STATISTICS
185
NON PARAMETRIC TESTING
Disadvantages of Non parametric Testing
1. Compared to a parametric test, the information in the data is used less efficiently, and the
power of the test will be lower.
2. They are less sensitive to their parametric counterparts when the assumptions of the
parametric method are met. Therefore, larger differences are needed before rejecting the
null hypothesis.
Number of samples
Non- parametric test
Parametric test
Wilcoxon signed rank test,
t- test, z- test
One sample
One sample
One
When samples are dependent;
Two
Wilcoxon signed rank test, t- test, z-test, paired samples
paired samples
When
Two
samples
are
t- test, z- test, two independent
independent;
samples
Wilcoxon rank sum test
When sample are dependent;
Randomized block analysis of
Freidman test
variance
More than two
When
More than two
samples
independent;
are
One way analysis of variance
Kruskal-Wallis test
THE RUNS TEST FOR RANDOMNESS
The runs test evaluates the randomness of a series of observations by analyzing the number of
runs it contains. A run is the consecutive appearance of one or more observations that are similar.
INFERENTIAL STATISTICS
186
NON PARAMETRIC TESTING
Assumptions
1. The sample data are arranged according to some scheme (such as time series).
2. The data falls into two separate categories (such as above and below a specific value).
3. The runs test is based on the order in which the data occur; not on the frequency of the data.
PROCEDURE

State the null and alternate hypotheses
Ho: the sequence is random
H1: the sequence is not random

For nominal data with two categories:
1. Determine n1 and n2, the number of observations of each type.
2. Count the number of runs , T

For ordinal, interval or ratio data:
1. Determine the median, m of the values.
2. Identify each data value with a (+), if x ≥ m and with a (-), if x < m.
3. Determine n1 and n2, the number of (+) and (-) observations.
4. Count the number of runs, T

Test statistics
𝑧=
𝑇−
2𝑛 1 𝑛 2
𝑛
+1
2𝑛 1 𝑛 2 2𝑛 1 𝑛 2 −𝑛
𝑛 2 𝑛 −1
Where
T= the number of runs
n1= the number of observations of the first type
INFERENTIAL STATISTICS
187
NON PARAMETRIC TESTING
n2= the number of observations of the second type
n= the total number of observations, n1+n2

Critical region

Conclusion
EXAMPLE
A political activist claims to have “randomly” stop persons at a street corner and asked them to
sign his petition and give their age,. During his first half on the street, 30 people signed the
document and gave their age, and the order is as shown:
The age of the signers in order in which they signed:
30
33
15
59
35
29
68
69
38
43
15
36
35
30
61
74
56
47
68
18
22
12
58
45
65
64
49
38
58
45
At the 0.05 LOS, evaluate the randomness of the ages for this sequence of 30 respondents.
SOLUTION
Ho = the ages are in a random order
H1: the ages are not in a random order
α= 0.05
𝑧=
𝑇−(
2𝑛 1 𝑛 2
𝑛
+ 1)
2𝑛 1 𝑛 2 (2𝑛 1 𝑛 2 −𝑛)
𝑛 2 (𝑛 −1)
INFERENTIAL STATISTICS
188
NON PARAMETRIC TESTING
30
-
74
+
33
-
56
+
15
-
47
+
59
+
68
+
35
-
18
-
29
-
22
-
68
+
12
-
69
+
58
+
38
-
45
+
43
-
65
+
15
-
64
+
36
-
49
+
35
-
38
-
30
-
58
+
61
+
45
+
The age values shown in table have a median of 44 years. The ages are converted to (+) when
they are 44 or greater than 44.
Total number of runs = T = 10.
There are n1 = 15 and n2 = 15 and the total sample size is n1 + n2= 30
Substituting the values in the formula
INFERENTIAL STATISTICS
189
NON PARAMETRIC TESTING
Z = -2.23
For a two tail test, at 0.05 LOS, the critical z values -1.96 and +1.96. The observed z value is
outside these limits, and the null hypothesis is rejected.
PROBLEMS
1. The News and Clarion kept a record of the gender of people to call the circulation
office to complain about delivery problems with the Sunday paper. For a recent
Sunday these data are as follows:
F
F
F
M
M
F
M
F
F
F
F
M
M
M
F
M
F
M
F
F
F
F
M
M
M
M
M
M
Using the 0.05 LOS, test this sequence for randomness.
SIGNED TEST METHOD:
It is one of the simplest of statistical test, which focuses on the median rather than the mean as a
measure of central tendency. Only assumption made in performing the test is that the variables
come from a continuous distribution. It is called the sign test because we use pluses and minuses
as the new data in performing the calculations. We illustrate its use with a single sample and a
paired sample. It is useful when we are not able to use the t test because the assumption of
normality has been violated.
ONE SAMPLE:
To perform a test, we replace each observation by a plus sign and minus sign depending upon
whether the observation is above or below Mo, the hypothesized value of the population median.
We discard any observation that equals Mo and reduce the sample size. We denote the total
number of plus and minus sign by n. the test statistic X is defined by number of times the less
frequent sign (plus or minus) occurs. Under null hypothesis, the samplying distribution of X is
𝟏
binominal with parameters 𝟐 and n. we determine the critical region by calculating the binominal
probalilities. To reach the signifinance level α, we add the probalilities from both tails in case of
INFERENTIAL STATISTICS
190
NON PARAMETRIC TESTING
two-tailed test, and in case on one-tailed test, the probalilties in the desired tail are added to reach
α. We accept the null hypothesis in case the population are systematric, may be sataed as
Ho: μ=μo
TWO SAMPLE:
Let Xi and Yi denote the observation from the first sample and the second sample respectively.
We replace the difference Xi-Yi by a plus sign if Xi>Yi; by a minus sign if Xi<Yi and we ignore
the pair if Xi=Y i.e, zero difference are droped from the analysis let n represent the number of
plus and minus signs and ket X stand for the number of times the less frequent sign(plus or
minus)occurs. Then the samplying samplying distribution of X binominal with parameter ½ and
n. the rest of the procedure is same as in one sample sign test.
In case the sample size are not equal, some of the values of the larger samples are to be discarded
9the data must be from match pair samples. The two sample sign test may be used to test the
hypothesis Ho: μ1=μ2 when the underlying populations are asumed to be symmetric.
With large n we use the normal approximation to the binominal distribution b(n,1/2). The
statistic X is then approximately standard normal with mean =n/2 and the standard deviation
= 𝑛/4. In other words, the test statistic under Ho, becomes
Z=
Z=
𝑿−𝒏/𝟐
𝒏/𝟒
𝟏
𝟐
,
without correction for continuity,
𝑿± −𝒏/𝟐
𝑛
2
,
with correction for continuity,
We reject or accept ho, applying the usual decision rules. In applying normal approximation to
binominal distribution ;n is taken large when both np and nq are atleast 5. As p=1/2 we can
therefore use the normal approximation when n exceeds 10.
The procedure for testing the hypothesis that the population median has a specified value Mo, in
case of one sample, is given below:
Test of null hypothesis
Ho: population medians M=Mo
INFERENTIAL STATISTICS
191
NON PARAMETRIC TESTING
Ha: population medians M≠M (o or M>Mo or M<Mo)
Significance level α
The test statistics is X, the number of times the less frequent sign (+ or -) occurs and is
binominally distributed.
If n , the number of pluses and minuses exceeds 10, the test
z=
𝑋−𝑛/2
𝑛/4
which ,if ho is true, is approximately standard normal.
Computations.
Subtract Mo the hypothesized value of population median, from each observation of the sample
i.e find the difference X i-Mo. Write down +sign if the difference is postive and a – sign if the
difference is negative, ignore the zero difference,if any. Denote by n the number of + and – signs
(i.e, non-zero differences) and by X,the number of times the less frequent sign occurs. Compute
either the extreme probalities of the binominal variable X or the value of Z, as the case may be.
The critical region depends on the test-statistic, alternative hypothesis and sinificance level α.
Apply the usual decision rule to reject or accept the null hypothesis.
The procdure is the case of two-sample sign test would be the same except the following two
steps:1
Ho: two populations are identical or that they have equal medians, M1=M2. It is tested aginst and
appropiate alternative hypothesis.
Computation. Subtract each observation of the second sample say Yi, from the corresponding
observation of the first sample say Xi i.e find the differences Xi – Yi. Write a plus sign if the
difference is postive and minus if the difference is negative. Discard zero differnces,if any; and
soo on .
INFERENTIAL STATISTICS
192
NON PARAMETRIC TESTING
EXAMPLE:1
You’re an analyst for Chef-Boy-R-Dee. You’ve asked 7 people to rate a new ravioli on a 5pointLiker scale (1=terrible to 5 = excellent). The ratings are: 2 5 3 4 1 4 5. At the .05 level,
is there evidence that the median rating is less than 3?
Solution:
Ho: h = 3
Ha: h < 3
a = .05
S=2
Ratings1 & 2Are Less Than h=3(2, 5, 3, 4, 1, 4, 5)
P (x≥2) = 1 - P(x £ 1)
= .937(Binomial Table, n = 7, p = 0.50)
Reject h0 if:
S ≥ p-value
2 ≥ 0.937
Do not reject Ho
Conclusion:
There Is No Evidence Median Is Less Than 3
EXAMPLE 2.
An experimenter want to determine the effectiveness of a certain reducing diet. 12 persons were
put on diet; their weight before and after they tried the diet,are shown below;
Persons
1
2
3
4
5
6
7
8
9
10
11
12
Weight
202
154
183
180
228
164
139
165
175
245
237
163
195
154
178
199
220
157
135
180
108
106
227
155
before
Weight after
INFERENTIAL STATISTICS
193
NON PARAMETRIC TESTING
Use the sign test at the 5%significance level to test the hypothesis that the diet is not effective
aginst the allternative that is effective.
Solution:
Ho: the diet is not effectice, which is eqvialent to test the hypothesis Ho : P[+sign]=[P-sign]=1/2,
and
Ha: the diet is effective, i.e P>1/2
(one- tailed test)
α=0.05
X=number pf times the less frequent sign occurs. Under Ho X is binominally distributed.
Cacluation:
Subtracting the weights after from the weights before they tried the diet , and writning down a
plus sign for each postive difference and minus sign for each negative difference we get
+0+-+++--+++
Thus n = 11 , the sum of + sign and – sign, ignoring ) diiference, and X =3, the number of minus
sign( less frequent sign ). The statistic X is the distribution of –sign therfore
1
1
1
1
P(X≤3)=(2)11 + 11(2)11 =55(2)11 +165(2)11
232
=2048 = 0.113
Reject Ho if
P≤0.05
As
0.113≤0.05
conclusion
Reject Ho because 0.05 is less the computed probality.
PROBLEMS
INFERENTIAL STATISTICS
194
NON PARAMETRIC TESTING
1. The following data show employee rate of defective work before and after a change in
the wage incentive plan. Compare the following two sets of data to see whether the
change lowered the defective units produces use the 0.10 LOS?
Before 8
7
6
9
7
10
8
6
5
8
10
8
After
5
8
6
9
8
10
7
5
6
9
8
6
2. After collecting a data on the amount of air pollution in Los Angeles, the environment
protection agency decided to issues strict new rules to govern the amount of
hydrocarbons in the air. For the next year, it took monthly measurements of this pollutant
and compared them to the preceding year’s measurements for corresponding months.
Based on the following data, does the EPA have enough evidence to conclude with 95 %
confidence that the new rules were effective in lowering the amount of hydrocarbons in
the air? To justify these laws for another year, it must conclude at =0.10 that they are
effective will these laws still will be effective next year?
Months
Last year
This year
Jan
7.0
5.3
Feb
6.0
6.1
March
5.4
5.6
April
5.9
5.7
May
3.9
3.7
June
5.7
4.7
July
6.9
6.7
INFERENTIAL STATISTICS
195
NON PARAMETRIC TESTING
WILCOXON SIGNED RANK TEST FOR ONE SAMPLE
For one sample, the Wilcoxon signed rank method tests whether the sample could have been
drawn from a population having a hypothesized value as its median. It is equivalent to 1sample z
or t test. This procedure makes no assumption except that the sample we have is randomly taken
from a population, with a symmetric frequency distribution. The symmetric assumption does not
assume normality, simply that there seems to be roughly the same number of values above and
below the median.
Following are the steps of Wilcoxon signed rank test:

Stating the null and alternative hypotheses
Ho :
H1 :
Two- tail test
Left-tail test
Right- tail test
m = mo
m≥ mo
m≤ mo
m ≠ mo
m < mo
m > mo
Where m = the population median
mo = a value that has been specified

Level of significance (α)

Test statistic, W
 For each of the observed values, calculate di = xi - mo
 Ignoring observations where di = 0, rank the | di | values so the smallest | di | will
have a rank of 1. If there are ties, assign each of the tied rows the average of the
ranks they are occupying.
 For observations where xi> mo, list the rank in the R+ column.
 The test statistics is the sum of the R+ column, W = ∑R+.

Critical value of W
The Wilcoxon signed rank table is shown below, which lists the lower and upper critical
values for various levels of significance, with n = the number of observations for which
INFERENTIAL STATISTICS
196
NON PARAMETRIC TESTING
di≠0. The rejection region will be in either one or both tails, depending on the null
hypothesis being tested.

Conclusion
EXAMPLE
An environmental activists believe her community’s drinking water contains at least the 40 ppm
(parts per million) limit recommended by health officials for a certain metal. In response to her
claim, the health department samples and analyzes drinking water from a sample of 11
households in the community. The results are residue levels of 39, 20.2, 40, 32.2, 30.5, 26.5,
42.1, 45.6, 42.1, 29.9, and 40.9 ppm. At 0.05 LOS, can we conclude that the community’s
drinking water might equal or exceed the 40 ppm recommended limit?
SOLUTION
Specification of hypothesis:
Ho: m ≥40 ppm
H1: m < 40 ppm
LOS:
α= 0.05
Test statistics and calculations:
Observed
Household Concentration
di = xi –mo
| di |
Rank
R+
R-
xi
A
39
-1.0
1.0
2
2
B
20.2
-19.8
19.8
10
10
C
40
0
0
-
INFERENTIAL STATISTICS
197
NON PARAMETRIC TESTING
D
32.2
-7.8
7.8
6
6
E
30.5
-9.5
9.5
7
7
F
26.5
-13.5
13.5
9
9
G
42.1
2.1
2.1
3.5
3.5
H
45.6
5.6
5.6
5
5
I
42.1
2.1
2.1
3.5
3.5
J
29.9
-10.1
10.1
8
K
40.9
0.9
0.9
1
8
1
13.0
42.0
W= ∑R+ = 13.0
The critical value of W can be found from the table of critical values for the Wilcoxon signed
rank test. For n = 10 nonzero differences and α = 0.05 the critical value is 11. The test statistics,
W= 13 exceeds 11, and the null hypothesis cannot be rejected at the 0.05 level. At 0.05 level, we
are unable to reject the possibility that the city’s water supply might have at least 40 ppm of the
metal.
NOTE:
For n = the number of observations for which di ≠ 0, the Wilcoxon signed rank statistics will be
W= 0
If all the di vales are negative
W = n(n+1)/2
If all the di values are positive
INFERENTIAL STATISTICS
198
NON PARAMETRIC TESTING
THE NORMAL APPROXIMATION
When the number of observations for which di ≠ 0 is n ≥ 20, a z test will be a close
approximation to the Wilcoxon signed rank test. This is possible because the W distribution
approaches a normal curve as n becomes larger. The format for this approximation is shown
below:
𝑊−
𝑧=
𝑛(𝑛+1)
4
𝑛 𝑛 +1 (2𝑛 +1)
24
Where W = sum of R+ ranks
n = the number of observations for which di ≠ 0
Our example has only n = 10 nonzero differences so this approximation will be rougher than if n
were 20 or larger.
Substituting the values of W and n into this expression, we get z = -1.48. The critical value of z
will be -1.645 for a left tail test. Since the calculated value of z lies to the right of this critical
value, the null hypothesis cannot be rejected
PROBLEMS
1. According to the director of a country tourist bureau, there is a median of 10 hours of
sunshine per day during the summer months. For a random sample of 20 days during the
past three summers, the number of hours of sunshine has been recorded as shown below.
Use the 0.05 levels in evaluating the director’s claim.
8
9
8
10
9
7
7
9
7
7
9
8
11
9
10
7
8
11
8
12hours
2. Use the Wilcoxon signed rank test for the following randomly selected observations in
examining whether the population median could be greater than 37.0. LOS is 0.01.
INFERENTIAL STATISTICS
34.6
40.0
33.8
47.7
41.4
40.2
47.0
39.5
36.1
48.1
39.1
45.0
45.7
46.6
199
NON PARAMETRIC TESTING
WILCOXON SIGNED RANK TEST FOR COMPARING PAIRED
SAMPLES
The Wilcoxon signed rank test for paired sample is the non parametric equivalent of the paired
sample z or t- test. It is used when we want to make inferences about the mean difference
between the two populations.
This test too assumes that the data are continuous and of the interval or ratio scales of
measurement. The population of d values is assumed to be symmetric or nearly so, but need not
be normally distributed or have any other specific shape. In this application, the measurement of
interest is the difference between paired observations i.e., di = xi -yi
Applying the Wilcoxon signed rank test to paired sample is nearly identical to its use of one
sample.
The steps are as below:

State the null and alternative hypotheses
Ho :
H1 :
Two- tail test
Left-tail test
Right- tail test
md = 0
md ≥ 0
md ≤ 0
md ≠ 0
md < 0
md > 0
Where md = the population median of di = xi -yi

Determine the level of significance

Test statistic, W
 For each of the matched pair values calculate the difference between the two
responses, di = xi -yi
 List the set of n absolute differences.
 Rank the absolute value of the differences.
 Group all the positive differences and the negative differences separately (under
R+ and R-).
INFERENTIAL STATISTICS
200
NON PARAMETRIC TESTING
 The sum of the ranks of positive differences, R+ is the Wilcoxon signed rank test
statistics.

Critical value of W

Conclusion
An example will help to understand a clearer picture of this application.
EXAMPLE
An athletics coach wishes to test the values to his athletes of an intensive period of weight
training and so he selects twelve 400m runners from his regions and records their times, in
seconds, to complete this distance. They then undergo his plan of weight training and have their
times for 400 m measured again. The table below summarizes the result.
Athlete
A
B
C
D
E
F
G
H
I
J
K
L
Before
51.0
49.8
49.5
50.1
51.6
48.9
52.4
50.6
53.1
48.6
52.9
53.4
After
50.6
50.4
48.9
49.1
51.6
7.6
53.5
9.9
51.0
48.5
50.6
51.7
Use the Wilcoxon signed rank test to investigate the hypothesis that the training program will
significantly improve athletes’ time for the 400 meters.
SOLUTION
Ho: md = 0 (the population median of di = xi -yi is 0 i.e. training programme has no effect)
H1: md> 0 (the population median of di = xi -yiis greater than 0 i.e. training programme
improves time)
Level of significance = 0.05
Athletes
Before
After
di
| di |
Rank
R+
A
51.0
50.6
0.4
0.4
2
2
INFERENTIAL STATISTICS
201
R-
NON PARAMETRIC TESTING
B
49.8
50.4
-0.6
0.6
3.5
3.5
C
49.5
48.9
0.6
0.6
3.5
3.5
D
50.1
49.1
1.0
1.0
6
6
E
51.6
51.6
0.0
0.0
-
F
48.9
47.6
1.3
1.3
8
G
52.4
53.5
-1.1
1.1
7
H
50.6
49.9
0.7
0.7
5
5
I
53.1
51.0
2.1
2.1
10
10
J
48.6
48.5
0.1
0.1
1
1
K
52.9
50.6
2.3
2.3
11
11
L
53.4
51.7
1.7
1.7
9
9
8
7
55.5
10.5
W= ∑R+ = 55.5
As the table shows, the difference is calculated for each pair of the values, the absolute values for
di are obtained, and then they are ranked. Finally the ranks associated with positive differences
are listed in the R+ column and added to get the observed value of the test statistic, W.
The test statistic is W = 55.5. For n= 11 nonzero differences and α = 0.05, the Wilcoxon signed
rank table gives lower and upper critical values for W of 14 and 52 respectively. The observed
value W = 55.5 falls outside these limits so the null hypothesis is rejected. Based on this data the
weight training programme does improve the athletes’ time for 400 m.
INFERENTIAL STATISTICS
202
NON PARAMETRIC TESTING
PROBLEMS
1. Eight subjects were asked to perform a simple puzzle assembly under normal conditions
and under conditions of stress. During the stress condition the subjects were told that a
mild shock would be delivered 3 minutes after the start of the experiment and every 30
seconds thereafter until the task was completed. Blood pressure readings were taken
under both conditions. Data in the accompanying table represent the highest reading
during the experiment. Do the data present sufficient evidence to indicate higher blood
pressure readings during conditions of stress? Test at 0.01 level of significance.
Subject
1
2
3
4
5
6
7
8
Normal
126
117
115
118
118
128
125
120
Stress
130
118
125
120
121
125
130
120
2. The ministry of Defense is considering which of the two shoe leathers it should adopt for
its new army boot. They are particularly interested in how boots made from these leathers
wear and so 15 soldiers are selected at random and each man wears one boot of each
type. After six months the wear, in mm, for each boot are recorded as follows:
Soldier
1
2
3
4
5
6
7
8
Leather A
5.4
2.6
4.3
1.1
3.3
6.6
4.4
3.5
Leather B
4.7
3.2
3.8
2.3
3.6
7.2
4.4
3.9
INFERENTIAL STATISTICS
203
NON PARAMETRIC TESTING
3. Nine experts rated two brands of Colombian Coffee in a taste testing experiment. A
rating on a 7 point scale (1 = extremely unpleasing and 7 = extremely pleasing) is given
for each of four characteristics: taste, aroma, richness and acidity. The following table
displays the summated ratings- accumulated over all four characteristics.
BRAND
D
Expert
A
B
C.C.
24
26
S.E.
27
27
E.G.
19
22
B.L.
24
27
C.M.
22
25
C.N.
26
27
G.N.
27
26
R.M.
25
27
P.V.
22
23
WILCOXON RANK SUM TEST FOR COMPARING TWO
INDEPENDENT SAMPLES
Wilcoxon signed rank test for matched pairs cannot be performed when either the two samples
are independent or the two samples have different sizes. For such situations, Wilcoxon rank sum
test is used. The Wilcoxon Rank Sum test is used to test for a difference between two samples.
It is the nonparametric counterpart to the two-sample Z or t test. Instead of comparing two
population means, we compare two population medians. It has the same assumptions as the
previous test.
INFERENTIAL STATISTICS
204
NON PARAMETRIC TESTING
This test is also known as Mann Whitney U Test.
Procedure for performing this test is:

State the null and alternative hypotheses
Ho :
H1 :
Two- tail test
Left-tail test
Right- tail test
m1 = m2
m1 ≥ m 2
m1 ≤ m2
m1 ≠ m2
m1 < m 2
m1 > m2
Where m1 and m2 are the population medians

Level of significance

Test statistic, W
 Elect the smaller of the two samples as sample 1. In case, the sample sizes are
equal, either of the samples can be considered as sample 1.
 Rank the combined data values as if they were from a single group. (The smallest
data value gets the rank of 1, the next smallest 2, and so on). If there is a tie
between the values, each tied value gets the average rank that the values are
occupying.
 List the ranks for data values from sample 1 in the R1 column and the ranks for
data values from sample 2 in the R2 column.
 The calculated or observed value of the test statistic is W = ∑R1.

Critical value of W.
The Wilcoxon rank sum table is shown below. It lists lower and upper critical values for
the test, with n1 and n2 as the number of observations in the respective samples.
The rejection region will be in either one or both tails, depending on the null hypothesis
being tested.

Conclusion
INFERENTIAL STATISTICS
205
NON PARAMETRIC TESTING
EXAMPLE
In evaluating the flexibility of rubber tie – down lines, an inspector selects random samples of the
straps and counts the number of 360 degree twists each will withstand before breaking. For 7 lines
from one production lot, the number of turns before breaking is 112, 105, 83, 102, 144, 85 and 50.
For ten lines from a second production lot, the number of turns before breaking is 91, 183, 141,
219, 105, 138, 146, 848, 134 and 106. At 0.05 level, can it be concluded that the two production
lots have the same median flexibility.
SOLUTION
Ho:m1 = m2
H1:m1 ≠ m2
α= 0.05
Sample 2 and Ranks
Sample 1 and Ranks
112
10
91
5
105
7.5
183
16
83
2
141
13
102
6
219
17
144
14
105
7.5
4
138
12
1
146
15
44.5
84
3
134
11
106
9
85
50
108.5
W= ∑R1 = 44.5
INFERENTIAL STATISTICS
206
NON PARAMETRIC TESTING
The sum of rank of values from sample 1 is W = 44.5, the observed value of test
statistics. From the table, for n1 = 7 and n2= 10 at 0.05 level, the lower and upper critical
values of W are 43 and 83 respectively. Since the calculated value W= 44.5 lies within
the limits, the null hypothesis cannot be rejected at 0.05 level. Our conclusion is that the
median flexibility of the two lots could be the same.
The Normal Approximation:
When n1 ≥ 10 and n2 ≥ 10, a normal approximation to the Wilcoxon rank sum test can be
used. The format is as follows:
Z-test approximation to the Wilcoxon rank sum test:
Test statistics:
Z=
𝑛 1(𝑛 +1)
2
𝑊−
𝑛 1𝑛 2(𝑛 +1)
12
Although our example falls short of the n1 ≥ 10 and n2 ≥ 10 rule of thumb, it will be used
to demonstrate the normal approximation. The results will be more “approximate” than if
the rule of thumb had been met.
Substituting W = 44.5, n1 = 7, n2 = 10 and n = 17 into the above expression we find that
z = -1.81. For two tail test at 0.05 level the critical z values are -1.96 and +1.96. The
calculated z is between these limits; therefore, null hypothesis cannot be rejected.
PROBLEMS
1. Given the two samples below, use the Wilcoxon rank sum test to test the null
hypothesis that the population medians are equal against their alternative that they are
not equal.
INFERENTIAL STATISTICS
207
NON PARAMETRIC TESTING
Sample
1
Sample
40
35
44
42
46
28
39
50
37
45
27
35
32
34
49
37
48
49
51
45
44
34
36
50
49
37
48
2
2. Many states are considering lowering the blood-alcohol level at which a driver is
designated as driving under the influence (DUI) of alcohol. An investigator for a
legislative committee designed the following test to study the effect of alcohol on
reaction time. Ten participants consumed a specified amount of alcohol. Another
group of ten participants consumed the same amount of a nonalcoholic drink, a
placebo. The two groups did not know whether they were receiving alcohol or the
placebo. The twenty participants’ average reaction times (in seconds) to a series of
simulated driving situations are reported in the following table. Does it appear that
alcohol consumption increases time?
Placebo
0.90
0.37
1.63
0.83
0.95
0.78
0.86
0.61
0.38
1.97
Alcohol
1.46
1.45
1.76
1.44
1.11
3.07
0.98
1.27
2.56
1.32
KRUSKALWALLIS TEST
It is a non-parametric counterpart for one-way analysis of variance used to determine if three
or more samples originate from the same distribution. It is similar to the Mann-Whitney U
test, but applicable to more than two sample groups.
Assumptions:
1) Variables has a continuous distribution
2) The data are at least ordinal.
3) Samples are independent.
4) Samples are not drawn from normally distributed populations with equal variances.
INFERENTIAL STATISTICS
208
NON PARAMETRIC TESTING
SOLVING KRUSKAL WALLIS TEST METHOD:

State the null and alternative hypotheses.
In this test the null hypothesis is that the median of the k populations are the same i-e m1=m2=….
=mk. The test is one tail and is carried out as follows:
Ho = m1=m2=…. =mk.(The population medians are equal)
H1: at least one mjdiffers from others.

Level of significance

Test statistics:
(The population medians are not equal)
 Rank the combined data value as, if they were from a single group. The smallest
value gets the rank of 1, next smallest get 2, and so on, in case of a tie; each of
the tied value gets their average rank.
 Add the ranks for data values from each of the k group, obtaining ∑R1, ∑R2,
through∑ Rk.

the calculated value of test statistics is:
12
𝐻=
𝑛(𝑛 + 1)
𝑝
𝑘=1
𝑅𝑘 2
− 3(𝑛 + 1)
𝑛𝑘
Where:
n = Sum of sample sizes in all samples
Rk = Sum of ranks in the k the sample
nk = Size of the kth sample

Critical value
The distribution of H is closely approximated by the chi-square distribution. Whenever each sample size
is at least 5 and for α = the level of significance for the test, the critical H is the chi-square value for
which df= k-1 and the upper tail area is α. If the calculated H exceeds the critical value, the null
hypothesis is rejected. Otherwise it cannot be rejected.

Conclusion
INFERENTIAL STATISTICS
209
NON PARAMETRIC TESTING
EXAMPLE:
As production manager, you want to see if 3 filling machines have different filling times. You assign
15 similarly trained & experienced workers, 5 per machine, to the machines. At the .05 level, is there a
difference in the distribution of filling times?
Machinery 1
Machinery 2
Machinery 3
25.40
23.40
20.00
26.31
21.80
22.20
24.10
23.50
19.75
23.74
22.75
20.60
25.10
21.60
20.40
SOLUTION
Ho: Identical Distribution.
H1: At Least 2 Differ.
α = .05
12
𝐻=
𝑛(𝑛 + 1)
𝑝
𝑘=1
𝑅𝑘 2
− 3(𝑛 + 1)
𝑛𝑘
With 𝜗= k-1
Calculations:
Machinery 1
Rank
Machinery 2
Rank
Machinery 3
Rank
25.40
14
23.40
9
20.00
2
26.31
15
21.80
6
22.20
7
INFERENTIAL STATISTICS
210
NON PARAMETRIC TESTING
24.10
12
23.50
10
19.75
1
23.74
11
22.75
8
20.60
4
25.10
13
21.60
5
20.40
3
∑R165
∑R238
∑R317
Substituting the values in the formula we get
H=11.58
Critical value:
The critical value for H is the chi-square statistics corresponding to an upper tail area of α = 0.05
and df = k -1 =2 .Referring chi-square table, thisis found to be 5.991
Reject Ho if:
Hcal ≥ Htab
11.58≥5.991
The calculated H = 11.58 exceeds 5.991 so the null hypothesis is rejected at 0.05. At this level
we conclude that three populations do not have the same median.
PROBLEMS:
1. For the following independent and random sample, use 0.05 level of significance in
testing whether the population medians could be equal.
Sample 1
Sample 2
Sample 3
31.3
24.9
36.0
INFERENTIAL STATISTICS
211
NON PARAMETRIC TESTING
30.7
20.8
37.7
35.4
22.2
31.0
36.1
24.9
28.4
30.3
21.4
31.7
25.5
24.1
32.6
2.
In testing three different rubber compounds, a tire manufacturer find the tread lives of
the tire made from each to be shown as given below. At the level of significance 0.05,
could three compounds deliver the same median tread life?
Design 1
34
38
33
30
30
Design 2
46
43
39
46
36
Design 3
48
39
33
35
41
3.
In an agriculture test, each of four organic compounds is applied to a sample of plants.
At the end of the 4 weeks, the highest of the plants are as shown below. At the 0.025
level, are the compounds equally effective in promoting plant growth?
Formula 1
18
18
20
20
18
Formula 2
9
13
20
16
13
Formula 3
14
8
8
17
8
INFERENTIAL STATISTICS
212
18
NON PARAMETRIC TESTING
Statistical Quality Control
Introduction:
Statistics:
Statistics means the good amount of data to obtain reliable results. The Science of
statistics handles this data in order to draw certain conclusions. Its techniques find
extensive applications in quality control, production planning and control, business
charts,
linear
programming
etc.
Inferential Statistics:
Inferential statistics is a valuable tool because it allows us to look at a small sample
size and make statements on the whole population.
Samples must be pulled RANDOMLY from a population so that the sample truly
represents the population. Every unit in a population must have a equal chance of
being selected for the sample to be truly random. The distribution or shape of the data
is important to know for analytical purposes.
 The most common distribution is the bell shaped or normal distribution.
 Parameters can be estimated from sample statistics. Two of the most common
parameters are the mean and standard deviation.
 The mean (or average, denoted by μ) measures central tendency
 This is estimated by the sample mean or x-bar.
 The standard deviation (σ) measures the spread of the data and is estimated by
the sample standard deviation.
Quality:
In manufacturing,
a measure of
excellence
or
a
state
of
being free from defects, deficiencies and significant variations. It is brought about by
strict and consistent commitment to certain standards that achieve uniformity of
a product in order to satisfy specific customer or user requirements. Quality is a
relative term and is generally explained with reference to the end use of the product.
Quality is thus defined as fitness for purpose.
Dimensions of Quality
•
•
•
•
•
Performance - Will it do the intended job?
Reliability - How often does it fail?
Durability - How long does the product last?
Serviceability - How easy is it to repair the product?
Aesthetics - What does the product look like?
INFERENTIAL STATISTICS
213
STATISTICAL QUALITY CONTROL
•
•
•
Features - What does the product do?
Perceived Quality - What is the reputation of the company or its product?
Conformance to Standards - Is the product made exactly as the designer
intended?
Control:
Control is a system for measuring and checking or inspecting a phenomenon. It
suggests when to inspect, how often to inspect and how much to inspect, how often to
inspect. Control ascertains quality characteristics of an item, compares the same with
prescribed quality standards and separates defective item from non-defective ones.
Statistical Quality Control (SQC) is the term used to describe the set of statistical
tools used by quality professionals. SQC is used to analyze the quality problems and
solve them. Statistical quality control refers to the use of statistical methods in the
monitoring and maintaining of the quality of products and services. Statistical Quality
Control (SQC) is the term used to describe the set of statistical tools used by quality
professionals. SQC is used to analyze the quality problems and solve them. Statistical
quality control refers to the use of statistical methods in the monitoring and
maintaining of the quality of products and services. All the tools of SQC are helpful in
evaluating the quality of services. SQC uses different tools to analyze quality
problem.
Statistical quality control
Statistical quality control refers to the use of statistical methods in the monitoring and
maintaining of the quality of products and services. One method, referred to as
acceptance sampling, can be used when a decision must be made to accept or reject
a group of parts or items based on the quality found in a sample.
INFERENTIAL STATISTICS
214
STATISTICAL QUALITY CONTROL
Objective of Statistical Quality Control
Quality Control is very important for an every company. Quality control includes
service quality given to customer, company management leadership, commitment of
management, continuous improvement, and fast response, actions based on facts,
employee participation and a quality driven culture.
The main objectives of the quality control module are to control of material reception,
internal rejections, clients, claims, providers and evaluations of the same corrective
actions are related to their follow-up. These systems and methods guide all quality
activities. The development and use of performance indicators is linked, directly or
indirectly, to customer requirements and satisfaction, and to management.
Three SQC Categories
Statistical quality control (SQC) is the term used to describe the set of statistical tools
used by quality professionals
 SQC encompasses three broad categories of;
1. Descriptive statistics
 e.g. the mean, standard deviation, and range
2. Statistical process control (SPC)
 Involves inspecting the output from a process
 Quality characteristics are measured and charted
 Helpful in identifying in-process variations
3. Acceptance sampling used to randomly inspect a batch of goods to
determine acceptance/rejection.Does not help to catch in-process
problems.
Descriptive Statistics involves describing quality characteristics and relationships.
SPC involves inspect random sample of output from process for characteristic.
Acceptance Sampling involve batch sampling by inspection.
Sources of Variation
 Variation exists in all processes.
 Variation can be categorized as either;
 Common or Random causes of variation, or
 Random causes that we cannot identify
 Unavoidable
 e.g. slight differences in process variables like diameter,
weight, service time, temperature
 Assignable causes of variation
 Causes can be identified and eliminated
 e.g. poor employee training, worn tool, machine needing repair
INFERENTIAL STATISTICS
215
STATISTICAL QUALITY CONTROL
Traditional Statistical Tools
Descriptive Statistics include
 The Mean- measure of central tendency.
 The Range- difference between largest/smallest observations in a set
of data.
 Standard Deviation measures the amount of data dispersion around
mean.
 Distribution of Data shape
 Normal or bell shaped
 Skewed
Analysis of Patterns on Control Charts
 When do you have a problem with your process?
 One or more points outside of the control limits
 A run of at least seven points (up, down or above or below center line)
 Two or three consecutive points outside the 2-sigma warning limits, but still
inside the control limits
 Four or five consecutive points beyond the 1-sigma limits. An unusual or
nonrandom pattern in the data
Setting Control Limits
Percentage of values under normal curve
Control limits balance risks like Type I error
SPC Methods-Control Charts
 Control Charts show sample data plotted on a graph with CL, UCL, and
LCL
 Control chart for variables are used to monitor characteristics that can be
measured, e.g. length, weight, diameter, time
INFERENTIAL STATISTICS
216
STATISTICAL QUALITY CONTROL

Control charts for attributes are used to monitor characteristics that have
discrete values and can be counted, e.g. % defective, number of flaws in a
shirt, number of broken eggs in a box
The normal distribution is the basis for the charts and requires the following
assumptions:

The quality characteristic to be monitored is adequately modelled by a normally
distributed random variable

The parameters μ and σ for the random variable are the same for each unit and each
unit is independent of its predecessors or successors

The inspection procedure is same for each sample and is carried out consistently
from sample to sample
Control Charts
The control chart is a graph used to study how a process changes over time. Data are
plotted in time order. A control chart always has a central line for the average, an
upper line for the upper control limit and a lower line for the lower control limit.
Lines are determined from historical data. By comparing current data to these lines,
you can draw conclusions about whether the process variation is consistent (in
control) or is unpredictable (out of control, affected by special causes of variation).
When to use a control chart?
 Controlling ongoing processes by finding and correcting problems as
they occur.
 Predicting the expected range of outcomes from a process.
 Determining whether a process is stable (in statistical control).
 Analyzing patterns of process variation from special causes (nonroutine events) or common causes (built into the process).
 Determining whether the quality improvement project should aim to
prevent specific problems or to make fundamental changes to the
process.
Basic components of control charts
 A centerline, usually the mathematical average of all the samples
plotted;
 Lower and upper control limits defining the constraints of common
cause variations;
INFERENTIAL STATISTICS
217
STATISTICAL QUALITY CONTROL

Performance data plotted over time.
Types of the control charts
1 Variables control charts
 Variable data are measured on a continuous scale. For example:
time, weight, distance or temperature can be measured in
fractions or decimals.
 Applied to data with continuous distribution
2 Attributes control charts
 Attribute data are counted and cannot have fractions or decimals.
Attribute data arise when you are determining only the presence
or absence of something: success or failure, accept or reject,
correct or not correct. For example, a report can have four errors
or five errors, but it cannot have four and a half errors.
 Applied to data following discrete distribution
Variables Control Charts
Suppose we have a scatter plot with a response variable on the vertical axis and a
representation of time (such as hours, shifts, days, weeks, or months) on the
horizontal axis. This scatter plot shows the nature of the response over time. For
example, we might see trends, shifts, sudden jumps, and so on. If we add horizontal
limit lines to the plot to indicate standards, the scatter plot becomes a control chart.
When the plots fall inside these limits lines, the process yielding the response is said
to be in control. When the process yields responses that are outside these limits, the
process is said to be out-of-control.
The limit lines set a range of „normal behavior.‟ They are based on past experience
with the process and give a frame of reference for judging current outcomes. Because
of natural variation in the process, the responses will not be exactly the same. They
will bounce up and down. As long as the response stays within the limits, we need
take no corrective action. However, once a measurement occurs outside the limits, we
must investigate the cause and take appropriate corrective action.



Use x-bar and R-bar charts together
Used to monitor different variables
X-bar & R-bar Charts reveal different problems
INFERENTIAL STATISTICS
218
STATISTICAL QUALITY CONTROL
Control Charts for Variables uses:
 x-bar charts to monitor the changes in the mean of a process (central
tendencies)
 R-bar charts to monitor the dispersion or variability of the process
 System can show acceptable central tendencies but unacceptable variability or
 System can show acceptable variability but unacceptable central tendencies
X-Bar Chart:
The X-bar chart monitors the process location over time, based on the average of a
series of observations, called a subgroup.
X-bar / Range charts are used when you can rationally collect measurements in
groups (subgroups) of between two and ten observations. Each subgroup
represents a "snapshot" of the process at a given point in time. The charts' x-axes
are time based, so that the charts show a history of the process. For this reason,
data should be time-ordered; that is, entered in the sequence from which it was
generated. If this is not the case, then trends or shifts in the process may not be
detected, but instead attributed to random (common cause) variation.
For subgroup sizes greater than ten, use X-bar / Sigma charts, since the range
statistic is a poor estimator of process sigma for large subgroups. In fact, the
subgroup sigma is always a better estimate of subgroup variation than subgroup
range. The popularity of the Range chart is only due to its ease of calculation,
dating to its use before the advent of computers.
For subgroup sizes equal to one, an Individual-X / Moving Range chart can be
used.
X-bar Charts are efficient at detecting relatively large shifts in the process
average, typically shifts of +-1.5 sigma or larger. The larger the subgroup, the
more sensitive the chart will be to shifts, providing a Rational Subgroup can be
formed.
Mean or x bar chart:
Control over the average quality is exercised by the control chart of averages,
typically called the X bar chart; the construction of an x bar chart is based on the
central limit theorem. This states that regardless of the distribution of the population,
X Bars distribution (mean of each sample drawn from the population)will tend to
follow a normal distribution as the sample size increases, this theorem also states that
the

Mean of sample means denoted by x bar bar will equal the mean of the

population µ i.e. µ= X
Standard deviation of sample distribution σx will be the population standard
deviation divided by the square root of the sample size n, i.e. σx =σ/√𝑛
INFERENTIAL STATISTICS
219
STATISTICAL QUALITY CONTROL
The Chart Construction Process: in order to construct x bar and R charts, we
must first find our upper- and lower-control limits. This is done by utilizing the
following formulae:

In case when µ and σ are known


UCL = μ+ 3σ/√n
LCL = μ - 3σ/√n

CL= X or µ
While theoretically possible, since we do not know either the population process
mean or standard deviation, these formulas cannot be used directly and both must be
estimated from the process itself. First, the R chart is constructed. If the R chart
validates that the process variation is in statistical control, the x bar chart is
constructed.

In case when σ and µ are not known
1. Find the mean of each subgroup X (1), X (2), X (3)... X (k) and the grand
mean of all subgroups using:
2. Find the UCL and LCL using the following equations:
UCL= X + A2R
CL= X
LCL= X - A2R
3. Plot the LCL, UCL, centre line, and subgroup means on a graph paper.
4. Interpret the data to determine if the process is in control:
A2 can be found from the table given below:
INFERENTIAL STATISTICS
220
STATISTICAL QUALITY CONTROL
A2
2nd column of the table is giving the value of A2 respective to the subgroup or sample
size n
Example of Constructing a X-bar Chart:
A quality control inspector at the Cocoa Fizz soft drink company has taken three
samples with four observations each of the volume of bottles filled. If the standard
deviation of the bottling operation is .2 ounces, use the below data to develop control
charts with limits of 3 standard deviations for the 16 oz. bottling operation.
 Center line and control limit formulas
Observation
1
Observation
2
Observation
3
Observation
4
Sample
means (Xbar)
Sample
ranges (R)
x1  x 2  ...xn
σ
, σx 
k
n
where(k ) is the # of sample means and (n)
is the # of observations w/in each sample
Time 1
Time 2
Time 3
15.8
16.1
16.0
16.0
16.0
15.9
15.8
15.8
15.9
UCL x  x  zσ x
15.9
15.9
15.8
LCL x  x  zσ x
15.875
15.975
15.9
0.2
0.3
0.2
INFERENTIAL STATISTICS
221
x
STATISTICAL QUALITY CONTROL
X-Bar Control Chart
R chart: A control chart for dispersion:
An x chart is used to plot the location values, for each sample, whereas R chart is used
to plot the variation of each sample as measured by the sample range. The range (r
chart) is used to control the variability or dispersion in the quality of a product in the
process of production. for each sample of size n, we calculate a sample variance
where variation is known . the frequency distribution of these variances approximate
the normal distribution, with mean µ and variance of sample variance σR2.thus we can
use this distribution of sample variances to establish control limits, to understand
variability in the process be controlled.
Procedure of constructing an R chart is as follows:




Take random samples each of size n, and let Ri be the value of range in
ith sample of size n.
We can find the range by subtracting the maximum value of a sample
from the minimum value : Rmax- Rmin
Mean of sample ranges R- is given by
R- = R1+R2+…Rn /n = ∑R/n
Control limits are set as under
CL= R
UCL= R + 3 σR
&
LCL= R - 3 σR
Where σR is the standard error of range or the standard deviation of the
range of all possible sample size from a given population. The average
range R provides an estimate of the mean of the random variable of
interest.
INFERENTIAL STATISTICS
222
STATISTICAL QUALITY CONTROL
Sample Obs 1
1
10.68
2
10.79
3
10.78
4
10.59
5
10.69
6
10.75
7
10.79
8
10.74
9
10.77
10
10.72
11
10.79
12
10.62
13
10.66
14
10.81
15
10.66
Obs 2
10.689
10.86
10.667
10.727
10.708
10.714
10.713
10.779
10.773
10.671
10.821
10.802
10.822
10.749
10.681
Obs 3
10.776
10.601
10.838
10.812
10.79
10.738
10.689
10.11
10.641
10.708
10.764
10.818
10.893
10.859
10.644
Obs 4
10.798
10.746
10.785
10.775
10.758
10.719
10.877
10.737
10.644
10.85
10.658
10.872
10.544
10.801
10.747
Obs 5
10.714
10.779
10.723
10.73
10.671
10.606
10.603
10.75
10.725
10.712
10.708
10.727
10.75
10.701
10.728
Averages
Avg
10.732
10.755
10.759
10.727
10.724
10.705
10.735
10.624
10.710
10.732
10.748
10.768
10.733
10.783
10.692
Range
0.116
0.259
0.171
0.221
0.119
0.143
0.274
0.669
0.132
0.179
0.163
0.250
0.349
0.158
0.103
10.728
0.220400
Step 2. Determine Control Limit Formulas and Necessary Tabled Values
x Chart Control Limits
n
2
3
4
5
6
7
8
9
10
11
UCL = x + A 2 R
LCL = x - A 2 R
R Chart Control Limits
UCL = D 4 R
LCL = D 3 R
A2
1.88
1.02
0.73
0.58
0.48
0.42
0.37
0.34
0.31
0.29
D3
0
0
0
0
0
0.08
0.14
0.18
0.22
0.26
D4
3.27
2.57
2.28
2.11
2.00
1.92
1.86
1.82
1.78
1.74
Where R = ∑R/15
R= Largest observation- lowest observation
Steps 3&4: Calculate x-bar Chart and Plot Values
UCL = x + A2 R  10.728  .58( 0.2204 )=10.856
Means
LCL = x - A2 R  10.728-.58( 0.2204 )=10.601
10.900
10.850
10.800
10.750
10.700
10.650
10.600
10.550
Sample mean
UCL
1
2
3
4
INFERENTIAL STATISTICS
5
6
7
8
9
Sample
223
10
11
12
13
14
15
STATISTICAL QUALITY CONTROL
Steps 5&6: Calculate R-chart and Plot Values
UCL = D 4 R  (2.11)(0.2204)  0.46504
LCL = D 3 R  (0)(0.2204)  0
Range
UCL
LCL
R-bar
0.800
0.700
0.600
0.500
R 0.400
0.300
0.200
0.100
0.000
1
2
3
4
5
6
7
8
Sample
9
10
11
12
13
14
15
There may arise two situations:
1 If σR is known, then
CL=d2 σR
b. UCL= R+ d2 σR
c. LCL= R- d1 σR
a.
Where d1 and d2 are the constants that depend on the sample size as shown
above
2 If σR is not known, then its value can be estimated as:
CL= R
LCL=D3 R
UCL=D4 R
Thus control limits for R chart are given
Example of x-bar and R charts:
Step 1: Calculate sample means, sample ranges, mean of means,
and mean of ranges.
INFERENTIAL STATISTICS
224
STATISTICAL QUALITY CONTROL
Range Chart :
The lower and upper control limits for the range chart are calculated using the formula
1 If variation in R in known
UCL = R + 3 σR
LCL = R − 3 σR
2
If variation in R in unknown
UCL = D4 R
LCL = D3 R
Range Chart Limits when n=1 the moving ranges of size two replace the usual ranges
in the formulas above. All calculations remain the same after this substitution, except
that there are only k-1 ranges to plot.
Control Chart for Range (R)
 Factors for three sigma control limits
Center Line and Control Limit formulas:
R
0.2  0.3  0.2
 .233
3
UCL R  D4 R  2.28(.233) .53
LCLR  D3R  0.0(.233) 0.0
R-Bar Control Chart
INFERENTIAL STATISTICS
225
STATISTICAL QUALITY CONTROL
Second Method for the X-bar Chart Using R-bar and the A2 Factor :
Method used when sigma for the process distribution is not know
 Control limits solution:
R
0.2  0.3  0.2
 .233
3
UCL x  x  A 2 R  15.92  0.73.233  16.09
LCL x  x  A 2 R  15.92  0.73.233  15.75
Chart
Focuses
on
Subgroup
Sample
Size
Advantages
Disadvantages
X-bar
Average
Two and
above
Slow to detect drifts in
the average. Not good at
detecting small changes
in the process average.
R
Variability Two and
above
Does a good job at
detecting sudden, large
jumps in the process
average. Simple to
understand. Used often
so there is a large body
of knowledge about its
use.
Good at detecting
sudden jumps.
Easy to compute and
understand.
Ignores a lot of
information about the
variability, especially
when the subgroup size
is large.
CONTROL CHART FOR ATTRIBUTES:
Control charts for attributes are used to understand whether products under inspection
satisfy or not certain characteristics. In other words, the attribute (quality
characteristics) charts are typically based on classification of products or services as
defective or non defective. This class of charts neither includes any measurement of
variation anything comparable to an R chart derived from the range in samples. They
are similar to R charts in a sense that their control limits are three standard errors
away from the means of all possible values.
INFERENTIAL STATISTICS
226
STATISTICAL QUALITY CONTROL
C chart: control chart for defects per unit:
Sometimes the characteristics representing the quality of a product and services are
there in nature and the data is obtained by counting, such as if the machine is working
or idle defects in automobiles , machine components, services rendered by a
restaurant a store or a bank and so on.
C chart is used in situations wherein opportunities for a defect in each production unit
or a complaint from a customer and very large while the probability occurrence per
unit tends to b very small or constant. The outcome of such a sampling can be
described by a Poisson distribution.
Example: During an examination of equal length, the following numbers of defects
were observed 2,3,4,0,5,6,7,4,3,2. Draw a control chart for the number of defects.
Comment whether the process is under control or not.
Solution
Step 1: let c denote the number of defects per piece. Then the average number of
defects in 10 samples will be
C = ∑C/ N = 2+3+4+0+5+6+7+4+3+2 / 10
Thus the control limits are:
C =3.6
LCL= C +3√𝐶
=3.6 + 3√3.6 = 3.6+5.692 = 9.292
UCL= C -3√𝐶
=3.6- 5.692 = -2.092 or 0
The control chart for C based on these limitations is given
From the above calculated data draw a graph with three lines one of control limit e
upper limit which is the maximum value of control limit the lower limit which
contains the lowest possible value of the data, out of these values the process would
be out of control.
P chart: control chart for proportions of defectives:
The p chart is designed to control the percentage (or proportion) of defectives per
sample and is based on the distribution of proportion (or fraction) defectives in each
INFERENTIAL STATISTICS
227
STATISTICAL QUALITY CONTROL
sample. The assumption that attributes that are classified as either good or bad follows
the binomial distribution implies that
a) There are only two possible outcomes (good or defective)
b) The outcomes occur randomly, and
c) The probability of either outcome remains unchanged for each trial.
Since the no of C per unit can be converted into a fraction (proportion)
defectives by dividing C by the sample size, p- chart may be used in place of
C- chart.
The p- chart has two advantages over the C- chart
1. Expressing the defectives as a percentage or proportion of the given
population is more meaningful.
2. When sample size varies from sample to sample, the p chart derives
more meaningful and simple presentation.
If the sample size is constant, the primary difference in C-chart and pchart is only in the computation of control limits.
The steps for construction of control limits for p-chart are as follows:
1. Compute the proportion defective items in each sample by dividing the
number of defectives xi recorded in a sample of size n
P1 = x1/n1 , p2 = x2/n2
In general, p= no of defectives/sample size, n
2. Obtain the mean and variance of p from all samples combined i.e. Average
proportion defectives
total number of defectives in all the samples combined
p- =
Total number of items in all the samples combined
=
And
p1+p2+p3+⋯pn
n
σp2 =p- q-/n = p-(1- p-)/ n
3. The control limits for p-chart are given by
UCL= p +3 σp = p + 3
LCL= p -3 σp = p - 3
INFERENTIAL STATISTICS
p−(1−
p)
n
p−(1−
p)
n
228
STATISTICAL QUALITY CONTROL
Where σp is the standard error (deviation) of proportion. While constructing
the p chart it is generally preferred to expression terms of percent defective
rather than defective fraction. The percent defective is 100p.
Control Charts for Attributes – P-Charts & C-Charts
Attributes are discrete events; yes/no, pass/fail. Use P-Charts for quality
characteristics that are discrete and involve yes/no or good/bad decisions e.g:
 Number of leaking caulking tubes in a box of 48
 Number of broken eggs in a carton
Use C-Charts for discrete defects when there can be more than one defect per unit
 Number of flaws or stains in a carpet sample cut from a production run
 Number of complaints per customer at a hotel
P-Chart Example: A Production manager for a tire company has inspected the
number of defective tires in five random samples with 20 tires in each sample. The
table below shows the number of defective tires in each sample of 20 tires. Calculate
the control limits.
Sample
Number
of
Defective
Tires
Number
of Tires
in each
Sample
Proportion
Defective
1
3
20
.15
2
2
20
.10
σp 
3
1
20
.05
4
2
20
UCL p  p  zσ   .09  3(.064)  .282
.10
LCLp  p  zσ   .09  3(.064)  .102  0
5
2
20
.05
Total
9
100
.09
INFERENTIAL STATISTICS
Solution:
CL  p 
229
# Defectives
9

 .09
Total Inspected 100
p(1  p)

n
(.09)(.91)
 0.64
20
STATISTICAL QUALITY CONTROL
P- Control Chart
C-Chart Example:
The number of weekly customer complaints are monitored in a large hotel using a
c-chart. Develop three sigma control limits using the data table below.
Week
Number of
Complaints
1
3
2
2
3
3
CL 
4
1
UCL c  c  z c  2.2  3 2.2  6.65
5
3
LCLc  c  z c  2.2  3 2.2  2.25  0
6
3
7
2
8
1
9
3
10
1
Total
22
INFERENTIAL STATISTICS
Solution:
230
# complaints 22

 2.2
# of samples 10
STATISTICAL QUALITY CONTROL
C- Control Chart
Acceptance Sampling
A statistical measure used in quality control. A company cannot test every one of its
products due to either ruining the products, or the volume of products being too large.
Acceptance sampling solves this by testing a sample of product for defects. The
process involves batch size, sample size and the number of defects acceptable in the
batch. This process allows a company to measure the quality of a batch with a
specified degree of statistical certainty without having to test every unit of product.
The statistical reliability of a sample is generally measured by a t-statistic.
Probability is a key factor in acceptance sampling, but it is not the only factor. If a
company makes a million products and tests 10 units with one default, an assumption
would be made on probability that 100,000 of the 1,000,000 are defective. However,
this could be a grossly inaccurate representation. More reliable conclusions can be
made by increasing the batch size higher than 10, and increasing the sample size by
doing more than just one test and averaging the results. When done correctly,
acceptance sampling is a very effective tool in quality control.
INFERENTIAL STATISTICS
231
STATISTICAL QUALITY CONTROL
SAMPLING PLANS
A “lot,” or batch, of items can be inspected in several ways, including the use of
single, double, or sequential sampling.
Single Sampling
Two numbers specify a single sampling plan: They are the number of items to be
sampled (n) and a pre-specified acceptable number of defects (c). If there are fewer or
equal defects in the lot than the acceptance number, c, and then the whole batch will
be accepted. If there are more than c defects, the whole lot will be rejected or
subjected to 100% screening.
Double Sampling
Often a lot of items is so good or so bad that we can reach a conclusion about its
quality by taking a smaller sample than would have been used in a single sampling
plan. If the number of defects in this smaller sample (of size n1) is less than or equal
to some lower limit (c1), the lot can be accepted. If the number of defects exceeds an
upper limit (c2), the whole lot can be rejected. But if the number of defects in the n1
sample is between c1 and c2, a second sample (of size n) is drawn. The cumulative
results determine whether to accept or reject the lot. The concept is called double
sampling.
INFERENTIAL STATISTICS
232
STATISTICAL QUALITY CONTROL
Sequential Sampling
Multiple sampling is an extension of double sampling, with smaller samples used
sequentially until a clear decision can be made. When units are randomly selected
from a lot and tested one by one, with the cumulative number of inspected pieces and
defects recorded, the process is called sequential sampling.
If the cumulative number of defects exceeds an upper limit specified for that sample,
the whole lot will be rejected. Or if the cumulative number of rejects is less than or
equal to the lower limit, the lot will be accepted. But if the number of defects falls
within these two boundaries, we continue to sample units from the lot. It is possible in
some sequential plans for the whole lot to be tested, unit by unit, before a conclusion
is reached.
Selection of the best sampling approach—single, double, or sequential—depends on
the types of products being inspected and their expected quality level. A very lowquality batch of goods, for example, can be identified quickly and more cheaply with
sequential sampling. This means that the inspection, which may be costly and/or
destructive, can end sooner. On the other hand, in many cases a single sampling plan
is easier and simpler for workers to conduct even though the number sampled may be
greater than under other plans.
INFERENTIAL STATISTICS
233
STATISTICAL QUALITY CONTROL
Acceptance quality level (AQL):
This is the minimum level of quality acceptable in a given lot. It is expressed in
decimals or percentage defectives in a lot that can be considered satisfactory by the
consumer. For example, if acceptable quality is20% in defectives in a lot of1000
items, then the AQL= 20/1000 =2%
Advantages and limitations of statistical quality control:
ADVANTAGES:

Reduction in cost: since only a fraction of the output is inspected, cost of
inspection is greatly reduced.

Greater efficiency: not only there is reduction in cost but the efficiency also
goes up because much of the boredom is avoided, the work of inspection being
considerably reduced.

Easy to apply: an excellent feature of quality control is that it is easy to
apply. Once the system is established, it can be even operated by a person who
has not had extensive specialized training or higher mathematical background.
It may appear difficult only because the statistical principles are actually based
on commonsense, the quality control method finds wide application.

Early detection of faults: quality control ensures an early detection of
faults and hence a minimum waste of rejected production. The moment a
sample point falls outside the control limits it is taken to be a danger signal
and necessary corrective action is taken. On the other hand, with 100%
inspection unwanted variations in the quality may b detected at a stage when a
large amount of fault products have already been produced. Thus there would
be a big wastage. A control chart, on the other hand, provides a graphic picture
of how the production is proceeding and tells management where to look for
trouble.
LIMITATIONS:
Despite several advantages of quality control it is believed that it is not a treatment for
all quality evils. The techniques of quality control should not be used technically.
Instead these should be matched to the process being studied. The application of
standard procedure without adequate supply of the process is extremely dangerous.
INFERENTIAL STATISTICS
234
STATISTICAL QUALITY CONTROL
Practice Questions:
Q no 1: The following data refer to defects found at the inspection of the first 10
samples of size 100.use the term to obtain the Upper and lower control limits for
percentage defective in samples of 100.represent the first ten sample results in the
chart you prepare the central limit and control limits.
Sample
1
number
No
of 2
defectives
2
3
4
5
6
7
8
9
10
1
1
3
2
3
4
2
2
0
Qno 2: the average no of defectives in 23 samples of size 2000rubber belts are found
to b 16%. Indicate how to construct the relevant control chart, Its upper and lower
control limit and tell whether the process is under control or not?
Qno 3: the following table gives the no of defects observed in 8 woollen carpets
passing as satisfactory. Construct the control chart for the no of defects
Carpet 1
no
No of 3
defects
2
3
4
5
6
7
8
9
10
4
5
6
3
3
5
3
6
2
Qno4 : Todd Olmstead is the meals on wheels dispatcher for the atlantes for
metropolitan area. He wants meal delivered to clients within 30 minutes of leaving the
kitchens. Meal with longer delivery times tend to be too cold when they arrive. Each
of his 10 volunteer drivers is responsible for delivering 15 meals daily over the past
month; Todd has recorded the percentage of each day‟s 150 meals that were delivered
on time;
“DAY”
1
2
3
4
5
6
7
8
9
10
11
12
13
14
INFERENTIAL STATISTICS
“% on-time”
89.33
81.33
95.33
88.67
96.00
86.67
98.00
84.00
90.67
80.67
88.00
86.67
96.67
85.33
235
STATISTICAL QUALITY CONTROL
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
78.67
89.33
89.33
78.67
94.00
94.00
99.33
95.33
94.67
92.67
81.33
89.33
99.33
90.67
92.00
88.00
a) Help Todd construct a p chart from these data
b) How does your chart show that the attribute “fraction of meals delivered on
rime” is out-of-control?
c) What action do you recommend for Todd?
Qno5: the following data gives the inspection data relating to 10 samples of 100
items each, concerning the production of bottle corks. Construct a control chart.
Sample no
1
2
3
4
5
6
7
8
9
10
Size of sample
100
100
100
100
100
100
100
100
100
100
No of defectives
5
3
3
6
5
6
8
10
10
4
Fraction defective
0.05
0.03
0.03
0.06
0.05
0.06
0.08
0.10
0.10
0.04
Qno6: A food company puts mango juice into cans advertised as containing 10ounces
of the juice. The weight of the cans immediately after filling for 20 samples are taken
by a random method (at an interval of every 30 mins) each of the samples includes 4
cans. The sample values are tabulated in the following table. The weights in the table
are given in units of 0.01ounces in excess of 10 ounces. For example the weight of the
juice drained from the first can of sample is 10.15 ounces which is excess of 10
ounces by 0.15 ounces (10.15-10=0.15) since the unit in the table is 0.01ounces, the
excess is recorded as 15 units In the table. Construct the control chart.
INFERENTIAL STATISTICS
236
STATISTICAL QUALITY CONTROL
Sample number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
I
15
10
8
12
18
20
15
13
9
6
5
3
6
12
15
18
13
10
5
6
II
12
8
15
17
13
16
19
23
8
10
12
15
18
9
15
17
16
20
15
14
III
13
8
17
11
15
14
23
14
18
24
20
18
12
15
6
8
5
8
10
2
IV
20
14
10
12
4
20
17
16
5
20
15
18
10
18
16
15
4
10
12
14
QNO 7: the following data shows the value of sample mean and range for 10
samples of size 5 each. Calculate the values for central line and control limits for
Mean and range chart, and determine whether the process is in control.
Sample 1
2
3
4
5
6
7
8
9
Mean
11.2 11.8 10.8 11.6 11.0 9.6
10.4 9.6
10.6
Range 7
4
8
5
7
4
8
4
7
Conversion factors required for n=5 are given in the table of control chart.
10
10.0
9
Qno 8: Draw a suitable chart for the following data predicting to the number of
foreign-cultured threads (considered as defects) in 15 pieces of cloth of 2m×2m in a
certain
make
of
synthetic
fiber
and
state
your
conclusion.
7,12,3,20,21,5,4,3,10,8,0,9,6,7,20.
Qno 9: After finding out his luggage arrived in San Antonio while his destination was
Omaha, Will Richardson, a statistician for USA Airlines, decided to do some
research. For the last three weeks, Will has sampled 200 passengers daily and
determined the percentage of luggage delivered to the expected destination with the
results given below:
Day
1
2
3
4
INFERENTIAL STATISTICS
Percent correct
0.89
0.91
0.93
0.95
237
STATISTICAL QUALITY CONTROL
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
0.94
0.96
0.92
0.91
0.93
0.90
0.88
0.94
0.97
0.94
0.95
0.92
0.93
0.92
0.91
0.93
0.89
a) Help will construct a p chart from these data.
b) Is this luggage delivery process in control?
Qno 10: The Take-Charge Company produces batteries. From time to time a random
sample of six batteries is selected from the output and the voltage of each battery is
measured, to be sure that the system is under control. Here are statistics on 16 such
samples.
a. What type of control chart should be used here? Why?
b. What is the centre line of the chart?
c. What is the lower control limit?
d. The upper control limit?
e. Draw the control chart on a piece of graph paper.
f. what the graph interprets?
Sample
Mean
Range
Sample
Mean
Range
1
4.99
0.41
9
5.01
0.49
2
4.87
0.57
10
5.19
0.56
3
4.85
0.59
11
5.40
0.44
4
5.26
0.74
12
5.15
0.63
5
5.09
0.74
13
5.00
0.35
6
5.02
0.21
14
4.89
0.45
7
5.13
0.56
15
4.99
0.54
INFERENTIAL STATISTICS
238
STATISTICAL QUALITY CONTROL
8
5.09
0.92
16
5.05
0.33
Qno11: A machine is set to deliver an item of a given weight 10 samples of size 5
were recorded. The relevant data is as follows:
Sample 1
15
Mean
(X )
Range 7
(R-)
2
17
3
15
4
18
5
17
6
14
7
18
8
15
9
17
10
16
7
4
9
8
7
12
4
11
5
Calculate the value for central line and control limits for mean chart and the range
chart and then comment on the state of control. Conversion factors for n=5 are given
in the table above.
Qno12: Ross Darrow is a flight operations analyst for spacious skies. He has been
assigned the task of monitoring flights at the company‟s hub airport in the southeast
.each day, spacious skies has 240 takeoffs scheduled from his hub. Ross has been
concerned about the fractions of flights with late departures, and four weeks ago he
instituted procedures designed to reduce the fraction. Use the data for the last 30
week-days to construct a p chart to see whether his new procedures have been
successful. What further action if any, should Ross consider?
INFERENTIAL STATISTICS
Weeks (1-6)
late
M
T
W
TH
F
M
T
W
TH
F
M
T
W
TH
F
M
T
26
19
26
22
24
19
19
20
18
18
17
9
13
10
12
14
14
239
STATISTICAL QUALITY CONTROL
W
TH
F
M
T
W
TH
F
M
T
W
TH
F
13
9
10
12
15
14
15
16
18
17
16
18
17
Qno 13: R&H Bloch is a large accounting firm specializing in the preparation of
individual federal tax returns. The firm is very conservative in its practices and tries to
avoid having more than 2% of its clients audited. As part of a summer internship, Jane
Bloch has been asked to see whether this goal is being met on consistent basis. For
each week during a 16 week interval centered on April 15 of last year, she has
randomly selected 125 returns prepared by the firm.
Week ending
2/25
3/04
3/11
3/18
3/25
4/01
4/08
4/15
4/22
4/29
5/06
5/13
5/20
5/27
6/03
6/10
# audited
2
1
2
3
5
4
5
6
3
1
1
3
2
2
3
2
a) Are they significantly more than 2% of R&H Bloch‟s clients being audited?
State and test appropriate hypothesis using all 2000 clients in Jane‟s sample.
b) Now withstanding your result in part (a), construct a p chart based on Jane‟s
data. Is there anything evident in the chart that Jane should bring to attention
of the partners in the firm?
INFERENTIAL STATISTICS
240
STATISTICAL QUALITY CONTROL