Download 2 - Olympia College

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical inference wikipedia , lookup

Time series wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
DIPLOMA IN
INFORMATION TECHNOLOGY
MODULE LEARNING GUIDE
STATISTICAL ANALYSIS
Version 2: MAY 2007
Diploma in Information Technology
1.0 INTRODUCTION
The use of Statistics in decision making has become widespread in virtually every occupational field. In support of
good decision-making, the purpose of this module is to provide students’ knowledge and skills in statistical principles.
Fundamental principles discussed include data collection and sampling, measures of central tendency and location,
correlation, regression, probability distribution, hypothesis testing, time series and index numbers. After learning this
module, students should be able to develop the skill of data collection, data presentation, data analysis, and
interpretation.
2.0 AIMS
The broad aim of this programme is to students should be able to develop the skill of data collection, data presentation,
data analysis and interpretation.
3.0 Programme Learning Outcomes
Learners who successfully complete this module will:
Examine changes in data from applied science or business, overtime, paying attention to trends, cyclical
variations, and irregular variations.
Explore, describe and regression techniques to data from business or applied science to identify relationships,
gauge the strength of relationships, make predictions and detect study effectiveness.
Identify problems associated with producing a particular set of data to measure opinion or identify a trend
Apply probability laws or concepts to real life situations
Apply binomial and normal probability distributions to real life situations
Apply Chi-square distribution to real life situations
4.0 AUDIENCE AND PRE-REQUISITES
This module requires students to have former understanding of the elementary elements of Business Mathematics and
Economics.
5.0 OVERVIEW OF THE MODULE
Students will be exposed to the various forms of Statistical Analysis such as:
 Describe about the data collection methods
 Tabulation of data, data presentation and data analyses like averages and dispersion
 Correlation and regression analysis
 Probability and probability distributions
 Analysis of hypothesis by using the test criterions like Z, t and Chi-square tests
 Examine the application of time series and index numbers.
6.0 INSTRUCTIONAL PLAN AND RESOURCES
6.1 The use of an electronic calculator must be used for the teaching of this module.
The base source of material to be used in the design of the teaching-learning schedule will be the print based module
material provided to both staff and students. Other resources will be included to supplement and fill in gaps especially
latest development or recent changes which are obviously not found in the module material.
Statistical Analysis
2
Diploma in Information Technology
Class Teaching Schedule
Week
1
Lecture
Statistical Analysis (An
Introduction)
Tutorial/Activities



2
Survey & Sampling



3
The Contribution of Graphical Data
to Research
4
Measures of Location and
Dispersion





5
Regression Analysis



6
Correlation Analysis


7
Probability and Discrete Probability
Distribution


8
Normal Distribution

9
Sampling Distribution & Estimation





Discuss a basic understanding of the types of
statistics.
Conduct activity about the types of data and data
collection methods
Divide students into group to do a short discussion on
collecting data with various approaches.
Review on the relationship between censuses and
samples.
Divide students into group to do a short discussion on
survey.
Conduct activity on the application of sampling
methods.
Summarize and present the data that has been collected.
Divide students into group to do presentation of data.
Conduct exercises on the Classification of Data
Learn how arithmetic mean, median and mode
formulation from grouped and ungrouped data.
Learn how range, quartile deviation, mean deviation
and standard deviation formulation
Compute co-efficient of variation and Skewness
Use graph paper to draw scatter diagrams to visualize
the relationship between two variables.
Use regression analysis to estimate equation to predict
future values of dependent variables.
Learn how correlation analysis describes the degree to
which two variables are linearly related to each other.
Use coefficient of determination as a measure of the
strength of the relationship between two variables.
Give exercises on using various probability concepts to
make decisions.
More advanced application questions to be provided to
students to test on understanding of Poisson
distribution and Binomial Distribution.
Give application exercises on using normal distribution
formulation.
Discuss the characteristics of normal distribution.
Apply t-Distribution for small samples
Discuss exercises on using the t-Distribution and ZDistribution method.
Review various Confidence Intervals methods for
means, proportion and means based on small samples.
Determining the sample size n
7.0 ASSESSMENT STRATEGY
7.1 AIM
The aim of the assessment strategy is to identify formal practices and procedures for assessing and appraising the
performance of participants in order those judgments and decisions can be reached concerning:
Statistical Analysis
3
Diploma in Information Technology



The progression of participants through the programme.
How well participants have met the programme learning outcomes through the combination of the individual
module learning outcomes.
The provision of feedback information to participants concerning their performance and how they adhered to the
generic assessment criteria and the module-specific assessment criteria.
The underpinning principles which drive the assessment strategies adopted for this programme are the profile of the
target participants and the programme itself (its philosophy and associated learning outcomes).
7.2 ASSESSMENT INSTRUMENTS:
Reference is to be made to the appendix on assessment instruments.
Statistical Analysis
4
Diploma in Information Technology
LEARNING SUGGESSTIONS AND GUIDELINES
Week 1: Statistical Analysis (An Introduction)
Over the week of lecture and tutorial, the focus will be to undertake the following:






Define Statistics
Explain how Statistics be of value in Business
Explain the basic concepts of Statistics
To provide a brief history on statistics.
To provide a basic understanding of the types of statistics.
To present a review of the types of data and data collection methods.
Learning outcomes to be attained:




Understand the needs of Statistics in business application
Learn about the different types of Statistics
Identify the types of data
Provide an insight of various data collections methods
Readings and preparation to be undertaken:
a)
From the Learning material:
Section 1a & Section 1c
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Zealand
Reading 1
c)
1.
2.
3.
4.
5.
Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
d) Discuss the following questions and explain their rationale
Design questionnaire for any of the following:
AIDS awareness
TV habits
Study habits
Video games
Family values
Postal questionnaire and interview are two methods of collecting data. List the advantages of each method.
One of the main defects of survey by interviews is the problem of interview bias. What is interview bias and how
can the problem be minimized.
Explain the different between primary and secondary data, giving suitable examples and state the advantages and
disadvantages of each source.
Explain whether each of the following items describes a statistics or a parameter.
Your grade-point average
The percentage of persons who respond to an n opinion survey by agreeing that our president is doing a good
job.
The average age of all U.S. citizens as reported by the 1990 census.
Statistical Analysis
5
Diploma in Information Technology
6.
7.
The average income for all persons who are of the American Medical Association
Sales made by General Foods Corp. in July as an indicator of annual sales.
Description and inference are the two broad subareas of statistics. Identify which subareas classificationdescription or inference. Describe each of the following items better:
Computing a baseball player’s batting average
Using the first-quarter summary of production to project second-quarter production
A bar chart that displays the corporate revenues for the current year, the preceding year, two years ago, and
so on.
A survey estimates of corporate year-end assets at somewhere between $1.2 million and $1.4 million.
Estimating an assembly worker’s weekly output as 1200 units plus or minus 100 units from eight hours of
work performance.
Comment on each of the following as a potential sample survey question. Is the question clear? Is it slanted toward
a desired respond?
Does our family use food stamps?
Which of the following best represents your opinion on gun control?
The government should confiscate our guns
We have the right to keep and bear arms.
A national system of health insurance should be favored because it would provide health insurance for
everyone and would reduce administrative costs.
In view of escalating environmental degradation and incipient resource depletion, would you favor economic
incentives for recycling of resource-intensive consumer goods?
Week 2: Survey & Sampling
Over the week of lecture and tutorial, the focus will be to undertake the following:

Discuss the guideline to conduct survey activity

Define Population, Sample and Sampling frame

Define Sampling and state the merits and demerits of sampling

Describe the major forms of spatial sampling for selecting samples from phenomena that vary across the
landscape.

Explain the various types of sampling methods like Probability Sampling method and Non-Probability
Sampling method.
Learning outcomes to be attained:

Learn about survey activity through group work

Understand the difference between sample and population

Understand the merit points and demerit points of sampling

Understand the various sampling method to be used.
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 1b
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 1
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
d) Discuss the following questions and explain their rationale
1.
Indicate whether a sample or a population is described in each of the following items.
- An opinion survey
- A biannual agriculture survey
- A quarterly stockholders of assets and liabilities
Statistical Analysis
6
Diploma in Information Technology
2.
3.
4.
5.
6.
7.
8.
The collection of all grades assigned for this class.
Define the target population for the following studies:
-
prior programming experience among first-year tertiary computing students
-
infertility among the wild echidna (native anteater) population
-
variations in the durability of jogging shoes
-
variation in heavy-metal content of a river downstream from a lead smelter
A non-profit organization is conducting a door-to-door opinion poll on municipal day-care centres. The
organization has devised a scheme for random sampling of houses, and plans to conduct the poll on weekdays
from noon to 5p.m. Will this scheme produce a random sample
Bob Peterson, public relations manager for Piedmont Power and Light, has implemented an intuitional advertising
campaign to promote energy consciousness among its customers. Peterson, anxious to know if the campaign has
been effective, plans to conduct a telephone survey of area residents. He plans to look in the telephone book and
select random numbers with addresses that correspond to the company’s service area. Will Peterson’s sample be a
random one?
At the U.S Mint in Philadelphia, 10 machines stamp out pennies in lots of 50. These lots are arranged sequentially
on a single conveyor belt, which passes an inspection station. An inspector decides to use systematic sampling in
inspecting the pennies and is trying to decide whether to inspect every fifth or every seventh lot of pennies. Which
is better? Why?
The state occupational board has decided to do a study of work-related accidents within the state, to examine some
of the variables involved in the accidents, for example, the type of job, the cause of the accident, the extent of the
injury, the time of day, and whether the employer was negligent. It has been decided that 250 of the 2500 workrelated accidents reported last year in the state will be sampled. The accident reports are filed by date in a filing
cabinet. Marsha Gulley, a department employee, has proposed that the study use a systematic sampling technique
and select every tenth report in the file for the sample. Would her plan of systematic sampling e appropriate here?
Explain.
Bob Bennett, product manager for Clipper Mowers Company, is interested in looking at the kinds of lawn mowers
used throughout the country. Assistant product manager Mary Wilson has recommended a stratified randomsampling process in which the cities and communities studied are separated into substrata, depending on the size
and nature of the community. Mary Wilson proposes the following classification:
Category
Type of Community
Urban
Inner city(population 100,000)
Suburban
Outlaying areas of cities or smaller communities(pop.
20000 to 100000)
Rural
Small communities(fewer than 20000 residents)
Is stratified random sampling appropriate here?
A Senate study on the issue of self-rule for the District of Columbia involved surveying 2000 people from the
population of the city regarding their opinions on a number of issues related to self-rule. Washington,D.C., is a
city in which many neighborhoods are poor and many neighborhoods are rich, with very few neighborhoods
falling between the extremes. The researches who were administering the survey had reasons to believe that the
opinions expressed on the various questions would be highly dependent upon income. Which method was more
appropriate, stratified sampling or cluster sampling? Explain briefly
Week 3: The Contribution of Graphical Data to Research
Over the week of lecture and tutorial, the focus will be to undertake the following:

Define the contribution to the presentation process that graphical displays of data can make.

To provide students with a comprehensive understanding of data collection methods.

Distinguish between Diagram and Graph

Describe the types of diagrams likes Bar diagram and Pie Diagram

Describe the types of Graphs like Line Graph, Histogram, Frequency Polygon and Ogive(Cumulative Frequency
Curve)

Describe Stem-and-Leaf
Learning outcomes to be attained:

Learn about data collection activity through group work

Understand about the sources and approaches to use when collecting data

Understand the basic definitions and concept of data collection and presentation
Statistical Analysis
7
Diploma in Information Technology

Ability to present data as chart, graphs and illustrates diagrams
Readings and preparation to be undertaken:
a)
1.
2.
From the Learning Material:
Section 2
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 2
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
Billson, J. (2002). The Power of Focus Groups for Social and Policy Research. Skywood Press.
Wadsworth, Y. (1997). Do it yourself social research (2nd ed.). St. Leonards, NSW, Australia: Allen and
Unwin.
Berry, M. J. A. and Linoff, G. (1997). Data Mining Techniques for Marketing, Sales, and Customer Support.
New York: Wiley.
Keppel, G. (1991). Design and Analysis : A Researcher's Handbook. Englewood Cliffs, NJ: Prentice-Hall.
Tukey, J. W. (1977). Exploratory Data Analysis. Reading, MA: Addison-Wesley.
Cox, B.G., et al (eds.) (1995), Business Survey Methods, Wiley.
Groves, R.M. (1988), Telephone Survey Methodology, Wiley.
Groves, R.M. (1989), Survey Errors and Survey Costs, Wiley.
d) Discuss the following questions and explain their rationale
Construct the stem-and-leaf plot for the following data:
53, 47, 59, 66, 36, 69, 84, 77, 42, 57, 51, 60, 78, 63, 46, 63, 42, 55, 63, 48, 75, 60, 8, 80, 44, 59, 60, 75, 49, 63.
The data in the table below describe the weekly take – home pay for 20 semi- skilled
Laborers. Using the classes $300.00 - $ 309.99, $310.00 - $ 319.99, ….., $ 340.00 - $ 349.99, set up a frequency
distribution table and plot the appropriate histogram and frequency polygon.
Weekly Take- Home Pay ($)
319.12
331.50
320.76
325.42
333.98
3.
326.81
348.39
321.67
315.38
340.89
324.79
337.24
331.47
304.12
327.02
313.48
326.67
326.67
321.19
327.11
There are five hospitals in a Health District, and they classify the number of beds in each hospital as follows:
Hospital
Maternity
Surgical
Medical
Psychiatric
4.
5.
Foothills
General
Southern
Healthview
St John
24
86
82
25
38
85
55
22
6
45
30
30
0
30
30
65
0
24
35
76
Sales in four regions are given in the following table. Draw a pie chart to represent these:
Region
Sales
North
South
East
West
25
10
45
25
Total
100
Construct a frequency distribution from the following set of data showing the number of minutes 100 customers
occupy their seats in a college cafeteria.
29
67
34
39
23
66
24
37
45
58
Statistical Analysis
8
Diploma in Information Technology
51
73
31
15
51
47
35
46
72
6.
37
48
58
31
31
41
45
40
35
45
63
35
34
56
34
26
41
62
22
37
82
56
43
47
35
56
28
41
19
28
45
39
30
67
37
38
55
31
35
27
35
54
73
51
61
27
38
44
54
23
49
30
33
33
96
68
40
46
28
34
16
92
49
22
22
41
62
45
53
52
70
59
43
35
34
29
48
61
35
63
36
The Degree of Reading Power (DRP) test is often used to measure the reading ability
of children. Here are the DRP scores of 44 third – grade students, measured during research on ways to improve
reading performance:
40 26 39 14 42 18 25 43 46 27 19
47 19 26 35 34 15 44 40 38 31 46
52 25 35 35 33 29 34 41 49 28 52
47 35 48 22 33 41 51 27 14 54 45
Make a stem plot of these data. Then, make a histogram
7.
8.
9.
In 1994, there were 12, 263, 000 undergraduate students in U.S. colleges. According
To the U.S. Department of Education, there were 117, 000 American Indian students, 674,000 Asian, 1, 317, 000
non – Hispanic black, 968, 000 Hispanic, and 8, 916, 000 non- Hispanic white students. In addition, 269, 000
foreign undergraduates were enrolled in U.S. colleges. Present these data in a graph.
What is meant by ogive?
Draw two ogive curves from the following data:
Class interval
Frequency
0 to less than 5
7
5 to less to less than 10
10
10 to less to less than 15
16
15 to less to less than 20
23
20 to less to less than 25
25
25 to less to less than 30
13
30 to less to less than 35
17
35 to less to less than 40
10
40 to less to less than 45
14
45 to less to less than 50
10
50 to less to less than 55
5
Hourly wages rates(RM) for 25 workers are as follows:
4.11
4.25
4.90
5.30
5.20
5.05
6.15
5.80
4.65
5.60
4.43
5.25
4.54
4.50
4.25
4.14
4.85
4.29
4.80
4.40
5.50
4.40
6.05
5.15
4.90
Construct a frequency distribution for the above data using equal data class intervals and with first class defined as
RM4.00 and under RM 4.50. Prepare a histogram too.
Statistical Analysis
9
Diploma in Information Technology
Week 4: Measures of Location and Dispersion
Over the week of lecture and tutorial, the focus will be to undertake the following:

Define Discrete and Continuous Variables

Develop Discrete and Continuous Frequency Distribution

Find a measure of central location and dispersion such as average, median, mode, standard deviation and
coefficient of variation.
Learning outcomes to be attained:

Ability to calculate the mean, median and mode from both ungrouped and grouped data.

Ability to find variance and standard deviation both from ungrouped and grouped data.
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 2
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 3
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
Billson, J. (2002). The Power of Focus Groups for Social and Policy Research. Skywood Press.
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
d) Discuss the following questions and explain their rationale
1.
The numbers of resignation received by a certain form per month during 1988 were:
8, 3, 5, 3, 4, 3, 1, 0, 3, 4, 0, 7
Calculate the arithmetic mean, mode and median.
2. Calculate the mean, median and mode for the number of types purchased annually by each individual from the
following data:
3.
No of types purchased
No of people
1
2
3
4
5
6
7
8
9
2
4
8
3
3
2
2
4
6
Define the following:
mean
median
mode
standard deviation
coefficient of variation
Statistical Analysis
10
Diploma in Information Technology
4.
5.
The table below shows the marks obtained by 40 students in class test. Find standard deviation and variance:
Marks
30
40
50
60
70
80
90
No of Students
4
6
12
10
5
2
1
The following data shows wages of a group of employee:
Wages group(hourly rate in cents)
6.
50 and under 60
60 and under 70
70 and under 80
80 and under 90
90 and under 100
100 and under 110
110 and under 120
Calculate the mode, median, mean and standard deviation
Find coefficient of variation quartile deviation
No. of employees
5
25
134
85
9
43
34
From the 140 children whose urinary concentration of lead were investigated 40 were chosen who were aged at
least 1 year but less than 5 years. The following concentrations of copper (in
) were found.
0.70, 0.45, 0.72, 0.30, 1.16, 0.69, 0.83, 0.74, 1.24, 0.77,
0.65, 0.76, 0.42, 0.94, 0.36, 0.98, 0.64, 0.90, 0.63, 0.55,
0.78, 0.10, 0.52, 0.42, 0.58, 0.62, 1.12, 0.86, 0.74, 1.04,
0.65, 0.66, 0.81, 0.48, 0.85, 0.75, 0.73, 0.50, 0.34, 0.88
Find the median, range and quartiles.
Week 5: Regression Analysis
Over the week of lecture and tutorial, the focus will be to undertake the following:


To use scatter diagrams to visualize the relationship between two variables.
To use regression analysis to estimate equation to predict future values of dependent variables.
Learning outcomes to be attained:

Identify situations where regression analysis is appropriate

Ability to draw scatter diagram

Know when to apply regression analysis
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 3a
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 5
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
d) Discuss the following questions and explain their rationale
Statistical Analysis
11
Diploma in Information Technology
1. Following are the advertising expenses and sales value over 5 months for a company
Jan
Feb
Mar
Apr
May
Advertising expenses($’000)
15
15
11
11
19
Sales Value($’000)
120
160
140
100
180
-
Construct a linear regression equation show the sales value dependent on advertising expenses
Estimate the sales value if the advertising expenses amounted to $13000
Draw a scatter diagram to show the best fit line for the above linear regression equation based on the above
figures
2. A machine runs at a different speed, the higher the speed is the sooner the part has to be replaced. Trial observation
produced the following data:
-
3.
Speed(revolution per minute)
Life of drill head(hours)
18
162
20
154
20
171
21
165
23
128
26
138
26
129
31
125
32
106
32
97
40
95
41
103
42
109
43
69
Plot the figures on a scatter diagram
Determine the equation of the regression line.
Plot the line on the scatter diagram and estimate the life of the drill if the machine operates at the 30
revolutions per minute.
The following data has been collected over eight periods:
-
Period
Unit of output
Total cost($)
1
10,000
32,000
2
20,000
39,000
3
40,000
58,000
4
25,000
44,000
5
30,000
52,000
6
40,000
61,000
7
50,000
70,000
8
45,000
64,000
Draw a scatter diagram
Draw a straight line that best fits the data.
Give the equation of the line and estimate the cost likely to be incurred at the output levels of 26,000 units
and 48,750 units.
Statistical Analysis
12
Diploma in Information Technology
4.
The following table shows the increase in average earning of male employee in the United Kingdom between
1975 and 1981.
Year
Average earnings
1975
59
1976
70
1977
77
1978
87
1979
99
1980
122
1981
132
-
Find the regression for least squares regression line that would enable you to forecast earning in future years.
Plot the data and draw your regression line on the same graph
Make a forecast of the earning for 1982 and comment
Week 6: Correlation Analysis
Over the week of lecture and tutorial, the focus will be to undertake the following:

Identify situations where correlation analysis is appropriate between two variables.

Calculate the correlation coefficient, perform statistical test on the coefficient and interpret the correlation
coefficient.

Differentiate between Product Moment Correlation Coefficient and Coefficient of Determination

Determine Rank Correlation by using Spearman’s Rank
Learning outcomes to be attained:

Identify situations where correlation analysis is appropriate

Ability to understand the difference between Product Moment Correlation Co-efficient and Coefficient of
Determination

Ability to perform statistical tests on the coefficient of correlation and interpret it

Ability to understand Spearman’s Rank approach
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 3a
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 4
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
e) Discuss the following questions and explain their rationale
Statistical Analysis
13
Diploma in Information Technology
1.
Following are the quantity units of product and its total production cost.
Product quantity(‘000 units)
10
20
40
25
30
40
50
45
Production cost ($‘000)
32
39
58
44
52
61
70
64
-
2.
Calculate the coefficient of correlation to explain the extent of correlation between product quantity and
production cost.
Calculate the coefficient of determination to explain the extent of variation in production cost caused by the
variation in product quantity.
Following are the marks scored by a group of 10 students in two different progress tests
Student title
A
B
C
D
E
F
G
H
I
J
Accounting Marks
50
70
60
30
40
90
80
75
65
55
Costing Marks
40
60
50
20
55
65
70
90
80
75
Calculate the coefficient of rank and comment the ranking of marks for 10 students between accounting test and
costing test.
3. Take the data given below and construct a scatter diagram. Find the correlation coefficient and coefficient of
determination for this data.
X
Y
4.
5.
10
12
14
16
18
20
22
24
26
28
25
24
22
20
19
17
13
12
11
10
A farmer has recorded the number of fertilizer applications to each of the fields in one section of the farm and, at
harvest time, records the weight of crop per acre. The results are given in the accompanying table:
X
1
2
4
5
6
8
10
Y
2
3
4
7
12
10
7
Use the following data to calculate the coefficient of correlation and the coefficient of determination. Can you
draw conclusions from your results?
X
4
2
6
7
8
5
2
4
Y
10
5
15
16
19
14
8
11
Week 7: Probability and Discrete Probability Distribution
Over the week of lecture and tutorial, the focus will be to undertake the following:

Explain how probability can help understand the consequences of the dependencies

Discuss and implement the rules of probability under statistically dependency and statistically independence.

Identify the range of business situations which requires different probability distributions

Identify when and how to apply Binomial Distributions and Poisson Distributions
Learning outcomes to be attained:

Understand the rules of probability

Solve probability questions

Understand conditional probability

Able to understand the concept of Binomial Distributions and Poisson Distributions
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 6a
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 6 &
Zealand
Reading 7
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
Statistical Analysis
14
Diploma in Information Technology
d)
1.
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
Discuss the following questions and explain their rationale
The following table is based on observing a random sample of n=200 individuals who entered a gift shop at an
airline terminal:
Sex
Purchase
No Purchase
Male
40
40
Female
40
80
2.
3.
4.
5.
6.
7.
8.
9.
-
What is the probability that a randomly selected individual is female?
What is the probability that a randomly selected individual is female and made a purchase/
What is the probability that a randomly selected individual is female or made a purchase?
What is the probability that a randomly selected individual is female given that a purchase is made?
What is the probability that a purchase was made given that a randomly selected individual is female?
What is the probability that the sum of the faces in two rolls of a die is:
Less than 6
Equal to 6
Greater than 6
A survey of 2000 customers was conducted to determine their purchasing behaviour regarding two products. It
was found that during the past summer 500 had purchased brand A. 350 purchased brand B and 125 had
purchased both brands. If a person is selected at random from this group, what is the probability that the person:
Would have purchased brand A
Would have purchased brand A, but not brand B
Would have purchased brand A, brand B or both
Would not have purchased either brand
A group of people consists of 30 men of whom 10 disagree with the proposal and 70 women of who 40 disagree
with the proposal. What is the probability for a person selected from the group is a man or disagree with the
proposal?
Probability for machine break down is 0.1 and probability of material supply is 0.3. What is the probability for
machine break down and stoppage of material supply?
A group of ten people consists of 5 men and 5 women. What is the probability for the second person selected
from the group being man, if the first person selected was a man? What is the probability for the second person
selected from the group is woman, if the first person selected from the group was man?
A company minibus has 10 passenger seats. In a routine run, it is estimated that the probability of any passengers
seat being filled in 0.42. What is the mean and variance of the binomial distribution of the number of passengers
on a routine run? Calculate the probability that on a routine run :
There will be no passengers
There will just be one passenger
There will be exactly two passengers
There will be at 3 passengers
A firm which produces half-inch diameters rubber hose estimates that on average there are 0.4 flaws per 10 meter
length. Assuming that flaws occur randomly, what is the probability that:
There are no flaws in a 20 meter length
There is more than 1 flaw in a 10 meter length
There are more than 2 flaws in a 20 meter length
Metal components are subjected to rigorous breaking tests. The probability that a component will break during
such test is 0.4. If such components were tested on one particular occasion, what is the probability that:
3 of them will break
2,3 or 4 will break
0 or 1 will break
Statistical Analysis
15
Diploma in Information Technology
10. The number of road accidents at a certain traffic roundabout has been found with a mean of 0.8 accidents per
week. Calculate the probabilities that:
There will be at least 2 accidents in a particular week
There will be exactly 3 accidents in a particular three-week period
Week 8: Normal Distribution
Over the week of lecture and tutorial, the focus will be to undertake the following:



Identify the Normal Distributions as the most important probability distributions in statistics
Discuss the characteristics of Normal Distributions
Apply the Standard Normal Deviations (Z) scores
Learning outcomes to be attained:

Apply the Normal Distributions

Understand the different structure of Normal Distribution

Calculate the Standard Normal Distributions (Z) scores

Use table of Standard Normal probabilities
Readings and preparation to be undertaken:
a) From the Learning Material:
Section 6a
b) Additional reading
Lower Hutt(0221, Statistical Analysis, The Open Polytechnic of New
Reading 8
Zealand
c) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
d)
1.
2.
3.
4.
Discuss the following questions and explain their rationale
A recent study conducted by the Public Health Service found that males who smoke average 34 cigarettes per day.
The number of cigarettes smoked per day is normally distributed with a standard deviation of 8. If a male smoker
is selected at random, what is the probability that he smokes
More than 2 packs(40 cigarettes) per day
Less than one pack per day
Less than 1½ pack per day
Bolts are produced with a mean diameter of 0.8cm and any bolt whose diameter is outside the range of 0.78cm to
0.852cm is considered substandard. Assuming that the diameter normally distributed
Find the standard deviation if 2.2% of the bolts is substandard
An alteration of the production method changes the standard to 0.005cm and does not alter the mean. Find the
percentage of bolts which are now substandard.
It is known from the past experience that the life of a machine component is approximately normally distributed
with mean equal to 200 hours and a standard deviation of 4 hours. Calculate the probability that a randomly
selected component has a life of
At least 206 hours
Less than 198 hours
Between 204 to 208 hours
An aptitude test, which is marked out of 100, is widely used in colleges. From past experience it is known that the
distribution marks is normally distributed and the average mark is 56 with a standard deviation of six. If, in a
particular college, 250 students take the test, calculate
Statistical Analysis
16
Diploma in Information Technology
5.
The percentage of students expected to obtain more than 47 marks
The expected number of students obtaining more than 70 marks
The mark below which 33% of the students are expected to score
The amount dispensed by a drink vending machine has a normal distribution with a mean of 205ml and standard
deviation of 5ml. State the proportion of the drinks containing:
Less than 212ml
Less than 200ml
Between 197.5 to 210ml
If the standard deviation remains 5 ml to what value must the mean be changed if approximately 75% of the
drinks are to contain more than 200ml
Week 9: Sampling Distribution & Estimation
Over the week of lecture and tutorial, the focus will be to undertake the following:

Calculate Confident Interval for population parameters small samples

Determine t-distributions and Z-distributions

Estimate the characteristics of population by observing the characteristics of a sample

Use a two-tail confidence interval

Find the confidence interval using survey data
Learning outcomes to be attained:

How to apply the t-distribution and Z-distribution to the estimation of mean and proportion for single or
multiple samples

Understand the sampling distribution of the mean and proportion

Estimate point and confidence intervals for the mean and the proportion of single and multiple samples
Readings and preparation to be undertaken:
a)
From the Learning Material:
Section 6a
b) Recommended text
David S. Moore and George P. McCabe, Introduction to the Practice of Statistics(IPS), 3 rd, W.H.
Freeman and Company, New York
Donald Waters, 1994, Quantitative Methods for Business, Addison-Wesley, USA
John A. Ingram & Joseph G. Monks(1992), Statistics for Business and Economics, 2 nd, The Dryden
Press, Harcourt Brace Jovanovich College
Sheldon P. Gordon & Florence S. Gordon, Contemporary Statistics (1994), McGraw-Hill, USA
c)
1.
2.
3.
Discuss the following questions and explain their rationale
In a random sample of 200 garages it was found that 79% sold car batteries at below list price recommended by
the manufacturer.
Estimate the proportion of all garages selling below the list price.
Calculate 99% confidence interval for this estimation.
In 2000 a simple random sample of 100 sales invoices was taken from a very large population of sales invoices.
The average value of sales found to be $18.50 with a standard deviation of $6.00
Obtain the 95% confidence interval for the true average for sales.
How large a simple random sample would have been required so as to be 95% confidence that the sample did
not differ from the mean by more than ±0.05.
Using the following data, obtain the 99% confidence limit for the population mean number of children:
No. of Children
No. of families
0
1
2
Statistical Analysis
30
40
45
17
Diploma in Information Technology
4.
5.
6.
7.
8.
3
30
4
20
5
15
6
12
7
8
A random sample of 1000 manufactured items is inspected and 250 are found to contain defects. What is likely
range of the proportion defective in the proportion of items? (Use 98% confidence limit)
A random sample of 100 student examination scripts has randomly been selected. If 12 were found to obtain
marks below 40, calculate a 95% confidence interval for the true percentage of scripts that obtain marks below 40.
What sample size would be necessary if we wish to produce an estimate of the percentage of scripts that obtain
marks below 40 in the whole school to within 2% with a 95% confidence limit?
A sample of 40 sardine cans was randomly selected and 8 were found to be defective. The production manager
feels that the current percentage of defectives is too high. Construct appropriate 99% confidence intervals.
A sample of 80 bottles of brandy was randomly selected and found to have a mean of 650 liters and standard
deviation of 30 liters. A week later a second sample of 60 bottles brandy was selected and found to have a mean of
655 liters with a standard deviation of 25 liters. Construct 92% confidence interval for the change in the average
weight of brandy
The following table shows the shoe size and respective number of shoes in a shoe market
Shoe size
Frequency
1
10
2
28
3
42
4
50
5
20
Find the population mean of shoe size with 95% confidence level.
Statistical Analysis
18