Download Chapter 8

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
Chapter 8
Random Sampling and
Sampling Distributions
Using a Small Group
to Represent the Population.
page 8-1
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-2
Importance of TS 3
“…the need of the great society for instruments of
analysis by which an invisible and most
stupendously difficult environment can be made
intelligible.”
Walter Lippman in Public Opinion
Desired Properties of the Instrument
1. Efficiency
a. Cost
b. Speed – short cycle
2. Reasonable Reliability
3. Fairness
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-3
Target Populations
for TQM
Figure 1:
Statistics for Total Quality Management
Global Competition
Customer
Satisfaction
Quality
Quality
Joiner Triangle
Scientific
Scientific
Approach
Approach
All
AllOne
One
Team
Team
1. PDCA Cycle
2. Data Driven
Statistical Process
Management
Employee
Survey
Customer
Survey
Statistical Thinking for Management
1. Identify the relevant population
2. Collect data using a valid design
3. Make inference on the key characteristics of the target population
4. Formulate actions using sensible criteria, including experimentation for
process improvement
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-4
Strategy for Tracking Target Populations
1. Focus on a few “key parameters.”
The population mean
The population proportion


2. Collect a set of “representative”
observations, the sample size, n.
3. “Approximate” the parameter value, with
a reasonable accuracy.
“Statistical Inference”
the instrument for implementing this
strategy.
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-5
Examples
1. CFO of a credit card company wants to determine the
proportion of cardholders who pays more than the
required minimum monthly payment.
2. A national credit company wants to learn the proportion
of people using the company’s credit cards for rental
payment.
3. A marketing manager of a cruise company would like to
know the average family expenditure for vacation for
the target customer group.
4. A newly opened up-scale supermarket wants to fined out
the percentage of the customers in its trade area who
redeem store coupons for free grocery items.
5. Human Resource Director wants to learn the percentage
of employees who believe that top management reads
suggestions by employees.
6. TIME/CNN wants to report the popularity of a certain
political measure among US voters.
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-6
Summary of Examples
Target Population
Parameter
Estimator
1
All cardholders
of the bank
Proportion
Sample
Proportion
2
All cardholders
of the credit
company
Proportion
Sample
Proportion
3
All customers of
the target group
4
All customers in
the trade area
Proportion
Sample Proportion
5
All reported
maintenance
Expenditures
Total expend =
Sample Mean
All Employees
Proportion..
Sample
Proportion
6
Your
Own?
Mean Vacation
Expenditure
N × Mean, μ
Sample Mean
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-7
INF_TS:
Statistical Inference Tool Set
1. Random Sampling
For fair representation, and also for
assuring the reasonable accuracy
2. Sampling Distribution
What is the expected size of the estimation
error?”
3. Inference Procedure
How to account for the estimation error in
the statement about the population
parameter?
(a) Confidence Interval
(b) Hypothesis Testing
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-8
Relating Key Concepts of Ch. 8
To Probability Notions in Ch. 7
Chapter 8
Chapter 7
1. Random Sampling
1. Random Experiment
Betting on a roulette game
n times
2. Estimator / Statistic
2. Random Variable
The average gain per bet
3. Sampling Distribution 3. Probability Distribution
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-9
INF_TS1: Random Sampling
Definition:
A random sample must satisfy:
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-10
Tools for Selecting a Random Sample
(a) Frame
(b) Random number table
Random Number Table
(a) Properties
(b) How To Use
Example:
N = 870 – 3 digits
Starting point
Selection route
Table 8.2.1
• For example
– Starting in row 21, column 3
– We find 52794, then 01466
19
20
21
22
23
24
17594
09584
81677
45849
97252
26232
10116
23476
62634
01177
92257
77422
55483
09243
52794
13773
90419
76289
96219
65568
01466
43523
01241
57587
85493
89128
85938
69825
52516
42831
96955
36747
14565
03222
66293
87047
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
1
51449
16144
48145
83780
95329
11179
94631
64275
72125
16463
10036
85356
84076
76731
19032
72791
11553
71405
17594
09584
81677
45849
97252
26232
87799
46120
53292
81606
67819
50458
59772
94752
01885
85190
97747
43318
47874
24259
31947
37911
82714
82927
65934
56953
16278
96339
84110
49017
43560
25206
2
39284
56830
48280
48351
38482
69004
52413
10294
19232
42028
66273
51400
82087
39755
73472
59040
00135
70352
10116
23476
62634
01177
92257
77422
33602
62298
55652
56009
47314
20350
27000
91056
82054
91941
67607
84469
71365
48079
64805
93224
15799
37884
21782
04356
17165
95028
49661
60748
05552
15177
3
85527
67507
99481
85422
93510
34273
31524
35027
10782
27927
69506
88502
55053
78537
79399
61529
28306
46763
55483
09243
52794
13773
90419
76289
01931
69126
11834
06021
96988
87362
97805
08939
45944
86714
14549
26047
76603
71198
34133
87153
93126
74411
35804
68903
67843
48468
13988
03412
54344
63049
4
67168
97275
13050
42978
39170
36062
02316
25604
30615
48403
19610
98267
75370
51937
05549
74437
65571
64002
96219
65568
01466
43523
01241
57587
66913
07862
47581
98392
89931
83996
25042
93410
55398
76593
08215
86003
57440
95859
03245
54541
74180
45887
36676
21369
49349
12279
75909
09880
69418
12464
5
91284
25982
81818
26088
63683
26234
27611
65695
42005
88963
01479
73943
71030
11680
14772
74482
34465
62461
85493
89128
85938
69825
52516
42831
63008
76731
25682
40450
49395
86422
09916
59204
55487
77199
95408
34786
49514
94212
24546
57529
94171
36713
35404
35901
90163
81039
35580
94091
01327
16149
6
19954
69294
25282
17869
40587
58601
15888
36014
90419
79615
92338
25828
92275
78820
32746
76619
47423
41982
96955
36747
14565
03222
66293
87047
03745
58527
64085
87721
37071
58694
77569
04644
56455
39724
46381
38931
17335
55402
48934
38299
97117
52339
69987
86797
97337
56531
18426
90052
07771
18759
7
91166
32841
66466
94245
80451
47159
13525
17988
32447
41218
55140
38219
55497
50082
38841
05232
39198
15933
89180
63692
79993
58458
14536
20092
93939
39342
26587
50917
72658
71813
71347
44336
56940
99548
12449
34846
71969
93392
41730
65659
31431
68421
52268
83901
35003
10759
29038
43596
25364
96184
8
70918
20861
24461
26622
43058
82248
43809
02734
53688
43290
81097
13268
97123
56068
45524
28616
54456
46942
59690
09986
44956
77463
23870
92676
07178
42749
92289
16978
53947
97695
62667
55570
68787
13827
03672
28711
58055
31965
47831
00202
00323
35968
19894
68681
34915
19579
79111
21424
77373
15968
9
85957
83114
97021
48318
81923
95968
40014
31732
36125
53618
73071
09016
40919
36908
13535
98690
95283
36941
82170
47687
82254
58521
78402
12017
70003
57050
41853
39472
11996
28804
09330
21106
36591
84961
40325
42833
99136
94622
26531
07054
62793
67714
81977
02397
91485
00015
56049
16584
34841
89446
10
19492
12531
21072
73850
97072
99722
30667
29911
28456
68082
61544
77465
57479
55399
03113
24011
54637
93412
77643
46448
65223
07273
41759
43554
18158
91725
38354
23505
64631
58523
02152
76588
29914
76740
77312
93019
73589
11673
02203
40168
11995
05883
87764
55359
33814
22829
96451
67970
75927
07168
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-11
INF_TS2: Sampling Distribution
Definition:
The probability distribution of the estimator.
Describes the performance of the estimator,
over the repeated independent samplings
from the target population.
Visualizing:
The Population
you
do
Sample
n units
Statistic
(estimator)
you ne
gi
a
m
i
…
Sample
n units
Sample
n units
…
Sample
n units
…
Statistic
(estimator)
Statistic
(estimator)
…
Statistic
(estimator)
A histogram of these imagined
values represents the sampling
distribution of this statistic
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-12
The Sampling Distribution – Key Results
A. Expected Value and SD
Parameter Desired
Population Mean, 
Estimator
Sample Mean,
Expected Value
SD
X
Population Proportion, 
Sample Proportion,



 1   
n

n
p

1. The sample mean (proportion) is unbiased.
2. The accuracy of the sample mean or the
sample proportion increases, as the
sample size increases
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-13
B. Central Limit Theorem
Predicts the shape of the sampling
distribution
• Individuals in population
Highly non-normal distribution
Mean , standard deviation 
3000

0
0 
• Averages of n = 3 individuals
Non-normal, but less so
Same mean 
Smaller std. deviation  X
 / 3
$1k
$2k
$1k
$2k
$1k
$2k
2000
1000
0
0 
• Averages of n = 10 individuals
Close to normal
Same mean 
Smaller std. deviation
1000
 X   / 10
0
0 
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-14
Relating to Las Vegas Roulette
Play the roulette n times, and compute the
average gain per play is:
equivalent to:
Take a random sample of size n, and compute
the sample mean.
Example: Survey of Family Vacation
Expenditure
The corresponding roulette game:
The die has N faces, each face shows the
vacation expenditure of a household.
xi, the face i value = the expenditure of the
household i.
You spin the wheel, so that each face appears
equally likely, with probability 1/N.
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-15
Roulette for Random Sampling
Game: You Play n Times
x1
x2
xi
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-16
Analysis:
Let X= the outcome of one spin
1
1
x

E(X) = N 1 N x2
1
xN    population mean 
N
2 1
2 1
2 1
2
x



x



...
x










1
2
N
Var(X) =
N
N
N
(population variance)
SD(X) =  (population standard deviation)
The Total for playing n games:
E Total   n  
SD Total   n  
The average per each game:
Total  n

E X 


n  n

Total 
n 


SD  X 



n 
n
n

Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-17
INF_TS2A: Standard Error
Definition:
Estimated Standard Deviation of the Estimator
A. Standard Error of the Mean =
Standard Deviation of the Average =
–
S
n

n
Problem: cannot be computed without population
parameters
Example
Sample Size n = 100
Sample Average X = $633.91
Sample Standard Deviation S = $311.49
What is the expected estimation error?
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-18
B. Standard Error of the Proportion,
Standard Deviation of p =
Problem:
Example
 1   
n
p 1  p 
n
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
page 8-19
Appendix
Terminology of Sampling
 Population
Size = N
 Sample
Size = n
 Census
 Representative Sample (Random Sample)
Population
Sample
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
Terminology of Sampling (cont.)
 Biased Sample: Non-random sample
 Sampling With/Without Replacement
 Frame
 Pilot Study
page 8-20
Chapter 8
Random Sampling: Planning Ahead for Data Gathering
Parameter, Statistic & Estimator
 Parameter and Statistic
 Estimator and Estimate
 Estimation Error (Sampling Error)
page 8-21
Related documents