• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```BA 275
Agenda
 Summarizing Quantitative Data



The Stem-and-Leaf Display
The Box and Whisker Plot
The Empirical Rule
 Quiz #2
Announcement
•TA: Shen. 8:00 – 9:50 am, Friday at 328 Bx.
1
Quiz #1
Part 1. One of the conclusions of the Nationwide Personal
Transportation Survey was that "In 1990, women drove 76%
more on average than they did in 1969. However, women still
drove 7,000 miles less on average than men in a year."
Question 1. What is the numerical value of 76% an example of?
A. A population parameter based on driving behavior in 1990
B. A population parameter based on comparing driving behavior in
1969 and 1990
C. A sample statistic based on driving behavior in 1990
D. A sample statistic based on comparing driving behavior in 1969
and 1990
E. None of the above
2
Quiz #1
Part 1. One of the conclusions of the Nationwide Personal
Transportation Survey was that "In 1990, women drove 76%
more on average than they did in 1969. However, women still
drove 7,000 miles less on average than men in a year.“
Question 2. What is the numerical value of 7,000 miles an
example of?
A. A population parameter based on comparing driving behavior in
1969 and 1990
B. A sample statistic based on driving behavior in 1990
C. A sample statistic based on comparing driving behavior in 1969
and 1990
D. A population parameter based on driving behavior in 1990
E. None of the above
3
Quiz #1
Part 3. A college researcher surveys a random sample
of 500 students. Among other things, the survey
asks whether students have observed any instance
of racial discrimination on campus. One hundred
students return completed surveys, and 53 report
having observed such discrimination. The
researcher then reports that racial discrimination is
common on campus. This report is:
A. Invalid because the sample is such a small percent
of the student body
B. Invalid because it is a volunteer sample
C. Valid because the sample is random
D. Valid because the sampling frame is the entire
student population
4
statistics: x , s2, s, p̂ , etc.
x1, x2, …, xn
Sample of size n
Qualitative
Quantitative
Organizing data:
Estimation
Hypothesis Testing
Regression Analysis
Contingency Tables
Drawing conclusions from data:
Random variables,
Probability,
Distributions
Discrete: binomial distribution
Continuous: normal distribution,
Sampling distribution of the sample mean
Describing uncertainty:
X1, X2, …, Xn
Selecting a random sample:
parameters: , 2, , p, etc.
POPULATION
Statistical Analysis
5
CEO Data
NO
1
2
3
4
5
6
7
8
9
10
AGE
53
36
48
53
46
50
59
48
43
45
SALARY
145
291
659
298
250
291
296
388
621
58
EDUCATION
Bachelors
Masters
Doctorate
Masters
Doctorate
Bachelors
Bachelors
Masters
Masters
Bachelors
:
:
:
:
:
:
:
:
:
:
:
:
54
55
56
57
58
59
60
55
51
58
44
51
70
43
736
368
217
206
536
213
573
Bachelors
Bachelors
Bachelors
Bachelors
Masters
Masters
Bachelors
6
Summary Statistics
AGE
SALARY
---------------------------------------------------Count
60
60
Average
51.4667
403.883
Median
50.0
350.0
Mode
50.0
Variance
79.609
47815.6
Standard deviation 8.92239
218.668
Minimum
32.0
21.0
Maximum
74.0
1103.0
Range
42.0
1082.0
Lower quartile
45.5
250.0
Upper quartile
57.0
539.5
Interquartile range 11.5
289.5
---------------------------------------------------7
Stem-and-Leaf Display for AGE
2
5
11
25
(12)
23
11
4
2
3|23
3|678
4|013344
4|55556677788889
5|000000112333
5|555666677889
6|0111223
6|99
7|04
8
Box-and-Whisker Plot for Salary
Box-and-Whisker Plot
0
200
400
600
800
1000
1200
Salary
9
Box-and-Whisker Plot for Salary
Invisible line
Q1 – 1.5 x IQR
Box-and-Whisker Plot
Invisible line
Q3 + 1.5 x IQR
IQR = 289.5
1.5 x IQR = 434.25
Q1 = 250.0
Q1 – 1.5 x IQR = -184.25
Q3 = 539.5
Q3 + 1.5 x IQR = 973.75
0
200
400
600
800
1000
1200
Salary
Smallest observation
Within the invisible lines
= Minimum
Largest observation
Within the invisible lines
≠ Maximum
10
Two or Multiple-Sample Comparison
SG+: Compare / Multiple Samples / Multiple-Sample Comparison
Compare / Two Samples / Two-Sample Comparison
Summary Statistics for AGE
EDUCATION
Count
Average
Median
Standard devia
---------------------------------------------------------------------------------------------Bachelors
25
51.92
53.0
8.10309
Doctorate
10
47.3
47.0
9.00679
Masters
22
53.8636
52.5
9.34164
None
3
44.0
45.0
6.55744
---------------------------------------------------------------------------------------------Total
60
51.4667
50.0
8.92239
EDUCATION
Maximum
Lower quartile
Upper quartile
---------------------------------------------------------------------------------------------Bachelors
69.0
47.0
57.0
Doctorate
62.0
41.0
50.0
Masters
74.0
48.0
58.0
None
50.0
37.0
50.0
---------------------------------------------------------------------------------------------Total
74.0
45.5
57.0
11
Two or Multiple-Sample Comparison
SG+: Compare / Multiple Samples / Multiple-Sample Comparison
Compare / Two Samples / Two-Sample Comparison
EDUCATION
Box-and-Whisker Plot
Bachelors
Doctorate
Masters
None
32
42
52
62
AGE
72
82
12
Questions: CEO
1. What is the mean age? The mean salary?
2. What is the median age? The median salary?
3. How many CEO’s in the sample are younger than 45
years old?
4. How many CEO’s in the sample are making more
than \$540K a year?
5. If the youngest and oldest CEO’s are removed from
the sample, what is the new median? What is the
new mean? Does the new variance (or standard
deviation) increase or decrease?
13
Exercise 1.1
 In your own words, define and give an
example of each of the following statistical
terms.




Population
Sample
Parameter
Statistic
14
```
Related documents