Download Problems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Definitions
Population: A collection, or set, of individuals, objects, or
events whose properties are to be analyzed.
Sample: A subset of the population.
We desire knowledge about an entire population but is
most often the case that it is prohibitively expensive, so
we select representative sample from the population and
study the individual items in the sample.
Descriptive Statistics: The collection, presentation, and
description of of the sample data.
Inferential Statistics: The technique of of interpreting the
values resulting from the descriptive techniques and
making decisions and drawing conclusions about the
population.
Section 1.1, Page 4
1
Definitions
Parameter: A numerical value summarizing all the data of a
population. For example, the average high school grade
point of all Shoreline Students is 3.20. We often use Greek
letters to identify parameters, μ = 3.20.
Statistic: A numerical value summarizing the sample data.
For example, the average grade point of a sample of
Shoreline Students is 3.18. We would use the symbol,
x  3.18
The statistic corresponds to the parameter. We usually don’t
know the value of the parameter, so we take a sample and
estimate it with the corresponding statistic.
Sampling Variation: While the parameter of a population is
considered a fixed number, the corresponding statistic will
vary from sample to sample. Also, different populations give
rise to more or less sampling variability. Considering the
variable age, samples of 60 students from a Community
college would have less variability than samples of a Seattle
neighborhood.
Section 1.1, Page 4
2
Variables
Variable: A characteristic of interest about each
element of a population.
Data: The set of values collected for the variable from
each of the elements that belong to the sample.
Variability: The extent to which data values for a
particular variable differ from each other.
Numerical or Quantitative Variable: A variable that
quantifies an element of the population. The HS
grade point of a student is a numerical variable.
Numerical variables are numbers for which math
operations make sense. The average grade point of a
sample makes sense.
Continuous Numerical Variable: The variable can take
on take on an uncountable number of values between
to points on the number line. An example is the
weight of people.
Discrete Numerical Variable: The variable can take on
a countable number of values between two points on
a number line. An example is the price of statistics
text books.
Section 1.1, Page 8
3
Variables (2)
Categorical or Qualitative Variable: A variable that describes or
categorizes an element of a population. The gender of a person
would be a categorical variable. The categories are male and
female.
Nominal Categorical Variable: A categorical variable that uses a
number to describe or name an element of a population. An
example is a telephone area code. It is a number, but not a
numerical variable used on math operations. The average area
code does not make sense.
Ordinal Categorical Variable: A categorical variable that
incorporates an ordered position or ranking. An example would
be a survey response that ranks “very satisfied” ahead of
“satisfied” ahead of “somewhat satisfied.” Limited math
operations may be done with ordinal variables.
Section 1.1, Page 8
4
Problems
Problems, Page 19
5
Problems
Problems, Page 18
6
Problems
Section 1.3, Page 20
7
Observational Studies and
Experiments
Observational Study: Researchers collect data without
modifying the environment or controlling the process being
observed. Surveys and polls are observational studies.
Observational studies cannot establish causality.
Example: For a randomly selected high school
researchers collect data on each student, grade point and
whether the student has music training, to see if there is a
relationship between the two variables.
Experiments: Researchers collect data in a controlled
environment. The investigator controls or modifies the
environment and observes the effect of a variable under
study. Experiments can establish causality.
Example: Randomly divide a sample of people with
migraine headaches into a control and treatment groups.
Give the treatment group a experimental medication and
the control group a placebo, and then measure and
compare the reduction of frequency and severity of
headaches for both groups.
Section 1.3, Page 12
8
Single-Stage Sampling Methods
Single-stage sampling: A sample design in which the
elements of the sampling frame treated equally and there
is no subdividing or partitioning of the frame.
Simple Random Sample: Sample selected in such a way
that every element of the population has an equal
probability of being selected and all samples of size n have
an equal probability of being selected.
Example: Select a simple random sample of 6
students from from a class of 30.
1.Number the students from 1 to 30 on the roster.
2.Get 6 non-recurring random numbers between 1
and 30.
3.The six students who match the six random
numbers are the sample.
Section 1.3, Page 13
9
Multistage Sampling Designs
Multistage Sampling: A sample design in which the
elements of the sampling frame are subdivided and the
sample is chosen in more than one stage.
Stratified Random Sampling: A sample is selected by
stratifying the population, or sampling frame, and then
selecting a number of items from each of the strata by
means of a simple random sampling technique.
The strata are usually subgroups of the sampling frame
that are homogeneous but different from each other.
Example: Select a sample of six students from a
class of 30 so that the sample contains an equal
number of males and females.
1.List the males and females separately
2.Take a simple random sample of 3 students
from each group.
3.The six students selected are the sample.
Section 1.3, Page 15
10
Multi-Stage Sampling Designs
Cluster Sample: A sample obtained stratifying the
population, or sampling frame, and then selecting
some or all of the items from some, but not all of the
strata.
The strata are usually easily identified subgroups of
the sampling frame that are similar to each other.
This is often the most economical way to sample a
large population.
Example: Take a sample of 300 Catholics in
the Seattle Area.
1. Get a list of the Catholic Parishes in the
Seattle area.
2. Take a random sample of 3 parishes.
3. In each parish, select a simple random
sample of 100 parishioners.
Section 1.3, Page 16
11
Problem
a. Find the mean, variance, and standard deviation.
b. Find the 5-number summary.
c. Make a box and whisker display and label the numbers.
d. Calculate the Interquartile range and the range
e. Describe the shape of the distribution
Problems, Page 50
12
Summary of Probability Formulas
Equally Likely Outcomes: P(A) = n(A)/n
Complement: P(A) = 1- P(not A); P(not A) =1- P(A)
General Addition Rule:
P(A or B) = P(A) + P(B) – P(A and B)
If A and B are disjoint, P(A and B) = 0
Then the Special Addition Rule:
Then P(A or B) = P(A) + P(B)
General Multiplication Rule:
P(A and B) = P(A)×P(B|A)
If A and B are independent, P(B|A) = P(B)
Then the Special Multiplication Rule:
P(A and B) = P(A)×P(B)
Odds
If the odds for A are a:b, then the odds against A are
b:a. The probability of A is a/(a+b). The probability of
not A is b/(b+a)
Chapter 4
13
Problems
Problems, Page 95
14
Problem
Problems, Page 95
15
Problems
Problems, Page 97
16
Problems
Problems, Page 99
17
Z Score Problems
Problems, Page 52
18
Problems
Problems, Page 132
19
Problems
6.51 IQ scores are normally distributed with a
mean of 100 and a standard deviation of 16. Find
the following:
a. The 66th percentile.
b. The 80th percentile.
c. The minimum score required to be in the top
10%.
d. The minimum score to be in the top 25%.
6.52 Find the two z-scores that bound the middle
30% of the standard normal distribution.
Problems, Page 133
20
Problems
Problems, Page 149
21
Problems
Problems, Page 151
22
Problems
Problems, Page 50
23
Problems
Problems, Page 179
24
Problems
Test the claim that the BMI of the cardiovascular
technologists is different than the BMI of the general
population. Use α = .05. Assume the population of the
BMI of the cardiovascular technologists is normal.
a.State the necessary hypotheses.
b.Is the sampling distribution normal. Why?
c.Find the p-value.
d.State your conclusion.
e.If you made an error, what type of error did you make?
Problems Page 181
25
Problems
Problems, Page 179
26
Problems
a.
b.
c.
d.
e.
Find the 98% confidence interval.
Find the critical value
Find the margin of error.
Find the standard error.
What assumption must we make about the the
population to have a t-sampling distribution.
f. What are the proper words to describe the
confidence interval?
g. If you wanted to have a margin of error of one
minute and the 98% confidence interval for this
data, how large must the sample be?
Problems, Page 205
27
Problems
a. Find the p-value.
b. State your conclusion.
c. What is the name of the probability model used for
the sampling distribution
d. What is the mean of the sampling distribution?
e. What is the value of the standard error?
f. If your conclusion is in error, what type of error is it?
Problems, Page 205
28
Problems
Problems, Page 208
29
Problems
a.
b.
c.
d.
e.
f.
g.
Check the conditions for a normal sampling distribution.
State the hypotheses.
Find the p-value.
State your conclusion
If you make an error in your conclusion, what type is it?
Find the mean of the sampling distribution.
Find the standard error of the sampling distribution.
Problems, Page 207
30
Dependent and Independent
Samples
Section 10.1, Page 208
31
Problems
a. Test the hypotheses that the people
increased their knowledge. Use α=.05
and assume normality. State the
appropriate hypotheses.
b. Find the p-value and state your
conclusion.
c. Find the 90% confidence interval for
the mean estimate of the increase in
test scores.
Problems, Page 231
32
Problems
a. State the hypothesis (Assume Normality)
b. Find the p-value, and state you conclusion.
c. Find the 95% confidence interval for the
difference of the means; Gouda-Brie.
d. Find the mean and standard error of the sampling
distribution
Problems, Page 232
33
Problems
a. State the appropriate hypotheses.
b. Find the p-value and state your conclusion.
c. What model is used for the sampling distribution and
what is the mean of the sampling distribution and its
standard error?
d. Find the 98% confidence interval for the difference in
proportions, men – women.
Problems, Page 234
34
Summary of Chi-Square Applications
Goodness of Fit Test
Given one categorical variable with a fixed set of
proportions for the categories.
Ha: The observed data does not fit the proportions.
Calculate expected values (Ho true proportion * total
observations)
Observed and Expected data in List Editor
PRGM: GOODFIT
Test for Independence
Given two categorical variables measured on the
same population.
Ha: The variables are not independent (They are
related)
Observed data in Matrix Editor
Stat-Tests-χ2 Test
Test for Homogeneity
Given one categorical variable and two or more
populations.
Ha: The proportions for the categories are not the
same for for all populations.
Observed data in Matrix Editor
Stat-Tests-χ2 Test
Chapter 12, Summary
35
Chi-Square Distribution
Fair Die Example
Now we need a sampling distribution for the Χ2 statistic
= 2.2, so we can calculate the probability of getting a Χ2
≥ 2.2 when the true proportions are all equal to 1/6.
Χ2 Distribution for 5 df
This is a distribution of all possible Χ2 statistics
calculated from all possible samples of 60
observations when there are 6 proportions or cells.
Note that the degree of freedom equals the number
of proportions – 1.
Finding the p-value on the TI-83, Given Χ2 Stat, df
PRGM – CHI2DIST
LOWER BOUND: 2.2
UPPER BOUND: 2ND E99
df: 5
Output: P-VALUE = 0.8208
The null hypothesis cannot be rejected.
Section 11.2, Page 240
36
Problems
a. Perform a hypotheses test to see if the
preferences are not all the same. State the
hypotheses.
b. Find the p-value and state your conclusion
c. What is the name of the model used for the
sampling distribution?
Problems, Page 252
37
Problems
a. Perform a hypotheses test to see if the
preferences are not all the same. State the
hypotheses.
b. Find the p-value and state your conclusion
c. What is the name of the model used for the
sampling distribution?
Problems, Page 252
38
Problems
a. Test the hypotheses that the size of
community reared in is independent of the
size of community residing in. State the
appropriate hypotheses.
b. Find the p-value and state your conclusion
c. What is the name of the sampling
distribution?
d. What are the necessary conditions, and are
they satisfied? What is the value of the
smallest expected cell?
Section 11.3, Page 254
39
The F-Distribution
4.
Each sample must be from a normal distribution
Sec 10.5, Page 226
40
Problem
Set up the problem so that the the F-Stat >1.
a. State the necessary hypotheses.
b. Find the p-value and state your conclusion.
c. What is the name of the model used for the
sampling distribution?
Problems, Page 234
41
Problems
a.
b.
c.
d.
State the necessary hypotheses.
Sketch the side-by-side box plots. Does it appear
that the means are all the same?
Find the p-value and state your conclusion.
What is the name of the model used for the
sampling distribution?
Sec 12.1, Page 268
42
Problems
Sample
Size
Sample Sample
Mean
St. Dev.
Atlanta
6
24.67
7.76
Boston
7
33.00
9.56
Dallas
7
30.86
7.58
Philadelphia
5
32.20
7.47
Seattle
5
27.40
9.40
St. Louis
6
25.83
10.03
a. Test the hypotheses that not all the mean commute
times are all the same. State the appropriate
hypothesis.
b. Find the p-value and state your conclusion.
c. What is the name of the sampling distribution?
d. What is the F-Statistic, the df numerator and df
denominator?
Problems, Page 268
43