Download Probability and Statistics v. 2016 - West Jefferson Hills School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
2015 - 2016
Probability & Statistics
In this course, students will summarize, represent, and interpret data on a single count or measurement
variable and on two categorical and quantitative variables. Students will also interpret linear models,
understand and evaluate random processes underlying statistical experiments, and make inferences/justify
conclusions from sample surveys, experiments and observational studies. The course follows a nontheoretical approach, without formal proofs, explaining concepts intuitively, and supporting them with
abundant examples.
Course Information:
Frequency & Duration: Daily for 42 minutes
Text: Elementary Statistics by McGraw Hill Education. Allan G. Bluman, 9th edition.
Probability and Statistics
Content: Descriptive and Inferential
Statistics
v. 2015 - 2016
Duration: Aug./Sept. (3 weeks)
Essential How do we use statistics to analyze raw data?
Question:


Skill:











Assessment:



Make inferences and justify conclusions.
Recognize the purposes of and differences among sample surveys, experiments, and
observational studies; explain how randomization relates to each.
Interpret categorical and quantitative data.
Understand and evaluate random processes underlying statistical experiments.
Understand statistics as a process for making inferences about population parameters based
on a random sample from that population.
Understand discrete versus continuous data.
Understand the four levels of measurement.
Understand the five different sampling methods.
Understand independent versus dependent variable.
Understand the Hawthorne Effect.
Define statistics.
Explain the difference between descriptive and inferential statistics.
How does a population differ from a sample?
What is a variable?
What is meant by a biased sample?
Is the number of pages in your statistics book discrete or continuous?
Resources: Chapter 1: pages 1 through page 40.
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Probability and Statistics
v. 2015 - 2016
Descriptive Statistics- collection, organization, and presentation of data; Experimental
Study- the researcher manipulates one of the variables and tries to determine how the
manipulation influences the other variables; Hawthorne Effect- changing your behavior
because you know you’re in an experiment; Inferential Statistics- generalizing from samples
to populations, hypothesis testing, and making predictions; Observational Study – when the
researcher observes what is currently happening or what has already happened and then draws
conclusions; Population- all subjects that are being studied; Qualitative Variables- variables
Vocabulary: that have distinct categories according to some characteristic or attribute; Quantitative
Variables- variables that can be counted or measured; Sample- small group selected from the
population; Statistics- the science of conducting studies to collect, organize, summarize,
analyze, and draw conclusions from data; Types of Sampling Methods- random: all members
have an equal chance of being selected, systematic: sample obtained by selecting every kth
member where k is a counting number, stratified: sample obtained by breaking the population
into subgroups and then selecting from each subgroup/strata, cluster: divide population into
sections/clusters then select, convenience: when we use subjects/people who are convenient to us
Comments:
Probability and Statistics
Content: Frequency Distributions and Graphs
v. 2015 - 2016
Duration: Sept./Oct. (4 weeks)
Essential How does a graph help us to better understand data?
Question:
Skill:
Assessment:




Summarize, represent, and interpret data on a single count or measurement variable.
Represent data with plots on the real number line (dot plots, histograms, and box plots).
Summarize categorical data for two categories in two-way frequency tables.
Interpret relative frequencies in the context of the data (including joint, marginal, and conditional
relative frequencies). Recognize possible associations and trends in the data.

Given raw data, make a frequency distribution, categorical frequency distribution, grouped
frequency distribution, cumulative frequency distribution, and/or an ungrouped frequency
distribution.
Given raw data, make a histogram, Ogive, dot plot, and/or stem-and-leaf plot.
Given a stem-and-leaf plot, find the range, median, and inter-quartile range of the data.


Resource: Chapter 2: pages 41 through page 108
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Vocabulary:
Bar Graph– represents the data y using vertical or horizontal bars whose lengths represent frequencies
of the data; Categorical Frequency Distribution- placing raw data into specific categories;
Compound Bar Graph- bar graph that compares two or more groups at once; Cumulative Frequency
Distribution- distribution that shows the number of data values that are less than or equal to a specific
value; Distribution- the organization of raw data in table form, using classes and frequencies; Dot Plotstatistical graph in which each data value is plotted as a point/dot above the horizontal axis; Frequency
Grouped Frequency Distribution- when the range of data is large, and the data must be grouped into
classes that are more than one width; Frequency Polygon- graph that displays the data by using lines
that connect points plotted for the frequencies at the midpoints of the classes; Histogram- graph that
displays data in continuous vertical or horizontal bars that represent frequencies; Ogive- graph that
represents the cumulative frequencies for the classes in a frequency distribution; Pareto Chartfrequency distribution displayed as a bar graph where the bars are listed in order from tallest to shortest;
Pie Graph- circle that is divided into wedges according to the percentage of frequencies in each
Probability and Statistics
v. 2015 - 2016
category; Raw Data- data in original, unorganized form; Stem-and-Leaf Plot- data plot that uses part
of the data value as the stem and part of the data value as the leaf; Time-Series Graph- represents data
that occur over a specific period of time; Ungrouped Frequency Distribution- using single data values
for each class because the range for data values is small;
Comments:
Probability and Statistics
Content: Data Description
v. 2015 - 2016
Duration: Oct./Nov. (5 weeks)
Essential Which measure of central tendency best explains the raw data?
Question:



Use statistics appropriate to the shape of the data distribution to compare center (median, mean)
and spread (interquartile range, standard deviation) of two or more different data sets.
Interpret differences in shape, center, and spread in the context of the data sets, account for possible
effects of extreme data points (outliers).
Use data from a sample survey to estimate a population mean or proportion; develop a margin of
error through the use of simulation models for random sampling.
Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate
population percentages.
Recognize that there are data sets for which such a procedure is not appropriate.
Use calculators, spreadsheets, and tables to estimate areas under the normal curve.





Given raw data, make a boxplot.
Given raw data, calculate the inter-quartile range.
Given raw data, calculate standard deviation.
Apply the Empirical Rule for normal data.
Calculate a Z-score and a percentile.

Skill:


Assessment:
Resources: Chapter 3 pages 109 – 184
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Vocabulary:
Bimodal- two modes; Boxplot- graph of data that shows the low, the high, the median, and the
quartiles; Chebyshev’s Theorem- the proportion of values from a data set that will fall within k
1
standard deviations of the mean will be at least 1 – 2 where k>1; Coefficient of Variation- standard
𝑘
deviation divided by the average; Deciles- divides data into 10 equal groups; Median- the midpoint of
the data after placed in numerical order; Empirical Rule- in a normal distribution, 68% of the data falls
within one standard deviation of the mean, 95% falls within two standard deviations, and 99.7% falls
within three standard deviations; Interquartile Range- Q3 – Q1; Midrange- an estimation of the
middle of the data; Multimodal- more than two modes; Outlier- extremely high or extremely low data
value; Parameter:-characteristics or measure obtained by using all of the data values from a specific
population; Percentile- divides the data in 100 equal groups; Population Mean- the average of the
Probability and Statistics
v. 2015 - 2016
entire population; Population Variance- average of the squares of the distance each value is from the
mean; Quartiles- divides the data into 4 equal groups; Rounding Rule- round the mean to one
additional decimal place than the raw data; Sample Mean- the average of a sample; Standard
Deviation- square root of the variance; Statistic- characteristic or measure obtained by using the data
values from a sample; Types of Distributions- symmetric, skewed-right, skewed-left; Range- high
value minus the low value; Unimodal- one mode; Weighted Mean- the mean that is used when not all
𝑥−𝑥̅
values are equally represented; Z-Score/Standard Score- 𝑍 =
𝜎
Comments:
Probability and Statistics
Content: Probability and Counting Rules
v. 2015 - 2016
Duration: Dec./Jan. (5 weeks)
Essential Would you feel safer if you flew across the US on a commercial airline or if you drove yourself?
Question:




Skill: 



Assessment:



Use permutations and combinations to compute probabilities of compound events to solve
problems.
Calculate the expected value of a random variable; interpret it as the mean of the probability
distribution.
Understand the conditional probability of A given B as P(A and B)/P(B), and interpret
independence of A and B as saying that the conditional probability of A given B is the same as the
probability of A, and the conditional probability of B given A is the same as the probability of B.
Define a random variable for a quantity of interest by assigning a numerical value to each event in a
sample space; graph the corresponding probability distribution using the same graphical displays as
for data distributions.
Develop a probability distribution for a random variable defined for a sample space in which
theoretical probabilities can be calculated; find the expected value. For example, find the theoretical
probability distribution for the number of correct answers obtained by guessing on all five questions
of a multiple-choice test where each question has four choices, and find the expected grade under
various grading schemes.
Weigh the possible outcomes of a decision by calculating the expected value.
Apply the Addition Rule, P(A or B) = P(A) + P(B) – P(A and B), and interpret the answer in terms
of the model.
Apply the general Multiplication Rule in a uniform probability model, P(A and B) = P(A)P(B|A) =
P(B)P(A|B), and interpret the answer in terms of the model.
Evaluate 10P7, 8C3.
A coin is flipped and a die is rolled, what is the probability that you get heads and an odd number.
A license plate consists of 3 letters followed by 3 numbers, how many license plates are possible if
repeats are/aren’t allowed?
Resources: Chapter 4 pages: 186 - 255
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Probability and Statistics
Vocabulary:
Comments:
v. 2015 - 2016
Combination- the selection of distinct objects where order is not important; Complement of an
Event- the set of outcomes in the sample space that are not included in the outcome of the event;
Empirical Probability- probability from data; Equally-Likely Events- events that have the same
probability of occurring; Events- two events where the outcome of the first does not affect the
probability of the second; Fundamental Counting Rule- the number of possible outcomes of an
experiment is the product of the number of choices or decisions that have to be made along the way;
Independent Probability- the chance of an event occurring; Law of Large Numbers- as the number
of trials of an experiment gets very large, the empirical probability will approach theoretical probability;
Mutually Exclusive Events- two events that cannot occur at the same time; Outcome- the result of a
single trial of a probability experiment; Permutation- the arrangement of n objects in a specific order;
Sample Space- set of all possible outcomes
Probability and Statistics
Content: Discrete Probability Distributions
v. 2015 - 2016
Duration: January (3 weeks)
Essential When is it appropriate to create a binomial, Poisson, hypergeometric, or multinomial distribution?
Question:
Skill:



Assessment: 
Develop a probability distribution for a random variable defined for a sample space in which
probabilities are assigned empirically; find the expected value.
Calculate the probability of independent events using the Binomial Theorem.
Create a probability distribution from data.
What is the probability of getting 9 out of 10 right on a 4-choice multiple-choice test by guessing?
Resources: Chapter 5: pages 257 - 309
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Vocabulary:
Comments:
Binomial Distribution- the outcomes of a binomial experiment and the corresponding probabilities;
Discrete Probability Distribution- the values a random variable can assume and the corresponding
probabilities of the values; Expected Value- the sum of the probabilities times the outcome (theoretical
average of the variable); Hypergeometric Experiment- distribution that has two outcomes when
sampling is done without replacement of time; Poisson Distribution- a discrete probability distribution
used when the independent variables occur over a period; Random Variable- variable whose values are
determined by chance;
Probability and Statistics
Content: The Normal Distribution
v. 2015 - 2016
Duration: February (2 weeks)
Essential What is normal?
Question:

Skill:













Assessment:



Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate
population percentages.
Identify the properties of a normal distribution.
Identify distributions as symmetric or skewed.
Find the area under the standard normal distribution, given various z values.
Find probabilities for a normally distributed variable by transforming it into a standard normal
variable.
Find specific data values for given percentages, using the standard normal distribution.
Use the central limit theorem to solve problems involving sample means for large samples.
Use the normal approximation to compute probabilities for a binomial variable.
Find the area under the standard normal distribution curve to the left of z = 2.09.
What are the characteristics of a normal distribution?
What is the total area under the standard normal distribution curve?
What percentage of the area falls below the mean? Above the mean?
Find the probability of P(0 < z < 0.92) using the standard normal curve.
An adult has on average 5.2 liters of blood. Assume the variable is normally distributed and has a
standard deviation of 0.3 Find the percentage of people who have less than 5.4 liters of blood in
their system.
Given a set of raw data, determine if the data are normally distributed.
A.C. Neilsen reported that children between the ages of 2 and 5 watch an average of 25 hours of
television per week. Assume the variable is normally distributed and the standard deviation is 3
hours. If 20 children between the ages of 2 and 5 are randomly selected, find the probability that
the mean of the number of hours they watch television will be greater than 26.3 hours.
If a baseball player’s batting average is 0.320 (32%), find the probability that the player will get at
most 26 hits in 100 at bats.
Resources: Chapter 6 pages: 312 - 368
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Probability and Statistics
Vocabulary:
Comments:
v. 2015 - 2016
Central Limit Theorem: as the sample size n increases without limit, the shape of the distribution of
the sample means taken with replacement from a population with mean 𝜇 and standard deviation 𝜎 will
approach a normal distribution; Correction for Continuity: correction employed when a continuous
distribution is used to approximate a discrete distribution; Normal Distribution- random variable that
as a probability distribution whose graph is continuous, bell-shaped, and symmetric; Sampling Errordifference between the sample measure and the corresponding population measure; Standard Normal
Distribution- normal distribution with a mean of 0 and a standard deviation of 1
Probability and Statistics
Content: Hypothesis Testing
v. 2015 - 2016
Duration: February/March (5 weeks)
Essential How much better is better?
Question:

Skill: 






Assessment:



Use data from a randomized experiment to compare two treatments; use simulations to decide if
differences between parameters are significant.
Evaluate reports based on data.
Find the critical value(s) for a left-tailed test with 𝛼 = 0.10. Draw the appropriate figure, showing
the critical region.
Explain the difference between a one-tailed and a two-tailed test.
What is meant by the critical region?
Using the z table, find the critical value if 𝛼 = 0.02, left-tailed test.
Conjecture: the average age of community college students is 24.6. State the null and alternative
hypothesis.
In Pennsylvania, the average IQ score is 101.5. The variable is normally distributed, and the
population standard deviation is 15. A school superintendent claims that the students in her school
district have an IQ higher than the average of 101.5. She selects a random sample of 30 students
and finds the mean of the test scores is 106.4. Test the claim at 𝛼 = 0.05.
Find the critical value for 𝛼 = 0.05 with d.f. = 16 for a right-tailed test.
The Gallup Crime Survey states that 23% of gun owners are women. A researcher believes that in
the area where he lives, the percentage is less than 23%. He randomly selects a sample of 100 gun
owners and finds that 11% of the gun owners are women. At 𝛼 = 0.01, is the percentage of female
gun owners in his area less than 23%?
Find the critical chi-square value for 15 degrees of freedom when 𝛼 = 0.05 and the test is righttailed.
Resources: Chapter 8 pages: 413 - 486
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Probability and Statistics
Vocabulary:
Comments:
v. 2015 - 2016
Alternative Hypothesis (Research Hypothesis)- a statistical hypothesis that states the existence of a
difference between a parameter and a specified value, or states that there is a difference between two
parameters; Critical or Rejection Region- range of test values that indicates that there is a significant
difference and that the null hypothesis should be rejected; Critical Value- separates the critical region
from the noncritical region; Hypothesis Testing- a decision-making process for evaluating claims
about a population; Level of Significance- the maximum probability of committing a type I error; OneTailed Test- indicates that the null hypothesis should be rejected when the test value is in the critical
region on one side of the mean. A one-tailed test is either a right-tailed test or a left-tailed test,
depending on the direction of the inequality of the alternative hypothesis; Noncritical or Nonrejection
region- range of test values that indicates that the difference was probably due to chance and that the
null hypothesis should not be rejected; Null Hypothesis- statistical hypothesis that states that there is
no difference between a parameter and a specified value, or that there is no difference between two
parameters; P-Value- the probability of getting a sample statistic (such as the mean) or a more extreme
sample statistic in the direction of the alternative hypothesis when the null hypothesis is true; Statistical
Hypothesis- a conjecture about a population parameter which may or may not be true; Statistical
Test- uses the data obtained from a sample to make a decision about whether the null hypothesis should
be rejected. The numerical value obtained from a statistical test is called the test value; T-Test- is a
statistical test for the mean of a population and is used when the population is normally or
approximately normally distributed and 𝜎 is known; Two-Tailed Test- the null hypothesis should be
rejected when the test value is in either of the two critical regions; Type I Error- occurs when you reject
the null hypothesis when it is true; Type II Error- occurs if you do not reject the null hypothesis when it
is false; Z Test- a statistical test for the mean of a population. It can be used either when 𝑛 ≥ 30 or
when the population is normally distributed and 𝜎 is known
Probability and Statistics
Content: Correlation and Regression
v. 2015 - 2016
Duration: April/May (4 weeks)
Essential How can we use a regression equation to make predictions?
Question:


Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context
of the data.
Compute (using technology) and interpret the correlation coefficient of a linear fit.
Distinguish between correlation and causation.





Given data, calculate the line of best fit using a graphing utility.
Calculate a line of best fit by hand.
Interpret the correlation coefficient.
Interpret the slope of the line of best fit.
Test the significance of a correlation coefficient.
Skill: 
Assessment:
Resources: Chapter 10 pages: 550 - 608
Standards: This content is beyond the scope of the PA Core Mathematics Standards.
Vocabulary:
Comments:
Dependent Variable- variable that cannot be controlled or manipulated; Extrapolation- making
predictions beyond the bounds of the data; Independent Variable- variable that can be controlled or
manipulated; Linear Correlation Coefficient- computed from the sample data and measures the
strength and direction of the relationship between two quantitative variables; Population Correlation
Coefficient- the correlation computed by using all possible pairs of data values (x, y) taken from a
population; Prediction Interval- interval estimate of a predicted value of y when the regression
equation is used and a specific value of x is given Regression Line- line of best fit; Residual- the
difference between the actual and predicted y value; Scatter Plot- graph of the ordered pairs (x, y) of
numbers consisting of the independent variable (x) and the dependent variable (y); T-Test- a statistical
test for the mean of a population used when the population is normally distributed and the population
standard deviation is unknown;