Download 2030Lecture5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Regression toward the mean wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Foundations of statistics wikipedia, lookup

Transcript
Final Project
• Some details on your project
– Goal is to collect some numerical data pertinent to
some question and analyze it using one of the
statistical tests we’ve discussed in class
– You will be graded on all aspects of the task from
the nature of the question to the execution of the
statistical test
Final Project
• Some examples:
– Does the price of oil correlate with the price of
gasoline?
• Approach: record daily price of oil and the price of gas at some
gas station over several weeks and run a correlation
– Is Calgary colder/windier/rainer than Edmonton
• Collect data from Environment Canada’s web site
– Do Canadians score more than other NHL players?
• Collect data from any sports section or website
Final Project
• Guidelines:
– Use readily available observational data
• Don’t run an experiment unless you check with me first!!!
– Keep questions simple and straightforward
• Get your idea checked by Farshad before you proceed
– Plan to do your project with Excel or some stats
program
• Turn in the data, the relevant statistics, and one or two
sentences explaining your question and the answer should fit on one page.
Some Review
• A population is a really big bunch of
numbers
Some Review
• A population is a really big bunch of
numbers
• A sample is some of the numbers from
a population
Some Review
• All sets of numbers have a distribution
– The population has a mean
– A sample has a mean that is probably
similar but not necessarily the same as the
population
Some Review
• All sets of numbers have a distribution
– The population has a standard deviation
– A sample has a standard deviation that is
probably similar but not necessarily the
same as the population
Some Review
• If we think in terms of standard
deviation, we can know things like
whether or not a single number is very
different from the mean of a population
Some Review
• But often we’re not interested in single
numbers - we’ve collected a sample and
computed a mean
• That mean comes from a population of
sample means (you just happened to pick
one of them)
• The mean of the distribution of sample means
is the mean of the population
• The standard deviation of the sample means
is the standard error
Some Review
• If we think in terms of standard errors,
we can know things like whether a
particular mean is very different from
the mean of a population
Keep these ideas straight
• If we think in terms of
standard deviation, we
can know things like
whether or not a single
number is very different
from the mean of a
distribution
xi  x
zi 
Sx
• If we think in terms of
standard errors, we can
know things like
whether a particular
mean is very different
from the mean of a
population
Zx 
x  x
x
Some Review
• We use the Z table to look up the
probability that a particular Z score
came from any normal population
Some Review
• We use the Z table to look up the
probability that a particular Z score
came from any normal population
• Since the population of sample means
is normal (Central Limit Theorem), we
can use the same Z table to look up the
probability that a sample mean came
from a population with a particular mean
Now a Real Example
• Break into groups of 10
• Write down your heights in inches
• Compute the mean of your n=10
sample
• Compute the standard deviation
• Hand it all in to Fraser
Critical Z Value
• In our examples we’ve been testing the
hypothesis that one sample has a mean that
is higher (or lower) than a population mean
Critical Z Value
• In our examples we’ve been testing the
hypothesis that one sample has a mean that
is higher (or lower) than a population mean
• Let’s turn this around a bit…let’s work
backwards
Critical Z Value
• How much bigger would a sample mean have to be
so that there’s only a 5% chance that it came from a
particular population?
Critical Z Value
• How much bigger would a sample mean have to be
so that there’s only a 5% chance that it came from a
particular population?
This is the alpha
= .05 threshold
Gaussian (Normal) Distribution
0.6
0.5
probability
0.4
0.3
95%
5%
0.2
0.1
0
-4
-3
-2
-1
0
score
1
2
3
4
What Z score?
Critical Z Value
• This is sometimes called the critical Z value or
Zcrit (one  tailed) 1.64

Directional vs. Bidirectional
Tests
• In our examples we’ve been testing the
hypothesis that one sample has a mean that
is higher (or lower) than a population mean
• We call this a directional or “one-tailed” test
• What does that one-tailed bit mean !?
Directional vs. Bidirectional
Tests
• We were checking to see if our sample had a
mean far enough into the positive tail of the
distribution and ignoring the negative tail
Directional vs. Bidirectional
Tests
• Often we haven’t made a directional
hypothesis, but have simply predicted “a
difference”
Directional vs. Bidirectional
Tests
• Often we haven’t made a directional
hypothesis, but have simply predicted “a
difference”
• In that situation, we are twice as likely to
make a Type I error: the sample mean could,
by chance, be in either tail !
Directional vs. Bidirectional
Tests
• What would the critical Z value be so that
there is a 5% chance that a mean is beyond it
in either direction?
Directional vs. Bidirectional
Tests
• What would the critical Z value be so that
there is a 5% chance that a mean is beyond it
in either direction?
This is the alpha
= .05 threshold
Gaussian (Normal) Distribution
0.6
0.5
probability
0.4
0.3
2.5%
95%
2.5%
0.2
0.1
0
-4
-3
-2
-1
0
score
1
2
3
4
What Z score?
Directional vs. Bidirectional
Tests
• Thus:
Zcrit (two tailed)   1.96
