Download H - UCF Complex Adaptive Systems Laboratory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

German tank problem wikipedia , lookup

Confidence interval wikipedia , lookup

Regression toward the mean wikipedia , lookup

Least squares wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
Statistical Inference Based on a Single Sample
Point and Interval Estimation of Parameters
Population
Parameters:
Mean (µ)
Variance (2)
Subset of the
population
Sample
Statistics:
Mean ( ),
Variance (s2)
is the best point estimate of µ
s2 is the best point estimate of 2
In general, a statistic (i.e., a function of the sample) is considered
the best estimate of the corresponding population parameter.
100(1- α )% confidence interval:
A 100(1- α )% confidence interval for μ is an interval (L, U) constructed in such a way
that100(1- α )% of all similarly constructed intervals, in repeated sampling, would
contain the true parameter value μ.
Understanding the Concept a Confidence Interval
Machine
95%
Confidence
Interval
Formula for μ
Produces Items
(95% good, 5% bad)
Randomly select
only one item
Produces intervals
(good: encloses μ 95% of
the time, bad: fail to
enclose μ 5% of the time)
Since 95% of all
items are good,
we say with 95%
confidence that
the selected item
is a good item.
A sample of
n values
gives only
one interval
Examples of Some Estimation Problems:
Estimate the mean sentence complexity scores of all children.
Estimate the mean pulse rate of all U.S. adult males who jog at least 15 miles per week
Estimate the mean body temperature of all adults in USA.
Estimate the proportion of adults who were victims of a crime.
Estimate the proportion of all hourly employees earning minimum wage
Since 95% of
all intervals
constructed
by the
formula
enclose μ,
we can say
with 95%
confidence
that the
interval
obtained from
the observed
sample
2
encloses μ.
Confidence Interval Formula
Standard confidence Interval for a parameter θ:
(Point Estimate of θ - Margin of error of estimation, Point Estimate of θ + Margin of Error of estimation)
Examples (Just for your understanding, will not do any calculations now)
What factors/parameters affect the width of CI and how?
If you increase n, width will decrease given everything else same;
If you increase degree of confidence, width of CI will increase given
everything else same;
Statistical Inference based on one sample- Tests of Hypothesis
● Instead of estimating parameters, the goal of a test of hypothesis is to
see if the data provide sufficient evidence to support a research
hypothesis about population parameters (e.g., μ, σ2).
● Elements of a test of hypothesis problem
1. Alternative or Research Hypothesis (Ha)
– A statement that contradicts the null hypothesis. It represents researcher's claim
about the population parameter. Accept Ha only when data provides sufficient
evidence to establish its truth.
2. Null Hypothesis (H0)
– A statement about the value(s) of population parameter(s) which we accept as
true until proven false.
Null (H0) versus Research (Ha) hypotheses
– H0 needs no proof; Ha needs proof/evidence/support from data;
• A person is innocent is H0 in our court system and needs no proof.
• A person is guilty is Ha in a court system and must be proven by evidence;
More on Elements of a Test of Hypothesis
3. Test Statistic and its sampling distribution under H0
– A test statistic is a function of sample and H0. For any given sample and null
hypothesis, it gives a score that helps to decide whether to reject H0. The
decision depends on the sampling probability distribution of the test statistic
when H0 is true.
4. Rejection Region (Summary scores for Rejecting H0)
– It consists of all values of the test statistic for which H0 is rejected.
– This rejection region is selected in such a way that the probability of rejecting
true H0 is equal to α (a small number usually 0.05).
– The value of α is referred to as the level of significance
5. Conclusion
– Reject H0 if the summary score (obtained from test formula) falls in the rejection
region (or if the p-value < α). Otherwise, do not reject H0
6. P-value or significance probability
– proportion of samples that would be unfavorable to H0 (assuming H0 is true) if
the observed sample is considered unfavorable to H0.
Remarks:
- If the p-value is smaller than α then reject H0.
- If the observed test score in step 3 fall in the rejection region, then reject H0
Examples of some test of hypothesis problem
● A consumer group purchases 49 family-size 20-ounce bottles of ketchup ,
weigh the contents of each, and finds that the mean weight is 19.86 ounces,
and s = 0.22 ounces. Do the data provide sufficient evidence for the
consumer group to conclude that the true mean fill per family-size bottle is
les than 20 ounces?
● A long-range missile missed its target by an average of 0.88 miles. A new
steering device is supposed to increase accuracy, and a random sample of
36 missiles were equipped with this new mechanism and tested. These 36
missiles missed by distances with a mean of 0.76 miles and a standard
deviation of 0.04 miles. Is there sufficient evidence to conclude that the
new steering system lower the mean miss distance?
● A car manufacturer wants to test a new engine to determine whether it
meets new air pollution standard (true mean emission must be less than 20
parts per million of carbon). Ten engines manufactured for testing purposes
yield the following emission levels with mean =17.57 and std dev = 2.9522:
• 15.6, 16.2, 22.5, 20.5, 16.4, 19.4, 19.6, 17.9, 12.7, 14.9
Do the data supply sufficient evidence to allow the manufacturer to conclude
that this type of engine meets the pollution standard?
Two Types of Errors
●
Decision Table and Error Rates
H0 is True
H0 is False
Sufficient
evidence to Type I Error
reject H0
Correct Decision
Insufficient
evidence to Correct Decision
reject H0
Type II Error
Each cell the above 2 by 2 tables is an event;
prob(Type I Error) = α ; prob(Type II Error) = β
Examples of Cases with Type I Error
● Example 1 (New York Times; January 3, 2011).
– Thirty years after Mr. Dupree was imprisoned for rape and robbery,
prosecutors in Dallas declared him innocent in light of new DNA
evidence.
• H0: Mr. Dupree was innocent
• Ha: Mr. Dupree was guilty
• Conclusion: Court found him guilty
– A true null hypothesis was rejected by the court and thus
a TYPE I ERROR was committed in this case.
Examples of Possible Type II Error
● Burlington Free Press:
– Isaac Turnbaugh, of Randolph, Vt., was found not guilty in 2004 of
first-degree murder in the killing of Declan Lyons (24) who was shot
dead in 2002 while working at the American Flatbread Co., a local pizza
restaurant. Turnbaugh, 28, phoned the Randolph police in July and
confessed to shooting Lyons in the head with a rifle.
– If this Free Press story was right, a type II error was committed by the
jury.
● Casey Anthony’s Trial
– Casey Anthony was found not guilty.
• No Type I error in this case (why?).
– If Casey was not the killer, then it was a correct decision.
• Type II error?
– If Casey was the killer, then the jury committed a Type II error by saying that there is insufficient
evidence to support Ha
● Are you familiar with O.J. Simpson’s case?
– No Type I error.
– Type II error ?
• Yes if he was actually guilty
Summary - Elements of a Test of Hypothesis
●
●
●
●
●
●
●
Null Hypothesis (H0)
Alternative or Research Hypothesis (Ha)
Test Formula
Rejection Region
Calculation of the test statistic and conclusion
P-value or significance probability
Two Types of Errors (can’t have both in one test)
● Note: a test is one-tailed if the rejection region is
located on one side of the distribution, otherwise the
test is two-tailed.
Understanding statistical concepts associated with test of hypothesis
● The smaller the p-value associated with a test of hypothesis, the stronger the support
for (a) null hypothesis (b) research hypothesis.
» Ans. (b)
● Which elements of a test of hypothesis can be specified prior to collecting data?
• (a) Null hypothesis (b) Research hypothesis (c) α (d) all of the above
» Ans (d)
● If we do not reject the null hypothesis, we conclude that
• (a) There is enough evidence to infer that the alternative hypothesis is true
• (b) There is enough evidence to infer that the null hypothesis is true
• (c) There is insufficient evidence to support the alternative hypothesis
» Ans. (c)
● What is a Type I error?
• (a) rejecting the null hypothesis when it is false,
(b) rejecting the null hypothesis
when it is true,
(c) α ,
(d) both (b) and (c)
» Ans. (b)
● What is a Type II error?
• (a) rejecting the null hypothesis when it is false
• (b) failing to reject the null hypothesis when it is false
• (c) rejecting the null hypothesis when it is true
(d) β
» Ans. (b)
Remarks
● Type I Error:
Reject null hypothesis H0 when H0 is true
• α = p(Type I Error)
● Type II error: Accept H0 when research hypothesis Ha is true
• β = p(Type II Error)
● α and β move in opposite direction; Small α results in large β.
● A two-tailed test rejects H0: µ = µ0 in favor of Ha: µ ≠ µ0 for all
µ0 that falls outside (1-α)100% confidence interval for µ.
● If you use α= 0.05 for your test, then you are allowed to reject
true null hypothesis 5% of the time in repeated application of
your test rule.
● If the p-value of a test is 0.04 (say) and you reject H0 then,
under your test rule, 4% of the time you would reject true null
hypothesis. P-value must be less than α to reject H0;
● A test is called one-tailed or one sided if the rejection region is
located on one tail/side of the distribution of your test statistic.
Collecting Data and the concept of random error or noise
●
Designed Study
o Analysts control treatments (independent variables) and assignment of units to treatments and record
the value of the response variable
●
Observational Study
o Analysts observes both treatments and response variable on a random sample of units
●
Goal:
●
Experimental Noise or Error or random Error
o
●
Determine whether there is any real difference between treatments and estimate differences
The observed results or outcomes of an experiment when repeated/replicated under essentially
the same conditions may not be identical. The difference in results that occurs from one
repetition to another or from one unit to another identical unit is known as noise or error. Thus an
error is unavoidable or unexplained variation in the response variable.
Some widely used designs:
o
o
o
Independent samples design or completely randomized design
Paired design or before-after design or pre-post study
Randomized complete block design (paired design is a special case of this design)
Simple Linear Regression
● In many situations, the mean of a population is not viewed as a
constant, but rather as a variable, dependent on the value of another
variable. For example,
- mean sale price (y) of houses may depend on square feet (x) of
living spaces
- mean starting salary (y) of a college graduate may depend on
GPA (x)
- mean sales amount of ice cream may depend on the temperature
(x)
● In simple linear regression, the mean (also called expected value) of
a dependent variable y is assumed to be linearly related (straight-line
relationship) to a single quantitative independent variable x as
● µy = E(y) =β0 + β1 x
● Understanding intercept (β0) and slope (β1):
- For one unit increase in the value of x, the mean of y is increased
(decreased) by β1 if β1 is positive (negative).
- β0 represents the mean of y when x = 0.
Simple Linear Regression Example
• Age and height of children
• Age = Age in Months;
• Height = average height in cm of 161 children of each age (all data values are not
available)
• Age: 18 19 20 21 22 23 24 25 26 27 28 29
…….
• Height: 76.1 77.0 78.1 78.2 78.8 79.7 79.9 81.1 81.2 81.8 82.8 83.5 …….
Mean Height = a + b*Age
Correlation Coefficient
• Recall Scatterplots: These plots are used to describe relationship
between two quantitative variables.
• The correlation coefficient is a quantitative measure of the strength of
linear relationship between two variables x and y.
• Properties of r: ( -1 ≤ r ≤ 1)
• The value of r lies between -1 and +1 including +1 (a perfect positive
correlation) and -1 (a perfect negative correlation).
• If there is absolutely no correlation present, the value of r is zero.
• The closer the number is to 1 or -1, the stronger the correlation, or the
stronger the relationship between the two variables.
• The closer the number is to 0, the weaker the correlation.
• Something that seems to correlate in a positive direction might have a
value of 0.8, whereas something with a weak negative correlation might
have the value -.3, etc.
Multiple Linear Regression
● In simple linear regression, the mean of a population is viewed as a
variable, dependent on the value of another variable (mean age (y) of
children is assumed to depend on the age (x))
● In multiple linear regression, the mean (also called expected value)
of a dependent variable y is assumed to be related to k(≥ 2)
independent variables as
●
µy = E(y) = β0 + β1 x1 + β2 x2 + … + βk xk
● Understanding β–parameters
-
For one unit increase in the value of x1, the mean of y is increased (decreased)
by β1 if β1 is positive (negative) for given values of all other variables
β0 represents the mean of y when all other variables are set at zero.
-
● Example: Insurance agents determine your automobile premium (Y)
based on your answer to several questions listed in the application
form. Some of the important questions are listed below:
•
•
•
•
•
DMV record (X1) ;
Age (X2) ;
Driver education course (X3);
School/College GPA/Education (X4); Distance and miles of driving (X5)
Equipment (Alarm/anti-theft devices, airbags) in your car (X6);
Number of cars insured (X7);
Place (X8) ;
Gender (X9) ;
Race (X10) ;
Marital Status (X11) ;
Credit Score (X12).