Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
```Summary 2 –The Chi square tests
The CHI-SQUARE distribution:
Suppose we have a normal population with standard deviation
sample of fixed size n and compute the statistic
2
 
 and we take every
(n  1)s 2
2
. It just happens that
such a statistic has a well known distribution called the chi-square distribution, with n-1
degrees of freedom. There are many other distributions which also have a chi-square
distribution.
2
A typical value of a random variable that has a chi-square distribution is denoted as  ,
the square of the Greek letter chi.
Characteristics:
 The distribution is a continuous distribution

values are never negatives.
 The shape of the curves depend on the parameter DEGREES OF FREEDOM, which
is related to the sample size. For each number of degrees of freedom there is one chisquare distribution curve.
 For a small number of degrees of freedom, the distribution is markedly skewed to the
right.
 The skewness disappears rapidly as degrees of freedom increase. For df > 30, the
distribution is approximately normal.
Solving problems by using the chi-square distributions:
Problem 1. Construction of the confidence interval for the variance of a population
(n  1)s
2
n1,

2
2
2
 
(n  1)s
2
n1, 1
2

2
Problem 2. Test of independence by using contingency tables.
H : The variables are independent
H1 :The variables are not independent.
k (0  e)2
2
Test: the statistics   
has a chi-square distribution with df = (r-1)(ce
i1
1)Note: The expected frequencies for any cell of a contingency table may be obtained by
multiplying the total of the row to which it belongs by the total of the column to which it
belongs and then dividing by the grand total for the entire table
2
2
If If  0bserved   tablereject the null hypothesis and accept that there is a certain
dependency between the variables under consideration.
Problem 3. Goodness of fit problem.
To test whether the discrepancies between the observed frequencies and the expected
frequencies can be attributed to chance we use the same statistic as in the preceding
section. We want to determine how well a group of experimental values (observed) fit
or agree with certain theoretical results (as expected by the theory)
df =k-m-1 where k=number of terms in the summation, m= number of parameters that
have to be estimated on the basis of the sample data
Inferential Statistics
Summary
Size of
sample
Mean 
Mean 
Nature of
Population
Normal
Normal
Small (n<30)
Is  known or
unknown?
known
unknown
Mean 
unknown
Small (n<30)
unknown
Mean 
Standard
deviation 
unknown
normal
Large (n>30)
Unknown
Parameter
Comparing two Both
variances
populations
2
2
are normal
 and 
1
2
Type of
distribution
normalcdf
Tcdf
n-1 degrees of
freedom
Non-parametric
test
normalcdf
Chi-square
n-1 degrees of
freedom
F distribution
```