Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

Sufficient statistic wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript

Summary 2 –The Chi square tests The CHI-SQUARE distribution: Suppose we have a normal population with standard deviation sample of fixed size n and compute the statistic 2 and we take every (n 1)s 2 2 . It just happens that such a statistic has a well known distribution called the chi-square distribution, with n-1 degrees of freedom. There are many other distributions which also have a chi-square distribution. 2 A typical value of a random variable that has a chi-square distribution is denoted as , the square of the Greek letter chi. Characteristics: The distribution is a continuous distribution values are never negatives. The shape of the curves depend on the parameter DEGREES OF FREEDOM, which is related to the sample size. For each number of degrees of freedom there is one chisquare distribution curve. For a small number of degrees of freedom, the distribution is markedly skewed to the right. The skewness disappears rapidly as degrees of freedom increase. For df > 30, the distribution is approximately normal. Solving problems by using the chi-square distributions: Problem 1. Construction of the confidence interval for the variance of a population (n 1)s 2 n1, 2 2 2 (n 1)s 2 n1, 1 2 2 Problem 2. Test of independence by using contingency tables. H : The variables are independent H1 :The variables are not independent. k (0 e)2 2 Test: the statistics has a chi-square distribution with df = (r-1)(ce i1 1)Note: The expected frequencies for any cell of a contingency table may be obtained by multiplying the total of the row to which it belongs by the total of the column to which it belongs and then dividing by the grand total for the entire table 2 2 If If 0bserved tablereject the null hypothesis and accept that there is a certain dependency between the variables under consideration. Problem 3. Goodness of fit problem. To test whether the discrepancies between the observed frequencies and the expected frequencies can be attributed to chance we use the same statistic as in the preceding section. We want to determine how well a group of experimental values (observed) fit or agree with certain theoretical results (as expected by the theory) df =k-m-1 where k=number of terms in the summation, m= number of parameters that have to be estimated on the basis of the sample data Inferential Statistics Summary Size of sample Mean Mean Nature of Population Normal Normal Small (n<30) Is known or unknown? known unknown Mean unknown Small (n<30) unknown Mean Standard deviation unknown normal Large (n>30) Unknown Parameter Comparing two Both variances populations 2 2 are normal and 1 2 Type of distribution normalcdf Tcdf n-1 degrees of freedom Non-parametric test normalcdf Chi-square n-1 degrees of freedom F distribution