Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psychometrics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Gibbs sampling wikipedia , lookup
Analysis of variance wikipedia , lookup
Misuse of statistics wikipedia , lookup
PSYCHOLOGICAL STATISTICS IV Semester COMPLEMENTARY COURSE B Sc COUNSELLING PSYCHOLOGY (2011 Admission) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut University P.O. Malappuram, Kerala, India 673 635 School of Distance Education UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION STUDY MATERIAL Complementary Course B Sc Counselling Psychology IV Semester PSYCHOLOGICAL STATISTICS Prepared by: Dr.Vijayakumari. K, Associate Professor, Farook Teacher Training College, Farook College. P.O. Feroke Scrutinized by: Prof.C. Jayan, Department of Psychology, University of Calicut. Layout: Computer Section, SDE © Reserved Psychological Statistics – IV Semester Page 2 School of Distance Education CONTENTS MODULE 1- HYPOTHESIS TESTING PAGE 5 MODULE 2 - NORMAL DISTRIBUTION 15 MODULE 3 - ANALYSIS OF VARIANCE 19 Psychological Statistics – IV Semester Page 3 School of Distance Education Psychological Statistics – IV Semester Page 4 School of Distance Education MODULE 1 Objectives: 1. To know about various techniques of hypothesis testing 2. To develop understanding about various hypothesis testing techniques INTRODUCTION It is impossible or impractical to study all the elements of a population to arrive at conclusions. Usually a researcher selects an appropriate sample from the population and studies the sample. From the sample values, the population values are inferred using inferential statistics. Inferential statistics is that branch of statistics which helps in inferring population value. It uses the concept of probability to deal with uncertainty in decision making. Inferential statistics has two functions. 1. Estimation- that is, estimating the parameter (Population value) from the statistic (sample value). 2. Testing of hypothesis – that is, to test some hypothesis about the population from which the sample is drawn. Statistical inference makes it possible to have idea about population value (which is unknown) from the sample values (which is known). UNIT 1 HYPOTHESIS TESTING HYPOTHESIS Hypotheses are assumptions about population values. It may be of different types. If we want to know whether a coin is unbiased, the hypothesis stated may be‘the coin is unbiased’. The coin may be tossed for a number of times, say 200. Suppose one got 80 heads and 120 tails. Using this information (sample value), statistics helps to test the hypothesis ‘the coin is unbiased’ arriving at a conclusion of either accept the hypothesis or reject it. A hypothesis test is a statistical method that uses sample values to evaluate a hypothesis about a parameter. Hypotheses are stated in different forms. In statistics there are two types of hypotheses –Null Hypothesis and alternate hypothesis. The null hypothesis (denoted as ) states that in the general population, there is no change , no difference or no relationship. In an experiment with treatment A for the control group and treatment B for the experimental group, the investigator may be interested to know which treatment has greater effect on the dependent variable ‘Y’. Then the hypothesis may be stated as there is no significant difference in mean Y scores of control and experimental groups after the treatment. This hypothesis is stated in null form as it says no difference between groups. Psychological Statistics – IV Semester Page 5 School of Distance Education If the study is to find out whether there is gender difference in mechanical aptitude, the null hypothesis will be ‘there is no significant gender difference in the mean scores of mechanical aptitude’. In the case of finding whether the variables X and Y are related or not, the null hypothesis will be ‘the two variables X and Y are not related’ or ‘there is no significant relationship between the variables X and Y’. If the population mean score of a variable X is 60, one can test whether the obtained sample mean indicates a difference in the value. Here the null hypothesis is ‘the population value is equal to 60’. The hypothesis simply opposite of the null hypothesis is known as the alternative hypothesis (denoted as ). This hypothesis states that there is a change, a difference or a relationship in the general population. In the first case, the alternate hypothesis will be ‘there is significant difference in the mean Y score of control and experimental group after the treatment’. The second one will be ‘there is significant gender difference in the mean scores of mechanical aptitude’. The third hypothesis can be stated as ‘the two variables X and Y are related’. In the fourth example the alternate hypothesis can be written as ‘the population value will not be equal to 60’. All these alternative hypotheses state that there will be some type of change. That is, there is no indication of change or direction of relation. The experimental and control groups differ significantly but which group is high / low- it is not stated in the hypothesis. Such hypotheses in which there is no indication of direction of change or relation are called non directional hypotheses. Tests that are used to test such hypotheses are non directional tests or two tailed tests. If there is an indication of direction of change or nature of relation, the hypothesis is a directional one. In the above examples, if the hypothesis is ‘experimental group has a higher mean Y score than the control group after the treatment’, it is directional as there is a clear indication that experimental group is better than the control group in the variable Y. Similarly, if the hypothesis ‘the two variables X and Y are related’, is stated as ‘there is positive significant relation between the variables X and Y’, it is directional as there is indication of the nature of relationship – positive/negative. If µ and µ are two population means and we have to test whether these means are equal or not, the null hypothesis can be stated as : µ = µ against the alternative hypothesis : µ ≠ µ . Here the hypothesis formed is non directional and the test used will be a two tailed test. If the researcher wishes to know whether µ is greater than µ , the hypothesis will be : µ ≤µ against : µ >µ Here hypotheses are directional and hence the test will be one tailed test. Statistical tests are designed to test the null hypothesis and based on this decision, alternate hypothesis is rejected or accepted. Psychological Statistics – IV Semester Page 6 School of Distance Education HYPOTHESIS TESTING Hypothesis testing deals with prediction of population values based on sample values. That is, here we are taking decisions about parameters based on the sample values. Whenever we take decisions about accepting or rejecting four alternatives are possible. Our decision will be one among the four alternatives. The four alternatives are Accept , When not true Accept , When true Reject , When not true Reject , When true The population values are unknown and hence we do not know whether is true or false. But based on the evidences from the sample we are taking decisions on accept or reject . This is similar to the judiciary. One who is accused may be innocent or the culprit. Based on the evidences before the court, the Judge sentences that the person is innocent or the person is culprit which may be correct or wrong. Similarly based on the evidences from sample, suppose the researcher had taken a decision to accept / reject . Among the four alternatives, the second and third are correct decisions and the other two are incorrect. The decision taken by the researcher will be a correct one or a wrong one. One cannot avoid completely the chance for error in taking decisions. The four decisions can be represented in a 2x2 (Two by Two ) table as below. True Reject Incorrect decision False Correct decision (Type I error) Accept Correct decision Incorrect decision (Type II error) The two errors possible in hypothesis testing are rejecting accepting when is false. when is true and The first type of error is known as type one (type I) error and the second one is known as type two (type II ) error in hypothesis testing. That is, type I error in hypothesis testing is the error committed while rejecting Ho when it is true. The probability of type I error is known as level of significance of the test denoted as ‘α’ Type II error is the error committed while accepting when of type II error is denoted as β and 1-β is known as power of the test. is not true. The probability It should be noted that, if we decrease the probability of type I error the chance for rejecting will be decreased, that is chance for accepting will be increased leading to increase in the Psychological Statistics – IV Semester Page 7 School of Distance Education probability of type II error. The case is same when the reverse is taken. That is , as α decreases, β increases, and as α Increases, β decreases. To avoid confusion, in research α Is fixed at different levels. In behavioral sciences, usually α Is taken as 0.01 or 0.05. α = 0.01 means that the probability of rejecting when actually it was true is 0.01. or more clearly, the probability of getting a mean difference, when such difference does not exist is 0.01. It can be again explained that if one conduct the study 100 times, in 99 times the researcher will get a difference, only in one case he may get a non difference. If α= 0.05, the probability of getting a difference when actually there is no difference between group means is 0.05 or the probability of accepting a true hypothesis is 95 percent. SAMPLING DISTRIBUTION AND STANDARD ERROR Sampling distributions are the distributions formed by sample values. For example if we want to study the EQ of adolescents in Kerala, we will take a sample from the population ( say sample size 1000), measure their EQ and calculate the mean and other descriptive statistics. If the same procedure is continued with different samples of same size, we will get a set of Arithmetic means of EQ. Each mean score will be a true estimate of the population mean. But as we know, we are not measuring EQ of each and every member of the population and hence the mean score obtained from the sample may be different from the population mean. The distribution formed by the sample mean values is known as the sampling distribution of mean. If the set is formed by calculating correlation between two variables, the distribution will be a sampling distribution of correlation. Generally, sampling distribution is formed from a population distribution known or assumed. A number of sampling distributions (each for a specific sample size) is possible from a population. Sampling distributions of two or more statistics are possible from the same population. Sampling distribution help the researcher to calculate errors due to chance involved in making generalization about population on the basis of samples. The standard deviation of sampling distribution is named as standard error (SE). SE of sampling distribution of mean is . where σ Is the standard deviation of the population √ distribution (or its estimate). Standard Error gives an idea about the unreliability of the sample. Moreover, confidence limits within which the parameter values are expected to lie can be formed with the help of SE. TESTS OF SIGNIFICANCE FOR LARGE SAMPLES Though there is no clear-cut line of demarcation between large and small samples, if the size of a sample exceeds 30, statistically it can be considered as a large sample. Tests of significance used for large samples are different from that for small samples. This is because the assumption of normality will not be satisfied by small samples. Two tailed test for difference between the means of two samples (large independent samples) If two independent (different , for eg Boys and Girls, Students of class A & class B etc.,) random samples with size & perceptively ( , > 30) are drawn from populations of standard deviation and Then to test the null hypothesis that there is no significant difference in the means of the two samples Psychological Statistics – IV Semester Page 8 School of Distance Education : µ = µ against the alternative hypothesis Critical ratio = Where = Therefore : µ ≠µ + = If this value is greater than 1.96, the null hypothesis is rejected at 0.05 level. (If it is greater than 2.58, the null hypothesis is rejected at 0.01 level). Illustration Emotional Intelligence of two groups A and B are measured and the mean, standard deviation and sample size of each group is given below. Test wether there is significant difference in the mean emotional intelligence scores of the two groups. Mean SD N Group A 75 15 150 Group B 70 20 250 : µ = µ against : µ ≠µ (The two groups do not differ significantly in their mean emotional intelligence score against there is significant difference between the groups). = − + = = 2.84 Since the calculated value is greater than the tabled value 2.58 for significance at 0.01 level, is rejected. That is group A and group B differe significantly in their mean emotional intelligence. (α ≤ 0.01) Psychological Statistics – IV Semester Page 9 School of Distance Education TESTS OF SIGNIFICANCE FOR SMALL SAMPLES When the size of the sample is less than 30, one can not assume that the sampling distribution of the statistic is approximately normal and that the values given by the sample data are sufficiently close to the population value. (That is the sample value need not be a true estimate of the population value) While dealing with small samples, one will be interested in testing a given hypothesis than estimating the population value. For eg, if a correlation of .38 is reported from a sample of 10 individuals, instead of finding out the population value, one will be interested in finding out whether this value could have arisen from an uncorrelated population. STUDENTS t DISTRIBUTION Theoretical works on t distribution of W.S. Gosset was published in 1905 under the pen name ‘student’. The t distribution is known as ‘student t distribution’or ‘student distribution’. The t distribution is used when sample size is less than or equal to 30 and the population standard deviation is unknown. t statistic is calculated using the formula Where = ∑( ) = −µ × √ PROPERTIES OF t DISTRIBUTION 1. ‘t’ distribution ranges from minus infinity to plus infinity 2. ‘t’ distribution varies as ‘n’ varies 3. It is symmetrical with respect to the ordinate at mean 4. The variance of the distribution is greater than one and approaches one as the sample size becomes large. That is, as sample size increases the t distribution approaches a normal distribution. Following figure will clearly bring out the features of ‘t’ distribution. As degrees of freedom increases, that is, as the sample size increases the ‘t’ distribution approaches the normal curve. Psychological Statistics – IV Semester Page 10 School of Distance Education Student t distribution is used to test the significance of various results obtained from small samples. It can be used to test the significance of ‘mean’ of a random sample in the following way. To test whether the mean of a sample drawn from a normal population deviate significantly from a stated value (may be a specified value or a population mean) when the population standard deviation is unknown, t variate can be calculated using the formula = µ × √ where -- Mean of the sample µ -- the population mean or the specified value n – sample size = ∑( ) If the calculated value of t exceeds . (the tabled value of t at n-1 degrees of freedom for significance at 0.05 level) the difference between and µ is significant at 0.05 level, if it is less than . , the difference is not significant at 0.05 level. If the calculated value greater than . , the difference is significant at 0.01 level. For example, if the mean life time of ten bulbs is found to be 4400 hrs with standard deviation 0.589, test the hypothesis that the average life time of bulbs is 4000 hrs. = ( − )√ ( = =2.148 . )√ Degrees of freedom= n-1= 9. Therefore . for 9 df =2.262 (from table) Since the calculated value is less than . , the difference between and µ significant at 0.05 level. That is, the average life of the bulbs can be taken as 4000 hrs. is not Test of Significance of difference between two means (Small independent samples) If and are the means of two samples (small and independent) of size and with standard deviation and respectively, to test whether the two means differ significantly, ‘t’ can be calculated using the formula Psychological Statistics – IV Semester Page 11 School of Distance Education = ∑( − ) + ∑( − + −2 − This formula can be rewritten as: t= ( ) ( ) + ) If this calculated value is greater than the tabled value of ‘t’ for n1+ n2-2 degrees of freedom ( . ) for significance at 0.05 level, the difference is significant at 0.05 level. Illustration: Effect of two types of drugs on two samples of patients sample A (Size 5) and sample B (size 7) for reducing weight was studied and the loss in weight was measured. Sample A (using drug 1) had a mean loss of weight 12 kg with a standard deviation 1.12 and sample B (Using drug 2) has a mean loss of weight 11 kg with standard deviation 2.31. Find whether there is significant difference in the efficacy of the two drugs. =12 =1.12 = 11 =2.31 =5 =7 For practical purposes the standard deviation of = instead of S = ( ) ( ) . − can be calculated using the formula + + Thus t = = × . × . × × =0.89 Psychological Statistics – IV Semester Page 12 School of Distance Education Since the calculated value is less than the tabled value of ‘t’ for 10 degrees of freedom at 0.05 level of significance ( . = 2.228) the difference is not significant at 0.05 level. That is there is no significant difference in the efficacy of the drugs used. Test of Significance of difference between two means (Small dependent samples). If the samples are dependent, that is paired observations, the difference between means can be tested using the formula: t= √ where ̅ - mean of differences s= ∑( ) or ∑ Here ‘t’ is based on (n-1) degrees of freedom. Illustration: In an experimental study, a researcher obtained the pre-test and post- test scores for 10 participants as below: Individual No. 1 2 3 4 5 6 7 8 9 10 Pre-test: 44 40 61 52 32 44 70 41 67 72 Post Test: 53 38 69 57 46 39 73 48 73 74 Test whether the pre-test and post-test mean scores differ significantly. d 9 -2 8 5 14 -5 3 7 6 2 d2 81 4 64 25 196 25 9 49 36 4 ̅ = 4.7, ∑ = 493 = = ∑ ̅√ . ×√ − −1 × . = 2.85 for 9 d.f = 2.26 (from table). The calculated ‘t’ value exceeds the tabled value, hence the . difference is significant at 0.05 level. That is the pre test and post test mean scores differ significantly at 0.05 level. Psychological Statistics – IV Semester Page 13 School of Distance Education Test of Significance of Observed Correlation Coefficient. Suppose a random sample is taken from a bivariate normal population, (that is two variables are involved in the population), to test the hypothesis that the correlation coefficient of the population is zero, ie.,the two variables in the population are uncorrelated, t can be calculated using formula = √ × √ − 2 with (n-2) degrees of freedom. If the calculated ‘t’ value exceeds the tabled value of ‘t’ at 0.05 level for n-2 degrees of freedom, the value of r is significant at 0.05 level. If ‘t’ is less than the tabled value, the data are consistent with the hypothesis of an uncorrelated population. Illustration The correlation coefficient obtained in the case of two variables is 0.42 for a sample of 27 pairs of observations. Test whether the correlation obtained is a significant one. = = √1 − . 42 ×√ −2 √1 −. 42 =2.31 × (27 − 2) The tabled value of t at 0.05 for 25 degrees of freedom is 1.708. Since the calculated value is greater than the tabled value, the correlation obtained is significant at 0.05 level. Psychological Statistics – IV Semester Page 14 School of Distance Education MODULE 2 NORMAL DISTRIBUTION Objectives: 1. To know about the characteristics of Normal distribution 2. To familiarize with the concepts of skewness and kurtosis of frequency curves. 3. to acquaint with knowledge about various measures of skewness and kurtosis. Normal distribution Normal distribution was originally investigated by DeMoivre (1667–1754) to describe the results of games of chance (gambling). The distribution was defined precisely by Pierre-Simon Laplace (1749–1827) and put in its more usual form by Carl Friedrich Gauss (1777–1855). It was Francis Galton (1822–1911) who gave the normal distribution a central role in psychological theory, especially in the theory of mental abilities. Mathematically the normal distribution is defined as ( ) = Where e and π are constants, µ and σ are the mean and standard deviation of the √ set of scores.(π = 3.1416 and e = 2.7183) Normal distribution has a significant role in statistical analysis because 1. Many of the dependent variables with which we deal are commonly assumed to be normally distributed in the population. 2. Many of the statistical techniques to make inferences about values of the variable assumes the normality of the distribution of the variable. 3. The theoretical distribution of the hypothetical set of sample means obtained by drawing an infinite number of samples from a specified population can be shown to be approximately normal under a wide variety of conditions. The concepts of sampling distribution and sampling error are highly connected to the concept of normal distribution. The general characteristics of a normal distribution are 1. It is symmetrical with respect to the ordinate at mean. The left side of the normal curve is a mirror image of the right side. 2. Mean, Median and Mode coincide. 3. Fifty percent of the scores are below the mean, and fifty percent above it. Most of the scores pile up around the mean and extreme scores are relatively rare. 4. The height of the vertical line (Ordinate) is the maximum at the mean. Psychological Statistics – IV Semester Page 15 School of Distance Education 5. The curve has no boundaries in either direction (The curve is asymptotic to the X-axis and extends from –∞ to + ∞ ) 6. The percentage area around the mean are a. Mean to Mean±1σ is 34.13% b. Mean+1σ to Mean + 2σ is 13.59% ( Mean-1σ to Mean - 2σ is 13.59%). c. Mean+2σ to Mean + 3σ is 2.15%. (Mean-2σ to Mean -3σ is 2.15%). That is 68% of the total area of the curve lies between the limits mean + 1 σ and mean – 1σ; 95.44% of the total area falls between mean + 2σ and Mean – 2 σ: And 99.73% of the total area lies between mean + 3σ and Mean -3σ Hence for practical purposes it is assumed that the normal curve extends from mean -3σ and mean + 3σ. Application Normal distribution is a good model for many naturally occurring distributions. So it is very much useful in interpreting inferences about population. Major applications of normal curve can be listed as below. 1. When the distribution is normally or nearly normally distributed using the normal probability table, the percentage of cases that fall between mean and a given σ distance from the mean or the percentage of total area included between mean and a given σ distance from the mean can be calculated. 2. The normal curve is used to convert a raw score into standard score ( = ). 3. Normal curve is useful in calculating the percentile rank of scores. 4. For normalizing a given frequency distribution normal curve is used. SKEWNESS AND KURTOSIS Two distributions may have the same mean and standard deviation but may differ widely in their overall appearance. Equal means and standard deviations do not guarantee the equality of two distributions. Measures of skewness and kurtosis give clear picture of the overall appearance of the distribution. Skewness The term ‘skewness’ means lack of symmetry. When a distribution is not symmetrical (asymmetrical) it is called a skewed distribution. Measure of skewness gives the direction and the extent of skewness. In symmetrical distribution, the mean, median, and mode are identical. The more the mean moves away from the mode, the larger the asymmetry or skewness. Psychological Statistics – IV Semester Page 16 School of Distance Education In a symmetrical distribution, the value of mean, median and mode coincide. The spread of the frequencies is the same on both sides of the centre point of the curve. A skewed distribution can be either positively skewed or negatively skewed. In a positively skewed distribution, the value of mean is maximum and that of mode is minimum. Median lies in between mean and mode. In a negatively skewed distribution, the value of mode is maximum and that of mean is least. Median lies in between the two. The distribution of frequencies are spread out over a wide range of values on the high-value side or the right hand side of the distribution in a positively skewed distribution (excess tail on right hand side) while the tail will be more extended in the left hand side of the curve in a negatively skewed distribution. Negatively skewed distribution Normal distribution Positively skewed distribution Measurement of Skewness Extent of skewness is calculated using the formula Or = = ( Karl Pearson’s coefficient of skewness) (Bowley’s coefficient of skewness) In Karl Pearson’s method there is no limit for the value of skewness but in Bowley’s method the value of skewness ranges from -1 to +1. A value of zero in both cases indicates the curve is symmetrical or non skewed. Kurtosis The word Kurtosis means ‘bulginess’. Kurtosis is the degree of flatness or peakedness in the region of mode of a frequency curve. Thus kurtosis gives an idea about how far a distribution is peaked or flat compared to a normal distribution. A normal curve is said to be mesokurtic and if the curve is more peaked than the normal curve, it is leptokurtc. If the curve is more flatter than the normal curve it is known as platy kurtic. A lepto kurtic curve has a narrower central position and higher tails than does the normal curve. A platy kurtic curve will have a broader central position and lower tails. Psychological Statistics – IV Semester Page 17 School of Distance Education Mesokurtic(normal curve) Platykurtic Leptokurtic Measurement of Kurtosis A formula for calculating kurtosis in terms of percentiles is =( ) where Q is the quartile deviation or ( − )/2. In the case of a normal distribution the value of kurtosis calculated using this formula is 0.263. if the value is less than .263 the distribution will be lepto kurtic and if it is greater than .263 it will be a platy kurtic distribution. Psychological Statistics – IV Semester Page 18 School of Distance Education MODULE 3 ANALYSIS OF VARIANCE Objectives: 1. To know about ANOVA 2. To know about assumptions of ANOVA 3.To know about one-way and two-way ANOVA ANALYSIS OF VARIANCE (ANOVA) When we have to test the significance of difference between means of two random samples, we use test of significance of difference between means or t-test. But if there are more than two groups using t-test will be laborious to find out whether any of the two group means differ significantly. For example if there are five groups to be compared, ten t-values are to be calculated to know whether any of the groups differ in their means. Analysis of variance helps one to find out whether any of these groups differ significantly in their mean. Instead of a large number of t-tests, ANOVA uses a single test, F-test in which the variances are compared (One Way ANOVA). Though ANOVA is used for testing the significance of difference between means, it is known as Analysis of Variance as it uses or analyses two types of variances-Between variance and Within variance. Between variance is the variance of the group means and within variance is the mean value of the variances of the scores within each sample or groups. F-value is calculated using the formula = ℎ From the table of F-values one can determine whether the groups differ significantly in their mean. If the calculated value is greater than the tabled value of F, for (k-1),(N-k) degrees of freedom, the mean difference between atleast two groups in the set will be significant and if the calculated value is less than the tabled value, the mean difference between any groups is not significant at that level of significance considered. Basic Assumptions of ANOVA ANOVA is a parametric test and has to satisfy certain assumptions in order to use it for statistical inferences. 1. The population distribution of the dependent variable should follow normality.(Assumption of Normality) 2. The groups drawn on certain criteria should be randomly selected from the sub population having the same criteria. (Assumption of Randomness) 3. The subgroups under study should have the same variability. (Assumption of Homogeneity of Variance). Psychological Statistics – IV Semester Page 19 School of Distance Education Computation for Analysis of Variance (One-Way) Step 1 Correction (∑ ) = (∑ +∑ + ∑ +. . ) Where x1,x2…. Are individual measures and N total number of observations in all the groups. Step 2 Total sum of squares(SSt) =∑ +∑ + ⋯− Step 3 Sum of squares of Between Means (SSb) = (∑ ) Step 4 Sum of Squares within (SSw) + (∑ ) + ⋯− SSw= SSt-SSb Step 5 Calulation of F F= = = Where groups) is the degrees of freedom for between group sum of squares (k-1, k number of is the degrees of freedom for within group sum of squares (N-k) Step 6 Arriving at conclusion If the calculated value is greater than the tabled value reject H0 , otherwise accept H0. Analysis of Variance for factorial design. If there are more than one independent variables, One Way Anova cannot be used and then we have to use ANOVA for Factorial esign. Usually in ANOVA independent variables are called as factors and different categories of these variables as levels. For example if we consider the dependent variable as hostility among students and the independent variables as sex and home environment, sex and home environment are known as factors and the levels, male and female are the levels of the factor sex and the levels of home environment will be the categories formed based on the home environment. In this case we can find out the main effects of the two factors and the interaction effect of the two factors. That is we can test the significance of difference in hostility among male and female students, test the significance of difference in hostility of students of various home environment and test whether the home environment influence hostility Psychological Statistics – IV Semester Page 20 School of Distance Education at various levels of the factor sex. In this case the total variance is devided into four- the main effect of sex; main effect of home environment ; interaction effect of the two factors and the residual or within group variance. If two factors are involved the ANOVA is known as Two way ANOVA,if there are three factors it is three way ANOVA and if more than three factors are included it is factorial design. While using ANOVA it is conventional to say the levels of variables. In the above example with sex and home environment as factors the ANOVA used is two way ANOVA with design 2X3 ( sex has two levels-male and female; home environment is here assumed to have three levels: it is read as two by three even written in the form of multiplication). Thus an ANOVA with design 3X2X2 means there are three factors involved in the analysis-first one with 3 levels, and the second and third with two levels each. Here the ANOVA used is a three way ANOVA. ******* Psychological Statistics – IV Semester Page 21