Download Document

STATISTICAL HYPOTHESIS TESTING BY Dr. K.R. SUNDARAM Professor & Head Department of Biostatistics All India Institute of Medical Sciences New Delhi-110029 Workshop on “Essentials of Epidemiology and Research Methods” October 8-12 , 2003, Surajkund,Faridabad STATISTICAL METHODS (A)Descriptive methods (B)Inference methods (A) Descriptive Methods :-Statistical methods used for describing ( summarizing ) the collected data:--Statistical Tables, Diagrams & Graphs, Computation of Averages, Location Parameters, Proportions & Percentages, Deviation measures and Correlation measures and Regression analysis . (B) Inference Methods:-Statistical methods used for making inferences (generalizations) from the results obtained from the sample to the population from where the sample was selected Two important questions raised in scientific studies (A) How reliable are the results obtained---ESTIMATION (B) How probable is it that the differences between observed & expected results on the basis of the hypothesis have been produced by chance alone TEST OF STATISTICAL SIGNIFICANCE :---by computing the chance element Important terms / concepts concerned with the Statistical Inference :-Standard Error Null Hypothesis Confidence Interval Alternate Hypothesis Type-I error ( level of significance / ‘p’ value’/ ‘’value ) Type – II () error Probability and Probability distributions or Statistical distributions ( Normal , Binomial, Poisson etc. ) Test Statistic ( Test Criterion ) Critical Ratio and Decision making . Notations used :-Statistical figure Number of subjects Value of observation Mean Proportion Standard deviation Variance Correlation coefficient Population Sample N n - X M ( ) P  m (X ) p s 2  s2 r Concept Of Standard Error (SE) Standard Deviation (SD): average amount of deviation of different sample values from the mean value. SD = SQRT ( (X-m)2/n ) X – sample value n - sample size the sample m – Mean value in Standard Error (SE) :--Average amount of deviation of different sample mean values from the population ( true ) mean value. SE =SQRT ((m-)2/r) (  = Grand ( combined ) mean = estimate of population mean , r - no of samples) Computation of SE using the above formula is difficult and may not be feasible. Hence, SE is usually computed from one randomly selected sample of adequate size, as follows:- SE = SD / SQRT(n) Probability :--Relative frequency or probable chances of occurrences with which an event is expected to occur on an average –in the long run. :--Relative frequency of the number of occurrences of a favorable event to the total number of occurrences of all possible events. No conclusion can be drawn with 100 % certainty ( confidence ) Probability is the measurement of chance / uncertainty / subjectivity associated with a conclusion. Two Types of Probability:- ( A ) Mathematical ( B ) Statistical (A) Mathematical probability: An experiment or a trial where the probabilities of occurrences of various events / possibilities are already established mathematically. Examples:--(1) Prob. of getting a head when a coin is tossed (2) Prob. of getting five when a dice is thrown (3) Prob. of getting spade ace from a deck of cards (B)Statistical / Empirical Probability: An experiment or a trial is required to find out the probabilities of occurrences of various events / possibilities. Examples :---(1 ) Prob. of getting a boy in the first pregnancy (2 ) Prob. of getting a twin for a couple. (3 ) Prob. of improvement after the treatment for a specified period (4 ) Prob. of getting lung cancer in smokers (5 ) Prob. of an association of sedentary type of work with diabetes (6 ) Prob. that drug-A is better than drug-B in curing a disease. Probability Distributions Several basic theorems based on which several types of probabilities are computed. A series of probabilities associated with various occurrences/ outcomes/ possibilities of events in an experiment/ trial/ study will generate a probability distribution. Basically -three types of probability distributions: Binomial , Poisson and Normal distribution. Probability Distributions Binomial and poisson distributions --for discrete variables Normal distribution --for continuous variables . Most important probability distribution in statistical inference is Normal distribution(Guassian distribution ) Normal distribution will generate a Normal (Guassian ) curve . Normal Curve Properties of Normal Curve: (1 ) It is bell shaped & symmetrical (2 )The three types of averages--- the mean,the median & the mode will be almost equal (3 ) The total area under the normal curve will be equal to “1” (4) Fifty percent of the sample values will lie on the left of the perpendicular drawn on the middle and the remaining 50 % will lie on the right of this line Properties of Normal Curve: (5 ) Mean - 1 SD & mean + 1 SD will include about 68 % of the sample values (6 ) Mean – 2 SD& Mean + 2 SD will include about 95 % of the sample values (7 ) Mean – 3 SD & mean + 3 SD will include about 99 % of the sample values Properties of Normal Curve (8 ) Theoretically the curve touches the horizontal line only at the infinity (9 ) (Sample value – Mean ) / SD which is called as Standard Normal Deviate / Z- score is distributed with a mean of “ 0 “ and a SD of “ 1 “ , what ever the variable may be . This is a very important property.Inference theory is based on this property. Estimation of Population Parameters Two types of Estimation (1) Point estimation – (Estimation without Confidence) Values of mean, proportion,correlation coefficient etc. computed from sample serve as estimates of the population parameters. This estimate is a single value and is called Point estimate. (2) Interval estimation: (Estimation with Confidence) A lower limit (LL) and an upper limit (UL) are computed from sample values It can be said with a certain amount of confidence, that the population value (true value) of the parameter will lie within these limits. These limits are called Confidence limits or Interval estimates. The LL and UL estimates for the Population mean are given as :- mean - C* SE and mean + C*SE C= Confidence coefficient, SE ={ SD / (n) }, n = sample size. ( * = multiplicative sign ) If 95% confidence is desired , C = 1.96 , for 99% confidence, C = 2.58 for 99.9% confidence, C = 3.29 Example-1: In a study of a sample of 100 subjects it was found that the mean systolic blood pressure was 120mm. of hg. with a standard deviation of 10mm. of hg. Find out 95% confidence limits for the population mean of systolic blood pressure. SE = SD / ( n ) = 10/ ( 100 ) = 10/10 =1 LL :--- mean - 1.96*1 :--- 120 - 1.96 = 118.04 UL :--- mean +1.96*1 :--- 120 + 1.96 = 121.96 i.e. the population mean value of systolic blood pressure will lie between 118.04 and 121.96 and we can have a confidence of 95% for making this statement. Example-2: (2) In a study of 10,000 persons in a town , it is found that 100 of them are affected by tuberculosis. Find out 99% confidence limits for the population prevalence rate. SE = (( pq)/(n)), where, p= (100/10000 ) * 100 = 1% q = 100 – p = 100 – 1 = 99%, SE= ( (1*99) / 10000 )= 0.0995 LL = p - 2.58*0.0995 = 1- 0.2567 = 0.7433 = 0 .74 % UL= p +2.58*0.0995 = 1 +0.2567 = 1.2567 = 1.26 % i.e. the population prevalence rate of tuberculosis will lie between 0.74% and 1.26% and we can say this with 99% confidence Statistical Hypothesis A declarative statement about the parameters (of population) or the distribution form of the variable in the population. Examples 1. Mean systolic blood pressure (m) in normal subjects of 30 years of age in the population is equal to 120mm i.e. M=120. 2. Mean cholesterol value in hypertension patients (M1) > mean cholesterol value in normals (M2) i.e. M1>M2. 3. Percent of babies born with low birth weight to anaemic women (P1) is greater than that in normal women (P2) i.e. P1>P2. 4. Occurrence of lung cancer is associated with smoking. 5. Birth weights of children are normally distributed Null Hypothesis --- Ho No difference in average values or percentages between two or several populations. Examples:--( 1 ) Mean cholesterol value in normal (M1) =Mean cholesterol value in hypertension patients ( M2 ) ( 2 ) Percentage of babies born with low birth weight in anaemic women ( P1 ) = Percentage of babies born with low birth weight in normal women ( P2 ) ( 3 ) no association between lung cancer and smoking Alternative Hypothesis( H1)---two sided There is difference in average values or percentages between two or several populations:--M1  M2 P1  P2 Alternate Hypothesis (H1 )---one sided M1 > M 2 or M 2 > M1 P1 > P2 or P2 > P1 Examples:--( 1 ) Mean cholesterol value in hypertension patients (M1) > Mean cholesterol value in normals( M2 ) ( 2 ) Percentage of babies born with low birth weight in anaemic women ( P1 ) > Percentage of babies born with low birth weight in normal women ( P2 ) ( 3 ) There is an association between lung cancer and smoking---Prevalence of lung cancer is higher in smokers than in non-smokers TYPE - I & TYPE- II ERRORS Consider the following 2X2 Table:-- Ho True False Accept (no error) - (type- II ) Reject - (type –I) (no error) Type- I error :----  : p- value : level of significance probability of rejecting Ho when it is actually true. = probability of finding an effect when actually there is no effect. measures the strength of evidence by indicating the probability that a result at least as extreme as that observed would occur by chance 1- = Confidence coefficient = probability of rejecting Ho when it is false = probability of finding an effect when actually there is an effect. Type - II error :-  = Probability of accepting Ho when it is actually false. = Probability of not finding an effect when actually there is an effect. 1- = Power of the test = Probability of accepting Ho when it is true = Probability of not finding an effect when actually there is no effect. • When the null hypothesis is rejected, type-I error is to be stated Maximum error allowed---5 % i.e., Minimum confidence required---95 % • When the null hypothesis is accepted, type- II error is to be stated Maximum error allowed---20 % i.e; Minimum power required ----80% • • When the null hypothesis is rejected at a chosen level of significance ,what ever may be the sample size it may be adequate but, • when the null hypothesis is accepted, the adequacy of the sample size has to be checked before accepting Ho by computing the Power of the test Testing The Statistical Significance Of Hypothesis Testing the statistical significance of Hypothesis is the process of calculations using sample results to see whether the null hypothesis is true or false Steps :--1. State the null hypothesis: H0 2. State the alternate hypothesis: H1 (one sided / tailed or two sided / tailed) 3. State the distribution of the sample statistic or the difference (normal or student’s ‘t’ or chi- square). 4. State the level of significance ( or p - value or type -I error) desired. 5. Compute the Test Statistic (TS) = (difference in parameter values) = ------ ----------------------------(SE of difference) 6. Find out the Critical Ratio (CR) from the statistical table at the chosen level of significance • Take decision :-a. If TS <CR: accept Ho i.e. difference in parameter values is not statistically significant b. If TS > CR: reject Ho : accept H1 i.e. difference in parameter values is statistically significant . If p < 0.05, Confidence (C) > 95 %; if p < 0.01, C > 99 % and if p < 0.001, C > 99.9% Guidelines , Steps and Examples in Tests of Significance (A) Continuous variable :(1) Ho : Null Hypothesis: μ1=μ2 μ1= Mean gain in weight of infants who received supplementary diet μ2= Mean gain in weight of infants who did not receive supplementary diet (2) H1 : Alternate Hypothesis: μ1 μ2 (3-a) If Population distribution of gain in weight in both the groups is NORMAL (either known from earlier studies or could be established from the random samples ) or both the sample sizes are large ( n1 and n2 > 30 ) the TEST STATISTIC is Z and the test is called NORMAL TEST. (3-b) If n1 or n2 or both n1 and n2 < 30 , the TEST STATISTIC is Student`s “t” and the test is called Student`s “t” TEST. Level of Significance ( :-Type I Error:- p-Value ) If  = 0.05, Confidence ( C ) = 95% , if  = 0.01, C=99 % if  = 0.001, C=99.9 % (5) Test Statistic or Test Criteria (Z) If Normal or n1 , n2 > 30 Z ,  X1  X 2 S12 S 22  n1 n2 • -----where, X1 and X2 are the mean values of weight in Samples A and B respectively and S12 and S22 are the corresponding standard deviations. (6) Critical Ratio ( C.R ) If  = 0.05, C.R =1.96 , if  = 0.01, C.R.= 2.58 and if  = 0.001 ,C.R.= 3.29 (7) Taking Decision Difference in means between the Two Groups _________________________ If Z < 1.96 Not Significant ( Ho is acceptable ) ( p > 0.05 ) ( a ) Z > 1.96 Significant ( p < 0.05 ) ( b ) Z > 2.58 Highly Significant ( p < 0.01 ) ( c ) Z > 3.29 Very Highly Significant ( p < 0.001 ) ( Ho is rejected in ‘a’ ‘ b’ and ‘c’ ) Various Tests of Statistical Significance (a)To test the statistical significance of the difference in sample and population Means H1 : X   H0 : X    = 0.05 , CR = 1.96 , TC = Z = ( X   )  S / n Example : Mean SBP in population= 120, Mean SBP in Sample= 115 ( n = 100 SD = 20 ) Z = ( 120 – 115 )  20 / 100 = 2.5 ie , TC > CR . p < 0.05 Means in the population and sample are significantly different or The sample does not represent the population w.r.t. SBP ( b ) To test the statistical significance of the difference in Mean values between two Populations X X Z (1) Large Sample: 1 2 S12 S 22  n1 n2 If Z < 1.96 ,The difference in means in the population and sample can be considered as statistically not significant Test of Homogeneity of Variances ( Fisher`s ‘F ‘ ) • One of the assumption which has to be satisfied for applying Student`s t test is Homogeneity of variances in the two populations .This is tested by computing Fisher`s F statistic.  12 F= 2 2 for (n1-1) , (n2-1) d.f. ( 1   2) • If the computed F value is less than the Critical ratio of F at (n1-1) , (n2-1) d.f. , then the assumption of Homogeneity of variances in the two populations can be accepted. Otherwise , the variances in the two populations will be Heterogeneous. ( n1 or n2 or both n1 & n2 < 30 ) : (1 = 2) Homogeneity of variances in the two populations is assumed and accepted, (2) Small Samples t where S, S X1  X 2 1 1 S     n1 n2   rr1  1 S12   n2  1 S22  n1  n2  2 Critical ratio values depend upon degree of freedom - ( n1+n2-2 ) 3 Small Samples (n < 30 ) and (1  2) : Homogeneity of variances in the two populations is not accepted, In such a case . Modified ‘t’ test has to be applied. t X1  X 2 1 1 S     n1 n2  S12 S22 t  n1  1  t  n2  1 n1 n2 t1  S12 S22  n1 n2 If t > t` ; p<0.05 (significant) , if t < t` p > 0.05 ( not significant) Weight ( kg ) of school going ( A ) and non-School going ( B ) children of 5 years of age in slum areas :--Population (1) n1 & n2 > 30 A B Sample Size Mean S.D 100 100 17.4 13.2 3.0 2.5 Z = 15.56 ( p < 0.001 ) i.e. ---  A  B  A  B (2) n1 & n2 < 30 ( σ1 = σ2 ) A B 15 10 17.4 13.2 3.0 2.5 F = ( 3.0 )2 / (2.5)2 =1.44 < 3.00 ( for 14 & 9 d.f. at  = 0.05 ). Hence, assumption of homogeneity of variances in the two populations can be accepted. t = 3.65 > 2.81 ( for 23 d.f at  = 0.01 )< 3.77 (for 23 d.f at  = 0.001 ) i.e., p < 0.01 i.e,  A  B  A  B (3) n1 & n2 < 30 and 1   2 A B 15 10 17.4 13.2 1.8 4.2 F = ( 4.2 )2 / (1.8)2 =5.44 > 2.65 ( for 9 & 14 d.f. at  = 0.05 ) i.e . The assumption of Homogeneous variances in the two populations cannot be accepted (  1   2 ) and hence modified ‘t’ test has to to be applied . t =2.98 > 2.25 t` (t`at i.e.  A   B …… =0.05  A  B ) but, < 3.22 t` ( t`at =0.01 ) ( p<0.05 ) (4) Paired Samples :  Where : d  d w t Sd Mean of the difference , Sd: SD of the difference degrees of freedom = n-1 Systolic B.P Patient Number 1 2 3 4 5 6 7 8 9 10 Before Drug 160 150 170 130 140 170 160 160 120 140 After 140 110 165 140 145 120 130 110 120 130 Drug Mean S.D. Before drug 150 17.00 After drug 131 17.13 19 22.46 Change (Decrease) 19 10 t 22.46 =2.67 > 2.26 ( t at =0.05 with 9 d.f. ) i.e p < 0.05 i.e The decrease of 19 units ,on average, in the Systolic BP after giving the drug is statistically significant at 5 % level of significance. (5) Analysis of Variance (ANOVA) • To test the statistical significance of the differences in mean values of a variable among different groups (more than TWO groups). • In case of two groups, student's `t' test is applied. • The added advantage in ANOVA is that the total variance can be partitioned into different components (due to several factors)which will enhance the validity of comparison of the means among the different Groups. • This is not possible in the case of `t' test. Designs Basically THREE important Experimental Designs are used in ANOVA. They are :– 1. Completely Randomized Design (CRD) ( One-way ANOVA ) 2. Randomized Complete Block Design (RCBD):(Two or Multiple-way ANOVA ) 3. Repeated Measures Design ( Before & After Design ) ( Two-way, Between TIME Analysis ) • 1. CRD If there is only ONE FACTOR studied affecting the study variable Completely Randomized Design (CRD)/One-way ANOVA is used Example: The study population consists of only children who are severely malnourished and a Clinical Trial is conducted to study the efficacy of three methods: diet, drug and placebo, in increasing their weight. • 2. RCBD If TWO or more factors are studied affecting the study variable OR if the study elements in the population are HETEROGENEOUS with respect to the Factor(s), in addition to the main Factor studied,Randomized complete Block Design (RCBD)/Two or Multiple-way ANOVA is used. Example: • The population consists of children who are mildly, moderately or severely malnourished and a Clinical Trial is conducted to study the efficacy of three methods: diet, drug and placebo, in increasing their weight. • Here, the children are classified according to their malnourishment status, and in each group are randomly allocated into three methods of treatment. • This design will enhance the validity of comparison of the mean weight increase among the three Groups as compared to the Completely Randomized Design Repeated measures design : If the values of a variable of the subjects are recorded BEFORE and AFTER an INTERVENTION (more than once after the intervention) Repeated Measures Design is adopted, for a valid comparison of the mean values of the variable between various Timings of recording taking into consideration, the variation between the Subjects. Example : Blood Pressure values of Hypertension patients were recorded before and after ONE week and after TWO weeks after giving a drug. To test the statistical significance of the differences in mean BP among the THREE Timings of recording , Repeated Measures Analysis will enable us to make a more valid comparison. Homogeneity of variances Before applying ANOVA test ,HOMOGENEITY( EQUALITY) of VARIANCES of the variable in the different Groups has to be tested. The most commonly used test is BARTLETT`s Test. If this test shows non-significance ANOVA can be applied on the original values of the Variable .If this shows statistical significance, appropriate transformation ( Log, Square root ,inverse etc. ) has to be done for the original values before applying ANOVA. MULTIPLE RANGE TESTS If the Analysis of Variance provides statistically significant F-value for the treatment variation ( ie;if the ANOVA shows statistically significant differences in the mean values among the Groups) appropriate Multiple Range Test is to be applied to find out significantly different pairs of groups. The most commonly used Multiple Range Test is Student Newman Keul's (SNK) Test. PROBLEMS IN ANOVA :--(1) ONE – WAY ANOVA ( COMPLETELY RANDOMIZED DESIGN A study was conducted to investigate the effect of supplementary nutrition, a drug and placebo in increasing the weight of severely malnourished children. Fifteen severely malnourished children were randomly divided into three Groups A , B & C. Group A was given supplementary nutrition , Group B , the drug and Group C , the placebo. Gain in weight in these children was noted after one month of treatment. Test whether tht differences in weight gain, on an average,among the three groups are statistically significant or not at 5 % level of significance. Also test whether the difference between any two groups is statistically significant or not at 5% level of significance. Gain in Weight ( Kg.) A B C Total 0.20 0.10 0.05 0.35 0.15 0.10 0.10 0.35 0.10 0.05 0.05 0.20 0.30 0.15 0.05 0.50 0.25 0.20 0.15 0.60 ANOVA TABLE Source of Variation d.f. S.S. 14 0.0833 Between Groups 2 Error 12 Total M.S.S. F p 0.0373 0.0186 4.91 < 0.05 0.0460 0.0038 d.f. –Degrees of freedom ; S.S.—Sum of squares ; M.S.S. –Mean sum of squares ; F—F statistic ; p—level of significance F at  = 0.05 F at  = 0.01 with 2, 12 d.f. = 3.89 , with 2, 12 d.f. = 6.93 Computed F (4.91) > 3.89, but < 6.93 . i.e., Differences in gain in weight in children among the three groups are statistically significant, on an average (p < 0.05) – Confidence = 95% Multiple Comparison Test: Since the ANOVA gave a significant F value , we may have to find out the groups which are significantly different by applying Multiple comparison test. The most commonly used multiple comparison test is Student-Newman Keul`s (SNK) test. Treatment Group A B C Mean gain in weight ( kg) 0.20 0.12 0.08 On applying SNK test using a statistical software , it is found that gain in weight in severely malnourished children who received supplementary diet was significantly larger than in those who received placebo, on an average (p < 0.05; confidence = 95%). However, differences observed in gain in weight between those who received supplementary diet and drug or between those who received drug and placebo were statistically not significant (p > 0.05) (2)Two - way ANOVA ( Randomized Complete Block Design - RCBD) In a clinical trial to test the efficacy of two drugs and a placebo in the sleeping hours of mental patients it was thought that age of the patient could also influence the sleeping hours. Hence , the patients were stratified according to their age group and then randomly distributed into three treatment groups. Age group ( Years ) IMPROVEMENT IN SLEEPING HOURS A B Placebo Total 24-34 2.3 1.6 0.6 4.5 35-44 2.0 1.4 0.4 3.8 45-54 1.8 1.0 0.3 3.1 55 and More 1.2 0.8 0.3 2.3 ANOVA TABLE Source of Variation d.f. S.S. (n-1)= 11 5.19 Due to age (r-1)= 3 Due to drug Error Total M.S.S . F p 0.89 0.297 8.2 < 0.05 (p-1)=2 4.0825 2.0412 56.4 <0.001 (n-1)-(r-1)-(p-1)=n-r-p+1=6 0.2175 0.0362 Conclusions: Influence of age on treatment effect is significant ( p <0.05). i.e., accounting variation due to age has helped in reducing the error (MESS) i.e, in improving the precision of the estimate. Differences in mean improvement in sleeping hours among the three treatment groups are statistically significant (p <0.001) Drug Drug : A Drug : B Placebo: Mean improvement in sleeping hours -1.825 (A) -1.200 (B) -0.400(C) On applying SNK test using a statistical software ,it was found that improvement in sleeping hours with drug A was significantly higher than that with drug B and placebo (p < 0.01) and that with drug B was significantly higher than that with placebo, on an average (3) Two – way ANOVA ( RCB design where individuals themselves serve as blocks): Systolic blood pressure values of 10 patients, before treatment and after 1 week and after 2 weeks after treatment are given below. Test whether the change (reduction) in systolic blood pressure after 1 week and 2 weeks after treatment is statistically significant or not. Sl.No. 1 2 3 4 5 6 7 8 9 10 Total Mean Before 170 165 180 175 165 180 175 160 155 165 1690 196 After 1 week 160 160 170 165 160 160 170 150 140 145 1580 158 After 2 weeks 140 135 140 135 135 140 145 125 120 120 1335 133.5 Total 470 460 490 475 460 480 490 435 415 430 4605 TWO-way ANOVA TABLE Source of Variation d.f. S.S. 29 8857.5 Between Time (T) 2 Between Patients (P) Error (E) Total (T) M.S .S. F p 6605.0 330 2.5 26 0.2 < 0.001 9 2024.17 224. 9 17. 7 < 0.001 18 228.33 12.6 9 Conclusions: Variation due to patients was found to be statistically significant at  = 0.001 i.e. variation in BP among patients is statistically significant. After accounting for this variation, the differences in mean BP among the three Time periods are found to be statistically significant (p < 0.001). On applying SNK test ,it was found that reduction in BP, 1 week after treatment and 2 weeks after treatment was statistically significant (p < 0.001).Reduction from 1 week to 2 weeks after treatment is also statistically significant (p < 0.001) . INFERENCE METHODS for DISCRETE VARIABLES Estimation : 1. Point Estimate : Proportion , Percentage , Ratio , Rate 2. Interval Estimate :95% or 99%or 99.9 % Confidence intervals for proportion , Percentage. Point Estimate : 1. Proportion of persons diagnosed as cases in a survey of diabetes ( p = 0.14 or 14 % ) 2. Proportion of smokers with lung cancer (p = 0.24 or 24% ) 3. Sex Ratio : 970 females / 1000 males Doctor / Population Ratio : 1 : 10,000 4. Birth rate , Death rate etc. Interval Estimate :S.E = (pq / n ) (1) If p = 0.14 and n = 900, S.E = = 0.0116 95% Confidence limits : p – 1.96 SE and p + 1.96 SE : 0.1172-3 and 0.1627 (2) If p = 24% and n = 10,000 , SE = 0.43 99% Confidence limits : p –2.58 SE and p+2.58 SE ; 23.2 & 24.8 Tests of Significance :1. Z - test ( Proportion ) 2. λ 2 test ( 22 , 2n , rn ) 3. Matched λ 2 test ( McNemar’s ) ( 2 2 or pp ) Examples: Distribution of children according to their sex and nutritional grading is given in the table below:Sex Nutritional Grading Total Normal Gr I Gr II Gr III/IV Male 25 (18) 45(42) 25(30) 5(10) 100 Female 11(18) 39(42) 35(30) 15(10) 100 Total 36(18) 84(42) 60(30) 20(10) 200 ( 100 ) ( 1 ) 22 Contingency Table : Normal Malnourished Total Sex M 25(18) 75(82) 100 F 11(18) 89(82) 100 164 200 T 36 Malnourished = Gr-I , Gr- II , Gr. III & Gr. IV Ho: No association between sex and nutritional status H1 : There is an association between sex and nutritional status x  2 Test Statistic = O  E  E 2 with 1 d.f. (degree of freedom ). Degrees of freedom is the number of independent cells ( groups ) in the data . If there are four cells , d.f. will be 1 since if there is only one independent cell and the number in the other three cells can be determined by subtraction of the available cell number from the corresponding marginal totals. O—Observed number E--- Expected number λ 2 =6.64 =6.64 ( Critical ratio with1d.f.at 1 %level of significance.) 0.01. i.e., p = When the expected number in any cell is less than 5 which may happen in case of small samples and rare events,continuity correction has to be applied in the formula as given below :x  2 O  E  2 E (O-E) should be replaced by O  E  0.5 Since the sample sizes in males and females are larger and the expected numbers in all the four cells are more than 5 , continuity correction need not be applied for this data. Conclusions : i.e., The association between sex of the child and nutritional status is statistically significant at 5% level . Proportion of male children with normal nutrition is significantly higher ( 25 % ) than that of female children( 11 % ) . This statement can be made with 99 % confidence . In case of 2*2 contingency table , statistical significance of association can be tested by applying Proportion test also :- (2) Proportion Test: z 1 1 1     2  n1 n2  1 1 1     2  n1 n2  1 1 pq     n1 n2   p1  p2   p p1n1  p2 n2 n1  n2 q  (1  p) is to be included in the formula only in case of small sample sizes and if the expected number in any cell is less than 5. z (0.25  0.11)0.01 0.0543 =2.58 = CR of 2.58 at 1 % level of significance (p =0.01) i.e,Proportion of male children with normal nutrition is significantly higher ( 25 % ) than that of female children( 11 % ) . This statement can be made with 99 % confidence (3) 2n Table: In the example giving data on the Nutritional grading of children, there are four nutritional groups ( N,Gr I, Gr II , Gr. III & Gr. IV ) and two sexes ( Males & Females ) Degrees of freedom = (4-1) * (2-1) = 3 λ 2= 12.54 > 11.35 ( p < 0.01 ) i.e. Association between sex and Nutritional grading of children is statistically significant at 1 % level of significance ( Confidence = 99 % ) (4) Matched λ 2 test : To test the significance of the association between two categorical variables in correlated samples Matched λ 2 due to McNemar has to be applied. McNemar`s λ 2 = {( b-c)-1 }2/ (b+c) ‘ – 1 ‘ need to be included in the formula when the sample size is small. The data in the table given below gives the results ( + ve & - ve ) of two tests ,TA & TB ,done on 100 subjects to diagnose the presence of a certain disease . TA is the existing test which is expensive and TB is the new test ,which is comparatively cheaper.It has to be investigated whether the results of the two tests are statistically comparable or not so that , if found comparable test A can be replaced by the less expensive test B Example: T-A ( Expensive , but confirmative ) + - 8 ( a) 12 (c ) 20 8 (b) 72 (d) 80 Total T-B ( cheap ) + Total 16(16%) 84(84%) 100 McNemar`s λ 2 = 0.8 i.e., the discrepancy in the results is statistically not significant . The results of the two tests agree well. Test A can be replaced by test B. NON-PARAMETRIC STATISTICAL METHODS The meaning of the word “ Science “ as given in the dictionary is “ the truth ascertained by observation , experiment and induction . “ A vast amount of time , money and energy is being spent by society today in the pursuit of Science knows, the processes of observation, experiment and induction do not always lay bare the “ Truth “. One experiment with one set of observations may be lead two scientists to two different conclusions. The purpose of the body of the method known as “ STATISTICS “ is to provide the means for measuring the amount of subjectivity that goes into the scientist’s conclusion. •This is accomplished by setting up a theoretical model for the experiment in terms of probability. •Laws of probability are applied to this model in order to determine what the (chance) ‘ probabilities’ are for various possible outcomes of the experiment, under the assumption that chance alone determines the outcome of the experiment. •Then the experimenter has an objective basis for deciding whether the fact was the result of the treatment that was applied or whether it could have occurred by chance alone! • Although it is sometimes difficult to describe an appropriate theoretical model for the experiment, the real difficulty often comes after the model has been defined in the form of finding the probabilities associated with the model. Many reasonable models have been invented for which probability solutions have been found. This body of Statistics, i.e., applying the probability model for making inferences from the sample of experiment in order to arrive at valid conclusion - known as ‘ PARAMETRIC STATISTICAL METHODS ‘ Student`s t test ---F test In parametric method, exact solutions for the approximately suitable probability model are found. However, in the late 1930s, a different approach to the problem of finding probability began to gather momentum. This approach involves making few changes in the model and using simple unsophisticated methods to find out the desired probability. Thus, approximate solutions to the exact problems were found as opposed to the exact solution to approximate problem. This new package of Statistical Methods became to be known as “ NON PARAMETRIC METHODS “ Advantages of Non parametric statistics over parametric statistics : 1. Simpler Models 2. Easy Computability 3. No assumption on the form of population distribution of the variable. 4. No need of larger sample for making inferences. In case of applying parametric inferences model, the specific form of distribution of the variable in the population is required. Also, the computability is sometimes not easier and hence not quicker. However randomness of the sample is required in applying non parametric methods as in case of parametric methods. There are no parameters such as mean and standard deviation in the Non-parametric models and hence it is called NON-PARAMETRIC METHODS Since the assumption of specific form of distribution of the variable is not required, Non parametric methods are also known as ‘ DISTRIBUTION FREE METHODS ‘ Since non-parametric methods are based on RANKS it is also called RANKING METHODS OR ORDER STATISTICS Since the development of nonparametric methods has been taken place only recently, no comparable methods have been developed for all the inference methods which are used in parametric methods. However, most of the commonly used parametric inference methods have got corresponding non-parametric methods. :- Non Parametric methods may be applied when :-1. The form of distribution of the values of the variable in the population (s) is not known. 2. Sample size is very small. 3. The researcher does not have the mathematical background to understand and apply the parametric methods. Of course, this is not a compromise. 4. The researcher would like to make inference as quickly as possible. It has been shown by some researchers that the Power of many Non parametric methods is lesser compared to the corresponding parametric methods. Hence, it is suggested that one should try his best to apply the parametric inference methods if the conditions for applying such methods are met with . This can be achieved by suitable transformation of the values of the variables. If all these approaches fail, then the only method of arriving at conclusions with some validity and robustness is by applying the non-parametric methods. 1. Wilcoxon’s Rank Sum test : For testing whether two independent samples with respect to a variable come from the same population or not. i.e, “ does one population tend to yield larger values than the other population do the two Medians are equal or not . Corresponds to the Normal test (Z) or the student’s ‘t” test for two independent samples. 2. Wicoxon’s Signed Rank test : For testing whether the differences observed in the values of the variable between two correlated populations ( before and after Design ) are statistically different or not. Corresponds to the Paired ‘t’ test in parametric methods. 3. Kruskal Wally`s One-way Analysis of Variance: For testing whether several independent samples come from the same population or not. Corresponds to One - way Analysis of Variance in parametric method. 4.Friedman`s Two-way Analysis of Variance : For testing whether the differences observed in the values of the variable between different time periods are statistically significant or not. Corresponds to the Two-way Analysis of Variance in parametric methods. All the Non parametric methods can be applied manually by ranking the observations appropriately and doing simple computation. Computer packages :--- BMDP, SPSS, SAS and SYSTAT Statistical Estimation: Parametric 1. Representative Non-Parametric Mean, Median Mode Median, Mode 2. Variation Standard Deviation (SD) Quartile Deviation, Range. 3. Correlation Pearson’s Product Moment-corr. Coefficient () 4. Intervals for the estimate Mean  SD Value Spearman’s Rank Corr. Coefficient () Quartiles (Q10-Q90), Percentiles(P3-P97) Statistical Tests of Significance 1. Comparison between two independent populations : Parametric Non-Parametric Continuous : Z-test t-test Wilcoxon’s Rank Sum test Discrete Z-test 2-test : 2.Comparison between two Correlated populations : Parametric Non parametric Continuous : Paired ‘t’ test Discrete --- Wilcoxon’s Signed Rank test McNemar’s 2-test 3. Comparison among several independent populations: Parametric Non Parametric Continuous : Discrete One- way Anova --- Kruskal Wally`s One- way Anova 2-test 4. Comparison among several correlated populations: Parametric Continuous : Discrete Non parametric Two- way Anova Freidman’s Two-way --- McNemar’s Anova 2-test EXAMPLES : ( A) Independent samples: Intelligent quotient ( IQ ) of 5 normally nourished children(NN) and 4 malnourished children(MN), aged 4 years, are given below:--NN--------- 60 , 80 , 120 , 130 , 100 MN-------- 50 , 60 , 100 , 45 Null hypothesis-- IQs in the two groups are statistically the same , on an average. On applying Wilcoxon`s Rank sum test using statistical software p =0.11 Since p is greater than 0.05 ,the difference in IQ values in the two groups is statistically not significant and the hypothesis of identical IQ values, on average ,in the two groups is accepted . ( B ) Paired ( repeated ) samples: IQ Values Before ( b ) :-- 40 60 55 65 43 70 80 After ( a3 ) 50 80 50 70 40 60 90 60 85 On applying Wilcoxon`s Rank sum test using the statistical software p=0.18 Since p value is greater than 0.05 , the difference in IQ values after giving the diet for three months is not statistically significant and the Null hypothesis(Ho ) of no difference in IQ after giving the diet is accepted. – ( C ) Independent samples---more than two groups : Intelligent quotient ( IQ ) of 5 normally nourished children ( NN), 4 moderately malnourished children(MN) and 5 severely malnourished children ( MN ) , aged 4 years, are given below:--NN--------- 60 , 80 , 120 , 130 , 100 MN-------- 50 , 60 , 100 , 45 SN -------- 50 , 40 , 60 , 35 , 65 On applying Kruskal Wally`s One-way Analysis of variance, p=0.0438. i.e, The differences in IQ among the three groups on an average, are statistically significant. On applying Multiple range test ,it can be inferred that the differences in IQ between NN & MN and between MN & SN are statistically not significant and the difference between NN & SN is significant at 5 % level. ( D ) Paired(repeated ) samplesmore than two occasions: IQ of 8 malnourished children of 4 years of age ,before and after giving some Nutritious diet for three months ( a3 ) and for six months ( a6 ) are given below :--Before ( b ) :-- 40 60 55 After ( a3 ) :-- 50 80 50 After ( a6 ) :-- 70 90 100 65 43 70 80 70 40 60 90 90 75 65 70 60 85 120 On applying Freidman`s Two-way Analysis of variance , p=0.093 i.e, the differences in IQ after giving nutritious food for three and six months are statistically not significant. Giving Nutritious food for three or six months is not effective in increasing the IQ. WISH YOU ALL A VERY FRUITFUL USEFUL AND MEANINGFUL RESEARCH . THANK YOU

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document