Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICS Meena Ganapathy MEANING Statistics Latin-status Italian statistica Germany Statistik French statistique Statistic – Singular- One value associated e.g., wt of one person Plural e.g., wt of more values Statistics as singular branch of science- It is the combination of logic & Mathematics. DIFF. BRANCHES OF STATISTICS 1) Medical Statistics 2) Health statistics 3) Vital statistics 4) Biostatistics STATISTICS It is the branch of Science which deals with technique of collection, compilation, presentation, analysis of data & logical interpretation of the result. USE OF STATISTICS 1.To collect the data in best possible way. 2.To describe the characteristics of a group or a situation. 3.To analyze data & to draw conclusion from such analysis. DEFINITION Variable :- A characteristic that take different values in different person places or things. E.g. Ht, Wt, B.P., Age;’ It is denoted by capital x = x E.g., x: ht X1, x2, x3, x4…….xn N= total numbers of observation ATTRIBUTE A qualitative characteristic like age, sex, nationality is called as attribute CONSTANT The characteristic which does not change its value or nature is considered as constant E.g. blood group, sex OBSERVATION An event or its measurement such as BP., Is as event & 120/80 mm of Hg. Is as measurement OBSERVATION UNIT The source that gives observation such as object person etc. DATA A set of values recorded on one or more observational unit is called as data. It gives numerical observation about observational unit. e.g., HT, WT, Age. = equal to < Less than > greater than =< less that & equal to => greater than & equal to ≠ not equal to ∑ Summation Short forms A.M.- arithmetic mean H.M.- harmonic mean G.M.- Geometric mean C.V.- Coefficient of variation S.E.- Standard error S.D.- Standard deviation D.F.- Degree of freedom C.I.- Confidence interval E :- Expected value of cell of contingency table O :- Observed value of cell of contingency table. N :- Population size N :- Sample size L :- Level of significance (I.O.S) Ho :- Null hypothesis H1 Alternative hypothesis TYPES OF DATA Qualitative and quantitative Discrete and continuous Primary and Secondary Grouped and ungrouped QUALITATIVE & QUANTITATIVE DATA Qualitative data :-It is also called as enumeration data. It represents particular quality or attribute there is no notion of measurement. It can be classified by counting individuals having the same characteristics. E.g. Sex, religion, blood group QUA N T I TAT I VE DATA It is also called as measurement data. This can be measures by counting the characteristics in the variable. E.g. Ht, Wt, BP, HB DISCRETE & CONTINUOUS Discrete :- Here we always get a whole number. E.g. no of people dying in road accidents no. of vials of polio vaccine. Continuous :- In this data there is possibility of getting fraction like 1.2, 2.1,3.81. i.e. it takes all possible values in a certain range. E.g., Ht, WT, temp PRIMARY AND SECONDARY Primary :- The data obtained directly from a individual gives precise information. i.e., when the data is collected originally by the investigator for the first time is called primary data. E.g. to find no. of alcoholic person in Karvenagar area. By the investigator. Secondary :- When the data collected by somebody or other person is used the data is called secondary data. E.g. Census hospital records UNGROUPED AND GROUPED Ungrouped :- When the data is presented in raw way , it is called as ungrouped data E.g. Marks of 5 students 20,30,25,20,30 Grouped :- When the ungrouped data is arranged according to groups, then it is called as grouped data. E.g. Marks Students 20 2 30 2 25 1 M E T H O D S O F DATA C O L L E CT I ON Observation Visual Instrument Instrument Properties Reliability Validity Interviews & self administered questionnaires Use of documentary sources (secondary data) CLASSIFICATION OF DATA Definition :- The process of arranging data in to groups or classes according to similar characteristics is called as classification & the group so formed are called as class limits 1 class interval. OBJECTIVES OF CLASSIFICATION OF DATA 1.It condense the data 2.It omits unnecessary information. 3.It reveals the important features of the data. 4.It facilities comparison with other data 5.It enables further analysis like competition of average, dispersion (Variables ) data. F R E QU E N C Y A) Frequency Definition :- No. of times variable value is repeated is called as frequency. B) Cumulative class frequency Definition :-Cumulative frequency is formed by adding frequency of each class to the total frequency at the previous class. It indicates the no. of observations < upper limit of the class limit. Representatives Symbol Sample Population 1. Mean X bar M 2. SD $ o2 3. Variance $2 o2 4. Proportion p P 2 Q 5. Complement of proportion DATA PRESENTATION Meena Ganapathy M E T H O D S O F P R E S E N TAT I ON O F DATA Tabulation. Charts and diagrams. METHODS OF PRESENTATION OF DATA Caption heading Stub heading S T U B Total Caption Total Subheading Body of the Table IMPORTANT POINTS IN MAKING A TABLE Table No. :- If many tables are present Title :- Should be small Head note :- Whatever is not covered in title can be written in head note. E.g. expressing units Caption :- column heading According to characteristics Stub :- raw Subheading Body :- content Foot note:- Short forms or Source note :- resource it is important because it shows reliability of table. RULES AND GUIDELINES FOR TA BU L A R P R E S E N TAT I O N 1. Table must be numbered 2. Brief & self explanatory title must be given to each table. 3.The headings of columns & rows must be clear, sufficient, concise & fully defined. 4. The data must be presented according to size or importance chronologically alphabetically or geographically. 5. Table should not be large. 6. Foot note should be given whenever necessary providing additional information sources or explanatory notes. TYPES OF TABLE 1.One way table/simple table 2.Two way table 3.Complex table 1.ONE WAY TABLE/ SIMPLE TABLE When there is only one characteristics is described in a table then it is called as simple table EXAMPLE OF ONE WAY TABLE Class interval Frequency Tally Mark Frequency 3–4 IIII 5 5- 6 II 2 7–8 IIII 5 9 - 10 III 3 TWO WAY TABLE In this table data is classified according to two characteristics it given information about two interrelated characteristics. Sex Types of anemia Total Boy s 160 15 260 Girls 190 120 45 355 Total 350 205 60 615 Frequency distribution table qualitative data distribution of types of anemia According to sex 85 COMPLEX TABLE Information collected regarding 3 or 4 characteristics & tabulated according to these characteristics such a type of table is called as complex table. EXAMPLE OF COMPLEX TABLE Fasting blood Male Female Total Glucose 51-60 & 61-70yrs 51-60 & 61-70 yrs 120-129 4 4 2 2 12 130-139 1 3 3 1 8 140-149 2 4 1 3 10 150-159 2 3 3 2 10 160-169 4 5 3 3 15 170-179 5 4 5 4 18 180-189 1 2 1 1 5 19 25 18 16 78 A DVA N TAG ES O F A G R A P H S & D I AG R A MS 1. Information is presented in condensed form 2. Facts are presented in more effective & impressive manner as compared to tables Easy to understand for a layman. Create effect which last for longer time Facilitate the comparison. Help in revealing patterns. DISADVANTAGES Approximate results instead of accuracy Gives only a general idea Not sufficient for statistical analysis T Y P E S O F D I AG R A MS F O R QUA L I TAT I V E DATA Bar: Simple, Multiple or complex, Component & Proportional Pie or Sector Pictograms Shaded Map / Contour / Spot Maps BAR DIAGRAMS It is used to compare variables possessed by one or more groups. SIMPLE BAR DIAGRAM Here only one variable is presented Bars are at uniform distance from one another It can be drawn vertically or horizontally Each should have title & source note No. of dependents at home 120 103 97 No. of subjects 100 80 60 47 40 34 21 20 17 0 None 1 2 3 No. of dependents 4 5 and above PIE OR SECTOR DIAGRAMS When the data is presented as sum of different components for one qualitative characteristics we use pie diagrams. Patients age distribution in percentage 21% 34% 19-29 30-39 40-49 50-59 19% 26% PICTOGRAMS This diagrams are useful for lay people. E.g., Village map indicating temple, trees etc… SPOT MAPS In this diagram a map of an area with location of each case of an illness, death etc… are identified with spots or dot or any other symbol. TYPES OF DIAGRAMS FOR QUANTITATIVE DATA Histograms Frequency polygon Frequency curve Cumulative frequency curve Line graph Scatter diagram Population Pyramid HISTOGRAMS It is the graphical representation of frequency distribution. It is a series of adjacent rectangles erected on bars Areas of these bars denote the frequency of respective class interval. X axis base of bars shows class width of class interval Y axis frequency / No of observations 90 80 70 60 East West North 50 40 30 20 10 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr FREQUENCY POLYGON It is representation of categories of continuous & ordered data similar to histogram. It can be drawn in two ways: Using histograms, with out using histograms. Uses: it is used when sets of data are illustrated on the same diagram such as temperature, & pulse, birth & death rate etc… 350 300 250 200 150 100 50 0 Series1 Series2 1 2 3 4 5 6 7 SCATTER DIAGRAMS It is prepared after tabulation in which frequencies of two variables have been cross classified It is graphic representation of co relation between two variables SCATTER PLOT 700 600 500 400 300 200 100 0 Series1 0 5 10 15 LINE DIAGRAMS It is used to show the trends of events with the passage of time. E.g., rising & falling LINE GRAPH 700 600 500 400 300 200 100 0 Series1 Series2 1 2 3 4 5 6 7 LINE & BAR 14 12 10 8 6 4 2 0 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 Series2 Series1 MEASURES OF CENTRAL TENDENCY Mode-Value that occurs most frequently Median –point below and above 50% of cases fall Mean-mathematical average( sum of scores divided by the total # of scores Level of measurement plays a role in which central tendency measure you Mean-interval & Ratio Mode-Nominal Median-ordinal VARIABILITY / CENTRAL DISPERSION Extent to which scores deviate from each other Homogenous Heterogeneous Range-highest score-lowest Distance between high & low scores Standard Deviation (SD) Difference between individual score and mean Weight of person A=150 lbs Mean =140 Deviation =+10 SD ( average deviation from mean ) Formula BIVARIATE STATISTICS Associations between 2 variables Correlations INFERENTIAL STATISTIC Hypothesis testing Null Ho No actual relationship between variables There will be no difference in grant writing ability between nurses who attend and do not attend the research short course Accept the null Ho Reject the null Ho Type I Error Reject the null when it is actually true Type II Error Accepting the null when it is actually false Level of significance Probability of committing Type I Error Set by the researcher Usually set at p =.05 Lowering risk to Type I increases risk of Type II PARAMETRIC TESTS Involve estimation of at least one parameter Interval level data / Ratio scale Assume variables are normally distributed NONPARAMETRIC TESTS Nominal or ordinal level data Less restrictions about distributions Between subjects testing Men versus women Within subjects testing Same group compared pre and post-intervention DIFFERENCES BETWEEN 2 GROUP MEANS Parametric T-tests for independent groups Paired t-Tests Nonparametric Mann Whitney U Wilcoxon signed rank test DIFFERENCES BETWEEN 3 OR MORE GROUP MEANS Parametric One-Way Analysis of Variance (ANOVA) F ratio test Post-hoc tests to see which groups differ from each other LSD; Bonferroni Multifactor ANOVA (MANOVA) More than 2 IVs Usually for more complex analyses EG., Human behavior, feelings Repeated Measures ANOVA 3 or more measures of same DV for each subject EG., subjects exposed to 3 or more different treatment conditions 3 more data collection points of DV over time (longitudinal) Nonparametric ‘analysis of variance’ Kruskal wallis TESTING DIFFERENCES IN PROPORTIONS DV is nominal level Chi square test RELATIONSHIPS BETWEEN 2 VARIABLES Pearson’s (interval level) Spearman’s rho or Kendall's tau (ordinal) POWER ANALYSIS The probability of obtaining a significant result is called power of a statistical test Insufficient power-greater risk of Type II error 4 components Significance level-more stringent, lower the power Sample Size-increases, power increases Population effect size (gammaY)- how strong effect of IV is on the DV Power (1-B)-probability of rejecting null Ho MULTIVARIATE STATISTICS Simple linear regression Make predictions about phenomena R-correlation R2proportion of variance in Y accounted for by combined Xs Analysis of Covariance (ANCOVA) Tests significance of differences between group means after adjusting scores on DV to eliminate effects of covariate (s) Anxiety pre and post biofeedback therapy One hospital = treatment One hospital = control Post anxiety DV; hospital condition IV Pre anxiety scores- covariate Discriminant Analysis Predicts group membership Nurses who graduate versus drop outs Cancer patients adhere to treatment versus those who don’t Logistic Regression Binomial Logistic Regression DV is categorical (2 groups) Odds of Belonging to one group Multinomial Logistic Regression DV is categorical (. 2 groups) Odds of belong to one group