Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PRESENTATION OF DATA • TEXT FORM • TABULATION • DRAWINGS TABULAR PRESENTATION • A table is a systematic arrangement of data into vertical columns and horizontal rows. • The process of arranging data into rows and columns is called tabulation. TABULATION • Simple table • Complex table – Principles • • • • Table should be numbered Each table has a Title---brief & self explanatory Headings of column & rows should be clear Data must be presented a/c to size, importance, chronologically, alphabetically or geographically. • No too large table • Foot note may be given. STATISTICAL TABLE • THE TITLE • THE STUB • THE BOX-HEAD • THE BODY SIMPLE TABLE • Table 1 • Population of Pakistan year Population (millions) 1901 16.6 1911 19.4 1921 21.1 1931 23.6 1941 28.3 COMPOUND TABLE • Table III • Sex wise fatality rate of untreated patients Attribute Men Women Total Attacked 40 30 70 Deaths 12 8 20 %age died 30 26.7 28.6 COMPOUND TABLE Table II Colour choices of medical students of shirts Sex White Blue Yellow Green Pink Total Male 60 125 20 10 75 290 Female 55 45 0 25 5 130 Total 115 170 20 35 80 420 Compound table Table II Colour choices of medical students about shirt Sex White Blue Green Pink Yellow Total Boys 10 60 55 45 22 192 Girls 55 45 25 5 0 130 Total 65 105 80 50 22 322 TABLE I Population of Punjab & Baluchistan (thousands) Punjab Baluchistan census Male Female Total Male Female Total 1961 13643 11938 25581 640 521 1161 1971 19942 17566 37508 1272 1133 2405 DATA • Arrangement of data is based on • Classification • Purpose of table • • • • • • Alphabetically Geographically According to magnitude Historically Customary classes Progressive arrangement. FREQUENCY DISTRIBUTION • Is a tabular arrangement of data in which various items are arranged into classes and the no. of items falling in each class (class frequency) is stated. • Grouped data • Class limits • Class interval FREQUENCY DISTRIBUTION • Data is split into groups-called--class intervals • No. of items (frequency) is written in adjacent column FREQUENCY DISTRIBUTION TABLE • TABLE II • Age distribution of patients on Monday Age Number of patients 0-4 23 5-9 21 10-14 43 15-19 10 20-24 6 FREQUENCY DISTRIBUTION TABLE • TABLE II • Weight of medical students of SZMC Weight (Kg) Number of students 35-39 42 40-44 35 45-49 83 50-54 70 55-59 36 60 and above 28 DESCRITIVE STATISTICS • Descriptive statistics comprises those methods concerned with collecting and describing a set of data so as to yield meaningful information. STATISTICAL INFERENCE • Statistical inference comprises those methods concerning with the analysis of a subset of data leading to predictions or inferences about the entire set of data. ANALYSIS OF DATA • When characteristic and frequency are both variable • Calculations are: • Averages • Percentiles • Standard deviation, • Standard error • Correlation and • Regression coefficients. NORMAL • Normal is not the mean or a central value but the accepted range of variation on either side of mean or average. –Normal BP is not the mean but is a range between 100and 140 (mean 120 ± 20). • Chances of even higher or lower are there. MEASURE OF CENTRAL LOCATION / TENDENCY • Any measure indicating the centre of a set of data or observations, arranged in an increasing or decreasing order of magnitude. • A single value which represents all the values of the distribution in a definite way. • Most commonly used measures of central location are – Mean – Median – Mode MEASURE OF CENTRAL TENDENCY “AVERAGE” • What is the average or central value? • How are the values dispersed around this value? • Degree of scatter? • Is the distribution normal ( shape of distribution) AVERAGE • Average value of a characteristic is the one central value around which all other observations are dispersed. • 50% of observations lie above and • 50% of values lie below the central value. • It helps • Most of normal observations lie close to central value • Few of the too large or too small values lie far away at ends • To find which group is better off by comparing the average of one group with that of other. MEAN • Most commonly used average. • It is the value obtained by dividing the sum of the values by their number i.e., summarizing up of all observations and dividing total by no. of observations MEAN • It implies arithmetic average or arithmetic mean which is obtained by summing up all the observations and dividing by the total number of observations.e.g. • ESRs of 7 patients are 7,5,4,6,4,5,9 • Mean =7+5+4+6+4+5+9 =40/7=5.71 7 MEDIAN • When all observations are arranged in either ascending or descending order, the middle observation is called as median. i.e. mid value of series • Median is better indicator of central value when one or more of the lowest or highest observations are wide apart or not so evenly distributed. MEDIAN • 83, 75, 81, 79, 71, 95, 75, 77, 84, 79, 75, 71, 73, 91, 93. • 71, 71, 73, 75, 75, 75, 77, 79, 79, 81, 83, 84, 91, 93, 95. • Median = 79 MODE • Most frequently occurring value or observation in a series i.e. the most common or most fashionable value. • 85, 75, 81, 79, 71, 95, 75, 77, 75, 90, 71, 75, 79, 95, 75, 77, 84, 75, 81, 75. MODE • Most frequently occurring observation in a series I.e. the most common or most fashionable value. • 85, 75, 81, 79, 71, 95, 75, 77, 75, 90, 71, 75, 79, 95, 75, 77, 84, 75, 81, 75. • Mode = 75. NORMAL DISTRIBUTION • Normal curve • Smooth, Bell shaped, bilaterally symmetrical curve • Total area is =1 • Mean, Median and mode are equal. • Standard deviation=1 • Mean, median, mode coincide. • Area between ¯X±1 SD=68.3% • X±2SD=95.5% • X±3SD=99.9% NORMAL DISTRIBUTION NO. OF Pts ADMITTED PATIENTS IN SZH 15 10 5 0 0--9 10-19 20-29 30-39 40-49 AGE GROUP 50-59 60-69 POSITIVELY SKEWED NO. OF Pts AGE WISE Pts VISITING SZH 15 10 5 0 0--9 10-19 20-29 30-39 40-49 AGE GROUP 50-59 60-69 NEGATIVELY SKEWED NO. OF PtS AGE WISE Pts VISITING SZH 15 10 5 0 0--9 10-19 20-29 30-39 40-49 AGE GROUP 50-59 60-69 Normal distribution histogram of weights of students 220 200 195 150 136 90 90 weights 6 17 7.5 17 2.5 17 0 16 7.5 16 5 16 2.5 16 0 15 7.5 16 15 5 14 7.5 45 16 15 2.5 6 15 0 3 14 5 45 17 5 50 0 195 136 100 14 2.5 no. of students 250 3 NORMAL DISTRIBUTION POSITIVELY SKEWED NEGATIVELY SKEWED VARIABILITY • • • • Biological data are variable Two measurements in man are variable Cure rate are not equal but variable Height of students in same age group is not same but variable • Height of students in one area is not same as compared to other place but variable • Variability is essentially a normal character • It is a biological phenomenon. TYPES OF VARIABILITY • Biological variability • That occurs within certain accepted biological limits. It occurs by chance. – Individual variability – Periodical variability – Class, group or category variability – Sampling variability or sampling error REAL VARIABILITY – When the difference between two readings or observations or values of classes or samples is more than the defined limits in the universe, it is said to be real variability. The cause is external factors. e.g. significant difference in cure rates may be due to a better drug but not due to a chance. Experimental variability • Errors or differences due to materials, methods, procedures employed in the study or defects in the techniques involved in the experiment. – Observer error – Instrumental error – Sampling error. MEASURES OF VARIABILITY • How individual observations are dispersed around the mean of a large series. • Measures of variability of individual observations. – – – – Range Mean deviation Standard deviation Coefficient of variation • Measures of variability of samples – – – – – – Standard error of mean Standard error of difference between two means Standard error of proportion Standard error of difference between two proportions Standard error of correlation coefficient Standard error of regression coefficient. RANGE • It is the difference between the highest and lowest values or figures in a given sample. • Example: 83,75,81,79,71,90,75,95,77,94. • Range =71 to 95. RANGE • Range defines the normal limits of a biological characteristic. • It is the simplest measure of dispersion • Usually employed as a measure of variability in medical practice • It indicates the distance between the lowest and highest. • It ignores all observations except two extreme values on which it is based. • Normal range covers observations falling in 95% confidence limits. MEAN DEVIATION • It the average of the deviations from the arithmetic mean. • M.D=∑ (X-¯X) • n • Example: • 83,75,81,79,71,90,75,95,77,94. MEAN DEVIATION D BP Mean Deviation from mean=X-X 83 81 2 75 81 -6 81 81 0 79 81 -2 71 81 -10 95 81 14 75 81 -6 77 81 -4 84 81 3 90 81 9 810 56 M.D=5.6 STANDARD DEVIATION • Most frequently used measure of deviation • “Root – means—square--deviation” SD • • • • • • • • 142.5 145 147.5 150 152.5 155 157.5 160 (M) 3 8 15 45 90 155 194 195 162.5 165 167.5 170 172.5 175 M=160 SD=5 136 93 42 16 6 2 SD WEIGHTS OF STUDENTS (Kg) N0. OF STUDENTS 250 200 194 195 155 150 136 100 50 0 93 90 45 3 8 42 15 16 6 2 143 145 148 150 153 155 158 160 163 165 168 170 173 175 WEIGHT NORMAL DISTRIBUTION • Range, mean±1SD=160±5=155 to 165cm – 68.27% of the observations • Range, mean±2SD=160±2x5=150 to 170cm – 95.45% of the observations • Range, mean±3SD=160±3x5=145 to 175cm – 99.5% of the observations • 3 observations < -3 SD & 2 observations > +3 SD fall in 0.05% group. RELATIVE VARIATE (Z) • Deviation from the mean in a normal distribution or curve is called relative or standard normal deviate. • It is measured in terms of SD & it tells us how much an observation is higher or smaller than mean in terms of SD. • Z= observation-mean =X-X¯ SD SD RANGE • • • • Easy to understand Easy to calculate Useful as a rough measure of variation Value may be greatly changed by an extreme value • Highly unstable measure of variation. MEAN DEVIATION • Simple to understand and interpret. • Affected by the value of every observation • Less affected by absolute variation • Not suited for any mathematical treatment. SD • Affected by value of every observation • It avoids algebraic fallacy • Less affected by fluctuations of sampling than other measures of dispersion • Has a definite mathematical meaning • Has a great practical utility in sampling and statistical inferences. QUESTION • Average weight of baby at birth is 3.05Kg with SD of 0.39Kg. In a normal distribution a) wt. of 4 Kg as abnormal? b) wt. of 2.5 Kg as normal? Percentage • Is the number of units with a certain characteristic divided by total no. of units multiplied by 100. Proportion • It is a numerical expression that compares one part of the study unit to the whole. RATIO • It is a numerical expression, which indicates the relationship in quantity , amount or size between two or more parts. SAMPLING • Not possible to include each & every member • Not possible to examine all people of country • To test efficacy of drug to all patients • Cooking of rice • Costly collection & Time consuming • Blood test POPULATION • Population • Sample • Parameter: a value calculated from a population – Mean (μ) – Standard Deviation(σ) • Sample – Mean (X) – Standard deviation ( s) SAMPLING • Sample is a part of population • Estimation of population parameters • To test the hypothesis about the population from which the sample was drawn. • Inferences are applied to the whole population but generalization are valid if sample size is sufficiently large & must be representative of the population-unbiased. SAMPLING • Sampling units are break down of population into smaller parts which are distinct and non overlapping so that each member / element of the population belongs to one and only one sampling unit. • When a list of all individuals , households, schools and industries are drawn, it is called sampling frame. Sample • A representative sample is the one with which we can draw valid inference regarding the population parameters. • It is representative of the population under study • Is large enough but not too large • The selected elements must be properly approached, included and interviewed. CONFIDENCE INTERVAL • It is the interval or range of values which most likely encompasses the true population value. • It is the extent that a particular sample value deviates from the population • A range or an interval around the sample value • Range or interval is called confidence interval. • Upper & lower limits are called confidence limits. C.I • Random sample of 11 three years children were taken, sample mean was 16 Kg and standard deviation is 2 Kg. standard error is 0.6 Kg. find C.I. STANDARD ERROR • Standard error is the standard deviation of the means of different samples of population. • Standard error of the mean • S.E. is a measure which enables to judge whether a mean of a given sample is within the set of confidence limits or not, in a population. • S.E= SD/√n SAMPLING TECHNIQUES • • • • • • • • • SIMPLE RANDOM SAMPLING SYSTEMATIC SAMPLING STRATIFIED SAMPLING MULTISTAGE SAMPLING CLUSTER SAMPLING MULTIPHASE SAMPLING CONVENIENT SAMPLING QUOTA SAMPLING SNOW BALL SAMPLIG Sample size • L= 2 σ √n √n= 2 σ L n= 4 σ² L² Example: 1.mean pulse rate=70 Pop. Standard deviation(σ)=8 beats Calculate sample size? 2. Mean SBP=120,SD=10, calculate n? Sample size • Qualitative data • N=4pq L² e.g.