Download to view.

Document related concepts

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
STATISTICS
Meena Ganapathy
MEANING
Statistics
Latin-status
Italian statistica
Germany Statistik
French statistique
Statistic – Singular- One value associated e.g., wt of
one person
Plural e.g., wt of more values
Statistics as singular branch of science- It is the
combination of logic & Mathematics.
DIFF. BRANCHES OF
STATISTICS
 1) Medical Statistics
 2) Health statistics
 3) Vital statistics
 4) Biostatistics
STATISTICS
 It is the branch of Science which deals with technique of
collection, compilation, presentation, analysis of data & logical
interpretation of the result.
USE OF STATISTICS
 1.To collect the data in best possible way.
 2.To describe the characteristics of a group or a situation.
 3.To analyze data & to draw conclusion from such analysis.
DEFINITION
 Variable :- A characteristic that take different values in different person
places or things.
 E.g. Ht, Wt, B.P., Age;’
 It is denoted by capital x = x
 E.g., x: ht
 X1, x2, x3, x4…….xn
 N= total numbers of observation
ATTRIBUTE
 A qualitative characteristic like age, sex, nationality is called as
attribute
CONSTANT
 The characteristic which does not change its value or nature is
considered as constant
 E.g. blood group, sex
OBSERVATION
 An event or its measurement such as BP., Is as event & 120/80
mm of Hg. Is as measurement
OBSERVATION UNIT
 The source that gives observation such as object person etc.
DATA
 A set of values recorded on one or more observational unit is
called as data. It gives numerical observation about observational
unit.
 e.g., HT, WT, Age.
 = equal to
 < Less than
 > greater than
 =< less that & equal to
 => greater than & equal to
 ≠ not equal to
 ∑ Summation
Short forms
A.M.- arithmetic mean
H.M.- harmonic mean
G.M.- Geometric mean
C.V.- Coefficient of variation
S.E.- Standard error
S.D.- Standard deviation
D.F.- Degree of freedom
C.I.- Confidence interval
 E :- Expected value of cell of contingency table
 O :- Observed value of cell of contingency table.
 N :- Population size
 N :- Sample size
 L :- Level of significance (I.O.S)
 Ho :- Null hypothesis
 H1 Alternative hypothesis
TYPES OF DATA
 Qualitative and quantitative
 Discrete and continuous
 Primary and Secondary
 Grouped and ungrouped
QUALITATIVE &
QUANTITATIVE DATA
 Qualitative data :-It is also called as enumeration data. It represents
particular quality or attribute there is no notion of measurement. It
can be classified by counting individuals having the same
characteristics.
 E.g. Sex, religion, blood group
QUA N T I TAT I VE DATA
 It is also called as measurement data. This can be measures by
counting the characteristics in the variable.
 E.g. Ht, Wt, BP, HB
DISCRETE & CONTINUOUS
 Discrete :- Here we always get a whole number.
 E.g. no of people dying in road accidents no. of vials of polio
vaccine.
 Continuous :- In this data there is possibility of getting fraction
like 1.2, 2.1,3.81. i.e. it takes all possible values in a certain range.
 E.g., Ht, WT, temp
PRIMARY AND SECONDARY
Primary :- The data obtained directly from a individual
gives precise information. i.e., when the data is collected
originally by the investigator for the first time is called
primary data.
E.g. to find no. of alcoholic person in Karvenagar
area. By the investigator.
Secondary :- When the data collected by somebody or
other person is used the data is called secondary data.
E.g. Census hospital records
UNGROUPED AND
GROUPED
 Ungrouped :- When the data is presented in raw way , it is
called as ungrouped data
 E.g. Marks of 5 students
 20,30,25,20,30
 Grouped :- When the ungrouped data is arranged according
to groups, then it is called as grouped data.
 E.g. Marks Students

20
2

30
2

25
1
M E T H O D S O F DATA
C O L L E CT I ON
 Observation
Visual

Instrument
 Instrument Properties
 Reliability
Validity
 Interviews & self administered questionnaires
 Use of documentary sources (secondary data)
CLASSIFICATION OF DATA
 Definition :- The process of arranging data in to groups or classes
according to similar characteristics is called as classification & the
group so formed are called as class limits 1 class interval.
OBJECTIVES OF
CLASSIFICATION OF DATA
 1.It condense the data
 2.It omits unnecessary information.
 3.It reveals the important features of the data.
 4.It facilities comparison with other data
 5.It enables further analysis like competition of average, dispersion
(Variables ) data.
F R E QU E N C Y
 A) Frequency
 Definition :- No. of times variable value is repeated is called as
frequency.
 B) Cumulative class frequency
 Definition :-Cumulative frequency is formed by adding frequency of
each class to the total frequency at the previous class. It indicates the no.
of observations < upper limit of the class limit.
 Representatives

Symbol
Sample
Population
 1. Mean
X bar
M
 2. SD
$
o2
 3. Variance
$2
o2
 4. Proportion
p
P
2
Q
 5. Complement of

proportion
DATA PRESENTATION
Meena Ganapathy
M E T H O D S O F P R E S E N TAT I ON
O F DATA
 Tabulation.
 Charts and diagrams.
METHODS OF PRESENTATION OF
DATA
Caption
heading
Stub
heading
S
T
U
B
Total
Caption
Total
Subheading
Body of the
Table
IMPORTANT POINTS IN
MAKING A TABLE
 Table No. :- If many tables are present
 Title :- Should be small
 Head note :- Whatever is not covered in title can be written in head
note.
 E.g. expressing units
 Caption :- column heading
 According to characteristics
 Stub :- raw
 Subheading
 Body :- content
 Foot note:- Short forms or
 Source note :- resource it is important because it shows reliability of
table.
RULES AND GUIDELINES FOR
TA BU L A R P R E S E N TAT I O N
 1. Table must be numbered
 2. Brief & self explanatory title must be given to each table.
 3.The headings of columns & rows must be clear, sufficient,
concise & fully defined.
 4. The data must be presented according to size or importance
chronologically alphabetically or geographically.
 5. Table should not be large.
 6. Foot note should be given whenever necessary providing
additional information sources or explanatory notes.
TYPES OF TABLE
 1.One way table/simple table
 2.Two way table
 3.Complex table
1.ONE WAY TABLE/ SIMPLE
TABLE
 When there is only one characteristics is described in a table then it
is called as simple table
EXAMPLE OF ONE WAY TABLE
Class interval
Frequency
Tally Mark
Frequency
3–4
IIII
5
5- 6
II
2
7–8
IIII
5
9 - 10
III
3
TWO WAY TABLE
 In this table data is classified according to two characteristics it
given information about two interrelated characteristics.
Sex
Types of anemia
Total
Boy
s
160
15
260
Girls 190
120 45
355
Total 350
205 60
615
Frequency distribution
table qualitative data
distribution of types of
anemia
According to sex
85
COMPLEX TABLE
 Information collected regarding 3 or 4 characteristics & tabulated
according to these characteristics such a type of table is called as
complex table.
EXAMPLE OF COMPLEX TABLE
Fasting blood
Male
Female
Total
Glucose
51-60 & 61-70yrs
51-60 & 61-70 yrs
120-129
4
4
2
2
12
130-139
1
3
3
1
8
140-149
2
4
1
3
10
150-159
2
3
3
2
10
160-169
4
5
3
3
15
170-179
5
4
5
4
18
180-189
1
2
1
1
5
19
25
18
16
78
A DVA N TAG ES O F A G R A P H S &
D I AG R A MS
 1. Information is presented in condensed form
 2. Facts are presented in more effective & impressive manner as
compared to tables
 Easy to understand for a layman.
 Create effect which last for longer time
 Facilitate the comparison.
 Help in revealing patterns.
DISADVANTAGES
 Approximate results instead of accuracy
 Gives only a general idea
 Not sufficient for statistical analysis
T Y P E S O F D I AG R A MS F O R
QUA L I TAT I V E DATA
 Bar: Simple, Multiple or complex, Component & Proportional
 Pie or Sector
 Pictograms
 Shaded Map / Contour / Spot Maps
BAR DIAGRAMS
 It is used to compare variables possessed by one or more groups.
SIMPLE BAR DIAGRAM
 Here only one variable is presented
 Bars are at uniform distance from one another
 It can be drawn vertically or horizontally
 Each should have title & source note
No. of dependents at home
120
103
97
No. of subjects
100
80
60
47
40
34
21
20
17
0
None
1
2
3
No. of dependents
4
5 and
above
PIE OR SECTOR
DIAGRAMS
 When the data is presented as sum of different components for
one qualitative characteristics we use pie diagrams.
Patients age distribution in percentage
21%
34%
19-29
30-39
40-49
50-59
19%
26%
PICTOGRAMS
 This diagrams are useful for lay people. E.g., Village map
indicating temple, trees etc…
SPOT MAPS
 In this diagram a map of an area with location of each case of an
illness, death etc… are identified with spots or dot or any other
symbol.
TYPES OF DIAGRAMS FOR
QUANTITATIVE DATA
Histograms
Frequency polygon
Frequency curve
Cumulative frequency curve
Line graph
Scatter diagram
Population Pyramid
HISTOGRAMS
 It is the graphical representation
of frequency distribution. It is a
series of adjacent rectangles erected
on bars
 Areas of these bars denote the
frequency of respective class
interval.
 X axis base of bars shows class
width of class interval
 Y axis frequency / No of
observations
90
80
70
60
East
West
North
50
40
30
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
FREQUENCY POLYGON
 It is representation of categories of continuous & ordered data
similar to histogram. It can be drawn in two ways: Using histograms,
with out using histograms.
 Uses: it is used when sets of data are illustrated on the same
diagram such as temperature, & pulse, birth & death rate etc…
350
300
250
200
150
100
50
0
Series1
Series2
1
2
3
4
5
6
7
SCATTER DIAGRAMS
 It is prepared after tabulation in which frequencies of two
variables have been cross classified
 It is graphic representation of co relation between two variables
SCATTER PLOT
700
600
500
400
300
200
100
0
Series1
0
5
10
15
LINE DIAGRAMS
 It is used to show the trends of events with the passage of time.
E.g., rising & falling
LINE GRAPH
700
600
500
400
300
200
100
0
Series1
Series2
1
2
3
4
5
6
7
LINE & BAR
14
12
10
8
6
4
2
0
700
600
500
400
300
200
100
0
1
2
3
4
5
6
7
Series2
Series1
MEASURES OF CENTRAL
TENDENCY
 Mode-Value that occurs most frequently
 Median –point below and above 50% of cases fall
 Mean-mathematical average( sum of scores divided by the total #
of scores
 Level of measurement plays a role in which central tendency
measure you
 Mean-interval & Ratio
 Mode-Nominal
 Median-ordinal
VARIABILITY / CENTRAL
DISPERSION
 Extent to which scores deviate from each other
 Homogenous
 Heterogeneous
 Range-highest score-lowest
 Distance between high & low scores







Standard Deviation (SD)
Difference between individual score and mean
Weight of person A=150 lbs
Mean =140
Deviation =+10
SD ( average deviation from mean )
Formula
BIVARIATE STATISTICS
 Associations between 2 variables
 Correlations
INFERENTIAL STATISTIC
 Hypothesis testing
 Null Ho
 No actual relationship between variables
 There will be no difference in grant writing ability between nurses
who attend and do not attend the research short course
 Accept the null Ho
 Reject the null Ho
 Type I Error
 Reject the null when it is actually true
 Type II Error
 Accepting the null when it is actually false
 Level of significance
 Probability of committing Type I Error
 Set by the researcher
 Usually set at p =.05
 Lowering risk to Type I increases risk of Type II
PARAMETRIC TESTS
 Involve estimation of at least one parameter
 Interval level data / Ratio scale
 Assume variables are normally distributed
NONPARAMETRIC TESTS
 Nominal or ordinal level data
 Less restrictions about distributions
 Between subjects testing
 Men versus women
 Within subjects testing
 Same group compared pre and post-intervention
DIFFERENCES BETWEEN 2
GROUP MEANS
 Parametric
 T-tests for independent groups
 Paired t-Tests
 Nonparametric
 Mann Whitney U
 Wilcoxon signed rank test
DIFFERENCES BETWEEN 3
OR MORE GROUP MEANS
 Parametric
 One-Way Analysis of Variance (ANOVA)
 F ratio test
 Post-hoc tests to see which groups differ from each other
 LSD; Bonferroni
 Multifactor ANOVA (MANOVA)
 More than 2 IVs
 Usually for more complex analyses
 EG., Human behavior, feelings
 Repeated Measures ANOVA
 3 or more measures of same DV for each subject
 EG., subjects exposed to 3 or more different treatment conditions
 3 more data collection points of DV over time (longitudinal)
 Nonparametric ‘analysis of variance’
 Kruskal wallis
TESTING DIFFERENCES
IN PROPORTIONS
 DV is nominal level
 Chi square test
RELATIONSHIPS
BETWEEN 2 VARIABLES
 Pearson’s (interval level)
 Spearman’s rho or Kendall's tau
 (ordinal)
POWER ANALYSIS
 The probability of obtaining a significant result is called power of
a statistical test
 Insufficient power-greater risk of Type II error
 4 components
 Significance level-more stringent, lower the power
 Sample Size-increases, power increases
 Population effect size (gammaY)- how strong effect of IV is on
the DV
 Power (1-B)-probability of rejecting null Ho
MULTIVARIATE
STATISTICS
 Simple linear regression
 Make predictions about phenomena
 R-correlation
 R2proportion of variance in Y accounted for by combined Xs
Analysis of Covariance (ANCOVA)
Tests significance of differences between group
means after adjusting scores on DV to eliminate
effects of covariate (s)
Anxiety pre and post biofeedback therapy
One hospital = treatment
One hospital = control
Post anxiety DV; hospital condition IV
Pre anxiety scores- covariate
 Discriminant Analysis
 Predicts group membership
 Nurses who graduate versus drop outs
 Cancer patients adhere to treatment versus those who don’t
 Logistic Regression
 Binomial Logistic Regression
 DV is categorical (2 groups)
 Odds of Belonging to one group
 Multinomial Logistic Regression
 DV is categorical (. 2 groups)
 Odds of belong to one group