Download Decision Making

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Immunity-aware programming wikipedia , lookup

Transcript
Statistical Techniques in
LIS Research
Dr S K Savanur
Senior Faculty
Department of LIS
Joshi-Bedekar College
Thane-400601
Decisions
•
•
•
•
•
•
•
Technical, Managerial and Life-Related
Routine and Special
Decisions implies Unknown and Future
Decisions: Resources and Estimations
Decisions = Observations + Processing
Better Observation = Better Data
Core Theme is Decision Making
•
25 May 2017
…Decision Making
2
Decision Making
•
•
•
•
•
•
Decision about entities
Entities have attributes/properties
Attributes are variables
Variables take different values
Variables come from objectives/hypothesis
Measurement of variable is Data
•
25 May 2017
•
Types of Variables
Note: Variables, Measurement and Data.
3
Types of Variables
•
•
•
•
Quantitative Vs Qualitative
Continuous Vs Discrete
Dependent Vs Independent
[Tells what is to be measured. What is the data that need to be
collected.]
•
25 May 2017
Types of Variables Contd.
4
Types of Variables Contd.
• Continuous Variable: Eg: Change in Tumor Volume or diameter,
age, height, BP Obsolescence rate, age of manuscripts/books, etc.
Commonly used point estimates: mean, median
• Binary Variables: Observations (i.e., dependent variables) that
occur in one of two possible states (zero or 1, Eg: improved/not
improved, completed/failed task, yes or no, Male-Female,
Response, Progression, >50% Reduction in Tumor Size. Reference
Lending, Borrowed-on the shelf, Commonly used point estimate:
proportion, relative risk, odds ratio
• Time-to-Event (Survival) Variables: Eg: time to progression,
time to death, time to relapse, cut-off temperature, voltage, time
to discard, time to binding etc. Commonly used point estimates:
median survival, k-year survival, hazard ratio
•
25 May 2017
Measurement
5
Measurement
•
•
•
•
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
•
25 May 2017
Population, Samples etc
6
Some Definitions
• Population: The complete set of
individuals or objects that the
investigator is interested in studying
• Sample: A subset of the population that
is actually being studied
• Essence of Statistics
•
25 May 2017
•
Essence of Statistics
Point Estimation, Range Estimation
7
Essence of Statistics
• Plural Vs Singular
• Statistic: Summary measure of a sample
• Parameter: Summary measure of a population
• Summary Measures: Mean, Median, Mode, SD,
Coefficient of Correlation, Regression Coefficient
etc.
25 May 2017
8
When to use which summary measure
•
•
•
•
•
•
•
•
•
•
•
Mean
Interval and Ratio Scales
Median
Ordinal, Interval and Ratio Scales
Mode
Nominal, Ordinal, Interval and Ratio Scales
SD
Interval and Ratio Scales
Association
- Coefficient of Correlation: Two Variables
- Interval and Ratio Scale : Pearson
- Ordinal Scale: Spearman
- Nominal and Ordinal Scale: Chi-Square
Regression Coefficient
Single Variable: Mean, Median, Mode, SD
25 May 2017
•
Two Aspects of Statistics9
Two Aspects of Statistics
• Descriptive Statistics
• Inferential Statistics
25 May 2017
10
Descriptive Statistics
• Concerned with describing or characterizing
the obtained sample data
• Use of summary measures—typically measures
of central tendency and spread
• Measures of central tendency include the
mean, median, and mode.
• Measures of spread include the range,
variance and standard deviation.
• These summary measures of obtained from
sample data are called statistic.
25 May 2017
11
Inferential Statistics
• Involves using obtained sample statistics
to estimate the corresponding population
parameters.
• Most common inference is using a sample
mean to estimate a population mean.
• In short, it leads to an estimation.
• Sample 
Statistic
• Population  Parameter
25 May 2017
12
Point Estimation
•
•
•
•
•
25 May 2017
A “Point Estimate” is a one number summary of the data.
Examples:
Dose Finding Trails: MTD (Maximum Tolerable Dose)
Safety and Efficiency Trials: Response, Median Survival
Comparative Trails: Odds Ratio, Hazard Ratio etc.
13
Odds Ratio and Hazardous Ratio
•
•
The odds ratio is a measure of effect size, describing the strength
of association or non-independence between two binary data values. It
is used as a descriptive statistic, and plays an important role in logistic
regression. Unlike other measures of association for paired binary data
such as the relative risk, the odds ratio treats the two variables being
compared symmetrically, and can be estimated using some types of nonrandom samples.
The hazard ratio (HR) is the ratio of the hazard rates corresponding
to the conditions described by two levels of a treatment/variable. In a
drug study, the treated population may die at twice the rate per unit
time as the control population. The hazard ratio would be 2,
25 May 2017
14
Range Estimate
25 May 2017
15
Reliability, Validity and Normality
• A test is reliable if it gives the same reading every
time
• Eg: Reliable friend, reliable data, inputs etc
• A test is valid when it is testing what it is supposed
to test
• Eg: Valid ticket, valid instrument [Spring balance/weighing
machine]
• If the frequency distribution has
Mean=Median=Mode and there is no skewness
• Kurtosis measures flat-topness of a curve
• Parametric Vs Nonparametric
25 May 2017
16
25 May 2017
17
Parametric and Non-Parametric Tests
• A parametric statistical test makes assumptions about the parameters
(defining properties) of the population distribution(s) from which one's
data are drawn.
• A non-parametric test makes no assumptions.
• "Non-parametric Test" is a null category, as all statistical tests assume one
thing or another about the properties of the source population(s).
• The following are non-parametric Tests
•
Chi-square Tests
Fisher Exact Probability test
The Mann-Whitney Test
The Wilcoxon Signed-Rank Test
The Kruskal-Wallis Test
The Friedman Test
Non-parametric tests are sometimes spoken of as "distribution
25 May 2017
18
Hypothesis Testing
• Null Hypothesis
• Alternative Hypothesis
25 May 2017
H0
Ha
19
Analysis
•
–
–
–
–
–
–
•
•
–
–
–
–
–
–
1. Processing of Data
1.1 Editing
1.2 Coding
1.3 Classification
1.4 Tabulation
1.5 Percentages
1.6 Graphic Presentation
2. Analysis of Data
2.1 Descriptive and Causal Analysis
–
–
–
–
–
–
–
2.1.1 Uni-dimensional Analysis
2.1.1.1 Central Tendency
2.1.1.2 Dispersion
2.1.1.3 Skewness
2.1.1.4 One-way ANOOVA
2.1.1.5 Index Number
2.1.1.6 Time Series Analysis
– 2.1.2 Bi-variate Analysis
– 2.1.2.1 Correlation
– 2.1.2.2 Regression
– 2.1.2.3 Association
– 2.1.2.4 Two-way ANOVA
25 May 2017
•
2.1.3 Multi-variate analysis
2.1..3.1 Multiple Regression
2.1.3.2 Multiple Discriminant analysis
2.1.3.3 Multi-ANOVA
2.1.3.4 Canonical Analysis
2.1.3.5 Factor Analysis, Cluster Analysis
etc
2.2 Inferential or Statistical Analysis
–
–
–
–
–
–
–
–
–
–
2.2.1 Estimation of Parameter Values
2.2.1.1 Point Estimation
2.2.1.2 Range/Interval Estimation
2.2.2 Testing of Hypothesis
2.2.2.1 Parametric Tests
2.2.2.1.1 z-test
2.2.2.1.2 t-test
2.2.2.1.3 ANOVA
2.2.2.2 Non-Parametric Tests
2.2.2.2.1 Chi-square Tests
2.2.2.2.2 Fisher Exact Probability test
2.2.2.2.3 The Mann-Whitney Test
2.2.2.2.4 The Wilcoxon Signed-Rank Test
2.2.2.2.5 The Kruskal-Wallis Test
2.2.2.2.5 The Friedman Test
20
Functional Statistics
• Statistics is the judgment of choosing and sequencing
the tests
• It is more of understanding when to do what
• Then MS-Excel or SPSS takes care of the data
• First test the normality, reliability and validity
• Then central tendency, if required
• Choose the level of significance
• Multi-criteria-Decision Making
• Thank You
25 May 2017
21