Download Lecture 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Lecture 4
HO 4
-1-
The scientific hypothesis
In our research we must provide evidence of validity based
on:
Unfortunately many consider only
• Observation
the last of these as the only valid
form of research. This is incorrect.
• Case-study
The availability and applicability of
• Co-relation
a particular level of research relates
• Differentiation, and to the specificity of the research
question and the type, form and
• Experimentation
availability of observational data
and
existence
of
previous
knowledge.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-2-
The scientific hypothesis
Observation:
This is the least constrained of all scientific research methods
and is not bound to a very strong and specific hypothesis. In
this type of research, “subjects” are observed in the natural
setting so that patterns of behavior or trends might be
observed and abstractions made of such patterns.
Example:
The work of Porter and Votta (1994) that observes the
professional habits of software engineers .
Porter, A.; Votta, L; An experiment to assess different defect detection methods for software requirements inspection”;
proceedings of the 16th ICSE; Sorrento,Italy; 1994.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-3-
The scientific hypothesis
Case-Study
This is somewhat higher in constraint in that the researcher
intervenes with the subject’s functioning to some degree, for
example by asking questions or requiring subjects to conduct
certain tasks. Quite a lot of research in both information
systems and software engineering is based on this research
method.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-4-
The scientific hypothesis
Co-relation
In this method we are interested in quantifying the relationship
between two or more variables. There is thus need for a higher
degree of constraint than in a case-study.
Example:
Examine if there is a relationship between “Years of programming
experience”, and “The number of defects injected when coding”
by a programmer, given the programming environment.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-5-
The scientific hypothesis
Correlation IS NOT Causality
100
90
80
70
60
DIV
50
DEF
40
30
20
10
0
1880
1900
1920
1940
Correlation:
CSCI 6960- Research Methods
1960
1980
2000
2020
0.994987
© Houman Younessi 2013
Lecture 4
HO 4
-6-
The scientific hypothesis
Differentiation:
This is an explicit comparison between two or more groups
of subjects in terms of one concept of interest. In this type
of research, all constraints governing over all groups must
be the same except for the single concept of interest that
defines each group. This variable is called a “pre-existing
variable” and is not under the control of the researcher.
Example:
Compare the reliability of a program written by novices
versus experienced programmers when all other conditions
are identical.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-7-
The scientific hypothesis
Experimentation:
In this type of research, subjects are assigned to groups
without bias. In other words there is NO pre-existing
variable. An explicit comparison is then made between such
groups.
Example:
Programmers randomly assigned to groups to evaluate the
efficacy of two programming techniques.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-8-
The scientific hypothesis
It is interesting that much of empirical research in information
systems or software engineering that is labeled “experimental”,
is in fact either differential or co-relational.
It is very difficult indeed, to set up and conduct a true
experiment in computer science.
Important:
If it is equally (or nearly so) possible and practical to set up
higher constraint empirical research (e.g. an experiment)
then a lower constrained one should not be set up.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
-9-
The scientific hypothesis
Empirical Validity
Internal Validity
External Validity
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 10 -
The scientific hypothesis
Empirical Validity:
Construct a well-formed
hypothesis and test it with respect
to validity and in relation to type 1
and type 2 errors.
There is only rarely any
“proof” in science,
mostly the
demonstration that the
claim that has been
made is a reasonable
one.
Test of Hypothesis:
• Type 1 error
• Type 2 error
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 11 -
The scientific hypothesis
A well formed hypothesis
A well-formed hypothesis is in the form of a specific assertion
that lends itself to mathematical proof or statistical comparison.
It is customary to present most hypotheses in the “null” form.
A null hypothesis states that “There is no significant statistical
difference between the measure of A and B”.
If the observed inferential statistics were very different, we shall
reject the hypothesis.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 12 -
The scientific hypothesis
But how different is very different?
To answer this question, we define a cut-off point. We measure the
probability p that the observed data was obtained if the null
hypothesis is true. If the probability is small, then it is unlikely
that the null hypothesis is true. The cut-off point for this decision
is the probability value (1-).
But what is an appropriate value for the alpha level?
That depends on the rigor of the data, the procedures and the
reliability required of the results. In hard sciences it is usually
extremely low such as 0.05 or 0.01. In social sciences values of up
to 0.1 or even 0.2 have been used.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 13 -
The scientific hypothesis
It is however always possible that the researcher’s decision
to accept or reject the hypothesis based on the alpha levels is
wrong. There could be two types of errors:
Type 1 error
This is when the researcher rejects the hypothesis when
in fact it should be accepted.
or
Type 2 error
This is when the researcher accepts the hypothesis when
in fact it should be rejected.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 14 -
The scientific hypothesis
Of course the probability of committing a type 1 error is equal
to the alpha level ().
But we cannot set =0 to avoid all type 1 errors due to
existence of type 2 errors ().
Decreasing  without doing anything to increase the rigor or
validity of the data (or our procedures) would automatically
increase the value of .
As we wish to reduce BOTH error types, the only solution is
to increase the rigor of our research.
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 15 -
The scientific hypothesis
There are, depending on the scale used in the measurement, a
number of tests that can help in determining whether to accept
or reject a hypothesis. These include:
1. Tests to determine differences in population means:
Simple t-Test
Correlated t-Test
Analysis of Variance (ANOVA)
2. Tests of goodness of fit
Chi-square (2) test
Kolmogorov-Smirnoff Test
CSCI 6960- Research Methods
© Houman Younessi 2013
Lecture 4
HO 4
- 16 -
The scientific hypothesis
3.
Correlation
Pearson Correlation Test
Spearman Correlation Test
Kendall’s tau Test
Point bisect Test
Partial Correlation Test
CSCI 6960- Research Methods
© Houman Younessi 2013