Download What is a research question? (From topics to questions)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Least squares wikipedia , lookup

Regression analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Some Key Concepts in
Research (cont’d) &
Quantitative methods




Hypothesis: An informed proposition which must
be testable against evidence.
Methodology – How can we go about acquiring that
knowledge?
Methods – What precise procedures can we use to
acquire it?
Sources – Which data can we collect?



Reliability, rigour and validity
Reliability of data – more likely to provide
agreement among independent observers (e.g.
Another researcher will achieve the same results);
Rigour – sufficient evidence to support findings;
appropriate methodology; consideration of
competing evidence; generalisability.
◦ Triangulation: using more than one method or
source of data, so that findings can be crosschecked

Overall validity of research
◦ Internal validity – coherent internal logic and
consistency; adequate demonstration of cause
and effect; consideration of negative evidence; it
must make sense within its own terms: e.g. What
do I need to do to test the rel between dem &
dev...
◦ External validity generalisability – is sampling
diverse enough? E.g. ‘all latin american
countries...’ based on 3 countries; is context
thickly described?; can concepts be applied to
other settings?




Quantitative
◦
◦
◦
◦
Sampling
Questionnaires
statistical analysis
SPSS
◦
◦
◦
◦
Interviewing
focus groups
PAR techniques
textual analysis
Qualitative
Mixed methods
What serves my research/the type of data I
have access to best?





What is your population? (group you are studying, e.g.
100)
Too large to study all? - Sample 20/100
Logic is that you analyse data from sample but
extrapolate / draw conclusions for whole population
Key question = how representative is your sample? –
need to follow logic of your research question
Key questions
 How big will sample be, and why?
 How will it be chosen, and why?
 What claims will you make for its representativeness?

Anything that can be measured, and can differ
across entities or across time;

Purpose of research (though not always…) to find out ◦ Why does Y happen?
(Y = poverty levels among farmers in LDCs; dependent
variable)
◦ Does X cause Y?
(X = Global trade/agricultural regime; independent
variable)
◦ Or, do A, B and/or C cause Y?
(A = population growth; B = climate change; C = economic
policies; further independent variables).

Y = your dependent variable

X = your independent variable


A, B or C
further independent variables that may impact
on/cause change in Y
Intervening variables
may change the normal relationship between X and
Y

Correlation yes, but how do we infer causation?
High correlation does not necessarily equal
causation (sun, ice-cream and crime!)
◦ Variables (say X and Y) must be related
◦ A time order between variables must be demonstrated –
e.g. X (the cause) preceding Y (the effect)
◦ Must have a plausible theory showing causal links between
the two variables
◦ Must consider and eliminate plausible rival hypotheses
(Punch, 2005: 49)



Quantitative (statistical) analysis is the study of the
variation in variables and the co-variation between
them.
Conducted for large number of cases (though
definitions of ‘large’ vary)
Looks for patterns in data and the possibility of
generalising from this to larger population
◦ findings due to chance variation or are they ‘statistically
significant’?

A correlation exists between two variables when
their values move together
◦ e.g. taller people tend to be heavier
◦ May have negative correlations, e.g. thicker clouds lead to
diminished sunlight

Pearson’s coefficient (r):
◦ A measure of the strength and direction of the correlation
between two variables
◦ from -1 to 1: the higher the value (the closer to -1 or 1),
the greater the correlation.



An effect is termed as ‘significant’ if it is unlikely to
occur as a fluke, but there is a good chance that it
will re-occur in instances where generalised, and
the more cases examined the better.
Significance generally accepted at a 95% confidence
level, with a less than 5% chance that the null
hypothesis (a disproving of your hypothesis) is
true.
P-value – probability, measure of statistical
significance.

p (probability) = measure of statistical significance
◦ E.g. p<0.05 means that I’m likely to be wrong
(the findings came about by chance) 5% (5 out of
each 100 times) of the time
◦ Therefore, there is a 95% chance that my findings
can be generalised to the whole population
◦ This is generally the level required for results to
be ‘statistically significant’

Relationships between variables – cross tabulations
e.g. Table X: Wealth and educational attainment
Completed 2˚ school
Yes
No
Total

Wealthy
83%
17%
100%
Middle income Poor
60%
26%
40%
74%
100%
100%
Can use for many more variables (multivariate
analysis)

Multivariate analysis
◦ Example: Surveying students and measuring a
number of variables
◦ Finding: On basis of bivariate analysis, findings =
physics students taller than language students
◦ Tests confirmed, but not correct to conclude
course choice is the factor
◦ Another variable, gender, shows that the physics
course is mostly male.

Multivariate analysis
◦ Say have two sources of variance (wealth and
gender) with respect to your dept. variable
(educational attainment) and want to know about
differences between groups, within groups and
interactions of these…
Table Y: Gender, wealth and educational attainment
Women
Men
Wealthy
77%
88%
Middle income
53%
67%
Poor
12%
33%
◦ Mean (average, the centre of distribution of
scores)
◦ Median (middle value) measurements listed in
ascending order and the measurement in the
middle is chosen
◦ Mean and median generally quite close, otherwise
‘skewed’ distribution
◦ Mode (most frequently occurring value) and
◦ Variation – standard deviation (the higher the SD,
the greater the deviation from mean)
Example: ODA levels as
% of GNI from 23
OECD countries, 2011
Mean: Sum of %,
divided by 23.
Median: The middle %
of this set of
observations
Mode: The % of GNI
that most frequently
appears in the data
Australia
Austria
Belgium
4 983
1 111
2 807
0.34
0.27
0.54
Canada
Denmark
Finland
5 457
2 931
1 406
0.32
0.85
0.53
France
Germany
Greece
12 997
14 093
425
0.46
0.39
0.15
Ireland
Italy
Japan
914
4 326
10 831
0.51
0.20
0.18
Korea
Luxembourg
Netherlands
1 328
409
6 344
0.12
0.97
0.75
New Zealand
Norway
Portugal
424
4 934
708
0.28
1.00
0.31
Spain
Sweden
Switzerland
4 173
5 603
3 076
0.29
1.02
0.45
13 832
30 924
0.56
0.20
United Kingdom
United States

Variation – standard deviation is a measure
of the dispersion of the data (the greater
the deviation from mean, the higher the SD)

E.g. Mean (average) ODA as % GNI = .49%

Greece: 0.15; Sweden: 1.02

Frequency distributions histograms/scatterplot, pie-charts





Nominal – ‘named’ e.g. 1 = Female; 2 = Male, not
ranked.
Ordinal – comparative, directional scale, can be
used for ranking e.g. 1 = highest and 6 = lowest
(e.g. Freedom House scores; ODA levels)
Interval – gaps of equal measure, e.g. height,
temperature, years – gaps of equal size.
Dummy - presence or absence of a specific
characteristic, e.g. 1 = Irish Aid programme
country; 0 = non-Irish Aid programme country.
Important if you are inputting data






Statistical Package for the Social Sciences
Capable of handling vast quantities of information
/ variables from structured surveys
Basic and advanced statistical analysis
◦ Descriptive statistics
◦ Cross tabulations
◦ correlations
Graphical representation
Package is on DCU computers (limited copies can
also be put on student’s laptops under the DCU
licensing agreement)
Buy a manual!