Download What is a research question? (From topics to questions)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Least squares wikipedia , lookup

Linear regression wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Regression analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Data collation and analysis

–
–
–
–
–

–
–
–
–
–
Quantitative
Sampling
Questionnaires
statistical analysis
SPSS
Interpreting quantitative data (exercise)
Qualitative
Interviewing
focus groups
PAR techniques
textual analysis
Interviewing and focus groups (exercise)
Some key concepts

Formulating and testing your hypothesis

Reliability of data – consistency over time (quant); internal
consistency

Rigour – sufficient evidence (appropriate methodology;
consideration of competing evidence; generalisability)
– Triangulation
– Documentation

Validity (vs bias)
– Overall validity of research
– Internal validity (internal logic and consistency – elimination of rival
hypotheses; consideration of negative evidence; cross-validation of
findings with other parts of data; ‘member checking’)
– External validity (generalisability – is sampling diverse enough?; is
context thickly described?; can concepts be applied to other settings?)
Research design – Selecting variables

Purpose of research – generally (though not always…)
– Why does Y happen?
– Does X cause Y?
– Do A, B and/or C cause Y?

Y = your dependent variable (by convention)

The factors you are examining which may impact
on/cause Y (e.g. X, A, B or C) are your independent
variables

Need to decide these in advance through literature
analysis (though can adapt as go on)
Sampling




What is your population? (group you are studying)
Too large to study all? - Sample
Logic is that you analyse data from sample but extrapolate / draw
conclusions for whole population
Key question = how representative is your sample? – need to follow logic of
your research question
– Random (probability) sampling (general representativeness) – may be stratified
to ensure particular categories included (women, men, youth, landowners…)
– Purposive sampling (specific groups, variables)
– Key informants

Key questions
– How big will sample be, and why?
– How will it be chosen, and why?
– What claims will you make for its representativeness?
Questionnaires / Interview Schedules


Formal, structured, closed (quant) vs semi-formal, semistructured, open-ended (qual)
Structured questionnaires
– Principally for ‘what’ questions
– Standardised questions (tick box, assign value)

Semi-structured interview schedules
– Principally for ‘why’ or ‘how’ questions
– Standardised set of topics / areas to be explored

Back to research question
– What type of information/data do you need?

Generally useful to pilot either technique
Rigour in questionnaire surveys and
interview schedules
Misinterpretation of question
 Interviewer bias
 With respect to surveys, correlation not equal to
causation – rigour required in interpretation (supplement
with qualitative data)
 Triangulation

– Different methods
– Different informants
– Different researchers

Documentation
– Tape recordings, photography etc…
– Reflexivity – record of evolving analysis, decisions, directions
taken in research process
Surveys and interviews compared
(Table 7.2, p. 171, Thomas and Mohan)
Design – complete vs. evolving
 Selection of informants – initial vs. evolving
 Selection of questions – complete vs. modified
 Nature of questions – identical vs. tailored
 Choice of research participants – random vs. purposive
sample
 Data analysis – statistical vs. qualitative analysis
 Validation methods – probability criteria vs. triangulation
 Most appropriate use – ‘what’ (statistical extrapolation to
wider population) vs. ‘why’ & ‘how’

Statistical Analysis (the logic)



Quantitative (statistical) analysis consists of the study of the
variation in variables and the co-variation between them.
Conducted for large number of cases (though definitions of
‘large’ vary)
Looks for patterns in data and the possibility of generalising
from this to larger population
– findings due to chance variation or are they ‘statistically significant’?

Basic descriptive statistics
– Mean (average), mode (most frequently occurring value) and median
(middle value)
– Variation – standard deviation (the higher the SD, the greater the
deviation from mean)
– Frequency distributions - histograms, pie-charts
Types of variable
Level of mmt
Interval var.
Ordinal var.
Nominal var.
Dummy var.
Example
43% Lab; 10% Lib Dem
47% Con
1=Lab, 2=Lib Dem, 3=Con
1=Lab, 2=Con, 3=Lib Dem
0=Non-Labour, 1=Labour
Additional info contained
Quantity (Con is 4%
more than Labour)
Order (left-right wing)
Mutual exclusivity
n/a
Relationships between variables
(Correlations)
Statistical analysis looks at relations between variables and
allows us to assess the chances of being wrong when
drawing conclusions
A presumed causal effect is ‘strong’ if it appears to have
extensive effects – causes big changes (graphs)
 It is ‘widespread’ if it occurs in many different
circumstances – need to use multiple variables to
examine this
 It is ‘significant’ if it is unlikely to occur as a fluke, but
good chance that it will re-occur in instances where
generalised – generally accepted at 95% chance (though
the more cases examined the better)

Relationships between variables
Cross tabulations / bivariate analysis
e.g. Table X: Wealth and educational attainment
Completed 2˚ school
Yes
No
Total

Wealthy
83%
17%
100%
Middle income
60%
40%
100%
Poor
26%
74%
100%
Can use for many more variables (multivariate analysis)
Assessing the strength of the
relationship
Correlations between variables (r or R) (=relationship) –
Pearson coefficient



2 variables related if their values tend to move together
(e.g. taller people tend to be heavier)
May have negative correlations (e.g. thicker clouds lead
to diminished sunlight)
r/R measures between -1 and +1; the higher the value,
the greater the correlation
Assessing the strength of the
relationship
General guidelines on strengths of correlations
Strength of correlation
(r/R)
Interpretation with
indiv-level data
Interpretation with
aggregate data
0.00-0.06
0.07-0.19
0.20-0.34
0.35-0.49
0.50-0.65
0.66-0.80
0.81-0.95
0.96-1.00
Trivial
Slight
Moderate
Strong
Spectacular
Highly spectacular
Suspect
Very suspect
Trivial
Trivial
Slight
Moderate
Strong
Very strong
Spectacular
Suspect
Relationships between more than 2
variables

Multivariate analysis
– Say have two sources of variance (wealth and
gender) with respect to your dept. variable
(educational attainment) and want to know about
differences between groups, within groups and
interactions of these…
Table Y: Gender, wealth and educational attainment
Women
Men
Wealthy
77%
88%
Middle income
53%
67%
Poor
12%
33%
Relationships between more than 2
variables

Simple correlation = relationship between 2 variables
– Pearson coefficient (r)
– from -1 to 1 – the closer the number to 1, the higher the
correlation

Multiple correlation = relationship between more than 2
variables
– Most common method used in calculating the effects of several
variables is regression analysis
Relationships between more than 2
variables

Regression analysis developed by geneticists to study
the way in which offspring who are taller/shorter than
their parents tend themselves to have children whose
height moves back towards the average or regresses to
the mean

Measure = R squared (guidelines on strengths as in
previous table for r/R)
Statistical inference / inferential
statistics

How to generalise from sample

Key question: How likely am I to be wrong if I infer that my findings
apply to population as a whole? = statistical significance

p (probability) = measure of statistical significance – obtained
through chi-square test
– E.g. p<0.05 means that I’m likely to be wrong (the findings came about
by chance) 5% (5 out of each 100 times) of the time
– Therefore, there is a 95% chance that my findings can be generalised
to the whole population
– This is generally the level required for results to be ‘statistically
significant’
– Though caution… again depends on size of sample
Determining causation

Correlation yes, but how do we infer causation?

Variables (say X and Y) must be related
A time order between variables must be demonstrated –
e.g. X (the cause) preceding Y (the effect)
Must have a plausible theory showing causal links
between the two variables
Must consider and eliminate plausible rival hypotheses



(Punch, 2005: 49)
Data analysis: SPSS



Statistical Package for the Social Sciences
Capable of handling vast quantities of information / variables from
structured surveys
Basic and advanced statistical analysis
–
–
–
–
–
–
–



Descriptive statistics
Cross tabulations
correlations
regressions
ANOVA
Forecasting, time-series
Reliability tests
Graphical representation
Package is on DCU computers (limited copies can also be put on
student’s laptops under the DCU licensing agreement)
Buy a manual!
All with me so far?
Interpreting survey-based data
(Exercise)

See Table (photocopy)

Your research question: What was the effect of SAPs on
the growth of the informal sector in Tanzania?
Using data from informal sector survey of 1991

1. What are the dependent and independent variables?
2. What can the data tell you in relation to your question?
3. What can the data not tell you? (its limitations)