Download Sonia Williams

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroinformatics wikipedia , lookup

Geographic information system wikipedia , lookup

Predictive analytics wikipedia , lookup

Regression analysis wikipedia , lookup

Coding theory wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Corecursion wikipedia , lookup

Pattern recognition wikipedia , lookup

Data analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Quantitative analysis
Sonia Williams
Northern College of Acupuncture
19th February 2011
Numbers, numbers……
Measureable values
• Height, weight, age
• Can calculate:
• Average/mean
• Median
• Mode
Parametric statistics
Continuous variables
– height, weight, age expressed in exact terms
– e.g. 1.67m; 71.5Kg; 25.5years.
Non-continuous variables
– height, weight, age expressed in groupings
– e.g. 1.5-1.7m; 70-75Kg; 20<25yrs.
Distribution curve: height, weight, IQ, etc.
Continuous variable
Comparing 2 groups
E.g. shoe sizes men/women?
Is there a statistically significant difference
between them?
Parametric stats.
e.g. t-tests
Comparing means
And standard deviations
Other uses of numbers in
quantitative data……
Categorical data
• E.g. gender
• Yes/no answers
Presenting categorical data
• 4 categories
• Visually presented
Comparing categorical data
30
10
40
male
30
30
60
female
60
40
100
Apuc-
Acup+
total
Sample size = 100
Comparing…….
• 40 males & 60 females
• 40 had received acupuncture
while 60 had not.
• Was there a significant
difference in the proportion
of males & females receiving
acupuncture?
• Chi squared test used
• ANSWER=? Ask SPSS
Probability values (P)
• Probability of
heads OR tails = 1
in 2 or 50% (or 0.5)
• Probability of 2
consecutive heads
= 1 in 2 AND 1 in
2 = 1 in 4 or 25%
(or 0.25)
Probability values (P)
• How many times
would you need to
get consecutive
tails to reach a
probability value
less than 0.05?
Probability values (P)
• P<0.05 becomes
biologically
important.
• There is only a 5%
chance that this
result occurred by
chance
• or 1 in 20
• P<0.01 is 1% or 1
in 100
• P<0.001 is 0.1% or
1 in 1000
Sources of error in statistics
• Assuming that an
association is the
same as causation.
• The link may be
spurious
• There may be a
confounding
variable
Sources of error in statistics
which one will be true?
Sources of error in statistics
which one will be true?
• Type 1 error. The
one you thought
was true was not
Sources of error in statistics
which one will be true?
• Type 2 error:
• The one you
thought would not
be true was
Data entry: hardware?
Punch card machine
Data analysis
Life is easier now & less noisy!
• SPSS
• Comprehensive set of flexible tools that
can be used to accomplish a wide variety
of data analysis tasks.
• Data collection instrument
• Data analysis
• Graphic presentations
• Statistical analysis
Creating datasets
•
•
•
•
What experimental design?
Which variables?
What values do these variables assume?
How can the data be coded to make data
entry easier?
• Devise a code book to help you
• Make sure you ‘clean’ the data, as errors
in data entry can occur (10% check +
frequency check)
Choose appropriate scales &
measures
Questionnaires
• Closed questions: easy to code: inflexible
• Semi-structured questions: harder to code: more
flexible
• May need to add to dataset as ‘unexpected
answers’ become apparent
• Open-ended questions: bit of a nightmare: need
to go through & document all possible answers
before devising suitable coding system
Questionnaires: try to avoid…
•
•
•
•
•
•
•
•
Long complex questions
Double negatives
Double-barrelled questions
Jargon or abbreviations
Culture-specific terms
Words with double meanings
Leading questions
Emotionally loaded words
Developing a codebook
• Decide how you will go about:
– Defining and labelling each of the variables
– Assigning numbers to each of the possible
responses
– Each question or section of a question must
have a variable name which:
• Must be unique, begin with a letter, cannot include
punctuation
Data entry: issues to consider
Variables:
• Categorical
• Continuous/discrete
Whether you are dealing with how to deal
with multiple responses (where more than
one response may be given to a single
question)
Outcomes?
•
•
•
•
•
Frequencies?
Cross tabulations?
Visual display?
Statistical analysis?
Is amenable to enter into Word, if
necessary