Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Neuroinformatics wikipedia , lookup
Geographic information system wikipedia , lookup
Predictive analytics wikipedia , lookup
Regression analysis wikipedia , lookup
Coding theory wikipedia , lookup
K-nearest neighbors algorithm wikipedia , lookup
Corecursion wikipedia , lookup
Pattern recognition wikipedia , lookup
Quantitative analysis Sonia Williams Northern College of Acupuncture 19th February 2011 Numbers, numbers…… Measureable values • Height, weight, age • Can calculate: • Average/mean • Median • Mode Parametric statistics Continuous variables – height, weight, age expressed in exact terms – e.g. 1.67m; 71.5Kg; 25.5years. Non-continuous variables – height, weight, age expressed in groupings – e.g. 1.5-1.7m; 70-75Kg; 20<25yrs. Distribution curve: height, weight, IQ, etc. Continuous variable Comparing 2 groups E.g. shoe sizes men/women? Is there a statistically significant difference between them? Parametric stats. e.g. t-tests Comparing means And standard deviations Other uses of numbers in quantitative data…… Categorical data • E.g. gender • Yes/no answers Presenting categorical data • 4 categories • Visually presented Comparing categorical data 30 10 40 male 30 30 60 female 60 40 100 Apuc- Acup+ total Sample size = 100 Comparing……. • 40 males & 60 females • 40 had received acupuncture while 60 had not. • Was there a significant difference in the proportion of males & females receiving acupuncture? • Chi squared test used • ANSWER=? Ask SPSS Probability values (P) • Probability of heads OR tails = 1 in 2 or 50% (or 0.5) • Probability of 2 consecutive heads = 1 in 2 AND 1 in 2 = 1 in 4 or 25% (or 0.25) Probability values (P) • How many times would you need to get consecutive tails to reach a probability value less than 0.05? Probability values (P) • P<0.05 becomes biologically important. • There is only a 5% chance that this result occurred by chance • or 1 in 20 • P<0.01 is 1% or 1 in 100 • P<0.001 is 0.1% or 1 in 1000 Sources of error in statistics • Assuming that an association is the same as causation. • The link may be spurious • There may be a confounding variable Sources of error in statistics which one will be true? Sources of error in statistics which one will be true? • Type 1 error. The one you thought was true was not Sources of error in statistics which one will be true? • Type 2 error: • The one you thought would not be true was Data entry: hardware? Punch card machine Data analysis Life is easier now & less noisy! • SPSS • Comprehensive set of flexible tools that can be used to accomplish a wide variety of data analysis tasks. • Data collection instrument • Data analysis • Graphic presentations • Statistical analysis Creating datasets • • • • What experimental design? Which variables? What values do these variables assume? How can the data be coded to make data entry easier? • Devise a code book to help you • Make sure you ‘clean’ the data, as errors in data entry can occur (10% check + frequency check) Choose appropriate scales & measures Questionnaires • Closed questions: easy to code: inflexible • Semi-structured questions: harder to code: more flexible • May need to add to dataset as ‘unexpected answers’ become apparent • Open-ended questions: bit of a nightmare: need to go through & document all possible answers before devising suitable coding system Questionnaires: try to avoid… • • • • • • • • Long complex questions Double negatives Double-barrelled questions Jargon or abbreviations Culture-specific terms Words with double meanings Leading questions Emotionally loaded words Developing a codebook • Decide how you will go about: – Defining and labelling each of the variables – Assigning numbers to each of the possible responses – Each question or section of a question must have a variable name which: • Must be unique, begin with a letter, cannot include punctuation Data entry: issues to consider Variables: • Categorical • Continuous/discrete Whether you are dealing with how to deal with multiple responses (where more than one response may be given to a single question) Outcomes? • • • • • Frequencies? Cross tabulations? Visual display? Statistical analysis? Is amenable to enter into Word, if necessary