Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Research Methods in Psychology AS Descriptive Statistics 1 There was widespread panic today with fans fainting as top girl band ‘Central Tendency’ revealed their true values mean median mode Top? More like pretty average 2 Hi! The name’s Ave Rage and I’m a pretty MEAN character. I get really MEAN because people add up all the facts about me and then divide them by the total number. I suppose that makes me an average guy most of the time 3 I’m MEAN and powerful because I make use of all the data. Those weaklings MEDIAN and MODE chuck most of it away, but I can only be used to measure data that: 1. 2. Starts from true zero, e.g., physical quantities such as time, height, weight Are on a scale of fixed units separated by equal intervals that allow us to make accurate comparisons, e.g., someone completing a memory task in 20 seconds did it twice as fast as someone taking 40 seconds I’m also affected by extreme scores 4 Extreme scores • • • • • • Time in seconds to solve a puzzle: 135, 109, 95, 121, 140 Mean = 600 secs ÷ 5 participants =120 secs Add a 6th participant, who stares at it for 8 mins 135, 109, 95, 121, 140, 480 Mean = 1080÷6=180 secs Get out of here, kid! You’re taking too long. You’re about to wreck my experiment! 5 Median • Middle value of scores arranged in rank order • Half the scores will lie above the median, half below it Unlike the MEAN, it can be used on ranked data, e.g., placing a group of people in order of position on a memory test rather than counting their actual score Unaffected by extremes, so can we can use it on data with a skewed distribution where results would be a bit one sided – if we were to plot a graph, it would look like this: 6 Do you know where my middle is? Odd scores: 2, 3, 5, 6, 7, 10, 14 median=6 (middle value) Even scores: 2, 3, 5, 6, 7, 10, 14, 15 median=6+7÷2 = 6.5 7 Disadvantages of the median 1. 2. Does not work well on small data sets, e.g., 10, 12, 13, 14, 18, 19, 22, 22 =16 10, 12, 13, 14, 15, 19, 22, 22 = 14.5 Not as powerful as the MEAN: we can only say one value is higher than another on ranked data 8 Mode • Most frequently occurring value in a data set, e.g., 2, 4, 6, 7, 7, 7, 10,12 mode = 7 Unaffected by extremes as we’re just looking at the most common value rather than its position Can be used on basic data forming nominal categories – we could do a frequency count on these, e.g., number of people preferring vanilla, strawberry or chocolate ice-cream Those mongrels are just sooooooo common 9 Disadvantages of the Mode Small changes can make a big difference, e.g., 1. 3, 6, 8, 9, 10, 10 mode=10 2. 3, 3, 6, 8, 9, 10 mode=3 Can be bi/multimodal, e.g., 3,5,8,8,10,12,16,16,16,20 So we’re too common for you, now? You can calculate the MEDIAN and MEAN if you want to waste time…we’re off to have fun! But the MODE doesn’t tell me much. What about the rest of the data? There’s this really interesting figure… 10 Measures of central tendency are always accompanied by a measure of dispersion When using the … Use … Mean Standard deviation Median Interquartile range or range Mode Range 11 Measures of Dispersion Describe how spread out the values in a data set are Standard Deviation 12 • The difference between the highest and lowest scores in a set of data Quick to calculate Gives us a basic measure of how much the data varies Tells us nothing about data in the middle of a set of scores Affected by outlying values 13 Interquartile • This measures the spread of the middle 50% of scores Avoids extreme scores lying in the top 25% and bottom 25% Still uses only half of the available data 14 Standard Deviation • Measures the variability of our data, i.e., how scores spread out in relation to the mean score Allows us to make statements about probability – how likely or unlikely a given value is to occur Most powerful measure of dispersion as all the data is used Data cannot be ranked or from categories Data must form a normal distribution curve as SD is affected by skewed data 15 68% 95% 99% -3 SD -2 SD -1 SD mean 1 SD 2 SD 3 SD 16 The AS syllabus doesn’t require you to work out SD, but you must know why we use it and what it means. However, previous students found this much easier to understand when they saw how it was calculated and how it related to data in a study, so stick with it if you can. 17 Formula for calculating standard deviation s d 2 N 1 18 s d 2 I knew I should’ve done art N 1 It’s really not so bad! You can do it! S = the standard deviation we are trying to calculate √ = square root ∑ = sum of – add up d2 = the squared deviation from the mean for each value N = number of scores less one for error 19 The easiest way to calculate SD is to put all your data into a very simple table…Come on, I don’t think you’re even trying, but it’s not hard. 20 Let’s look at some test scores on a reaction time task 85 86 94 95 96 107 108 108 109 112 21 Make a table like this Test Scores Mean Difference (d) Difference Squared (d2) 85 86 94 95 96 107 108 108 109 112 22 Calculate the mean of the data set • 85+86+94+95+96+107+108+108+109+112 = 1000 • 1000 ÷ 10 (number of participants)=100 23 Test Scores Mean 85 100 86 100 94 100 95 100 96 100 107 100 108 100 108 100 109 100 112 100 Difference (d) Difference Squared (d2) 24 Find the difference between your results and the mean score to give you column d Then square all the values of d to give you the next column d2. Add up all the figures to give you the total sum for use in the formula 25 Test Scores Mean Difference (d) Difference Squared (d2) 85 100 -15 225 86 100 -14 196 94 100 -6 36 95 100 -5 25 96 100 -4 16 107 100 +7 49 108 100 +8 64 108 100 +8 64 109 100 +9 81 112 100 +12 144 ∑d2 = 900 26 You now have all the figures you need to place into the equation 900 s 10 1 s d 2 N 1 900 is the sum of d2 10 is your number of participants 900 s 9 Subtract 1 from your number of participants to allow for errors in sampling method, then divide the top by the bottom number s 100 Find the square root of this figure s 10 This will give you your figure of standard deviation 27 But what does it actually mean in terms of the reaction time task? I’m still very confused The mean time taken to do the task was 100 seconds, but our SD figure shows us that individual performances varied from the mean by 10 seconds. Some people would’ve taken 90 seconds to complete the task, while others would’ve taken 110 seconds 28 Here’s an example where we can compare performance between two differing conditions in a repeated measures design Participant Number Control Condition (before caffeine) Experimental Condition (after caffeine) 1 6 5 2 5 5 3 7 4 4 9 3 5 8 8 6 5 4 7 6 5 8 7 6 9 8 5 10 6 7 Raw scores in seconds for a reaction time task 29 Summary data table of reaction time scores in control (before caffeine) and experimental (after caffeine) conditions Control Condition (time in secs) Experimental Condition (time in secs) Mean 6.7 5.2 Median 6.5 5.0 6 5 1.34 1.48 Mode Standard Deviation You can see from the table that performance after caffeine was faster compared to doing the task before caffeine. However, the SD tells us that there was greater variation in the scores in the experimental condition. Scores in the control condition were closer to the mean and therefore are more representative as there’s less variation in them. SD provides us with detailed information about data spread that the range cannot.30