Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Distribution Center (Clutter) Criteria The average 1 / 25 The Aim By the end of this lecture, the students will be aware of the central distribution measures and be able to calculate the extent of the distribution center by using SPSS. 2 / 25 2 The Goals • • • • • Be able to count the central tendency measures. Be able to calculate the central tendency measures by using SPSS. Be able to draw histogram by using SPSS and be able to evaluate the status of the distribution. Be able to explain weighted average. Be able to make weigthing by using SPSS. 3 / 25 3 1-We can not have an idea about the overall data if we do not summarize our obtained data in some way. 2-The graphic display can be a good way to summarize. 3-We can get a general idea by calculating the specific characteristics of the data. We can find a value to represent our data and we can calculate the distribution of variables around this value. 4 / 25 4 • • • • • Arithmetic mean Median Mod Geometric mean Weighted mean 5 / 25 5 Arithmetic Mean Arithmetic mean is indicated by a line above x in the formula. By using Greek Sigma collection sign, this formula is shown as follows: Or 6 / 25 6 Person 1 No Year 2 3 10 13 15 4 5 6 7 8 9 16 16 18 19 20 20 Arithmetic average age in the above data set (10+13+15+16+16+18+19+20+20)/9 = 16,33 7 / 25 7 Median • • When we sort the data from greater to small, the middle walue is called “Median” If the data number is an even number, the average of the middle two values are taken. Person No 1 Year 10 13 15 2 3 4 5 6 7 8 9 16 16 18 19 20 20 8 / 25 8 Mod • • • • Mode is not frequently used measure of central tendency. Most repetitive variable in the data set is called “Mod”. Data sets can have multiple modes. If each variables is repeated only once, no mod is present. Person 1 No Year 2 3 10 13 15 4 5 6 7 8 9 16 16 18 19 20 20 9 / 25 9 Geometric Mean 1-In the case of slope of our data, using arithmetic average it is not apropriate. 2-In the case of data that become skewed to the right (tail of the bell curve is toward right on a histogram graph), if we take the individual log data (according to the base 10 or base e) the new data set we will obtain may become symmetrical. 3-We can get the arithmetic mean of these logarithm values. 10 / 25 10 • In order to return to the original unit, data conversion (antilog) is required. • The new value is called the geometric mean. • In general, geometric mean is close to the median and It becomes smaller value than the arithmetic mean. 11 / 25 11 www.aile.net/agep/istat/08_09/diyabet.sav • When we examine " Weight" variables in data set, we saw that the tail of the bell curve toward right (right skewed) in the histogram graph. • Now, Let's have the logarithm of "weight" to make a new variable. 12 / 25 12 • Transform> Compute variable> [Let us write "logWeight " into "Target Variable " field and "LG10 (Weight ) into "Numeric Expression" field]> OK • A new variable with a name of "logWeight" will appear in our SPSS data set. Now let's look at the histogram of this variable: • Graphs> Interactive > Histogram [Let us drag "logWeight" variable on X-axis. Then click on" Histogram" tab. Let's mark "Normal curve" box ]> OK 13 / 25 13 50 Count 40 30 20 10 1,60 1,80 2,00 2,20 LogWei ght • Let us calculate the arithmetic average of our “Weight” variable: • Analyze> Descriptive Statistics > Descriptives [ Let us drag "Weight" variable into "Variable (s)" area ]> OK 14 / 25 14 N Minimum Weight 424 Valid N (listwise) 424 33,0 Mean 160,0 74,266 Maximum Std. Deviation 15,1381 Now, let us take the arithmetic average of “logWeihgt” variable : Analyze> Descriptive Statistics > Descriptives [ Let us drag “logWeight” variable into "Variable (s)" area ]> OK Descriptive Statistics N LogWeigh t Valid N (listwise) 424 Minimum Maximum 1,52 2,20 Mean 1,8625 Std. Deviation ,08428 424 15 / 25 15 In order to interpret the clinical value we obtained we need to reverse "Weight" variable's unit back. We must take anti- logarithm value of 1,862 . Antilog (1.862) = 101,862 = 72.777 kg. İn order to to calculate the median and mode of ”Weight” variable with SPSS. Analyze>Descriptive Statistics>Frequencies [Let us drag “Weight” variable into “Variable(s)” area. Let us click on “Statistics” button. Under the title "Central tendency” let us click on “Median” ve “Mode” boxes ]>Continue>OK 16 / 25 16 Weighted Mean • • We use weighted mean if some values of a variable is more important than others. We will give a coefficient to each value in our sample. We multiply eacah value by coefficient and we collect them. Then we divide by the total value. 17 / 25 17 • • Ex . We examine the number of daily discharge of our city hospital. Our Variable "This day was the number of Suppose that our variable is: "How many patients discharged from your hospital today?" The followings are the obtained data for 3 hospitals in our province are as follows: Hospital 1 Hospital 2 Hospital 3 Discharged patient 20 5 50 We realise that the thirth hospital has discharged maximum patient. The mean discarged number is 25. We can not have an idea about the workload of hospitals without knowing their beds capacity. 18 / 25 18 Suppose that the patient capacity as follows: Hospital 1 Hospital 2 Hospital 3 Discharged patient 20 5 50 Bed capacity 50 50 400 We can get a better idea if we weighted discharge number according to bad capacity. Let us apply the formula: (20x50 + 5x50 + 50x400)/(50+50+400) = 42,5 discharge. 19 / 25 19 Hospilal 1 Hospilal 2 Hospilal 3 Mean Discharged patient 20 5 50 25 Bed capacity 50 50 400 166,6 Weighted discharged 66,6* 16,6 20,8 * 20 x 166,6 / 50 As a result: we observe that: the weighted average (42,5 people) is much more from what appears at first (25 people) and we see that hospital 1 work in the highest density. 20 / 25 20 • We know that the number of children is affected by the age factor and as age progresses having more children. www.aile.net/agep/istat/08_09/diyabet.sav • Let us weighted "children" ( number of children) variable according to age. • Before weightining note that the arithmetic average of "children" variable is 6,38. 21 / 25 21 • Data>Wieght Cases>Weight cases by>[Let us drag “Age” variable into “Frequency Variable” area]>OK. • When we take the arithmetic mean of the number of children ("children"), we will se that it is 6,61. • This process is also called "corrected the number of children by age". • On International statistics statistics such as mortality rates are given with correction (weighted ) according to population or other variables. 22 / 25 22 Mean type Positive Negative Arithmetic All values are used Affected by outliers mean Defined algebraic It is affected by skewed and maybe used mathematically data Known sampling distribution (see section : data conversion ) Median Not affected by extreme A large part of the values information is ignored Not affected bay inclined It is not defined as (skewed ) data algebraic It is affected by the sample distribution 23 / 25 23 Mean type Positive Negative Mod -It can easily be detected for categorical data. -A large part of the information is ignored. -It is not defined as algebraic. -Sampling distribution is unknown. Geometric mean -Before recycling It has the same advantages with arithmetic average. -Suitable for right skewed data. -Only works if the log transformation making a symmetrical distribution. Weighted mean -It has the sane advantages with arithmetic average . -The relative importance is given for each observation. -It is defined as algebraic. -Weight should be known or should be calculated. 24 / 25 24 Summary 25 / 25 25