Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ADNAN MENDERES UNIVERSITY Faculty of Engineering Measures of central tendency and variability MAT 254 Probability & Statistics Week #3 Olcay ÜZENGİ AKTÜRK, Assoc. Prof.Dr. Data Types • Primary • Ours • Secondary • Not ours • Qualitative data (words) • Blue, short • Quantitative data (numbers) • 1, 2.5, 10000 • Discrete (countable) • 1 car, 206 students • Continuous (measurable) • 165 cm, 52.5 kg 2 Sampling Techniques • Probability Sampling (every member of the population has equal chance) • • • • • Simple Random Sampling (Lottery) Systematic Sampling (every 4th sample) Stratified Sampling (from each area) Cluster or Area Sampling (form clusters) Multi-stage Sampling (multi-stage) • Non-probability Sampling (samples are selected based on an inclusion rule) 3 Age Frequency 12 2 13 13 14 27 15 4 Presentation of Data Percentage of causes of child death in Egypt congenital 10% accident 10% diarrhea 50% Stem Leaves 1 7,8 2 0,3,3,4,5,6,7,8,9 3 4,4,5,5,7,8,8,8,8,9,9,9 4 2,3,4,4,5,6,6,8,9 5 0,0,0 chest infection 30% 4 Other Graphical Methods Box Plot (Box and Whisker) Lowest value Highest value Median Upper Quartile Lower Quartile Inter-Quartile Range Range Scatter Plot RI Correlation between Doppler velocimetry (RI) and baby birth weight 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1.5 2 2.5 3 3.5 4 4.5 baby weight in kg 5 How can you represent a huge amount of data (numbers) by using only one (or two) number(s)? Minimum of them? Maximum of them? Average of them? ???? “CENTRAL TENDENCY” 6 • In statistics, a measure of CENTRAL TENDENCY is a single value that attempts to describe a set of data by identifying the central position within that set of data. As such, measures of central tendency are sometimes called measures of central location. • The most commonly used statistics for measuring the center of a set of data, arranged in order of magnitude, are the mean, median, and mode. Mean (average) Median (middle) Mode (most) 7 Sample Mean 𝑥 The mean (arithmetic mean or average) of a set of data is found by adding up all the items and then dividing by the sum of the number of items. The mean of a sample is denoted by x (read “x bar”). 2 3 7 7 𝑥 = 2+3+7+7/4 = 4.75 8 Sample Mean x Staff 1 2 3 4 5 6 7 8 9 10 Salary 1$ 1$ 1$ 1$ 2$ 2$ 2$ 2$ 100$ 100$ 1 1 1 1 2 2 2 2 100 100 x 10 x 21.2$ 9 Trimmed Mean xtr (..) A trimmed mean is computed by “trimming away” a certain percent of both the largest and the smallest set of values. For example, the 10% trimmed mean is found by eliminating the largest 10% and smallest 10% and computing the average of the remaining values. xtr (10 ) xtr (10 ) 1 1 1 2 2 2 2 100 8 11.1$ xtr ( 20 ) 1.67$ 10 ~ Sample Median x 2 3 7 7 𝑥 = 3+7/2 =5 3 7 2 2 3 7 7 3 𝑥=3 11 Sample Mode The most repeated value in observations 2 3 7 7 7 2 2 3 7 7 2 7 unimodal bimodal 12 Sample #1 : 1 51 101 x 101 ; ~ x 101 Sample #2 : 99 100 101 x 101 ; ~ x 101 151 201 No mode ! 102 103 No mode ! Measures of spread or variability ???? 13 Sample Range The difference between the lowest and the highest value of that sample. 2 3 7 7 The range is 7-2 = 5 14 Variance & Standard Deviation 15 Variance & Standard Deviation 1 3 5 11 𝑥 = 1+3+5+11/4 =5 Variance s2= [(1-5)2 + (3-5)2 + (5-5)2 + (11-5)2] / (4-1) s2= [(-4)2 + (-2)2 + (0)2 + (6)2] / 3 s2= (16 + 4 + 0 + 36) / 3 s2= 56/3 = 18.666 Standard deviation 𝑠 = 18.666 = 4.32 16 Population vs. Sample Commonly used Symbols for a Sample and for a Population. 17 Example from textbook 18 Example from textbook 19 Example from textbook 20 Example from textbook 21 Example from textbook 22 END of LECTURE #3 MAT254-02 Probability & Statistics 23