Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Transcript
Section 3.4
Measures of Position
Standard Score
• A z score or standard score for a value is
obtained by subtracting the mean from the
value and dividing the result by the
standard deviation.
– The symbol for a standard score is z.
– The z score represents the number of
standard deviations that a data value falls
above or below the mean.
X−X
z=
s
z=
X −µ
σ
SAT Scores
The following data represent a sample of the SAT
scores.
798
789
675
669
658
647
642
639
637
635
631
624
624
616
608
606
605
601
599
597
597
595
593
589
589
577
574
570
568
555
555
554
554
552
552
550
549
547
544
539
534
532
530
530
527
527
521
519
518
513
512
511
509
508
499
498
498
497
488
488
482
480
479
477
476
473
470
467
464
461
452
452
451
448
447
445
443
442
418
417
414
414
413
413
409
406
404
403
401
395
394
389
382
381
364
360
351
348
334
319
The mean of the data is 513.0 and the standard deviation is
94.7.
Example
What is more unusual:
1.A student who scored 75 on a statistics
test that had a mean of 65 and a standard
deviation of 6
2.The person who scored 351 on the SAT
z Facts
• When all data for a variable are transformed into
z scores, the resulting distribution will have a
mean of 0 and a standard deviation of 1.
Percentiles
• Percentiles divide the data set into 100
equal groups.
Finding the Value
• The formula for finding a value in a data
set that corresponds to a given percentile
is
k
i=
(n + 1)
100
– where n is the total number of values and k is
the percentile.
Example
• For the data set of the city miles per gallon
rating for the Saturn Ion, find the value
corresponding to the 25th percentile and
the 60th percentile.
• The data in ascending value:
17 18 18 19 20 22
23 23 24 24 25 26
Percentile Formula
The percentile corresponding to a given
value x is computed by using the following
formula:
(number of values below x)
Percentile =
×100
total number of values
Round this number to the nearest integer
Example
• For the data set of the city miles per gallon
rating for the Saturn Ion, find the percentile
rank of a car that had 24 MPG.
• The data in ascending value:
17 18 18 19 20 22
23 23 24 24 25 26
Quartile
• Quartiles divide the data set into four
equal groups.
• The four groups are separated by Q1, Q2,
Q 3.
• Note that Q1 is the same as the 25th
percentile, Q2 is the same as the 50th
percentile, and Q3 is the same as the 75th
percentile.
Finding the Quartiles
Step 1: Arrange the data in order from
lowest to highest.
Step 2: Find the median of the data values.
This is Q2.
Step 3: Find the median of the data values
that fall below Q2. This is Q1.
Step 4: Find the median of the data values
that fall above Q2. This is Q3.
Example
• Find the quartiles for the items on the
Taco Bell menu:
• The data in ascending order:
180 290 290 340 430 450 470 540 600
Interquartile Range
• The interquartile range (IQR) is defined
as the difference between Q1 and Q3.
• It is the range of the middle 50% of the
data.
– The interquartile range is used to determine
outliers.
– It is used as a measure of variability in
exploratory data analysis.
Deciles
• Deciles divide the data set into 10 equal
groups.
• Deciles are denoted D1, D2, D3 … D9.
• The deciles correspond to the percentiles:
P10, P20, P30 … P90.
Outliers
• An outlier is an extremely high or
extremely low data value when compared
to the rest of the data values.
Identifying Outliers
Step 1: Arrange the data in order and find Q1 and
Q3.
Step 2: Find the interquartile range.
Step 3: Multiply the IQR by 1.5.
Step 4: Subtract the value obtained in step 3 from
Q1 and add the value to Q3.
Step 5: Check the data set for any data value that
is smaller than Q1 − 1.5(IQR) and larger
than Q3 + 1.5(IQR).
Example
• Check the following data set for outliers.
2 23
29
31
37
41
43
71