Download Statistics File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics
Statistics
The collection, evaluation, and interpretation
of data
Statistics
Statistics
Descriptive Statistics
Inferential Statistics
Describe collected data
Generalize and
evaluate a population
based on sample
data
Data
Categorical or Qualitative Data
Values that possess names or labels
Color of M&Ms, breed of dog, etc.
Numerical or Quantitative Data
Values that represent a measurable quantity
Population, number of M&Ms, number
of defective parts, etc.
Data Collection
Sampling
Random
Systematic
Stratified
Cluster
Convenience
Graphic Data Representation
Histogram
Frequency distribution graph
Frequency Polygons
Frequency distribution graph
Bar Chart
Categorical data graph
Pie Chart
Categorical data graph %
Measures of Central Tendency
Mean x
Arithmetic average
Sum of all data values divided by the
number of data values within the array
 x

x
n
Most frequently used measure of central
tendency
Strongly influenced by outliers – very large
or very small values
Measures of Central Tendency
Determine the mean value of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
 x

x
n
(48  63  62  49  58  2  63  5  60  59  55)
x
11
524
x
11
x  47.64
Measures of Central Tendency
Median
Data value that divides a data array into
two equal groups
Data values must be ordered from lowest
to highest
Useful in situations with skewed data
and outliers (e.g., wealth management)
Measures of Central Tendency
Determine the median value of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Organize the data array from lowest to
highest value.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Select the data value that splits the data set
evenly.
Median = 58
What if the data array had an even number of
values?
5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Measures of Central Tendency
Mode
Most frequently occurring response within a
data array
Usually the highest point of curve
May not be typical
May not exist at all
Modal, bimodal, and multimodal
Measures of Central Tendency
Determine the mode of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Mode = 63
Determine the mode of
48, 63, 62, 59, 58, 2, 63, 5, 60, 59, 55
Mode = 63 & 59 Bimodal
Determine the mode of
48, 63, 62, 59, 48, 2, 63, 5, 60, 59, 55
Mode = 63, 59, & 48
Multimodal
Data Variation
Measure of data scatter
Range
Difference between the lowest and highest
data value
Standard Deviation
Square root of the variance
Range
Calculate by subtracting the lowest value
from the highest value.
R  h l
Calculate the range for the data array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
R  h l
R  63  2
R  61
Standard Deviation
 x  x 

( N  1)
σ for a sample, not population
1. Calculate the mean x
2. Subtract the mean from each value and then
square it.
3. Sum all squared differences.
4. Divide the summation by the number of
values in the array minus 1.
5. Calculate the square root of the product.
2
Standard Deviation
 x  x 
Calculate the standard

( N  1)
deviation for the data array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
1.
 x

x
n
524

11
 47.64
2.  x  x 
(2 - 47.64)2 = 2083.01
(59 - 47.64)2 = 129.05
(5 - 47.64)2 = 1818.17
(60 - 47.64)2 = 152.77
2
(48 - 47.64)2 =
0.13
(62 - 47.64)2 = 206.21
(49 - 47.64)2 =
1.85
(63 - 47.64)2 = 235.93
(55 - 47.64)2 =
54.17
(63 - 47.64)2 = 235.93
(58 - 47.64)2 = 107.33
2
Standard Deviation
 x  x 
Calculate the standard
s
( N  1)
deviation for the data array.
2
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
4.   x  x 
2
2083.01 + 1818.17 + 0.13 + 1.85 + 54.17 + 107.33
+ 129.05 + 152.77 + 206.21 + 235.93 + 235.93
= 5,024.55
5.
 x  x 
( N  1)
2
5,024.55

10
 502.46
6.
 x  x 
s
( N  1)
2
 502.46
S = 22.42
Graphing Frequency Distribution
Numerical assignment of each outcome of a
chance experiment
A coin is tossed 3 times. Assign the variable
X to represent the frequency of heads
occurring in each toss.
Toss Outcome
HHH
X Value
3
2
X =1 when?
HTH
THH
2
HTT,THT,TTH
HTT
THT
1
1
TTH
1
TTT
0
HHT
2
Graphing Frequency Distribution
The calculated likelihood that an outcome
variable will occur within an experiment
Toss Outcome
X value
HHH
3
2
HHT
HTH
THH
2
HTT
THT
1
1
TTH
1
TTT
0
2
x
0
Fx
Px 
Fa
1
2
3
P(x)
1
8
3
P1 
8
3
P2 
8
1
P3 
8
P0 
Graphing Frequency Distribution
Histogram
x
0
1
2
3
P(x)
1
8
3
P1 
8
3
P2 
8
1
P3 
8
P0 
x
Histogram
Available airplane passenger seats one week
before departure
percent of the time
What information does
the histogram provide
the airline carriers?
What information
does the histogram
provide prospective
customers?
open seats
Related documents