Download Grouped Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics 1: Statistical Measures
Pearson: Chapter 9.1 p279 – 283; 9.2 p288 – 291
Homework: Exercise 9.1 Q10 a,b
Exercise 9.2 p306 Q2, 5, 6
Haese & Harris: Chapter 20 p501 – 517, 521 - 525
Homework: Ex 20A Q4 Ex 20B.1 Q 5, 9, 12
Ex 20B.2 Q3 Ex 20B.3 Q2
Ex 20D Q1, 2
Basic Definitions
Using the GDC: (see “using the GDC” on Moodle for more help with Ti84 and TiNspire)
Discrete Data:
Press STAT and then 1:Edit;
Enter values into L1 (frequencies in L2)
Press STAT > CALC, 1:1-var stats
List = L1; FreqList is either blank or L2,
mean, median, standard deviation,
Q1, Q3, max, min are displayed
Grouped Data:
Press STAT and then 1:Edit;
Enter midpoint values into L1 (frequencies in L2)
Press STAT > CALC, 1:1-var stats
List = L1; FreqList is either blank or L2,
mean is an ”estimate”, median & quartiles will be incorrect
Measures of Central Tendency
𝑥
•
In a frequency table, the mean is the sum of the frequency x data value ÷ total number
•
For grouped data, the midpoint of the interval x frequency ÷ total number
gives an estimate of the mean. (do not use your GDC to calculate the median of grouped data)
Measures of Central Tendency
1. Calculate the mean, mode and median of this data.
2. The table shows the scores of competitors in a competition.
The mean score is 34. Find the value of k.
3.
•
•
Histograms
Histograms are a frequency graph for grouped (usually continuous) data.
There are no gaps between the bars. The data can be entered into the GDC by using the midpoints
of each interval for “x”.
4. The histogram to the right shows the
house prices in thousands of Australian
dollars (AUD) of a random sample of
houses in a certain town in Australia.
Frequency
45
40
35
30
25
20
15
10
5
(a) How many houses are there in the sample?
0
0
60
120
180
240
(b) Write down the modal group for house prices.
(c) Calculate an estimate for the mean.
(c) Find the probability of choosing a house at random which costs less than 60 000 AUD
or more than 240 000 AUD.
(d) Given that a house costs more than 120 000 AUD, find the probability that it costs
between 180 000 and 240 000 AUD.
300
Thousands of
dollars
Measures of Spread
•
Range = the maximum value – the minimum value
•
Lower or First Quartile (Q1) = the first 25% of the ordered data. It is the median of the values below
the median. Often called the 25th percentile.
•
Median or Second Quartile (Q2) = the first 50% of the ordered data. Often called the 50th percentile.
•
Upper or Third Quartile (Q3) = the first 75% of the ordered data. It is the median of the values above
the median. Often called the 75th percentile.
•
Interquartile Range = Q3 – Q1 = the middle 50% of the data
Box & Whisker Plots
•
•
•
A visual representation of the “5 number summary” found on the GDC.
Each section represents 25% of the data
When drawing one, rule the lines & the whiskers do not pass through the box.
Box & Whisker Plots
5. A boxplot has been drawn to show the
distribution of marks (out of 100) in a test
for a particular class.
6. The following diagram is a box and whisker
plot for a set of data.
The interquartile range is 20 and the range is 40.
a)
What was the i) highest mark
ii) lowest mark scored?
a)
What was the median test score for the class?
b)
What was the range of marks scored for this test?
c)
What percentage of students scored 60% or more
for the test?
d)
What was the interquartile range for this test?
e)
The top 25% of students scored a mark between
______ & ______
a)
If you scored 70 for this test, would you be in the
top 50% of students in the class?
a)
Write down the median value.
b)
Find the value of
i.
a
ii.
b
Outliers & Box and Whisker Plots
•
Outliers are values that are much smaller or larger than the other values.
•
Outliers are defined as being more than 1.5 x IQR from the nearest quartile
•
Outliers are represented on a box and whisker plot by a *, so the “minimum” or “maximum” value
is the smallest or largest value within 1.5 x IQR. It is possible to more than one outlier.
7. Test the following data for outliers and hence construct a boxplot for the data:
Related documents