Download LESSON: Error bars and measures of dispersion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
LESSON: Error bars and measures of dispersion
FOCUS QUESTION: How can I depict uncertainty and variability in data?
This lesson discusses various ways of putting error bars on graphs.
In this lesson you will:
Examine measures of spread
or dispersion.
Display error bars on different
types of charts.
Additional practice with plot
properties.
Contents
DATA FOR THIS LESSON
SETUP FOR LESSON
EXAMPLE 1: Load the data about New York contagious diseases
EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox
EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and chickenpox
EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year
EXAMPLE 5: Plot the SD error bars for measles monthly counts by year
EXAMPLE 6: Plot the SD error bars on a bar chart for measles
EXAMPLE 7: Compute median, MAD and IQR by month for measles
EXAMPLE 8: Plot median monthly measles with IQR for error bars
EXAMPLE 9: Plot IQR and MAD error bars on the same graph
SUMMARY OF SYNTAX
DATA FOR THIS LESSON
File
Description
The data set contains the monthly totals of the number of new cases of measles,
mumps, and chicken pox for New York City during the years 1931­1971. The file is organized into the following variables: measles ­ an array containing the monthly cases of measles
mumps ­ an array containing the monthly cases of mumps
NYCDiseases.mat
chickenPox ­ an array containing the monthly cases of chicken pox
years ­ a vector containing the years 1931 through 1971
The data was extracted from the Hipel­McLeod Time Series Datasets Collection,
available at http://www.stats.uwo.ca/faculty/aim/epubs/mhsets/readme­
mhsets.html . The data was first published in: Yorke, J.A. and London, W.P. (1973). "Recurrent
Outbreaks of Measles, Chickenpox and Mumps", American Journal of
Epidemiology, Vol. 98, pp. 469.
SETUP FOR LESSON
Create an ErrorBars directory on your V: drive and make it your current directory.
Download the NYCDiseases.mat to your ErrorBars directory.
Create a ErrorBarLesson script file in your ErrorBars directory.
EXAMPLE 1: Load the data about New York contagious diseases
Create a new cell in which you type and execute:
load NYCDiseases.mat; % Load the disease data
You should see measles, mumps, chickenPox, and years variables in the Workspace Browser.
EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox
Create a new cell in which you type and execute:
measlesAver = mean(measles(:)); % Calculate overall average measles
measlesSD = std(measles(:), 1); % Calculate overall std measles
chickenPoxAver = mean(chickenPox(:)); % Calculate overall average chickenpox
chickenPoxSD = std(chickenPox(:), 1); % Calculate overall std chickenpox
You should see the following variables in your Workspace Browser:
measlesAver ­ overall average of measles
measlesSD ­ overall standard deviation of measles
chickenPoxAver ­ overall average of chickenpox
chickenPoxSD ­ overall standard deviation of chickenpox
Note: we used the population estimate of standard deviation, not the sample standard deviation.
EXERCISE 1: Create variables to hold the overall average and overall standard deviation of the traffic data of
Lesson 1.
EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and
chickenpox
Create a new cell in which you type and execute:
figure
hold on
errorbar(1, measlesAver./1000, measlesSD./1000, 'rs');
errorbar(2, chickenPoxAver./1000,chickenPoxSD./1000, 'ko');
hold off
xlabel('Disease')
ylabel('Monthly averages (in thousands)')
title('Childhood diseases NYC: 1931‐1971 (SD error bars)')
set(gca, 'XTickMode', 'manual', 'XTick', 1:2, ...
'XTickLabelMode', 'manual', 'XTickLabel', {'Measles', 'Chicken Pox'})
You should see a Figure Window with a labeled error bar plot:
EXERCISE 2: Copy the code in EXAMPLE 3 and modify it to also include mumps.
EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year
Create a new cell in which you type and execute:
measlesByYearAver = mean(measles, 2); % Average monthly measles by year
measlesByYearSD = std(measles, 1, 2); % Std monthly measles by year
You should see the following varibles in your Workspace Browser:
measlesByYearAver ­ a 41 x 1 array of average monthly measles cases by year
measlesByYearSD ­ a 41 x 1 array of average standard deviations of measles cases by year
Note: we used the population estimate of standard deviation, not the sample standard deviation.
EXERCISE 3: Create variables to hold the average and standard deviation by hour of the traffic data of Lesson
1.
EXERCISE 4: Plot the mean with SD error bars for the traffic data. Use the data computed from Exercise 3.
EXAMPLE 5: Plot the SD error bars for measles monthly counts by year
Create a new cell in which you type and execute:
figure
errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks');
xlabel('Year');
ylabel('Monthly averages (in thousands)')
title('Measles NYC: 1931‐1971 (SD error bars)')
set(gca, 'YLimMode', 'manual', 'YLim', [0, 20])
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 6: Plot the SD error bars on a bar chart for measles
Create a new cell in which you type and execute:
figure
hold on
errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks');
bar(years, measlesByYearAver./1000, 'FaceColor', [0.5, 0.5, 1])
plot(years, measlesByYearAver./1000, 'LineStyle', 'none', ...
'Marker', 's', 'MarkerEdgeColor','k', 'MarkerFaceColor','r')
hold off
xlabel('Year');
ylabel('Monthly averages (in thousands)')
title('Measles NYC: 1931‐1971 (SD error bars)')
set(gca, 'YLimMode', 'manual', 'YLim', [0, 20])
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 7: Compute median, MAD and IQR by month for measles
Create a new cell in which you type and execute:
measlesByMonthMedian = median(measles, 1); % Median by month
measlesByMonthMAD = mad(measles, 1, 1); % Median by month
measlesByMonthIQR = prctile(measles, [25, 75]); % 25th and 75th %‐tile
You should see the following 3 variables in your Workspace Browser:
measlesByMonthMedian ­ the median measles by month
measlesByMonthMAD ­ median absolute deviation (MAD) by month
measlesByMonthIQR ­ IQR for the measles by month
The rows of measlesByMonthIQR correspond to the percentiles, and the columns correspond to the months.
EXAMPLE 8: Plot median monthly measles with IQR for error bars
Create a new cell in which you type and execute:
xPositions = 1:12;
lowerDist = measlesByMonthMedian ‐ measlesByMonthIQR(1, :); % Bottom
upperDist = measlesByMonthIQR(2, :) ‐ measlesByMonthMedian; % Top bar
figure
errorbar(xPositions, measlesByMonthMedian./1000, ...
lowerDist./1000, upperDist./1000, '‐m*')
xlabel('Month');
ylabel('Cases in thousands')
title('Measles cases in NYC: 1931‐1971')
legend('Median (IQR error bars)', 'Location', 'Northeast') % Upper right
You should see the following 3 variables in your Workspace Browser:
lowerDist ­ lengths of lower edges of IQR error bars for median
upperDist ­ lengths of upper edges of IQR error bars for median
xPositions ­ vector with the values 1..12
You should see a Figure Window with median/IQR error bars:
EXERCISE 5: Copy the code for EXAMPLE 8 into a new cell. Add a line graph of the average monthly measles
cases (black line, no markers or error bars). Update the legend appropriately.
EXERCISE 6: Create a new figure in which you plot the yearly averages for measles and chickenpox on the
same graph. The graphs should have SD error bars.
EXAMPLE 9: Plot IQR and MAD error bars on the same graph
Create a new cell in which you type and execute:
figure
hold on
errorbar(xPositions‐0.1, measlesByMonthMedian./1000, ...
lowerDist./1000, upperDist./1000, 'm*')
errorbar(xPositions+0.1, measlesByMonthMedian./1000, ...
measlesByMonthMAD./1000, 'ks')
hold off
xlabel('Month');
ylabel('Median in thousands')
title('Measles cases in NYC: 1931‐1971')
legend('IQR error bars', 'MAD error bars', 'Location', 'Northeast')
You should see a Figure Window with two sets of error bars:
SUMMARY OF SYNTAX
MATLAB syntax
Description
errorbar(Y, E)
Create a plot of the values of Y similar to plot(Y). The corresponding values in
E give the length of each wing of the error bars that extend above and below the
corresponding values in Y.
errorbar(X, Y, E)
Create a plot similar to errorbar(Y, E) except that this function uses the values
of X for the horizontal positions rather than using the integers 1, 2, ... .
errorbar(X, Y, L, U)
Create a plot similar to errorbar(X, Y, E) except that this function uses the
values of L and U to determine the lengths of the lower and upper wings of the
error bars, respectively.
mad(X)
Compute the average or mean absolute deviation for the array X across the first
non­singleton dimension. For 2D arrays, this computes the mean absolute
deviation across the rows (resulting in the mean absolute deviations of the
columns).
mad(X, 0, 1)
Compute the average or mean absolute deviation for the array X across
dimension 1 (resulting in the mean absolute deviations of the columns). Note: If
the second argument is 1, we compute the median absolute deviation.
mad(X, 0, 2)
Compute the average or mean absolute deviation for the array X across
dimension 2 (resulting in the mean absolute deviations of the rows). Note: If the
second argument is 1, we compute the median absolute deviation.
Y = prctile(X, p)
Compute a vector of the percentiles of the vector X. The vector p specifies the
percentiles. When X is a 2D array, the i­th row of Y contains the percentiles
p(i).
Compute the unbiased estimate of the population standard deviation for the
array X across the first non­singleton dimension. For 2D arrays, this computes
std(X)
the standard deviation across the rows (resulting in thestandard deviations of the
columns).
std(X, 0, 1)
Compute the unbiased estimate of the population standard deviation for the
array x across dimension 1 (resulting in the standard deviations of the columns).
Note: If the second argument is 1, the actual sample standard deviation is
computed.
std(X, 0, 2)
ompute the unbiased estimate of the population standard deviation of the array x
across dimension 2 (resulting in thestandard deviations of the rows). Note: If the
second argument is 1, the actual sample standard deviation is computed.
This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 10­Feb­2015.
Please contact [email protected] with comments or suggestions.The photo shows rate of measles vaccination
worldwide (WHO 2007) http://en.wikipedia.org/wiki/File:Measles_vaccination_worldwide.png .
Published with MATLAB® 8.3
Related documents