Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LESSON: Error bars and measures of dispersion FOCUS QUESTION: How can I depict uncertainty and variability in data? This lesson discusses various ways of putting error bars on graphs. In this lesson you will: Examine measures of spread or dispersion. Display error bars on different types of charts. Additional practice with plot properties. Contents DATA FOR THIS LESSON SETUP FOR LESSON EXAMPLE 1: Load the data about New York contagious diseases EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and chickenpox EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year EXAMPLE 5: Plot the SD error bars for measles monthly counts by year EXAMPLE 6: Plot the SD error bars on a bar chart for measles EXAMPLE 7: Compute median, MAD and IQR by month for measles EXAMPLE 8: Plot median monthly measles with IQR for error bars EXAMPLE 9: Plot IQR and MAD error bars on the same graph SUMMARY OF SYNTAX DATA FOR THIS LESSON File Description The data set contains the monthly totals of the number of new cases of measles, mumps, and chicken pox for New York City during the years 19311971. The file is organized into the following variables: measles an array containing the monthly cases of measles mumps an array containing the monthly cases of mumps NYCDiseases.mat chickenPox an array containing the monthly cases of chicken pox years a vector containing the years 1931 through 1971 The data was extracted from the HipelMcLeod Time Series Datasets Collection, available at http://www.stats.uwo.ca/faculty/aim/epubs/mhsets/readme mhsets.html . The data was first published in: Yorke, J.A. and London, W.P. (1973). "Recurrent Outbreaks of Measles, Chickenpox and Mumps", American Journal of Epidemiology, Vol. 98, pp. 469. SETUP FOR LESSON Create an ErrorBars directory on your V: drive and make it your current directory. Download the NYCDiseases.mat to your ErrorBars directory. Create a ErrorBarLesson script file in your ErrorBars directory. EXAMPLE 1: Load the data about New York contagious diseases Create a new cell in which you type and execute: load NYCDiseases.mat; % Load the disease data You should see measles, mumps, chickenPox, and years variables in the Workspace Browser. EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox Create a new cell in which you type and execute: measlesAver = mean(measles(:)); % Calculate overall average measles measlesSD = std(measles(:), 1); % Calculate overall std measles chickenPoxAver = mean(chickenPox(:)); % Calculate overall average chickenpox chickenPoxSD = std(chickenPox(:), 1); % Calculate overall std chickenpox You should see the following variables in your Workspace Browser: measlesAver overall average of measles measlesSD overall standard deviation of measles chickenPoxAver overall average of chickenpox chickenPoxSD overall standard deviation of chickenpox Note: we used the population estimate of standard deviation, not the sample standard deviation. EXERCISE 1: Create variables to hold the overall average and overall standard deviation of the traffic data of Lesson 1. EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and chickenpox Create a new cell in which you type and execute: figure hold on errorbar(1, measlesAver./1000, measlesSD./1000, 'rs'); errorbar(2, chickenPoxAver./1000,chickenPoxSD./1000, 'ko'); hold off xlabel('Disease') ylabel('Monthly averages (in thousands)') title('Childhood diseases NYC: 1931‐1971 (SD error bars)') set(gca, 'XTickMode', 'manual', 'XTick', 1:2, ... 'XTickLabelMode', 'manual', 'XTickLabel', {'Measles', 'Chicken Pox'}) You should see a Figure Window with a labeled error bar plot: EXERCISE 2: Copy the code in EXAMPLE 3 and modify it to also include mumps. EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year Create a new cell in which you type and execute: measlesByYearAver = mean(measles, 2); % Average monthly measles by year measlesByYearSD = std(measles, 1, 2); % Std monthly measles by year You should see the following varibles in your Workspace Browser: measlesByYearAver a 41 x 1 array of average monthly measles cases by year measlesByYearSD a 41 x 1 array of average standard deviations of measles cases by year Note: we used the population estimate of standard deviation, not the sample standard deviation. EXERCISE 3: Create variables to hold the average and standard deviation by hour of the traffic data of Lesson 1. EXERCISE 4: Plot the mean with SD error bars for the traffic data. Use the data computed from Exercise 3. EXAMPLE 5: Plot the SD error bars for measles monthly counts by year Create a new cell in which you type and execute: figure errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks'); xlabel('Year'); ylabel('Monthly averages (in thousands)') title('Measles NYC: 1931‐1971 (SD error bars)') set(gca, 'YLimMode', 'manual', 'YLim', [0, 20]) You should see a Figure Window with a labeled error bar plot: EXAMPLE 6: Plot the SD error bars on a bar chart for measles Create a new cell in which you type and execute: figure hold on errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks'); bar(years, measlesByYearAver./1000, 'FaceColor', [0.5, 0.5, 1]) plot(years, measlesByYearAver./1000, 'LineStyle', 'none', ... 'Marker', 's', 'MarkerEdgeColor','k', 'MarkerFaceColor','r') hold off xlabel('Year'); ylabel('Monthly averages (in thousands)') title('Measles NYC: 1931‐1971 (SD error bars)') set(gca, 'YLimMode', 'manual', 'YLim', [0, 20]) You should see a Figure Window with a labeled error bar plot: EXAMPLE 7: Compute median, MAD and IQR by month for measles Create a new cell in which you type and execute: measlesByMonthMedian = median(measles, 1); % Median by month measlesByMonthMAD = mad(measles, 1, 1); % Median by month measlesByMonthIQR = prctile(measles, [25, 75]); % 25th and 75th %‐tile You should see the following 3 variables in your Workspace Browser: measlesByMonthMedian the median measles by month measlesByMonthMAD median absolute deviation (MAD) by month measlesByMonthIQR IQR for the measles by month The rows of measlesByMonthIQR correspond to the percentiles, and the columns correspond to the months. EXAMPLE 8: Plot median monthly measles with IQR for error bars Create a new cell in which you type and execute: xPositions = 1:12; lowerDist = measlesByMonthMedian ‐ measlesByMonthIQR(1, :); % Bottom upperDist = measlesByMonthIQR(2, :) ‐ measlesByMonthMedian; % Top bar figure errorbar(xPositions, measlesByMonthMedian./1000, ... lowerDist./1000, upperDist./1000, '‐m*') xlabel('Month'); ylabel('Cases in thousands') title('Measles cases in NYC: 1931‐1971') legend('Median (IQR error bars)', 'Location', 'Northeast') % Upper right You should see the following 3 variables in your Workspace Browser: lowerDist lengths of lower edges of IQR error bars for median upperDist lengths of upper edges of IQR error bars for median xPositions vector with the values 1..12 You should see a Figure Window with median/IQR error bars: EXERCISE 5: Copy the code for EXAMPLE 8 into a new cell. Add a line graph of the average monthly measles cases (black line, no markers or error bars). Update the legend appropriately. EXERCISE 6: Create a new figure in which you plot the yearly averages for measles and chickenpox on the same graph. The graphs should have SD error bars. EXAMPLE 9: Plot IQR and MAD error bars on the same graph Create a new cell in which you type and execute: figure hold on errorbar(xPositions‐0.1, measlesByMonthMedian./1000, ... lowerDist./1000, upperDist./1000, 'm*') errorbar(xPositions+0.1, measlesByMonthMedian./1000, ... measlesByMonthMAD./1000, 'ks') hold off xlabel('Month'); ylabel('Median in thousands') title('Measles cases in NYC: 1931‐1971') legend('IQR error bars', 'MAD error bars', 'Location', 'Northeast') You should see a Figure Window with two sets of error bars: SUMMARY OF SYNTAX MATLAB syntax Description errorbar(Y, E) Create a plot of the values of Y similar to plot(Y). The corresponding values in E give the length of each wing of the error bars that extend above and below the corresponding values in Y. errorbar(X, Y, E) Create a plot similar to errorbar(Y, E) except that this function uses the values of X for the horizontal positions rather than using the integers 1, 2, ... . errorbar(X, Y, L, U) Create a plot similar to errorbar(X, Y, E) except that this function uses the values of L and U to determine the lengths of the lower and upper wings of the error bars, respectively. mad(X) Compute the average or mean absolute deviation for the array X across the first nonsingleton dimension. For 2D arrays, this computes the mean absolute deviation across the rows (resulting in the mean absolute deviations of the columns). mad(X, 0, 1) Compute the average or mean absolute deviation for the array X across dimension 1 (resulting in the mean absolute deviations of the columns). Note: If the second argument is 1, we compute the median absolute deviation. mad(X, 0, 2) Compute the average or mean absolute deviation for the array X across dimension 2 (resulting in the mean absolute deviations of the rows). Note: If the second argument is 1, we compute the median absolute deviation. Y = prctile(X, p) Compute a vector of the percentiles of the vector X. The vector p specifies the percentiles. When X is a 2D array, the ith row of Y contains the percentiles p(i). Compute the unbiased estimate of the population standard deviation for the array X across the first nonsingleton dimension. For 2D arrays, this computes std(X) the standard deviation across the rows (resulting in thestandard deviations of the columns). std(X, 0, 1) Compute the unbiased estimate of the population standard deviation for the array x across dimension 1 (resulting in the standard deviations of the columns). Note: If the second argument is 1, the actual sample standard deviation is computed. std(X, 0, 2) ompute the unbiased estimate of the population standard deviation of the array x across dimension 2 (resulting in thestandard deviations of the rows). Note: If the second argument is 1, the actual sample standard deviation is computed. This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 10Feb2015. Please contact [email protected] with comments or suggestions.The photo shows rate of measles vaccination worldwide (WHO 2007) http://en.wikipedia.org/wiki/File:Measles_vaccination_worldwide.png . Published with MATLAB® 8.3