Download Chapter 3 study guide - Germantown School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Time series wikipedia , lookup

Transcript
Thursday, September 6:
3.1 Displaying Categorical Data: Bar and Pie Charts
The first rule of data analysis is: make a picture. The second rule of data analysis is: make a picture.
The third rule of data analysis is: ______________________
Def: A ________________ displays the possible values of a variable and how often it takes them.
Depending on the type of data and what you are looking for, there are a number of ways to organize and
display the distribution.
When a variable is categorical, the first step is usually to construct a ________________________,
which displays the possible categories and the number of observations in each category.
Make
Frequency
American
Asian
European
Total
Def: The ______________________ for a particular category is the fraction or proportion of the time
that the category appears in the data set.
Make
Relative Frequency
American
Asian
European
Total
Note: the total relative frequency should be 1 (or 100%) except for rounding error.
There are several ways to display categorical data.
Bar Charts and Relative Frequency Bar Charts:
Def: The __________________ says that the area occupied by a part of the graph should correspond to
the magnitude of the value it represents. For example, when making a _________________, the
rectangles should all be the same width so only the height determines the area.
-1-



The horizontal axis should include the variable name and the possible categories. The bars
should have some space between them to indicate they are freestanding and can be arranged in
any order.
The vertical axis can be frequency or relative frequency and should include a numeric scale
starting at 0
Relative frequency bar charts make it easier to compare multiple distributions, especially when
the sample sizes are different
________________________________________________: are useful when comparing categories that
form “parts of the whole”.
 Label variable and categories
 Segmented bar charts are basically rectangular pie charts
 Segmented bar charts make it easier to compare multiple distributions
-2-
3.2 Displaying Numerical Data: Dotplots and Stemplots
A ______________ is a simple way to display numerical data when the data set is reasonably small.
Note: it is important to label the axis and scale clearly
The 4 Key Features of a Distribution are: ________________________________________________
Words used to describe the ____________ of a distribution:
There is no specific definition of “unusual values”, but here are some things to consider:
Def: ____________: data values that fall out of the pattern of the rest of the distribution
Def: ____________: isolated groups of values
Def: ________: large spaces between values
We will learn more specific ways to describe center and spread, but for now:
To identify the _________ of a distribution, try to find a typical value or the single value that would best
describe the rest of the data. This is often the _________________ or the _______________________.
To describe the ____________ of a distribution, consider how close the data values are to the center. If
the points are consistently close to the center, there isn’t much spread (or variability). For example,
“most of the data is within ___ units of the center.”
-3-
A _____________________ plot is another way to display a relatively small numerical data set. In a
stem-and-leaf plot, the _____ is the first part of the number and the _____ is the last part of the number.
ex: male weights {97,102,117,128,130,132,139,147,154,162,166,189,225}
Male Weights
9
10
11
12
13
14
15
16
17
18
19
20
21
22
7
2
7
8
029
7
4
26
9
5
key: 14 | 7 = 147 pounds





The numbers to the left of the line are the stems (hundreds and tens digits) and the numbers to
the right of the line are the leafs (units digits).
You must include a key (with units) and a label/title
Leaves should be single digits (no commas)
It is best if the leafs are in numerical order, but it is not required.
Stemplots will look very similar to a dotplot of the same data, but a stemplot preserves the
individual data values
Back-to-back stemplots are useful for comparing distributions. For example, given the following female
weights, make a back-to-back stemplot to compare female and male weights.
Female Weights = {93, 99, 100, 104, 109, 111, 113, 113, 121, 125, 126, 128, 142, 159, 185}
-4-
When a data set is very compact, it is often useful to __________________ to stretch the display to
investigate the shape. This is sometimes called a ___________________.
ex: body temperatures: {96.3, 97.6, 97.8, 97.9, 98.1, 98.1, 98.3, 98.5, 98.6, 98.6, 98.7, 98.8, 99.0, 99.5}
When a data set is very spread out, it is often useful to _______________ the data to shrink the display.
ex: grocery bills: {10.53, 13.67, 15.01, 18.30, 20.89, 27.07, 32.82, 37.57, 52.36}
HW #1: 3.2, 3, 11, 12, 15-20 (check odd answers yourself)
-5-
Monday, Sept. 9: 3.3 Displaying Numerical Data: Frequency Distributions and Histograms
Stem-and-leaf plots and dotplots are very good for displaying small data sets. However, when there are
a large number of observations, _________________________ and _____________ are a better choice.
Frequency Distributions and Histograms for Discrete Numerical Data:
ex: Number of siblings



When each bar corresponds to only one value, center each bar above its corresponding value.
Make sure you label axes and scales. The vertical axis should always start at 0.
The bars in a histogram should touch (unlike bar charts for categorical data)
Relative Frequency Histograms:
A relative frequency histogram looks the same as a regular histogram, except that it will have relative
frequency (percent of the total) rather than frequency (number of observations) on the vertical axis.
Relative frequency histograms are particularly useful for comparing distributions with different sample
sizes, since the vertical axis will be on the same scale.
-6-
Frequency Distributions and Histograms for Continuous Numerical Data:
ex: hair length of students
Since this data is continuous, there are no “natural” categories to place the data. In this case, we will
define our own categories, called ___________. There is no perfect way to create classes, but classes
should always be the same length and never overlap or leave any gaps. Ideally, you should use between
4-10 classes.
What if an observation falls exactly on a boundary?? As a convention, we will put boundary values into
the upper class. For example, suppose you decided on class lengths of 3 inches starting at 0. Then, the
boundaries would be 0”, 3”, 6”, etc. The class from 0-3 to be 0 ≤ x < 3 or 0-<3 and an observation of 3”
would fall into the 3-6 category.
Note: There are methods for creating histograms with unequal classes, but we are skipping them. If a
frequency distribution has non-bounded classes, such as “12 or more”, a histogram cannot be made
(think of the area principle).
HW #2: 3.32, 34, 36
-7-
Tuesday, September 11:
3.3 Continued: Cumulative Relative Frequency Distributions
Instead of wanting to know what percent of the data falls into a particular class, we often want to know
what percent falls below a certain value. To make this possible, we will compute the _____________
___________________________for each class, which is the sum of the relative frequency of that class
and all the classes below it.
For example, consider the following frequency distribution of test scores:
Score Frequency Relative Frequency Cumulative Relative Frequency
0-<10
2
.01
10-<20
7
.035
20-<30
9
.045
30-<40
12
.06
40-<50
23
.115
50-<60
34
.17
60-<70
48
.24
70-<80
39
.195
80-<90
19
.095
90-<100
7
.035
Total
200
1

What proportion scored less than 30?

Less than 90?

What proportion scored at least 40?

At least 70?

What proportion scored 45?

About what proportion scored 50-<70?
(.675-.265 = .41)
Def: The graph of a cumulative relative frequency distribution is called an ________ (not in book).
-8-

About what proportion scored less that 45?

What score separates the lower half of the scores from the upper half of scores?
Def: the Pth _______________ of a distribution is the value in the distribution such that P percent of the
observations lie at that level or below.
 What is the 30th percentile for this distribution?

What is the 70th percentile?

What is the median?
Given the following ogive, construct the corresponding relative frequency histogram.
-9-
4.1 Describing the Center of a Data Set
For the last few days, we learned graphical methods to describe a data set. Now, we will learn more
precise, numerical methods to describe a data set.
The 2 most common measures of center are the mean and median.
Def: The _________________________ =  = the average of all the values in the entire population.
However, since we rarely study the entire population, we estimate the population mean (  ) with the
sample mean ( x ).
Def: The ____________________ is the average of all the values in a sample from a population
x  x   xn  xi
x 1 2

where n = sample size
n
n



Suppose that the average height of the entire class is 66” (that is,  = 66”).
If we were to randomly select 5 students and find their average height, we may find x = 65.4”,
x = 67.2”.
This illustrates the concept of __________________________________.
Note: It is a convention in statistics to use Greek letters to denote population characteristics, also called
________________. The values we use to estimate parameters are called __________________.
Note: The mean is also the balancing point of a distribution.
Def: The _____________ is the middle score. To find which value is the middle score, put data in
order and calculate (n+1)/2. The median will be the (n+1)/2 score in the ordered list.
Def: The ________, or modal value is the most frequent observation. We rarely use mode as a measure
of center, but it still can reveal interesting information about a data set.
Find the mean and median of the sample below:
{12, 20, 19, 8, 17, 23, 255, 12}
- 10 -
It turns out that the 255 was recorded incorrectly and should have been 25. What effect will this have?
Def: a _____________________________ is measure that is not affected by outliers.

The median is resistant, but the mean is not since an outlier like 255 has no effect on the median
but a big effect on the mean.

The median is often used to describe the center of a skewed distribution, such as home prices. In
non-symmetrical distributions, the mean does not do a good job describing a “typical” value
since it is heavily influenced by extreme values.
In a skewed left distribution, which is greater, mean or median?
Skewed right?
When the data is grouped into classes, use the _____________ of each class to estimate the mean
SAT score
Fre quen cy
20 0-<30 0
1
30 0-<40 0
3
40 0-<50 0
11
50 0-<60 0
22
60 0-<70 0
13
70 0-<80 0
8
To tal
58
- 11 -
Def: A _____________________ is a resistant measure which tries to avoid the influence of outliers by
eliminating values from the low and high end of the distribution. For example, to find a 10% trimmed
mean when n = 50, the data must be put in order and the lowest 5 (since 5 is 10% of 50) and the highest
5 values will be eliminated. The mean will then be computed with the remaining 40 values.
Find the sample mean and the 10% trimmed mean for the data below. Which value do think will be
bigger? Why?
9
10
11
12
13
14
15
16
17
18
19
20
21
22
7
25
47
258
025789
14788
479
26
13
9
15
5
key: 14 | 7 = 147 pounds
What happens as the trimming percentage gets closer to 50%?
AP Question
HW #3: Ogive Worksheet, 4.2, 3, 5, 7, 9, 16, 17 (Read 108-110 if don’t finish trimmed mean)
- 12 -
Thursday, September 13: 4.2 Measures of Spread
Knowing the shape and center of a data set gives us some understanding of its distribution, but we do
not know the whole story until we investigate the __________, or _________________.
The ________ = maximum value - minimum value
 The range is one number, not two!
 Although it is simple to calculate, the range has a big weakness: it is not resistant to outliers at
all.
Instead of using only 2 data points, better measures of spread consider the entire data set. One way to do
this is to estimate the average deviation from the mean. This quantity is called the ____________
___________________________.
Find and interpret the standard deviation:
Number of pencils in backpack: 1, 13, 11, 1, 5, 9, 6, 5, 8, 11
- 13 -
Note: If we had the entire ________________, we would use  (lower case sigma) to denote the
population standard deviation, which describes how spread out all the values are from the true
population mean (  ). In the calculation we would use  instead of x in the numerator and n in the
denominator instead of n-1. However, since we rarely study entire populations, we usually estimate the
population standard deviation (  ) with the sample standard deviation (s). Always assume data sets are
from samples unless otherwise told!
Note: The square of the sample standard deviation is called the ________________________. It will
come in handy later.
What if the last value in the data set above was 41 instead of 11? Calculate the new sample SD. Is the
standard deviation a resistant measure of spread?
Can s = 0?
Can s be negative?
- 14 -
4.2/4.3 More Measures of Spread / Boxplots
The ________________________________ (IQR) is the range of the middle 50% of the data.
IQR = Q3 - Q1, where
Q1 is the first quartile (the point that divides the lowest 25% of the data from the upper 75%)
Q3 is the third quartile (the point that divides the lowest 75% of the data from the upper 25%)
Note: The median is the second quartile, although it is rarely referred to that way.
To find Q1 and Q3, put the data in order and find the median. Q1 is the median of all the data less than
the median and Q3 is the median of all the data above the median.
Find the range and IQR for the following male weights:
9
10
11
12
13
14
15
16
17
18
19
20
21
22
7
25
47
258
025789
14788
479
26
13
9
15
5
key: 14 | 7 = 147 pounds
If we were to change the 225 to 525, would the IQR change? The range? Which is resistant?
Note: if there were an odd number of points so that the median is one of the observations in the set, that
observation is not considered part of the upper or lower half (although different books and software
disagree).
ex: {1, 2, 5, 9, 11, 16, 27}
The ___________________________ of a data set is the minimum, Q1, median, Q3, maximum
A __________ is a graphical display that uses the 5# summary to picture a distribution.
- 15 -


Note: the length of the box = ___________________________
Note: the length of the entire plot = __________________
If a distribution has ___________, they are usually marked separately
To determine if there are outliers we place boundaries (fences) around the main part of the data.
 lower fence = Q1 – 1.5 IQR
 upper fence = Q3 + 1.5 IQR
 Any observations that are outside of these fences are considered outliers.
 If there are outliers, the whisker extends to the most extreme observation that is not an outlier.
Are there outliers in the example above about male weights? Draw a boxplot of the data and describe
what you see (tell a story with SCSU).



Boxplots are nice because we can quickly identify the center (median) and spread (range and
IQR).
We can also identify symmetry or skewness, but NOT how many modes the distribution has.
Boxplots are especially useful for comparing distributions, but make sure they are on the same
scale!
Note: the book uses the term “mild outlier” for points farther than 1.5 IQR, but less than 3 IQR away
from the quartiles and the term “extreme outliers” for any points farther then 3 IQR beyond the quartiles.
For our class, the distinction isn’t important. Just call them all outliers.
Note: Boxplots that mark the outliers separately are sometimes called modified boxplots. Always mark
outliers!
AP Question:
HW #4 4.19, 21, 25, 29, 30, 32, 33 (bring TI-83 tomorrow)
- 16 -
Monday, September 17: 4.4 More on Standard Deviation/Using the TI-83

Entering data in a list: Stat: Edit
{4, 9, 11, 13, 13, 15, 15, 16, 16, 16, 17, 17, 18, 19, 20, 20}

Naming lists for storage: From the heading of L6, press the right arrow. Then name the list.
You can find it using the List menu or scrolling to the right of L6.

Making histograms: Stat Plot: Plot 1 graph #3
-zoom: 9 zoomstat makes a nice window
-trace will show class boundaries and frequencies
-window: Xscl changes bar width

Making boxplots: Stat Plot graph #4, #5
-#4 shows outliers and #5 does not
-zoom: 9 zoomstat makes a nice window
-trace will show 5 number summary and outliers

Checking for normality: normal quantile plot (Stat Plot: graph #6)

Calculating summary statistics: after data has been entered into L1, choose Stat: calc:1-var stats,
L1.

Making graphs from frequency tables: Enter the values into L1 and the frequencies in L2. Then,
in the Stat plot menu choose L1 for Xlist and L2 for Freq. Make sure to enter 1 for Freq when
you are finished!

Calculating summary statistics from frequency tables: Enter the values into L1 and the
frequencies in L2: Stat: calc: 1-var stats, L1, L2

Other operations with lists: Suppose you had data in L1 and you wanted to add 5 to each
observation. In the header for L2, enter L1+5 and press Enter.

To sort a data in a list: Stat: Edit: SortA or SortD
For the following sample of IQ scores, calculate the mean and standard deviation. Then, sketch a
dotplot and label the mean, mean +/- 1 SD, 2 SD, 3 SD. Finally, calculate the percentage of the data that
is within each set of boundaries.
{73, 79, 85, 91, 98, 101, 102, 103, 105, 108, 110, 111, 112, 117, 118, 119, 121, 122, 131, 136}
HW #5: 4.22-4.24, 4.34
- 17 -
Tuesday, September 18: 4.4 More on Standard Deviation
Def: Chebyshev’s Rule states that for any distribution, the proportion of observations that are within k
standard deviations of the mean is at least: 1  1 2 .
k
At least what proportion of data will be within 2 standard deviations of the mean?
3?
4?
Suppose that the distribution of incomes for baseball players has a mean of 2.6 million dollars with a
standard deviation of 5.1 million dollars. At least 75% of players will have incomes between which two
values?
Note: Chebyshev’s rule works for all distributions, but it usually underestimates the proportion of data
within each set of boundaries.
Def: The 68-95-99.7 Rule (also called the EMPIRICAL RULE) states that if the data set can be well
approximated by a normal curve, then
-approximately 68% of the observations will be within 1 standard deviation of the mean
-approximately 95% of the observations will be within 2 standard deviation of the mean
-approximately 99.7% of the observations will be within 3 standard deviation of the mean
Note: a normal curve, also called a bell curve, is single peaked, symmetrical, and mound shaped. To
identify one SD from the mean, find the inflection points (where the graph changes from concave up to
concave down).
- 18 -
The heights of students are approximately normal with mean 66” and standard deviation 3”.
 sketch the curve

approximately 95% of heights will be within which 2 values?

what proportion of students will have heights between 63” and 69”?

between 63” and 72”?

between 57” and 69”?
Note: The empirical rule only works for distributions that are approximately normal.
Note: It is very rare for an observation to be more than 3 standard deviations from the mean in a
distribution that is approximately normal!
Measures of Location (using the standard deviation as a ruler):
Suppose that a professional soccer team has the money to sign one additional player and they are
considering adding either a goalie or a forward. The goalie has a 90% save percentage and the forward
averages 1.2 goals a game.
In this league, the average goalie saves 86% of shots with a standard deviation of 5% while the average
forward scores 0.9 goals per game with a standard deviation of 0.2. Who is the better player at his
position?
The goalie is 4% higher than the average and the forward is 0.3 goals higher than average. But, since
we are comparing different units, we cannot just say the goalie is better since 4 > 0.3. To make
comparisons possible, we consider where each player falls in their respective distributions.
90  86
 0.8 standard deviations above the goalie mean.
5
1.2  0.9
 1.5 standard deviations above the forward mean.
The forward is
0.2
The goalie is
- 19 -
When we are using the standard deviation as a ruler to measure how far an observation is above or
below the mean, we are using a ________________________________, or z-score.
z
xx
s
Using standardized scores has many advantages. Since z-scores have no units, we can compare values
that are measured on different scales, with different units, or from different populations.
Suppose that the distribution of male heights has a mean of 69 inches with a standard deviation of 3
inches and the distribution of female heights has a mean of 64 inches with a standard deviation of 2.5
inches.
Who is taller, relatively speaking, a 72 inch male or a 67 inch female?
An exclusive club only allows members who are especially tall. The height requirement for women is
72 inches. What is the equivalent requirement be for men?
Shifting and Rescaling Data
Suppose that I took a random sample of 9 students and recorded their score on a recent quiz. There
scores are listed below:
30, 35, 36, 40, 40, 43, 45, 45, 50
Sketch a dotplot for this distribution and list all the summary statistics for this data set (mean, standard
deviation, median, quartiles, IQR, range).
0
20
40
60
- 20 -
80
100
Now, suppose that I was feeling especially generous and added 5 points to each score. Sketch the new
distribution (same scale!) and recalculate the summary statistics. Which ones changed? Which did not?
Did the shape change?
0
20
40
60
80
100
Now, go back to the original data and assume that the quiz was out of 50 points. To convert these scores
to percents, we could multiply each of them by 2 (since 2 x 50 = 100). Sketch the new distribution
(same scale!) and recalculate the summary statistics. Which ones changed? Which did not? Did the
shape change?
0
20
40
60
- 21 -
80
100
A Manhattan Taxi cab company studied the lengths of taxi rides and computed the following statistics
(in miles): mean = 4.8, standard deviation = 4.2, median = 3.6, Q1 = 1.8, Q3 = 5.9, IQR = 4.1
If they converted the mileage measurements to km, what will the summary statistics be in km? (1 mile =
1.62 km)
If the cost of a taxi ride is $2.50 plus $3.20 per mile, what are the summary statistics for the distribution
of costs?
Note: When we convert measurements to z-scores, we are simply shifting and rescaling the data.
Subtracting the mean shifts the center (mean) of the distribution to 0. Dividing by the standard deviation
rescales the data by making the spread (standard deviation) equal to 1. Neither of these operations
changes the shape of the distribution.
HW #6: 4.35-41, 44, JMP worksheet
Thursday, September 20: Review chapters 3-4
Matching Distributions Activities
AP Question:
AP Question:
HW #7: 3.46, 3.49, 4.51, 4.52, 4.58, 4.59, 4.61, 4.64
Monday, September 24: Test chapters 3 and 4
- 22 -
Using JMP-Intro for Univariate Data Analysis
1: Entering the data
Open JMP-Intro and choose “New Data Table” from the File menu. Click on the first column to
change its name. To add more columns, double click immediately to the right of the last column.
JMP-Intro makes you specify the variable type for each column (JMP calls this “modeling type”). It
will choose different plots and analyses based on the variable type.
 All numerical variables are called “Continuous” (even if they are discrete). This is the default
setting unless you change it. These variables are labeled “C”.
 Categorical variables are called “Nominal” (nominal means name) or “Ordinal” (ordinal
variables have a natural order such as first place, second place, etc.) These variables are
abbreviated “N” or “O”. When you try to enter a word or letter, JMP will ask you if you want to
change the variable to a character column. Choosing yes will make it a Nominal (N) variable.
Enter the following data for a sample of 20 students.
Class (N) Gender (N) AP Score (C) Final Exam (C)
soph
m
5
92
soph
f
5
85
soph
m
4
88
soph
f
4
72
soph
m
3
72
jun
f
5
91
jun
m
4
68
jun
f
3
76
jun
m
1
51
jun
f
1
33
sen
m
5
99
sen
f
4
91
sen
m
4
85
sen
f
4
82
sen
m
3
83
sen
f
3
74
sen
m
3
72
sen
f
2
48
sen
m
2
67
sen
f
1
31
2: Analyzing the Distribution of a Categorical Variable:
In the Analyze Menu, choose Distribution. In the Dialog box, enter “Class” for Y and click OK. This
will produce a bar chart, segmented bar chart, and a table of frequencies and relative frequencies.
If you click on the red arrow under “Distribution” and choose Stack, the bar chart will become
horizontal. Clicking the red arrow under “Class” will give you other options (we will learn about most
of these later). For now, look under histogram options and choose “count axis”. This will put the
frequencies on the y-axis. “Prob. axis” will display the relative frequencies.
- 23 -
Another way to enter categorical data is with a frequency table. For example, if you enter the data
below, to get a bar chart, choose Analyze: Distribution, enter “class” for Y and “freq” for Frequency.
Class Frequency
soph
5
jun
5
sen
10
3: Analyzing the Distribution of a Numerical Variable
In the Analyze Menu, choose Distribution. In the dialog box, enter “Final Exam” for Y and hit OK.
This produces a histogram, boxplot, and a variety of summary statistics. Like before, you can change to
a horizontal layout by selecting stack from the red arrow under “distribution.”
 On the histogram, you can double click on the x-axis to change the class size. Likewise, you can
choose the hand tool from the menu bar and grab the histogram to change the classes.
 On the boxplot, the red bracket is the densest 50% and the diamond is the 95% confidence
interval for the mean.
 By clicking on the red arrow under final exam, you can adjust the histogram, display a normal
quantile plot to assess normality, a stem and leaf plot, a CDF plot (ogive), and do other things
reserved for second semester.
4: More fun stuff
In the Analyze: Distribution menu, choose all 4 variables for Y. Make sure you can see all 4 graphs.
Now, click on a row in the data table. This student will be highlighted in all four plots. Next, click on a
bar in any of the plots. The students in this bar will be highlighted in all the other graphs and on the data
table. How do the sophomores compare to the juniors? The males and females? You can also create
new data tables with subsets of selected data (Tables: subset).
Suppose we wanted to make 2 separate graphs to compare the final exam scores for males and females.
There are 2 ways to do this. With the data table we have, the easiest way is to go to the Analyze:
Distribution menu, choose final exam for Y and gender for the box called “By”. Make the display
horizontal and choose “Uniform Scaling” under both red arrows (under Distribution). This will make
the graphs easy to compare.
The second way to compare 2 distributions is to enter each in a separate column (one for male scores
and another for female scores). Then, choose both columns for Y and choose uniform scaling for the
graphs.
It is relatively easy to import data from other applications, such as Excel. Cutting and pasting works
best, but you can also try to open another file with JMP and follow the importing directions. To include
output from JMP-Intro in a word processing document, choose the “fat plus” from the menu bar (the
“dotted box” on a Mac) and highlight what you want. Then copy and paste!
Homework:
1. Analyze the distribution of Final exam scores for all students. Cut and paste the histogram, boxplot,
stem-and-leaf plot, and summary statistics.
2. Analyze the distribution of final exam scores for sophomores, juniors, and seniors. Cut and paste the
histograms and make sure to use the same scale. Compare the distributions in a few sentences.
Remember, shape, center, spread, and unusual values!
3. Currently, AP Score should be a continuous variable. To change this, double click on the column
heading and choose “ordinal” under modeling type. Analyze the distribution under both modeling types.
How is the output different?
4. Cut and paste a segmented bar chart for class.
- 24 -