Download MS Word file

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
LAB ACTIVITY 3
Due Friday, Sept. 9 at 11:59pm
(Use ‘ClassData_FA2016.MTW’ Dataset in ANGEL)
Activity 1: Graphical Summaries for Quantitative Data
1. First we will investigate the variable ‘Height’.
Create a histogram for ‘height’: Graph  Histogram  Simple  OK  select ‘Height’  OK
a. Does it look like there are some data points that are… ridiculous?
b. Let’s get rid of the ridiculous data points! Make a boxplot of ‘Height’ using Graph 
boxplot  simple  select ‘Height’  OK
How many outliers can you see? __________________
Find out which observations represent the five outliers that have height 10 inches or less.
You can do this by hovering your mouse over the little stars on the boxplot. This will tell
you which observations provided these values. If you can only see a couple outliers,
double click on them then select ‘Identical Points’  Jitter  OK.
What rows are these observations in? __________, __________, _______, _________,
and _____________.
Now use this information to replace those outlying heights with an asterisk (*). This will
tell Minitab to treat them as ‘missing’, and we won’t mess up the analysis. Remember,
raw data can be messy!
c. After removing these observations, make the histogram again. How would you describe
the shape of the distribution?
Roughly Symmetric
Left-Skewed
Right-skewed
Let’s get the numerical summary for this variable as well.
Stat  basic stat  display descriptive statistics  click in the variables box  select ‘Height’  OK
d. What is the mean?
e. What is the standard deviation (StDev)?
f.
Complete the 5-number summary below:
Minimum =
Q1 =
Median =
Q3 =
Maximum =
g. Write a sentence that interprets the median.
h. Give the value that completes the following sentence. About 1/4 of the students are less
than
i.
Give the value that completes the following sentence: About 1/4 of the students are more
than
j.
___ inches tall.
___ inches tall.
What is an interval that describes the middle one half of the students’ heights?
k. Calculate the Inter-Quartile Range (IQR): Q3 – Q1
2. Now let’s compare the variable ‘Height’ for the different genders. Create side by side boxplots:
Graph  boxplot  one y with groups  OK  put ‘height’ in the box ‘graph variables’  put
‘gender’ in the box ‘categorical variables for grouping’.  OK. Now, considering only the
boxplots from ‘Male’ and ‘Female’:
a. Which gender, M or F, has the highest median?
b. Which gender, M or F, has a larger middle box?
c. Which gender, M or F, has the most outliers?
d. Are there any other noticeable differences between genders in their distribution of height?
Creating a side-by-side boxplot like this one is one of the first steps in answering the following
question: Is there a statistically significant difference in height between college aged men and
women? More on this later in the semester.
Activity 2: Empirical Rule
1.
Car and truck speeds at a particular location have approximately a bell-shaped
distribution with mean = 65 mph and standard deviation = 5 mph.
Draw a picture to show the approximate distribution of the speeds. Make sure to label
important values on the horizontal axis such as the mean, the mean plus and minus one
standard deviation, the mean plus and minus two standard deviations, and the mean plus
and minus three standard deviations.
a.
About 68% of cars and trucks travel between ______ and _____ mph at this
location.
b. About 95% of cars and trucks travel between _____ and _______ mph at this
location.
c. About 99.7% of cars and trucks travel between _____ and _____ mph at this
location.
2. The typical amount of sleep per night for college students has a bell-shaped distribution
with a mean of 7 hours and a standard deviation equal to 1.7 hours.
Use the empirical rule (and a picture, if you want) to complete the following sentences:
a. About 68% of college students typically sleep between
b. About 95% of college students typically sleep between
c. About 99.7% of college students typically sleep between
and
and
and
hours.
hours.
hours.
3. A sample of n=500 individuals is asked how many hours they typically spend using a
computer in a week. The mean response is 8.3 hours and the standard deviation is 7.2
hours. Explain why the distribution cannot be bell-shaped.