Download Activity 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Categorical variable wikipedia , lookup

Transcript
Activity 3: Z-Scores and Graphing Data/Distributions
The SPSS file for tonight’s class is a dataset on NBA players. Although descriptive statistics (Z-scores and
data distributions are still descriptive) can be used for any number or reasons – this time we will be
analyzing the performance individual basketball players. As you fill in the information below, please
type in RED or BOLD font so it is easy to see.
A. Download the SPSS file from the web site, ‘NBAScoring’. Open this file using SPSS.
This file contains data from the entire 82-game 2008-2009 NBA Season. All NBA players were
eligible to be in the file – but I trimmed out Centers (where it was their only position listed) and players
that played in less than 20 games (likely had major injury or were released /signed in-season). Please
note that the data is summarized as each player’s per-game average. You should find their name, along
with the most commonly used basketball statistics. Take a second to look over the data and make sure
you understand the organization and naming. The “variable view” can be used to see the full
explanation of each statistical abbreviation (i.e., @3PP is the player’s 3-point shooting percentage).
B. Z-scores, Points/game
Points scored per game is an often-discussed statistic. Calculate the mean, median and
standard deviation for the points statistic (PTS):
1) Points Mean =
2) Points Median =
3) Points Standard Deviation =
4) What is our sample size (n)? =
5) Judging only from the different measures (mean, median), is this distribution positively or negatively
skewed?
6) Tim Duncan’s average points per game was 19.3 this season. What is his z-score? What is the
interpretation of this?
You can use the website below to enter a z-score (select one-sided) and calculate what percentage of
players score above/below a z-score of 1.5. This assumes a normal distribution – but we’ll ignore that
for the sake of the example. Check it out and report the values below:
http://www.measuringusability.com/pcalcz.php
7) About what percentage of players score above 19.3 points? Below?
C. Z-scores Continued
You could go through and calculate all the z-scores by hand, or use the “compute” feature to
create your own equation – but it’s far easier to let SPSS do the work for you. Go to “Analyze” >
“Descriptive Statistics” > “Descriptives”. Move the “points” variable into the Variables box. Now, make
sure the box in the lower left corner is checked, “Save standardized values as variables”. Checking this
box will automatically calculate a z-score for every player and save it as a new variable, “ZPTS”. Hit “Ok”
to run the analysis. Go back to your data sheet and scroll to the right, the new variable should be there.
8) Determine the mean and standard deviation of this new variable, “ZPTS”. You should know these
answers already – but prove it to yourself anyway. Mean = , SD =
D. Fun with Z-Scores
As discussed, z-scores are great for making comparisons between different variables. You could,
for example, determine if a player was a better scorer than he was a rebounder, or a better defender
than scorer? Create a z-score variable for Points (already done), Total Rebounds, Assists, Steals, and
Blocks. Go back to part C and redo the steps for the new variables.
Once you’ve made all your new z-score variables, find Dwight Howard (center/forward for the
Orlando Magic). The names are in alphabetical order (by first name). If you have not sorted your data
file, Dwight Howard should be case #102. List his z-scores for each of the variables below.
9) Z-scores:
Points =
Total Rebounds =
Assists =
Steals =
Blocks =
10) Compared to the rest of the players in this dataset – what is Dwight Howard the best at? What is he
the worst at?
Points, total rebounds, and assists are probably the most discussed statistics in basketball. Use
the “Transform” > “Compute” function to create a new variable called “ZTotal”. Add up the z-scores for
Points, Rebounds, and Assists (ZPTS + ZTRB + ZAST). This new variable will be a summary of the z-scores
for all the players. After you have made the variable, go back to your data sheet and sort the file in
descending order by ZTotal (go to “Data” > “Sort Cases”).
11) Who are the top 3 players with the highest ZTotal Score? What does this mean?
D. Graphing – Frequency Distributions
Injuries and durability are a critical component of professional sports. Create a frequency
distribution of games played. Go to “Analyze” > “Descriptive Statistics” > “Frequencies”. Move “Games
Played” into the variables box. This time, in the lower left hand corner make sure that the “Display
Frequencies Table” box is checked. This will create a frequency distribution for Games Played.
Remember, I eliminated players that played in less than 20 games – but…
12) What percentage of players played in all 82 games?
13) What percentage of players played in less 59 or less games?
14) What percentage of players play in 90% or more of the NBA season (> or = 74 games)?
Create a histogram for the games played variable. Go to “Graphs” > “Legacy Dialogs” >
“Histogram”. Move “Games Played” into the variable box and hit “Ok”.
15) Describe this distribution in a sentence or two.
E. Graphing – Box and Whisker Plots/BoxPlots
Create a box plot for both scoring and turnovers. Go to “Graphs” > “Legacy Dialogs” >
“BoxPlot”. Select “Summaries of separate variables”. Where it asks you what you want the plots to
represent, move over the points and turnovers variables (do not use the z-scores here). Click “Ok” to
create the plots. SPSS tries to make things easy for you by highlighting scores and providing the case
numbers of subjects that are more than 3 standard deviations above or below the mean – these could
be outliers. However, we know this isn’t the case in this dataset.
16) Who are the two “outliers” in points?
17) Who is this turnover machine, with more than 3 SD’s of turnovers above the mean?
F. Graphing – Scatterplots
Create a scatterplot of Minutes played per game and points. Go to “Graphs” > “Legacy Dialogs”
> “Scatter/Dot”. There are many options here, but we will keep it simple. Choose “Simple Scatter”.
Move “Minutes Played” to the x-axis and “Points Scored” onto the y-axis.
18) Describe the plot in one or two sentences. Do the highest scoring players play more minutes
(because they’re good) – or does playing more minutes mean the players will score more points (more
opportunities)?
Create one more scatterplot, this time put Points scored on the x-axis and the ZPoints variable
we made earlier on the y-axis. Plot the data.
19) Describe the plot in one or two sentences. Why does it look like it does?
20) Save this word file and later in the week add your homework below, print it off, let me know that
you completed it before class next Monday night.