Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Activity 3: Z-Scores and Graphing Data/Distributions The SPSS file for tonight’s class is a dataset on NBA players. Although descriptive statistics (Z-scores and data distributions are still descriptive) can be used for any number or reasons – this time we will be analyzing the performance individual basketball players. As you fill in the information below, please type in RED or BOLD font so it is easy to see. A. Download the SPSS file from the web site, ‘NBAScoring’. Open this file using SPSS. This file contains data from the entire 82-game 2008-2009 NBA Season. All NBA players were eligible to be in the file – but I trimmed out Centers (where it was their only position listed) and players that played in less than 20 games (likely had major injury or were released /signed in-season). Please note that the data is summarized as each player’s per-game average. You should find their name, along with the most commonly used basketball statistics. Take a second to look over the data and make sure you understand the organization and naming. The “variable view” can be used to see the full explanation of each statistical abbreviation (i.e., @3PP is the player’s 3-point shooting percentage). B. Z-scores, Points/game Points scored per game is an often-discussed statistic. Calculate the mean, median and standard deviation for the points statistic (PTS): 1) Points Mean = 2) Points Median = 3) Points Standard Deviation = 4) What is our sample size (n)? = 5) Judging only from the different measures (mean, median), is this distribution positively or negatively skewed? 6) Tim Duncan’s average points per game was 19.3 this season. What is his z-score? What is the interpretation of this? You can use the website below to enter a z-score (select one-sided) and calculate what percentage of players score above/below a z-score of 1.5. This assumes a normal distribution – but we’ll ignore that for the sake of the example. Check it out and report the values below: http://www.measuringusability.com/pcalcz.php 7) About what percentage of players score above 19.3 points? Below? C. Z-scores Continued You could go through and calculate all the z-scores by hand, or use the “compute” feature to create your own equation – but it’s far easier to let SPSS do the work for you. Go to “Analyze” > “Descriptive Statistics” > “Descriptives”. Move the “points” variable into the Variables box. Now, make sure the box in the lower left corner is checked, “Save standardized values as variables”. Checking this box will automatically calculate a z-score for every player and save it as a new variable, “ZPTS”. Hit “Ok” to run the analysis. Go back to your data sheet and scroll to the right, the new variable should be there. 8) Determine the mean and standard deviation of this new variable, “ZPTS”. You should know these answers already – but prove it to yourself anyway. Mean = , SD = D. Fun with Z-Scores As discussed, z-scores are great for making comparisons between different variables. You could, for example, determine if a player was a better scorer than he was a rebounder, or a better defender than scorer? Create a z-score variable for Points (already done), Total Rebounds, Assists, Steals, and Blocks. Go back to part C and redo the steps for the new variables. Once you’ve made all your new z-score variables, find Dwight Howard (center/forward for the Orlando Magic). The names are in alphabetical order (by first name). If you have not sorted your data file, Dwight Howard should be case #102. List his z-scores for each of the variables below. 9) Z-scores: Points = Total Rebounds = Assists = Steals = Blocks = 10) Compared to the rest of the players in this dataset – what is Dwight Howard the best at? What is he the worst at? Points, total rebounds, and assists are probably the most discussed statistics in basketball. Use the “Transform” > “Compute” function to create a new variable called “ZTotal”. Add up the z-scores for Points, Rebounds, and Assists (ZPTS + ZTRB + ZAST). This new variable will be a summary of the z-scores for all the players. After you have made the variable, go back to your data sheet and sort the file in descending order by ZTotal (go to “Data” > “Sort Cases”). 11) Who are the top 3 players with the highest ZTotal Score? What does this mean? D. Graphing – Frequency Distributions Injuries and durability are a critical component of professional sports. Create a frequency distribution of games played. Go to “Analyze” > “Descriptive Statistics” > “Frequencies”. Move “Games Played” into the variables box. This time, in the lower left hand corner make sure that the “Display Frequencies Table” box is checked. This will create a frequency distribution for Games Played. Remember, I eliminated players that played in less than 20 games – but… 12) What percentage of players played in all 82 games? 13) What percentage of players played in less 59 or less games? 14) What percentage of players play in 90% or more of the NBA season (> or = 74 games)? Create a histogram for the games played variable. Go to “Graphs” > “Legacy Dialogs” > “Histogram”. Move “Games Played” into the variable box and hit “Ok”. 15) Describe this distribution in a sentence or two. E. Graphing – Box and Whisker Plots/BoxPlots Create a box plot for both scoring and turnovers. Go to “Graphs” > “Legacy Dialogs” > “BoxPlot”. Select “Summaries of separate variables”. Where it asks you what you want the plots to represent, move over the points and turnovers variables (do not use the z-scores here). Click “Ok” to create the plots. SPSS tries to make things easy for you by highlighting scores and providing the case numbers of subjects that are more than 3 standard deviations above or below the mean – these could be outliers. However, we know this isn’t the case in this dataset. 16) Who are the two “outliers” in points? 17) Who is this turnover machine, with more than 3 SD’s of turnovers above the mean? F. Graphing – Scatterplots Create a scatterplot of Minutes played per game and points. Go to “Graphs” > “Legacy Dialogs” > “Scatter/Dot”. There are many options here, but we will keep it simple. Choose “Simple Scatter”. Move “Minutes Played” to the x-axis and “Points Scored” onto the y-axis. 18) Describe the plot in one or two sentences. Do the highest scoring players play more minutes (because they’re good) – or does playing more minutes mean the players will score more points (more opportunities)? Create one more scatterplot, this time put Points scored on the x-axis and the ZPoints variable we made earlier on the y-axis. Plot the data. 19) Describe the plot in one or two sentences. Why does it look like it does? 20) Save this word file and later in the week add your homework below, print it off, let me know that you completed it before class next Monday night.