Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Outline and Definitions 2.1 Summarizing Categorical Data (Single Variable) A] Frequency, Relative Frequency and % Frequency Distribution i) Frequency Distribution: tabular summary of data showing the frequency (number) of data items in each non-overlapping class ii) Relative Frequency Distribution: tabular summary of data showing the relative frequency of data items in each non-overlapping class Relative Frequency: fraction or proportion of the total number of data items belonging to the class Relative Frequency = Frequency where n = total # observations n iii) % Frequency Distribution: tabular summary of data showing the % frequency of data items in each non-overlapping class % Frequency = Relative Frequency x 100 or = Frequency x 100 where n = total # observations n Example: 1 B] Bar Chart and Pie Chart -- additional types of graphics used to describe Categorical data i) Bar Charts Bar Charts can be used for frequency, relative frequency and percent frequency distributions On one axis (usually the horizontal axis), we specify the labels that are used for each of the classes. A frequency, relative frequency, or percent frequency scale can be used for the other axis (usually the vertical axis). The bars are separated to emphasize the fact that each class is a separate category ii) Pie Charts Pie charts are used with relative and % frequency distributions Relative and % frequencies subdivide the circle into sectors based on the proportion it occupies of 360 degrees. 2 2.2 Summarizing Quantitative Data (Single Variable) A] Frequency Distribution Key is to develop non-overlapping classes that are not too broad or too narrow Guideline for # classes: Use between 5 and 20 classes with larger data sets requiring more classes than smaller ones Guideline for width of classes: Use classes of equal width Approximate class width = Largest Data Value – Smallest Data Value # classes B) Histogram: Histograms can be used for frequency, relative frequency and percent frequency distributions. Unlike a bar chart, there is no separation between rectangles of adjacent classes (shows continuous nature of data) Variable of interest is normally on horizontal axis and frequency, relative frequency or percent frequency on vertical axis 3 A) Symmetric Distribution: left tail is mirror image of right tail B) Skewed Left: Longer tail to the left C) Skewed Right: Longer tail to the right 4 C] Cumulative Distributions A) Cumulative Frequency Distribution: Shows the # items with values < each upper class limit B) Cumulative Relative Frequency Distribution: Proportion of items with values < each upper class limit C) Cumulative % Frequency Distribution: % of items with values < each upper class limit Ogive: A linear graph of a cumulative distribution 5 Data values are shown on horizontal axis and the cumulative frequency, cumulative relative frequency or cumulative % frequency is shown on vertical axis A linear plot requires you to show continuity and to have a single value represent each class range Ogive class values are the values halfway between the class limits—upper class limit of select range and lower class limit of next range (59.5 is used for class 50-59, 69.5 is used for class 60-69, etc). Preserves continuity 0 is plotted on graph to show no data exists below a certain class (for whole numbers 0.5 below lowest class value). 2.4 Crosstabulations (i.e Cross Tabs) and Scatter Plots Reveals the relationship between two variables Cross Tabs: 2 variables represented in tabular format Use Pivot Table option in Excel to generate the table Not restricted by characteristic of variable (Quantitative/Categorical)— any combination will work Example: 6 Scatter Diagram and Trend Line Scatter Diagram: representing 2 quantitative variables graphically o Need to pay special attention to the general pattern of plotted points which reveals the overall relationship between the variables Trend Line: linear line representing approximate relationship (positive, negative or no relationship) evaluate slope 7 8