Download Descriptive Statistics - Home | University of Pittsburgh

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inverse problem wikipedia , lookup

Predictive analytics wikipedia , lookup

Pattern recognition wikipedia , lookup

Psychometrics wikipedia , lookup

Generalized linear model wikipedia , lookup

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Data analysis wikipedia , lookup

Corecursion wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Descriptive Statistics
Exploratory Data Analysis (EDA)
• Searching for patterns in
the data
Data Organization
•
•
•
•
Coding sheets/scripts
Stacked vs. Unstacked format
Data entry problems - transcription errors
Grouped vs. Individual Data
Graphing and Visualizing Data
• Bar graphs - best when
the IV is categorical
(nominal)
• Line graphs - best for
illustrating functional
relationships between
variables
Line Graph Terminology
•
•
•
•
Increasing vs. decreasing
Positive vs. negative acceleration
Monotonic vs. non-monotonic
Floor vs. ceiling values
Scatterplots
• Best for two DV's when
illustrating correlations
or linearity
Pie Charts
• Best for proportions or
percentages
Tables vs. Graphs
• Numerical precision lost
in a graph, but
relationships in the data
are shown more clearly
Frequency Distribution of Data
• Histogram
– Columns should not be
too narrow or too wide
– Positive vs. negative
skew
– Normal distributions
– Outliers
• Stemplot
– Stem + leaf
– All values are seen
Measures of Centers of
Distributions
• Center
– Mode
– Median
– Mean
• Choosing a center:
– Skew - median
– Normal -mean
Measures of Spread of
Distributions
• Range
– Problems: sensitive to outliers; scores in between
not taken into account
•
•
•
•
Interquartile range
Variance
Standard Deviation
Choice of measure of spread:
– If there are outliers or skew, choose interquartile
The 5-Number Summary
• min and max scores
• 1st quartile, median, and
3rd quartile scores
• Boxplot - visual
representation of the 5number summary
Statistical Measures of Description
• Used to describe data; measures of association
between variables
Measures of Association
• Pearson product-moment correlation
coefficient (Pearson R)
• Used for two variables with continuous values
• Positive and negative relationships denoted
• Magnitude of the number tells you the degree
of linear relationship between the two
variables
• Factors that affect magnitude of the Pearson R:
– Outliers
– Restricted range of scores
– Non-normal distribution of scores
Other Correlation Tests
• Point-biserial correlation
– Used when one variable is dichotomous
• Spearman rank order correlation (rho)
– Used to determine monotonicity
• Phi coefficient
– Used when both variables are dichotomous
Other Analyses
• Linear regression analyses
– Used for predicting scores and values
• Multivariate analyses
– Used for analyzing more than one variable
at a time