Download Descriptive Statistics - Home | University of Pittsburgh

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia, lookup

Regression analysis wikipedia, lookup

Data assimilation wikipedia, lookup

Psychometrics wikipedia, lookup

Data analysis wikipedia, lookup

Least squares wikipedia, lookup

Corecursion wikipedia, lookup

Generalized linear model wikipedia, lookup

Inverse problem wikipedia, lookup

Predictive analytics wikipedia, lookup

Transcript
Descriptive Statistics
Exploratory Data Analysis (EDA)
• Searching for patterns in
the data
Data Organization
•
•
•
•
Coding sheets/scripts
Stacked vs. Unstacked format
Data entry problems - transcription errors
Grouped vs. Individual Data
Graphing and Visualizing Data
• Bar graphs - best when
the IV is categorical
(nominal)
• Line graphs - best for
illustrating functional
relationships between
variables
Line Graph Terminology
•
•
•
•
Increasing vs. decreasing
Positive vs. negative acceleration
Monotonic vs. non-monotonic
Floor vs. ceiling values
Scatterplots
• Best for two DV's when
illustrating correlations
or linearity
Pie Charts
• Best for proportions or
percentages
Tables vs. Graphs
• Numerical precision lost
in a graph, but
relationships in the data
are shown more clearly
Frequency Distribution of Data
• Histogram
– Columns should not be
too narrow or too wide
– Positive vs. negative
skew
– Normal distributions
– Outliers
• Stemplot
– Stem + leaf
– All values are seen
Measures of Centers of
Distributions
• Center
– Mode
– Median
– Mean
• Choosing a center:
– Skew - median
– Normal -mean
Measures of Spread of
Distributions
• Range
– Problems: sensitive to outliers; scores in between
not taken into account
•
•
•
•
Interquartile range
Variance
Standard Deviation
Choice of measure of spread:
– If there are outliers or skew, choose interquartile
The 5-Number Summary
• min and max scores
• 1st quartile, median, and
3rd quartile scores
• Boxplot - visual
representation of the 5number summary
Statistical Measures of Description
• Used to describe data; measures of association
between variables
Measures of Association
• Pearson product-moment correlation
coefficient (Pearson R)
• Used for two variables with continuous values
• Positive and negative relationships denoted
• Magnitude of the number tells you the degree
of linear relationship between the two
variables
• Factors that affect magnitude of the Pearson R:
– Outliers
– Restricted range of scores
– Non-normal distribution of scores
Other Correlation Tests
• Point-biserial correlation
– Used when one variable is dichotomous
• Spearman rank order correlation (rho)
– Used to determine monotonicity
• Phi coefficient
– Used when both variables are dichotomous
Other Analyses
• Linear regression analyses
– Used for predicting scores and values
• Multivariate analyses
– Used for analyzing more than one variable
at a time