Download Notes 10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bias of an estimator wikipedia , lookup

Data assimilation wikipedia , lookup

Forecasting wikipedia , lookup

German tank problem wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Robust statistics wikipedia , lookup

Transcript
GS/PPAL 6200 3.00 Sections M&N
Research Methods and Information
Systems
A QUANTITATIVE RESEARCH PROJECT (1) DATA COLLECTION
(2) DATA DESCRIPTION
(3) DATA ANALYSIS
Variables Vocabulary
• Nominal (or Categorical) Variables: units
composed of discrete categories
(Male/Female/Neither e.g.)
• Ordinal Variables: variables that can be ordered
but are not equal across the range (Frequencies
measured as Always/Usually/Rarely/Never, e.g.)
• Interval Variables: variables that can be measured
with equidistant units (Age in years; Amount
studied in hours per month)
DATA DESCRIPTION
• Describing Nominal, Ordinal, and Interval Data
with Graphs, Charts, Tables
• Describing the Interval Data (CGPA) with
Descriptive Statistics
– Mean, range, minimum value, maximum value,
standard deviation, standard error
– Deciles, Quintiles, Percentiles
– Z-scores
Graphs, Charts and Tables
• Create a Table of Cases #1-10 for CGPA, Hours
Studied, and Main Reasons for Attending
University
• Graph a scatterplot of CGPA and Hours
Studied for these 10 cases
• Draw a Pie Chart showing the Main Reasons
for Attending University for these 10 cases
Describing the CGPA Data
• Refer to Cases #1-10 in our survey results
• What is the sample mean of CGPA for these 10
observations?
• Consider the student graduating with CGPA =
6.33
• How many grade points above the mean is
this student?
• How does this student’s CGPA compare with
that of other students?
Ranking Concepts:
Quintiles and Deciles
• Order the 10 observations from lowest CGPA
to highest
• Divide the number of observations by 5 to
obtain the quintile units (10/5 = 2); in which
quintile is the student with CGPA = 6.33?
• Divide the number of observations by 10 to
obtain the decile units (10/10 = 1); in which
decile is the student with CGPA = 6.33?
Statistical Description of the Data
• Distribution: Frequency – what is the
frequency of each value?
• Central Tendency: Mean, Median, and Mode –
what are the measures of the “centre” of the
data?
• Dispersion: Range and Standard Deviation –
what is the spread around the centre (Mean)?
Frequency and Central Tendency
• How many times does the value CGPA = 7.67
occur? … CGPA = 6.33? What is this
occurrence as a % of the total # observations?
• What is the Mean of the CGPA in the sample?
• What is the Median of the CGPA?
Dispersion around the Mean
• Standard Deviation (True) dispersion of values
around the true population mean = σ
• Standard Deviation (Sample) dispersion of
values around the sample mean = s
• Standard Error of the Mean (SEM) = Standard
Deviation (Sample, s) of the sample mean’s
estimate (avgX) of the population mean (μ)
Dispersion about the Sample Mean
• What is the range of CGPA in the sample?
• What is the standard deviation (average
distance from the mean) of the sample?
– Calculate the difference between each sample
value and the sample mean.
– Square each result and add the squared values
together (“Sum of Squares”)
– Divide by n-1 (“Degrees of Freedom”) *
– Take square root of the answer (*)
Standard Error of the Mean
• If a standard error of the mean (SEM) is
calculated by the sample standard deviation
divided by the square root of the sample size,
what is the SEM of this 10-case sample?
• @ 95% CI what is the Confidence Level?
@ 95% CI then CL = +/- 1.96*SEM = 0.86
• If the hypothesized mean is 6.5, is the mean of
this sample significantly different @ 95% CI?
Upper Limit = Sample Mean + 0.86 = 6.96
Lower Limit – Sample Mean – 0.86 = 5.24
Dispersion about the True Mean
• For a comparison of the academic performance of this
student with the rest of her graduating class, it is good
to look at where they are ranked in the class but better
to look at this in relation to the dispersion around the
mean
• How many standard deviations is a CGPA of 6.33 away
from the population mean (assume μ = 6.47)?
• If the standard deviation of CGPA for all graduating
classes is known (σ = 1.48), then the CGPA is -0.095
standard deviations below the population mean:
(6.33 – 6.47)/ 1.48 = -0.095 = z-score
Ranking Concept: Percentile
• For a student who graduates with a 6.33 CGPA, what percentile is he in? If
he is in the 46th percentile, 46% of the other CGPA values of other
graduating students fall below his grade.
• Unlike Quintiles and Deciles, scales in Percentile ranks are not of equal
intervals; they are adjusted for the dispersion (frequency) of the numbers
• The pth percentile is calculated by pi = 100 (i-0.5)/n ; for n = total number
of observations and i = the rank
• What percentile is the CGPA = 6.33 for this student relative to the sample
of 10 graduating students?
CGPA
i
Percentile = pi
= 100 (i-0.5)/n
(for n = 10)
4.17
1
= 100*(1 - 0.5)/10
=5
4.17
2
= 100*(2 - 0.5)/10
= 15
5.00
3
= 100*(3 - 0.5)/10
= 35
6.33
4
Percentile Calculations in Excel
• The 40th percentile of the sample and 46th percentile
of the population, assuming the population is
normally distributed.
RESULT
0.4
0.4623
EXCEL COMMAND
PERCENTRANK.EXC(data array, 6.33, TRUE)
NORM.DIST (6.33, 6.47, 1.48, TRUE),
for mean 6.47, and std dev 1.48
If the z-score is -0.095 (for same mean and standard deviation), Excel
generates:
0.4621
NORM.S.DIST(-0.095, TRUE)