Download Performance Element 3.04

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
INDICATOR 3.05
INTERPRET MARKETING INFORMATION TO TEST
HYPOTHESES AND/OR TO RESOLVE ISSUES.
NEED FOR PHYSICAL INSPECTION OF
PAPER-BASED MARKETING DATA
• If errors are caught early, they can be corrected.
• Look for standardizations and continuity
• Missing values/nonuniform data entry which means
changing the value or formula of the response codes.
• Make sure information exists as required
• Data may not be presented in the form needed for
further analysis
• ie
• incomplete responses (skips questions)
• data entry error (wrong entry/answer; recorded improperly –choice
is B but entered as C)
• Inconsistencies in responses
QUESTIONS TO ASK WHEN SCREENING
INITIAL DATA
• Where did the data come from?
• Who did research?
• Who is source?
• Have the data been peer reviewed?
• Professionals have looked at the study before it was published
and concluded that the study’s authors pretty much followed
the rules of good scientific research
• How were the data collected?
• If survey, were the respondents selected at random?
• No cherry picking (selecting an area you know will give you the
answers you hope for)
• Be skeptical when dealing with comparisons
• A correlation (relationship) DOES NOT mean causation
• Be aware of numbers taken out of context
• Numbers picked to look interesting may mean something
entirely difference once it is placed in a different context
ACTIONS THAT CAN BE TAKEN WITH
INCOMPLETE RECORDS
• Cleanse the data which includes
dropping problematic data.
• If caught early and still in field, can
correct at once
• Recontact/reinterview respondent
• Check responses for each question as
interview progresses
• Drop incomplete data from survey (in
part or in full)
REASONS FOR DATA ENTRY ERRORS
THAT LIMIT THE USE OF MARKETING
RESEARCH DATA
•Wrong entry
•Failure of data entry
technology
•Information is not recorded
properly
REASONS THAT QUESTIONABLE
ENTRIES ARE FOUND IN RAW
MARKETING RESEARCH DATA
•Honesty
•Inconsistencies in responses
ROLE OF CODING IN PROCESSING
MARKETING DATA
• Procedure converts the edited raw
data into numbers or symbols
• Code book is created
• Identifies all of the variable names
and numbers associated with each
possible response to each question
GENERALLY ACCEPTED CODING
PRINCIPLES
• Specify categories/classes
• ie Male = M or 1 Female = F or 2
•
•
•
•
Locate only one character in each column
Use only numeric codes
Use as many columns for fields assigned to variables
Use standard coded for no information
• ie No answers = 9
• Code in respondent ID on each record
• to tie coded data to questionnaire
IMPORTANCE OF EDITING CODED
INFORMATION
• Must have systematic procedure for assessing the
questionnaires developed by the research analyst
• Cleary defined decision rules should be in place
and followed
• MUST not allow subjectivity to enter into editing
process
• Data editors should be intelligent, experienced, and
objective
DATA-ENTRY OPTIONS
• On-line direct data entry
• Computer Assisted Telephone Interviewing (CATI)
• interviewer is sitting behind a terminal and asks the questions
that appear on the screen. The respondent's answer is then
typed into computer by the interviewer.
• Optical scanning
• For highly structured questionnaires
• Keyboarding
• Data entry via a computer keyboard; requires verification
PAPER-BASED MARKETING INFORMATION
VS.
COMPUTER-BASED MARKETING INFORMATION
• Computer based software saves time and money.
• The programs have error checking built in to
prevent common errors.
• The availability of the item bank makes it easier to
build questionnaires.
IMPORTANCE OF USING A CODEBOOK TO DESCRIBE
DATA AND TO INDICATE WHERE AND HOW THEY
CAN BE ACCESSED
• Shows each question in the questionnaire
• Shows the variable name and label
• Shows the variable type, values and value labels.
DATA ANALYSIS
• After data is collected it must be analyzed so that
conclusions can be made.
• Data summarization: is the process of describing a
data set by computing a small number of measures
that explain the data set
PURPOSE OF DATA SUMMARIZATION
• Summarizes the data
• Make the data understandable
• Communicates patterns
• Shows how the sample findings relate
to the overall population
WHAT IS DESCRIPTIVE STATISTICS?
• Descriptive statistics are used simply to describe the
sample you are concerned with.
• Descriptive statistics use measures of central
tendency to explain data.
WHAT IS CENTRAL TENDENCY?
• A measure of central tendency is a measure that
tells us where the middle of a bunch of data lies. It
explains where data falls.
• The three most common measures of central
tendency are the mean, the median, and the
mode.
MEAN
• It is simply the sum of the numbers divided by the
number of numbers in a set of data.
• This is also known as average.
• Ex. What is the mean number of brothers and
sisters?
• The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2.
• 2.8
MEDIAN
• The median is the middle score in a distribution of
scores that have been ranked in numerical order.
• What is the median number of brothers and sisters?
• The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2.
•2
• A disadvantage of the median is if there is a large
set of numbers, it would be time consuming to
place each in order of size
MODE
• The mode is the most frequently occurring score in
a set of scores.
• Ex. A researcher is interested in the effect of family
size on self-esteem. To begin this study, 10 students
are each asked how many brothers and sisters they
have. The responses are as follows: 2, 3, 1, 0, 9, 2, 3,
2, 4, 2. What is the mode for this set of data?
• 2
WHEN TO USE THE MEAN, MEDIAN,
AND MODE
 The mean can be used to find an average/ typical
number. Use the mean when numbers are closely
related. If however, the numbers are very widely
spread, are very unevenly distributed, or contain
extreme values, e.g. 9, 10, 13, 17, 23, 30, 45; or a
hundred values of 10 and one value of 50 then the
mean can be misleading,.
 Use the median if looking for the central point.
 Mode – When you want to determine the most
common response
WHEN DO YOU USE MEASURES OF
VARIABILITY?
• Measure of variablity are used when the researcher
wants to know the extent to which the data set
differ from one another.
• Range, and Standard Deviation are computations
for measuring variability.
RANGE
 The range is the difference between the highest and
lowest scores in a distribution.
 A social researcher would like to know how many
digits people in different age categories can recall
with only one presentation of a list. She creates
random lists of digits and presents them to
participants. The number of digits recalled by the
first 10 participants is as follows: 5, 9, 6, 10, 9, 7, 8, 7, 9,
12. What is the range of this data set?
7
STANDARD DEVIATION
• The standard deviation is a measure of how spread
out your data are. How far away is the data from
the average.