Download Data Collection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
ERT446
FINAL YEAR PROJECT 2
DATA PREPARATION
&
COLLECTION
PRESENTED BY:
Munira Bt Mohamed Nazari
School of Bioprocess Engineering
Universiti Malaysia Perlis
Purposes of collecting data
Establishing the parameters of a system.
Establishing benchmark data.
Data collection strategies
Market research
"Objective" research
Proof of theory
Decision support
Propaganda
Belief justification
Types of data
According to research type
Qualitative
Non-numerical
measurements
e.g. thick, thin, slow, fast.
Quantitative
Numerical measurements
e.g. Weight, length,
Temperature, etc.
Scales of Data
Ordinal
Nominal
Interval
Ratio
Nominal data scale
The nominal data scale is the lowest
level of data.
Nominal scales are therefore
qualitative rather than
quantitative.
Quantitative information can only
obtain by doing counts of the number
of occurrences with a particular
property.
Have no order. It is only for identity.
Nominal scale has no zero.
Numbers themselves are not the
nominal scale; they are just values.
Hair color Number
Black
47
Brown
16
Gray
7
Also called
categorical data
Ordinal data scale
Have an order (unlike nominal data)
Grade
The intervals between the numbers are
Excellent
not necessarily equal
There is no "true" zero point
Number
47
Very good 26
Good
21
Have the properties of the nominal Pass
Fail
data
15
In the example it is reasonable to say
that grade is an ordinal scale because
fail/pass/good/very good/excellent form
a sequence that would not make sense
in any other form.
7
Also called
ordered data
Interval data scale
Have an equal
sequence
Have the properties of
nominal and ordinal
Have not true zero
point
Most sophisticated
data scale
Temperature
(oC)
Day
Monday
29
Tuesday
28
Wednesday
30
Thursday
31
Friday
32
Also called score
data
Ratio data scale
The ratio between any two pairs of values
that are the same 'distance apart' is the
same anywhere on the scale .
Also called score
The data has true zero point.
data
The closest to real number system
For example: Kelvin scale of temperature.
This scale has an absolute zero. Thus, a
temperature of 300 Kelvin is twice as high
as a temperature of 150 Kelvin.
Order
Interval Origin
Nominal none
none
none
Ordinal
yes
unequal
none
Interval
yes
Ratio
yes
equal or none
unequal?
equal
zero
As you go from nominal to interval scales, you get more
information about thing being measured.
Example:
Nominal Scales:
DO you use CNN for online news?
Yes/No
Ordinal Scales:
How many times do you use CNN in a day?
(a) 0 times a day
(b) 1-5 times a day
(c) more than 5 times a day Yes/No
Interval/Ratio Scales:
How many times do you use CNN in a day?
_____ times a day
Types of data
According to source
Primary
original data collected for
a specific purpose.
 Direct observation
 Experimentation
 Survey
 Interviews
Secondary
collected by someone else
for another purpose
 Trade journals
 Newspapers
 Press releases
 Demographic data
 Industry analysts' reports
 Marketing research reports
 Public opinion polls
Key Data Collection Techniques
Observations
Surveys
Interviews
Experimentation
Observations
Observation means that the situation
of interest is checked.
Observation does not tell why it
happened.
Used for quantitative research
It can be conducted by ways:
 Mechanically
 Personally
Surveys
Surveys or questioning involve using a
questionnaire (data collection instrument) to ask
respondents questions to secure the desired
information.
Used for quantitative research
Questionnaires may be administered by:
 Mail (slow; low respond)
 Telephone (easy to administer; allow data to be
collected quickly at a relatively low cost )
 Computer/internet (rapid; low cost)
 In-person
Interviews
A focus group is a small group (6-8) of people
(respondent) headed by a moderator, carefully
selected, deliberate certain topic. They are
used to generate concepts and hypotheses.
In-depth interview:
An in-depth interview is an unstructured, direct,
personal interview in which a single respondent is
probed by a highly skilled interviewer to uncover
underlying motivations, beliefs, attitudes and feelings
on a topic.
Used in qualitative research.
Experimentation
Selection of matched groups, giving them
different experimental treatments controlling for
other related factors, and checks for differences
in the responses of the experimental group and
the control group.
Data in an experiment may be collected through:
Observation
Surveys.
Experimentation can be in a form of:
Laboratory experiments.
Field experiments
Clinical experiments
Consideration for Data Selection
Technique
Technical adequacy: reliability, validity,
freedom from bias, etc.
Practicality: cost, political consequences,
duration, personnel needs, etc.
Ethics: protection of human rights,
privacy, legality, environment, etc.
Data Preparation
Data Preparation involves:
Checking or logging the data in
Checking the data for accuracy
Entering the data into the computer
Transforming the data;
Developing and documenting a database
structure that integrates the various
measures.
Logging the Data
Set up a procedure for logging the data and
keeping track of it until you are ready to do
a comprehensive data analysis.
Database that enables you to assess at
any time is recommended.
Retain data records for at least 5-7 years.
Checking the Data For Accuracy
As soon as data is received you should screen it for
accuracy. In some circumstances doing this right away
will allow you to go back to the sample to clarify any
problems or errors.
There are several questions you should ask as part of
this initial data screening:
Are the responses legible/readable?
Are all important questions answered?
Are the responses complete?
Is all relevant contextual information included (e.g., data, time,
place, researcher)?
Developing a Database Structure
Two options available for developing a
database:
 Database programs (Microsoft access, Claris
Filemaker)
 Statistical programs (e.g., SPSS, SAS, Minitab,
Datadesk)
Entering the Data into the
Computer
Type the data directly.
Check it for errors.
Summarize the data.
Data Transformations
Transform the raw data into variables
that are usable in the analysis.
Any
questions?
THANK YOU…