Download Week 7 - Angelfire

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
COMMUNITY DENTAL
HEALTH
Jan Ladas
1
BIOSTATISTICS CONTINUED
Previously discussed:
 Descriptive statistical techniques
 The first measures of spread / central tendency
Information about central tendency is important.
Equally important is information about the
spread of data in a set.
Algonquin College - Jan Ladas
2
VARIABILITY/DISPERSION
Three terms associated with variability /
dispersion:
 Range
 Variance
 Standard Deviation
(They describe the spread around the
central tendency)
Algonquin College - Jan Ladas
3
VARIABILITY/DISPERSION
Range:
The numerical difference between the highest and
lowest scores
 Subtract the lowest score from the highest score
i.e.: c = {19, 21, 73, 4, 102, 88}
Range = 102 – 4 = 98
n.b.: easy to find but unreliable
Algonquin College - Jan Ladas
4
VARIABILITY/DISPERSION
Variance:
The measure of average deviation or spread of scores
around the mean
- Based on each score in the set
Calculation:
1. Obtain the mean of the distribution
2. Subtract the mean from each score to obtain a
deviation score
3. Square each deviation score
4. Add the squared deviation scores
5. Divide the sum of the squared deviation scores by the
number of subjects in the sample
Algonquin College - Jan Ladas
5
VARIABILITY/DISPERSION
Standard Deviation of a set of scores is the positive square
root of the variance
- a number which tells how much the data is spread
around its mean
Interpretation of Variance and Standard Deviation is
always equal to the square root of the variance
“The greater the dispersion around the mean of the
distribution, the greater the standard deviation and
variance”
Algonquin College - Jan Ladas
6
KURTOSIS
Kurtosis of a data set relates to how tall and
thin, or short and flat the data set is.
 Leptokurtic = tall and thin
 Mesokurtic = normal, about average
 Platykurtic = short and flat
Algonquin College - Jan Ladas
7
NORMAL CURVE (BELL)
 A population distribution which appears very commonly
in life science
 Bell-shaped curve that is symmetrical around the mean of
the distribution
 Called “normal” because its shape occurs so often
 May vary from narrow (pointy) to wide (flat)
distribution
 The mean of the distribution is the focal point from
which all assumptions may be made
 Think in terms of percentages – easier to interpret the
distribution
Algonquin College - Jan Ladas
8
THE NORMAL CURVE
Most used frequency distributions in biostatistics.
Characteristics:
1. Total area under the curve is equal to 1.00 or 100%
2. Mean = mode = median
3. The area under the curve is broken into equal
segments which are one standard deviation in width
4. The proportion of area under the curve between:
A the mean and 1 SD (+ or -) 34.13%
B the 1st and 2nd SD 13.59%
C the 2nd and 3rd SD 2.21%
Algonquin College - Jan Ladas
9
RESEARCH TECHNIQUES
Inferential Statistics
(Statistical Inference)
 Techniques used to provide a basis for
generalizing about the probable characteristics
of a large group when only a portion of the
group is studied
 The mathematic result can be applied to larger
population
Algonquin College - Jan Ladas
10
DEFINITIONS RELATING TO
RESEARCH TECHNIQUES
Population:
 Entire group of people, items, materials, etc. with at
least one basic defined characteristic in common
 Contains all subjects of interest
 A complete set of actual or potential observations
e.g. all Ontario dentists or all brands of toothpaste
Sample:
 A subset (representative portion) of the population
 Do not have exactly the same characteristics as the
population but can be made truly representative by
using probability sampling methods and by using an
adequate sample size (5 types of “sampling”)
Algonquin College - Jan Ladas
11
DEFINITIONS RELATING TO
RESEARCH TECHNIQUES
Parameters:
 Numerical descriptive measures of a population
obtained by collecting a specific piece of
information from each member of the
population
 Number inferred from sample statistics
E.G.: 2,000 women over age 50 with heart disease
Algonquin College - Jan Ladas
12
DEFINITIONS RELATING TO
RESEARCH TECHNIQUES
Statistic:
 A number describing a sample characteristic.
Results from manipulation of sample data
according to certain specified procedures
 A characteristic of a sample chosen for study
from the larger population
e.g.: 210 women out of 500 with diabetes have
heart problems
Algonquin College - Jan Ladas
13
DEFINITIONS RELATING TO
RESEARCH TECHNIQUES
Statistics:
 Characteristics of samples used to infer
parameters (characteristics of populations)
 A set of tools for collecting / organizing,
presenting and analyzing numerical facts or
observations
Survey:
 The process of collecting descriptive data from a
population
Algonquin College - Jan Ladas
14
SAMPLING PROCEDURES
5 Types of Samples:
1. A random sample – by chance
2. A stratified sample – categorized then
random
3. A systematic sample – every nth item
4. A judgment sample – prior knowledge
5. A convenience sample – readily available
Algonquin College - Jan Ladas
15
RANDOM SAMPLE
1. A random sample is one in which every element
in the population has an equal and independent
chance of being selected. This method is
preferred when possible because it equalizes the
effect of variables not under investigation but
which may influence the observations. It also
controls possible selection bias on the part of the
researcher.
Sample = 1000 / 5000 students from 50 universities
Lottery numbers or names in a hat
Algonquin College - Jan Ladas
16
STRATIFIED RANDOM
SAMPLE
12. Stratified random sampling is employed when it
may be necessary to select elements of the
population according to certain sub groups or
categories e.G. Age or gender. This method allows
for the control of the variable on which
categorization is made. Sample subjects are then
randomly chosen from the population making up
each category.
E.G.: List of names per university – random
selection 1/5 of names
Algonquin College - Jan Ladas
17
SYSTEMATIC SAMPLE
3. Systematic samples are selected by deciding
to observe every nth item in the population.
This method is not random because not every
element in the population has an equal and
independent chance for selection.
Every 5th from a list – odd or even numbers
Algonquin College - Jan Ladas
18
JUDGEMENT SAMPLE
4. A judgement sample has characteristics
similar to that of a stratified random sample. It
is sample selection done when the researcher,
with prior knowledge of the population or
question under investigation, arbitrarily chooses
certain criteria for representation E.G.: Income,
educational levels, place of residence etc.
Could be biased.
Algonquin College - Jan Ladas
19
CONVENIENCE SAMPLE
5. A convenience sample is chosen because
it is most readily available. It may or may
not be representative of the larger
population. Convenience samples are often
chosen on the basis of geographical
accessibility.
Reliability is questionable – could be biased.
Algonquin College - Jan Ladas
20
VARIABLES
The items of a study that are measured.
Independent Variable(s) (intervention):
 All the factors that influence the characteristics
which are under investigation
 Some of the Independent Variables will be
manipulated as part of the study or experiment
= “controlled”
i.e.: age, gender, type of oral hygiene aid, amount
of drug administered
Algonquin College - Jan Ladas
21
VARIABLES
Independent Variable(s) (intervention):
“Uncontrolled” variables can not be manipulated:
 Subject’s prior experience
 Subject’s knowledge base
 Subject’s emotional state
 Subject’s values, beliefs
i.e.: dental hygienist evaluating tooth brushing
method for children = “controlled variable”
Algonquin College - Jan Ladas
22
VARIABLES
Dependant Variable(s)
 The measurable result or outcome which the researcher
hopes will change or not change as a result of the
intervention
 Their values are determined by all of the independent
variables operational at the time of the study (both
controlled and uncontrolled)
n.b.: called dependant because result depends on
independent variable
e.g.: subject’s plaque scores / gingival condition
(measured before and after)
Result depends on method used.
Algonquin College - Jan Ladas
23
POTENTIAL PATHOGENS ON
NON-STERILE GLOVES
1.
2.
3.
Method = experimental
- Brief outline of experiment
Independent variables = items of a study that are
measured = the intervention
- Gloves – material and origin
- Petri dishes with growth substances
- Time and temperature of incubation
- Testing methods for identification
- Soap – type, amount and use
- Air exposure etc.
Algonquin College - Jan Ladas
24
POTENTIAL PATHOGENS ON
NON-STERILE GLOVES
Dependant = measurable result
= The types and numbers of microorganisms found on the tested gloves
Algonquin College - Jan Ladas
25
CONCEPT OF
SIGNIFICANCE
Probability – P (symbol)
When using inferential statistics, we often deal
with statistical probability.
 The expected relative frequency of a particular
outcome by chance or likelihood of something
occurring
 Coin toss
Algonquin College - Jan Ladas
26
PROBABILITY
Rules of probability:
1. The (P) of any one event occurring is some value from
0 to 1 inclusive
2. The sum of all possible events in an experiment must
equal 1
* Numerical values can never be negative nor greater
than 1
0 = non event
P 1 = event will always happen
Algonquin College - Jan Ladas
27
PROBABILITY
Calculating probability:
Number of possible successful outcomes
/ Number of all possible outcomes
E.G.: Coin flip:
1 successful outcome of heads
/ 2 possible outcomes = P = .5 or 50%
E.G.: Throw of dice
1 successful outcome
/ 6 possible outcomes = P = .17 or 16.6%
Algonquin College - Jan Ladas
28
HYPOTHESIS TESTING
 The first step in determining statistical
significance is to establish a hypothesis
 To answer questions about differences or
to test credibility about a statement
e.g.: ? – does brand X toothpaste really
whiten teeth more than brand Y ?
Algonquin College - Jan Ladas
29
HYPOTHESIS TESTING
Null hypothesis (Ho) = there is no statistically
significant difference between brand X and
brand Y
Positive hypothesis = brand X does whiten more
* Ho – most often used as the hypothesis
* Ho – assumed to be true
Therefore the purpose of most research is to
examine the truth of a theory or the effectiveness
of a procedure and make them seem more or less
likely!
Algonquin College - Jan Ladas
30
HYPOTHESIS
CHARACTERISTICS
Hypothesis must have these characteristics in order to be researchable.
Feasible
 Adequate number of subjects
 Adequate technical expertise
 Affordable in time and money
 Manageable in scope
Interesting to the investigator
Novel
 Confirms or refutes previous findings
 Extends previous findings
 Provides new findings
Algonquin College - Jan Ladas
31
HYPOTHESIS
CHARACTERISTICS
Ethical
Relevant
 To scientific knowledge
 To clinical and health policy
 To future research direction
Algonquin College - Jan Ladas
32
SIGNIFICANCE LEVEL
A number (a = alpha) that acts as a cut-off point below
which, we agree that a difference exists = Ho is rejected.
Alpha is almost always either 0.01, 0.05 or 0.10.
 Represents the amount of risk we are willing to take of
being wrong in our conclusion
P < 0.10 = 10% chance
P < 0.01 = 1% chance
(cautious)
P < 0.05 = 5% chance
 Critical value cut-off point of sample is set before
conducting the study (usually P < 0.05)
Algonquin College - Jan Ladas
33
ERRORS
Type I (Alpha):
 Is made when we reject the null hypothesis
when, in fact, it is true, therefore could lead to
practicing worthless treatments that do not
work.
Type II (Beta):
 Is made when we do not reject the null
hypothesis when, in fact, it is false, therefore
could lead to overlooking a promising treatment.
e.g.: the law – “innocent” or “guilty”
Algonquin College - Jan Ladas
34
DEGREE OF FREEDOM
(d.f.)
 Most tests for statistical significance require application
of concept of d.f.
 d.f. refers to number of values observed which are free to
vary after we have placed certain restrictions on the data
collected
* d.f. usually equals the sample size minus 1
e.g.: 8, 2, 15, 10, 15, 7, 3, 12, 15, 13 = 100
d.f. = number (10) minus 1 = 9
 Takes chance into consideration
 A penalty for uncertainty, so the larger the sample the
less the penalty
Algonquin College - Jan Ladas
35