Download Chapter 1 PowerPoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Probability and Statistics
Chapter 1 Notes
Probability and Statistics
Chapter 1 Notes
I. Section 1-1
A. Statistics is the science of collecting, organizing, analyzing,
and interpreting data in order to make decisions.
1. Data is information coming from observations, counts,
measurements, or responses.
a. There are 2 types of data sets.
1) Population – the collection of all outcomes,
responses, measurements, or counts that are of
interest.
a) the set of all possible measurements, counts
or observations that are of interest in a
particular study.
Probability and Statistics
Chapter 1 Notes
I. Section 1-1
2) Sample – A subset of the population.
a) Since it is usually impractical or even impossible
in terms of time or money to obtain every
possible response, we must often rely on
information obtained from a sample.
b) Random Sample: -- A sample in which every
member of the population has an equal
chance of belonging.
Probability and Statistics
Chapter 1 Notes
I. Section 1-1
b. There are 2 types of numerical descriptions
1) Parameter – A numerical description of a
population characteristic.
2) Statistic – A numerical description of a sample
characteristic.
B. Branches of Statistics
1. Descriptive Statistics
a. The branch of statistics that involves the organization,
summarization, and display of data.
2. Inferential Statistics.
a. The branch of statistics that involves using a sample to
draw conclusions about a population.
1) A basic tool in the study of inferential statistics is
probability.
Probability and Statistics
Chapter 1 Notes
I. Section 1-1
Assignment:
Classwork: Page 8, #1-38 ALL
Probability and Statistics
Chapter 1 Notes
II. Section 1-2
A. Types of Data
1. Qualitative Data
a. Attributes, labels or non-numerical entries.
2. Quantitative Data
a. Numerical measurements or counts.
B. Levels of Measurement
1. Nominal Data
a. Consists of names, categories, qualities, or labels.
Example: type of car you drive.
b. Can put data into categories, but we are unable to
determine if one piece of data is better or higher than
another.
c. When numbers are used as labels, such as on an athletic
jersey, they are classified as nominal data.
Probability and Statistics
Chapter 1 Notes
II. Section 1-2
2. Ordinal Data
a. Designations or numerical rankings which can be
arranged in ascending or descending order.
1) TV ratings for #1 show, #2 show, etc.
b. We can compare rankings as to which is higher, however
it does not make sense to subtract one rank value from
another.
1) Differences in rankings are not meaningful
computations.
Probability and Statistics
Chapter 1 Notes
II. Section 1-2
3. Interval Data
a. Can be subtracted to find the difference between two
values, put in order, and put into categories.
b. Data is numerical; 0 can be used to indicate a position in
time or space, however, the zero at this level does not
correspond to “none” of the specific variable being
measured.
1) The position on the thermometer of zero degrees
does not indicate that is absolutely no heat present.
c. Differences between data values are meaningful but it
does not make sense to compare one data value as being
twice (or any multiple of) another.
1) The two most common uses are for elapsed time and
temperature.
Probability and Statistics
Chapter 1 Notes
II. Section 1-2
4. Ratio Data
a. The highest level of measurement.
1) The number of gallons of gasoline you put into your
car today.
b. There is a zero on this scale which is interpreted as
“none” of the variable in question.
1) It is possible to put zero gallons of gas into your tank
today.
2) This is called an “inherent” zero.
c. It is meaningful to say one measure is two times, or three
times, as much as another.
1) You may have put twice as much gas in your car today
than you did last week.
Probability and Statistics
Chapter 1 Notes
II. Section 1-2
5. How to tell Interval data from Ratio data.
a. Does the expression “twice as much” have any meaning in
the context of the data?
1) $2 is twice as much as $1, so these data points are at
the ratio level.
2) A temperature of 2 degrees is NOT twice as warm as 1
degree is, so these data points are at the interval level.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
A. Design of a Statistical Study
1. Identify the variable(s) of interest (the focus) and the
population of the study.
2. Develop a detailed plan for collecting data.
3. Collect the data.
4. Describe the data, using descriptive statistics techniques.
5. Interpret the data and make decisions about the population
using inferential statistics.
6. Identify any possible errors.
B. Data Collection
1. Do an Observational Study
a. Observe and measure characteristics of interest of part
of a population, but do NOT change existing conditions.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
B. Data Collection
2. Do an Experiment
a. Apply a treatment to part of a population and observe
responses or results.
b. Observe another part of the population as a control
group.
1) May use a placebo in place of the treatment being
tested.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
B. Data Collection
3. Use a simulation
a. Use a mathematical or physical model to reproduce the
conditions of a situation or process.
1) Simulations allow us to study situations that are
impractical or even dangerous to create in real life.
a) Testing the effects of alcohol on a pilot’s ability to
fly is best done in a flight simulator.
b) Predicting how quickly and how far a disease may
spread is also best done using a computer model.
2) Simulations often save time and/or money.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
B. Data Collection
4. Use a survey (census)
a. A survey is an investigation of one or more
characteristics of a population.
1) Usually carried out on people by asking them to
respond to questions.
b. It’s important to word the questions so that they do not
lead to biased results.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
C. Experimental Design
1. Experiments must be carefully designed in order to produce
meaningful, unbiased, results.
a. The Hawthorne effect occurs in an experiment when
subjects change their behavior simply because they know
they are participating in an experiment.
2. Three key elements of a well-designed experiment are
control, randomization, and replication.
a. Control
1) It is important to control as many influential factors as
possible in a study.
2) When an experimenter cannot tell the difference
between the effects of different factors in an
experiment, a confounding variable has occurred.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
C. Experimental Design
3) Placebo effect occurs when a subject reacts favorably
to a placebo when in fact they have been given no
medical treatment at all.
a) Blinding is a technique used in which the subject
does not know whether he or she is receiving a real
treatment or a placebo.
b) Double-blind experiments occur when neither the
subjects nor the experimenter know which
individual subjects are receiving a treatment or a
placebo.
1. The experimenter only finds out which subjects
are which after all the data have been collected.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
C. Experimental Design
b. Randomization is a process of randomly assigning
subjects to different treatment groups.
1) Randomized block design – Divide subjects with
similar characteristics into blocks, and then randomly
split each block up into different treatment groups.
2) Matched-pairs design – Subjects are paired up
according to a similarity.
a) One subject in each pair is randomly selected to
receive one treatment, while the other one gets
another, different treatment.
c. Replication is the repetition of an experiment using a
large group of subjects.
1) The larger the sample size, the better.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
1. Census – a count or measure of an entire population.
a. Provides complete information, but is often too costly or
difficult to perform.
2. Sampling – a count or measure of part of a population.
a. Researcher must ensure that the sample is
representative of the population.
1) This is necessary to ensure that inferences about a
population are valid.
a) Sampling error – the difference between the
results of a sample and those of the population.
b. Random sample – a sample in which every member of
the population has an equal chance of being selected.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
1) Methods of sampling randomly
a) Simple Random Sample – assign each member of
the population a number and then randomly select
the numbers that you will survey.
1. Random number table (Appendix B of the book)
a. Randomly pick a starting point
b. Count off digits in groups that match how
many digits your population has.
c. Record the numbers, ignoring those that are
larger than the population size.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
2. Calculator
a. Press Math, select PRB, press 5(randInt)
b. Enter the number that you started with when
assigning labels to your population, then a
comma, then the last number you assigned,
comma, and the sample size you wish to use.
1) The calculator will generate the requested
quantity of random numbers.
3. If you do not want to have any member of the
population included in the sample twice, the
sampling process is said to be without
replacement.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
4. If you don’t care if a member of the population
is included twice, the sampling process is said to
be with replacement.
b) Stratified Sample
1. Separate population into two or more subsets,
called strata, using some similar characteristic.
a. Randomly select members of each strata to
make up your sample.
c) Cluster Sample
1. When the population is already divided into
subsets that are very similar to each other, you
could randomly select a number of entire
groups (not all the groups) and do your data
collection on those groups.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
a. We call these groups clusters.
d) Systematic Sample
1) Each member of the population is assigned a
number.
a. Put the members of the population in order
somehow.
b. Randomly select a starting point.
c. Randomly select an interval.
d. Survey every nth member of the population
from your starting point.
Probability and Statistics
Chapter 1 Notes
III. Section 1-3
D. Sampling Techniques
e) Convenience Sample
1) NOT RECOMMENDED!!
a. Simply select those members of the
population who are readily available.