Download Data - Amarillo ISD Blogs

Document related concepts

Pattern recognition wikipedia , lookup

Theoretical computer science wikipedia , lookup

Data analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Science Skills
AP Biology
Science has principles
 Science seeks to explain the natural world
and its explanations are tested using
evidence from the natural world
 Science assumes we can learn about the
natural world by gathering evidence
AP Biology
Science is a process
 Scientific ideas are developed through reasoning
 Scientific claims are examined using collected
evidence
 Scientific claims are subject to peer review and
replication
AP Biology
McClure-Ottmers
Science is a process
 No such thing as “The Scientific Method”
involves continuous observations, questions,
multiple hypotheses and more observations
Science seldom concludes & never proves
AP Biology
McClure-Ottmers
Science is a process--Theories
 Central to scientific thinking
 Overarching explanations that make sense of some
aspect of nature
 Based on evidence
 Allow scientists to make valid predictions
 Tested in many ways
 Supported, modified or replaced as new evidence appears
 Give scientists frameworks within which to work
 Big ideas within which scientists test specific hypotheses
AP Biology
McClure-Ottmers
Characteristics of Science
 Conclusions of science are reliable,
though tentative
 Science is not democratic
Science is based
on evidence, not
votes
AP Biology
McClure-Ottmers
Characteristics of science
 Science is non-dogmatic
Not based on faith or belief systems
 Science cannot make moral or aesthetic
decisions
AP Biology
McClure-Ottmers
Developing Hypotheses




Proposed explanations
Tentatively explains something observed
Must be testable and falsifiable
Can be supported through evidence, but
not proven
 Proposed as statements, not questions
AP Biology
McClure-Ottmers
Types of Hypotheses
 Null Hypothesis
States that there is no relationship between 2 variables;
findings probably occurred due to chance events
 Alternative hypothesis
States that there is a relationship between 2 variables; findings
probably did NOT occur due to chance events
 Scientists often state both types of hypotheses in order to analyze
results statistically
AP Biology
McClure-Ottmers
What effect does fertilizer have on the
growth of bermuda grass in West Texas?
 H0—If fertilizer was added to the soil where bermuda
grass grows, then no extra growth of the grass would be
observed.
 Ha1—If fertilizer was added to the soils where bermuda
grass grows, then the grass would grow at a faster rate
than grass without fertilizer.
 Ha2—If fertilizer was added to the soils where bermuda
grass grows, then the grass would grow at a slower rate
than grass without fertilizer.
AP Biology
McClure-Ottmers
Experimental Design
1. Determine variables
Dependent
Variable measured in an experiment
Independent
Variable changed in an experiment
Controlled/constant
Variables that are held constant in an experiment
AP Biology
McClure-Ottmers
Experimental Design
2. Designing a procedure
Level of treatment
Value set for the independent variable
Replicates
Experiments cannot be valid if conclusions are
only based on one or two individuals
Procedures usually repeated several times with
several individuals
Control group
Independent variable is either held constant or
omitted
Different from controlled variables!
AP Biology
McClure-Ottmers
Experimental Design
3. Making Predictions
Based on the experiment written in the form
of if/then statements
Built into a working hypothesis!
“If the hypothesis is true, then the results of the
experiment will be…”
Provides critical analysis of experimental
design
Used to evaluate results of experiment
AP Biology
McClure-Ottmers
Collecting Data
 What kind of data is needed to answer
question asked?
 Categories of Questions in Biology:
compare phenomena, events or populations
Is A different than B
look for association between variables
How are A and B correlated?
AP Biology
McClure-Ottmers
Collecting Data
 Decide how data should be collected so
that question can be answered—do this
BEFORE running experiment!
 English statistician R.A. Fisher once said,
“To consult the statistician after an experiment is
finished is often merely to ask him to conduct a
post mortem examination. He can perhaps say
what the experiment died of.”
AP Biology
McClure-Ottmers
 Qualitative
Data
Not numerical
Usually subjective
 Quantitative
Numerical
Lends itself to statistical analysis
Two types
Discrete
Finite values
Integers or Bucket categories such as “red” or “tall”
Continuous
Infinite number of values
Forms a continuum
AP Biology
McClure-Ottmers
Which graph shows continuous data?
Discrete data?
Graph
A
Graph
B
Adapted from iLoveBiology.net
Data
 Data collected will usually be
Parametric—normal distribution
Nonparametric
Frequencies
AP Biology
McClure-Ottmers
Statistical Tests and Graph Styles
AP Biology
McClure-Ottmers
Comparative statistics
--compare phenomena, events, or
populations
--Is A different from B?
Bar Graph
Parametric Data
(normal data)
Box-and-Whisker Plot
Nonparametric
Data
Bar Graph
or
Pie Chart
Frequency Data
(counts)
Adapted from iLoveBiology.net
Association statistics
--look for associations between
variables
--How are A and B correlated?
Scatterplot
Parametric Data
and
Nonparametric Data
Adapted from iLoveBiology.net
Elements of Effective Graphs
 Informative Title
 Easily identifiable lines/bars
 Axes clearly labeled with units
X—independent variable
Y—dependent
 Uniform intervals
 Clarify whether data starts at origin (0,0)
Line should not extend to origin if data does not
start there
 Line should not extend past last point
 Include standard error bars when appropriate
AP Biology
McClure-Ottmers
Bar Graphs
 Use to
Visually compare categorical or count data
Visually compare calculated means with error
bars for normal data
AP Biology
McClure-Ottmers
Bar Graphs
 Examples of questions where bar graphs
might be produced
Are the spines on fish in one lake without
predators shorter than the spines on fish in
another lake with predators?
Are the leaves of ivy grown in the sun
different from the leaves of ivy grown in the
shade?
AP Biology
McClure-Ottmers
Bar Graphs
 Standard error bars provide more
information about how different two means
may be from each other (sample standard
error)
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Scatterplots
 Use when comparing one measured
variable against another
 Can calculate linear regression line if
relationship is thought to be linear
use to help determine statistical correlation
between x and y variables
infer possibility of causal mechanisms
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
r = correlation coefficient
Range -1 to +1
Increased relationship
with values closer to 1
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Box and Whisker Plots
 Allow graphical comparison of two
samples of nonparametric data
appropriate descriptive statistics to use with
graph are median and quartile values
AP Biology
Histograms
 Frequency diagrams
 Use when an investigation involves measurement
data
Used to display distribution of data
 Provides representation of central tendencies and
spread of data
Use to determine whether data is parametric or
nonparametric
 Must set up
Bins
Uniform range intervals that cover entire range of data
Range of units
AP Biology
McClure-Ottmers
Histograms
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Line Graphs
 Used when data
on both axes are
continuous
 Dots indicate
measurements
that were actually
made
AP Biology
McClure-Ottmers
Using Graphs
 Estimation—Interpolation/Extrapolation
 Calculating Rate--Use slope
m=
y
y2 – y1
x
x 2 – x1
Rise
Slope =
Run
AP Biology
McClure-Ottmers
Positive
Slope
Rate
Increasing
Negative
Slope
Rate
Decreasing
Zero Slope
Constant
Rate
Indicates
some values
were skipped
Adapted from iLoveBiology.net
Why bother with data analysis?
 Appropriate techniques allow generation
of measures of confidence that lead to
greater precision
 Allows you to make claims with
confidence
 Allows you to decide whether results you
observe are due to chance or some real
difference
AP Biology
McClure-Ottmers
Descriptive Statistics
 Used to estimate important parameters of sample
data set
 Allows us to estimate how well sample data
represent true population
 Allows data to be summarized
 Can show variation, standard error, and
confidence that sufficient data have been
collected
AP Biology
McClure-Ottmers
Descriptive Statistics
 Examples
Sample standard deviation
Describes variability in data
Measurements of central tendencies
Mean, median, mode, range
Sample standard error of sample mean
Confidence Intervals
Helps determine confidence in sample mean
AP Biology
McClure-Ottmers
Inferential Statistics
 Includes tools and methods that rely on
probability theory and distributions to
determine precise estimates of true
population parameters from sample data
AP Biology
McClure-Ottmers
Population vs. Sample
 Often, researchers want
to investigate a
population (N)
may not be feasible to
collect data for every
member of entire
population
 sample (n)
smaller group of
members of a population
selected to represent
population.
must be random
Adapted from iLoveBiology.net
If sample is not collected randomly, it may
not closely reflect original population.
This is called sampling bias.
Adapted from iLoveBiology.net
Data Analysis
 Investigations involving measurement data
Construct histogram
Determine whether data has normal distribution
Could you have a sample distribution that doesn’t
“look” parametric but does represent a normally
distributed population of measurements?
Small sample size
Measurement error
Sampling error—random or nonrandom?
AP Biology
McClure-Ottmers
Descriptive Statistics
 Allows data to be summarized/Highlights trends or patterns in data
 Sample Mean
Average of all data entries
Measure of central tendency for normally distributed data
 Population Mean-- µ
Average of all data from all members of a population
 Median
Middle value
Good measure of central tendency for skewed distributions
 Mode
Most common value
Suitable for bimodal distributions and qualitative data
 Range
Difference between smallest and largest value
Crude indication of data spread
AP Biology
McClure-Ottmers
Measuring Spread in Data
 Variance (s2) and standard deviation (s)
measure how far a data set is spread out.
 variance of zero--all values in data set are
identical
Variance
AP Biology
Distance from the mean
McClure-Ottmers
Measuring Spread of Data
 Differences from mean are squared to
calculate variance
So…units of variance are not same as units in
original data set
 Standard deviation=square root of variance
Expressed in same units as original data set
So….more useful than variance!
AP Biology
McClure-Ottmers
Standard Deviation
AP Biology
McClure-Ottmers
Smaller Standard deviation
shows values clustered
tightly around mean
Larger Standard deviation
shows values spread out
widely from mean
AP Biology
McClure-Ottmers
Standard Deviation
Data: 2, 5, 9, 12, 15, 17
1. Calculate mean: 60/6 = 10
2. Find difference between each
term and mean
x
2
5
9
12
15
17
AP Biology
McClure-Ottmers
Standard Deviation
Data: 2, 5, 9, 12, 15, 17
1. Calculate mean: 60/6 = 10
2. Find difference between each
term and mean
x
2
(2-10)
(2-10)2
64
5
9
12
15
17
AP Biology
McClure-Ottmers
Standard Deviation
Data: 2, 5, 9, 12, 15, 17
1. Calculate mean: 60/6 = 10
2. Find difference between each
term and mean
x
2
(2-10)
(2-10)2
64
5
(5-10)
(5-10)2
25
9
(9-10)
(9-10)2
1
12 (12-10)
(12-10)2
4
15 (15-10)
(15-10)2
25
17 (17-10)
(17-10)2
49
Total 168
AP Biology
McClure-Ottmers
Standard Deviation
x
2
(2-10)
(2-10)2
64
5
(5-10)
(5-10)2
25
9
(9-10)
(9-10)2
1
12 (12-10)
(12-10)2
4
15 (15-10)
(15-10)2
25
17 (17-10)
(17-10)2
49
Total 168
AP Biology
McClure-Ottmers
Standard Deviation
 mean & standard deviation help estimate
characteristics of population from a single sample
AP Biology
McClure-Ottmers
Inferential Statistics--SE
AP Biology
McClure-Ottmers
Reliability of the Mean
AP Biology
McClure-Ottmers
Reliability of the Mean
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Interpreting & Communicating Results
 Study data to decide whether hypothesis
is supported or falsified
 Present conclusions in a scientific paper
Peer reviewed
Published in scientific journal
 Ideas, procedures, results, analyses and
conclusions critically scrutinized by other
scientists
AP Biology
McClure-Ottmers
Hypothesis Testing
 Hypothesis testing does not allow proof
or acceptance of the alternative to the null
hypothesis!
 Testing allows us to find support for the
alternative hypothesis by rejecting the
null hypothesis.
AP Biology
McClure-Ottmers
Hypothesis Testing
 Formal process to determine whether to
reject null hypothesis
1. state hypotheses—null and alternative
should be mutually exclusive
2. Determine which test statistic to use
3. Analyze sample data and find value of test
statistic
4. Interpret results—if value is unlikely based
on null hypothesis then reject
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Example—English Ivy Leaves
 Do shady English ivy leaves have a
larger surface area than sunny
English ivy leaves?
 Propose Hypotheses
 H0 = The true population mean width of
ivy leaves grown in the shade is the
same as the true population mean
width of ivy leaves grown in the sun.
 H1 = The true population mean width of
ivy leaves grown in the shade is larger
than the true population mean width of
ivy leaves grown in the sun.
AP Biology
McClure-Ottmers
Example—English Ivy Leaves
Sampling
 Choose smaller samples
instead of entire population
Why?
How?
Random and unbiased
 Collected and measured max
width in cm of 30 leaves from
each habitat
AP Biology
McClure-Ottmers
Example
 Just looking at this data
in this form does not
answer question
AP Biology
McClure-Ottmers
Example
 Data Analysis
determine confidence in data collected
Is difference between two groups real or due to
some chance event?
 Data measurements
Units are cm
continuous measurement data
not counts or categories
 What is first step?
Construct histogram to check for normal
distribution!
AP Biology
McClure-Ottmers
Normally distributed?
Close enough!
AP Biology
McClure-Ottmers
Example
 Since Data are Parametric
Calculate Descriptive Statistics
Mean
Standard deviation
Calculate Inferential Statistic
Standard Error
AP Biology
McClure-Ottmers
Example
AP Biology
McClure-Ottmers
Example
 Produce bar graph to compare means including
error bars of ±1 SE
Do SE bars overlap?
Would SE bars overlap
if ±2 SE were graphed?
What does SE suggest
about two populations?
Use SE statistic as inference
to describe confidence that
means of samples represent
true population means
AP Biology
McClure-Ottmers
Example
 SE Bars indicate there is a statistically significant
difference between two populations
 More rigorous statistical test will need to be performed to
confirm that two populations are different from one another
AP Biology
McClure-Ottmers
Example
 Most biological studies establish a critical value of
the probability of whether results occur by chance
alone
 When observations deviate from the predictions, how
much variation should be tolerated before rejecting
null hypothesis?
In biological investigations, a 5% critical value is often used
as a decision point for rejecting null hypothesis.
Could set more stringent critical value (1% or 0.1%)
In life-and-death issues often associated with medical studies
AP Biology
McClure-Ottmers
AP Biology
McClure-Ottmers
Example
 For two leaf populations
p=0.016%
less than 5% critical value
reject null hypothesis that there is no difference
between means of two populations
 provides support for alternative hypothesis
leaves in shady areas are larger than leaves found in the
sun in English ivy plants
 Only provides support for alternative hypothesis—
doesn’t cause you to accept it!
 Additional studies
chlorophyll amounts, leaf area, stomata densities, or light
response curves.
AP Biology
McClure-Ottmers
More Hypothesis Testing
—Chi Square Test
 Use with frequency counts
 Test to see if data supports null hypothesis
No difference between observed and expected values
Any difference is due to chance
 Compare observed and expected values
Is variance from expected values due to random chance?
Is there another factor influencing data?
X
AP Biology
Ʃ
2 =
(o – e)2
e
McClure-Ottmers
AP Biology
McClure-Ottmers
Chi-Square Example
 An ecologist is studying habitat preferences of
periwinkles on the rocky coast line of the New
England Coast.
 She hypothesizes that more periwinkles will be found
closer to the tide line.
 To test her hypothesis, she collects data by
counting the number of periwinkles within a .5m2
quadrat sample that she observes on a rocky coast
line location at low tide.
 Determine if the difference in number of
periwinkles observed in each location is
statistically significant.
AP Biology
McClure-Ottmers
Distance from low
tide
# of periwinkles
observed
At low tide line
35
2 m above low tide
24
2 m above low tide
10
3 m above low tide
3
4 m above low tide
2
Total
75
Null Hypothesis: There is no difference in the number of
Periwinkles observed at each of the water levels.
If Null Hypothesis is accepted then there is no
difference in the distribution of periwinkles on the shoreline
AP Biology
McClure-Ottmers
AP Biology
Category
o
Low tide
35
1m above
34
2 m above
10
3 m above
3
4 m above
2
e
o-e
(o-e)2
(o-e)2/e
McClure-Ottmers
AP Biology
Category
o
e
Low tide
35
21
1m above
34
21
2 m above
10
21
3 m above
3
21
4 m above
2
21
o-e
(o-e)2
(o-e)2/e
McClure-Ottmers
AP Biology
Category
o
e
o-e
Low tide
35
21
14
1m above
34
21
13
2 m above
10
21
-11
3 m above
3
21
-18
4 m above
2
21
-19
(o-e)2
(o-e)2/e
McClure-Ottmers
Category
o
e
o-e
(o-e)2
Low tide
35
21
14
210
1m above
34
21
13
169
2 m above
10
21
-11
121
3 m above
3
21
-18
324
4 m above
2
21
-19
361
(o-e)2/e
2
AP Biology
McClure-Ottmers
Category
o
e
o-e
(o-e)2
(o-e)2/e
Low tide
35
21
14
210
10.00
1m above
34
21
13
169
8.05
2 m above
10
21
-11
121
5.76
3 m above
3
21
-18
324
15.43
4 m above
2
21
-19
361
17.19
2 56.43
AP Biology
McClure-Ottmers
 p-value is predetermined choice of how
certain we are.
 Smaller p-values--more confidence we
can claim.
 p = 0.05 means that we can claim 95%
confidence.
AP Biology
McClure-Ottmers
 Compare chi-square value to table of
values according to the number of degrees
of freedom
df = number of categories – 1
df = 5-1=4
AP Biology
McClure-Ottmers
 If X 2 value is less than critical value,
accept null hypothesis.
difference is not statistically significant
 If X 2 value is greater than or equal to
critical value, reject null hypothesis.
difference is statistically significant
AP Biology
McClure-Ottmers
 Reject the null hypothesis. There is a
statistically significant distribution of
periwinkles.
 Variance between observed and expected
results would occur from random chance
alone only about 5% of the time
 95% of the time variance would be due to
something other than chance
AP Biology
McClure-Ottmers