Download Seedling Planted by Year

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
A Statistical
Analysis of
Seedlings Planted
in the
Encampment
Forest Association
By: Tony Nixon
Project Overview
• The Encampment Forest Association
• The Nature Conservancy
• White Pine and White Cedar Restoration
• Silvicultural Management
• GIS Database
General Information of Database
• 3,321 total records in database pertaining to white pine, white cedar, and
yellow birch seedlings.
• 1,987 useable records for statistical analysis that contain plant year
information about white pine and white cedar with no yellow birch
seedlings.
• Important database fields (“variables”) for statistical measure include
year planted, species, height, vigor (measure of health), canopy cover,
canopy species, and date updated.
• Plant year from 2004 to present.
Various Graphs Representing Overall Data
Seedling Planted by Year
Species
700
Seedlings Planted
600
Yellow Birch
6%
500
White Cedar
19%
400
300
200
White Pine
75%
100
0
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Year
Time plot representing the number of seedlings planted by year.
Pie Chart showing the abundance of each species planted.
Determining Samples for Analysis
• It can be stated that seedling are the same height and vigor when planted.
• All records were updated in 2012 – 2013 for measuring changes in height,
vigor, canopy cover, and competition.
• Samples will be created using a method of stratified random sampling and
multistage sampling.
• Seedlings must all be planted in the same year.
• Seedling must be of the same species.
• Seedlings must be located within the same sample site. This reduces
variance due to different environmental conditions.
Correlation Test
• A correlation test performed on white pine seedlings planted in 2004.
• Response variable is height of seedling and explanatory variable is canopy cover.
• Results show a correlation value of r = -0.524.
• This indicates a negative correlation between height and canopy cover. In other
words, with a higher percentage of canopy cover the growth of the seedlings is
delayed. The most evidence is shown between a canopy cover of 0 through 15
where there is the greatest height of seedlings.
Height (inches)
Height vs. Canopy Cover
100
90
80
70
60
50
40
30
20
10
0
y = -0.3157x + 38.181
R² = 0.2746
0
20
40
60
Canopy Cover
80
100
Variable Inference about Population
• A growth rate (inches/year) is calculated from the height of seedling based on year
planted and year the data was updated.
• All seedlings are white pine and were planted in 2008. They reside in the same
sample site.
• By testing this variable at this sample site, an inference can be made on the
population for these environmental conditions.
• The sample mean and standard deviation are calculated and shown below along
with the distribution. It can be concluded that the sample is relatively Normal with
a couple outliers that may skew the data a little to the right.
One Sample t Test
• State: According to the National Forest Service early stages of white pine growth is
relatively slow and varies from year to year. On average the height of white pine
seedlings are 54 inches at 8 years of age. That is about the growth rate of 6.75 inches per
year. Based on the data, does the current conditions of the sample site suggest a
relatively healthy habitat or does it require more management?
• Formulate: The null hypothesis suggests that this sample site is no different than the
average growth rate determined by the Forest Service. Use the one sample t test to
determine if this is true.
• Solve: We know the sample is relatively Normal with a sample size > 15 so the t test will
yield approximately accurate results. Calculate the probability that the growth rate of
the sample site is to happen based on the conditions of the site. The sample has a
17.24% chance of occurring which is not enough evidence to reject the null hypothesis.
• Conclude: It can be stated that the null hypothesis is true. In other words, the sample
site has about average growth rate. Therefore, the growing conditions are acceptable
and do not need further management.
One Sample t Test Calculations
Sample Size
78
Degrees of Freedom
77
Sample Mean
7.112
Sample Standard Deviation
2.321
Standard Error
0.26280
Mean Based on Null Hypothesis
6.75
One Sample t Statistic
1.37747
Two sided P-value
0.1724
One Sample t Confidence Interval
• State: What is the growth rate of white pine seedlings based on the
environmental conditions of the sample site?
• Formulate: Use the one sample t confidence interval for determining the
mean range of growth rate for the population mean based on 95%
confidence.
• Solve: Calculate the confidence interval based on the sample mean and
standard deviation. Based on the calculations below, there is a confidence
interval of 7.112 +/- 0.523.
• Conclude: It can be stated that on a 95% confidence these environmental
conditions will have a growth rate between 6.589 and 7.635 inchers per year
for similarly aged seedlings.
One Sample t Confidence Interval Calculations
Sample Size
78
Degrees of Freedom
77
Sample Mean
7.112
Sample Standard Deviation
2.321
Standard Error
0.26280
Critical Value t* on 95%
1.991
Confidence Interval (+/-)
0.52324
Two Sample t Test
• State: Comparing what is known about one population of environmental conditions, a
test can be done to make an inference about another population. Another sample site is
selected of white pines planted in 2010 from another location. Is there a significant
difference between population growth rates?
• Formulate: The null hypothesis suggests that the sample sites have the same growth
rate and that the two environmental conditions have no real effect on this rate. Use the
two sample t test to determine if this is true.
• Solve: The new sample site also has a relatively Normal distribution with > 15 sample
size as shown below. Calculate the probability that the growth rate of the sample sites
are similar and have no drastic differences. The sample has less than 0.0001 probability
occurring which is evidence to reject the null hypothesis.
• Conclude: It can be stated that the null hypothesis is false. In other words, the sample
sites have different conditions that lead to different growth rates.
Sample 1 Statistics (2008 Growth Rate)
Sample Size
78
Degrees of Freedom
77
Sample Mean
7.112
Sample Standard
Deviation
2.321
Sample 2 Statistics (2010 Growth Rate)
Sample Size
49
Degrees of Freedom
48
Sample Mean
7.112
Sample Standard
Deviation
2.321
Two Sample t Test Calculations
Degrees of Freedom
48
Mean Difference X1 - X2
4.116
t Test Denominator
0.31959
Two Sample t Statistic
12.87907
Two sided P-value
< 0.001
Two Sample t Confidence Interval
• State: How much smaller is the growth rate of sample site two compared to
sample site one based on 95% confidence?
• Formulate: Perform a two sample t confidence interval using a 95%
critical value.
• Solve: Determine the critical t* value based on degrees of freedom from the
smaller sample. Based on the calculations below, there is a confidence
interval of 4.116 +/- 0.643.
• Conclude: It can be stated that on a 95% confidence sample site one has a
growth rate between 3.473 and 4.759 inchers per year greater than sample
site two.
One Sample t Confidence Interval Calculations
Degrees of Freedom
48
Mean Difference X1 - X2
4.116
Critical Value t* on 95%
2.011
Confidence Interval (+/-)
0.643
Limitations and Bias
• There will always be errors and a level of bias with statistical inferences and
human related tests.
• Human error in data collection with measuring heights and estimating
canopy cover.
• Data recording error with entry.
• Bias when selecting sample sites without a true method of random
sampling.
• Bias on choosing only protected seedlings that doesn’t factor in other
damaging agents such as animal browsing.
Overall Conclusions
• Time Plot and Pie Chart: Based on the plots, it can be stated that the
Encampment Forest Association has recently took interest in restoring their white
pine populations.
• Correlation Inference: There is a negative correlation between height of
seedlings and canopy cover; therefore, a management practice of releasing white
pine stands will help increase growth rate of seedlings.
• One Sample Inferences: Since it is known that the sample site tested has
average growth rates overall, environmental conditions occurring at this site can be
noted and very minimal management practices need to be performed to maintain
optimal growing conditions. Also when determining future plantation sites, these
environmental conditions should be sought.
• Two Sample Inferences: By comparing different populations of environmental
conditions, the optimal growing locations can be found and time can be saved
focusing on sites with significantly less growth rates.
Personal Reflection and Final Thoughts
• Focus on material related to degree. Use ArcGIS software throughout the
project and tools to aid in statistical analysis.
• Found out the functionality of its statistical components of the software and
how it enhances the overall project.
• Use a method of querying for different samples and look for trends to test.
• The power of statistics for work related projects.
• Using stats to improve the efficiency of work and for determining materials
and management required.
• Will use inferential statistics in my capstone for predicting population
variables.
Questions??