* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Seedling Planted by Year
Degrees of freedom (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Resampling (statistics) wikipedia , lookup
A Statistical Analysis of Seedlings Planted in the Encampment Forest Association By: Tony Nixon Project Overview • The Encampment Forest Association • The Nature Conservancy • White Pine and White Cedar Restoration • Silvicultural Management • GIS Database General Information of Database • 3,321 total records in database pertaining to white pine, white cedar, and yellow birch seedlings. • 1,987 useable records for statistical analysis that contain plant year information about white pine and white cedar with no yellow birch seedlings. • Important database fields (“variables”) for statistical measure include year planted, species, height, vigor (measure of health), canopy cover, canopy species, and date updated. • Plant year from 2004 to present. Various Graphs Representing Overall Data Seedling Planted by Year Species 700 Seedlings Planted 600 Yellow Birch 6% 500 White Cedar 19% 400 300 200 White Pine 75% 100 0 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Year Time plot representing the number of seedlings planted by year. Pie Chart showing the abundance of each species planted. Determining Samples for Analysis • It can be stated that seedling are the same height and vigor when planted. • All records were updated in 2012 – 2013 for measuring changes in height, vigor, canopy cover, and competition. • Samples will be created using a method of stratified random sampling and multistage sampling. • Seedlings must all be planted in the same year. • Seedling must be of the same species. • Seedlings must be located within the same sample site. This reduces variance due to different environmental conditions. Correlation Test • A correlation test performed on white pine seedlings planted in 2004. • Response variable is height of seedling and explanatory variable is canopy cover. • Results show a correlation value of r = -0.524. • This indicates a negative correlation between height and canopy cover. In other words, with a higher percentage of canopy cover the growth of the seedlings is delayed. The most evidence is shown between a canopy cover of 0 through 15 where there is the greatest height of seedlings. Height (inches) Height vs. Canopy Cover 100 90 80 70 60 50 40 30 20 10 0 y = -0.3157x + 38.181 R² = 0.2746 0 20 40 60 Canopy Cover 80 100 Variable Inference about Population • A growth rate (inches/year) is calculated from the height of seedling based on year planted and year the data was updated. • All seedlings are white pine and were planted in 2008. They reside in the same sample site. • By testing this variable at this sample site, an inference can be made on the population for these environmental conditions. • The sample mean and standard deviation are calculated and shown below along with the distribution. It can be concluded that the sample is relatively Normal with a couple outliers that may skew the data a little to the right. One Sample t Test • State: According to the National Forest Service early stages of white pine growth is relatively slow and varies from year to year. On average the height of white pine seedlings are 54 inches at 8 years of age. That is about the growth rate of 6.75 inches per year. Based on the data, does the current conditions of the sample site suggest a relatively healthy habitat or does it require more management? • Formulate: The null hypothesis suggests that this sample site is no different than the average growth rate determined by the Forest Service. Use the one sample t test to determine if this is true. • Solve: We know the sample is relatively Normal with a sample size > 15 so the t test will yield approximately accurate results. Calculate the probability that the growth rate of the sample site is to happen based on the conditions of the site. The sample has a 17.24% chance of occurring which is not enough evidence to reject the null hypothesis. • Conclude: It can be stated that the null hypothesis is true. In other words, the sample site has about average growth rate. Therefore, the growing conditions are acceptable and do not need further management. One Sample t Test Calculations Sample Size 78 Degrees of Freedom 77 Sample Mean 7.112 Sample Standard Deviation 2.321 Standard Error 0.26280 Mean Based on Null Hypothesis 6.75 One Sample t Statistic 1.37747 Two sided P-value 0.1724 One Sample t Confidence Interval • State: What is the growth rate of white pine seedlings based on the environmental conditions of the sample site? • Formulate: Use the one sample t confidence interval for determining the mean range of growth rate for the population mean based on 95% confidence. • Solve: Calculate the confidence interval based on the sample mean and standard deviation. Based on the calculations below, there is a confidence interval of 7.112 +/- 0.523. • Conclude: It can be stated that on a 95% confidence these environmental conditions will have a growth rate between 6.589 and 7.635 inchers per year for similarly aged seedlings. One Sample t Confidence Interval Calculations Sample Size 78 Degrees of Freedom 77 Sample Mean 7.112 Sample Standard Deviation 2.321 Standard Error 0.26280 Critical Value t* on 95% 1.991 Confidence Interval (+/-) 0.52324 Two Sample t Test • State: Comparing what is known about one population of environmental conditions, a test can be done to make an inference about another population. Another sample site is selected of white pines planted in 2010 from another location. Is there a significant difference between population growth rates? • Formulate: The null hypothesis suggests that the sample sites have the same growth rate and that the two environmental conditions have no real effect on this rate. Use the two sample t test to determine if this is true. • Solve: The new sample site also has a relatively Normal distribution with > 15 sample size as shown below. Calculate the probability that the growth rate of the sample sites are similar and have no drastic differences. The sample has less than 0.0001 probability occurring which is evidence to reject the null hypothesis. • Conclude: It can be stated that the null hypothesis is false. In other words, the sample sites have different conditions that lead to different growth rates. Sample 1 Statistics (2008 Growth Rate) Sample Size 78 Degrees of Freedom 77 Sample Mean 7.112 Sample Standard Deviation 2.321 Sample 2 Statistics (2010 Growth Rate) Sample Size 49 Degrees of Freedom 48 Sample Mean 7.112 Sample Standard Deviation 2.321 Two Sample t Test Calculations Degrees of Freedom 48 Mean Difference X1 - X2 4.116 t Test Denominator 0.31959 Two Sample t Statistic 12.87907 Two sided P-value < 0.001 Two Sample t Confidence Interval • State: How much smaller is the growth rate of sample site two compared to sample site one based on 95% confidence? • Formulate: Perform a two sample t confidence interval using a 95% critical value. • Solve: Determine the critical t* value based on degrees of freedom from the smaller sample. Based on the calculations below, there is a confidence interval of 4.116 +/- 0.643. • Conclude: It can be stated that on a 95% confidence sample site one has a growth rate between 3.473 and 4.759 inchers per year greater than sample site two. One Sample t Confidence Interval Calculations Degrees of Freedom 48 Mean Difference X1 - X2 4.116 Critical Value t* on 95% 2.011 Confidence Interval (+/-) 0.643 Limitations and Bias • There will always be errors and a level of bias with statistical inferences and human related tests. • Human error in data collection with measuring heights and estimating canopy cover. • Data recording error with entry. • Bias when selecting sample sites without a true method of random sampling. • Bias on choosing only protected seedlings that doesn’t factor in other damaging agents such as animal browsing. Overall Conclusions • Time Plot and Pie Chart: Based on the plots, it can be stated that the Encampment Forest Association has recently took interest in restoring their white pine populations. • Correlation Inference: There is a negative correlation between height of seedlings and canopy cover; therefore, a management practice of releasing white pine stands will help increase growth rate of seedlings. • One Sample Inferences: Since it is known that the sample site tested has average growth rates overall, environmental conditions occurring at this site can be noted and very minimal management practices need to be performed to maintain optimal growing conditions. Also when determining future plantation sites, these environmental conditions should be sought. • Two Sample Inferences: By comparing different populations of environmental conditions, the optimal growing locations can be found and time can be saved focusing on sites with significantly less growth rates. Personal Reflection and Final Thoughts • Focus on material related to degree. Use ArcGIS software throughout the project and tools to aid in statistical analysis. • Found out the functionality of its statistical components of the software and how it enhances the overall project. • Use a method of querying for different samples and look for trends to test. • The power of statistics for work related projects. • Using stats to improve the efficiency of work and for determining materials and management required. • Will use inferential statistics in my capstone for predicting population variables. Questions??