* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download power point - pptx format
Inverse problem wikipedia , lookup
Geographic information system wikipedia , lookup
Pattern recognition wikipedia , lookup
Theoretical computer science wikipedia , lookup
Neuroinformatics wikipedia , lookup
Data analysis wikipedia , lookup
Data assimilation wikipedia , lookup
Presented at the Information Management Workshop for Forest Dynamic Plot Database 2009 Nantou, TAIWAN June 15th, 2009 The CTFS database workshop II Smithsonian Tropical Research Institute (STRI) Panamá September 29 to October 6, 2008. Participating Nations: Brasil - 1 plot Canada – 1 plot Colombia – 2 plots Ecuador – 1 plot DR of Congo – 1 plot US North America Temperate: Wisconsin – 1 plot Maryland – 1 plot Hawaii – 2 plots Puerto Rico – 1 plot Nantou, TAIWAN Monday, June 15, 2009 A 16 ha plot 500 m N-S and 320 m E-W 16 X 25 grid 400 20 x 20 m quadrats 16 5 x 5 m sub-quadrats per quadrat Plant census every 5 years 4 censuses already Measurements include: location, size, point of measurement, mortality, damage LUQ CTFS Plot data : location, dbh, mortality code, point of measurement Directed Neotropical and African plot sites Present and implement the newly designed database system ◦ Load data from each (completed) plot into the database system ◦ Work with data reports created by the database system ◦ Work with the database editor for minor changes to the data ◦ Demonstrate and examine the data entry program ◦ Make concrete plans for how the databases will be distributed: web applications; sharing level up to ea. scientist Nantou, TAIWAN Monday, June 15, 2009 Store and manage: ◦ enormous amount of plot data ◦ store annual changes ◦ store versions: tracking the history of data; modifications and corrections Minimize errors: ◦ Tree measurements errors Mismatching errors in the database or at the field Can be customized to add site specific data Nantou, TAIWAN Monday, June 15, 2009 Used of R to check species, quadrants, codes, match tags, etc - LUQ spent 2-3 days filtering data for the 3 censuses R Mayor Problem: tagging 1. 2. Mayor problems X,Y coordinate definition-local to quadrat not to plot Dead stem vs dead tree: CTFS database : dead trees VS LUQ : dead stems 3. Repetitions of records having <tag, stemtag> Solutions: 1. Reduced the X,Y coordinate in a Paradox scripts 2. Series of queries to determine that all stems were dead, 3. Extracted records for further inspections. Eg., Duplicate Tag;Tree is in quadrat=1013” A false duplication error Eg., Another stem has larger dbh? : LUQ’s main stem was not always the one with the biggest diameter Are this real problems that will cause substantial error in the analysis of these data? Data gathering protocols? Conceptual design? DO WE NEED TO STANDARDIZE AT THE METHODOLOGY AND DATA ENTRY LEVEL? Using private online forms Step-by-step set of forms Using private online forms Step-by-step set of forms Convert species, codes, quadrant files into CTFS’ database format •Great quality control tool •Easy to use, once your data is “bug” free •PURPOSE OR CONCEPT BEHIND: •Database designed for storage and management of data in a standard way, •NOT FOR SHARING WITH OTHERS •“Forces” standardization in some ways: •Data design: codes and measurments definitions •Data gathering protocols , My assessment of the system • Importance of IM workshops: • Mix IM and scientists • Learning the system together • Both groups discussing the scientists’ needs and use of the system • IM and Scientists relationship • IM reason to exist: facilitate science • Scientists need of IM to manage large amount of data •When spreadsheets are just not enough , Some Thoughts