Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Protection Act, 2012 wikipedia , lookup
Clusterpoint wikipedia , lookup
Data center wikipedia , lookup
Forecasting wikipedia , lookup
Data analysis wikipedia , lookup
3D optical data storage wikipedia , lookup
Data vault modeling wikipedia , lookup
Information privacy law wikipedia , lookup
Increasing Relational Capabilities Between PSMFC Databases With Focus on BPA-Sponsored PSMFC Projects September 17, 2014 205 Spokane Street, Suite 100 Portland, Or. 97202 Meeting materials (agenda and presentations are available on the PNAMP website here: http://www.pnamp.org/event/4817 Morning Session 9:00 AM Welcome, Introductions Chris Wheaton & Jen Bayer Attendees: Chris Wheaton, Jen Bayer, Bob Ryznar, Bill Kinney, John Tenney, Van Hare, Mike Banach, George Nandor, Steve Pastor, Brett Holycross, Greg Wilke, Jim Longwill, Dan Webb, Nicole Tancreto, Rebecca Scully, Katie Pierson, Craig White, Marianne McClure, Brandon Chockley, Tom Pansky Phone: Bill Bosch, Cedric Cooney, Jason Vogel, Phil Roger, Rich Carmichael, Tiffani Marsh, Tom Iverson, Stacy Springer, Dan Rawding, Russell Scranton, Micki Varney, Henry Franzoni 9:15 AM Data Sharing in the AKFIN Program (Alaska Fisheries Information Network) Bob Ryznar PowerPoint Presentation Large commercial fisheries database, funded by NOAA Fisheries (www.akfin.org) Primary purpose is to provide complex data sets to fisheries analysts and economists to support the Council’s decisionmaking process. Initiated in 1999, developed comprehensive datasets in 2009, embedded data manager Michael Fey in the Council offices (helped move things along and satisfy their primary customer) Standardization of technology between AKFIN and PACFIN began in 2012; promotes increased efficiency and provides cost-savings [AKFIN and PacFIN are both NOAA funded fisheries information projects at PSMFC. They support the collection, processing, analysis, and reporting of fisheries statistics for fisheries off the coasts of Washington, Oregon, California, Alaska, and British Columbia.] Receive wholesale data loads (providers don’t standardize or convert their data prior to submission; AKFIN just takes whatever is available and works with it from there in-house) o This is possible because the total poundage caught is what is represented in the AKFIN database and it does not require further extrapolation. Format of the Business Intelligence Dashboard (and specific outputs offered) was driven by the user needs and wants Developers are key to the program in terms of saving individual entities’ time - bulk of what AKFIN does is adhoc data requests which are now handled through AKFIN (Developers Toolbox is available to write queries, pull data, and deliver smaller data sets as needed) Successes include describing and defining the regional data cycle and increasing transparency between decision makers and data providers; areas for improvement are in helping users understand where the data is coming from and getting greater participation from 1 Biggest success is having staff supported and relationships built with partners Discussion: In Alaska, data reported must be made up of 4 or more catch and processor records for privacy & confidentiality purposes. Federal standard is 3 or more records. Program really started to take shape and form when it became the clear feed for data to the Council. Previously many established data pathways existed, requiring many 1-1 interactions to get data assembled for the Council’s use. Fish species covered by AKFIN data include anything that’s commercially landed (salmon, herring, crab, halibut, pollock, etc.). AKFIN receives observer data from Alaska Fisheries Science Center as well- allows for total catch projections. Council will support access to 10 years of data. Fish tickets are available going back to 1969 (data quality is the issue with older records). E-Landings started with Crab Rationalization Program in 2003 and have since been expanded to other Alaska fisheries. AKFIN does not integrate their datasets with those of RMIS, PTAGIS, etc. (users have not asked for this- means that AKFIN is not dealing with the same set up users as the other data projects)- AKFIN product is total catch in Alaska by species- would have to go to RMIS, PTAGIS for details to help with analysis of the catch Working on assisting the genetics lab in Juneau (they need catch information to do their analyses) Tribal harvest is incorporated into PACFIN Not everyone who works with the data knows that AKFIN is the database that provides them the data. They are starting to edge into “analysis” with fleet and community profiles. Akfin model is that they get the data from providers and convert it into a common form. Pacfin model is that the data is collected in a rigid framework. Bob prefers the Akfin model because it is flexible and can accommodate change better. Data sharing in Akfin is all unofficial, use established guidelines. 10:00 AM Needs & opportunities for sharing & relating across databases Group Discussion What needs to be done to relate the various databases at PSMFC? What are the issues to be addressed? What are problems encountered? Steve P- are we talking about truly relational databases or portals? What opportunities currently exist? What could be added easily? Rich C- need to define the questions/ need for relationships and areas for improvement in the analysis Marianne- cannot currently analyze survival rates from CWT and PIT data; would be nice to have a way to relate both to one another 2 Steve P- would be useful to be able to get a larger view of different brood year returns Chris W- Should we have an ongoing process for sharing information and discussion of possible improvements? Dan R- currently have to pull data out of RMIS and PTAGIS and work with it to get the summaries needed. Is there a possibility of getting those summaries directly from the databases? John T- can create custom reports from PTAGIS; other entities have that function at the moment Dan R- those entities don’t do everything that they are needing; spend a lot of time on queries that aren’t available within a database or between databases Rich C- much of the analysis done requires additional information to qualify it/ modify it/ expand it; may not be easy to write standard queries for those types of analysis. Do we want the databases to actually do analysis? Or do we want high-quality, error-checked data available in the databases? John T- envisions the databases as a place to aggregate easily accessible high-quality, error-checked data; can do custom reports as needed Steve P- there could be a new site that would allow you to access all three databases at once (a portal) to easily access what you need for your analysis Rich C- Need to make sure that what is pulled from the various databases is comparable and compatible Van- more about identifying common linkages to help the user find data across the three databases for a given stream, watershed, dam, etc.- wouldn’t be actually combining the data, would just help the user to locate the data for the area of their interest. Marianne- common data links should be available somewhere for the analysts to locate Chris W- How can we make is easier for the user to find what they want? Should there be a roll up of that information into some sort of summary report? Are there standardized roll-up reports for each database? Sees three possible areas of discussion; 1. Integrate query systems to increase efficiency and simplify data searches? 2. Should there be an integrated analysis and interpretation of all the data for a population, stock, etc.? 3. Should we produce standard roll-up reports (either separately or together) for some key attributes that are most useful to people? George N- RMIS has canned reports available online, and can put together reports as requested to be made available on their site Dan R- are there standardized queries used by many groups that could be made available? Jason V- would like to see a PIT Tag Array common query available Tiffani M- sees PTAGIS as a repository because there is enough difference in how different groups make their calculations that is makes creating a standardized query difficult 3 10:45 AM Opportunities and issues related to possible integration and data sharing between RMIS and PTAGIS (and StreamNet) databases John Tenney PowerPoint Presentation See also www.ptagis.org Infrastructure upgrades completed in 2013 (database, reporting system, related processes) 2014-15 Program Objectives include refining and documenting upgraded systems, 2015 PIT Tag Workshop (Jan 27-29), upgrade P3 Tagging Software: P4, evolving PTAGIS MRR data model Have limited support in PTAGIS for CWT (have a comment flag that indicates presence of CWT and text field for tag code); not easy to query this information from PTAGIS; of the 4.1 million CWT records in PTAGIS, less than 50 records actually include the code Currently not a lot of opportunity to link the PTAGIS and CWT datasets, but Council wants to see it happen MMR Data Model can be expanded during the upgrade; created a forum for requested changes and proposed additions to the data model Goal is to promote CWT and other Marks in the PTAGIS MMR Model by adding a specialty field in P4 to capture that information; fields would be optional but would be exclusive to the type of tag; new fields can be queried and/or can link out to other systems (RMIS, etc.) Already have a public web API to get information on a particular tag code (also available with observation data, mark sites) Discussion: Marianne- Are the CWT fields blank because they don’t have access to the code information? If so, then they will always be blank. Need to get people to understand why it’s helpful for them to add that information. From the perspective of a PTAGIS user, knowing that there is a CWT is useful but knowing what the CWT code is may not matter as much Steve P- this could be a good demonstration opportunity, could put a V detector at the raceways for CWT detection John T- don’t want contributors to have to work through fields that don’t apply to them (can set it up to allow users to select the fields they want to see/ use) Dan R- PTAGIS has information on the tagged fish, time of capture and location of capture; RMIS has number of CWT juveniles and number of released juveniles so you can get abundance information Chris W- group needs to make sure that the right people are at the table for making decisions on database field changes/ updates George- would not be difficult to add a PIT tag field to RMIS; but would need a way to link it to a table of PIT tags associated with each CWT release. 4 Marianne- would be more appropriate to add CWT info to PTAGIS due to regional scope of use Micki- real time info provided by PTAGIS during return/ recovery was useful in understanding what happened to the CWT fish in the DIT groups and MSFs Rich- What is the objective for providing connectivity to a broad audience when the people responsible for analyzing the data already have that connectivity? Dan R- should allow the CWT tag code to be entered into PTAGIS; what people do with it from there is up to them John T- it’s incumbent upon the people using the data to give guidance and input as to what they need to do their analysis; additional tag type fields will be added to PTAGIS Henry F- is double tagging of fish being discouraged due to mortality rates? Don’t hatchery release databases have all this PIT tag and CWT info? Rich C- lots of hatchery programs are releasing fish with both PIT tags and CWT Marianne- important to have this information in a place where people can access it over time, rather than it residing on an individual person’s desktop Chris W- Let’s not be overcautious and prevent collecting of the actual tag data associated with a particular fish, if the database can be set up to collect the information without too much hassle 1:00 PM Linking PSMFC data systems through a common GIS: progress to date and future opportunities/goals Van Hare PowerPoint Presentation GIS at PSMFC has been centralized to provide support to all Commission projects and provide continuity amongst all the projects, reduce duplication of effort, and build communication Goal 1- reference data to common map features; working on creating common spatial framework among projects, common referencing systems, common identifiers, etc. Goal 2- cross reference map features to common areas of interest amongst the projects (StreamNet and PTAGIS do this currently; may not always be appropriate for all types of data within all systems) Goal 3- coordinated data sharing for core map layers Still need to figure out details and address concerns Proposed focus would be publishing the map features with links into the different project data systems through common IDs Can leverage PSMFC’s existing ArcGIS platform for data sharing Follow best practices and work with PNAMP and others to meet end-user and funding source needs Discussion: Chris W- Would the end user eventually see an interactive map-based portal? 5 Van- This plan would support that by being transparent and documented, and could allow for that discussion to happen in the future. Initially it would support the concept of getting projects to start talking to one another and map things in the same way. People want know what is happening in a particular location on a map. Steve- Would be nice to have a common graphic interface (click a location and toggle boxes for the information you want displayed from the different systems for that location) Marianne- Each agency is responsible for adding their own location codes, but who resolves inconsistencies? Van- codes can be different, but they all need to point to the same spot on the map. Inconsistencies are dealt with as they come up, and conventions are in place and being developed as needed with the projects. Tom P- appreciates the effort towards efficiency of mapping things once and sharing amongst projects. Build a “front door” for managers and users to access the data more easily Henry F- how are management questions answered by tying together the databases spatially? Russell- would like to be able to query by populations in a location for the three projects Rich C- need to be more specific regarding the outputs and outcomes of the GIS component Jason- the only database that’s worked across the region is PTAGIS because they have dictated what goes in it and follow their rules for use. It is difficult to do crosswalk work when the names of the facilities aren’t even agreed on. Cedric- hatchery information may be contained in many locations, but it hasn’t been a targeted dataset for some time. Resources need to be in place to update datasets. Van- this work is already being done at the Commission level for StreamNet, PTAGIS, and RMPC. Not looking to be driven by any management questions at this point. 1:45 PM Common PSMFC “data use agreement or policy” Overview of current data sharing in PTAGIS, RMIS, StreamNet Mike Banach PowerPoint Presentation What are the rules currently in place? Each project has its own approach… StreamNet Main Database- No restrictions in use mentioned. Citation guide available. Data Store- data submitters can specify any restrictions they like RMIS No restrictions in use mentioned. Citation guide available. Data availability and process dictated by international treaty PTAGIS Formal “Data Use Policy” prescribed by Steering Committee and available on every page of website Issues: PSMFC has no enforcement authority if a data use agreement is ignored, projects have different authorities under which they operate, RMIS can’t limit their data, not all data is from a BPA project 6 Discussion: What would be the scope of a common approach? Is there a problem big enough to bother addressing? If there is no enforcement ability, then what if anything, would change? What would a new agreement look like? Russell- would like to see consistency amongst the databases Chris W- all databases have User Agreements that work for them; StreamNet should look at those and mirror what would work for their needs; requirement for all federally collected data to be placed in a secure, accessible repositoryData Store will be a secure repository and StreamNet is working with partners to develop an agreement so that they will be comfortable placing their data in the Data Store Mike- Coordinated Assessments also has a user agreement Bill B- need to recognize that users may not fully understand the nuances of the data they are looking at Russell- metadata records should accompany datasets to help the user understand what they are seeing Chris W- StreamNet Data Store User Agreement started with the PTAGIS agreement and has been modified by committee. It appears that each of the programs has specific needs and history that require their own unique strategies 2:30 PM Potential for increasing relational capabilities: next steps Chris Wheaton Lesson learned: AKFIN created a system which efficiently provided the information the Council needed to do its job, and they did it better than other systems that had been in place How can BPA-funded projects do this? GIS standardization component could be successful in supporting this Integration of key functions among the projects? Common GIS framework? Making links easier for the user to go from one database to the other to find information on the same group or population of fish? Build that front door? More differences than similarities among the data use agreements Marianne- after an analytical method was established for RMIS data, then the web page was updated to do it. Where there is an established cross-agency standardized methodology that could be considered for addition to the other projects. Jen- need to come back around to the specific needs/ details brought up by the users George- RMIS has evolved due to changing needs of its users Chris- It was productive to talk together and learn what each of us is doing. Thanks to everyone for their presentations and the discussions. Adjourn 7