Download september_17_data_sharing_meeting_notes_final

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Database wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Clusterpoint wikipedia , lookup

Data center wikipedia , lookup

Data model wikipedia , lookup

Forecasting wikipedia , lookup

Data analysis wikipedia , lookup

3D optical data storage wikipedia , lookup

Data vault modeling wikipedia , lookup

Information privacy law wikipedia , lookup

Database model wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
Increasing Relational Capabilities Between PSMFC Databases
With Focus on BPA-Sponsored PSMFC Projects
September 17, 2014
205 Spokane Street, Suite 100
Portland, Or. 97202
Meeting materials (agenda and presentations are available on the PNAMP website here: http://www.pnamp.org/event/4817
Morning Session
9:00 AM
Welcome, Introductions
Chris Wheaton & Jen Bayer
Attendees: Chris Wheaton, Jen Bayer, Bob Ryznar, Bill Kinney, John Tenney, Van Hare, Mike Banach, George
Nandor, Steve Pastor, Brett Holycross, Greg Wilke, Jim Longwill, Dan Webb, Nicole Tancreto, Rebecca Scully,
Katie Pierson, Craig White, Marianne McClure, Brandon Chockley, Tom Pansky
Phone: Bill Bosch, Cedric Cooney, Jason Vogel, Phil Roger, Rich Carmichael, Tiffani Marsh, Tom Iverson, Stacy
Springer, Dan Rawding, Russell Scranton, Micki Varney, Henry Franzoni
9:15 AM
Data Sharing in the AKFIN Program
(Alaska Fisheries Information Network)
Bob Ryznar
PowerPoint Presentation
Large commercial fisheries database, funded by NOAA Fisheries (www.akfin.org)
Primary purpose is to provide complex data sets to fisheries analysts and economists to support the Council’s decisionmaking process.
Initiated in 1999, developed comprehensive datasets in 2009, embedded data manager Michael Fey in the Council
offices (helped move things along and satisfy their primary customer)
Standardization of technology between AKFIN and PACFIN began in 2012; promotes increased efficiency and provides
cost-savings [AKFIN and PacFIN are both NOAA funded fisheries information projects at PSMFC. They support the
collection, processing, analysis, and reporting of fisheries statistics for fisheries off the coasts of Washington, Oregon,
California, Alaska, and British Columbia.]



Receive wholesale data loads (providers don’t standardize or convert their data prior to submission; AKFIN just
takes whatever is available and works with it from there in-house)
o This is possible because the total poundage caught is what is represented in the AKFIN database and it
does not require further extrapolation.
Format of the Business Intelligence Dashboard (and specific outputs offered) was driven by the user needs and
wants
Developers are key to the program in terms of saving individual entities’ time - bulk of what AKFIN does is adhoc data requests which are now handled through AKFIN (Developers Toolbox is available to write queries, pull
data, and deliver smaller data sets as needed)
Successes include describing and defining the regional data cycle and increasing transparency between decision makers
and data providers; areas for improvement are in helping users understand where the data is coming from and getting
greater participation from
1
Biggest success is having staff supported and relationships built with partners
Discussion:
In Alaska, data reported must be made up of 4 or more catch and processor records for privacy & confidentiality
purposes. Federal standard is 3 or more records.
Program really started to take shape and form when it became the clear feed for data to the Council. Previously many
established data pathways existed, requiring many 1-1 interactions to get data assembled for the Council’s use.
Fish species covered by AKFIN data include anything that’s commercially landed (salmon, herring, crab, halibut, pollock,
etc.). AKFIN receives observer data from Alaska Fisheries Science Center as well- allows for total catch projections.
Council will support access to 10 years of data. Fish tickets are available going back to 1969 (data quality is the issue
with older records). E-Landings started with Crab Rationalization Program in 2003 and have since been expanded to
other Alaska fisheries.
AKFIN does not integrate their datasets with those of RMIS, PTAGIS, etc. (users have not asked for this- means that
AKFIN is not dealing with the same set up users as the other data projects)- AKFIN product is total catch in Alaska by
species- would have to go to RMIS, PTAGIS for details to help with analysis of the catch
Working on assisting the genetics lab in Juneau (they need catch information to do their analyses)
Tribal harvest is incorporated into PACFIN
Not everyone who works with the data knows that AKFIN is the database that provides them the data. They are starting
to edge into “analysis” with fleet and community profiles.
Akfin model is that they get the data from providers and convert it into a common form. Pacfin model is that the data is
collected in a rigid framework. Bob prefers the Akfin model because it is flexible and can accommodate change better.
Data sharing in Akfin is all unofficial, use established guidelines.
10:00 AM
Needs & opportunities for sharing & relating
across databases
Group Discussion
What needs to be done to relate the various databases at PSMFC? What are the issues to be addressed? What are
problems encountered?
Steve P- are we talking about truly relational databases or portals? What opportunities currently exist? What could be
added easily?
Rich C- need to define the questions/ need for relationships and areas for improvement in the analysis
Marianne- cannot currently analyze survival rates from CWT and PIT data; would be nice to have a way to relate both to
one another
2
Steve P- would be useful to be able to get a larger view of different brood year returns
Chris W- Should we have an ongoing process for sharing information and discussion of possible improvements?
Dan R- currently have to pull data out of RMIS and PTAGIS and work with it to get the summaries needed. Is there a
possibility of getting those summaries directly from the databases?
John T- can create custom reports from PTAGIS; other entities have that function at the moment
Dan R- those entities don’t do everything that they are needing; spend a lot of time on queries that aren’t available
within a database or between databases
Rich C- much of the analysis done requires additional information to qualify it/ modify it/ expand it; may not be easy to
write standard queries for those types of analysis. Do we want the databases to actually do analysis? Or do we want
high-quality, error-checked data available in the databases?
John T- envisions the databases as a place to aggregate easily accessible high-quality, error-checked data; can do custom
reports as needed
Steve P- there could be a new site that would allow you to access all three databases at once (a portal) to easily access
what you need for your analysis
Rich C- Need to make sure that what is pulled from the various databases is comparable and compatible
Van- more about identifying common linkages to help the user find data across the three databases for a given stream,
watershed, dam, etc.- wouldn’t be actually combining the data, would just help the user to locate the data for the area
of their interest.
Marianne- common data links should be available somewhere for the analysts to locate
Chris W- How can we make is easier for the user to find what they want? Should there be a roll up of that information
into some sort of summary report? Are there standardized roll-up reports for each database? Sees three possible areas
of discussion;
1. Integrate query systems to increase efficiency and simplify data searches?
2. Should there be an integrated analysis and interpretation of all the data for a population, stock, etc.?
3. Should we produce standard roll-up reports (either separately or together) for some key attributes that are
most useful to people?
George N- RMIS has canned reports available online, and can put together reports as requested to be made available on
their site
Dan R- are there standardized queries used by many groups that could be made available?
Jason V- would like to see a PIT Tag Array common query available
Tiffani M- sees PTAGIS as a repository because there is enough difference in how different groups make their
calculations that is makes creating a standardized query difficult
3
10:45 AM
Opportunities and issues related to possible integration
and data sharing between RMIS and PTAGIS
(and StreamNet) databases
John Tenney
PowerPoint Presentation
See also www.ptagis.org
Infrastructure upgrades completed in 2013 (database, reporting system, related processes)
2014-15 Program Objectives include refining and documenting upgraded systems, 2015 PIT Tag Workshop (Jan 27-29),
upgrade P3 Tagging Software: P4, evolving PTAGIS MRR data model
Have limited support in PTAGIS for CWT (have a comment flag that indicates presence of CWT and text field for tag
code); not easy to query this information from PTAGIS; of the 4.1 million CWT records in PTAGIS, less than 50 records
actually include the code
Currently not a lot of opportunity to link the PTAGIS and CWT datasets, but Council wants to see it happen
MMR Data Model can be expanded during the upgrade; created a forum for requested changes and proposed additions
to the data model
Goal is to promote CWT and other Marks in the PTAGIS MMR Model by adding a specialty field in P4 to capture that
information; fields would be optional but would be exclusive to the type of tag; new fields can be queried and/or can
link out to other systems (RMIS, etc.)
Already have a public web API to get information on a particular tag code (also available with observation data, mark
sites)
Discussion:
Marianne- Are the CWT fields blank because they don’t have access to the code information? If so, then they will always
be blank. Need to get people to understand why it’s helpful for them to add that information.
From the perspective of a PTAGIS user, knowing that there is a CWT is useful but knowing what the CWT code is may not
matter as much
Steve P- this could be a good demonstration opportunity, could put a V detector at the raceways for CWT detection
John T- don’t want contributors to have to work through fields that don’t apply to them (can set it up to allow users to
select the fields they want to see/ use)
Dan R- PTAGIS has information on the tagged fish, time of capture and location of capture; RMIS has number of CWT
juveniles and number of released juveniles so you can get abundance information
Chris W- group needs to make sure that the right people are at the table for making decisions on database field
changes/ updates
George- would not be difficult to add a PIT tag field to RMIS; but would need a way to link it to a table of PIT tags
associated with each CWT release.
4
Marianne- would be more appropriate to add CWT info to PTAGIS due to regional scope of use
Micki- real time info provided by PTAGIS during return/ recovery was useful in understanding what happened to the
CWT fish in the DIT groups and MSFs
Rich- What is the objective for providing connectivity to a broad audience when the people responsible for analyzing the
data already have that connectivity?
Dan R- should allow the CWT tag code to be entered into PTAGIS; what people do with it from there is up to them
John T- it’s incumbent upon the people using the data to give guidance and input as to what they need to do their
analysis; additional tag type fields will be added to PTAGIS
Henry F- is double tagging of fish being discouraged due to mortality rates? Don’t hatchery release databases have all
this PIT tag and CWT info?
Rich C- lots of hatchery programs are releasing fish with both PIT tags and CWT
Marianne- important to have this information in a place where people can access it over time, rather than it residing on
an individual person’s desktop
Chris W- Let’s not be overcautious and prevent collecting of the actual tag data associated with a particular fish, if the
database can be set up to collect the information without too much hassle
1:00 PM
Linking PSMFC data systems through a common GIS:
progress to date and future opportunities/goals
Van Hare
PowerPoint Presentation
GIS at PSMFC has been centralized to provide support to all Commission projects and provide continuity amongst all the
projects, reduce duplication of effort, and build communication
Goal 1- reference data to common map features; working on creating common spatial framework among projects,
common referencing systems, common identifiers, etc.
Goal 2- cross reference map features to common areas of interest amongst the projects (StreamNet and PTAGIS do this
currently; may not always be appropriate for all types of data within all systems)
Goal 3- coordinated data sharing for core map layers
 Still need to figure out details and address concerns
 Proposed focus would be publishing the map features with links into the different project data systems through
common IDs
 Can leverage PSMFC’s existing ArcGIS platform for data sharing
 Follow best practices and work with PNAMP and others to meet end-user and funding source needs
Discussion:
Chris W- Would the end user eventually see an interactive map-based portal?
5
Van- This plan would support that by being transparent and documented, and could allow for that discussion to happen
in the future. Initially it would support the concept of getting projects to start talking to one another and map things in
the same way. People want know what is happening in a particular location on a map.
Steve- Would be nice to have a common graphic interface (click a location and toggle boxes for the information you
want displayed from the different systems for that location)
Marianne- Each agency is responsible for adding their own location codes, but who resolves inconsistencies?
Van- codes can be different, but they all need to point to the same spot on the map. Inconsistencies are dealt with as
they come up, and conventions are in place and being developed as needed with the projects.
Tom P- appreciates the effort towards efficiency of mapping things once and sharing amongst projects. Build a “front
door” for managers and users to access the data more easily
Henry F- how are management questions answered by tying together the databases spatially?
Russell- would like to be able to query by populations in a location for the three projects
Rich C- need to be more specific regarding the outputs and outcomes of the GIS component
Jason- the only database that’s worked across the region is PTAGIS because they have dictated what goes in it and
follow their rules for use. It is difficult to do crosswalk work when the names of the facilities aren’t even agreed on.
Cedric- hatchery information may be contained in many locations, but it hasn’t been a targeted dataset for some time.
Resources need to be in place to update datasets.
Van- this work is already being done at the Commission level for StreamNet, PTAGIS, and RMPC. Not looking to be
driven by any management questions at this point.
1:45 PM
Common PSMFC “data use agreement or policy”
Overview of current data sharing in PTAGIS, RMIS,
StreamNet
Mike Banach
PowerPoint Presentation
What are the rules currently in place? Each project has its own approach…
StreamNet
 Main Database- No restrictions in use mentioned. Citation guide available.
 Data Store- data submitters can specify any restrictions they like
RMIS


No restrictions in use mentioned. Citation guide available.
Data availability and process dictated by international treaty
PTAGIS
 Formal “Data Use Policy” prescribed by Steering Committee and available on every page of website
Issues: PSMFC has no enforcement authority if a data use agreement is ignored, projects have different authorities
under which they operate, RMIS can’t limit their data, not all data is from a BPA project
6
Discussion:
What would be the scope of a common approach? Is there a problem big enough to bother addressing? If there is no
enforcement ability, then what if anything, would change? What would a new agreement look like?
Russell- would like to see consistency amongst the databases
Chris W- all databases have User Agreements that work for them; StreamNet should look at those and mirror what
would work for their needs; requirement for all federally collected data to be placed in a secure, accessible repositoryData Store will be a secure repository and StreamNet is working with partners to develop an agreement so that they will
be comfortable placing their data in the Data Store
Mike- Coordinated Assessments also has a user agreement
Bill B- need to recognize that users may not fully understand the nuances of the data they are looking at
Russell- metadata records should accompany datasets to help the user understand what they are seeing
Chris W- StreamNet Data Store User Agreement started with the PTAGIS agreement and has been modified by
committee. It appears that each of the programs has specific needs and history that require their own unique strategies
2:30 PM
Potential for increasing relational capabilities: next steps
Chris Wheaton
Lesson learned: AKFIN created a system which efficiently provided the information the Council needed to do its job, and
they did it better than other systems that had been in place
How can BPA-funded projects do this? GIS standardization component could be successful in supporting this
Integration of key functions among the projects? Common GIS framework? Making links easier for the user to go from
one database to the other to find information on the same group or population of fish? Build that front door?
More differences than similarities among the data use agreements
Marianne- after an analytical method was established for RMIS data, then the web page was updated to do it. Where
there is an established cross-agency standardized methodology that could be considered for addition to the other
projects.
Jen- need to come back around to the specific needs/ details brought up by the users
George- RMIS has evolved due to changing needs of its users
Chris- It was productive to talk together and learn what each of us is doing. Thanks to everyone for their presentations
and the discussions.
Adjourn
7