* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download BlueBRIDGE Competitive Call – Data management services for
Survey
Document related concepts
Transcript
BlueBRIDGE – 675680 www.bluebridge-vres.eu Annex A: Proposal Template Green highlighted areas to be completed by applicant BlueBRIDGE Competitive Call – Data management services for SMEs Building Research environments for fostering Innovation, Decision making, Governance and Education to support Blue growth March 2017 Full title of your project Acronym of your proposal (optional) BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 BlueBRIDGE – 675680 www.bluebridge-vres.eu Disclaimer BlueBRIDGE (675680) is a Research and Innovation Action (RIA) co-funded by the European Commission under the Horizon 2020 research and innovation programme The goal of BlueBRIDGE, Building Research environments for fostering Innovation, Decision making, Governance and Education to support Blue growth, is to support capacity building in interdisciplinary research communities actively involved in increasing the scientific knowledge of the marine environment, its living resources, and its economy with the aim of providing a better ground for informed advice to competent authorities and to enlarge the spectrum of growth opportunities as addressed by the Blue Growth societal challenge. This document is the application form for the BlueBRIDGE Open Call for SMEs. The document has been produced with the funding of the European Commission. The content of this publication is the sole responsibility of the BlueBRIDGE Consortium and its experts, and it cannot be considered to reflect the views of the European Commission. BlueBRIDGE Open Call Application Form Page 2 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu Table of contents 1 Background and Qualifications ............................................................................................ 4 2 Problem Statement and Objectives ...................................................................................... 5 3 Business Impact and Sustainability ...................................................................................... 6 4 Requested BlueBRIDGE resources ........................................................................................ 7 5 Open Access ...................................................................................................................... 20 BlueBRIDGE Open Call Application Form Page 3 of 20 BlueBRIDGE – 675680 1 BACKGROUND AND QUALIFICATIONS www.bluebridge-vres.eu Provide a brief company profile including information on who your customers are, an overview of the activities that you perform and your qualifications (including the technical expertise that you have in house). Maximum 500 words Remark: The information in this section may be used in public documents and reports by the BlueBRIDGE consortium. BlueBRIDGE Open Call Application Form Page 4 of 20 BlueBRIDGE – 675680 2 PROBLEM STATEMENT AND OBJECTIVES www.bluebridge-vres.eu Outline the problem statement that BlueBRIDGE can help you to address and describe the objectives that you want to achieve through this proposal. These objectives should be those achievable within your proposed action, not through subsequent development. Preferably, they should be stated in a measurable and verifiable form. Maximum 500 words. BlueBRIDGE Open Call Application Form Page 5 of 20 BlueBRIDGE – 675680 3 BUSINESS IMPACT AND SUSTAINABILITY www.bluebridge-vres.eu Describe how the set up of the proposed BlueBRIDGE collaborative environment may impact on the growth/innovation/business/service portfolio/research etc. If you are planning to use the BlueBRIDGE services/resources to improve a current process, please describe how this will change your process (a detailed description of your current process would be useful for evaluators). Please also indicate if you envisage the use of BlueBRIDGE also beyond the duration of the proposal. Maximum 500 words. BlueBRIDGE Open Call Application Form Page 6 of 20 BlueBRIDGE – 675680 4 REQUESTED BLUEBRIDGE RESOURCES www.bluebridge-vres.eu Please select from the given list which BlueBRIDGE data sources, services, technologies, data analytics and algorithms will be required for your application. Data Sources Name Description/Link Examples Biological and Ecological List of Names Biological and Ecological authoritative and comprehensive list of names of marine organisms, including information on synonymy Catalogue of Life, World Register of Marine Species (WoRMS), World Register of Deep-Sea Species (WoRDSS). Taxonomic, trophic level and life history traits data from FishBase Biological and Ecological Data Biological and Ecological Data evidence about more than 1.6 million species, collected over three centuries of natural history exploration and including current observations from citizen scientists, researchers and automated monitoring programmes. Global Biodiversity Information Facility (GBIF), Ocean Biogeographic Information System (OBIS) Chemical & physical variables with global geospatial coverage Apparent Oxygen Utilization World Ocean Atlas, EMODnet, Copernicus Marine Environmental Monitoring System, Planet OS, GEBCO Annual, seasonal - monthly, and Apparent Oxygen Utilization Dissolved Oxygen Do you need this resource? Oxygen Saturation Ice Concentration, velocity Chlorophyll Mass Concentration Chlorophyll in Sea Water - Mole Concentration - thickness, of of Dissolved Oxygen in Sea Water Nitrate in Sea Water Phosphate in Sea Water Phytoplankton expressed as carbon in sea water BlueBRIDGE Open Call Application Form Page 7 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu Carbon Net Primary Productivity of Carbon Nitrate Annual, seasonal monthly, and Phosphate Annual, seasonal monthly, and Salinity Monthly average coverage Sea Surface Height Monthly average coverage Sea Water Salinity Annual, seasonal monthly, and Sea Water Temperature Annual, seasonal monthly, and Silicate Annual, seasonal monthly, and Temperature Monthly average coverage Wind Speed Monthly ASCAT global wind Wind Stress Monthly ASCAT global wind Zonal Velocity Monthly average coverage Services & Technologies Name RStudio Description/Link Characteristic RStudio makes R easier to use. It includes a code editor, debugging & visualization tools RStudio server is configured with 16 cores and 16 GB RAM BlueBRIDGE Open Call Application Form Do you need this resource? Page 8 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu Data Miner DataMiner Manager1 is a computational engine for performing data analytics operations. Specifically, it offers a unique access to perform data analytics on heterogeneous data, which may reside either at client side, in the form of commaseparated values files, or be remotely hosted, possibly in a database. Data Miner cluster is configured to support high throughput computing on 100 cores and 100 GB RAM Spatial Data Infrastructure The Spatial Data Infrastructure includes: - Geoserver cluster to manage vector data accessible via OGC WMS and WFS protocols The Spatial Data Infrastructure is configured to support the storage of spatially referenced datasets up to 0.5 TB of disk space. - Geonetwork to manage spatial spatially referenced metadata accessible via OGC CSW protocol - Thedds Data Service cluster to manage NetCDF, OpenDAP, and HDF5 datasets accessible via OPeNDAP protocol 1 Storage Infrastructure The Storage infrastructure supports storage of files organized in directories. Policies can be associated with directories by selecting private to a single user, restricted access to specified users, shared with all users of the VRE The Storage infrastructure is configured to support the storage of files up to 2 TB of disk space. Relational Database Relational Database transactional replication with The Relational Database cluster is configured to support the storage up to 0.5 TB of disk space Social Framework All applications running on the infrastructure are make accessible through a portal. It includes facilities for the management of users, for communicating with users via The Social Framework cluster is configured to manage up to 500 users and 5 VREs. Mandatory If you need to trun an algorithm or a model on the infrastructure you need Data Miner BlueBRIDGE Open Call Application Form Page 9 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu posts and notifications, for managing access policies, etc. SmartGears Framework SmartGears framework is to make your Tomcat based application runnable on the infrastructure. It manages on behalf of the application authentication, authorization, accounting, monitoring, and alerting. Performance evaluation aquaculture Techno economic investment analysis and what if analysis --news aquaculture training VRE news in Mandatory if Tomcat based applications must be hosted on the infrastructure Data harmonization? The Data Harmonization facility supports the semi-automatic harmonization of time series with respect to code lists and controlled vocabularies. It provides a suite for human curators that can define tailored template for harmonizing series of time series. The Data Harmonization facility can be used to harmonize time series up to 1 M observations for each iteration of the harmonization process. Data Publication Species distribution maps generation; Production of indicators; Facilities for creating and managing enhanced documents; generation of standard ISO 10139 metadata for geospatial datasets The Data publication facility allows to publish product in the VRE with the aim to make available either at all members of the VRE or open access. Data Analytics Name Description/Link Examples Facilities for species occurrence and geospatial datasets processing Time Series Analysis, Time Geo Chart, XYExtractor, ZExtraction, Raster Data Publisher, ESRI-GRID Extraction, Maps Comparison The Scalable Data Mining VRE BlueBRIDGE Open Call Application Form Do you need this service? Published examples: Coro, Gianpaolo, Pasquale Pagano, and Anton Ellenbroek. "Comparing heterogeneous distribution maps for marine Page 10 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu species." GIScience & Remote Sensing 51.5 (2014): 593-611. Coro, Gianpaolo, et al. "Automatic classification of climate change effects on marine species distributions in 2050 using the AquaMaps model." Environmental and ecological statistics 23.1 (2016): 155-180. Facilities for performing data mining tasks on tabular and computer science data Feed Forward Neural Network Regressor, Feed Forward Neural Network Trainer, Dbscan, Kmeans, Lof, Xmeans, WEB App Publisher, Quality Analysis, Generic Charts, Stat Val The Tabular Data Lab VRE Published examples: Candela, Leonardo, et al. "Species distribution modeling in the cloud." Concurrency and Computation: Practice and Experience (2013). Coro, Gianpaolo, et al. "Parallelizing the execution of native data mining algorithms for computational biology." Concurrency and Computation: Practice and Experience 27.17 (2015): 46304644. Facilities for the management and supervision of ecosystems Absence Cells from AquaMaps, HRS, Absence Generation from Obis, Estimate Monthly Fishing Effort, Ecopath with Ecosim, Estimate Fishing Activity, SEADATANET Interpolator, Species Maps from Points, BiOnym, Whole Steps Vpa Iccat Bft E The Biodiversity Lab VRE Published examples: Coro, Gianpaolo, et al. "Improving data quality to build a robust distribution model for Architeuthis dux." Ecological Modelling 305 (2015): 29-39. Coro, Gianpaolo, Luigi Fortunati, and Pasquale Pagano. "Deriving fishing monthly effort and caught species from vessel trajectories." OCEANS-Bergen, 2013 MTS/IEEE. IEEE, 2013. BlueBRIDGE Open Call Application Form Page 11 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu Facilities for the development of optimized feeding and growth models Simulfishkpis Performance and Evaluation in Aquaculture VRE Facilities for supporting decision making and strategic investment analysis and doing better planning in the aquaculture domain Mpa Intersect V2 Protected Area Impact Maps VRE, Aquaculture Atlas Generation VRE Algorithms Name Description/Link Feed Forward Neural Network Regressor The algorithm simulates a realvalued vector function using a trained Feed Forward Artificial Neural Network and returns a table containing the function actual inputs and the predicted outputs Requires the DataMiner Cluster Feed Forward Neural Network Trainer The algorithm trains a Feed Forward Artificial Neural Network using an online Back-Propagation procedure and returns the training error and a binary file containing the trained network Requires the DataMiner Cluster Dbscan A clustering algorithm for real valued vectors that relies on the density-based spatial clustering of applications with noise (DBSCAN) algorithm. A maximum of 4000 points is allowed. Requires the DataMiner Cluster Kmeans A clustering algorithm for real valued vectors that relies on the k- Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Requirements Do you need this resource? Page 12 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu means algorithm, i.e. a method aiming to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. A Maximum of 4000 points is allowed. Lof Local Outlier Factor (LOF). A clustering algorithm for real valued vectors that relies on Local Outlier Factor algorithm, i.e. an algorithm for finding anomalous data points by measuring the local deviation of a given data point with respect to its neighbours. A Maximum of 4000 points is allowed. Requires the DataMiner Cluster Xmeans A clustering algorithm for occurrence points that relies on the X-Means algorithm, i.e. an extended version of the K-Means algorithm improved by an ImproveStructure part. A Maximum of 4000 points is allowed. Requires the DataMiner Cluster An algorithms applying signal processing to a non uniform time series. A maximum of 10000 distinct points in time is allowed to be processed. The process uniformly samples the series, then extracts hidden periodicities and signal properties. The sampling period is the shortest time difference between two points. Finally, by using Caterpillar-SSA the algorithm forecasts the Time Series. The output shows the detected periodicity, the forecasted signal and the spectrogram. Requires the DataMiner Cluster Time Geo Chart An algorithm producing an animated gif displaying quantities as colors in time. The color indicates the sum of the values recorded in a country. Requires the DataMiner Cluster XYExtractor An algorithm to extract values associated to an environmental Requires the DataMiner Cluster Time Analysis Series BlueBRIDGE Open Call Application Form Page 13 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu feature repository (e.g. NETCDF, ASC, GeoTiff files etc. ). A grid of points at a certain resolution is specified by the user and values are associated to the points from the environmental repository. It accepts as one geospatial repository ID (via their UUIDs in the infrastructure spatial data repository - recoverable through the Geoexplorer portlet) or a direct link to a file and the specification about time and space. The algorithm produces one table containing the values associated to the selected bounding box. ZExtraction An algorithm to extract the Z values from a geospatial features repository (e.g. NETCDF, ASC, GeoTiff files etc. ). The algorithm analyses the repository and automatically extracts the Z values according to the resolution wanted by the user. It produces one chart of the Z values and one table containing the values. Requires the DataMiner Cluster Absence Cells from AquaMaps An algorithm producing cells and features (HCAF) for a species containing absense points taken by an Aquamaps Distribution Requires the DataMiner Cluster HRS An algorithm that calculates the Habitat Representativeness Score, i.e. an indicator of the assessment of whether a specific survey coverage or another environmental features dataset, contains data that are representative of all available habitat variable combinations in an area. Requires the DataMiner Cluster Absence Generation from Obis An algorithm to estimate absence records from survey data in OBIS. Based on the work in Coro, G., Magliozzi, C., Berghe, E. V., Bailly, N., Ellenbroek, A., & Pagano, P. (2016). Estimating absence locations of marine species from Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Page 14 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu data of scientific surveys in OBIS. Ecological Modelling, 323, 61-76. Raster Publisher Data This algorithm publishes a raster file as a maps or datasets in the eInfrastructure. NetCDF-CF files are encouraged, as WMS and WCS maps will be produced using this format. For other types of files (GeoTiffs, ASC etc.) only the raw datasets will be published. The resulting map or dataset will be accessible via the VRE GeoExplorer by the VRE participants. Requires the DataMiner Cluster Estimate Monthly Fishing Effort An algorithm that estimates fishing exploitation at 0.5 degrees resolution from activity-classified vessels trajectories. Produces a table with csquare codes, latitudes, longitudes and resolution and associated overall fishing hours in the time frame of the vessels activity. Requires each activity point to be classified as Fishing or other. This algorithm is based on the paper 'Deriving Fishing Monthly Effort and Caught Species' (Coro et al. 2013, in proc. of OCEANS - Bergen, 2013 MTS/IEEE). Example of input table (NAFO anonymised data): http://goo.gl/3auJkM Requires the DataMiner Cluster Ecopath Ecosim with Ecopath with Ecosim (EwE) is a free ecological/ecosystem modeling software suite. This algorithm implementation expects a model and a configuration file as inputs; the result of the analysis is returned as a zip archive. References: Christensen, V., & Walters, C. J. (2004). Ecopath with Ecosim: methods, capabilities and limitations. Ecological modelling, 172(2), 109-139. Requires the DataMiner Cluster Estimate Fishing Activity An algorithm that estimates activity hours (fishing or other) from vessels trajectories, adds bathymetry information to the table and Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Page 15 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu classifies (point-by-point) fishing activity of the involved vessels according to two algorithms: one based on speed (activity_class_speed output column) and the other based on speed and bathymetry (activity_class_speed_bath output column). The algorithm produces new columns containing this information. This algorithm is based on the paper 'Deriving Fishing Monthly Effort and Caught Species' (Coro et al. 2013, in proc. of OCEANS - Bergen, 2013 MTS/IEEE). Example of input table (NAFO anonymised data): http://goo.gl/3auJkM ESRI-GRID Extraction An algorithm to extract values associated to an environmental feature repository (e.g. NETCDF, ASC, GeoTiff files etc. ). A grid of points at a certain resolution is specified by the user and values are associated to the points from the environmental repository. It accepts as one geospatial repository ID (via their UUIDs in the infrastructure spatial data repository - recoverable through the Geoexplorer portlet) or a direct link to a file and the specification about time and space. The algorithm produces one ESRI GRID ASCII file containing the values associated to the selected bounding box. Requires the DataMiner Cluster SEADATANET Interpolator A connector for the SeaDataNet infrastructure. This algorithms invokes the Data-Interpolating Variational Analysis (DIVA) SeaDataNet service to interpolate spatial data. The model uses GEBCO bathymetry data and requires an estimate of the maximum spatial span of the correlation between points and the signal-to-noise ratio, among the other parameters. It can interpolate up to 10,000 points randomly taken Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Page 16 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu from the input table. As output, it produces a NetCDF file with a uniform grid of values. This powerful interpolation model is described in Troupin et al. 2012, 'Generation of analysis and consistent error fields using the Data Interpolating Variational Analysis (Diva)', Ocean Modelling, 52-53, 90-101. WEB Publisher App This algorithm publishes a zip file containing a Web site, based on html and javascript in the eInfrastructure. It generates a public URL to the application that can be shared. Requires the DataMiner Cluster Maps Comparison An algorithm for comparing two OGC/NetCDF maps in seamless way to the user. The algorithm assesses the similarities between two geospatial maps by comparing them in a point-to-point fashion. It accepts as input the two geospatial maps (via their UUIDs in the infrastructure spatial data repository - recoverable through the Geoexplorer portlet) and some parameters affecting the comparison such as the z-index, the time index, the comparison threshold. Note: in the case of WFS layers it makes comparisons on the last feature column. Requires the DataMiner Cluster Quality Analysis An evaluator algorithm that assesses the effectiveness of a distribution model by computing the Receiver Operating Characteristics (ROC), the Area Under Curve (AUC) and the Accuracy of a model Requires the DataMiner Cluster Species Maps from Points An algorithm to produce a GIS map from a probability distribution made up of x,y coordinates and a certain resolution. Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Page 17 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu Generic Charts An algorithm producing generic charts of attributes vs. quantities. Charts are displayed per quantity column. Histograms, Scattering and Radar charts are produced for the top ten quantities. A gaussian distribution reports overall statistics for the quantities. Requires the DataMiner Cluster BiOnym An algorithm implementing BiOnym, a flexible workflow approach to taxon name matching. The workflow allows to activate several taxa names matching algorithms and to get the list of possible transcriptions for a list of input raw species names with possible authorship indication. Requires the DataMiner Cluster Stat Val Statistical validation of BIPARTITE WEIGHTED network. Requires the DataMiner Cluster Simulfishkpis Creates simulation models for KPIs fish production in Aquaculture. Import data from SimulFish Growth database via URLs. Calculated KPIs are FCR, SFR, Mortality using Regression models generated by GAMs and MARs methodologies. Requires the DataMiner Cluster Whole Steps Vpa Iccat Bft E ICCAT (Eastern) Bluefin Tuna Stock Assessment. This set of R and Fortran code have been provided by ICCAT and IFremer to execute a whole Stock assessment workflow Requires the DataMiner Cluster Mpa Intersect V2 An algorithm to compute areas of geomorphic features in an EEZ or ECOREGION area and in its intersecting Marine Protected Areas (MPAs) Requires the DataMiner Cluster BlueBRIDGE Open Call Application Form Page 18 of 20 BlueBRIDGE – 675680 www.bluebridge-vres.eu If you need any other specific resource which is not part of the current list or you need to integrate in the virtual environment your own software/application (for example R, Java, JavaScript, Phyton, etc.), please describe below what you need and why (max.1 page) BlueBRIDGE Open Call Application Form Page 19 of 20 BlueBRIDGE – 675680 5 OPEN ACCESS www.bluebridge-vres.eu Indicate your willingness to share your data. Identify the type of data that you will be sharing and an overall percentage. The more data you share on the BlueBRIDGE infrastructure, the more access to resources we will provide. Open access is not mandatory but strongly encouraged. Maximum 500 words. BlueBRIDGE Open Call Application Form Page 20 of 20