Download Configuring Metadata Stores

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

SQL wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Ingres (database) wikipedia , lookup

Btrieve wikipedia , lookup

Team Foundation Server wikipedia , lookup

Functional Database Model wikipedia , lookup

Concurrency control wikipedia , lookup

PL/SQL wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PostgreSQL wikipedia , lookup

Database model wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Transcript
ANU Metadata Stores
System Administrator’s Manual
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia
License.
Table of Contents
Overview ................................................................................................................................................. 3
Licence .................................................................................................................................................... 3
Acknowledgement .................................................................................................................................. 3
Components ............................................................................................................................................ 3
Architecture ............................................................................................................................................ 4
Prerequisites ........................................................................................................................................... 4
PostgreSQL .......................................................................................................................................... 4
Java Runtime Environment ................................................................................................................. 4
Tomcat ................................................................................................................................................ 5
Optional: Fuseki Sparql Server ............................................................................................................ 5
Configuration .......................................................................................................................................... 5
Tomcat ................................................................................................................................................ 5
Set up access to the Manager web application .............................................................................. 5
Maven ................................................................................................................................................. 5
Configuring Metadata Stores .............................................................................................................. 6
Setup Database ............................................................................................................................... 6
Optional: Sparql HTTP Server .......................................................................................................... 9
Setup Miscellaneous ....................................................................................................................... 9
Building ................................................................................................................................................. 10
Dependencies.................................................................................................................................... 10
Build Process ......................................................................................................................................... 10
Clone the source repository .............................................................................................................. 10
Execute Maven Build ........................................................................................................................ 10
Deployment .......................................................................................................................................... 10
Deployment using Tomcat Manager................................................................................................. 11
Deployment using Maven Tomcat Plugin ......................................................................................... 11
F.A.Q...................................................................................................................................................... 11
How do I encrypt a database password? .......................................................................................... 11
How do I create/modify reports? ..................................................................................................... 12
Troubleshooting .................................................................................................................................... 12
SSL Exceptions ................................................................................................................................... 12
Overview
This document lists and explains the steps required to deploy and maintain an instance of the ANU
Metadata Stores software.
Licence
Use of ANU Metadata Stores is governed by the GNU GPL3 licence.
Acknowledgement
This project is supported by the Australian National Data Service (ANDS). ANDS is supported by the
Australian Government through the National Collaborative Research Infrastructure Strategy Program
and the Education Investment Fund (EIF) Super Science Initiative.
Components
1. Metadata Store
This is the primary component of the project. The Metadata Store provides the means to
update, combine, and serve data from various sources. At this point in time we are
retrieving information from the ANU Aries, LDAP, Data Commons and Digital Collections
systems. The store can be presented as either a standalone jar file or as a component in
web services.
Currently the Standalone JAR file updates information via command line options while as a
component of the web services it serves information to clients.
2. Metadata Store Web Services
The Metadata Store Web Services provide access for external clients to retrieve information
from the Metadata Store. The web service provides the ability to perform some queries to
find the data and serves the information for clients in JSON or XML format.
Included with the Metadata Store Web Services are some html pages that provide some
administrative functionality such as confirming or denying possible relationships and to
provide access to Metadata Store reports.
3. Metadata Store Search Web Application
The Metadata Store Search Service provides the ability for users to search and view
information about objects in the Metadata Store.
4. OAI-PMH Harvester
The harvester can harvest data from external OAI-PMH systems to allow for processing at a
later date/time. It pulls the data into individual items so that they can then be processed by
the Metadata Store. Examples of systems that data may be retrieved from are the Data
Commons and Digital Collections.
Architecture
Prerequisites
The Metadata Store is a set of Java applications that require a server (or virtual machine) capable of
running an operating system that supports Oracle Java. Refer to
http://www.oracle.com/technetwork/java/javase/downloads/index.html for a list of operating
systems capable of hosting a Java Runtime Environment.
We are running Red Hat Enterprise Linux Server 5.8 running on a virtual machine with 2GB of RAM.
1. PostgreSQL. Available from http://www.postgresql.org/download/
2. Java Runtime Environment
a. Tomcat 7.x
If using Linux as the operating system some of the aforementioned programs may be available in
your distribution’s repository. Check your package manager for more details.
PostgreSQL
The ANU Metadata Store holds information in a PostgreSQL database.
Java Runtime Environment
Installation of the Java Runtime Environment (JRE) is specific to the platform that it is being installed
on. Refer to your operating system’s user manual for details. The Java Development Kit (JDK) is
required if you intend to perform remote debugging.
Tomcat
Tomcat is an application server that can be used for hosting web applications. Tomcat has a build-in
web server that can be used to serve HTTP requests without the need for a dedicated web server
such as Apache HTTP Server. We have developed our application using Tomcat 7.0.27.
Optional: Fuseki Sparql Server
Fuseki is a SPARQL 1.1 HTTP Server. Information about Fuseki can be found at:
http://jena.apache.org/documentation/serving_data/index.html .
If desired there is an option to use a SPARQL Server to perform searches on the data in the Metadata
Store. We are using Fuseki, however there should be the ability to use any SPARQL 1.1 HTTP Server
that includes the complete SPARQL 1.1 standard.
Configuration
Tomcat
If you wish to encrypt a password held in configuration files it is advisable to add the java system
property value named app.pwd. To do this the option:
-Dapp.pwd=”AnEncryptionPassword”
Should be added to either the JAVA_OPTS or CATALINA_OPTS environment variable that will be
used when tomcat starts.
Set up access to the Manager web application
Accessing the Manager application through the web interface requires a Tomcat user to be set up
with the manager-gui role. Accessing the same application through a scripted interface, such as
through the Maven Tomcat Plugin requires a Tomcat user to be set up with the manager-script
role. Refer to http://tomcat.apache.org/tomcat-7.0-doc/manager-howto.html for details on how to
configure the Manager application.
Maven
For Maven to deploy applications to Tomcat, you will need to create one or more profiles that
include information such as the URL where the Tomcat instance is hosted along with the credentials
to use to access its manager application. Refer to the section Set up access to the Manager web
application to setup the users whose credentials you’d like to use to deploy applications through
Maven.
To create a profile for a tomcat instance, open the settings.xml file. Refer to
http://maven.apache.org/settings.html to find the location of settings.xml. In the file, add the
following XML:
<servers>
…
<server>
<id>INSTANCE_ID</id>
<username>USERNAME</username>
<password>PASSWORD</password>
</server>
…
</servers>
<profiles>
…
<profile>
<id>PROFILE_ID</id>
<properties>
<maven.tomcat.url>http://HOSTNAME:PORT/manager/text</maven.tomc
at.url>
<maven.tomcat.server>INSTANCE_ID</maven.tomcat.server>
</properties>
</profile>
…
</profiles>
Parameter
INSTANCE_ID
USERNAME
PASSWORD
PROFILE_ID
HOSTNAME
PORT
INSTANCE_ID
Value
An arbitrary ID assigned to the username and
password to be used for sending requests to the
Manager application.
The username to which manager-script role is
assigned.
Password for the username above.
An arbitrary ID assigned to the tomcat instance
to which a Web application will deploy.
Generally, you’d have one profile for the
development tomcat instance, one for testing
and one for production.
The fully qualified hostname where the tomcat
instance is located. For example, metadatastores.com .
The port on which the tomcat instance is
listening on. For example, 8080 .
The ID assigned to the combination of username
and password to be used for deploying
applications through the Manager application.
Configuring Metadata Stores
Setup Database
Metadata Stores requires two databases to be created and configured and one to be configured.
One database contains the stores information, the other harvested information.
Store Database
The Store database holds the retrieved metadata to serve to clients. The different types of objects
i.e. People, Publications, Grants, etc. are contained within the same set of tables however they are
discriminated via the ext_system column in the Item table.
Figure 1: Metadata Store Schema
The Stores database is created and initial values filled out via executing scripts found In the
metadata-stores/store/src/main/sql directory.
Create the Stores Database in a PostgreSQL instance by executing the following commands:
psql
psql
psql
psql
–Upostgres –fcreate_database.sql
–Umsuser –fcreate_tables.sql – dmetadatastoresdb
–Umsuser –fcreate_sequences.sql - dmetadatastoresdb
–Umsuser –f3_add_data.sql –dmetadatastoresdb
Then execute each of the SQL files in the format YYYYMMDD_NAME.sql in order:
psql –Umsuser –fYYYYMMDD_NAME.sql – dmetadatastoresdb
To establish a connection between the store and the database created, modify the following
properties in store.cfg.xml found in the store/src/main/resources directory.
Property
Value
hibernate.connection.driver_class Change to an appropriate value if using a database other than
PostgreSQL
hibernate.connection.url
The URL of the database. For example,
jdbc:postgresql://hostname:1234/dbname
hibernate.connection.user
Username to use to connect to the database. For example,
dcuser
hibernate.connection.password
hibernate.dialect
Password to use to connect to the database.
Change to an appropriate value if using a database server other
than PostgreSQL
Harvest Database
The harvest database contains two tables, one to indicate from where harvesting should occur, the
other to hold the content that has been harvested.
Figure 2: Harvester Schema
The Harvest database is created in initial values filled out via executing scripts found in the
metadata-stores/stores-harvester/src/main/sql directory.
Create the Harvest Database in a PostgreSQL instance by executing the following commands:
psql –Upostgres –fcreate_database.sql
psql –Uharvestuser -fcreate_tables.sql -dharvestdb
To establish a connection between the harvester and the database created, modify the following
properties in harvester.cfg.xml found in the stores-harvester/src/main/resources
directory.
Property
Value
hibernate.connection.driver_class Change to an appropriate value if using a database other than
PostgreSQL
hibernate.connection.url
The URL of the database. For example,
jdbc:postgresql://hostname:1234/dbname
hibernate.connection.user
Username to use to connect to the database. For example,
dcuser
hibernate.connection.password
hibernate.dialect
Password to use to connect to the database.
Change to an appropriate value if using a database server other
than PostgreSQL
Aries Database
The Aries database is an existing database that contains research information. To establish a
connection between the harvester and the database created, modify the aries.cfg.xml found in
the aries/src/main/resources directory. The user attempting to update information will
require an account that has access to the Aries database.
Property
hibernate.connection.driver_class
Value
Change to an appropriate value if using a database other than
SQL Server
hibernate.connection.url
The URL of the database. For example,
jdbc:jtds:sqlserver://hostname:1234/dbname;instance
=XXXX
hibernate.connection.user
Note: Connecting to the SQL server database from an
environment other than a standard windows login (e.g. from
Linux) you will also need the domain to connect to.
Username to use to connect to the database. For example,
dcuser
hibernate.connection.password
hibernate.dialect
Note: This property is not necessary if logged in to a windows
computer with a user who has access to the Aries database.
Password to use to connect to the database.
Note: This property is not necessary if logged in to a windows
computer with a user who has access to the Aries database.
Change to an appropriate value if using a database server
other than SQL Server.
Optional: Sparql HTTP Server
The Sparql HTTP Server is an optional database that is utilised for performing searches on the data.
To establish a connection and update the data, modify the file rdf.properties found in the
store/src/main/resources directory.
Property
rdf.object.uri
rdf.relation.uri
rdf.server
rdf.server.query
rdf.server.update
Value
The prefix uri for objects
The prefix uri to indicate relations
The Sparql server location
The path for SPARQL queries
The path for SPARQL updates
To actually use the RDF for searches prior to compilation of the code the class
au.edu.anu.metadatastores.store.search.SearchService requires a modification to the line:
private static Search search_ = DBSearch.getSingleton();
to:
private static Search search_ = RDFService.getSingleton();
Setup Miscellaneous
LDAP
To configure the retrieval of data from LDAP the following properties will need to be updated in the
file ldap/src/main/resources/ldap.properties.
Property
Value
ldap.uri
This is the URL of the LDAP service
ldap.baseDN
The base distinguished name for the LDAP service
ldap.peopleDn The distinguished name for finding people within the LDAP service
Harvesting Locations
Insert rows into the location table of the harvest database. This should include the system, the url of
the OAI provider, the metadata prefix, how frequently the data should be harvested (in milliseconds)
and an original harvest date to calculate when to next harvest the data It will harvest the next. If for
whatever reason you do not wish to harvest the entire feed but rather from a specific date, this can
be done via setting the last harvest date.
Currently there are two locations that have code that will process their data, Data Commons and
Digital Collections. The default values for these are DATA_COMMONS and DIGITAL_COLLECTIONS
respectively. These can be renamed, and for processing to occur the
harvest.location.datacommons and harvest.location.digitalcollections values will
need to be updated in the store.properties. store.properties is located in the
store/src/main/resources directory.
Building
Dependencies
As Metadata Stores is a Maven project its dependencies will automatically be pulled from a Maven
repository.
Build Process
ANU Metadata Stores uses Maven as its build tool. To build the projects from source perform the
following steps:
Clone the source repository
Clone the GitHub repository where ANU Metadata Stores source code is hosted https://github.com/anu-doi/metadata-stores.git
Execute Maven Build
Execute the following command in the directory where the ANU Metadata Stores repository has
been cloned to compile and build the project into JAR and WAR files:
mvn clean package
If any tests fail, run:
mvn clean package –DskipTests
It is also possible to create a runnable jar for the store. In which case navigate to the store directory
and execute:
mvn clean package assembly:single -DskipTests
Deployment
Once the project has been built, the generated WAR files can be deployed to Tomcat by using the
Tomcat manager application or using Maven itself.
Deployment using Tomcat Manager
Assuming the Tomcat instance is hosted at http://localhost:8080/, Open Tomcat Manager
application at http://localhost:8080/manager. Scroll down to the section titled ‘Deploy’. Click on the
browse button, select the WAR file for the application you would like to deploy in the file browser
dialog box and select OK. Then Click on the Deploy button to deploy the application to Tomcat.
Deployment using Maven Tomcat Plugin
The applications can be deployed to Tomcat by executing the following in the project directory:
mvn tomcat:deploy-only –p PROFILE_ID
where PROFILE_ID is the profile ID assigned to a tomcat instance. Refer to section Configuring
Maven.
This command only deploys the web applications in the metadata-stores project without executing
the package phase. The modules must already be packaged for them to be deployed. If you’d like to
package the application and deploy using a single command, execute the following:
mvn tomcat:undeploy
F.A.Q
How do I encrypt a database password?
Download the latest distribution release from jasypt (http://www.jasypt.org/) and unpack the zip file.
Navigate to the bin directory and execute the following command:
> encrypt input=”TheDatabasePassword” password=”the-app.pwd-value”
algorithm=”PBEWithMD5AndDES”
The encrypted value will then be returned below the line containing ‘---OUTPUT---’
This value should then be placed in the appropriate hibernate configuration file surrounded by
‘ENC()’. For example the Hibernate password property becomes:
<property name="hibernate.connection.password">
ENC(lP+z+2cKMGlM4vRsM5tl9d72FAJTiF6f)
</property>
Note: the app.pwd value is as set up in the Tomcat Configuration.
How do I create/modify reports?
The reports have been created with iReport 4.7.0 (http://community.jaspersoft.com/project/ireportdesigner ) and are compiled and run using the jasper reports engine
(http://community.jaspersoft.com/project/jasperreports-library ).
After creating/modifying the report and subreports in iReport to satisfaction place the jrxml files in
the services/src/main/webapp/WEB-INF/reports directory. The reports will be
automatically compiled when the web service starts. Modifications to the ReportResource and
ReportGenerator classes may also be required to be able to access the reports from the front end on
the web server.
Troubleshooting
SSL Exceptions
If the tomcat instance is configured for SSL connections from clients, it is vital that the private key
and public certificate are correctly configured. Refer to the section SSL Configuration HOW-TO at
http://tomcat.apache.org/tomcat-7.0-doc/ssl-howto.html .
If using a self-signed certificate on the server, all clients connecting to it must have the certificate in
its trusted certs store. This includes applications within Tomcat that act as clients to other
applications in the same Tomcat instance.