Download What is Marklogic?

Hadoop & NoSQL Database Project Spring 2016 MARKLOGIC DATABASE Aashi Rastogi (0997297), Sanket Patel (0999383) Introduction: NoSQL means non-SQL or non-relational databases which provides mechanism to store and retrieve the data other than relational databases. NoSQL database is in use nowadays because of simplicity of design, easy to scale out and control over availability. Types of NoSQL Databases:      Key-value based Column oriented Graph oriented Document based Multi-model Multi-model database is an only designed to support multiple data models against a single application. Marklogic DB is one of the NoSQL database that uses multi-model database design. What is Marklogic? Marklogic is the only Enterprise NoSQL Database. It is optimized for structured and unstructured data that allows you to store, manage, query and search across JSON, XML, RDF (Triplestore), Geospatial data, text, and large binaries. With Marklogic one can handle data in a schema-agnostic fashion or built in application server and leads to faster time-to-results. It provides capabilities like ACID Transaction, high availability and disaster recovery, Security. Marklogic is designed to run on Hadoop and help you to use their technology in better way. It is also easily deployed on cloud to maintain hardware and provide all benefits of elasticity. It also has built in application services and text search capabilities. It allows to discover new facts by acting as a triplestore with inference capabilities. How it works? It uses XML document as its data model, and stores the documents within a transactional repository. It indexes the words and values from each of the loaded documents, as well as the document structure. And, because of its unique Universal Index, Marklogic doesn’t require advance knowledge of the document structure nor complete adherence to a particular schema. Marklogic Server clusters on commodity hardware using a shared-nothing architecture and differentiates itself in the market by supporting massive scale and fantastic performance- customer deployments have scaled to hundreds of terabytes of source data while maintaining sub-second query response time. In addition to XML, Marklogic can store JSON, text, and binary documents. JSON documents are internally transformed to XML for purposes of indexing. Text documents are indexed as if each was an XML text node without a parent. Binary documents are by default unindexed, with the option to index their metadata and extracted contents. Motivations: Data has changed overtime and technology to handle it also changed. First Hierarchical Era where data is tied with its application. Then it comes Relational era where data are stored independent of the application but all of the data has to be fitted in tabular tables. But what if it didn’t fit? Then one has to increase the size of table or chopped the data to fit it. But that will be not done for all data like unstructured data. And this led to growth of NoSQL database. Characteristics: Following are the characteristics to be discussed 1. 2. 3. 4. 5. 6. 7. 8. Flexible Data Model Search and Query Clean Semantics Scalability and Elasticity ACID Transactions High availability and Disaster Recovery Hadoop Integration Bitemporal 1. Flexible Data Model – It is the only database that can natively store and rapidly query JSON, XML, RDF, and more- providing a single powerful platform for all data. The document-centric data model is schema-agnostic, which provides flexibility in modelling data. Fig: Shows how Marklogic read documents(Schema-agnostics). 2. Search and Query – Marklogic indexes data on load and makes it immediately searchable. It works on Universal index like a search engine. The Universal Index keeps track of words, phrases, and values in documents. It also indexes the structure of documents—thus providing context for search. By indexing like a search engine, queries become really fast. This indexes provides ability to run complex queries across multiple data types. 3. Semantics – Semantics provides a new approach to modelling data that focuses on relationships and context. It simply links two entities together based on the relationship between them to form a triple. This triples form a graph that is without heirarchy. 4. Scalability and Elasticity – Marklogic is designed with a shared nothing architecture means all nodes have its own memory and disk. In Marklogic it scales horizontally in clusters on commodity hardware to hundreds of nodes, petabytes of data, and billions of documents and still processes tens of thousands of transactions per second. 5. ACID Transactions – ACID stands for atomicity, consistency, isolation and durability. Marklogic gets ACID transactions using MVCC (multi-version concurrency control). In an MVCC systems, changes are tracked with a timestamp number on each document. The database uses these timestamps to ensure that all users see consistent data. 6. High availability and Disaster Recovery – It achieves a HA/DR using a shared-nothing architecture that provides redundancy for failover and high-performance scaling, with no single point of failure. It can quickly and easily backup selected components or the entire database, all securely using SSL out-of-the-box. It also has incremental backups means to only backup the changes since the previous incremental or full backup. 7. Hadoop Integration – Hadoop is popular because it is designed to cheaply store large amount of data in the Hadoop Distributed File System (HDFS) and run large-scale MapReduce jobs for batch analysis. MarkLogic is the best database for Hadoop because it can seamlessly run alongside the Hadoop ecosystem, acting as the database to power realtime, transactional applications. 8. Bitemporal – Bitemporal ensures that you always have a full and accurate picture of your data at every point-in-time, which is particularly useful in regulated industries. It enables you to get better answers from today’s, tomorrow’s, and yesterday’s data. You can go back in time and explore data, manage historical data across systems, ensure data integrity, and do complex bitemporal analysis with ease. Brief Manual – Supported Platforms: MarkLogic Server is supported on the following platforms:         Microsoft Windows Server 2012 (x64), Microsoft Windows Server 2008 (x64), Windows 7 and 8 64-bit (x64)* Sun Solaris 10 (x64) Red Hat Enterprise Linux 7 (x64)** *** Red Hat Enterprise Linux 6 (x64)** *** **** SUSE Linux Enterprise Server 11 (x64) SP3** *** CentOS 6 (x64)** *** Amazon Linux 2013.03 (x64)** *** Mac OS X 10.8 or 10.9***** * Microsoft Windows 7 and Windows 8 are supported for development only. If MarkLogic Server fails to start up on Windows with the error 'the application failed to initialize properly (0xc0150002)', then a dependency is missing from your environment and you need to download and install the following DLL for 64-bit versions of Windows: http://www.microsoft.com/downloads/details.aspx?FamilyID=eb4ebe2d-33c0-4a479dd4-b9a6d7bd44da&DisplayLang=en. Additionally, if you get an error on startup saying you need MSVCR100.dll, the install the Microsoft Visual C++ 2010 SP1 Redistributable Package (x64)http://www.microsoft.com/en-us/download/details.aspx?id=13523. ** The deadline I/O scheduler is required on Red Hat Linux platforms. The deadline scheduler is optimized to ensure efficient disk I/O for multi-threaded processes, and MarkLogic Server can have many simultaneous threads. For information on the deadline scheduler, see the Red Hat documentation (for example, http://www.redhat.com/magazine/008jun05/features/schedulers/). ***The redhat-lsb, glibc, and gdb packages are required on Red Hat Linux. Additionally, on 64bit Red Hat Linux, both the 32-bit and the 64-bit glibc packages are required. ****Red Hat Linux 6 (x64) is also supported in a VMWare ESXi 5.0 (installed on bare metal) environment. *****Mac OS X is supported for development only. Conversion (Office and PDF) and entity enrichment are not available on Mac OS X. Mac OS X 10.8 or 10.9 (Mountain Lion or Mavericks) on a 64-bit capable processor is required (http://support.apple.com/kb/HT3696). Installing Marklogic Server: This section describes the procedure for installing Marklogic Server on each platform. Perform the procedure corresponding to the platform to which you are installing. Platform Perform the following: 1. Shut down and uninstall the previous release of Marklogic Server (if you are upgrading from 7.0, 6.0, or 5.0, see Upgrading from Release 7.0, 6.0, Or 5.0, if Windows you are upgrading from 8.0-1 or later, see Removing Marklogic Server). x64 2. Download the Marklogic Server installation package to your desktop. The latest installation packages are available from http://developer.marklogic.com. 3. Double click the MarkLogic-8.0-1-amd64.msi icon to start the installer. If you are installing a release other than 8.0-1, double-click on the appropriately named installer icon. 4. 5. 6. 7. The Welcome page displays. Click Next. Select Typical. Click Install. Click Finish. Red Hat1. Shut down and uninstall the previous release of Marklogic Server (if you are Linux x64 upgrading from 7.0, 6.0, or 5.0, see Upgrading from Release 7.0, 6.0, Or 5.0, if you are upgrading from 8.0-1 or later, see Removing Marklogic Server). 2. Download the package to /tmp or another location using your web browser. The latest installation packages are available from the http://developer.marklogic.com. If you are using Firefox or another browser that is configured to associate rpm files, the browser will prompt you for the root password (if you are not already running as root) and you can follow the prompts to complete the installation. When the installation is complete, you can skip the next step. Otherwise, continue to the next step. 3. As the root user, install the package with the following command: rpm -i /tmp/MarkLogic-8.0-1.x86_64.rpm If you are installing a release other than 8.0-1, replace the characters 8.0-1 in the line above with the appropriate release number. 4. If you are using HDFS, make sure the server is configured to use HDFS with a Hadoop HDFS client and any needed environment variables set in the /etc/sysconfig/Marklogic file. For details, see HDFS Storage in the Query Performance and Tuning Guide. Sun Solaris x64 1. Shut down and uninstall the previous release of Marklogic Server (see Removing Marklogic Server). 2. Download the package to /var/spool/pkg using your web browser. The latest installation packages are available from http://developer.marklogic.com. 3. Unpack the compressed tar file in /var/spool/pkg with the following shell commands: 4. % cd /var/spool/pkg 5. % uncompress MARKlogic-8.0-1-amd64.tar.Z 6. % tar xf MARKlogic-8.0-1-amd64.tar % rm MARKlogic-8.0-1-amd64.tar If you are installing a release other than 8.0-1, replace the characters 8.0-1 in the line above with the appropriate release number. 7. As the root user, install the package with the following command: # pkgadd Marklogic Mac OS X1. Download the Marklogic Server installation package to your desktop. The latest installation packages are available from the http://developer.marklogic.com. 2. Double click the MarkLogic-8.0-1-x86_64.dmg icon to open the folder that contains the MarkLogic-8.0-1-x86_64.pkg installer. Double click on the installer to start. 3. The Welcome page displays. Click Continue. 4. In the Select a Destination window, select a destination to install Marklogic Server or Continue to select the default destination. 5. In the Installation Type window, click Install. An Installation window appears that displays the progress of the installation. 6. When the installation Summary window appears, click Close. 7. A Marklogic control window appears from which you can start/stop Marklogic Server, open the Admin Interface, and view the Error Log. The following table shows the installation directory (<marklogic-dir>) and the default data directory for each platform: Platform Installation Directory Default Data Directory (for configuration and log files) Windows c:\Program Files\Marklogic\ c:\Program Files\Marklogic\Data /opt/Marklogic /var/opt/Marklogic Sun Solaris /opt/Marklogic /var/opt/Marklogic Mac OS X ~/Library/Marklogic ~/Library/Application Support/Marklogic/Data Red Linux Hat The default forest directory is the same as the default data directory if the optional data directory is not specified during forest creation. On UNIX platforms, if you want Marklogic Server to use another location for its default data directory, make your data directory (/var/opt/Marklogic on Linux and /var/opt/Marklogic on Solaris) a soft link to the alternate location. Starting Marklogic Server: Marklogic Server will automatically start when the computer reboots. To start Marklogic Server without rebooting, perform the following command for the platform on which you are running: Platform Perform the following: Windows Select Start > Programs > Marklogic Server > Start Marklogic Server. When you start Marklogic Server from the Start menu, the Windows service configuration for Marklogic Server is set to start automatically. Also, if you are using Windows Vista or Windows 7, to start the service you must right-click the Start Marklogic Server link in the Start menu and choose Run as Administrator, then choose to allow the action. Red Hat Linux As the root user, enter the following command: /etc/init.d/Marklogic start Sun Solaris As the root user, enter the following command: /etc/init.d/Marklogic start Mac OS X Select System Preferences > Marklogic to open the Marklogic control window. Click Start Marklogic Server. This starts all of the App Servers that are configured on your Marklogic Server. Configuring the First and Subsequent Hosts: The following configuration procedures different depending on if you run Marklogic Server in a cluster configuration or on a single host. The procedures are as follows:    Configuring a Single Host or the First Host in a Cluster Configuring an Additional Host in a Cluster Leaving a Cluster and Becoming a Single Host If you are configuring Marklogic Server as a standalone host, or if this is the first host in a cluster configuration, follow the installation instructions. Otherwise, follow the installation instructions. If you are upgrading a cluster to a new release, see Upgrading a Cluster to a New Maintenance Release of Marklogic Server in the Scalability, Availability, and Failover Guide. The security database and the schemas database must be on the same host, and that host should be the first host you upgrade when upgrading a cluster. Configuring a Single Host or the First Host in a Cluster To configure this installation as a single host, or as the first host in a cluster, perform the following steps: 1. Install Marklogic and start Marklogic as described in Installing Marklogic Server and Starting Marklogic Server. 2. Log into the Admin Interface in a browser. It is on port 8001 of the host in which Marklogic is running (for example, on the localhost, http://localhost:8001). The Server Install page appears. 3. Click OK to continue. 4. Wait for the server to restart. 5. After the server restarts, you will be prompted to join a cluster. 6. Click Skip. 7. You will be prompted to create an admin user. Enter the login name and password for the admin user. 8. Click OK. 9. You will be prompted to log in with your admin username and password. You will now see the Admin Interface. Configuring an Additional Host in a Cluster All hosts in a cluster have to be on the same platform. To configure this installation as an additional host in a cluster of the same platform, perform the following steps: 1. On the node you want to add to an existing cluster, install Marklogic and startMarklogic, as described in Installing Marklogic Server and Starting Marklogic Server. 2. Log into the Admin Interface in a browser. It is on port 8001 of the host in which Marklogic is running (for example, on the localhost, http://localhost:8001). The Server Install page appears. 3. Click OK to continue. 4. Wait for the server to restart. 5. After the server restarts, you will be prompted to join a cluster. 6. Enter the DNS name or the IP address of one of the machines in the cluster. For instance, if this is the second host you are installing, you can enter the DNS name of the first host you installed. 7. Click OK. 8. You will be prompted for an admin username and password. You can use the admin username and password you created when installing the first host. Click OK. 9. Select a Group to assign this host. Click OK. 10. Click OK to confirm that you are joining the cluster. 11. You have now joined the cluster. 12. Click OK to transfer the cluster configuration information. You have completed the process to join a cluster and will now see the Admin Interface. Leaving a Cluster and Becoming a Single Host If your host is currently in a cluster of multiple hosts, and you would like to leave the cluster and switch to a single host environment, follow the steps in this section. A host cannot leave a cluster if there are still forests assigned to it or if it has any foreign clusters associated with it; you must delete all forests assigned to the host and de-couple any clusters associated with a host before you can leave the cluster. However, you can delete the configuration only for a forest and the forest data will remain on the filesystem, allowing you to add the forest back to the host after changing the configuration. For instructions on adding a forest to a host, see the Administrator's Guide. Perform the following steps to leave the cluster to which a host is connected. 1. Run the Admin Interface from the host you want to remove from the cluster. 2. Click the Hosts icon in the left menu tree. The Host Summary page appears. 3. Click the name of the host you want to remove from the cluster, either from the left menu tree or from the Host Summary page. The Host Configuration page appears: The Leave button only appears if the Admin Interface is running from this host. 4. 5. 6. 7. Click the Leave button Click OK to confirm leaving the cluster. The host restarts to load the new configuration. Follow the instructions in sections 'Configuring a Single Host or the First Host in a Cluster' or 'Configuring an Additional Host in a Cluster' as appropriate. Entering a License Key Marklogic will run without a license key, but you should enter a valid key for what you are licensed for after installing Marklogic. At any time, you can change the license key for a host from the Host Status page. You might need to change the license key if your license key expires, if you need to use some features that are not covered in your existing license key, if you upgrade your hardware with more CPUs and/or more cores, if you need a license that covers a larger database, if you require different languages, or for various other reasons. Changing the license key sometimes results in an automatic restart of Marklogic (for example, if your new license enables a new language). To change the license key for a host, perform the following steps using the Admin Interface: 1. Click the Hosts icon on the left tree menu. 2. Click the name of the host in which contains you want to change the license key, either on the tree menu or the summary page. The Host Configuration page appears. 3. Click the Status tab. The Host Status page appears. 4. Click the License Key button. The License Key Entry page appears. 5. Enter your new license key information. For information about licensing of Marklogic Server, contact your Marklogic sales representative. 6. After entering valid information in the Licensee and License Key fields, click OK. If it needs to, Marklogic will automatically restart, and the new license key will take effect. Checking for the Correct Software Version After logging in with your admin username and password, the Admin Interface appears. In the left corner of the Admin Interface, the version number and product edition are displayed. To view more details about the release of Marklogic Server that is installed and licensed, complete the following steps: 1. Click the Hosts icon on the left tree menu. 2. Select the name of the host you just installed, either from the left menu tree or from the Host Summary page. 3. Click the Status tab. The Host Status page appears. 4. Check that <version> is correct. To begin using Marklogic Server, see the following document:  Getting Started With Marklogic Server Otherwise, you are finished with the Admin Interface for now. You have successfully installed Marklogic on your system. Configuring Marklogic Server on UNIX Systems to Run as a Non-daemon User On UNIX-based systems (Linux and Solaris), Marklogic runs as the UNIX user named daemon. This section describes how to change a configuration to run as a different named UNIX user. This procedure must be run by the root user. Additionally, the root user is still required for installing and uninstalling Marklogic and for starting and stopping Marklogic from the startup scripts. To modify an installation to run as a user other than daemon, perform the following steps: 1. In a command window on the machine in which you installed Marklogic, log in as the root user. 2. Make sure Marklogic is stopped. If it is still running, stop it as follows: Platform Perform the following to stop Marklogic: Red Hat Linux As the root user, enter the following command: /etc/init.d/Marklogic stop Sun Solaris As the root user, enter the following command: /etc/init.d/Marklogic stop 3. Edit the configuration file for your platform using a text editor such as vi. Platform Configuration File to Edit /etc/sysconfig/Marklogic Red Hat Linux Sun Solaris /etc/Marklogic.conf 4. In the file, edit the MARKLOGIC_USER environment variable to point to the user in which you want Marklogic Server to run. For example, if you want it to run as a user named raymond, change the following line: MARKLOGIC_USER=daemon to the following: MARKLOGIC_USER=raymond 5. Save the changes to the /etc/sysconfig/Marklogic or /etc/Marklogic.conf file. 6. If you have not yet started Marklogic after performing a clean installation (that is, after installing into a directory where Marklogic has never been installed), then you are done and you can skip the rest of the steps in this procedure. If have an existing installation (for example, if you are upgrading to a maintenance release), then continue with the following steps. 7. For all of the Marklogic files owned by daemon, you need to change the owner to the new user. This includes all forest data and all of the configuration files. By default, the forest data is in the following directories: Platform Default Data Directory (for configuration and log files, and default forest directory) Red Linux /var/opt/Marklogic Hat Sun Solaris /var/opt/Marklogic 8. For example, on a Linux system, perform a command similar to the following, which changes the owner to the user specified earlier in the /etc/sysconfig/Marklogic file: 9. chown -R raymond /var/opt/Marklogic 10. Make sure to change the owner for all forests in the system, otherwise forests will fail to mount upon startup. Note that the above command only changes the owner for forests installed in the default directory. You need to run a similar command on the data directory for each forest in which a data directory is specified. 11. When you have completed all the file and directory ownership changes, start Marklogic as described in Starting Marklogic Server. Once you have performed this procedure, all new files created by Marklogic are created with the new user ownership; there will be no need to change any ownership again. The configuration changes you made to the startup scripts need to be merged in during any upgrade of Marklogic (because the installation installs a new version of the startup scripts). Under Linux, the uninstallation process saves an old version of the scripts (for example, /etc/sysconfig/Marklogic.rpmsave), so you can use that version to merge in your changes. If you perform a clean installation (not an upgrade installation), however, you will need to run this entire procedure again. Removing Marklogic Server To remove Marklogic from your system, complete the following steps: 1. Stop Marklogic by performing the following action based on the platform in which you are running: Platform Perform the following: Windows Select Start > Programs > Marklogic Server > Stop Marklogic Server. If you are using Windows Vista or Windows 7, to stop the service you must right-click the Stop Marklogic Server link in the Start menu and choose Run as Administrator, then choose to allow the action. Red Hat Linux As the root user, enter the following command: Sun Solaris As the root user, enter the following command: /etc/init.d/Marklogic stop /etc/init.d/Marklogic stop Mac OS X Select System Preferences > Marklogic to open the Marklogic control window. Click Stop Marklogic Server. 2. Once the server is stopped, you can uninstall Marklogic package by performing the following action based on the platform in which you are running: Platform Perform the following: Windows Use the Add/Remove Programs Control Panel to uninstall Marklogic. Red Linux As the root user, enter the following command: Hat rpm -e Marklogic Sun Solaris As the root user, enter the following command: pkgrm Marklogic Mac OS X No action is necessary when upgrading. If you want to remove the user data and do a fresh install, then remove the following directory: ~/Library/Application Support/Marklogic/Data To entirely remove Marklogic, remove the following directories: ~/Library/Marklogic ~/Library/Application Support/Marklogic ~/Library/StartupItems/Marklogic ~/Library/PreferencePanes/Marklogic.prefPane To make Mac OS X completely forget it ever had a Marklogic installation, run the following command from a terminal window: sudo pkgutil --forget com.Marklogic.server 3. Using this procedure to remove Marklogic from your system will not remove user data (configuration information, XQuery files used by HTTP or XDBC servers, or forest content). This data is left in place to simplify the software upgrade process. If you wish to remove the user data, you must do so manually using standard operating system commands. Database Definition and Manipulation: Creating a New Database Follow the following steps to create a new database. 1. Click the Databases icon in the left tree menu. 2. Click the Create tab at the top right. The Create Database page displays: 3. Enter the name of the database. This is the name the system will use to refer to this database. 4. Select a security database to be associated with this database. We recommend selecting Security as the security database. 5. Select a schema database to be associated with this database. 6. You may leave the rest of the parameters unchanged or set them according to your needs. 7. Click OK. Your database is now created. You can now attach forests to the database. Creating a database is a “hot” admin task. Attaching and/or Detaching Forests to/from a Database In order to query content in a forest, it must be attached to a database. Forests can be moved from one database to another (detached from one database and attached to another). Detaching a forest from a database does not delete the forest; the forest remains on the host on which it was created with the data intact. Forests can be moved from one database to another (detached from one and attached to another). However, before you attach the forest to another database, ensure that the new database has the same configuration as the old database. If the configuration of the new database is different and the reindex enable setting is set to true on the new database, the forest will begin reindexing to match the database configuration as soon as it is attached. Perform the following steps using the Admin Interface to attach or detach one or more forests to a database: 1. Click the database to which you want to attach forests. 2. Click the Forests icon for the database. The Database Forest Configuration Page appears. 3. Check the box corresponding to forest(s) you want to attach to the database. You can also uncheck forests you want to detach from the database. 4. Click OK. The forests you attached or detached are now reflected in the database configuration. Attaching and detaching a forest to a database are “hot” admin tasks. Viewing Database Settings To view the settings for a particular database, perform the following steps: 1. Click the Databases icon on the left tree menu. 2. Locate the database for which you want to view settings, either in the tree menu or in the Database Summary table. 3. Click the name of the database for which you want to view the settings. 4. View the settings. 5. Click Forests, Triggers, Content Processing, Fragment Roots, Fragment Parents, ElementWord-Query-Throughs, Phrase-Throughs, Phrase-Arounds, Element Indexes and Attribute Indexes to view settings specific to those aspects of the database. Loading Documents into a Database You can use the Admin Interface to load documents into the database. The documents will be loaded with the default permissions and added to the default collections of the user with which you logged into the Admin Interface. To load a set of documents into a database, perform the following steps: 1. Click the Databases icon on the left tree menu. 2. Click on the database into which you want to load the documents. 3. Click on the Load tab near the top right. 4. Enter the name of the directory in which the documents are located. This directory must be accessible by the host from which the Admin Interface is currently running. 5. Enter a filter for the names of the documents to be loaded (for example, *.xml to load all files with an xml extension). For an exact match, enter the full name of the document. 6. Click OK to proceed. 7. The load confirmation screen will list all documents in the specified directory matching the specified filter. Click OK to complete the load. The documents are loaded into the database. The URI path of the documents are the same as your filesystem path. Merging a Database You can merge all of the forest data in the database using the Admin Interface. The Merge button allows you to explicitly merge the forest data for this database. To explicitly merge the database, complete the following procedure: 1. Click the Databases icon on the left tree menu. 2. Decide which database you want to merge. 3. Click the database name, either on the tree menu or the summary page. The Database Configuration page displays. 4. Click the Merge button on the Database Configuration page. A confirmation message displays. 5. Confirm that you want to merge the forest data in this database and click OK. Merging data in a database is a “hot” admin task; the changes take effect immediately. Reindexing a Database You can reindex all of the document data in the database using the Admin Interface. The reindex operation sets the reindexer timestamp to the current system timestamp, which causes a reindex and refragment operation on all fragments in the database that have a timestamp equal to or less than the timestamp (assuming reindexer enable is set to true). The Reindex button forces a complete reindex/refragment operation on the database. To reindex the database, complete the following procedure: 1. Click the Databases icon on the left tree menu. 2. Decide which database you want to reindex. 3. Click the database name, either on the tree menu or the summary page. The Database Configuration page displays. 4. Click the Reindex button on the Database Configuration page. A confirmation message displays. 5. Confirm that you want to reindex this database and click OK. Reindexing data in a database is a “hot” admin task; the changes take effect immediately. Clearing a Database You can clear all of the forest content from the database using the Admin Interface. Clearing a database deletes all of the content from all of the forests in the database, but leaves the database configuration intact. To clear all data from a database, complete the following procedure: 1. Click the Databases icon on the left tree menu. 2. Decide which database you want to clear. 3. Click the database name, either on the tree menu or the summary page. The Database Configuration page displays. 4. Click the Clear button on the Database Configuration page. A confirmation message displays. 5. Confirm that you want to clear the forest data from this database and click OK. Clearing a database is a “hot” admin task; the changes take effect immediately. Deleting a Database A database cannot be deleted if there are any HTTP, WebDAV, or XDBC servers that refer to the database. Deleting a database detaches the forests that are attached to it, but does not delete them. The forests remain on the hosts on which they were created with the data intact. Perform the following steps to delete a database: 1. Click the Databases icon on the left tree menu. 2. Locate the database you want to delete, either in the tree menu or in the Database Summary table. 3. Click the name of the database which you want to delete. 4. Click on the Delete button near the top right. Note: Clicking the Clear button clears all of the forests attached to this database, removing all of the data from the forests. Clicking the Delete button removes the database configuration, but does not delete the data stored in the forests. 5. Assuming that there are not any HTTP, WebDAV, or XDBC servers referring to the database, a delete confirmation screen appears. Click OK. The database is now permanently deleted. Deleting a database is a “hot” admin task. Application: Oscar Search Application using Application Builder and its Snapshots Overview of Application Builder Using Application Builder requires no coding on your part. Its user interface is easy to use, while its search applications can have many high-end search features such as a search box with Google-style search grammar, search suggestions, faceted navigation, and results visualization widgets. It scales for huge database sizes while maintaining its speed. The generated application uses the Search API and can be used as is or customized with your own code.You can define many aspects of an application, such as:     Facets Details appearing on the search result page Content display control via item rendering Visualization widgets for search results Typically, building an application is an iterative process. To begin, you must have a representative content set loaded in a database with any needed indexes already set up. If your content is not complete or not completely indexed, you can still generate an application and modify it as you modify your content. Setting Up and Starting Application Services Application Builder is bundled with MarkLogic Server Application Services. On a fresh installation of MarkLogic 8, Application Builder is preconfigured and ready to use. For an upgrade installation, your existing application data remains intact although some renaming of your Application database and App Server may occur during the installation process. This section describes the following scenarios:   Clean Installation Starting Application Services Clean Installation When you install MarkLogic Server for the first time, the installation process does the following:   Creates an HTTP App Server named App-Services on port 8000 for Application Services Creates a database named App-Services to store the Application Builder application documents Starting Application Services To start Application Services, open a browser and go to your server's port 8000. For example, if your browser runs on the same machine as MarkLogic Server, open the following URL: http://localhost:8000/appservices When MarkLogic Server prompts you for a username and password, enter them for a user with either the admin or app-builder role. Building the Oscars Application Application Builder includes a template to build a sample application based on Oscar awards data from Wikipedia. To build this application, go through the Application Builder wizard as follows: 1. Start Application Builder by going to the following URL (If MarkLogic Server is installed on a different host or your App Server uses a different port, substitute those values): http://localhost:8000/appservices 2. On the Application Builder Applications screen, click New Example Application. 3. The New Example Application screen appears. a. Enter a name for the application, in this case Oscars. b. Select New Database and enter a database name, in this case Oscars. c. Click Create Application. 4. Application Builder creates an Oscars App Server, forest, and database and then displays the Search tab page. 1. On the Search page, you can accept the default search constraints and facets or change the settings. 2. Click the next button at the upper right to go to the Assemble tab page. 3. On the Assemble page, you can accept the defaults widgets and layout or change the settings. 1. Click the next button at the upper right to go to the Results tab page. 2. On the Results page, you can accept the defaults for the contents of an individual search result or change the settings. 1. Click the next button to move to the Sort tab page. a. The Sort tab page displays. 1. On the Sort page, you can accept the default search results ordering(s) or change the settings. 2. Click the next button to go to the Content page. a. In the Content tab page, you can control how the application renders content as XHTML for web browsers. 1. Click the next button to move to the Apperance tab page. a. On the Appearance page, you can specify your application's title and overall look and feel. 1. Click the next button to go to the Deploy tab page. a. The Deploy page appears. 1. On the Deploy page, select New App Server. (You can only select Existing App Server if an App Server is already configured for this application.) Accept the default values or provide the App Server with a name and port number. 2. Click the Deploy button and confirm. a. Application Builder creates and configures the new App Server and opens a new window where it launches the new application. This may take a short while. When Application Builder prompts you to log in, enter a username and password and click OK. b. You can test the Oscars application by entering search terms or clicking on the browse links to narrow the diplayed results. Loading the Complete Set of Oscars Data Initally, Application Builder only loads a few sample data files for use by the Oscars application. To load the full 20 MB content set, use the following steps: 1. In the Information Studio Flows section of the Application Services page, click New Flow. The Flow Editor appears. 2. Click Change Collector at the bottom of the Collect section: 3. In the Select A Collector window, select Oscars Example Data Loader: 4. In the Configure Settings window, select Done. Do not make any changes in this window. 5. The Collect section of the Flow now shows that the Oscars Example Data Loader is configured and the URL from which it will download the data. 6. In the Load section, select oscars as the Destination Database and click Document Settings: 7. In the Document Settings window, change the URI Structure Configuration to: /oscars/{$filename}{$dot-ext} Click Done. 8. Click Start Loading to load the content into the database. The data downloads automatically over your Internet connection, while a spinning icon appears until the load is complete. When done, you will see different count values and additional facet values. Using the Oscars Sample Application The Oscars sample application enables you to search, browse, and display articles about Oscar award winners from the last nine decades. It is uses the Search API's standard features, including query text parsing, faceted navigation, snip petting, and many more. While you can learn about the application by playing with it, this section highlights some of its main features, including:     Keyword Searching, Search Suggestions, and Parsing Browsing with Facets Search Result Page Displaying Content Details Keyword Searching, Search Suggestions, and Parsing You can enter keywords into the search box and press return to search the database. For example, a search for raymond shows snippets for the first 10 of 37 results. You can also search using constraints. For example, the following query text finds everything about the actor Dustin Hoffman: actor:"Dustin Hoffman" This is not a standard full-text search, but is a constraint showing all the documents matching where an particular value in the source XML, <actor>, has the content Dustin Hoffman. You can combine the constraint with other terms to further narrow the results: buck actor:"Dustin Hoffman" When you click on any links in the user interface, notice that the query text in the search box shows the current query. Browsing with Facets The left side of the application shows facets used to browse through the content. When you click on a facet, it narrows the overall search results to those results in the facet's category, while keeping the existing categories or search terms active. For example, if you first click on the Award:Best Director browse link, then on the Decade:1970s link, and then on the Winners:True link, your results are all of the 1970s winners of the Best Director award. Each of the browse facets has a count of how many of its results match your current query. Search Result Page The search result page shows a link with a text summary of the content, highlighted snippets of the content matching your search, and other information about the search match. Clicking the result link takes you to the content details. Displaying Content Details The content details page includes the complete content for the search result. The rendering is based on the configuration specified on Application Builder's Content Display page. The page's style is based on the skin you chose and on any custom CSS entered on the Appearance page. Using Application Builder to Modify the Oscars Sample Application This section describes how to add a year facet to the Oscars application. With the year facet, you can first drill down on results with the decade facet, then drill down further using the year facet. The year facet uses the same index as the decade facet. To create a year facet, do the following: 1. Start Application Builder (for example, open http://localhost:8000/appservices in a browser). 2. On the Application Builder Applications page, click the application name that you used for your Oscars sample application (in this case, Oscars). 3. Click the Search tab. At the bottom of the Search page, click Add New: 4. In the New Constraint dialog, Click Range. a. In the New Constraint dialog, enter year for the Name and select year for the Source Index. b. Click Create Range Constraint. Application Builder creates the constraint. 5. In the application name menu, click Deploy Now from the Oscars pull-down menu: 6. Application Builder compiles and deploys the new application code to your App Server's modules database. During deployment, the following appears in a new window: When Application Builder is done, the newly modified application replaces the status page, including the new year facet. Test the facet by doing a search, selecting a decade, and then selecting a year to find the results for a single year from that decade. Conclusion: Marklogic is designed to handle the volume, variety, and velocity of Big Data like other NoSQL solutions, and has the enterprise features that made last-generation relational databases so reliable. And Marklogic gives a way to manage the hierarchical content, distributed graph data, and also XML and RDF in the same database. MarkLogic has dramatically accelerated the deployment of products and services, while greatly reducing the costs of content loading and design – translating into even faster research cycles and clinical diagnoses – thanks to a new generation of solutions for helping professionals find exactly the information they need, when they need it most.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download What is Marklogic?