Download Neo4J

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Graph Database - Neo4j
ISQS3358, Spring 2016
Graph Database
•
A graph database is a database that uses
graph structures for semantic queries with
nodes, edges and properties to represent
and store data.
• Graph databases employ nodes, properties,
and edges.



•
Nodes represent entities such as people, businesses, accounts, or
any other item you might want to keep track of.
Properties are pertinent information that relate to nodes.
Edges are the lines that connect nodes to nodes, or nodes to
properties and they represent the relationship between the two.
Most of the important information is stored in the edges.
https://en.wikipedia.org/wiki/Graph_database
Graph Database
• What are graph databases & When to use a graph database,
3’54”, https://www.youtube.com/watch?v=Zg4EWwgADLk
• Graph database case – money laundering, 3’26”
https://www.youtube.com/watch?v=41qdmKIIMz0
• Graph databases:
•
•
•
Neo4J, 5’11”
Titan, 4’51”
GraphX
• Use Cases for Neo4j
Neo4j
About Neo4j
Introduced in 2010
Open Source tool
Java-based
Graphical Database
Neo is a database designed for network-oriented
data
It uses Cypher as graph query language
Neo4j, the Graph Database
•A Graph Database:
• A Property Graph contains Nodes, Relationships with
Properties on both
• Perfect for highly connected data
•A Graph Database:
• A declarative query language, called Cypher
• Scalable: could have a social network of multiple earths
• High-performance and reliability with High-availability
Neo4J Model
Neo4j Storage Record Layout
Traversals – how do they work?
•Relationship Expanders: given (a path to) a node, returns
Relationships to continue traversing from that node
•Evaluators: given (a path to) a node, returns whether to:
• Continue traversing on that branch (i.e. expand) or not
• Include (the path to) the node in the result set or not
•Then a projection to Path, Node or Relationship applied to
each path in the result set
•Uniqueness level: policy for when it is ok to revisit a node that
has already been visited
Cypher - Just convenient traversal
descriptions?
•Builds on the same infrastructure as Traversals Expanders
• but not on the full Traversal system
•Uses graph pattern matching for traversing the
graph
• Recursive matching with backtracking
START x=... matching x-->y, x-->z, y-->z, z-->a-->b, z-->b
Neo4j Adoption
Benefits of using Neo4J
•Organizes data in Networks
•Representation is natural and intuitive
•High performance traversal over domain data
•Captures semi-structured data easily, which is
impossible in a relational database
•Encourages agile methodologies
•Lower maintenance costs
•Shorter development times
Drawbacks
•Since Neo4j utilizes navigational model, it is
hard to execute arbitrary queries
• Ex: “how many of my customers over age 25 and a last name
that starts with an F have purchased items the last two
months?”
•Lacks in tool and framework support
From SQL to Cypher
•Cypher queries end with a return statement
rather than begin with what you want to return as
in SQL
Where is Neo4j used?
•Master Data Management
•Network and Data Centre
•Real-Time Recommendations
•Identity and Access Management
•Digital Asset Management
•Fraud Detection
•Social Media
Combining Neo4J and Hadoop
• Hadoop is good for data crunching, but the end-results in
flat files, which is hard to visualize your network data.
• Neo4J is perfect for working with networked data
•Method:
• Prepare data using HIVE, which is then transformed into MapReduce jobs
• The MapReduce jobs are utilized to create nodes and relationships in
Neo4J
• Make Neo4J’s batch importer read the files from the cluster directly
• Perform necessary steps to describe the nodes, relationships and their
properties.
Case Study
Demo – Neo4j
Demo..
Install Neo4J
http://neo4j.com/downloadthanks/?edition=community&flavour=winstall64&release=2.3.3
&_ga=1.73460564.71683671.1458926380
Big Data Exercises