Download NoSQL

.NET Database Technologies: Using NoSQL databases NoSQL – “Not only SQL” • Alternatives to the ubiquitous relational database which may be superior in specific application scenarios • Object-oriented databases (ODBMS)  They came, they saw, they....  ...didn’t conquer, but they are still around • NoSQL databases  The new kids on the block  General term applied to a range of different non-relational database systems  Largely emerging to meet the needs of large-scale Web 2.0 applications Object-oriented databases • ODBMSs use the same data model as object-oriented programming languages  no object-relational impedance mismatch due to a uniform model • An object database combines the features of an object- oriented language and a DBMS (language binding)   treat data as objects • object identity • attributes and methods • relationships between objects extensible type hierarchy • inheritance, overloading and overriding as well as customised types ODBMS history • Object Database Manifesto  Paper published in 1989 (Atkinson et. al) • Some ODBMS products  Early 1990s: Gemstone, Objectivity  Late 1990s: Versant, ObjectStore, Poet , Matisse  2000s: db4o, Cache • ODMG (Object Data Management Group)  1993: ODMG 1.0 standard  1997: ODMG 2.0  1999: ODMG 3.0, then ODMG disbanded  2005: ODMG reformed, working towards new standard ODMG • Object Database ManagementGroup (ODMG) founded in 1991  standardisation body including all majorODBMS vendors • Define a standard to increase the portability across different ODBMS products • Mirroring the SQL standard for RDBMS  Object Model  Object Definition Language (ODL)  Object Query Language (OQL)  language bindings • C++, Smalltalk and Java bindings Characteristics of ODBMS • Support complex data models with no mapping issues • Tight integration with an object-oriented programming language (persistent programming language) • High performance in suitable application scenarios • Different products scale from small-footprint embedded db (db4o) to large-scale highlyconcurrent systems (e.g. Versant V/OD) Persistence patterns and ODBMS • Some of Fowler’s patterns are specific to the use of a relational database, e.g.  Data Mapper  Foreign Key Mapping  Metadata Mapping  Single-table Inheritance, etc. • Some are not specific to the data storage model and are relevant when using an ODBMS, e.g.  Identity Map  Unit of Work  Repository  Lazy-Loading db4o • Open-source object-database engine  Now owned by Versant  Complements their own V/OD product • Can be used in embedded or client-server modes  Embed in application simply by including DLLs • Native object database  Stores .NET (or Java) objects directly with no special requirements on classes  Other ODBMSs (e.g. V/OD) require classes to be marked as persistent through bytecode manipulation and also store class definitions  Tight integration with application, but trade-off in limited adhoc querying and reporting  Can replicate data to relational database if required IObjectContainer • IObjectContainer interface is implemented by objects which provide access to database  IObjectContainer is roughly equivalent to EF ObjectContext  Unit of Work pattern if transparent persistence is enabled (see later) • Can access DB in embedded mode (direct file access) or client-server mode (local or remote)  IObjectServer instance required in client-server mode • IObjectContainer instances created by factory classes, e.g. Db40Embedded • Queries on IObjectContainer return IObjectSet (except LINQ queries) Viewing data and ad-hoc querying • ObjectManager Enterprise  Visual Studio plug-in  Browsing and drag-and-drop queries • LINQPad  Need to include db4o DLLs and namespaces for stored classes  Executes LINQ queries and visualises results db4o query APIs • Query-by-example (QBE)  Very limited - no comparisons, ranges, etc. • Simple Object Data Access (SODA)  Build query by navigating graph and adding constraints to nodes • Native Queries  Expressed completely in programming language  Type-safe  Optimised to SODA query at runtime if possible • LINQ  .NET version, not in Java (obviously) Activation • Objects are stored in DB as an object graph • If db4o configured to cascade-on-activate (eager loading) then retrieving one object could potentially load a large number of related objects • Fixed activation depth limits depth of traversal of graph when retrieving objects  Default value is 5 • Can then explicitly activate related objects when needed • Lazy loading can be configured with transparent activation • Classes need to be “instrumented” at load time by running Db4oTool.exe  Code injected into assembly so that classes implement IActivatable interface Update depth • Similar considerations apply to updates • Storing an updated object could cause unnecessary updates to related objects • Fixed update depth limits depth of traversal of graph when retrieving objects  Default value is 1 • Can configure transparent persistence which allows changes to be tracked  Only changed objects are updated in database  Behaves like change tracking in, for example, Entity Framework  Unit of Work PI? • Stores POCOs without any need for mapping, so yes • Transparent Activation requires that classes implement a specific interface • But this is done at build time so domain classes don’t need any specific code • Has parallels with dynamic proxies in ORMs:  Classes are instances of domain classes, which have been modified ‘under the hood’ at build-time  Compare with dynamic proxy class which derive from domain classes and are created ‘under the hood’ at run-time Further reading • www.odbms.org  Resource portal • Db4o Tutorial  included in product download • The Definitive Guide to db4o (Apress) NoSQL databases • New breed of databases that are appearing largely in response to the limitations of existing relational databases • Typically:  Support massive data storage (petabyte+)  Distribute storage and processing across multiple servers • Contrast in architecture and priorities compared to relational databases • Hence term NoSQL • “Not only SQL” – absence of SQL is not a requirement NoSQL features • Wide variety of implementations, but some features are common to many of them: • Schema-less • Shared-nothing architecture • Elasticity • Sharding and asynchronous replication • BASE, not ACID  Basically Available  Soft state  Eventually consistent MapReduce • Algorithm for dividing a work load into units suitable for parallel processing • Useful for queries against large sets of data: the query can be distributed to 100’s or 1000’s of nodes, each of which works on a subset of the target data • The results are then merged together, ultimately yielding a single “answer” to the original query • Example: get total word count of a large number of documents   Map: calculate word count of each document • Each node works on a subset of the overall data set • Results emitted to intermediate storage Reduce: calculate total of intermediate results Brewer’s CAP theorem • Can optimize for only two of three priorities in a distributed database: • Consistency  All clients have same view of the data  Requires atomicity, transaction isolation • Availability  Every request received by a non-failing node must result in a response • Partition Tolerance  Partitions happen if certain nodes can’t communicate  No set of failures less than total network failure is allowed to cause the system to respond incorrectly Implications of CAP theorem • Any two properties can be achieved • CP  If messages between nodes are lost then system waits  Possible that no response returned at all  No inconsistent data returned to client • CA  No partitions, system will always respond and data is consistent • AP  Response always returned even if some messages between nodes  Different nodes may have different views of the data Implications of CAP theorem • Choose a database whose priorities match the application http://blog.nahurst.com/visual-guide-to-nosql-systems Using a NoSQL database in a .NET application • Application typically makes connection to remote cluster • Some (but not many) NoSQL databases are supported by native .NET clients  Handle “mapping” from .NET objects to data model • Many NoSQL databases are accessed through a REST interface  Application must construct request and handle response format, e.g. JSON  Application can be written in any suitable language • Azure Table Storage is Microsoft’s NoSQL storage for cloud-based applications • However the data is accessed, you need to understand the data model, which will be significantly different from a typical relational database or object model NoSQL database types and examples • Key/value Databases  These manage a simple value or row, indexed by a key  e.g. Voldemort, Vertica • Big table Databases  “a sparse, distributed, persistent multidimensional sorted map”  e.g. Google BigTable, Azure Table Storage, Amazon SimpleDB • Document Databases  Multi-field documents (or objects) with JSON access  e.g. MongoDB, RavenDB (.NET specific), CouchDB • Graph Databases  Manage nodes, edges, and properties  e.g. Neo4j, sones MongoDB • Scalable, high-performance, open source, document- oriented database • Stores JSON-style (actually BSON) documents with dynamic schema • Replication, high-availability and auto-sharding • Supports document-based queries and map/reduce • Command line tools :  mongod – starts server as a service or daemon  mongo – client shell • Store documents defined as JSON • Retrieved documents form query displayed as JSON MongoDB and HTTP • Admin console at http://<server name>:28017 • REST interface on http://<server name>:28018  Enabled by starting server with mongod --rest  Server responds to RESTful HTTP requests, e.g. • http://127.0.0.1:28017/company/Employee/?filter_Name= Fernando  Response is in JSON format  Could be consumed by client-side code in Ajax application MongoDB .NET driver • Can access documents as instances of Document class • Represents document as key-value pairs • Or, can serialize POCOs to database format (JSON) • Deserialize database documents to POCOs • Supports LINQ queries • MapReduce queries can be expressed as LINQ queries MongoDB schema design • Collections are essentially named groupings of documents  Roughly equivalent to relational database tables • Less "normalization" than a relational schema because there are no server-side joins • Generally, you will want one database collection for each of your top level objects  Don’t want a collection for every "class" - instead, embed objects relational document Document example • Save: • Query: http://www.10gen.com/video/mongosv2010/schemadesign MongoDB in C# applications - PI? • Up to a point • Collection class needs Id property of a specific type (MongoDB.Oid) • Object model needs to be designed with document schema in mind Further reading • http://nosql-database.org/ • http://www.nosqlpedia.com/ • http://www.mongodb.org/ • http://www.codeproject.com/KB/database/MongoDBCS.aspx  Nice code example for C# and MongoDB

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download NoSQL