* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
High-performance database technology for rock-solid IoT solutions Gints Ernestsons, Clusterpoint founder LATA conference, 28.01.2016. Key facts about Clusterpoint Founded: 2006 Team size: 32 Engineering: 25 Privately held, VC backed 4.8 m investments to date Product: database software Market share: 100s of installations Partners: 7, Cloud partners: 2 Cloud DBaaS : from Q1/2015 Key Personnel CEO Founder, Visionary CTO, Founder Algorithms Architect DB Software Architect Business Dev Director Zigmars Rasscevskis Gints Ernestsons Jurgis Orups Martins Krikis Janis Sermulins Peteris Janovskis 8 years in Google; Engineering manager of the Web search backend (Zurich); IMO silver medal 15 years CTO in Lursoft; 8 years CEO in Clusterpoint; 25 years as a technology entrepreneur and investor 9 years runs Clusterpoint core software engineering team, expert in C/C++, NoSQL, Big data search 4 years in Intel (USA), 4 years in Tieto; PhD from Yale University; Lecturer on Algorithms 5 years in Google MSc from MIT; Intel Research (USA) IOI 2x Gold medallist; 12 years in Oracle; Alliance & Channel Director Central and East Europe Selected list of our customers and partners Ousting ORACLE, Microsoft SQL, MySQL and SEARCH platforms in 24/7 services We operate cloud database infrastructure in Europe and USA Already > 5000 users, only 8 months in a program Dallas, US Cloud DBaaS started in Q1/2015 Riga, Europe Explosion of IoT data is inevitable: we are at the very beginning! Internet of Things Units Installed Base by Category (Millions of Units) | Gartner Nov 2015 Internet of Things Endpoint Spending by Category (Billions of Dollars) | Gartner Nov 2015 25000 3500 3000 20000 2500 15000 2000 10000 1500 1000 5000 500 0 2014 2015 Consumer Business: Cross-Industry 2016 2020 Business: Vertical-Specific 0 2014 2015 Consumer Business: Cross-Industry 2016 Business: Vertical-Specific Gartner, Inc. forecasts that 6.4 billion connected things will be in use worldwide in 2016, up 30 percent from 2015, and will reach 20.8 billion by 2020. In 2016, 5.5 million new things will get connected every day. 2020 Product: hybrid operational database, analytics and search platform Hyper converged platform that uses JSON open standards XML ACID OLAP OLTP SIEM DWH WEB HPC TEXT Use cases MB ► GB ► TB ► PB HybridSQL Secure, high-performance, distributed data management at scale We solve performance problems where relational databases fail Blazing fast Unlimited scalability performance Up to 1000x faster MB ► GB ► TB ► PB Bulletproof transactions, instant text search and security Reduces your TCO by 80% over your database life-time Distributed architecture delivers high-performance computing RDBMS CLUSTERPOINT ACID Simultaneous Time execution of parallel computing tasks using fast & secure transactions Reliability of legacy RDBMS without its complexity, at 1000x its speed All-in-one platform: DBMS, SEARCH, one API and one COST Document database with Search platform with data JavaScript/SQL + high relevance ranking, including performance transactions full-text & geospatial data Bulletproof ACID transactions (patent filed, US) Scalable high-availability Real-time online web and distributed computing mobile analytics in Big data (sharding, replication) (no need for map-reduce) Kill complexity! Boost performance! Nail search! Cut your cost! Up to 1000x faster RDBMS w ACID- Search platform, full-text transactions index Tons of your integration efforts and application “spaghetti” code High availability Online analytics shards, replicas platform Custom “stitching” all platforms Cut 80% off your TCO ONE API: JS/SQL No systems integration required Replace 2 software platforms with 1 to decrease your TCO by 80% Commercial RDBMS + SEARCH Open source RDBMS + SEARCH Clusterpoint database 14 000 0 0 20% / 3 x 2800 DIY / 0 3 x 7200 10 000 0 0 20% / 3 x 2 000 DIY / 0 0 10 000 0 0 DBMS + SEARCH integration through custom application software code (developer months) 3m / 15 000 6m / 30 000 1m / 5000 DBMS high-availability clustering option or custom HA integration (developer months) 2m / 10 000 4m / 20 000 0 Operate & scale integrated DBMS + SEARCH + custom application code (developer months) 9m / 45 000 18m / 90 000 0 118 400 140 000 26 600 Budget for 100-users company, in $ DBMS software license (enterprise edition) DBMS software maintenance (3 years) SEARCH PLATFORM (SEARCH) license SEARCH PLATFORM maintenance fee (3 year) DBMS client software access licenses (100 users) Your TCO over 3-years life-span, in $: By using multiple platforms, your security problems are snowballing Major security alert as 40,000 MongoDB databases left Attackers targeting Elasticsearch remote code unsecured on the internet execution hole The Register InformationAge The Odd Couple: Hadoop and Data US Department of Homeland Security Calls On Computer Security Users to Disable Java ZDnet Forbes MySQL Multiple Bugs Let Remote Users Access and Modify Bash bug leaves Linux users Data and Deny Service shellshocked Security Tracker WindowsSecurity Manage all your data, indexes and replicas with solid security All your mission-critical data in one DBMS, analytics and search platform Big data cluster, replicas, backups ONE API: JS/SQL ACID transactions Ordinary relational SQL database XML BLOB JSON Search and analytics data/indexes Develop your application software code scalable from day-one OPEX, TCO Save > 80% Database life-cycle Test Year 1 Year 2 WRITE ONCE and decrease lifetime cost of your web or mobile application Year 3 Year 4 Year N Why pay extra for high-end features? Use out-of-the-box! FAULT-TOLERANCE LOAD BALANCING HIGH-AVAILABILITY SCALE OUT ABILITY replica 1 replica 2 replica 3 Why document-oriented database architecture? Flexibility! Manage all your data in open industry standards: XML and JSON Easily includes other data models: tables, text, pictures, graphs, links etc Ordinary RDBMS: cost of changes escalates with software stack Relational database Cumulative cost ( ORM software model ) OPEX cost Launch 75 15 35 40 40 10 20 ORM Search Analytics & Reporting High availability clustering Life 45 d + 90 d + 6 months + 1 year time 5 Document database: cost of changes goes down to minimum Document database Cumulative cost ( XML / JSON data model ) OPEX cost Launch 70 75 60 40 5 20 Document model (de-normalization) Analytics & Reporting Search HA Life 1 year (rebuild application) + 6 months + 90 d +45 d time Smart IoT meters: storing data in documents vs database raws Millions of smart meters Billions of measurements Meter Time Volts Amps Cost Meter A day, a month or a year(s) data 1 00:00 { ... } ... 10:00 {220 0.25 0.05 } 10:15 { 240 0.65 0.10 } 10:30 { ... } ... 23:45 { ... } address 2 00:00 { ... } ... 10:00 {230 0.50 0.10 } 10:15 { 230 0.50 0.10 } 10:30 { ... } ... 23:45 { ... } ... name 3 00:00 { ... } ... 10:00 { Not available } 10:15 { Signal loss } 10:30 { ... } ... 23:45 { ... } ... photo 1 2 10:00 220 10.00 230 0.25 0.50 0.05 0.10 3 ... 10:00 180 ... ... 0.30 ... 0.03 ... 1 2 10:15 240 10.15 230 0.65 0.50 0.10 0.10 3 ... 10:15 180 ... ... 0.30 ... 0.03 ... Ordinary database stores individual measurements (1000s per meter) Fast degrading performance Document database stores all data on individual meters as rich text objects Instant search Top performance Automatically create and maintain fast full-text search index <id> <title> </title> RANKING INDEX indexes Ordinary database indexing model Full database content indexing SQL query: tens of seconds Our query: milliseconds Complex queries requiring steep learning Web-style free text SEARCH and analytical curve JS/SQL queries Ultra-fast database index for ranked search and online analytics Your original data in % documents XML & JSON RANKING INDEX strings tags names values Distributed storage dates numbers words relations architecture Index tree is organized into a graph, enabling you to set up your own search ranking (weighting) rules Ranking index delivers endless scale out ability to your data MB ► GB ► TB ► PB Organized as a modular graph, it enables to distribute data and computing Ordinary databases overload and overwhelm users with data Two main performance problems with ordinary databases Disrupt your competition with fast and relevant full-text search Use ranking Relevance Free text queries of search results at subsecond latency Having billions of data? Programmable filter that delivers superior search relevance Scientists: ranking is a game-changing technology in databases Very Large Data Bases Conference 7th International Workshop on Ranking in Databases, 2013 “ the sheer amount of data makes it almost impossible to process queries in the traditional compute-thensort approach ” “ Facing explosion of data ... the user would be overwhelmed by too many unranked results “ Ranking delivers superior search experience in your database Shop Map application application Address 100% Company 75% Product 50% Same data, different ranking rules for your free text search queries (think voice in future) Product 100% Company 75% Address 50% Easily configure your own ranking rules for your business needs Your own data items (fields) in Least relevant Most relevant your XML or JSON database Product 0% 50% 100% Company 0% 50% 100% Address 0% 50% 100% Email 0% 50% 100% Category 0% 50% 100% When free text search hits data with higher rankings, results are sorted up-front Enjoy instantly relevant search in your data using only free text <query> Plain text: java developer London Phrases: “John Smith” Simple, superfast, userfriendly web- Wildcards: Joh* Smi* or “John Smi*” style SEARCH Patterns: John Sm?th Substitutes: John Sm[iy]th </query> With ranking you can implement ranked pagination: 1 2 3 . . . Ranking your database structure Ranking your documents <document> 100% Title 50% 10% Body Comments <id> <title> ( when tag weightings are equal ) </title> Ranking your search query terms w1^100% w2^+30% w3^-20% Ranking density of context hits ..w1...w2 ........ w3 ........ Two problems solved Real-time Big Data SEARCH RANKING INDEX integer 0 ..... 232 milliseconds Ranked pagination (1 2 3 ..) solves information overload problem Page: 1 2 3 4 5 more Limited waiting time by users Limited network Limited screen estate bandwidth Fast and relevant database search in your web and mobile applications Scale to billions of documents without search performance loss Milliseconds for a JavaScript/SQL Minutes ... hours query in Clusterpoint database for a SQL query in legacy RDBMS MB RANKING INDEX GB TB PB JSON XML Constant query latency enables real-time Big data search and analytics Clusterpoint Cloud Database as a Service (DBaaS) We safely and efficiently manage your databases for you AND We instantly scale on-demand! Clusterpoint Cloud is always ON, 99.99% available & reliable REST API http(s) JS/SQL tcp/ip Our cloud computing is using cost-efficient on-demand model Resources Conventional Provisioning Model, $ Save 3x-10x Cost-Efficient Model, $ DB Time Thank you!