Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
High-performance database technology
for rock-solid IoT solutions
Gints Ernestsons, Clusterpoint founder
LATA conference, 28.01.2016.
Key facts about Clusterpoint
Founded: 2006
Team size: 32
Engineering: 25
Privately held, VC backed
4.8 m investments to date
Product: database software
Market share: 100s of installations
Partners: 7, Cloud partners: 2
Cloud DBaaS : from Q1/2015
Key Personnel
CEO
Founder,
Visionary
CTO,
Founder
Algorithms
Architect
DB Software
Architect
Business
Dev Director
Zigmars
Rasscevskis
Gints Ernestsons
Jurgis
Orups
Martins
Krikis
Janis
Sermulins
Peteris
Janovskis
8 years in Google;
Engineering manager of
the Web search backend
(Zurich); IMO silver
medal
15 years CTO in
Lursoft; 8 years CEO
in Clusterpoint;
25 years as a
technology
entrepreneur and
investor
9 years runs
Clusterpoint core
software engineering
team, expert
in C/C++, NoSQL, Big
data search
4 years in Intel
(USA), 4 years in
Tieto;
PhD from Yale
University; Lecturer
on Algorithms
5 years in Google
MSc from MIT; Intel
Research (USA)
IOI 2x Gold medallist;
12 years in Oracle;
Alliance & Channel
Director Central and
East Europe
Selected list of our customers and partners
Ousting ORACLE, Microsoft SQL, MySQL and SEARCH platforms in 24/7 services
We operate cloud database infrastructure in Europe and USA
Already > 5000 users,
only 8 months
in a program
Dallas, US
Cloud DBaaS
started in
Q1/2015
Riga, Europe
Explosion of IoT data is inevitable: we are at the very beginning!
Internet of Things Units Installed Base by
Category (Millions of Units) | Gartner Nov 2015
Internet of Things Endpoint Spending by Category
(Billions of Dollars) | Gartner Nov 2015
25000
3500
3000
20000
2500
15000
2000
10000
1500
1000
5000
500
0
2014
2015
Consumer
Business: Cross-Industry
2016
2020
Business: Vertical-Specific
0
2014
2015
Consumer
Business: Cross-Industry
2016
Business: Vertical-Specific
Gartner, Inc. forecasts that 6.4 billion connected things will be in use worldwide in
2016, up 30 percent from 2015, and will reach 20.8 billion by 2020. In 2016, 5.5
million new things will get connected every day.
2020
Product: hybrid operational database, analytics and search platform
Hyper converged
platform that uses
JSON
open standards
XML
ACID
OLAP
OLTP
SIEM
DWH
WEB
HPC
TEXT
Use cases
MB ► GB ► TB ► PB
HybridSQL
Secure, high-performance, distributed data
management at scale
We solve performance problems where relational databases fail
Blazing fast
Unlimited scalability
performance
Up to 1000x
faster
MB ► GB ► TB ► PB
Bulletproof transactions, instant text search and security
Reduces your TCO by 80% over your database life-time
Distributed architecture delivers high-performance computing
RDBMS
CLUSTERPOINT
ACID
Simultaneous
Time
execution of
parallel
computing tasks
using fast &
secure
transactions
Reliability of legacy RDBMS without its complexity, at 1000x its speed
All-in-one platform: DBMS, SEARCH, one API and one COST
Document database with
Search platform with data
JavaScript/SQL + high
relevance ranking, including
performance transactions
full-text & geospatial data
Bulletproof ACID
transactions (patent filed,
US)
Scalable high-availability
Real-time online web and
distributed computing
mobile analytics in Big data
(sharding, replication)
(no need for map-reduce)
Kill complexity! Boost performance! Nail search! Cut your cost!
Up to 1000x faster
RDBMS w ACID-
Search platform, full-text
transactions
index
Tons of your integration efforts and application
“spaghetti” code
High availability
Online analytics
shards, replicas
platform
Custom “stitching” all platforms
Cut 80% off your TCO
ONE
API:
JS/SQL
No systems integration required
Replace 2 software platforms with 1 to decrease your TCO by 80%
Commercial
RDBMS + SEARCH
Open source
RDBMS + SEARCH
Clusterpoint
database
14 000
0
0
20% / 3 x 2800
DIY / 0
3 x 7200
10 000
0
0
20% / 3 x 2 000
DIY / 0
0
10 000
0
0
DBMS + SEARCH integration through custom
application software code (developer months)
3m / 15 000
6m / 30 000
1m / 5000
DBMS high-availability clustering option or
custom HA integration (developer months)
2m / 10 000
4m / 20 000
0
Operate & scale integrated DBMS + SEARCH +
custom application code (developer months)
9m / 45 000
18m / 90 000
0
118 400
140 000
26 600
Budget for 100-users company, in $
DBMS software license (enterprise edition)
DBMS software maintenance (3 years)
SEARCH PLATFORM (SEARCH) license
SEARCH PLATFORM maintenance fee (3 year)
DBMS client software access licenses (100 users)
Your TCO over 3-years life-span, in $:
By using multiple platforms, your security problems are snowballing
Major security alert as 40,000 MongoDB databases left
Attackers targeting Elasticsearch remote code
unsecured on the internet
execution hole
The Register
InformationAge
The Odd Couple: Hadoop and Data
US Department of Homeland Security Calls On Computer
Security
Users to Disable Java
ZDnet
Forbes
MySQL Multiple Bugs Let Remote Users Access and Modify
Bash bug leaves Linux users
Data and Deny Service
shellshocked
Security Tracker
WindowsSecurity
Manage all your data, indexes and replicas with solid security
All your mission-critical data in one DBMS, analytics
and search platform
Big data cluster, replicas, backups
ONE API:
JS/SQL
ACID transactions
Ordinary relational SQL database
XML
BLOB
JSON
Search and analytics data/indexes
Develop your application software code scalable from day-one
OPEX, TCO
Save > 80%
Database life-cycle
Test
Year 1
Year 2
WRITE ONCE
and decrease lifetime cost of your
web or mobile
application
Year 3
Year 4
Year N
Why pay extra for high-end features? Use out-of-the-box!
FAULT-TOLERANCE
LOAD BALANCING
HIGH-AVAILABILITY
SCALE OUT ABILITY
replica 1
replica 2
replica 3
Why document-oriented database architecture? Flexibility!
Manage all your data in open industry
standards:
XML and JSON
Easily includes other data models: tables, text, pictures, graphs, links etc
Ordinary RDBMS: cost of changes escalates with software stack
Relational database
Cumulative cost
( ORM software model )
OPEX cost
Launch
75
15
35
40
40
10
20
ORM
Search
Analytics & Reporting
High availability clustering
Life
45 d
+ 90 d
+ 6 months
+ 1 year
time
5
Document database: cost of changes goes down to minimum
Document database
Cumulative cost
( XML / JSON data model )
OPEX cost
Launch
70
75
60
40
5
20
Document model (de-normalization)
Analytics & Reporting
Search
HA
Life
1 year (rebuild application)
+ 6 months
+ 90 d
+45 d
time
Smart IoT meters: storing data in documents vs database raws
Millions of smart meters
Billions of measurements
Meter Time
Volts
Amps Cost
Meter A day, a month or a year(s) data
1
00:00 { ... } ... 10:00 {220 0.25
0.05 } 10:15 { 240 0.65 0.10 }
10:30 { ... } ... 23:45 { ... } address
2
00:00 { ... } ... 10:00 {230 0.50
0.10 } 10:15 { 230 0.50 0.10 }
10:30 { ... } ... 23:45 { ... } ... name
3
00:00 { ... } ... 10:00 { Not
available } 10:15 { Signal loss }
10:30 { ... } ... 23:45 { ... } ... photo
1
2
10:00 220
10.00 230
0.25
0.50
0.05
0.10
3
...
10:00 180
...
...
0.30
...
0.03
...
1
2
10:15 240
10.15 230
0.65
0.50
0.10
0.10
3
...
10:15 180
...
...
0.30
...
0.03
...
Ordinary database stores individual
measurements (1000s per meter)
Fast degrading performance
Document database stores all data on individual
meters as rich text objects
Instant search
Top performance
Automatically create and maintain fast full-text search index
<id>
<title>
</title>
RANKING INDEX
indexes
Ordinary database indexing model
Full database content indexing
SQL query: tens of seconds
Our query: milliseconds
Complex queries requiring steep learning
Web-style free text SEARCH and analytical
curve
JS/SQL queries
Ultra-fast database index for ranked search and online analytics
Your original data in
%
documents
XML
&
JSON
RANKING
INDEX
strings
tags
names
values
Distributed storage
dates
numbers
words
relations
architecture
Index tree is organized into a graph, enabling you to set up your own
search ranking (weighting) rules
Ranking index delivers endless scale out ability to your data
MB ► GB ► TB ► PB
Organized as a modular graph, it enables to distribute data and computing
Ordinary databases overload and overwhelm users with data
Two main performance problems with ordinary databases
Disrupt your competition with fast and relevant full-text search
Use ranking
Relevance
Free text queries
of search results
at subsecond latency
Having
billions of
data?
Programmable filter that delivers superior search relevance
Scientists: ranking is a game-changing technology in databases
Very Large Data Bases Conference
7th International Workshop on Ranking in
Databases, 2013
“ the sheer amount of data makes it almost impossible to process queries in the traditional compute-thensort approach ”
“ Facing explosion of data ... the user would be overwhelmed by too many unranked results “
Ranking delivers superior search experience in your database
Shop
Map
application
application
Address
100%
Company
75%
Product
50%
Same data, different
ranking rules for your
free text search queries
(think voice in future)
Product
100%
Company
75%
Address
50%
Easily configure your own ranking rules for your business needs
Your own data items (fields) in
Least relevant
Most relevant
your XML or JSON database
Product
0%
50%
100%
Company
0%
50%
100%
Address
0%
50%
100%
Email
0%
50%
100%
Category
0%
50%
100%
When free text search hits data with higher rankings, results are sorted up-front
Enjoy instantly relevant search in your data using only free text
<query>
Plain text:
java developer London
Phrases:
“John Smith”
Simple, superfast, userfriendly web-
Wildcards:
Joh* Smi* or “John Smi*”
style
SEARCH
Patterns:
John Sm?th
Substitutes:
John Sm[iy]th
</query>
With ranking you can implement ranked pagination: 1 2 3 . . .
Ranking your database structure
Ranking your documents
<document>
100%
Title
50%
10%
Body
Comments
<id>
<title>
( when tag weightings are equal )
</title>
Ranking your search query terms
w1^100% w2^+30% w3^-20%
Ranking density of context hits
..w1...w2 ........ w3 ........
Two problems solved
Real-time Big Data SEARCH
RANKING
INDEX
integer 0 ..... 232
milliseconds
Ranked pagination (1 2 3 ..) solves information overload problem
Page: 1 2 3 4 5 more
Limited waiting
time by users
Limited network
Limited screen estate
bandwidth
Fast and relevant database search in your web and mobile applications
Scale to billions of documents without search performance loss
Milliseconds for a JavaScript/SQL
Minutes ... hours
query in Clusterpoint database
for a SQL query in
legacy RDBMS
MB
RANKING
INDEX
GB
TB
PB
JSON
XML
Constant query latency enables real-time Big data search and analytics
Clusterpoint Cloud Database as a Service (DBaaS)
We safely and efficiently
manage your databases for you
AND
We instantly scale on-demand!
Clusterpoint Cloud is always ON, 99.99% available & reliable
REST API
http(s)
JS/SQL
tcp/ip
Our cloud computing is using cost-efficient on-demand model
Resources
Conventional
Provisioning
Model, $
Save 3x-10x
Cost-Efficient
Model, $
DB
Time
Thank you!