Download PowerPoint ******

Document related concepts

Serializability wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Concurrency control wikipedia , lookup

Clusterpoint wikipedia , lookup

SAP IQ wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database model wikipedia , lookup

IMDb wikipedia , lookup

Transcript
In-Memory Database
전준민, 정주성, 이한민, 곽하녹
1
Table of Contents
1.
2.
3.
4.
5.
6.
7.
Introduction
Disk Resident DB vs In-Memory DB
Column Store
Durability
Data Overflow
Products of IMDB
Optimization Aspects on IMDB
2
1. Introduction
What is In-Memory Database (IMDB) ?
Architecture
Rise of IMDB
Applications
Myths about IMDB
3
What is In-Memory Database (IMDB)?
• An in-memory database system is a database management
system that stores data entirely in main memory.
4
What is In-Memory Database (IMDB)?
5
Architecture
• Fast data access
• Algorithms optimized on
main memory
• Efficient memory usage
• Durability
6
Rise of the IMDB
• Multicore Processors
• Cheaper and Bigger Memories
• Demands on Fast Databases
7
Rise of the IMDB
8
Rise of the IMDB
9
Applications
• Low-latency, high volume systems
10
Myths about IMDB
• Given the same amount of RAM, disk DBs can perform at the sa
me speed as IMDBs (by using caching technology).
• If a RAM disk is created and a traditional disk DB is deployed on
it, it delivers the same performance as an in-memory database.
•
•
•
•
write on disk
buffer manager
indexes for disk
redundant data
11
2. Disk Resident DB (DRDB)
vs In-Memory DB (IMDB)
DRDB vs IMDB : Overview
Indexes
Concurrency Control
12
DRDB vs IMDB : Overview [1]
DRDB
IMDB
File I/O
Carries File I/O burden
No file I/O burden
Storage Usage
Assumes storage is abundant
Uses storage more efficiently
Algorithm optimized for disk
CPU Cycles
More CPU cycles
Algorithms optimized for me
mory
Less CPU cycles
Persistence
Non-volatile
Volatile
Lock
Fine Locks
Coarse Locks
Algorithms
13
Indexes: B+-Tree in DRDB [2]
• The redundant data are kept in some index structures, to
reduce I/O.
14
Indexes: T-Tree in IMDB [3]
• The indexes in IMDBs are focused on reduced memory
consumption and CPU cycles.
• In the early 90's, Lehman and Carey proposed the T-tree as
an index structure for main memory database.
• The T-tree indexes are more efficient than B-trees in that they
require less memory space and fewer CPU cycles.
15
Indexes: T-Tree in IMDB
• The T-tree evolved from AVL Trees and B-Trees.
16
Indexes: Hash indexes in IMDB
• Hash indexes are used for key-value based in-memory
databases (cache servers) such as Redis and Memcached.
17
Concurrency Control
• In DRDBs, locking granules are low level.
• To reduce contention
• To increase parallelism
• In IMDBs, locks are coarse-grained thanks to fast processing.
• Locking granules like a relation or an entire database
• No need to look up hash table
• Serial scheduling is enough in most cases
18
3. Column Store
What is Column Store?
Benefits of Column Store
Delta Storage
19
What is Column Store?
• Column Store
• stores data tables as columns of
data rather than as rows of data
20
Benefits of Column Store [4]
• Column stores are more suitable in IMDB than row stores
• Better parallelism
• Better compression
• Faster data access
• Using parallel processing.
• Especially for aggregations.
.
21
22
Benefits of Column Store: Parallelism [5]
• Column storage can easily be separated into equal parts
which leads to effective parallel processing.
• Highly parallelized scan operations are available which are
faster than indexed searches.
• The row store cannot compete if processing is set-oriented
and requires column operations, but most applications are
based on set-oriented processing and not direct tuple access.
23
Benefits of Column Store: Parallelism
• Highly parallelized scan operations using column stores are
faster than using just ordinary indexes.
24
Benefits of Column Store: Compression
• Column store allows highly efficient compression because the
columns contain only few distinct values.
• Compression
25
Delta Storage [6]
• Since writing on compressed column stores in real time is
inefficient, delta storage techniques are used.
• Delta Storage
• optimized for write operations
• Main Storage
• compressed column store
26
Delta Storage
• INSERT
• insert a new record in the delta storage. The merge process will move the
record from delta to main.
• DELETE
• A DELETE statement will select the record and mark it as invalid by setting
a flag (for main or delta). The merge process will delete the record from
memory once there is no open transaction active for it anymore.
• UPDATE
• An UPDATE statement will insert a new version of the record. The merge
process will move the latest version from delta to main. Old versions will
be deleted once there is no open transaction active for them anymore.
27
Delta Storage: Simplified View of InsertOnly Approach
28
Delta Storage
• The merge process starts when the delta storage grows big
enough.
29
4. Durability
Logging and Checkpointing
Command Logging
NVM Logging
30
Durability
• Durability is difficult to support in IMDBs
• Many IMDBs have added durability via the following mechanisms
• Checkpoints
• Transaction logging
31
Checkpointing
• Checkpoints in DRDB
• Bring pages on disk up to date
• Reduce the work of recovery
• Checkpoints in IMDB
• Make a copy of the data on disks (snapshot)
• Truncate the logs
32
Logging and Checkpointing
Transaction
Memory
Tablespace
log sync
REDO Log File
Physical Disk
Checkpoint
Image File
Memory
Log Buffer
• Problem
• Log I/O becomes
bottleneck
• How long do we need to
keep the log?
• Until the next checkpoint
33
Logging and Checkpointing [7]
• TPCC benchmarking on DRDBs
(New Order transaction)
• Logging takes up a non-small
portion
• Larger portion for IMDBs
34
Command Logging
[8]
• Light-weight, coarse-grained logging technique
• Logical logging
• Advantages
• Write substantially fewer bytes per transaction than physical logging
• Reduce run time overhead
• Disadvantages
• Slow recovery
• Failures that require recovery to ensure system availability are
much less frequent
• 1.5X higher throughput than main-memory optimized
implementation of physical logging
35
Command Logging
36
NVM Logging
[9]
• NVM (Non-Volatile Memory)
• low read/write latency like DRAM
• persistent write like SSD
DRAM
NAND Flash
NVM
ByteAddressable
Yes
No
Yes
Capacity
1X
4X
2-4X
Latency
1X
400X
3-5X
37
NVM+DRAM Architecture
• DBMS relies on both DRAM and NVM
38
5. Data overflow
Anti-caching
Project Siberia
39
Data overflow
• Datasets may not fit in DRAM
• IMDB Solutions
• Anti-caching
• Project Siberia
40
Anti-caching
[10]
• Used in H-Store
• Cold data is moved to disk in a safe manner
• Bloom filter used for tracking data
• Manage cold data by maintaining a LRU chain
41
Anti-caching
42
Anti-caching
• Fine-grained eviction
• eviction is performed at tuple-level, not page-level
• Non-blocking fetches
• a transaction that accesses evicted data is simply aborted and then
restarted at a later point
43
Project Siberia [11]
• Used in Hekaton
• Automatically and transparently maintain cold data on
cheaper secondary storage
• Allow more data to fit in memory
• Log-based management of cold data
44
6. Products of IMDB
H-Store / VoltDB
Hekaton
SAP HANA
In-memory NoSQL Databases
45
Products of IMDB
46
H-Store / VoltDB
• Distributed row-based in-memory relational database
• Targeted for high-performance OLTP processing
• Light-weight logging strategy
• Anti-caching
47
Hekaton
• Memory-optimized OLTP engine
• Fully integrated into Microsoft SQL server
• Multi-version concurrency control
• Project Siberia
48
SAP HANA
• A distributed in-memory database featured for the
integration of OLTP and OLAP
• Provides rich data analytics functionality by offering multiple
query language interfaces (e.g., standard SQL, SQLScript,
MDX, WIPE, FOX and R)
49
SAP HANA
• Three-level column-oriented unified table structure
50
In-memory NoSQL Databases
• RAMCloud
• Distributed in-memory key-value store, featured for low latency, high
availability and high memory utilization
• Bitsy
• Embeddable in-memory graph database that implements the
Blueprints API, with ACID guarantees on transactions based on the
optimistic concurrency mode
51
Comparison of IMDB
Systems
H-Store
Relational
Databases
NoSQL
Databases
Graph
Databases
Data Model
relation(row)
[12]
Indexes
Fault Tolerance
Memory
Overflow
OLTP
hashing, b+tree, binary tree
command logging,
checkpoint, replica
anti-caching
OLTP
latch-free
hashing, Bwtree
logging,
checkpoint, replica
Project Siberia
table/partition
-level
swapping
Workloads
Hekaton
relation(row)
SAP HANA
relation, graph,
text
OLTP, OLAP
timeline index
logging,
checkpoint,
standby server
RAMCloud
key-value
object
operations
hashing
logging, replica
N/A
OLTP
optimistic
concurrency
control
logging, backup
N/A
Bitsy
N/A
52
7. Optimization Aspects on
IMDB
53
Optimization Aspects on IMDB
[12]
Aspects
Concerns
Index
cache consciousness, time/space
efficiency
T-Tree, CSS-Trees, CSB+-Trees, BD-Tree
cache consciousness, space
efficiency
columnar layout, HANA Hybrid Store, log
structure
overhead, correctness
virtual snapshot, transaction memory,
MVCC
Query Processing
code locality, time efficiency
stored procedure, JIT compilation, sorting
Fault Tolerance
durability, correlated failures,
availability
group commit and log coalescing, NVM,
command logging, remote logging
Data Overflow
locality, paging, hot/cold
classification
anti-caching, Hekaton Siberia, data
compression, virtual memory management,
pointer swizzling
Data Layout
Concurrency Control
Related Work
54
References
[1] Garcia-Molina, Hector, and Kenneth Salem. "Main memory database systems: An
overview." Knowledge and Data Engineering, IEEE Transactions on 4.6 (1992): 509-516.
[2] Comer, Douglas. "Ubiquitous B-tree." ACM Computing Surveys (CSUR) 11.2 (1979):
121-137.
[3] Lehman, Tobin J., and Michael J. Carey. "A study of index structures for main memory
database management systems." Conference on Very Large Data Bases. Vol. 294. 1986.
[4] Abadi, Daniel J., Samuel R. Madden, and Nabil Hachem. "Column-stores vs. rowstores: how different are they really?." Proceedings of the 2008 ACM SIGMOD
international conference on Management of data. ACM, 2008
[5] Plattner, Hasso. "A common database approach for OLTP and OLAP using an inmemory column database." Proceedings of the 2009 ACM SIGMOD International
Conference on Management of data. ACM, 2009.
[6] Färber, Franz, et al. "The SAP HANA Database--An Architecture Overview."IEEE Data
Eng. Bull. 35.1 (2012): 28-33.
55
References
[7] Harizopoulos, Stavros, et al. "OLTP through the looking glass, and what we found
there." Proceedings of the 2008 ACM SIGMOD international conference on Management
of data. ACM, 2008.
[8] Malviya, Nirmesh, et al. "Rethinking main memory oltp recovery." Data Engineering
(ICDE), 2014 IEEE 30th International Conference on. IEEE, 2014.
[9] DeBrabant, Justin, et al. "A Prolegomenon on OLTP Database Systems for Non-Volatile
Memory." Proceedings of the VLDB Endowment 7.14 (2014).
[10] DeBrabant, Justin, et al. "Anti-caching: A new approach to database management
system architecture." Proceedings of the VLDB Endowment 6.14 (2013): 1942-1953.
[11] Eldawy, Ahmed, Justin Levandoski, and Paul Larson. "Trekking through siberia:
Managing cold data in a memory-optimized database." Proceedings of the VLDB
Endowment 7.11 (2014).
[12] Zhang, Hao, et al. "In-memory big data management and processing: A survey."
(2015).
56