Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WORK WITH MULTIPLE HOT TERABYTES IN JVMS PER MINBORG @PMINBORG CTO, SPEEDMENT, INC. SPEEDMENT, INC. 3 ABOUT PER SCENARIO Application In-JVM-Cache In-Memory Solution >1 TB Web Shop Stock Trade Bank Machine learning Etc. Source of Truth PROS OF IN-MEMORY ¡ Improved performance ¡ Consistent performance ¡ Cost reduction (server, AWS and licenses) CHALLENGES OF IN-MEMORY ¡ Optimized Speed ¡ Cost and size of Memory ¡ Consistency, Restart, DB impact, etc. ¡ Organization and size of JVMs CHALLENGES OF IN-MEMORY ¡ Optimized Speed ¡ Cost and size of Memory ¡ Consistency, Restart, DB impact, etc. ¡ Organization and size of JVMs OPTIMIZED SPEED ¡ No matter how advanced database you may ever use, it is really the data locality that counts ¡ Eventually, memory will cost less than x $/GB (Pick any x) LATENCIES USING THE SPEED OF LIGHT ¡ Database query (1 s) LATENCIES USING THE SPEED OF LIGHT ¡ Disk Seek – LA ¡ TCP (DC) – SJ ¡ SSD - Oakland LATENCIES USING THE SPEED OF LIGHT ¡ Main Memory ¡ CPU L3 Cache LATENCIES USING THE SPEED OF LIGHT ¡ CPU L2 Cache ¡ CPU L1 Cache CHALLENGES OF IN-MEMORY ¡ Optimized Speed ¡ Cost and size of Memory ¡ Consistency, Restart, DB impact, etc. ¡ Organization and size of JVMs How much does TITLE OF SLIDE GOES HERE 1 GB cost? BACK TO THE FUTURE $ 67,000,000,000 $ 720,000 $5 $ 0.04 Source: http://www.jcmit.com/memoryprice.htm BACK TO THE FUTURE CHALLENGES OF IN-MEMORY ¡ Optimized Speed ¡ Cost and size of Memory ¡ Consistency, Restart, DB impact, etc. ¡ Organization and size of JVMs CACHE SYNCHRONIZATION STRATEGIES DUMP AND LOAD POLL • Dumps are reloaded periodically • Data evicted, refreshed or marked as old • All data elements are reloaded • Evicted element are reloaded • Data remains unchanged between reloads • Data changes all the time • System restart is just a reload • System restart either warm-up the cache or use a cold cache CACHE SYNCHRONIZATION STRATEGIES REACTIVE PERSISTANT CACHING • Changed data is captured in the Database • Changed data events are pushed into the cache • Events are grouped in transactions • Cache updates are persisted • Data changes all the time • System restart, replay the missed events COMPARISON Dump and Load Caching Poll Caching Reactive Persistance Caching Max Data Age Dump period Eviction time Replication Latency - ms Lookup Performance Consistently Instant ~20% slow Consistently Instant Consistency Eventually Consistent Inconsistent - stale data Eventually Consistent Database Cache Update Load Total Size Depends on Eviction Rate of Change Time and Access Pattern Restart Complete Reload Eviction Time Down time update rate * time -> 10% of down CHALLENGES OF IN-MEMORY ¡ Optimized Speed ¡ Cost and size of Memory ¡ Consistency, Restart, DB impact, etc. ¡ Organization and size of JVMs BIG JVMS WITH TERABYTES OF DATA ¡ Scale Up ¡ One large JVM handles all data ¡ Map memory to (SSD backed) files ¡ Several JVMs can share data via the file system ¡ Instant restart ¡ Scale Out ¡ Have several JVMs in a network ¡ Use sharding between nodes ¡ Redundant nodes CONVENTIONAL JAVA APPLICATIONS ¡ Java Objects live on the Heap and are Garbage Collected periodically ¡ Garbage Collection times increases with the Java Heap size ¡ Garbage Collection times increases with the Java Heap mutation rate ¡ “The app has hit the GC wall” ¡ Hard to meet reasonable SLAs with more than 16:ish GB JVMs ¡ 10 TB data and 10 GB JVMs -> ~1000 JVMs OFF HEAP STORAGE ¡ Stores data outside of the Java heap ¡ The Garbage Collector does not see the content ¡ Scales up to terra bytes of main memory in a single JVM ¡ Use any number of nodes for scale out solutions PERSISTENT SCALE OUT CACHE ¡ Persists data in files or memory mapped files ¡ SSD backing device recommended ¡ 1.3 GB/s reload per node ¡ ¡ ¡ 10 GB in 6s ¡ 100 GB in 1 min ¡ 1 TB in 10 min 6.5 GB/s reload in a system with 10 nodes (1 active and 1 backup) ¡ 10 GB in 1 s ¡ 100 GB in 12 s ¡ 1 TB in 2 min 65 GB/s reload in a system with 100 nodes, 1 TB in 12 s COMPRESSED OOPS IN JAVA 8 ¡ Using the default of –XX:+UseCompressedOops –XX:ObjectAlignmentInBytes=16 ¡ In a 64-bit JVM, it can use “compressed” memory references. ¡ This allows the heap to be up to 64 GB without the overhead of 64-bit object references. ¡ As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of the address are always zeros and don’t need to be stored. This allows the heap to reference 4 billion * 16-bytes or 64 GB. ¡ Uses 32-bit references. JVM SIZE SWEET SPOT ¡ 50 GB off heap per node ¡ 20 nodes per terabyte ¡ 40 nodes per terabyte with minimum redundancy CONCLUSIONS ¡ Get speed by keeping your data close to the application ¡ RAM is cheap and getting bigger and ever cheaper ¡ Consistent solution with Reactive Persistent Caching Reactive Persistent Caching imposes minimum load on restart and on the DB ¡ Scale up solutions can be in the terabytes with virtual memory or file mapped memory Scale out solutions can use 50 GBish nodes SOLUTION Application In-JVM-Cache Web Shop Stock Trade Bank Machine learning Etc. >1 TB Source of Truth SPEEDMENT ¡ Java Application Development Tool ¡ In-JVM-memory cache ¡ Database SQL Reflector (CDC, Change Data Capture) ¡ Pluggable storage engines (Speedment, Chronicle Map, Hazelcast, Grid Gain, etc.) ¡ Code generation tool -> Automatic domain model extraction from databases ¡ Transaction-aware SPEEDMENT SCALE UP ULTRA-LOW LATENCY CACHE ¡ Ultra-low latency (Runs in the same JVM as the application) ¡ Millions of TPS ¡ Latencies measured in microseconds ¡ Supports file mapping ¡ Terabytes of data ¡ O(1) for equality operations ¡ O(log(N)) for other operations SPEEDMENT SQL REFLECTOR ¡ Detects changes in a database ¡ Will preserve transactions ¡ Buffers the changes ¡ Sees data as it was persisted ¡ Can replay the changes later on ¡ Detects changes from any source ¡ Will preserve order INSERT UPDATE DELETE Database DOWNLOAD TRIAL @ WWW.SPEEDMENT.COM CONNECT TO YOUR EXISTING SQL DB AUTOMATIC SCHEMA ANALYSIS PUSH AND PLAY OFFERINGS ¡ Complete solutions for in-memory hot big data ¡ Software licenses ¡ Service and support ¡ Consulting [email protected] @Speedment www.speedment.com