Download 12.2_InMemory_new_features

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Serializability wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database model wikipedia , lookup

Transcript
Oracle Database In-Memory
New Features 12c Release 2
Gavin Soorma
Agenda
 What benefits can In-Memory provide
 In-Memory architecture and components
 How do we configure In-Memory in the database
 What’s new in Oracle 12c Release 2





Dynamically increase In-Memory area on the fly
In-Memory support for Active Data Guard
Optimize join performance via Join Groups
In-Memory Expressions
In-Memory Fast Start
The In-Memory Option
 In-Memory option included in Oracle Database Enterprise Edition
12.1.0.2
 Separate licensable option
 Transparently accelerate analytic queries by orders of magnitude
 Businesses benefit from better decisions made in real-time, resulting in
lower costs, improved productivity, and increased competitiveness
 No application change required
 Entire database need not be in memory as case with some competition
Best of both worlds
 Even mainly OLTP databases occasionally will have analytical type
workload
 Single database can now efficiently support both OLTP as well as
Data Warehouse type workloads
 Dual-format architecture
 Enables data to be maintained in the existing Oracle row format
ideal for OLTP operations (Buffer Cache)
 Also enables data to be maintained in a new purely in-memory
columnar format, optimized for analytical processing (In-Memory
Column Store)
Dual Row Format Architecture
OLTP
Few rows
Many columns
Data Warehouse
Millions of rows
Few columns
In-Memory boosts OLTP performance
OLTP is Slowed by Analytic Indexes
Column Store Replaces Analytic Indexes
Configure the In-Memory Area
 In-Memory Area is an optional SGA component that contains the IM
column store
 IM column store does not replace the buffer cache, but acts as a
supplement
 It is not a cache – objects do not age out!
 Data can now be stored in memory in both row and column format
 In-Memory Area is controlled by the INMEMORY_SIZE initialization
parameter
 By default, the size of the In-Memory Area is 0, which means the IM
column store is disabled
Configure the In-Memory Area
 In-Memory Area is subtracted from the SGA_TARGET initialization
parameter setting
 SGA_TARGET =10 GB and INMEMORY_SIZE=4 GB
 40% of the SGA_TARGET setting is allocated to the In-Memory Area
 The In-Memory area is sub-divided into two pools:
 1MB pool used to store the actual column data populated into memory
 64K pool used to store metadata about the objects that are populated into the IM
column store
In-Memory Architecture
In-Memory Area
Columnar Data
IMEU
IMCU
Background
Processes
IMCO
Metadata
SMU
Populate
Repopulate
w000
IMCU
w001
IMEU
Populate
SMU
w002
Repopulate
1 MB Pool
64 KB Pool
In-Memory Compression Units (IMCU)
 IMCU is a compressed, read-only storage unit that contains data for
one or more columns
 An IMCU is analogous to a tablespace extent
 An IMCU has two parts:
 A set of Column Compression Units (CUs)
 A header that contains metadata such as the IM storage index
 An IM storage index stores the minimum and maximum value for all
columns within the IMCU
Snapshot Metadata Units (SMUs)
 A Snapshot Metadata Unit (SMU) contains metadata and
transactional information for an associated IMCU
 Contained in the 64KB Pool in-memory area
 Each IMCU maps to a Snapshot Metadata Unit or SMU in the 64KB
pool that holds the metadata about the IMCU
 IMCU is a read only structure, SMU is modified as it contains the
Transaction Journal
Transaction Journal
 Every SMU contains a transaction journal which tracks row
modifications
 The database uses the transaction journal to keep the IMCU
transactionally consistent
 The database uses the buffer cache to process DML just as when the IM
column store is not enabled
 UPDATE statement might modify a row in an IMCU
 ROWID for the modified row is added to the transaction journal and
marked as stale in the IMCU as of the SCN of the DML statement
 If a query needs to accesses the new version of the row, then the
database obtains the row from the database buffer cache
Transaction Journal
 If a query accesses the columnar data, and discovers modified
rows, then it can obtain the corresponding rowids from the
transaction journal, and then retrieve the modified rows from the
buffer cache
In-Memory Expression Units (IMEUs)
 An In-Memory Expression Unit (IMEU) is a storage container for
materialized In-Memory Expressions and user-defined virtual
columns
 Every IMEU maps to exactly one IMCU, mapping to the same row
set
 The IMEU contains expression results for the data contained in its
associated IMCU
 When the IMCU is populated, the associated IMEU is also populated
Expression Statistics Store
 The Expression Statistics Store (ESS) is a repository maintained
by the optimizer to store statistics about expression evaluation
 The ESS resides in the SGA and also persists on disk in the
SYSAUX tablespace
 The database uses the ESS to determine whether an expression is
“hot” (frequently accessed), and thus a candidate for an IM
expression
 IM expressions are exposed as system-generated virtual columns,
prefixed by the string SYS_IME
Populating the In-Memory Column Store
 Population is a streaming mechanism, converting row data into
columnar format, and then compressing it
 New INMEMORY attribute for tables and materialized views
 Only objects with the INMEMORY attribute are populated into the IM
column store
 The INMEMORY attribute can be specified on a tablespace, table,
partition, or materialized view
 Repopulation occurs based on threshold and new in 12.2 is trickle
repopulation
Populating the In-Memory Column Store
ALTER TABLESPACE ts_data DEFAULT INMEMORY;
ALTER TABLE sales INMEMORY;
ALTER TABLE sales INMEMORY NO INMEMORY(prod_id);
ALTER TABLE sales MODIFY PARTITION SALES_Q1_1998 NO INMEMORY;
 Objects are populated into the IM column store either in a prioritized list
immediately after the database is opened or after they are scanned
(queried) for the first time
ALTER TABLE customers INMEMORY PRIORITY CRITICAL;
 Oracle Database In-Memory Advisor (Doc ID 1965343.1)
In-Memory Background Processes
 In-Memory Coordinator Process (IMCO) initiates population and
repopulation of columnar data
 Space Management Worker Processes (Wnnn) populate or
repopulate data on behalf of IMCO
 During population, Wnnn processes are responsible for creating
IMCUs, SMUs, and IMEUs
 Space Management Worker Processes also wake up at frequent
intervals to initiate repopulation for objects with ‘stale’ columnar data
Repopulating the In-Memory Column Store
 As the number of modifications increase, so do the size of
Transaction Journals in the SMUs
 Leads to increase in the amount of data that must be fetched from
database buffer cache in the absence of current columnar data
available in the In-Memory store
 The more stale entries there are in an IMCU, the slower the scan of
the IMCU will become
 To avoid degrading query performance through journal access,
background processes repopulate modified objects back into the
column store
Repopulating the In-Memory Column Store
 Threshold-based repopulation and trickle repopulation
 Database will repopulate an IMCU when the number of stale entries
in the transaction journal reaches a staleness threshold
 Trickle repopulation supplements threshold-based repopulation by
periodically refreshing stale columnar data even when the threshold
has not been reached
 INMEMORY_TRICKLE_REPOPULATE_SERVERS_PERCENT
limits the maximum number of background populate servers used
for In-Memory Column Store repopulation
In-Memory Architecture
In-Memory Area
Columnar Data
IMEU
IMCU
Background
Processes
IMCO
Metadata
SMU
Populate
Repopulate
w000
IMCU
w001
IMEU
Populate
SMU
w002
Repopulate
1 MB Pool
64 KB Pool
Dynamically increase size of In-Memory
Column Store
 When more memory is required for the IM column store, it can be
increased in size dynamically
 Before 12.2 required instance restart
 The size of the IM column store cannot be decreased dynamically
 The new size of the IM column store must be at least 128 megabytes
greater than the current INMEMORY_SIZE setting
 There must be enough memory available in the SGA to increase the
size of the IM column store dynamically to the new value
ALTER SYSTEM SET INMEMORY_SIZE = 500M SCOPE=BOTH;
In-Memory Expressions
 Analytic queries often contain complex expressions in the select list
or where clause predicates that need to be evaluated for every row
processed by the query
 The evaluation of these complex expressions can be very resource
intensive and time consuming
 IM Expressions greatly improve the performance of analytic queries
that use computationally intensive expressions and access large
data sets
 IM expressions speed queries of large data sets by precomputing
computationally intensive expressions
In-Memory Expressions
 The Expression Statistics Store (ESS) automatically tracks the
results of frequently evaluated (“hot”) expressions
 Invoke the IME_CAPTURE_EXPRESSIONS procedure which is part
of the DBMS_INMEMORY_ADMIN package
 Queries the ESS, and identifies the 20 most frequently accessed
(“hottest”) expressions in the specified time range
 Expressions are stored as added hidden virtual columns to the table
 Under the direction of In-Memory Coordinator Process (IMCO),
Space Management Worker Processes (Wnnn) load IM expressions
into IMEUs during population/repopulation
In-Memory Expressions
SELECT
employee_id, last_name, salary, commission_pct,
ROUND(salary*12/52,2) as "weekly_sal",
12*(salary*NVL(commission_pct,0)+salary) as "ann_comp"
FROM employees
ORDER BY ann_comp;
In-Memory Fast Start
 In-Memory population is a CPU bound operation, involving
reformatting data into a columnar format and compressing that data
before placing it in memory
 With IM FastStart enabled IMCUs from the In-Memory column store
are checkpointed to Fast Start area ondisk at periodic intervals
 On subsequent database restarts, column store data is populated
via the IMCUs stored in FastStart area on disk rather than from
querying the base tables
 This relieves the CPU overhead of population at the cost of
additional disk space requirements
In-Memory Fast Start
 Specify a tablespace for the FastStart area using the
DBMS_INMEMORY_ADMIN.FASTSTART_ENABLE procedure
 The Space Management Worker Processes (Wnnn) write IMCUs to
the SecureFiles LOB named SYSDBinstance_name_LOBSEG$
 Transactional consistency checks involve comparing the System
Change Number (SCN) at which the IM FastStart checkpoint was
taken for the IMCU with the most recent SCN
 Depending on result of this check IMCU will be populated from:
 FastStart Area
 FastStart Area with some rows marked invalid (due to data modification
after the IMCU was written to the FastStart area)
 Completely discarded and populated from disk
In-Memory and Active Data Guard
 IM column store can be enabled in an Oracle Active Data Guard
standby database
 IM column store can be configured only on the primary database,
only on a standby database, or on both the primary and standby
databases
 Primary database can support the transactional workload while the
standby database supports the analytic workload
 INMEMORY_ADG_ENABLED initialization parameter is set to true
on the standby database instance
In-Memory and Active Data Guard
 INMEMORY attribute with the DISTRIBUTE FOR SERVICE clause
set on all objects to be populated in the IM column store in the
standby database
 Create service which connects to ADG Standby database
 An object is only populated in the column store on database
instances on which the service is active
 Set Primary Database INMEMORY_SIZE=0 and ADG Standby
INMEMORY_SIZE=50G
In-Memory and Active Data Guard
 The redo generated on the primary database for all DML statements
includes metadata indicating whether the change is to an
INMEMORY object
 If an INMEMORY object is modified, then the standby database
invalidates the modified rows just as it does on the primary database,
using the transaction journal and Snapshot Metadata Unit (SMU)
to track the changes
 The repopulation mechanism works the same way in a standby
database as it does in a primary database
ALTER TABLE sales INMEMORY DISTRIBUTE FOR SERVICE reporting_standby_svc;
(command issued on Primary)
In-Memory and Automatic Data Optimization
 Automatic Data Optimization (ADO) was introduced in Oracle
Database 12c Release 1 to enable the automation of Information
Lifecycle Management (ILM) tasks
 ADO supports both compression tiering and storage tiering using
policies defined at the row or segment level on tables and partitions
 New in Oracle 12.2 ADO also manages content of IM column store
 User-defined policies to move objects in and out of IM column store
In-Memory and Automatic Data Optimization
 User-defined policies to adjust the compression level of objects
within the IM column store
 SET INMEMORY – enables INMEMORY attribute
 MODIFY INMEMORY – changes columnar compression level of the object from
lower to higher level
 NO INMEMORY – removes object from IM column store
ALTER TABLE sales_mar ILM ADD POLICY SET INMEMORY AFTER 14
DAYS OF CREATION;
 Table initially subject to a lot of DML but later on subject to mainly
analytical type queries
Hash Joins In-Memory 12.1.0.2
1)
Hash table created in memory
HASH TABLE
4)
Join completed by probing
Hash Join table in memory
to find matching rows
PGA
NAME
VEHICLES
2)
Table is accessed in memory via full scan
Data is uncompressed
NAME
SALES
3)
Table is accessed in memory via full scan
Data is uncompressed
Hashing algorithm applied
Data sent to hash join
Local Dictionary
Each IMCU has a local dictionary which has a list of distinct values and their
corresponding dictionary codes or compression symbol
The IMCU stores only the dictionary code rather than the original value
Optimizing Joins with Join Groups
 Join Groups have been introduced to improve the performance of
HASH JOINS on tables in the IM column store
 A join group is a set of columns on which a set of tables is frequently
joined
CREATE INMEMORY JOIN GROUP jgroup_name (sales(name), vehicles(name));
 The Join Group tells the IM column store that the NAME column in
both the VEHICLES and SALES tables should share the same
compression dictionary or common dictionary
Common Dictionary
Local Dictionary in each IMCU now has references to the Common Dictionary
Common Dictionary created when Join Group is created
Optimizing Joins with Join Groups
 The database operates on compressed data as it is stored in the
columnar store
 The is no costly decompressing of data first before hash join is
performed
 In a hash join based on a join group, the database uses an array
instead of building a hash table in PGA which is faster and uses less
resources
 The database stores codes for each join column value in a common
dictionary
 The database joins on the codes rather than on the actual column
values
Optimizing Joins with Join Groups
 The database automatically creates a common dictionary in the IM
column store when a join group is defined on the underlying
columns
 The common dictionary enables the join columns to share the same
dictionary codes
 The local dictionary now stores references to the values stored in
the common dictionary
 No decompressing of data for HASH JOINS
 Reduce memory requirements for performing HASH JOINS
Optimizing Joins with Join Groups
CREATE INMEMORY JOIN GROUP sh.sales_products_jg
(sh.sales(prod_id), sh.products(prod_id));
ALTER TABLE sh.sales INMEMORY;
ALTER TABLE sh.products INMEMORY;
Thanks for attending!!
Please send your feedback and any questions to:
Gavin Soorma
[email protected]
GoldenGate 12c Advanced Workshop
Course topics include :




Advanced troubleshooting tips and techniques
Configure Downstream Capture and Replication
Configure GoldenGate on a clustered environment
Providing High Availability for GoldenGate using Grid Infrastructure
agents
 Minimal downtime database upgrades using GoldenGate
 Install and configure GoldenGate Monitor 12c
 Install and configure GoldenGate Veridata 12c