Download WebFOCUS Hyperstage - Information Builders

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Big data wikipedia , lookup

Relational model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
WebFOCUS Hyperstage
Analyze/Report from large Volumes of Data
Information Builders
May 11, 2012
Information Builders (Canada) Inc.
WebFOCUS
Higher Adoption & Reuse with Lower TCO
Mobile
Applications
Visualization
& Mapping
Data Updating
Predictive
Analytics
Enterprise
Search
High Performance
Data Store
Performance
Management
Reporting
Query &
Analysis
MS Office &
e-Publishing
Dashboards
Information
Delivery
Business to
Business
Data Warehouse
& ETL
Data Profiling &
Data Quality
Master Data
Management
Business Activity
Monitoring
Extensions to the
WebFOCUS platform
allow you to build
more application
types at a lower cost
WebFOCUS
High Performance Data Store
Mobile
Applications
Visualization
& Mapping
Data Updating
Predictive
Analytics
Enterprise
Search
High Performance
Data Store
Performance
Management
Reporting
Query &
Analysis
MS Office &
e-Publishing
Dashboards
Information
Delivery
Business to
Business
Data Warehouse
& ETL
Data Profiling &
Data Quality
Master Data
Management
Business Activity
Monitoring
Extensions to the
WebFOCUS platform
allow you to build
more application
types at a lower cost
The Business
Challenge
Big Data
Copyright 2007, Information
Builders. Slide 4
Today’s Top Data-Management Challenge
Big Data and Machine Generated Data
Machine- Generated
Data
Data
Storage
Human-Generated
Data
Time
IT Managers try to mitigate these response times …..
How Performance Issues are Typically Addressed – by Pace of Data Growth
75%
Tune or upgrade existing databases
66%
70%
Upgrade server hardware/processors
54%
60%
Upgrade/expand storage systems
33%
44%
Archive older data on other systems
30%
High Growth
32%
Upgrade networking infrastructure
21%
Low Growth
4%
7%
Don't Know / Unsure
0%
20%
40%
60%
80%
100%
When organizations have long running queries that limit the business, the response
is often to spend much more time and money to resolve the problem
Source: KEEPING UP WITH EVER-EXPANDING ENTERPRISE DATA ( Joseph McKendrick Unisphere Research October 2010)
Classic Approaches and Challenges
Data Warehousing
Limited Resources
and Budget
More Data,
More Data Sources
010101010101010101010101010
01010101010101010101010101
Real time data
01 1
010101010101010101010
0101010101010101010101010
Multiple databases
More Kinds of Output
Needed by More Users,
More Quickly
1010
01010101010101010101
0101010101010101010
External Sources 01
101
1
10
1
1
1
010
1
0
 Labour intensive, heavy
Traditional Data
Warehousing
indexing, aggregations and
partitioning
 Hardware intensive:
massive storage; big servers
 Expensive and complex
Classic Approaches and Challenges
Data Warehousing – Growing Demands
New Demands:
Larger transaction volumes driven by the internet
Impact of Cloud Computing
More -> Faster -> Cheaper
Data Warehousing Matures:
Near real time updates
Integration with master data management
Data mining using discrete business transactions
Provision of data for business critical applications
Early Data Warehouse Characteristics:
Integration of internal systems
Monthly and weekly loads
Heavy use of aggregates
Classic Approaches and Challenges
Dealing with Large Data
INDEXES
CUBES/OLAP
Classic Approaches and Challenges
Limitations of Indexes
 Increased Space requirements
 Sum of Index Space requirements can exceed the source DB
 Index Management
 Increases Load times
 Building the index
 Predefines a fixed access path
Classic Approaches and Challenges
Limitations of OLAP
 Cube technology has limited scalability
 Number of dimensions is limited
 Amount of data is limited
 Cube technology is difficult to update (add Dimension)
 Usually requires a complete rebuild
 Cube builds are typically slow
 New design results in a new cube
Limitations of Rows
These Solutions Contribute to Operational Limitations
1.
2.
3.
4.
Impediments to business agility
 wait for DBAs to create indexes or other tuning structures, thereby
delaying access to data.
 Indexes significantly slow data-loading operations and increase the size
of the database, sometimes by a factor of 2x.
Loss of data and time fidelity:
 ETL operations typically performed in batch during non-business hours.
 Delay access to data, often result in mismatches between operational
and analytic databases.
Limited ad hoc capability:
 Response times for ad hoc queries increase as the volume of data grows.
 Unanticipated queries (where DBAs have not tuned the database in
advance) can result in unacceptable response times.
Unnecessary expenditures:
 Attempts to improve performance using hardware acceleration and
database tuning schemes raise the capital costs of equipment and the
operational costs of database administration.
 Added complexity of managing a large database diverts operational
budgets away from more urgent IT projects.
Pivoting Your Perspective:
Columnar Technology ….
Copyright 2007, Information
Builders. Slide 13
The Limitation of Rows
 The Ubiquity of Rows
30 columns
Row-based databases are
ubiquitous because so many
of our most important business
systems are transactional.
50
millions
Rows
Row-oriented databases
are well suited for
transactional environments,
such as a call center where a
customer’s entire record is
required when their profile
is retrieved and/or when fields
are frequently updated.
But - Disk I/O becomes a substantial limiting factor since a
row-oriented design forces the database to retrieve all column
data for any query.
Pivoting Your Perspective
Columnar Technology
Employee Id
Name
Location
Sales
1
Smith
New York
50,000
2
Jones
New York
65,000
3
Fraser
Boston
40,000
4
Fraser
Boston
70,000
Row Oriented
(1, Smith, New York, 50000; 2, Jones, New York, 65000; 3, Fraser, Boston, 40000; 4, Fraser, Boston, 70000)
 Works well if all the columns are needed for every query.
 Efficient for transactional processing if all the data for the row is available
Column Oriented
(1, 2, 3, 4; Smith, Jones, Fraser, Fraser; New York, New York, Boston, Boston, 50000, 65000, 40000, 70000)
 Works well with aggregate results (sum, count, avg. )
 Only columns that are relevant need to be touched
 Consistent performance with any database design
 Allows for very efficient compression
WebFOCUS Hyperstage
Copyright 2007, Information
Builders. Slide 16
Introducing
WebFOCUS Hyperstage
 Mission
 Improve database performance for WebFOCUS applications with less
hardware, no database tuning, and easy migration
 What is WebFOCUS Hyperstage
 High performance analytic data store
 Designed to handle business-driven queries on large volumes of data
 without IT intervention.
 Easy to implement and manage, Hyperstage provides answers to your
business users need at a price you can afford
 Advantages
 Dramatically increase performance of WebFOCUS applications
 Disk footprint reduced with powerful compression algorithm = faster
response time
 Embedded ETL for seamless migration of existing analytical databases
 No change in query or application required
 Includes optimized Hyperstage Adapter
 WebFOCUS metadata can be used to define hierarchies and drill
paths to navigate the star schema
17
Introducing WebFOCUS Hyperstage
How it is architected
Hyperstage Engine
Knowledge Grid
Compressor
Combines a columnar database with
intelligence we call the Knowledge Grid
to deliver fast query responses.
Improve database performance for
WebFOCUS applications with less
hardware, no database tuning, and easy
migration
Bulk
Loader
•
Unmatched Administrative Simplicity
• No Indexes
• No data partitioning
• No Manual tuning
Introducing WebFOCUS Hyperstage
What it means for Customers
Self-managing: 90% less administrative effort
Low-cost: More than 50% less than alternative solutions
Scalable, high-performance: Up to 50 TB using a single industry standard
server
Fast queries: Ad hoc queries are as fast as anticipated queries, so users
have total flexibility
Compression: Data compression of 10:1 to 40:1 means a lot less storage
is needed, it might mean you can get the entire database in memory!
Introducing WebFOCUS Hyperstage
How it works
Create Information
(Metadata) about the data,
and, upon Load,
automatically …
o
o
o
Stores it in the Knowledge Grid (KG)
KG Is loaded into Memory
Less than 1% of compressed data Size
Uses the metadata when
Processing a query to
Eliminate / reduce need to
access data
o
The less data that needs to be accessed,
the faster the response
Sub-second responses when answered by KG
o
o
Architecture Benefits
o
No Need to partition data, create/maintain indexes
projections, or tune for performance
Ad hoc queries are as fast as static queries,
so users have total flexibility
WebFOCUS Hyperstage Engine
How it works
Column Orientation
Smarter
Architecture
Knowledge Grid – statistics
and metadata “describing” the
super-compressed data
 No maintenance
 No query planning
 No partition schemes
 No DBA
Data Packs – data
stored
in manageably sized,
highly compressed data
packs
Data compressed
using algorithms
tailored to
data type
Summary
Copyright 2007, Information
Builders. Slide 22
Business Intelligence – Meeting Requirements
WebFOCUS Hyperstage
The Big Deal




No indexes
No partitions
No views
No materialized aggregates
 Value proposition
Low IT overhead
Allows for autonomy from IT
Ease of implementation
Fast time to market
Less Hardware
Lower TCO
No DBA
Required!
WebFOCUS Hyperstage Adapter
What it looks like
WebFOCUS Hyperstage Adapter
What it looks like
Example – Focus to Hyperstage Compression
243639 Rows
Q&A
Co
pyr
igh
t
20
07,
Inf
or
ma