Download The realities of polyglot persistence in mainstream enterprises

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
www.pwc.com/technologyforecast
Technology Forecast: Remapping the database landscape
Issue 1, 2015
The realities of
polyglot persistence in
mainstream enterprises
Ritesh Ramesh describes how NoSQL and Hadoop get
used in retail environments.
Interview conducted by Alan Morrison and Bo Parker
PwC: You’ve been the technical lead
on a number of big data projects at
retailers. What’s a typical database
challenge they’re encountering?
Ritesh Ramesh
Ritesh Ramesh is a chief
technologist for the global data
and analytics organization at PwC.
RR: Some clients run both traditional and
nontraditional databases, with the Hadoop
and NoSQL database1 doing ingestion and
pre-processing. Their customer analytics may
run on a traditional database management
system—for financial reporting to the CFO
on sales, for instance. And they might have
a NoSQL solution or perhaps Hadoop [an
open-source software framework], that they
use to acquire and process clickstream data.
These clients do a lot of pre-processing on
their clickstream data. They’ve learned that
a traditional database cannot handle their
typical daily acquisition of tens of gigabytes
of files, rising to terabytes every month.
For other retailers, recommendation engines
and personalization of websites are classic
reasons to use NoSQL databases. Everyone
wants to personalize customer-facing portals
and other interfaces for specific customers.
I’ve seen clients use NoSQL when their
websites are critical to their business models.
In addition, a lot of people are using Hadoop
and NoSQL for innovation pilots. For example,
someone wants to build a mobile app with a
new organization, or wants to do something
edgy, and they don’t want to work with a
traditional relational-database management
system and database administrators. They do
these pilots and, before you know it, they’re
using NoSQL for some customer-facing apps.
I don’t really see NoSQL being used directly
for any types of enterprise reporting
and dashboards. I think the traditional
databases that host enterprise data
warehouses are going to be used for that.
“In the case of NoSQL, consistency is implicit,
because all you’re going to do is take the order ID
and use it as a key to write the entire receipt into
the associated value field. It’s almost like you’re
writing the transaction in a single action.”
PwC: What are some of the other
advantages to NoSQL in that context?
RR: The fact that you can start with a
data model that doesn’t have a schema
increases your flexibility during app-design
iterations. That cuts your time to market.
Developers typically work in an objectoriented environment when they’re trying
to build an application, whether it’s a mobile
or online app. NoSQL schema flexibility
also aligns nicely with object orientation.
As a result, you can build powerful
customer-facing applications a lot faster.
It’s really just a case of simplicity with
NoSQL. For example, NoSQL doesn’t need
a normalized data model, which makes it
possible for developers to focus on building
solutions and own the process of storing and
retrieving data themselves. On top of that,
NoSQL emerged in the cloud computing era,
so most NoSQL options are cloud-ready.
PwC: Can you give us an example?
RR: Sure. Let’s say I’m developing an
ecommerce site, and some ecommerce sites
might want customers to have the option
of receiving their receipts by email.
In that case, NoSQL lends itself to what I
call storing data by aggregate points. If a
retailer is using a key-value or document
database, that retailer just needs some
kind of identifier to manage the data, or
simply the customer’s name. Once that’s
done, the retailer quickly gets a complete
receipt for the order in one chunk. Then it’s
easy to send that receipt to the customer’s
mobile device, or by email to wherever.
Now, in the traditional world, that same
data will be modeled in 20 to 25 tables,
which is good if you have to slice and dice
2
PwC Technology Forecast
the data. For slicing and dicing, it’s probably
better to start with a relational model. But
a key-value or document database is best
for pushing out entire-order receipts.
PwC: What’s your view of the new
database environment that’s
emerging?
RR: It will be a polyglot environment going
forward. Clients will need a tightly integrated
heterogeneous set of both emerging and
traditional technology components to
manage all types of internal and external
data. NoSQL is not going to be this alien
technology coming into the enterprise and
then destroying everything else. People will
be forced to manage a hybrid environment.
They won’t say, “Oh, I’m going to standardize
my enterprise data warehouse on NoSQL
or Hadoop.” That’s not going to happen.
That’s why I say that NoSQL is likely better
for operational applications. For example,
companies who have brick-and-mortar stores
and an online ecommerce presence want their
offline and online data in the same place.
NoSQL can be used as a backend operational
data store to funnel in all their point-of-sale
data from their stores, together with all the
purchasing data from their websites. They
can also do this at a very low cost. NoSQL
brings the price point down so companies
can scale up their operations. When that
happens, a company’s price-to-performance
ratio decreases over the long term.
PwC: Some people we’ve interviewed
question NoSQL’s consistency.
RR: If you think about it, NoSQL provides
you with implicit transactional consistency.
In the retail example I mentioned earlier, If I
use an OLTP [online transactional processing]
database, such as Oracle, the result will be data
The realities of polyglot persistence in mainstream enterprises
spread across seven tables. So now when I write
the transaction, I have to be worried about who
might be trying to read it while I’m writing it.
In the case of NoSQL, consistency is implicit,
because all you’re going to do is take the order
ID and use it as a key to write the entire receipt
into the associated value field. It’s almost
like you’re writing the transaction in a single
action, because that’s how you’re going to
retrieve it. It’s already ID’d and date-stamped.
Enterprises will soon figure out that NoSQL
delivers implicit consistency. In Cassandra, for
instance, it’s very unlikely that you’ll write an
order receipt in six different places. Instead,
that receipt will only occupy one place.
PwC: What about BI [business
intelligence] and NoSQL?
RR: I ask for the client’s definition of BI
and analytics before we take on any BI
strategy project. What we see as a trend is
there is a whole spectrum sophistication
of BI needs at the business function level.
Our team even created taxonomy for BI
by business role. We said BI for digital
marketing means these things, BI for store
operations means these things, and BI
for supply chain means these things.
Different roles require different definitions,
of course. So it’s no surprise that a hybrid
SQL [structured query language]/NoSQL
environment is often the outcome of a BI
strategy project. The managers of store
operations, responsible for keeping track of
inventory, only ask for real-time information
in specific formats sent to their tablets.
They don’t need or want the rest of the BI
data, because they don’t have time to deal
with it. NoSQL is a great solution for these
managers. Operational NoSQL solutions
are becoming an efficient way to enable
access to near real-time information to
internal and external stakeholders.
PwC: What role is NoSQL playing
in the data-integration challenge
at retailers? What is the NoSQL
access strategy without a query
language that is standard
across NoSQL technologies?
RR: Data access in NoSQL is often through
an API [application programming interface].
If done the right way, it works well. When
companies are using NoSQL to run their
websites or mobile services, they just put in
the data. APIs give you flexibility. Some APIs
can have a customer ID with seven columns.
Other APIs can specify a customer ID with
six columns, or whatever meets the business
need. So NoSQL access through the API is
probably a good way to go, compared to
the way things are done with a traditional
relational database—especially for people
who use NoSQL for data integration.
1 Structured query language, or SQL, is the dominant query language associated with relational databases. NoSQL stands for not only
structured query language. In practice, the term NoSQL is used loosely to refer to non-relational databases designed for distributed
environments, rather than the associated query languages. PwC uses the term NoSQL, despite its inadequacies, to refer to non-relational
distributed databases because it has become the default term of art. See the section “Database evolution becomes a revolution” in
“Enterprises hedge their bets with NoSQL databases” at http://www.pwc.com/us/en/technology-forecast/2015/remapping-databaselandscape/features/enterprises-nosql-databases.jhtml for more information on relational versus non-relational database technology.
To have a deeper conversation about remapping the database
landscape, please contact:
Gerard Verweij
Principal and US Technology
Consulting Leader
+1 (617) 530 7015
[email protected]
Chris Curran
Chief Technologist
+1 (214) 754 5055
[email protected]
Oliver Halter
Principal, Data and Analytics Practice
+1 (312) 298 6886
[email protected]
Bo Parker
Managing Director
Center for Technology and Innovation
+1 (408) 817 5733
[email protected]
About PwC’s Technology Forecast
Published by PwC’s Center for Technology
and Innovation (CTI), the Technology
Forecast explores emerging technologies
and trends to help business and technology
executives develop strategies to capitalize on
technology opportunities.
Recent issues of the Technology Forecast have
explored a number of emerging technologies
and topics that have ultimately become
many of today’s leading technology and
business issues. To learn more about the
Technology Forecast, visit www.pwc.com/
technologyforecast.
© 2015 PricewaterhouseCoopers LLP, a Delaware limited liability partnership. All rights reserved. PwC refers to the US member firm, and may sometimes refer to the PwC network. Each member firm
is a separate legal entity. Please see www.pwc.com/structure for further details. This content is for general information purposes only, and should not be used as a substitute for consultation with
professional advisors. MW-15-1351 LL