Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational model wikipedia , lookup
www.pwc.com/technologyforecast Technology Forecast: Remapping the database landscape Issue 1, 2015 The realities of polyglot persistence in mainstream enterprises Ritesh Ramesh describes how NoSQL and Hadoop get used in retail environments. Interview conducted by Alan Morrison and Bo Parker PwC: You’ve been the technical lead on a number of big data projects at retailers. What’s a typical database challenge they’re encountering? Ritesh Ramesh Ritesh Ramesh is a chief technologist for the global data and analytics organization at PwC. RR: Some clients run both traditional and nontraditional databases, with the Hadoop and NoSQL database1 doing ingestion and pre-processing. Their customer analytics may run on a traditional database management system—for financial reporting to the CFO on sales, for instance. And they might have a NoSQL solution or perhaps Hadoop [an open-source software framework], that they use to acquire and process clickstream data. These clients do a lot of pre-processing on their clickstream data. They’ve learned that a traditional database cannot handle their typical daily acquisition of tens of gigabytes of files, rising to terabytes every month. For other retailers, recommendation engines and personalization of websites are classic reasons to use NoSQL databases. Everyone wants to personalize customer-facing portals and other interfaces for specific customers. I’ve seen clients use NoSQL when their websites are critical to their business models. In addition, a lot of people are using Hadoop and NoSQL for innovation pilots. For example, someone wants to build a mobile app with a new organization, or wants to do something edgy, and they don’t want to work with a traditional relational-database management system and database administrators. They do these pilots and, before you know it, they’re using NoSQL for some customer-facing apps. I don’t really see NoSQL being used directly for any types of enterprise reporting and dashboards. I think the traditional databases that host enterprise data warehouses are going to be used for that. “In the case of NoSQL, consistency is implicit, because all you’re going to do is take the order ID and use it as a key to write the entire receipt into the associated value field. It’s almost like you’re writing the transaction in a single action.” PwC: What are some of the other advantages to NoSQL in that context? RR: The fact that you can start with a data model that doesn’t have a schema increases your flexibility during app-design iterations. That cuts your time to market. Developers typically work in an objectoriented environment when they’re trying to build an application, whether it’s a mobile or online app. NoSQL schema flexibility also aligns nicely with object orientation. As a result, you can build powerful customer-facing applications a lot faster. It’s really just a case of simplicity with NoSQL. For example, NoSQL doesn’t need a normalized data model, which makes it possible for developers to focus on building solutions and own the process of storing and retrieving data themselves. On top of that, NoSQL emerged in the cloud computing era, so most NoSQL options are cloud-ready. PwC: Can you give us an example? RR: Sure. Let’s say I’m developing an ecommerce site, and some ecommerce sites might want customers to have the option of receiving their receipts by email. In that case, NoSQL lends itself to what I call storing data by aggregate points. If a retailer is using a key-value or document database, that retailer just needs some kind of identifier to manage the data, or simply the customer’s name. Once that’s done, the retailer quickly gets a complete receipt for the order in one chunk. Then it’s easy to send that receipt to the customer’s mobile device, or by email to wherever. Now, in the traditional world, that same data will be modeled in 20 to 25 tables, which is good if you have to slice and dice 2 PwC Technology Forecast the data. For slicing and dicing, it’s probably better to start with a relational model. But a key-value or document database is best for pushing out entire-order receipts. PwC: What’s your view of the new database environment that’s emerging? RR: It will be a polyglot environment going forward. Clients will need a tightly integrated heterogeneous set of both emerging and traditional technology components to manage all types of internal and external data. NoSQL is not going to be this alien technology coming into the enterprise and then destroying everything else. People will be forced to manage a hybrid environment. They won’t say, “Oh, I’m going to standardize my enterprise data warehouse on NoSQL or Hadoop.” That’s not going to happen. That’s why I say that NoSQL is likely better for operational applications. For example, companies who have brick-and-mortar stores and an online ecommerce presence want their offline and online data in the same place. NoSQL can be used as a backend operational data store to funnel in all their point-of-sale data from their stores, together with all the purchasing data from their websites. They can also do this at a very low cost. NoSQL brings the price point down so companies can scale up their operations. When that happens, a company’s price-to-performance ratio decreases over the long term. PwC: Some people we’ve interviewed question NoSQL’s consistency. RR: If you think about it, NoSQL provides you with implicit transactional consistency. In the retail example I mentioned earlier, If I use an OLTP [online transactional processing] database, such as Oracle, the result will be data The realities of polyglot persistence in mainstream enterprises spread across seven tables. So now when I write the transaction, I have to be worried about who might be trying to read it while I’m writing it. In the case of NoSQL, consistency is implicit, because all you’re going to do is take the order ID and use it as a key to write the entire receipt into the associated value field. It’s almost like you’re writing the transaction in a single action, because that’s how you’re going to retrieve it. It’s already ID’d and date-stamped. Enterprises will soon figure out that NoSQL delivers implicit consistency. In Cassandra, for instance, it’s very unlikely that you’ll write an order receipt in six different places. Instead, that receipt will only occupy one place. PwC: What about BI [business intelligence] and NoSQL? RR: I ask for the client’s definition of BI and analytics before we take on any BI strategy project. What we see as a trend is there is a whole spectrum sophistication of BI needs at the business function level. Our team even created taxonomy for BI by business role. We said BI for digital marketing means these things, BI for store operations means these things, and BI for supply chain means these things. Different roles require different definitions, of course. So it’s no surprise that a hybrid SQL [structured query language]/NoSQL environment is often the outcome of a BI strategy project. The managers of store operations, responsible for keeping track of inventory, only ask for real-time information in specific formats sent to their tablets. They don’t need or want the rest of the BI data, because they don’t have time to deal with it. NoSQL is a great solution for these managers. Operational NoSQL solutions are becoming an efficient way to enable access to near real-time information to internal and external stakeholders. PwC: What role is NoSQL playing in the data-integration challenge at retailers? What is the NoSQL access strategy without a query language that is standard across NoSQL technologies? RR: Data access in NoSQL is often through an API [application programming interface]. If done the right way, it works well. When companies are using NoSQL to run their websites or mobile services, they just put in the data. APIs give you flexibility. Some APIs can have a customer ID with seven columns. Other APIs can specify a customer ID with six columns, or whatever meets the business need. So NoSQL access through the API is probably a good way to go, compared to the way things are done with a traditional relational database—especially for people who use NoSQL for data integration. 1 Structured query language, or SQL, is the dominant query language associated with relational databases. NoSQL stands for not only structured query language. In practice, the term NoSQL is used loosely to refer to non-relational databases designed for distributed environments, rather than the associated query languages. PwC uses the term NoSQL, despite its inadequacies, to refer to non-relational distributed databases because it has become the default term of art. See the section “Database evolution becomes a revolution” in “Enterprises hedge their bets with NoSQL databases” at http://www.pwc.com/us/en/technology-forecast/2015/remapping-databaselandscape/features/enterprises-nosql-databases.jhtml for more information on relational versus non-relational database technology. To have a deeper conversation about remapping the database landscape, please contact: Gerard Verweij Principal and US Technology Consulting Leader +1 (617) 530 7015 [email protected] Chris Curran Chief Technologist +1 (214) 754 5055 [email protected] Oliver Halter Principal, Data and Analytics Practice +1 (312) 298 6886 [email protected] Bo Parker Managing Director Center for Technology and Innovation +1 (408) 817 5733 [email protected] About PwC’s Technology Forecast Published by PwC’s Center for Technology and Innovation (CTI), the Technology Forecast explores emerging technologies and trends to help business and technology executives develop strategies to capitalize on technology opportunities. Recent issues of the Technology Forecast have explored a number of emerging technologies and topics that have ultimately become many of today’s leading technology and business issues. To learn more about the Technology Forecast, visit www.pwc.com/ technologyforecast. © 2015 PricewaterhouseCoopers LLP, a Delaware limited liability partnership. All rights reserved. PwC refers to the US member firm, and may sometimes refer to the PwC network. Each member firm is a separate legal entity. Please see www.pwc.com/structure for further details. This content is for general information purposes only, and should not be used as a substitute for consultation with professional advisors. MW-15-1351 LL