Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Finding new business potential with Big Data Analytics Carsten Frisch Oracle Business Analytics DOAG 2015 Business Solutions Conference Darmstadt, 10. Juni 2015 Copyright Copyright ©©2015, 2015, Oracle Oracleand/or and/or itsitsaffiliates. affiliates.AllAllrights rightsreserved. reserved.| Referent » Carsten Frisch » Senior Sales Consultant » Business Analytics Big Data Discovery Lead - DE/CH Cluster » Kontakt +49 (0)6103 397-380 » [email protected] Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only,toand may our not general be incorporated into any contract. It is not The following is intended outline product direction. It is intended fora commitment to deliveronly, any material, code, functionality, and should not be information purposes and may not beor incorporated into any contract. It isrelied not aupon in making purchasing decisions. The development, release, and anybefeatures or commitment to deliver any material, code, or functionality, andtiming shouldofnot relied upon functionality describeddecisions. for Oracle’s remains at the and sole timing discretion of Oracle. in making purchasing Theproducts development, release, of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 4 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 5 Monetizing New Insights Business Cases for Big Data and the Discovery Lab Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 6 Financial Services Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 7 Enabling Rich Customer Experience Across Channels Is A Key Focus For Banks Customers have become more Email Mail demanding and their loyalties Sales are diffused with low-switching costs. The customer experience 360 degree view of customer Branch Phone expectations for banking services (across channels) are being reset by the experiences Mobile Online being provided by retailers and online providers elsewhere ATM Source: Redefining Customer Experience, Infosys Whitepaper; PWC Report 2012 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 8 Banks Need To Move Towards Personalization And Targeted Marketing To Enhance Customer Experience Top 3 Emerging Changes in Customer Behavior That Impact Banking (% of respondents) Customer Demand… Using Direct and Self-Service Channels Seeking Better, More Personal Advice Price Sensitivity, Discount Seeking 63% 49% 44% ……. More personalized services, offers and enhanced customer experience ……. More relevant services and transparent access to information across all channels consistently ……. Increase simplicity, self-control, mobility of banking services Customers are making web / mobile as their primary channel of interaction with their banks. These channels are already heavily personalized and there is a rising demand for more personalized services and offers from customers Source: Enhancing The Banking Customer Value Proposition Through Technology-led Innovation, Accenture Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 9 Market Challenges Are Compelling Banks To Focus On Customer Insight And Real-Time Offers KEY BIG DATA CAPABILITIES INDUSTRY CHALLENGES ENHANCE CUSTOMER EXPERIENCE Develop deep client relationships by offering superior service Analyze internal customer logs and social media activity to generate indications of customer dissatisfaction allowing time to act Analyze behavior profiles, spending habits, and segmentation to gain view on customer risk and enhance risk management capabilities REAL-TIME OFFERS Generate real-time, context sensitive, targeted offers based on analytical insights on OPTIMIZE OPERATIONS Provide more visibility into performance in order to facilitate timely and cost effective spending patterns Rapid time to market and improved customer value Leverage insights from social media during various stages of product and service development management of operations Discover opportunities to achieve greater efficiency across global operations Understand and forecast performance and drive strategies that improve operations and financial results Source: Oracle Financial Services Industry Solutions Overview; Oracle Insight; PWC Report Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 10 Leveraging Big Data for Competitive Advantage in FS Customer Insight Customer Insight Social Media Sentiment & Engagement Big Data Augmentation Personalised Services New Product Launch Optimise Operations Data Monetisation Real Time Offers New Revenue Streams Context Sensitive Offers / Ads Location Based Offers / Ads Compliance Processes Fraud Detection Information as a Service Risk Management Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Fast Data Quality of Models Financial risk Security risk Digital Business, Data-driven Decisioning Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 12 Characteristics of Digital Business Leaders They ‘Reframe’ Challenges They Sprint Looking at them from new perspectives and multiple angles They work at pace - researching, testing and evaluating current ideas while generating new ones They Appreciate That Failure Can Be Good and are not afraid of new ideas They Convert Data Into Value They invest heavily in analyzing their own data and data from external sources to establish patterns and un-noticed opportunities Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 13 Data-driven Decisions Data Science + Knowledge Discovery Apply a statistical model and evaluate the correctness of the approach. Repeat this procedure until the right method has been identified. Present & implement results Gather all available information about the variables of your hypothesis. The relevance of a dataset might address your business question directly or needs to be derived Analyse the data Formulate a detailled hypothesis how specific variables might influence the result of the chosen model Gather all necessary data Try to identify alternatives to your perception Find out who has investigated such or a similar problem in the past and the approach that has been taken Design of a solution model Become clear about all aspects of the decision to be taken or the problem to be solved. Verify earlier findings Identify (business) question Non-Analysts & Executives: should take a closer look on steps 1 and 6 of the analysis process if they plan to make use of statistical analysis. Frame the results obtained in a comprehensible story. This kind of presentation intends to motivate decision makers and relevant stake-holders to take action Adopted from Thomas H. Davenport, Harvard Business Manager 2013 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 14 Vertical and Horizontal Data Science Skills DataVertical Warehouse Deep technical skills Eigenvalues, Lasso-related regressions Experts in Bayesian networks, R Support Vector Machine Hadoop, NoSQL, Data Modeling, DW The Specialist Horizontal Cross-discipline knowledge Machine Learning & Statistics Visualization skills Domain expertise Storytelling experts Programming experience Aware of pitfalls & rules of thumb Look for the individual Unicorn or build a Data Science Team? The Unicorn Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 15 Enabling Data-driven Innovations in Organizations Executive: Decisions effecting strategy and direction Data Scientists: Information analysis to meet strategic goals Business Analysts: Day-to-Day performance of a business unit Analytical Competence Center (ACC) Perf. Mgmt. ACC Knowledge Discovery Insight Dynamic Dashboards and Reports BICC Information Consumer: Reporting on individual transactions Automated Process: Decisions effecting execution of an indiv. transactions Volume and Fixed Reporting Knowledge Driven Business Process » Separate group reporting to CxO. not part of a Business Intelligence Competence Center (BICC) » Mission: broadening the adoption of Analytics across the organization » Skilled resource pool of Data Scientists, Statisticians and Business Experts » Data-driven approach (not development-driven) with privileged access to enterprise data sources » Group will be assigned to projects for a limited time Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 16 Discovery Lab Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 17 Information Management – Conceptual View Actionable Events Actionable Insights Actionable Information BICC Data Streams Data Reservoir Event Engine Data Factory Structured Enterprise Data Enterprise Information Store Business Intelligence Other Data Execution Line of governance Innovation ACC Events & Data Discovery Lab Discovery Output Source: Oracle White Paper “Information Management and Big Data – A Reference Architecture” Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 18 Discovery Lab: Design Pattern » Iterative development approach – data oriented NOT development oriented » Small group of highly skilled individuals (aka “Data Scientists” or a team organized as an Analytical Competence Center, ACC) with privileged access to enterprise data sources » Specific focus on identifying commercial value for exploitation » Wide range of tools and techniques applied ACC » Typically separate infrastructure but could also be unified Reservoir if resource managed effectively » Data provisioned through Data Factory or own ETL processes Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 20 Discovery Lab: Activity Cycles Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 21 Discovery Lab: Data Provisioning Data Factory flow 1 2 Pre-Built Intelligence Assets Virtualisation & Information Services General BI flow The majority of BI development activity will be from existing sources – performed by the BICC developing new reports to existing or new channels Analysis Processing & Delivery Scorecards Charts & Graphs Ad Hoc Query & Analysis Tools Intelligence Analysis Tools OLAP Tools Forecasting & Simulation Tools BICC Reporting Tools Discovery Lab & Development Environment Sandbox – Project 3 Raw Data Dashboards & Reports Query & Search Tools Statistics Tools Sandbox – Project 2 Sandbox – Project 1 Data Science (Primary Toolset) Data store Analytical Processing ACC may quickly develop new reporting through mashups from any available internal and external sources and may used advanced analytical tools for innovative analysis Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Data Modelling Tools Programming & Scripting Data & Text Mining Tools ACC Faceted Query Tools Data Quality & Profiling Graphical rendering tools 22 Unified: Big Data Management and Analytics… Experiment, Prototype, Collaborate Oracle BI Foundation Suite Exalytics Oracle SQL Queries (ROLAP/MOLAP, Mobile,…) Productize, Secure & Govern Structured Data In-Memory Appliance Oracle Advanced Analytics (Data Mining, Oracle R Enterprise) Oracle Big Data SQL Tables in DB Oracle Database Exadata Polystructured Data SQL join Experiment, Prototype & Collaborate BDA Oracle R for Hadoop » Use to build predictive models with Oracle R for Hadoop » Connect published HDFS files to secure Oracle DB using Oracle Big Data SQL » No data movement required Hadoop (HDFS) Data Reservoir » Publish results to the Hadoop Distributed File System (HDFS) Productize, Secure, Govern Data Warehouse Oracle Big Data Discovery » Quickly find, explore, transform, analyze and share discoveries in Big Data Discovery Tables in Hadoop » Seamlessly extends existing DWH and BI investments with non-traditional data in Hadoop Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 23 Need To Get Analytic Value Fast Data Uncertainty 80% effort typically spent on evaluating and preparing data » Not familiar and overwhelming » Potential value not obvious » Requires significant manipulation Tool Complexity Overly dependent on scarce and highly skilled resources » Early Hadoop tools only for experts » Existing BI tools not designed for Hadoop » Emerging solutions lack broad capabilities Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 24 Oracle Big Data Discovery Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 25 Oracle Big Data Discovery: The Visual Face of Hadoop find explore transform discover Copyright © 2015, Oracle and/or its affiliates. All rights reserved. share 26 Oracle Big Data Discovery: Components Oracle Big Data Discovery Workloads Hadoop Cluster (Oracle Big Data Appliance or Commodity Hardware with Cloudera CDH 5.) Studio • Web UI: Find, Explore, Transform, Discover, Share MapReduce In-Memory Discovery Indexes BDD node • DGraph: Search, Guided Navigation, Analytics name node Hadoop 2.x data node Metadata (HCatalog) data node Workload Mgmt (YARN) data node Other Hadoop Workloads Filesystem (HDFS) Spark Data Processing, Workflow & Monitoring • Profiling: catalog entry creation, data type & language detection, schema configuration • Sampling: dgraph (index) file creation • Transforms: >100 functions • Enrichments: location (geo), text (cleanup, sentiment, entity, key-phrase, whitelist tagging) Self-Service Provisioning & Data Transfer data node • Personal Data: Upload CSV and XLS to HDFS Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Hive Pig Oracle Big Data SQL (Oracle Big Data Appliance only) 27 Oracle Big Data Discovery: Preparation of Data Sources Have to be created as Hive Tables and registered in the Hive Metastore Hive Table with a standard Regex SerDe (“Serializer-Deserializer”) to map more complex file structures by using Regular Expressions into regular table columns Hive Table definition for fixed-width or delimited files Hive Table using a custom developed SerDe to map nested file structures of a JSON file into regular table columns Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 29 Oracle Big Data Discovery: Preparation of Data Sources There are multiple ways to get new Data Sets loaded… Big Data Discovery HUE (Hadoop User Experience) Hive Command Line Upload of XLS und CSV files and automatic Hive Table creation Upload of various file formats, table creation wizzards, web-based Hive Query Client Interface is similar to the MySQL command line Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 30 Oracle Big Data Discovery: Preparation of Data Sources …or by using your favorite Data Integration / ETL Tool IKM File To Hive (Load Data) IKM Hive Transform IKM Hive Control Append IKM File-Hive To Oracle (OLH, OSCH) Hive Hive LKM HBase to Hive File (FS/HDFS) IKM SQL to HiveHBase-File (SQOOP) Any RDBMS Oracle DB HBase Hive IKM Hive to HBase Hive IKM File-Hive to SQL (SQOOP) HBase Any RDBMS Oracle Data Integrator 12.1.3 with Advanced Big Data Option (Supporting HDFS, Hive, HBase, Scoop, Pig, Spark) Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 31 Oracle Big Data Discovery: Data Ingestion Data Processing Workflow including Profiling and Enrichment access_logs 100m rows access_logs 1 m rows Hive / HCatalog Profiling and Enrichment Process BDD access_logs 1 m rows access_logs 1 m rows Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 1M of 100M 32 Demonstration Oracle Big Data Discovery Oracle Big Data Discovery Demonstration Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 35 Catalog » Access a rich, interactive catalog of all data in Hadoop » Familiar search and guided navigation for ease of use » See data set summaries, user annotation and recommendations » Provision personal and enterprise data to Hadoop via selfservice Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 36 Explore » Visualize all attributes by type » Sort attributes by information potential » Assess attribute statistics, data quality and outliers » Use scratch pad to uncover correlations between attributes Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 37 Transform » Intuitive, user driven data wrangling » Extensive library of powerful data transformations and enrichments » Preview results, undo, commit and replay transforms » Test on sample data then apply to full data set in Hadoop Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 38 Transform – User friendly… Preferred method for the Business Analyst Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 39 Transform – … but flexible (based on Groovy Programming Language / Library) Preferred Method for IT / Data Engineer / Data Scientist … Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 40 Discover » Join and blend data for deeper perspectives » Easy usage - compose project pages via drag and drop » Use powerful search and guided navigation to ask questions » See new patterns in rich, interactive data visualizations Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 41 Share » Share projects, bookmarks and snapshots with others » Build galleries and tell big data stories » Collaborate and iterate as a team » Publish blended data to HDFS for leverage in other tools Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 42 Data Discovery & Analytics Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 43 Data Discovery & Analytics Lifecycle Typical Effort Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 44 Data Discovery & Analytics Lifecycle More Time left for Analysis and Interpretation of Results Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 45 Analytics: More Data Variety available – Better Results 100 Data Mining-based prediction results with Response Modelling including hundreds of input variables like: » Demographic data » Purchase POS transactional data » Polystructured data, text & comments » Spatial location data » Long term vs. recent historical behaviour » Web visits » Sensor data »… % of Positive Responders Example: Marketing Campaigns Getting „lift“ on responders 0 Naïve Guess or Random Model with 20 variables Model with 75 variables Model with 250 variables Population Size (% of Total Cases) Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 100 46 Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance R Integration Oracle R Enterprise (ORE) » Allows distributed processing of huge data volumes » Benefits from DB features, e.g. Security and SQL access » R Studio = GUI for Data Analysts Oracle Data Mining (ODM) » Implemented in the Oracle Database kernel » Direct access via PL/SQL API and SQL operators » Oracle Data Miner GUI embedded in SQL Developer Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 47 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 48