Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Intel Analytics Rich Pilling EMEA Director of Professional Services & Analytics — Intel May, 2014 Intel Confidential — Do Not Forward What happens online in 60 seconds? Source: http://blog.qmee.com/qmee-online-in-60-seconds/ “The future got to us, before we got to it.” Terry Gilliam, Monty Python 3 SEMANTICS! Virtuous Cycle of Computing … and so on DEVICES SERVICES DATACENTER Other brands and names are the property of their respective owners. DEVICES SEMANTIC INFORMATION IS FUEL FOR THE CYCLE MACHINE LEARNING ON THIS enterprise NoSQL RDF 1985 1995 2005 Docs + Semantics 2015 Likes? Danny : isBrotherOf : Nezih Food food cartuses : uses : bicycles Cart Frank : isFriendsWith : Mohit likes Bicycles likes Franklikes : isFriendsWith : Ted likes Frank : likes : bicycles Frank : likes : foodfriends carts Mohit likes Frank likes Ivy : isFriendsWith : Kushal friends Ivy : isFriendsWith : Ted friends Ivy : likes : bicycles Ivy Kushal Ted friends Ivy friends : likes : food carts Kushal : isFriendsWith : Mohit Kushal : Nezih friends : isFriendsWithfriends Nezih : is FriendsWith : TedDanny Nezih brothers Ted : likes : bicycles This model... ... infers this interest. IMAGINE THE POSSIBILITIES Imagine the enjoyment Graph of channel viewing behavior High SH002463130000 Current popular surfing patterns Changes in surfing behavior may predict customer churn. EP005544723744 Program Importance (Centrality) Low Imagine the security Imagine the satisfaction Preference and Similarity Recommendations similar weight=0.03 topic title: The Departed genre: Crime drama cast: [L. DiCaprio, M. Damon] similar cast weight=0.67 title: Scarface genre: Crime drama cast: [Al Pacino, M. Pfeiffer] prefers weight=11.8 title: The Godfather genre: Crime drama cast: [M. Brando, Al Pacino] userId: A0A22A5 weight=14.98 1.7MM Nodes 23.9MM Edges User Movie A yoga ball graph. Really!?! You may actually need a model like this • When the problem is an information network • When a graph is a natural way of expressing the algorithm triangle count central influence shortest path • When you want to study specific relationships • When you want faster machine learning or solvers on sparse data sub networks But there are challenges. Handling all that data. Finding people good at both handling all that data and data analysis. Putting exploratory work into production fast enough to keep up with the competition. 16 Congratulations ! You are a data scientist. It’s a demanding job Painful data ingestion and preparation Ingest & Clean Engineer Features Skills shortage at intersection of systems engineering and data analysis Structure Model Workflows that are not designed with loopbacks in mind Train Model Composing pipeline is DIY Learn Query & Analyze Visualize Few tools for analyzing semantics at scale IMAGINE A PLATFORM FOR DATA SCIENTISTS DOCS + SEMANTICS + MACHINE LEARNING Delivering it Intel Analytics Toolkit DATA SCIENCE SERVER (Query and Scripting) BIG DATA API MACHINE LEARNING AND STATISTICS DATA WRANGLING Useful String Manipulation Useful Math Operators Graph Construction Tools Graphical Algorithms Classical Algorithms MarkLogic APACHE HADOOP APACHE SPARK FILESYSTEMS AND NOSQL STORAGE A UNIFIED DOCUMENT + SEMANTIC STORE HW PLATFORM Bringing a full spectrum of possibilities Graph Approach Algorithm Category Applications/Use Cases Loopy Belief Propagation (LBP) Structured Prediction Personalized recs, image de-noising Label Propagation Structured Prediction Personalized recommendations Alternating Least Squares (ALS) Collaborative Filtering Recommenders Conjugate Gradient Descent (CGD) Collaborative Filtering Recommenders Connected Components Graph Analytics Network manipulation, image analysis Latent Dirichlet Allocation (LDA) Topic Modeling Document Clustering Structure Attribute Clustering Network analysis, consumer seg K-Truss Clustering Social network analysis KNN* Clustering Recommenders Logistic Regression* Classification Fraud detection Random Forest* Classification Fraud detection, consumer seg Generalized Linear Model (Binomial, Poisson) Non-linear Curve Fitting Forecasting, pricing, market mix models Association Rule Mining Data Mining Market basket analysis, recommenders Frequent Pattern Mining* Data Mining Pattern Recognition 21 Reimagining 2014 New partnerships in big data Contributions to the open source community The Intel Analytics Toolkit – COMING SOON SEMANTICS + MACHINE LEARNING TOGETHER AT LAST! Intel Enablement Professional Services Solution Focused We have helped many other organizations through business change processes, and know that technology alone is not the answer. Unlike many other organizations we offer a complete solution to your problems. Partner focused – sub partners in and support partners with our talent We cover all the core Big Data Coverage Areas. We bring project expertise, Intel Enablement Professional Services Offerings Plan Implement Operate Standard Offerings – Customized by Product Business Analysis Deployment Staff Augmentation • Helps you define ROI, use-case, and build consensus • Workshops-Planning, Architecture, Security, Data Science • Show Intel technology unlocks value • PoC deployments • Production deployments • Healthcheck/Tuning projects • Flexible staffing models • Provides you access to limited skills • Data Scientist Services • Leverages all of Intel Security Assessment Operations Services • Ensures your security and compliance • Prevents embarrassment, legal issues, and regulatory incompliance • System penetration testing • Application threat model & review • Admin as a Service • Data Science as a Service • Private custom training • Process/procedure development • Ongoing application support Architecture Design • Delivers a scalable architecture and plan • Creates sizing plan for current and future state Analytics Analysis • Enhances your use-case development • Creates predictive analytic models Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer system with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXTcompatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of others. Copyright © 2013, Intel Corporation. All rights reserved.