Download Intel Analytics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Intel Active Management Technology wikipedia , lookup

Transcript
Intel Analytics
Rich Pilling
EMEA Director of Professional Services & Analytics — Intel
May, 2014
Intel Confidential — Do Not Forward
What
happens
online in
60
seconds?
Source: http://blog.qmee.com/qmee-online-in-60-seconds/
“The future got to us, before we got to it.”
Terry Gilliam, Monty Python
3
SEMANTICS!
Virtuous Cycle of Computing
… and so on
DEVICES
SERVICES
DATACENTER
Other brands and names are the property of their respective owners.
DEVICES
SEMANTIC INFORMATION
IS FUEL FOR THE CYCLE
MACHINE LEARNING
ON THIS
enterprise
NoSQL
RDF
1985
1995
2005
Docs
+
Semantics
2015
Likes?
Danny : isBrotherOf : Nezih
Food
food cartuses
: uses : bicycles
Cart
Frank : isFriendsWith : Mohit
likes
Bicycles
likes
Franklikes
: isFriendsWith : Ted
likes
Frank : likes : bicycles
Frank
: likes
: foodfriends
carts Mohit
likes Frank
likes
Ivy : isFriendsWith : Kushal
friends
Ivy : isFriendsWith : Ted
friends
Ivy : likes : bicycles
Ivy
Kushal
Ted
friends
Ivy friends
: likes : food carts
Kushal : isFriendsWith : Mohit
Kushal
: Nezih
friends : isFriendsWithfriends
Nezih : is FriendsWith
: TedDanny
Nezih
brothers
Ted : likes : bicycles
This model...
... infers this interest.
IMAGINE THE POSSIBILITIES
Imagine the enjoyment
Graph of
channel
viewing
behavior
High
SH002463130000
Current popular
surfing patterns
Changes in surfing
behavior may predict
customer churn.
EP005544723744
Program
Importance
(Centrality)
Low
Imagine the security
Imagine the satisfaction
Preference and Similarity Recommendations
similar
weight=0.03
topic
title: The Departed
genre: Crime drama
cast: [L. DiCaprio, M. Damon]
similar cast
weight=0.67
title: Scarface
genre: Crime drama
cast: [Al Pacino, M. Pfeiffer]
prefers
weight=11.8
title: The Godfather
genre: Crime drama
cast: [M. Brando, Al Pacino]
userId: A0A22A5
weight=14.98
1.7MM Nodes
23.9MM Edges
User
Movie
A yoga
ball
graph.
Really!?!
You may actually need a model like this
• When the problem is an information
network
• When a graph is a natural way of
expressing the algorithm
triangle count
central
influence
shortest path
• When you want to study specific
relationships
• When you want faster machine learning
or solvers on sparse data
sub networks
But there are challenges.
Handling all that
data.
Finding people good at both handling all
that data and data analysis.
Putting exploratory work into production
fast enough to keep up with the
competition.
16
Congratulations
! You
are a
data scientist.
It’s a demanding job
Painful data
ingestion and
preparation
Ingest &
Clean
Engineer
Features
Skills shortage at
intersection of
systems
engineering and
data analysis
Structure
Model
Workflows that are not designed
with loopbacks in mind
Train
Model
Composing
pipeline is DIY
Learn
Query &
Analyze
Visualize
Few tools for analyzing
semantics at scale
IMAGINE A PLATFORM FOR DATA SCIENTISTS
DOCS + SEMANTICS + MACHINE LEARNING
Delivering it
Intel Analytics Toolkit
DATA SCIENCE SERVER (Query and Scripting)
BIG DATA API
MACHINE LEARNING AND
STATISTICS
DATA WRANGLING
Useful String
Manipulation
Useful Math
Operators
Graph
Construction Tools
Graphical
Algorithms
Classical
Algorithms
MarkLogic
APACHE HADOOP
APACHE SPARK
FILESYSTEMS
AND NOSQL
STORAGE
A UNIFIED
DOCUMENT
+ SEMANTIC
STORE
HW PLATFORM
Bringing a full spectrum of possibilities
Graph
Approach
Algorithm
Category
Applications/Use Cases
Loopy Belief Propagation (LBP)
Structured Prediction
Personalized recs, image de-noising
Label Propagation
Structured Prediction
Personalized recommendations
Alternating Least Squares (ALS)
Collaborative Filtering
Recommenders
Conjugate Gradient Descent (CGD)
Collaborative Filtering
Recommenders
Connected Components
Graph Analytics
Network manipulation, image
analysis
Latent Dirichlet Allocation (LDA)
Topic Modeling
Document Clustering
Structure Attribute
Clustering
Network analysis, consumer seg
K-Truss
Clustering
Social network analysis
KNN*
Clustering
Recommenders
Logistic Regression*
Classification
Fraud detection
Random Forest*
Classification
Fraud detection, consumer seg
Generalized Linear Model (Binomial,
Poisson)
Non-linear Curve Fitting
Forecasting, pricing, market mix
models
Association Rule Mining
Data Mining
Market basket analysis,
recommenders
Frequent Pattern Mining*
Data Mining
Pattern Recognition
21
Reimagining 2014
 New partnerships in big data
 Contributions to the open source community
 The Intel Analytics Toolkit – COMING SOON
SEMANTICS + MACHINE LEARNING
TOGETHER AT LAST!
Intel Enablement Professional Services
Solution Focused





We have helped many other
organizations through business
change processes, and know
that technology alone is not the
answer.
Unlike many other
organizations we offer a
complete solution to your
problems.
Partner focused – sub partners
in and support partners with our
talent
We cover all the core Big Data
Coverage Areas.
We bring project expertise,
Intel Enablement Professional Services Offerings
Plan
Implement
Operate
Standard Offerings – Customized by Product
Business Analysis
Deployment
Staff Augmentation
• Helps you define ROI, use-case, and
build consensus
• Workshops-Planning, Architecture,
Security, Data Science
• Show Intel technology unlocks value
• PoC deployments
• Production deployments
• Healthcheck/Tuning projects
• Flexible staffing models
• Provides you access to limited skills
• Data Scientist Services
• Leverages all of Intel
Security Assessment
Operations Services
• Ensures your security and
compliance
• Prevents embarrassment, legal
issues, and regulatory incompliance
• System penetration testing
• Application threat model & review
• Admin as a Service
• Data Science as a Service
• Private custom training
• Process/procedure development
• Ongoing application support
Architecture Design
• Delivers a scalable architecture and
plan
• Creates sizing plan for current and
future state
Analytics Analysis
• Enhances your use-case development
• Creates predictive analytic models
Legal Disclaimers
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without
notice.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across
different processor families. Go to: http://www.intel.com/products/processor_number
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate
from published specifications. Current characterized errata are available on request.
Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor
(VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may
not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization
No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer
system with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXTcompatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit
http://www.intel.com/technology/security
Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are
trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Other names and brands may be claimed as the property of others.
Copyright © 2013, Intel Corporation. All rights reserved.