Download Vadis Smart Toolbox For Big Data Analytics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
VADIS S.A. European Leader
in Predictive Modelling & Data Mining
Vadis Smart Toolbox
For Big Data Analytics
Structuring useful
information from the
Internet
Internet
List of URI
Crawler
Raw HTML websites
UT
F-8
Re-encoding
Retrieve the encoding of the page
(header and Meta-information) and
re-encode it in UTF-8 if necessary
Raw HTML websites in
UTF-8 Encoding
Minifier
Remove all the unnecessary blank
spaces and html attributes in
order to reduce file space
Minified HTML website
Cleaning
Remove Parametrizable whitelist
(CSS, JS, EventHandler, HTML tags,…)
Cleaned HTML Contents
Or Pure Text Contents
Smart Business Intelligence Methodologies
and Processes boosting Sales Effectiveness,
Risk Management and Big Data Analytics
Robust Predictive
Modelling Tool
VADIS S.A. European Leader
in Predictive Modelling & Data Mining
Vadis has gathered numerous advanced tools in its Smart
Pure Texts
Toolbox to deal with Big Data Analytics:
List of
Internet
Topics/Keywords
»» Robust Predictive Modelling Tool RANK®, built based on long
and deep Analytics Expertise of Vadis Consulting and R&D teams.
»» Massive Modelling Platform, a web based second level
automation of RANK methodologies to automatically and
massively build-validate predictive models especially powerful
for industrializing personalized CRM action deployment and
optimization.
»» Data Quality Detection and Correction including Data Cleansing,
Deduplication and Fuzzy Name Matching based on a Graph Data
approach.
»» Generic Automated Deduplication Processes able to manipulate,
compare and treat hundreds of Million linked records on a weekly
basis.
»» A Generic Process to build Customer Single Views for Predictive
Modelling and profile exploration
»» A full cycle of Web Text Mining Processes composed of
independent components for Data Collection, Unstructured
Data Processing, Content Detection and Classification, Entities
Extraction, Text Enrichment, Language Detection towards Graph
Mining of discovered relations between identified entities to
construct Structured Data from the Internet.
StopWords Removal
Remove from Textual
documents all occurrences of
the stopWords given as input
Search Engine API
StopWords
List of URI
List of
Topics/Keywords
Pure Texts
(without StopWords)
P1
List of URI
P1- Extract Content
Topic Model Trainer
Extract topics from a set
of textual documents
Pure Text Contents
Optional Manual Filtering
Selection of most relevant
documents though human
validation
Topic Model
Pure Texts
Neutral Set
Selected
Pure Text Contents
Pure Texts
Neutral Set
P1
Classifier Model
Document Classifier
Model Trainer
Build Model that is able to
categorize documents related
to the right catagory
StopWords
P1- Extract Content
List of
Pure Text Contents
Internet
Topics/Keywords
Merged
P2
Pure Text Contents
Search Engine API
P2- Modeling
List of URI
P1
P1- Extract Content
Pure Text Contents
Topic Model
Pure Text Contents
P3
Using Rank to efficiently detect
rare events and profiles from Big
Data Sources: Sky is the limit
»» Propensity to buy, to have, to
grow
»» Up and cross-selling
»» Churn prediction
»» VIP prediction
»» Risk scoring
»» Fault detection
»» Fraud detection
»» Corruption detection
»» Disease detection
»» Spam detection
»» …
P3- Computing Relevance
Classifier Model
Topic Relevancy
Extraction Model
P4
Classifier Relevancy
P4 – Validation & Filtering
Selected Documents
P5
P5 – Entities extraction
List of Entities
Vadis Smart Toolbox has been
internationally benchmarked in
various domains. It has been used
by Vadis Experts to build their own
automated processes and solutions
to serve their prestigious clients.
Adopt this ready-to-use or
customizable Smart Toolbox for
you to build your own Big Analytics
Platform under your control, easy-touse for non experimented operators
for your own Business.
Rank - Turnkey prediction software:
fast, robust and easy to use. It is best
in efficiently looking for rare events or
profiles and their amplitude from Big
Data sources. It has been continuously
benchmarked in KDD contests and has
been proven efficient in all domains:
Banking, Telco, Retail, E-Commerce,
Texts, Medicine, ...
You will be served by Vadis professional and highly experienced delivery
and R&D teams for ready-to-use or customizable solutions and processes.
»» Do you have problems in uniquely identifying your own customers
from your databases or public databases?
Try out our deduplication processes and you’ll be amazed by its
efficiency and accuracy.
»» Do you have general Data Quality Control need?
Our automatic processes will save you significant time and money and
guarantee you a robust Quality Process.
»» Do you want to build your own databases from Public Data sources?
We can serve you with tailor-made crawlers. You’ll be pleased by the
Power of Public Data combined with your internal data.
»» Are you planning to organize your diversified structured and
text-based information?
Vadis can help you build easily assessable databases with Search
Interface, graph mining based structure and visualization and
navigation.
Should you have any unsolved Data Problems,
please do not hesitate to contact us.
Smart Business Intelligence Methodologies
and Processes boosting Sales Effectiveness,
Risk Management and Big Data Analytics
VADIS s.a. Boulevard de l’Humanité 292, B - 1190 Brussels, BELGIUM
www.vadis.com [email protected] +32 2 894 28 00
Editor in charge: Claude Delcour