Download May 2016 - TMA Associates

Document related concepts

Speech synthesis wikipedia , lookup

Human-Computer Interaction Institute wikipedia , lookup

Machine learning wikipedia , lookup

Affective computing wikipedia , lookup

Pattern recognition wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Smartwatch wikipedia , lookup

Wizard of Oz experiment wikipedia , lookup

Speech recognition wikipedia , lookup

AI winter wikipedia , lookup

Speech-generating device wikipedia , lookup

Transcript
May 2016
(Formerly Speech Strategy News)
Editor, William Meisel
Microsoft introduces Skype Bots and a Bot framework for outside developers
Simply send a message to a specialized bot and get a reply
At Microsoft’s Build 2016 developers’
conference at the end of March, Microsoft CEO
Satya Nadella said that “bots are the new
applications.” He also spoke of a world where
“human language is the UI layer” and Microsoft
is participating in “conversational canvases,” a
term he applied to any app where people are
conversing in natural language, including email,
chat, and SMS.
Announcements during the conference
validated Microsoft’s support for this point of
view, with the announcements of the ability of
outside developers to create “intelligent” bots
that work inside of Skype and independently of
Skype using a Microsoft cloud service.
Continued on page 28
Google discusses the future of search
Being able to react to natural language and sustain a dialog
On March 1, Behshad Behzadi, Google’s
director of conversational search, gave a keynote
address at SMX West in San Jose. As Google
sees it, the intent of conversational search is to
conduct a dialog with the user, rather than the
usual one-and-gone interaction in search.
Behzadi discussed the future of search,
including voice search and Now on Tap. (In
Android, when you touch and hold the Home
button, Now on Tap uses what’s on your screen
to show you related details, apps, and actions.)
He started the talk by showing video clips with
Captain Kirk talking to the Star Trek’s computer
and from the movie Her to show the kind of
interaction Google is targeting.
Continued on page 30
Facebook supports contact with outside company “bots” within Messenger
Natural language texting to companies for customer service, purchases, or services
Every month, over 900 million people around
the world communicate with friends, families
and over 50 million businesses on Facebook
Messenger. It’s the second most popular app on
iOS, and was the fastest growing app in the US
in 2015. In April, the company launched the
Messenger Platform in Beta with bots and a
Send/Receive API.
Continued on page 31
LUI News
May 2016
2
Table of Contents
Microsoft introduces Skype Bots and a Bot
framework for outside developers
1 Simply send a message to a specialized bot and get a
reply
1 Google discusses the future of search
1 Being able to react to natural language and sustain a
dialog
1 Inbenta launches “Hybrid Chat,” to integrate
human and automated chat
System can change options when agents aren’t
available
Signpost unveils Mia, AI-driven CRM
Automated, personalized customer relationship
assistant using machine learning
13 13 14 14 Facebook supports contact with outside company
“bots” within Messenger
1 noHold expands its virtual assistants to multiple
business units in a company
14 Editor’s Notes
5 Today’s route to increasing computer power—and
its role in lowering the cost of the Language
User Interface (LUI)
5 AYLIEN adds News option to its NLP text analysis
API
15 Natural language texting to companies for customer
service, purchases, or services
1 Bill Meisel, Publisher & Editor
5 Amazon has sold three million Echos, adds more
features and skills
6 Alexa available on Echo-like device from Invoxia
Nuance testing “Mix,” a Natural Language
Understanding tool
Extrapolates from a relatively small number of
examples
6 7 7 Spare5 provides crowdsourcing service to label
unstructured data
8 Data labeling specialists scored by machine learning 8 Expert System combines semantics and machine
learning
9 Cogito Studio designed for customized text analysis 9 Kik messaging app launches “Bot Shop”
10 New API allows outside developers to build bots for
Kik
10 Taco Bell builds a bot for workplaces that use
Slack’s messaging platform
10 Natural language text interaction to take food orders10 Interactions and Arise Virtual Solutions partner
for Voice Virtual Assistants
11 Automated customer care on Arise’s platform using
Interactions’ human-aided system
11 Aspect Software will support customer chatbots
through Facebook Messenger
11 Chatbot messaging with Aspect’s Natural Language
Understanding and agent backup
11 Nina Virtual Assistant from Nuance used in
Swedbank customer service
Natural-language text interaction
12 12 AgentBot and Zendesk partner to offer virtual
agent for ticketing
13 Integration allows easy transfer to agents when
necessary
13 Includes Human Resources, IT, Legal, Marketing,
Sales and Support
14 SDKs for Natural Language Processing, Information
Retrieval, and Machine Learning tools
15 AI-driven virtual assistant from Kasisto powers
India’s first mobile-only bank
15 Text inquiries in natural language simulate banking
assistant
15 Toyota is forming a new data science company in
partnership with Microsoft
16 Toyota Connected has a goal of simplifying
technology so it’s easier to use in vehicles
16 Mobvoi releases new in-car app for information
and entertainment
16 Chinese company supports voice interaction in
Android OS
16 Amazon’s Alexa featured on a smartwatch from
Chinese company iMCO
17 Paired wirelessly with Android or iOS phones
17 Samsung ARTIK modules can support speech
recognition and NLU
17 SoundHound’s Houndify adds the technology to
connected devices
17 Conversica launches AI assistant for automotive
service
18 Automatically maintains contact with current and
potential service customers
18 Baidu Research and Peel collaborate on voiceenabled smart home products
19 “In the future, it will be as easy to talk to your devices
as it is to talk to the person next to you.”
19 LumenVox updates its speech recognition and
TTS for IVR systems
20 Adds partners using its technology
20 Google’s annual Founders’ Letter
20 “Over time, the computer itself—whatever its form
factor—will be an intelligent assistant helping you
through your day.”
20 LUI News
Google launching a new machine learning
platform
May 2016
21 Free limited access to create custom models, with
pre-trained models including a Speech API
21 Google expands hands-free operation in Android22 Button control by voice aids those with disabilities 22 New Samsung Galaxy models Include Sensory’s
TrulyHandsfree Voice Control
22 Samsung Galaxy S7 and S7 Edge smartphones
continue a long-term relationship with Sensory
22 Speech-enabled Unibet Sports Betting App uses
Artificial Solutions’ Teneo
23 Natural-language interaction even while watching a
streaming game
23 Speech Processing Solutions adds speech-to-text
service to its dictation software
23 Philips SpeechLive dictation service now available as
a cloud service
23 SYSTRAN API Platform enables translation and
natural language processing
24 SYSTRAN.io supports multiple languages with cloudbased Application Programming Interface
24 Winscribe and Speech Processing Solutions
expand dictation offerings
24 Availability of speech-to-text transcription services on
Philips SpeechAir Android device
24 Mattersight and Voci will license the other
company’s products
25 Voci’s transcription engine and Voci Mattersight’s
Behavioral Analytics
25 Audeme offers speech recognition and synthesis
for Arduino platform
25 Speaker-independent voice control with up to 150
commands
25 Microsoft Translator adds features on iOS
26 Offline translation and webpage translation using
Deep Neural Nets
26 Nvidia unveils processor for AI and creates “deep
learning supercomputer”
27 Turnkey system claims to deliver the equivalent
throughput of 250 x86 servers
27 Intelligent Voice offers speech-to-text based on
Graphical Processing Units
28 Nvidia GPUs allow up to 400 times real-time
processing of speech
28 News briefs .............................................................. 33 Elon Musk’s OpenAI releases first AI tool ..................... 33 [24]7 introduces customer acquisition cloud service for
marketers ..................................................................... 33 Microsoft Windows 10 Mobile test build includes
support for Cortana in more languages ..................... 33 IBM discusses recent advances in conversational
speech recognition ...................................................... 33 IBM notes growth in Watson services ........................... 34 3
IBM and the University of Illinois to pioneer nextgeneration cognitive computing systems for
applications such as multimodal education .............. 34 SparkCognition uses IBM Watson in assessing security
risks .............................................................................. 34 IBM and SAP agree to combine complementary
services, including IBM Cognitive Computing and SAP
HANA Business Suite, available on-premise and in the
cloud ............................................................................. 35 IBM partners with American Cancer Society on Watson
Cancer Advisor ............................................................. 35 IBM Health Corps to use Watson to tackle global health
disparities ..................................................................... 35 IBM teaming with Sesame Street to aid in early learning
through Watson technologies ..................................... 35 NTT uses machine learning to detect cyber-crime ....... 35 NTT Comm and IPsoft partner to launch an automated
cognitive agent service ................................................ 36 Google upgrades its open-source TensorFlow machine
learning framework to a distributed version .............. 36 Google open-sources Walt, a tool that measures lag for
touch and voice commands ........................................ 36 Android N preview 2 lets you change the pitch of
Google’s text-to-speech voice ..................................... 36 Yahoo reportedly preparing a mobile personal assistant36 Rage Frameworks’ linguistics tool analyzes documents,
adds deployments ....................................................... 36 NeoSpeech releases Canadian French TTS voice ........ 37 NeoSpeech integrates Bitcode support in its text-tospeech software for iOS .............................................. 37 Conexant introduces far-field microphone processing
software for Qualcomm Hexagon DSP ....................... 37 Fortemedia’s updated FM1388 Series IC provides voice
processing solutions for Apple CarPlay ...................... 37 Apple TV activates “live tune-in” feature launched
through Siri ................................................................... 37 e-djuster launches mobile solution for contents
inventory and claims management with speech
recognition ................................................................... 38 Lexalytics provides text analytics that run on an Android
device, targeted at developers to include in their apps38 GMA Consulting and TermSet offer document-centric
solutions for financial sector ...................................... 38 Nuance selected by CHRISTUS Health for enterprisewide speech recognition and clinical documentation
improvement deployment ........................................... 39 Recent Windows 10 update build apparently includes
Cortana “find my phone” feature ............................... 39 CallMiner partners with Ultracomms to add its
interaction analytics solutions to Ultracomms’ PCIcompliant cloud contact center .................................. 39 Hitachi introduces in-store sales representative robot 39 Thomson Reuters signs an agreement with FiscalNote
to add automated legislative tracking solution to
Thomson Reuters Regulatory Intelligence ................. 40 Forum announces the public launch of the VoiceXML
2.1 Developer Certification Exam ............................... 40 ASTi to enhance RAF Ch-47 trainers including speech
recognition ................................................................... 40 LUI News
May 2016
Max Sound’s High Definition Audio now available for
iPhones......................................................................... 40 BodyWorn body-worn camera system allows entering
notes with speech-to-text ............................................ 41 MIT and PatternEx develop machine learning AI to
detect cyberattacks, using machine learning to cluster
similar potential problems for human analysts ......... 41 Sentient Technologies uses AI to help sell shoes ........ 41 SRI International spins off robotics company to make
body suit that enhances movement ........................... 42 Chevron employs AI to improve operations .................. 42 Wise.io introduces content discovery capability for
customer support ........................................................ 42 Intoware uses Nuance speech recognition to direct
aircraft maintenance hands-free ................................ 42 Does this look like a robot? ........................................... 43 Narita International Airport in Japan testing a speech-tospeech translation application on shuttle buses ...... 43 Perfect Pitch uses recorded scripted responses to allow
off-shore agents to sound native ................................ 43 IndianTTS launches Hindi text-to-speech solution ....... 43 Microsoft collaborates with Narrative Science to add
automated natural-language narratives to its Power BI
visuals .......................................................................... 44 Microsoft updates Cortana on the iPhone .................... 44 Florida Hospital achieves significant quality
improvements plus $72.5 million in increased
reimbursement with Nuance Clinical Documentation
Improvement ................................................................ 44 Apple will pay $24.9 million in a long-running lawsuit
over the origins of Siri, more suits likely .................... 44 MetaMind acquired by Salesforce................................. 48 $30M round for AI marketing firm Persado.................. 48 X.ai secures $23M in Series B funding ......................... 48 Artificial intelligence startup DigitalGenius raises $4M
to automate customer service .................................... 49 nGUVU raises $3 million to bring gamification and
machine learning to contact centers ......................... 49 Security startup Illumio raises $100 million ................ 49 Shopify acquires Kit CRM to further “conversational
commerce” ................................................................... 49 Mobify acquires Pathful and its machine learning
technology .................................................................... 50 Coveo grows with its “intelligent search” products ...... 50 Almax Analytics closes seed round to provide AI for
news insights in capital markets ................................ 50 SparkCognition closes a $6 million Series B funding
round ............................................................................ 50 Vivint Smart Home raises $100M in equity funding .... 50 Statistics and Surveys ............................................. 45 More than 75% of businesses rank the importance of
having a mobile application as high ........................... 45 89% of consumers expect and prefer conversational
interactions with customer service ............................ 45 Botego CEO predicts 2017 will be the year of the bots
with $2 billion market size .......................................... 45 Millennials have the lowest tolerance for errors and
delays, but reward good service with loyalty ............. 46 73% of smart home owners already use voice
commands ................................................................... 46 Global virtual reality headset revenues projected to
reach $895 million in 2016........................................ 46 YouTube and Netflix lead Web video delivery .............. 46 Speech analytics market worth $1.60 billion by 2020 47 Intelligent Virtual Assistant market is expected to
exceed $3 million by 2020 ......................................... 47 The “global intelligent voice” industry to reach $19
billion by 2020 ............................................................. 47 More than three billion Android phones in use globally
by 2020 ........................................................................ 47 Global smartphone shipments this year have fallen ... 47 Annual gains in worldwide ad spending will hover
around 6% through 2020 ........................................... 47 Mobile advertising nears $29 billion annually in US ... 48 Financial Notes ...................................................... 48 Aspect Software makes progress in restructuring ....... 48 4
People ...................................................................... 51 Avaya appoints Steve Joyner to Head of Sales
Engineering, Europe .................................................... 51 For Further Information on Companies Mentioned
in this Issue
51 Blog (with a chance to comment!)
58 The Software Society (www.thesoftwaresociety.com).. 58 LUI News
May 2016
5
Editor’s Notes
Today’s route to increasing computer power—and its role in lowering the cost of
the Language User Interface (LUI)
Bill Meisel, Publisher & Editor
Moore’s Law is of course not a physical law,
and it is reaching its limits in terms of the
number of transistors that can fit on a chip—the
size of an atom is an eventual limit at least. But
two trends may continue the growth of
affordable computing power beyond those
limits, in effect extending the implications of
Moore’s Law. The key implications of Moore’s
Law were always the decreasing cost of
computing, and that trend may continue beyond
stuffing more transistors on a chip.
The first of the two trends allowing this
continued reduced cost of computing is cloud
computing. It lowers the cost of IT support by
centralizing IT, providing the advantage of
scale. And it provides powerful and scaleable
computing resources available even to
companies whose size would make it difficult to
support such resources internally.
The second trend is driven by specialized
chips that can perform parallel processing on
each chip, rather than the sequential processing
of a single CPU on a chip—the classical
microprocessor. These parallel-processing chips
can be general-purpose, such as the new server
chips introduced by Intel at the end of March;
the new Xeon E5-2600 v4 family includes up to
22 calculating engines on each chip, up from a
maximum of 18 on prior models.
Such parallel-processing chips can be
specialized, such as new Nvidia Tesla P100
chips with 15 billion transistors each, using a
Graphical Processing Unit (GPU) architecture.
The Nvidia DGX-1 deep learning system (p.
27), built on the new Nvidia chips, provides the
throughput of 250 CPU-based servers,
networking, cables, and racks in a single box,
according to the company.
The DGX-1 is specialized to create Deep
Neural Network (DNN) solutions from large
databases that can then be deployed for
operation on more conventional processors. It
reflects the growing importance of AI and the
natural language processing (including digital
assistants and “bots”) to companies. The chips
and architecture could be used by companies
offering a cloud-based machine learning service,
such as those by Microsoft, Amazon, and the
one just introduced by Google (p. 21). Google in
fact indicated that its cloud machine learning
service is powered in part by GPUs.
Reflecting the interest in such parallelprocessing architectures, IBM researchers
unveiled a new generation of “neurosynaptic”
computing chips with networks of simulated
neurons, intended to be used in “cognitive
computers” that learn as they are used. The
company’s first two prototype chips have
already been fabricated and are currently
undergoing testing. The company and its
university collaborators also announced they
have been awarded approximately $21 million in
new funding from the Defense Advanced
Research Projects Agency (DARPA) for Phase
2 of the Systems of Neuromorphic Adaptive
Plastic Scalable Electronics (SyNAPSE) project.
The “Language User Interface“ (LUI) is likely
the next step in user-friendly design to augment
or, in some cases replace, the Graphical User
Interface (GUI) that has served us so well
historically in easing the use of digital systems.
Machine learning and DNNs have proved
effective in creating the natural language
understanding required for digital assistants and
similar applications of the LUI. And the LUI
could benefit from the IBM’s “cognitive
computer.” Intelligent Voice claims that it now
has the “world’s fastest commercially available
speech to text appliance,” based around Nvidia
GPU technology (p. 27).
These trends could continue to boost the
amount of computing power available per dollar
over time, perhaps even faster than Moore’s
Law for specific uses such as machine learning.
The LUI will be a likely beneficiary of these
trends.
LUI News
May 2016
6
Amazon has sold three million Echos, adds more features and skills
Alexa available on Echo-like device from Invoxia
A survey of 2,000 U.S. customers by
Consumer Intelligence Research Partners led
them to estimate Amazon has sold 3 million
Echos. They found that close to half of US
Amazon customers are aware of the product.
Analysts have struggled to explain the success
of the product. Judging from comments by
consumers who express “affection” for the
device, it may be the first successful “social
robot,” a device whose utility is prized, but
which is almost viewed fondly—like a pet.
The company is doing well financially.
Amazon reported a profit of $513 million in its
first quarter, helped by a 28% jump in sales to
$29.1 billion for the first quarter. The most
profitable unit was Amazon Web Services, its
cloud-computing platform.
Invoxia’s Triby device incorporates Alexa
Amazon previously announced that Alexa was
available for devices beyond its Echo. What
appears to be the first example of that is an
announcement by Invoxia that its Triby portable
speaker system (see image) incorporates Alexa.
Triby, which has a magnetic frame that lets
you attach it, for example, to a refrigerator;
some analysts have considered this to make it a
kitchen-oriented device. It features a built-in
speaker and microphone that can be used to
listen to Internet radio or as a hands-free
speakerphone.
It appears, however, that its main function is
to act as a digital assistant that can provide
hands-free control for getting information from
the Web, including Wikipedia, weather, news
and sports. It can also support a shopping list,
and other features similar to the Echo. Since it
has a small e-ink screen, it can be used to leave a
message for other members of the household
(without draining the battery).
Triby costs $199. Invoxia received backing
from Amazon’s Alexa Fund last year.
Triby portable speaker with Alexa
New features
You can now ask Alexa to add events directly
to your Google Calendar after you go to Settings
in your Alexa smartphone App and tap
Calendar. Then you can make requests such as:
§ Alexa, add an event to my calendar.
§ Alexa, add ‘brunch with Mom’ to my
calendar for Saturday at 10 AM.
§ Alexa can also check your Google Calendar
for events:
§ “Alexa, when is my next event?”
§ “Alexa, what’s on my calendar today?”
Alexa can now set alarms that repeat daily or
on the same day every week through the Alexa
app. You can set one alarm for weekdays and
another for the weekend. You can use the app to
set a default sound—including custom tones by
Alec Baldwin and Missy Elliott and brand-new
ones from Jason Schwartzman and Dan Marino.
Alexa can now access election news and
content “written exclusively for Alexa,” the
company’s virtual assistant, by Washington Post
political blogger Chris Cillizza. The political
blog entries from Cillizza are read in Alexa’s
voice. (The Washington Post is owned by
Amazon CEO Jeff Bezos.)
New tool for skills
Amazon has added the Smart Home Skill API
to the Alexa Skills Kit. The API makes it faster
LUI News
May 2016
and easier for device makers to build the Skills
that sync their products up with Alexa, and it
standardizes the vocabulary that they’ll use, too.
If I make a smart thermostat and sync it up with
Alexa using the Smart Home Skill API, I’ll be
using common terminology that Alexa already
knows. That means that Alexa will be able to
control my thermostat with basic commands
like, “Turn the heat up” or, “Set the thermostat
to 70” without me needing to program any of it.
A new Syfy skill will give fans of the
entertainment
source
behind-the-scenes
previews, scheduling, and episode info. Alexa
has exhaustive knowledge of Syfy’s schedule as
far out as 14 days in the future. Alexa knows
about the most recent episodes of current Syfy
shows; If you need an update, just ask her what
happened in a show you missed. Alexa is also
prepared to give you a quick sneak peek on the
next episode of your favorite Syfy shows.
Giant Spoon has created an advertising
agency app on the Amazon Echo platform, in
order to educate and entertain clients, marketers,
and industry influentials. Giant Spoon hopes its
new Amazon Echo app serves as inspiration to
tell brand stories on Alexa (and maybe make a
few people laugh along the way. Some available
commands:
§ “Alexa, ask Giant Spoon for an idea.”
§ “Alexa, ask Giant Spoon for advice.”
§ “Alexa, ask Giant Spoon for help
brainstorming.”
These commands will initiate over 50
different advertising and marketing ideas created
by Giant Spoon. Some responses:
§ “Subtly promise consumers eternal youth.”
§ “Think glocal act lobal.”
§ “A branded food truck. But, sexier.”
Giant Spoon will be sending a notice to all
clients showcasing the new Echo app and
offering to build custom experiences on the
platform for brands.
New skills
A new1-800-Flowers skill will let you choose
from among four different arrangements to say
happy Mother's Day, happy birthday, I love you,
or thanks. For example, you can request, “Alexa,
ask 1-800-Flowers to send Becky flowers.”
Boston Children’s Hospital released a new
app for Amazon’s Alexa. Called KidsMD, the
software lets Alexa offer simple health advice to
parents about their children’s’ fever and
medication dosing. For example, you can ask
whether symptoms like fever, cough, or
headache warrant a call to the doctor, or get
weight- or age-specific dosing guidelines for
acetaminophen or ibuprofen.
Alexa can call a plumber, a carpenter, or
another
skilled
professional
using HomeAdvisor’s new app. Users will be
asked for their zip code and phone number,
allowing the requested tradesperson to call them
back to schedule a visit.
7
Nuance testing “Mix,” a Natural Language Understanding tool
Extrapolates from a relatively small number of examples
As part of its Developer program, Nuance
Communications is testing a natural language
service called Nuance Mix. Those allowed in the
beta program get access to Mix.nlu, the
company’s web-based Natural Language
Understanding (NLU) tool to develop NLU
models. The Mix.nlu tool is for developers
interested in adding natural language
interactions into mobile apps, IoT, wearables,
and other use cases. The tool is designed to
translate a text input (which might come from a
speech recognition engine) into an actionable
command. Nuance gives examples using
“intents” and “concepts” in JSON (JavaScript
Object Notation), a lightweight data-interchange
format:
§ “Play some funky jazz” = user_intent:
change_playlist, type: funky_jazz
§ “Turn the lights on in the kitchen” =
user_intent:
lights_on,
location:
home_kitchen.
The tool works by your giving text examples
of what users might say, with the appropriate
action indicated. Nuance said that Mix.nlu learns
patterns in your user samples, so eventually it
Speech Strategy News
May 2016
8
will understand more samples than what you
explicitly put in (using machine learning).
Nuance Mix is specifically focused on
consumer device and mobile app experiences for
IoT. Nuance does have offerings for the
healthcare and enterprise vertical markets, with
language models and services that meet the
needs of those industries specifically. Greg Pal,
vice president of marketing strategy and
business development, Nuance Enterprise
Division, noted in an interview that the same
model can be used across multiple channels, text
(e.g., SMS).
Mark Hanson, Senior Director of Nuance’s
Cognitive Innovations group, in an interview,
noted that the tool is intended for use cases
where there is limited data with the text and
intent labeled. The tool doesn’t use what is
usually considered “machine learning,” which
can require tens of thousands of labeled data or
more for accuracy. Hanson noted that most of
the use cases targeted by the tool are situations
in which only “small data,” not “big data,” are
available. He said the tool creates a “dynamic
grammar,” a representation that can parse an
utterance to determine its intent and type.
To use speech input, a developer can use
Nuance’s SpeechKit 2.x or a websockets (web
protocol) interface with Mix functionality. The
HTTP interface v1.0 or earlier SpeechKit
versions are not available for use with the Mix
Beta.
Mix.nlu currently supports US English as part
of the beta. Nuance indicates it will be adding
additional language support throughout 2016.
Nuance previously announced that its Dragon
Drive connected car platform and Nuance Mix
voice and natural language understanding
(NLU) developer platform provide automakers
with a set of capabilities to create intelligent and
conversational voice experiences for cars that
talk to ‘things’ such as consumer electronics and
smart devices (SSN, April 2016, p. 7).
Spare5 provides crowdsourcing service to label unstructured data
Data labeling specialists scored by machine learning
It’s obvious that AI is “hot,” and “machine
learning” often cited as a key driver of major
advances in applications and services using such
techniques as natural language interpretation.
The major bottleneck for most companies using
machine learning technology is obtaining the
labeled data to drive machine learning. Machine
learning simply extrapolates from examples that
are labeled with the outcome that the resulting
algorithm is to predict. Similarly, searching for
content (e.g., an image that fits certain criteria)
often requires labeling that content, even if the
search technique is more conventional.
According to IDG, unstructured data is growing
at the rate of 62% annually, and, by 2022, 93%
of all data will be unstructured.
Spare5 is
attacking
this
data-labeling
bottleneck
with
a
service,
Intelligent
Crowdsourcing Platform, to provide companies
a way to convert their unstructured data into
labeled data. The Platform leverages a known
community of specialists to accomplish custom
micro-tasks that, filtered for quality, allow
product owners to train artificial intelligence
models, improve their search and browse
experiences, and augment their directories.
Spare5’s customers include Avvo (an online
legal services marketplace), Expedia, Getty
Images, GoPro (wearable cameras), and
Sentient Technologies (AI software, p. 41).
Spare5’s platform applies a combination of
human insights and machine learning to solve
the increasingly complex problem of utilizing
unstructured data, including images, video,
social media content, and text messages.
Spare5’s data solution leverages a secure
network of qualified individuals, and the
company claims the ability to engage the right
human in the right loop to deliver the best
insights into unstructured data. Spare5’s
“Reputation Engine” applies machine learning
to rate each individual’s performance by
domain.
Spare5 offers a subscription service.
Companies can join the platform, gain access to
a variety of task templates, and work with a
Speech Strategy News
May 2016
9
company member. Pricing varies depending on
customization, task complexity, and specialty,
the company indicated.
Myles Brundage, Director at Sentient
Technologies, said, “With Spare5’s unique
ability to access people with specific domain
experience, we are able to quickly validate our
AI-generated models by comparing them to how
people perceive certain nuances between
different retail products.”
Steve Heck, CTO of Getty Images, said, “The
old adage that a picture is worth a thousand
words is true, but it’s just the beginning. Spare5
is providing us breakthrough value by delivering
nuanced human insights into our photos, at a
value and scale that was unthinkable just a year
ago.”
Spare5 filters task results for accuracy through
a quality assurance process that includes
Spare5’s
proprietary
machine
learning
algorithms. As customers use the platform over
time, the process improves. Using a variety of
SDKs and APIs, the data can be integrated into
existing data workflows and exported to produce
top-line business reports.
“Our mission is to tap the world’s potential
brainpower,” said Matt Bencke, Founder and
CEO of Spare5. “Businesses need specialized
human insights to solve complex data problems.
It used to be somewhere between impossible and
impractical to crowdsource specialized insights
at scale, with confidence and speed. Not
anymore. We love seeing our customers get the
help they need to interpret unstructured data,
while freeing up their employees to focus on
their core competencies. There is a profound
difference when the right human intelligence
powers machine learning.”
Expert System combines semantics and machine learning
Cogito Studio designed for customized text analysis
Expert System, which characterizes its
business as “multilingual cognitive computing
technology for the effective management of
unstructured information,” announced the
release of Cogito Studio, a product developed by
its Cogito Labs for developing customized
semantic applications for text analytics,
including the analysis, categorization, and
extraction
of
information.
(Expert
System’s Cogito Labs were established in 1994
in Modena, Italy. Today, the company has R&D
facilities in Rovereto and Naples, Italy;
Grenoble, France; Madrid, Spain; and in the
Silicon Valley and Washington DC areas.) Onix
Networking Corp., a provider of IT solutions
and services to government and corporate
customers, and Expert System also announced
that Onix will be integrating Expert System’s
Cogito technology into both commercial and
federal market solutions.
Cogito Studio
Cogito Studio combines both semantics
(formal representations of human language) and
deep learning techniques (machine learning).
Cogito Studio will help companies optimize the
launch of new projects by automatically learning
new knowledge, such as that for a specific
domain, by applying its semantic technology
that reads and understands words in context.
This approach can exceed the limitations of deep
learning because it reduces the need to manually
acquire large volumes of data and to deal with
cases not covered well in the data.
Marco Varone, President and CTO, Expert
System, said, “We believe that we can make
significant contributions to the field of artificial
intelligence. In our vision of AI, typical deep
learning algorithms for automatic learning and
knowledge extraction can be made more
effective when combined with algorithms based
on a comprehension of text and on knowledge
structured in a manner similar to that of
humans.”
Onix and Expert System
Onix delivers a full range of services
including consulting, deployment planning,
implementation and support for knowledge
discovery solutions. Onix also specializes in
cloud computing technologies.
Speech Strategy News
May 2016
10
Onix will be integrating Expert System's
Cogito technology into their solutions. Dal
VanDervort, Vice President Sales at Onix, said,
“Expert System's Cogito guards against
cognitive bias by applying multiple, diverse
worldviews through the use of community/topicspecific taxonomies. We are thrilled to be
offering customers this new capability as part of
our suite of offerings."
Kik messaging app launches “Bot Shop”
New API allows outside developers to build bots for Kik
Kik Interactive launched its Bot Shop. The
Bot Shop features bots you can chat with (by
text) from within the Kik messaging app. The
bots you can address are from their partners,
including Vine, Funny Or Die, Riffsy, Sephora,
and The Weather Channel, with more promised.
You can find the Bot Shop in Kik by tapping on
the search magnifying glass, then “Find People,”
or, on the web, visit bots.kik.com. The company
also invited developers to start building bots for
Kik using a new bot API, available
at dev.kik.com.
This continues a trend away from users
downloading more apps (see Microsoft Bot
article, p. 1). People are spending more time in
chat apps like Kik instead of apps. Last year, 1.4
billion people used a chat app, according
to eMarketer, and they spend a lot of time in
those messaging apps. Kik said its research
shows that people spend 35 minutes per session
in Kik.
Kik CEO Ted Livingston said in an a recent
interview, “Chat is going to be the next great
operating system. Apps will come to be thought
of as the new browsers; bots will be the new
websites. This is the beginning of a new
internet.”
New features help make the bot experience
smoother. Kik’s web bubbles (“wubbles”)
provide a way of displaying rich media in
conversation threads. “Suggested responses”
now take over the whole keyboard space and
allow a user to select options while chatting to a
bot without having to type anything in. There is
a new “mentions” feature, which allows you to
call a bot into a conversation by typing the “@”
symbol, allowing bots to be participants in chats
with friends.
Taco Bell builds a bot for workplaces that use Slack’s messaging platform
Natural language text interaction to take food orders
Taco Bell and its ad agency, Deutsch, built a
experimental bot on the Slack collaboration and
communications platform that uses Wit.ai’s AI
technology to take orders and even crack a
couple of jokes. (Wit.ai is now part of
Facebook.) The TacoBot is the digital assistant
version of the cashiers that take your order at its
restaurants. Taco Bell built the bot for
workplaces that use Slack’s messaging platform
to communicate internally. Participants can ask
TacoBot to place an order for them.
The bot uses natural text commands that
support a variety of ways of making a request,
e.g., “Can I have a burrito” or “Let me get a
burrito” or “Burrito, please.” It retains context:
you can order a taco in one line and then later
type “no cheese,” and Taco Bot will understand
that you don’t want cheese on the taco you had
previously ordered.
“We are at a point of switching up how we use
computers,” said Deutsch’s senior VP and
creative
technology
director,
Martin
Legowiecki. “It used to be we had to talk like
computers.”
TacoBot is still in private beta mode with a
few companies that use Slack. Taco Bell and
Deutsch have plans to bring the bot to Facebook
Messenger and Amazon’s Echo. The hope is that
eventually, TacoBot will be able to remember
people’s order histories to make better
recommendations.
Speech Strategy News
May 2016
11
Interactions and Arise Virtual Solutions partner for Voice Virtual Assistants
Automated customer care on Arise’s platform using Interactions’ human-aided system
Interactions provides speech and natural
language technology for customer care (SSN,
March 2016, p. 1). Interactions and Arise
Virtual Solutions, a provider of crowdsourced
business process outsourcing (BPO) services,
announced that they have partnered to expand
the delivery of conversational Virtual Assistant
solutions for customer care. Arise will offer
Interactions Virtual Assistant solutions to its
customers, combining Interactions “Adaptive
Understanding” technology with Arise’s virtual
customer care platform.
The combined offering allows consumers to
accomplish tasks in self-service or transition to a
live agent when human interaction is needed.
Arise’s crowdsourcing platform provides
flexibility and reduces cost of the humanaugmented aspect of the virtual interaction. John
Meyer, CEO of Arise, said, “Interactions’
offerings enable us to expand our portfolio by
providing greater conversational self-service
options while call centers using our platform
handle those calls that are better suited for live
agents, either due to complexity or business
practices…This partnership is an ideal
alternative for businesses looking to bring
offshore customer care back on-shore, by
providing both higher quality and lower cost
point.”
Aspect Software will support customer chatbots through Facebook Messenger
Chatbot messaging with Aspect’s Natural Language Understanding and agent backup
Aspect Software, a provider of integrated
consumer engagement, workforce optimization,
and back-office solutions on premises and in the
cloud (LUI News, April 2016, p. 27), announced
an initiative to accelerate consumer brand
transactions and interactions through Facebook
Messenger, using the new “bots” support from
Facebook (p. 1). Aspect cited new research that
found nearly 40% of consumers would rather
use messaging apps like Facebook Messenger
for customer service versus a phone call. The
Aspect Consumer Experience Index study found
that 73% of consumers want companies to offer
more self-service options for customer service.
Aspect sees Facebook Messenger rapidly
becoming a critical customer service and
engagement channel. The company claims that
Messenger is a natural extension to Aspect’s
Customer Experience Platform’s (CXP) omnichannel capabilities: another conversational
channel for customers to self-serve on, in
addition to IVR, mobile Web, SMS, and Twitter.
The integration between Aspect CXP and
Facebook Messenger helps create chatbots using
Natural Language Understanding (NLU) in
more than a dozen languages to match a
customer query to the right response. The
chatbots can engage consumers in a
conversation to get more information when
needed or complete multi-step transactions.
And when agent assistance is needed,
conversations can be transferred seamlessly
using Aspect’s software, with full contact center
integration and appropriate routing to the right
agent without consumers needing to repeat
themselves. CXP’s design-once, deployanywhere support means that chatbots designed
for Messenger can easily be deployed on SMS
and Twitter. Aspect also says that the years of
experience Aspect’s Professional Services has in
conversational user interface design guarantees
that enterprise bots built on Aspect CXP begin
with a high level of quality.
Joe Gagnon, SVP and Chief Customer
Strategy Officer, said, “Intelligent, automated
messaging has the potential to create highly
engaging and interactive conversations for
consumers, and chatbots offer the promise for
timely and intuitive consumer engagement. But
brand interaction in isolation from the customer
service ecosystem, regardless of the medium or
application, puts businesses at risk of falling into
Speech Strategy News
May 2016
12
the same customer service failures of the past.
With 900 Million active users on Messenger,
Facebook is poised to take messaging to a new
and very exciting level, and we’re thrilled to be
elevating the quality of company-consumer
interactions on the platform to which more and
more consumers are moving.”
Aspect Customer Service on Facebook
Messenger can be integrated and implemented
alongside existing contact center and selfservice solutions, even if they aren’t Aspect
platforms, according to the company. It allows
consumers to opt-in to outbound messages like
important notifications, payment reminders, or
sales promotions, and respond to inbound
inquiries with natural dialogues powered by
Aspect NLU.
Due to early interest, Aspect is inviting more
companies interested in testing customer service
interaction on Facebook Messenger to
participate in a free introductory production
pilot. Participating companies will have access
to Aspect’s NLU platform as well.
Nina Virtual Assistant from Nuance used in Swedbank customer service
Natural-language text interaction
Nuance Communications announced that
Swedbank Group, a major financial institution
in Sweden, Estonia, Latvia, and Lithuania, is
using Nuance Nina, an intelligent virtual
assistant that delivers conversational customer
service experience to enable self-service
capabilities and quick and easy access to
information for both Swedbank customers and
service agents. With the natural language system
on the Swedbank Web site, banking customers
type their questions to the Swedbank virtual
assistant in order to find answers to their
questions and identify the financial services that
are best suited for their needs. In addition,
Swedbank customer service agents are using the
new system to quickly find information for
customers, reducing the amount of time
customers must spend on the phone seeking
answers to their questions. As a result, Nina has
helped Swedbank improve the customer
experience for consumers and agents alike,
reaching 78% first-contact resolution within the
first three months.
The virtual assistant provides a “chat”
experience” based on the Nuance Nina platform,
using
Nuance’s
Natural
Language
Understanding (NLU) technology. The web site
shows a chat box headed “Hi, how can we help
you?” (in Swedish), with the chatbox containing
the instructions “Feel free to ask your question
here!” Because more than 75% of Swedbank
customers prefer to conduct their banking via a
mobile app or the bank’s web site, the virtual
assistant is helping to guide customers quickly
to the answers that they seek, limiting the need
to call in to the bank with additional questions.
Swedbank’s virtual assistant is currently
answering 8 out of every 10 customer questions,
according to Nuance. Customer adoption of the
virtual assistant has been positive, with an
average of 30,000 conversations having
occurred per month within just the first three
months of the deployment.
In the future, Swedbank plans to expand Nina
to reach more of the customer base, followed by
its digital platforms, Mobile Bank and iPad
Bank. This includes the addition of transactional
capabilities to Nina Web, removing the need for
escalation to Swedbank’s contact center for
many service-related inquiries; new value-added
services via chat and call-back; increased
integration with a greater depth of customer data
for more precise routing of customers to the
right contact center agent; and increased self
service (for example, for qualification and
assistance in response to a call-to-action in a
sales campaign).
Robert Weideman, executive vice president
and general manager, Nuance Communications,
said, “Since Nina was introduced, we’ve seen
the clear impact that these sorts of interactive,
conversational experiences can have when it
comes to improving the consumer’s experience,
and we’re also seeing tangible business benefits,
particularly on the web.”
Speech Strategy News
May 2016
13
AgentBot and Zendesk partner to offer virtual agent for ticketing
Integration allows easy transfer to agents when necessary
AgentBot creates text-based natural-language
virtual assistants for customer service. Zendesk
provides a web-based customer service system
for dealing with inbound ticket requests from
any channel—email, web, social, phone, or chat.
The companies announced a partnership
integrating their tools. The integration combines
the virtual agents solution offered by AgentBot
with the possibility of incorporating secondlevel assistance forms integrated to Zendesk.
With this partnership, both companies will
include new functionalities, giving their support
service solutions and their automatic customer
service more agility in the resolution of client’s
requests. It will be possible to complement all
the benefits of instantaneous automatic customer
service with the possibility of transferring
requests related to sales or complex matters
directly to Zendesk.
AgentBot indicates it uses an advanced
language understanding engine to generate
responses. The virtual agent retains important
information and recognizes different means of
expressing the same concept thanks to its
own evolving dictionary. It is designed to
understand the errors we make daily when
texting, as well as local regionalisms, while
giving more importance to products and
services.
Inbenta launches “Hybrid Chat,” to integrate human and automated chat
System can change options when agents aren’t available
Inbenta specializes in Natural Language
Processing and semantic search to improve the
customer experience online (SSN, April 2016, p.
19). The firm announced the global availability
of a new “Hybrid Chat” service. This hybrid
approach
combines
Inbenta’s
selfservice Virtual Assistant support with its ondemand Live Chat technology. Now, when a
customer is not satisfied with an answer
provided by the virtual agent, they can instantly
open a conversation with a live agent who will
already have the detailed history for a seamless
user experience.
Here’s how Hybrid Chat works:
§ Using Inbenta’s self-service technology,
customers seeking support will immediately
see an interactive “Help” window appear on
the homepage of the company’s website.
§ Once engaged, the customer is greeted by a
customized Virtual Assistant avatar; the
avatar can be programmed to speak
questions and responses.
§
Customers will begin the conversation by
typing their question(s) into the chat box; the
virtual assistant will then access the
company’s knowledge base to find the most
relevant answer(s) based on Natural
Language Processing.
§ If the customer is satisfied, the user closes
out of the window and proceeds through
their journey; otherwise, they are offered to
connect to a live agent.
§ The full chat conversation is instantly sent to
the responding agent so they can pick up the
conversation wherever the Virtual Assistant
left off. There’s no need to make the
customer repeat themselves.
Hybrid Chat has the ability to detect whether
agents are online or not. Support teams can
make real-time changes to the “Help” widget so
that customers will either have access to a
Virtual Assistant, a Virtual Assistant + Chat, or
simply the Chat feature when support teams are
unavailable.
Speech Strategy News
May 2016
14
Signpost unveils Mia, AI-driven CRM
Automated, personalized customer relationship assistant using machine learning
Signpost announced the launch of Mia—
artificial intelligence that helps companies
connect with customers. Mia automatically
collects customer data such as email, phone, and
purchase information to create a customer
record. Using this data, Mia stays connected
with customers and prospects to drive word-ofmouth reviews and repeat business. Mia uses
machine learning to eliminate the need for a
traditional CRM and other forms of manual
marketing.
“Mia is the future of communication between
businesses and consumers,” claimed Signpost
CEO Stuart Wall. “B2C companies have long
shied away from CRM systems, because they’re
complicated, time-consuming, and built for
companies with dedicated sales teams. The
average CRM forces users to manually upload
leads, cross-reference data streams like email,
text, and purchases; and, perhaps most onerous,
they require users to curate email marketing
campaigns. Mia removes that friction by
automatically capturing and optimizing every
customer interaction with zero effort from the
user.”
Users simply set objectives they would like to
achieve such as five-star reviews, customer
referrals, generating new customers. Mia learns
and changes her behavior with every customer
interaction. She generates follow-up and
ongoing communication through email and
SMS.
Mia has data on more than 16 million US
consumers across more than 6,000 companies
that are Signpost customers. From this
proprietary data set she learns what types of
communication work best and applies that to her
email or SMS–driving targeted communication.
Signpost recently raised a $20M Series C
round. The company is backed by Google
Ventures, Spark Capital, Georgian Partners, and
OpenView Ventures.
noHold expands its virtual assistants to multiple business units in a company
Includes Human Resources, IT, Legal, Marketing, Sales and Support
noHold provides web-based self-service
solutions through a natural-language virtual
assistant using text interaction. Its flagship
product, Support Advisor, thus typically answers
support and customer care type questions.
The company has expanded to include
internally facing enterprise Virtual Assistants
such as:
§ HR Advisor- allows employees to help
themselves with human resources questions;
§ IT Advisor- for the enterprise help desk;
§ Legal Advisor- to enforce compliance;
§ Marketing Advisor- to prequalify sales
opportunities;
§ Sales Advisor- to increase conversion rates;
and
§
Support Advisor- to reduce support costs.
According to Diego Ventura, CEO of noHold,
“When creating a Virtual Assistant for a specific
organization within the enterprise, you must take
into consideration the different features needed
by those particular business units that will
optimize the Virtual Assistant. For example, we
discovered that when a Call Center Agent is on
the phone with a customer, they need to have a
sense of the questions they will be asking and
why. In these situations, the Virtual Assistant
cannot always ask questions serially, so we
implemented the ‘Look Ahead’ feature,
designed to show the Agent why the Virtual
Assistant would be asking those questions.”
Speech Strategy News
May 2016
15
AYLIEN adds News option to its NLP text analysis API
SDKs for Natural Language Processing, Information Retrieval, and Machine Learning tools
AYLIEN Text API is a package of Natural
Language Processing, information retrieval, and
machine learning tools for extracting meaning
and insight from textual and visual content. The
company has introduced a more specialized
version for summarizing news content, their
News API.
§
§
§
Text API
AYLIEN indicated that it had broad use-cases
in mind when designing and developing the
Application
Programming
Interface,
so
developers could use it for applications such as
organizing legal documents (such as patents).
However, the company notes that the API is
“slightly geared towards analyzing news and
Social Media content.”
The tool allows search by keywords, entities
(people, organizations, brands), and topics. You
can filter stories by sentiment towards People,
Organizations, Brands, etc., and specify topics
and categories that matter to you. You can also
track specific news outlets and authors. You can
further narrow down results to a specific region.
The company describes features of the tool:
§ Article Extraction strips HTML documents
of ads, navigation elements, and anything
that gets in the way of understanding the
text.
§ Summarization extracts key sentences from a
text, leaving only the most important
concepts.
§
§
§
Classification tags a text with metadata from
up to 500 categories.
Entity Extraction lists organizations, phone
numbers, currency amounts, and individuals
mentioned in a text.
Concept Extraction goes beyond Entity
Extraction to provide Linked Data for topics
mentioned, including semantic types and
Uniform Resource Identifiers.
Automatic Hashtag Suggestion helps get
more exposure for content on Social Media.
Sentiment Analysis summarizes the tone of a
text—positive or negative, subjective or
objective.
Language Detection ensures that you and the
text in question are speaking the same
language.
News API
The News API is specifically tuned to help
summarize news sources. It allows users to
search, source, and index news and blog content
from across the Web in real time. The service
crawls and indexes thousands of news sources
every day and analyzes their content using an
NLP-powered Text Analysis Engine to provide a
flexible news data source.
Developers will find interactive API
documentation and code snippets (JavaScript,
Python, PHP, Go, C#, and Ruby) for the News
API on the AYLIEN website.
AI-driven virtual assistant from Kasisto powers India’s first mobile-only bank
Text inquiries in natural language simulate banking assistant
Kasisto, which spun out of SRI International
in 2014, is supporting a new generation of
banking services that are mobile and accessible
using natural language text. Its conversational
AI platform powers customizable virtual
assistants that enable banks to engage with their
customers.
Kasisto announced that its KAI platform is
behind the virtual assistant in digibank, a new
mobile-only bank launched in India. In
addition, digibank creator DBS Bank has taken
a minority stake in Kasisto.
DBS Bank, Singapore’s largest bank and a
leading bank in Asia, announced digibank,
India’s first mobile-only bank, in April.
Speech Strategy News
May 2016
16
Breaking away from conventional banking
norms, digibank is a completely paperless,
signatureless, and branchless bank.
Account opening can be done at an extensive
network of outlets run by DBS’ partners; for a
start, this includes over 500 cafes across India.
There is no paperwork involved; instead,
customer authentication is done purely using the
Aadhaar card, a biometrics-enabled ID which
has been issued to over 1 billion Indians.
Customer service provided by a Kasisto-based
24x7 AI-driven virtual assistant. Customers can
converse with the virtual assistant to get their
queries answered or banking transactions
performed, with requests such as, “What is my
account balance?,” “Show me past transactions,”
or “Pay Amit 100 rupees.” Today, the assistant
can anticipate and answer some 10,000 customer
questions, with new knowledge continually
added.
DBS CEO Piyush Gupta said, “With the
advent of technology, banking as we know it is
being completely transformed. digibank places
an entire bank in our customers’ hands.”
Toyota is forming a new data science company in partnership with Microsoft
Toyota Connected has a goal of simplifying technology so it’s easier to use in vehicles
Toyota is forming a new data science
company, Toyota Connected, in partnership
with Microsoft, that’s designed to free
customers “from the tyranny of technology.”
Zack Hicks, the company’s CEO and Toyota
Motor America’s Chief Information Officer,
explained that the goal was making the
increasing number of options created by Internet
connectivity less difficult to use. The company
wants to create more intuitive user interfaces,
such as speech interaction, to address the
problem. “I think people are really tired of
fumbling with multiple devices and having this
disjointed experience,” Hicks said when Toyota
announced the venture.
Toyota Connected will research options such
as linking with other vehicles so they can report
weather and traffic conditions to people driving
the same route. Microsoft engineers will work
with the company at its headquarters in Plano,
Texas, where Toyota is moving its US
operations. Microsoft bought a 5% equity stake
in Toyota Connected, which is organized as a
separate corporation.
Toyota says the new company will support
research into artificial intelligence and robots, as
well as analyze data from vehicle sensors and
cameras, so algorithms can be developed for
self-driving cars. Toyota Connect will use
Microsoft’s Azure cloud computing platform to
collect and analyze data. Drivers would have to
opt in to all of the data reporting, and Toyota
would disclose what data is being shared, the
company said.
In January, Toyota began a $1 billion, fiveyear investment in Toyota Research Institute,
which is setting up centers near Stanford
University and Massachusetts Institute of
Technology. Leading the effort is Gill Pratt, the
former top robotics engineer for the US military.
Mobvoi releases new in-car app for information and entertainment
Chinese company supports voice interaction in Android OS
After receiving an investment from Google,
the Chinese AI company Mobvoi is moving into
in-car infortainment. Mobvoi has just released
an Android app for drivers, Wenwen In Car—
“drive and ask.” The new app leverages Mobvoi
technology used in its voice search engine
Chumenwenwen. The Android app shows four
major large buttons on the right hand side and a
speech button on the left.
Wenwen In Car allows drivers hands-free
operation through voice interaction. It supports
traffic updates, navigation, points of interest,
and calling. The app can also be used to play
music and request information. The application
can be engaged by saying “Nihao Wenwen.”
LUI News
May 2016
Wenwen In Car uses Mobvoi’s speech
recognition, TTS, proactive search, mobile
search, and semantic analysis.
17
Amazon’s Alexa featured on a smartwatch from Chinese company iMCO
Paired wirelessly with Android or iOS phones
Chinese company iMCO Technology has
introduced a smartwatch called the CoWatch
(see image), expected to be available in June in
limited quantities. It features a high-resolution
touchscreen and Wi-Fi and Bluetooth
connectivity, and it also runs on the Cronologics
OS, an operating system designed by Google
and Android veterans. CoWatch is the first
smartwatch to feature Alexa, the voiceinteractive digital assistant from Amazon. As
the voice of the Amazon Echo (p. 6), which
doesn’t have a screen, Alexa focuses on voice
interaction, where many digital assistants default
to a screen (usually a web search) when
stumped. Thus, Alexa is more suited to a
smartwatch than most digital assistants.
The CoWatch was launched on crowd-funding
site Indiegogo in April. The watch can
synchronize with smartphones running on
Android 5.0 or higher, or iPhones with iOS 9. It
features a 400 x 400 pixel Super AMOLED
display at 1.39 inches.
iMCO CoWatch
The watch is fitted with a dual-core MIPS
processor, 1GB RAM and 8GB storage. It also
includes sensors such as an accelerometer,
magnetometer, gyro, vibrating motor, and a
heart rate sensor.
The watch comes with a charging cradle, and
it takes one hour to charge up to 70%. The
wearable device has a 300mAh battery that
promises up to 32 hours of usage.
Samsung ARTIK modules can support speech recognition and NLU
SoundHound’s Houndify adds the technology to connected devices
The open Samsung ARTIK platform provides
all the essential hardware and software to
support the Internet of Things (IoT). Samsung
calls it “the end-to-end, integrated IoT platform
that transforms the process of building,
launching, and managing IoT products.” It’s an
entire integrated ecosystem, from silicon to
development tools to cloud, plus an extensive
array of technology and development partners.
SoundHound announced as partner
One
recently
announced
partner
is
SoundHound. Samsung has a consumer
product, Hound, that is essentially a general
mobile personal assistant for iOS and Android,
with a focus on specific talents. Hound uses
SoundHound’s Speech-to-Meaning technology
to showcase an impressive smartphone
experience (judging from the examples of that
experience shown by Katie McMahon, VP &
General Manager, SoundHound, at the Mobile
Voice Conference). Hound showcases the
company’s Houndify platform, designed to be
used by application developers to support speech
understanding and voice search in their
applications. The collaboration between the
company’s new Houndify platform and the
Samsung ARTIK platform ecosystem enables
connected devices to have the power of a voiceenabled conversational interface.
Samsung and SoundHound worked closely to
enable Houndify across the entire Samsung
LUI News
May 2016
ARTIK family of modules. The collaboration
resulted in an optimized version of the Houndify
SDK that can run on the Samsung ARTIK 1
module, using extremely low memory and
processing power, while unlocking the full
power of the Houndify platform. The new SDK
is available on Houndify.com to all developers
interested in the ARTIK platform, and is being
actively integrated with devices that can be
shipped to hundreds of millions of consumers in
the near future, with some shipping later this
year, according to SoundHound.
Keyvan Mohajer, Founder and CEO,
SoundHound, said, “We are a strong believer in
the IoT revolution – the world of connected
devices around us. We also believe that the most
compelling, and often the only form of
communication with these connected devices is
using voice and natural language. The Samsung
ARTIK platform enables developers to more
easily create a connected device, and the
Houndify platform enables them to quickly and
easily add a voice-enabled, conversational
interface to whatever they build.”
announced the intent to become a Samsung
Artik Cloud Platform collaborator.
Samsung’s Otto
Samsung created Otto, an Echo-like device to
show off ARTIK technology (see image). It can
respond to voice inquiries. It includes a video
camera that lets you look around when not at
home, like a surveillance camera (which has
already garnered some negative publicity as a
hacker’s target). It’s not yet a product, however.
VoiceBox on the Artik Platform
VoiceBox Technologies, a global provider of
contextual voice and natural language
understanding
(NLU)
technologies
for
automotive, mobile and IoT products,
18
Samsung’s Otto
VoiceBox indicated that IoT developers face a
number of challenges that are solved by using
the Artik Cloud Platform. Hurdles like
managing data from devices using a variety of
protocols, ensuring interoperability across siloed
ecosystems, working with heterogeneous data,
reducing latency, protecting data privacy, and
accessing data from disparate clouds can be
addressed through Artik tools and services.
“Our vision of delivering a common,
intelligent IoT interface that is multi-device,
cross-device and multi-user, is quickly extended
by adopting standards-bridging platforms
like Artik Cloud,” said Mike Kennewick,
VoiceBox's CEO.
Conversica launches AI assistant for automotive service
Automatically maintains contact with current and potential service customers
Conversica, a provider of sales conversion
management software, offers an AI persona that
helps sales and marketing personnel close sales
(LUI News, April 2016, p. 41). The sales
assistant engages potential customers in two-
way email conversations to uncover their intent
and connect them with a salesperson.
In April, the company announced the launch
of the Conversica automotive service assistant,
an AI-based solution for automotive dealerships
and their service departments that engages new
LUI News
May 2016
19
car buyers, as well as present and past service
customers, in natural, two-way email
conversations to get them into the service drive.
This automated yet natural engagement with
service customers frees up service advisors to
focus on the day’s service appointments and
builds a long-lasting relationship between dealer
and customer.
According to Conversica, the service center is
the most profitable department in a dealership
with a 72% gross revenue margin and is the
main point of contact for creating a lifelong
relationship with the customer. Moreover, 82%
of car buyers that service with their dealership
will buy their next car from that dealership and
consequently, over a lifetime, one customer is
worth on average over $500,000 in revenue from
car purchases, servicing, and parts. Therefore,
keeping a customer engaged with the dealership
at every stage of ownership is vital to future
revenue.
With artificial intelligence for auto service,
Conversica also delivers valuable information
about potential service customers to the service
department. With details from the AI
conversation, service advisors will be prepared
to engage each customer when, where and how
that person prefers. The auto service assistant is
designed to handle situations such as:
§ Engaging service leads in real time as they
come in through the dealer website;
§ Encouraging new car buyers to make their
first service appointment;
§ Engaging new car buyers who have never
scheduled a service;
§ Reengaging customers who have been in for
service but not returned; and
§ Following up after service appointments to
gauge customer satisfaction and identify
areas for improvement.
Baidu Research and Peel collaborate on voice-enabled smart home products
“In the future, it will be as easy to talk to your devices as it is to talk to the person next to you.”
Peel offers a popular universal remote app
called Peel Smart Remote for smartphones and
tablets. It has more than 150 million users in 200
countries and executes 10 billion monthly
remote commands. It uses the infrared
functionality built into many Samsung phones,
as well as the HTC One, to control TVs, DVRs,
and cable boxes. An iPhone user can take
advantage of its other features, such as its show
and channel trackers, or pair it with a $50 Pronto
to take advantage of its remote control
capabilities.
Baidu Research, a division of Baidu, Inc.,
announced a technology collaboration with Peel
where Baidu’s Deep Speech technology will be
integrated into Peel’s AI-based platform for
home control to create voice-enabled smart
home products. Deep Speech is a speech
recognition system developed using “end-to-end
deep learning” by Baidu Research’s Silicon
Valley AI Lab (SVAIL).
The companies demoed a beta version of
Peel’s voice-based remote at the GPU
Technology Conference in San Jose, CA in
April. The Peel demo used speech recognition to
access live TV, DVR, and streaming content
seamlessly across devices. For example, using
voice commands, a user can switch between
House of Cards on Roku and Game of Thrones
on cable TV, or ask to see a line-up of comedy
shows or programs about the US presidential
election.
Baidu’s Adam Coates, who leads the SVAIL
team, said, “Speech recognition is at an
inflection point. In the future, it will be as easy
to talk to your devices as it is to talk to the
person next to you. We are excited about the
potential of the collaboration with Peel to bring
that experience to users.”
Peel Co-founder and Chief Product Officer
Bala Krishnan added: “This collaboration is
opening up new exciting possibilities for our
users. Voice command and artificial intelligence
are the foundation for the next generation of
Peel universal home control.”
LUI News
May 2016
20
LumenVox updates its speech recognition and TTS for IVR systems
Adds partners using its technology
LumenVox has released version 14.2 of its
technology supporting speech recognition and
text-to-speech in an Interactive Voice Response
(IVR) customer service environment. The
company now has 57 TTS voices available. The
new
additions
include Polish
Female
“Agnieszka” and “Ewa” voices, a Polish Male
“Jacek” voice and an American English Male
child “Justin” voice. LumenVox TTS now
covers 24 languages.
The Dashboard configuration utility has been
extended to provide more options, including the
ability to configure the “client property”
settings, making it easier for users to configure
LumenVox servers. A new feature that allows
the LumenVox Manager Service to be restarted
directly through the Dashboard interface was
also added.
The company added two new features to their
Speech Tuner, offering more control over the
naming of saved audio files and a flexible new
accuracy classification feature that allows
companies to see the hypothetical effects on
individual interactions if the confidence
threshold were to be changed.
LumenVox
announced
new
partners,
including Forty
7
Ronin
and
InfinityCTI. They were awarded LumenVox
Skills Certification, which demonstrates a
partner’s capability to deliver high-quality
speech solutions based on the LumenVox speech
automation suite.
Forty 7 Ronin specializes in Voice User
Interface Design, IVR development, and speech
recognition. The company provides voice
solutions supported by the LumenVox Speech
Recognizer and Text-To-Speech Server on the
Genesys, Aspect, and Avaya IVR platforms.
InfinityCTI
supports
speech-enabled
applications on the Avaya Aura Experience
Portal. The company utilizes the LumenVox
speech automation suite to provide an IVR
system that automates membership renewal,
card replacement, and credit card payment
services over the phone for AAA Arizona. The
InfinityCTI
speech-enabled
application
increased the call completion rate and reduced
the agent load in the AAA Arizona Call Center,
according to InfinityCTI. The enhanced tuning
services provided by InfinityCTI with the
LumenVox Speech Application Tuner have
helped improve customer satisfaction with the
call center.
Google’s annual Founders’ Letter
“Over time, the computer itself—whatever its form factor—will be an intelligent assistant helping
you through your day.”
On April 28, Google posted its annual
Founders’ Letter, introduced by Alphabet CEO
Larry Page and written by Goggle CEO Sundar
Pichai. One particular summarizing comment
seems key:
“Looking to the future, the next big step will
be for the very concept of the ‘device’ to fade
away. Over time, the computer itself—whatever
its form factor—will be an intelligent assistant
helping you through your day. We will move
from mobile first to an AI first world.”
A few other excerpts:
Search and assistance
“…today we are about one thing above all
else: making information and knowledge
available for everyone…the majority of our
searches come from mobile, and an increasing
number of them via voice.”
“You should be able to move seamlessly
across Google services in a natural way, and get
assistance that understands your context,
situation, and needs…Smart assistance should
LUI News
May 2016
understand all of these things and be helpful at
the right time, in the right way.”
car, like Android Auto, or your wrist, like
Android Wear.”
“Most of these computing experiences are
very likely to be built in the cloud.”
Machine learning and artificial intelligence
“…creating artificial intelligence that can
help us in everything from accomplishing our
daily tasks and travels, to eventually tackling
even bigger challenges like climate change and
cancer diagnosis.”
More great content, in more places
“Our focus on our core mission has led us to
many efforts over the years to improve
discovery, creation, and monetization of
content—from indexing images, video, and the
news, to building platforms like Google Play
and YouTube.”
Powerful computing platforms
“Android…has more than 1.4 billion 30-dayactive
devices—and
growing.
Today’s
proliferation of ‘screens’ goes well beyond
phones, desktops, and tablets. Already, there are
exciting developments as screens extend to your
21
Enterprise
“As we look to our long-term investments in
our productivity tools supported by our machine
learning and artificial intelligence efforts, we see
huge opportunities to dramatically improve how
people work.”
Building for everyone
“The Internet is one of the world’s most
powerful equalizers, and we see it as our job to
make it available to as many people as
possible.”
“Making this possible is a lot more
complicated than simply translating a product or
launching a local country domain.”
“For us, technology is not about the devices or
the products we build. Those aren’t the endgoals. Technology is a democratizing force,
empowering people through information.”
Google launching a new machine learning platform
Free limited access to create custom models, with pre-trained models including a Speech API
In April, at the NEXT Google Cloud Platform
user conference, Google’s officials announced a
new cloud-based machine-learning platform.
The company is offering a limited free trial.
Google
Cloud
Machine
Learning
provides modern machine learning services,
with some pre-trained models and a platform to
generate custom tailored models. The neuralnet-based machine learning platform is claimed
to
have
better
training
performance
and increased accuracy compared to other largescale deep learning systems. The pre-trained
models use REST APIs (RESTful web
Application Programming Interfaces). The cloud
platform is a scalable and distributed training
infrastructure, powered by Graphical Processing
Units (see Editor’s Notes, p. 5).
A developer can create a model with the
TensorFlow framework,
open-sourced
by
Google, that powers many Google products
from Google Photos, to Google Cloud Speech
(p. 21). Companies can use the TensorFlow
SDK to train models locally on sample data sets
and use the Google Cloud Platform for training
at scale. In future phases, Google indicated that
models trained using Cloud Machine Learning
can be downloaded for local execution.
The service is integrated with other Google
Cloud Data platform products such as Google
Cloud Storage or Google BigQuery to ease the
process of training machine learning models
from data. Major Google applications use the
Cloud Machine Learning platform, including the
Google search app itself (voice search), Photos
(image search), Translate, and Inbox (Smart
Reply).
“In the future almost everything will be done
in the cloud,” Google Chief Executive Officer
Sundar Pichai said in a speech at the company’s
Google Cloud Platform NEXT conference. “For
LUI News
May 2016
years we have been investing in scaling up our
infrastructure to do this.”
Google will put “thousands of people” to
work on the systems that support its cloud over
the next few years, in part to build the data
center and network infrastructure needed to
handle many large corporate customers,
according to Alphabet, Inc. Chairman Eric
Schmidt.
Google’s Speech API can stream text results,
returning partial recognition results as they
become available, with the recognized text
appearing
immediately
while
speaking.
Alternatively, Speech API can return recognized
text from audio stored in a file.
The company says its speech-recognition
service is also free for a preview phase, but there
will be a charge.
There is also a Cloud Vision API and Cloud
Translate API. Google will charge companies
between 60 cents and $5 for every 1,000 uses of
its Cloud Vision system. This lets users
automatically detect faces, landmarks, logos and
other features in images. Google says the more
companies use the services, the cheaper it will
be for them; there is also a free tier for 1,000 or
fewer uses during a month.
The Google Cloud Speech API
The Google Cloud Speech API enables
converting audio to text by using neural network
models. The API recognizes over 80 languages
and variants. Google said the API can support
transcription
through
an
application’s
microphone or enable voice control in an
application, among other use cases.
22
Google expands hands-free operation in Android
Button control by voice aids those with disabilities
Google Voice Access for Android, recently
launched in beta testing, is intended to bring
increased functionality to people who have
accessibility issues, e.g., difficulty seeing
objects at handheld distance or pressing buttons,
both a challenge to smartphone use. Among
other things, Voice Access adds discrete button
control. The app assigns numbers to every
button that it’s possible for a user to press.
An individual can use both keyboards and
websites by calling out the number visually
associated with an element you want to press. In
addition, users can say “open Chrome” or “go
home” to navigate around the phone or interact
with the screen by saying “click next” or “scroll
down,” for example. Voice Beta Access disables
the phone’s touch screen while Voice Access is
running.
Google has also improved its screen reader on
its Chromebook tablets. Every Chromebook now
comes with a built-in screen reader called
ChromeVox, enabling the blind or visually
impaired to read the screen using text-to-speech
software. The latest version, ChromeVox Next
Beta, includes a simplified keyboard shortcut
model, a new caption panel to display speech as
text and Braille output, and a new set of
navigation sounds.
Google Docs now enables users to type, edit,
and format documents using voice commands,
making it easier for people who can’t use a
touchscreen to edit documents. Google also
pointed out that it is continuing to work closely
with Freedom Scientific, a provider of assistive
technology products, to improve the Google
Docs and Drive experience with the JAWS
screen reader.
New Samsung Galaxy models Include Sensory’s TrulyHandsfree Voice Control
Samsung Galaxy S7 and S7 Edge smartphones continue a long-term relationship with Sensory
Sensory Inc. announced that Samsung
continues to utilize TrulyHandsfree technology
to provide a speech trigger and speech
recognition experience on their Samsung Galaxy
S7 and Galaxy S7 Edge smartphones. Samsung
continues to utilize Sensory’s embedded
LUI News
May 2016
23
TrulyHandsfree speech recognition engine that
work without an Internet connection.
Sensory technologies have been shipping on
Samsung Galaxy smartphone products since
2011, on every Samsung Galaxy smartphone
product since the S2, as well as Samsung’s
Galaxy Note tablets and Galaxy Gear and
Galaxy Gear S2 smart watches.
TrulyHandsfree supports US English, UK
English, French, German, Italian, Japanese,
Korean, Mandarin Chinese, Portuguese,
Russian, and Spanish. Sensory TrulyHandsfree
has ultra-low-power deeply embedded ports
available for leading DSP/MCU IP cores from
ARM, Cadence, CEVA, NXP CoolFlux,
Synopsys and Verisilicon, as well as for
integrated circuits from Audience, Avnera,
Cirrus Logic, Conexant, DSPG, Fortemedia,
Intel, Invensense, Microsemi, NXP, Qualcomm,
QuickLogic, Realtek, STMicroelectronics, TI,
and Yamaha.
Speech-enabled Unibet Sports Betting App uses Artificial Solutions’ Teneo
Natural-language interaction even while watching a streaming game
Artificial Solutions announced a new speechenabled intelligent interactive feature in
the Unibet Sports Betting App from Malta-based
Unibet that allows users to place a bet speaking
natural language. The feature that uses the
Teneo platform from Artificial Solutions enables
enterprises to rapidly build a range of natural
language applications—from digital employees
and mobile personal assistants, to wearables and
IoT interfaces—all from a single platform.
Unibet, one of Europe’s largest and fastest
growing online gaming operators, with over 13
million customers globally, is making its mark
by making betting simpler and more intuitive.
Using Artificial Solutions’ Natural Language
Interaction (NLI) technology, Unibet is
experimenting with ways of eliminating some of
the current frictions in online betting—such as
in navigation and finding the desired bet.
With Unibet’s new NLI-enabled app the
customer simply needs to say something such as
“Five quid on Chelsea to win” or “A tenner on a
3-0 City win” and the app does the rest, guiding
the customer until the bet is placed. Should any
ambiguity occur, the app will either make some
safe assumptions or ask the customer for
clarification.
NLI is particularly useful when coupled with
Unibet’s live streaming portfolio (Unibet
streams over 30,000 major events per year). Bets
can be placed without having to exit the stream
and risk missing a crucial play.
Will Mace, Head of Strategic Development at
Unibet, said, “As speech-enabled apps become
more commonplace, people expect interactions
to be intelligent, fast and responsive. They are
fed-up with clunky menu driven interfaces;
instead customers are demanding sophisticated
understanding of their spoken words.”
Speech Processing Solutions adds speech-to-text service to its dictation software
Philips SpeechLive dictation service now available as a cloud service
Speech Processing Solutions, the global
provider of professional dictation solutions has
just launched their latest update of Philips
SpeechLive cloud dictation workflow solution.
The latest release now offers a fully-integrated
speech recognition service that turns recordings
into text. Philips SpeechLive also allows users to
send dictation from anywhere to the cloud and
have their human assistant transcribe the files
for them. People using dictation can also use the
SpeechLive transcription service, where
industry-specific and trained professionals type
up the recordings.
Real estate agents, journalists, finance
professionals, and people in the legal, healthcare
and insurance industry save time and money by
having their recordings automatically typed up
for them. Notes, tasks, and documents can
LUI News
May 2016
24
simply be recorded with a Philips voice recorder
or the Philips dictation recorder smartphone app
and the SpeechLive speech recognition service
transcribes the files almost immediately. The
new service is available for 21 different
languages. As an introductory offer, Philips is
giving its customers 10 free speech recognition
minutes per user every month.
SYSTRAN API Platform enables translation and natural language processing
SYSTRAN.io supports multiple languages with cloud-based Application Programming Interface
SYSTRAN Software, a global 48-year-old
language translation technology company,
opened its language intelligence technology to
multi-national companies of all sizes with a
cloud-based Application Program Interface
(API)—SYSTRAN.io. SYSTRAN.io is based
on the same language translation technology that
powers SYSTRAN’s enterprise offering used by
Symantec, Cisco, Airbus, Ford, Toyota, BNP
Paribas, Daimler, Barclays, defense and security
organizations such as the US intelligence
community, NATO, Interpol, and language
service providers. Platform customers include
ADP, Adobe, and Apple, which integrates a
language translation widget on the dashboard of
its MacBook. To attract application developer
interest, the API will allow 1 million characters
to be translated free each month.
SYSTRAN.io is a hosted 50-language toolkit
that features:
§ Real-time text-to-text translation;
§
§
§
§
Voice-to-text recognition and transcription;
Data extraction and restructuring;
Dictionary Management;
Sentiment analysis of user-generated
content; and
§ Anonymization.
The company said that SYSTRAN.io is a
natural solution for collaboration platforms and
team messaging apps such as Slack, which offer
simplified communication with team members
around the world. A real-time language
translation tool ensures that virtual team
members who speak different languages are
being heard for true collaboration. SYSTRAN.io
can be integrated for real-time translation into
collaboration platforms. For customer support
teams, real-time translation of complaints,
customer feedback, or service outages reduces
call
volumes
and
increases
customer
satisfaction.
Winscribe and Speech Processing Solutions expand dictation offerings
Availability of speech-to-text transcription services on Philips SpeechAir Android device
Speech Processing Solutions specializes in
professional dictation. Founded in Austria in
1954 as part of Philips, the company has
products such as the new portable Philips
SpeechAir, Philips Pocket Memo voice recorder,
the Philips SpeechMike Premium dictation
microphone, and the Philips dictation recorder
app for smartphones. The latest, Philips
SpeechLive, brings secure dictation workflow to
the cloud.
Winscribe, which provides digital dictation,
speech recognition, and document workflow
management software, further expanded its
partnership with Speech Processing Solutions
with a new mobile app and full support of the
Philips SpeechAir smart voice recorder. The
Philips SpeechAir brings together advances in
dictation hardware and sound technology with
the comfort and familiarity of a modern
smartphone. Users get an Android device with a
touchscreen, camera, Wi-Fi, and Bluetooth, but
still benefit from outstanding sound quality, a
comfortable slide-switch, extended battery life,
professional voice file editing, and maximum
security controls.
The paired solution has been designed to save
users time and resources by allowing them to
work in a more flexible manner and enabling
fast and efficient communication of important
information. Users can dictate their reports,
LUI News
May 2016
25
letters, time, directions to support staff, and
more – from anywhere at any time – and send
them off for transcription or to be automatically
converted from voice to text, using Winscribe’s
speech recognition software solutions.
Winscribe mobile speech productivity suite of
applications are also available for Android, iOS,
BlackBerry, and Windows smartphones and
tablets.
Mattersight and Voci will license the other company’s products
Voci’s transcription engine and Voci Mattersight’s Behavioral Analytics
Mattersight
Corporation
and
Voci
Technologies announced a strategic partnership
whereby Mattersight will license Voci’s VBlaze transcription engine and call recording
converters and Voci will license Mattersight’s
Behavioral Analytics Cloud, in order to elevate
the customer experience for both companies’
customers. Mattersight has developed analytics
algorithms detecting human emotion and
personality hosted in Mattersight’s secure
Behavioral Analytics Cloud.
Voci has developed a fast, low-latency
speech-to-text transcription engine. Currently
available in English and Spanish, it provides
fully punctuated transcripts and emotional
information, and can also be used to protect
customers’ privacy by automatically redacting
sensitive Payment Card Industry (PCI)
information. These capabilities support both
post-call analytics and live real-time analytics.
Voci has also created a telephony adapter that
can be configured to ingest call recordings from
the most popular third-party call recording
systems.
Combining its proprietary algorithms,
personality insights, real-time alerting, and
portal infrastructure with Voci’s transcription
engine, Mattersight will be able to provide
customers with a set of analytics applications, all
served from the company’s secure Behavioral
Analytics Cloud, including:
§ The feeding of hundreds of data attributes
and the underlying call transcriptions to its
customers’ big data and CRM applications;
§ Improved real-time alerting for financial
compliance
and
business
process
monitoring; and
§ Enhancement of Mattersight’s predictive
modeling of Customer Satisfaction Score
and Net Promoter Score.
Voci will leverage Mattersight’s Behavioral
Analytics Cloud to provide its Speech Analytics,
Business Intelligence, Electronic Discovery, and
Compliance customers with its highly-secure
secure PCI- and HIPAA-compliant cloud
solution.
Audeme offers speech recognition and synthesis for Arduino platform
Speaker-independent voice control with up to 150 commands
Arduino provides an open-source electronics
platform called “Arduino” based on easy-to-use
hardware and software. It has its own
programming language and development
environment.
Audeme, an auditory sensor and solution
provider that enables the maker community with
audio recognition products, announced the
availability of MOVI, a cloudless speech
recognizer and synthesizer for Arduino,
available through a programming interface to
makers of all technical levels.
MOVI can be programmed from the Arduino
Integrated Development Environment (IDE),
allowing users to have the shield recognize up to
150 of their own full-sentence voice commands.
MOVI is speaker-independent and can support
programmed conversations with projects using
the speech synthesizer. Compatible with most
Arduino-compatible boards, MOVI retails for
LUI News
May 2016
26
$89.90. The open source community has already
ported MOVI to two other platforms:
Raspberry Pi and the Huzzah esp8266 from
Adafruit.
Bertrand Irissou, CEO at Audeme, noted that
the local recognition alleviated privacy concerns
when recognition is done in the cloud, and cited
home alarm systems, home automation, games,
as well as applications to help those living with
disabilities, as possible uses of the technology.
Dr. Gerald Friedland, CTO at Audeme, added,
“As few as 4 lines of codes is all that’s required
to start recognizing an English sentence.”
Microsoft Translator adds features on iOS
Offline translation and webpage translation using Deep Neural Nets
Continuing its move to make its technologies
available on OSs other than Windows,
Microsoft announced two new features for
the free Microsoft Translator app for iOS. In
addition to the text, conversation, and image
translation already available, Microsoft added
support for offline translation (i.e., not
connected to the Internet) and webpage
translation.
Until now, iPhone users needed an Internet
connection if they wanted to translate on their
mobile devices. Now, by downloading the
Microsoft Translator app and the needed offline
language packs, iOS users can get translations
comparable to the cloud solution even when
they are not connected to the Internet.
The update for iOS includes a new Safari
extension which lets users translate web pages
within their Safari browser. After you have
turned the extension on, clicking on “Microsoft
Translator” from the list of available extensions
will translate the page automatically.
The new offline language packs use the
same Deep Neural Network technology the
company recently introduced in the Microsoft
Translator app for Android. Deep Neural
Networks have been used for almost a year by
the Microsoft Translator online cloud service to
deliver high-quality translations to Microsoft
Translator apps and Bing.com/translator. They
are also used to power the speech translation
technology in the new speech translation
API and Skype Translator.
In conjunction with this release, Microsoft
also added 34 new languages to the list of
offline languages supported by Microsoft
Translator. Microsoft gave examples of the use
of the Microsoft Translator, illustrating its
flexibility:
§ Get from the airport or a conference to your
hotel by pinning preplanned translations to
your favorites.
§ Respond to something you didn’t plan ahead
for and get quick translations of short
phrases by typing or speaking into your
phone. You can also speak the phrase into
your Apple Watch if your phone is in your
pocket or purse.
§ Translate instant messages, texts and other
content by simply copy-pasting it from and
to the Translator app.
§ Translate signs and restaurant menus with
the image translation feature. This also
works with pictures you receive by email or
save from online sites or social media posts.
§ Download the new offline language packs so
you’ll be sure to be able to translate text and
images if you don’t have an internet
connection.
§ Use the text to speech feature to let the app
do the talking and ensure you have the right
pronunciation.
§ Use the conversation feature to engage in
natural conversations to find out the best
local restaurants from the concierge, a cab
driver, or maybe just someone you happen to
meet.
§ View that restaurant’s website in your own
language before you visit using the new
Safari extension.
LUI News
May 2016
27
Nvidia unveils processor for AI and creates “deep learning supercomputer”
Turnkey system claims to deliver the equivalent throughput of 250 x86 servers
Nvidia sells Graphical Processing Units
(GPUs) that perform specialized parallel
processing operations for displaying graphical
information. Early in April, the company
announced it is going beyond that role and
attempting to provide a full platform to
accelerate Artificial Intelligence applications
that use deep neural networks. Nvidia
announced a chip intended to accelerate such
applications, the Tesla P100, and an architecture
using multiple Tesla P100s in a cloudcomputing environment; this “deep learning
supercomputer” is designed to accelerate the
development of such algorithms from large
databases.
The new Tesla P100 chip, designed for use in
corporate data centers, achieves very high
performance by packing 15 billion transistors on
a piece of silicon. That is roughly twice as many
as Nvidia’s prior high-end graphics processor
and some new server chips introduced by Intel
at the end of March expanding its Xeon line.
The new Xeon E5-2600 v4 family includes up to
22 calculating engines on each chip, up from a
maximum of 18 on prior models. (The use of
parallel processing within chips to accelerate
computing power is the subject of this month’s
editorial, p. 5.)
The Nvidia DGX-1 supercomputer, using the
company’s “Pascal” architecture, is designed
specifically for deep learning. It comes fully
integrated with hardware, deep learning
software, and development tools for easier
deployment. It is a turnkey system that contains
the new P100 chips, delivering the equivalent
throughput of 250 Intel x86 servers, according to
Nvidia.
“Artificial intelligence is the most farreaching technological advancement in our
lifetime,” said Jen-Hsun Huang, CEO and cofounder of Nvidia. “It changes every industry,
every company, everything. It will open up
markets to benefit everyone. Data scientists and
AI researchers today spend far too much time on
home-brewed high performance computing
solutions. The DGX-1 is easy to deploy and was
created for one purpose: to unlock the powers of
superhuman capabilities and apply them to
problems that were once unsolvable.”
The Nvidia DGX-1 deep learning system,
built on Nvidia Tesla P100 GPUs, is based on a
new Nvidia Pascal GPU architecture. It provides
the throughput of 250 CPU-based servers,
networking, cables and racks in a single box.
The DGX-1 includes the Nvidia NVLink highspeed interconnect for maximum application
scalability and new half-precision instructions to
deliver more than 21 teraflops of peak
performance for deep learning. The DGX-1
systems equipped with Tesla P100 GPUs can
deliver over 12 times faster training than fourway Nvidia Maxwell architecture-based
solutions from just one year ago.
Yann LeCun, director of AI Research at
Facebook, which introduced bots for its
Messenger system (p. 1), said in the Nvidia
announcement, “As neural nets become larger
and larger, we not only need faster GPUs with
larger and faster memory, but also much faster
GPU-to-GPU communication, as well as
hardware that can take advantage of reducedprecision arithmetic. This is precisely what
Pascal delivers.”
“Microsoft is developing super deep neural
networks that are more than 1,000 layers,” said
Xuedong Huang, chief speech scientist at
Microsoft Research. “Nvidia Tesla P100’s
impressive horsepower will enable Microsoft’s
CNTK [Computational Network Toolkit] to
accelerate AI breakthroughs.”
Andrew Ng, chief scientist at Baidu (p. 19),
said, “AI computers are like space rockets: The
bigger the better. Pascal’s throughput and
interconnect will make the biggest rocket we’ve
seen yet.”
Massachusetts General Hospital will use
Nvidia’s
new DGX-1
deep-learning
supercomputer. Nvidia is partnering with the
hospital’s Clinical Data Science Center to
advance health care with AI in order to improve
LUI News
May 2016
28
the detection, diagnosis, treatment, and
management of diseases.
The DGX-1 software includes the Nvidia
Deep Learning GPU Training System (called
DIGITS), a complete, interactive system for
designing deep neural networks (DNNs). It also
includes the newly released Nvidia CUDA Deep
Neural Network library (cuDNN) version 5, a
GPU-accelerated library of primitives for
designing DNNs. It also includes optimized
versions of several widely used deep learning
frameworks—Caffe, Theano, and Torch. The
DGX-1 additionally provides access to cloud
management tools, software updates, and a
repository for containerized applications
(encapsulating an application in a container with
its own operating environment).
General availability for the Nvidia DGX-1
deep learning system in the United States is in
June, and in other regions beginning in the third
quarter, direct from Nvidia and from some
systems integrators.
Intelligent Voice offers speech-to-text based on Graphical Processing Units
Nvidia GPUs allow up to 400 times real-time processing of speech
Intelligent Voice claims that it now has the
“world’s fastest commercially available speech
to text appliance” based on Nvidia GPU
technology (previous article). Operating at up to
400x real-time, Intelligent Voice leverages
CUDA programming (the parallel computing
platform and application programming interface
model created by Nvidia). The technology can
process large databases of speech quickly for
tasks such as analytics.
Intelligent Voice indicates that it allows a
company to sift through expanding data sets
without the need for expensive transcription
services.
The
software
goes
beyond
transcription, using machine learning and natural
language
techniques
to
produce
a
“SmartTranscript,” learning what is important in
a telephone call, extracting the information, and
storing a visual representation of the call.
Intelligent Voice’s search and alert makes it
possible to tackle agent performance issues,
address data security concerns, and monitor
physical access to data. Information is taken
directly from a network and quickly assessed for
quality. The system fine-tunes itself for future
processing and indexing with the data as well.
Quick search for specific content is enabled by
the company’s JumpTo technology.
SmartTranscript is a single, self-contained
HTML file that contains not only original
audio/video, but also a transcript of what has
been said. This is augmented by Intelligent
Voice’s topic extraction technology which gives
an instant snapshot of the key things said in the
file. This allows the user to immediately engage
with the content, understanding what has been
said, and to navigate intuitively from topic to
topic, guided by an advanced playback function.
Intelligent Voice claims a user can within
seconds understand the key topics in audio or
video, without the need to listen to the whole
file.
Microsoft Bots (cont.)
typical to download an application and install it
before using it. And one must first find the app
to start it up—sometimes not easy as apps
proliferate. Even after launching the app, one
generally must next go through the home screen
to get to any real functionality. The “bot” model
allows simply calling the application and usually
a function within the app from Cortana or
Messenger with one natural-language command
to use the app’s functionality. Amazon has a
similar functionality with its “skills” launched
through its digital assistant Alexa. The trend
Continued from page 1
Similar announcements from Facebook on
bots in its Messenger service (p. 1) and Kik for
its messaging service (p. 10) are part of a trend
toward using natural language to make digital
systems easier to use—what this newsletter
summarizes as the Language User Interface
(LUI). However, the announcements suggest a
further trend—going directly to an independent
application without launching it. Otherwise, it is
LUI News
May 2016
clearly has backing from influential and deeppocketed champions. This trend emphasizes a
point I’ve made in several forums, including this
newsletter—that the general personal assistants
are becoming gateways to company-specific
applications that should be bots themselves, and
that every company will eventually find it as
necessary to have a bot/digital-assistant as they
do a web site today. Microsoft’s announcements
make it easier for a company to do so by
adopting its platform for bots.
will interact with Skype bots by texting them
and receive a text reply. Later, they will be
available by voice to audio and video calling.
A new Skype client, available on Windows,
Android, iPhone, and iPad supports Skype Bots.
A preview on these platforms is available in
Skype currently in Australia, Canada, England,
Ireland, India, New Zealand, Singapore, and US.
A new Skype Bot developer program was
launched. The Skype Bots Platform for
developers includes the SDK, API, and
Workflows.
The Cortana and Skype demonstrations at the
developers conference showed how Cortana can
help one get things done directly in your Skype
chats. For example, Cortana proactively helps
you find information, manages your calendar,
and connects to other Bots, all without leaving
Skype.
Cortana enhancements
New features allow asking Cortana to send to
a contact a document you “worked on
yesterday” to surface the correct file you might
be referring to. Cortana can also use Skype to
interact with customers in a bot-like manner.
New Outlook integration means that Cortana
will be able to check your email messages and
calendar to offer suggestions based on your
schedule.
Microsoft’s Terry Myerson, Executive Vice
President, Windows and Devices Group, said in
a blog post, “And only Cortana works across
your devices—you can complete certain
notification-based tasks on your Windows or
Android phone, such as receiving and sending
text messages, on your PC.”
And developers can now build “Proactive
Actions” on Cortana. Microsoft offers a Cortana
Developer Preview.
Microsoft is not ignoring its basic Windows
business, however. Nadella said Windows 10 ”is
off to an fantastic start” with over 270 million
active devices, outpacing Windows 7 by 145%.
Skype Bots
Microsoft’s Skype is a VoiceOverIP (VoIP)
communications solution that makes local and
global connections. Nadella indicated that the
next stage is adding conversation to the platform
itself. Cortana is available in Skype, and can
now connect users to any Skype contact with a
voice command.
At their developers’ conference, Microsoft
introduced Skype Bots—a new way to bring
expertise, products, services, and entertainment
into daily messaging on Skype. Initially, one
29
The Bot Framework for conversational
intelligent assistants
Microsoft took the Bot Framework outside of
Skype as well. Microsoft is positioning their Bot
Framework as a platform where developers can
build bots to run in many different environments
and across services such as Facebook, Slack,
and Line.
The Framework uses Microsoft’s new
Language Understanding Intelligent Service
(LUIS) for natural language understanding
support. See the end of this article for more on
LUIS.
A Skype Bot SDK is available, and
developers can integrate it into both audio and
video experiences. The bots will also come to
Skype for Android, iOS, and HoloLens in
addition to Windows.
For bots outside of Skype, Microsoft
announced the Bot Framework to connect bots
to most major messaging apps and teach them to
better understand natural language. Developers
can also browse a bot directory to find ones with
the functionality you’re seeking. These
interactions can be brokered through Cortana,
meaning you could theoretically start a chat with
a Dominos bot after saying “Hey Cortana, order
a pizza.”
LUI News
May 2016
Microsoft’s Cognitive Services also supplies
22 free APIs to integrate into your product.
There is an Application Programming Interface
for cloud-based speech recognition and text-tospeech, for example. It’s free up to 5,000
transactions per month.
Microsoft’s cross-platform REST API
(RESTful web Application Programming
Interfaces) enables speech capabilities on all
internet-connected devices. Every major
platform including Android, iOS, Windows, and
third-party IoT devices are supported. The REST
API offers speech-to-text, text-to-speech, and
language understanding capabilities delivered
through the cloud.
databases, but can work from a very small
number of examples of what a user might type
and what the “intent” and “activities” are for
that action. The tool generalizes beyond the
specific examples, allowing you to list
“activities” relevant to your task (e.g., “run” or
“bike ride” for an exercise monitoring
application). It has built-in understanding of
concepts such as reminders that you can use
without creating them (but which can be
extended).
Microsoft has provided a video demonstrating
LUIS that shows how the tool is used. The tool
is most suitable for tasks where the context is
limited.
Once your application is deployed and traffic
starts to flow into the system, LUIS uses active
learning to improve itself. In the active learning
process, LUIS identifies the interactions that it is
relatively unsure of, and asks you to later label
those transactions according to intent and
entities to improve handling of those cases.
Language Understanding Intelligence
Service
LUIS is a new offering, currently in beta
mode. It is cloud-based, free with usage under
100,000 transactions per month on Azure. The
tool is not “machine learning” designed for large
Google search (cont.)
Continued from page 1
He noted that a big step forward in 2012 was
the use of Knowledge Graphs, which allowed
understanding of “things, not strings.” Currently,
Google’s representation of things includes two
billion entities, 54 billion facts, 38,000 types of
entities—and is growing.
Behzadi noted that the future is mobile, and
search has to adapt to that environment. In the
mobile world, people increasingly use speech.
Voice search is growing faster than typed
search. He said that that speech recognition
today works, with a word error rate of 8%. With
speech, people use more natural sentences
instead of query language, e.g., “What’s the
weather like in Paris?” versus “weather Paris.”
The future of search is to build the ultimate
assistant, Behzadi claimed. The ultimate
assistant should understand the world and your
relationship to it, as well as your situation in the
world currently (your current context).
He showed a series of demos illustrating
different types of results the user might be
seeking:
§
30
Answers about general knowledge such as
“Show me how to make chicken soup.”
§ Answers about you such as “When is my
next flight?”
§ Personal search, such as searching your
email, calendar, and photos for upcoming
meetings.
§ Using apps, such as asking Google to play a
song by title.
§ Actions, such as setting an alarm.
Behzadi illustrated the role of dialog with an
example. He asked, “How high is Rigi?” and the
question wasn’t understood. But then he said,
“mountains in the Alps,” getting a list of
mountains. which Google then lists off. When
he then asked, “How high is Rigi?,” Google
gave him the height of the mountain Rigi.
(Editor’s Note: Google has not named its search,
whatever the form, other than Google, so the
natural language assistant has the same name as
the company, which is also a verb, as when you
“Google” something. Sorry if this leads to
confusion that Apple and Microsoft, for
example, have avoided by giving their
equivalent functionality the names Siri and
Cortana. It’s clear that search has evolved into a
LUI News
May 2016
digital assistant functionality, as the movie
examples at the start of Behzadi’s talk confirm.)
A similar use of dialog results when you ask,
“Show me pictures of Wales” and get back
pictures of whales. If you then you say, “w-a-le-s,” Google corrects its interpretation and
shows pictures of Wales.
A demonstration of the future of Now on Tap
showed Behzadi in a chat talking about a
restaurant reservation. Now on Tap showed
options proactively, and a reservation was made
with two taps.
Perhaps we are moving toward what we have
considered science fiction faster than some
expect.
CEO Sundar Pichai gave further views on
where Google is heading in the company’s
annual Founders’ Letter (p. 20) and the
company launched a new cloud-based machine
learning platform (p. 21).
Facebook bots (cont.)
receipts, shipping notifications, and live
automated messages by interacting directly with
people inside the Messenger application. The
Messenger Send/Receive API will support not
only sending and receiving text, but also images
and interactive rich bubbles containing multiple
calls-to-action. Developers can also set a
welcome screen for their threads to set context
as well as different controls.
Facebook
is
immediately
providing
developers and businesses access to documents
to build bots for Messenger, and submit them for
review. We will gradually accept and approve
submissions to ensure the best experiences for
everyone on Messenger. The Bot Framework is
summarized in the following graphic from
Facebook.
Continued from page 1
In announcing the initiative at a company
developers conference, CEO Mark Zuckerberg
summarized the goal: “We think you should
message a business just the way you would
message a friend.” Implicit in that statement and
obvious from demonstrations is that one can use
“natural language” in what amounts to text
messages to a specialized company “bot.” More
than 30 partners are said to be working with
Facebook to develop bots.
Bots
Bots can provide anything from automated
subscription content like weather and traffic
updates, to customized communications like
31
Facebook’s Bot Framework
Facebook noted the availability of Wit.ai’s complex bots that can interpret intent from
Bot Engine to enable developers to build more natural language, and continuously learn to get
LUI News
May 2016
32
better over time. Wit.ai is now a subsidiary of
Facebook, and is apparently the technology
behind Facebook’s digital assistant M.
Facebook has built discovery tools such as
plugins for websites and a prominent search
option in Messenger. The company has made it
easier to connect to a bot (and other users) with
Messenger Usernames, Links, and Codes. A bot
can have a code that makes it easier to interact
with that bot without making them “friends” or
other complexities. Usernames can avoid the
issue of some people or companies having
similar names. A user can tap or click any
Messenger Link to open Messenger directly to a
thread with a person or business.
In addition, Facebook News Feed ads will
enable the opening of threads on Messenger and
a new customer-matching feature will allow
messages that are usually sent through SMS to
be sent on Messenger.
Facebook is experimenting with charging
businesses to send re-engagement messages to
people who’ve already voluntarily started a
conversation with them. These “Sponsored
Messages” are currently in testing with a small
number of advertisers. Facebook will also be
able to earn revenue with “Click To Message”
News Feed ads.
With the danger of such contacts becoming
spam, people will be able to mute and block
communications that they don’t want to receive.
Facebook indicates it has strict policies for
developers and businesses to uphold and will
have review processes to ensure they carefully
evaluate how their community is responding.
While they are not yet common in the U.S.
and Europe, chat bots have taken off in Asia,
where messaging services such as WeChat help
users schedule doctor’s appointments, shop for
the latest styles, play games or the lottery, and
even send money to friends.
Several businesses are already available
through Messenger. KLM Royal Dutch Airlines
recently began allowing passengers to check in,
get flight updates, make travel changes and talk
to customer service reps in its Messenger app.
You can hail a ride on Uber and Lyft by tapping
a new transportation option inside Messenger.
You can ask hotel chain Hyatt about your
accommodations, and you can track your
purchases through online retailer Everlane.
Facebook recently announced more than two
dozen additional chatbot partnerships including
the Wall Street Journal, CNN, Disney, Staples,
Shopify, and 1-800-Flowers. Bank of America
indicated it will use Facebook Messenger to
reach customers with real-time alerts and other
yet-to-be revealed features.
To browse for a pair of shoes on Facebook
Messenger, you can now text a message to the
mobile shopping start-up Spring. Spring will
ask you for a preferred price range for the shoes
and show you what it thinks you might like.
A new Fandango bot is providing prompts
for moviegoers to locate nearby theaters and
movies playing at those theaters. It will also
provide trailers, film synopses, ratings, run
times, and direct access to advance ticketing.
A concern with outside bots
There have been some online complaints
about poor performance of some bots attached to
Facebook, e.g., the weather option. This
represents a general problem for the bot
model—the quality of outside bots isn’t under
the control of the hosting company (Facebook in
this case), but the performance is associated
with the hosting company’s bot functionality.
This should ease as the outside bots mature, but
could be a significant problem if the hosting
companies don’t differentiate the functionality
that they directly support from that of third
parties.
Supporting announcements
Aspect Software indicated it would support
customer chatbots through Facebook Messenger
(p. 11). Chatfuel offers tools for creating
Facebook bots and a bot hosting service.
LUI News
May 2016
33
News briefs
Elon Musk’s OpenAI releases first AI tool
Elon Musk founded join OpenAI, a nonprofit dedicated to releasing cutting-edge artificial
intelligence research for free. The nonprofit released the first fruits of its work, a tool called OpenAI
Gym for developing and comparing different reinforcement learning algorithms, which provide a
way for a machine to learn through positive and negative feedback. OpenAI Gym includes code and
examples to help others get started with reinforcement learning.
OpenAI also announced two new recruits, including Pieter Abbeel, an associate professor at UC
Berkeley, and an expert on applying reinforcement learning to robots.
[24]7 introduces customer acquisition cloud service for marketers
[24]7 announced in April that it has launched the [24]7 Customer Acquisition Cloud, a suite of
marketing technology solutions. The suite is based on the acquisition of Campanja, which
specializes in paid search bid optimization, and EngageClick, which supports personalized
marketing. Combined with [24]7’s intent-driven, predictive customer engagement platform, the
[24]7 Customer Acquisition Cloud is claimed to give marketers insight into the entire customer
journey, from initial search through to the point of purchase.
Personalized content and optimized experiences ensure consumers are getting the best marketing
offer, at the optimal time and in the ideal format. A machine learning system, integrated with several
data sources and data platforms, combines software for predictive prospecting (what prospects might
like to buy), retargeting (determining what consumers wanted in the past), and predictive optimized
targeting (forecasting what they will likely want in the future) to determine outcomes for each
consumer. This adaptive system then selects the best digital content delivery channel to present a
customized offer for each individual customer.
Microsoft Windows 10 Mobile test build includes support for Cortana in more languages
A recent Windows 10 Mobile test build adds support for Cortana in Spanish (Mexico),
Portuguese (Brazil), and French (Canada).
IBM discusses recent advances in conversational speech recognition
George Saon of IBM published a blog on “Recent Advances in Conversational Speech
Recognition.” The blog, with references, discussed the core technology IBM employs in its Watson
speech-to-text. He indicated that, on the acoustic side, IBM uses a fusion of two powerful deep
neural networks that predict context-dependent phones from the input audio. The models were
trained on 2000 hours of publicly available transcribed audio from the Switchboard, Fisher and
CallHome corpora.
On the language modeling side, IBM uses a sequence of language models (LMs) that are
progressively more refined. The baseline is an n-gram LM estimated on a variety of publicly
available corpora such as Switchboard, Fisher, Gigaword, and Broadcast News and Conversations.
The hypotheses obtained by decoding with this LM are re-ranked with an exponential class-based
language model called model M. The M stands for Medium, meaning that it’s neither too big nor too
small. Lastly, IBM rescoreS the candidate sentences with a neural network LM to obtain the final
output.
Saon said IBM is currently working on integrating these technologies into IBM Watson’s
speech-to-text service.
LUI News
May 2016
34
IBM notes growth in Watson services
IBM announced accelerated adoption of Watson’s cognitive capabilities globally. “IBM is
receiving an outpouring of interest from partners in Watson’s cognitive capabilities and supporting
industry-changing use cases around the globe,” said Stephen Gold, Vice President, IBM Watson
Group. “We are thrilled to be working with some of the industries best and brightest entrepreneurs
and established organizations.”
IBM is investing more than $1 billion into the Watson Group, focusing on research, development
and bringing cloud-delivered cognitive applications and services to market. This includes $100
million earmarked for direct investment to support IBM’s ecosystem of start-ups and businesses
building cognitive apps made with Watson. Currently more than 3,300 organizations and
entrepreneurial individuals have shared their ideas for creating cognitive apps that redefine how
businesses and consumers make decisions, IBM indicated.
IBM and the University of Illinois to pioneer next-generation cognitive computing systems for
applications such as multimodal education
IBM Research announced plans for a multi-year collaboration with the University of Illinois
Urbana-Champaign to create the Center for Cognitive Computing Systems Research (C3SR),
which will be housed within the College of Engineering on the Urbana campus. Opening in the
summer of 2016, the C3SR will integrate and advance scientific frontiers in both machine learning
and heterogeneous computing systems optimized for new cognitive computing workloads.
The C3SR will build and optimize integrated systems such as state-of-the-art cognitive
computing systems modeled on IBM’s Watson technology that can master a subject area by learning
from multimedia and multi-modal educational content. Such systems will efficiently ingest vast
amounts of data including videos, lecture notes, homework, and textbooks, and reason through this
knowledge effectively enough to be able to eventually pass a college level exam. The optimized
computing systems developed by the C3SR are expected to perform orders of magnitude better than
today’s systems that run cognitive applications.
With the increased computational demands of cognitive computing, the researchers will further
optimize Power Systems for cognitive workloads. Researchers will have access to the
OpenPOWER Foundation’s systems technology as well as technical development and support
from IBM Systems Group. The new hardware designs and cognitive algorithms will be released to
the open source community and OpenPOWER Foundation, of which both IBM and the University of
Illinois are members.
SparkCognition uses IBM Watson in assessing security risks
SparkCognition, which calls itself a “Cognitive Security Analytics” company and recently
raised $6 million (p. 50), and IBM announced that clients, including ExamSoft Worldwide (which
does education testing), are tapping the power of Watson to transform how businesses make use of
unstructured data to enhance Security Analytics. SparkCognition’s MindSpark platform and
“Cognitive Fingerprinting” technology model physical and virtual assets, continuously learn from
data, and derive intelligent insights to secure and protect assets.
Cloud application security is a growing challenge compounded by increasingly sophisticated
attack patterns and underlying configuration complexity in the assets and infrastructure being
protected. Using IBM Watson technology through the cloud, SparkCognition’s Cognitive Security
Insights (CSI) service discovers unfolding threats and provides users with specific information they
need to know about remediation and defense mechanisms. SparkCognition’s CSI offering works by
collecting massive amounts of security data and applying machine learning and AI algorithms to
find trends, patterns and anomalies.
LUI News
May 2016
35
IBM and SAP agree to combine complementary services, including IBM Cognitive Computing
and SAP HANA Business Suite, available on-premise and in the cloud
IBM and Germany-based SAP plan to co-innovate solutions that increase customer value
through cognitive extensions, enhanced customer and user experiences, and industry-specific
functionality—all enabled with SAP Business Suite 4 SAP HANA (SAP S/4HANA) software,
available on-premise and in the cloud. The companies intend to co-locate resources in Walldorf,
Germany and Palo Alto, Calif. deepening a long-standing partnership.
“The future of business strategy and business value will proceed from the foundational elements
of this announcement—cognitive, cloud and the design of consumer-quality experiences in every
industry,” said Bridget van Kralingen, senior vice president, IBM Global Business Services. “We’re
formalizing a complementary set of capabilities to simplify and speed outcomes for clients evolving
to become cognitive enterprises.”
IBM partners with American Cancer Society on Watson Cancer Advisor
IBM and the American Cancer Society (ACS) announced a partnership to develop a Watsonbased advisor for people fighting cancer—to bring the cognitive power of IBM’s Watson, not only
to physicians treating cancer, but to those suffering from the disease.
IBM Chairman, President and CEO Ginni Rometty announced the collaboration at the 13th
Annual World Health Care Congress. The goal of the new cancer health advisor is to provide cancer
patients, survivors, and caregivers with ACS resources and guidance personalized to each
individual’s fight against cancer. An early 2017 release of the offering is expected.
IBM Health Corps to use Watson to tackle global health disparities
IBM Health Corps is a global pro-bono program focused on tackling health disparities. IBM
partners with health organizations across the world, contributing the time and expertise of teams of
IBM experts for three weeks on the ground. IBM Health Corps teams use IBM cognitive tools and
analytics to help partner organizations expand health access and services and improve health
systems and outcomes.
During 13th Annual World Health Care Congress in April, IBM unveiled IBM Health Corps,
a global pro-bono program focused on tackling health disparities using IBM Watson’s cognitive
tools and analytics. The new service initiative will bring IBM’s top talent and cognitive technologies
to help communities address health challenges such as primary care gaps, health worker shortages,
and access to safe water and nutritious food.
IBM teaming with Sesame Street to aid in early learning through Watson technologies
Studies show that early learning can affect a child’s ability for a lifetime. IBM is teaming up
with Sesame Workshop, the non-profit behind Sesame Street, to develop a new suite of preschool
products that could range from consumer apps and toys, to educational tools for schools. The
company says it entered into this partnership because it wants to provide personalized learning to as
many kids around the world as possible. Harriet Green, IBM’s GM for Watson IoT, Commerce and
Education told CNNMoney that not enough kids have access to the right level of education at the
right time in their lives.”
By taking what Sesame Workshop has learned about childhood education over the past few
decades, and giving it to Watson, IBM hopes to develop software that helps fill in some of these
gaps.
NTT uses machine learning to detect cyber-crime
NTT Com announced it has succeeded in automatically detecting infections of unknown
malware in real time with more than 99% accuracy using machine learning. The capability has been
added to NTT Com’s Managed Security Services solution under the WideAngle brand.
LUI News
May 2016
36
NTT Comm and IPsoft partner to launch an automated cognitive agent service
NTT Communications (a subsidiary of NTT) has agreed with IPsoft to launch an automated
cognitive agent service this summer. It is said to leverage AI for smooth interaction with customers
in both English and Japanese. The service will work on cloud-based systems, and continues to learn
from dialogue with customers. In addition to automating initial responses to customers who contact
a call center or enterprise/store contact point, it will also issue invoices, send emails, and mail
documents.
Google upgrades its open-source TensorFlow machine learning framework to a distributed
version
Google is releasing a distributed version of its open-source TensorFlow software that allows it
to run across multiple machines—up to hundreds at a time. The result is the ability to evaluate
models in an acceptable amount of time. With the release of TensorFlow 0.8, Google noted that
TensorFlow is the most popular machine learning framework on code repository GitHub.
TensorFlow is a software library for numerical computation using data flow graphs that could be
used to simulate a neural net, for example. Nodes in the graph represent mathematical operations,
while the graph edges represent the multidimensional data arrays (tensors) communicated between
them.
Google open-sources Walt, a tool that measures lag for touch and voice commands
In April, Google described “Walt,” software that people can use to figure out how long it takes
for a device to respond to touch or voice input. Google has been using Walt to do performance tests
on Android devices and Chromebooks. The company made the software available under an open
source Apache license on GitHub.
Walt requires hardware—a microcontroller, an accelerometer board, and a laser, but the entire
kit shouldn’t cost more than $50, Google software engineer Mark Koudritsky wrote in a blog post.
Koudritsky explained, “An important innovation in WALT (a descendant of QuickStep) is that it
synchronizes an external hardware clock with the Android device or Chromebook to within a
millisecond. This allows it to measure input and output latencies separately as opposed to measuring
a round-trip latency.”
Android N preview 2 lets you change the pitch of Google’s text-to-speech voice
The second developer preview of Android N, the next big release of Google’s mobile operating
system, comes with an update to the settings for Google’s core text-to-speech engine. There are now
sliders that allow you to fine-tune the speech rate and pitch of the voice.
Yahoo reportedly preparing a mobile personal assistant
Yahoo CEO Marissa Mayer in a recent presentation recently mentioned Index, a potential
entrant in the personal assistant competition. She noted that it “accomplishes tasks on your behalf.”
Rage Frameworks’ linguistics tool analyzes documents, adds deployments
Rage Frameworks, a provider of knowledge-based automation technology and services,
announced new deployments of its deep learning technology known as Rage AI across several
global financial services, consumer products, and manufacturing firms. The challenges these
organizations faced required the understanding and interpretation of complex documents and
integration of other transaction data from enterprise resource planning (ERP) systems to identify
significant cost efficiencies and compliance conformance. RAGE AI incorporates linguistics-based
tools to understand the meaning of documents and interpret them. It can operate completely
unsupervised or with assistance by human experts.
“Most artificial intelligence solutions involving natural language today are based purely on
patterns in the data with no understanding of what they are processing. They will require highly
LUI News
May 2016
37
homogenous data or suffer the curse of dimensionality,” said Venkat Srinivasan, CEO of RAGE
Frameworks. ”Additionally, they are all black boxes with no or little ability to provide the users
with their reasoning. This is true of Google, IBM Watson, and the scores of others who have rushed
to leverage computational statistics-based deep learning. With RAGE AI, we are breaking the mold
and creating truly intelligent machines that can substantiate their decisions.”
NeoSpeech releases Canadian French TTS voice
Text-to-speech provider NeoSpeech released Canadian French voice, Leo. He joins Chloe to
give consumers a male and female Canadian French option for synthesized speech. Leo comes with
a customizable dictionary that enables users to customize the way he says certain words or phrases
and add industry-specific jargon or slang to their audio.
NeoSpeech integrates Bitcode support in its text-to-speech software for iOS
Text-to-speech software provider NeoSpeech has released version 11.11 of its iOS VoiceText
Embedded SDK. The new version supports Bitcode. Apple introduced Bitcode last June to allow the
App Store to re-optimize apps for each kind of device before they’re delivered to the user.
Conexant introduces far-field microphone processing software for Qualcomm Hexagon DSP
Conexant Systems and Qualcomm Technologies, a subsidiary of Qualcomm Incorporated,
announced that Conexant’s AudioSmart software has been integrated into the Qualcomm Hexagon
Digital Signal Processor (DSP) family. Enabling speech recognition and voice control from a
distance in smart platforms requires overcoming substantial challenges related to echo cancellation,
background noise, microphone speaker position, and more. Conexant’s AudioSmart software
improves voice control in smart phones, smart home applications, wearables, robots, IoT devices,
and more.
Fortemedia’s updated FM1388 Series IC provides voice processing solutions for Apple CarPlay
Fortemedia announced it has launched new versions of its FM1388 integrated circuit tailored
for Apple CarPlay and ITU P.1100/P.1110 compliance. The FM1388-1 and FM1388-2 Voice
Processing ICs enable OEMs and ODMs to develop automotive infotainment systems that provide
quality hands-free voice communication, voice control, and speech recognition services. The
FM1388-1 works with single-microphone systems while the FM1388-2 supports multiplemicrophone systems, providing advanced features enabled by Fortemedia’s smart microphone array
system.
Fortemedia’s latest generation of FM1388-1 and FM1388-2 Voice Processing ICs include
Advanced Microphone Array Processing (AMAP), Advanced Acoustic Echo Cancellation (AEC),
and newly developed Smart Spectral Analysis (SSA) technologies to ensure the high-quality voice
operations for both wideband and narrowband hands-free calling with Apple CarPlay and other
automotive infotainment voice calling applications. For multiple-microphone systems Fortemedia
supports flexible microphone placements both in the car console and in traditional car cabin
overhead locations. In addition to providing a superior voice experience during hands-free phone
calls, the FM1388-1 and FM1388-2 offer a voice recognition enhancement mode, that assists speech
recognition systems to provide better performance when operating in the challenging car cabin
environment.
The FM1388-1 and FM1388-2 are shipping in prototype quantities and are currently being
incorporated in a number of OEM and ODM automotive infotainment products, according to the
company.
Apple TV activates “live tune-in” feature launched through Siri
Apple’s “Live Tune-In” feature for the most recent generation of Apple TV is officially live.
This feature allows an Apple TV user to ask Siri to automatically transport them to the livestream of
LUI News
May 2016
38
a tvOS app from the home screen. The “Live Tune-In” feature was officially part of the tvOS 9.2
update that Apple released in March. The update also brought Bluetooth keyboard support to the
Apple TV, folders for apps, dictation and Siri support for searching through the tvOS store.
Apple is promoting three apps at the moment for use with the “Live Tune-In” feature: ESPN,
Disney XD, and CBS. By telling Siri to watch ESPN live for example, Siri will automatically open
the app and start the live stream.
e-djuster launches mobile solution for contents inventory and claims management with
speech recognition
e-djuster announced the launch of its e-xclaim mobile solution, a Software-as-a-Service (SaaS)
contents inventory and claims management solution serving the insurance industry. The service
supports speech recognition, enabling adjusters and field content claims specialists to remotely
complete content inventory claims faster, with greater accuracy, and with the ability to operate on a
connected or unconnected basis to a cellular or Wifi network. The Mobile solution provides a
seamless upload capability to e-xclaim, where the pricing and valuation process is automatically
initiated without delay.
Compared to traditional claims methods, the e-xclaim valuation platform improves claims
processing productivity by more than 20% and indemnity performance by greater than 15% through
access to the industry’s highest quality matching product data, according to the company.
Lexalytics provides text analytics that run on an Android device, targeted at developers to
include in their apps
Lexalytics, a provider of cloud and on-premise text analytics solutions, is launching Salience for
Android, a native text analytics package built on machine learning. By providing native text
analytics, all processing stays local on the phone so analysis results never go back to the cloud,
ensuring end-user privacy, according to the vendor.
With Salience for Android, application developers can offer mobile users natural language
processing and analytics for an app that uses text, including email, SMS and chat, reviews,
comparison shopping, social media, travel, and hospitality, so they can gain insights and useful
information to improve productivity and simplify day-to-day activities.
With Salience for Android, developers can bring apps to market with functionality such as
immediately alerting a user about an email or post that is especially negative and incendiary, or
positive and praiseworthy; displaying a daily summary of emails from important contacts; providing
a list of any to-do’s throughout the day, week or month; summarizing the latest information in the
sports world from a favorite team; removing politics-related content or any content the user might
want stricken from their social feed; highlight buzz-worthy events taking place in the upcoming
weekend; and warning users when they’re about to send out a text they may regret later.
Other features include named entity extraction, summarization, imperative sentence extraction,
and query-based categorization.
GMA Consulting and TermSet offer document-centric solutions for financial sector
GMA Consulting, an IT consultancy specializing in solutions for the financial services industry,
and TermSet, a provider of software which automatically creates metadata and taxonomies for
Office documents, PDFs, and emails within Microsoft SharePoint, announced a partnership to offer
advanced document-centric solutions for firms in the financial services sector. Ben Grey, a Director
at GMA Consulting, said, “The first area we identified where TermSet could benefit our clients was
managing research documents. Both buy and sell side firms produce and consume large amounts of
research documents. When a piece of research material is uploaded to a SharePoint library, TermSet
can automatically produce a headline summary, classify the sentiment, and automatically tag with
metadata such as asset class, industry sector, company name, and region / country. This will make it
LUI News
May 2016
39
far easier for users to find documents they are interested in, particularly if SharePoint’s advanced
search capabilities are used as well. The second area we identified is legal document management.
TermSet can help extract key information such as matter reference and counterparty names and tag
documents with this information. This would negate the need for legal departments to manually
classify documents, which is hugely time consuming and inefficient.”
TermSet uses Natural Language Processing and Machine Learning to improve search,
governance and navigation within SharePoint.
Nuance selected by CHRISTUS Health for enterprise-wide speech recognition and clinical
documentation improvement deployment
Nuance Communications announced it has been selected by CHRISTUS Health, one of the
ten largest Catholic not-for-profit health systems in the US, for an enterprise-wide physician
documentation improvement initiative to improve how physicians work with inpatient and
ambulatory Electronic Health Records, with the additional benefit of improving financial
performance. This broad implementation replaces multiple competitive systems, adding Nuance’s
full suite of clinical documentation solutions including Nuance’s back-end medical transcription, as
well as Nuance PowerScribe 360, and Nuance Dragon Medical for real-time speech-enabled
physician documentation.
Recent Windows 10 update build apparently includes Cortana “find my phone” feature
Microsoft is working on Cortana features in the upcoming Windows 10 Anniversary Update.
One of those new features is, according to rumors, a new “Find my phone” option. Asking Cortana
on build 14295 to “find my phone” appears to work, according to one report—the command results
in Cortana saying that she’s now looking for it.
CallMiner partners with Ultracomms to add its interaction analytics solutions to Ultracomms’
PCI-compliant cloud contact center
CallMiner, Inc. and, Ultracomms, a European cloud contact center services provider,
announced a partnership to provide contact centers across the UK with advanced cloud-based
Interaction Analytics solutions from CallMiner. Ultracomms recently achieved the Payment Card
Industry Data Security Standard (PCI DSS) v3.1, (PCI DSS) Level 1 accredited service provider
status for its entire platform. By adding the CallMiner Eureka interaction analytics solution,
Ultracomms customers can capture and analyze 100% of customer interactions across all
communication channels, including calls, chats, emails and social media. This delivers a number of
benefits, including:
§ Helping organizations improve contact center performance by analyzing all interactions for all
agents, rather than basing decisions on a small sample of calls;
§ Making full FCA compliance very simple by providing access to all interactions;
§ Identifying best practice and agent strengths so that not only can the skills of individual agents
be matched to tasks but also the performance of the whole team can be raised;
§ Introducing a competitive spirit amongst agents by providing each agent with their own
dashboard that anonymously compares their performance with best-performing agents;
§ Identifying the optimal path for customer interactions so that best practice can be shared across
the team; and
§ Enabling contact center managers to become more effective by switching their time from
listening to a small sample of calls to identifying and delivering more targeted and intelligent
coaching based on all interactions.
Hitachi introduces in-store sales representative robot
SoftBank’s Pepper robot has been used as a conversational in-store guide for shoppers, and
Hitachi has launched a competitor. Called by the catchy name of EMIEW3, the roughly 3-foot-tall
LUI News
May 2016
40
unit can determine when customers need help and then approach them without prompting, Hitachi
said. (A questionable feature if, for example, it frightens a child or startles an adult.) The intelligence
is delivered over a network connection, which allows it to be helped by cameras in the venue
observing a customer’s movements. The device will reportedly go on sale in 2018.
Thomson Reuters signs an agreement with FiscalNote to add automated legislative tracking
solution to Thomson Reuters Regulatory Intelligence
Thomson Reuters announced that it has signed an agreement with FiscalNote, a provider of
legislative and regulatory analytics and insight. The agreement supplies predictive legislative
analytics to Thomson Reuters Regulatory Intelligence (TRRI), a global solution that provides clients
a focused view allowing them to manage regulatory risk.
Under the agreement, FiscalNote will provide TRRI users with insight on the likelihood and
factors important to a piece of legislation’s passage with a high degree of accuracy. FiscalNote uses
machine learning and natural language processing to create models analyzing open government data.
These models allow FiscalNote to automatically analyze how legislation is going to fare by
examining and understanding the importance of various factors such as legislators, committee
assignments, actions taken, bill versions, and amendments.
Forum announces the public launch of the VoiceXML 2.1 Developer Certification Exam
Voice eXtensible Markup Language (VoiceXML) has long been a popular way to control voiceinteractive IVR systems. The VoiceXML Forum was created in 1999 with the mission to promote
and to accelerate the worldwide adoption of VoiceXML-based applications.
The VoiceXML Forum announced the launch of the VoiceXML 2.1 competency-based
certification for IVR developers. VoiceXML developers will be able to attest to their skills and
expertise within the IVR community and to potential employers.
This Certification Exam expands upon a previous developer certification offered by the Forum to
match the evolution of industry standards in the following ways:
§ VoiceXML 2.0 test material has been greatly expanded and revised for clarity.
§ Extended to cover VoiceXML 2.1 through comprehensive inclusion of assertions relating to
material from the VoiceXML 2.1 specification.
§ Includes updated content reflecting the final draft specifications for the CCXML, SRGS, SISR,
and SSML specifications.
ASTi to enhance RAF Ch-47 trainers including speech recognition
Advanced Simulation Technology (ASTi) equipment will be fitted on a suite of CH-47 Mk6
Chinook helicopter weapons system trainers for the British Royal Air Force (RAF), the company
announced. The RAF's cabin trainer and two flight deck trainers will be equipped with ASTi’s
Telestra 4 and Simulated Environment for Realistic ATC (SERA) systems. The systems will equip
the trainers with audio and communications capabilities.
ASTi’s SERA system creates an immersive and fully automated ATC and external radio
environment for enhanced aircrew training by generating artificially intelligent (AI) entities that
represent pilots, other aircraft and air traffic controllers who communicate in the same simulation.
The system uses speech recognition to allow pilot trainees to communicate with AI controllers
during all phases of flight training. The speech recognition system replicates regional accents and
eliminates the need for additional staff role players.
Max Sound’s High Definition Audio now available for iPhones
Max Sound Corporation announced that the next generation iOS MAX-D HD Audio App
version 2.01 is now available for free download at the Apple App Store. MAX-D HD is designed
specifically to bring High Definition audio to today’s worldwide streaming audience. Version 2.01
resynthesizes today’s compressed audio in real time, restoring much of what was lost during the
LUI News
May 2016
41
compression process, including the highs, lows, and mid-ranges. In addition, MAX-D has a natural
audio safe mode that allows the App user to choose to enjoy their music under 85 decibels to avoid
hearing damage.
For those people who don't stream music, MAX-D Version 2.01 allows anyone with an iTunes
playlist to listen in MAX-D HD. MAX-D HD version 3.0 will offer a paid subscription version of
the App in the near future. The Subscription App will work with additional streaming audio/video
services, speech recognition, and Apple's Car Play.
BodyWorn body-worn camera system allows entering notes with speech-to-text
The latest upgrade to the BodyWorn smart body-worn camera system from Utility is now
available to instantly playback recorded HD video, classify video by retention categories, and enter
notes via text entry and speech recognition. BodyWorn can be configured to allow police officers to
view recorded video on their body-worn device immediately after recording to improve the accuracy
of their notes.
Robert McKeeman, CEO of Utility, said, “The Police Department does not have to buy new
body camera hardware to get new features and capabilities, or wait years for the camera hardware to
be replaced. The Department, of course, decides which features and capabilities to activate as part of
their overall policy-based recording strategy. Our mission is to provide Police Chiefs and Policy
Makers with the most up-to-date configurable technology so they can make policy decisions without
being limited by the technology.”
MIT and PatternEx develop machine learning AI to detect cyberattacks, using machine
learning to cluster similar potential problems for human analysts
A new artificial intelligence platform developed by the Massachusetts Institute of Technology
(MIT) and PatternEx can identify up to 85% of cyberattacks, according to a new research paper.
Dubbed AI2, the platform is said to be significantly better at predicting cyberattacks than similar
systems because it continuously incorporates new input provided by human experts.
“Today’s security systems usually fall into one of two categories: man or machine,” Adam
Conner-Simon from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) wrote
in a post on the MIT News site. “So-called ‘analyst-driven solutions’ rely on rules created by human
experts and therefore miss any attacks that don’t match the rules. Meanwhile, today’s machinelearning approaches rely on ‘anomaly detection,’ which tends to trigger false positives that both
create distrust of the system and end up having to be investigated by humans, anyway.” The MIT
and PatternEx platform attempts to merge those two approaches.
AI2 predicts attacks by combing through data and detecting suspicious activity by clustering it
into meaningful patterns using unsupervised machine learning, according to researchers at MIT. It
then presents the activity to human analysts who confirm which events are actual attacks. AI2 then
incorporates that feedback into its models for the next set of data.
Sentient Technologies uses AI to help sell shoes
Sentient Technologies uses AI to make e-commerce more effective with it Sentient
Aware software. Sentient Aware does not require shoppers to interact with navigation menus and
checkbox filters to narrow down results. Based on the positive response from its shoppers to
the initial rollout of Sentient Aware, SHOES.COM is expanding the use of Sentient Aware to a
number of new shoe categories on its Canadian site SHOEme.ca. SHOES.COM’s “Visual Filter”
uses the Sentient AI to gain has a deep, highly detailed understanding of the shoes available in the
catalog. As consumers click on product images, it rapidly learns shoppers’ unique style preferences.
Visual Filter provides instant recommendations of available shoes. These highly personalized
choices can then be selected for purchase or saved for later.
LUI News
May 2016
42
SHOES.COM has also released a new user interface for Sentient Aware on their site, and
rebranded the experience as Smart Shopper. Since the launch of the Sentient Aware-powered Visual
Filter last November, customers using the experience have, on average, a 16% higher average order
size when checking out compared to the rest of SHOEme.ca’s customers.
SRI International spins off robotics company to make body suit that enhances movement
Nonprofit research organization SRI motivated by support for soldiers that carry
International is spinning off part of its robotics heavy packs or armaments all day.
division into a new company called Superflex.
Superflex will focus on “human augmentation,”
robotic augmentations for people, with the focus
on helping the disabled or preventing disabilities
in people with jobs that entail repetitive physical
stress that might lead to disabilities.
Superflex is the name of the company’s
prototype, a full-body suit filled with soft
muscle-like actuators that detect movements and
give them a boost. The intent is to be the
difference that lets someone in physical therapy
walk normally, or prevents an individual in a job
requiring repetitive physical stress avoid future
problems. Some DARPA funding at SRI is
Chevron employs AI to improve operations
According to The Motley Fool, Chevron is currently using AI (probably machine learning in
particular) to identify new well locations and stimulation candidates in California. Using AI software
to analyze the company’s historical well performance data has reportedly allowed the company to
drill in better locations, yielding a production rise 30% over conventional methods. Chevron is also
using predictive models to analyze the performance of thousands of pieces of rotating equipment to
detect failures before they occur, avoiding unplanned shutdowns and has lowering repair expenses.
The increased production and lower repair costs have translated to more profit per well. Overall,
analysts believe artificial intelligence and digital technologies will unlock trillions of dollars of
productivity gains over the next decade in the industrial sector, with the oil and gas sector being one
of the prime beneficiaries.
Wise.io introduces content discovery capability for customer support
Wise.io, a provider of machine learning applications to help enterprises provide a better
customer experience, launched an intelligent content discovery and creation capability for customer
support organizations. Using machine learning to read and understand past support conversations,
“Wise Support” automatically identifies and surfaces similar agent responses that are not already
part of a company’s curated content library. With Wise.io, the creative work of front-line agents,
often lost in the noise, can be used to drive efficiency and uniformity across the customer care
organization.
Intoware uses Nuance speech recognition to direct aircraft maintenance hands-free
U.K.-based Intoware has developed speech recognition software called WorkfloPlus that guides
techs through performance and reporting of aircraft maintenance tasks hands-free. Chief Technology
Officer James Woodall explains the software gives audio instructions on each step to be followed,
receives the results of each step spoken into a digital device, and stamps each interaction with GPS
location and time. The Intoware software can be used on Android, iOS, or Windows smartphones or
wearable devices. “It solves the problem of giving instructions on each step, guiding mechanics
LUI News
May 2016
43
through each step and knowing that each step has been followed,” Woodall says. “You can’t skip
steps; it won’t allow that.”
Intoware uses Nuance’s VoCon 3200 speech recognition software and builds WorkfloPlus on
top of that. The company makes only the software, not the hardware.
Background noise cancelation is necessary to avoid false triggers in frequently loud maintenance
environments. WorkfloPlus currently uses algorithms to cancel background noise and improve
speech recognition. Woodall recommends that device manufacturers consider installing Kopin’s
noise-canceling chips to achieve the same effect.
Does this look like a robot?
Jia Jia (pictured) is an experimental robot
build at the University of Science and
Technology of China (USTC). Jia Jia has been
created to mimic humans in as many ways as
possible. The mouth lip-syncs words, she looks
around to scan the environment, and she
processes the events around her.
Perhaps ironically, given that China is often
viewed as a source of low-cost human labor, the
country has made a priority of developing robot
technology.
Jia Jia
Narita International Airport in Japan testing a speech-to-speech translation application on
shuttle buses
Narita International Airport in Japan is testing a speech-to-speech translation mobile
application NariTra for iOS and Android devices on shuttle buses running between two terminals.
The plan is to roll it out across the entire airport by the time the 2020 Tokyo Olympic and
Paralympic Games take place. The free multilingual translation app was created by the airport using
a multilingual speech translation engine developed by the National Institute of Information and
Communications Technology of Japan (NITC). It employs noise-canceling techniques and can
provide speech-to-speech translations between Japanese and English, Mandarin Chinese, Korean,
and Indonesian. It also supports text-to-text translation in these languages, as well as in Thai,
French, and Spanish.
Perfect Pitch uses recorded scripted responses to allow off-shore agents to sound native
Perfect Pitch’s customer contact center uses scripted, recorded responses, using human agents
to control scripted responses, essentially reducing agents to speech recognition engines. The
company’s approach is an attempt to address complaints about understanding foreign call center
representatives with heavy accents.
The company helps companies utilize offshore call centers, while having the representative
sound local to the customer from wherever he or she lives. The technology has been used to
facilitate over 100 million minutes of conversation, according to the company.
IndianTTS launches Hindi text-to-speech solution
Indian Text to Speech Private Limited has announced the availability of its Hindi text-tospeech solution for IVR systems. This technology is said to engage the customer in regional Indian
languages. IndianTTS’s Hindi product includes a specialized name conversion tool that is said to
achieve 85% accurate Indian naming conversion.
LUI News
May 2016
44
Microsoft collaborates with Narrative Science to add automated natural-language narratives to
its Power BI visuals
Microsoft Power BI (Business Intelligence) transforms company data into rich visuals.
Microsoft announced a collaboration with Narrative Science resulting in a new feature: Narratives
for Power BI, which automatically communicates insights from all connected Power BI data sources
in natural language. As you interact with your data and visualizations, Narratives for Power BI
dynamically delivers insights in narrative form, simulating what an analyst would write.
This custom feature uses Narrative Science Quill, a natural language generation platform.
Narratives can be generated directly from the wide range of Power BI’s connected data sources and
can be used to uncover trends hidden in visualizations. Dynamic narratives continuously update as
you interact with your data and visualizations, providing natural language insights driven by
descriptive, diagnostic, and predictive analytics.
Microsoft updates Cortana on the iPhone
Microsoft updated Cortana on the Apple iPhone, adding some new functionality and fixing
some issues. Version 1.5.5 includes the ability to launch some popular apps with your voice and an
improved homepage loading experience.
Florida Hospital achieves significant quality improvements plus $72.5 million in increased
reimbursement with Nuance Clinical Documentation Improvement
Nuance Communications announced that Florida Hospital has realized significant
improvements in its Case Mix Index (CMI), resulting in $72.5 million increase in appropriate
reimbursement since the implementation of Nuance CDI. (Clinical Documentation Improvement is a
process typically used in hospitals with specialists who review clinical documents and provide
feedback to physicians.) In addition, Florida Hospital reduced its observed-to-expected mortality
rates by 48% and achieved ICD-10 compliance. (ICD is the International Statistical Classification of
Diseases and Related Health Problems, a medical classification list standardized by the World
Health Organization).
Florida Hospital deployed Nuance’s CDI program at Florida Hospital in summer 2014 and
completed an expansion across eight affiliated hospitals by May 2015. Florida Hospital will expand
its program and leverage the enhanced Nuance Clintegrity CDI solution embedded within Cerner’s
Millennium Electronic Health Record and revenue cycle solutions. This integration will provide
clinicians and clinical documentation specialists with a highly efficient workflow that supports the
quality of physician documentation with minimal disruptions. Nuance clients will also leverage
computer-assisted physician documentation (CAPD) technology, which provides automated CDI
clarifications to physicians, integrated into Cerner’s Document Quality Review solution.
Apple will pay $24.9 million in a long-running lawsuit over the origins of Siri, more suits likely
A lawsuit, which dates back to 2011, alleges that Apple's virtual personal assistant Siri was
developed by researchers at Rensselaer Polytechnic Institute (RPI), which was awarded a patent
for the technology in 2007. RPI then licensed the patent to Dynamic Advances. Apple agreed to
provide Dynamic Advances’ parent company Marathon Patent Group with $5 million now, and
hand over an additional $20 million after certain conditions are met. In announcing the settlement,
the non-practicing entity said, ‘Dynamic Advances believes RPI has unreasonably withheld its
consent to the reasonable royalty rate set forth in the settlement agreement between Dynamic
Advances and Apple, and that issue may have to be resolved in arbitration.” The SEC
statement announcing the deal also makes clear that Dynamic Advances is going to find other
targets, as the company states that it “believes that other voice recognition products infringe the ‘798
patent.”
LUI News
May 2016
45
A Taiwanese university has also sued Apple for alleged patent infringement in its Siri voice
assistant. Taiwan's National Cheng Kung University alleged in the lawsuit, filed in a US district
court, that Apple's Siri feature infringes on two of the school’s U. patents dealing with speech
recognition technology. The university is demanding Apple pay a still undetermined amount in
damages, and that the court order an injunction on Apple's use of Siri as a feature on its iPhones and
iPads.
Statistics and Surveys
More than 75% of businesses rank the importance of having a mobile application as high
451 Research asked businesses to rate the importance of providing customer with services from
mobile devices. Over three-quarters ranked as high (8-10 on a scale of 10) both creating a mobile
shopping experience via a mobile web site and creating a mobile application for customers such as
shopping or customer service.
89% of consumers expect and prefer conversational interactions with customer service
Nuance Communications shared results of a recent global survey relating to consumer
preferences and expectations around customer self-service. 89% of consumers want to engage in
conversation with virtual assistants to quickly find information instead of searching through Web
pages or a mobile app on their own. In the phone channel, the majority of consumers indicated they
prefer to engage with a system that lets them speak naturally when calling a business. Further
results:
§ 73% of consumers want their conversation with customer service to be personalized.
§ 64% of consumers want their customer service to be proactive in nature, with suggestions and
reminders.
§ Consumers want a conversational, personalized and proactive interaction throughout the entire
service experience, including authentication, with 83% of respondents seeking an alternative to
passwords and PINs and the majority eager to use voice biometrics as the method to identify
themselves.
Botego CEO predicts 2017 will be the year of the bots with $2 billion market size
Botego has nine years of experience in developing bots for web, mobile, Facebook, Slack, and
other platforms. Customers include Johnson & Johnson, Coca Cola, and Unilever. Currently, 40+
bots developed by Botego deliver an average of 5 million answers per month, according to the
company.
Ekim Nazim Kaya, CEO of Botego, expects 2017 to be a big year for bots. He shared his
predictions in a recent blog post. Kaya bases his optimism on the following:
§ Siri, Cortana, Echo, and now Messenger M represent a strong trend;
§ Apps such as Whatsapp, Line, Telegram have billions of users;
§ Using a bot running on these platforms is much more efficient than downloading an app for
every brand;
§ Millenials, who will account for 40% of all consumers, prefer using self service channels over
talking to a customer service rep; and
§ Analysts from Gartner and Deloitte predict that autonomous software will take place in most
economic transactions by 2020.
LUI News
May 2016
46
Millennials have the lowest tolerance for errors and delays, but reward good service with
loyalty
J.D. Power released their first Millennials Insight Report: The Customer Experience
Perspective, defining the makeup and customer experience preferences of Millennials—those born
between 1982 and 1994.
The report concluded that millennials have the lowest tolerance for errors and delays of any
other generation studied—they simply expect things to work. However, when there is a problem and
it is resolved fully, Millennials are substantially more likely than Boomers to reuse a product or
service.
Unlike other generations that tend to buy things for status, image, or brand loyalty, Millennials
are most likely to make a purchase decision based on value for money—across virtually every
product category.
Millennials are less concerned than other generations about privacy. They accept the erosion of
privacy as inevitable and are generally willing to have their information collected if it comes with
benefits in the form of targeted offers and personalized services.
Despite having lower accumulated wealth, less income, and higher debt than other generations,
millennials are much more optimistic about the economy and their own personal financial outlook.
73% of smart home owners already use voice commands
According to The NPD Group Connected Intelligence Connected Home Automation Report,
64% of smart home product owners used a smartphone to control or monitor their home automation
devices. Additionally, 73% of smart home owners already use voice commands, with 61% of those
consumers expressing an interest in wanting to use voice to control more products in their homes.
“This reliance on smartphones to control and monitor the smart home is due, in part, to app
compatibility, as nearly all home automation devices have an iPhone or Android app,” said John
Buffone, executive director, Connected Intelligence. “As apps and devices become more intuitive,
voice recognition – and thus, voice control – will begin to play a more prominent role in the further
development of the smart home.”
The NPD Group’s Retail Tracking Service has also recorded that Home Automation sales are up
41% year-over-year in 2015 versus 2014. This tracking service’s metrics incorporate not only
systems controllers such as thermostats, but also the broad range of smart capabilities across
technologies such as power, sensors, lighting, security/monitoring, locks, and kits.
Global virtual reality headset revenues projected to reach $895 million in 2016
According to the latest research from Strategy Analytics, global virtual reality headset revenues
will reach $895 million in 2016 with 77% of that value accounted for by newly launched premium
devices from Oculus, HTC, and Sony. These three brands however will only account for 13% of
volume in 2016 as lower-priced smartphone-based devices will dominate in the 12.8 million unit
virtual reality headset market. The analyst firm sees 2016 as a pivotal year for virtual reality given a
confluence of factors, and also one where managing expectations will be paramount given a dearth
of available content and the technical limitations of entry-level virtual reality.
YouTube and Netflix lead Web video delivery
In broadcasting, over-the-top content (OTT) refers to delivery of audio, video, and other media
over the Internet without the involvement of a multiple-system operator in the control or distribution
of the content. Netflix continues to grow its user base in the US, with 126.9 million people expected
to use it this year, according to eMarketer’s latest forecast on OTT video usage. That equates to
67.9% of OTT video users. Among the OTT service providers eMarketer tracks, only YouTube has
more users than Netflix—176.1 million, which equates to 94.3% of OTT users.
LUI News
May 2016
47
Speech analytics market worth $1.60 billion by 2020
According to a new market research report on speech analytics published by
MarketsandMarkets, the market is estimated to grow from $589.2 Million in 2015 to $1.60
Billion by 2020, at an estimated Compound Annual Growth Rate (CAGR) of 22.0% from 2015 to
2020.
The telecom, IT and outsourcing segment is expected to contribute the largest market share in
the Speech Analytics Market in 2015. The travel and hospitality segment is expected to grow at a
rapid rate from 2015 to 2020.
Intelligent Virtual Assistant market is expected to exceed $3 million by 2020
Grand View Research projected that the global intelligent virtual assistant market will be $352
million in 2012, and is expected to grow at a CAGR of 31.7% from 2013 to 2020. Growing focus on
efficient customer interaction facilitated by virtual assistants is expected to drive the market over the
forecast period.
Large enterprises were the dominant consumers of intelligent virtual assistant services, and
accounted for over 80% of the overall market in 2012. Market prospects for small and medium
enterprises (SME) are expected to be positive, with adoption rates expected to increase considerably
over the next six years. Demand from travel, utilities, telecommunication, etc. is expected to be a
major opportunity for market participants.
The “global intelligent voice” industry to reach $19 billion by 2020
The “global intelligent voice industry” was estimated at $4.75 billion in 2014 and should see
30% growth per year, reaching $19 billion by 2020, according to a study from ReportsnReports.
Propelled by big data, mobile Internet, cloud computing, and other technologies, the global
intelligent voice industry is predicted to grow 30.7% from a year ago to hit $6.21 billion. The market
was valued at $4.75 billion in 2014. As the application of speech recognition technology in
intelligent in-vehicle, smart home, and wearable devices goes deeper, the market will maintain rapid
growth, reaching an estimated $19.17 billion in 2020.
With the intense involvement of Internet giants including Google, Microsoft, and Apple around
2010, the global intelligent voice industry has gradually evolved from an oligopoly to monopolistic
competition. In 2015, speech recognition leader Nuance Communications still took first place with a
market share of 31.1 percent but suffered a significant decline; Google, Microsoft, Apple, and
IFLYTEK witnessed rapid share growth, standing at 20.7 percent, 13.4 percent, 12.9 percent, and
6.7 percent, respectively.
The report identifies the major speech recognition players as Nuance, Apple, Google,
Microsoft, IBM, MindMeld, and Speaktoit.
More than three billion Android phones in use globally by 2020
451 Research estimated that the number of Android phones in use will rise from 2.132 billion in
2015 to 3.029 billion in 2020.
Global smartphone shipments this year have fallen
According to a report from Strategy Analytics, global smartphone shipments this year have
fallen 3% compared with the first quarter of 2015, down from 345 million units to 334.6 million.
This is the first time smartphone unit sales have slipped since the device’s introduction.
Annual gains in worldwide ad spending will hover around 6% through 2020
Spending on paid media ads worldwide will climb 5.7% in 2016 to $542.55 billion, propelled by
increased investments in digital advertising. Worldwide ad spending will reach $674.24 billion by
the end of 2020, with annual gains hovering between 5% and 6%.
LUI News
May 2016
48
Mobile advertising nears $29 billion annually in US
This year, in the US, eMarketer estimates that advertisers will spend $28.72 billion to reach
their targets on mobile devices.
Financial Notes
Aspect Software makes progress in restructuring
Aspect Software, a provider of fully integrated consumer engagement, workforce optimization,
and back-office solutions on premises and in the cloud, updated the status of its strategic action to
facilitate its long-term growth. On March 9, 2016, the company commenced a pre-arranged chapter
11 case in support of a proposed restructuring agreement with existing lenders. The restructuring is
expected to facilitate the reduction of more than $320 million of indebtedness, a new first lien
facility, and an infusion of new capital to enable growth.
Since the announcement of its restructuring just a month ago, the Company has agreed upon and
filed the underlying documentation to implement the transaction and, more importantly, secured
consent from nearly all of its existing lenders. More specifically:
§ 100% support from the Company’s first lien secured lenders
§ Approximately 80% support from the Company’s second lien secured lenders
§ Angel Island Capital, an existing stakeholder in Aspect and an affiliate of Golden Gate Capital
(Aspect’s current majority equity shareholder), has agreed to invest new capital and continue as
an equity holder in Aspect following its reorganization
§ All trade debt will be paid in full in cash as part of the restructuring; moreover, no committee of
creditors has been formed
§ Overwhelming support for the restructuring will facilitate an expedited exit from the
restructuring process.
MetaMind acquired by Salesforce
Salesforce has acquired MetaMind, which has a question-and-answering technology for both
visual and textual inputs. MetaMind indicated it would extend Salesforce’s data science capabilities
by embedding deep learning within the Salesforce platform.
Salesforce plans to integrate MetaMind’s technology into Salesforce services. MetaMind will
discontinue the services it offers directly.
$30M round for AI marketing firm Persado
Persado has a “persuasion automation” platform, natural language processing technology that
automates the creation of the persuasive language used in digital marketing and other applications.
The company optimizes marketing pitches for a variety of corporate clients such as Citibank,
Microsoft, Verizon Wireless, Sears, and American Express, just closed a $30 million Series C
funding round. Investment bank Goldman Sachs led the round, while previous investors — Bain
Capital Ventures, StarVest Partners, American Express Ventures, and Citi Ventures — also
participated.
The company currently employs more than 200 people across nine global locations and
generates cognitive content for display ads, Facebook, email, website landing pages, SMS, and
mobile push notifications.
X.ai secures $23M in Series B funding
x.ai, Inc., an artificial intelligence company, announced that it has secured an additional $23
million in Series B financing led by Two Sigma Ventures. In addition, DCM Ventures and Work-
LUI News
May 2016
49
Bench Ventures have joined in the round of funding as new investors. All of x.ai’s existing
investors have also participated. The investment will support the continued expansion of x.ai’s data
science team as well as the development of its customer acquisition and enterprise sales teams.
To date, a select number of customers have had beta access to x.ai’s AI personal assistant (Amy
or Andrew Ingram) (Speech Strategy News, March 2015, p. 23). Customers simply cc’ Amy, and
she takes over the job of scheduling your meeting.
Artificial intelligence startup DigitalGenius raises $4M to automate customer service
DigitalGenius is announced its Human+AI customer service platform, along with a $4.1 million
seed investment. The platform integrates with existing customer service software suites — like
Salesforce, Zendesk, and Oracle — to automate the most repetitive parts of customer service
through AI and machine learning-powered chatbots. Salesforce was part of the deal.
nGUVU raises $3 million to bring gamification and machine learning to contact centers
nGUVU, a developer of gamification and machine learning software for contact centers, has just
received $3 million in funding from Brightspark Venture Capital and Desjardins Venture
Capital. Pierre Donaldson, Chairman and CEO of nGUVU, said, “Our mission is to revolutionize
contact centers by putting the agent at the core of our strategy. The opportunity before us is huge
because there are over 4 million agents in North America and 20 million agents worldwide. There is
no other solution today that uses gamification and machine learning to improve the agent
experience.”
Security startup Illumio raises $100 million
Having emerged from stealth mode late last year, Illumio announced $100m of series C funding
from new investors BlackRock Funds and Accel Partners, plus existing investors. The company
offers an Adaptive Security Platform (ASP). According to the company, traditional perimeter- and
network-centric security products are no longer sufficient in a world where applications and
workloads increasingly need to work dynamically across on-premise data centers and public cloud
services. Firewalls, intrusion protection systems and advanced threat protection appliances are
widely deployed to secure interactions at the perimeter - but, says Illumio, these tools offer little
protection within enterprise data centers and in the public cloud, where much of today’s traffic flow
and data resides.
Illumio’s Adaptive Security Platform addresses the problem by taking a granular approach to
security. There are two elements to the ASP: an agent (the Virtual Enforcement Node, or VEN) that
attaches to Linux or Windows workloads running on physical and virtual machines, be they in onpremise data centers or in the cloud; and a centralized (on-premise or cloud-based) server, the Policy
Compute Engine (PCE), which receives telemetry from the VENs to build a map of the
dependencies between classified workloads in multi-tiered applications. This map can then be used
to build application-specific security policies based on explicitly allowed interactions between the
constituent workloads. Policies are written in natural language (and translated into network actions
by the VENs) rather than arcane firewall rules.
Shopify acquires Kit CRM to further “conversational commerce”
Shopify, a Canadian e-commerce company headquartered that develops computer software for
online stores and retail point-of-sale systems, is among the latest to invest in chat bots. The ecommerce platform has acquired Kit CRM as it looks to further what it calls its “conversational
commerce” strategy.
Kit, which got its start in 2013, is positioned as a virtual marketing assistant that helps online
merchants market their stores and engage customers via text and other messaging. Kit enables
businesses to build Facebook ads, email customers, sponsor Instagram photos, post updates to their
LUI News
May 2016
50
Facebook Page and offer recommendations through a chat interface. Kit also introduced an API that
lets it interact with other apps in the Shopify App Store. Terms of the deal were not disclosed.
Mobify acquires Pathful and its machine learning technology
Mobile customer engagement company Mobify announced the acquisition of Pathful, a provider
of advanced machine learning-based technology for behavior-based targeting. Terms of the
acquisition were not disclosed.
Founded in 2011, Pathful helps retailers and B2B customers understand how visitors interact
with content, capturing micro interactions on the web as well as with desktop and mobile devices to
surface content that engages with customers and increases conversions. The technology is being
integrated with Mobify’s Mobile Customer Engagement Platform, enabling retailers that use content
marketing and marketing automation to acquire and qualify customers by understanding the entire
customer journey, from acquisition through conversion.
Coveo grows with its “intelligent search” products
Coveo, which provides “intelligent search,” announced another consecutive record-breaking
quarter for Q1 2016. Coveo reported CY Q1 2016 revenue bookings growth of 107% and GAAP
revenue growth of 109% over CY Q1 2015. During CY Q1 2016 Coveo expanded its key verticals
with multiple new customers in energy, healthcare, financial services, and technology.
“This is an exciting time for Coveo with our significant technology advances in machine
learning, analytics, and search intelligence in the cloud,” said Louis Tetu, Coveo CEO. Coveo plans
to open new offices in Montreal and Silicon Valley, expand its R&D headquarters in Quebec City,
and hire an additional 70 staff this year.
Almax Analytics closes seed round to provide AI for news insights in capital markets
Almax Analytics, which provides “Artificial Intelligence for News Insights in Capital Markets,”
announced a seed round has been closed in London. Investors include executives from MCSI Inc.,
Aviva plc, a founding member of RiskMetrics Group Inc., the Fintech SEIS Fund and Jonas
Dromberg, a Finnish technology investor and former Bloomberg Bureau Chief. Results of this firstto-market service for analyzing news are expected by July, with Series A financing plans already in
process.
Almax Analytics delivers actionable insights by putting the content of news into context and
running deep analysis across the entire network of affected companies.
SparkCognition closes a $6 million Series B funding round
Artificial intelligence and cybersecurity company SparkCognition (p. 34) has closed a $6
million Series B funding round from investors that include Verizon Ventures and CME Ventures.
The startup plans to spend the new money on gaining customers for its machine-learning technology
that aims to predict when a company’s systems might fail or get hacked.
Vivint Smart Home raises $100M in equity funding
Vivint Smart Home, a provider of smart home technology and services, raised $100 million in
equity funding. The round was co-led by tech investor Peter Thiel and investment firm Solamere
Capital. Today, the company has more than one million customers and revenue of more than $650
million.
Vivint Smart Home offers a custom platform with integrated smart home products, including
smart door locks, thermostat, cameras, doorbell camera, cloud storage, and an array of sensors. In
addition to its product suite, the company has integrated smart home products into its Vivint Sky
platform, including the Amazon Echo and the Nest Learning Thermostat. Vivint also offers in-home
consultation, professional installation and support delivered by professionals, as well as 24-7
customer care and monitoring.
LUI News
May 2016
51
People
Avaya appoints Steve Joyner to Head of Sales Engineering, Europe
Avaya announced it is has appointed Steve Joyner as Head of Sales Engineering for Europe.
Steve has over 25 years of experience in the technology industry. He has undertaken a range of
technical roles at companies including GEC Plessy, Nortel, and Avaya in Europe and the Middle
East. Most recently he was a manager within the European Sales Engineering team at Avaya.
For Further Information on Companies Mentioned in this Issue
Company
Location
1-800-Flowers
--
24/7 Inc. ([24]7)
Campbell, CA
451 Research
Adafruit
Advanced Simulation
Technololgy Inc. (ASTi)
New York, NY
--
AgentBot
Almax Analytics
Amazon
Amazon Web Services
American Cancer Society
American Express
American Express
Ventures
Apple
Arduino
Arise Virtual Solutions
Artificial Solutions
Aspect Software
Audeme
Avaya Inc.
Avvo
Business
Online florist
Customer service
solutions
Research firm
Electronics kits
Military system
simulation
Herndon, VA
San Francisco,
CA
Text-based virtual agent
Helsinki,
Analysis of news for
Finland
company impact
Product sales on the
Seattle, WA
Web, Echo, and more
Technology infrastructure
Seattle, WA
platform in the cloud
Atlanta, GA
Non-profit organization
New York, NY Credit card service
Palo Alto, CA
Venture capital firm
Personal computers,
music players, wireless
Cupertino, CA phones
Open-source electronics
-platform
Crowd-sourcing of
Miramar, FL
agents
Stockholm,
Virtual assistant
Sweden
development tools
Chemsford,
Customer Service
MA
platform
Sunnyvale, CA Audio recognition
Basking Ridge, Enterprise telephony
NJ
solutions
Online legal services
Seattle, WA
marketplace
Contact info
www.1800flowers.com
www.247-inc.com
(212)505-3030;
https://451research.com
www.adafruit.com
www.asti-usa.com
(415)849-2288; http://agentbot.net
www.almaxanalytics.com
www.amazon.com
http://aws.amazon.com
www.cancer.org
www.americanexpress.com
www.americanexpress.com/us/cont
ent/amexventures
www.apple.com
www.arduino.cc
www.arise.com
+46 8 663 54 50; www.artificialsolutions.com
(978)25- 7900; www.aspect.com
www.audeme.com
(908)953-6000; www.avaya.com
www.avvo.com
LUI News
May 2016
52
Companies Mentioned in this Issue
AYLIEN
Dublin, Ireland
Beijing, China,
Baidu Research (division of and Silicon
Baidu, Inc.)
Valley
Baidu, Inc.
Beijing, China
Boston, MA
Bain Capital Ventures
Bank of America
-BodyWorn
Atlanta, GA
Botego
New York, NY
Brightspark Ventures
Canada
CallMiner
Waltham, MA
Center for Cognitive
Computing Systems
Research (C3SR)
Urbana, IL
Kansas City,
Cerner
MO
San Francisco,
Chatfuel
CA
San Ramon,
Chevron
CA
CHRISTUS Health
Citi Ventures
Citibank
Coca-Cola
Conversica
Cronologics Corporation
DARPA (Defense
Advanced Research
Projects Agency)
DBS Bank
Deutsche
digibank
Natural language
processing and data
services provider
Research division
Web search in Chinese
Venture capital fund
Bank
Wearable cameras
Virtual agents
Venture capital
Speech analytics
http://research.baidu.com
http://ir.baidu.com
Research center
http://illinois.edu
Healthcare solutions
Creating and hosting
chatbots
(816)201-1024; www.cerner.com
www.darpa.mil
Dynamic Advances
-San Antonio,
TX
Patent licensing firm
Claim content valuation
for insurers
New York, NY
Boca Raton,
FL
Market research
Bellevue, WA
Rockville, MD
Travel service
Semantic software
Social web service
Expedia
Expert System
Facebook
http://chatfuel.com
Arlington, VA
Singapore
-India
New York, NY
ExamSoft Worldwide
www.bodyworn.com
www.botego.com
www.brightspark.com
(781)547-5666; www.callminer.com
www.chevron.com
DigitalGenius
eMarketer
www.baincapitalventures.com
www.bankofamerica.com
Oil company
Not-for-profit health
Irving, TX
system
Palo Alto, CA
Venture capital firm
New York, NY bank
Atlanta, GA
Soft drinks
Lead engagement
Foster City, CA software
Wearable technology
San Mateo, CA company
Research support
Financial services group
Ad agency
Online bank
AI-powered customer
service
e-djuster
+353-1-5983-168; http://aylien.com
Palo Alto, CA
Education testing
www.christushealth.org
www.ventures.citi.com
212-559-1719; www.citigroup.com
www.coca-cola.com
www.conversica.com
https://cronologics.com
www.dbs.com
www.deutsch.com
(212)266-0090;
http://digitalgenius.com
www.marathonpg.com/patentportfolio/subsidiarylisting/detail/546/dynamic-advances
www.e-­‐djuster.ca
(212)763-6010;
www.emarketer.com
http://learn.examsoft.com
1-800-EXPEDIA;
www.expedia.com
www.expertsystem.com
www.facebook.com
LUI News
May 2016
53
Companies Mentioned in this Issue
Fandango
FiscalNote
Florida Hospital
Fortemedia
Forty 7 Ronin
Freedom Scientific
Genesys
Santa Monica,
CA
Ticketing service
Artificial intelligence for
Washington,
analysis of government
DC
data
Orlando, FL
Hospital system
Fab-less semiconductor
company with array
Sunnyvale, CA microphone
Colorado
VUI Design and IVR
Springs, CO
Development
St. Petersburg, Solutions for visually
FL
impaired
Customer service and
contact center software
Daly City, CA
and services
Getty Images
Chicago, IL
Giant Spoon
Los Angeles, CA
GMA Consulting
London, UK Goldman Sachs
New York, NY
Mountain View,
CA, and
Cambridge,
MA
--
Image licensing
Creative agency
IT consulting for financial
services
Investment firm
Inbenta
Voice and directory
search
Wearable cameras
Medical information
Palo Alto, CA
mobile apps
Felton, CA
Market research
Semiconductors and
Tokyo, Japan
robots
Digital home services
Golden, CO
marketplace
Smartphone and PDA
Taiwan, R.O.C. Phone devices
Somers, NY
Information systems
Organization to uncover
-global health disparities
Sunnyvale, CA Security system
China
Smartwatch
Intelligent virual
Sunnyvale, CA assistants
Indian Text to Speech
Private Limited
Indiegogo
Ahmedabad, In
dia
Text-to-speech software
-- Crowd-funding site
InfinityCTI
San Diego, CA
Santa Clara,
CA
Google (part of Alphabet)
GoPro
HealthTap
Hexa Reports
Hitachi Group
HomeAdvisor
HTC
IBM
IBM Health Corps
Illumio
iMCO Technology
Intel Corporation
1-800-FANDANGO;
www.fandango.com
www.fiscalnote.com
www.floridahospital.com
(408)861-8088;
www.fortemedia.com
(719)445-8054;
www.forty7ronin.com
(727)803-8000;
www.FreedomScientific.com
(650)466 - 1100;
www.genesyslab.com
(312)344 4500; www.gettyimages.com
www.giantspoon.com www.gmaconsulting.com
(212)902-1000;
www.goldmansachs.com
(650)253-0000; www.google.com;
www.google.com/mobile;
www.grandcentral.com
http://gopro.com
www.healthtap.com
www.hexareports.com
www.hitachi.com/gateway
www.homeadvisor.com
+886-3-3753252; www.htc.com
(877)426-3774; www.ibm.com
www.ibmhealthcorps.org
(669)800-5000; www.illumio.com
-(408)213-8771; www.inbenta.com
www.indianTTS.com
www.indiegogo.com Interactive Voice
Response systems
(800)795-1546; www.infinitycti.com Semiconductors
www.intel.com
LUI News
May 2016
54
Companies Mentioned in this Issue
Intelligent Voice
London, UK
Interactions Corporation
International Data Group
(IDG)
Franklin, MA
Framingham,
MA
Nottingham,
UK
--
Intoware
Invoxia
IPSoft
J.D. Power and Associates
New York, NY
Westlake
Village, CA
Johnson & Johnson
New Brunswick, NJ
Kasisto
California Kik Interactive
Waterloo, Canada
Kit CRM
Kopin
Lexalytics
LumenVox LLC
Marathon Patent Group
MarketsandMarkets
Massachusetts General
Hospital
Massachusetts Institute of
Technology (MIT)
Mattersight
Max Sound
MetaMind
Microsoft
MindMeld (formerly Expect
Labs)
Mobify
Mobvoi (Chumenwenwen)
Narrative Science
eDiscovery and
Compliance solutions
Virtual agent services for
call centers and speech
recognition technology
www.intelligentvoice.com
(317)810-2800;
www.interactions.com
Market research
Construction on-site
workflow tool
Digital assistant device
IT services and cognitive
computing
www.idg.com
Surveys and studies
www.jdpower.com
Healthcare products
Conversational virtual
assistant
(732)524-0400; www.jnj.com
Smartphone messenger
with a built-in browser
San Francisco, Manages social
CA marketing
Hands-free displays and
Westboro, MA headsets
Text and sentiment
Boston, MA
analysis technologies
Speech recognition
technology and
San Diego, CA development tools
Los Angeles,
CA
Patent licensing firm
www.intoware.com
www.invoxia.com
www.ipsoft.com
(650)762-6450; www.kasisto.com
http://kik.com
https://kitcrm.com
(508)870-­‐5959; www.kopin.com
(617)249-1049; www.lexalytics.com Dallas, TX
Market research
(858)707-0707;
www.lumenvox.com
(703)232-1701;
www.marathonpg.com
(888)600-6441;
www.marketsandmarkets.com
Boston, MA
Cambridge,
MA
Hospital
www.massgeneral.org
University
Analytics for customerChicago, Il
employee interactions
Apps for enhanced
Audio, Video and Data
San Diego, CA transmissions
Artificial Intelligence
Palo Alto, CA
technology
Various applications,
Redmond, WA products, and services
San Francisco, Natural language
CA
interpretation platform
Vancouver,
Mobile customer
BC, Canada
engagement platform
Mobile voice search and
Beijing, China apps
Software writing news
Chicago, IL
articles
www.mit.edu
(877)235-6925;
www.mattersight.com http://maxd.audio
www.metamind.io
(206)454-2030; www.microsoft.com
https://mindmeld.com
www.mobify.com
www.mobvoi.com
(312)477-0590;
www.narrativescience.com
LUI News
May 2016
55
Companies Mentioned in this Issue
National Cheng Kung
University
National Institute of
Information and
Communications
Technology of Japan
(NITC)
NeoSpeech
Nest Labs (acquired by
Google)
Netflix
nGUVU
noHold
NPD Group
NTT (Nippon Telegraph
and Telephone
Corporation)
NTT Communications
(subsidiary of NTT)
Nuance Communications
Nvidia
Oculus VR
Onix Networking Corp.
OpenAI
OpenPOWER Foundation
Opus Research
Pathful
PatternEx
Taiwan
University
http://english.web.ncku.edu.tw
Tokyo, Japan
Santa Clara,
CA
Research center
Text-to-speech and other
speech technologies
Advanced thermostat
and fire alarm
www.nict.go.jp
Movie rental
www.netflix.com
Cloud games
Web-based self-service
solutions
www.nguvu.com
Market research
(516)625-0700; www.npd.com
Palo Alto, CA
Beverly Hills,
CA
Montréal,
Canada
Milpitas, CA
Port
Washington,
NY
Tokyo, Japan
Tokyo, Japan
Burlington, MA
Santa Clara,
CA
Irvine, CA
Lakewood, OH
Telephony and other
solutions
Telephony and other
solutions
Speech technology,
applications, and
services
Graphics chips
Virtual reality headsets
IT solutions and services
Artificial Intelligence
-research
-Technical organization
Conversational Access
San Francisco, Technologies research
CA
and consulting
Machine learning-based
Vancouver,
technology for behaviourBC, Canada
based customer targeting
Peel
Perfect Pitch
Persado, Inc.
San Jose, CA
Mountain View,
CA
Utah
New York, NY
Qualcomm Inc.
San Diego, CA
Rage Frameworks
Raspberry Pi
ReportsnReports.com
Dedham, MA
-Dallas, TX
AI for cyber security
Home entertainment and
home control
Call center services
"Persuasion Automation"
Chips and wireless
devices
Knowledge-based
automation technology
ARM software platform
Market research
www.neospeech.com
www.nest.com
www.nohold.com
(81)3-3509-3101; www.ntt.co.jp
www.ntt.com
(617)428-4444; www.nuance.com
(408)486-­‐2000; www.nvidia.com
(949)502-2070; www.oculusvr.com www.onixnet.com
http://openai.sourceforge.net
openpowerfoundation.org
(415)904-7666;
www.opusresearch.net
(866)662-0786; www.pathful.com
(408)416-5322;
www.patternex.com
www.peel.com
www.perfectpitchtech.com
(646)678-­‐3400; www.persado.com (619)651-7942;
www.qualcomm.com www.rageframeworks.com
www.raspberrypi.org
www.ReportsnReports.com
LUI News
May 2016
56
Companies Mentioned in this Issue
Sears Holdings
San Francisco,
CA
Seoul, South
Korea
Walldorf,
Germany
Hoffman
Estates, IL
Sensory, Inc.
Santa Clara,
CA
Sentient Technologies
San Francisco, Artificial intelligence
CA
software
Salesforce
Samsung Electronics
SAP
CRM and sales support
software
Wireless telephones and
TVs
(415)901-7000;
www.salesforce.com
Enterprise software
+49 180 534-34-24; www.sap.com
Retailer
Embedded speech
recognition and speaker
ID
www.searsholdings.com
www.samsung.com
(408)625-3300; www.sensory.com
Staples
(415)422-9886; www.sentient.ai
Children's entertainment
New York, NY and education
www.sesameworkshop.org
Software for online stores
Ottawa,
and retail point-of-sale
Canada
systems
www.shopify.com
CRM and marketing
New York, NY automation technology
(855)606-4900; www.signpost.com
San Francisco, Collaboration and
CA
communications platform https://slack.com
Minato-ku,
Tokyo, Japan
Pepper robot
www.softbank.jp/en/robot
Consumer electronics,
www.sony.com
Tokyo, Japan
including wearable
Music identification and
(408)441-3200;
Santa Clara,
natural language speech www.soundhound.com;
CA
interaction
www.houndify.com Data enrichment, data
clean-up and data
www.spare5.com
Seattle, WA
labeling
Dallas, TX
Security analytics
www.sparkcognition.com Personal assistant
Palo Alto, CA
mobile app
www.speaktoit.com
Vienna, Austria
and Alpharetta,
GA Dictation solutions
www.dictation.philips.com
-Shopping service
www.shopspring.com
Menlo Park,
Speech recognition and
CA
language R&D
(650)859-2000; www.sri.com
Stanford, CA
(650)723-­‐2300; www.stanford.edu
University
Framingham,
Office supplies and
www.staples.com MA equipment
StarVest Partners
New York, NY
Venture capital
Strategy Analytics
Superflex (SRI
International)
Swedbank
Syfy
Newton, MA
Market reports
-Sweden
-Seoul, South
Korea
Body suit for disabled
Bank
Entertainment
Sesame Workshop
Shopify
Signpost
Slack
SoftBank Robotics
Sony Corporation
SoundHound
Spare5
SparkCognition
Speaktoit, Inc.
Speech Processing
Solutions (Philips)
Spring
SRI International
Stanford University
Systran Software
Machine translation
(212)863-­‐2500; www.starvestpartners.com 617 614-0700;
www.strategyanalytics.net https://www.sri.com/sites/default/file
s/brochures/superflex.pdf
www.swedbank.com
www.syfy.com www.systransoft.com LUI News
May 2016
57
Companies Mentioned in this Issue
Taco Bell
Downey, CA
TensorFlow
--
TermSet
Thomson Reuters
Toyota
Toyota Research Institute
(TRI)
London, UK
New York, NY
Japan
Research
group
Ultracomms
Unibet
Fareham, UK
Gzira, Malta
Unilever
New York, NY
University of Illinois Urbana Urbana, IL
University of Science and
Technology of China
(USTC)
Hefei, China Decatur,
Utility
Georgia
Restaurant chain
Open-source software
library for machine
intelligence
Search, governance and
navigation within
SharePoint
Information provider
Automobiles and robots
University
http://en.ustc.edu.cn
WinScribe
Chicago, IL
Wise.io
Berkeley, CA
Wit.ai
x.ai
Palo Alto, CA
Yahoo
YouTube
Zendesk
New York, NY Santa Clara,
CA
San Bruno, CA
San Francisco,
CA
+44 203 086 8080;
www.termset.com http://thomsonreuters.com
www.toyota.com
www.ultracomms.com
www.unibet.com
(212) 906-4694; (212) 9064666(fax); www.unilever.com
www..illinois.edu
VoiceXML Forum
Voicebox Technologies
www.tensorflow.org
Cloud-based contact
center services
Betting application
Food, home, and
personal care products
University
Body-worn cameras
Wireless telephone
Bedminster, NJ services
Provo, UT
Smart home technology
Speech recognition
Pittsburgh, PA technology
Conversational voice
Bellevue, WA
technology
Voice eXtensible Markup
New York, NY Language
Verizon Wireless
Vivint Smart Home
Voci Technologies
Incorporated
www.tacobell.com
Dictation solutions
Customer service
support using machine
learning
Natural language
interpretation tool
AI-based virtual assistant
Web resources
Online video clips
Web-based help desk
software
www.utility.com
www.verizonwireless.com
www.vivint.com
(412)621-9310; www.vocitec.com
(425)968-7900; www.voicebox.com
(732)465-6486; [email protected];
www.voicexml.org
(866)494-6727;
www.winscribe.com
www.wise.io
https://wit.ai
https://x.ai
(408)731-3300; www.yahoo.com
(650)253-0000; www.youtube.com
(415)418-7506; www.zendesk.com
LUI News
May 2016
58
Blog (with a chance to comment!)
The Software Society (www.thesoftwaresociety.com)
§
§
THE HUMAN-COMPUTER CONNECTION
The Language User Interface -- Be there or be late
Will Artificial Intelligence someday dominate humans?
§ The end of advertising?
· Redistributing income won’t solve the jobs problem
· Conference Emphasizes the Critical Role of Language Technology in the Usability of Digital Systems
I wish to subscribe to Speech Strategy News for one year (12 issues), payable in US$ on US bank—
Individual*
Corporate*
Individual*
Corporate*
PDF
PDF
PDF
PDF
6 monthly issues
6 monthly issues 12 monthly issues 12 monthly issues
$215
$750*
$425
$1,495*
* Corporate subscriptions: Unlimited users within a corporation for PDF version with Web access through corporate
password. Individual subscriptions cannot be shared (neither passwords nor electronic copies).
Please invoice me.
Or go to www.tmaa.com/subscribetossn
Please send information on your consulting.
Name:
Company:
Address:
Check enclosed, payable to TMA Associates
(in U.S. $ on a U.S. bank).
Invoice me.
Charge my—
Visa MasterCard American Express
City, State
ZIP/Postal code
Card #
Country
Expiration date:
Email (required for email alerts or a Web subscription):
Signature:
_______________________________________________
Phone:
Copyright TMA Associates 2016; All rights reserved. TMA Associates, P.O. Box 570308, Tarzana, CA 91357-
0308 USA. Tel: (818) 708-0962.
275
LUI News (formerly Speech Strategy News) is published twelve times per year by TMA Associates, Editor: William S. Meisel. Trademarks mentioned
in this publication are the property of the companies mentioned; they are used editorially. The material herein is based on data from sources
believed to be reliable, but is not guaranteed as to accuracy and does not purport to be complete. From time to time, the author or TMA Associates
may have consulting assignments, advisory positions, own stock, or have other business relations with organizations in speech recognition and
associated areas, including companies discussed in this newsletter. LUI News and Speech Strategy News are trademarks of TMA Associates.
A m ystery novel by Bill M eisel— Technically Dead. It takes place in the near future
with technology familiar to readers of this newsletter.
Check out the reviews!
Reader’s Favorite https://readersfavorite.com/book-review/technically-dead
US Review of Books www.theusreview.com/reviews/Technically-Dead-by-WilliamMeisel.html – .VeG8_9NVhBc
Kirkus Reviews www.kirkusreviews.com/book-reviews/william-meisel/technically-dead/
Forward Clarion Reviews www.forewordreviews.com /reviews/technically-dead