The Future Of Voice: Part 1

Chris Bunch 6th April 2018
Voice technology

Voice technologies are everywhere. You can’t avoid seeing someone using voice for “something”, whether that be a five year old playing their favourite music on an Echo, or an adult in a pub settling a debate via smartphone.

We decided to put together a series of blog posts commenting on the future of voice technology. In this post Chris Bunch, Head of Cloudreach Europe, looks at how enterprises are adopting this technology.

 

The future of voice technology in the enterprise

 

In the enterprise world, AWS launched Alexa for Business at re:Invent with some basic controls and skills, whilst Microsoft are embedding Cortana in pretty much everything, as are Apple with Siri and Google with their Assistant. As with so many things in tech, there’s a war – in part for mass adoption, and in part for the data itself. Not solely from an advertising/targeting perspective, as one might think, but from a machine learning perspective – the more data you have, the more accurate your models can become.

It is definitely fair to say though, that most people don’t use voice that often – especially in an enterprise context. Whilst I use Assistant on my phone a lot, I barely use Siri at all on the various Macbooks/iDevices I own. And I’m typing this blog by hand, not dictating it, ;-). However, the more I, and others, get used to voice as an interface, the more we’ll demand it at work.

 

Where are enterprises using Voice?

Cloudreach have seen some initial demand recently, with several Alexa-based projects completed in the past few months. I wanted to  share some of the high level details from two of them:

1) This project was for a nationalised transportation company in Europe. This was really our first foray into the world of Alexa for a paying client, beyond hackathons, and we created a Skill* enabling end users to book taxi services with the company, purely through voice. Awesome, and perhaps a more pleasant experience than fiddling with your Uber/Lyft app at an airport.

*Skills are a basically Alexa’s version of an App, see here for a public list of Skills AWS and partners offer, mostly consumer facing of course.

Lessons learned? It opened my eyes personally to how quickly we could get a working solution together, as the client already had some solid APIs available to integrate with. It also highlighted the importance of a reliable Internet connection, without which, these technologies struggle – no one wants Alexa to tell them about a network timeout… We also realised that we need Amazon (in this example) to provide more native language support beyond English (note that German, French and Japanese are now available), for true global adoption to be able to kick in.

2) For an industrial client in Germany, we developed a solution to connect their forklift trucks back to their service centre. Why? Well, their drivers typically don’t report errors until they become major problems, as they regard fault reporting as painful. Humans, eh? So, we simplified this using a nice big blue IoT button to allow them to fault report quickly and from their cabs.

Lessons learned? Not that many on the Voice side actually, but it’s fair to say that an Alexa <-> Azure AD <-> SAP integration is fiddlier than one would like.

 

What’s going on elsewhere?

Away from Cloudreach, what other moves are we seeing in the industry in general?

Well, in the race for enterprise collaboration Microsoft are using voice as a differentiator by adding Cortana functionality into their Teams product. Would you like some automated meeting notes taken for you? Automatically translated into multiple languages? Sounds cool. How about when they’re automatically attributed to you via face recognition? Maybe not…

This page documents some Alexa use cases, including how WeWork use it for room booking, fault reporting, etc. More generally, AWS are pushing integrations with SaaS platforms like SFDC – they’re currently basic, but adding useful functionality in places, e.g. updating an Account record while traveling back from a meeting.

 

What’s not so great?

Well, as argued here, the functionality is pretty limited and targeted to simple requests currently – i.e. this is first generation technology right now, and may not change the world for a few more iterations. I think that’s a fair statement, although with the Machine Learning wars very much active, and training data increasing dramatically everyday, I think things will evolve faster than one might expect.

I mentioned language support above, but within language there’s another complex nuance – accents. As noted by Wired, and a few people in our Edinburgh office(!), voice tech doesn’t always work that well for those speaking with an accent, either regionally or nationally. This is fair, especially when combined with a noisy environment, and a hard problem to solve – not just for machines. I once chatted to a group of people in bar in Scotland, and came away thinking they were Danish. Turns out they were from Glasgow.

An interesting thought provoking article appeared a few weeks back, asking whether voice assistants should auto-report crimes, given they’re “always listening”. Ignoring the many technical challenges with this (e.g. distinguishing a child screaming when punched by a sibling, vs. someone being genuinely attacked), it opens an interesting moral and legal can of worms (always think “GDPR” folks!) regarding the recording of speech in the workplace and related permissions, storage, assurances, etc.

 

What does the future hold?

One thing that is important to note is that voice technology can be a real game changer right now for those with certain disabilities. In these cases voice technology can provide a much easier interface to the world. As I’ve remarked before, this empathetic view of the world is very much one that Microsoft are leading with from the top down, and something more of us could and should think about more frequently.

For more comments on the uses of voice technology in the consumer world, watch this space for our follow up post from resident Voice expert Neil Stewart,

Overall, it’s definitely early days for voice in the enterprise – but I would argue now is the time to start experimenting, now is the time to build skills in this area, and now is the time to start innovating. If you don’t, someone else will.