Voice recognition technology: the good, the bad and the ugly
Brands should start to find their own ‘voice’ or risk playing catch up when we enter a browserless world, writes Nick Graham
I’ve had my Google Home for less then six months and it’s been a (metaphorical) life-saver. Not only do I know when to pack an umbrella during Spring, I also know what time it is in Australia so I can message friends. I’ve put a stop to meaningless conversations with my wife (“What was Kanye West’s fourth album called?”), creating more time for me to gloat.
However, beyond the few useful moments I’ve experienced, overall I’m not impressed: The most returned phrase I’ve heard is “Sorry, I can’t help with that yet”.
What do you mean you can’t help with that yet? You’ve now reached the big stage, the mainstream. You’ve been hailed as one of the most promising societal trends this year. You’ll be in 40% of UK households this year and can now be picked up for less than the cost of a coffee machine. And you can’t help me with that yet?
Voice features on almost every conference agenda today, as brands and agencies try to figure out how best to use it. And, with any new tech, we are seeing mixed success. For every great initiative there are some clangers, as brands wrestle with how to tread the fine line between value added and pain in the ass.
The Good:
Perhaps the most anticipated use is powered by Apple’s CarPlay and Google’s Android Auto, whose tech takes away the distraction of screens and dials from drivers, allowing them to change the radio station or get mapping instructions using voice alone.
Nat Geo has developed Bravo Tango Brain Training, an app that offers mental health guidance tailored for military veterans, to be accessed in the privacy of their own home. This is just a step away from help being offered to sufferers of early onset dementia, offering cues to short term and long term memories.
Finally, using voice (and body movements) Pepper, SoftBank’s human-shaped robot can interact with people by reading their emotions, offering everybody a unique experience based on their current and previous interactions. While this is neither mainstream nor for everybody, this level of intelligent companionship could be a genuine life-changer for a lot of people.
The Bad:
Smart speakers are a personal extension of our quest for knowledge and are tailored to our tastes, preferences and habits. As a result, there has been much indignation when brands have tried to be clever, attempting to influence or take over the gadgets by initiating voice command through TV ads.
A 15” spot from Burger King last year hijacked Google Home devices in the US by asking “OK Google, what is in a Whopper” to which devices within earshot would respond with a prepopulated response from Wikipedia. While the ad won awards, it was also described by judges as ‘an abuse of technology’ and ‘invasive’. The very fact that a piece of work described in this way was celebrated so fervently goes to show how intrigued the ad industry is by the ways in which smart speakers can work for brands, and consider a little overstepping of the mark just fine.
However, brands want to be slightly careful here. Given the recent dominance of distrust towards brands and the media, consumers may not take kindly to having their devices hijacked for the purposes of a third, uninvited, party.
The (potentially) Ugly:
Hijacking the output is one thing, but take it further and smart speakers could be responsible for far more dangerous mimicry. Software can now mimic anybody’s voice based on a small recording sample. Although this can be used in very creative and harmless ways – such as this attempt by CereProc to give voice to the speech due to be delivered by JFK the day he was assassinated – there is no reason for it to stop there. Imagine the implications of somebody’s voice – and therefore a unique part of their identity – being used in a damaging situation, with listeners unaware technology has played a part.
It’s interesting to listen to the tone and temperament of people speaking to a smart speaker and wonder if their attitude is reflective of their general manner with, say, waiting staff. When it comes to 21st century office etiquette there are two types of people: those who say please and thank you to Alexa, and those who don’t. Recently ChildWise published a report warning parents that kids who demand information from their smart speakers might become more aggressive in the future, specifically with women, given most AI voices are female. In response, Alexa released a function that allowed parents to set ‘please’ and ‘thank you’ as requirements when issuing voice commands.
While smart speakers continually develop their responses to our demands and needs, who is responsible when those demands become more serious and emotionally charged? 41% of people who own a smart speaker suggest it’s like speaking with another person. I would love to see the introduction of smart speaker intelligence that can identify a human who is grieving, depressed or lonely, and respond accordingly, perhaps serving supportive content in response to their situation.
The Hopeful:
Voice as a medium is only on the up, and brands should start to find their own ‘voice’ or risk playing catch up when we enter a browserless world.
But with any new technology we, as an industry, are primed to act fast, grab all the glory and be ‘innovative’, worrying about the consequences later. But when it comes to such a personal platform and relatively intimate style of consumer interaction, brands should ask themselves if they are ready to really add value or if they risk turning people off their brand for a chance to say they were one of the first to help ruin the voice experience for everyone else. Don’t be sorry, just don’t do it! Ok. Google?
Nick Graham is digital strategist, MC&C Media