Human and AI
Marisa Tschopp
How to integrate Voice User Interfaces
The variety of creative ideas, problem solutions, jokes and videos at this time is remarkable. From the greenhouse, which serves as a mini restaurant, to the floor signs for keeping distance, to the Bavarian Drive-In with fresh Schnitzel to-go. All these measures serve the purpose of phyiscal (or social) distancing, i.e. keeping physical distance. The aim is to offer fewer contact points where viruses and bacteria are present and spread, especially in public spaces. Exogenous shocks, such as the corona pandemic, not only give free rein to creativity, but also come with a remarkable speed, especially in their implementation. The focus is clearer, the pressure is high and lethargy has no place. Even the critical voices are often neglected. These are the best conditions for innovation, for better or worse. Even if voice control per se is not a new idea, it is an innovation. Voice user interfaces are not yet widely used in Switzerland, although machine speech processing has made considerable progress since the 1960s. Back then, IBM proudly presented a shoebox-sized machine that could process 10 numbers and a few words by voice.
While in the USA almost every second American has a Smart Speaker and especially Amazon’s voice assistant Alexa is very strongly represented, the VUI experiences of the Swiss are rather based on the digital assistants from the smart phone (Amazon is not yet present in the Swiss market). Apple’s Siri can make appointments or shopping lists by voice command, the Google Assistant can call someone or play music without having to go into the app. Anyone in Switzerland who has a TV box with the provider Swisscom can now watch the latest programmes on TV with the wakeword Hey Swisscom or operate the Smart Home, provided everything is set up accordingly (without having to use a remote control).
The saving of a few seconds may not win a Nobel Prize at first glance, but it has to be said that Voice User Interfaces are actually quite well set up technically. Only the proliferation is lacking, because VUIs have an image problem and less a tech problem. A good technology is by no means enough to be successful. This also includes good communication with all facets of marketing. Many things are theoretically possible with voice, but does #voicefirst
make sense? A meaningful option would be voice control in public spaces, such as doors that open with voice, packages that can be signed with voice, food orders in hospital rooms or at McDonalds.
The problem, however, is that voice control in public spaces is rather difficult, as many people simply feel uncomfortable talking to a machine in public. Could control by gestures be a relevant alternative here? Google is working on several projects, e.g. when using a smartspeaker the music can be stopped by holding up the hand. Can this be transferred safely to applications in public space? It is a fact that touch points have to be reduced, but this requires a systematic thinking through the possible applications of VUI. This includes at least the following three levels of feasibility: Technological, regulatory, ethical.
Voice User Interface Strategy: Is it feasible? Is it allowed? Is it reasonable?
It is clear that just because something is technologically feasible does not always make sense to use VUI. The distinction between public and private space is essential. Calling the PIN when withdrawing money from an ATM? While it would be desirable to eliminate the physical entry of the PIN, voice control is hardly the right tool for this.
Things get a little more delicate when VUI is technologically and regulatory feasible, but ethically questionable. VUI is not only an extension of the user journey, but can also make jobs or services, such as customer service, obsolete. Classical examples are call centers in the insurance and banking industry. Under the guise of relieving employees of monotonous tasks or offering 24-hour service, for those customers who want to discuss mortgage interest rates at night, jobs could be cut and thus save money in the long run. But the economic benefit is not the only benefit that is relevant. Prof. Dr. R. Hofstetter from the University of Lucerne differentiates even further into functional, process-related, emotional and social benefits. An evaluation according to this pattern goes somewhat deeper than the three levels of feasibility and gives the possibility to use quantitative benchmarks as a decision support and thus to base a decision on various criteria.
Today’s society, influenced by the power of tech giants such as Amazon or Google, tends to look for a technical solution first for many problems (techsolutionism). But often such tech patches only alleviate symptoms of deep-seated problems, do not change the cause and in the worst case create even more problems. Even the Corona Pandemic cannot be conjured away by an app (contact tracing debate). But embedded in a sensible system (testing, distance, etc.), technology can very well make a valuable contribution. It should be a matter of course that the motto move fast, break things is not followed and that democratic values are the central theme, far away from surveillance capitalism.
Finally, it should not be forgotten that there is something else besides the whole debate about functionality and regulation. The enthusiasm for new technologies. Many people with an affinity for technology love to try out new gadgets and push the limits of what is possible in a playful way. It is often forgotten that VUI can be fun and can therefore have a positive influence on the user experience and emotional attachment to a company. Why not try something out? Think about an exciting application that is safe, legal and fun. The problem is obvious. Time is money and development is still very expensive, especially if you want to develop it yourself. A switch to Alexa, should Amazon at some point penetrate the Swiss market, is cheaper, for example, but Amazon’s good reputation in data management is not exactly ahead of Amazon.
The Corona Pandemic is crisis and at the same time driver of innovation. Voice User Interfaces (VUI) could experience an upswing as a way to reduce physical contact points. Not only must a distinction be made between public and private space, but different benefits (What benefits does a VUI have?) must be systematically evaluated in a well defined usecase with existing database. There are a variety of conceivable scenarios and the benefits are quite exciting: For example, a good VUI can reduce screen time or cognitive loads. This means that it allows multitasking, such as controlling the navigation system by voice while driving. But there are also other major challenges, such as integrating people who cannot speak well or not at all. In summary, if resources are available, it is crucial that the usecase is adequate and that from the beginning a VUI strategy with simultaneous prototyping is developed interdisciplinarily.
This article was written in the course of the continuing education course Voice User Interface Strategy of the Lucerne University of Applied Sciences and Arts Contents reflect the author’s opinion and are based on the lectures of Markus Maurer (Farner Lab), Prof. Dr. Reto Hofstetter (University of Lucerne) and Tim Kahle (169Labs).
Our experts will get in contact with you!
Marisa Tschopp
Marisa Tschopp
Marisa Tschopp
Marisa Tschopp
Our experts will get in contact with you!