Voice User Interface Strategy - Opportunities and Risks of Voice Control

Voice User Interface Strategy

Opportunities and Risks of Voice Control

Marisa Tschopp
by Marisa Tschopp
time to read: 10 minutes

Keypoints

How to integrate Voice User Interfaces

  • Voice User Interface, also known as VUI, refers to user interfaces that are controlled by voice
  • VUIs are not yet very widespread in Switzerland
  • Due to the Corona Pandemic (among other things), VUIs could be relevant in the future
  • An early theoretical discussion of opportunities, risks and structural requirements is important
  • In addition to the VUI strategy, it is recommended to carry out benefit evaluation and prototyping from the beginning and continuously

The Corona Pandemic has been keeping the world in uncertainty since the beginning of 2020, plunging many nations and people into a deep crisis. Hardly anyone will be able to continue as before in the next few years, behaviour and processes will rightly be put to the test. The motto is: Necessity is the mother of invention. Courses are attended, work and events are moved to the virtual world and a wide variety of ideas are worked on to help improve the current situation and the future. In the Research Department, we are investigating the question of whether voice user interfaces (VUI), i.e. machines that are controlled by speech, will experience an upswing as a result of stricter hygiene conditions. The potential of VUIs will be evaluated in the coming weeks. Because if VUIs are really interesting in four or five years time, one should start working on this topic in advance.

The variety of creative ideas, problem solutions, jokes and videos at this time is remarkable. From the greenhouse, which serves as a mini restaurant, to the floor signs for keeping distance, to the Bavarian Drive-In with fresh Schnitzel to-go. All these measures serve the purpose of phyiscal (or social) distancing, i.e. keeping physical distance. The aim is to offer fewer contact points where viruses and bacteria are present and spread, especially in public spaces. Exogenous shocks, such as the corona pandemic, not only give free rein to creativity, but also come with a remarkable speed, especially in their implementation. The focus is clearer, the pressure is high and lethargy has no place. Even the critical voices are often neglected. These are the best conditions for innovation, for better or worse. Even if voice control per se is not a new idea, it is an innovation. Voice user interfaces are not yet widely used in Switzerland, although machine speech processing has made considerable progress since the 1960s. Back then, IBM proudly presented a shoebox-sized machine that could process 10 numbers and a few words by voice.

While in the USA almost every second American has a Smart Speaker and especially Amazon’s voice assistant Alexa is very strongly represented, the VUI experiences of the Swiss are rather based on the digital assistants from the smart phone (Amazon is not yet present in the Swiss market). Apple’s Siri can make appointments or shopping lists by voice command, the Google Assistant can call someone or play music without having to go into the app. Anyone in Switzerland who has a TV box with the provider Swisscom can now watch the latest programmes on TV with the wakeword Hey Swisscom or operate the Smart Home, provided everything is set up accordingly (without having to use a remote control).

Voice User Interfaces have an Image Problem

The saving of a few seconds may not win a Nobel Prize at first glance, but it has to be said that Voice User Interfaces are actually quite well set up technically. Only the proliferation is lacking, because VUIs have an image problem and less a tech problem. A good technology is by no means enough to be successful. This also includes good communication with all facets of marketing. Many things are theoretically possible with voice, but does #voicefirst make sense? A meaningful option would be voice control in public spaces, such as doors that open with voice, packages that can be signed with voice, food orders in hospital rooms or at McDonalds.

The problem, however, is that voice control in public spaces is rather difficult, as many people simply feel uncomfortable talking to a machine in public. Could control by gestures be a relevant alternative here? Google is working on several projects, e.g. when using a smartspeaker the music can be stopped by holding up the hand. Can this be transferred safely to applications in public space? It is a fact that touch points have to be reduced, but this requires a systematic thinking through the possible applications of VUI. This includes at least the following three levels of feasibility: Technological, regulatory, ethical.

Voice User Interface Strategy: Is it feasible? Is it allowed? Is it reasonable?

It is clear that just because something is technologically feasible does not always make sense to use VUI. The distinction between public and private space is essential. Calling the PIN when withdrawing money from an ATM? While it would be desirable to eliminate the physical entry of the PIN, voice control is hardly the right tool for this.

Voice User Interface Strategies are a Balancing Act on many Levels

Things get a little more delicate when VUI is technologically and regulatory feasible, but ethically questionable. VUI is not only an extension of the user journey, but can also make jobs or services, such as customer service, obsolete. Classical examples are call centers in the insurance and banking industry. Under the guise of relieving employees of monotonous tasks or offering 24-hour service, for those customers who want to discuss mortgage interest rates at night, jobs could be cut and thus save money in the long run. But the economic benefit is not the only benefit that is relevant. Prof. Dr. R. Hofstetter from the University of Lucerne differentiates even further into functional, process-related, emotional and social benefits. An evaluation according to this pattern goes somewhat deeper than the three levels of feasibility and gives the possibility to use quantitative benchmarks as a decision support and thus to base a decision on various criteria.

Techsolutionsim: Technology is not the Solution for Everything

Today’s society, influenced by the power of tech giants such as Amazon or Google, tends to look for a technical solution first for many problems (techsolutionism). But often such tech patches only alleviate symptoms of deep-seated problems, do not change the cause and in the worst case create even more problems. Even the Corona Pandemic cannot be conjured away by an app (contact tracing debate). But embedded in a sensible system (testing, distance, etc.), technology can very well make a valuable contribution. It should be a matter of course that the motto move fast, break things is not followed and that democratic values are the central theme, far away from surveillance capitalism.

VUI can be Fun

Finally, it should not be forgotten that there is something else besides the whole debate about functionality and regulation. The enthusiasm for new technologies. Many people with an affinity for technology love to try out new gadgets and push the limits of what is possible in a playful way. It is often forgotten that VUI can be fun and can therefore have a positive influence on the user experience and emotional attachment to a company. Why not try something out? Think about an exciting application that is safe, legal and fun. The problem is obvious. Time is money and development is still very expensive, especially if you want to develop it yourself. A switch to Alexa, should Amazon at some point penetrate the Swiss market, is cheaper, for example, but Amazon’s good reputation in data management is not exactly ahead of Amazon.

Conclusion

The Corona Pandemic is crisis and at the same time driver of innovation. Voice User Interfaces (VUI) could experience an upswing as a way to reduce physical contact points. Not only must a distinction be made between public and private space, but different benefits (What benefits does a VUI have?) must be systematically evaluated in a well defined usecase with existing database. There are a variety of conceivable scenarios and the benefits are quite exciting: For example, a good VUI can reduce screen time or cognitive loads. This means that it allows multitasking, such as controlling the navigation system by voice while driving. But there are also other major challenges, such as integrating people who cannot speak well or not at all. In summary, if resources are available, it is crucial that the usecase is adequate and that from the beginning a VUI strategy with simultaneous prototyping is developed interdisciplinarily.

Notes on the Voice User Interface Strategy series

This article was written in the course of the continuing education course Voice User Interface Strategy of the Lucerne University of Applied Sciences and Arts Contents reflect the author’s opinion and are based on the lectures of Markus Maurer (Farner Lab), Prof. Dr. Reto Hofstetter (University of Lucerne) and Tim Kahle (169Labs).

About the Author

Marisa Tschopp

Marisa Tschopp (Dr. rer. nat., University of Tübingen) is actively engaged in research on Artificial Intelligence from a human perspective, focusing on psychological and ethical aspects. She has shared her expertise on TEDx stages, among others, and represents Switzerland on gender issues within the Women in AI Initiative. (ORCID 0000-0001-5221-5327)

Links

You want to evaluate or develop an AI?

Our experts will get in contact with you!

×
Human and AI

Human and AI

Marisa Tschopp

Human and AI Art

Human and AI Art

Marisa Tschopp

Conversational Commerce

Conversational Commerce

Marisa Tschopp

ChatGPT & Co.

ChatGPT & Co.

Marisa Tschopp

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here