Voice User Interface Design

An Overview of Brand Personas

by Marisa Tschopp

time to read: 14 minutes

Keypoints

This is how you produce a face with a voice

Voice user interfaces are becoming increasingly popular
In addition to technical and functional aspects, brand personas play a role in the design process
A brand persona is a conceptual representation of an immaterial voice application and influences acceptance and user behavior
Embedded in the overall design process, the examination of brand personas brings advantages on the consumer and organizational psychological level

Without much fuss the Swiss Federal Railway SBB has launched a Google Action, an app for the Google Assistant (analogous to the individually buildable skills for Alexa), which runs via the voice control of the digital assistant. Command input by voice creates a new access and experience for users. Whoever instructs the Google Assistant to talk to SBB is the first to be greeted by the SBB bells that are familiar to public transport users. A digital assistant called Silba then provides information on timetables and prices. When Silba speaks to you, an image of Silba will involuntarily appears in your head. What does it look like? Does it have a nose? Is the image friendly? Professional? Conservative? These and many other questions about the design of voice user interfaces are asked by consumer researchers, conversational designers and marketing practitioners when it comes to the topic of brand personas.

Insight into our research on Voice User Interfaces

The Titanium Research Department has been monitoring developments in the field of digital assistants (often referred to as conversational AI) for several years. Various topics have already been considered, starting with the functioning, safety aspects and development potential of language assistants and the role of anthropomorphism, i.e. the humanization of machines. The core statement of the latter article is that only a conversation with a language assistant that is as natural as possible will lead to great success in terms of acceptance and benefit. We are still quite far away today from that, despite the impressive developments on the technological level, such as the performance of the restaurant reservation through the Google Duplex, which quite successfully pretended the restaurant to be a human being. Building on the initial work, embedded in a larger interdisciplinary research project, the A-IQ was developed, which is a kind of IQ test for voice-controlled digital assistants and is thus intended to serve as a playful way of exploring the competencies of voice assistants. The practical implementation of the gained knowledge was carried out in the development of an Alexa Skill for the VulDB (a vulnerability database which documents vulnerabilities, thus supporting administrators and security experts in their work).

With this approach of Research from Within we want to continuously build up a deep understanding of language as a control instrument of the future, both on a theoretical and practical level. Voice User Interfaces are considered to have great potential, but there are a number of open questions regarding the design of technology implementation in companies, which due to their complexity urgently require a transdisciplinary dialogue.

VUI: It is complicated

If VUI’s had a relationship statute, it would be: It’s complicated. There are no off-the-shelf applications that can be bought quickly, without creating further dependencies. The development and implementation needs commitment, time and money. Because how, for example, the voice interfaces are programmed, i.e. whether the digital assistant says Hey or Good afternoon, has a female, male or synthetic voice, makes humorous jokes or whether it should make recommendations on safety measures during the corona crisis, has an effect on perception. If a person hears a voice, an image is created in his head, that is inevitable. In an fascinating study at the JK University of Linz, researchers had test persons paint pictures of how they imagine the machine, depending on different voice sounds. For example, machines with a human voice, were more often painted with a nose, while mental representations of voices with synthetic/computer generated sound have more often been given wheels (Mara et al., 2020).

Which voice fits which robot. Mara et al., 2020

From the point of view of consumer psychology it is useful to understand how the different variables affect the perception of the individual user and what effect this has on acceptance and thus user behaviour. One of these variables is the so-called Brand Persona.

The role of personas in conversation design

If we are in the context of Voice User Interfaces where no avatar or mascot is associated to the voice, such as Clippy, we are (in short) talking about a Brand Persona, which is a purely mental manifestation in the minds of the users. A bundle of impressions of a voice applications becomes a mental picture that is supposed to reflect the values or image of a company, the needs of the target group in combination with the task to be performed by the voice application. This is important to understand, because:

Invisible Interface is not Faceless (Study by Metadesign on Voice Branding)

Wally Brill, Head Of Conversation Design Advocacy & Education at Google, has been working for over 20 years on the question of how to give a voice application a character and experiments with the various variables that are supposed to shape the image in the minds of consumers, including for example:

Audio signals: Like voice pitch and speed, additional sounds
Visual signals: Like the Google color pattern of the microphone or the type of wave motion during sound recording
Different forms of interaction: Like inserting emojis in the text track, if screen is available

The way of speaking or the jargon seems to be particularly important. It enables the programmers to influence the perception of the brand persona in a targeted way. The simplified example of how many possibilities there are to say ‘yes, shows how complex the process is to choose the right words. Does the assistant speak rather informally and humorously or, as in the example of SBB, rather formally and emotionlessly?

Formal YES	Yes, Gladly, Ok, [confirmation in the form of paraphrases…]
Informal YES	Sure, Okidoki, mhm, yup, 👍 [emoji thumbs up….]

Using the example of the brand persona, it quickly becomes clear: If an organization does not know its image, its values, its relationship to the user and its tasks, then it becomes difficult to choose the right ‘Yes’. What should the digital assistant look like to the customer? As a servant? As a boss? As a friend and helper? These questions have to be answered before a voice application goes into technical construction.

On a side note: With a voice project you can kill two (or more) birds with one stone. From an organizational psychological point of view, this preliminary exercise is worthwhile in itself, because in a participatory process, involving employees and managers, they work on these exciting questions of corporate culture and look for creative solutions to communicate their mission to the outside world. This can have a positive effect on the employees’ identification with the company, like a kind of team-building measure, which in the ideal case even brings economic benefits.

Mind the Gaps: Voicebot != Chatbot and other tips

In this context, the difference between chatbots and voicebots will be discussed in addition. Voicebots are not simply voicified chatbots or FAQs and differ from chatbots in many ways, especially in the design of the conversations and the brand personas, like Karen Kaushansky, Conversational Designer at Google, preaches in many talks and reports. Text and voice work completely different, have different design options, e.g. a text-based chatbot can show more options while a pure voice application without screen should be limited to two or three answer options. For example, in our Alexa Skill for the VulDB, we have two or three spoken suggestions, while more information can be displayed on the screen of the Alexa Echo Show.

However, a spoken dialogue is based on a completely different pattern or design. A dialogue can run in many different directions. A text, on the other hand, is guided in a controlled manner and is often very logically structured. This does not mean that dialogues cannot be logical, but they appear in a completely different way. The decisive difference can be shown very well by comparing it with the chatbot model: In a chatbot, it is always possible to offer several options for the further course of the conversation via text output (The design of a voice user interface – anything but simple, by Farner Lab, 2020)

Above all else, however, it must be remembered that the design of the brand persona is not the first and most important step, as first the use case, the type of access (e.g. mobile phone, smart speaker?), and the multimodality (e.g. screen, no screen?) must be clarified:

Let’s go back to the Example SBB: What distinguishes the Google Action as a voice application from a text-based chatbot? For those who are already voice users, it is simply convenient and corresponds to their preferences. But can Silba also inspire new non #Voicefirst followers? Our conclusion is: Yes and No. Is this an ideal use case to produce a Google Action?

Surely on a playful level a certain personal touch was generated by the nice SBB sound. On the other hand, the normal Google Assistant already produces very good results when you ask for the next train to Zurich, for example. So there are still a few questions left unanswered regarding the SBB Google Action, even though the application certainly needs to be given a certain amount of kudos regarding their attention to detail. It remains to be seen whether Silba will prevail, the use case does not stand out very clearly in any case. It is one of the most important steps in the design process to identify exactly this Raw Chicken Moment. The raw chicken moment is the aha-moment, and refers to the characteristics of a situation in which both hands are not operational, making a voice application really useful.

Conclusion

In addition to technical and functional aspects from previous reports, this article focused on the topic of brand persona in voice user interface design. Brand personas are a part of the design process that shapes the image of the voice application, i.e. the perception of the consumer. Through its design elements, the brand persona influences the trust and sympathy of the consumers and thereby affects the user and adoption behavior. Before a brand persona is designed, the requirements must be clarified: (1) There is a good use case (raw chicken moment) and the necessary (2) infrastructure, data and resources for a voice application. A readiness check can help here, such as this Voice Readiness Check. What happens when the basis is right?

Step 1: Clear understanding of the own image + values
Step 2: Development of the brand persona with various tools
Step 3: Transdisciplinary cooperation in speech design + programming (Yes, they exist, the Voice Architects!)

The development of the brand persona is perhaps not the most critical of all tasks in voice user interface design, but it is certainly one of the fun, creative tasks that can also be valuable on the level of corporate culture.

Notes on the Voice User Interface Strategy series

This article was written within the course Voice User Interface Strategy of the Lucerne University of Applied Sciences and Arts. Contents reflect the opinion of the author and are based on the lectures of Markus Maurer (Farner Lab), Karen Kaushansky and Wally Brill (Google Assistant Team) and Dr. Laura Dreessen (VUIagency).

Part 1: Strategies
Part 2: Design
Part 3: Prototyping
Part 4: Insights

References

Mara, M., Schreibelmayr, S., & Berger, F. (2020). Hearing a Nose? User Expectations of Robot Appearance Induced by Different Robot Voices. In HRI’20 Companion: ACM/IEEE Int. Conf. on Human-Robot Interaction, March 23-26, 2020, Cambridge, UK.

About the Author

Marisa Tschopp has studied Organizational Psychology at the Ludwig-Maximilians-University in Munich. She is conducting research about Artificial Intelligence from a humanities perspective, focusing on psychological and ethical aspects. She held different talks like for TEDx and is also board member of the global Women in AI (WAI) initiative. (ORCID 0000-0001-5221-5327)

You want to evaluate or develop an AI?

Our experts will get in contact with you!

Human and AI Art

Marisa Tschopp

Conversational Commerce

Marisa Tschopp

ChatGPT & Co.

Marisa Tschopp

TEDxBoston Countdown to AGI

Marisa Tschopp

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here

Voice User Interface Design

An Overview of Brand Personas

Keypoints

Insight into our research on Voice User Interfaces

VUI: It is complicated

The role of personas in conversation design

Mind the Gaps: Voicebot != Chatbot and other tips

Conclusion

Notes on the Voice User Interface Strategy series

References

About the Author

Links

Tags

You want to evaluate or develop an AI?

Human and AI Art

Conversational Commerce

ChatGPT & Co.

TEDxBoston Countdown to AGI

You want more?

You need support in such a project?

You want more?