Do voicebots understand emotions?

I am thrilled to share the fifth little video made for Apifonica’s “Niebanalne pytania o voiceboty” cycle – “Remarkable questions about voicebots”. The topic of today’s video is: do voicebots understand emotions? The film features a special cameo, or rather – cameow!

What is sentiment analysis?

We call the feature in question “sentiment analysis”. It’s a great tool to check the sentiment of large amounts of text. Let’s say a company wants to quickly learn what is their general perception online. They might download tweets containing their name and run a sentiment analysis. The result will let them know if people in general express positive or negative emotions towards the company. They might also go deeper and see what exactly the positive or negative sentiment is, whether it’s rage, satisfaction, happiness, etc.

Now, of course, we can do the same with voice. Imagine a text consisting of three words, like “Great. Splendid. Awesome”. I might pronounce it in a highly sarcastic way, expressing a lot of contempt – something that might get lost if the text is written. Spoken language can convey way more information about emotions due to the way something is said – the prosody, tension, vowel quality etc.

Usage implications

Why would we use it in a voicebot? Well, some customers would like to have their end users categorised depending on their sentiment. If there is an angry client calling, we might want give them priority, and transferred to a real person immediately. This approach is, however, problematic.

Firstly, it means we are assuming someone’s emotions. The technology is quite reliable, yes, but it’s still ethically questionable to simply assume someone’s annoyance or dissatisfaction.

Second thing – we might transfer such a client even if their issue is easily solvable by the voicebot. Instead, they now have to wait for an agent, getting even more annoyed.

The third implication is maybe the most important. There is a cultural bias here, with men being more likely to express negative sentiments. Women are, culturally, less likely to express anger. This means a technology based on sentiment analysis might further prioritise – or simply take different actions – depending on, basically, gender.

So, do you use it?

Well… the technology is there, yes. But do we use it? In Apifonica – no, we don’t. There is just not much of a use case here. We want to treat customers equally and let the voicebot – or an agent – help all customers out, no matter how angry or pleased they are.

Another thing worth mentioning is the EU AI Act, which marks most of the voice sentiment analysis technologies (and its use cases) as high risk. Without getting into much detail – the EU realises that these features need a specific, detailed ethical framework to ensure safety.

The video is in Polish and you can view it here:

Grammar Queen