How Speaker Identification Affects Emotion Detection

Caro Bauer

Speech is a very personal and powerful medium of communication that contains a lot of unique information and characteristics. This information enables software providers to use it in concrete application scenarios like emotion detection. Speaker Identification is an important part of voice-based technologies like voice and speech recognition. It can provide unique information like gender and age. So how can it affect a technology like emotion detection from speech?

Advantages of Speaker Identification

Due to the individuality of every single person, every voice also has individual characteristics that only this unique person has. Current speech and emotion detection suffer from the variation of voice characteristics. In order to deliver the best outcome for every individual and to resolve this issue in the scope, we apply speaker identification.

“Through speaker identification, our emotion detection technology can work better and more accurately. The speaker-dependent technology enables more accurate results than speaker-independent technology, because we can respond to the characteristics of the speakers,” says Prof. Dr. Felix Burkhardt, Director Research at audEERING.

Self-learning AI Adapts to Speakers

However, in the long run, important information about the speaker’s characteristics would be lost, as soon as different people use the speaker system. Because of this issue it should be combined with a self-learning technique to improve the self-learning possibilities of the system. A speaker adaptation scheme is introduced for fast short-term and detailed long-term adaptation.  These adaptation profiles are then used for an efficient speaker recognition system and enables the speaker adaptation to track different speakers. In the long term this approach provides an optimal adaptation.

Data Privacy has to be guaranteed

Data privacy is key to us and always has to be guaranteed when applying artificial intelligence. As a German company we work according to the General Data Protection Regulations (GDPR) and German Telemedia Law (TMG) so that no data is stored on remote servers. Furthermore, the user always has the possibility to delete all his data. You can see this rule very transparently applied in our COVID-19 study on our recording platform AI SoundLab.