Human in the Loop – How Do We Create AI?

Tags: annotation, artificial intelligence, humans, machine learning, technology, transparency

November 26, 2021,

Dagmar Damazyn

Artificial Intelligence relies on humans. Developing AI technology as we do at audEERING, we need to understand our human perception. That’s why we’re working with a trained inhouse annotation team. It’s a crowd of persons diverse in age, gender, background and present, who deliver the knowledge for our AI technology. In this blog post we focus on the human machine learning process. How do we at audEERING create AI?

Bringing Humans in the Middle

Everyday perception is enabling us to realize the emotional state of our communication partner in different situations. Most of the time it’s happening unconsciously. One typical example is a call with a good friend. You’re asking your friend: „How are you?“ and the answer could be: „I’m fine“. Now imagine your reaction is: „Oh what’s wrong? Do you want to talk?“ How does the person sound like while using a positive content, which makes you reacting this way?

Making the Unconscious Conscious

In the process of Human Machine Learning we need to give the algorithm essential input. The basis is a huge amount of training data. The data used for VoiceAI is audio. Having this data available, the next step is to know what you want to figure out from this data. As clearer the goal is as clearer the question can be. The given answers are the input which is used to train the AI. In the moment of answering the annotator is trying to realize what his or her perception is focusing on. This is the moment when the unconscious is becoming conscious.

The Carpenter Effect

Based on the Carpenter Effect, hearing a harsh voice can cause a scratching in one’s throat. The Carpenter effect explains the internal understanding of another person’s physiological motions by imitating them unconsciously. Regarding to the area of speech therapy this effect is used to capture tendences of a person’s voice diseas. Further still, the voice transfers personality and humans as empathic beings are accessible to it. That’s why you can difference between your friend’s content and the way of speaking.

AI as the Mirror of Human Perception

How can it happen that AI might become discriminatory, for example in terms of gender and origin? The AI technology learns from the given data sources and its labels. We as humans make differences in our everyday life between different languages, because of special linguistic phenomena, regarding to a language’s wording structure and phonetic sound. We also differ between a female and male way of speaking. Humans are socialized beings, kind of trained on interprating special attributes. Additionaly there are real physiological differences between a male and female vocal tract which effects on the acoustic creation of voice.

audEERING’s Role as a World-Leading Innovator in Voice AI

Developing AI with responsibility towards humanity, science, and the transparency in technology, means to understand the data. In essence, it means understanding human perception. How do people percept what they hear? Which personal background information influences our human perception? To maintain diversity, diversity must be created. However, this cannot be done one-dimensionally, but must be captured at all levels. Accordingly, developing AI can mean adding value that doesn’t just have the technology at the end. It means gaining insight into the complexity of the human mind and perception. But therefor the complexity of the human mind is indispensable.

To get to know more about audEERING’s technology go ahead to devAIce. Furthermore we’re looking forward to get in contact for a individual talk about the possibilities of audEERING’s technology.