How a computer recognizes human emotions

human emotions

Still, think computers are soulless tins? No matter how! They are already learning to recognize in people sadness, joy and anger.

All human feelings are manifested not in words, but in facial expressions, which show much more than most of us are aware of. Even if we do not want to reveal our thoughts, we are given out by body language, facial expression. 90 per cent of communication is non-verbal – this may surprise laypeople, but has long been a basic rule for communication professionals. We are not even able to control many of these signals, they appear involuntarily and regardless of our origin or cultural level.

This is especially true for microexpressions, facial expressions that slip for just a split second and are not subject to conscious control. In addition, they are very difficult to imitate, and therefore they are considered a fairly reliable emotional signalling system. As a rule, they are invisible to an inexperienced eye, but the camera grabs them without problems. It uses the algorithms of the so-called emotional computing (Affective Computing), when analyzing individuals by their expressions, which are usually classified into six or seven categories.

According to the Facial Action Coding System ( FACS ), developed in the 70s of the last century by Paul Ekman and Wallace Friesen, these include anger and fear, resentment and disgust, sadness, surprise and happiness. More advanced systems use even more than 20 measured values. Facial expressions and emotions do not depend on cultural factors, which was shown by his studies conducted among the population of Papua New Guinea, far from the media and cultural influences of other countries. Facial expressions and emotions are equally expressed throughout the world, they are universal and are innate.

Can AI identify attackers?

Now the functionality of the programs has expanded to such an extent that they are able to analyze images in real time, which opens up a huge range of possibilities for their application. Since the beginning of the year, the United States Transportation Safety Administration (TSA), as part of a pilot program, has been testing biometric facial recognition technologies to verify the passenger’s identity with his documents.

It is easy to imagine that AI is additionally used to recognize emotions, for example, to identify possible terrorists among passengers. Companies are already using emotion recognition to improve the performance of their business.

Disney knows in advance when the audience will laugh

Disney Cinema uses face recognition technology to evaluate the public’s emotional reactions. An algorithm called factorized variational autoencoders (FVAE) was developed to track the facial expressions of people watching movies. After a ten-minute analysis of the face of the viewer, one can predict the future expression of this face in the future viewing process.

FVAE lays out images of the faces of viewers in the form of a series of numbers based on certain signs: one number for the smile of a certain face, another for the breadth of the opening of the eyes, and so on. The Disney team applied FVAE to more than 3,000 viewers when watching multiple films and identified 68 measurement points per face, resulting in 16 million individual face shots. If there is a sufficient amount of information, the system can accurately predict human reactions after just a few minutes of observation.

By the way, technology is not limited to individuals only. FVAE can, for example, analyze how trees react to wind depending on their type and size.

Voice gives out emotions too

In addition to facial expressions and body position, our emotional state gives out a voice. Enough evidence for researchers around the world to work on automated emotion recognition capabilities.

Back in 2016, Matthew Fernandez and Akash Krishnan, students at the Massachusetts Institute of Technology and Stanford University, developed an algorithm that can recognize dozens of emotions in human speech. The so-called Simple Emotion algorithm tracks the acoustic characteristics of speech sounds, such as voice frequency, volume, and tone changes, and compares them with a library of sounds and tones. He identifies the emotion by finding the closest match in the catalogue.

Speech analysis tools may be of interest to companies that want to improve their customer service. As you know, little can make callers on the hotline more nervous than talking to an indifferent call centre employee or a robot after waiting for a connection. And then an algorithm comes to the rescue, giving real-time feedback on the emotional state of the caller. This may give the caller the impression that he was taken seriously and with understanding. For call centre employees, this will mean less stress. This tool can also be used for quality assurance or training.

American psychologist Paul Ekman made a distinction between the six core emotions.
It is impossible to learn, they are innate: fear, anger, sadness, joy, disgust and surprise.

But voice and facial expressions are not the only things that give out your emotions. Instead of voices or facial expressions, the Moxo wrist-mounted device uses skin resistance. Its changes, as in the case of a lie detector, provide information about the currently prevailing emotion. A device that measures emotions is primarily intended for use in market research.

How AI reads “between the lines”

The situation is somewhat more complicated with the texts. How can one deduce from written words and sentences feelings that animate readers can’t always understand (remember school literature lessons!). Bjarke Felbo, a Danish Fellow at the Massachusetts Institute of Technology, in 2017 developed a particularly original way of teaching artificial intelligence to read “between the lines”. His main tool, in this case, are emojis.

In fact, Falbo wanted to develop a system that would better recognize racist Twitter posts. But he soon realized that many records could not be correctly interpreted without understanding irony or sarcasm. Since Twitter users don’t use face, body language, or voice tonality in communication, they need other means to make their messages sound right: they use emojis, explains Iyad Rahwan, Felbo’s research supervisor at MIT. “The neural network has learned the connection between a certain way of expression and emoji.”

Emoji: attention, sarcasm!

Using an algorithm called DeepMoji, researchers analyzed 1.2 million tweets that contained a total of 64 different kinds of emojis. At first, they taught the system to predict which emoji would be used with a particular message – depending on whether it expresses happiness, sadness, laughter, or something else. After that, the system learned to recognize sarcasm based on the available data set for the corresponding categories of examples.

Researchers have even provided artificial intelligence with their own website to demonstrate the part of the system that emojis make up. The program automatically ties one or more suitable emojis to English text and seems to work quite efficiently. Difficulties arise only with Donald Trump’s tweets that clearly confuse Deepmoji, just like all other flesh and blood readers.

The meaning and purpose of pattern recognition

After the hype around the new technical capabilities has subsided, the question remains about the deep meaning of emotion recognition. After all, machines equipped with such AI do not generate any feelings, they do not even understand them. They only persistently and unshakably analyze endless rows of numbers. A variety of expression forms are laid out for algorithms on images and graphics, which are checked for patterns and features through image recognition. This can give people the illusion that they are dealing with a sensitive interlocutor.

Such programs will no doubt be able to pass any Turing test soon. But this success is not least due to the fact that human understanding is also based on pattern recognition and is always looking for something familiar in the unusual. All Rorschach tests are based on this. So there remains the fear that the foundation will be laid here for even greater control or even more sophisticated manipulations. Or the hope that a reasonable application will still be found.