ENGLISH
La vitrine de diffusion des publications et contributions des chercheurs de l'ÉTS
RECHERCHER

Continuous emotion recognition with spatiotemporal convolutional neural networks

Téléchargements

Téléchargements par mois depuis la dernière année

Plus de statistiques...

Teixeira, Thomas, Granger, Eric et Koerich, Alessandro Lameiras. 2021. « Continuous emotion recognition with spatiotemporal convolutional neural networks ». Applied Sciences, vol. 11, nº 24.
Compte des citations dans Scopus : 3.

[thumbnail of Granger-E-2021-23853.pdf]
Prévisualisation
PDF
Granger-E-2021-23853.pdf - Version publiée
Licence d'utilisation : Creative Commons CC BY.

Télécharger (705kB) | Prévisualisation

Résumé

Facial expressions are one of the most powerful ways to depict specific patterns in human behavior and describe the human emotional state. However, despite the impressive advances of affective computing over the last decade, automatic video-based systems for facial expression recognition still cannot correctly handle variations in facial expression among individuals as well as cross-cultural and demographic aspects. Nevertheless, recognizing facial expressions is a difficult task, even for humans. This paper investigates the suitability of state-of-the-art deep learning architectures based on convolutional neural networks (CNNs) to deal with long video sequences captured in the wild for continuous emotion recognition. For such an aim, several 2D CNN models that were designed to model spatial information are extended to allow spatiotemporal representation learning from videos, considering a complex and multi-dimensional emotion space, where continuous values of valence and arousal must be predicted. We have developed and evaluated convolutional recurrent neural networks, combining 2D CNNs and long short term-memory units and inflated 3D CNN models, which are built by inflating the weights of a pre-trained 2D CNN model during fine-tuning, using application-specific videos. Experimental results on the challenging SEWA-DB dataset have shown that these architectures can effectively be fine-tuned to encode spatiotemporal information from successive raw pixel images and achieve state-of-the-art results on such a dataset.

Type de document: Article publié dans une revue, révisé par les pairs
Professeur:
Professeur
Granger, Éric
Lameiras Koerich, Alessandro
Affiliation: Génie des systèmes, Génie logiciel et des technologies de l'information
Date de dépôt: 24 janv. 2022 17:21
Dernière modification: 03 mars 2022 15:37
URI: https://espace2.etsmtl.ca/id/eprint/23853

Actions (Authentification requise)

Dernière vérification avant le dépôt Dernière vérification avant le dépôt