Emotion Transformer : attention model for pose-based emotion recognition
Pedro V. V. Paiva, Josué J. G. Ramos, Marina L. Gravilova, Marco A. G. Carvalho
ARTIGO
Inglês
Este artigo foi apresentado no evento 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), 2023
Agradecimentos: This work is supported in part by FAPESP (São Paulo Research Foundation), grant number 2020/07074-3, by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) [88881.690185/2022-01] and by the Canada NSERC Discovery Grant on Machine Intelligence for...
Ver mais
Agradecimentos: This work is supported in part by FAPESP (São Paulo Research Foundation), grant number 2020/07074-3, by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) [88881.690185/2022-01] and by the Canada NSERC Discovery Grant on Machine Intelligence for Biometric Security, and by the Canada Defense and Security, IDEaS Innovation Network Establishment Grant AutoDefence. The authors are grateful to the Renato Archer IT Center, for its infrastructure support and MSc. Yajurv Bhatia for his insightful comments and suggestions
Ver menos
Abstract: Capturing humans’ emotional states from images in real-world scenarios is a key problem in affective computing, which has various real-life applications. Emotion ecognition methods can enhance video games to increase engagement, help students to keep motivated during e-learning sections,...
Ver mais
Abstract: Capturing humans’ emotional states from images in real-world scenarios is a key problem in affective computing, which has various real-life applications. Emotion ecognition methods can enhance video games to increase engagement, help students to keep motivated during e-learning sections, or make interaction more natural in social robotics. Body movements, a crucial component of non-verbal communication, remain less explored in the domain of emotion recognition, while face expression-based methods are widely investigated. Transformer networks have been successfully applied across several domains, bringing significant breakthroughs. Transformers’ self-attention mechanism captures relationships between different features across different spatial locations, allowing contextual information extraction. In this work, we introduce Emotion Transformer, a self-attention architecture leveraging spatial configurations of body joints for Body Emotion Recognition. Our approach is based on the visual transformer linear projection function, allowing the conversion of 2D joint coordinates to a regular matrix representation. The matrix projection then feeds a regular transformer multi-head attention architecture. The developed method allows a more robust correlation between joint movements with time to recognize emotions using contextual information learning. We present an evaluation benchmark for acted emotional sequences extracted from movie scenes using the BoLD dataset. The proposed methodology outperforms several state-of-the-art architectures, proving the effectiveness of the method
Ver menos
FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULO - FAPESP
2020/07074-3
COORDENAÇÃO DE APERFEIÇOAMENTO DE PESSOAL DE NÍVEL SUPERIOR - CAPES
88881.690185/2022-01
Aberto
Emotion Transformer : attention model for pose-based emotion recognition
Pedro V. V. Paiva, Josué J. G. Ramos, Marina L. Gravilova, Marco A. G. Carvalho
Emotion Transformer : attention model for pose-based emotion recognition
Pedro V. V. Paiva, Josué J. G. Ramos, Marina L. Gravilova, Marco A. G. Carvalho
Fontes
Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Fonte avulsa) Setúbal : SciTePress, 2023. p. 274-281 |