arXiv:2312.07507 Abstract | arXiv Analytics

arXiv:2312.07507 [cs.CV]Abstract References Reviews Resources

NAC-TCN: Temporal Convolutional Networks with Causal Dilated Neighborhood Attention for Emotion Understanding

Published 2023-12-12Version 1

In the task of emotion recognition from videos, a key improvement has been to focus on emotions over time rather than a single frame. There are many architectures to address this task such as GRUs, LSTMs, Self-Attention, Transformers, and Temporal Convolutional Networks (TCNs). However, these methods suffer from high memory usage, large amounts of operations, or poor gradients. We propose a method known as Neighborhood Attention with Convolutions TCN (NAC-TCN) which incorporates the benefits of attention and Temporal Convolutional Networks while ensuring that causal relationships are understood which results in a reduction in computation and memory cost. We accomplish this by introducing a causal version of Dilated Neighborhood Attention while incorporating it with convolutions. Our model achieves comparable, better, or state-of-the-art performance over TCNs, TCAN, LSTMs, and GRUs while requiring fewer parameters on standard emotion recognition datasets. We publish our code online for easy reproducibility and use in other projects.

Comments: 8 pages, presented at ICVIP 2023

Categories: cs.CV

Keywords: temporal convolutional networks, causal dilated neighborhood attention, emotion understanding, standard emotion recognition datasets, high memory usage

Related articles: Most relevant | Search more

arXiv:1611.05267 [cs.CV] (Published 2016-11-16)

Temporal Convolutional Networks for Action Segmentation and Detection

Colin Lea, Michael D. Flynn, Rene Vidal, Austin Reiter, Gregory D. Hager

arXiv:2001.08702 [cs.CV] (Published 2020-01-23)

Lipreading using Temporal Convolutional Networks

Brais Martinez, Pingchuan Ma, Stavros Petridis, Maja Pantic

arXiv:2011.12986 [cs.CV] (Published 2020-11-25)

Sign language segmentation with temporal convolutional networks