arXiv:2008.09748 Abstract | arXiv Analytics

arXiv:2008.09748 [cs.CV]Abstract References Reviews Resources

Multidomain Multimodal Fusion For Human Action Recognition Using Inertial Sensors

Published 2020-08-22Version 1

One of the major reasons for misclassification of multiplex actions during action recognition is the unavailability of complementary features that provide the semantic information about the actions. In different domains these features are present with different scales and intensities. In existing literature, features are extracted independently in different domains, but the benefits from fusing these multidomain features are not realized. To address this challenge and to extract complete set of complementary information, in this paper, we propose a novel multidomain multimodal fusion framework that extracts complementary and distinct features from different domains of the input modality. We transform input inertial data into signal images, and then make the input modality multidomain and multimodal by transforming spatial domain information into frequency and time-spectrum domain using Discrete Fourier Transform (DFT) and Gabor wavelet transform (GWT) respectively. Features in different domains are extracted by Convolutional Neural networks (CNNs) and then fused by Canonical Correlation based Fusion (CCF) for improving the accuracy of human action recognition. Experimental results on three inertial datasets show the superiority of the proposed method when compared to the state-of-the-art.

DOI: 10.1109/BigMM.2019.00074

Categories: cs.CV, cs.AI, cs.LG, eess.IV, eess.SP

Keywords: human action recognition, inertial sensors, novel multidomain multimodal fusion framework, transform input inertial data, input modality

Tags: journal article

Related articles: Most relevant | Search more

arXiv:2407.06162 [cs.CV] (Published 2024-06-02)

RNNs, CNNs and Transformers in Human Action Recognition: A Survey and A Hybrid Model

Khaled Alomar, Halil Ibrahim Aysel, Xiaohao Cai

arXiv:2010.16073 [cs.CV] (Published 2020-10-29)

CNN based Multistage Gated Average Fusion (MGAF) for Human Action Recognition Using Depth and Inertial Sensors

Zeeshan Ahmad, Naimul khan

arXiv:2105.13533 [cs.CV] (Published 2021-05-28)