|
pp. 2477-2490
S&M4448 Research paper https://doi.org/10.18494/SAM6134 Published: May 12, 2026 Multiscale Network Leveraging Wavelet Scattering Feature Maps for Emotion Recognition from Acoustic Sensor Data [PDF] Na Ying, Mengfan Yu, Shunpeng Wu, Xinyu Lin, Du Jiang, and Yinfeng Fang (Received January 9, 2026; Accepted April 9, 2026) Keywords: human–robot interaction, emotion recognition, multiscale neural network, high-resolution networks, wavelet scattering feature maps
Reliable emotion recognition from acoustic sensor signals is a pivotal yet challenging component for achieving natural and empathetic human–robot interaction. We propose a dedicated, deployable audio processing module. The core of this module is a novel integration of the Wavelet Scattering Transform (WST) as a robust preprocessing front-end with a multiscale neural network. Specifically, the WST standardizes raw, sensor-acquired audio signals into translation-invariant and deformation-stable feature maps, effectively mitigating input variability. Building upon this, we construct a Wavelet Scattering Feature Map–Multiscale Network (WSM-MSN) that synergistically combines hierarchical Convolutional Neural Network (CNN) branches for extracting fine-grained local affective features with a High-Resolution Network (HRNet) branch to capture and fuse multiscale contextual dependences, thereby significantly enhancing recognition precision. Extensive evaluations on four datasets of affective acoustic signals (EMODB, RAVDESS, IEMOCAP, eNTERFACE’05) demonstrate the module’s superiority. It achieves consistent unweighted average recall (UAR) improvements of 1.62%, 5.28%, 4.44%, and 3.37%, respectively, over traditional scattering methods and surpasses other comparative algorithms.
Corresponding author: Na Ying![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Na Ying, Mengfan Yu, Shunpeng Wu, Xinyu Lin, Du Jiang, and Yinfeng Fang, Multiscale Network Leveraging Wavelet Scattering Feature Maps for Emotion Recognition from Acoustic Sensor Data, Sens. Mater., Vol. 38, No. 5, 2026, p. 2477-2490. |