Sensors and Materials

Young Researcher Paper Award 2025
🥇Winners

Notice of retraction
Vol. 32, No. 8(2), S&M2292

Print: ISSN 0914-4935
Online: ISSN 2435-0869
Sensors and Materials
is an international peer-reviewed open access journal to provide a forum for researchers working in multidisciplinary fields of sensing technology.

Tweets by Journal_SandM Sensors and Materials
is covered by Science Citation Index Expanded (Clarivate Analytics), Scopus (Elsevier), and other databases.

Instructions to authors
English 日本語

Instructions for manuscript preparation
English 日本語

Template
English

Publisher
MYU K.K.
Sensors and Materials
1-23-3-303 Sendagi,
Bunkyo-ku, Tokyo 113-0022, Japan
Tel: 81-3-3827-8549
Fax: 81-3-3827-8547

MYU Research, a scientific publisher, seeks a native English-speaking proofreader with a scientific background. B.Sc. or higher degree is desirable. In-office position; work hours negotiable. Call 03-3827-8549 for further information.

MYU Research
(proofreading and recording)

MYU K.K.
(translation service)

The Art of Writing Scientific Papers
(How to write scientific papers)
(Japanese Only)

Sensors and Materials, Volume 37, Number 6(3) (2025)
Copyright(C) MYU K.K.

pp. 2489-2500
S&M4068 Research Paper of Special Issue
https://doi.org/10.18494/SAM5558
Published: June 25, 2025

Robust Speaker Recognition in Voice Sensing Environments with Specific Background Noises Using Deep Learning of Hybridized Speech Enhancement Generative Adversarial Network and Convolutional Neural Network for Smart Manufacturing [PDF]

Ing-Jr Ding and Meng-Chuan Hsieh

(Received January 18, 2025; Accepted June 2, 2025)

Keywords: speaker recognition, deep learning, hybridized SEGAN-CNN, SEGAN, VGG-16 CNN

Identity recognition using the specific biometrical characteristics of a person has recently become a popular technique. Compared with image-sensor-data-based face and fingerprint recognition, speaker recognition using the acoustic characteristics of the uttered voices obtained from a speaking person is an additional alternative. In certain cases of dark environments or dirty fingers, acoustics-based speaker recognition will be an alternative method for accomplishing identity recognition with satisfactory recognition accuracy. Speaker recognition in practical application scenarios will inevitably encounter the problem of acoustic speech mixed with background noises. Utterances with undesired background noises of specific environments cannot be finely matched with the preestablished speaker models, thus causing inaccurate identity recognition results. To tackle this issue, we present a deep-learning-based method for speaker recognition in a noisy environment, which is a hybridization of two different types of deep learning calculation model, speech enhancement generative adversarial network (SEGAN) and convolutional neural network (CNN), called hybridized SEGAN-CNN. By removing specific background noise from the substandard utterance with noise using SEGAN and classifying the identities of numerous speaking subjects without noise effects using CNN, the task becomes speaker recognition in a clear environment, in which the robustness of speaker recognition can be effectively maintained. The results of experiments using a voice command phrase mixed with motor operation noise for robot navigation control in a simulated factory environment demonstrate the effectiveness of the proposed speaker recognition method.

Corresponding author: Ing-Jr Ding

This work is licensed under a Creative Commons Attribution 4.0 International License.

Cite this article
Ing-Jr Ding and Meng-Chuan Hsieh, Robust Speaker Recognition in Voice Sensing Environments with Specific Background Noises Using Deep Learning of Hybridized Speech Enhancement Generative Adversarial Network and Convolutional Neural Network for Smart Manufacturing, Sens. Mater., Vol. 37, No. 6, 2025, p. 2489-2500.