pp. 4781-4799
S&M3831 Research Paper of Special Issue https://doi.org/10.18494/SAM5202 Published: November 19, 2024 Applying the Generative Model Integrated with the Diffusion Technique to Improve Virtual Sample Generation in Environmental Sound Classification [PDF] Yao-San Lin and Mei-Ling Huang (Received June 24, 2024; Accepted October 23, 2024) Keywords: GAN, generative model, diffusion technique, ESC, virtual sample generation
We propose a novel framework for environmental sound classification (ESC) to address the challenge of insufficient training samples in sound recognition systems for manufacturing environments. Because of sample scarcity, traditional systems often perform poorly, so in this research, we utilize generative adversarial networks (GANs) to generate virtual sound samples and augment existing datasets. The proposed method integrates a robust Bayesian inference approach with a modified GAN architecture to generate high-quality synthetic samples, particularly for rare events and emergencies on production lines. The framework aims to enhance the stability and performance of ESC systems by expanding training data in a controlled manner. Experimental results demonstrate the potential of this approach to reduce sample collection costs and improve the practical application of ESC technology in manufacturing systems. Key aspects discussed include technological innovation, cost-effectiveness, implementation challenges, and ethical considerations related to synthetic audio data generation. The results of this research will advance ESC’s real-time monitoring and anomaly detection capabilities in diverse manufacturing environments.
Corresponding author: Mei-Ling HuangThis work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Yao-San Lin and Mei-Ling Huang, Applying the Generative Model Integrated with the Diffusion Technique to Improve Virtual Sample Generation in Environmental Sound Classification, Sens. Mater., Vol. 36, No. 11, 2024, p. 4781-4799. |