pp. 3209-3220
S&M2333 Research Paper of Special Issue https://doi.org/10.18494/SAM.2020.2860 Published: October 9, 2020 Voice Recognition and Marking Using Mel-frequency Cepstral Coefficients [PDF] Jia-Shing Sheu and Ching-Wen Chen (Received March 14, 2020; Accepted June 17, 2020) Keywords: mel-frequency, Hamming window, speech recognition, fast Fourier transform, microphone array
A real-time voice recognition and marking system was developed in this study to automatically identify different voices of speakers. A microphone array was installed for audio reception. Pre-emphasis, framing and Hamming window, fast Fourier transform, mel-frequency, and mel-frequency cepstral coefficients with processing times of 0.001, 0.305, 0.205, 0.049, and 0.546 s, respectively, were used in the system. The total processing time was less than 1.5 s. Unique eigenvalues were obtained for each sound. The results indicated that the proposed system, which is an example of intelligent recording, can be used to automatically record speech in meetings or during classes.
Corresponding author: Jia-Shing SheuThis work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Jia-Shing Sheu and Ching-Wen Chen, Voice Recognition and Marking Using Mel-frequency Cepstral Coefficients, Sens. Mater., Vol. 32, No. 10, 2020, p. 3209-3220. |