Sensors and Materials

Young Researcher Paper Award 2025
🥇Winners

Notice of retraction
Vol. 32, No. 8(2), S&M2292

Print: ISSN 0914-4935
Online: ISSN 2435-0869
Sensors and Materials
is an international peer-reviewed open access journal to provide a forum for researchers working in multidisciplinary fields of sensing technology.

Tweets by Journal_SandM Sensors and Materials
is covered by Science Citation Index Expanded (Clarivate Analytics), Scopus (Elsevier), and other databases.

Instructions to authors
English 日本語

Instructions for manuscript preparation
English 日本語

Template
English

Publisher
MYU K.K.
Sensors and Materials
1-23-3-303 Sendagi,
Bunkyo-ku, Tokyo 113-0022, Japan
Tel: 81-3-3827-8549
Fax: 81-3-3827-8547

MYU Research, a scientific publisher, seeks a native English-speaking proofreader with a scientific background. B.Sc. or higher degree is desirable. In-office position; work hours negotiable. Call 03-3827-8549 for further information.

MYU Research
(proofreading and recording)

MYU K.K.
(translation service)

The Art of Writing Scientific Papers
(How to write scientific papers)
(Japanese Only)

Sensors and Materials, Volume 38, Number 5(2) (2026)
Copyright(C) MYU K.K.

pp. 2723-2738
S&M4465 Research paper
https://doi.org/10.18494/SAM6144
Published: May 22, 2026

Taiwanese Sign Language Recognition and Natural Sentence Generation System Based on Spatiotemporal Graph Convolutional Networks and Distilled Bidirectional Encoder Representations from Transformers [PDF]

Neng-Sheng Pai, Li-An Weng, Pi-Yun Chen, and Lian-Sheng Hong

(Received December 22, 2025; Accepted April 14, 2026)

Keywords: sign language recognition, natural sentence generation, MediaPipe, ST-GCN, DistilBERT

We present a Taiwanese Sign Language (TSL) recognition and natural sentence generation system that focuses on continuous sign language recognition, in contrast to most existing approaches that primarily address isolated sign recognition. The proposed system integrates a spatiotemporal graph convolutional network (ST-GCN) with a distilled bidirectional encoder representations from transformers (DistilBERT)-based language generation model, with the aim of reducing communication barriers for the deaf and hard-of-hearing community. First, a camera sensor is used to capture sign language videos. MediaPipe is then utilized to extract human body key points from sign language video sequences. These spatiotemporal key point representations are subsequently processed by the ST-GCN model to perform sign recognition. Finally, the recognized sign sequences are translated into fluent and natural sentences using a fine-tuned DistilBERT model. Experimental evaluations are conducted on a self-collected dataset consisting of 42 classes of TSL videos, along with a frame sampling analysis. The results indicate that uniformly sampling video sequences to 70 frames yields the best recognition performance for the ST-GCN model. For sentence generation, 24 predefined Chinese sentence templates are employed to fine-tune the DistilBERT model. Experimental results indicate that the proposed method can achieve accurate and natural sentence generation under low-resource training conditions. Overall, the proposed system exhibits strong performance in terms of lightweight model architecture, robust gesture recognition accuracy, and natural language generation quality, thereby validating its effectiveness and feasibility for continuous sign language translation and language generation tasks.

Corresponding author: Pi-Yun Chen

This work is licensed under a Creative Commons Attribution 4.0 International License.

Cite this article
Neng-Sheng Pai, Li-An Weng, Pi-Yun Chen, and Lian-Sheng Hong, Taiwanese Sign Language Recognition and Natural Sentence Generation System Based on Spatiotemporal Graph Convolutional Networks and Distilled Bidirectional Encoder Representations from Transformers, Sens. Mater., Vol. 38, No. 5, 2026, p. 2723-2738.