|
pp. 1463-1479
S&M4388 Research paper https://doi.org/10.18494/SAM5947 Published: March 23, 2026 Monocular 2D Baseball Swing Pose Estimation Using Multistacked Hourglass Networks [PDF] Yu-Liang Hsu and Yu-Ming Lo (Received September 24, 2025; Accepted February 13, 2026) Keywords: monocular, 2D human pose estimation, baseball swing motions, multistacked hourglass network
Human pose estimation from monocular 2D images has been a fundamental yet challenging task in the computer vision community, with wide-ranging applications in human–computer interaction, animation, and behavior detection. With the rapid development of deep learning techniques, monocular human pose estimation has witnessed remarkable advancements in both 2D and 3D research areas. In this study, we aim to implement multistacked hourglass (MSH) networks to accurately estimate 2D baseball swing poses. Monocular 2D images captured by a monocular camera are inputted into the MSH networks to estimate the 2D keypoint coordinates of human poses. The proposed MSH networks are validated on the Max Planck Institute for Informatics (MPII) Human Pose dataset to prove their feasibility and effectiveness for 2D human pose estimation. In addition, the proposed MSH network trained by the MPII Human Pose dataset is utilized to estimate the 2D keypoint coordinates of the baseball swing poses from the monocular 2D images. The experimental results show that the MSH networks achieve average percentages of correct keypoints (PCK)@0.5 of 90.2 and 94.0% for the MPII Human Pose dataset and baseball swing motions, respectively.
Corresponding author: Yu-Liang Hsu![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Yu-Liang Hsu and Yu-Ming Lo, Monocular 2D Baseball Swing Pose Estimation Using Multistacked Hourglass Networks, Sens. Mater., Vol. 38, No. 3, 2026, p. 1463-1479. |