pp. 1193-1209
S&M3981 Research Paper of Special Issue https://doi.org/10.18494/SAM5220 Published: March 31, 2025 Approach to Deep-learning-based Visual Relationship Detection for Scene Analysis [PDF] Ming-Yuan Shieh, Po-Kuan Wu, and Neng-Sheng Pai (Received June 28, 2024; Accepted March 17, 2025) Keywords: visual relationship detection, scene analysis, convolutional neural networks, vision transformer
The paper is focused on the implementation of a deep-learning-based visual relationship detection system for scene analysis. Initially, the system employs convolutional neural networks (CNNs) for precise object detection and localization, accurately capturing detailed information about the objects in the scene. Following this, the system applies the vision transformer model to infer relationships between objects, enabling detailed analysis, interpretation of spatial layouts, and understanding of behavioral interactions among objects within the scene. This process allows the system to gain a profound understanding of the relationships and interaction patterns among the objects. Furthermore, the system integrates an intuitive and feature-rich user interface to display detailed relationships among objects and the model’s inference outcomes. We not only conduct scene analysis but also provide functionality for users to select two distinct subjects for visual relationship detection. Users can choose two subjects of interest to explore their relationships with other objects in the same scene. Through this selection capability, the system analyzes the selected subjects while excluding results that do not meet specified criteria. This functionality enables users to precisely control the analysis direction, ensuring that the results align with their expectations and requirements effectively.
Corresponding author: Neng-Sheng Pai![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Ming-Yuan Shieh, Po-Kuan Wu, and Neng-Sheng Pai, Approach to Deep-learning-based Visual Relationship Detection for Scene Analysis, Sens. Mater., Vol. 37, No. 3, 2025, p. 1193-1209. |