|
pp. 2415-2433
S&M4445 Report https://doi.org/10.18494/SAM5949 Published: May 12, 2026 FALCON-HiMRAN: A Dual-stage RGB–D Sensor–based Scene Classification Framework with Cross-modal Fusion and Graph Reasoning [PDF] Nouf Abdullah Almujally, Ting Wu, Muhammad Waqas Ahmed, Ahmad Jalal, and Hui Liu (Received September 29, 2025; Accepted April 13, 2026) Keywords: cross-modal sensor fusion, multimodal sensing, FALCON, sensor noise, sensor-driven perception
In this study, we address key limitations in RGB–depth (D) sensing systems, including depth noise, sensor misalignment, missing depth values, and performance degradation under low illumination. We propose a dual-stage RGB–D sensor-driven scene classification framework comprising feature-aligned lightweight cross-modal fusion (FALCON) and a hierarchical multi-region aggregation network (HiMRAN), the FALCON-HiMRAN, designed to enhance the reliability and interpretability of multimodal sensing systems. The proposed method integrates the data acquired from structured-light and time-of-flight RGB–D sensors and introduces the FALCON network to mitigate modality inconsistencies and sensor-induced noise. Furthermore, HiMRAN was developed to perform region-level reasoning by the graph-based modeling of spatial relationships. Experimental evaluation on benchmark RGB–D datasets demonstrates improved robustness under challenging sensing conditions such as occlusion, illumination variation, and depth degradation. The proposed framework contributes to the advancement of sensor-based perception systems by enabling more reliable scene understanding from imperfect multimodal sensor data. Remaining challenges include real-time deployment and the handling of extreme sensor noise in outdoor environments.
Corresponding author: Ahmad Jalal and Hui Liu![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Nouf Abdullah Almujally, Ting Wu, Muhammad Waqas Ahmed, Ahmad Jalal, and Hui Liu, FALCON-HiMRAN: A Dual-stage RGB–D Sensor–based Scene Classification Framework with Cross-modal Fusion and Graph Reasoning, Sens. Mater., Vol. 38, No. 5, 2026, p. 2415-2433. |