|
pp. 3413-3420
S&M4510 Research paper https://doi.org/10.18494/SAM6189 Published: June 26, 2026 Learning Marine Spatial Information Data Using a Korean Speech-based Large Language Model [PDF] Je Hyung Tak, Yun Soo Choi, Min Sung Kim, and Chan Woo Lee (Received November 24, 2025; Accepted June 4, 2026) Keywords: marine spatial information, spatial information, LLM , RAG, fine tuning
In previous studies, large language models (LLMs) were fine-tuned using Korean utterance data to improve performance in small-scale computing environments by applying the low-rank adaptation (LoRA) method. In addition, Gradient Checkpointing and Gradient Accumulation techniques were employed to address computational resource limitations, enabling efficient fine-tuning under constrained computing conditions. The purpose of this study is to develop a marine geospatial information LLM. To achieve this, the LLM previously fine-tuned on Korean utterance data was enhanced by integrating a retrieval-augmented generation (RAG) framework, a document-based inference approach, with marine geospatial information data. First, domain-specific terminology learning was conducted using the International Hydrographic Organization Dictionary (S-32), which provides standardized definitions. Additionally, data on current speed, current direction, wind speed, and wind direction were collected from the Badanuri marine information service provided by the Korea Hydrographic and Oceanographic Agency and incorporated into the RAG knowledge base. Subsequently, S-101 and S-102 datasets were preprocessed to extract bathymetric depth information and were also integrated into the RAG framework. In conclusion, this study demonstrates the feasibility of developing a marine geospatial information-specialized LLM in a resource-constrained environment and enhances the practical applicability of the proposed marine geospatial information LLM through RAG-based knowledge integration.
Corresponding author: Yun Soo Choi![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Je Hyung Tak, Yun Soo Choi, Min Sung Kim, and Chan Woo Lee, Learning Marine Spatial Information Data Using a Korean Speech-based Large Language Model, Sens. Mater., Vol. 38, No. 6, 2026, p. 3413-3420. |