Sensors and Materials

Young Researcher Paper Award 2025
🥇Winners

Notice of retraction
Vol. 32, No. 8(2), S&M2292

Print: ISSN 0914-4935
Online: ISSN 2435-0869
Sensors and Materials
is an international peer-reviewed open access journal to provide a forum for researchers working in multidisciplinary fields of sensing technology.

Tweets by Journal_SandM Sensors and Materials
is covered by Science Citation Index Expanded (Clarivate Analytics), Scopus (Elsevier), and other databases.

Instructions to authors
English 日本語

Instructions for manuscript preparation
English 日本語

Template
English

Publisher
MYU K.K.
Sensors and Materials
1-23-3-303 Sendagi,
Bunkyo-ku, Tokyo 113-0022, Japan
Tel: 81-3-3827-8549
Fax: 81-3-3827-8547

MYU Research, a scientific publisher, seeks a native English-speaking proofreader with a scientific background. B.Sc. or higher degree is desirable. In-office position; work hours negotiable. Call 03-3827-8549 for further information.

MYU Research
(proofreading and recording)

MYU K.K.
(translation service)

The Art of Writing Scientific Papers
(How to write scientific papers)
(Japanese Only)

Sensors and Materials, Volume 35, Number 7(1) (2023)
Copyright(C) MYU K.K.

pp. 2195-2204
S&M3316 Research Paper of Special Issue
https://doi.org/10.18494/SAM4410
Published: July 13, 2023

Image Caption Generation Using Scoring Based on Object Detection and Word2Vec [PDF]

Tadanobu Misawa, Nozomi Morizumi, and Kazuya Yamashita

(Received March30, 2023; Accepted June 6, 2023)

Keywords: image caption generation, deep learning, object detection, Word2Vec, scoring

Generating descriptive text from images, known as caption generation, is a noteworthy research field with potential applications, including aiding the visually impaired. Recently, numerous methods based on deep learning have been proposed. Previous methods learn the relationship between image features and captions on a large dataset of image–caption pairs. However, it is difficult to correctly learn all objects, object attributes, and relationships between objects. Therefore, occasionally incorrect captions are generated. For instance, captions about objects not included in the image are generated. In this study, we propose a scoring method using object detection and Word2Vec to output the correct caption for an object in the image. First, multiple captions are generated. Subsequently, object detection is performed, and the score is calculated using the resulting labels from object detection and the nouns extracted from each caption. Finally, the output is the caption with the highest score. Experimental evaluation of the proposed method on the Microsoft Common Objects in Context (MSCOCO) dataset demonstrates that the proposed method is effective in improving the accuracy of caption generation.

Corresponding author: Tadanobu Misawa

This work is licensed under a Creative Commons Attribution 4.0 International License.

Cite this article
Tadanobu Misawa, Nozomi Morizumi, and Kazuya Yamashita, Image Caption Generation Using Scoring Based on Object Detection and Word2Vec, Sens. Mater., Vol. 35, No. 7, 2023, p. 2195-2204.