pp. 91-104
S&M3499 Research Paper of Special Issue https://doi.org/10.18494/SAM4531 Published: January 24, 2024 Applying Deep Learning Neural Network with Randomly Downscaled Image and Data Augmentation to Multiscale Image Enlargement [PDF] Ming-Tsung Yeh, Wei-Yin Lo, Yi-Nung Chung, and Hong-Yi Cai (Received May 29, 2023; Accepted November 30, 2023) Keywords: image enlargement, advanced CAE, residual network, data augmentation
Digital image applications have been extensively utilized in entertainment, education, research, medicine, and industry. Many images should be resized for better demonstration. In general, image resizing is performed by conventional image processing technology. However, the enlarged image usually includes unacceptable amounts of noise, blurring, and jagged effects. A high resolution (HR) is usually required for output images. Applying a learning-based method to enlarge the input image and reconstruct the output to the HR image has better results. However, substantial external training datasets are required. In this study, we used the proposed randomly downscaled image and data augmentation (RDIDA) module to shrink images by the random scale and produce multiscale samples to reduce the dependence of preparation of significant datasets on the training stage. The image enlargement neural network (IENN) is proposed to apply deep learning neural network architecture based on an advanced convolutional autoencoder (CAE) to address the poor quality issues of output images. The proposed IENN with RDIDA can accept multiscale inputs and effectively enlarge images to specific sizes with high resolution. This learning-based approach with multiple residual networks is different from other methods. Applying the encoder of an advanced CAE structure captures features of the original image, and then the decoder with residual structure can create an enlarged image with HR quality. The CAE network used to enlarge an image can effectively denoise and reduce distortions that smooth out the traditional processing drawbacks. Our experimental results show that the peak signal-to-noise ratio (PSNR) of validation for our proposed model has been over 29.55 dB at epoch 30 during the training stage. Furthermore, this model can achieve an average PSNR above 26 dB on all test samples to demonstrate robust performance.
Corresponding author: Ming-Tsung YehThis work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Ming-Tsung Yeh, Wei-Yin Lo, Yi-Nung Chung, and Hong-Yi Cai, Applying Deep Learning Neural Network with Randomly Downscaled Image and Data Augmentation to Multiscale Image Enlargement, Sens. Mater., Vol. 36, No. 1, 2024, p. 91-104. |