|
pp. 3255-3270
S&M4501 Research paper https://doi.org/10.18494/SAM6159 Published: June 18, 2026 Dataset Distillation on Medical Dataset Using Genetic Algorithm [PDF] Takumi Sato, Yasumasa Tamura, and Masahito Yamamoto (Received January 5, 2026; Accepted June 11, 2026) Keywords: dataset distillation, genetic algorithm, MedMNIST
MedMNIST, which is a collection of medical classification datasets, contains medical images obtained with a variety of techniques, such as computed tomography and optical coherence tomography, and images captured using a microscope. Compressing MedMNIST datasets using a dataset distillation method is a challenging task in the images per class (IPC) = 1 setting. Regarding this problem, methods that do not use gradients to update a compressed dataset largely remain unexplored, especially in MedMNIST. We propose a simple dataset distillation method based on the Deep Learning Evolutionary Algorithm (DL-EA) to optimize synthetic images mainly focused on MedMNIST. Our proposed method updates synthetic images and evaluates them using a subset of training data. Since DL-EA is a neural network (NN) training method, it cannot be applied to dataset distillation directly so we extended it by updating synthetic images rather than NN weights; this is our main contribution. To extend our method even further, we made two improvements. (1) We added the Laplace crossover method to optimize synthetic images. (2) To reduce the computational cost of each generation, we used only the subset of the training dataset. Our method is evaluated on MNIST and eight medical datasets included in MedMNIST. These datasets have been evaluated by gradient-based dataset distillation. We mainly focus on IPC = 1. Our method has a higher accuracy for MNIST and eight medical datasets included in MedMNIST than does strong random selection.
Corresponding author: Takumi Sato![]() ![]() This work is licensed under a Creative Commons Attribution 4.0 International License. Cite this article Takumi Sato, Yasumasa Tamura, and Masahito Yamamoto, Dataset Distillation on Medical Dataset Using Genetic Algorithm , Sens. Mater., Vol. 38, No. 6, 2026, p. 3255-3270. |