Congratulations to Jeong Woon
[Title]
Mix-Spectrum for Generalization in Visual Reinforcement Learning
[Journal]
IEEE Access, Vol. 13, pp. 7939-7950, 2025.
[Authors]
Jeongwoon Lee, and Hyoseok Hwang*
[Summary]
To enhance the generalization ability of visual Reinforcement Learning (RL) agents, this study proposes MixSpectrum, a novel frequency-based image augmentation method that increases data diversity while preserving semantic information. MixSpectrum combines existing methods to improve performance, is applicable to all types of visual RL algorithms, and demonstrates superior zero-shot generalization performance in experiments.
[Key Figures]
Synthetic images from DMControl-GB augmented with Mixup (top) and Mix-Spectrum (bottom) under the same mixed ratio. We can observe that Mix-Spectrum preserves the semantic information of the image compared to Mixup.
The framework of Mix-Spectrum. Our method applies Random Convolution to the reference image to increase amplitude diversity. Given images, our method applies the Fast Fourier Transform (FFT) to get the amplitudes and phases. Following this, the proposed method gets the mixed amplitude by mixing the amplitudes. Finally, the proposed method obtains an augmented image by using the Inverse Fast Fourier Transform (IFFT) with the original phase and mixed amplitude.
[Key Results]
Comparison with other methods on Color easy, Color hard, Video easy, and Video hard benchmarks in DMControl-GB. We provide the mean and standard deviation trained with five different random seeds. (·) represents the standard deviation. The best results are in bold.
Training performance of SAC (top), DrQ (middle), and SVEA (bottom) under 5 tasks of DMControl-GB. The x-axis represents the number of frames in units of million and the y-axis represents the episode return.