In this episode we discuss Catch Missing Details: Image Reconstruction with Frequency Augmented
by Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiuman Ho, Yu Kong. The paper proposes a new architecture called Frequency Augmented VAE (FA-VAE) to address the issue of rapid quality degradation in image reconstruction with popular VQ-VAE models as the compression rate increases. The proposed architecture incorporates a Frequency Complement Module (FCM) to capture missing frequency information and a Dynamic Spectrum Loss (DSL) to balance between frequencies for optimal reconstruction. The paper also introduces a Cross-Attention Autoregressive Transformer (CAT) to improve the generation quality and image-text semantic alignment in text-to-image synthesis. Experiments conducted on benchmark datasets show that FA-VAE and CAT outperform state-of-the-art methods in their respective tasks.
CVPR 2023 – Catch Missing Details: Image Reconstruction with Frequency Augmented
by
Tags: