We propose in this paper a new paradigm for facial video compression. We leverage the generative capacity of GANs such as StyleGAN to represent and compress a video, including intra and inter compression. Each frame is inverted in the latent space of StyleGAN, from which the optimal compression is learned. To do so, a diffeomorphic latent representation is learned using a normalizing flows model, where an entropy model can be optimized for image coding. In addition, we propose a new perceptual loss that is more efficient than other counterparts. Finally, an entropy model for video inter coding with residual is also learned in the previously constructed latent representation. Our method (SGANC) is simple, faster to train, and achieves better results for image and video coding compared to state-of-the-art codecs such as VTM, AV1, and recent deep learning techniques. In particular, it drastically minimizes perceptual distortion at low bit rates.
Video coding using learned latent GAN compression
Video coding using learned latent GAN compression
Video coding using learned latent GAN compression
Research Paper / Oct 2022
Related Content
White Paper /Oct 2025
In a new insight paper commissioned by InterDigital, CCS Insights details why energy efficiency is increasingly important in the video and media ecosystem, and introduces InterDigital’s cutting edge Pixel Value Reduction technology as an exciting solution to reduce energy consumption and extend watch time, enabling sustainability at scale across billions of devi…
White Paper /Oct 2025
“Bridge to 6G: Spotlight on 3GPP Release 20”
As live sports migrates from traditional broadcast to digital platforms, streaming is redefining how leagues, networks, and tech providers engage audiences and generate revenue. This transition brings both opportunity and complexity—from fragmented rights and shifting viewer expectations to significant technical demands around latency, scalability, and quality.
Webinar /Jun 2024
Blog Post /Oct 2025
Blog Post /Oct 2025
Blog Post /Sep 2025