We propose in this paper a new paradigm for facial video compression. We leverage the generative capacity of GANs such as StyleGAN to represent and compress a video, including intra and inter compression. Each frame is inverted in the latent space of StyleGAN, from which the optimal compression is learned. To do so, a diffeomorphic latent representation is learned using a normalizing flows model, where an entropy model can be optimized for image coding. In addition, we propose a new perceptual loss that is more efficient than other counterparts. Finally, an entropy model for video inter coding with residual is also learned in the previously constructed latent representation. Our method (SGANC) is simple, faster to train, and achieves better results for image and video coding compared to state-of-the-art codecs such as VTM, AV1, and recent deep learning techniques. In particular, it drastically minimizes perceptual distortion at low bit rates.
Video coding using learned latent GAN compression
Video coding using learned latent GAN compression
Video coding using learned latent GAN compression
Research Paper / Oct 2022
Related Content
White Paper /Oct 2025
“Bridge to 6G: Spotlight on 3GPP Release 20”
As live sports migrates from traditional broadcast to digital platforms, streaming is redefining how leagues, networks, and tech providers engage audiences and generate revenue. This transition brings both opportunity and complexity—from fragmented rights and shifting viewer expectations to significant technical demands around latency, scalability, and quality.
White Paper /May 2025
Media over Wireless: Networks for Ubiquitous Video
Webinar /Jun 2024
Blog Post /Oct 2025
Blog Post /Sep 2025
Blog Post /Aug 2025