This paper presents an end-to-end Artificial Neural Network (ANN)-based compression framework in response to the video compression task of the Challenge for Learned Image Compression (CLIC) at CVPR 2021. In this framework, the video frames are divided into Groups Of Pictures (GOPs) in which each frame can be encoded in Intra or Inter mode. In Intra mode, an auto-encoder compresses the pixel values directly. For Inter frames, we leverage bi-directional prediction with reference frame signaling, allowing for efficient hierarchical GOP temporal structures. The motion information, computed using the luminance, and prediction residuals are compressed using dedicated auto-encoder structures, in which the layers are conditioned based on the GOP structure. The network is trained fully end-to-end, from scratch. The results demonstrate the promises of end-to-end approaches.
Hierarchical Temporal Structure for End-to-End Neural Network-based Video Compression
Hierarchical Temporal Structure for End-to-End Neural Network-based Video Compression
Hierarchical Temporal Structure for End-to-End Neural Network-based Video Compression
Research Paper / Jun 2021
Related Content
White Paper /Jan 2025
A global shift in the standard of living has caused significant growth in greenhouse carbon emissions (GHG), with the TV and video streaming industry accounting for 4% of total global emissions—double that of the aviation industry, which stands at 2%. While the TV and video industry has historically been seen as a relatively low-carbon emitter, today’s increasin…
Abstract—Ubiquitous connectivity is vital for emerging appli cations like extended reality, factory automation, and robotics, necessitating low latency, high data rates, and reliability in both downlink and uplink. From the network protocol perspective, successfully supporting these new use cases hinges on the network being resilient enough to address the hetero…
White Paper /Aug 2024
Our modes of communication are evolving to empower immersive content and increasingly incorporate haptics, or the sense of touch, into our connected experiences. This new modality extends our sensory engagement beyond our sight and hearing and unlocks an opportunity to add more depth and sense of immersion within connected communications, experiences, and servic…
Webinar /Jun 2024
Blog Post /May 2025