Recent work in video compression has shown that using multiple 2D transforms instead of a single transform in order to de-correlate residuals provides better compression efficiency. These transforms are tested competitively inside a video encoder and the optimal transform is selected based on the Rate Distortion Optimization (RDO) cost. However, one needs to encode a syntax to indicate the chosen transform per residual block to the decoder for successful reconstruction of the pixels. Conventionally, the transform index is binarized using fixed length coding and a CABAC context model is attached to it. In this work, we provide a novel method that utilizes Convolutional Neural Network to predict the chosen transform index from the quantized coefficient block. The prediction probabilities are used to binarize the index by employing a variable length coding instead of a fixed length coding. Results show that by employing this modified transform index coding scheme inside HEVC, one can achieve up to 0.59% BD-rate gain.
CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC
CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC
CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC
Related Content
The human visual system (HVS) non-linearly processes light from the real world, allowing us to perceive detail over a wide range of illumination. Although models that describe this non-linearity are constructed based on psycho-visual experiments, they generally apply to a limited range of illumination and therefore may not fully explain the behavior of the HVS u…
We present a new method for reconstructing a 4D light field from a random set of measurements. A 4D light field block can be represented by a sparse model in the Fourier domain. As such, the proposed algorithm reconstructs the light field, block by block, by selecting frequencies of the model that best fits the available samples, while enforcing orthogonality wi…
Webinar /Jun 2024
Blog Post /Jun 2025
Blog Post /Jun 2025