CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC




CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC

CNN-BASED TRANSFORM SYNTAX PREDICTION IN ADAPTIVE MULTIPLE TRANSFORMS FRAMEWORK TO ASSIST ENTROPY CODING IN HEVC
Research Paper / EUSIPCO 2017 / Aug 2017 / Machine/Deep Learning /AI, Image Processing, Video Coding

Recent work in video compression has shown that using multiple 2D transforms instead of a single transform in order to de-correlate residuals provides better compression efficiency. These transforms are tested competitively inside a video encoder and the optimal transform is selected based on the Rate Distortion Optimization (RDO) cost. However, one needs to encode a syntax to indicate the chosen transform per residual block to the decoder for successful reconstruction of the pixels. Conventionally, the transform index is binarized using fixed length coding and a CABAC context model is attached to it. In this work, we provide a novel method that utilizes Convolutional Neural Network to predict the chosen transform index from the quantized coefficient block. The prediction probabilities are used to binarize the index by employing a variable length coding instead of a fixed length coding. Results show that by employing this modified transform index coding scheme inside HEVC, one can achieve up to 0.59% BD-rate gain.