The ubiquitous deployment of 4G/5G technology has made it a critical infrastructure for society that will facilitate the delivery and adoption of emerging applications and use cases (extended reality, automation, robotics, to name but a few). These new applications require high throughput and low latency in both uplink and downlink for optimal performance, while…
"Learning-based point cloud (PC) compression is a promising research avenue to reduce the transmission and storage costs for PC applications. Existing learning-based methods to compress PCs attributes employ variational autoencoders (VAE) or normalizing flows (NF) to learn compact signal representations. However, VAEs leverage a lower-dimensional bottleneck that…
Achieving successful variable bitrate compression with computationally simple algorithms from a single end-to-end learned image or video compression model remains a challenge. Many approaches have been proposed, including conditional auto-encoders, channel-adaptive gains for the latent tensor or uniformly quantizing all elements of the latent tensor. This paper …
Video postproduction pipeline will increasingly benefit from artificial intelligence tools. For instance, the automatic extraction of specific objects helps the postproduction workflow. In particular, booms mics removal could be accelerated and color chart detection could end up in a more efficient color pipeline. For now, the segmentation of these objects is us…
We report a new neural backdoor attack, named Hibernated Backdoor, which is stealthy, aggressive and devastating. The backdoor is planted in a hibernated mode to avoid being detected. Once deployed and fine-tuned on end devices, the hibernated backdoor turns into the active state that can be exploited by the attacker. To the best of our knowledge, this is the fi…
The universality of the point cloud format enables many 3D applications, making the compression of point clouds a critical phase in practice. Sampled as discrete 3D points, a point cloud approximates 2D surface(s) embedded in 3D with a finite bit-depth. However, the point distribution of a practical point cloud changes drastically as its bit-depth increases, req…
In this paper we address the problem of view synthesis from large baseline light fields, by turning a sparse set of input views into a Multi-plane Image (MPI). Because available datasets are scarce, we propose a lightweight network that does not require extensive training. Unlike latest approaches, our model does not learn to estimate RGB layers but only encodes…
Human character animation is often critical in entertainment content production, including video games, virtual reality or fiction films. To this end, deep neural networks drive most recent advances through deep learning (DL) and deep reinforcement learning (DRL). In this article, we propose a comprehensive survey on the state-of-the-art approaches based on eith…
"Abstract—3rd Generation Partnership Project (3GPP) Release 18 has initiated a comprehensive study of Artificial Intelligence (AI)/Machine Learning (ML) use cases for Air Interface, e.g., Channel State Information (CSI) feedback enhancement, beam management, and positioning accuracy enhancement. In order to advance the adoption of AI/ML in 5G and towards 6G, it …
Deep bi-prediction blending. This paper presents a learning-based method to improve bi-prediction in video coding. In conventional video coding solutions, block-based motion compensation blocks from already decoded reference pictures stand out as the main tool used to predict the current frame. Especially, bi-predicted blocks, i.e. blocks that combine two differ…
Building upon on a digital transformation, Industry 4.0 (I4.0) aims to build the factories of the future, which feature additional flexibility, increasingly connected infrastructures and automated processes. 5G is playing a paramount role in this transformation, as it can offer high bandwidth, reliable and low latency wireless connectivity to meet the stringent …
Recently, learning methods have been designed to create Multiplane Images (MPIs) for view synthesis. While MPIs are extremely powerful and facilitate high quality renderings, a great amount of memory is required, making them impractical for many applications. In this paper, we propose a learning method that optimizes the available memory to render compact and ad…
Despite many modern applications of Deep Neural Networks (DNNs), the large number of parameters in the hidden layers makes them unattractive for deployment on devices with storage capacity constraints. In this paper we propose a Data-Driven Low-rank (DDLR) method to reduce the number of parameters of pretrained DNNs and expedite inference by imposing low-rank st…
Deep learning based automatic modulation classification (AMC) has received significant attention owing to its potential applications in both military and civilian use cases. Recently, data-driven subsampling techniques have been utilized to overcome the challenges associated with computational complexity and training time for AMC. Beyond these direct advantages …
This paper presents CompressAI, an open-source library that provides custom operations, layers, models and tools to research, develop, and evaluate end-to-end image and video codecs. In particular, CompressAI includes pre-trained models and evaluation tools to compare learned methods with traditional codecs. Multiple models from the state-of-the-art on learned e…
This document describes the winning solution to the GNN Challenge 2020 organized by the Barcelona Neural Networking
Music source separation is the task of isolating individual instruments which are mixed in a musical piece. This task is particularly challenging, and even state-of-the-art models can hardly generalize to unseen test data. Nevertheless, prior knowledge about individual sources can be used to better adapt a generic source separation model to the observed signal. …
We present a novel learning-based approach to synthesize new views of a light field image. In particular, given the four corner views of a light field, the presented method estimates any in-between view. We use three sequential convolutional neural networks for feature extraction, scene geometry estimation and view selection. Compared to state-of-the-art approac…
Learning from multi-label data in an interactive framework is a challenging problem as algorithms must withstand some additional constraints: in particular, learning from few training examples in a limited time. A recent study of multi-label classifier behaviors in this context has identified the potential of the ensemble method “Random Forest of Predictive Clus…
The backdoor attack raises a serious security concern to deep neural networks, by fooling a model to misclassify certain inputs designed by an attacker. In particular, the trigger-free backdoor attack is a great challenge to be detected and mitigated. It targets one or a few specific samples, called target samples, to misclassify them to a target class. Without …
Generating complex discrete distributions remains as one of the challenging problems in machine learning. Existing techniques for generating complex distributions with high degrees of freedom depend on standard generative models like Generative Adversarial Networks (GAN), Wasserstein GAN, and associated variations. Such models are based on an optimization involv…
The challenging propagation environment, combined with the hardware limitations of mmWave systems, gives rise to the need for accurate initial access beam alignment strategies with low latency and high achievable beamforming gain. Much of the recent work in this area either focuses on onesided beam alignment, or, joint beam alignment methods where both sides of …
High quality facial attribute editing in videos is a challenging problem as it requires the modifications to be realistic and consistent throughout the video frames. Previous works address the problem with auto-encoder architectures and rely on adversarial training to ensure the attribute editing and the temporal consistency of the results. However, many algorit…
"The last standard Versatile Video Codec (VVC), aims to im- prove the compression efficiency by saving around 50% of bitrate at the same quality compared to its predecessor High Efficiency Video Codec (HEVC). However, this comes with a significant rise in computational complexity due to the new added tools in the encoder side. This paper proposes a speed- up tec…
The futures of AI and wireless networks are intricately intertwined. On the one hand, AI is a potent tool for automating the deployment and management of wireless networks. The next-generation wireless network, on the other hand, can support the training and deployment of AI models by providing an ocean of multi-modal data and a distributed computation resource.…
In recent years, we have seen the development of integrated plenoptic sensors, where multiple pixels are placed under one microlens. It is mainly used by cameras and smartphones to drive the autofocus of the main lens, and it often takes the form of dual-pixels with 2 rectangular sub-pixels. We study the evolution of dual-pixels, the so-called quad-pixel sensor …
Recently, learning methods have been designed to create Multiplane Images (MPIs) for view synthesis. While MPIs are extremely powerful and facilitate high quality renderings, a great amount of memory is required, making them impractical for many applications. In this paper, we propose a learning method that optimizes the available memory to render compact and ad…
While advances in Machine Learning have revolutionized certain areas (computer vision, robotics, natural language processing, etc.), the application in wireless communications has been less dramatic. One limiting factor is the (potentially) high computational complexity. Yet another important inhibitor is the lack of realistic datasets. To fully understand the p…
This paper describes the MediaEval 2021 Predicting Media Memorability task. After first being proposed at MediaEval 2018, the Predicting Media Memorability task is in its 4th edition this year, as the prediction of short-term and long-term video memorability remains a challenging task. This year, two datasets of videos are used: first, as in the 2020 task, a sub…
"Compressed beamforming algorithm is used in the current Wi-Fi standard to reduce the beamforming feedback overhead (BFO). However, with each new amendment of the standard the number of supported antennas in Wi-Fi devices increases, leading to increased BFO and hampering the throughput despite using compressed beamforming. In this paper, a novel index-based meth…
Federated learning (FL) is a distributed machine learning technique developed and pursued in part to address data privacy and security issues. Participant selection is critical to determine the latency of the training process in a heterogeneous FL architecture, where users with different hardware setups and wireless channel conditions communicate with their base…
Low-power wide-area networks (LPWANs) bring exceptional networking capabilities that will enable the massive roll-out of the Internet of Things (IoT). Among these capabilities are the support of ultra-low power consumption devices – up to 10 years of battery life – and connectivity up to tens of kilometers. The Long Range (LoRa) protocol has captured the researc…