A Haptic Experience may take many shapes or forms, ranging fromsimple UI notifications on a phone, to fully immersive multi-sensorial experiences.In this context, dedicated Haptic authoring tools have been flourishing and theliterature on Haptic design introduced many concepts and guidelines to improveHaptic Experiences. In this paper, we propose to go...
RESEARCH PAPER / Apr 2024
/
["Machine learning/ Deep learning /Artificial Intelligence",
"Computer Vision",
"Image processing"]
The production, transmission, and display of video content requires significant amounts of energy. Whether broadcasting or streaming a video, its display on modern televisions is responsible for a significant proportion of the energy consumption. This paper proposes a framework for analyzing and processing High Dynamic Range (HDR) video frames that...
In this work we propose several full-color metagrating solutions for single waveguide-based Augmented and Virtual Reality near-eye display systems. The presented solutions are based on a combination of reflective and/or transmissive diffraction gratings inside or outside a waveguide. The proposed designs have high intensity across a wide angular range. Applying...
Using a collection of publicly available links to short form video clips of an average of 6 seconds duration each, 1,275 users manually annotated each video multiple times to indicate both longterm and short-term memorability of the videos. The annotations were gathered as part of an online memory game and...
RESEARCH PAPER / Oct 2021
/
Audio processing,
Neural network,
Machine learning/ Deep learning /Artificial Intelligence
Music source separation is the task of isolating individual instruments which are mixed in a musical piece. This task is particularly challenging, and even state-of-the-art models can hardly generalize to unseen test data. Nevertheless, prior knowledge about individual sources can be used to better adapt a generic source separation model...
RESEARCH PAPER / Sep 2021
/
Video coding,
Machine learning/ Deep learning /Artificial Intelligence,
Image processing,
Computer Graphics
Recently, learning methods have been designed to create Multiplane Images (MPIs) for view synthesis. While MPIs are extremely powerful and facilitate high quality renderings, a great amount of memory is required, making them impractical for many applications. In this paper, we propose a learning method that optimizes the available memory...
RESEARCH PAPER / Sep 2021
/
Optics,
Machine learning/ Deep learning /Artificial Intelligence,
Image processing
In recent years, we have seen the development of integrated plenoptic sensors, where multiple pixels are placed under one microlens. It is mainly used by cameras and smartphones to drive the autofocus of the main lens, and it often takes the form of dual-pixels with 2 rectangular sub-pixels. We study...
We present a new method for reconstructing a 4D light field from a random set of measurements. A 4D light field block can be represented by a sparse model in the Fourier domain. As such, the proposed algorithm reconstructs the light field, block by block, by selecting frequencies of the...
Recently, the advances in transform coding have contributed to significant bitrate saving for the next generation of video coding. In particular, the combination of different discrete trigonometric transforms (DTT’s) was adopted in the Joint Video Exploration Team (JVET) solution, as well as the Bench-Mark Set (BMS) of the future video...
Style transfer' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a...
A wide color gamut (WCG) display has great color rendering capability and offers the opportunity to achieve a pleasing and realistic appearance in terms of image quality. To take full advantage of the large display gamut, a new gamut extension algorithm (GEA) is proposed based on a new color appearance...
Thanks to the increasing number of images stored in the cloud, external image similarities can be leveraged to efficiently compress images by exploiting inter-images correlations. In this paper, we propose a novel image prediction scheme for cloud storage. Unlike current state-of-the-art methods, we use a semi-local approach to exploit inter-image...
To enable light fields of large environments to be captured, they would have to be sparse, i.e. with a relatively large distance between views. Such sparseness, however, causes subsequent processing to be much more difficult than would be the case with dense light fields. This includes segmentation. In this paper,...
Plenoptic cameras enable a variety of novel post-processing applications, including refocusing and single-shot 3D imaging. To achieve high accuracy, such applications typically require knowledge of intrinsic camera parameters. One such parameter is the location of the main lens' optical center relative to the sensor, which is required for modeling radially...
In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation...
The proposed Single Layer SDR backward compatible HDR video distribution solution detailed in this paper, named SL-HDR1, and standardized in ETSI TS 103 433 specification, aims at addressing these issues. SL-HDR1 leverages SDR distribution networks and services already in place. It enables both high quality HDR rendering on HDR-enabled CE...
Light field acquisition devices allow capturing scenes with unmatched postprocessing possibilities. However, the huge amount of high-dimensional data poses challenging problems to light field processing in interactive time. In order to enable light field processing with a tractable complexity, in this paper, we address the problem of light field oversegmentation....
We consider example-guided audio source separation approaches, where the audio mixture to be separated is supplied with source examples that are assumed matching the sources in the mixture both in frequency and time. These approaches were successfully applied to the tasks such as source separation by humming, score-informed music source...
This paper describes a light field scalable compression scheme based on the sparsity of the angular Fourier transform of the light field. A subset of sub-aperture images (or views) is compressed using HEVC as a base layer and transmitted to the decoder. An entire light field is reconstructed from this...
This paper deals with the unification of local and non-local signal processing on graphs within a single convolutional neural network (CNN) framework. Building upon recent works on graph CNNs, we propose to use convolutional layers that take as inputs two variables, a signal and a graph, allowing the network to...
Wearable optical technologies are emerging to keep users safe, powered-up and entertained
Recent work in video compression has shown that using multiple 2D transforms instead of a single transform in order to de-correlate residuals provides better compression efficiency. These transforms are tested competitively inside a video encoder and the optimal transform is selected based on the Rate Distortion Optimization (RDO) cost. However,...
Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition,...
Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition,...
Predicting interestingness of media content remains an important, but challenging research subject. The difficulty comes first from the fact that, besides being a high-level semantic concept, interestingness is highly subjective and its global definition has not been agreed yet. This paper presents the use of up-to-date deep learning techniques for...
Light-field (LF) is foreseen as an enabler for the next generation of 3D/AR/VR experiences. However, lack of unified representation, storage and processing formats, variant LF acquisition systems and capture-specific LF processing algorithms prevent cross-platform approaches and constrain the advancement and standardization process of the LF information. In this work we...
This paper addresses the example-based stylization of videos. Style transfer aims at editing an image so that it matches the style of an example. This topic has recently been investigated massively, both in the industry and academia. The difficulty lies in how to capture the style of an image. For...
With the explosion of Virtual Reality technologies, the production and usage of omni directional images (a.k.a 360 images) is presenting new challenges in the domains of compression, transmission and rendering. The evaluation of the quality of images generated by these technologies is therefore paramount. As the exploration of 360 images...
The quantity and diversity of data in Light-Field videos makes this content valuable for many applications such as mixed and augmented reality or post-production in the movie industry. Some of such applications require a large parallax between the different views of the Light-Field, making the multi-view capture a better option...
In this paper, we present a complete processing pipeline for focused plenoptic cameras. In particular, we propose 1) a new algorithm for microlens center calibration fully in the Fourier domain, 2) a novel algorithm for depth map computation using a stereo focal stack, and 3) a depth-based rendering algorithm that...
The human visual system (HVS) non-linearly processes light from the real world, allowing us to perceive detail over a wide range of illumination. Although models that describe this non-linearity are constructed based on psycho-visual experiments, they generally apply to a limited range of illumination and therefore may not fully explain...
Summary form only given. This paper presents two sets of modifications to band offset type of the Sample Adaptive Offset technique in HEVC. First, some constraints on the SAO semantics are added to solve sub-optimal syntax issue and to exploit the actual range information of reconstructed samples. Next, the classification...
This paper presents an adaptive clipping technique with optimized syntax in the video coding Joint Exploratory Model (JEM), which exploits the signal characteristics of the video sequence. The component-wise clipping bounds are coded for each slice. Two encoding methods leveraging the efficiency of the proposed technique are then described. The...
In this paper, we propose a novel scheme for scalable image coding based on the concept of epitome. An epitome can be seen as a factorized representation of an image. Focusing on spatial scalability, the enhancement layer of the proposed scheme contains only the epitome of the input image. The...
In this paper we tackle the problem of single channel audio source separation driven by descriptors of the sounding object's motion. As opposed to previous approaches, motion is included as a soft-coupling constraint within the nonnegative matrix factorization framework. The proposed method is applied to a multimodal dataset of instruments...
We propose a novel informed source separation method for audio object coding based on a recent sampling theory for smooth signals on graphs. Assuming that only one source is active at each time-frequency point, we compute an ideal map indicating which source is active at each time-frequency point at the...
“To be considered for the 2017 IEEE Jack Keil Wolf ISIT Student Paper Award.” In this paper we study the problem of noisy tensor completion for tensors that admit a canonical polyadic or CANDE-COMP/PARAFAC (CP) decomposition with one of the factors being sparse. We present general theoretical error bounds for...
The migration from high-definition TV to ultrahigh definition (UHD) is already underway. In addition to an increase of picture spatial resolution, UHD potentially provides more color by introducing a wider color gamut, and better contrast by moving from standard dynamic range (SDR) to high dynamic range (HDR). The transition from...
Some recent smartphones have offered the so-called audio zoom feature which allows to focus sound capture in the front direction while attenuating progressively surrounding sounds along with video zoom. This paper proposes a complete implementation of such function involving two major steps. First, targeted sound source is extracted by a novel approach...
This work concerns sampling of smooth signals on arbitrary graphs. We first study a structured sampling strategy for such smooth graph signals that consists of a random selection of few pre-defined groups of nodes. The number of groups to sample to stably embed the set of $k$-bandlimited signals is driven...
To reproduce the appearance of real world scenes, a number of color appearance models have been proposed thanks to adapted psycho-visual experiments. Most of them were designed and intended for a limited dynamic range, or address only dynamic range compression applications. However, given the increasing availability of displays with higher...
HDR Solution presentation
In this paper, we introduce a novel graph representation forinteractive light field segmentation using Markov Random Field (MRF).The greatest barrier to the adoption of MRF for light field processing isthe large volume of input data. The proposed graph structure exploits theredundancy in the ray space in order to reduce the...
In this paper we propose a new method to automatically select the rank of linear transforms during supervised learning. Our approach relies on a sparsity-enforcing element-wise soft-thresholding operation applied after the linear transform. This novel approach to supervised rank learning has the important advantage that it is very simple to...
This paper describes a novel scheme to reduce the quantization noise of compressed videos and improve the overall coding performances. The proposed scheme first consists in clustering noisy patches of the compressed sequence. Then, at the encoder side, linear mappings are learned for each cluster between the noisy patches and...
Light field imaging is recently made available tothe mass market by Lytro and Raytrix commercial cameras.Thanks to a grid of microlenses put in front of the sensor, aplenoptic camera simultaneously captures several images of thescene under different viewing angles, providing an enormousadvantage for post-capture applications,e.g., depth estimationand image refocusing. In...
This paper addresses the estimation of accurate long-term dense motion fields from videos of complex scenes. With computer vision applications such as video editing in mind, we exploit optical flows estimated with various inter-frame distances and combine them through multi-step integration and statistical selection (MISS). In this context, managing numerous...
Displays' new rendering capabilities combined with the ever-growing number of video applications have fueled the emergence of new video formats addressing wider color gamut and larger frame size. Thus, the need in scalable compression technology to provide backward compatibility with legacy devices and capitalize on the superior compression performance of...
The movie industry has been using Unmanned Aerial Vehicles as a new tool to produce more and more complex and aesthetic camera shots. However, the shooting process currently rely on manual control of the drones which makes it difficult and sometimes inconvenient to work with. In this paper we address...
With the advent of ultra-high-definition TV services, high dynamic range (HDR) and wide color gamut (WCG) have become two highly desired image quality improvements for delivering immersive video experiences to the consumer mass market. Capture and rendering technologies have reached a level of maturity that now allows HDR and WCG...
As the video industry begins deployment of ultrahigh-definition TV in both professional and consumer markets, including support for higher dynamic range and wider color gamut services is considered essential within the industry. Higher dynamic range and wider color gamut offer end users a significantly enhanced viewing experience by supporting intensity...
5G will deliver the next level of experience and enable a diverse array of 5G services. Learn about eHealth services, pervasive video and high quality content, and enabling tactile internet in this 2015 Mobile World Congress presentation.
Check out this 2014 IBC presentation on User Aware Video!
BLOG / Oct 2015
/
iot,
interoperability,
wotio,
data,
internet of things,
connected devices
/ Posted By: wotio team