InterDigital logo which acts as a link to the home page
  • Research & Innovation  
    • Overview
    • WIRELESS LAB
    • Video Lab
    • Emerging Technologies Lab
    • Talent
  • Thought Leadership  
    • Vault
    • Sustainability
    • Blog
    • Creators
  • About Us  
    • Overview
      • Leadership
      • History
      • Licensing
      • Contact
    • Government Solutions
    • Careers
    • Media
  • Investors  
    • Investor Relations
    • ESG / SUSTAINABILITY
  • Search

Results for Computer Vision




Results for Computer Vision

Overview of The MediaEval 2021 Predicting MediaMemorability Task
RESEARCH PAPER / Dec 2021 / Computer Vision, Machine learning/ Deep learning /Artificial Intelligence
This paper describes the MediaEval 2021 Predicting Media Memorability task. After first being proposed at MediaEval 2018, the Predicting Media Memorability task is in its 4th edition this year, as the prediction of short-term and long-term video memorability remains a challenging task. This year, two datasets of videos are used:...
On the hidden treasure of dialog in video question answering
RESEARCH PAPER / Oct 2021 / Computer Vision
High-level understanding of stories in video such as movies and TV shows from raw data is extremely challenging. Modern video question answering (VideoQA) systems often use additional human-made sources like plot synopses, scripts, video descriptions or knowledge bases. In this work, we present a new approach to understand the whole...
Disentangled Face Attribute Editing for High Quality Videos
RESEARCH PAPER / Oct 2021 / Computer Vision, Neural network, Machine learning/ Deep learning /Artificial Intelligence
High quality facial attribute editing in videos is a challenging problem as it requires the modifications to be realistic and consistent throughout the video frames. Previous works address the problem with auto-encoder architectures and rely on adversarial training to ensure the attribute editing and the temporal consistency of the results....
Inplace knowledge distillation with teacher assistant for improved training of flexible neural networks
RESEARCH PAPER / Aug 2021 / Neural network, Machine learning/ Deep learning /Artificial Intelligence, Computer Vision
Deep neural networks (DNNs) have recently achieved great success in many machine learning tasks including computer vision and speech recognition. However, existing DNN models are computationally expensive and memory demanding, hindering their deployment in devices with low memory and computational resources or in applications with strict latency requirements. In addition,...
CHG synthesis using layer-based method and perspective projection images
RESEARCH PAPER / Dec 2020 / Computer Vision
From its Nobel prize winning discovery by Denis Gabor over half a century ago, Holography has been alternately put in the spotlight as a promising technique for its capacity of displaying 3D scenes, to be later forgotten regarding the complexity of performing holographic recording outside from optical laboratories. The later...
High Resolution Face Age Editing
RESEARCH PAPER / Oct 2020 / Computer Vision
Face age editing has become a crucial task in film post-production, and is also becoming popular for general purpose photography. Recently, adversarial training has produced some of the most visually impressive results for image manipulation, including the face aging/de-aging task. In spite of considerable progress, current methods often present visual...
A Generative Adversarial Approach for 2D Human Pose Estimation Completion and Upsampling
RESEARCH PAPER / Sep 2020 / Computer Vision
Human Pose Estimation is a low-level task useful for surveillance, human action recognition, and scene understanding at large. It also offers promising perspectives for the animation of synthetic characters. For all these applications, and especially the latter, estimating the positions of many joints is desirable for improved performance and realism....
Structural Inpainting
RESEARCH PAPER / Oct 2018 / Computer Vision, Machine, Deep learning/AI
Scene-agnostic visual inpainting remains very challenging despite progress in patch-based methods. Recently, Pathak et al. [26] have introduced convolutional "context encoders'' (CEs) for unsupervised feature learning through image completion tasks. With the additional help of adversarial training, CEs turned out to be a promising tool to complete complex structures in...
Towards Mobile Diminished Reality
RESEARCH PAPER / Oct 2018 / Immersive/AR/VR/MR, Computer Vision
We present a diminished reality application running live on consumer mobile devices. In our pre-observation-based approach, the clean 3D scene, free of undesired objects, is scanned beforehand and reconstructed as a high resolution textured 3D model. At runtime, objects added in a region of interest are efficiently removed by projecting...
Compressive 4D Light Field Reconstruction Using Orthogonal Frequency Selection
RESEARCH PAPER / Oct 2018 / Image Processing, Light Field, Computer Vision
We present a new method for reconstructing a 4D light field from a random set of measurements. A 4D light field block can be represented by a sparse model in the Fourier domain. As such, the proposed algorithm reconstructs the light field, block by block, by selecting frequencies of the...
Interestingness Prediction & its Application to Immersive Content
RESEARCH PAPER / Sep 2018 / Computer Vision, Immersive / AR/VR/MR
Which parts or objects are interesting in a content? In this paper we first propose three computational models to automatically predict interestingness rankings of areas/objects inside a 2D picture. We based our modeling on previous experimental findings to ensure reliability of the prediction when compared to the human assessement of...
Deep Learning for Image Memorability Prediction
RESEARCH PAPER / Apr 2018 / Computer Vision, Machine/Deep learning/AI
Memorability of media content such as images and videos has recently become an important research subject in computer vision. This paper presents our computation model for predicting image memorability, which is based on a deep learning architecture designed for a classification task. We exploit the use of both convolutional neural...
Photometric Registration Using Specular Reflections and Application to Augmented Reality
RESEARCH PAPER / Apr 2018 / Immersive/AR/VR/MR, Computer Vision
Photometric registration consists in blending real and virtual scenes in a visually coherent way. To achieve this goal, both reflectance and illumination properties must be estimated. These estimates are then used, within a rendering pipeline, to virtually simulate the real lighting's interaction with the scene. In this paper, we are...
Color gamut compression for multiple production color gamuts
RESEARCH PAPER / Feb 2018 / Image Processing, Computer Vision, Color Management
A wide color gamut (WCG) display has great color rendering capability and offers the opportunity to achieve a pleasing and realistic appearance in terms of image quality. To take full advantage of the large display gamut, a new gamut extension algorithm (GEA) is proposed based on a new color appearance...
Scattering Features for Multimodal Gait Recognition
RESEARCH PAPER / Nov 2017 / Machine/Deep Learning/AI, IoT Computer Vision
We consider the problem of identifying people on the basis of their walk (gait) pattern. Classical approaches to tackle this problem are based on, e.g., video recordings or piezoelectric sensors embedded in the floor. In this work, we rely on acoustic and vibration measurements, obtained from a microphone and a...
Supervised Structured Binary Codes for Image Search
RESEARCH PAPER / Oct 2017 / Machine/Deep Learning/AI Computer Vision
For large-scale visual search, highly compressed yet meaningful representations of images are essential. Structured vector quantizers based on product quantization and its variants are usually employed to achieve such compression while minimizing the loss of accuracy. Yet, unlike binary hashing schemes, these unsupervised methods have not yet benefited from the...
MoFA: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction
RESEARCH PAPER / Oct 2017 / Machine/Deep Learning/AI, Image Processing Computer Vision
In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation...
Illumination Estimation using Cast Shadows for Realistic Augmented Reality Applications
RESEARCH PAPER / Oct 2017 / Immersive/AR/VR/MR, Computer Vision
Augmented Reality (AR) scenarios aim to provide realistic blending between real world and virtual objects. A key factor for realistic AR is thus a correct illumination simulation. This consists in estimating the characteristics of real light sources and use them to model virtual lighting. In this paper, we briefly introduce...
Super-Rays for Efficient Light-Field Processing
RESEARCH PAPER / Oct 2017 / Image processing, Light Field, Computer Vision, Volumetric Imaging
Light field acquisition devices allow capturing scenes with unmatched postprocessing possibilities. However, the huge amount of high-dimensional data poses challenging problems to light field processing in interactive time. In order to enable light field processing with a tractable complexity, in this paper, we address the problem of light field oversegmentation....
Multimodality and Deep Learning when predicting Media
RESEARCH PAPER / Sep 2017 / Machine/Deep Learning/AI, Computer Vision
This paper summarizes the computational models that Technicolor proposes to predict interestingness of images and videos within the MediaEval 2017 PredictingMedia Interestingness Task. Our systems are based on deep learning architectures and exploit the use of both semantic and multimodal features. Based on the obtained results, we discuss our findings...
MediaEval 2017 Predicting Media Interestingness Task
RESEARCH PAPER / Sep 2017 / Machine/Deep Learning/AI, Computer Vision
In this paper, the Predicting Media Interestingness task which is running for the second year as part of the MediaEval 2017 Benchmarking Initiative for Multimedia Evaluation, is presented. For the task, participants are expected to create systems that automatically select images and video segments that are considered to be the...
Learn to unify local and non-local signal processings with graph CNN
RESEARCH PAPER / Sep 2017 / Machine/Deep Learning/AI, Computer Vision, Image Process
This paper deals with the unification of local and non-local signal processing on graphs within a single convolutional neural network (CNN) framework. Building upon recent works on graph CNNs, we propose to use convolutional layers that take as inputs two variables, a signal and a graph, allowing the network to...
Automated Light Composting with Rendered Images
RESEARCH PAPER / Sep 2017 / Immersive/AR/VR/MR, Computer Vision
Lighting is a key element in photography. Professional photographers often work with complex lighting setups to directly capture an image close to the targeted one. Some photographers reversed this traditional workflow. Indeed, they capture the scene under several lighting conditions, then combine the captured images to get the expected one....
Mixed Illumination Analysis in Single Image for Color Grading
RESEARCH PAPER / Jul 2017 / Image Processing, Computer Vision, Production Workflow
Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition,...
Context-aware Clustering and Assessment of Photo Collections
RESEARCH PAPER / Jul 2017 / Computer Vision
To ensure that all important moments of an event are represented and that challenging scenes are correctly captured, both amateur and professional photographers often opt for taking large quantities of photographs. As such, they are faced with the tedious task of organizing large collections and selecting the best images among...
ROAM: a Rich Object Appearance Model with Application to Rotoscoping
RESEARCH PAPER / Jul 2017 / Image Processing; Computer Vision, Production Workflow
Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition,...
Kernel square-loss exemplar machines for image retrieval
RESEARCH PAPER / Jul 2017 / Machine/Deep Learning/AI, Computer Vision
Zepeda and Perez [41] have recently demonstrated the promise of the exemplar SVM (ESVM) as a feature encoder for image retrieval. This paper extends this approach in several directions: We first show that replacing the hinge loss by the square loss in the ESVM cost function significantly reduces encoding time...
Video Style Transfer by Adaptive Patch Sampling
RESEARCH PAPER / Jun 2017 / Image Processing, Computer Vision, Machine/Deep Learning/ AI
This paper addresses the example-based stylization of videos. Style transfer aims at editing an image so that it matches the style of an example. This topic has recently been investigated massively, both in the industry and academia. The difficulty lies in how to capture the style of an image. For...
Dataset and Pipeline for Multi-View Light Field Video
RESEARCH PAPER / May 2017 / Image processing, Light Field, Computer Vision, Volumetric Imaging
The quantity and diversity of data in Light-Field videos makes this content valuable for many applications such as mixed and augmented reality or post-production in the movie industry. Some of such applications require a large parallax between the different views of the Light-Field, making the multi-view capture a better option...
An Image Rendering Pipeline for Focused Plenoptic Cameras
RESEARCH PAPER / May 2017 / Image processing, Light Field, Computer Vision, Volumetric Imaging
In this paper, we present a complete processing pipeline for focused plenoptic cameras. In particular, we propose 1) a new algorithm for microlens center calibration fully in the Fourier domain, 2) a novel algorithm for depth map computation using a stereo focal stack, and 3) a depth-based rendering algorithm that...
Structured sampling and fast reconstruction of smooth graph signals
RESEARCH PAPER / Feb 2017 / Image Processing, Computer Vision, Machine/Deep Learning/AI
This work concerns sampling of smooth signals on arbitrary graphs. We first study a structured sampling strategy for such smooth graph signals that consists of a random selection of few pre-defined groups of nodes. The number of groups to sample to stably embed the set of $k$-bandlimited signals is driven...
Predicting Interestingness of Visual Content
RESEARCH PAPER / Jan 2017 / Machine/Deep Learning/AI, Computer Vision
The ability of multimedia data to attract and keep people’s interest for longer periods of time is gaining more and more importance in the fields of information retrieval and recommendation, especially in the context of the ever growing market value of social media and advertising. In this chapter we introduce...
Towards a Perceptually-Motivated Color Space for High Dynamic Range Imaging
RESEARCH PAPER / Dec 2016 / Image Processing, Computer Vision
To reproduce the appearance of real world scenes, a number of color appearance models have been proposed thanks to adapted psycho-visual experiments. Most of them were designed and intended for a limited dynamic range, or address only dynamic range compression applications. However, given the increasing availability of displays with higher...
Technicolor@MediaEval 2016 Predicting Media Interestingness Task
RESEARCH PAPER / Oct 2016 / Computer Vision, Machine/Deep Learning/AI
This paper presents the work done at Technicolor regardingthe MediaEval 2016 Predicting Media Interestingness Task,which aims at predicting the interestingness of individual im-ages and video segments extracted from Hollywood movies.We participated in both the image and video subtasks.
MediaEval 2016 Predicting Media Interestingness Task
RESEARCH PAPER / Oct 2016 / Computer Vision, Machine/Deep Learning/AI
This paper provides an overview of the Predicting MediaInterestingness task that is organized as part of the Media-Eval 2016 Benchmarking Initiative for Multimedia Evalua-tion. The task, which is running for the first year, expectsparticipants to create systems that automatically select images and video segments that are considered to be the...
Approximate search with quantized sparse representations
RESEARCH PAPER / Oct 2016 / Computer Vision, Machine/Deep Learning/AI
This paper tackles the task of storing a large collection of vectors, such as visual descriptors, and of searching in it. To this end, we propose to approximate database vectors by constrained sparse coding, where possible atom weights are restricted to belong to a finite subset. This formulation encompasses, as...
SPLeaP: Soft Pooling of Learned Parts for Image Classification
RESEARCH PAPER / Oct 2016 / Computer Vision, Machine/Deep Learning/AI
The aggregation of image statistics – the so-called pooling step of image classification algorithms – as well as the construction of part-based models, are two distinct and well-studied topics in the literature. The former aims at leveraging a whole set of local descriptors that an image can contain (through spatial...
Light Field Segmentation Using a Ray Based Graph Structure
RESEARCH PAPER / Oct 2016 / Image processing, Light Field, Computer Vision, Volumetric Imaging
In this paper, we introduce a novel graph representation forinteractive light field segmentation using Markov Random Field (MRF).The greatest barrier to the adoption of MRF for light field processing isthe large volume of input data. The proposed graph structure exploits theredundancy in the ray space in order to reduce the...
Supervised Learning Of Low-Rank Transforms For Image Retrieval
RESEARCH PAPER / Sep 2016 / Image Processing, Computer Vision, Machine/Deep Learning/AI
In this paper we propose a new method to automatically select the rank of linear transforms during supervised learning. Our approach relies on a sparsity-enforcing element-wise soft-thresholding operation applied after the linear transform. This novel approach to supervised rank learning has the important advantage that it is very simple to...
Reflectance and Illumination Estimation for Realistic Augmentations of Real Scenes
RESEARCH PAPER / Sep 2016 / Immersive/AR/VR/MR, Computer Vision
The acquisition of surface material properties and lighting conditions is a fundamental step for photo-realistic Augmented Reality (AR). In this paper, we present a new method for the estimation of diffuse and specular reflectance properties of indoor real static scenes. Using an RGB-D sensor, we further estimate the 3D position...
On plenoptic sub-aperture view recovery
RESEARCH PAPER / Sep 2016 / Optics, Light Field, Image Processing, Computer Vision
Light field imaging is recently made available tothe mass market by Lytro and Raytrix commercial cameras.Thanks to a grid of microlenses put in front of the sensor, aplenoptic camera simultaneously captures several images of thescene under different viewing angles, providing an enormousadvantage for post-capture applications,e.g., depth estimationand image refocusing. In...
The CNN News Footage Dataset: Enabling Supervision in Image Retrieval
RESEARCH PAPER / Aug 2016 / Computer Vision, Machine/Deep Learning/AI
Image retrieval in large image databases is an important problem that drives a number of applications. Yet the use of supervised approaches that address this problem has been limited due to the lack of large labeled datasets for training. Hence, in this paper we introduce two new datasets composed of...
Experiencing the interestingness concept within and between pictures
RESEARCH PAPER / Feb 2016 / Computer Vision, Machine/Deep Learning/AI
Interestingness is the quantification of the ability of an imageto induce interest in a user. Because defining and interpretinginterestingness remain unclear in the literature, we introduce inthis paper two new notions, intra- and inter-interestingness, andinvestigate a novel set of dedicated experiments.More specifically, we propose four experimental protocols:1/ object ranking with...

ready to
learn more?

CONTACT INVESTORS
RESEARCH & INNOVATION
  • Overview
  • Wireless Lab
  • Video Lab
  • Emerging Technologies Lab
  • Talent
THOUGHT LEADERSHIP
  • Vault
  • Blog
  • Creators
ABOUT US
  • Overview
  • Leadership
  • History
  • Licensing
  • Contact
  • Government Solutions
  • Careers
  • Media
  • LinkedIn
  • Twitter
InterDigital logo in white

   © COPYRIGHT 2023 INTERDIGITAL, INC. ALL RIGHTS RESERVED.

  • Privacy Policy
  • Forward Looking Statements
  • Legal Notices
  • Research & Innovation
    • Overview
    • Wireless Lab
    • Video Lab
    • Emerging Technologies Lab
    • Talent
  • Thought Leadership
    • Vault
    • Sustainability
    • Blog
    • Creators
  • About Us
    • Overview
    • Leadership
    • History
    • Licensing
    • Contact
    • Government Solutions
    • Careers
    • Media
  • Investors
    • Investor Relations
    • ESG / SUSTAINABILITY
  • Search