Interns at InterDigital: Exploring Immersive and Metavideo Standards
Throughout 2023, InterDigital’s Rennes Office welcomed more than a dozen brilliant interns and researchers to explore and contribute to the industry-shaping research and innovation being conducted at InterDigital.
Before concluding their time at InterDigital, each of the interns reflected on the work they led and experiences they shared working in InterDigital’s esteemed labs with world class engineers and inventors.
We thank each of our interns for their insights, their diligence, and the hard work they contributed to the Lab’s research and our InterDigital team. Learn more about the interns and their work below.
Majd Alghaddaf: Quality of Experience for XR
Tom Roy: MPEG-I Standards on Haptics and Scene Description
Jim Pavan: Semantic-based technology for Video Conferences
Tom Le Menn: Synthetic Video Data Generation
Mohammad Aroui: Implicit Representations for Image and Video Compression
Kilian Ravon: Energy Reduction Solutions for Video Displays and Screens
Emmanuel Sampaio: Benchmarking Energy Aware Imaging Processing Methods
Antoine de Saint-Martin: Evaluating the Meta Quest Pro2
Quality of Experience for XR
Throughout the year, Majd Alghaddaf worked on quality of experience (QoE)for XR. He conducted significant research to make a state-of-the-art solution around QoE, with the goal of helping the Video Lab contribute and propose new metrics to analyze in the AR-specific field. Majd provided strong help and developed deep understanding on this subject and helped the Lab on two patents related to this research.
Majd Alghaddaf: “I've been interning at InterDigital for the past 6 months, studying the quality of experience of augmented reality applications. This experience has granted me a unique perspective on the field of research and standards that I could not have obtained anywhere else. Not only did I gain an abundance of knowledge on augmented reality and its rapid evolution, but I got to participate in the filing of two patents related to my internship. I could not have done this without my amazing team, comprised of experts who have been ever so kind to me throughout my entire time here. I thank everyone that I have come across at this company, for they have all shared a piece of their expertise.”
Demonstrating MPEG-I Standards on Haptics and Scene Description
Throughout his internship, Tom Roy worked on the development of a demonstration showcasing the capabilities of the upcoming MPEG-I standards on Haptics and Scene Description.
The demonstration is a sensorial immersive experience using a virtual reality headset, haptic-enabled controllers, and a haptic vest. The proposed experience is an interactive go-kart race-like game where interactions with the environment will trigger spatialized haptic feedback to provide an enhanced sense of immersion.
Tom Roy: “Interning at InterDigital was a turning point in my studies and allowed me to discover the world of research. At InterDigital, I have worked with amazing people that have broadened my engineering and my professional skills. I have been working on haptics in immersive experiences, a new way for people to perceive sensations. I am grateful to InterDigital for the opportunity, and I am really happy to continue my research as a PhD student.”
Exploring Semantic-based technology for Video Conferences
InterDigital’s Video Lab is working on semantic-based technology for video conferences, which depends on 3D human facial data to correctly estimate semantic features such as facial expressions, lighting conditions (e.g., light positions or specular reflections), background and hair segmentation. Recent efforts to generate 3D synthetic data has shown that it matches real data in accuracy for machine learning systems. During his internship, Jim Pavan worked on establishing a parametric 3D face model based on internal 3D assets, while developing a scripting framework to generate and render synthetic training images.
Jim Pavan: “My experience here at InterDigital has been interesting and I learned many things about the field of computer graphics. Working with the Interactive Media team has been very cool and instructive. InterDigital’s employees deserve recognition for their hard work and personal investment.”
Facilitating Synthetic Video Data Generation
During his time at InterDigital, Tom Le Menn has worked on synthetic video data generation of dynamic hair using the Unreal Engine. The main objective of his work was to obtain realistic synthetic data for the PhD #2 in the Nemo project, InterDigital’s joint lab with INRIA. The PhD explores encoding and modeling hair data on semantic-based technology for video conferences. InterDigital’s MetaVideo group has worked on hair modeling, encoding, and segmentation for a few years in coordination with post-production studios and has built a large VFX haircut database. Tom leveraged this database to automate and facilitate the creation of synthetic video data.
Tom Le Menn: “I've been an intern at InterDigital since March, working on a realistic grooming database generator to assist the work of a future thesis. The internship made me work with various interesting solutions in the digital imaging domain such as Houdini, Unreal, and Blender. Grooming is a highly specific domain, and working on it is challenging; it involves a lot of research and a need to find new ideas. It is very rewarding to attempt to push the boundaries and to innovate. Besides the subject, working at InterDigital comes with many advantages and is very pleasant; everyone on the team is very nice.”
Developing Implicit Representations for Image and Video Compression
This year, Mohammed Aroui pursued his PhD and worked on implicit representations for image and video compression with InterDigital. An implicit representation is the overfitting of the image function using a neural network. Specifically, Mohammed developed a spatially adaptive model selection, attached to a superpixel image partitioning that enables it to select, for a given tradeoff between rate and distortion, the optimal neural network according to its capacity. In other words, lightweight neural representations will be used for homogeneous areas, while bigger neural nets will be used for complex, textured image portions.
Mohammed Aroui: “My internship at InterDigital was a great learning experience where I developed many new skills. I learned so much about implicit neural representations and their application to image and video compression, and I was able to contribute to two patents. The highlight was getting to work with an amazing team and brilliant individuals who guided me throughout my internship.
“Overall, it was an enriching summer where I grew professionally and made connections with other colleagues. I’m thankful for the opportunity and will carry the knowledge I gained throughout my career.”
Exploring Energy Reduction Solutions for Video Displays and Screens
Kilian Ravon worked in the context of climate change, with the objective of reducing the energy needed to display images and videos on screens, like smartphones and TV sets. Kilian investigated different chromaticity-based methods, derived from properties of the visual perception and its optical illusions, such as heterochromatic flicker fusion, and developed spatial and temporal algorithmic processing. Besides the development of the methods through classic computer vision algorithms, machine learning-based networks were developed to mimic the most promising investigated technique. Kilian actively worked on a demonstration proving the efficiency of the method and is author and co-author of three patents.
Kilian Ravon: "I did my internship at InterDigital from February to August, and it was a very rewarding experience. I was able to work on the energy reduction of displays. After getting to grips with the existing technology, I improved and adapted it to other use cases, which led to the writing of three patents. This experience gave me an insight into private research and its culture. I also acquired a lot of technical knowledge, whether in image processing or in understanding colors and their perception. I'd like to thank the Energy Aware Media team for making me feel so welcome and for sharing their expertise with me.”
Benchmarking Energy Aware Imaging Processing Methods
During his internship, Emmanuel Sampaio worked in the context of sustainability, with the objective of reducing the energy required to display images and videos on screens like smartphones and TV sets. To that end, Emmanuel investigated image processing methods aiming to modify the visual content to display. He tested and benchmarked a set of state-of-the-art methods alongside our InterDigital methods, and to the best of our knowledge, this is the first comprehensive benchmark related to energy-aware images. The benchmark provides new insights and perspectives, and the results will be published in the scientific community in the coming months.
Emmanuel Sampaio: “During my six-month internship, we worked to benchmark different models that have as a major goal the reduction of the image’s power consumption. We addressed in this work a wide range of models: from simple image processing to sophisticated artificial intelligence models that can learn how to manipulate an image to achieve a power reduction goal. This benchmark allows us to improve our understanding of the image processing related with power reduction and enables us to implement improvements on our own models. It is important to mention that over the next months we intend to publish the benchmark results.”
Evaluating the Meta Quest Pro2
During this project, Antoine de Saint-Martin worked with Meta Quest Pro2 headset to first evaluate the headset's performance, and then explore challenges related to anchoring and the creation of digital twins. Antoine’s work was done within the MPEG SD perspective, and he developed a parser to support the anchoring and interactivity extensions provided by InterDigital. Upon completion of the internship, Antoine’s research on the Meta Quest Pro 2 headset will be integrated into the test bed developed by InterDigital’s Video Labs.
Antoine de Saint-Martin: "After having to evaluate and compare performances of several headsets for the first month, I started working on the META Quest Pro2 headset. My research focused on the anchoring capabilities of the headset and on the implementation of software used to map the area surrounding the user, like furniture for example. This surrounding is characterized by multiple digital twins, each one fits with a real object. I did some tests on user interactions over digital twins with hand tracking, to see if it fits well with the physics of the real world, and I implemented an MPEG extension called MPEG_anchor to be able to export my created scene from the headset to another device in a gITF file. This exported file can be used in the XR test bed demo developed by my mentor and the Video Lab.”