What’s Next for Video in 2022?
InterDigital remains at the forefront of innovation, often developing solutions and contributing to standards years before a technology’s deployment. With this skill of foresight and industry expertise, we looked to two of InterDigital’s leading video researchers Philippe Guillotel and Gaëlle Martin-Cocher to explain what opportunities might be ahead for video in 2022, and beyond.
IDCC Comms: Some believe 2022 may be the year for immersive video. What milestones and/or challenges do you foresee in the coming year around immersive video? How are InterDigital labs shaping this change?
Gaëlle: The release of new, diverse devices with a range of capabilities, like see-through glasses, light-field displays, and head-mounted displays (HMDs) indicates that immersive media has a bright future but will take various forms. In our world of research and standards, different standardized formats will likely be developed to support the new terminals and markets that enable this future.
As innovators, we wish to drive greater immersivity in more experiences and services, but the quality of experience differs between Augmented Reality (AR) and Virtual Reality (VR) services, between 2D or 3D displays, and between powerful or power-constrained devices. Latency remains a critical challenge to address for immersive and augmented reality experiences and traditional video metrics may need to be adapted to support new media formats and interactive experiences.
Among our video research at InterDigital, we are exploring mesh and point-cloud video and their integration with 2D and Multiview video formats in a scene description over HTTP and RTP to provide the technical enablers to deploy immersive services. Specifically, we are engaged in developing the MPEG Scene Description format to enable a rich experience with timed media in AR and VR. Our labs are also focused on multimedia transport protocols for compressed point cloud formats, alongside our participation in a new call for proposal for MESH compression.
IDCC Comms: Energy aware media is emerging in importance as industry and consumers consider the energy impacts of technology. What is this concept, and what’s on the horizon this year?
Philippe: Energy aware media is an important but complex topic because it involves every element of the video creation, distribution, and consumption chain. As such, it will take time to be implemented and adopted by all industry and consumer stakeholders but government agencies might push for it considering early public support.
At InterDigital, our research approach has been to build the foundations for more energy efficient media distribution with the specification of meta-data information on the content, which can be used along the distribution chain. For example, different video streams can be generated that provide different quality and bit-rate tradeoffs and different energy profiles. The user then has the choice to select a less energy-consuming stream, even if the quality is slightly lower.
Our research is also exploring challenges in transmission. In current web architecture, routers are computing routes determined by the best tradeoff delay or loss; however, our research looks at how a route with similar delay and loss performances might consume less energy by limiting the number of intermediate nodes/processes used.
Another path our teams are exploring is the perceptual aspects linked to energy consumption. By using some appropriate properties of the human visual system it is possible to reduce by 3% to 10% the average light level emitted by some video content with no perceptual degradation. This reduced light level will hopefully translate to some energy reduction.
IDCC Comms: With significant industry hype around the “metaverse,” do you think current streaming solutions are sufficient to deliver that level of immersivity?
Philippe: The metaverse has become a buzzword and it’s important for the industry to develop consensus around the definition of this new technology space. There are several concepts, technologies and even application areas that needs to be defined first, to know what we are talking about. This is among our Research and Innovation lab’s current concerns and efforts.
Besides, in light of the pandemic, demand has grown for more efficient communication and collaboration, and immersive environments can be a solution if appropriate display devices are provided. While current HMDs are not the solution, in the longer term I see more potential for new types of screens (immersive setup with panoramic and multi-view) or light-wear glasses. Of course, intermediate steps will be achieved to improve current 2D video solutions but will likely be insufficient for full immersive AR/VR solutions, where consumers can interact and collaborate in a 3D environment.
To achieve this future, the networks will have to support the necessary performances. The robustness of network architectures has been really impressive throughout the pandemic, and it gives me confidence it would be the case with more complex environments. In addition, the coming 5G and 6G networks, the tactile internet and some other ongoing initiatives will offer more efficient networks for the metaverse, with low latency (critical for interactivity), high throughput and more functionalities. I know that our network teams are working on those aspects, and I am sure they will successfully contribute to make this happen.
IDCC Comms: What breakthroughs in video interactivity are you hoping or preparing for this year?
Philippe: For truly collaborative social environments, interactivity is key. This means not just talking to people, but also sharing objects and engaging in different types of interactions on diverse topics in dynamic environments. This also means that all our senses, from vision to hearing to touch, should be stimulated so that emotion can be transmitted. To achieve this goal, we need efficient scene representation formats associated with the representation of the users, including a virtual one, a real one, and the link between the two. There are several initiatives, especially in standardization (ISO/IEC SC29 and SC24), working in that direction and our teams are currently contributing to these efforts.
To be more specific, I think one main challenge we must overcome is in the link between the real and the virtual. The link should be synchronized and made coherent, otherwise users will not be comfortable enough to use the technology. In addition, haptics, or touch, will gain significance because you cannot fully interact with people or objects without touching and receiving sensory feedback. Our haptic team is strongly involved since mid-2020 to finalize MPEG haptics standards for this new media type.
In video standards, what critical issues are being addressed this year?
Gaëlle: In addition to tackling critical challenges in latency and driving consistent management of media in diverse scene representation, we are engaged in standards exploring the compression of non-natural video content, like computer generated content, mixed media and gaming while defining and addressing mesh and point cloud needs. From this research we envision many different coding paradigms. The use of prior information, for example, in immersive cloud gaming, the parameters from the game engine can be used to guide encoding algorithms or create new end-to-end compression solutions.
Philippe: AI impact is also emerging in research importance and standards and we are exploring the impact of deep learning technologies for video coding but also for representing video content in a different way. It is expected that these new representation formats could reduce the data-rate but also provide new functionalities such as intrinsic editing or adaptation capabilities.
IDCC Comms: Are there any new or emerging issues unique to this year and current environment?
Philippe: We cannot ignore the obvious impact of the pandemic and the need for communications systems and architectures that enable efficient remote work, education, and socialization. Eventually, we will get to a future when both physical and remote, real and virtual worlds are used at the same time and co-operate between each other. It means appropriate formats and APIs, maybe something like a metaformat!
Privacy and user data protection are also significant considerations as we ask the questions ‘How will we manage the protection of users and guarantee appropriate use of their personal data within our fully-digital environments?’ This may tie-in with the integration of blockchain technologies into media consumption and delivery.