The Vault

Realism in Augmented Reality
White Paper / Jun 2017 / ar/vr

Despite the relatively recent commercial availability of consumer devices, various market forecasts are expecting a rapid increase in the number of users and the market value of augmented reality (AR) content in the coming years.  When it comes to media consumption, the quality of the experience is the major driving factor for success, as seen in the popular mobile gaming sensation from Niantic, Pokémon Go. With future AR developments, the realism of the content will be a vital factor in determining overall experience quality; this white paper showcases key aspects of realism in the context of AR and introduces related solutions developed through InterDigital's Innovation Partners open innovation initiative.

We found some problems with the form:
To view this content please fill out the form below to tell us about you.
Privacy Policy
Motivation / background The promise of augmented reality (AR) is the ability to embed virtual information directly into our physical environment, so realistically rendered that virtual elements become indistinguishable from real ones. Using AR to add realistic virtual elements to our physical surroundings will allow for new kinds of experiences where the world is improved by virtual content blending seamlessly with the real, all tailored to meet our individual needs and desires. Early head-mounted display devices (HMD), such as the Microsoft Hololens, do a fine job in overlaying virtual information directly on top of our normal view. As they become more widely adopted by consumers, the level of realism and the quality of the illusion will become significantly more important. With the advent of these devices, we are in the early days of a completely new form of media. The transition is imminent, as HMDs are almost at the level of usability expected by the average consumer, and the computing power enabling real-time high-end graphics, limited to desktop computers only a few years ago, is rapidly becoming commonplace on mobile computing platforms. Despite the relatively recent commercial availability of consumer devices, various market forecasts are expecting a rapid increase in the number of users and the market value of AR content in the coming years. Indeed, Pok?mon Go, a mobile game developed by Niantic, became a worldwide phenomenon in a matter of weeks. Its success has demonstrated that even more primitive forms of mobile AR can take off quickly when the content and usability are nailed down correctly. This underscores a significant point ? when it comes to media consumption, the quality of the experience is the major driving factor for success. In the near future, AR will be used as an ingredient for completely new kinds of experiences, new types of entertainment, and productive applications. However, when crafting novel AR experiences, the details will be extremely important, as they can mean the difference between being just one app among thousands or a killer app ? the next Pok?mon Go. With future AR experiences, the realism of the content will be a key factor in determining the quality of the overall experience. This white paper will highlight some key aspects of realism in the context of AR and introduce related solutions developed by InterDigital?s Innovation Partners open innovation initiative. Realism in Augmented Reality Varieties of realism Ignoring optical filtering and scattering in the eye, physical realism means that the image produced by the computer has to be an accurate point-by-point representation of the spectral irradiance values at a particular viewpoint in a real world scene. A computer-generated image should contain identical representations of all objects in all light energy spectral and intensity ranges that the real world scene features. Photorealism means that the produced image needs to be photo-metrically realistic. The image produced by the computer has to produce the same visual response as the real world scene, even if the physical energies between the two vary in spectral and intensity ranges. The criterion for functional realism suggests that the image produced by the computer has the same visual information as the real world scene. Information helps the user to understand meaningful properties of objects in a scene, such as shapes, sizes, positions, motions, and materials, and therefore enables the user to perform useful visual tasks. VARIETIES OF REALISM The realism of AR is a measure of how realistic combinations of virtual and physical elements look, feel, and behave, and it is a quality that can be defined from several viewpoints. One categorization originally proposed by Ferwerda1 divides the realism of computer graphics into three varieties: 1. Physical realism: The image provides the same visual stimulation as the scene. 2. Photorealism: The image produces the same visual response as the scene. 3. Functional realism: The image provides the same visual information as the scene. 1 Ferwerda, James A. ?Three varieties of realism in computer graphics.? Electronic Imaging 2003. International Society for Optics and Photonics, 2003. Innovation Partners | White Paper The emphasis of academic research on computer graphics has traditionally been on photorealism, which has led to the steadily increasing quality of real-time rendered 3D graphics. Despite having a close connection with computer graphics, the realism in AR needs to be considered from a wider perspective. AR is not only about the visual appearance. AR experiences are meant to be consumed as a part of the real physical environment and, as such, they are interactive, reacting to both the user and the environment. To address the interactive and physical aspects of AR, we propose criteria to rate AR experiences that are similar to Ferwerda?s for computer graphics with some subtle, and some not-so-subtle, modifications: 1. Physical realism in AR: Virtual and physical elements are indistinguishable from each other; all elements look and behave as if they were physically part of the reality. 2. Photorealism in AR: All virtual elements look as if they are physically part of the real world scene that the user is observing. 3. Functional realism in AR: All virtual elements behave according to the physical reality. They represent the virtual object correctly in relation to the physical space and help the user perform tasks assisted by AR. With a closer examination of each of these aspects of realism in AR, we can determine how close we are to achieving full realism, and what novel solutions have recently been developed to move us closer to that goal. Physical realism in AR Full physical realism with true Holodeck-like experiences will remain a distant ultimate goal for AR. Steps towards increasing physical realism in AR involve developing not only how virtual elements appear to the viewer, but also interactivity - how a viewer can literally touch, feel, and manipulate virtual objects directly with their bare hands. A complexity and richness of interaction is required when moving towards a fully natural and intuitive interaction versus the traditional point-and-click style of interaction lifted from Windows-based user interfaces that is prevalent in early AR applications. Advanced AR applications require completely new methods for input and output, as well as a different approach to crafting the user experience. From an input and output technologies perspective, to enable a new level of interaction, there is a need for accurate detection and structural understanding of the physical environment and the users within it. Methods for enabling users to directly manipulate virtual elements by touching them are also needed. This not only requires accurate detection of user actions and gestures, but a means for providing the realistic haptic sensation of virtual objects so that the user ?feels? what they would if they were touching real physical objects. An accurate structural understanding of the physical environment requires the use of various depth-sensing technologies, which solve the biggest challenges ? to an extent. Depth cameras, or RGB-D sensors as they are often called, consist of a camera sensor providing regular 2D RGB information and a depth sensor operating on structured light or time-of-flight principle, providing depth information in the form of depth map images. Data collected by such RGB-D sensors enable reconstruction of full 3D models of the local physical environment surrounding the user. These RGB-D sensors are now being integrated with many of the AR HMD devices in development. We are also seeing mobile phones with embedded RGB-D sensors entering consumer markets, led by Google?s Tango (formerly Project Tango), which paved the way for RGB-D mobile device sensor integration, allowing the development of new applications and uses. RGB-D sensors can also be used for the detection of user hand poses and gestures, which form a control input for AR applications. Despite the relatively good accuracy of these solutions, there is room for new innovations to better support natural input. There are a limited number of hand gestures that can be accurately recognized from RGB-D data today. Robust detection of individual physical objects, user interaction with them and context of use cases are areas of study still in their infancy. Speech recognition has been often used as an alternative or supplementary input method for cases where hand gesture and direct manipulation do not suffice. Traditional input methods, such as a keyboard, do not work well with AR. Innovation Partners | White Paper Compared to other relevant interaction technology areas, haptic feedback is both the single most important area and one that is severely lacking in easy-to-deploy solutions. Haptic feedback, the sensation of touch that we get when we are in contact with physical objects, is an essential input channel we use when carrying out tasks in everyday life. At this time, there are a number of AR haptic feedback approaches under consideration, which range from ultrasonic to mechanical haptic force feedback. These approaches tend to be experimental in nature. The haptic approach used is very application-specific, as no single approach is suitable for all generic, easy-to-deploy applications. Until radically new approaches exist, the use of physical proxy objects and the replacement of haptic feedback using other sensory channels such as audio will often need to be used. Even with these shortcomings, goals for increasing physical realism in AR are worthwhile. Any solutions, even with limited use cases and poor interaction fidelity, that increase the physical realism can result in a dramatic change in the nature of the experience. Often, when using several sub-optimal input and output methods in combination, a user?s brain will fuse the feedback from different sensory systems, resulting in a jump to a completely new level of immersion as compared to the use of a single sensory channel such as vision. One example of increasing physical realism with a combination of feedback channels is a solution developed by InterDigital to provide haptic feedback with relatively flat physical proxy objects. AR is used to generate visual feedback and track the user?s hand for input. In this solution, proxy objects can be compressed so that instead of the full 3D shape matching the virtual object, the proxy objects are compressed along the depth axis, resulting in a physical proxy that is easy to produce with current 3D or 2.5D printing methods and is easier to handle than full 3D objects matching the virtual elements. When a user touches the physical proxy object, the point of touch is detected and a simultaneous AR visualization of the virtual object is aligned with the touch point. The depth axis compression of the proxy object can be substituted, and the user feels as if they are really touching the full 3D shape of the virtual object. Photorealism in AR From the photorealistic point of view, the focus is on the image quality that the AR system is capable of producing. The goal of high photorealism is to combine virtual elements with a real world view so that the virtual elements are indistinguishable from real ones using visual inspection alone. To achieve such photorealistic visual quality, virtual elements need to be lit with lighting identical to that present in the environment, and the visual implications of the virtual elements on the physical environment need to be considered. This is easier said than done. Lighting solutions simulating lighting effects that capture the look of the real world are required. A detailed understanding of the environment?s structure and materials, the simulation of light interaction between the virtual and real elements, and the seamless embedding of the computer-generated synthetic elements with the real world view make this a challenging problem to solve. For some of these tasks, solutions already exist, but for others such as seamless composition, it is quite hard to solve and requires novel approaches. Seamless composition of virtual elements blended with the real world view has specific challenges. For example, with optical see- through AR, the viewer sees an unobstructed view of the real world, and virtual elements need to be overlaid on top of it. With any of the currently available display technologies used for existing AR HMDs, the mismatch of image quality between the virtual and real view causes virtual elements to stand out clearly from the composited view. This is often even further highlighted by mismatched lighting, unwanted transparency, and erroneous occlusions featured on the virtual elements. Innovation Partners | White Paper HAPTIC FEEDBACK IS BOTH THE SINGLE MOST IMPORTANT AREA AND ONE THAT IS SEVERELY LACKING IN EASY-TO-DEPLOY SOLUTIONS. Functional realism in AR Functional realism is the least rigid variety of AR realism. When looking at realism from this point of view, the quality comes from how accurately the AR content manages to convey meaningful information. As an example, in an AR-assisted maintenance use case where the goal for the AR system is to help a maintenance worker see which parts of the machinery need to be worked on, it is essential that the AR system can point out the correct parts and effectively illustrate operational details with an easy-to-understand graphical style. Functional realism is a measure of how effectively the AR system helps the user perform the assigned task. When considering functional realism, the augmented elements can be rendered with any rendering style, be it shell shading, line drawing or photorealistic textured materials. For the user to be able to carry out the task with the help of AR, the rendering style is irrelevant as long as it helps the AR system provide information necessary to execute the task. When focusing on the effectiveness of information transfer, a much wider range of content types can be considered and compared. The realism here is not so much dependent on the details of the rendering, but rather on underlying technology. Accuracy of 3D tracking (i.e., knowing where the AR device is in relation to the environment), understanding of the environment, object recognition, and context recognition are some of the key elements that enable an AR application to display correct information at the correct moment for the user, thus providing needed information efficiently. If the aim of AR technology is to move towards a ubiquitous computing dream, where users wear AR HMDs throughout their daily lives, then functional realism becomes a very important consideration. AR devices need to be able to adjust operation according to many different contexts, always aiming for the most efficient way of delivering needed information to the user at the moment when it?s needed. In addition to core technologies, such as robust 3D tracking and context recognition, solutions aimed at improving functional realism can include ways to extract richer information from the environment, which in turn can be used to tailor content delivered back to the user. The dynamic range of the human visual system and the accommodation of visual perception to prevailing conditions further complicates the reproduction of virtual objects. Existing displays can reproduce only a fraction of the dynamic range of the human visual system and optical see-through display systems generally suffer from image transparency, which causes virtual elements to appear semi-transparent with a ghost image of the background environment bleeding through. Much of these shortcomings need to be solved by the display hardware, but solutions that enable better matching of lighting conditions between virtual and real elements as well as correct handling of occlusions and better tone range matching can improve the realism of AR significantly and, by doing so, enhance the photorealism of the AR. InterDigital has developed solutions that enable automatic adjustment of virtual content elements so that the content as a whole features uniform lighting conditions, thus improving the photorealism of the AR as well as allowing methods enabling full re-lighting of real-time captured 3D content. These enable use cases where the visual appearance of a user is captured in real-time and augmented to photorealistic quality in another physical space with a significantly different kind of lighting setup. Innovation Partners | White Paper THE REALISM HERE IS NOT SO MUCH DEPENDENT ON THE DETAILS OF THE RENDERING, BUT RATHER ON UNDERLYING TECHNOLOGY. InterDigital has been developing a concept system for collecting and refining information from different user environments, which can then be used to improve functional realism. With this solution, the data collected by device sensors (such as RGB-D sensors, motion, etc.) that are embedded in the AR HMDs are collected in a continuous and centralized manner in order to create an extremely rich information repository. From this information repository, deeper insight into the environment, users, and relationships between them can be extracted with further analysis. Such extracted information enables the improvement of functional realism for various AR experiences and services, as well as the development of completely new kinds of services that can identify situations where users need assistance, match virtual experiences with physical environments, and offer the potential to perform virtual product placement for advertisers. Innovation Partners | White Paper Innovation Partners www.innovation-partners.com Conclusions We have discussed some views on the realism of AR experiences and the differences as compared to realism for computer graphics. Clearly, we see that computer graphics are an essential component of AR and, as such, they have much similarity with AR when viewed from the perspective of realism. However, we have also pointed out that there are significant differences between the two, especially as AR has a much more profound connection with the physical environment and the interaction between virtual, real, and user. Based on these observations, we have extended the viewpoint of realism from purely visual aspects addressed by computer graphics to a wider perspective that also considers the interaction between virtual content, the real physical environment, and the user. We see that with this wider perspective, the visual aspects of realism are not necessarily the ones most severely lacking solutions, and we believe solutions that add physical realism in AR such as haptic feedback and natural interaction are very much in demand, since they carry the potential of elevating the realism and immersion of AR to the next level. We have also introduced recent solutions developed by InterDigital aimed at improving the realism of AR, and we want to conclude this white paper by encouraging everyone to continue to push the limits of realism in AR. WP_201705_007