The Vault

Improving coding and delivery of video by exploiting the oblique effect
Research Paper / Feb 2014

Improving coding and delivery of video by exploiting the oblique effect Yuriy A. Reznik and Rahul Vanam InterDigital Communications, Inc. 9710 Scranton Road, San Diego, CA 92121 USA E-mail: {yuriy.reznik, rahul.vanam}@interdigital.com Abstract—Oblique effect implies lower visual sensitivity to diagonally oriented spatial oscillations as opposed to horizontal and vertical ones. To exploit this phenomenon we propose to use an adaptive anisotropic low-pass filter applied to video prior to encoding. We then describe design of such a filter. Through experiments, we demonstrate that the use of this filter can yield appreciable bitrate savings compared to conventional filtering and encoding of the same content. I. INTRODUCTION It is known that for a human observer, horizontal and vertical lines are more visible than diagonal ones. This so- called “oblique effect” has been known for at least seven decades. Early characterizations of it can be found in papers by Campbell et al. [1], [2], and Kelly [3]. Analytic models of the Contrast Sensitivity Function (CSF) incorporating this effect have been proposed by Daly [4] and Barten [5], [6]. A number of creative uses of this effect have also been proposed in the past. These include special sampling techniques, such as “diagonal sampling”, deployed in early printing and half- toning systems [7], “dot-interlaced sampling”, deployed in early analog TV systems [8], [9], etc. It was also exploited by several advanced quantization techniques proposed for JPEG compression [10], [11]. A survey of several other known uses can be found in [9]. However, this effect does not seem to be fully appreciated or exploited by modern video systems. Most images and videos are now captured and processed using a square pixel grid, and not a diagonal one. Spatial filters (resampling/anti-aliasing, etc.) are commonly implemented in a separable fashion, result- ing in an extended rather than shortened frequency response in oblique directions. Video encodings, employing standards such as H.264/AVC [12], are now commonly produced using “flat” quantization weight matrices – suggesting that the oblique effect is not exploited by quantization/coding schemes as well. The objectives of this paper are: (1) propose one possible way this effect can be exploited in the context of a video coding and delivery system, and (2) quantify gains that it can provide. In this work, we will assume that video is delivered by an adaptive system1, shown in Figure 1. In this system, the transmitter knows parameters of the reproduction setup (view- ing distance, pixel density, contrast, etc.) delivered through a feedback loop from the receiver. These parameters are passed to a pre-processing filter. The function of this filter 1Related publications advocating the use of this adaptation model for mobile video streaming are [13], [14]. Perceptual preprocessing filter Encoder Decoder Network Display Sensors Distance, display parameters, ambient light Fig. 1. Architecture of adaptive video delivery system exploiting the oblique effect. The pre-processing filter is used to remove spatial oscillations invisible under current reproduction setup. 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 0 20 40 60 80 100 Spatial frequency (cpd)Spatial frequency (cpd) C on tra st s en si tiv ity Horizonta l frequenc y fx (cpd ) Vertical frequency f y (cpd) Fig. 2. 3D model of Contrast Sensitivity Function [4]. A dent in the oblique directions indicates lower visual sensitivity for such frequencies. is to remove spatial oscillations that are invisible under given viewing conditions. By removing such oscillations this filter simplifies the video content, thereby leading to more efficient encoding. It is the pre-processing filter in Figure 1 which we will use to exploit the oblique effect. In the next Section, we will describe our proposed design of such a filter. In Section III, we will describe experiment setup and present characterization of the effectiveness of our design. Conclusions will be drawn in Section IV. II. DESIGN OF A PRE-PROCESSING FILTER A. Basic principles We start with a characterization of oblique effect through a model of a Contrast Sensitivity Function (CSF) of human vision [4], [6]. A typical shape of this model is shown in Figure 2. This model describes a relationship between frequencies of spatial oscillations and their contrast sensitiv- ity thresholds. It is understood that spatial oscillations with contrast sensitivities below the CSF surface are detectable, and the ones above it are not detectable by human observers with normal vision. As shown by Figure 2, the oblique effect manifests itself by a dent in the CSF surface along diagonal directions. Fig. 3. Illustration of the concept of spatial frequency f = 1/β (cycles per degree). Here β is the angle capturing 1 period of spatial oscillation, n - wavelength, and d - distance between the viewer and the screen. Spatial frequencies in the CSF model are usually expressed in cycles per degree [cpd]. As further explained in Figure 3, the mapping between a spatial frequency f and a wavelength n in pixel domain can be obtained as follows: f = 1 β [cpd], where β = 2arctan ( n 2 d ρ ) × 180 pi [◦], (1) where d is the distance between the viewer and the screen, and ρ is the pixel density. Contrast sensitivity values in the CSF model are defined as reciprocals of the contrast thresholds CT . In turn, contrast values are computed using the Michaelson formula: C = Lmax − Lmin Lmax + Lmin where Lmax, Lmin denote maximum and minimum luminous intensities of an oscillation. B. Limits imposed by the display We first note that there ought to be a Nyquist frequency implied by the resolution of the display. Using (1) we can immediately compute it as: fD,Nyq = pi 360 arctan ( 1 d ρ )−1 [cpd]. (2) We next look at contrast sensitivity limits. Let LDmax, L D min denote peak and base luminance characteristics of a display. Then for any oscillation rendered on this device: LDmin 6 Lmin 6 Lmax 6 LDmax. By applying these inequalities it follows that C = Lmax − Lmin Lmax + Lmin 6 L D max − LDmin LDmax + L D min = CD, where CD is the contrast of the display. It can also be expressed as CD = CR−1CR+1 , where CR = L D max/L D min is the contrast ratio. It further follows, that the contrast sensitivity S = 1/C of any oscillation realizable by such a display cannot be less than SD = 1 CD = CR+ 1 CR− 1 . (3) C. Region of visible spatial frequencies We now find a set of outermost points fc(θ) where the CSF surface reaches the “sensitivity floor” SD imposed by the display: fc(θ) = max {f : CSF (f, θ) = SD} , θ ∈ [0◦, 90◦] (4) This function fc(θ) can be understood as a boundary of a region of frequencies that are visible under current viewing conditions. Fig. 4. Frequency characteristics of (a) an idealized oblique filter (red), and (b) conventional uniform low pass filter (blue). Using Daly’s CSF model [4] the function fc(θ) can be further characterized as follows. Let f ′c be the solution of (4) for cardinal directions: f ′c = fc(0 ◦) = fc(90◦) . Then: fc(θ) = f ′ c · ( 1− µ 2 cos (4θ) + 1 + µ 2 ) , (5) where µ = 0.78 is a constant. We show a plot of function fc(θ) in Figure 4. In the same plot we also show frequency characteristics of a conventional separable filter with cutoff f ′c. It can be observed that the region bounded by fc(θ) is much smaller. D. Design of an oblique filter Our next task is to design a low-pass filter with anisotropic frequency characteristic defined by (5). Here, we will offer a very simple approximate solution, while the derivation of a more elaborate design is left for future research. Our proposed solution is based on approximation shown in Figure 5. The maximum cutoff applied to cardinal directions remains f ′c, while minimum cutoff applied to oscillations along 45◦ is approximately 0.55f ′c. As shown in Figure 6, this filter is implementable as a mix of three rectangular filters. This also allows separable implementation. We use the Lanczos kernel [15] to implement one- dimensional low-pass filters with a programmable cutoff fre- quency. The mappings between f ′c or 0.55 f ′ c and pixel-domain cutoffs is done by using (1). The 2D filtering operation is realized by four one di- mensional filtering stages. We first apply 1D filters with fc = 0.55 f ′ c and fc = f ′ c along rows of an input image to generate filtered images A1 and A2, respectively. We then compute a difference image A3 = A2 − A1. We then apply filter with fc = f ′c to A1 along columns to obtain filtered image A4. Similarly, a filter with fc = 0.55 f ′c is applied to A3 along columns to obtain filtered image A5. Finally, the output filtered image is obtained as the sum of A4 and A5. The complexity of this anisotropic filter is about 2 times the complexity of a regular 2D filtering operation. E. Filtering example We illustrate operation of our proposed filter in Figure 7. Sub-figure (a) shows a star-shaped synthetically generated image that we’ve used in this test. Sub-figure (b) shows amplified difference between filtered images produced by a separable 2D filter with cutoff f ′c and our oblique filter. It Fig. 5. Simplified frequency characteristic of an oblique filter that is realizable using separable filter. Fig. 6. Oblique filter in Figure 5 can be realized by adding frequency components in (a) and (b) and subtracting (c). can be observed that differences in filtering results are only apparent for oblique edges. III. EXPERIMENT SETUP AND RESULTS A. Video sequences and encoder settings In our experiments we used the standard test sequences listed in Table I. We also used an x264 [16] video encoder in our tests. This encoder was applied to both original and filtered versions of test sequences. In order to achieve the same level of quality in encodings of original and filtered versions of a sequence, we used fixed QP rate control, and applied the same QPs for all such encodings. Specific choices of QP values that we selected for each sequence are shown in Table I. These QPs were found to produce encodings of the original (non-filtered) sequences at approximately 10Mbps and 5Mbps rates, which we felt are practically relevant operating points. B. Viewing conditions In our experiments we have assumed that viewing is per- formed using 8 fixed viewing distances resulting in observation angles in the range from 7◦ to 35◦. The relationship between observation angle γ, viewing distance d, and display width w is established as follows: tan (γ 2 ) = w 2 d . (6) We have also fixed lighting conditions and measured charac- teristics of the display. We used a dim room (with ambient illu- minance of around 30 lux), and a 17” monitor with an effective contrast ratio (measured under given lighting conditions) of around 300:1. Additionally, we have measured average screen luminance throughout playback of our video sequences. We have found it to be close to 50 cd/m2. C. Test process and verification Given the above conditions we have next estimated cutoff frequencies f ′c. We set Daly’s CSF model parameters [4] as follows: Fig. 7. (a) Star test image. (b) Difference between the output from conven- tional uniform filter and oblique filter, magnified 100×. TABLE I TEST SEQUENCES, AND QPS SELECTED TO ACHIEVE 10MBPS AND 5MBPS. ALL SEQUENCES ARE 1920× 1080, 25 FPS. Sequence 10Mbps 5Mbps name QP PSNR (dB) QP PSNR(dB) IntoTrees [17] 27 35.7 30 34.3 DucksTakeOff [17] 38 28.2 42 26.1 Parkjoy [17] 36 28.7 40 26.2 Bluesky [18] 24 41.3 28 39.2 • light adaptation level: l = 50[cd/m2], • eccentricity: ε = 0, • angular image size: i2 = γ2 – computed using (6), • viewing distance: d [for each point in our tests], • absolute peak sensitivity: P = 200, and then used this model to find points: f ′c = max { CSF (f, 0◦) = CR+1CR−1 } , where CR is the estimated contrast ratio of our screen. Obtained cutoff values f ′c were subsequently passed to our oblique filter, as well as the uniform/separable filter. Both original and all filtered versions of each sequence were then encoded. To ensure same level of quality of encodings of original and filtered sequences we have used the same encoder settings and the same fixed QPs. We have also performed simultaneous double-stimuli viewing of both encoded original and filtered sequences by a panel of 5 viewers. These tests confirmed that under specified viewing conditions both encoded original and filtered sequences exhibit no significant differences. This was confirmed for outputs of both oblique and uniform filters. The final results of our tests are comparisons of: • size of encoded original vs. uniform filtered sequences, • size of encoded original vs. oblique filtered sequences, • size of encoded uniformly filtered vs. oblique filtered sequences. The first two comparisons are indicative of absolute gains achievable by an adaptive system employing a perceptual pre-filter vs. conventional encoding. The last comparison is indicative of gains achievable specifically by our oblique filter vs. conventional uniform filtering used in the same workflow. D. The results The results for each sequence in our tests are presented in Figures 8-11. We first observe that both oblique and uniform IntoTree.yuv 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Intotree.yuv, 10 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Intotree.yuv, 5 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter (a) (b) Fig. 8. Bitrate savings for sequence “IntoTree”. Reference rates: 10Mbps: (a), 5Mbps: (b). Parkjoy.yuv 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Parkjoy.yuv, 10 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Parkjoy.yuv, 5 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter (a) (b) Fig. 9. Bitrate savings for sequence “Parkjoy”. Reference rates: 10Mbps: (a), 5Mbps: (b). filtering achieve very significant (over 50%) improvements as viewing distance increases and observation angles become small. This indicates that adaptation to viewing distances and other conditions by means of pre-filtering can be worthwhile. We also observe that the use of the oblique filter in our tests has resulted in additional savings of up to 5 − 10% (as shown by blue curves in Figures 8-11). These gains seems most profound in the range of viewing angles from 12◦ to 25◦. We also note that for some sequences, such as “IntoTree” and “BlueSky” the oblique filter produces gains even for wide (30◦ − 35◦) viewing angles. This suggests that in some cases the use of the oblique filter may be meaningful even without precise knowledge of characteristics of the reproduction setup. IV. CONCLUSIONS We have proposed an adaptation mechanism and design of a pre-filter exploiting visibility limits implied by the contrast sensitivity function and oblique effect for coding and delivery of visual information. Through experiments, we have shown that the use of our adaptation model may yield significant (over 50%) bitrate sav- ings compared to a conventional coding and delivery approach. We have also shown that exploitation of the oblique effect in this delivery model results in additional savings of up to 5 − 10%. These gains were shown to be most consistent for viewing angles in the range of 12◦ − 25◦. We have also observed gains for wider viewing angles, but they were achievable only for a subset of sequences in our tests. Bluesky.yuv 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Bluesky.yuv, 10 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Bluesky.yuv, 5 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter (a) (b) Fig. 10. Bitrate savings for sequence “Bluesky”. Reference rates: 10Mbps: (a), 5Mbps: (b). DucksTakeOff.yuv 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Ducks.yuv, 10 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter 5 10 15 20 25 30 35 0 10 20 30 40 50 60 70 80 viewing angle [degrees] bi tra te sa vi ng s ( %) Ducks.yuv, 5 Mbps Oblique filter vs. no−filter Uniform filter vs.no−filter Oblique filter vs. uniform filter (a) (b) Fig. 11. Bitrate savings for sequence “DucksTakeOff”. Reference rates: 10Mbps: (a), 5Mbps: (b). REFERENCES [1] F. W. Campbell, J. J. Kulikowski, and J. Levinson, “The effect of orien- tation on the visual resolution of gratings,” The Journal of physiology, vol. 187, no. 2, pp. 427–436, 1966. [2] F. W. Campbell and J. J. Kulikowski, “Orientational selectivity of the human visual system,” The Journal of physiology, vol. 187, no. 2, pp. 437–445, 1966. [3] D. H. Kelly, “No oblique effect in chromatic pathways,” J. Opt. Soc. Am., vol. 65, pp. 1512–1514, Dec 1975. [4] S. J. Daly, “Visible differences predictor: an algorithm for the assess- ment of image fidelity,” in SPIE/IS&T 1992 Symposium on Electronic Imaging: Science and Technology, pp. 2–15, SPIE, 1992. [5] P. Barten, “Contrast sensitivity of the human eye,” Japan Display’92, pp. 751–754, 1992. [6] P. G. J. Barten, “Formula for the contrast sensitivity of the human eye,” Proc. SPIE 5294, pp. 231–238, 2003. [7] R. Ulichney, Digital Halftoning. The MIT Press, 1987. [8] Y. Ninomiya et al., “An HDTV broadcasting system utilizing a bandwidth compression technique-muse,” IEEE Trans. Broadcasting, vol. BC-33, no. 4, pp. 130–160, 1987. [9] W. E. Glenn, “Digital image compression based on visual perception,” in Digital Images and Human Vision, A. W. Watson, ed., p. 6372, MIT Press, Cambridge, MA, 1993. [10] W. B. Pennebaker and J. L. Mitchell, JPEG: Still Image Data Compres- sion Standard. Springer, 1993. [11] W. Zeng, S. Daly, and S. Lei, “An overview of the visual optimization tools in jpeg 2000,” Signal Processing: Image Communication, vol. 17, pp. 85–104, 2002. [12] T. Wiegand, G. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Trans. CSVT, vol. 13, no. 7, pp. 560–576, 2003. [13] Y. Reznik et al., “User-adaptive mobile video streaming,” in Visual Communications and Image Processing, pp. 1–1, 2012. [14] Y. Reznik, “User-adaptive mobile video streaming using MPEG-DASH,” IEEE COMSOC MMTC E-Letter, vol. 8, no. 2, pp. 39–41, 2013. [15] C. Duchon, “Lanczos filtering in one and two dimensions,” Journal of Applied Meteorology, vol. 18, no. 8, pp. 1016–1022, 1979. [16] “x264 encoder.” http://www.videolan.org/developers/ x264.html. [17] “SVT test set.” ftp://vqeg.its.bldrdoc.gov/HDTV/SVT MultiFormat/. [18] C. Keimel et al., “Visual quality of current coding technologies at high definition IPTV bitrates,” in IEEE MMSP, p. 390 393, 2010.