Adaptive Bilateral Filter for Video and Image Upsampling

Research Paper / Jan 2012

- Home /

A professor of Electrical Engineering at L’École de Technologie Supérieure (ÉTS) in Montreal and the chair holder of the Richard J. Marceau Chair on Wireless IP Technology for Developing Countries, François Gagnon has begun a project to bring the Internet to the most sparsely populated areas of the developing world. InterDigital...

In order for network slicing to deliver on its promise as an effective tool to generate new revenues and profitability, new business models must match the dynamic technical and operational changes that it brings. Significant progress is being made in demonstrations and trials, as highlighted in this report and the...

Joe Giersch, a biologist from USGS, sat down with InterDigital to discuss technology’s role in the study of stream ecology, and explain why we should all know about the insect, Lednia tumana, that he has spent his life studying.

Adaptive Bilateral Filter for Video and Image Upsampling

Rahul Vanam and Yan Ye

InterDigital Communications, LLC, 9710 Scranton Road, San Diego, CA 92121 USA

ABSTRACT

Upsampling is a post-processing method for increasing the spatial resolution of an image or video. Most video

players and image viewers support upsampling functionality. Sometimes upsampling can introduce blurring,

ringing, and jaggedness artifacts in the upsampled video or image thereby lowering its visual quality. In this paper,

we present an adaptive bilateral interpolation filter for upsampling a video or image by an arbitrary upsampling

factor, and show that it mitigates most of the artifacts produced by conventional upsampling methods.

Keywords: Upsampling, ringing artifact, jaggedness, sinc filter, Bilateral filter, Lanczos filter, Sobel operator.

1. INTRODUCTION

With the advances of wireless networks and the increasing popularity of video-capable mobile devices such

as tablets and smartphones, consumption of digital video content has been increasing rapidly. Video hosting

websites such as YouTube and Dailymotion often store the same video content in a variety of bit rates and

resolutions. When a user requests a certain video over wireless networks, usually the lower resolution and lower

bit rate version is delivered due to bandwidth constraints. Further, if the encoded video has a resolution lower

than the screen resolution on the device, the video is usually upsampled before being displayed to fit the whole

screen. Sometimes the upsampling operation can introduce artifacts such as ringing, jaggedness, and blurriness,

thereby lowering the visual quality. Therefore, this necessitates the need of an upsampler that introduces minimal

artifacts thereby ensuring good visual quality.

Tomasi and Maduchi1 presented a bilateral filter for smoothing an image while preserving its edges. Bilateral

filters have since found several applications that include denoising, contrast management, depth reconstruction,

data fusion, 3D fairing, and upsampling.2 Adaptive bilateral filters have been used for sharpness enhancement

and denoising of images.3

There exist upsampling methods that use bilateral filters. Hung and Siu4 presented an edge-directed inter-

polation using weighted least square estimator, where the weights of the estimator were modeled by a bilateral

filter. Han et al.5 used a bilateral filter to decompose an image into detail layer and base layer images, which

were then denoised, interpolated, and combined to generate an upsampled image. Yang and Hong6 presented a

bilateral interpolation filter for image upsampling. Although this approach alleviates ringing artifacts produced

by the ideal sinc-based filters, it produces jaggedness around slant edges. Jaggedness or staircase artifacts are

commonly produced by bilateral filters.2 In this paper, we present an upsampling scheme that uses an adaptive

bilateral interpolation filter. We compare our scheme against fixed bilateral and Lanczos interpolation filters,

and demonstrate that our scheme reduces jaggedness artifacts in the upsampled video.

The remainder of this paper is organized as follows. In Section 2, we describe the adaptive bilateral filter.

We describe our upsampling scheme in Section 3. Details of our experiments and results are provided in Section

4, and we conclude in Section 5.

Further author information:

R.V.: E-mail: rahul.vanam@interdigital.com

Y.Y.: E-mail: yan.ye@interdigital.com

2. ADAPTIVE BILATERAL FILTER

In this section, we shall first describe the bilateral filter and then the adaptive bilateral filter. The bilateral filter

operation is given by

y[m,n] =

∑

k

∑

l

R−1(k, l)hd(m,n; k, l)hr(x[m,n], x[k, l])x[k, l], (1)

where x[m,n] is the input image, y[m,n] is the filtered image, R(k, l) normalizes the filter coefficients to

unity, and hd(.) and hr(.) are the domain and range filters, respectively. Often a Gaussian filter is used for both

domain and range filters1,3 and are defined as follows

hd(m,n;m0, n0) = e

−

(

(m−m0)2+(n−n0)2

2σ2

d

)

and

hr(x[m,n], x[m0, n0]) = e

−

(

(x[m,n]−x[m0,n0])2

2σ2r

)

,

(2)

where [m0, n0] is the center pixel of the windowWm0,n0 = {[m,n] : [m,n] ∈ [m0−N,m0+N ]×[n0−N,n0+N ]},

σd and σr are the standard deviations corresponding to the domain and range filters, respectively. Therefore,

the domain filter gives higher weights to pixels closer to the center pixel [m0, n0], while the range filter gives

higher weights to the pixels having closer gray-scale values to the center pixel x[m0, n0]. Thus, when bilateral

filter operates on an edge pixel it behaves as a elongated Gaussian filter oriented along the edge direction, since

it assigns higher weights to neighboring edge pixels and smaller weights to pixels in the gradient direction.3

Zhang et al.3 introduced an adaptive bilateral filter by modifying Equation (2) by including an offset ζ in

the range filter equation, and adapting ζ and σr to local image characteristics. The domain and range filters of

an adaptive bilateral filter are defined as follows

hd(m,n;m0, n0) = e

−

(

(m−m0)2+(n−n0)2

2σ2

d

)

and

hr(x[m,n], x[m0, n0]) = e

−

(

(x[m,n]−x[m0,n0]−ζ[m0,n0])2

2σ2r

)

.

(3)

The parameter ζ controls the sharpness of the image.3 When ζ is made closer to the mean, the filtered image

appears blurrier, while shifting ζ away from the local mean sharpens the filtered image.

3. ADAPTIVE BILATERAL FILTER FOR IMAGE OR VIDEO UPSAMPLING

Figure 1 illustrates our approach for upsampling an image or video by MhNh and

Mv

Nv

in horizontal and vertical

directions, respectively. The input image or frame is first horizontally upsampled followed by vertical upsampling

as shown in Figure 1(a). In this section, we shall describe our approach for upsampling a video in YUV 4:2:0

format.

3.1 Upsampling chroma

For upsampling chroma, we use a conventional sinc interpolation filter as illustrated in Figure 1(c). The input is

first upsampled by inserting M zeros, followed by low pass filtering, and decimation by factor N . The following

sinc filter is used for horizontal upsampling

fh[n] = w[n]sinc

pin

Mh

where w[n] = e−

(n−u)2

2σ2u2 and

u =

N − 1

2

,

(4)

where N is the length of the filter, and σ is the standard deviation of the Gaussian window. We use a similar

sinc filter during vertical upsampling. Equation (4) can be reduced to a polyphase filter bank that requires fewer

number of operations during upsampling as illustrated in Figure 2. If x[k] is the input (since we use separable

filters, the input is one dimensional vector), its corresponding phase Φ is computed by

Φ = mod(Mhk,Nh), (5)

where mod(.) is the modulus operator. A phase filter f

(Φ)

h is obtained by decimating the sinc filter fh as follows

f

(Φ)

h = fh[j],

where j = Φ+ iMh and j < N.

(6)

3.2 Upsampling luma

In this section, we describe our approach for upsampling luma in the horizontal direction. The same scheme is

used for vertical upsampling. Our scheme consists of an edge detector, a sinc filter, and an adaptive bilateral

filter as illustrated in Figure 1(b).

We perform edge detection by applying horizontal and vertical Sobel operators defined below on the input

frame

Gh =

1 2 0 −2 −1

4 8 0 −8 −4

6 12 0 −12 −6

4 8 0 −8 −4

1 2 0 −2 −1

, Gv =

−1 −4 −6 −4 −1

−2 −8 −12 −8 −2

0 0 0 0 0

2 8 12 8 2

1 4 6 4 1

, (7)

to obtain horizontal and vertical gradients (∆x and ∆y), respectively. Gradient magnitude (g) and angle (θ) are

then computed as follows

g =

√

∆x2 +∆y2

θ = tan−1

(

∆y

∆x

)

.

(8)

The gradient information of i-th input pixel is defined as gi = (gi, θi), and a vector of gradient information

corresponding to N pixels is defined as G = (g1, . . . ,gi, . . . ,gN ). During edge detection, if the gradient g is

greater than the threshold T , we consider the input pixel to be an edge pixel and compute its gradient angle θ,

otherwise we set θ = 0. The threshold T is determined heuristically.

The input is simultaneously upsampled using a sinc filter defined in Equation (4) to yield an upsampled

pixel p. As shown in Figure 1(b), the edge information from the input pixels is used to decide the use of an

adaptive bilateral filter in upsampling. Specifically, we examine the gradient angle θ corresponding to the two

input pixels on either side of a pixel to be interpolated. For example, in Figure 3, ‘c’ is a pixel to be interpolated

horizontally, and ‘a’ and ‘b’ are its neighboring input pixels. Let the edge information of a and b be ga = (ga, θa)

and gb = (gb, θb), respectively.

Based on θa and θb we decide if the switch in Figure 1(b) is turned on or off (switch turned on implies that

adaptive bilateral filter is included in upsampling, otherwise not). If one of the following three conditions is

true, the switch is turned off, thereby resulting in the output pixel p from the sinc filter to be used as the final

upsampled pixel.

1. if ((0◦ ≤ θa ≤ α1)||(α2 ≤ θa ≤ 180◦))&&((0◦ ≤ θb ≤ α1)||(α2 ≤ θb ≤ 180◦))

2. if (α3 < θa < α4)||(α3 < θb < α4)

3. if (α5 < θa < α6)||(α5 < θb < α6),

(a)

(b)

(c)

Figure 1. Adaptive bilateral filter for image and video upsampling. (a) Video/image is upsampled in horizontal and

vertical dimensions. (b) Upsampling process for Luminance component in each dimension. Based on the edge information

the output of the sinc filter is either used as the upsampled output or used as a parameter to the adaptive bilateral filter.

(c) Upsampling process for Chrominance component in each dimension.

where α1 = 85

◦, α2 = 95◦, α3 = 25◦, α4 = 75◦, α5 = 105◦, and α6 = 155◦.

Our adaptive bilateral filter is similar to Equation (2). Instead of the Gaussian filter we use a sinc-based

filter defined in Equation (4) as the domain filter. The range filter is made adaptive to the edge information G

and is defined as follows

hr(x[m,n], p,G) = e

− (x[m,n]−p)2

2σ2r(G) , (9)

where σr is the standard deviation of the range filter, and p is the output of the sinc filter.

Our range filter is similar to the range filter defined in Equation (3). It should be noted that ζ in Equation (3)

was used to adapt the sharpness of the output image. Since we do not consider sharpening during upsampling

we set ζ = 0.

To reduce computational complexity, our adaptive bilateral filter is implemented as a polyphase filter bank

as illustrated in Figure 4. Each phase filter in the filter bank is defined as

hpf(x[m,n], p,G,Φ) =

{

f

(Φ)

h hr(x[m,n], p,G), 0 < Φ < M

f

(0)

h , else,

(10)

where hr is defined in Equation (9). Based on θa and θb, the standard deviation of the range filter is adapted

as follows

σr(G) =

150, 75◦ < θ < 85◦

150, 95◦ < θ < 105◦

150, 5◦ < θ < 25◦

150, 155◦ < θ < 175◦

50, else,

(11)

Figure 2. Horizontal upsampling using a polyphase filter bank. p is the output upsampled pixel.

Figure 3. Pixel to be interpolated is labeled as ’c’, and ’a’ and ’b’ are its neighboring input pixels.

Figure 4. Adaptive bilateral filter implemented as a polyphase filterbank.

Figure 5. Adapting the standard deviation of the range filter σr based on the angular ranges of θa or θb.

where θ corresponds to either θa or θb. The adaptation of σr based on angular ranges is illustrated in Figure

5. For pixels belonging to horizontal or vertical edges, corresponding to θ = 0◦, 180◦, and 90◦, respectively, a

smaller σr is used which results in stronger bilateral filtering. For pixels belonging to an angular edge, larger σr

is used that results in milder range filtering. In the following section, we will show that adapting σr to θ reduces

jaggedness in the upsampled frame.

4. RESULTS

In this section, we compare our upsampling approach with a fixed bilateral interpolation filter and Lanczos

interpolation filter.7 We derive the fixed bilateral filter by setting σr = 150 in Equation (9). The following test

images/videos are used in our experiments: Cactus, Foreman, and Baboon, and are illustrated in Figure 6. The

Cactus and Foreman test videos were downsampled from 1920× 1080 and 352× 288, to 640× 480 and 132× 108,

respectively, using an ideal sinc interpolation filter given in Equation (4).

The Cactus, Foreman, and Baboon test videos are upsampled to 1920 × 1080, 352 × 288, and 1024 × 1024,

respectively. We crop the regions of interest in the upsampled frames and illustrate them in Figure 7. Figures

7(a) and (b) illustrate the background building of the upsampled Foreman image. Lanczos filter is found to

produce significant jaggedness along slant edges, while the fixed bilateral filter produces mild jaggedness, and

our approach produces least jaggedness.

(a) (b) (c)

Figure 6. Test images. (a) Cactus (640× 480), (b) Foreman (132× 108), and (c) Baboon (512× 512)

Figures 7(c) and (d) illustrate the whiskers in the upsampled Baboon image. Lanczos filter produces jagged-

ness along the edges (whiskers), which is subdued in the upsampled frame generated by the fixed and adaptive

bilateral filters. Figure 7(e) illustrates a segment of upsampled Cactus video. Both Lanczos and fixed bilateral

filters produce strong jaggedness along the edge, while jaggedness is mostly subdued in our approach.

5. CONCLUSIONS

In this paper, we present an image and video upsampling scheme that uses an adaptive bilateral interpolation

filter. We adapt our bilateral interpolation filter based on the edge angles of neighboring input pixels. Compared

to fixed bilateral and Lanczos interpolation filters, our approach produces cleaner upsampled images with fewer

jaggedness artifacts.

REFERENCES

[1] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Proc. 6th Int. Conf. Computer

Vision, 1998, pp. 839–846.

[2] S. Paris, P. Kornprobst, J. Tumblin, and Fre´do Durand, “Bilateral filtering: Theory and applications,”

Foundations and Trends in Computer Graphics and Vision, vol. 4, no. 1, pp. 1–74, 2009.

[3] B. Zhang and J. P. Allebach, “Adaptive bilateral filter for sharpness enhancement and noise removal,” IEEE

Transactions on Image Processing, vol. 17, no. 5, pp. 664–678, 2008.

[4] K-W. Hung and W-C. Siu, “Improved image interpolation using bilateral filter for weighted least square

estimation,” in Proc. IEEE ICIP, 2010, pp. 3297–3300.

[5] J-W. Han, J-H. Kim, S-H. Cheon, J-O. Kim, and S-J. Ko, “A novel image interpolation method using the

bilateral filter,” IEEE Transactions on Consumer Electronics, vol. 56, pp. 175 – 181, 2010.

[6] S. Yang and K. Hong, “Bilateral interpolation filters for image size conversion,” in Proc. IEEE ICIP, 2005,

pp. 986–989.

[7] C. E. Duchon, “Lanczos filtering in one and two dimensions,” Journal of Applied Meteorology, vol. 18, no.

8, pp. 1016–1022, 1979.

Lanczos filter Fixed bilateral filter Our approach

(a)

(b)

(c)

(d)

(e)

Figure 7. Cropped upsampled images corresponding to Foreman (a) and (b); Baboon (c) and (d); and Cactus (e).

InterDigital develops mobile technologies that are at the core of devices, networks, and services worldwide. We solve many of the industry's most critical and complex technical challenges, inventing solutions for more efficient broadband networks and a richer multimedia experience years ahead of market deployment.

© Copyright 2017 InterDigital, Inc. All Rights Reserved