The Vault

Adaptive Bilateral Filter for Video and Image Upsampling
Research Paper / Jan 2012

Adaptive Bilateral Filter for Video and Image Upsampling

Rahul Vanam and Yan Ye

InterDigital Communications, LLC, 9710 Scranton Road, San Diego, CA 92121 USA


Upsampling is a post-processing method for increasing the spatial resolution of an image or video. Most video
players and image viewers support upsampling functionality. Sometimes upsampling can introduce blurring,
ringing, and jaggedness artifacts in the upsampled video or image thereby lowering its visual quality. In this paper,
we present an adaptive bilateral interpolation filter for upsampling a video or image by an arbitrary upsampling
factor, and show that it mitigates most of the artifacts produced by conventional upsampling methods.

Keywords: Upsampling, ringing artifact, jaggedness, sinc filter, Bilateral filter, Lanczos filter, Sobel operator.


With the advances of wireless networks and the increasing popularity of video-capable mobile devices such
as tablets and smartphones, consumption of digital video content has been increasing rapidly. Video hosting
websites such as YouTube and Dailymotion often store the same video content in a variety of bit rates and
resolutions. When a user requests a certain video over wireless networks, usually the lower resolution and lower
bit rate version is delivered due to bandwidth constraints. Further, if the encoded video has a resolution lower
than the screen resolution on the device, the video is usually upsampled before being displayed to fit the whole
screen. Sometimes the upsampling operation can introduce artifacts such as ringing, jaggedness, and blurriness,
thereby lowering the visual quality. Therefore, this necessitates the need of an upsampler that introduces minimal
artifacts thereby ensuring good visual quality.

Tomasi and Maduchi1 presented a bilateral filter for smoothing an image while preserving its edges. Bilateral
filters have since found several applications that include denoising, contrast management, depth reconstruction,
data fusion, 3D fairing, and upsampling.2 Adaptive bilateral filters have been used for sharpness enhancement
and denoising of images.3

There exist upsampling methods that use bilateral filters. Hung and Siu4 presented an edge-directed inter-
polation using weighted least square estimator, where the weights of the estimator were modeled by a bilateral
filter. Han et al.5 used a bilateral filter to decompose an image into detail layer and base layer images, which
were then denoised, interpolated, and combined to generate an upsampled image. Yang and Hong6 presented a
bilateral interpolation filter for image upsampling. Although this approach alleviates ringing artifacts produced
by the ideal sinc-based filters, it produces jaggedness around slant edges. Jaggedness or staircase artifacts are
commonly produced by bilateral filters.2 In this paper, we present an upsampling scheme that uses an adaptive
bilateral interpolation filter. We compare our scheme against fixed bilateral and Lanczos interpolation filters,
and demonstrate that our scheme reduces jaggedness artifacts in the upsampled video.

The remainder of this paper is organized as follows. In Section 2, we describe the adaptive bilateral filter.
We describe our upsampling scheme in Section 3. Details of our experiments and results are provided in Section
4, and we conclude in Section 5.

Further author information:
R.V.: E-mail:
Y.Y.: E-mail:


In this section, we shall first describe the bilateral filter and then the adaptive bilateral filter. The bilateral filter
operation is given by

y[m,n] =



R−1(k, l)hd(m,n; k, l)hr(x[m,n], x[k, l])x[k, l], (1)

where x[m,n] is the input image, y[m,n] is the filtered image, R(k, l) normalizes the filter coefficients to
unity, and hd(.) and hr(.) are the domain and range filters, respectively. Often a Gaussian filter is used for both
domain and range filters1,3 and are defined as follows

hd(m,n;m0, n0) = e




hr(x[m,n], x[m0, n0]) = e





where [m0, n0] is the center pixel of the windowWm0,n0 = {[m,n] : [m,n] ∈ [m0−N,m0+N ]×[n0−N,n0+N ]},
σd and σr are the standard deviations corresponding to the domain and range filters, respectively. Therefore,
the domain filter gives higher weights to pixels closer to the center pixel [m0, n0], while the range filter gives
higher weights to the pixels having closer gray-scale values to the center pixel x[m0, n0]. Thus, when bilateral
filter operates on an edge pixel it behaves as a elongated Gaussian filter oriented along the edge direction, since
it assigns higher weights to neighboring edge pixels and smaller weights to pixels in the gradient direction.3

Zhang et al.3 introduced an adaptive bilateral filter by modifying Equation (2) by including an offset ζ in
the range filter equation, and adapting ζ and σr to local image characteristics. The domain and range filters of
an adaptive bilateral filter are defined as follows

hd(m,n;m0, n0) = e




hr(x[m,n], x[m0, n0]) = e





The parameter ζ controls the sharpness of the image.3 When ζ is made closer to the mean, the filtered image
appears blurrier, while shifting ζ away from the local mean sharpens the filtered image.


Figure 1 illustrates our approach for upsampling an image or video by MhNh and

in horizontal and vertical
directions, respectively. The input image or frame is first horizontally upsampled followed by vertical upsampling
as shown in Figure 1(a). In this section, we shall describe our approach for upsampling a video in YUV 4:2:0

3.1 Upsampling chroma

For upsampling chroma, we use a conventional sinc interpolation filter as illustrated in Figure 1(c). The input is
first upsampled by inserting M zeros, followed by low pass filtering, and decimation by factor N . The following
sinc filter is used for horizontal upsampling

fh[n] = w[n]sinc


where w[n] = e−
2σ2u2 and

u =
N − 1



where N is the length of the filter, and σ is the standard deviation of the Gaussian window. We use a similar
sinc filter during vertical upsampling. Equation (4) can be reduced to a polyphase filter bank that requires fewer
number of operations during upsampling as illustrated in Figure 2. If x[k] is the input (since we use separable
filters, the input is one dimensional vector), its corresponding phase Φ is computed by

Φ = mod(Mhk,Nh), (5)

where mod(.) is the modulus operator. A phase filter f
h is obtained by decimating the sinc filter fh as follows

h = fh[j],

where j = Φ+ iMh and j < N.

3.2 Upsampling luma

In this section, we describe our approach for upsampling luma in the horizontal direction. The same scheme is
used for vertical upsampling. Our scheme consists of an edge detector, a sinc filter, and an adaptive bilateral
filter as illustrated in Figure 1(b).

We perform edge detection by applying horizontal and vertical Sobel operators defined below on the input

Gh =

1 2 0 −2 −1
4 8 0 −8 −4
6 12 0 −12 −6
4 8 0 −8 −4
1 2 0 −2 −1

 , Gv =

−1 −4 −6 −4 −1
−2 −8 −12 −8 −2
0 0 0 0 0
2 8 12 8 2
1 4 6 4 1

 , (7)
to obtain horizontal and vertical gradients (∆x and ∆y), respectively. Gradient magnitude (g) and angle (θ) are
then computed as follows

g =

∆x2 +∆y2

θ = tan−1




The gradient information of i-th input pixel is defined as gi = (gi, θi), and a vector of gradient information
corresponding to N pixels is defined as G = (g1, . . . ,gi, . . . ,gN ). During edge detection, if the gradient g is
greater than the threshold T , we consider the input pixel to be an edge pixel and compute its gradient angle θ,
otherwise we set θ = 0. The threshold T is determined heuristically.

The input is simultaneously upsampled using a sinc filter defined in Equation (4) to yield an upsampled
pixel p. As shown in Figure 1(b), the edge information from the input pixels is used to decide the use of an
adaptive bilateral filter in upsampling. Specifically, we examine the gradient angle θ corresponding to the two
input pixels on either side of a pixel to be interpolated. For example, in Figure 3, ‘c’ is a pixel to be interpolated
horizontally, and ‘a’ and ‘b’ are its neighboring input pixels. Let the edge information of a and b be ga = (ga, θa)
and gb = (gb, θb), respectively.

Based on θa and θb we decide if the switch in Figure 1(b) is turned on or off (switch turned on implies that
adaptive bilateral filter is included in upsampling, otherwise not). If one of the following three conditions is
true, the switch is turned off, thereby resulting in the output pixel p from the sinc filter to be used as the final
upsampled pixel.

1. if ((0◦ ≤ θa ≤ α1)||(α2 ≤ θa ≤ 180◦))&&((0◦ ≤ θb ≤ α1)||(α2 ≤ θb ≤ 180◦))
2. if (α3 < θa < α4)||(α3 < θb < α4)
3. if (α5 < θa < α6)||(α5 < θb < α6),




Figure 1. Adaptive bilateral filter for image and video upsampling. (a) Video/image is upsampled in horizontal and
vertical dimensions. (b) Upsampling process for Luminance component in each dimension. Based on the edge information
the output of the sinc filter is either used as the upsampled output or used as a parameter to the adaptive bilateral filter.
(c) Upsampling process for Chrominance component in each dimension.

where α1 = 85
◦, α2 = 95◦, α3 = 25◦, α4 = 75◦, α5 = 105◦, and α6 = 155◦.

Our adaptive bilateral filter is similar to Equation (2). Instead of the Gaussian filter we use a sinc-based
filter defined in Equation (4) as the domain filter. The range filter is made adaptive to the edge information G
and is defined as follows

hr(x[m,n], p,G) = e
− (x[m,n]−p)2

2σ2r(G) , (9)

where σr is the standard deviation of the range filter, and p is the output of the sinc filter.

Our range filter is similar to the range filter defined in Equation (3). It should be noted that ζ in Equation (3)
was used to adapt the sharpness of the output image. Since we do not consider sharpening during upsampling
we set ζ = 0.

To reduce computational complexity, our adaptive bilateral filter is implemented as a polyphase filter bank
as illustrated in Figure 4. Each phase filter in the filter bank is defined as

hpf(x[m,n], p,G,Φ) =

h hr(x[m,n], p,G), 0 < Φ < M

h , else,


where hr is defined in Equation (9). Based on θa and θb, the standard deviation of the range filter is adapted
as follows

σr(G) =


150, 75◦ < θ < 85◦

150, 95◦ < θ < 105◦

150, 5◦ < θ < 25◦

150, 155◦ < θ < 175◦

50, else,


Figure 2. Horizontal upsampling using a polyphase filter bank. p is the output upsampled pixel.

Figure 3. Pixel to be interpolated is labeled as ’c’, and ’a’ and ’b’ are its neighboring input pixels.

Figure 4. Adaptive bilateral filter implemented as a polyphase filterbank.

Figure 5. Adapting the standard deviation of the range filter σr based on the angular ranges of θa or θb.

where θ corresponds to either θa or θb. The adaptation of σr based on angular ranges is illustrated in Figure
5. For pixels belonging to horizontal or vertical edges, corresponding to θ = 0◦, 180◦, and 90◦, respectively, a
smaller σr is used which results in stronger bilateral filtering. For pixels belonging to an angular edge, larger σr
is used that results in milder range filtering. In the following section, we will show that adapting σr to θ reduces
jaggedness in the upsampled frame.


In this section, we compare our upsampling approach with a fixed bilateral interpolation filter and Lanczos
interpolation filter.7 We derive the fixed bilateral filter by setting σr = 150 in Equation (9). The following test
images/videos are used in our experiments: Cactus, Foreman, and Baboon, and are illustrated in Figure 6. The
Cactus and Foreman test videos were downsampled from 1920× 1080 and 352× 288, to 640× 480 and 132× 108,
respectively, using an ideal sinc interpolation filter given in Equation (4).

The Cactus, Foreman, and Baboon test videos are upsampled to 1920 × 1080, 352 × 288, and 1024 × 1024,
respectively. We crop the regions of interest in the upsampled frames and illustrate them in Figure 7. Figures
7(a) and (b) illustrate the background building of the upsampled Foreman image. Lanczos filter is found to
produce significant jaggedness along slant edges, while the fixed bilateral filter produces mild jaggedness, and
our approach produces least jaggedness.

(a) (b) (c)

Figure 6. Test images. (a) Cactus (640× 480), (b) Foreman (132× 108), and (c) Baboon (512× 512)

Figures 7(c) and (d) illustrate the whiskers in the upsampled Baboon image. Lanczos filter produces jagged-
ness along the edges (whiskers), which is subdued in the upsampled frame generated by the fixed and adaptive
bilateral filters. Figure 7(e) illustrates a segment of upsampled Cactus video. Both Lanczos and fixed bilateral
filters produce strong jaggedness along the edge, while jaggedness is mostly subdued in our approach.


In this paper, we present an image and video upsampling scheme that uses an adaptive bilateral interpolation
filter. We adapt our bilateral interpolation filter based on the edge angles of neighboring input pixels. Compared
to fixed bilateral and Lanczos interpolation filters, our approach produces cleaner upsampled images with fewer
jaggedness artifacts.


[1] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Proc. 6th Int. Conf. Computer
Vision, 1998, pp. 839–846.

[2] S. Paris, P. Kornprobst, J. Tumblin, and Fre´do Durand, “Bilateral filtering: Theory and applications,”
Foundations and Trends in Computer Graphics and Vision, vol. 4, no. 1, pp. 1–74, 2009.

[3] B. Zhang and J. P. Allebach, “Adaptive bilateral filter for sharpness enhancement and noise removal,” IEEE
Transactions on Image Processing, vol. 17, no. 5, pp. 664–678, 2008.

[4] K-W. Hung and W-C. Siu, “Improved image interpolation using bilateral filter for weighted least square
estimation,” in Proc. IEEE ICIP, 2010, pp. 3297–3300.

[5] J-W. Han, J-H. Kim, S-H. Cheon, J-O. Kim, and S-J. Ko, “A novel image interpolation method using the
bilateral filter,” IEEE Transactions on Consumer Electronics, vol. 56, pp. 175 – 181, 2010.

[6] S. Yang and K. Hong, “Bilateral interpolation filters for image size conversion,” in Proc. IEEE ICIP, 2005,
pp. 986–989.

[7] C. E. Duchon, “Lanczos filtering in one and two dimensions,” Journal of Applied Meteorology, vol. 18, no.
8, pp. 1016–1022, 1979.

Lanczos filter Fixed bilateral filter Our approach






Figure 7. Cropped upsampled images corresponding to Foreman (a) and (b); Baboon (c) and (d); and Cactus (e).