The Vault

Enabling QoE-based Scheduling for Video Teleconferencing via PSNR Time Series Prediction
Research Paper / Feb 2014

 

Enabling QoE-based Scheduling for Video Teleconferencing

 

via PSNR Time Series Prediction

 

Liangping Maa, Yuriy Reznika, Rahul Vanama, and Gregory Sternbergb

 

aInterDigital Communications, Inc., San Diego, CA 92121, USA;

bInterDigital Communications, Inc., King of Prussia, PA 19406, USA

 

ABSTRACT

 

The ultimate goal of network resource allocation for video teleconferencing is to optimize the Quality of Experi-

ence (QoE) of the video. The IPPP video coding structure with macroblock intra refresh is widely used for video

teleconferencing. With such video coding structure, the loss of a frame generally causes error propagation to

subsequent frames. A resource allocation decision of a communication network determines the QoE given that

other conditions such as viewing conditions are fixed. Therefore, to optimize the QoE, a communication network

needs to be able to accurately predict the QoE for each of its resource allocation decisions and then selects the

decision corresponding to the best QoE. In our previous work, we reduced the QoE prediction problem to one of

predicting the per-frame PSNR time series. The accuracy of the proposed per-frame PSNR prediction method

was demonstrated, however, only for low resolution video sequences. In this paper, we show via simulations

that the per-frame PSNR prediction method achieves good performance for higher resolution video sequences as

well.

 

Keywords: QoE, video, prediction, scheduling, network.

 

1. INTRODUCTION

 

In our previous work,1 we proposed a QoE prediction scheme for video teleconferencing and evaluated the

scheme on Common Intermediate Format (CIF) video sequences, which has a resolution of 352 × 288 pixels.

This resolution is low, especially when considering the fact that the state-of-the-art mobile devices, such as smart

phones, are able to support much higher resolutions. This motivates us to evaluate the proposed scheme1 with

higher resolution videos.

 

For the purpose of completeness, we give the background for our previous work.1 Video teleconferencing is

generally real time and widely uses the IPPP video coding structure, where the first frame is an intra-coded frame

and each P frame uses the frame immediately preceding it as the reference for motion compensated prediction.

To meet the stringent delay requirement, the encoded video is typically delivered by the RTP/UDP protocol,

which is lossy in nature. When a packet loss occurs, the associated video frame as well as the subsequent

frames will be affected, which is called error propagation. Packet loss information can be fed back to the

video sender or multipoint control unit (MCU) (which may perform transcoding) via protocols such as RTP

Control Protocol (RTCP) to trigger video-side adaptation such as the insertion of an intra-coded frame to stop

error propagation. However, the feedback delay is at least about a round trip time (RTT). To alleviate error

propagation, additionally, macroblock intra refresh – encoding some macroblocks of each video frame in the intra

mode – is often used.

 

We realize that a video frame generally is mapped to one or multiple packets (or slices in the case of

H.264/AVC) and thus a packet loss does not necessarily lead to the loss of a whole frame. We focus on whole

frame losses, and leave the more general packet losses for future work.

 

Although there is no difference between the P frames in the IPPP video coding structure, the impact of

dropping a P frame can be dramatically different from P frame to P frame. For the purpose of illustration,

each time, we drop a frame, apply a simple error concealment technique (which is frame copy), and evaluate the

MSE of the immediately affected 10 decoded frames consisting of the dropped frame and the 9 frames thereafter.

The reason of looking at only 10 frames instead of the whole video is that RTCP is typically used in video

teleconferencing, and the video receiver can inform the video sender of a packet loss if a packet gets lost. For

 

Applications of Digital Image Processing XXXVI, edited by Andrew G. Tescher,

Proc. of SPIE Vol. 8856, 88560I · © 2013 SPIE · CCC code: 0277-786X/13/$18

 

doi: 10.1117/12.2026910

 

Proc. of SPIE Vol. 8856  88560I-1

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

a frame rate of 30 frames per second, 10 frames correspond to a RTCP feedback delay of 1/3 second. One

potential cause of the variation of the impact of dropping a P frame could be due to the differences in the

sizes of the P frames. For a communication network, the impact should be evaluated against the same network

resource that would be consumed. Therefore, we normalize the impact by the frame size. As an example, Fig. 1

shows the average MSE per 1000-byte data as a function of the dropped frame for the RaceHorses video with

resolution 832 × 480 pixels, encoded in H.264/AVC with a quantization parameter (QP)= 30. The normalized

MSE corresponding to dropping frame 30 is 1429 while that corresponding to dropping frame 39 is 449. The

difference is more than 3 times. This presents an opportunity for a communication network to intelligently drop

certain video packets in the event of network congestion to optimize the video quality.

 

5 10 15 20 25 30 35 40 45 50

400

 

600

 

800

 

1000

 

1200

 

1400

 

1600

 

Frame Dropped

 

M

SE

 

 p

er

 

 K

 

Da

ta

 

Figure 1. The impact of dropping a P frame: the average MSE per 1000-byte data for the immediately affected 10 frames

 

as a function of dropped frame for the RaceHorse video.

 

The QoE prediction scheme that we proposed1 is low in delay, computational complexity and communication

overhead to enable a network to allocate network resources so as to optimize the QoE. Specifically, with such a

scheme, the network knows the resulting QoE for each resource allocation decision (e.g., dropping certain frames

in the network) so that it can do optimal resource allocation by selecting the decision corresponding to the best

QoE. Because of the predictive nature of the problem and the fact that the prediction has to be done within a

communication network, we need a QoE model that is feasible to compute, which motivates us to consider those

(e.g.,23) that use the per-frame PSNR time series as the input. A per-frame PSNR time series is a sequence of

PSNR values, each indexed by its corresponding frame number. The QoE prediction problem then reduces to

one of predicting the per-frame PSNR. The proposed QoE prediction scheme is jointly implemented by the video

sender (or MCU) and the communication network. Simulation results show that the proposed per-frame PSNR

prediction method can achieve an average error much less than 1dB.

 

To predict the per-frame PSNR, we can look at the channel distortion and the source distortion separately.

The challenge for PSNR prediction is channel distortion. We briefly review related work on channel distortion.

Some aspects of the channel distortion model in the seminal work4 are adopted in our work. An additive

exponential model proposed in5 is shown to have good performance. However, the determination of the model

requires some information (the motion reference ratio) of the predicted video frames to be known a priori. This

is possible only if the encoder generates all the video frames up to the predicted frame, introducing significant

delay. For example, to predict the channel distortion 10 frames ahead, assuming a frame rate of 30 frames per

second, the delay will be 333 ms. A model taking into account the cross-correlation among multiple frame losses

is proposed for channel distortion.6 However, in the parameter estimation, the whole video sequence needs to

be known in advance, making it infeasible for real time applications. Pixel-level channel distortion prediction

models are proposed for optimizing the video encoder,7 which, although accurate, are an overkill for the problem

we look at. Thus, in this paper, we consider the simpler frame-level distortion prediction.

 

Proc. of SPIE Vol. 8856  88560I-2

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

The remainder of the paper is organized as follows. Section 2 describes the QoE prediction scheme, Section

3 gives the simulation results on the per-frame PSNR prediction for higher resolution videos, and Section 4

concludes the paper.

 

2. QOE PREDICTION

 

For completeness, we introduce the QoE prediction scheme we proposed previously.1

 

2.1 Choosing QoE Models

 

Subjective video quality testing is the ultimate method to measure the video quality perceived by the human

visual system (HVS). However, subjective testing requires playing the video to a group of human subjects in

stringent testing conditions8 and collecting the ratings of the video quality, which is time consuming, expensive

and unable to provide realtime assessment results a posteriori, not to mention predicting the video quality.

 

Alternatively, QoE models can be constructed by relating QoS metrics to video QoE.2, 3, 9, 10 The ITU rec-

ommendation G.107010 considers the packet loss rate rather than the packet loss pattern in modeling the QoE,

which is insufficient as indicated by our example in Section 1 where the pattern is a single frame loss. The model

in9 has the same problem. The ITU recommendation G.1070 also requires extensive offline subjective testing to

construct a large number of QoE models, and extract certain video features (e.g., degree of motion)11 during

prediction for desired accuracy, making it unsuitable for real-time applications.

 

The QoE model proposed in2 uses the statics extracted from the per-frame peak signal-to-noise ratio (PSNR)

time series, which are QoS metrics, as the model input. Some of the statistics are the minimum, maximum,

standard deviation, the 90% and the 10% percentiles, and the difference in PSNR between two consecutive

frames. Although the average PSNR of a video sequence is generally considered a flawed video quality metric,

the model in2 is shown to outperform Video Quality Metric (VQM)12 and Structural Similarity (SSIM)13 in

terms of correlation to subjective testing results. With the choice of such QoE models, the QoE prediction

problem reduces to one that predicts the per-frame PSNR time series.

 

2.2 The Proposed QoE Prediction Approach

 

Before discussing various approaches to QoE prediction, we note that the pattern of packet losses is important,

because the video quality, or specifically statistics of the per-frame PSNR time series, depends on not only how

many frame losses have occurred, but also where they have occurred in the video sequence.

 

There are three approaches to QoE prediction. In a sender-only approach, the per-frame PSNR time series

for each frame loss pattern is obtained by simulation at the video sender. However, the number of frame loss

patterns grows exponentially with the number of video frames. Even if the amount of computation is not an issue,

the resulting per-frame PSNR time series need to be sent to the communication network, generating excessive

communication overhead.

 

In a network-only approach, the network decodes the video and finds out the channel distortion for different

packet loss patterns. However, the video quality depends on not only the channel distortion, but also the

distortion from source coding. Due to lack of access to the original video, it is impossible for the network to know

about the source distortion, making the QoE prediction inaccurate. Also, this approach becomes unscalable when

the network serves a very large number of video teleconferencing sessions simultaneously. Finally, this approach

may not be suitable when the video packets are encrypted.

 

We propose a joint approach that involves both the video sender (or MCU) and the network. The video sender

obtains the channel distortion for single frame losses, and passes the results along with the source distortion to

the network. The network knows the QoE model, and for each resource allocation decision, it calculates the total

distortion for each frame (and hence the per-frame PSNR time series) by utilizing the linearity assumption for

multiple frame losses4.5 This approach eliminates virtually all the communication overhead in the sender-only

approach, takes into account source distortion absent in the network-only approach, and does not need to do

video encoding/decoding in the network.

 

Proc. of SPIE Vol. 8856  88560I-3

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

Į Ȗ

 

Į Ȗ

 

Video 

 

Encoder

 

(delay by t1)

 

D D

 

Video 

 

Decoder

 

(delay t2)

 

D...

 

Delay by 

 

t1+t2

 

+

 

-

Channel Distortion Model

 

(delay by t3)

 

Delay by 

 

t3

AnnotationF(n)

 

G(n) delayed by t1 sec

 

ds(n) delayed 

 

by (t1 + t2) sec

d0(n), Į(n-m), Ȗ(n-m)

 

Network

 

m delay units

 

F(n) delayed 

 

by (t1 + t2) sec

 

Delay by 

 

t3-t2

 

Figure 2. System architecture of the video sender.

 

2.3 Per-frame PSNR Prediction

 

As mentioned before, frame-level channel distortion prediction is appropriate for the problem we focus on. We

first look at the video sender side, whose architecture is shown in Fig. 2. Let the number of pixels in a frame

be N . Let F (n), a vector of length N , be the nth original frame, and F (n, i) denote pixel i of F (n). Let Fˆ (n)

be the reconstructed frame without frame loss corresponding to F (n), and Fˆ (n, i) be pixel i of Fˆ (n). Original

video frame F (n) is fed into the video encoder, which generates packet G(n) after a delay of t1 seconds. Packet

G(n) may represent multiple NAL units, which we call together a packet for convenience. Packet G(n) is then

fed into the video decoder to generate the reconstructed frame Fˆ (n), which takes t2 seconds. Note that in a

typical video encoder, this reconstruction is already in place. Let the distortion due to source coding for F (n)

be ds(n). Then,

 

ds(n) =

 

N∑

i=1

 

(F (n, i)− Fˆ (n, i))2/N, (1)

 

which is readily available at the video encoder.

 

As mentioned earlier, the construction of the channel distortion model in5 requires some information (the

motion reference ratio) of the predicted video frames to be known in advance, which results in significant

delay. To address this problem, we propose using the current packet G(n) and the previously generated packets

G(n − 1), ..., G(n −m) to train a channel distortion model. In Fig. 2, D represents a delay of an inter-frame

time interval. The training takes t3 seconds. Note that t3 ≥ t2, because the Channel Distortion Model needs

to decode at least one frame. The values of the parameters for the model are then sent to the Annotation

block for annotation. Also annotated is the source distortion ds(n). The annotated packet is then sent to the

communication network.

 

We now look at the details of the channel distortion model. Prior results show that a linearity model performs

well in practice4.5 For each frame loss, we define function h(k, l),5 which models how much distortion the loss

of frame k causes to frame l for l ≥ k

 

h(k, l) = d0(k)

e−α(k)(l−k)

 

1 + γ(k)(l − k)

(2)

 

where d0(k) is the channel distortion for frame k, resulting from the loss of frame k only and the error concealment,

and α(k) and γ(k) are parameters dependent on frame k. In this paper, we consider a simple error concealment

 

Proc. of SPIE Vol. 8856  88560I-4

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

scheme, namely, frame copy. Hence the distortion due to the loss of frame k (and only frame k) is

 

d0(k) =

 

N∑

i=1

 

(Fˆ (k, i)− Fˆ (k − 1, i))2/N. (3)

 

In (2), γ(k) is called leakage, which describes the efficiency of loop filtering to remove the artifacts introduced

by motion compensation and transformation.4 The term e−α(k)(l−k) captures the error propagation in the case

of pseudo-random macroblock intra refresh. In,4 a linear function (1 − (l − k)β), where β is the intra refresh

rate, is proposed. We do not use this linear function, because the macroblock intra refresh scheme in4 is cyclic,

while the one used in our simulation software (JVT JM 16.214) is pseudo-random. The linear model states that

the impact vanishes after 1/β frames (the intra refresh update interval for the cyclic scheme), which is not the

case for the pseudo-random scheme as suggested by our simulation results. An exponential model such as the

one in5 is better. However, the model in5 fails to capture the impact of loop filtering, while our model does

capture it. The values of α(k) and γ(k) can be obtained by methods such as least squares or least absolute

value via fitting simulation data. In Fig. 2, the video sender drops packet G(n −m) from the packet sequence

G(n), G(n − 1), ..., G(n−m), performs video decoding, measures the channel distortions, and finds the value of

α(n −m) (defined as αˆ(n −m)) and the value of β(n −m) (defined as γˆ(n −m)) in (2) with the substitution

k = n−m that minimize the error between the measured distortions and the predicted distortions.

 

We next look at the network side. We assume that the network has packets G(n), G(n − 1), ..., G(n − L)

available. Let I(k) be the indicator function, being 1 if frame k is dropped and 0 otherwise. A packet loss

pattern can be characterized by a sequence of I(k)’s. For convenience we denote a pattern by a vector P :=

(I(n), I(n− 1), ..., I(0)). The channel distortion of frame l ≥ n− L resulting from P is then predicted as

 

dˆc(l, P ) =

 

l∑

k=0

 

I(k)hˆ(k, l), (4)

 

where the linearity assumption for multiple frame losses in45 is used, and

 

hˆ(k, l) = d0(k)

e−αˆ(k−m)(l−k)

 

1 + γˆ(k −m)(l − k)

. (5)

 

We realize that the model in (4) can be improved, for example, by considering the cross-correlation of frame

losses.6 However, as mentioned earlier, the model in6 is not suitable for real time applications, and its complexity

is very high. The simple model in (4) proves to be reasonably accurate.4, 5

 

In order to predict the per-frame PSNR for a particular packet loss pattern P , the network needs to know

about the source distortion as well. The total distortion prediction can be represented as

 

dˆ(l, P ) = dˆc(l, P ) + dˆs(l), (6)

 

where dˆs(l) = ds(l) for n ≥ l ≥ n − L, and dˆs(l) = ds(n) for l > n, and where we have applied the assumption

that the channel distortion and the source distortion are independent, which is shown to be pretty accurate.15

 

Note that the source distortion estimation dˆs(l) for n ≥ l ≥ n − L is precise and readily available at the video

sender and is included in the annotation of the L+ 1 packets G(n), ..., G(n− L).

 

The PSNR prediction for frame l ≥ n− L with packet loss pattern P is then

 

̂PSNR(l, P ) = 10 log10(2552/dˆ(l, P )). (7)

The per-frame PSNR time series is then { ̂PSNR(l, P )}, where l is the time index. The time series is a

 

function of P . Thus, to generate the best time series, the network chooses the optimal P among those that are

feasible under the resource constraint. Note that, part of P , i.e., I(n − L − 1), I(n− L − 2), ..., I(0), is already

determined, because a frame between 0 and n−L−1 has been either delivered or dropped. The variables subject

to optimization are the remaining part of P , i.e., I(n − L), ..., I(n). We define the prediction length λ as the

number of frames to be predicted. That is, if the nth frame is to be dropped, then the predictor predicts frames

n through n+λ. Note that, it is not necessary to predict many frames, since it takes the video encoder not more

than one RTT to receive feedback about a frame loss.

 

Proc. of SPIE Vol. 8856  88560I-5

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

3. SIMULATION RESULTS

 

We evaluate the performance of the proposed per-frame PSNR prediction method via simulation. We consider

both single frame losses and multiple frame losses. The 832 × 480 RaceHorses video sequence is used. For

m = 10, L = 5, and λ = 8, Fig. 3 shows the prediction for frames l ≥ 49 if frame 49 is dropped, and Fig. 4 for

frames l ≥ 45 if frames 45 and 48 are dropped.

 

0 10 20 30 40 50

18

 

20

 

22

 

24

 

26

 

28

 

30

 

32

 

34

 

36

 

Frame number

 

PS

NR

 

 (d

B)

 

 

 

 

 

Actual

Predicted

 

Figure 3. The per-frame PSNR prediction for a single frame loss at frame 49.

 

0 5 10 15 20 25 30 35 40 45 50

16

 

18

 

20

 

22

 

24

 

26

 

28

 

30

 

32

 

34

 

36

 

Frame number

 

PS

NR

 

 (d

B)

 

 

 

 

 

Actual

Predicted

 

Figure 4. The per-frame PSNR prediction for two frame losses at frames 45 and 48.

 

For single frame losses, we drop a frame, evaluate the prediction errors, and then drop the next frame and

so on. The prediction error is measured by the absolute per-frame PSNR prediction error, which is defined as

the absolute value of the difference between the actual per-frame PSNR and the predicted value, both in dB.

We plot the cumulative distribution function (CDF) of the absolute per-frame PSNR prediction error in Fig. 5.

We consider prediction length of 8 (blue dashed lines) and of 5 (red solid lines). The mean value of the absolute

prediction error are 0.74dB and 0.52dB for prediction lengths 8 and 5, respectively.

 

For multiple losses, we consider a particular frame loss pattern: two frame losses with a gap of 2 frames in

 

Proc. of SPIE Vol. 8856  88560I-6

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

between. The CDF of the absolute per-frame PSNR prediction error is shown in Fig. 6. The mean value of the

absolute prediction error are 0.88dB and 0.63dB for prediction lengths 8 and 5, respectively.

 

0 0.5 1 1.5 2 2.5 3 3.5 4

0

 

0.1

 

0.2

 

0.3

 

0.4

 

0.5

 

0.6

 

0.7

 

0.8

 

0.9

 

1

 

Per−frame PSNR Prediction Error (dB)

 

CD

F

 

 

 

 

 

L=8

L=5

 

Figure 5. The CDF of per-frame PSNR prediction error for single frame losses.

 

0 0.5 1 1.5 2 2.5 3 3.5 4

0

 

0.1

 

0.2

 

0.3

 

0.4

 

0.5

 

0.6

 

0.7

 

0.8

 

0.9

 

1

 

Per−frame PSNR Prediction Error (dB)

 

CD

F

 

 

 

 

 

L=5

L=8

 

Figure 6. The CDF of per-frame PSNR prediction error for two frame losses with a gap of 2 frames in between.

 

4. CONCLUSION

 

In our previous work, we proposed a QoE prediction scheme that allows a communication network to optimize

resource allocation for video teleconferencing. By using QoE models that take the per-frame PSNR time series

as the input, the QoE prediction problem reduces to a per-frame PSNR prediction problem. Simulation results

showed that the proposed per-frame PSNR prediction method achieves an average prediction error well below

1dB for the relatively low resolution CIF videos. In this paper, we show that similar performance is achieved for

higher (4×) resolution videos.

 

Proc. of SPIE Vol. 8856  88560I-7

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms

 

 

 

Acknowledgements

 

The authors would like to thank Dr. Robert A. DiFazio and Mr. Christopher Wallace of InterDigital Innovation

Labs for helpful comments.

 

REFERENCES

 

1. L. Ma, T. Xu, G. Sternberg, A. Balasubramanian, and A. Zeira, “Model-based QoE prediction to enable

better user experience for video teleconferencing,” in IEEE International Conference on Acoustics, Speech,

and Signal Processing (ICASSP), May 2013.

 

2. C. Keimel, T. Oelbaum, and K. Diepold, “Improving the prediction accuracy of video quality metrics,”

in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas,

March 2010, pp. 2442–2445.

 

3. C. Keimel, M. Rothbucher, H. Shen, and K. Diepold, “Video is a cube: Multidimensional analysis and video

quality metrics,” IEEE Signal Processing Magzine, pp. 41–49, Nov. 2011.

 

4. K. Stuhlmuller, N. Farber, M. Link, and B. Girod, “Analysis of video transmission over lossy channels,”

IEEE Journal on Selected Areas in Communications, vol. 18, no. 6, pp. 1012–1032, June 2000.

 

5. U. Dani, Z. He, and H. Xiong, “Transmission distortion modeling for wireless video communication,” in

IEEE Global Telecommunications Conference (GLOBECOM), Dec 2005.

 

6. Y. J. Liang, J. G. Apostolopoulos, and B. Girod, “Analysis of packet loss for compressed video: Effect of

burst losses and correlation between error frames,” IEEE Trans. Circuits and Systems for Video Technology,

vol. 18, no. 7, pp. 861–874, July 2008.

 

7. Z. Chen and D. Wu, “Prediction of transmission distortion for wireless video communication: Algorithm

and application,” Journal of Visual Communication and Image Representation, vol. 21, no. 8, pp. 948–964,

Nov. 2010.

 

8. Subjective Video Quality Assessment Methods for Multimedia Applications, ITU-T Recommendation-P.910,

Sep. 1999.

 

9. M. Venkataraman and M. Chatterjee, “Inferring video QoE in real time,” IEEE Network, pp. 4–13, Jan./Feb.

2011.

 

10. Opinion Model for Video-Telephony Applications, ITU-T Recommendation G.1070, 2007.

 

11. Jose Joskowicz and J. Carlos Lpez Ardao, “Enhancements to the opinion model for video-telephony appli-

cations,” in Proceedings of the 5th International Latin American Networking Conference (LANC), 2009, pp.

87–94.

 

12. M. Pinson and S. Wolf, “A new standardized method for objectively measuring video quality,,” IEEE

Trans. Broadcasting, vol. 50, no. 3, pp. 312322, Sep. 2004.

 

13. Z. Wang, L. Lu, and A. C. Bovik, “video quality assessment based on structural distortion measurement,”

Signal Processing: Image Communication, vol. 19, no. 2, pp. 121–132, Feb 2004.

 

14. ITU, “H.264/AVC reference software,” Online, Oct 2012, iphome.hhi.de/suehring/tml/download/.

 

15. Zhihai He, Jianfei Cai, and Chang Wen Chen, “Joint source channel rate-distortion analysis for adaptive

mode selection and rate control in wireless video coding,” IEEE Trans. Circuits and Systems for Video

Technology, vol. 12, no. 6, pp. 511–523, June 2002.

 

Proc. of SPIE Vol. 8856  88560I-8

 

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 01/29/2014 Terms of Use: http://spiedl.org/terms