RELATIONSHIP BETWEEN DCT-II, DCT-VI, AND DST-VII TRANSFORMS
Yuriy A. Reznik
9710 Scranton Rd, San Diego, CA 92121, USA
Discrete Sine Transfonns of type VII (DST-VII) have recently
received considerable interest in video coding. In this paper,
we show that there exists a direct connection between DST
VII and DCT-II transfonns, allowing their joint computation
for certain transfonn sizes. This connection also yields fast
algorithms for constructing DCT-VI and DCT-VII.
Index Terms- KLT, DCT-II, DCT-VI, DST-VII, factor
izations, video coding.
The Discrete Cosine Transfonns of types II and IV are among
most fundamental, well understood, and much appreciated
tools in data compression. The DCT-II is used at the core of
standards for image and video compression, such as JPEG,
ITU-T H.26x-series, and MPEG 1-4 standards . The
DCT-IV is used in audio coding algorithms, such as ITU-T
Rec. G.722.1, MPEG-4 AAC, and others . Such transforms
are very well studied, and a number of efficient technique ex
ists for their computation [1, 3, 4, 5, 6, 7].
Much less known are so-called "odd" sinusoidal trans
fonns: Discrete Cosine and Sine Transfonns of types V, VI,
VII, and VIII. Existence of some of such transfonns was dis
covered by A. Jain in 1979 . A complete tabulation was de
veloped in 1985 by Wang and Hunt . However, not much
work has followed. Surveys of related results can be found
in [10, 3].
Recently, DST of types VI and VII have surfaced as useful
tools in image and video coding. In 2010, Han, Saxena, and
Rose have shown that DST-VII produce good approximations
of Karhunen-Loeve Transform (KLT) for model of residual
signals after Intra-prediction . This was subsequently val
idated in the course of experimental work on ISO/IEC/ITU-T
High Efficiency Video Coding (HEVC) standard [12, l3].
The adoption of DST-VII in HEVC has prompted a dis
cussion on the existence of fast algorithms for computing of
such transfonns [12, 14]. This question was addressed in
2011 by Chivukula and Reznik, who have established
connection between DST-VII and DFT.
978-1-4799-0356-6/13/$31.00 ©2013 IEEE 5642
This paper offers an alternative solution by establishing a
mapping between DST-VII, DCT-VI, and DCT-II. This map
ping yields fast algorithms not only for DST-VIIVII, but also
for DCT-VINII, as well as possible joint factorizations of
such transfonns. The obtained mapping may also be of inter
est from methodological standpoint, as it suggests additional
connections between DST-VII, DCT-VI and KLT of residual
and mixed signals.
The rest of this paper is organized as follows. Section 2
provides definitions. Section 3 establishes mapping between
DCT-II, DCT-VI and DST-VII transfonns. Section 4 explains
how this mapping can be used to construct fast algorithms.
Discussion and concluding remarks are offered in Section 5.
Let N be the length of data sequence. The matrices of Dis
crete Fourier Transfonn (DFT) and Discrete Cosine and Sine
transforms of types II, III, IV, VI, and VII will be defined as
DFT: [FN ] mn
DCT-II: [cy] mn
DCT-III: [cyI] mn
DCT-IV: [C'zn mn
DCT-VI: [Ct�l] mn =
DCT-VII: [Ct�l] mn =
DST-VI: [stI] mn
DST-VII: [SVII] = N mn
e_j2n;;n, m, n E [O, N-l]
cosm(2;�1)7r, m, nE[O, N-l]
(2m+ l)n7r [0 N 1] cos 2N ' m, n E , -7r (2m+l)(n+ l) E [0 N-l] cos 4N ,m, n ,
m(2n+ l)7r [0 N] cos 2N+l ' m, n E ,
(2m+ l)n7r [0 N] cos 2N+l ' m, n E ,
. (m+l)(2n+ l)7r E [0 N-l] Sill 2N+l , m, n , . (2m+l)(n+ l)7r E [0 N-l] Sill 2N+l , m, n ,
In the above definitions, we have intentionally omitted
nonnalization constants (such as y!2/N and Ai = [l/a��
conventionally used in definition of DCT-II) as they don't af
fect factorization structures of the transfonns. Sub-indices N
or N + 1 indicate lengths of the transfonns. We follow Wang
and Hunt's convention of coupling N -point DST-VINII with
N + I-point DCT-VINII .
As easily noticed, transfonns of types II and III, as well
as VI and VII are closely related:
(CII)T _ CIII. (CVI )T _ CVII . (SVI)T _ SVII
N - N , N+l - N+l' N - N .
1--- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1
i 2N+1-point OCT-II i
, I 1 Yo Yo 1 Xo
r - - - - - - - - - - - - - - - - -
��i�t-oCT =ii --------------1
� : 1 Xo
: N-point : :_����V�I_!
i �t;'�I� :
I _______ !
i �t;'�I� :
1 I _______ ! 1
I ' !...---------------------------------------------------:
Fig. 1. Computing DCT-II of composite sizes: (a) split of 2N + I-point DCT-II into an N + I-point DCT-VI and N-point
DST-VI; (b) computation of DCT-II of length N(2N + 1)_
3. RELATIONSIDP BETWEEN 2N + I-POINT
DCT-II, N + I-POINT DCT-VI, AND N-POINT
In this section we prove the following statement.
Theorem 1. The following holds:
where Q2N+l is a matrix, such that when applied to a vector
x, it produces the following sign alterations and reordering:
( - I ) iXN+ Hi
i = O, _ .. , N,
i = O, . . . , N - 1,
and IN and J N are N x N identity and order-reversal matri
Proof Let us consider a 2N + I-long input sequence X
Xo, . . . , X2N, and apply DCT-II over it:
7f(2n + I)k
L Xn cos 2(2N + 1) , k = 0 , . . . , 2N. n=O
We first look at even output values (k = 2i, i = 0, ... ,N):
7f(2n + 1)2i
L Xn cos 2(2N + 1) n=O
7f(2n + I)i
L Xn cos 2N + 1 . n=O
We split this sum as follows:
ell � 7f(2n + I)i
X2i - L Xn cos 2N + 1 n=O
� 7f(2n + I)i L Xn cos 2N + 1 n=N+ l
7f(2(2N - n) + I)i
L Xn cos 2N + 1 n=N+ l
( ) � 7f 2n + 1 i L X2N -n cos 2N + 1 ' n=O
which implies that
, , , ,
,,: :0 Xo
x, ��--_H�+,������--�����:� �
x, -i-..,L-fH.�_+_<i--"-::..:=-=-=-'---'-'i;---__r--"-'--�___:_� x,
x, -i-���_+_�------��----� __ ���:�'�: � ,
x, �4T--����--����--�-4_,��ir�t+: Xs
x, �4---��, �����
, " ,
: sln(t ) : : : ______________________________________ J :
Fig. 2. Flow-graph of Winograd's factorization of DFT of length 9, and flow-graphs of 9-point DCT-II, 5 -point DCT-VI, and
4-point DST-VII implied by mappings (1,3).
where XC v I is a DCT-VI transform over the first N + 1 ele
ments of input sequence x, and X,CV1 is a DCT-VI transform
over the following input:
, _ [ X2N-n, xn - 0 ,
if n = 0 , . . . , N - 1,
We now turn our attention to the odd output values (k =
2i + 1, i = 0 , . . . , N - 1):
7r(2n + 1)(2i + 1)
� Xn cos 2( 2N + 1) n=D
( )i+l � . 7r
(N -n)(2i + 1)
-1 � X2N -n Sill --'----'-'---'-
n=D 2N + 1
We split this sum as follows:
XCII ( )i �
. 7r(N -n)(2i + 1)
2i+ 1 + -1 � X2N -n Sill ---'--2N.,....,-'+-'-1---'-n=D
(_l)i+l � . 7r
(N-n)(2i+l) � X2N-n Sill 2N + 1 n=N+l
. 7r(n + 1) (2i + 1)
-1 XN-l-n Sill , 2N +1 n=D
which implies that
where XSVll is an N-point DST-VII transform over a se
Xn = XN+l+n, n = 0 , . . . , N - 1,
VII and XS is an N -point DST-VII transform over a sequence
Xn=XN-l-n, n=O, ... , N - l.
By combining all these mappings we arrive at expression (1).
We present a flowgraph of the resulting mapping between
DCT-II, DCT-VI, and DST-VII transforms in Figure 1.a. Only
2N additions, permutations and sign changes are needed to
convert output of DCT-VI, and DST-VII into DCT-II.
4. FAST ALGORITHMS FOR COMPUTING DCT-II,
DCT-VI, AND DST-VII
4. 1. Connection to DFT
From (1) it follows that fast computation of DCT-VI and DST
VII can be reduced to computing subsets of DCT-II. Accord
ing to Heideman  it is also known that computing of DCT
II of odd numbers is equivalent to computing same-length
DFT. Considering 2N + I-point transforms, we can summa
rize Heideman's result as follows:
H ( [R(F2N+l)]rows D, ... ,N ) H (3) 2N+l - 1 ['2s(F )] 2, 2N+l rows N+l, ... ,2N
where R (F2N+d and '2s (F2N+l) denote real and imaginary
parts of the DFT transform matrix of size 2N + 1, and Hl and
H2 are some permutation and sign-inversion matrices .
In combination with (1) this formula shows that an N + 1-
point DCT-VI, an N-point DST-VII and an 2N + I-point
DCT-II can be computed by mapping to a 2N + I-point DFT.
Since many algorithms for computing of DFT are readily
available (see e.g. ), this automatically leads to to fast
algorithms for computing DCT-VI and DST-VII.
2N+1-point OCT-II 2N-point OCT-II
Fig. 3. Conceptual illustration of decompositions of (a) 2N + I-point DCT-II (1) and (b) 2N -point DCT-II (4).
4.2. Examples of fast algorithms for N = 4
We use Winograd DFT module of length 9 shown in Fig
ure 2.a. This particular factorization comes from . By
using this ftowgraph and mappings (3) and (1) we easily ob
tain 9-point DCT-I1, 5-point DCT-VI, and 4-point DST-VII.
This is shown in Figure 2.b.
We note that all these algorithms are very efficient in
terms of multiplicative complexity. Thus, obtained 9-point
DCT-II requires only 8 non-trivial multiplications. In con
trast, the least complex algorithms for computing DCT-ll of
size 8 (nearest dyadic-size) requires 11 multiplications.
The obtained 4-point DST-VII is also very efficient: it
uses only 5 multiplications. This factorization is immediately
suitable for implementing an integer approximation of DST
VII transform defined in HEVC standard .
Finally, factorization of a 5-point DCT-VI shown in Fig
ure 2.b needs only 3 real multiplications and 2 shifts (multi
plications by factors 1/2).
4.3. Fast computing of transforms of length 2k N (2N + 1)
It is known that a transform of a composite length N = pq,
where p and q are co-prime, can be decomposed into a cas
cade of p q-point transforms and q p-point transforms fol
lowed by pq - p - q - 1 additions. This class of techniques
is called Prime Factor Algorithms (PFA) [17, 18].
In Figure 1.b, we show how to compute DCT-II of length
N(2N + 1). This factorization includes 2N + 1 N-point DCT
II sub-transforms, and additionally N 2N + I-point DCT
II transforms, which, in turn include N -point DST-VII as
part of their ftowgraph. Hence, a system that implements
and uses N-point DCT-II and DST-VII, can easily compute
an N(2N + 1) transform by reusing them. Same principle
more generally applies to computing transforms of lengths
2k N(2N + 1).
Embedded factorization structures including DST-VII
blocks in ftowgraphs for DCT-II can be of interest to hard
ware implementations, as it offers potential for reducing the
area, cost, and power usage of a circuit responsible for com
S. DISCUSSION AND CONCLUDING REMARKS
We notice that decomposition (1) looks very similar to the
well-known split of even-sized DCT-I1 (see, e.g. ):
where P2N is a certain permutation matrix. This split leads
to recursive construction due to reappearance of DCT-ll in the
upper part of decomposition. In contrast, our decomposition
of 2N + I-point DCT-I1 (1) does not immediately lead to a
In Figure 3 we offer conceptual illustration of both de
compositions (1) and (4). Input data samples are denoted as
YN, ... ,YI,ZO,XI, ... ,XN in a 2N + I-point case (a), and
YN, ... , YI, Xl, ... , XN in 2N-point case (b). It is shown that
the lower (right) portion of DCT-I1 transform becomes es
sentially equivalent to DST-VII (or DCT-IV) transform over
residual samples Y; = Yi - Xi, while the upper (right) por
tion of DCT-I1 transform becomes essentially equivalent to
DCT-VI (or DCT-II) transform over sums: x; = Xi + Yi,
(i = 1, . . . , N). In the 2N + I-point case, the upper trans
form also absorbs the middle sample ZOo
This illustration may be insightful for understanding
meanings of the involved transforms. For instance, in sig
nal processing, it is customary to think of DCT-II as an
approximation of KLT for 1-st order Markov source with
high correlation coefficient. Decomposition in Figure 3.a
shows that DST-VII, as well as DCT-VI (with some permuta
tions and sign changes) can be understood as approximations
of KLT over residual or mixed signals with progressively
increasing distances between samples. Similarly, decom
position in Figure 3.b shows that in case of a 2N-sample
arrangement, it is DCT-IV and DCT-II that can be understood
as approximations of KLT over residual and mixed signals.
The obtained relationship (1) may also be instrumental in
showing that DST-VII-based coding of Intra-prediction resid
ual is essentially equivalent to performing L -shaped DCT
II, where one part of L -shape corresponds to boundary pix
els, and the other part absorbs pixels predicted based on this
boundary. A design of direction-adaptive transforms based on
similar idea was proposed in .
 K. R. Rao and P. Yip, Discrete Cosine Transform: Algo
rithms, Advantages, Applications, Academic Press, Boston,
 R. K. Chivukula and Y. A. Reznik, "Efficient implementation
of a class of MDCTIIMDCT filterbanks for speech and audio
coding applications," in IEEE Int. Con! Acoust., Speech, Sig
nal Processing (ICASSP), July 2008, pp. pp. 213-216.
 V. Britanak, K. R. Rao, and P. Yip, Discrete Cosine and Sine
Transforms: General Properties, Fast Algorithms and Inte
ger Approximations, Academic Press - Elsevier, Oxford, UK,
 M. T. Heideman, "Computation of an odd-length DCT from
a real-valued DFT of the same length," IEEE Trans. Signal
Processing, vol. 40, no. 1, pp. 54-61, 1992.
 C. W. Kok, "Fast algorithm for computing Discrete Cosine
Transform," IEEE Trans. Signal Processing, vol. 45, no. 3, pp.
 E. Feig and S. Winograd, "On the multiplicative complexity
of Discrete Cosine Transforms (Corresp.)," IEEE Trans. Infor
mation Theory, vol. IT-38, pp. 1387-1391, 1992.
 C. Loeffler, A. Ligtenberg, and G.S. Moschytz, "Practical
fast I-D DCT algorithms with 11 multiplications," in IEEE
Int. Con! Acoust., Speech, Signal Processing (ICASSP), May
1989, vol. 2, pp. 988-991.
 A. K. Jain, "A sinusoidal family of unitary transforms," IEEE
Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-
1, no. 4, pp. 356 -365, Oct. 1979.
 Z. Wang and B. R. Hunt, 'The discrete W transform," Applied
Mathematics and Computation, vol. 16, no. 1, pp. 19 - 48,
 S. A. Martucci, "Symmetric convolution and the Discrete Sine
and Cosine Transforms," IEEE Trans. Signal Processing, vol.
SP-42, pp. 1038-1051,1994.
[ll] J. Han, A. Saxena, and K. Rose, 'Towards jointly optimal
spatial prediction and adaptive transform in video/image cod
ing;' in IEEE Int. Con! Acoust., Speech, Signal Processing
(ICASSP), March 2010, pp. 726-729.
 A. Saxena and F. Fernandes, "CE7: Mode-dependent
DCT/DST for intra prediction in video coding," commitee in
put document JCTVC-D033, ISO/IEC/ITU-T Joint Collabora
tive Team on Video Coding, Daegu, Korea, January 2011.
 B. Bross, W-J. Han, G. Sullivan, J. Ohm, and T. Wiegland,
"High Efficiency Video Coding (HEVC) text specification
draft 6," ITU-T and ISO/IEC JCTVC-H1003, Feb 2012.
 F. Fernandes A. Saxena and Y. Reznik, "Fast transforms for
intra-prediction-based image and video coding;' in Data Com
pression Conference (DCC), March 2013 - submitted.
 R. K. Chivukula and Y. A. Reznik, "Fast computing of discrete
cosine and sine transforms of types vi and vii," in Applications
of Digital Image Processing XXXIV, A. G. Tescher, Ed., 2011,
vol. 8135 of Proc. SPlE, pp. 813505-1-14.
 C. S. Burrus and T. W. Parks, DFT/FFT and Convolution Al
gorithms and Implementation, Wiley-Interscience, New York,
 P.P.N. Yang and MJ. Narasimha, "Prime factor decompo
sition of the discrete cosine transform," in IEEE Int. Con!
Acoust., Speech, Signal Processing (ICASSP), March 1985, pp.
 B. G. Lee, "Input and output index mapping for a prime-factor
decomposed computation of discrete cosine transform," IEEE
Trans. Acoust., Speech, Signal Processing, vol. 37, no. 2, pp.
237 - 244, Feb. 1989.
 S. S. Tsai C-L. Chang, M. Makar and B. Girod, "Direction
adaptive partitioned block transform for color image coding,"
IEEE Transactions on Image Processing, , no. 7, pp. 1740-
1755, July 2010.