Introduction - If you have any usage issues, please Google them yourself
In this paper, we present area- and power-efficient
architectures for the implementation of integer discrete cosine
transform (DCT) of different lengths to be used in High Efficiency
Video Coding (HEVC). We show that an efficient constant matrix-
multiplication scheme can be used to derive parallel architectures
for 1-D integer DCT of different lengths. We also show that
the proposed structure could be reusable for DCT of lengths
4, 8, 16, and 32 with a throughput of 32 DCT coefficients
per cycle irrespective of the transform size. Moreover, the
proposed architecture could be pruned to reduce the complexity
of implementation substantially with only a marginal affect on
the coding performance