Introduction
- Vector quantization (VQ) has been commonly used in the compression of image
and speech signals. In vector quantization, a reproduction vector (codevector)
in a predesigned set of vectors (codebook) approximates each set (vector) of the
input signal. This representative codevector, the nearest neighbor of the source
vector, gives the least dissimilarity (distortion) among all the codevectors in
the codebook. In vector quantization, compression is achieved by transmitting or
storing the indices associated to the codevectors instead of the codevectors
because of the far fewer bits required for the indices. The following Figure
1 shows the principle of the resulting encoder and decoder.
- Vector quantization is the extension of scalar quantization. Basically in
signal processing converting a analog source - continuous time and amplitude-
into a digital source - discrete time and amplitude - results in two operations:
sampling and quantization. In the following context, a source is
regarded as stochastic process described by a probability density function. A
particularly realization of such a source is a signal. So sampling means to
convert a continuous-time signal into a discrete-time signal –a set of data
samples. Quantization means to convert a continuous-amplitude signal into a
discrete-amplitude signal.
- Vector quantization is characterized by its dimension, equal
to the number of data samples in a set, which is quantized jointly as a single
vector. Then vector quantization means to approximate an infinite set of vectors
by a limited set of vectors. This approximation can be regarded as a lossy
compression method characterized by its distortionand its
compression rate. The distortion measures the loss of information induced
by the quantization. For images is this loss a degradation in details. The
compression rate measures the gain in representing the samples by passing from
an infinite set to a limited set. In the sense of transmission, a large
compression rate implies a low bit rate and low compression rate implies
a large bit rate to transmit the same number of data samples in the same
time.
- In image processing vector quantization is used as a lossy compression
method to reduce psychovisual as well as statistical redundancies in the image
data.
begin
chapter|next
chapter
- Vector quantization is based on two principal theories: Shannon rate
distortion theory and high-resolution theory. The two theories are
complementary in the sense that Shannon rate distortion theory prescribes the
best possible performance of quantization with a given bit rate and
asymptotically large dimension, while high-resolution theory prescribes the best
possible performance with a given dimension and asymptotically large bit
rate.
- So vector quantization based on the Shannon rate distortion theory exploits
the interdependencies of the data samples to gain performance; in especially to
transmit with the same bit rate more information. With increasing dimension this
interdependencies approximate the probability density function of the source. As
a result the Shannon rate distortion theory leads to a small codebook with a
large dimension.
- Instead vector quantization based on the high-resolution theory assumes a
uniform probability density function for the source. In case of non-uniformity,
a high bit rate is imposed to attain a local uniform probability density. A high
bit rate means to transmit more codevectors for the same amount of information.
This is resulting in a large codebook with a small dimension.
So when dimension as well as bit rate is large, both theories merge. Obviously that each theory leads to its own design philosophy.
1.1.1 Encoder and Decoder
- The quantization can be decomposed into two operations: a lossy encoder
a and a reproduction decoder b. The lossy encoder a is the mapping
from the set of source vectors X ÌRk to the
index set I = {1,…,i,…N}. So a k-dimensional vectorxÎX is represented by a index iÎ I.
There the set of vectors X is the infinite set of all possible combinations of k
data samples. The index set I is a finite set of N indices. The reproduction
decoder ? is the mapping from the index set I to the reproduction set Y ÌRk.
- To each index i is associated a reproduction vector yiÎY, whereby Y =
{y1,…,yi,…,yN}. There the
original set of source vectors X is partitioned in N subsets (regions) and each
vector falling in a subset is represented by the associated reproduction vector.
So the reproduction set Y represents with less vectors an approximation of the
source.
- The reproduction set Y is also named codebook and its reproduction
vectors codevectors. In fact in most vector quantization methods the
reproduction set Y is used as codebook to encode and to decode the vectors as
presented in Figure
1. Depending on the theory the operations encoding and decoding varies or
can further be decomposed.
1.1.2 Distortion Measure
- The difference between the source vector and its associated codevector is
the distortion (quantization error, quantization noise) measured by a cost
function. Commonly used cost functions are norms so the difference
between the source vector and codevector becomes a distance d. In image
processing, Euclidean norm as the cost function, is the most used
distortion measure. However for vector quantization the squared Euclidean norm
or mean squared error is applied to simplify the computation. It has been
shown that mean squared error does not correlate well with human quality
requirements. In especially, details in an image such as edges and breaks
getting blurred by using the mean squared error.
- Given the source vector x and the codevector yithe
squared Euclidean distance or mean square error is given by
1.1.3 Codebook
- With a distortion measure and the probability density function of the source
an optimal codebook can be designed. Optimal in the sense that the average
distortion between each possible source vector x and the codevectors
yi is minimal.
- Commonly this distortion is expressed by the peak signal to noise
ratio.
- There the peak value is the maximal attainable source sample.
- So the codebook is partitioned such that for each vector x a nearest
neighbor codevector yi exists. Mathematically spoken this
partition is called Voronoi or Dirichlet partitionand the
codevectors are the centroids of each region. The followingFigure
2 represents the two dimensional case.
- In general the probability density function of the source is rarely known.
To circumvent this problem the rate distortion theory and high resolution theory
offers each a solution.
- The later one is to choose a large number of codevectors. The associated
regions are small and high structured as a lattice. So instead to find
the nearest neighbor the number of codevectors is raised such that for each
vector x the distortion is approximately constant. As mentioned, this
results to a local uniformity of the source probability density function and a
high bit rate.
- The other solution is to use a training set representing best the source to
optimize the codebook. To achieve this, a clustering algorithmis used.
Such an algorithm is the Lloyd algorithm. It iteratively improves a codebook by
alternately optimizing the encoder for the decoder - subdividing the codebook in
regions (Voronoi regions) in the manner that the average distortion for the
given training set is minimal, and the decoder for the encoder - replacing the
codevectors by the centroids. This is repeated until the average distortion and
rate converges to an inferior limit set by the Shannon rate distortion theory.
So the codebook is optimal for the training set but not necessary for the set of
source vectors. Other clustering algorithm are known such as the pairwise
nearest neighbor algorithm by Ward and Equitz, k-means algorithm, neural net
approaches, simulated annealing and stochastic relaxation algorithms just to
mention some.
1.2 Classification of Vector Quantization
- Vector quantization can be classified into vector quantization with and
without memory. Memoryless vector quantization encodes each source vector
separately instead vector quantization with memory encode the source vectors
exploiting in addition their interdependencies. A coarse criterion is the rate,
most memoryless vector quantizations have a fixed bit rate and most vector
quantizations with memory have a variable bit rate.
1.2.1 Vector Quantization without Memory
- In practice to get reasonable vector quantization performance there are two
different approaches. One remains with the original optimal vector quantization
and uses fast algorithms for the nearest neighbor search. This is named as
unconstrained or unstructured vector quantization. The other uses
simple search algorithms and, as a consequence of the simplicity, an
approximation of the optimal codebook. This is named as constrainedor
structured vector quantization.
Unconstrained Vector
Quantization
To encode a source vector the closest codevector (nearest neighbor) in the
optimal codebook is determined by using different search methods.
Full Search Vector Quantization For given sources vector the
distortion to each codevector in the codebook is computed. The codevector with
the minimal distortion is chosen as the reproduction vector.
Fast Search Vector Quantization To speed up the search procedure only
a subset of codevectors is used. To decide which codevectors have to be
considered an inherent property of the source vectors is used. Such properties
are obtained by a transformation like principal component analysis,
Walsh-Hadamard transformation, discrete cosine transformation or hyperplane
partitioning. An other method is to use a geometrical criterion such as the
triangular inequality to exclude codevectors from the search.
Constrained Vector
Quantization
To encode a source vector a constrained codebook is used. In this case the
codevector is an approximation of the nearest neighbor vector resulting in a
suboptimal encoding.
Tree Structured Quantization The codebook is structured as a tree there beginning by the root the branches with the minimal resulting distortion are chosen.
Classified Vector Quantization The codebook is divided in several sub-codebooks. Where the sub-codebook for a given source vectors is selected by an appropriate criterion. This criterion is based on an inherent property of the source vectors.
Product Code Techniques The set of source vectors is divided in approximately independent subset. Each subset is encoded by its proper codebook. Depending of source different methods are used known as partitioned vector quantization, mean removed or mean residual vector quantization, shape gain vector quantization, pyramid vector quantization, polar vector quantization, multistage or residual vector quantization.
Lattice Vector Quantization The codebook is a regular lattice there all regions having the same shape, size and orientation. begin chapter|next chapter
1.2.2 Vector Quantization with Memory