In VQ, the input samples are quantized in groups (vectors), producing a quantization index by vector [6]. Usually, the lengths of the quantization indexes are much shorter than the lengths of the vectors, generating the data compression. However, it is possible also to use an entropy codec con further compress the quantization indexes.
Vector Quantization (VQ) can remove auto-correlation in the encoded signal and therefore, is more efficient in RD terms [2] than Scalar Quantization (SQ).
VQ inputs blocks of \(L\) elements (samples of speech, pixels of an image, etc.), usually using an adaptive algorithm, build a code-book (a set of \(L\)-dimensional vectors called code-vectors which are selected to be representative of the input), compare the input vectors with all the code-vectors, and outputs the index of the code-vector that minimizes some distortion measurement. In other words, if \(\mathbf {x}\) is an input vector and \(\mathbf {y}_j\) is the selected code-vector (of the code-book \(\mathcal {C}=\{\mathbf {y}_i;i=0,1,\cdots ,K-1\}\)), it is satisfied that \begin {equation} ||\mathbf {x}-\mathbf {y}_j||^2 \le || \mathbf {x}-\mathbf {y}_j||^2\qquad \forall \mathbf {y_i}\in \mathcal {C}. \end {equation}
Because the decoder build exactly the same code-book, it can retrieve the decompressed code-vector given its quantization index. Notice that, although the encoder likely have to perform a considerable amount of computations in order to build the codebook and to find the closest code-vector, the decoding consists simply of a table lookup.
If \(K\) is the number of code-vectors in the code-book and the input is splitted in vectors of length \(L\), the quantizer will use \(\lceil \log _2 K\rceil =S\) bits per input vector. Therefore, the number of bits/sample would be \(S/L\).
If the source output is correlated, vectors of source output values will tend to fall in clusters [6]. The idea is to put in \(\mathcal {C}\) a set of code-vectors that minimize the distortion of VQ. In other words, we must split the signal space (of dimenssion \(L\)) in a set of \(K\) regions \begin {equation} V_k=\{\mathbf {x}:d(\mathbf {x},\mathbf {y}_j) < d(\mathbf {x},\mathbf {y}_i), \forall i\ne j\}, \end {equation} of arbitrary shape.
This is a combinatorial optimization problem in a rather large search space, which usually makes it impossible to determine a global optimum in adequate time. This is the reason why VQ methods only compute a “local” optimum at best [1]. Anyway, VQ is used, for example, in “palletized” images.
Notice that we will output \(\lceil \log _2 K\rceil \) bits per quantization index, and one quantization index will be generated by each \(L\) input samples, resulting in a quantizer’s rate of \(\frac {\lceil \log _2 K\rceil }{L}\) bits per sample.
If we know \(K\) (the size of the code-book \(\mathcal {C}\)), we can find \(\mathcal {C}\) using the K-means algorithm [3, 6]. This algorithm computes \(\mathcal {C}\) as the set of the centroids of each region \(V_k\).
The LGB Algorithm [4] is a generalization of the Lloyd Algorithm [5], where \(K\) is estimated from the input vectors and then, the K-means algorithm is used to compute the \(L\)-dimensional centroids of \(\mathcal {C}\).
The distortion generated by VQ depends on \(L\), \(K\), but also depends on:
[1] W. Burger and M.J. Burge. Digital Image Processing: An Algorithmic Introduction Using Java. Springer, 2016.
[2] V. González-Ruiz. Information Theory.
[3] John A Hartigan and Manchek A Wong. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.
[4] Yoseph Linde, Andres Buzo, and Robert Gray. An algorithm for vector quantizer design. IEEE Transactions on communications, 28(1):84–95, 1980.
[5] Stuart Lloyd. Least squares quantization in PCM. IEEE transactions on information theory, 28(2):129–137, 1982.
[6] K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.