Scalar (Digital) Quantization

Vicente González Ruiz

December 24, 2022

Contents

 1 Definition
 2 Uniform SQ (USQ)
  2.1 Mid-rise USQ
  2.2 Mid-tread USQ
  2.3 Mid-tread USQ with deadzone
 3 Non-uniform quantization
  3.1 Companded quantization [6]
  3.2 PDF-optimized quantization
 4 Adaptive quantization
  4.1 Forward adaptive quantization
 5 Backward adaptive quantization
  5.1 The Jayant quantizer [4]
  5.2 Adapting with a scale factor
 6 RD performance
 7 Perceptual quantization
  7.1 In audio
  7.2 In images
 8 Quantization error
 9 RD optimized quantizer design
 10 References
 References

1 Definition

Scalar (Digital) Quantization [67] (see Fig. 1) is a technique in which each source sample is quantized independently from the other samples and therefore, a quantization index \({\mathbf k}_i\) is produced for each input sample \({\mathbf s}_i\) [2].

Figure 1: Scalar quantization and dequantization of a signal.

A \(K\)-levels Scalar Quantizer (SQ) \(Q\) performs a partition of the domain of \(\mathbf s\) into \(K\) cells \({\mathbf C}_k, k = 1, \cdots , K\) and associates to any \({\mathbf s}_i\) the quantization index \({\mathbf k}_i\) if \({\mathbf s}_i\in {\mathbf C}_k\). In other words, \begin {equation} Q({\mathbf s}_i) = {\mathbf k}_i \Leftrightarrow {\mathbf C}_{k-1} < {\mathbf s}_i \le {\mathbf C}_k. \end {equation}

The inverse quantizer \(Q^{-1}\) estimates \({\mathbf s}_i\) knowing \({\mathbf k}_i\) and possibly the PDF (Probability Density Function) \(p_{\mathbf S}({\mathbf s})\), using a reconstruction level \({\mathbf r}_k\in ]{\mathbf C}_{k-1}, {\mathbf C}_k]\), generating the output \begin {equation} \tilde {\mathbf s}_i = {\mathbf r}_k. \end {equation}

The smallest and the highest value of all \({\mathbf C}_k\) are called the decision boundaries of \(Q\). Therefore,

2 Uniform SQ (USQ)

In an USQ, all decision levels are equally spaced by a distance known as the Quantization Step Size (QSS) \(\Delta \), satisfiying that the domain of the input signal is divided into intervals of constant size \begin {equation} \Delta ={\mathbf d}_{i+1}-{\mathbf d}_i={\mathbf r}_{i+1}-{\mathbf r}_i, \end {equation} where \({\mathbf d}_i\) is the \(i\)-th decision level and \({\mathbf r}_i\) is the \(i\)-th representation level.

In USQs, the quantization error \(\mathbf e\) depends on \(\Delta \) and can be modeled as a noise signal that: (1) is uncorrelated to the input \(\mathbf s\), (2) is white and therefore, (3) it follows a uniform distribution.

2.1 Mid-rise USQ

In mid-rise quantizers the reconstructed signal \(\tilde {\mathbf s}\) never is 0, even if \({\mathbf s}_i=0\) for any \(i\). The mapping process in a mid-rise quantizer can be described as \begin {equation} {\mathbf k}_i = \Big \lfloor \frac {{\mathbf s}_i}{\Delta } \Big \rfloor , \label {eq:mid-rise} \end {equation} and the inverse mapping by \begin {equation} \tilde {\mathbf s}_i = \Delta \Big ({\mathbf k}_i + \frac {1}{2}\Big ). \label {eq:inverse_mid-rise} \end {equation}

Figure 2: An uniform mid-rise quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=13\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-3,-2,-1,0,1,2,3,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2.5,-1.5,-0.5,0.5,1.5,2.5,\cdots \}\).

2.2 Mid-tread USQ

In mid-tread quantizers the reconstructed signal is \(0\) when \({\mathbf s}_i=0\). The mapping process in a mid-tread quantizer can be described as \begin {equation} {\mathbf k}_i = \mathrm {round}\Big ( \frac {{\mathbf s}_i}{\Delta } \Big ), \label {eq:midrise} \end {equation} and the inverse mapping by \begin {equation} \tilde {\mathbf s}_i = \Delta {\mathbf k}_i. \label {eq:inverse_midrise} \end {equation}

Figure 3: An uniform mid-tread quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=12\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-2.5,-1.5,-0.5,0.5,1.5,2.5,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2,-1,-0,1,2,\cdots \}\).

2.3 Mid-tread USQ with deadzone

In a USQwD (USQ with Deadzone), the quantization step is \(2\Delta \) for \({\mathbf s}_i=0\). Deadzone quantizers tends to remove the electronic noise (that usually has a small amplitude compared to the input signal \(\mathbf s\)), precisely where the SNR (Signal-to-Noise Ratio) is the lowest.1

Figure 4: An uniform deadzone quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=12\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-3,-2,-1,1,2,3,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2,-1,-0,1,2,\cdots \}\).

3 Non-uniform quantization

Uniform quantizers are efficient (minimize the distortion) only if the input samples (gray-scale values of the pixels) are uniformly distributed among the quantization bins [1]. However, when the probability of the samples is not “flat” (or the number of bins is small compared to the number of possible input values), we can use better quantizers. For example, if we know that most of the samples are small integer values (positive and negative2), a quantizer such as gray_SQ_companded.ipynb can be more suitable than an uniform one3.

If we know that the input signal \(\mathbf s\) does not follow an uniform distribution, it is possible to use a variable \(\Delta \) to minimize the quantization error \(\mathbf e\) in those samples that are more probable.

3.1 Companded quantization [6]

Companding (COMpressing + exPANDING) quantization is used when most of the samples are concentrated arround the \(0\) value. For example, most of the (interesting) audio for humans has a low volume (for this reason, companded quantizers are used in telephony).

In a companded codec, the original signal is mapped through a compressor, quantized using an uniform quantized, and re-mapped using the corresponding expander, resulting in a logarithmic quantization, centered at \(0\). A good example is the \(\mu \)-law codec.

Figure 5: Insights of a companded quantizer/dequantizer. See the notebook Companded Quantization.

3.2 PDF-optimized quantization

This non-uniform quantizer is a generalization of the companded quantizer, where the input samples can follow any distribution. Now, with the idea of minimizing the distortion (in general, in terms of the MSE), we chose \({\mathbf \Delta }_i\) smaller where signal samples appear most offen.

The most used PDF quantizer is the Max-Lloyd quantizer [5], whose authors developed an iterative algorithm for determining the decision and representation levels.

Figure 6: A Max-Lloyd quantizer.

Uniform quantizers are efficient (minimize the distortion) only if the input samples (gray-scale values of the pixels) are uniformly distributed among the quantization bins [1]. However, when the probability of the samples is not “flat” (or the number of bins is small compared to the number of possible input values), we can use better quantizers. For example, if we know that most of the samples are small integer values (positive and negative4), a quantizer such as gray_SQ_companded.ipynb can be more suitable than an uniform one5.

Notice that the Max-Lloyd quantizer is equivalent to use K-means [3] if we known the \(K\) parameter (the number of representation levels). In this case, the centroids computed by K-means are in the middle of each region. If we don’t know \(K\), we must use the Lloyd Algorithm [3], which also estimates \(K\).

4 Adaptive quantization

Adaptive quantizers modify \(\Delta \) dynamically, depending on the local characteristics of \(\mathbf s\).

4.1 Forward adaptive quantization

5 Backward adaptive quantization

5.1 The Jayant quantizer [4]

5.2 Adapting with a scale factor

6 RD performance

Normaly, RD curves are convex [?] (this can be seen in the notebook A Comparison of Scalar Quantizers). This means that:

  1. At low bit-rates the distortion decreases faster than at high bit-rates.
  2. If we have a scalable code-stream (we can decide how the code-stream will be decompressed), we should be aware that some parts of the code-stream can minimize faster the RD curve than others.

As it can be also seen in the previous notebook that the performance of the quantizers is not the same: usually midrise and midtread, performs better than deadzone at intermediate bit-rates, but deadzone is the best a low bit-rates (excluding Lloyd-Max). Deadzone has also another advantage over midread and midtread: when \(\Delta \) is a power of 2 (which corresponds to a bit-plane encoding), the obtained RD point is near optimal in the RD space. Finally, the Lloyd-Max Quantizer reaches the highest performance because it is adaptive.

7 Perceptual quantization

7.1 In audio

An important consideration is the relative perfectual importance of the input samples. This leads to a weighting of the MSE at the output. The weighting function can be derived through experiments to determine the “level of just noticeable noise”. For example, in subband coding, as expected, high frequecy subbands tolerate more noise because the HAS (Human Auditory System) becomes less sensitive at them.

7.2 In images

Normally, humans hardly distinguish more than 64 different colors.

8 Quantization error

The quantization error can be modeled as a random signal \(\mathbf {e}\) that is added to the original one \(\mathbf {s}\), and the energy of this error signal \(\langle \mathbf {e},\mathbf {e}\rangle \) depends on the QSS and the values of \(\mathbf {s}\).

9 RD optimized quantizer design

So far, we have designed the image compressor in two steps: (1) quantize the pixels ... trying to minimize the distortion, and (2) entropy encode (compress) the quantization indexes. However, if we assume that the entropy encoding performance is proportional to some rate-prediction metric, such as the entropy, it is possible to design the quantizer minizing the two RD variables (rate and distortion) at the same time [6].

However, there is not (yet) any known reason to think that this approach will be generate better RD curves that the one in which only the distortion is both parameters (R and D) are minimized independently. Besides, the joint optimization of R and D makes the quantizer dependent of the entropy codec, that reduces the flexility of the encoding system.

10 References

[1]   V. González-Ruiz. Scalar Quantization.

[2]   V. González-Ruiz. Signal Quantization.

[3]   John A Hartigan and Manchek A Wong. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.

[4]   Nuggehally S. Jayant. Digital coding of speech waveforms: PCM, DPCM, and DM quantizers. Proceedings of the IEEE, 62(5):611–632, 1974.

[5]   Stuart Lloyd. Least squares quantization in PCM. IEEE transactions on information theory, 28(2):129–137, 1982.

[6]   K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.

[7]   M. Vetterli, J. Kovačević, and V.K. Goyal. Foundations of Signal Processing. Cambridge University Press, 2014.

1Notice that, by definition, dead-zone quantizers should not be considered uniform, and that all dead-zone quantizers, by definition, are mid-tread.

2Such as it happens with audio CD data.

3At least, from a pure-distortion point of view.

4Such as it happens with audio CD data.

5At least, from a pure-distortion point of view.