Scalar (Digital) Quantization

Vicente González Ruiz

December 24, 2022

1 Definition
2 Uniform SQ (USQ)
  2.1 Mid-rise USQ
  2.2 Mid-tread USQ
  2.3 Mid-tread USQ with deadzone
3 Non-uniform quantization
  3.1 Companded quantization [6]
  3.2 PDF-optimized quantization
4 Adaptive quantization
  4.1 Forward adaptive quantization
5 Backward adaptive quantization
  5.1 The Jayant quantizer [4]
  5.2 Adapting with a scale factor
6 RD performance
7 Perceptual quantization
  7.1 In audio
  7.2 In images
8 Quantization error
9 RD optimized quantizer design
10 References
References

1 Definition

Scalar (Digital) Quantization [6, 7] (see Fig. 1) is a technique in which each source sample is quantized independently from the other samples and therefore, a quantization index \({\mathbf k}_i\) is produced for each input sample \({\mathbf s}_i\) [2].

Figure 1: Scalar quantization and dequantization of a signal.

A \(K\)-levels Scalar Quantizer (SQ) \(Q\) performs a partition of the domain of \(\mathbf s\) into \(K\) cells \({\mathbf C}_k, k = 1, \cdots , K\) and associates to any \({\mathbf s}_i\) the quantization index \({\mathbf k}_i\) if \({\mathbf s}_i\in {\mathbf C}_k\). In other words, \begin {equation} Q({\mathbf s}_i) = {\mathbf k}_i \Leftrightarrow {\mathbf C}_{k-1} < {\mathbf s}_i \le {\mathbf C}_k. \end {equation}

The inverse quantizer \(Q^{-1}\) estimates \({\mathbf s}_i\) knowing \({\mathbf k}_i\) and possibly the PDF (Probability Density Function) \(p_{\mathbf S}({\mathbf s})\), using a reconstruction level \({\mathbf r}_k\in ]{\mathbf C}_{k-1}, {\mathbf C}_k]\), generating the output \begin {equation} \tilde {\mathbf s}_i = {\mathbf r}_k. \end {equation}

The smallest and the highest value of all \({\mathbf C}_k\) are called the decision boundaries of \(Q\). Therefore,

2 Uniform SQ (USQ)

In an USQ, all decision levels are equally spaced by a distance known as the Quantization Step Size (QSS) \(\Delta \), satisfiying that the domain of the input signal is divided into intervals of constant size \begin {equation} \Delta ={\mathbf d}_{i+1}-{\mathbf d}_i={\mathbf r}_{i+1}-{\mathbf r}_i, \end {equation} where \({\mathbf d}_i\) is the \(i\)-th decision level and \({\mathbf r}_i\) is the \(i\)-th representation level.

In USQs, the quantization error \(\mathbf e\) depends on \(\Delta \) and can be modeled as a noise signal that: (1) is uncorrelated to the input \(\mathbf s\), (2) is white and therefore, (3) it follows a uniform distribution.

2.1 Mid-rise USQ

In mid-rise quantizers the reconstructed signal \(\tilde {\mathbf s}\) never is 0, even if \({\mathbf s}_i=0\) for any \(i\). The mapping process in a mid-rise quantizer can be described as \begin {equation} {\mathbf k}_i = \Big \lfloor \frac {{\mathbf s}_i}{\Delta } \Big \rfloor , \label {eq:mid-rise} \end {equation} and the inverse mapping by \begin {equation} \tilde {\mathbf s}_i = \Delta \Big ({\mathbf k}_i + \frac {1}{2}\Big ). \label {eq:inverse_mid-rise} \end {equation}

Figure 2: An uniform mid-rise quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=13\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-3,-2,-1,0,1,2,3,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2.5,-1.5,-0.5,0.5,1.5,2.5,\cdots \}\).

2.2 Mid-tread USQ

In mid-tread quantizers the reconstructed signal is \(0\) when \({\mathbf s}_i=0\). The mapping process in a mid-tread quantizer can be described as \begin {equation} {\mathbf k}_i = \mathrm {round}\Big ( \frac {{\mathbf s}_i}{\Delta } \Big ), \label {eq:midrise} \end {equation} and the inverse mapping by \begin {equation} \tilde {\mathbf s}_i = \Delta {\mathbf k}_i. \label {eq:inverse_midrise} \end {equation}

Figure 3: An uniform mid-tread quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=12\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-2.5,-1.5,-0.5,0.5,1.5,2.5,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2,-1,-0,1,2,\cdots \}\).

2.3 Mid-tread USQ with deadzone

In a USQwD (USQ with Deadzone), the quantization step is \(2\Delta \) for \({\mathbf s}_i=0\). Deadzone quantizers tends to remove the electronic noise (that usually has a small amplitude compared to the input signal \(\mathbf s\)), precisely where the SNR (Signal-to-Noise Ratio) is the lowest.¹

Figure 4: An uniform deadzone quantizer (see the notebook Uniform Midrise Scalar Quantization). \(\Delta =1\) and \(K=12\) (the decision boundaries have been ignored). The decision levels (\(\mathbf d\)) are \(\{\cdots ,-3,-2,-1,1,2,3,\cdots \}\) and the representation levels (\(\mathbf r\)) are \(\{\cdots ,-2,-1,-0,1,2,\cdots \}\).

3 Non-uniform quantization

Uniform quantizers are efficient (minimize the distortion) only if the input samples (gray-scale values of the pixels) are uniformly distributed among the quantization bins [1]. However, when the probability of the samples is not “flat” (or the number of bins is small compared to the number of possible input values), we can use better quantizers. For example, if we know that most of the samples are small integer values (positive and negative²), a quantizer such as gray_SQ_companded.ipynb can be more suitable than an uniform one³.

If we know that the input signal \(\mathbf s\) does not follow an uniform distribution, it is possible to use a variable \(\Delta \) to minimize the quantization error \(\mathbf e\) in those samples that are more probable.

3.1 Companded quantization [6]

Companding (COMpressing + exPANDING) quantization is used when most of the samples are concentrated arround the \(0\) value. For example, most of the (interesting) audio for humans has a low volume (for this reason, companded quantizers are used in telephony).

In a companded codec, the original signal is mapped through a compressor, quantized using an uniform quantized, and re-mapped using the corresponding expander, resulting in a logarithmic quantization, centered at \(0\). A good example is the \(\mu \)-law codec.

Figure 5: Insights of a companded quantizer/dequantizer. See the notebook Companded Quantization.

3.2 PDF-optimized quantization

This non-uniform quantizer is a generalization of the companded quantizer, where the input samples can follow any distribution. Now, with the idea of minimizing the distortion (in general, in terms of the MSE), we chose \({\mathbf \Delta }_i\) smaller where signal samples appear most offen.

The most used PDF quantizer is the Max-Lloyd quantizer [5], whose authors developed an iterative algorithm for determining the decision and representation levels.

Uniform quantizers are efficient (minimize the distortion) only if the input samples (gray-scale values of the pixels) are uniformly distributed among the quantization bins [1]. However, when the probability of the samples is not “flat” (or the number of bins is small compared to the number of possible input values), we can use better quantizers. For example, if we know that most of the samples are small integer values (positive and negative⁴), a quantizer such as gray_SQ_companded.ipynb can be more suitable than an uniform one⁵.

Notice that the Max-Lloyd quantizer is equivalent to use K-means [3] if we known the \(K\) parameter (the number of representation levels). In this case, the centroids computed by K-means are in the middle of each region. If we don’t know \(K\), we must use the Lloyd Algorithm [3], which also estimates \(K\).

4 Adaptive quantization

Adaptive quantizers modify \(\Delta \) dynamically, depending on the local characteristics of \(\mathbf s\).

4.1 Forward adaptive quantization

Used for determining a suitable \(\Delta \) for blocks of samples.
Encoder:
1. While samples in \(s\):
  1. Read into \(b\) the next \(B\) samples of \(s\).
  2. Determine \(\Delta \), minimizing the quantization error, and output \(\Delta \) (or the data necessary for its determination).
  3. Quantize \(b\) and output it.
Decoder:
1. While data in input:
  1. Read \(\Delta \) (or the data necessary for determining it, and in this case, use the same algorithm that the used by the encoder).
  2. “Dequantize” \(b\) and output it (note that the dequantization is only a way of calling the process of reverting the original range of the quantized signal).
The selection of \(B\) is a trade-off between the increase in side information needed by small block sizes and the loss of fidelity due to large block sizes.
Forward adaptive quantization generates a \(B\text {-samples}\times f_s\) delay (buffering), where \(f_s\) is the sampling rate of \(s\).

5 Backward adaptive quantization

Only the previously quantized samples are available to use in adapting the quantizer.
Idea: If happens that \(\Delta \) is smaller than it should be, the input will fall in the outer levels of the quantizer a high number of times. On the other hand, if \(\Delta \) is larger than it should be, the samples will fall in the inner levels a high number of times.
Encoder:
1. \(\Delta \leftarrow 2\).
2. While \(s\) is not exhausted:
  1. Quantize the next sample.
  2. Observe the output and refine \(\Delta \).
Decoder:
1. \(\Delta \leftarrow 2\).
2. While \(\hat {s}\) is not exhausted:
  1. “Dequantize” the next sample.
  2. Step 2.B of the encoder.

5.1 The Jayant quantizer [4]

Adaptive quantization with a one word memory (\(\Delta _{(t-1)}\)).
A Jayant quantider defines the Step 2.B. as: Define a multiplier \(M_l\) for each quantization level \(l\), where for the inner levels \(M_l<1\) and for the outer levels \(M_l>1\), and compute:
\[ \Delta ^{[n]} = \Delta ^{[n-1]}{M_l}^{[n-1]}, \]
where \(\Delta ^{[n-1]}\) was the previous quantization step and \({M_l}^{[n-1]}\) the level multiplier for the \(n-1\)-th (previous) sample. Thus, if the previous (\(n-1\)) quantization used a \(\Delta ^{[n-1]}\) too small (using outer quantization levels) then \(\Delta ^{[n]}\) will be larger and viceversa.
Depending on the multipliers \(M\), the quantizer will converge or oscillate. In the first case, the quantizer will be good for small variations of \(s\) but bad when a fast adaption to large changes in \(s\) is required. In the second one, the quantizer will adapt quickly to fast variations of \(s\) but will oscillate when \(s\) changles slowly.
Most Jayant quantizers clip the computation of \(\Delta \) to avoid generating a zero output quantizer in those contexts where \(s\) is zero or very close to zero, and to improve the adaptation to smaller samples after a sequence of bigger ones (avoiding to grow without limit):
\[ \begin {array}{ll} \text {if}~\Delta ^{[n]}<\Delta _{\text {min}}~\text {then}~\Delta ^{[n]} = \Delta _{\text {min}},\\ \text {if}~\Delta ^{[n]}>\Delta _{\text {max}}~\text {then}~\Delta ^{[n]} = \Delta _{\text {max}}. \end {array} \]

5.2 Adapting with a scale factor

A Jayant quantized adapts the quantization step to the dynamic range of the signa using a set of multipiers. A similar effect can be provided by dividing the input signal by a scale factor defined iteratively as:
\begin {equation} \alpha ^{[n]} = \alpha ^{[n-1]}M_l^{[n-1]}. \end {equation}

6 RD performance

Normaly, RD curves are convex [?] (this can be seen in the notebook A Comparison of Scalar Quantizers). This means that:

At low bit-rates the distortion decreases faster than at high bit-rates.
If we have a scalable code-stream (we can decide how the code-stream will be decompressed), we should be aware that some parts of the code-stream can minimize faster the RD curve than others.

As it can be also seen in the previous notebook that the performance of the quantizers is not the same: usually midrise and midtread, performs better than deadzone at intermediate bit-rates, but deadzone is the best a low bit-rates (excluding Lloyd-Max). Deadzone has also another advantage over midread and midtread: when \(\Delta \) is a power of 2 (which corresponds to a bit-plane encoding), the obtained RD point is near optimal in the RD space. Finally, the Lloyd-Max Quantizer reaches the highest performance because it is adaptive.

7 Perceptual quantization

7.1 In audio

An important consideration is the relative perfectual importance of the input samples. This leads to a weighting of the MSE at the output. The weighting function can be derived through experiments to determine the “level of just noticeable noise”. For example, in subband coding, as expected, high frequecy subbands tolerate more noise because the HAS (Human Auditory System) becomes less sensitive at them.

7.2 In images

Normally, humans hardly distinguish more than 64 different colors.

8 Quantization error

The quantization error can be modeled as a random signal \(\mathbf {e}\) that is added to the original one \(\mathbf {s}\), and the energy of this error signal \(\langle \mathbf {e},\mathbf {e}\rangle \) depends on the QSS and the values of \(\mathbf {s}\).

9 RD optimized quantizer design

So far, we have designed the image compressor in two steps: (1) quantize the pixels ... trying to minimize the distortion, and (2) entropy encode (compress) the quantization indexes. However, if we assume that the entropy encoding performance is proportional to some rate-prediction metric, such as the entropy, it is possible to design the quantizer minizing the two RD variables (rate and distortion) at the same time [6].

However, there is not (yet) any known reason to think that this approach will be generate better RD curves that the one in which only the distortion is both parameters (R and D) are minimized independently. Besides, the joint optimization of R and D makes the quantizer dependent of the entropy codec, that reduces the flexility of the encoding system.

10 References

[1] V. González-Ruiz. Scalar Quantization.

[2] V. González-Ruiz. Signal Quantization.

[3] John A Hartigan and Manchek A Wong. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.

[4] Nuggehally S. Jayant. Digital coding of speech waveforms: PCM, DPCM, and DM quantizers. Proceedings of the IEEE, 62(5):611–632, 1974.

[5] Stuart Lloyd. Least squares quantization in PCM. IEEE transactions on information theory, 28(2):129–137, 1982.

[6] K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.

[7] M. Vetterli, J. Kovačević, and V.K. Goyal. Foundations of Signal Processing. Cambridge University Press, 2014.

¹Notice that, by definition, dead-zone quantizers should not be considered uniform, and that all dead-zone quantizers, by definition, are mid-tread.

²Such as it happens with audio CD data.

³At least, from a pure-distortion point of view.

⁴Such as it happens with audio CD data.

⁵At least, from a pure-distortion point of view.