The DWT (Discrete Wavelet Transform)

Vicente González Ruiz - Depto Informática - UAL

December 28, 2022

1 The transform
  1.1 1D case
   1.1.1 An example of DWT: the Haar DWT
   1.1.2 Multiresolution analysis
   1.1.3 Filters
  1.2 2D case
  1.3 Multiresolution
   1.3.1 Multiresolution
   1.3.2 Multiresolution
  1.4 Quantization
   1.4.1 Quantization
   1.4.2 Quantization
2 References

1 The transform

Technically, the DWT (Discrete Wavelet Transform) is a linear basis expansion which computes a critically-sampled octave-band decomposition [4, 1]. As the rest of transforms:

The output of the DWT describes the (input) signal in a set of subbands, producing the so called a decomposition.
The index of the subband is related to the frequency of the signal. For example, in the case of the images, the position of the coefficients in the subbands is related to the spatial area where the corresponding pixels are found.

1.1 1D case

The 1D-DWT convolves the input signal with a set of basis functions (you can see the basis functions of several DWTs in the Wavelet Browser) defined by a cascade of filters banks, building a dyadic signal decomposition in the frequency domain.

1.1.1 An example of DWT: the Haar DWT

The DWT is also described by the filters of the analysis matrix transform (the matrix $\mathbf K$ in previous milestones). Haar defined the analysis downsampling filters \begin {equation} \begin {bmatrix} {\mathbf l}^{s+1}_n \\ {\mathbf h}^{s+1}_n \end {bmatrix} = \begin {bmatrix} \frac {1}{\sqrt {2}} & \frac {1}{\sqrt {2}} \\ \frac {1}{\sqrt {2}} & \frac {-1}{\sqrt {2}} \end {bmatrix} \begin {bmatrix} {\mathbf l}^s_{2n} \\ {\mathbf l}^s_{2n+1} \end {bmatrix}, \label {eq:Haar_transform} \end {equation} where the superindex $s$ denotes the subband, and $n$ refers to the $n$-th element of the signal. By definition (and notice that this holds for all DWTs), \begin {equation} {\mathbf l}^0={\mathbf x}. \end {equation}

Figure 1: Haar filters’s response in the frequency domain (see this notebook). $|{\mathbf K}_i(e^{j\omega })|$ denotes the modulus of the Fourier transform of ${\mathbf K}_i$.

Figure 2: Daubechies-5 filters’s response in the frequency domain (see this notebook).

Figure 3: Biorthiogonal-3.5 filters’s response in the frequency domain (see this notebook).

Figure 4: Phase response of the Haar filters (see this notebook).

Figure 5: Phase response of the Daubechies-5 filters (see the notebook Study of some wavelet filters). Notice that we obtained phase responses are not completely linear (it is not a straight line).

As it can be seen in the Fig. 1, ${\mathbf K}_0$ is a low-pass filter and ${\mathbf K}_1$ is a high-pass filter (this holds for all DWTs). Considering the Haar filter, we can conclude that:

There exists aliasing between the filters (this is also true for all DWTs) and this is a drawback because:
1. The same information can be found in both subbands ($\mathbf l$ and $\mathbf h$). Therefore, the concentration of the energy in one of the subbands is smaller.
2. As it can be seen also in Eq. \eqref{eq:Haar_transform}, the subbands are downsampled by 2 (action usually represented by $\downarrow 2$) and both subbands should have, at most, a bandwidth of $\frac {1}{2}$ radians/sample in order to avoid the aliasing during the subsampling, thus increasing the perceptible quality of the low-pass subband $\mathbf l$.
These problems can be only solved using filters that have transfer functions that overlaps a smaller area¹ (see Figs. 2 and 3).
There is not phase distortion (the phase of the filters is a linear function of the frequency, see the Fig. 4). This means that both coefficients ${\mathbf l}^{s+1}_n$ and ${\mathbf h}^{s+1}_n$ refers to the same section of signal $\{{\mathbf l}^s_{2n}, {\mathbf l}^s_{2n+1}\}$, allowing the design of entropy codecs for the decomposition which exploit the correlation between subbands. Notice that this does not hold for all filters (see Fig. 5).

Figure 6: Computation of the $S$-levels DWT and the generated subbands.

Eq. \eqref{eq:Haar_transform} computes the 1-levels Haar DWT. A more general expresion for this equation (and this holds for all DWTs) is \begin {equation} {\mathbf l}^{s+1} | {\mathbf h}^{s+1} = \text {DWT}({\mathbf l}^s), \label {eq:DWT} \end {equation} where $\cdot |\cdot $ denotes the concatenation of subbands. As can be seen in the Fig. 6, it’s possible to compute the $S$-levels DWT, using Eq. \eqref{eq:DWT} iteratively over the low-frequency subband, generating the decomposition \begin {equation} {\mathbf l}^S_0 | {\mathbf h}^S_0 | {\mathbf h}^{S-1}_0 {\mathbf h}^{S-1}_1 | {\mathbf h}^{S-2}_0 {\mathbf h}^{S-2}_1 {\mathbf h}^{S-2}_2 {\mathbf h}^{S-2}_3 | \cdots | {\mathbf h}^1_0 h^1_1 \cdots {\mathbf h}^1_{2^{n-1}-1}=\text {DWT}^S({\mathbf l}^0), \label {eq:S_levels_DWT} \end {equation} where \begin {equation} n = \log _2(N) \end {equation} where $N$ is the number of samples.

In the case of the Haar filters, the synthesis (inverse) $1$-levels DWT (that we will denote by $\text {DWT}^{-1}$) is the result of solving the coefficients ${\mathbf l}^{s+1}_n$ and ${\mathbf h}^{s+1}_n$ in the Eq. \eqref{eq:Haar_transform}. In general, for longer DWT filters we are going to have more coefficients, but the same procedure can be used for finding the synthesis transform. Therefore, it can be written that \begin {equation} {\mathbf l}^s = \text {DWT}^{-1}({\mathbf l}^{s+1} | {\mathbf h}^{s+1}), \end {equation} and $\text {DWT}^{-S}$, the $S$-levels synthesis DWT, can be computed as it is also described in the Fig. 6, by simply reversing the analysis steps.

1.1.2 Multiresolution analysis

In the decomposition described in Eq. \eqref{eq:S_levels_DWT} there are $S+1$ subbands, and therefore, it is possible to compute $S+1$ resolution levels of the analyzed signal ${\mathbf l}^0$:

${\mathbf l}^S$: the resolution level $S$ of ${\mathbf l}^0$ (no $\text {DWT}^{-1}$ has been applied in the synthesis process), with have only one sample ${\mathbf l}^S_0$ that uses to be the arithmetic mean of ${\mathbf l}^0$. The coefficient ${\mathbf l}^S_0$ is also called the DC (Direct Current) coefficient. On the contrary, the rest of coefficents (high-frequency subbands) are called the AC (Alternating Current) coefficients.
${\mathbf l}^{S-1}=\text {DWT}^{-1}({\mathbf l}^S | {\mathbf h}^S)$, the resolution level $S-1$ of ${\mathbf l}^0$, with two samples.
${\mathbf l}^{S-2}=\text {DWT}^{-1}({\mathbf l}^{S-1} | {\mathbf h}^{S-1})$, the resolution level $S-1$ of ${\mathbf l}^0$, with four samples.
$\vdots $
The resolution level $0$, ${\mathbf l}^0=\text {DWT}^{-1}({\mathbf l}^1 | {\mathbf h}^1)$.

1.1.3 Filters

Figure 7: A 2-channels PRFB (Perfect Reconstruction Filter Bank). See the notebook A Perfect Reconstruction Filter Bank (PRFB).

All DWTs are defined by 4 filters (using a similar notation that the followed by PyWavelets) (see Fig. 7):

Analysis (downsampling) low-pass filter $\tilde \phi $ (the decomposition scaling function), which computes ${\mathbf l}^{s+1}$ from ${\mathbf l}^s$.
Analysis (downsampling) high-pass filter $\tilde \psi $ (the decomposition wavelet function), which computes ${\mathbf h}^{s+1}$ from ${\mathbf l}^s$.
Synthesis (upsampling) high-pass filter $\phi $ (the reconstruction scaling function), which computes $\{{\mathbf l}^s_{2n}\}$ from ${\mathbf l}^{s+1}$ and ${\mathbf h}^{s+1}$.
Synthesis (upsampling) high-pass filter $\psi $ (the reconstruction wavelet function), which computes $\{{\mathbf l}^s_{2n+1}\}$ from ${\mathbf l}^{s+1}$ and ${\mathbf h}^{s+1}$.

In the context of the DWT, orthogonality provides intersubband decorrelation, which basically means that the contribution of each subbands to the reconstruction (and this holds for all DWTs) of the signal are independent. Orthogonal filters can be recognized because:

$\tilde \phi \bot \tilde \psi $ and $\phi \bot \psi $, where $\bot $ denotes orthogonality.
Their taps are all different (the filters are asymmetric).
The taps of the synthesis filters can be determined simply by “reflecting” the taps of the analysis filters.
In general, except for the Haar DWT, orthogonal filters generate phase distortion.

On the other hand, in a biorthogonal DWT:

$\tilde \phi \bot \tilde \psi $, $\phi \bot \psi $, $\psi \bot \tilde \phi $ and $\tilde \psi \bot \phi $.
All the filters are symmetric.
The filters satisfy that \begin {equation} \psi =(-1)^n\tilde \phi ;~n\in \mathbb {N}, \end {equation} and \begin {equation} \phi =(-1)^n\tilde \psi ;~n\in \mathbb {N} \end {equation} (the odd taps are multiplied by $-1$). Therefore, if we have the analysis filters we can deduce the synthesis filters and viceversa.
No phase distortion is generated.
In orthogonal DWTs, all the subbands have the same gain, but this is not true for biorthogonal DWTs.

1.2 2D case

(Digital) Images are 2D (2-Dimensional, discrete in space and amplitude) signals. The 2D-DWT of an image can be computed using (1) separable 1D filters, and (2) nonseparable 2D filters [3]. Except in very special cases, all 2D-DWT implementations use separable filters by simplicity. The Figure 8 show an example of basis functions used by a $3$-levels 2D-DWT.

Figure 8: 2D-DWT basis functions (see the notebook 2D-DWT basis).

Separability in the DWT context means that we can compute the 1-levels 2D-DWT using the 1D filters, by applying them to each dimension, and using in-place operations. This procedure has been described in the Algorithms 1.2, 1.2 and 1.2, where ${\mathbf X}_{r,*}$ refers to the $r$-th row of the matrix $\mathbf X$ and ${\mathbf X}_{*,c}$ to the $c$-th column, being $R$ and $C$ the number of rows and columns of the input image $\mathbf X$. See also the Fig. 9.

$\mathsf {Rows\_DWT}(\mathbf {X})$: $\rightarrow \mathbf {X}$

$\mathsf {for}~r~\mathsf {in~range}(\mathbf {X}.\mathsf {shape}[0])$:
1. ${\mathbf X}[r,:]\leftarrow \mathsf {DWT}(\mathbf {X}[r,:])$

$\mathsf {Columns\_DWT}(\mathbf {X})$: $\rightarrow \mathbf {X}$

$\mathsf {for}~c~\mathsf {in~range}(\mathbf {X}.\mathsf {shape}[0])$:
1. ${\mathbf X}[:,c]\leftarrow \mathsf {DWT}(\mathbf {X}[:,c])$

$\mathsf {2D\_DWT}(\mathbf {X})$: $\rightarrow \mathbf {X}$

${\mathbf X}\leftarrow \mathsf {Rows\_DWT}(\mathbf {X})$
${\mathbf X}\leftarrow \mathsf {Columns\_DWT}(\mathbf {X})$

Figure 9: Decomposition generated by $\text {2D-DWT}^2(X)$.

In the Subfig. (a), the rows of the image has been transformed using the 1-levels (1D) DWT (this is the output of the Alg. 1.2). In the Subfig. (b), the columns of the previously rows-transformed image has been also transformed using the 1-levels DWT (this is the output of the Alg. 1.2) applied to the previous data. Subfig. (b) is also the output of the Alg. 1.2 applied to the initial image. In the Subfig. (c) the 1-levels 2D-DWT has been also applied to the LL$^1$ subband, generating a 2-levels 2D decomposition. In the Subfig. (d) a simplified representation of Subfig. (c) has been used, where the different spatial resolution levels are highlighted.

As it can be seen in the Fig 9, the obtained 2D decomposition is expressed by \begin {equation} \begin {bmatrix} \mathbf {ll} & \mathbf {hl} \\ \mathbf {lh} & \mathbf {hh} \end {bmatrix} = \text {2D-DWT}(X), \label {eq:2D-DWT} \end {equation} where $\mathbf l$ stands for low-pass filtering and $\mathbf {h}$ for high-pass filtering. Notice that ${\mathbf ll}={\mathbf ll}^1$, ${\mathbf lh}={\mathbf lh}^1$, ${\mathbf hl}={\mathbf hl}^1$, and ${\mathbf hh}={\mathbf hh}^1$.

Eq. \eqref{eq:2D-DWT} describes the $1$-levels (analysis) 2D-DWT. Replacing ${\mathbf ll}={\mathbf l}^{s+1}$, ${\mathbf X}={\mathbf l}^s$, and ${\mathbf hl}, {\mathbf lh}, {\mathbf hh}\}={\mathbf h}^{s+1}$ in the previous expression, the $S$-levels 2D-DWT of ${\mathbf l}^0={\mathbf X}$ ($\text {2D-DWT}^S({\mathbf l}^0)$) can be computed applying $S$-times \begin {equation} \{{\mathbf l}^{s+1}, {\mathbf h}^{s+1}\} = \text {2D-DWT}({\mathbf l}^s) \end {equation} to the low-frequency subband ${\mathbf l}^s$. As an example, the Fig. 10 shows the $3$-levels 2D-DWT of the image Lena.

Figure 10: Subfig. (a): the Lena image. Subfig. (b): $\text {2D-DWT}^3(\text {Lena})$’s decomposition. Subfig. (c): $\text {2D-DWT}^3(\text {Lena})$’s normalized decomposition (see the notebook DWT of Lena).

1.3 Multiresolution

1.3.1 Multiresolution

Similarly to the 1D case, a $\text {2D-DWT}^S(\cdot )$ provides $S+1$ spatial resolution levels. An example of this can be seen in the Subfig. 9-(d), where there are 3 possible resolutions. In the Subfig. 10 there are 4 resolutions.

1.3.2 Multiresolution

The $\text {MDWT}$ provides dyadic spatial multiresolution and full temporal multiresolution.

1.4 Quantization

To find the gains in the 2D case we can compute the energy of the signal generated by the inverse transform of the unitary impulse discrete 2D signal \begin {equation} \delta _{i,j}(x,y) = \left \{ \begin {array}{ll} 1 & \text {if $i=x$ and $j=y$}\\ 0 & \text {otherwise}. \end {array} \right . \end {equation}

Notice that (see the Fig. 10) the low-frequency subbands concentrate more of the energy (and the visual information). Therefore, given a target bit-rate, these subbands are going to contribute more the the quality of the reconstruction.

1.4.1 Quantization

In the case of biorthogonal transforms, and in absence of RD optimization, the quantization steps uses for each subband should be inversely proportional to the L$_2$ synthesis gain of the subbands² [2]. However, notice that in orthogonal DWTs all the subbands have the same gain.

1.4.2 Quantization

As usually, we can use \begin {equation} \Delta _0=\Delta _1=\dots =\Delta _{N-1}, \end {equation} where $\Delta _i$ is the quantization step used for the $i$-th frame of the sequence. However, notice that this quantization pattern does not necessaryly optimizes the RD tradeoff. The RD curves of each frame should be taken into consideration in order to perform a good rate-control.

2 References

[1] J. Kovačević, V.K. Goyal, and M. Vetterli. Fourier and Wavelet Signal Processing. http://www.fourierandwavelets.org/, 2013.

[2] M.W. Marcellin, M.A. Lepley, A. Bilgin, T.J. Flohr, T.T. Chinen, and J.H. Kasner. An overview of quantization in JPEG 2000. Signal Processing: Image Communication, 17(1):73–84, 2002.

[3] K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.

[4] M. Vetterli, J. Kovačević, and V.K. Goyal. Foundations of Signal Processing. Cambridge University Press, 2014.

¹Sharper transition bands.

²If a subband has a high gain, this means that its contribution to the reconstruction of the signal is going to be high, also. Therefore, it’s desirable to introduce a small quantization error in this subband, reducing the quantization step.