Improved compression efficiency (compared to JPEG).
Lossy to lossless compression.
Multiple resolution representation.
Progressive (quality) decoding.
Tiling.
Region-of-interest (ROI) coding.
Error resilience.
Random code-stream access and processing.
2 Intro [?]
Lossy & lossless: The lossy path has a better RD curve than the lossless
path.
Quality scalability: More decoded data, more quality.
Spatial scalability: More decoded data, more resolution.
ROI (Regions Of Interest) scalability:
3 The JPEG2000 algorithm
Color component DC level offset (optional).
Intercomponent decorrelation (optional).
Spatial decorrelation (2D-DWT).
Quantization (only in the lossy path).
ROI definition.
Entropy coding (tier-1 coding).
Bit-stream organization (tier-2 coding).
4 DC level offset (step 1/6)
Depends on the selected path. For each component:
Irreversible: normalize the samples \(s[{\mbox {\boldmath $n$}}]\) in order to satisfy that
\begin {equation} -\frac {1}{2}\le s[{\mbox {\boldmath $n$}}] \le \frac {1}{2}, \end {equation}
where \(s[{\mbox {\boldmath $n$}}]=s[x,y]\) is a point.
Reversible: substract an offset to \(s[{\mbox {\boldmath $n$}}]\) if they does not verify than
\begin {equation} -2^{B-1}\le s[{\mbox {\boldmath $n$}}] < 2^{B-1}, \end {equation}
where \(B\) is the number of bits/component.
\begin {equation} q_b = \text {sign}(y_b)\Big \lfloor \frac {\displaystyle |y_b|}{\displaystyle \Delta _b}\Big \rfloor \tag {J2KQuant} \end {equation}
where \(q_b\) is the quantized coefficient, \(y_b\in [-0.5,0.5]\) is a wavelet coefficient in the subband \(b\) and \(\Delta _b\) is the
quantizer step size for the subband \(b\), whose value depends on \(y_b\) as it is shown in the
next figure (deathzone scalar quantizer):
Reversible path
There is no quantization: \begin {equation} q_b = y_b \tag {J2KRanging} \end {equation}
8 ROI definition (step 5/6)
Obtained by prioritizing (multiplying by a number greater than one) those
\(q_b\) that define the ROI.
9 Entropy encoding (step 6/6)
EBCOT (Embedded Block Coding with Optimal Truncation).
The coefficients are grouped into code-blocks (that have a typical size of \(32\times 32\) or \(64\times 64\))
and encoded bit-plane by bit-plane, using a context-based adaptive binary
arithmetic encoder (called MQ-coder).
Each bit-plane of each code-block is encoded in 3 passes:
Significance propagation pass: indicates if the coefficients that
are expected to be significant (in absolute value larger than \(2^p\), where \(p\)
is the index of the processed bit-plane), are significant in fact. When
a coefficient becomes significant, its sign is also encoded.
Magnitude refinement pass: indicates the correspondent bit value
for the processed bit-plane for those coefficients that, already, are
significant.
Cleanup pass: the significance propagation only determines a subset
of the total coefficients that can become significant. This pass solves
this problem.
The code-stream produced after each individual pass is an optimal code-stream
from the R/D point of view. In other words, if the code-stream is truncated
at any of these points, we are the closest to the R/D curve as it is
possible.
Notice that there are a total of \begin {equation} 3P-2 \end {equation}
optimal truncation points in the code-stream of a code-block, where \(P\) is the
number of bit-planes in the DWT domain.
PCRD-opt
In order to provide quality scalability, the code-stream of the code-blocks
should be shuffled attending to the contribution of each coding pass to the
increment of quality of the reconstruction of the whole image.
A JPEG 2000 encoder typically inputs a set of \(Q\) bit-rates or a number of \(Q\)
quality layers.
The PCRD-opt determines which segments of each code-block-stream are going
to be part of each quality layer. Example:
Notice that PCRD-opt does not improve the RD curves in the sense that the
curves will be closer to the origin of coordinates (in the case of using the
RMSE, for example). PCRD-opt increases the number of operational RD points
of the codec.
The precinct partition
Unfortunately, there is no a single code-stream ordering that generates
both scalabilities: spatial and quality.
Therefore, when the data-ordering in the code-stream does not match with
the target scalability, the only solution is to access to the code-stream using
a non-sequential ordering. For this reason, some extra data (overhead)
should be included in the code stream (remember that the contribution
(in bits of code) to each code-block to the total quality can be different).
Finally, if \(Q\) is high, the amount of overhead could be counterproductive.
To mitigate this drawback, the code-blocks (and their code-streams) are
grouped into the so called precincts.
So, each “quality layer” of each precinct is stored in a packet and there is
a index (or a length) for each packet in a JPEG 2000 code-stream.
Reduction of the distortion of a coding pass
The contribution to the quality (distortion decrease) of a coding pass to the
total distortion of the reconstruction is determined by:
The weight of the bits of the coefficients that are encoded in the
coding pass.
The energy gain factor of the subband where the code-block is
located.
Progressions of JPEG2000
In fact, a JPEG 2000 codec produces a packet for each precinct,
component, resolution level and quality layer.
Depending on the final packet ordering, we have one of the following
progressions:
LRCP or quality progression
Packets are ordered first by quality, then by resolution level, then by
component and finally, by precinct. Example:
RLCP or spatial progression
Packets are ordered first by resolution level, then by quality layer, then
by component and finally, by precinct. Example:
PCRL or sequential progression
Packets are ordered first by precinct, then by component, after that by
spatial resolution and finally, by quality layer. Example:
Rudimentary ROI definition at decoding time
The random access to the packet-stream give the possibility of the
definition of a ROI by the decoder.
The accuracy of the shape of the ROI depends on the precinct size(s). The
smaller the precincts, the better the precision.
This feature is typically exploited in client/server architectures through
the JPIP (JPeg 2000 Interactive Protocol) [2].
“Lena” at \(0.1\) bpp
JPEG (\(21.29\) dB)
JPEG2000 (\(27.03\) dB)
“Lena” at \(0.2\) bpp
JPEG (\(26.64\) dB)
JPEG2000 (\(29.22\) dB)
“Lena” at \(0.3\) bpp
JPEG (\(28.97\) dB)
JPEG2000 (\(30.71\) dB)
“Lena” at \(0.4\) bpp
JPEG (\(30.09\) dB)
JPEG2000 (\(31.58\) dB)
“Lena” at \(0.5\) bpp
JPEG (\(30.91\) dB)
JPEG2000 (\(32.24\) dB)
“Cat” at \(0.1\) bpp
JPEG (\(17.79\) dB)
JPEG2000 (\(23.33\) dB)
“Cat” at \(0.2\) bpp
JPEG (\(23.59\) dB)
JPEG2000 (\(25.97\) dB)
“Cat” at \(0.3\) bpp
JPEG (\(25.63\) dB)
JPEG2000 (\(27.60\) dB)
“Cat” at \(0.4\) bpp
JPEG (\(27.10\) dB)
JPEG2000 (\(28.97\) dB)
“Cat” at \(0.5\) bpp
JPEG (\(28.23\) dB)
JPEG2000 (\(30.08\) dB)
10 Motion JPEG 2000
As in JPEG, JPEG 2000 has an extension [1] to compress sequences of
images.
Each image is encoded independently.
But at difference of JPEG, the code-streams can be variable bit-rate
(compression ratio selected by slope(s)) or constant bit-rate (compression
ratio selected by bit-rate(s)).
Finally, scalability can be used to recover a reduced quality, lower
ROI/resolution or gray-scale version of the original image.
In the III... (or Intra video) coding, the 2D block-DWT, the 2D DWT, or
any other spatial transform, is used on sequences of frames (images)
to exploit the spatial correlation. This is achieved by simply iterating the
spatial decorrelation as it is described in the Algorithm 1[4], where \(V\) in
the input sequence and \(S\) controls the number of SRLs (Spatial Resolution
Levels)1.
The synthesis transform is computed using the Algorithm 2. In the Fig. 1
there is an example of the decomposition generated for three frames \(V_0\), \(V_1\) and
\(V_2\).
Algorithm 1: III-coding(\(\mathbf {V}\) /* original video sequence */, \(S\) /* Number of extra
levels */) \(\rightarrow \) (\(\mathbf {O}\) /* transformed video sequence */)
Algorithm 2: III-decoding(\(\mathbf {O}\) /* transformed video sequence */, \(S\) /* Number of
extra levels */) \(\rightarrow \) (\(\mathbf {V}\) /* original video sequence */)
Download and compile the Kakadu software implementation for the JPEG
2000 standard.
Find the R/D curve for the image lena using one quality layer (re-do
the same experiment that in JPEG: compress and expand the image for
several bit-rates and compute the RMSE). Use both, the reversible and
the irreversible paths.
Now, let’s take advantage of the scalability of JPEG 2000! Compress lena
without loss (using the reversible path) and one quality layer, and find the
R/D curve truncating the code-stream. Re-do the experiment for different
quality layers. Which alternative is best?
[1]ISO. Information Technology - JPEG 2000 Image Coding System:
Motion JPEG 2000. ISO/IEC 15444-3:2007, May 2007.