The block diagram is
where
Prediction Context |
|
Predictors | |
\(P_0\) | \(\hat {s}\leftarrow 0\) |
\(P_1\) | \(\hat {s}\leftarrow a\) |
\(P_2\) | \(\hat {s}\leftarrow b\) |
\(P_3\) | \(\hat {s}\leftarrow c\) |
\(P_4\) | \(\hat {s}\leftarrow a+b-c\) |
\(P_5\) | \(\hat {s}\leftarrow a+(b-c)/2\) |
\(P_6\) | \(\hat {s}\leftarrow b+(a-c)/2\) |
\(P_7\) | \(\hat {s}\leftarrow (b+c)/2\) |
If \(e>0\), then:
Else:
Category | Huffman code | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
If \(x\ne 0\), then:
Else:
For a RGB image, the baseline algorithm consist of:
For each component (Y, Cb y Cr):
Luminance | Chrominance | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Run the coefficients using the zig-zag pattern:
in order to (partially) sorter the coefficients attending to their magnitude. Notice that, after a given coefficient, the remainder ones are zero. This situation if encoded using the EOB (End Of Block) special symbol.
The color interlacing enables the pipelined (IO and CPU) reconstruction of the images row-by-row.
With interlacing:
Without interlacing:
Let’s encode a block of a grayscale image (luminance of “lena”).
| \(\Leftrightarrow \) |
|
| div |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
39 | -3 | 1 | 0 | 0 | 0 | 0 | 0 |
2 | -1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The coeficients of the matrix
39 | -3 | 1 | 0 | 0 | 0 | 0 | 0 |
2 | -1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
are visited using the zig-zag scan to tind the EOB. The result is
39 | -3 | 2 | 1 | -1 | 1 | 0 | 0 | 0 | 0 | 0 | -1 | EOB |
Encoding of the AC coefficients. We encode the pairs <number-of-previous-zeros,non-zero-value> in two steps.
Find the entry <number-of-previous-zeros,\(SSSS\)> in the table of AC codes proposed by the JPEG:
JPEG AC codes (Luminance)
| ||
Run/category | Longitud | Base code |
0/0 | 4 | 1010 (=EOB) |
0/1 | 3 | 00 |
0/2 | 4 | 01 |
0/3 | 6 | 100 |
: | : | : |
15/10 | 26 | 1111 1111 1111 11110 |
and output the base code bits. In our example, we output \(01_2\).
The whole bit-stream for our example is:
100101 | 0100 | 0110 | 001 | 000 | 001 | 11110100 | 1010 |
Finally, the block is encoded using only \(35\) bits. Therefore, the compression ratio is 15:1 approximately (\(0.55\) bits/pixel).
There are three positilities:
Progressive transmission based on spectral selection:
Progressive transmission based on bit-plane selection:
Progressive transmission based on a mixture of the last progressions:
To create the pyramid we can use the following algorithm:
For each image of the the Image Compression Corpus, build a table with the structure:
# bpp MSE
where the bpp (bit per pixel) is the result of compute the resulting bit-rate after compress and decompress the images using the command line tools cjpeg and djpeg.
[1] The Joint Photographic Experts Group (JPEG). Recommendation T.81: Digital Compression and Coding of Continuous-tone Still Images. International Telecommunication Union (ITU), September 1992.
[2] The Joint Photographic Experts Group (JPEG). FCD 14495, Lossless and Near-Lossless Coding of Continuous Tone Still Images (JPEG-LS). The International Standards Organization (ISO)/The International Telegraph and Telephone Consultative Committee (CCITT), July 1997.
[3] G. K. Wallace. The JPEG Still Picture Compression Standard. Communications of the ACM, 34(4):30 – 44, April 1991. Se puede conseguir en ftp://ftp.uu.net/graphics/jpeg/wallace.ps.Z.
1It is possible to use Huffman and arithmetic coding. However, the marginal gain of the last one (about a 10%) and the patents that are behind it cause that the Huffman version is the most used one.