Symbol encoding

Vicente González Ruiz

December 31, 2019

Contents

1 How it works?
2 Bits, data and information
3 Entropy
4 Compression basics
 4.1 Encoding of a symbol
 4.2 Decoding of a symbol
  4.2.1 Tip
 4.3 Example
5 Probabilistic models
6 Shannon-Fano coding
7 Huffman coding
8 Arithmetic coding
9 Move-to-front transform
10 Context-base text predictive transform
11 Unary coding
12 Golomb-Rice coding
13 gzip
References

1 How it works?

2 Bits, data and information

3 Entropy

4 Compression basics

4.1 Encoding of a symbol

  1. While the decoder does not know the symbol:

    1. Assert something about the symbol that allows to the decoder to minimize the uncertainty of finding that symbol. This guess should have true or false with the same probability.
    2. Output a bit of code that says if the last guess is true or false.

4.2 Decoding of a symbol

  1. While the symbol is not known without uncertainty:

    1. Make the same guess that the encoder.
    2. Input a bit of code that represents the result of the last guess.

4.2.1 Tip

4.3 Example

5 Probabilistic models

6 Shannon-Fano coding

7 Huffman coding

8 Arithmetic coding

9 Move-to-front transform

10 Context-base text predictive transform

11 Unary coding

12 Golomb-Rice coding

13 gzip

References

[1]   John Cleary and Ian Witten. Data compression using adaptive coding and partial string matching. IEEE transactions on Communications, 32(4):396–402, 1984.

[2]   Peter Deutsch. DEFLATE compressed data format specification version 1.3. Technical report, 1996.

[3]   Peter Deutsch. GZIP file format specification version 4.3. Technical report, 1996.

[4]   Robert M. Fano. The transmission of information. Massachusetts Institute of Technology, Research Laboratory of Electronics, 1949.

[5]   Solomon Golomb. Run-length encodings (Corresp.). IEEE transactions on information theory, 12(3):399–401, 1966.

[6]   David A Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers (IRE), 40(9):1098–1101, 1952.

[7]   Giovanni Manzini. An analysis of the Burrows—Wheeler transform. Journal of the ACM (JACM), 48(3):407–430, 2001.

[8]   Robert Rice and James Plaunt. Adaptive variable-length coding for efficient compression of spacecraft television data. IEEE Transactions on Communication Technology, 19(6):889–897, 1971.

[9]   Jorma Rissanen and Glen G. Langdon. Arithmetic coding. IBM Journal of research and development, 23(2):149–162, 1979.

[10]   Claude Elwood Shannon. A mathematical theory of communication. Bell system technical journal, 27(3):379–423, 1948.

[11]   Ian H Witten, Radford M Neal, and John G Cleary. Arithmetic coding for data compression. Communications of the ACM, 30(6):520–540, 1987.