MPEG Audio [1]

Vicente González Ruiz

January 1, 2020

Contents

1 Intro
2 Layer I
 2.1 Encoder
 2.2 Decoder
3 Loss information analysis
4 Layer II
5 Layer III
 5.1 Encoder
 5.2
References

1 Intro

2 Layer I

2.1 Encoder

  1. Split s[n] into blocks of 12 × 32 = 384 samples. For each block:
    1. Analyze the block using a 32-band equally-spaced (analysis) filter bank, producing 12 coeffs/subband (the coeffs are downsampled (subsampled, decimated) by factor of 32). 342 time-domain samples are transformed into 32 subbands with 12 coeffs. Notice that in a subband, each coeff can be considered as a sample of such subband.
    2. Scale each block of 12 coeffs to ensure that the entire range of the selected quantizer will be used. Output the *scalefactor*.
    3. Using the FFT, compute the ATH for the block (considering the masking effects).
    4. Let R the bit-rate selected by the user. While the generated bit-rate R R:
      1. Decrement the quantization step Δb for each subband b, proportionally to the ATH in b. Compute R. The bit-rate is controlled be switching between quantizers with different number of bits.
    5. Output {Δb}b=132 and the quantization indexes.

2.2 Decoder

  1. For each input frame:
    1. "Dequantize" the coeffs of each subband.
    2. Descale the coeffs to their original dynamic range.
    3. Apply the 32-band synthesis filters bank.

3 Loss information analysis

4 Layer II

5 Layer III

5.1 Encoder

  1. Split s[n] into blocks of 36 × 32 = 1152 samples. For each block:
    1. Performs FFT of the block to compute the ATH and windows sequence.
    2. Analyze the block using a 32-band equally-spaced (analysis) filter bank, producing 36 coeffs/subband.
    3. For each subband:
      1. Analyze transients. If detected, use a sequece of start/short*3/stop windows. Otherwise, use a long window.
      2. Compute MDCT. This produces 36 (long), 30 (start/stop) or 12 coeffs/subband (short). This step produces 18 coeffs/subband (long), 15 coeffs/subband (start/stop) and 6 coeffs/subband (short).
      3. Apply scalefactors to optimize quantization.
    4. Distortion control loop: keep (as much as possible) the quantization error below the ATH.
      1. Rate control loop: Let R the bit-rate selected by the user. While the generated bit-rate R R:
        1. Decrement the quantization step Δb for each subband b, proportionally to the ATH in b. Compute R after encoding the quantizer indexes with (static) Huffman coding. As in previous layers, a quantizer is selected from a list of predefined logaritmic quantizers.

5.2

  1. For each input frame:
    1. Decode the Huffman codes.
    2. “Dequantize” the coeffs of each subband.
    3. Descale the coeffs to their original dynamic range.
    4. Apply inverse MDCT.
    5. Apply the 32-band synthesis filters bank.

References

[1]   Khalid Sayood. Introduction to data compression. Morgan Kaufmann, 2017.