Context-based Text Predictive transform
Vicente González Ruiz
December 31, 2019
Contents
1 Preamble
- The MTF uses a model where the next encoded symbol (that can be
happened only one time) can get an index-code that is lower than the
index-code of a symbol that previously has been found thousands of times
:-(
- We can solve this problem if the positions of the symbols are determined by
their probability. In other words, the list
will be sorted by the ocurrence of the symbols.
2 0-order encoder
- The step 1 of the MTF transform, although now there is a counter for
every node in the list.
- While the input is not exhausted:
-
next input symbol.
-
position of
in
(the prediction error).
- Output .
- Update the count of
(the count of )
and keep sorted .
2.1 Example
3 0-order decoder
- The step 1 of the encoder.
- While the input is not exhausted:
-
next input code.
- .
- Output .
- Step 2.D of the encoder.
3.1 Example
TO-DO
4 -order
encoder
- Let
the context of a input symbol
and
the list previous simbols for that context. If
then the lists are empty, else, the list is fully populated and the count of
every node is .
- Let
the order of the prediction.
- Let
a list of tested symbols. All symbols in
must be different.
- While the input is not exhausted:
-
the next input symbol.
-
(except for the first symbol, where ).
- While :
- .
# reduce
deletes the repeated nodes.
- Update the count of
in
and keep sorted it.
- .
- Let the
position of
en .
- symbols
of .
So, the decoder will know the length of the context where
happens and does not count the same symbol twice.
- Output .
- Update the count of
in
and keep sorted it.
- .
4.1 Example ()
5 -order
decoder
- Steps 1, 2 and 3 of the encoder.
- While the input is not exhausted:
-
the next input code.
-
(except for the first symbol, where ).
- While :
- .
- .
- .
- Update the count of .
- While :
- .
- Insert the symbol
in .
5.1 Example
TO-DO