Coding for sources with memory:

A drawback of Huffman’s code is that it requires simple probabilities. For real time applications, Huffman encoding becomes impractical as a source statistics are not always known apriori. A code that might be more efficient to use statistical interdependence of the letters in the alphabet along with their individual probabilities of occurrences is the Lempel-ziv algorithm. This algorithm belongs to the class of universal source coding algorithms.

(adsbygoogle = window.adsbygoogle || []).push({});

The logic behind Lempel-ziv universal coding is as follows:

* The compression of an arbitrary binary sequence is possible by coding a series of zeros and ones as some previous such string (prefix string) + one new bit.

* The new string formed by such parsing becomes a potential prefix, string for future strings.

These variables are called phrases (sub sequences). The phrases are listed in a dictionary or code book which stores the existing phrases and their locations. In encoding a new phrase be specified the location of the existing phrase in the code book and append the new letter.

Consider an example:

1. Determine the Lempel-ziv code for the given sequence:

000101110010100101………….. ?

Answer:

The given sequence is:

00, 01, 011, 10, 010, 100, 101

Depending on this a table can be formed as :

A drawback of Huffman’s code is that it requires simple probabilities. For real time applications, Huffman encoding becomes impractical as a source statistics are not always known apriori. A code that might be more efficient to use statistical interdependence of the letters in the alphabet along with their individual probabilities of occurrences is the Lempel-ziv algorithm. This algorithm belongs to the class of universal source coding algorithms.

The logic behind Lempel-ziv universal coding is as follows:

* The compression of an arbitrary binary sequence is possible by coding a series of zeros and ones as some previous such string (prefix string) + one new bit.

* The new string formed by such parsing becomes a potential prefix, string for future strings.

These variables are called phrases (sub sequences). The phrases are listed in a dictionary or code book which stores the existing phrases and their locations. In encoding a new phrase be specified the location of the existing phrase in the code book and append the new letter.

Consider an example:

1. Determine the Lempel-ziv code for the given sequence:

000101110010100101………….. ?

Answer:

The given sequence is:

00, 01, 011, 10, 010, 100, 101

Depending on this a table can be formed as :

Numeric position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |

Sub Sequence | 0 | 1 | 00 | 01 | 011 | 10 | 010 | 100 | 101 |

Numerical representation | - | - | 11 | 12 | 42 | 21 | 41 | 61 | 62 |

Binary Coding Sequence | - | - | 0010 | 0011 | 1001 | 0100 | 1000 | 1100 | 1101 |

Maximum number in numerical representation = 6

6=110

No of bits = 3

i.e,.

0 = 000

1 = 001

2 = 0101

4 = 100

5 = 101

6 = 110

The last symbol of each sub sequence in the code book is an innovation sequence corresponding the last bit of each uniform blocks of bits in the binary encoded representation of the data stream represents innovation symbol for the particular subsequence under consideration. The remaining bits provide the equal binary representation of the pointer in the root subsequence that matches the one in question except for the innovation number

6=110

No of bits = 3

i.e,.

0 = 000

1 = 001

2 = 0101

4 = 100

5 = 101

6 = 110

The last symbol of each sub sequence in the code book is an innovation sequence corresponding the last bit of each uniform blocks of bits in the binary encoded representation of the data stream represents innovation symbol for the particular subsequence under consideration. The remaining bits provide the equal binary representation of the pointer in the root subsequence that matches the one in question except for the innovation number

Tags:
ITC