Saturday, 21 July 2012

Lempel Ziv Algorithm with Example

Coding for sources with memory:


                 A drawback of Huffman’s code is that it requires simple probabilities. For real time applications, Huffman encoding becomes impractical as a source statistics are not always known apriori. A code that might be more efficient to use statistical interdependence of the letters in the alphabet along with their individual probabilities of occurrences is the Lempel-ziv algorithm. This algorithm belongs to the class of universal source coding algorithms.


The logic behind Lempel-ziv universal coding is as follows:


* The compression of an arbitrary binary sequence is possible by coding a series of zeros and ones as some previous such string (prefix string) + one new bit.


* The new string formed by such parsing becomes a potential prefix, string for future strings.
   These variables are called phrases (sub sequences). The phrases are listed in a dictionary or code book which stores the existing phrases and their locations. In encoding a new phrase be specified the location of the existing phrase in the code book and append the new letter.


Consider an example:


1.     Determine the Lempel-ziv code for the given sequence:
000101110010100101…………..  ?
Answer:
                   The given sequence is:
                                 00, 01, 011, 10, 010, 100, 101
Depending on this a table can be formed as :

Numeric position
1
2
3
4
5
6
7
8
9
Sub sequence
0
1
00
01
011
10
010
100
101
Numerical representation


11
12
42
21
41
61
62
Binary coding sequence


0010
0011
1001
0100
1000
1100
1101



Maximum number in numerical representation = 6
6=110
No of bits = 3


i.e,.
0 = 000
1 = 001
2 = 0101
4 = 100
5 = 101
6 = 110


The last symbol of each sub sequence in the code book is an innovation sequence corresponding the last bit of each uniform blocks of bits in the binary encoded representation of the data stream represents innovation symbol for the particular subsequence under consideration. The remaining bits provide the equal binary representation of the pointer in the root subsequence that matches the one in question except for the innovation number 





No comments:

Post a Comment