Reference book "Introduction to Data Compression (fourth edition)" P121
7. Encode a sequence using the LZ77 algorithm. Given C (a) =1,c (p) =2,c (r) = 4. Decode the following ternary sequence:
<0,0,3> <0,0,1> <0,0,4> <2,8,2> <3,1,2> <0,0,3> <6,4,4> <9,5,4>
Assume that the size of the window is 20, and that the antecedent buffer is 10. Encode the resulting sequence of decoding to ensure that the same ternary sequence is obtained.
Answer: By test instructions: w=20,s=10
Decoding:<0, 0, 3>
Add an 'r ': | R, when the tuple sequence is:r |
Decoding:<0, 0, 1>
Add a 'a ':r | a
Decoding:<0, 0, 4>
Add a 'T':ra | T
Decoding:<2, 8, 2>
Start copying 2 letters from the 2nd letter 'a' to: R at| at
Copy 2 more letters to:Rat| Atat
Copy 2 more letters to:Rat| Atatat
Then decode 2 and the sequence is:Rat| ATATATB //b indicates a space, corresponding to 2
Decoding:<3,1,2>
Copy a letter from the eighth letter a to: | RATATATATB | a
Then decode 2, add a b, at which time the sequence is: ratatataTB| a b //b indicates a space, corresponding to 2
Decoding:<0,0,3>
Add a 'R’:RA|Tatatatbab| R //bRepresents a space, corresponding to 2
Decoding:<6,4,4>
From a eighth letteraStart copying four letters, get:Rat|atatatbaBR|Atba
Then decode 4, add a 'T', this sequence is:Rat|atatatbaBR|AtbaT//bRepresents a space, corresponding to 2
Decoding:<9,5,4>
From a tenth letterbStart copying five letters, get:Ratatata|Tbabratbat|Babra
Then decode 4, add a 'T', this sequence is:Ratatata|Tbabratbat| Babrat//bRepresents a space, corresponding to 2
Decode end, get sequence ratatatatbabratbatbabrat //b for space, corresponding to 2
The process of encoding the resulting sequence Ratatatatbabratbatbabrat is as follows:
w=20,s=10
| R Atatatatbabratbatbabrat
For R, there is no matching string
Send <0,0,3>
R | a Tatatatbabratbatbabrat
For a, there is no matching string
Send <0,0,1>
RA | T Atatatbabratbatbabrat
For T, there is no matching string
Send <0,0,4>
Rat| a Tatatbabratbatbabrat
Rat | at Atatbabratbatbabrat
R at | Atatat Babratbatbabrat
Send <2,8,2>
RatatataTB | a Bratbatbabrat
Send <3,1,2>
RA| Tatatatbab | RAtbatbabrat
For R, there is no matching string
Send <0,0,3>
Rat| atatatbabr | Atba Tbabrat
Send <6,4,4>
Ratatata| TBabra T Bat | Babra T
Send <9,5,4>
Encoding complete
8. Given the following initial dictionary and acceptance sequence, construct an LZW dictionary and decode the sent sequence. Receive sequence: 4,5,3,1,8,2,7,9,7,4
Initial dictionary:
Index |
Item |
1 |
S |
2 |
P |
3 |
I |
4 |
T |
5 |
H |
Solution: Known by the initial dictionary and the receiving sequence (note: b denotes a space, corresponding to the 2nd item):
The first index value 4 corresponds to the letter is T, so the receiving sequence of the 1th element is t;
The second index value 5 corresponds to the letter is H, so the receiving sequence of the 2nd element is H;
The TandH are joined together to form the pattern th, since the modal th is not present in the dictionary, the th is added to the dictionary as the first 6 items ;
The third index value 3 corresponds to the letter I, so the receiving sequence of the 3rd element is S;
The H,I join together, the composition mode hi, because the mode hi does not appear in the dictionary, so the hi Added to the dictionary as item 7 ;
The fourth index value 1 corresponds to the letter is S, so the receive sequence 4th element is I;
I,S are joined together, the composition mode is, because the mode is not in the dictionary, has appeared, so is to be added to the dictionary as part 8 items ;
The fifth index value 2 corresponds to the letter b, so the receive sequence 5th element is b;
Connect S,B together, make up the pattern SB, because the pattern SB does not appear in the dictionary, so will SB Added to the dictionary as item 9 ;
The index value 8 corresponds to the letter is, so the 6th and 7 elements of the sequence are I,S;
Connect BandI together to make pattern bi, because pattern bi does not appear in the dictionary, so the bi Added to the dictionary as part of ;
The sixth index value 2 corresponds to a letter b, so the receiving sequence of the first 8 elements is b;
Connect S,B together, make up the pattern SB, because the pattern SB has already appeared in the dictionary, so no need to add
The index value 7 corresponds to the letter HI, so the sequence 9th, 10 elements are H,I;
The B,H string together, the composition mode BH, because the pattern BH does not appear in the dictionary, so the BH Added to the dictionary as the first item ;
The index value 9 corresponds to the letter SB, so the 11th and 12 elements of the sequence are sb;
Put H,I string together, the composition of the mode hi, because the mode Hi has been in the dictionary has appeared, so do not add ;
The index value 7 corresponds to the letter HI, so the sequence of the firstand second elements is H,I ;
The B,H join together, the composition mode BH, because the mode HI has already appeared in the dictionary, so do not add ;
The index value 4 corresponds to the letter T, so the 15th element of the sequence is t;
I,T string together, the mode of it, because the mode it does not appear in the dictionary, so it Added to the dictionary as the first item ;
In summary get the complete dictionary as shown:
Index |
Item |
1 |
S |
2 |
B |
3 |
I |
4 |
T |
5 |
H |
6 |
TH |
7 |
Hi |
8 |
Is |
9 |
SB |
10 |
BI |
11 |
Bh |
12 |
IT |
The sending sequence is decoded by the dictionary to get the original sequence:thisbisbhisbhit
Fourth time assignment