Deferred Clear Code in the GIF-variant of LZW

Copyright 2010 Attila Tarpai (tarpai76 gmail)

This LZW decompressor (lzw.c) is part of the image decoder library, RAINBOW on BitBucket.


GIF uses a variant of the LZW compression. The basic operation (both for the LZW encoder and decoder) is that each new string is added to the dictionary. In GIF the dictionary is maximized to 4096 entries, numbered from #0 to #4095. Dictionary entries represent different length of strings of the original data, these repeating strings are the basis of compression in LZW. The first 2n entries are the base entries, eg. #0 to #255 for 8-bit coded values (n=8). The very first new entry is added at position #258, then #259 and so on.. because 2 special codes are used:

Reading the EOI code indicates end of data and decompression. CLR instructs the decoder to initialize and start adding entry #258 again with the original code size.

The decoder does only 2 things:

This happens at each code read: add the new entry.. which can be the new entry itself or some code less than that; then emit the previous string.

Note that this implementation also solves the code not in the dictionary yet problem.

When the dictionary is full (ie. entry #4095 has been added by the decoder) there are 2 possibilities:

The deferred operation is allowed in GIF, then the decoder has to stop adding new entries, and only emitting strings from the Dictionary until a CLR or EOI is received.


Here is a GIF image I've found on the web that utilizes Deferred Clear Code:

The Color Resolution is 4-bit: #16 is CLR, #17 is EOI and the start code size is 5. Before the Dictionary is full, decoding gets to this point and pixel position:

The next code is not a clear code but #1468, then #147, #2179 and so on until EOI is read and decompression ends. The rest of this particular GIF image is decoded in the DCC loop:

  if ( CLR ) goto clr
  if ( EOI ) return		

Sun Jun 17 18:23:47 UTC+0200 2018 © A. Tarpai