Deferred clear code in GIF

Copyright 2010 Attila Tarpai (tarpai76 gmail)

In the LZW decompressor used in the GIF decoder.

Intro

The basic operation (both for encoder and decoder) is that each new code is added to the dictionary. In GIF the dictionary is maximized to 4096 entries, numbered from #0 to #4095. Dictionary entries represent different length of strings of the original data, repeating strings are the basis of compression in LZW. The first 2n entries are the base entries, eg. #0 to #255 for 8-bit coded values (n=8). The very first new entry is added at position #258, then #259 and so on.. because 2 special codes are used:

EOI is obvious. The CLR instructs the decoder to initialize and start adding entry #258 again.

The decoder does only 2 things:

This happens at each code read: add the new entry.. then emit the string of the received code (which can be the new entry itself or some code less than that). Note that this implementation also solves the code not in the dictionary yet problem. When the dictionary is full (ie. entry #4095 has been added by the decoder) there are 2 possibilities:

The deferred operation can happen in GIF, then the decoder has to stop adding new entries, and just emitting until a CLR or EOI is received.

Example

This was a challenge to implement efficiently. Here is an image I've found on the web that used the deferred clear code:

These image files really help to discover errors in the code: my decoder somehow skipped a few pixels around code 4095; I forgot to emit it.. those lightgrey pixels in the red circle are unwritten pixels:

Anyway, my decoder finally handles the Deferred clear code in GIF correctly.

Implementation

ANSI C, something like this, simplified and ugly, because it's been taken out of its context:

if (idx==4096) {
      e:
       if ((code=ReadCode())==CLR) goto clr; 
       if (code==EOI) break;        
       dst=emit(d+code, dst);
      goto e;
}

When we've reached entry #4096, we behave deferred, and start reading the incoming codes in a loop (goto, so I can leave a while() on the outside):