The Deflate algorithm

Deflate is compressing data
Inflate is decompressing data

In the case of PNG decoding we Inflate the compressed data.

Inflate in PNG

There are 2 papers to read here:

PNG Image Data in IDAT chunks is in zlib stream format:

  0   1
+---+---+  +=====================+  +---+---+---+---+
|CMF|FLG|  |...compressed data...|  |    ADLER32    |
+---+---+  +=====================+  +---+---+---+---+

The compressed data part uses the DEFLATE COMPRESSION:

     +----------------------------------------------------------------------------+
     |010.......................EOD |001..................EOD |110............EOD |
     +----------------------------------------------------------------------------+
     |block                         |block                    |block 
     |header                        |header                   |header
                                                              |final

It consist of blocks with a 3-bit block header:

Block header:                  <EOD>
-------------                  -----
1 bit block final              Huffman coded symbol of '256'
2 bits block type              = end of data
       |
        - 00 uncompressed
        - 01 fixed Huffman
        - 10 dynamic Huffman 
        - 11 reserved

The most common, dynamic Huffman coded block starts with 3 Deflate style Huffman table definitions to read the compressed data. Deflate defines Huffman tables with an array of Huffman-code-bit-lengths values for every symbol of the alphabet. Zero lenght means the symbol doesn't occur in the compressed data.

Here is one block bit-by-bit (example from light_bulb.png):

|___________fixed lenght codes___________________|____________________________variable length Huffman codes_______________________________________|


[0][10][11101][10010][1111][010][110][000]...[001][01][01001][1110][010]...........[0001][010][1110][10010]....[10][......COMPRESSED DATA......EOD]
\_/\__/\_____/\_____/\____/\_____________________/\_______________________________/\______________________________/\_____________________________/
 f btype hlit  hdist  hclen     len-of-len           len-of-literals/ptr-lenghts           len-of-distance            
                                / \                       / \                              / \      
                               2   x                     x   x                            x   x     
                                  / \                   / \ / \                          / \ / \    
                                                       4  x x  x                        4  7 x  x   
                                                         /\ /\ /\                            /\ /\  

Inflate

    compressed data bits                                        decompressed data bytes 
      (Huffman codes)            
                                      inflate
011001100111010001110110000011    -------------->    .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 

The algo is rather simple:

Read a Huffman coded symbol of (0..285) and do one of 3 things:

     xx xx xx xx xx xx xx xx xx xx xx xx xx .. .. .. .. .. .. .. ..       BUFFER BEFORE
   
                              
                              
                              
 
1.   xx xx xx xx xx xx xx xx xx xx xx xx xx AA .. .. .. .. .. .. ..       0..255 (literal bytes)

2.   xx xx xx xx xx xx xx xx xx xx xx xx xx .. .. .. .. .. .. .. ..       256 (end)

3.   xx xx xx xx xx xx xx xx xx xx xx xx xx AA BB CC DD .. .. .. ..       257..285 (repeat codes)

Example repeat:

Repeat the byte sequence "AA BB CC DD" from buffer (called window).

This has 2 parameters, eg.:

xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   ..   ..   ..   ..   ..   ..   ..   ..   ..   
                                                            ^
                                                            |
                                                           dst


                                   distance = 7
                         
                         --   --   --   --   --   --   --    
                        |                                |
xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   ..   ..   ..   ..   ..   ..   ..   ..   ..   
                        |                 |                 ^
                         --   --   --   --                  |
                            length = 4                     dst
                              




xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   AA   BB   CC   DD   ..   ..   ..   ..   ..  
                                                                                ^
                                                                                |
                                                                               dst