# The Deflate algorithm

Deflate is compressing data
Inflate is decompressing data

In the case of PNG decoding we Inflate the compressed data.

## Inflate in PNG

There are 2 papers to read here:

• RFC1950 - ZLIB Compressed Data Format
• RFC1951 - DEFLATE Compressed Data Format

PNG Image Data in IDAT chunks is in zlib stream format:

```  0   1
+---+---+  +=====================+  +---+---+---+---+
|CMF|FLG|  |...compressed data...|  |    ADLER32    |
+---+---+  +=====================+  +---+---+---+---+
```

The compressed data part uses the DEFLATE COMPRESSION:

```     +----------------------------------------------------------------------------+
|010.......................EOD |001..................EOD |110............EOD |
+----------------------------------------------------------------------------+
|block                         |block                    |block
|final
```

It consist of blocks with a 3-bit block header:

```Block header:                  <EOD>
-------------                  -----
1 bit block final              Huffman coded symbol of '256'
2 bits block type              = end of data
|
- 00 uncompressed
- 01 fixed Huffman
- 10 dynamic Huffman
- 11 reserved
```

The most common, dynamic Huffman coded block starts with 3 Deflate style Huffman table definitions to read the compressed data. Deflate defines Huffman tables with an array of Huffman-code-bit-lengths values for every symbol of the alphabet. Zero lenght means the symbol doesn't occur in the compressed data.

Here is one block bit-by-bit (example from light_bulb.png):

```|___________fixed lenght codes___________________|____________________________variable length Huffman codes_______________________________________|

[0][10][11101][10010][1111][010][110][000]...[001][01][01001][1110][010]...........[0001][010][1110][10010]....[10][......COMPRESSED DATA......EOD]
\_/\__/\_____/\_____/\____/\_____________________/\_______________________________/\______________________________/\_____________________________/
f btype hlit  hdist  hclen     len-of-len           len-of-literals/ptr-lenghts           len-of-distance
/ \                       / \                              / \
2   x                     x   x                            x   x
/ \                   / \ / \                          / \ / \
4  x x  x                        4  7 x  x
/\ /\ /\                            /\ /\
```
• f = 0: not the final block
• btype = 2: dynamic Huffman
• hlit = 29: there will be 256+1+29=286 symbols for 256 literal bytes, one for EOD and 29 repeat-n
• hdist = 29: there will be 1+29=30 symbols for repeat-d
• hclen = 15: 4+15=19, hlit and hdist symbols use up to max 15-bit variable code lenght (3 are special repeat codes: 16, 17 and 18)
• len-of-len: 19 x 3-bit code-length for HTD. We build a HT and read 316 (286+30) code-lenght using this mini-HT
• len-of-literals/ptr-lenghts: 286 Huffman coded values
• len-of-distance: 30 Huffman coded values
• build 2 HT to read COMPRESSED DATA:
• one for reading lit values
• one for reading dist values
• Compressed data consists of Huffman codes of
• literal bytes (0..255)
• EOD (256)
• repeat codes (257..285)

### Inflate

```    compressed data bits                                        decompressed data bytes
(Huffman codes)
inflate
011001100111010001110110000011    -------------->    .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
```

The algo is rather simple:

Read a Huffman coded symbol of (0..285) and do one of 3 things:

• add one byte to output (0..255)
• end (256)
• add a sequence of bytes to output from already decoded bytes (257..285)
```     xx xx xx xx xx xx xx xx xx xx xx xx xx .. .. .. .. .. .. .. ..       BUFFER BEFORE

1.   xx xx xx xx xx xx xx xx xx xx xx xx xx AA .. .. .. .. .. .. ..       0..255 (literal bytes)

2.   xx xx xx xx xx xx xx xx xx xx xx xx xx .. .. .. .. .. .. .. ..       256 (end)

3.   xx xx xx xx xx xx xx xx xx xx xx xx xx AA BB CC DD .. .. .. ..       257..285 (repeat codes)
```

Example repeat:

Repeat the byte sequence "AA BB CC DD" from buffer (called window).

This has 2 parameters, eg.:

• distance = 7
• length = 4
```xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   ..   ..   ..   ..   ..   ..   ..   ..   ..
^
|
dst

distance = 7

--   --   --   --   --   --   --
|                                |
xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   ..   ..   ..   ..   ..   ..   ..   ..   ..
|                 |                 ^
--   --   --   --                  |
length = 4                     dst

xx   xx   xx   xx   xx   AA   BB   CC   DD   xx   xx   xx   AA   BB   CC   DD   ..   ..   ..   ..   ..
^
|
dst
```