JPEG Interchange Format (JIF)

Annex B of ITU T.81 defines the byte stream format, i.e. the basic structure of JPEG files. Multibyte values are big-endian.

Simplified:

                 |                                             |
+-----+----+-----+----+-----+----------------------------------+-----+ 
| SOI | [] | SOF | [] | SOS |ECS.....RST.ECS......RST.ECS......| EOI | 
+-----+----+-----+----+-----+----------------------------------+-----+  
                 |                                             |
                                                               * any number of scans
                                  
JPEG compressed image data between the SOI..EOI markers
SOF: a header marker defining basic image parameters 
SOS: coefficient data for one image scan (one or more)
ECS: entropy coded segment (sequence of entropy coded MCU-s)
RST: restart marker, optional (not used after the last ECS)

Before SOF or/and SOS: zero or any number of these markers in any order:

-  HT
-  QT
-  DRI
-  DAC
-  COM
-  APP

Markers

[FF] FF xx

[FF] fill byte is optional and can happen, xx is the marker byte.

Scan data

Huffman- or arithmetic coded data. It's a bitstream of ESC(s) with possible restart markers and stuffed zero bytes. "Entropy coded segments are always followed by a marker."


.. .. .. .. .. FF 00 .. .. .. .. FF Dx .. .. .. .. FF 00 FF Dx .. .. .. .. .. .. .. FF xx  
                 |                 |                 |     |                          |
               stuffed            Restart        stuffed  Restart                   Marker
                zero              Marker          zero    Marker                 (end of ESC)
                byte                              byte  

Stuffed zero byte: after every 0xFF occurring in the ECS.

Markers are byte-aligned using '1'-stuffing (and FF 00) if necessary.

                  ______RST________
         xxxxx111 11111111 ...Dx...
           A3        FF       Dx

                  ______RST________
xxxxx111 00000000 11111111 ...Dx...
  FF       00        FF       Dx

Restart Markers

This is optional.

The idea: "The encoder outputs the restart markers, intermixed with the entropy-coded data at regular restart intervals to isolate entropy-coded data segments". "If the compressed image data is non-interleaved, the MCU is defined to be one data unit."

                  Ri
            Restart interval
          ___________________
         |                   |
.. .. RST MCU MCU MCU MCU MCU RST .. ..           <--- scan

The restart interval is defined in DRI and equal to the number of MCU-s before the Restart Marker. There is no Restart Marker at the end of the scan.

DRI: define restart interval marker

+---+---+ +---+---+ +---+---+ 
|  DRI  | |  Lr=4 | |   Ri  |	Ri restart interval in DRI segment = # of MCU
+---+---+ +---+---+ +---+---+
                                Ri=0 disable restart intervals for the following scans

Example Ri=5:

 _______________________ _______________________ _______________________ 
|                       |                       |                       |
 MCU MCU MCU MCU MCU RST MCU MCU MCU MCU MCU RST MCU MCU MCU MCU MCU RST MCU MCU MCU     <--- scan (Ri=5)

Example Ri=0 (default)

 MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU MCU                 <--- scan (Ri disabled or Ri=0)

RSTm: restart marker number m (= Restart interval termination)

      Ri MCU            Ri MCU              Ri MCU              last
 ________________    _____________    ______________________   _______
|                |  |             |  |                      | |       |
.....ECS......RST0  .....ECS...RST1  .....ECS............RST2 ...ECS...          <--- scan restart is enabled