Intel HEX format

Intel Hex is one of the oldest file formats available and is adopted by many newcomers on the market. Therefore this file format is almost always supported by various development systems and tools.
Originally the Intel Hex format was designed for a 16 bit address range (64kb). Later the file format was enhanced to accommodate larger files with a 20 bit address range (1Mb) and even 32 bit address range (4Gb).

Records

All data lines are called records and each record contains the following fields:

:ccaaaarrddss

:

Every line starts with a colon (Hex value $3A).

cc

The byte-count. A 2 digit value (1 byte), counting the actual data bytes in the record.

aaaa

The address field. A 4 digit (2 byte) number representing the first address to be used by this record.

rr

Record type. A 2 digit value (1 byte) indicating the record type. This is explained later in detail.

dd

The actual data of this record. There can be 0 to 255 data bytes per record (see cc).

ss

Checksum. A 2 digit (1 byte) checksum. cc+aaH+aaL+rr+sum(dd)+ss=0

Record Begin

Every record begins with a colon (ASCII value $3A). Records contain only ASCII characters! No spaces or tabs are allowed in a record. In fact, apart from the 1st colon and the End Of Line (EOL), no other characters than 0..9 and A..F are allowed in a record. Interpretation of a record should be case insensitive, so it does not matter if you use a..f or A..F.

Byte Count

The byte count cc counts the actual data bytes in the current record. Usually records have 32 data bytes, but any number between 0 and 255 is possible.
It is not recommended to send too many data bytes in a record for that may increase the transmission time in case of errors. Also avoid sending only a few data bytes per record because the address overhead will be too heavy in comparison to the payload.

Address field

This is the address where the first data byte of the record should be stored. After storing that data byte, the address is incremented by 1 to point to the address for the next data byte of the record. And so on, until all data bytes are stored.
The address is represented by a 4 digit hex number (2 bytes), with the MSB first.
The order of addresses in the records of a file is not important. The file may also contain address gaps to skip a portion of unused memory.
Normally the address aaaa is used to store the first data byte of the record. However this will only allow us to send files with a maximum size of 64kb. Therefore Intel designed two extra record formats with which it is possible to pre-set an Extended Segment Address or Upper Linear Base Address.
In case of a Extended Segment Address this segment is added to the address field of the record, like in Intel 16 bit processors, to obtain a 20 bit address. This will enable us to send files with a total length of 1Mb. The Extended Segment address is pre-set by 16-bits, given in a special record type.
The formula to calculate the target address in case of Extended Segment mode is:

  target address = segment*16+aaaa

In case of Upper Linear Base Address mode the upper 16 bits of the 32 bit address are pre-set by a special record type. In this case the address space is expanded to 32 bits, which gives us a total range of 4Gb.
The formula to calculate the target address in case of Upper Linear Base Address mode is:

  target address = ulba*65536+aaaa

Record Type

There are 5 record types defined:

'00' = Data Record
'01' = End Of File Record
'02' = Extended Segment Address Record
'03' = Start Segment Address Record
'04' = Extended Linear Address Record
'05' = Start Linear Address Record.

Type 0

Type '00' is the main record type. The real data are sent using this record type. The 1st data byte of the record is stored in the address specified by the address field of the record (plus the pre-set Segment or Linear Base Address). After that the address is incremented and the next data byte is stored on the next address. The address in the address field is 16 bits, so a rollover from $FFFF to $0000 can occur. This will not produce a carry into the next Segment or Linear Base Address, so addressing space is wrapped back!

Type 01

Type '01' is the End Of File record. The receiver of the file will stop waiting for new records after receiving this record. The byte count and the address field of this record must always be $00. Because the contents of this record type is fixed, the checksum field is always the same ($FF).

Type 02

These records are used to pre-set the Extended Segment Address. With this segment address it is possible to send files of up to 1Mb in length. The Segment address is multiplied by 16 and then added to all subsequent address fields of type '00' records to obtain the effective address. By default the Extended Segment address will be $0000, until it is specified by a type '02' record. The address field of a type '02' record must be $00. The byte count field will be $02 (the segment address consists of 2 bytes). The data field of the type '02' record contains the actual Extended Segment address. Bits 3..0 of this Extended Segment address always should be 0!

Type 03

These records don't contribute to file transfers. They are used to specify the start address for Intel processors, like the 8086. So if you would upload a file to an Intel based development board, the starting address of the code can be specified with this record type. This starting address will be loaded into the CS and IP registers of the processor. For normal file transfers the type '03' records can be ignored. The byte count of type '03' record is $04, because 4 data bytes will be sent. The address field remains $0000. The data field of type '03' records contain 4 bytes, the first 2 bytes represent the value to be loaded into CS, the last 2 bytes are the value to be loaded into IP. Bytes are sent MSB first.

Type 04

Type '04' records are used to pre-set the Linear Base Address. This 16 bit Linear Base Address, specified in the data area, is used to obtain a full 32 bit address range when combined with the address field of type '00' records. With this LBA it is possible to send files of up to 4Gb in length. The Linear Base Address is used as the upper 16 bits in the 32 bit linear address space. The lower 16 bits will come from the address field of type '00' records. By default the Linear Base Address will be $0000, until specified by a type '04' record. The address field of a type '04' record must be $0000. The byte count field will be $02 (the LBA consists of 2 bytes). The data field of the type '04' record contains the actual 2 byte Linear Base Address. MSB is sent first.

Type 05

These records don't contribute to file transfers. They are used to specify the start address for Intel processors, like the 80386. If you would upload a file to an Intel based development board, the starting address of the code can be specified with a type 05 record. This starting address will be loaded in the EIP register of the processor. For normal file transfers the type '05' records can be ignored. The byte count of type '05' records is $04, because 4 data bytes will be sent. The address field remains $0000. The data field of type '05' records contain the 4 byte linear 32 bit starting address to be loaded into the EIP register of the processor.

Data or Offset field

This field contains 0 or more data bytes. The actual number of data bytes is indicated by the byte count in the beginning of the record. The data bytes are interpreted as real "payload" data in type '00' records. In all other record types the data represent pre-set address values.

Checksum Field

This field is a one byte (2 hex digits) 2's complement checksum of the entire record. To create the checksum make a 1 byte sum from all fields of the record:

  byte count + both address bytes + record type + all data bytes.

Then take the 2's complement of this sum to create the final checksum. Checking the checksum at the receiver's end is simply done by adding all bytes together including the checksum itself, discarding all carries, and the result must be $00.

Examples

:10C00000576F77212044696420796F7520726561CC
:10C010006C6C7920676F207468726F756768206137
:10C020006C6C20746869732074726F75626C652023
:10C03000746F207265616420746869732073747210
:04C040007696E67397
:00000001FF

In the example above you can see a piece of code with normal 16 bit addressing. The first 4 lines have 16 bytes of data each, which can be seen by the byte count, the first byte of each line. The 5th line has only 4 bytes because the program is at its end there.
After the byte count on each line you can see the address where the 1st data byte of that line is to be stored. The begin address of the file is $C000. Remember that the address order within a file is not important.
Then the record type is given. In each data record this identifier is $00. Only in the End Of File record, the last line, this identifier is $01. Note that the address of the last line is also $0000 and that there are no data bytes in this last record.
The data bytes follow the record identifier, at least for the data records they do.
Finally you see the checksum as the last byte of every record. If you like you can add all bytes of each line together and the 8-bit result should be $00 every time.

:100000004578616D706C65207769746820616E2039
:0B0010006164647265737320676170A7
:101000004865726520697320612067617020696E90
:1010100020746865206D656D6F727920616C6C6FEE
:06102000636174696F6E4C
:00000001FF

Here you see an example with an address gap. The first part of the program starts at address $0000. After the second record the address has suddenly changed to $1000. All date in the addresses in between remain unchanged, or are undefined. It is even possible to fill in these "blanks" later, without destroying the code presented in this example. As you can see not all lines have the same number of data bytes, which is no problem.
BTW: In both examples so far no Extended segment or Linear Base Address were defined. So these addresses are assumed to be $0000.

:020000022BC011
:1012340054686973207061727420697320696E2028
:0D12440061206C6F77207365676D656E74B7
:020000027F007D
:1080000054686973207061727420697320696E20EE
:108010007468652068696768207365676D656E744C
:00000001FF

In this final example I show you a piece of code with Extended Segment records in it. The first record is one of them. Here the Extended Segment address is set to $2BC0, which means that $2BC00 is added to all subsequent address fields to obtain the target address of the data. E.g. the first data byte of the 2nd record is stored at location $2BC00+$1234=$2CE34.
In the 4th record a new Extended Segment address is specified, which means that from then on all address fields are added to $7F000 to obtain the target address.
The Extended Segments records in this example can be replaced by Linear Base Address records by changing the identifier '02' into '04' and adapting the corresponding checksums. When keeping all other values the same this would result in a target address for the first byte of the second record of $2BC00000+$1234=$2BC01234.