8051 Introduction

        .cr     8051        To load this cross overlay

The 8051 is probably the most popular 8-bit micro controllers ever. I find it hard to find a suitable reason for its popularity, for it is by far not the best 8-bit micro controller. If you only take a brief look at the capabilities of e.g. the Z8 micro controller from Zilog or the 68HC11 from Motorola and you would agree with me. But the 8051 is here and it is here to stay.

One possible explanation for the 8051's popularity is the great number of manufacturers that make an impressive number of 8051 derivatives. Many different I/O features are integrated around the 8051 core to create a micro controller which needs only very little extra hardware to do most of the jobs you can think of.

If you take a closer look at what Intel has created here you will see a number of stupid design flaws.
One of the biggest disadvantages of the standard 8051 core is that there's only one 16 bit pointer register available. Moving a block of data is a very tedious job which takes far too much data moving overhead. Some manufacturers have implemented solutions to this problem by adding more pointer registers to the basic design.
Also the lack of a decent compare instruction is a big handicap during programming.
Or have you ever tried to shift a 4 byte number n bits to the left on an 8051? The time it takes to move the operands to the Accumulator and back takes 2/3 of the total time to shift the value one bit to the left!
Finally I would like to emphasize the fact that the oscillator frequency of the original devices was divided by 12! Therefore a 12MHz quartz results in an instruction clock of only 1MHz. No wonder that Intel always won the MHz race!

But I'm sure the 8051 still has some good things to offer otherwise it wouldn't be as popular as it is today.

The power of the 8051 is it's flexibility when it comes to peripheral expansion. New peripherals can be added without changing the core or the instruction set. Only a few Special Function Registers are added to add new features to the standard core.

Programming Model

The programming model in the picture below shows the most important registers of the 8051 processor. I only include a little summary about the features of the 8051's programming model here. It is not my intention to make the original documentation obsolete, so please refer to the original documentation for further details. You can also find lots of useful information at www.8052.com.

8051 programming model

The Accumulator

The Accumulator is the most important register for 8 bit arithmetic operations. Its standard name is A, which is a reserved word.
The Accumulator can also be addressed via SFR address $E0. It is common practice to call this address ACC. Remember that the SB-Assembler does not automatically assign SFR names, so you'll have to declare ACC yourself. ACC is a bit addressable SFR.

The Program Status Word

The PSW register is addressable via SFR address $D0, which is also bit addressable. The SB-Assembler does not automatically assign the name PSW or any bit names it contains. All the names can be assigned using normal labels. A bit address is simply the sum of the base address $D0 and the bit number. E.g. the bit address of CY is $D0 + 7 = $D7

The PSW contains 8 system flags:

Bit 7CYCarry flag
Bit 6ACYAuxiliary Carry
Bit 5F0General purpose Flag
Bit 4RS1Register bank select Flag
Bit 3RS0Register bank select Flag
Bit 2OVOverflow Flag
Bit 1F1General purpose Flag
Bit 0PParity Flag

Register R0 ... R7

8 registers assist the 8051 as intermediate registers during calculations. The registers are part of internal RAM memory.
Registers can be addressed with only 3 bits instead of the 8 bits needed to address one of the regular internal RAM addresses. There are a total of 4 banks of 8 registers to accomplish fast context switching during interrupts.
The register banks are mapped in the lower address range of the internal RAM.

Registers R0 and R1 can also be used as pointers during indirect addressing.

Stack Pointer

The Stack Pointer SP is addressable via SFR address $81. Again it's the programmer's responsibility to assign a name to that location because the SB-Assembler does not assign one by itself.
The 8-bit stack pointer contains the internal RAM address at which the last byte was pushed onto the stack. This is also the address of the next byte that will be popped. The SP is incremented before a PUSH, or decremented after a POP. Please note that the stack on an 8051 grows up, unlike with most other processors.
The stack pointer SP is initialized to $07 after reset, making internal RAM address $08 the first address of the stack.

A CALL to a subroutine or an interrupt will push the low byte of PC on the stack first. Please note that this adds to the endian model confusion, for return addresses are stored in little endian while the address field after an instruction is stored in big endian! Interrupts will only push the PC on the stack. The RET and RETI instructions will pop the high byte of the PC from the stack first. It's the programmer's responsibility to save all other affected resources before interrupt servicing and restore them when the interrupt is done.

Data PoinTeR

The data pointer DPTR is used to indirectly address data in ROM or external RAM. Therefore the pointer must be 16 bits wide. The Data Pointer DPTR is the 8051's only user-accessible 16-bit register. It is split into two 8-bit registers in the SFR space as DPL (low order byte) and DPH (high order byte). The name DPTR is reserved and can be used by some instructions to identify a single 16-bit register (e.g. MOV DPTR,#data). The names DPL and DPH are not pre-defined and must be defined by the programmer like all other SFR registers.

The Program Counter

The program counter PC is normally incremented after fetching each instruction or operand byte during program execution. The only way you can change this behaviour is with the jump, call and return instructions. Also interrupts can change the program counter's value.
The PC is not available as SFR register like all other internal registers.

Timing

SB-Assembler Version 3 can show you the cycle times of each instruction when the TON list flag is switched on. All times given are the number T states an instruction takes to execute. On the real 8051 a T state takes 12 oscillator clock cycles. Other derivatives may need fewer number of clock cycles per T state.

Reserved Words

The SB-Assembler 8051 cross overlay has a few reserved words. Reserved words are all fixed register and bit names. You better avoid these reserved words when you assign your own labels. E.g. don't call your labels R0, or A or C.
If you do use the reserved words as label names you may expect unpredictable behaviour of the assembler sooner or later. Please note that the assembler will not warn you if you try to assign a label with a reserved name!
Reserved names can not be used in expressions, like label names can. An Undefined label error will be reported if you do try to use a reserved word in an expression because it is treated as a normal label in this case.

Here's the list of all reserved words:

A, DPTR, R0, R1, R2, R3, R4, R5, R6, R7, C

As opposed to other 8051 assemblers the SB-Assembler does not have a predefined set of SFR names. All SFR names should be assigned using normal labels. Please note that this only applies to SFR registers and not to registers like A, R0, DPTR or C. The SB-Assembler automatically uses SFR addressing when the value of the label is in the SFR address range ($80 to $FF).
Personally I see this as an advantage because you don't need a special dedicated assembler for every possible derivative of the 8051. You simply create a list of all the SFRs and their addresses using the .EQ directive. This way the SB-Assembler can be used for every possible 8051 derivative, now and in the future.
I have added two include files in the download package for the standard 8051 and 8052 processors. Simply include one of these files at the beginning of your program and you can use any of the standard SFR registers. You can easily alter one of those files to represent your own processor's features.

Special Features

Register addressing

The 8051 has 4 banks of 8 registers. These registers are simply a small portion of internal RAM that can be addressed with only 3 bits. Many instructions that use registers are only 1 byte long, resulting in compact code and fast execution time.

Selecting one of the 4 banks is done by setting or clearing the 2 bank select bits RB0 and RB1 in the PSW register. Registers are called R0 to R7 by default. The SB-Assembler allows you to give your registers more meaningful names by assigning labels to the appropriate RAM addresses. The SB-Assembler will automatically use register addressing mode whenever a label is used where register addressing is available.
This is even true for indirect registers @R0 and @R1. Indirect mode register translation is treated as a forced mode in Version 2, so no check is made to see if the register really belongs to the current register bank! In Version 3 a full range check is provided, so a Out of range error will be reported if the label can't be translated to a legal current index register.

Some 8051 instructions can use direct addressing mode, but can't use register addressing mode. It is also not possible to move the contents of one register directly to another, like in MOV R1,R2 .
But the SB-Assembler allows you to use register addressing mode in these cases anyway. Such commands are translated automatically to use direct addressing mode instead. The previous example is translated to MOV R1,$02 . In this example .RB was set to 0, otherwise the direct address would have been $0A for bank 1, or $12 for bank 2, or $1A for bank 3.

See the description of the .RB directive for more details.

Short addressing

The JMP and CALL instructions come in two different flavours on the 8051. AJMP and ACALL can be used with a short address field, while LJMP and LCALL are used with a full 16 bit address field.
Short addressing mode can be used when the upper 5 bits of the destination address are the same as the first 5 bits of the address of the next instruction, i.e. the instruction following the AJMP or ACALL instruction.
You force the SB-Assembler to use short addressing mode if you use AJMP or ACALL. The assembler will report a Out of range error if the destination address is beyond the reach of a short address. It is the programmer's responsibility to use the appropriate addressing mode. Although you could always use LJMP and LCALL to be on the safe side, but that way you won't have the benefit of the shorter instructions.

You may use the JMP or CALL instructions instead of the 8051 instructions AJMP or ACALL and let the SB-Assembler decide what addressing mode to use. The assembler will use the short addressing mode whenever it can. This is when the assembler can know for certain that the address is within range. Long addressing mode is used when the destination contains a forward referenced label because the destination address is still unknown at that time.
Please note that the original 8051 instruction JMP @A+DPTR still works, despite the new enhancements.

Bit addressing

The 8051 can be used as a boolean processor because it can do a few tricks on one-bit quantities. Bit addresses can be given in two different ways. One way is to use the sequential bit number. The other way is to use a RAM or SFR address followed by a dot and a bit number.

There are a total of 256 different bits that can be addressed, so the first notation requires one byte to identify the bit. Bit numbering starts with $00 at bit 0 of RAM address $20 and ranges up to $7F for bit 7 of RAM address $2F. Bit numbers $80 to $FF are bits coming from the bit-addressable SFR registers in a similar way.

The second addressing mode uses the address of one of the sixteen bit-addressable internal RAM bytes or one of the sixteen bit-addressable SFRs followed by a dot and bit number. Only internal RAM addresses $20 to $2F and all SFRs ending in $x0 or $x8 are bit addressable.

Examples:

         MOV   C,$00                Bit number is addressed directly
         MOV   C,$20.0              Same result as previous line
         MOV   C,$7F                Bit number is addressed directly
         MOV   C,$2F.7              Same result as previous line
         MOV   C,$80                Use SFR bit number directly
         MOV   C,$80.0              Same result as previous line
         MOV   C,$FF                Use SFR bit number directly
         MOV   C.$F8.7              Same result as previous line
         MOV   C,LABEL.3            LABEL should evaluate to bit addressable memory
         MOV   C,SILLY.EXAMPLE.2    Label name may contain dots
         MOV   C,$20+1.2            Even arithmetic functions are allowed
         MOV   C,$21.2              Same result as previous line

The SB-Assembler Version 2 will report a Bad operand error if you try to use the .n bit addressing notation on illegal addresses. Version 3 reports a more appropriate Out of range error instead.

You may use the .n bit addressing notation whenever the assembler expects bit addressing mode. In all other situations you must be careful with the .n notation, for it will have no special meaning to the SB-Assembler.

The .n bit addressing notation will only work on lines containing 8051 instructions that expect bit addressing mode. It will not work on lines containing directives like .EQ or .DA !
Version 3 of the SB-Assembler will allow .n bit address notation for .EQ and = label declarations, only while the 8051 cross overlay is loaded. The address given should evaluate to a legel bit address of course.

Refer to the regular 8051 documentation for a more detailed description of bit addressing.

Overlay Initialization

Three things are set while initializing the 8051 overlay every time it is loaded by the .CR directive.

  • The register pointer (the one that's changed by the .RB directive) is set to 0.
  • Big endian model is selected for .DA and .DL directives. This means that words or long words are stored with their high byte first.
  • The maximum program counter value is set to $FFFF.

A little word on the endian model I chose might be in place here. I tried to find out what Intel's decision was about the endian model when they developed the 8051.
JMP, CALL and DPTR values use the big endian model when written to memory, while the SFR registers DPL an DPH are stored with the low byte first. When you look at the ordering of the timer SFRs you see that they have completely mixed the high and low bytes of the different timers. And to make the confusion complete return addresses are stored in little endian model on the stack.
I think the way JMP, CALL and DPTR order their bytes is the most important one, so I decided to use the big endian model for the 8051.

.EQ    Changed Behaviour in Version 3

Syntax:

LABEL    .RB  expression.n

Function:

The behaviour of the .EQ directive is slightly changed while the 8051 cross overlay is loaded. The expression may be followed by a bit number (from 0 to 7). When it does, the value should resolve into a legal bit value from 0 to 255.
When no bit number is following the expression the behaviour of the .EQ directive behaves as it normally does.

All of this is equally true for the = sign, which can replace the .EQ directive.

Explanation:

Legal bit addresses on the 8051 are direct RAM addresses $20 to $2F, and SFR addresses ending in 0 or 8. Official 8051 syntax allows you to take one of these registers and follow it with a dot and a bit number from 0 to 7. This finally results in a bit number from 0 to 255.
For direct RAM the formula from bit address to its bit number is (address - $20) * 8 + bit. The formula to translate an SFR address to its bit number is SFRaddress + bit.

Normally the .EQ directive doesn't support bit address translation. Now it will, as long as the 8051 cross overlay remains loaded.

Any other address than direct RAM $20 to $2F and SFR addresses ending in 0 or 8 will result in a Out of range error if the bit notation is used.

Examples:

00D0-         PSW    .EQ  $D0        A bit addressable SFR (ending in 0)
0000-
0000-         FLAG1  .EQ  $20.0      Just a flag in bit addressable RAM
007F-         FLAG2  =    $2F.7      Just another flag
00D7-         CY     .EQ  PSW.7      Definition of the carry flag in PSW
00E2-         SILLY  .EQ  PSW+$10.2  Even expressions are allowed

.RB    Register Bank

Syntax:

        .RB   expression

Function:

The .RB directive is used to tell the SB-Assembler what the currently selected register bank is. Knowing this enables the SB-Assembler to select between register or direct addressing mode automatically. The SB-Assembler will use register addressing mode whenever the address and the instruction both allow it.

Explanation:

One of the features of the 8051 SB-Assembler is the ability to select the most economical addressing mode for internal RAM. You can assign any name to a RAM location within the Register address space and the SB-Assembler will use register addressing mode whenever it can. In order to do this the SB-Assembler must know which one of the 4 possible register banks is selected at the moment. You can use the .RB directive to tell the SB-Assembler what register bank is selected.
Please note that you only tell the SB-Assembler what register bank is supposed to be selected. It is the programmer's responsibility to effectively set the RB1 and RB0 bits of the processor correctly to make it really happen! The SB-Assembler has no way of checking this and is therefore unable to warn you about wrong settings. So the .RB directive is in no way a substitute for the bit toggling which is required to select the proper register bank on the processor!

The expression must evaluate to a value from 0 to 3 and may not contain forward referenced labels. Other values will result in a Out of range error.
Per default (after loading or reloading the 8051 cross overlay) the selected register bank is 0. This is in sync with the 8051 itself because there register bank 0 is also the default selection after reset.

Please note that you can use a complete expression to indicate a direct address, even if it is translated into a register automatically. However it is not possible to use a register name in any expression, because register names are not real labels.
The SB-Assembler will use the direct addressing mode if the address expression contains a forward referenced label.

The selected register bank also affects the translation from register addressing mode to direct addressing mode for those instructions that do not support register addressing mode. The translation uses the calculation RB*8+Rn, which means that the selected register bank is multiplied by 8 and added to the source register number.

You can force register addressing mode by placing the < symbol before the address. Or you can force direct addressing mode by placing the > symbol before the address. You can't force the assembler's default addressing mode with indirect addressing mode @Ri though.

Examples:

00D0-           PSW     .EQ    $D0          Define PSW
0000-                   .RB    0            Assume register bank 0 is in use
0000-53 D0 E7           ANL    PSW,#%1110.0111   Select register bank 0
0003-EA                 MOV    A,$02        Is automatically translated to R2
0004-EA                 MOV    A,R2         Is R2
0005-E5 0A              MOV    A,$0A        Can't be translated to any register
0007-
0007-                   .RB    1            Assume register bank 1 is in use
0007-43 D0 08           ORL    PSW,#%0000.1000   Select register bank 1
000A-53 D0 EF           ANL    PSW,#%1110.1111
000D-E5 02              MOV    A,$02        Can't be translated to any register
000F-EA                 MOV    A,R2         Is R2
0010-EA                 MOV    A,$0A        Is automatically translated to R2
0011-
0011-A9 0A              MOV    R1,R2        R2 is translated to immediate address
0013-C0 0A              PUSH   R2           Normally not possible, R2 translated
0015-52 0A              ANL    R2,A         Normally not possible, R2 translated
0017-
0017-EA                 MOV    A,<$12       Force the use of register addressing
0018-E5 0A              MOV    A,>$0A       Force direct addressing

Please note that the assembler would still generate the same code when the ANL and ORL instructions at address 0007 and 000A were omitted. But in real life R2 would point to RAM address $02 in the instruction at address 0010, and not the intended address $0A!

The last 2 lines show an example of forced addressing modes. RAM address $12 is actually in register bank 2, while the SB-Assembler is set to bank 1 at this point. By placing a < symbol in front of the address will force the assembler to use register addressing.
The example in the last program line shows the opposite situation. Here the address $0A is within the selected register bank and without intervention the SB-Assembler would have selected register addressing mode. But we force direct addressing mode by placing a > in front of the address.

Differences Between Other Assemblers

There are some differences between the SB-Assembler and other assemblers for the 8051 processors. These differences require you to adapt existing source files before they can be assembled by the SB-Assembler. This is not too difficult though and is the (small) price you have to pay for having a very universal cross assembler.

  • For the SB-Assembler JMP and CALL instructions are added to give a smart solution for selecting between AJMP and LJMP or ACALL and LCALL.
  • Automatic selection of the shortest addressing mode when RAM memory is within the currently selected register bank.
  • No SFR names are defined per default. It's the programmer's responsibility to declare SFR names as normal labels.
  • The obvious differences in notation of directives common to all SB-Assembler crosses.
  • Don't forget that the SB-Assembler does not allow spaces in or between operands. Only Version 3 will allow one space after each comma separating operands in the operand field.