Expressions

Operands often contain expressions which will be evaluated into a single value. Internally all values are 32 bit long integers, no matter what the requirements for the operand are. Expressions consist of at least one element. Every extra element should be separated from the previous element by an operation.
Please note that the SB-Assembler, like most micro processors and controllers, can only handle integer calculations.

The following operations can be used in expressions:

+Addition
-Subtraction or change sign
*Multiplication
/Division
\Modulo (Remainder after integer division)
&Bitwise AND operation
^Bitwise OR operation
|Bitwise OR operation, same as ^ (Only in Version 3)
!Bitwise EXOR operation
<<Bitwise shift left operation (as of Version 3.02)
>>Bitwise shift right operation (as of Version 3.02)
=Logical equality (a=b returns -1 if a is equal to b, or gives 0 if a is not equal to b)
<Logical less than (a < b returns -1 if a is less than b, or gives 0 otherwise)
>Logical greater than (a > b returns -1 if a is greater than b, or gives 0 otherwise)
<=Logical less than or equal to (a <= b returns -1 if a is less than or equal to b, or gives 0 otherwise)
>=Logical greater than or equal to (a => b returns -1 if a is greater than or equal to b, or gives 0 otherwise)
<>Logical unequal (a <> b, returns -1 if a is not equal to b, or gives 0 otherwise)
!=Logical unequal, the same as <> (Only in Version 3).

All expressions are evaluated from left to right. No priority is given to multiplication and division over addition and subtraction like in normal math. Parentheses can not be used to change priority in expressions for they may serve a completely different purpose in some Crosses. I personally never ran into problems by the limitations of the math functions provided by the SB-Assembler. If you do, I think you should retrace your calculation. Remember: this is an assembler, not a spreadsheet program!

Overflows in expressions are ignored and the result is always truncated to 32 bits. The SB-Assembler makes no distinction between signed and unsigned integers. It's up to the program(mer) to interpret the results of calculations as either being signed or unsigned. You can, of course, use negative numbers. Entering -126 in an 8 bit number will result in the binary value %1000.0010, just as if you had used the positive value 130.
Expressions, like all other operands handled by the SB-Assembler, may not contain any spaces. Except when used with single or double quote delimiters to represent the ASCII space character.

Only Version 3.02.00 and later versions of the SB-Assembler support the bitwise shift operations << and >>.
Older versions of the SB-Assembler can use multiplication or division operands to achieve the same results. Shifting 4 bits to the left is exactly the same as multiplying by 16. Whereas shifting 5 bits to the right is the same as dividing by 32.

As of Version 3.03.03 the SB-Assembler also supports the negate operator ~. This operator can only be used at the beginning of the expression and will negate the final result of that expression. Every 0 of the result will become a 1, and every 1 will become a 0. Therefore it is sort of a bitwise inverter operand.

Some examples of expressions:

        MOV  DPTR,#INDEX*2+OFFSET   (index*2)+offset
        MOV  DPTR,#OFFSET+INDEX*2   (offset+index)*2
        .DA  ADDRESS/$100           Integer division
NEXT    .EQ  LABEL+2
        .DA  7/8*100                (7/8)*100=0   wrong result
        .DA  100*7/8                (100*7)/8=87  better result
        .DO  $=>$1000
N_MASK  .EQ  ~MASK                  Negate MASK = bitwise invert

Please note that the order in which you give the expressions is very important. The SB-Assembler evaluates all expressions from left to right. That is why 7/8*100 results in 0. This is because 7/8 is 0.875, which is truncated to 0 caused by the integer division. You'll get a much better result by rewriting the expression to 100*7/8, which is still an integer though.

Elements Of An Expression

An element is:

  • A number in any of the available radixes.
  • A label's value.
  • The current location value.
  • The pass identifier.

Elements are internally represented by a 32 bit value. Every element may be preceded by one of the two sign characters + or - . The + sign is optional and does not change the polarity of an element at all.
The element is made negative when it is preceded by the - sign. The maximum negative number with a length of 32 bits is -2147483648. The assembler will accept larger negative numbers as well, but with normal notation they end up as positive numbers for their value will be truncated to 32 bits.

Immediate Operand Prefixes

A total of 4 different prefixes are defined to indicate immediate operands.

The most common prefix of the 4 types is the # symbol. Most processor families use this prefix to specify immediate operands. The other 3 prefix types are not as common and may even be unique to the SB-Assembler. All 4 prefixes have in common that the immediate addressing mode will be used. The value following the operand is used literally by the instruction.

Examples:

        MOV  A,#10
        ANL  A,#%1111.000

Some processor families don't use the immediate prefix at all. In those cases the use of the # prefix will be optional in the corresponding SB-Assembler Crosses.

# Use the least significant bits that fit the operand size used by the instruction. This means that only the least significant n bits of the 32 word value are used if only an n bit operand is required.
/ Has the same effect as if the # symbol was used, but the 32 bit value is shifted 8 bits to the right first. Bits falling out of the 32 bit value are discarded and 0 bits are shifted in from the left to fill the operand if needed.
= Has the same effect as if the # symbol was used, but the 32 bit value is shifted 16 bits to the right first. Bits falling out of the 32 bit value are discarded and 0 bits are shifted in from the left to fill the operand if needed.
\ Has the same effect as if the # symbol was used, but the 32 bit value is shifted 24 bits to the right first. Bits falling out of the 32 bit value are discarded and 0 bits are shifted in from the left to fill the operand if needed.

In Version 3 of the SB-Assembler the shifted values are sign extended. This means that instead of shifting in zeroes from the left, the most significant bit is repeated over the shift length. This way the sign of the value doesn't change.

Examples

:
        .CR  8051

        MOV  A,#$12345678      Uses only lowest byte $78
        MOV  DPTR,#$12345678   Uses lowest 16 bits $5678

        MOV  A,/$12345678      Uses only 2nd lowest byte $56
        MOV  DPTR,/$12345678   Uses bytes 2 and 3 $3456

        MOV  A,=$12345678      Uses only 2nd highest byte $34
        MOV  DPTR,=$12345678   Uses highest 16 bits $1234

        MOV  A,\$12345678      Uses only highest byte $12
        MOV  DPTR,\$12345678   Uses highest byte as 16 bit value $0012

These prefixes are very easy when you have to use the immediate addressing mode of the processor. But sometimes you want the same function for other addressing modes as well. You can't use the prefixes there because they would change the addressing mode along with it. In those situations you could divide the value by $100, $10000 or by $1000000 respectively.

Radixes

We humans are very familiar with the base10 or so called decimal radix. Computers prefer to use base2 or so called binary radix. Binary numbers can come in very handy to define bit masks. But for other situations they are not very useful to us humans. Normal numbers can be represented in decimal as well, but addresses are better represented in base16 or so called hexadecimal radix.

Many different notations exist in the world of assemblers to identify the different radixes. The SB-Assembler initially only used the Motorola notation where every radix, except base10, is preceded by a special identifier symbol.
Many other assemblers use the Intel notation however, where every number is started by a decimal digit and different radixes have a special character following the number.
As of version 2.04 the SB-Assembler can handle both radix notations to make it more compatible with existing assemblers. Both notations may be mixed in any way you like because the SB-Assembler can easily tell them apart. However the Motorola notation will remain my favourite and you'll find almost all examples on this site using it.

As of Version 3 of the SB-Assembler the C-style notation can also be used. Hexadecimal numbers for instance can also be written as 0xABCD.

Remember that all numbers are internally stored as 32 bit values. Entering larger values will result in an Overflow error message to be reported.

Decimal Numbers

If an element starts with a digit from 0 to 9 it is very likely to be interpreted as a normal decimal number. Unless the total number is followed by a B, O, Q or H, in which case the Intel radix notation is used to evaluate the number.
Some assemblers allow you to add an optional "D" after a decimal number to indicate that it is indeed a decimal number. Version 2 of the SB-Assembler doesn't allow this very rarely used option. Version 3 of the SB-Assembler does allow this optional trailing D.
The maximum value of a decimal number is +4294967295 or -4294967296. Larger numbers will result in an Overflow error.
The parsing of a decimal number will continue until a non-numeric character or EOL is encountered.

Examples:

1234    -100    +65535

Hexadecimal Numbers

In the Motorola notation hexadecimal numbers are preceded by a $ symbol and may contain the digits 0 to 9 and A to F. The assembler stops interpreting the number when a non HEX digit is encountered.
With the Intel notation hexadecimal numbers must start with a decimal digit and may further contain the digits 0 to 9 and A to F. The HEX number is terminated and at the same time identified by the character H. In case your HEX number starts with a digit A to F you should precede it with an extra 0 to let it start with a decimal digit, otherwise it will be treated as a label name.
Version 3 of the SB-Assembler also understands the C-style notation, in which hexadecimal numbers are preceded by the 0x characters. Interpretation of the hex number is stopped when a non-hex character is encountered.
The maximum value of hexadecimal numbers is +$FFFFFFFF or -$100000000. Larger numbers will result in an Overflow error.
The SB-Assembler as a whole is case-insensitive, so it does not matter whether you use upper or lower case characters for the digits A to F or identifiers x or H.

Using the $ symbol without a legal HEX digit following it will result in the current program location being returned.

Examples:

$ABCD    -$1234    +$ffff0000      Motorola notation
0ABCDH   -1234H    +0ffff0000h     Intel notation
0xABCD   -0x1234   +0xffff0000     C-style notation (only Version 3)

Octal Numbers

In the Motorola notation octal numbers are preceded by an @ symbol and may contain the digits 0 to 7. The assembler stops interpreting the number when a non octal digit is encountered. At least one digit should follow the @ symbol otherwise a Value expected error is reported.
With the Intel notation octal numbers may contain the digits 0 to 7 and must be followed by the letter O or Q which terminates it and at the same time identifies the octal radix. Preferably the letter Q is used to avoid confusion the letter O may cause with the digit 0.
The maximum value of a 32 bit octal number is +@37777777777 or -@40000000000. Larger numbers will result in an Overflow error.

Some Crosses may limit the use of octal numbers in situations where the @ symbol can represent indirect addressing mode. In most such cases it is sufficient to add the + sign in front of the @ symbol, like in: +@123, to avoid confusion with indirect addressing mode.

Neither Version 2 nor Version 3 uses C-style notation for octal numbers.

Examples:

@12345    -@1234    +@777000    Motorola notation
12345Q    -1234Q    +777000Q    Intel notation

Binary Numbers

In the Motorola notation binary numbers are preceded by a % symbol and may contain only the digits 0 and 1. The assembler stops interpreting the number when a non binary digit is encountered. At least one digit should follow the % symbol otherwise a Value expected error is reported.
With the Intel notation binary numbers may contain the digits 0 and 1 which must be terminated by the letter B which identifies the binary radix.
The C-Style notation uses the 0b prefix as binary identifier, which only works in Version 3 of the SB-Assembler.

Only with the Motorola notation you may add dots in long binary numbers to make them easier to read for us humans. A long binary number is usually very hard to read, therefore you can insert dots between the bits at any position. Usually binary digits are grouped together in blocks of 4 bits, enabling easy translation to HEX.
The SB-Assembler ignores these dots completely and has no difficulty what so ever reading long binary numbers.

+%1111.1111.1111.1111.1111.1111.1111.1111 and -%1.0000.0000.0000.0000.0000.0000.0000.0000 are the maximum values of a 32 bit binary number. Larger numbers will result in an Overflow error.
Please note that the dotted notation is a lot easier to read than 11111111111111111111111111111111B. Remember though that the dotted notation only works for the Motorola notation of binary numbers, using the % prefix.

Examples:

%1010                  Motorola notation
%1111.0000
%10.111.000            Weird, but legaly dotted
-%11
+%111
%1010.1111.0011.1000

1010B                  Intel Notation
11110000B
10111000B
-11B
+111B
1010111100111000B

0b1010                 C-Style notation
0b1111.0000
-0b11
+0b111
0b1010111100111000

ASCII Numbers

ASCII values are often used in operands and expressions. The SB-Assembler knows two different ASCII identifiers.

' The single quote preceding the ASCII character will return a 7 bit ASCII value of the character, all other bits remain 0. This is called positive ASCII.
" The double quote preceding the ASCII character will return an 8 bit ASCII value of the character, with the MSB of the byte being 1. This is called negative ASCII and can be used to signal special situations, like bold text or end of string.

The ASCII character is considered to be a string literal, which is case sensitive in the SB-Assembler. So the character 'a' is a different character than 'A'.

The value of the character following the single or double quote is used. After that character the same quote type may be used to fully enclose the ASCII character. This closing quote is optional, but it should be the same type as the opening quote if it's used. To avoid confusion or error messages it is recommended to fully enclose the ASCII space character, especially if the space is located at the end of the program line. Some editors will strip off all trailing spaces on a line, killing your space that should be there.

It is perfectly legal to write ''' or """ to use ASCII values of ' and " respectively.

Examples:

'a'    'A'    'a     ' '     '''
"a"    "A"    "a     " "     """

The Current Location Value

The current location value holds the program counter value PC of the first byte of the current program line. It can be used for addressing relative to the current location of the program.

In versions of the SB-Assembler prior to Version 2 only the * symbol was used to represent the current location value. But since most assemblers use the $ symbol as current location value I decided to teach the SB-Assembler that same trick too. So it doesn't matter whether you use * or $ as current location value.

To us humans it may be a bit confusing if one of the current location values is used in expressions. Consider the following silly example:

***

This means: take the current location value (1st *) multiply it (2nd *) by the current location value (3d *), which is in effect raise the current location value to the power of 2. It could also have been written as $*$ .
If the SB-Assembler expects an element (a value) the * symbol is interpreted as the current location value. If it expects an operator the * symbol is interpreted as the multiplication symbol.
The $ symbol is interpreted as the current location value only if no hexadecimal digit follows it, otherwise it is interpreted as the hexadecimal identifier.

The following program line is the shortest possible endless loop:

JMP	*

It can also be written as:

JMP	$

The Pass Identifier

The pass identifier can be used to know whether we are in pass 1 or pass 2 of the assembling process. The ? symbol is used as pass identifier element. The pass identifier will be 0 during pass 1, and 1 during pass 2.

The pass identifier is only useful during debugging when used with conditional assembly functions.