Ace Assembler

Full title	Ace Assembler
Year of release	2017
Publisher	Alan Bleasby
Producer / Author(s)	Alan Bleasby
Memory	16k
Type	Utility
Cost :	0.00
Download	Ace Assembler [CRC32 A799D7C3] Distribution Permission Allowed \| Group 1
Instructions

Aass: Ace Assembler
Copyright (c) 2017 Alan Bleasby


1. Introduction

Aass is a two-pass native assembler for the Jupiter Ace. Writing an
assembler for the Forth platform presents several issues. First,
entering assembler source into Forth words and subsequently editing it
poses a problem. Secondly, assemblers written in the Forth style don't
usually conform to the mnemonic standards.

Aass sidesteps these issues by providing an integrated line editor,
enabling you to type in the assembly code at any given point in user
RAM space. Lines in the source code can be added, removed, listed and
renumbered. The assembler can inform you where the assembly code
starts and show its size, so enabling you to BSAVE it to storage.  The
important point is that the assembler/editor does not create any Forth
words; it just deals in areas of RAM. It is invoked from Forth and
returns to it when you leave the program.

Aass is a block of machine code that sits at address b000 to
cd89. This allows the separate disassembler (Adiss) to lie at address
d000 to db72. See separate documentation for Adiss.

The assembly tools are not located at the top of the address space as
my own Ace clone (Arv1) has an extra 8K of ROM from e000-ffff.


2. Hexadecimal

All addresses in this documentation are given in hexadecimal unless
stated otherwise. It is recommended that you define a Forth word 'hex'
which sets the machine to base 16 i.e.

   : hex decimal 16 base c! ;

and work in hexadecimal whenever you're using the assembler/editor.
The assembler and editor display start addresses, end addresses and
length in hexadecimal so it is very convenient to be able to drop to
the command cursor and BSAVE from/to these locations using the given
hexadecimal directly.

It is also more convenient to invoke the assembler using its
hexadecimal address rather than the equivalent decimal one.


3. Where to put assembly source code and compiled machine code.

The assembler can produce relocating code i.e. it can compile the
machine code at an address which may not be its final desired
location. This allows some flexibility in how you arrange your
assembly code and machine code blocks in RAM. If you have RAM from
e000-ffff then that is obviously a good space for the resulting
machine code, otherwise both assembly code and machine code need to
sit below a000 (but not too close to it - see the section on labels).

Obviously you don't want the assembly code and machine code to bump
into and overwrite each other. A reasonable compromise, for a new user
and assuming an otherwise empty Ace, would be to have your source
assembly code from address 6000 upwards and instruct the assembler to
produce to machine code above the end of Forth e.g. at address 4000.
If, on assembling, you see your machine code getting perilously
close to the source code then you can just save the assembly source to
tape and then reload it at a higher address. The assembly source is
fully relocatable.


4. Invoking the assembler and the main menu

Assuming you're working in hexadecimal you invoke the assembler by typing:

  b000 call

You'll be presented with the following screen:

Aass v1.0: (c) 2017 AJB



N: New Code Address
C: Code Address
L: Line Editor
M: Memory Extent
A: Assemble
X: Exit


As you'd expect, you can press the letter 'x' to exit from the program.
The six options have the following functions.

a) N - New Code Address

The New Code Address is the location at which you want to start
typing your assembly source code using the editor. As suggested above,
a suitable address is 6000. Your source code will grow upwards in
memory as it's typed. After typing 'N' you'll get the prompt:

   Hex address:

Type in your desired address using the cursor line at the bottom of the
screen. Note that the assembler is making use of the Forth QUERY LINE
statements at this point. It can therefore be abused in lots of ways;
it really is best to just type an address.

After entering the address the main menu will be redisplayed.

b) C - Code Address

The code address is similar to the new code address. It is the
address of some existing assembler source in RAM. Typically this will
be source code that you've previously saved using BSAVE and subsequently
loaded into memory. Alteratively you might have exited the assembler
to do some Forth and then re-entered it to continue editing your
source code. After typing 'C' you'll get the same prompt as for 'N' i.e.

   Hex address:

Type in your desired address using the cursor line at the bottom of the
screen. Note that the assembler is making use of the Forth QUERY LINE
statements at this point. It can therefore be abused in lots of ways;
it really is best to just type an address.

N.B. You can type in any address at the prompt and it'll be accepted.
However, if the address doesn't point to valid assembly source then you'll
get the error message 'Unrecognised code' when you subsequently try to
select the editor, memory or assemble options. Valid code is recognised
by using a magic number that is unlikely to occur elsewhere.

If you accidentally use 'N' when you intended to use 'C' then all is
not lost. You can recover your source code by returning to Forth
and use the store word (!). If such corrupted code was at address 6000
you'd type:

  1 6002 !

You should then be able to re-enter the assembler and continue editing. The
only side effect will be that the first line number will have been set
to '1' after the above command.


c) L - Line Editor

This invokes the line editor. You must either have intialised the
start of some memory (New Code Address) or selected some existing
valid source code (Code Address) to activate the editor, otherwise
you'll get an 'Unrecognised code' error.

Pressing 'L' will take you to a blank screen with an editing cursor
(inverse e) at the bottom of the screen. See the Line Editor section
for documentation. It is enough to say here that typing: quit[ENTER]
will return you to the main menu.


d) M - Memory Extent

This will show you how much memory your assembly source code is
occupying. The assembler must be pointing to valid code set up using
'N' or 'C' (above) otherwise an 'Unrecognised code' error will result.
The information is displayed on screen as follows:

Start:  6000
Length: 071D

Pressing any key will return you to the main menu.
You can use the M command to monitor whether you're short of
memory. At any point you can exit from the assembler and BSAVEing the
memory. For the above example you'd type

  6000 071d bsave mysrc

You can then optionally reinvoke the assembler and re-enter the code
address to continue working.


e) A - Assemble

The assembler must have been set up pointing to valid source code using
either the 'N' or 'C' main menu options, otherwise an 'Unrecognised code'
error will result.

The assembler will produce the machine code and, if no error
results, display the start, end and lengths of all machine code blocks
produced (the assembler supports multiple sections). Example output
for a tiny single section is:

Start End   Length
5000  500D  000E

You can use the Forth BSAVE command to save any machine code blocks
to storage e.g.

  5000 000e bsave myprog


f) X - Exit

Typing 'X' or 'Q' will return you to Forth.





5. The Line Editor


On entering the line editor you'll be presented with the edit cursor
(inverse e) at the bottom of a blank screen.

There are two kinds of thing you can enter at the command prompt:
   i) Source code lines
  ii) Editor commands

For both of these the following Ace editing keys work as expected.

  Shift-1 : Delete line
  Shift-5 : Move cursor left
  Shift-8 : Move cursor right
  Shift-0 : Delete character

N.B. If you enter any kind of invalid input (e.g. a source code line without
a line number or a misspelled command) then nothing will happen when you
type [ENTER]: you will need to correct the error using the editing keys.
Lines are a maximum length of 31 characters.

a) Source code lines: These have the form:
       LineNumber  mnemonic

   For example, the program below puts hex 1234 on the Forth stack when
   invoked using:  4000 call

       10 org $4000
       20 ld de,$1234
       30 rst $10
       40 jp iy   ; See assembler documentation

   After pressing [ENTER] then, if it is a valid line, then the line
   will be added to memory after being cleaned of unnecessary characters
   (e.g. excess spaces). Lines do not need to be entered in numerical
   sequence. You can add new lines between existing lines.
   entering a line using an existing number will replace the original line.
   Entering a line number on its own will delete any corresponding line.

b) Editor commands
   All editor commands can be abbreviated as long as there is no conflict
   with another command.

   i) QUIT - Return to main menu. The minimum abbreviation is q.

   ii) RENU - Renumber the source code. This is useful for tidying
       the source code or for making space between lines. By default
       the command will renumber starting at line 10 and increment
       by 10 each time. The full syntax of the command is:

           RENU [increment] [starting line number]

       The minimum abbreviation is r.

    iii) LIST - list the source code

       LIST will list the entire source code. Scrolling will
       pause if the screen gets full. At that point pressing
       Q or X will terminate scrolling. Pressing any
       other key will continue the scrolling.

	 LIST LineNumber (e.g. list 10) will list a single line number

	 LIST -LineNumber will list the source from the start to that
	                  line number (e.g. list -50)

         LIST LineNumber- will list the source from the given line
	                  number to the end of the source code.
			  (e.g. list 100-)

	 LIST LineNumber-LineNumber will list from the first to the
	                  second line number. (e.g. list 20-120)

       The minimum abbreviation is l.


6. The Assembler

The assembler is invoked by pressing 'A' from the main menu. At the
end of a successful assembly the size and location of the code
generated is displayed on the screen.

Start End   Length
4000  410F  010F

One such report will be given for every separate block of machine code
that you have specified in your source code (see org/rorg below).

6.1 Opcodes

All the standard documented opcodes are recognised. A few undocumented
ones have beeen added i.e. 'in (c)' and 'out (c),0' plus all the 'sll'
opcodes. Use undocumented opcodes at your own risk; they are not
available on processors post-dating the Z80, though they should work
on most, if not all, z80 cpus. All other undocumented codes can be
entered using the 'byte' directive described below.

The assembler uses a non-standard syntax for the three register pair
jump instructions:

   jp hl     jp ix      jp iy        instead of
   jp (hl)   jp (ix)    jp (iy)

The standard syntax for these commands incorrectly describes
their true action as they all jump directly to the addresses in the
registers, not the addresses in the memory locations specified
by the registers. All other opcode mnemonics are standard.


6.2 Labels

Labels can be used to refer to memory locations or to hold
the results of expressions. There are two types of label,
namely global and local.

i)  Global labels. These must start with an alphabetical character
    (A-Z or a-z). Any alphanumeric characters can
    then form more of the name including the characters '.' and '_'.
    The label 'Mabel' is different from the label 'mabel' i.e. labels
    are case-sensitive.

    Global labels may be referenced from anywhere in your source code.

ii) Local labels. These are typically used within subroutines and
    allow you to reuse label names to refer to locations within a
    local scope.
    Local labels must start with a '.' character, after which they can
    use the same characters as global labels.

    Local labels are used in conjunction with the pseudo-op 'subr'.
    each time 'subr' is used it increments a counter. The ASCII
    representation of that counter number is then made part of the
    local variable name. There is an implied subr level of '0'
    at the start of assembly. The value is reset at the start of each pass.

N.B. All labels are stored in a table which grows downwards in memory from
just underneath the assembler itself i.e. location afff. It is up to you
to avoid the label table smashing into your source code or generated
machine code.


6.3 Numbers

Numbers may refer to addresses or mathematical terms. The majority
of 2-byte numbers in the assembler are unsigned, given their use as
addresses. Minimal testing is done for overflows as some may be
valid. It's your responsibility make sure they don't happen
inappropriately. When single byte values are expected then an error
will be reported if any corresponding high byte is not zero.

Whenever a number is required it can be written as one of hex, decimal,
binary or octal. Decimal numbers have no prefix character. For
other bases the prefix characters are:

  $ - Hexadecimal (e.g. $9afe)
  @ - Octal	  (e.g. @731)
  % - Binary      (e.g. %10011010)

The assembler is case-insensitive to hex numbers/addresses.


6.4 Literals

Literal characters are enclosed in single quote. For
example:

  ld a,'A'

will load the ASCII value of the literal A ($41) into the
accumulator.

The following characters cannot be used inside a literal specification:
      ; semi-colon
      : colon
      ' single quote
      " double quote


6.5 Expressions

An expression may be a single number, a literal, an address,
a label or any combination of them separated by operators.
The operators available in Aass are:

  + : addition
  - : subtraction
  * : multiplication
  / : division
  % : modulus
  & : bitwise AND
  | : bitwise OR
  ^ : bitwise EOR

Intermediate results during calculations are held as 32-bit but,
at most, the final result should be 16-bit.

As an example of operator use:

  ld a,Mylabel % $100

will load the accumulator with the least significant byte of the address
represented by Mylabel.


6.6 Operator precedence

All operators (+-*/%&|^) have equal precedence. They are evaluated
from left to right. Bracketed expressions are not allowed.


6.7 Statements

All complete opcode mnemonics, pseudo operations and label definitions are
statements. Usually only one statement per line is given e.g.

    10 Loop
    20 ld a,(hl)
    30 cp $0d
    40 jr z,Exit
    50 add a,b
    60 ld (hl),a
    70 jr Loop
    80 Exit

However, the assembler allows multi-statement lines. Statements are
separated by colons. Sometimes these make code easier to read e.g.

    10 Loop: ld a,(hl)

    240 push hl:push de:push bc

    300 pop bc:pop de:pop hl

Comments may be added in place of stsatements by preceding them
with a semi-colon e.g.

    10 ld a,(BASE)   ; Load numeric base

Semi-colons and everything following them up to the end of the line are
ignored by the assembler.


6.8 Relative addressing

Relative addressing is used by branch and djnz instructions. There are
two ways of specifying branch instructions:

       30 jr +$40
       90 jr MyLabel

The first allows you to specify the displacement explicitly. The
displacements must be in the range +$7f to -$80.

The second usage allows you to specify a 16 bit address, either
explicitly or, more commonly, by using a label. In both cases the
assembler will calculate the displacement for you.

Indexed addressing instructions using IX and IY can only use explicit
displacements e.g.

    10 ld (ix+$31),$77

Such displacements must be in the range +$7f to -$80.


6.9 Pseudo-ops

The following pseudo-ops are provided.

i)    org expr
      This specifies the start of the assembly address for the
      current code block. There should be at least one 'org' in the
      source code file. A maximum of 8 are allowed. Only blocks
      which produce at least 1 byte count towards this maximum
      with the exception of the last block which is always reported
      (giving an indication that the assembly has finished).
      If no org is given then a default assembly address of 0000
      is used. This ensures that RAM is not accidentally overwritten.

ii)   rorg expr
      This is the relocation address i.e. the location to which the
      generated code should eventually be moved and executed. The 'org'
      pseudo-op always makes the assembly address and the
      relocation address the same.

iii)  label: equ expr
      The equ pseudo op allows the value of an expression to be
      assigned to a label. For example:

      Wspace: equ $a000

      Allows the name/label 'Wspace' to be equivalent to specifying
      the number/address '$a000'.

      Although, like all pseudo-ops, equ ops can appear anywhere in the source
      they are usually put in a block at the start.

      Note that overwriting the value of an EQU label is allowed.

iv)   subr
      Increment the prefix value for local labels.

v )   byte expr,expr,expr,...
      Add a single byte at the current assembly position

vi)   word expr,expr,expr,...
      Add a 16-bit word at the current assembly position (lsb,msb)

vii)  text "string"
      Add a string of characters at the current assembly position.

viii) block expr
      Allows a block of bytes to be skipped. This is useful for
      defining buffer areas etc.


7 Error messages

Assembly will stop at the first error found in the source code.
The line number containing the error and the error type will
be displayed. The statement containing the error will also
be shown.

Errors types include:

  Bad Number
  Missing Label
  Bad Expression
  Bad Syntax
  Out Of Range   (displacements greater than +7f or lower than -$80)
  Invalid Label
  Too Many Orgs


8. Example program

The obligatory Hello World program.

  10 org $4000
  20 ld hl,Hw
  30 Loop: ld a,(hl)
  40 or a
  50 jr z,Exit
  60 rst $08
  70 inc hl
  80 jr Loop
  90 Exit: jp iy
  100 Hw: text "Hello World"
  110 byte $0d,$00


9. Disclaimer

THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.