Ace Assembler
Full title | Ace Assembler |
Year of release | 2017 |
Publisher | Alan Bleasby |
Producer / Author(s) | Alan Bleasby |
Memory | 16k |
Type | Utility |
Cost : | 0.00 |
Download | Ace Assembler
[CRC32 A799D7C3] Distribution Permission Allowed | Group 1 |
Instructions
Aass: Ace Assembler Copyright (c) 2017 Alan Bleasby 1. Introduction Aass is a two-pass native assembler for the Jupiter Ace. Writing an assembler for the Forth platform presents several issues. First, entering assembler source into Forth words and subsequently editing it poses a problem. Secondly, assemblers written in the Forth style don't usually conform to the mnemonic standards. Aass sidesteps these issues by providing an integrated line editor, enabling you to type in the assembly code at any given point in user RAM space. Lines in the source code can be added, removed, listed and renumbered. The assembler can inform you where the assembly code starts and show its size, so enabling you to BSAVE it to storage. The important point is that the assembler/editor does not create any Forth words; it just deals in areas of RAM. It is invoked from Forth and returns to it when you leave the program. Aass is a block of machine code that sits at address b000 to cd89. This allows the separate disassembler (Adiss) to lie at address d000 to db72. See separate documentation for Adiss. The assembly tools are not located at the top of the address space as my own Ace clone (Arv1) has an extra 8K of ROM from e000-ffff. 2. Hexadecimal All addresses in this documentation are given in hexadecimal unless stated otherwise. It is recommended that you define a Forth word 'hex' which sets the machine to base 16 i.e. : hex decimal 16 base c! ; and work in hexadecimal whenever you're using the assembler/editor. The assembler and editor display start addresses, end addresses and length in hexadecimal so it is very convenient to be able to drop to the command cursor and BSAVE from/to these locations using the given hexadecimal directly. It is also more convenient to invoke the assembler using its hexadecimal address rather than the equivalent decimal one. 3. Where to put assembly source code and compiled machine code. The assembler can produce relocating code i.e. it can compile the machine code at an address which may not be its final desired location. This allows some flexibility in how you arrange your assembly code and machine code blocks in RAM. If you have RAM from e000-ffff then that is obviously a good space for the resulting machine code, otherwise both assembly code and machine code need to sit below a000 (but not too close to it - see the section on labels). Obviously you don't want the assembly code and machine code to bump into and overwrite each other. A reasonable compromise, for a new user and assuming an otherwise empty Ace, would be to have your source assembly code from address 6000 upwards and instruct the assembler to produce to machine code above the end of Forth e.g. at address 4000. If, on assembling, you see your machine code getting perilously close to the source code then you can just save the assembly source to tape and then reload it at a higher address. The assembly source is fully relocatable. 4. Invoking the assembler and the main menu Assuming you're working in hexadecimal you invoke the assembler by typing: b000 call You'll be presented with the following screen: Aass v1.0: (c) 2017 AJB N: New Code Address C: Code Address L: Line Editor M: Memory Extent A: Assemble X: Exit As you'd expect, you can press the letter 'x' to exit from the program. The six options have the following functions. a) N - New Code Address The New Code Address is the location at which you want to start typing your assembly source code using the editor. As suggested above, a suitable address is 6000. Your source code will grow upwards in memory as it's typed. After typing 'N' you'll get the prompt: Hex address: Type in your desired address using the cursor line at the bottom of the screen. Note that the assembler is making use of the Forth QUERY LINE statements at this point. It can therefore be abused in lots of ways; it really is best to just type an address. After entering the address the main menu will be redisplayed. b) C - Code Address The code address is similar to the new code address. It is the address of some existing assembler source in RAM. Typically this will be source code that you've previously saved using BSAVE and subsequently loaded into memory. Alteratively you might have exited the assembler to do some Forth and then re-entered it to continue editing your source code. After typing 'C' you'll get the same prompt as for 'N' i.e. Hex address: Type in your desired address using the cursor line at the bottom of the screen. Note that the assembler is making use of the Forth QUERY LINE statements at this point. It can therefore be abused in lots of ways; it really is best to just type an address. N.B. You can type in any address at the prompt and it'll be accepted. However, if the address doesn't point to valid assembly source then you'll get the error message 'Unrecognised code' when you subsequently try to select the editor, memory or assemble options. Valid code is recognised by using a magic number that is unlikely to occur elsewhere. If you accidentally use 'N' when you intended to use 'C' then all is not lost. You can recover your source code by returning to Forth and use the store word (!). If such corrupted code was at address 6000 you'd type: 1 6002 ! You should then be able to re-enter the assembler and continue editing. The only side effect will be that the first line number will have been set to '1' after the above command. c) L - Line Editor This invokes the line editor. You must either have intialised the start of some memory (New Code Address) or selected some existing valid source code (Code Address) to activate the editor, otherwise you'll get an 'Unrecognised code' error. Pressing 'L' will take you to a blank screen with an editing cursor (inverse e) at the bottom of the screen. See the Line Editor section for documentation. It is enough to say here that typing: quit[ENTER] will return you to the main menu. d) M - Memory Extent This will show you how much memory your assembly source code is occupying. The assembler must be pointing to valid code set up using 'N' or 'C' (above) otherwise an 'Unrecognised code' error will result. The information is displayed on screen as follows: Start: 6000 Length: 071D Pressing any key will return you to the main menu. You can use the M command to monitor whether you're short of memory. At any point you can exit from the assembler and BSAVEing the memory. For the above example you'd type 6000 071d bsave mysrc You can then optionally reinvoke the assembler and re-enter the code address to continue working. e) A - Assemble The assembler must have been set up pointing to valid source code using either the 'N' or 'C' main menu options, otherwise an 'Unrecognised code' error will result. The assembler will produce the machine code and, if no error results, display the start, end and lengths of all machine code blocks produced (the assembler supports multiple sections). Example output for a tiny single section is: Start End Length 5000 500D 000E You can use the Forth BSAVE command to save any machine code blocks to storage e.g. 5000 000e bsave myprog f) X - Exit Typing 'X' or 'Q' will return you to Forth. 5. The Line Editor On entering the line editor you'll be presented with the edit cursor (inverse e) at the bottom of a blank screen. There are two kinds of thing you can enter at the command prompt: i) Source code lines ii) Editor commands For both of these the following Ace editing keys work as expected. Shift-1 : Delete line Shift-5 : Move cursor left Shift-8 : Move cursor right Shift-0 : Delete character N.B. If you enter any kind of invalid input (e.g. a source code line without a line number or a misspelled command) then nothing will happen when you type [ENTER]: you will need to correct the error using the editing keys. Lines are a maximum length of 31 characters. a) Source code lines: These have the form: LineNumber mnemonic For example, the program below puts hex 1234 on the Forth stack when invoked using: 4000 call 10 org $4000 20 ld de,$1234 30 rst $10 40 jp iy ; See assembler documentation After pressing [ENTER] then, if it is a valid line, then the line will be added to memory after being cleaned of unnecessary characters (e.g. excess spaces). Lines do not need to be entered in numerical sequence. You can add new lines between existing lines. entering a line using an existing number will replace the original line. Entering a line number on its own will delete any corresponding line. b) Editor commands All editor commands can be abbreviated as long as there is no conflict with another command. i) QUIT - Return to main menu. The minimum abbreviation is q. ii) RENU - Renumber the source code. This is useful for tidying the source code or for making space between lines. By default the command will renumber starting at line 10 and increment by 10 each time. The full syntax of the command is: RENU [increment] [starting line number] The minimum abbreviation is r. iii) LIST - list the source code LIST will list the entire source code. Scrolling will pause if the screen gets full. At that point pressing Q or X will terminate scrolling. Pressing any other key will continue the scrolling. LIST LineNumber (e.g. list 10) will list a single line number LIST -LineNumber will list the source from the start to that line number (e.g. list -50) LIST LineNumber- will list the source from the given line number to the end of the source code. (e.g. list 100-) LIST LineNumber-LineNumber will list from the first to the second line number. (e.g. list 20-120) The minimum abbreviation is l. 6. The Assembler The assembler is invoked by pressing 'A' from the main menu. At the end of a successful assembly the size and location of the code generated is displayed on the screen. Start End Length 4000 410F 010F One such report will be given for every separate block of machine code that you have specified in your source code (see org/rorg below). 6.1 Opcodes All the standard documented opcodes are recognised. A few undocumented ones have beeen added i.e. 'in (c)' and 'out (c),0' plus all the 'sll' opcodes. Use undocumented opcodes at your own risk; they are not available on processors post-dating the Z80, though they should work on most, if not all, z80 cpus. All other undocumented codes can be entered using the 'byte' directive described below. The assembler uses a non-standard syntax for the three register pair jump instructions: jp hl jp ix jp iy instead of jp (hl) jp (ix) jp (iy) The standard syntax for these commands incorrectly describes their true action as they all jump directly to the addresses in the registers, not the addresses in the memory locations specified by the registers. All other opcode mnemonics are standard. 6.2 Labels Labels can be used to refer to memory locations or to hold the results of expressions. There are two types of label, namely global and local. i) Global labels. These must start with an alphabetical character (A-Z or a-z). Any alphanumeric characters can then form more of the name including the characters '.' and '_'. The label 'Mabel' is different from the label 'mabel' i.e. labels are case-sensitive. Global labels may be referenced from anywhere in your source code. ii) Local labels. These are typically used within subroutines and allow you to reuse label names to refer to locations within a local scope. Local labels must start with a '.' character, after which they can use the same characters as global labels. Local labels are used in conjunction with the pseudo-op 'subr'. each time 'subr' is used it increments a counter. The ASCII representation of that counter number is then made part of the local variable name. There is an implied subr level of '0' at the start of assembly. The value is reset at the start of each pass. N.B. All labels are stored in a table which grows downwards in memory from just underneath the assembler itself i.e. location afff. It is up to you to avoid the label table smashing into your source code or generated machine code. 6.3 Numbers Numbers may refer to addresses or mathematical terms. The majority of 2-byte numbers in the assembler are unsigned, given their use as addresses. Minimal testing is done for overflows as some may be valid. It's your responsibility make sure they don't happen inappropriately. When single byte values are expected then an error will be reported if any corresponding high byte is not zero. Whenever a number is required it can be written as one of hex, decimal, binary or octal. Decimal numbers have no prefix character. For other bases the prefix characters are: $ - Hexadecimal (e.g. $9afe) @ - Octal (e.g. @731) % - Binary (e.g. %10011010) The assembler is case-insensitive to hex numbers/addresses. 6.4 Literals Literal characters are enclosed in single quote. For example: ld a,'A' will load the ASCII value of the literal A ($41) into the accumulator. The following characters cannot be used inside a literal specification: ; semi-colon : colon ' single quote " double quote 6.5 Expressions An expression may be a single number, a literal, an address, a label or any combination of them separated by operators. The operators available in Aass are: + : addition - : subtraction * : multiplication / : division % : modulus & : bitwise AND | : bitwise OR ^ : bitwise EOR Intermediate results during calculations are held as 32-bit but, at most, the final result should be 16-bit. As an example of operator use: ld a,Mylabel % $100 will load the accumulator with the least significant byte of the address represented by Mylabel. 6.6 Operator precedence All operators (+-*/%&|^) have equal precedence. They are evaluated from left to right. Bracketed expressions are not allowed. 6.7 Statements All complete opcode mnemonics, pseudo operations and label definitions are statements. Usually only one statement per line is given e.g. 10 Loop 20 ld a,(hl) 30 cp $0d 40 jr z,Exit 50 add a,b 60 ld (hl),a 70 jr Loop 80 Exit However, the assembler allows multi-statement lines. Statements are separated by colons. Sometimes these make code easier to read e.g. 10 Loop: ld a,(hl) 240 push hl:push de:push bc 300 pop bc:pop de:pop hl Comments may be added in place of stsatements by preceding them with a semi-colon e.g. 10 ld a,(BASE) ; Load numeric base Semi-colons and everything following them up to the end of the line are ignored by the assembler. 6.8 Relative addressing Relative addressing is used by branch and djnz instructions. There are two ways of specifying branch instructions: 30 jr +$40 90 jr MyLabel The first allows you to specify the displacement explicitly. The displacements must be in the range +$7f to -$80. The second usage allows you to specify a 16 bit address, either explicitly or, more commonly, by using a label. In both cases the assembler will calculate the displacement for you. Indexed addressing instructions using IX and IY can only use explicit displacements e.g. 10 ld (ix+$31),$77 Such displacements must be in the range +$7f to -$80. 6.9 Pseudo-ops The following pseudo-ops are provided. i) org expr This specifies the start of the assembly address for the current code block. There should be at least one 'org' in the source code file. A maximum of 8 are allowed. Only blocks which produce at least 1 byte count towards this maximum with the exception of the last block which is always reported (giving an indication that the assembly has finished). If no org is given then a default assembly address of 0000 is used. This ensures that RAM is not accidentally overwritten. ii) rorg expr This is the relocation address i.e. the location to which the generated code should eventually be moved and executed. The 'org' pseudo-op always makes the assembly address and the relocation address the same. iii) label: equ expr The equ pseudo op allows the value of an expression to be assigned to a label. For example: Wspace: equ $a000 Allows the name/label 'Wspace' to be equivalent to specifying the number/address '$a000'. Although, like all pseudo-ops, equ ops can appear anywhere in the source they are usually put in a block at the start. Note that overwriting the value of an EQU label is allowed. iv) subr Increment the prefix value for local labels. v ) byte expr,expr,expr,... Add a single byte at the current assembly position vi) word expr,expr,expr,... Add a 16-bit word at the current assembly position (lsb,msb) vii) text "string" Add a string of characters at the current assembly position. viii) block expr Allows a block of bytes to be skipped. This is useful for defining buffer areas etc. 7 Error messages Assembly will stop at the first error found in the source code. The line number containing the error and the error type will be displayed. The statement containing the error will also be shown. Errors types include: Bad Number Missing Label Bad Expression Bad Syntax Out Of Range (displacements greater than +7f or lower than -$80) Invalid Label Too Many Orgs 8. Example program The obligatory Hello World program. 10 org $4000 20 ld hl,Hw 30 Loop: ld a,(hl) 40 or a 50 jr z,Exit 60 rst $08 70 inc hl 80 jr Loop 90 Exit: jp iy 100 Hw: text "Hello World" 110 byte $0d,$00 9. Disclaimer THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.