|Previous Page > Index of General Forth Information > Going Forth - Part 3|
Going Forth - part 3
Computing Today March 1982 page 48
Yet more vital details about this language in our
continuing series for the programmer
o far in this series, we have seen enough of FORTH to realize that it is more than a Male unusual. I hope, however, that it is becoming clear that the language has considerable advantages in many circumstances and that it might well be worth getting .used to its back-to-front way of doing things.
Last month, I explained how it was possible to extend the language to include almost any facility you might want. We went on from there to look at FORTH's basic conditional operators and the way that branching and looping structures are set up. This month, we will look at conditional loops in FORTH and the significance and use of the language's two stacks. I will also outline the language's assembler mode and introduce some new words.
Before we start, however, let me remind you that where necessary, FORTH words are enclosed in square brackets — [ ] — to make them stand out clearly. (Those familiar with FORTH may have been having a quiet chuckle at our use of quotes to indicate reserved words, the quote symbol itself being a reserved word! We are now substituting square brackets, and anything enclosed by these should be taken as being FORTH and not text. Ed.) In addition, wherever I show a dialogue with the computer, the computer's responses and prompts are underlined.
Last month we saw how to construct conditional branches (BASIC's IF...THEN...ELSE) and finite loops (FOR ...NEXT in BASIC). While you can do an awful lot of programming with just these two structures, they are sometimes very limited and force you into rather involved bits of code. In particular, you may wish to loop an unknown number of times until an event takes place, or you may wish to loop while a set of circumstances are true (or false).
While these two conditional loops are very similar, they have the fundamental difference that,
in the first case, the program will always go through the loop at least once while in the second case it is possible to omit the loop operations altogether. Examples of these types of loop are found in Pascal's REPEAT...UNTIL and WHILE...DO.
You will probably have guessed that FORTH also provides these structures. An example of the first type, in which a function must be repeated until a condition occurs, is the processing of a string of input items, BASIC has to resort to conditional GOTOs to handle this case, but FORTH provides:
BEGIN <operation> <condition> END
up at the operation after END. Figure 1 is a flowchart for this function. Note that both <operation> and <condition> can be any suitable combination of words, including other loops, etc.
As an example, let's define HEXPFIINT to input decimal data from the keyboard and print its hexadecimal equivalent. The loop is to finish after inputting (and converting) decimal 100. The answer is:
: HEXPRINT BEGIN CR #IN DUP HEX .
DECIMAL 100 = END ;
This example also introduces three new FORTH words. The first, [# IN], is not actually a standard FORTH word, but is an MMSFORTH extension which prompts for, and accepts, a numerical input. HEX and DECIMAL demonstrate the language's ability to accept and print data in any number base. The system normally treats data as decimal, but type HEX and both incoming and output data are handled as hexadecimal. DECIMAL takes the system back to base 10. OCTAL, similarly, puts the system into a base 8 mode. In fact, virtually any number base can be used by putting a suitable value into the system variable BASE. Type:
23 BASE C! OK
and you are in base 23 (and the best of luck...). No matter what I/O base you choose, data is always stored in a two-byte signed binary format.
However, back to the example:
? 1 1
? 20 14
? 79 4F
? 100 64 OK
BEGIN...END is very useful, but has the weakness that <operation> is always performed at least once. Sometimes the first test can fail, meaning that there is no need for <operation> at all. MMSFORTH provides this option by:
WHILE <condition> PERFORM <operation>
This structure tests <condition>, and executes <operation> if the TOS is '1'. The PEND then loops back to WHILE to repeat the test again; Fig. 2 shows the operation of the function. Notice another difference between this and BEGIN... END; <operation> is PERFORMed if <condition> is TRUE, but END repeats <operation> if <condition> is FALSE. You have to watch your test very carefully.
Although MMSFORTH uses WHILE...PERFORM...PEND, other versions of the language use functionally identical constructions such as BEGIN...WHILE...REPEAT or BEGIN...IF...WHILE.
Computing Today March 1982 page 49
Repeating the HEXPRIN example above using this con-struction is easy:
: HEXPRINT WHILE CR #IN DUP 100 <>
PERFORM HEX . DECIMAL
PEND DROP ;
This time we can have:
? 100 OK
without the conversions being performed.
'Why two versions of the indefinite loop?', you say. Most of the time, either would be perfectly acceptable but if data must be processed until an appropriate result emerges, use BEGIN...END. At other times, the input data controls the need for further processing - in such cases, the second construction is better. The second version of HEXPRINT is more suitable.
Just like the looping and branching structures we looked at last month, BEGIN, END, WHILE, PERFORM and PEND are Defining Words that can only be used within colon definitions. If you try to use them in FORTH`s immediate mode, you will get an error message.
I hope that it goes without my saying that all these conditional and looping structures can be nested pretty well as much as you like. Next month's article, will contain a FORTH program called "HANOI" which gives examples.
Since two of FORTH's major advantages are its running speed and the very free access it gives to the computer's operations, there is very little need to use assembly language segments in FORTH programs.
Fig, 2. Another 'structured' construct available in FORTH is the WHELE...PERFORM...PEND function.
Nevertheless, sometimes you must set up accurate timing loops or control I/O devices directly. To meet this need, almost all FORTH systems include an appropriate assembler that in MMSFORTH is for the 8080, but if you pay extra you can get a Z80 version. The beautiful (?) thing about FORTH assemblers is that they are just as interactive as any other element of the language.
FORTH words are defined in assembler terms in a way that is very similar to a colon definition. For example an MMSFORTH segment to whiteout the screen is:
CODE WHITEOUT DE PUSH HL PUSH
1024 DE LXI
15360 HL LXI
BEGIN 191 A MVI A A MOV HL
INX DE DCX D A MOV E ORA =0
HL POP DE POP
As soon as you ENTER, this is assembled and the new FORTH word WHITEOUT gives access to the code. You can try it out instantly by typing WHITEOUT, or you can incorporate the new word into any additional 'conventional' FORTH words.
I won't go into any more detail on FORTH assembly programming, but it is worth noticing three things about the definition above:
a. CODE..NEXT are the ASSEM-BLER DEFINING WORDS cor-responding to [:..;] brackets.
b. The language uses mainly Intel mnemonics, but in an RPN format.
c. The conditional jumps use nonstandard mnemonics and FORTH conditional tests.
Incidentally, a much easier definition of WHITEOUT is:
: WHITEOUT 191 15360 1024 FILL ;
Use Of The Stacks
We know by now that FORTH is a stack-based system. In fact, it actually uses two stacks. The most important one for the programmer is the PARAMETER STACK - that's where data goes and is operated on.
The second stack is the RETURN STACK and holds the return ad- dresses, loop counters, etc that the system needs as it goes up and down the dictionary. In effect, it is equivalent to the single stack assembly-language
programmers will be familiar with. The advantage of using two stacks is that variables and control information are firmly separated, with a consequent reduction in the confusion-factor.
Although the return stack is normally transparent to the programmer, it is possible to push and pop data onto and off it. The [<R] word takes the top item off the parameter stack and moves it to the return stack, while [R>] moves a word in the opposite direction. In both cases, the word is removed from the source stack. When we studied the DO...LOOP last month we met [I which copied the loop index on to the parameter stack. In fact, [I] can be used at any time and will put a copy of the top , number on the return stack on to the parameter stack, without altering the return stack. If you like, [I] is equivalent to:
<R DUP R>
Normally, there is not a lot of call for <R and R> , except to gain access to a number buried several layers down the parameter stack. For example, a word to print the fourth item on the stack would be:
: 4PRINT <R <R <R . R> R> R> ;
It would, however, be pretty clumsy programming to get yourself into this position.
Just as in assembly programming, you must be very careful how and when you move data to or from the return stack, since it is very easy to corrupt return addresses and so crash the system. As a general rule, if you take something off the return stack you must replace it within the same word and vice-versa.
We have now met many of the common 'standard' FORTH words, but there are a few more which can provide some very useful functions:[
[1+] This word simply adds 1 to whatever number is TOS; it has a counterpart, [2+] which adds 2. Their advantages are that they take up slightly less memory and run slightly faster than the equivalent [1+] and [2+1].
[2*] You will not be surprised to read that this word doubles the TOS. [8*], [16*] and [64*] act in much the same vein. Once again, they economise (slightly) on memory and run time, although you would be very hard-
Computing Today March 1982 page 50
hard-pressed if you needed such economies.
ABS This word simply converts the TOS to its absolute value tie it makes negative numbers positive).
FILL This word is very useful for loading blocks of memory with any given value. It sets the <TOS> bytes of memory starting at <2OS> to the value that is third on stack. For example, to draw a narrow line along the top edge of the TRS-80 screen, you could use:
131 15360 64 FILL
This loads 131 into the first 64 bytes of the screen memory.
MAX The FORTH word, MAX, takes the top two numbers on the stack and replaces them with the larger; the smaller is DROPped. If the numbers have the same value, one is discarded.
MIN This word works much like MAX, except that it leaves the smaller tie most negative) on the stack. Together, MAX and MIN provide an easy way of inputting a number and forcing it to lie in a given range:
: GETNO " INPUT 1-20" #IN 1 MAX
20 MIN ;
Figure 3 shows the working of GETNO if the actual input number is 25; you can see that it leaves 20 on the stack.
MINUS MINUS simply changes the sign of the TOS. It effectively has the colon definition:
: MINUS -1 * ;
MOD If you want to divide 2OS by TOS and leave the remainder, then MOD is your word. For instance, to see if a number is divisible by 64:
: DIV64 DUP 64 MOD 0= IF " DIVISIBLE"
The FORTH word ["], which must be followed by a space, prints the following characters on the terminal until the next ["] or the end of the line; in this case, it simply prints DIVISIBLE. Figure 4 shows the action of DIV64 when applied to a TOS value of 192.
NOT NOT simply reverses the truth value of TOS, leaving '1' if the TOS was zero. It has exactly the same effect as [0=].
SPACE To output a space to the screen, use SPACE. On its own, that is not a lot of use, 'but its extension SPACES will output <TOS> spaces. The colon
definition of SPACES is effectively:
: SPACES 0 DO SPACE LOOP ;
In next month's article, I will give a listing for a FORTH version of the classic computer task of solving the 'Towers of Hanoi' problem. Before then, however, let's take a look at some of the essential differences between programming in FORTH and 'conventional' high-level languages.
Variables. A key feature of FORTH programming is that variables are held in the stack and not in named locations. Nevertheless, life can get very complicated if you are trying to juggle more than four or five numbers on the stack and a few carefully chosen variables can make a program much easier to write and to follow.
However, good FORTH programs use relatively few variables and these are largely used to store the numbers not being manipulated at a given time. Furthermore, the variables actually saved by name will almost certainly be only those which hold the information which the program is manipulating, rather than all the other needed to control the program.
Think about the BASIC programs you write. You will probably find that the majority of the variables are dummies, loop counters, buffers, etc; in FORTH programs, numbers in this class will generally exist only as transient data on the stack(s).
Techniques. Remember that FORTH is a structured programming language and that, to make a program easy to follow, word definitions should be kept short (seldom more than two or three lines long). When taken together, these two aspects of the language make it very difficult to avoid top-down programming, I am glad to say.
Start to write a FORTH program with a single word which will
Fig.4 The DIV64 operation as it runs
through the stack.
represent the final program define this word from a few other suitable words, eg:
: PROGRAM INIT BEGIN 10 0 DO CONVERT LOOP MORE? END ;
That defines a simple program which sets itself up (via INIT), converts something to something else 10 times and then goes to see (vi, MORE?) whether any more conversions are needed.
Having set up the outline of the program, you can go on to define the new words you used, eg:
: MORE? CR " ANOTHER RUN?" KEY 89 =
What does that word do? First of all it throws a line; having done so, it asks a question. The next word KEY, is another standard FORTH word — it puts the ASCII value o the next key to be pressed on top o the stack. The input value is compared with 'Y' (ASCII 89), and NOT reverses the truth test value so that response of 'Y' leaves '0' on the stack to force a jump back to BEGIN.
Easy isn't it? Obviously, you would also define INIT and CONVERT and any new words you might use in them, and so on.
Up to this point, you should have been working with pencil and paper, because your draft program will have its highest level word ['PROGRAM'] at the top, and it: most fundamental words at the bottom. You must, however, enter the program from the bottom, because each word must have access to all. the words it uses before it can be compiled itself. It is thus virtually impossible to write any practical FORTH program at the keyboard, which is probably not a bad thine since it forces you to think about what you want to do. The ease with which working BASIC program can be written directly from the terminal is one of the reasons that there are so many badly-structured BASIC programs about!
With a large and/or complicated program, it is not necessary to prepare it all before you start to enter it to the system - sometimes, you won't even be able to. Because the language is so modular, it is easy to use dummy words in some places while you are developing other parts of the program. The dummies do not need to do any processing, but should show that they have been called and, if appropriate, input suitable data
Computing Today March 1982 page 51
suitable data to make the rest of the program work, For instance, in the example above, we could use:
: INIT " INITIALISATION ROUTINE CALLED" CR ;
: CONVERT " RUNNING CONVERT NOW" CR ;
while we were making sure that MORE? worked. Having sorted MORE? we could, perhaps, go on to get CONVERT running properly, etc. A step-by-step approach like this makes program development much easier.
As you enter your program, FORTH's highly interactive nature lets you test each word in isolation. Because it is so easy to test every detail of the program, debugging time is usually very much shorter than it is with BASIC. On the other hand, effective FORTH programming demands a much more careful design approach than you can get away with in BASIC. On balance, once you are used to the language, it is much quicker to design and fully debug programs than it is with more traditional languages.
Remember then, that while it is
always important to plan a program carefully before you sit down at the keyboard, the effort is particularly valuable in FORTH.
Layout. Any sensibly-sized program will normally occupy several screens of source code. However, because this code will be compiled, you do not have to concern yourself with RAM space and run-time as you do with BASIC. Indeed, the program will be much easier to follow if you:
a. Use meaningful word and variable names - 'A-PLAYERSCORE' is much more helpful than 'SA'.
b. Space out the definitions so that they are easy to follow. If appropriate, indent loops, etc to make the nested functions stand out clearly.
c. Use plenty of helpful comments, particularly where you are doing complex stack manipulation. The word [(] defines the start of a comment field, which is terminated by a carriage return or a [)]. If you must use brackets within the comment, then MMSFORTH allows
you to start the field with [("] and end it with ["]. You can put comments anywhere in a screen, but it is a FORTH convention that the first line is always a comment briefly describing the function of that screen:
: SCREEN 80 1 OF 6 TOWERS OF HANOI 29/9/81) : TASK ; (DEFINE PROGRAM START IN DICTIONARY)
d. If you are using a disc-based system, then it is easy to modify any screen during program development. However, in a tape-based system like my version of MMSFORTH, you have to be a little more careful in your planning, since it is tedious to swap screens between tape and memory.
Because MMSFORTH holds two screens in memory at once. I find it easiest to develop a program in pairs of screens, arranging the definitions of group-related words into the two screens in memory at any given time.
The listing in next month's article will demonstrate the application of these principles.