|Previous Page > Index of General Forth Information > Going Forth - Part 1|
Going Forth - part 1
Computing Today January 1982 page 45
We start a major new series on how to program in this
unique language which actually grows with your use of
it. Your programs will run faster than they do under
BASIC but it's not difficult to use.
T he Great Language Debate over the last couple of years has tended to focus on the relative advantages and disadvantages of BASIC and Pascal, Although the subject is fascinating, it becomes rather academic when, like me, you only have a 16K computer and no discs.
If you are in this position, there is no point in getting worked up over Pascal because you simply can't run any sensible version of it on your micro. Your choice is normally between BASIC, which is easy and interactive but s-l-o-w, and assembly language, which is difficult, not interactive, but very fast.
Recently, though, another possibility has had some publicity - FORTH. It is a very unusual language, originally developed in the USA for such things as the control of astronomical equipment. It is easy to write (not as easy as BASIC, but still easy), interactive and very, very fast. What's more, the compiler, operating system and fairly large programs can all fit together into a 16K cassette-based system.
This is the first of four articles which will introduce FORTH. In this part, we'll see just what FORTH is (and isn't) and examine some of its simpler characteristics.
I should mention at this point that all my specific comments, and the programming examples, are based on the FORTH system which I use. This is a particularly attractive version called MMSFORTH which runs in a 16K TRS-80 or Video Genie. is available from Miller Microcomputer Services, 61 Lake Shore Road, Natick, MA 01760, USA.
The easiest way to pay is with a Barclaycard (the Americans call them VISA cards) or Access Card (Mastercharge) - simply quote the card number and its expiry date. That's what I did, and I received my system 17 days (including Christmas!) after I posted my letter. How's that for service? I should point out that I have no connection with MMS beyond being a satisfied customer, and that other FORTH systems are available in the UK.
First of all though - why yet another language?
The power of FORTH comes from its unusual philosophy, which allows you to add new functions (called 'words') to the language in order to match it to whatever you want to do. In other words, it is a language that lets you define your own language. Read that again slowly!
FORTH programmers create words to do whatever jobs they need, but, once a word is in the language's dictionary, it can be used in any other program, and has the same status as all other FORTH words. In fact, FORTH has no way of distinguishing between the words that came with the original package and those which have been added to it.
An example might be useful here. Suppose you need to take a number and print it if, and only if, it is divisible by 3. Easy; type:
: 3PRINT DUP 3 MOD 0 = IF .
ELSE DROP THEN ;
Then, to test a number, you simply type:
<number> 3PRINT <ENTER>
and the program prints the number if it is divisible by 3: At this point don't get to worried about what the word actually does - all should soon become clear.
3PRINT looks rather like a subroutine, but it is not - it is a totally new FORTH function. Having defined it, you can then use SPRINT to define other FORTH words, and so on, ad (nearly) infinitum. Not only that, but you can use SPRINT both in a program, and in an immediate mode.
The Pros . . .
It is this unbelievably flexible approach that gives FORTH its power, and makes it particularly useful as a (I hate the word) hobbyist language. Let's take a closer look at some of its advantages.
Speed. FORTH programs run something like 10-20 times faster than equivalent BASIC programs. For reference, Table 1 shows the results I obtained when I ran the standard 'Kilobaud' benchmarks
on my Video Genie. The Table shows the times for floating point and integer BASIC, and for equivalent FORTH programs; as you can see, the increase in speed is dramatic. I could not use a floating-point arithmetic benchmark because my version of FORTH is integer-only (but it does handle them in tie range ± 2,147, 483,648).
Interactive. Although FORTH definitions, like the one for 3PRINT above, are actually compiled into the language when you enter them. The whole process is so fast that the language is just as interactive as BASIC. Once a word has been compiled into the FORTH 'dictionary' it can be used in an immediate mode, without further compilation. This makes program development much faster than with languages like Pascal that need a separate compilation cycle every time a change is made.
Structured. FORTH is inherently a structured language, with all the advantages this implies. For example, as far as I can tell, it is impossible to use the GOTO construction in FORTH. That is not to say that you cannot write bad FORTH programs. . .
Fig. 1. The MMSFORTH memory map as implemented on a Video Genie.
Computing Today January 1982 page 46
Transportable. Through the efforts of its developers, Forth Inc. (1) and the International FORTH users Group (IFUG), all the implementations of the language are very similar, and programs can easily be moved from one system to another. After all, when you can define a language to suit your requirements, dialect problems tend to disappear overnight.
Quick Development. Program development is very quick with FORTH - once you are used to the language, it is probably faster than writing in BASIC. One of the reasons [or this is that the more programs you have written, the more wards you have created which you can draw on for further programs.
Flexibility. Because of its user-defined nature, FORTH is enormously flexible, and its vocabulary can be matched precisely to a task's requirements.
However, nothing is perfect and, inevitably, FORTH has disadvantages. The main ones are:
Readability. FORTH programs tend to be very hard to read (but not to write). This is partly because you can do an awful lot with just a few words and partly because the language's syntax is, to say the least, unusual.
Integer Arithmetic. Most FORTH systems are integer-only. On the other hand, MMSFORTH supports double-precision integers (± 2,147,483,648) and its disc-based version has a floating-point option. Actually, how many programs you write are in, or could be in, integer arithmetic? Probably, like mine, a large proportion.
Unfamiliarity. FORTH's poor readability is partly a result of its unfamiliar approach. It works through a pair of user-accessible stacks, and uses Reverse Polish Notation (RPM). (Keep reading!). On the other hand, anyone who has used a Hewlett-Packard calculator will immediately be at home with the language, (or those following the Number Crunching articles in Computing Today, Ed).
Error-Trapping. FORTH gives the programmer an astonishing degree of freedom of access to the computer hardware; in this respect it is assembler-like.
Unfortunately, this approach tends to sacrifice some of the language's error-trapping possibilities. Grammar and vocabulary errors are spotted as well as with any other language hut if for example you are using an array, you must do your own checks to ensure that you are not going outside its bounds. The inevitable result of such an error would be a crashed system and, probably, a corrupted program. If you'd rather that the computer did all the checking for you, then read no further - FORTH is not for you.
Although FORTH has been implemented on computers of all types, and a micro version first appeared in 1976, it has only become widely used in the last two to three years, and packages are now available for most micro systems. The basic syntax is controlled to some extent by the International FORTH Standards Teams, although the anarchic nature of the language prevents this control from being more than a guiding hand.
At the core of any FORTH system is the DICTIONARY. This is a list of items, each of which defines a FORTH word. Since most words in the language are specified in terms of other words (the whole system is constructed from a handful of core words), a dictionary entry is essentially a list of pointers to other words. These words point to yet more words, and so on; a dictionary of this type is called a THREADED LIST.
Whenever a word is used, the FORTH INTERPRETER executes it by following the pointers. It may help if you imagine the execution of each word as being similar to going down a long line of nested subroutines, but don't carry this analogy too far. The secret of FORTH's extensibility is that, when new words are defined, they are COMPILED directly into the dictionary, which thus varies in size as the program is constructed. The operations system cannot tell the difference between the original and the new words, and the latter simply become part of the language. Fig.1. shows a simplified memory map of MMSFORTH, which is a typical system.
Unlike a purely interpreted language such as BASIC, the FORTH code that is executed when the program runs is held in its compiled form. However, unlike FORTRAN, the compiler does not produce machine-code; it produces
a dictionary entry which must then be interpreted. FORTH's operation is thus a mixture of compilation and interpretation, bringing together, amazingly, the advantages of both approaches.
Once a program has been written and compiled into the dictionary, the source code no longer exists within the system, other than on tape or disc. The code is written in BLOCKS, the block size varies according to the storage medium and the system, but FORTH actually works on groups of blocks called SCREENS. A screen is normally structured as 16 rows of 64 characters - coincidentally very convenient for a TRS-80. In MMSFORTH, blocks are actually the same size as screens, each holding 1024 characters:
Although FORTH's immediate mode by passes the block/screen format, programs are written and loaded by using the system's SCREEN BUFFERS (MMSFORTH has two. Implementations of the language generally use a virtual memory system to make sure that a modified or newly-written buffer is saved before it is overwritten by a new one. Cassette-based virtual memory is something to marvel at!
Before you can run a program, its appropriate screens are read in turn into the buffers, from where the LOAD word compiles them into the dictionary. With a disc-based system, the detail of the screens is hidden to some extent, although it is all too apparent when you use cassettes.
FORTH's most obvious feature, however, is its STACK. All data goes into, comes out of, or is processed in, the system via a Last-In First-Out (LIFO) stack. Before data can be operated on, it must be pushed onto the stack, which leads to the curious (to a BASIC programmer) effect of all words appearing after the data they manipulate.
This is called Reverse Polish Notation (RPN) and is very strange at first. Unless a word explicitly does something else, all the operations work on 16-bit data-words; don't get confused with a normal micro's stack operations, which often work on single bytes.
Actually, almost all computers use the stack/RPN approach when they run high-level languages, but the fact is normally hidden from you. In FORTH, however, it stands naked and unashamed. To be honest, this is one of the reasons that FORTH is so fast and undemanding of space - the programmer does the work that other languages' compilers or interpreters do.
Computing Today January 1982 page 47
The best way to get a proper understanding. of RPN is to look at a few examples arid try for yourself. It is truer of FORTH than of almost any other language that you must learn by experience.
Basic FORTH Words
FORTH words can be any size and can use any character available on the keyboard. If you really want to, you can define "!?6A/+ $" to mean something. Furthermore, spaces are vital; each word must be separated from the next one by at least one space.
At this point, I will define two conventions. Normally in this series the FORTH words will be quite distinct from the rest of the text, but whenever there might be some ambiguity, I will enclose them between double quotes " . ". Assuming that there are no typesetting errors, everything between the quotes is FORTH! The second point is that anything that the computer outputs will be underlined - what you type in is not. There is always an implicit ENTER before any output appears.
FORTH has the four arithmetic operators familiar to all programmers ( + — / *), and they have the usual meanings. You must remember, though, that they operate on the top two numbers in the stack, which must be put there BEFORE the operator is used. Thus you can have the sequence
4 5 1 + OK
The OK is the normal FORTH response which indicates that it has done what you told it to do (which May not he what you meant...).
What happened? A number on its own means "put this number on top of the stack", so first of all '4' was put on the stack, and then '5' was pushed on top of it. The "+" popped them both off, added them, and pushed the answer (9) back on to the stack (Fig.2.).
Multiplication is much the same:
4 5 0K
leaves 20 on the stack.
What about subtraction and division, where the sequence of events is vital? The rule is easy: the number on the top of the stack (the last one) is taken away from, or
divided into, the second from top item. The sequences:
12 4 - OK
12 4 / OK
leave 8 and 3 respectively on the stack.
All these operators take the top two items off the stack, replacing them with a single item, which is the result, and they all follow the pattern of Fig.2.
You can chain numbers and words together almost indefinitely before you hit ENTER. FORTH tackles them strictly from the left, putting numbers on to the stack and manipulating them with the words, and the language does not use a hierarchy of operators (eg "*" before "+").
For instance, suppose you wanted to calculate:
( (7+3*2) * 6+8) / 2
The FORTH sequence
7 3 2 * + 6 * 8 + 2 / OK
would do the job, and leave the answer 43 on the stack. Fig. 3. shows what is happening all the way through this sequence.
Easy, isn't it? Notice that using RPN means that you never need to bracket anything - the stack operations give an equivalent effect — which is characteristic of the notation, Here is another example. What is left on the stack after
3 7 2 1 8 * - 5 8 - / + OK
is finished? Try to work out the answer yourself before you look at Fig. 4.
So now we can put numbers on the stack and manipulate them once they are there. What we can't do yet is output them. The FORTH word "." (yes, a full-stop) is the answer; its job is to take the top number off the stack and print it on the VDU. You can have examples such as:
7 20 . 20 OK
7 20 . . 20 7 OK
2 10 + 3 / . 4 OK
2 3 . . . 3 2 STACK EMPTY!
Your can see that "." takes no notice of what has gone before — it blindly takes the top number and prints it, removing it from the stack in the process.
In the last example, we tried to print three numbers, but the stack only contained two. The third "." did not find anything and so generated the standard FORTH 'STACK EMPTY!' error message.
Fig. 4. A really involved operation, just try this in algebraic notation!
Computing Today January 1982 page 48
That last point leads to a fundamental rule of FORTH programming:
Always make sure that the stack contains enough numbers or data for your word to work on.
Equally, you must not leave unwanted numbers on the stack. Remember that the stack grows down to meet the dictionary growing up - if they ever meet, the system crashes! So:
Never leave unwanted numbers or data on the stack.
We can now do basic arithmetic on the stack but, to write effective programs, we also need to be able to manipulate the arrangement of the data on the stack directly. FORTH provides a number of words to meet this need.
DUP. DUP is used to make a copy of whatever (16-bit word) is on top of the stack, and push it on top of the original. This gives, for instance, an easy way of calculating a square:
16 DUP * . 256 OK
SWAP. One of the secrets of good FORTH programming is to do as much as possible on the stack, and to use the minimum number of named variables. SWAP is a word that makes this much easier by literally swapping the top two numbers on the stack:
10 6 - . 4 OK
10 6 SWAP - . -4 OK
OVER. We can go one better than SWAP by using OVER, which takes a copy of the second item, and puts it on top of the original top item. Thus we could calculate by:
(10-6) * 10
10 6 OVER SWOP - * . 44 OK
Fig.5. shows what is going on here.
ROT. The ROTate word takes things even further by rotating the top 3 items on the stack. The third item is removed, and pushed on top of the original top two items to become the new top-of-stack (TOS). Hence:
2 3 4 ROT - * . 6 OK
If you are confused at this point, Fig 6 may help. Later in this series, we'll see some examples of ROT in use.
DROP. It is sometimes useful to be able to 'lose' the top item on the stack. DROP does just that — it pops it off and forgets about it.
The examples above are really very trivial, and it may be hard to see the point of the stack operators. However, their value will become obvious when we start to define new words, and try to do more complex things with the language next month.
That, then, is a first look at FORTH. I have explained that it has many advantages over 'conventional' languages such as BASIC or even (sacrilege!) Pascal. It does, however, need a whole new way of thinking about problems, and you can see that it's not a very easy language to read. You sometimes have to spend a lot of time trying to work out what's on the stack when, and why.
In this month's article, we have only taken a look at FORTH's most fundamental operations: those which do arithmetic and those which manipulate the stack. All the little bits of code have been in the immediate mode type them in, press ENTER, and there is the answer.
Next month, we will start to look at the use of constants and variables in FORTH, and the definition of new FORTH words in order to extend the language. We will also look at the way that the language provides the fundamental structures of, any real program - decision making and looping.
Later in the series, we will go on to see what a typical FORTH program looks like by using the language to solve the classical computing problem of 'The Towers Of Hanoi'. The program will demonstrate just what is involved in writing a FORTH program, and how the programming techniques might be different from those used in, say, BASIC. The 'Towers' program will also give a convincing display of just how fast the language runs.
BENCHMARK BASIC BASIC FORTH
FOR Loop 2.6 1.8 0.165
Loop on comparison 11.3 9.5 0.435
Variable Arithmetic 26.6 27.2 3.1
Constant Arithmetic 27.8 27.0 3.2
Constant Arithmetic 31.2 30.0 3.3
Constant Arithmetic and 50.6 45.1 4.2
Constant Arithmetic with 78.1 69.3 5.6
Nested Loop and Array
Table 1. Timings for the FORTH Benchmarks in seconds