www.jupiter-ace.co.uk

Index of general Forth information > Byte August 1980 What is FORTH?


Byte August 1980 page 100
What Is FORTH?
A Tutorial Introduction

John S James
POB 348
Berkely CA 94701
FORTH is a programming language with a small but fast-growing and enthusiastic user community. Though easy to learn at a terminal, it is difficult to explain abstractly because it is so different from other languages. Even advocates do not agree why it is good or how it should be used.
FORTH was developed for control applications (using a computer to run other machinery), data bases, and general business. It is least useful for big number-crunching jobs (eg: writing a matrix inversion routine), although it can link to subroutine packages written in other languages to incorporate such functions. Unlike Pascal, FORTH gives the user complete access to the machine and does not try to guard the programmer against mistakes. But its modularity and other forms of error control allow ...

Acknowledgments and Availability
Listings 1 thru 7 were run on a FORTH system for the Apple II provided by Cap'n Software, POB 575, San Francisco CA 94101. The PDP-11 examples were run on a system written and distributed by the author. The 8080 example was provided by John Cassady of the Forth Interest Group, POB 1105, San Carlos CA 94070; a similar 8080 FORTH system is available from Forthright Enterprises, POB 50911, Palo Alto CA 94303. Other members of the Forth Interest Group contributed helpful suggestions. And of course we are indebted to the inventor of FORTH, Charles Moore of FORTH Inc, 2309 Pacific Coast Hwy, Hermosa Beach CA 90254, who started it all.

production of remarkably bug-free application programs-perhaps more than any other language in common use. The compiler uses much less memory than Pascal does, and its programs run about equally fast. FORTH is much more interactive than most conventional implementations of Pascal. FORTH is available on most common personal computers (eg: Apple, TRS-80) and all major microprocessors (eg: 8080, 6800, 6809, 6502, PACE, LSI-11, and 9900). An international FORTH Standards Team exists, and standard systems are virtually identical among all different machines.
This article will describe what it is like to program in FORTH. A group of annotated terminal sessions, shown in listings 1 thru 10, will provide more details on the language itself.


The Philosophy of FORTH
FORTH reduces the cost of a subroutine to very little, and the whole language is built on functions that are like subroutine calls. The programmer keeps defining new words (new functions) from old ones until, finally, one of them is the whole job. Most programmers keep each definition short, usually one to three lines not counting comments. The definitions are compiled as entered and are immediately ready to run.
Because FORTH definitions are short, all possible execution paths of the definition can be tested easily. Since most functions work exactly the same when executed as commands from the terminal or when used as components in
further definitions, they can be tested immediately from the terminal. And the functions are so general that there is no sharp distinction between program and data.
Since programmers define their own operations, special application libraries of FORTH words can be developed. The new routines are integrally part of the language, so they do not need any special calling sequences, and they are immediately ready to run. Even the original words supplied with the system (there are about one hundred of them), can be redefined if desired, adapting the language for special circumstances. Also, programmers can create their own data types or operation types (eg: their own kinds of arrays or other data structures, or new classes of operations). This flexibility allows unprecedented "customization" of a language to the requirements of a particular installation or application. The finished programs are easily modifiable when requirements change because they are composed of pre tested building blocks specially designed for that kind of program.


Stack and Postfix Notation
A smaller convenience of FORTH is that you do not have to do much coding when you start a new program. As soon as the system comes up, all your previous work is ready to go, just as if it were originally part of the language.
A feature that some people do not like is FORTH's use of a stack (explained below) and its postfix notation (also called reverse Polish

Byte August 1980 page 102

Most FORTH operations communicate only through a stack.

notation or RPN). In postfix notation (a system used on most Hewlett-Packard calculators), arithmetic formulas are written with the operations after their arguments, not between them. For example, "2+3" becomes { 2 3 + } in FORTH or other postfix systems; "(4 + 5)*(6 + 7)" becomes { 4 5 + 6 7 + * }. (See explanation below.) No parentheses are needed in postfix.
Some programmers do not like postfix, and they ask, "Why doesn't someone write an algebraic-to-postfix translator for FORTH? That would be easy to do." The reason is that postfix has benefits far more important than the compiler-writer's convenience. It greatly simplifies linkage to subroutines. With postfix, you do not need any CALL statement or argument list, or any formal parameters in the subroutine. While arithmetic-formula operations (add, subtract, etc) must take either one or two arguments and return exactly one result, postfix functions can have any number of arguments or results.
In FORTH, most operations communicate only through a stack. The stack, perhaps the most important data structure in programming, is used in almost all languages, but most languages hide it from the user. In FORTH, the user controls the stack directly.
A stack is a pile of numbers where the last ones put in are the first ones taken out; that is, you can only remove the number that is on top of the stack. It is like a stack of trays in a restaurant; trays are conveniently added and removed only at the top. (Unfortunately, computer-science texts do not agree on terminology, and a few call the top of the stack "the bottom.")
To see how a stack works in computation, consider the expression { 2 3 + } above. In FORTH, numbers are compiled as operations which place their

Figure 1: Evaluation of the postfix-notation expression, { 4 5 + 6 7 + * }.Numbers are pushed onto the stack at the top. Operators (here, + and *) pop the top two entries off the stack and push the result of that operation back on the stack. For example, the first plus sign (column 4) replaces the 4 and 5 on the stack with 9, the result of the addition operation.

values onto the stack. So when the 2 is executed, it is placed on top of the stack, which then looks as follows:
2
__
__

values where the dashes represent whatever data may have been on the stack before. Then after the 3 has been encountered, the stack becomes:

3
2
__
__

Then the + is executed. The 1-character word + takes two arguments from the stack (destroying them), performs the addition, and leaves the result on the stack. So the stack finally is:

5
__
__

The reader can verify that when the formula { 4 5 + 6 7 + * } is executed, the stack goes through the sequence shown in figure 1.
Now we can see why FORTH is not the best language for big number-crunching jobs. Numbers to be operated on must be moved to the stack in addition to whatever operations are to be carried out, and this extra movement slows FORTH down for this kind of computation. Once on the stack, however, arithmetic is fast (for example, single instruction execution for addition on some 16-bit machines, more for 8-bit machines). Also, FORTH can link the useful instructions of one routine and those of another in as little as one or two instruction executions (depending on machine architecture). This makes FORTH programs much faster than BASIC, usually ten times faster or more
(assuming an interactive BASIC, that is-FORTH is always interactive). But a good FORTRAN compiler's code may do number-crunching several times faster still.

Characteristics of FORTH Code
FORTH is a structured language (as is Pascal) in that it has no GOTO or statement labels in the language. Discussion of structured programming is outside the scope of this article, but its importance for program correctness and maintainability is recognized.
FORTH object code (ie: a compiled program) is extremely compact, even more so than machine language. The reason is that no matter how much work an operation performs, each invocation of it takes the same space in the object program-two bytes. The bigger the program, the greater the memory advantage, since the hierarchical structure of programs allows increasingly powerful and application-targeted operations to be built up. But FORTH has a relatively large run-time memory overhead, so small programs can take less total space in other languages.
[The reason that a FORTH call can be shorter than a normal machine-language subroutine call (usually three bytes) is that a FORTH program is interpreted by a FORTH interpreter (also part of the FORTH language) in much the same way that a BASIC program is interpreted by a BASIC interpreter. The "relatively large runtime memory overhead" mentioned above is the FORTH interpreter plus a core of FORTH words defined in machine language. When a FORTH program is very large, it saves enough memory in FORTH calls to make up for run-time memory overhead.... GW]
The complete FORTH system (itself largely written in FORTH) takes about 7 K bytes, and this whole system including the compiler is commonly left in memory as a run-

Byte August 1980 page 104
-time package. Therefore, 16 K bytes and a floppy disk for storing source programs are sufficient hardware for an excellent FORTH system (compare this with the memory requirements of Pascal, 48 K bytes or more). When compactness is especially important, as when programs are burned into read-only memory and embedded in machinery, FORTH's compiler, terminal handler, and operation names anything not needed to run can be stripped out of the application program, leaving a runtime package of about 800 bytes, instead
of the usual 7 K bytes.
FORTH programming is reentrant; this means that different users can share the same copy of a program in memory while running at the same time. FORTH easily handles multitasking, including multiple terminals used for program development. (At present, however, most of the low-cost systems on the market are still single-user.) FORTH is recursive, meaning that routines can invoke themselves.
Suppose you want to link your high-level-language program to a machine-language subroutine (eg:
you may be controlling a high-speed device and need the full speed of the computer to keep up). Many languages make this linkage difficult or impossible. In FORTH, however, it is very convenient. You can type in or load from disk a machine-language routine, using a FORTH assembler, and the new routine can be executed immediately. Listing 9 shows examples for PDP-11 and for 8080.
The word CODE invokes the FORTH assembler and begins the definition of a machine-language routine. Mnemonic instructions and address-mode symbols are understood by this assembler, and the whole power of FORTH is available for address arithmetic at assembly time. FORTH assemblers use postfix notation, so op codes come after their addresses, not before as in conventional assemblers.
The machine-language code is generated as the definition is being entered. The completed operation works just like any other FORTH word, so the user does not need to use any special calling sequence, or even need to know which operations are defined in code and which are not. (In fact, about fifty FORTH words are written in machine language-all other words in FORTH are ultimately defined in terms of these fifty words.)
The FORTH assembler allows structured conditionals and loops at the machine-code level; it can also assemble unstructured code if desired. Users can define their own macro-instructions, use custom-made data types, etc.
In other words, the FORTH assembler allows structured programming even in machine code, and it links the resulting machine-language subroutines into the system immediately. No separate assembly and linking-loader passes are needed, and the associated file management overhead is avoided.

Some More Advantages
FORTH programs are highly transportable between different computers. Any assembly-language routines used by the program must be rewritten, but most applications do not need any assembly, and very few need more than a handful of short, critical routines. When FORTH systems have been designed for compatibility, large applications can be moved among
Magazine Advert

Byte August 1980 page 106
machines, with little or no change. For example, it can be practical to down-load program development from a PDP-11 to a TRS-80 or an Apple II. It is even possible to write the software for a product before a hardware commitment is final.
Another advantage is that FORTH is a self-contained operating system. The 7 K bytes include terminal and disk handlers and a rudimentary file system. No other software is needed anywhere in the computer. Yet, if a monitor in read-only memory is available, FORTH can use it; and FORTH can run as a task under some other operating system (eg: CP/M) when that is wanted. FORTH can link together otherwise incompatible pieces of systems: software in read-only memory, operating systems, subroutine packages, and hardware. It provides a user interface that enables subroutine packages normally used by batch (ie: noninteractive) programs, mostly on older, larger computers, to be used interactively.
FORTH puts you in charge of your computer. You can understand everything happening
in your software or in any desired parts of it, and you can change it. This means no more "black box" systems that only the manufacturer's specialists can understand, no more dependence on someone else for upgrades, fixes, or documentation, and no more question of who is responsible if software does not work. The whole system is written in FORTH, right down to the bits-your application programs, the compiler, the operating system, the I/O drivers, etc. You do not have to learn some other language or be a systems specialist to modify it.

Disadvantages
Few FORTH systems used today have floating-point arithmetic. This is not a fault of the language; rather, it reflects its history in microcomputer control applications, where integer arithmetic is often needed for speed. Now there is more pressure for floating point, and it is becoming available.
A more fundamental limitation of FORTH is that it is not a typed language (unlike Pascal). For example, if an integer operation is
performed on a floating-point quantity, no message is printed either at compile time or at run time to warn of this error. (However, the user can add type checking and other error-preventing operations into any FORTH word.)
It may seem that unreliable code would result from the un typed nature of FORTH, but, in fact, FORTH code is remarkably solid and bug-free. The modularity and excellent testing environment aid error control; and type mismatches are less dangerous than most other mistakes because they are easy to detect.
Another criticism of FORTH is its lack of a directory file structure. Again, this is historical and is not a characteristic of the language, which can be developed to use any kind of files.
The traditional FORTH file system is primitive, but in practice it has worked very well. The entire disk (or disks) is a single virtual array of blocks numbered from 1, with the block size standardized at 1024 bytes regardless of physical disk sector size. The blocks (called screens because they can be displayed as sixteen 64-character lines on a terminal) are automatically buffered so that they are physically read and written only when necessary. A LOAD command will read a given screen and treat the information exactly as if it had been typed in a terminal session, thereby compiling source code or executing commands (depending on the contents of the screen). The LOAD instruction can be executed within a screen; in this way, a single LOAD command can control the compilation of large source programs.
This disk-based file system allows any part of the disk to be read or written with a single access. Load screens or data areas can be saved by name, and portions of the disk can be protected by redefining the names of a few input and output operations so that they check before writing and/or reading.
The disadvantage of this system is that there is no directory; when a new disk is inserted, the user or the program must know the block numbers for load screens and data files. Also, FORTH source programs are traditionally stored without tabs or truncation of blank lines, making white-space (ie: unused area on a
Magazine Advert

Byte August 1980 page 109
Magazine Advert
line) and space for comments costly in disk space and load time, discouraging good program layout. For these reasons, there is increasing interest in changing to a directory file system. Perhaps it will be written on top of the screen system currently in use.
The most important criticism of FORTH is that its source programs are difficult to read. Some of this impression results from unfamiliarity with a language different from others in common use. However, much of it results from its historical development in systems work and in read only memory based machine control, where very tight programming that sacrifices clarity for memory economy can be justified. Today's trend is strongly toward adequate commenting and design for readability.
FORTH benefits most from a new, different programming style; techniques blindly carried over from other environments can produce cumbersome results. Most FORTH programmers seldom use named variables; they use the stack instead so that the implicit commenting normally available through choice of variable names is only provided through comments and user-defined operation names. Single definitions that would have more than about three unrelated numbers on the stack at any one time are best split into two or more operations; most programmers learn to keep their definitions short.
FORTH enforces extreme modularity, so the decomposition of each task into component parts is critical. Top-down design is especially important. Large jobs should be written as application-oriented libraries of operations to make teamwork and maintenance easier. A much larger fraction of the total programming effort is spent on design, with less on coding and debugging. For these and other reasons, FORTH creates its own issues of style, which are only beginning to be explored.

A Taste of FORTH
FORTH is an interactive language best explained by example. Because of this, a series of listings (listings 1 thru 10) with fairly detailed explanations make up the rest of this article. In the listings that follow, underlining denotes user keyboard input.

Byte August 1980 page 110
FORTH uses punctuation in some of its words, which makes representing them in text a difficult problem. For example, one FORTH word is ("), which could be taken to mean one of several character combinations. (For your information, the word has three characters and is made from a left parenthesis followed by a double quote mark and a right parenthesis.)
To decrease the chance of confusion while trying not to clutter text unnecessarily, we will sparingly use braces, { }, to isolate the character string within as a FORTH word or phrase. (For example, the above word would be written { (") } .) Braces will be used only under the following situations:
when the material being quoted is
a phrase of FORTH words (eg: { 26 LOAD } or 1 3 5 ± )
with the FORTH words { .
(period), { , } (comma), { : } (colon), { ; } (semicolon), { ? } (question mark), { ! } (exclamation point), { ' } (single quote mark), and { " } (double quote mark)
with any word using the above
punctuation marks (eg: { $. } or { ." } ).

All other FORTH words will be set apart by a space on either side of the word. So, in this and other FORTH articles in this issue, braces will always signal a FORTH word or phrase. The braces are not part of the word or phrase, and FORTH words will never use braces within the body of a figure or listing.... GW

On the Necessity of Using Camera-Ready Copy
Examination of listings 1 thru 10 will reveal a variety of typefaces used. This variety is present because each listing was created by the printer of the system producing the listing. Such listings are called camera-ready copy, which means that we can reproduce them in BYTE without inadvertently adding the errors that creep in with the retyping of a listing. Contributors to BYTE and on Computing are strongly encouraged to submit camera-ready listings made with a fresh ribbon, since this helps us to improve the accuracy of the article.




Magazine Advert
Listing 1: FORTH as a calculator. FORTH is easy to approach because it can be used as a calculator. Here, the programmer has not defined any new operation but has used addition, multiplication, and print (the dot means print). These are three of about one hundred operations that are available when FORTH first comes up. Programming consists of defining new operations which can be custom designed for a particular task or a particular industry.
FORTH uses postfix (also called RPN or reverse Polish notation) arithmetic, which is best known from its use in Hewlett-Packard calculators. In postfix notation, the operations are written after their arguments, not between them. The text of this article shows how postfix notation works, using a data structure called the stack, and it explains the formulas in this example.
Postfix notation, which does not use parentheses, is more general than ordinary arithmetic notation. Its biggest advantage is that it greatly simplifies the writing and calling of subroutines.
In these examples, underlining indicates what the user has typed on the terminal. FORTH does not process the line until you type a carriage return. The OK prompt means that the system has completed its work and is ready for new input from the user.

2 3 + . 5 OK
4 5 + 6 7 + * . 117 OK
 

Listing 2: Changing number bases. FORTH can work in different number bases and can change any time, so it serves as an octal/hexadecimal/ binary/decimal calculator within the limits of 16-bit numbers (or 32 bits for double precision). The FORTH word HEX converts FORTH into a hexadecimal machine, and all numbers are printed in hexadecimal

Byte August 1980 page 112
Magazine Advert
hexadecimal until some other operation changes the base again. FORTH always begins a session in decimal radix.
The operations DECIMAL and HEX are built into the system; OCTAL , BINARY , and TRINARY (base 3) are not. So when OCTAL was first used, the error message { OCTAL ? } indicated an undefined word; that is, the system did not recognize the word OCTAL . In the next line, the user defined OCTAL (line 6). This example illustrates FORTH's extensibility; users can extend the language to include new operators.
Incidentally, the second error message { 12885 ? } in line 12 resulted because the system was in binary (from the line above), and, in binary, numbers must contain only the digits 0 and 1 , so 12885 was not recognized as a number. It was treated as a word, and, because there was no operation named 12885 , the error message was generated.
OCTAL and the other number-base operations work by giving a new value to BASE , a variable used by the system. Defining new operations is more fully explained in listing 3. The { ! } operation (store) is explained later.
Number bases only affect input and output. All internal computation


HEX OK
3BE8 C8 + . 3CB0 OK
25 2F * . 6CB OK
DECIMAL 1348 HEX . 544 OK
DECIMAL 1348 OCTAL . OCTAL ?
: OCTAL 8 BASE ! ; OK
DECIMAL 1348 OCTAL . 2504 OK
DECIMAL OK
: BINARY 2 BASE ! ; OK
: TRINARY 3 BASE ! ; OK
12885 BINARY . 11001001010101 OK
12885 BINARY . 12885 ?
DECIMAL 12885 BINARY . 11001001010101 OK
DECIMAL 12885 OCTAL . 31125 OK
DECIMAL 12885 HEX . 3255 OK DECIMAL 12885 TRINARY . -122200020 OK
DECIMAL -12885 TRINARY . -122200020 OK
DECIMAL OK
is in binary, so there is no speed penalty for using nondecimal numeric bases.



Figure 2: An explanation of the IF ... ELSE ... THEN construct. (See listing 4.) As shown in figure 2a, the portion of code executed depends on the value of the number on top of the stack when the word IF is encountered. If the call this number N and say that t he number has a boolean value of true if its numeric value is nonzero and false if 0, then figure 2b gives the equivalent construct to figure 2a in conventional flowchart notation. Here and in figures 3 thru 5, the dotted box indicates the boundaries of the construct (as oppose to values assumed to be on the stack).

Listing 3: Defining new operations. Here, a new operation CUBE is created. CUBE replaces whatever number is on top of the stack with the cube of that number. The statements within the parentheses are comments.
The colon, { : } begins a FORTH word definition; the word following it is the name being defined. Semicolon, { ; }, ends the definition.
The new word CUBE will first execute DUP, which duplicates the number on top of the stack, making a second copy. The second DUP leaves three copies. The first * causes the top two copies to be replaced by the square of the number; the next * computes the cube, and then all three copies of the original number are gone, leaving the cube of the number on top of the stack.
This colon definition shows one of several ways to create new words in FORTH. Most words that appear inside the definition are compiled and not executed immediately.
All words and numbers in FORTH are separated by one or more blanks (and/or carriage returns). FORTH operation names can be up to thirty-one characters long and can consist of letters, numbers, or any other characters. For example, an operation name could be a number, or it could be non printing characters only. In practice such names are rarely used, but they illustrate the flexibility that is available.

Byte August 1980 page 114
Magazine Advert
This listing shows CUBE being executed from the terminal. It can also be used as a component in further definitions. A fundamental property of FORTH is that operations defined by users are indistinguishable from those which were originally part of the system.


: CUBE ( N -> N. CUBE A NUMBER)
DUP DUP ( NOW THERE ARE THREE COPIES)
* * ( GET THE CUBE)
; OK
5 CUBE . 125 OK
-28 CUBE . -21952 OK
HEX 17 CUBE BINARY . DECIMAL 10111110000111 OK
 

Listing 4: Conditional branching. The IF ... THEN is for conditional execution. IF takes one argument off of the stack; this argument is interpreted as a boolean or truth value, with 0 meaning false and any nonzero value meaning true. If true, any statements between the IF and THEN are executed. In either case, execution continues after the THEN , which terminates the conditional. There is also an optional ELSE clause that is executed only if the argument is false. (See figure 2.)
Here, the true-clause contains only one word, MINUS , but it could contain almost any FORTH statements, including other conditionals and loops nested to any practical depth. These statements run fast because they are compiled into a form of object program called threaded code.
Incidentally, the FORTH word 0< returns a boolean value indicating whether its argument (the number on top of the stack) is less than zero. The DUP is necessary because 0< follows the FORTH convention that operations should destroy their arguments on the stack. MINUS reverses the sign of its argument (the top stack number).
Items in parentheses are comments. The comment "N - > N" in the first line is to show that this operation takes one number off of the stack and returns one number to it. Perhaps the most important information to put in the comments accompanying each new operation is what arguments it takes off of the stack and what results it returns to the stack.

: ABSOLUTE-VALUE ( N -> N. ABSOLUTE VALUE)
DUP 0< ( GET BOOLEAN, TRUE IF NEGATIVE)
IF MINUS THEN ( NEGATE THE NUMBER IF TRUE)
; OK
10 ABSOLUTE-VALUE . 10 OK
-5 ABSOLUTE-VALUE . 5 OK
 

Listing 5: The DO ... LOOP, a structured loop with a counting index. DO takes two arguments from the stack, the initial value of the index (on top) and the final value plus 1. (See figure 3.) These indices are written in reverse order from most other languages, making the loop terminating value (which is more often passed as an argument) more accessible on the stack.
CR simply performs a carriage return. In this example, the index values are literals (10 and 0), but they can also come from variables or from computations of any complexity; anything that gets the indices onto the stack is legitimate.
This listing also shows a timing benchmark; the word TIME-TEST does 30,000 empty loops. On an Apple II running FORTH, TIME-TEST executes in less than 4 seconds. In Apple Integer BASIC (which is a fast BASIC), 30,000 empty loops take 40 seconds.

: 10CUBES ( ->. PRINT A TABLE OF CUBES OF 0-9)
10 0 ( INDICES OF LOOP)
DO ( START LOOP)
CR I . I CUBE . ( PRINT A NUMBER AND ITS CUBE)
LOOP ( END OF LOOP)
; OK
10CUBES
0 0
1 1
2 8
3 27
4 64
5 125
6 216
7 343
8 512
9 729 OK
: TIME-TEST 30000 0 DO LOOP ; OK
TIME-TEST OK
 

Byte August 1980 page 116
Magazine Advert
Listing 6: The BEGIN ... UNTIL loop. This loop takes one argument, a truth value, usually computed within the loop, at the end. If it is false (0), control branches back to the corresponding BEGIN ; if the value is true (nonzero), the loop ends, and control transfers to the next word in the program. (See figure 4.)
Note that the test of the value on top of the stack occurs at the end of the body of the loop; this guarantees that the body of the loop will be executed at least once.
The word = removes the top two numbers from the stack and returns a truth value of 1 if they are equal, 0 otherwise. In this example, the index stays on the stack and is duplicated before each use. The DROP at the end throws away the top stack value; this prevents the used index from cluttering the stack.
The warning message "10CUBES ISN'T UNIQUE" notifies us that the same name has already been defined. The only penalty for reusing a name is that the former definition becomes inaccessible for the rest of the program. Therefore, you do not have to remember a list of reserved words in FORTH; if you do not know about a name or have forgotten about it, you probably were not planning to use it anyway. But, in case of a mistake, the bad definition can be deleted with a FORGET operation, or the source code can be changed on disk.
[Some versions of FORTH use BEGIN ... END instead of BEGIN ... UNTIL .... GW]

: 10CUBES ( ->. SAME, USING 'UNTIL' LOOP) 10CUBES ISN'T UNIQUE
0 ( INITIAL VALUE OF INDEX)
BEGIN ( STARE LOOP)
CR DUP . DUP CUBE . ( PRINT A # AND ITS CUBE)
1 + ( INCREMENT) DUP 10 = ( TEST FOR INDEX=10)
UNTIL ( END OF LOOP)
DROP ( THROW AWAY USED INDEX)
; OK
10CUBES
0 0
1 1
2 8
3 27
4 64
5 125
6 216
7 343
8 512
9 729 OK
 

Listing 7: The BEGIN ... WHILE ... REPEAT loop. This looping structure tests the value on top of the stack at the beginning of the loop; because of this, this loop can execute 0 times. REPEAT causes an unconditional branch back to BEGIN , and WHILE branches out of the loop (just beyond REPEAT) if the truth-value which it finds on top of the stack is false (ie: 0); see figure 5.
All of these looping and conditional branching structures can be nested within each other to any practical depth. Any mismatching can be detected at compile time. Most FORTH systems allow these structures only inside colon definitions; they cannot be executed directly from the terminal.
[Some versions of FORTH use: BEGIN ... IF ... WHILE or WHILE ... PERFORM ... PEND instead of BEGIN ... WHILE ... REPEAT .... GW]

: 10CUBES ( ->. SAME, USING 'WHILE' LOOP) 10CUBES ISN'T UNIQUE
0 ( INITIAL VALUE OF INDEX)
BEGIN
DUP 10 < ( LOOP TEST)
WHILE
CR DUP . DUP CUBE . ( PRINT A # AND ITS CUBE)
1 + ( INCREMENT)
REPEAT
DROP ( THROW AWAY USED INDEX)
; OK
10CUBES
0 0
1 1
2 8
3 27
4 64
5 125
6 216
7 343
8 512
9 729 OK
 

Byte August 1980 page 118
Figure 3: An explanation of the DO ... LOOP construct. As shown in figure 3a, the top number on the stack is taken to be the lower limit of the loop variable, I, and the next to-top number on the stack is the upper limit of the loop variable + 1. The body of the loop is shaded, and the loop variable is incremented and tested after the body of the loop is executed. Figure 3b gives the equivalent construct in conventional flowchart notation.
Figure 4: An explanation of the BEGIN ... UNTIL construct. As shown in figure 4a, the body of the loop (shaded) is repeated only if the value on top of the stack when the word UNTIL is reached is false. Figure 4b gives the equivalent construct in conventional flowchart notation.
 
Magazine Advert

Byte August 1980 page 120
Magazine Advert
Listing 8: An example of FORTH looping. A practical use of FORTH's structured looping is this terminal output handler. This example is for a PDP-11; an example for other computers would be similar. Address 177564 (octal) is the output status register of the console terminal; bit 7 of this address is set when the device is ready to receive a character. The ASCII code for the output character can then be placed in address 177566 (the data buffer register).
The FORTH word @ (pronounced fetch) does the work of PEEK in BASIC; it treats the number on top of the stack as an address and replaces it with the contents of that address word. AND does a "bitwise" boolean AND operation. So 177564 @ 200 AND } indicates true (nonzero) only if bit 7 of the status register is set. Until then, the BEGIN ... UNTIL loop does a waiting loop ending on the above condition. When the device is ready, the argument that was given to TERMINAL-OUT (the ASCII character to be written) is still on top of the stack. { ! } (pronounced store) stores the word that is second on the stack into the address that is on top of the stack; so { 177566 ! 1 transmits the character to the terminal data buffer register, from which it will be written onto the terminal by the hardware of the PDP-11 system.
The FORTH word ASCII-TEST was written to test the TERMINAL-OUT word. It transmits ASCII values for all of the printable character set.
Listing 9 shows the same device handler, only written in machine-language code with a FORTH assembler.

OK
OCTAL OK
: TERMINAL-OUT ( CHAR -) . TERMINAL OUTPUT HANDLER PDP-11)
BEGIN 177564 @ 200 RND UNTIL ( WAIT TILL PORT RERDY)
177566 ! ( TRANSMIT THE CHARACTER)
; OK
: ASCII-TEST ( -) . TEST HRNDLER - PRINT CHARACTER SET)
177 40 ( TRRNSMIT ASCII BLANK THROUGH '~')
DO I TERMINAL-OUT LOOP ( OUTPUT THE CHARACTERS)
; OK
DECIMAL OK
ASCII-TEST !"#$%&'()*+,-/0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~ OK
 


Listing 9: FORTH words defined by machine-language subroutines, for PDP-11 and for 8080 processors. The operation TERMINAL-OUT-2 behaves exactly the same as TERMINAL-OUT defined in listing 8, but it is written in assembly language. FORTH assemblers use postfix notation, so address-mode symbols and operation codes (instruction mnemonics) follow their operands, unlike conventional assemblers. In the PDP-11 example (listing 9a), { 177564 200 # BIT, } in line 2 assembles a "bit test" instruction that does a logical AND between address 177564 and the literal 200 ( # indicates literal), setting condition codes. { UNTIL, } assembles a conditional branch back to the corresponding { BEGIN, }. The commas are part of the operation names, not punctuation. The word NE tells the { UNTIL, } what kind of conditional branch to assemble. There are also { IF, }...{ THEN, } and { IF, }...{ ELSE, }...{ THEN, } operations; all these code-level structures can be nested.
In the 8080 example (listing 9b), the machine-language subroutine sets up a call to the character-output routine in the North Star disk operating system. In contrast, the PDP-11 example outputs directly to the hardware without using any software outside of FORTH. Either approach could be used on either machine, of course, and each has its own advantages.
The word CODE , like { : } (colon, introduced in listing 3), creates a new definition in FORTH's dictionary for the word following it. CODE also sets the number base (to octal for PDP-11 and to hexadecimal for 8080), saving the original number base, which is later restored by { C; }. CODE also changes the vocabulary, which allows the same names to have different meanings in the assembler and in the rest of FORTH without confusion. Users can create their own vocabularies and sub vocabularies to keep different application libraries separate.
Many FORTH programmers never need to write machine-language subroutines, so they do not need to use an assembler. FORTH assemblers have an unfamiliar postfix notation, but they have the advantage of giving immediate feedback. You know right away whether an operation works, with no wait for assembly passes, linking passes, and file handling. This interactive assembly greatly speeds program development and allows more though testing

Byte August 1980 page 122
Magazine Advert
Collectively, { : } and CODE are called defining words because they are used to create new FORTH words. There are several other such functions in FORTH, and users can also define their own types of defining words, creating new data types or operation types; see listing 10.

CODE TERMINAL-OUT-2 ( Nur-> TERMINAL OUTPUT HANDLER, PDP-11) OK
BEGIN, 177564 200 # BIT, NE UNTIL, (WAIT TILL PORT READY) OK
S )+ 177566 MOV, ( POP FORTH STACK INTO DATA REGISTER) OK
NEXT, ( A 2-INSTRUCTION MACRO TO CONTINUE FORTH EXECUTION) OK
C; ( GET OUT OF THE FORTH ASSEMBLER) OK
OCTAL OK
: ASCII-TEST-2 ( -), PRINT ASCII CHARACTER SET)
177 40 ( ASCII BLANK THROUGH '~')
DO I TERMINAL-0UT-2 LOOP ( OUTPUT THE CHAPRETERS)
; OK
DECIMAL OK
ASCII-TEST-2 !"#$%&'()*+,-/0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~ OK
ASCII-TEST
!"#$%&'()*+,-/0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~ OK


0 CONSTANT DEV ( DEVICE NO FOR NORTHSTAR DOS ) OK
200D CONSTANT COUT ( NORTHSTAR DOS CHAR OUT JUMP POINT ) OK CODE TERMINAL-OUT-2 ( CHAR->. 8080 WITH NORTHSTAR DOS ) OK
H POP ( CHARACTER IS ON STACK, POP TO HL ) OK
B PUSH ( BC IS INSTRUCTION POINTER, SAVE IT ) OK
L B MOV ( DOS EXPECTS CHAR IN B REGISTER ) OK
DEV A MVI ( AND DEVICE NUMBER IN ACCUMULATOR ) OK
COUT CALL B POP NEXT JMP C; ( DO IT AND CONTINUE ) OK
 


Listing 10: User-defined data types. Because this example is longer, it was not typed in directly like the others, but was stored on disk with an editor (the editor session is not shown here). This example is contained in two disk screens, each of which is a virtual block of 1024 bytes (see text). The commands { 58 LIST } and { 59 LIST } print these screens. The line numbers (0 that 15) are not part of the program and are used only by the editor.
This example creates table-lookup sine and cosine routines for integer-degree arguments. The results are accurate enough for most graphics applications, making this situation an example of the versatility of FORTH, even without floating-point routines.
The definition of TABLE creates a new data type. When TABLE is executed, it creates a new table of numbers taken from the stack; the number on top of the stack tells how many items there are in the table. In this case, { 91 TABLE SINTABLE } creates a table called SINTABLE with ninety-one entries; these entries are the values of the sine of 0° thru 90° , multiplied by 10,000 so that they can be expressed as integers. SINTABLE gives the sine (scaled by 10,000) of 0° thru 90° degrees; SIN does the same, except that its argument can be any number of degrees (from -32,768 to 32,767).
Incidentally, few FORTH programs use as much depth of stack as this one. The system used for listings 1 thru 7 limits the stack depth in order to use "page 0" memory for speed, so this example would have to be modified to run on it.
The <BUILDS ... DOES> construct, which creates the new data type, is one of the most advanced concepts of FORTH. Briefly, the <BUILDS part is executed when SINTABLE is defined; that is, it creates the table. The DOES> part defines what happens when SINTABLE is executed. Once TABLE has been defined, any number of tables of varying length can be declared using the word. Similar definitions can create special-purpose arrays such as word, byte, or bit arrays, user-defined record structures or other data objects, or user-defined classes of operations. [An excellent explanation of the words <BUILDS and DOES> is given in Kim Harris' article "FORTH Extensibility," also in this issue....GW]

OK
58 LIST
SCR # 58
0 ( TRIG LOOKUP ROUTINES - WITH SINE *10000 TABLE)
1 : TABLE ( ... N -> . CREATE 'TABLE' DATA TYPE.)
2   <BUILDS 0 DO , LOOP ( COMPILE N ELEMENTS)
3   DOES> SWAP 2 * + @ ( EXECUTE TABLE LOOKUP)
4   ;

Byte August 1980 page 124
Magazine Advert
 5 10000 9998 9994 9986 9976 9962 9945 9925 9903 5877
 6  9848 9816 9781 9744 9703 9659 9613 9563 9511 9455
 7  91397 9336 9272 9205 9135 9063 8988 8910 8829 8746
 8  8660 8572 8480 8387 8290 8192 8090 7986 7880 7771
 9  7660 7547 7431 7314 7193 7071 6947 6820 6691 6561
10 6428 6293 6157 6018 5878 5736 5592 5446 5299 5150
11 5000 4848 4695 4540 4384 4226 4067 3907 3746 3584
12 3420 3256 3090 2924 2756 2588 2419 2250 2079 1908
13 1736 1564 1392 1219 1045 0872 0698 0523 0349 0175
14 0000 ( 91 ELEMENTS OF TABLE PLACED ON STACK)
15 91 TABLE SINTABLE ( RETURNS SINE, 0-90 DEGREES ONLY)
OK


SCR # 59
( SINE AND COSINE TABLE-LOOPUP ROUTINES)
 1    : S180 ( N-> N . RETURNS SINE, 0-180 DEGREES)
 2    DUP 90 > ( IF GREATER THAN 90 DEGREES,)
 3    IF 180 SWAP - ENDIF (SUBTRACT FROM 180)
 4    SINTABLE ( THEN TAKE SINE)
 5    ;
 6 : SIN ( N -> SINE, RETURN SINE OF ANY NUMBER OF DEGREES)
 7     360 MOD ( BRING WITHIN + OR - 360)
 8     DUP 0< IF 360 + ENDIF ( IF NEGATIVE, ADD 360)
 9     DUP 180 ( TEST IF GREATER THAN 180)
10   IF 180 - S180 MINUS ( IF SO, SUBTRACT 180, NEGATE SINE)
11   ELSE S180 ENDIF ( OTHERWISE, STRAIGHTFORWARD)
12 ;
13 : COS ( N COSINE.)
14     360 MOD ( PREVENT OVERFLOW NEAR 32,767)
15     90 SIN ; ( COSINE IS SINE WITH 90 DEGREES PHASE SHIFT)
OK


OK
58 LOAD 59 LOAD OK
0 SIN . 0 OK
0 COS . 10000 OK
90 SIN . 10000 OK
45 SIN . 7071 OK
1 SIN . 175 OK
361 SIN . 175 OK
1000 SIN . -9848 OK
10000 SIN . -9848 OK
10000 COS . 1736 OK
-25281 COS . 1564 OK
32767 SIN . 1219 OK
32767 COS . 9925 OK
-1 SIN . -175 OK
: SINSCRLE ( N DEGREES -> N . SCALE BY SINE)
    SIN 10000 */ ( MULTIPLY, THEN DIVIDE; 32 BITS INTERMEDIATE)
    ; OK
100 45 SINSCALE . 70 OK
10000 45 SINSCALE . 7071 OK
30000 -5 SINSCRLE . -2616 OK

Byte August 1980 page 126




Figure 5: An explanation of the BEGIN ... WHILE ... REPEAT construct. As shown in figure 5a, the FORTH words between BEGIN and WHILE perform operations that leave a truth value, N, on top of the stack. The value of N determines whether the body of the loop (the words between WHILE and REPEAT) is performed or not. The loop repeats until N evaluates to false (N=0). Figure 5b gives the equivalent construct in conventional flowchart notation.
7. Moore, C H, "FORTH: a New Way to Program a Computer," Astronomy and Astrophysics Supplement, 1974, number 15, pages 497 thru 511.
8. Moore, C H, and E D Rather, "The FORTH Program for Spectral Line Observing," Proceedings of the IEEE, September 1973, pages 1346 thru 1349.
9. Rather, E D, and L Brody, Unsig FORTH (2nd rev ed) FORTH Inc, Hermosa Beach CA, 1980.
10. Rather, E D, and C H Moore, "The FORTH Approach to Operating Systems," ACM 1976 Proceedings, Association for Computing Machinery, 1976.
11. Sachs, J, An Introduction to STOIC, Technical Report BMEC TR001, Harvard-MIT Program in Health Sciences and Technology, Cambridge MA, June 1976.
12. Stein, P, "The FORTH Dimension: Mini Language Has Many Faces.- Computer Decisions, November 1975, page 10.
13. Stevens, W R, A FORTH Primer, Kitt Peak National Observatory, Tucson AZ, 1979.
14. Taylor, A, "FORTH Becoming Hothouse for Developing Languages," Computerworld, July 30, 1979.
15. Taylor, A, "FORTH Setting Coding Trend?," Computerworld, August 13, 1979.
16. Taylor, A, "Trade Language Families Can Sprout from FORTH," Computerworld, August 27, 1979.
17. Wells, D C, "Interactive Image Analysis for Astronomers," Computer, August 1977, pages 30 thru 34.
SELECTED BIBLIOGRAPHY
1. Bartoldi, P, "Stepwise Development and Debugging Using a Small Well-Structured Interactive Language for Data Acquisition and Instrument Control," Proceedings of the International Symposium and Course on Mini-and Microcomputers and their Applications,Acta Press, Anaheim CA, 1976, pp 117-122.
2. Ewing, M S, The Caltech FORTH Manual, California Institute of Technology, Pasadena CA, 1978.
3. Forsley, L, URTH Tutorial, University of Rochester, Rochester NY.
4. Hicks, S M, "FORTH's Forte is Tighter Programming," Electronics, March 15, 1979, pages 114 thru 118.
5. James, J S, "FORTH for Micro computers," Dr Dobb's Journal of Computer Calisthenics & Orthodontia, May 1978; reprinted in SIGPLAN Notices (Special Interest Group on Programming Languages of the Association for Computing Machinery), October 1978.
6. Meinzer, K, "IPS, an Unorthodox High-Level Language," BYTE, January 1979, pages 146 thru 159.
Magazine Advert

Byte 1980 Cover page   Byte 1980 page 100    Byte 1980 page 102
Byte 1980 page 104   Byte 1980 page 106    Byte 1980 page 109
Byte 1980 page 110   Byte 1980 page 112    Byte 1980 page 114
Byte 1980 page 116   Byte 1980 page 118    Byte 1980 page 120
Byte 1980 page 122   Byte 1980 page 124    Byte 1980 page 126