***************** patternFORTH ****************** * * * * * pattern language extensions to * * IBM PC real-FORTH * * (c) 1988 B. J. Rodriguez * * * * * * * *   * * * * * * * * * * * ***** Further distribution must include the above notice. ***** ERROR MESSAGES 21 03 82 CRC Empty Stack Dictionary Full Has Wrong Addressing Mode Isn't Unique msg 5 Disk Range ? Full Stack Disk Error !!! Bytes (hex) Dictionary space available. Addressing Mode Too Deep Address Out Of Range Address Opcode Operand Opcode Operand Mode ASCII Invalid Register ?? Not in dictionary ?? real-FORTH for Super8 (c) 1988 T-Recursive Tech 12 Apr 1988 ERROR MESSAGES 21 03 82 CRC Compilation Only, Use in definitions Execution only Conditionals not paired Definitions not finished In protected dictonary Use only when loading Off Current Editing Screen Declare Vocabulary msg 25 msg 26 Print Wheel Error Paper Feed Error Carriage Movement Error Printer failure msg 31 ERROR MESSAGES ( 27 10 88 bjr 21:10 ) No room for string String stack empty String stack full Hash table full ( PATTERNFORTH load screen) ( 14 9 89 bjr 11:40 ) FORTH-83 : TASK ; .S DECIMAL 11 LOAD .( Null-capable WORD ) .S DECIMAL 12 14 THRU .( memory management ) .S DECIMAL 15 16 THRU .( extended memory operators ) .S DECIMAL 23 LOAD .( string descriptor ) .S DECIMAL 33 38 THRU .( multiple hash tables ) .S DECIMAL 29 31 THRU .( strings ) .S DECIMAL 32 LOAD .( testing utilities ) .S DECIMAL 42 47 THRU .( pattern matching ) .S DECIMAL 48 50 THRU .( pattern primitives ) .S DECIMAL 51 52 THRU .( string operators ) *OBS* ( String stack load screen) ( 30 10 88 bjr 1 DECIMAL 12 14 THRU \ memory management DECIMAL 15 16 THRU \ extended memory operators DECIMAL 18 21 THRU \ string stack ( Null-capable WORD for real-Forth) ( 1 4 89 bjr 10:44 ) ( requires Forth-83 compatibility to be loaded vvv ) : WORD ( c - a) BLK @ IF BLK @ BLOCK ELSE TIB THEN IN @ + 2DUP C@ = IF ( null) 2DROP 0 HERE ! 1 IN +! ELSE SWAP ENCLOSE HERE 34 BLANKS IN +! OVER - >R R HERE C! + HERE 1+ R> CMOVE THEN HERE ; ( Dynamic memory management) ( 31 3 89 bjr 12:13 ) ( for 8086 hosts) CS@ HEX 1000 + CONSTANT $SEG \ 64K segment for strings VARIABLE ROVER \ roving search pointer for ALLOC : ^ ( - pfa) ' >BODY ; CODE \ALLOC AX AX SUB, AX ROVER #) MOV, ^ $SEG #) DS MOV, FFFE # AX MOV, AX 0 #) MOV, AX FFFE #) MOV, CS AX MOV, AX DS MOV, NEXT \ALLOC ( Memory ALLOC) HEX ( 27 10 88 bjr 23:32 ) CODE ALLOC ( n - a 0 | 1) AX POP, 3 # AX ADD, FFFE # AX AND, ROVER #) BX MOV, ^ $SEG #) DS MOV, BEGIN, \ get adr,seg 0 [BX] DI MOV, 1 # DI TEST, 0= IF, \ this blk avail? AX DI CMP, U>= IF, \ and big enough? \ block is available - split \ tag new free blk? DI DX MOV, AX DX SUB, 0<> IF, 2 # DX SUB, DX 0 [BX+DI] MOV, AX DI MOV, DX 2 [BX+DI] MOV, THEN, AX DI MOV, AX INC, \ tag alloc'd blk AX 0 [BX+DI] MOV, AX 0 [BX] MOV, 2 # BX ADD, BX PUSH, AX AX SUB, AX PUSH, \ result CS AX MOV, AX DS MOV, \ update roving ptr DI BX ADD, BX ROVER #) MOV, NEXT THEN, THEN, \ block not available - try next FFFE # DI AND, 2 [BX+DI] BX LEA, CS: ROVER #) BX CMP, 0= UNTIL, CS AX MOV, AX DS MOV, 1 # AX MOV, AX PUSH, NEXT ( Memory RELEASE) HEX ( 27 10 88 bjr 19:41 ) CODE RELEASE ( a) BX POP, 2 # BX SUB, ^ $SEG #) DS MOV, 0 [BX] DI MOV, FFFE # DI AND, \ get length -2 [BX] AX MOV, 1 # AX TEST, 0= IF, \ consolidate preceding 2 # AX ADD, AX BX SUB, AX DI ADD, THEN, 2 [BX+DI] AX MOV, 1 # AX TEST, 0= IF, \ consolidt followin 2 # AX ADD, AX DI ADD, THEN, \ BX and DI now specify the consolidated memory block DI 0 [BX] MOV, DI 0 [BX+DI] MOV, \ tag free region \ we may have invalidated ROVER in this process CS AX MOV, AX DS MOV, BX ROVER #) MOV, NEXT ( Extended memory CMOVE @ !) ( 14 9 89 bjr 13:03 ) CODE ECMOVE ( sa sseg da dseg n) SI AX MOV, CLD, CX POP, ES POP, DI POP, DS POP, SI POP, \ ds:si -> es:di CX BX MOV, CX SHR, 0<> IF, REP, MOVS, THEN, \ move words 1 # BX AND, 0<> IF, .B MOVS, THEN, \ move odd byte if any AX SI MOV, CS AX MOV, AX DS MOV, AX ES MOV, NEXT CODE E@ ( a seg - n) DS POP, BX POP, 0 [BX] PUSH, CS AX MOV, AX DS MOV, NEXT CODE E! ( n a seg) DS POP, BX POP, 0 [BX] POP, CS AX MOV, AX DS MOV, NEXT CODE EC! ( n a seg) DS POP, BX POP, AX POP, AL 0 [BX] MOV, CS AX MOV, AX DS MOV, NEXT CODE EERASE ( a seg n) CX POP, ES POP, DI POP, AX AX SUB, CLD, REP, AL STOS, CS AX MOV, AX ES MOV, NEXT ( Extended memory -TEXT) ( 27 10 88 bjr 22:37 ) CODE E-TEXT ( a1 seg1 n1 a2 seg2 n2 - f) SI AX MOV, CLD, DX POP, ES POP, DI POP, BX POP, DS POP, SI POP, \ do text compare only for length of shorter string DX BX CMP, U< IF, BX CX MOV, ELSE, DX CX MOV, THEN, CX<>0 IF, REPZ, .B CMPS, THEN, \ compare ds:si - es:di 0= IF, DX BX CMP, THEN, \ text equal; compare bx - dx \ flags now set according to string1 - string2 0= IF, BX BX SUB, ELSE, U< IF, -1 # BX MOV, ELSE, 1 # BX MOV, THEN, THEN, BX PUSH, AX SI MOV, CS AX MOV, AX DS MOV, AX ES MOV, NEXT *OBS* ( String stack) DECIMAL ( 1 3 89 bjr 9:13 ) 0 VARIABLE $STACK 256 ALLOT \ area for stack of pointers HERE CONSTANT $SP0 \ top of $STACK 0 VARIABLE $SP \ string stack pointer : \$STACK $SP0 $SP ! ; : ?$STACK ( - f) $SP0 $SP @ U< 34 ?ERROR $SP @ $STACK U< 35 ?ERROR ; \ the following may be CODEd someday. : $PUSH ( a) ( - s) -2 $SP +! $SP @ ! ; : $TOP ( - a) ( s - s) $SP @ @ ; : $LEN ( - n) ( s - s) $TOP $SEG E@ ; : $2ND ( - a) ( s s - s s) $SP @ 2+ @ ; : $2LEN ( - n) ( s s - s s) $2ND $SEG E@ ; *OBS* ( String create and destroy) ( 1 3 89 bjr 9:14 ) : $MAKE ( n - n a) ( - s) DUP 2+ ALLOC 33 ?ERROR ( n a) DUP $PUSH ( adr on stack) 2DUP $SEG E! ( count in string) ; : $DROP ( ) ( s) $TOP RELEASE 2 $SP +! ; : $DUP ( ) ( s - s s) $TOP $SEG $LEN $MAKE $SEG ROT 2+ ECMOVE ; : $OVER ( ) ( s1 s2 - s1 s2 s1) $2ND $SEG $2LEN $MAKE $SEG ROT 2+ ECMOVE ; *OBS* ( String input and output) ( 1 3 89 bjr 9:14 ) : >$ ( a n) ( - s) $MAKE 2+ ( a n A) CS@ ROT ROT $SEG ROT ECMOVE ; : $> ( - a n) ( s - s) $TOP 2+ $SEG HERE CS@ $LEN ECMOVE HERE $LEN ; : $TYPE ( ) ( s - s) $> TYPE ; : ($") ( ) ( - s) R COUNT DUP 1+ R> + >R >$ ; : $" ( ) ( - s) ASCII " STATE @ IF COMPILE ($") WORD HERE C@ 1+ ALLOT ELSE WORD HERE COUNT >$ THEN ; *OBS* ( String stack operators) ( 1 3 89 bjr 9:14 ) : $SWAP $SP @ >R R @ R 2+ @ R ! R> 2+ ! ; : $ROT $SP @ >R R @ R 2+ @ R 4 + @ R ! R 4 + ! R> 2+ ! ; \ CODE these for greater efficiency. : $.S BASE @ HEX $SP0 $SP @ 2DUP = IF 2DROP ." Empty" ELSE DO I @ DUP CR 5 U.R SPACE 2+ $SEG HERE CS@ I @ $SEG E@ DUP >R ECMOVE HERE R> TYPE 2 +LOOP THEN BASE ! ; : $- ( - f) ( s s - s s) $2ND 2+ $SEG $2LEN $TOP 2+ $SEG $LEN E-TEXT ; ( String descriptor structure) ( 10 9 89 bjr 18:51 ) HEX $SEG 1000 + CONSTANT HSEG \ 64k segment for hash area : DESCR ( n) ( s - a seg) C@ + HSEG ; 0 DESCR $LINK \ link to next descriptor in list, in hash space 2 DESCR $ASCEN \ ascension value; list is in ascending order 4 DESCR $PTR \ pointer to text in string space 6 DESCR $LEN \ length of text, excluding appended null 8 DESCR tbd 0A DESCR $VALU \ value "stored" at this "address" 0C DESCR tbd2 \ ... double cell for value 0E DESCR $TABLE \ table associated with this string \ beware...some of these offsets are hard-coded in the assembly *OBS* ( Hash table memory) ( 14 9 89 bjr 14:45 ) DECIMAL 512 CONSTANT H0 \ addr of first list element 16 CONSTANT HSIZE \ size of hash list element 4064 CONSTANT #H \ # of hash elements available 0 VARIABLE HFREE \ head of free list : \HASH H0 0 DO 0 I HSEG E! 2 +LOOP \ zap heads H0 #H 1- 0 DO DUP HSIZE + DUP ROT HSEG E! LOOP \ link all 0 SWAP HSEG E! ( last el) H0 HFREE ! ; \ point to free lst \HASH *OBS* ( Hashing function) ( 14 9 89 bjr 14:45 ) CODE HASH ( a n - a n u c) AX AX SUB, DX DX SUB, DI POP, BX POP, BX PUSH, DI PUSH, DI CX MOV, CX SHR, CX<>0 IF, DO, \ do by words AX ROL, DL .B RCL, 0 [BX] AX XOR, 2 # BX ADD, LOOP, THEN, 1 # DI TEST, 0<> IF, \ do odd byte AX ROL, DL .B RCL, 0 [BX] AL .B XOR, THEN, AL DL XCHG, AX PUSH, DX PUSH, NEXT *OBS* ( Hash table search) ( 14 9 89 bjr 14:46 ) CODE HFIND ( a n u c - a n u a' f) ^ HSEG #) DS MOV, CLD, BX POP, AX POP, CX POP, DI POP, DI PUSH, CX PUSH, AX PUSH, SI PUSH, BX BX ADD, \ do while link is not null... BEGIN, BX DX MOV, 0 [BX] BX MOV, BX BX OR, 0<> WHILE, AX 2 [BX] CMP, 0= IF, \ list ascen - desired ascen CX 6 [BX] CMP, 0= IF, \ list length - desired length CX PUSH, DI PUSH, 4 [BX] SI MOV, 2 # SI ADD, CS: ^ $SEG #) DS MOV, AX AX CMP, REPZ, .B CMPS, DI POP, CX POP, 0= IF, \ text equal? SI POP, DX PUSH, AX AX SUB, AX PUSH, \ exit! CS AX MOV, AX DS MOV, NEXT THEN, ^ HSEG #) DS MOV, THEN, AX AX CMP, ( ascen = , text not, so force loop) THEN, U> UNTIL, ENDIF, ( WHILE exit) SI POP, DX PUSH, 1 # AX MOV, AX PUSH, CS AX MOV, AX DS MOV, NEXT *OBS* ( Hash list insertion) DECIMAL ( 14 9 89 bjr 14:46 ) CODE HINSERT ( u a' A n - A f) CX POP, DX POP, HFREE #) BX MOV, ^ HSEG #) DS MOV, BX BX OR, 0<> IF, 0 [BX] AX MOV, CS: AX HFREE #) MOV, \ head->next free DI POP, 0 [DI] AX MOV, AX 0 [BX] MOV, \ new el's link BX 0 [DI] MOV, \ link preceding el to new el 2 [BX] POP, ( ascen) CX 6 [BX] MOV, ( length) AX AX SUB, AX 8 [BX] MOV, AX 10 [BX] MOV, ( nulls) AX 12 [BX] MOV, AX 14 [BX] MOV, ( nulls) DX DI MOV, DI 4 [BX] MOV, ( string pointer) CS: ^ $SEG #) DS MOV, BX 0 [DI] MOV, ( back-pointer) ELSE, 4 # SP ADD, 1 # AX MOV, THEN, DX PUSH, AX PUSH, CS AX MOV, AX DS MOV, NEXT *OBS* ( Hash list deletion) ( 14 9 89 bjr 14:46 ) CODE HDELETE ( a') ^ HSEG #) DS MOV, BX POP, 0 [BX] DI MOV, 0 [DI] AX MOV, AX 0 [BX] MOV, \ link around CS: HFREE #) AX MOV, AX 0 [DI] MOV, \ this el -> free list CS AX MOV, AX DS MOV, DI HFREE #) MOV, \ head -> this el NEXT ( Create strings) DECIMAL ( 14 9 89 bjr 13:31 ) : (>$) ( a n - A n) DUP 3 + ALLOC 33 ?ERROR ( a n A) ROT >R 2DUP R> CS@ ( n A n A a sg) 2SWAP 2+ $SEG ROT ECMOVE ( n A) 2DUP 2+ + 0 SWAP $SEG EC! ( append a null) SWAP ; : >$ ( a n - s) HASH HFIND ( a n u a' f) OVER >R IF 2SWAP (>$) HINSERT IF RELEASE 36 ERROR ELSE DROP THEN ELSE 2DROP 2DROP THEN R> $LINK E@ ; : (") ( - s) R COUNT DUP 1+ R> + >R >$ ; : " ( - s) ASCII " STATE @ IF COMPILE (") WORD C@ 1+ ALLOT ELSE WORD COUNT >$ THEN ; IMMEDIATE ( Delete strings) ( 14 9 89 bjr 14:50 ) : \$ ( a n) HASH HFIND ( a n u a' f) IF ( not found - do nothing) 2DROP ELSE DUP $LINK E@ $PTR E@ RELEASE HDELETE DROP THEN 2DROP ; EXIT Memo: should be redefined to accept a string descriptor! ( Associative store and fetch) ( 10 9 89 bjr 16:36 ) : $@ ( s1 - s2) $VALU E@ ; : $! ( s2 s1) $VALU E! ; : } [ $CONTEXT @ ] LITERAL $CONTEXT ! ; : { ( s) $TABLE E@ $CONTEXT ! ; : $TEXT ( s - adr seg n) DUP $PTR E@ 2+ $SEG ROT $LEN E@ ; ;S Memo to self: need to initialize strings to point to null. ( Testing utilities) ( 10 9 89 bjr 17:57 ) : HDUMP HSEG SEG ! DUMP ; : SDUMP $SEG SEG ! DUMP ; : 0DUMP 0 SEG ! DUMP ; : DUMP CS@ SEG ! DUMP ; : AWAIT ." hit any key " KEY DROP CR ; : $TYPE ( s) $TEXT >R HERE CS@ R ECMOVE HERE R> TYPE ; ;S ( Multiple hash tables) ( 1 4 89 bjr 10:19 ) DECIMAL 16 CONSTANT H0 \ addr of first list element ( element 0 is specially reserved) 16 CONSTANT HSIZE \ size of hash list element 4095 CONSTANT #H \ # of hash elements available VARIABLE HFREE \ head of free list VARIABLE $CONTEXT \ addr of current hash table : \HASH H0 #H 1- 0 DO DUP HSIZE + DUP ROT HSEG E! LOOP \ link all 0 SWAP HSEG E! ( last el) H0 HFREE ! ; \ point to free lst \HASH ( Hashing function) ( 26 10 89 bjr 1:38 ) CODE HASH ( a n - a n u c) AX AX SUB, DX DX SUB, DI POP, BX POP, BX PUSH, DI PUSH, DI CX MOV, CX SHR, CX<>0 IF, DO, \ do by words AX ROL, DL .B RCL, 0 [BX] AX XOR, 2 # BX ADD, LOOP, THEN, 1 # DI TEST, 0<> IF, \ do odd byte AX ROL, DL .B RCL, 0 [BX] AL .B XOR, THEN, AL DL XCHG, AX PUSH, DX PUSH, NEXT \ : HASH 2DUP TYPE HASH 2DUP ." Hashed: " U. U. ; ( testing) CODE HFIND ( a n u c - a n u a' f) CLD, BX POP, BX AX MOV, BX SHR, BX SHR, BX SHR, HEX 0E # BX AND, $CONTEXT #) BX ADD, ^ HSEG #) DS MOV, 0 [BX] BX MOV, 0E # AX AND, AX BX ADD, ( addr of list head) DECIMAL ( Hash table search) ( 26 10 89 bjr 8:06 ) \ ...CODE HFIND ( a n u - a n u a' f | BX=>hash list head) AX POP, CX POP, DI POP, DI PUSH, CX PUSH, AX PUSH, SI PUSH, \ do while link is not null... BEGIN, BX DX MOV, 0 [BX] BX MOV, BX BX OR, 0<> WHILE, AX 2 [BX] CMP, 0= IF, \ list ascen - desired ascen CX 6 [BX] CMP, 0= IF, \ list length - desired length CX PUSH, DI PUSH, 4 [BX] SI MOV, 2 # SI ADD, CS: ^ $SEG #) DS MOV, AX AX CMP, REPZ, .B CMPS, DI POP, CX POP, 0= IF, \ text equal? SI POP, DX PUSH, AX AX SUB, AX PUSH, \ exit! CS AX MOV, AX DS MOV, NEXT THEN, CS: ^ HSEG #) DS MOV, THEN, AX AX CMP, ( ascen = , text not, so force loop) THEN, U> UNTIL, ENDIF, ( WHILE exit) SI POP, DX PUSH, 1 # AX MOV, AX PUSH, CS AX MOV, AX DS MOV, NEXT ( Hash list insertion) DECIMAL ( 1 11 88 bjr 9:04 ) CODE HINSERT ( u a' A n - A f) CX POP, DX POP, HFREE #) BX MOV, ^ HSEG #) DS MOV, BX BX OR, 0<> IF, 0 [BX] AX MOV, CS: AX HFREE #) MOV, \ head->next free DI POP, 0 [DI] AX MOV, AX 0 [BX] MOV, \ new el's link BX 0 [DI] MOV, \ link preceding el to new el 2 [BX] POP, ( ascen) CX 6 [BX] MOV, ( length) AX AX SUB, AX 8 [BX] MOV, AX 10 [BX] MOV, ( nulls) AX 12 [BX] MOV, AX 14 [BX] MOV, ( nulls) DX DI MOV, DI 4 [BX] MOV, ( string pointer) CS: ^ $SEG #) DS MOV, BX 0 [DI] MOV, ( back-pointer) ELSE, 4 # SP ADD, 1 # AX MOV, THEN, DX PUSH, AX PUSH, CS AX MOV, AX DS MOV, NEXT ( Hash list deletion) ( 30 10 88 bjr 16:36 ) CODE HDELETE ( a') ^ HSEG #) DS MOV, BX POP, 0 [BX] DI MOV, 0 [DI] AX MOV, AX 0 [BX] MOV, \ link around CS: HFREE #) AX MOV, AX 0 [DI] MOV, \ this el -> free list CS AX MOV, AX DS MOV, DI HFREE #) MOV, \ head -> this el NEXT ;S for future reference: CODE HRELEASE ( a) BX POP, HFREE #) AX MOV, ^ HSEG #) DS MOV, AX 0 [BX] MOV, \ this el -> free list CS AX MOV, AX DS MOV, BX HFREE #) MOV, \ head -> this el ( Hash TABLE definition) ( 14 9 89 bjr 13:12 ) CODE HALLOC ( - a f) HFREE #) BX MOV, BX BX OR, 0<> IF, ^ HSEG #) DS MOV, 0 [BX] AX MOV, \ head->next free CS: AX HFREE #) MOV, 0 # AX MOV, \ true flag ELSE, 1 # AX MOV, 0 # BX MOV, THEN, BX PUSH, AX PUSH, CS AX MOV, AX DS MOV, NEXT : ?HASH ( f) ABORT" Hash space full" ; : \TABLE ( a - a) 16 0 DO HALLOC ?HASH DUP HSEG 16 EERASE \ zap the heads OVER I + HSEG E! 2 +LOOP ; : TABLE ( s) HALLOC ?HASH \TABLE SWAP $TABLE E! ; : \HASH \HASH 0 \TABLE $CONTEXT ! ; \ create the 'root' hash \HASH \ table, usng hash element 0. ( Pattern matching logic) ( 26 10 89 bjr 10:07 ) VARIABLE CURSOR CREATE FALLOUT ] R> DROP 0 EXIT [ CODE ENTER< ( - a oip) ( R: ip - ip a) CURSOR #) AX MOV, \ build subpattern context RP DEC, RP DEC, AX 0 [RP] MOV, \ (ip stacked on entry) AX PUSH, \ build a "fallout" record FALLOUT # AX MOV, AX PUSH, NEXT \ for backtracking CODE (<<) ( R: ip a - ip a) CURSOR #) AX MOV, AX 0 [RP] MOV, \ save cursor posn in NEXT \ return stack "context" ( Pattern matching logic) ( 26 10 89 bjr 10:08 ) CODE (|) ( =t - a ip, =f - ) ( R: ip a - ip a) AX POP, AX AX OR, 0<> IF, \ if successful - 0 [RP] PUSH, \ build backtrack rec: cursor 2 [IP] AX LEA, AX PUSH, \ & IP for next alternative 0 [IP] IP ADD, \ branch to end of alternatvs ELSE, \ if failed - 0 [RP] AX MOV, AX CURSOR #) MOV, \ restore cursor IP INC, IP INC, \ and fall thru to next alt. THEN, NEXT CODE (>>) ( =t - , a ip =f - ) AX POP, AX AX OR, 0= IF, \ if failed - IP POP, CURSOR #) POP, \ pop a backtrack record THEN, NEXT ( Pattern matching logic) ( 26 10 89 bjr 10:08 ) CODE (FALLIN) ( a ip ra) ( R: - ip a) 4 # RP SUB, 2 [RP] POP, \ restore context: return adr, CURSOR #) AX MOV, AX 0 [RP] MOV, \ cursor (from backtrack) IP POP, CURSOR #) POP, NEXT \ pop a backtrack record CREATE FALLIN ] (FALLIN) [ CODE >EXIT ( a oip - T, xx - xx a iip T) ( R: ip a - ip) SP W MOV, FALLOUT # 0 [W] CMP, 0= IF, \ if no backtracks - AX POP, AX POP, \ discard fallout record ELSE, \ if any backtracks pushed - 2 [RP] PUSH, \ build fallin rec: return adr, 0 [RP] PUSH, \ cursor adr, FALLIN # AX MOV, AX PUSH, \ IP of fallin routine THEN, -1 # AX MOV, AX PUSH, \ push a success code RP INC, RP INC, NEXT \ discard context (; will return) ( Top level pattern control) ( 12 9 89 bjr 16:48 ) \ CREATE FAILURE ] R> DROP 0 EXIT [ ( - 0) ( R: ra) \ data stack is empty when failure backtrack is taken CODE MARK ( - ) ( R: ra - ra n) RP DEC, RP DEC, SP 0 [RP] MOV, \ push data stack posn \ CURSOR #) PUSH, \ build a "failure" record \ FAILURE # AX MOV, AX PUSH, NEXT \ for backtracking NEXT CODE RESTORE ( xx ... xx f - f) ( R: ra n - ra) AX POP, \ get the success flag 0 [RP] SP MOV, RP INC, RP INC, \ restore data stack posn AX PUSH, NEXT \ push the success flag \ -1 # AX MOV, AX PUSH, NEXT \ push "true" for success ( Pattern compiler directives) ( 10 9 89 bjr 20:29 ) : << ( - 20 20) COMPILE (<<) 20 DUP ; IMMEDIATE : | ( - a 21) COMPILE (|) HERE 0 , 21 ; IMMEDIATE : >> ( 20 20 ... a 21 - ) COMPILE (>>) BEGIN 21 = WHILE HERE OVER - SWAP ! REPEAT 20 - ABORT" Pattern imbalance" ; IMMEDIATE : PAT: [COMPILE] : COMPILE ENTER< ; IMMEDIATE : ;PAT COMPILE >EXIT [COMPILE] ; ; IMMEDIATE ( Top level pattern execution) ( 12 9 89 bjr 17:01 ) VARIABLE CURADR VARIABLE CURSEG VARIABLE EOS : SUBJECT ( s) $TEXT EOS ! CURSEG ! CURADR ! 0 CURSOR ! ; \ : SUBJECT 1 WORD COUNT EOS ! CURADR ! \ CS@ CURSEG ! 0 CURSOR ! ; ( testing version) : EVALUATE ( s a - f) SWAP SUBJECT MARK EXECUTE RESTORE ( x f) SWAP DROP ; EXIT the following is old and obsolete stuff................. : SUBJECT 1 WORD COUNT OVER CURSOR ! + EOS ! ; : LEFT? ( n - f) EOS @ CURSOR @ - > 0= ; : (=") R COUNT DUP 1+ R> + >R SWAP OVER LEFT? IF OVER CURSOR @ EDITOR -TEXT 0= ELSE DROP 0 THEN IF FORTH CURSOR +! -1 ELSE DROP 0 THEN ; ( a quick and dirty kluge -- needs cleaning up) ( Pattern primitives: MATCH) ( 10 9 89 bjr 21:26 ) CODE (MATCH) ( adr seg n - f) CX POP, ES POP, DI POP, IP PUSH, EOS #) AX MOV, CURSOR #) AX SUB, CX AX CMP, 0>= IF, ( length ok) CX DX MOV, CURADR #) SI MOV, CURSOR #) SI ADD, CURSEG #) DS MOV, CLD, REPZ, .B CMPS, ( returns Z if match, NZ if not) THEN, CS AX MOV, AX DS MOV, AX ES MOV, 0= IF, ( match) DX CURSOR #) ADD, -1 # AX MOV, ELSE, ( no match) AX AX SUB, THEN, IP POP, AX PUSH, NEXT : MATCH ( s - f) $TEXT (MATCH) ; ;S ( Pattern primitives: ANY) ( 10 9 89 bjr 21:30 ) CODE (ANY) ( adr seg n - f) CURADR #) DI MOV, CURSOR #) DI ADD, CURSEG #) DS MOV, 0 [DI] AL MOV, ( get char from subject) CX POP, ES POP, DI POP, ( ES:DI -> character set) CLD, REPNZ, .B SCAS, ( returns Z if found, NZ if not) CS AX MOV, AX DS MOV, AX ES MOV, 0= IF, ( found) CURSOR #) INC, -1 # AX MOV, ELSE, ( not found) AX AX SUB, THEN, AX PUSH, NEXT : ANY ( s - f) $TEXT (ANY) ; ;S ( Pattern primitives: LEN POS ARB) ( 12 9 89 bjr 17:07 ) : LEFT ( - n) EOS @ CURSOR @ - ; : LEN ( n - f) DUP LEFT > IF DROP 0 ELSE CURSOR +! 1 THEN ; : POS ( n - f) CURSOR @ = ; ( true if n = CURSOR) PAT: ARB << LEFT BEGIN DUP LEN | 1- DUP 0= [ 2SWAP ] UNTIL 0= >> ;PAT EXIT ( String operators: CONCAT SUBST) ( 14 9 89 bjr 14:39 ) : SIZE ( s - n) $LEN E@ ; : CONCAT ( s1 s2 - s3) SWAP $TEXT >R HERE CS@ R@ ECMOVE ( copy s1 to here) $TEXT HERE R + SWAP >R CS@ R@ ECMOVE ( copy s2 after s1) HERE R> R> + >$ ; ( make new string) : SUBST ( s1 ofs n - s2) >R OVER SIZE ( s1 ofs N) OVER - 0 MAX ( s1 ofs N-ofs) R> MIN ?DUP IF ( s1 ofs n') >R >R $TEXT ( adr seg N) DROP SWAP R> + SWAP ( adr+ofs seg) HERE CS@ R@ ECMOVE HERE R> ELSE 2DROP 0 0 ( null string) THEN >$ ; ( String operators: FILLED, $>N, N>$) ( 14 9 89 bjr 14:34 ) : FILLED ( n c - s) OVER HERE SWAP ROT FILL HERE SWAP >$ ; : $>N ( s - n) $TEXT >R HERE CS@ R@ ECMOVE ( copy string to here) 0 HERE R> + C! ( append a null terminator) 0. HERE 1- CONVERT 2DROP ; ( convert digits) : N>$ ( n - s) DUP ABS 0 <# #S ROT SIGN #> >$ ; ( program to count letter occurences) ( 26 10 89 bjr 1:50 ) VARIABLE STRING : >STRING ( c - s) STRING C! STRING 1 >$ ; " COUNTS" TABLE : SCRUB ASCII A 26 BOUNDS DO " COUNTS" { I >STRING } 0 SWAP $! LOOP ; : DISPLAY ASCII A 26 BOUNDS DO " COUNTS" { I >STRING } $@ ?DUP IF CR I EMIT ." OCCURS " . ." TIMES " THEN LOOP ; : ACCUMULATE ( scr#) SCRUB BLOCK 1024 BOUNDS DO " COUNTS" { I C@ >STRING } DUP $@ 1+ SWAP $! LOOP DISPLAY ; THE WORLD OF THE FUTURE WILL BE AN EVER MORE DEMANDING STRUGGLE AGAINST THE LIMITATIONS OF OUR INTELLIGENCE. - N. WEINER ( program to test algebraic expressions) ( 26 10 89 bjr 11:54 ) : RECURSE LATEST PFA CFA , ; IMMEDIATE ( not Forth-83) VARIABLE 'EXPR \ for deferred execution PAT: IDENTIFIER << " ABCDEFGHIJKLMNOPQRSTUVWXYZ" ANY >> ;PAT PAT: ELEMENT << IDENTIFIER | << " (" MATCH >> << 'EXPR @ EXECUTE >> << " )" MATCH >> -1 >> ;PAT PAT: TERM ?STACK << ELEMENT | << ELEMENT >> << " */" ANY >> << RECURSE >> -1 >> ;PAT PAT: EXPR ?STACK << TERM | << TERM >> << " +-" ANY >> << RECURSE >> -1 >> ;PAT ' EXPR 'EXPR ! PAT: STMT << EXPR >> << LEFT 0= >> ;PAT : CHECK ( s) [ ' STMT ] LITERAL EVALUATE . ; ( program to check for repeating groups) ( 26 10 89 bjr 11:21 ) ( per Griswold's Exercise 2.7) : RECURSE LATEST PFA CFA , ; IMMEDIATE ( not Forth-83) : GRAB ( ofs n) >R CURADR @ + CURSEG @ HERE CS@ R ECMOVE HERE R> >$ ; VARIABLE FRAGMENT PAT: FRAG << FRAGMENT @ MATCH >> ;PAT PAT: REPGRP << FRAG | << FRAG >> << RECURSE >> -1 >> ;PAT PAT: REPGRPLINE << FRAG >> << REPGRP >> << LEFT 0= >> ;PAT PAT: REPEATS << 1 BEGIN 0 OVER GRAB FRAGMENT ! REPGRPLINE | 1+ DUP EOS @ > [ 2SWAP ] UNTIL 0= >> ;PAT : CHECK ( s) [ ' REPEATS ] LITERAL EVALUATE . ; : .FRAG FRAGMENT @ $TYPE ; \ test 1: memory allocation ( 3 8 89 bjr 7:30 ) HEX \ALLOC .( initial allocated memory state ) 0 20 SDUMP FFF0 10 SDUMP AWAIT 5 ALLOC .( allocate 5 bytes: ) . DUP . 0 20 SDUMP FFF0 10 SDUMP AWAIT 8 ALLOC .( allocate 8 bytes: ) . . 0 20 SDUMP FFF0 10 SDUMP AWAIT RELEASE .( release the 5 bytes) 0 20 SDUMP FFF0 10 SDUMP AWAIT 5 ALLOC .( allocate 5 bytes: ) . . 0 30 SDUMP FFF0 10 SDUMP AWAIT FFF0 ALLOC .( try to allocate 64K bytes: ) . 0 30 SDUMP FFF0 10 SDUMP DECIMAL ;S \ test 2: hashing & string creation ( 10 9 89 bjr 18:49 ) HEX \HASH .( hash table initialized ) CR CR .( context: ) $CONTEXT ? 0 A0 HDUMP .( next free: ) HFREE ? CREATE BLAH 48 C, 45 C, 52 C, 45 C, 20 C, AWAIT BLAH 5 .( a string in memory: ) DUMP AWAIT BLAH 5 >$ .( created string: ) DUP . 0 40 SDUMP 90 40 HDUMP .( next free: ) HFREE ? AWAIT BLAH 5 >$ .( same string 'created' again: ) . 0 40 SDUMP 90 40 HDUMP .( next free: ) HFREE ? AWAIT " Testing" .( inline string 'Testing' : ) . 0 50 SDUMP 90 40 HDUMP .( next free: ) HFREE ? AWAIT " Testing" .( inline string again: ) . 0 50 SDUMP 90 40 HDUMP .( next free: ) HFREE ? AWAIT BLAH 5 \$ .( delete first string ) 0 50 SDUMP 90 40 HDUMP .( next free: ) HFREE ? DECIMAL ;S \ test 3: associative access ( 10 9 89 bjr 18:54 ) HEX .( New string contents: ) " Testing" $@ . AWAIT .( Save an integer in a string: ) 1 " One" $! 0 50 SDUMP 90 50 HDUMP AWAIT .( Get integer from string: ) " One" $@ . AWAIT .( Save a pointer in a string: ) HERE " One" $! 0 50 SDUMP 90 50 HDUMP AWAIT .( Get pointer from string: ) " One" $@ U. AWAIT .( Save a string in a string: ) " Springfield" " Illinois" $! CR .( Get string descriptor: ) " Illinois" $@ . CR .( Get string text: ) " Illinois" $@ $TYPE CR .( Pointers both ways ) " Illinois" " Springfield" $! CR .( Get text other direction ) " Springfield" $@ $TYPE SPACE AWAIT 30 30 SDUMP 90 50 HDUMP DECIMAL ;S \ test 4: TABLEs ( 3 8 89 bjr 8:07 ) .( Basic string's contents: ) " Illinois" $@ $TYPE CR .( Create a table: ) " Population" TABLE HEX 90 80 HDUMP .( next free: ) HFREE ? AWAIT CR .( Store into a table entry: ) 20 " Population" { " Illinois" } $! CR .( Fetch from a table entry: ) " Population" { " Illinois" } $@ . .( millions ) CR .( A new table: ) " Bigcity" TABLE " Chicago" " Bigcity" { " Illinois" } $! CR .( Show all entries are different: ) CR " Illinois" $@ $TYPE CR " Population" { " Illinois" } $@ $TYPE CR " Bigcity" { " Illinois" } $@ $TYPE \ test 5: pattern tests ( 3 8 89 bjr 8:29 ) \ Subpattern tests: leave history info on data stack on exit. : test1 enter< >exit ; ( empty pattern) : test2 enter< << 1 >> >exit ; ( true pattern) : test3 enter< << 1 | 1 >> >exit ; ( an potential backtrack) : test4 enter< << 0 | 0 | 1 >> >exit ; ( 2 discarded alt'ves) : test5 enter< << 1 | 1 >> << 1 | 1 >> >exit ; ( 2 backtracks) : test6 enter< ( illustrate trial sequence) << ." a" dpl off 1 | ." b" dpl on 1 >> << ." c" 0 | ." d" dpl @ >> >exit ; : test7 enter< << 0 >> >exit ; ( false pattern) : test8 enter< ( subpattern alternatives) << ." X" test7 | ." Y" test6 >> ." Z" >exit ; \ test 6: pattern tests ( 12 9 89 bjr 14:33 ) \ Main pattern tests: either succeed or fail; clean up stacks : testz enter< << 1 | 1 >> << 0 | 0 >> >exit ; : TESTA ENTER< ( show stack cleanup) << 1 | 1 >> << 1 | 0 >> >EXIT ; : TESTB ( simple text match) ENTER< << " FOOBAR" MATCH >> >EXIT ; : TESTC ENTER< ( text match w/ alt'ves) << " AB" MATCH | " CD" MATCH >> << " -" MATCH >> << " ab" MATCH | " cd" MATCH >> >EXIT ; EXIT Note that nothing must be R-stacked before ENTER<, thus PATTERN< ENTER< is no longer a valid construct. \ test 7: more pattern tests ( 10 9 89 bjr 20:40 ) \ALLOC \HASH .( String package initialized ) CR PAT: TESTQ << " foobar" MATCH >> ;PAT PAT: TESTR << ARB >> << " FOOBAR" MATCH >> << 2 LEN >> ;PAT PAT: TESTS << " P" MATCH >> << " AEIOU" ANY >> << 1 LEN >> ;PAT PAT: TESTT << " UP" MATCH | " DOWN" MATCH >> << " TOWN" MATCH | " SIDE" MATCH >> ;PAT PAT: TESTU << ARB >> << " FOO" MATCH >> << ARB >> << 6 POS >> << " BAR" MATCH >> ;PAT \ pattern primitive tests ( 10 9 89 bjr 21:46 ) \ALLOC \HASH .( strings init'd ) CR : MATCHES SUBJECT " foobar" MATCH . ; \ MATCHES subject : ANYS SUBJECT " abcdef" ANY . ; \ ANYS subject : LENS ( n) SUBJECT LEN . ; \ n LENS subject : POSS ( n) SUBJECT POS . ; \ n POSS subject -- true only if n=0 \ test for editor screens ( 10 9 89 bjr 22:15 ) : wasted do cr i 5 .r ." x" 1024 i / dup 3 .r i * 1024 swap - 6 .r ." wasted" loop ; \ test for POSTPONE ( 14 9 89 bjr 21:41 ) : ?DUP1 COMPILE DUP [COMPILE] IF COMPILE DUP [COMPILE] THEN ; IMMEDIATE : TEST1 0 ?DUP1 1 ?DUP1 ; : POSTPONE ( per John Hayes) BL WORD FIND DUP 0= ABORT" ?" 0< IF COMPILE COMPILE THEN , ; IMMEDIATE : ?DUP2 POSTPONE DUP POSTPONE IF POSTPONE DUP POSTPONE THEN ; IMMEDIATE : TEST2 0 ?DUP2 1 ?DUP2 ; : KOMPILE POSTPONE POSTPONE ; IMMEDIATE : [KOMPILE] POSTPONE POSTPONE ; IMMEDIATE : ?DUP3 KOMPILE DUP [KOMPILE] IF KOMPILE DUP [KOMPILE] THEN ; IMMEDIATE : TEST3 0 ?DUP3 1 ?DUP3 ; This screen extends real-Forth to make it look more like F83. WORD returns an address, and returns null strings as count=0. (The null string fix is not in F83.) These words provide a simple and efficient dynamic allocation of string memory. They perform collection of free space, but not compacting. $SEG holds the segment address of the string memory. A 64K region is reserved for string storage. We use 16-bit addressing for speed and simplicity; 64K is the maximum size, and gives us "circular" (wraparound) addressing at no cost. ROVER is the "roving" pointer where allocation searches begin. This approach is suggested by Knuth. \ALLOC initializes the memory allocation. A single, 64K free block is created at address zero, and ROVER points to it. ALLOC allocates a block of n bytes from the string space. If successful, it returns the address (in the string segment) and a false flag. If not, it returns a true flag. ALLOC always allocates even lengths; if odd, n is rounded up. Four bytes are added (internally) for tag fields. The address returned is that of the user's data area. ALLOC searches for the first block which is the desired size or larger. No attempt is made for the "best fit". The search begins just after the last allocation, as indicated by the roving pointer ROVER. ALLOC updates ROVER. If the block is too large, it is split into an allocated block of the right size, and a new, smaller free block. RELEASE returns a block of allocated memory to free space. It is given the address of the user's data area in the allocated block. If either the block preceding or the block following (or both) are free, they will be merged to form a single, larger free block. Since, anytime a block is released, adjacent free blocks are consolidated, it can be seen that we will never have two free blocks adjoining. Thus we always maximize the free space available (short of physically compacting the allocated blocks, that is). ECMOVE moves a block of data from anywhere to anywhere in 8086 memory. It works like CMOVE except that it expects "extended addresses" (address segment) instead of simple addresses. Note that it uses MOVSW for efficiency, with an extra MOVSB if an odd number of bytes is to be moved. E@ fetches a cell from an "extended address". E! stores a cell to an "extended address". E-TEXT performs a text compare between two strings, which may be anywhere in 8086 memory. It resembles -TEXT except that: a) it expects extended addresses for both strings, and b) it expects lengths for BOTH strings. E-TEXT returns 1 for string1>string2, -1 for string1$ moves text from program memory onto the string stack, given address and length of the text. A new string is created. $> moves text from the string stack to program memory, returning address and length. The string is undisturbed. Note that this implementation puts the text at HERE. $TYPE types the topmost string. The string is undisturbed. $" when executed, creates a string with the given literal text. It is state-smart and may be compiled or used interpretively. Note that $> and $TYPE are non-destructive, i.e., they do not consume the string. This is to allow programmer optimization; e.g. $DUP $TYPE is slower than a non-destructive $TYPE . You can define $TYPE $DROP if that's what you really want. $SWAP swaps the top two strings. $ROT rotates the top three strings. Actually, only the pointers need be swapped/rotated. This is the advantage of maintaining a pointer stack. $.S prints out, nondestructively, the contents of the string stack. $- performs a non-destructive compare of the top two strings. It returns 1, 0, or -1 depending on whether the second string is greater, equal or less than the top string, respectively. DESCR builds a word which, given a string descriptor, offsets to a particular field and returns its address and segment. In the current implementation, the string descriptor is merely an address in the hash space (HSEG segment.) In the current implementation, the text pointer is merely an address in the string space ($SEG segment). HSEG is the segment address of a 64K region for hash lists. The first 512 bytes are heads for 256 hash lists. #H is 64K - 512 bytes (for heads), divided by 16 bytes per element. It should be changed if any of the other parameters change. Should rewrite someday to derive this from the other stuff. HFREE points to the first in a list of available elements. \HASH clears the hash table by a) zeroing all the heads, b) linking all list elements into a single list, and c) setting HFREE to point to this free list. HASH given an address and count in program space, computes an 8-bit hash code c and a 16-bit "ascension" value u . The hash code is used to select which of 256 hash lists to search for the desired string. The hash lists are stored in order of increasing ascension value. This resolves collisions quickly - only rarely is a text comparison needed (when both hash and ascension match), and the list search ends when the ascension is exceeded (on average, after half the list is searched). See Waite, pp. 230-232. This is not a particularly good or mathematically sound hash function, just one that was easy to write. It may be quite poor by the usual standards for hash functions. It should be replaced some day! HFIND searches the hash table for the given string. a n are the address and length of the string in program space. u c are the ascension and hash code, as returned by HASH. HFIND returns f=0 for success, f=1 for failure. a' is the address of the preceding element in the list (which may be the list head). On success, this is the address of the link (!) to the desired element. The algorithm: start with head of the list (which is a link) While link not null, save adr of link and get link If ascensions equal, If lengths equal, If text equal, then return successful Loop if list ascension <= desired list ascension > desired, or null link: return unsuccessful HINSERT inserts a new element into a hash list, and initializes the element. It expects the address a' of the preceding element, as returned by HFIND. u is the ascension, n the string length, and A the address in string space. HINSERT does not allocate string space nor copy the string. HINSERT assumes that the string is not a duplicate. (Use HFIND and test for failure.) HINSERT returns f=0 if successful, f=1 if not. The only error condition is an empty free list, meaning that the hash table is full. HINSERT also stores a pointer back to this list element, in the string space. HDELETE removes an element from the hash list and returns it to the free list. It expects the address a' of the preceding element, as returned by HFIND. Nothing is returned. HFIND does not de-allocate string space. It assumes that there are no references to this hash entry still active. I.e., since the address of the hash entry (aka string descriptor) is used as the "handle" of the string, there had better not be any attempt to use this handle after the list element is returned to the free pool! HDELETE always succeeds. POSTNUL appends a null to the end of a counted string. (>$) given an address and count of text in program space, ALLOC string space for it and copies the text to string space. It returns the address A in string space, and the count. Two bytes at the start of the string are allocated for future (back-pointer) use. Error 33 if insufficient room. @LINK given the address a' of the preceding element, returns the address of a string descriptor. >$ given an address and count of text in program space, finds or creates it as a content-addressable string, returning its descriptor s. If not already defined, >H will allocate space and create a hash table entry. Error 36 if hash table full. When created, all ancillary data is set to zero. " when executed, returns the string descriptor for a string with the given text. The string is created if necessary. $" is state-smart and may be compiled or used interpretively. @$PTR given the string descriptor s, returns the address of its allocated memory in string space. (Note that the actual string text begins at @$PTR 2+ .) \$ (wipe-string) given an address and count of text in program space, deletes the string from string space and the hash table. If the string does not exist, no action is taken. $@ given a string descriptor s1, returns the descriptor s2 that it addresses. (Fetch associated string.) $! stores string s2 at "address" s1, i.e., cause s1 to point to s2. (Store associated string.) Note that what is being stored/fetched here are simple integers. (A string descriptor is a single cell value). Thus we may store a number, or an address, or any cell value "at" a string. } resets the string context to the 'root' hash table. { sets the string context to the hash table associated with string descriptor s . All searches/creations/deletions will use this table until changed with another { , or } . $TEXT given a string descriptor, returns address, segment, and length of its string text. (Note 2+ to skip reverse pointer) HSEG is the segment address of a 64K region for hash lists. The first 512 bytes are heads for 256 hash lists. #H is 64K - 512 bytes (for heads), divided by 16 bytes per element. It should be changed if any of the other parameters change. Should rewrite someday to derive this from the other stuff. HFREE points to the first in a list of available elements. \HASH clears the hash table by a) zeroing all the heads, b) linking all list elements into a single list, and c) setting HFREE to point to this free list. HASH given an address and count in program space, computes an 8-bit hash code c and a 16-bit "ascension" value u . The hash code is used to select which of 256 hash lists to search for the desired string. The hash lists are stored in order of increasing ascension value. This resolves collisions quickly - only rarely is a text comparison needed (when both hash and ascension match), and the list search ends when the ascension is exceeded (on average, after half the list is searched). See Waite, pp. 230-232. This is not a particularly good or mathematically sound hash function, just one that was easy to write. It may be quite poor by the usual standards for hash functions. It should be replaced some day! HFIND searches the hash table for the given string. a n are the address and length of the string in program space. u c are the ascension and hash code, as returned by HASH. HFIND returns f=0 for success, f=1 for failure. a' is the address of the preceding element in the list (which may be the list head). On success, this is the address of the link (!) to the desired element. The algorithm: start with head of the list (which is a link) While link not null, save adr of link and get link If ascensions equal, If lengths equal, If text equal, then return successful Loop if list ascension <= desired list ascension > desired, or null link: return unsuccessful HINSERT inserts a new element into a hash list, and initializes the element. It expects the address a' of the preceding element, as returned by HFIND. u is the ascension, n the string length, and A the address in string space. HINSERT does not allocate string space nor copy the string. HINSERT assumes that the string is not a duplicate. (Use HFIND and test for failure.) HINSERT returns f=0 if successful, f=1 if not. The only error condition is an empty free list, meaning that the hash table is full. HINSERT also stores a pointer back to this list element, in the string space. HDELETE removes an element from the hash list and returns it to the free list. It expects the address a' of the preceding element, as returned by HFIND. Nothing is returned. HFIND does not de-allocate string space. It assumes that there are no references to this hash entry still active. I.e., since the address of the hash entry (aka string descriptor) is used as the "handle" of the string, there had better not be any attempt to use this handle after the list element is returned to the free pool! HDELETE always succeeds. HALLOC allocates a hash list element (16 bytes), and returns its address in hash space. A false flag is returned if successful; true flag if hash space is full. \TABLE allocates and initializes hash list elements to build a hash table. Nine elements are required for a two-tier table: one contains pointers to the next eight, each of which contains pointers to eight lists for a total of 64 threads. \TABLE expects and returns the address of the first element; thus an HALLOC must be done before \TABLE. TABLE given a string descriptor, builds a hash table to be associated with that string. The 'root' hash table is stored at address zero in hash space. Important! this is be the init. value of $TABLE for new strings! Note the use of an explicit HALLOC to get the element at address zero; usu. a zero link means 'empty list' but we know better. CURSOR is the scan pointer for pattern matching. ENTER< builds the pattern context for a subpattern routine. The context, on the return stack, is Return Address (pushed on entry) and cursor position. ENTER< then puts a "fallout" backtrack record on the History stack. FALLOUT If a subpattern routine needs to back up to a calling routine, the "fallout" record will be executed first; \ IP is set to FALLOUT and the word (FALLOUT) is performed. \ This will remove the inner routine's context from Rstack. IP is set to FALLOUT; this discards the superfluous cursor info from the Rstack, and returns to the caller with a 0. (<<) indicates the beginning of alternatives; it saves the current cursor position in the Rstack context. This establishes the posn. where alternatives will be tried. Compiled by << . (|) indicates that an alternative follows. ON SUCCESS (true): puts an "alternative" backtrack record on the History stack. This consists of the "backup" cursor position (as found from Rstack) and the IP of the following alternative. Then execution branches past all alternatives to the code after (>>). (Any parameters for the next alt've can be put on data stack before (|). They will be on top if/when execution backtracks to the alternative.) ON FAILURE (false): the cursor position is restored from (and left on) Rstack. Execution falls thru to next alt've. (>>) indicates the end of alternatives. Compiled by >> . ON SUCCESS (true): fall thru. ON FAILURE (false): no alternatives remain; must backtrack. Pops the IP and cursor position from History stack; leaves Rstack alone. (Rstack is adjusted by FALLIN & FALLOUT.) >EXIT cleans up the pattern context for a subpattern routine. IF NO ALTERNATIVES WERE STACKED: discards the "fallout" backtracking record (since we'll never fall in to here). IF ANY ALTERNATIVES WERE STACKED: adds a "fallin" record to the History stack. IN EITHER CASE: discards the cursor from the Rstack; the Return Address context is consumed by the ; which follows. Also stacks a "true" success code. (If the subpattern failed, >EXIT is never performed & FALLOUT stacks a zero.) FALLIN If a "higher" routine needs to back up to an alternative in a subpattern, the "fallin" record will be executed first. IP is set to FALLIN and the word (FALLIN) is performed. This puts the appropriate context & return information back on the return stack, allowing the subpattern routine to execute and exit normally. PATTERN< indicates the start of the "main" pattern. The History (data) stack pointer is saved on the Rstack. This will be needed to clean up the stack if a match is found when there are still untried alternatives. PATTERN< also \ puts the "failure" record on the bottom of the History stack. FAILURE Before anything else is put on the History stack, a "failure" backtracking record is stacked. If the backtrack logic ever backs up to this record, it means that all alternatives have been exhausted. The data stack is empty; the saved stack pointer is dropped from Rstack and an immediate exit to the calling routine is performed. >ENDS indicates the end of the "main" pattern. If it is ever \ reached, a match was found; it scrubs any leftover History It is reached with a success code on the stack top, and perhaps some alternatives underneath. It scrubs the stack info by resetting the stack pointer, preserving the flag. After each alternative, a conditional branch is compiled to skip the rest. These words do the compile-time construction. << stacks a 20 for error check, and another to indicate to >> that no more branches need to be resolved. Compiles (<<) . | compiles the conditional branch (|) and an empty offset cell. Stacks the address of the cell, and a 21 flag. >> compiles (>>). Then proceeds to resolve all the | branches, looping until none are left on stack. Performs a limited error check by testing the 20 left by >> . We add a few words for "cosmetic" purposes... PAT: begins a subpattern definition; equiv. to : xxx ENTER< ;PAT ends a subpattern definition; equiv. to >EXIT ; These variables indicate the subject string of the pattern match CURADR is the address of the start of the string. CURSEG is the 8086 segment of the string. CURSOR (screen 42) is an offset into the string (0..N). EOS is the offset of the end of the string, i.e., its length. SUBJECT given a string s, sets the pattern matching pointers. SUBJECT (testing version) parses a line from the input stream, and sets the pattern matching pointers to it. EVALUATE is the top-level word to initiate pattern matching. It expects a subject string s, and the execution address a (i.e. cfa) of a pattern. It returns a success flag. EVALUATE is what makes a subpattern expression the "main" pattern, via PATTERN< and >ENDS; the difference being that the first success terminates and dumps all alternatives. (Note the fancy stack manipulation necessary so as not to disturb the stack balance between PATTERN< and >ENDS.) (MATCH) is the text matching code for MATCH. It expects the address, segment, and length of a text string, and compares this with the subject string at the current position (as indicated by CURADR, CURSEG, CURSOR, EOS). Note that it first checks that the rest of the subject is long enough! MATCH matches a given string in the current subject. Usage is " text" MATCH . Returns true if the subject contains that text at the current cursor position. (ANY) is the basic matching code for ANY. It expects the address, segment, and length of a text string, and searches this string for the character at the current position in the subject string (per CURADR, CURSEG, CURSOR). ANY matches a single character in the subject to any of a set of characters, that set determined by the contents of the given string s. Usage: " characters" ANY . Returns true if the character at the current cursor position is in the set. LEFT returns the number of characters remaining in the subject. LEN matches a string of length n having any contents. Essentially, this just checks that n bytes remain, and if so, advances the cursor over them. POS returns true if currently at position 'n'. The left end of the subject string is position zero. The right end is the value in EOS. Cursor position is not affected. ARB matches an arbitrary number of characters. It attempts to match the largest possible number of characters, beginning with the entire remaining string, and backtracking down to 0. This is an example of using Forth logic in a pattern. The BEGIN...UNTIL loops from LEFT down to zero, breaking out at | on every loop with success. Note that a DO..LOOP would not work here because it uses the return stack. These operators act on strings, often creating new strings. SIZE returns the length of a string s. CONCAT concatenates s1 and s2, returning the new string s3. s1 and s2 are unaffected. This word uses the "brute force" approach of copying both strings to the working area at HERE, and then creating a new string from the result. (Given only the primitive >$, this is the only option. A better approach would be to reserve the required amount of string space, and to build the concatenated string there directly. Some day..) SUBST returns the substring of s, starting at offset ofs for n characters. If fewer than n characters exist after ofs, the result is truncated accordingly. If ofs is past the end of string, a null string is returned. FILLED creates a string of length n filled with character c. Again, a brute force approach is used: build it at HERE, then create the string. $>N converts a text string s to a binary number n. Note that the Forth rules for numeric conversion are followed, i.e., conversion goes up to the first non-convertible character, so that " 1234XYZ" $>N will return 1234 . Also, signed and decimal numbers are not converted. N>$ converts a binary number n to a string s. This is a signed conversion with no leading or trailing spaces. This screen emulates Griswold's SNOBOL4 example 1.16. STRING is where strings of 1 character are assembled. >STRING given a character, creates a 1-byte string and returns its string descriptor. A table is defined with the string identifier COUNTS. SCRUB clears the COUNTS entries for identifiers "A" thru "Z". DISPLAY displays the COUNTS entries, likewise. ACCUMULATE given a block number, counts all the characters in that block, storing the count in character-named entries in the table COUNTS. This is the test data for the letter counting program. Type 54 LOAD to compile the program, then type 55 ACCUMULATE to count the letters in this screen. These words are a pattern matcher for simple algebraic expressions. They implement the BNF expressions := | + | - := | * | / := ( ) | := A | B | C | D ... X | Y | Z