ENGINE.F Parser/Compiler Engine Module

ENGINE.f is a top-down recursive parser with
backtracking and conditional compilation
capability. It is designed to the Extended Backus-Naur
Form, EBNF, Standard ISO/IEC 14977 published in 1996(E).

Engine needs EBNF 4.0 module Usefile.

Key Values:

The Success flag is the switch that controls the
conditional execution of all EBNF productions and
successive alternatives.

true value Success
' Success alias Success?

variable Uin    \ LineBuf scan pointer
Uin off         \ offset like >in

0 value Uline   \ current source line number


usource returns the char address of the next source
character to be scanned.
csource in ver 3.5

+uline advances Uline to the next source text line.
+blk in ver 3.5

-uline sets Uline back to a previously scanned source line
-blk in ver 3.5

<bnf is the run-time for the start of an EBNF production.
If previous scans were successful, save the
line#, scan pointer and DP on the return stack.
Otherwise, backstep through the production chain
by dropping the return address.

bnf> is the run-time for the end of an EBNF production.
If the production scan was successful, discard
the saved scan parameters. Otherwise, restore the
scan parameters to their previous values which
were saved on the return stack.

An Extended BNF production, aka expression or syntax-rule,
consists of a meta-identifier, i.e. production-name,
a defining-symbol, a definitions-list, and a
terminal-symbol in this form:
meta-identifier = definitions-list ;

EBNF 4.0 uses the following form for productions in order
to integrate better with FORTH:
bnf= meta-identifier definitions-list ;bnf

EBNF 4.0 performs the lexical analysis of the source text
by parsing tokens from the source and comparing them to
pre-defined tokens. If the comparison succeeds, the scan
proceeds forward through the user source file. If it doesn't,
the parser back-tracks and attempts a lexical alternative.
The EBNF parser/compiler employs top-down recursive parsing
with back-tracking.

@token returns the next source input character to be scanned for analysis.

+token advances the source scan pointer, Uin.

=token compares an EBNF token to source text. EBNF tokens are stored as
counted strings.

Lexical parsing exceptions are handled by saving the
state of the source line number and source offset
pointer, Uline and Uin in SaveUline and SaveUin.
SAVESCAN in ver 3.5

The EBNF standard treats tokens as EBNF productions.
For example:
vowel = "A" | "E" | "I" | "O" | "U" ;
EBNF 4.0 treats tokens as a special production.
" A" is= "A"
" E" is= "E"
" I" is= "I"
" O" is= "O"
" U" is= "U"
bnf= vowel "A" | "E" | "I" | "O" | "U" ;bnf
In this case, each element is defined as a
re-usable token and those elements are
combined to create the "vowel" production.

The EBNF Standard allows the use of double quotes
in place of single quotes to define tokens. EBNF 4.0
uses the double quote representation for two reasons:
(1) Win32Forth 'g' function returns the ASCII code for
    the enclosed character and interferes with EBNF tokens;
(2) EBNF 4.0 tokens can also be strings which are better
    represented by the double quote.
Single quotes were used in version 3.5 and are not
compatible with Win32Forth!

" is an EBNF form, or alias, of S"

is= is used to define an elemental token symbol and save it as
a counted string. A tokens run-time action is to
save the state of Uin and Uline and invoke the
analysis, =token.
Example:   " L" is= "L"
TOKEN in ver 3.5
Would have been 'L' in ver 3.5

The syntactic exception state flag, -Tflag, is used to
track the state of series syntactic exceptions.
Logically, a syntactic exception means "NOT Token".
Successive scans of a token need to be performed
without prematurly failing the alternative.

| defines an EBNF production definition-list

Note: An Empty-Sequence is a production or
Alternative-Sequence with no definitions list.
These always evaluate to Success = True.
The following example is always successful:
bnf= <L>  "L"  |  ;bnf