FIXME: Give lexemes as an extra argument to Parser?
FIXME: Rename second argument to parse method to "tokens"?
FIXME: Make start_token an optional argument to parse? (swap with
token list) and have it default to the first non-terminal?
A parser is created by
p = Parser {grammar}
and called with
result = p:parse (start_token, token_list[, from])
where start_token is the non-terminal at which to start parsing in
the grammar, token_list is a list of tokens of the form
{ty = "token_type", tok = "token_text"}
and from is the token in the list from which to start (the default
value is 1).
The output of the parser is a tree, each of whose
nodes is of the form:
{ty = symbol, node_1 = tree_1, node_2 = tree_2, ... [, list]}
where each node_i is a symbolic name, and list is the list of
trees returned if the corresponding token was a list token.
A grammar is a table of rules of the form
non-terminal = {production_1, production_2, ...}
plus a special item
lexemes = Set {"class_1", "class_2", ...}
Each production gives a form that a non-terminal may take. A
production has the form
production = {"token_1", "token_2", ...,
[action][,abstract]}
A production
* must not start with the non-terminal being defined (it must not
be left-recursive)
* must not be a prefix of a later production in the same
non-terminal
Each token may be
* a non-terminal, i.e. a token defined by the grammar
* an optional symbol is indicated by the suffix "_opt"
* a list is indicated by the suffix "_list", and may be
followed by "_
* a lexeme class
* a string to match literally
The parse tree for a literal string or lexeme class is the string
that was matched. The parse tree for a non-terminal is a table of
the form
{ty = "non_terminal_name", tree_1, tree_2, ...}
where the tree_i are the parse trees for the corresponding
terminals and non-terminals.
An action is of the form
action = function (tree, token, pos)
...
return tree_
end
It is passed the parse tree for the current node, the token list,
and the current position in the token list, and returns a new parse
tree.
An abstract syntax rule is of the form
name = {i_1, i_2, ...}
where i_1, i_2, ... are numbers. This results in a parse tree of
the form
{ty = "name"; tree_i_1, tree_i_2, ...}
If a production has no abstract syntax rule, the result is the
parse node for the current node.
Parser constructor (deals with abstract syntax rules)
Parser:parse: Parse a token list
start: the token at which to start
token: the list of tokens
returns
tree: parse tree