Parsing the def statement is a bit tricky because we use the same name for five different forms of definition, each with its own syntax. It is rather difficult to merge those five syntaxes into one LL(1) syntax. Instead we use look-ahead to pre-classify the def statement as a method definition (regular or generic), a constant definition, a variable definition, or a forward declaration.
The look-ahead remembers the tokens read from the source token stream, along with their source locations. Once the pre-classification is completed, the remembered tokens and source locations are returned to the token stream push back buffer so the same tokens can be parsed again.
A constant definition starts with tokens matching the pattern
(name [ "(" stuff ")" ] | "[" stuff "]") "="
where stuff is any sequence of tokens containing balanced
parentheses or brackets.
In other words, the left-hand side preceding =
is either a name or a destructuring.
If it's a destructuring, it's either a name followed by stuff in parentheses
(a function call destructuring)
or it is stuff in brackets (a list destructuring).
A variable definition starts with a name followed by :=.
A forward definition starts with a name but the next token is none of =, :=, or ( and marks the end of the statement.
Anything else must be a method definition or a syntax error.
See Definitions for how the def statement is parsed once it has been pre-classified.
A method_modifiers datum represents the modifiers and description that can be attached to a method. It can be parsed as the methodmodifiers syntactic construct. It could have been defined by
defclass method_modifiers(named: sealed boolean,
dominant boolean,
intrinsic boolean,
description string | false)
;; True if this method_modifiers object can be discarded
def empty?(mm method_modifiers)
not (mm.sealed or mm.dominant or mm.intrinsic or mm.description)
def parse_methodmodifiers(lexer, indentation, scope, required?)
block exit: return
def sealed := false
def dominant := false
def intrinsic := false
def description := false
while true
if match?(lexer, #sealed:) then sealed := true
else if match?(lexer, #dominant:) then dominant := true
else if match?(lexer, #intrinsic:) then intrinsic := true
else if match?(lexer, #description:)
description := parse_string(lexer, indentation, scope, true)
match?(lexer, #\,)
else if sealed or dominant or intrinsic or description
return(method_modifiers(sealed: sealed, dominant: dominant,
intrinsic: intrinsic, description: description)
else if required?
parse_error(lexer, "Required method modifiers not present")
else
return(false)
There is also a utility method to help with converting modifiers placed in front of a def statement into method_modifiers.
def add_statement_modifiers(mm method_modifiers, statement_modifiers set![name]) if remove!(statement_modifiers, #sealed) then mm.sealed := true if remove!(statement_modifiers, #dominant) then mm.dominant := true if remove!(statement_modifiers, #intrinsic) then mm.intrinsic := true mm
A formal_parameters datum represents the formal parameters of a method. It can be parsed as the formalparameters syntactic construct. It is variable but only the parser should modify it.
It could have been defined by
defclass formal_parameters(named:
required = [] list[formal_parameter_definition],
optional = [] list[formal_parameter_definition],
named = [] list[formal_parameter_definition],
rest = false formal_parameter_definition | false,
scope scope) ; scope where all parameters are visible
def parse_formalparameters(lexer, indentation, scope, required?)
block exit: return
def initial_name := false
if not required?
;; Check if formal parameters are present
initial_name := parse_name(lexer, indentation, scope, false)
if not initial_name and next(lexer) ~= #optional: and next(lexer) ~= #named:
return(false)
def result := formal_parameters(scope: scope)
def mode := #required
while true
match_newline?(lexer, indentation, true)
def selector = if mode = #named and next(lexer) in keyword then next!(lexer).name
if def parameter_name = initial_name or parse_name(lexer, indentation, scope, false)
initial_name := false
def default = if mode ~= #required
if match?(lexer, #=)
parse_expression(lexer, indentation, scope, true)
else
quotation(false)
def type = evaluate_type(parse_expression(lexer, indentation, scope, false), scope)
def selector2 = selector or mode = #named and name(parameter_name.spelling)
def formal = formal_parameter_definition(name: parameter_name,
type: type,
default: default,
scope_for_default: default and result.scope,
selector: selector2)
;; Scope for next formal parameter's default, and body, includes this formal parameter
def new_scope = formal_parameter_scope(result.scope)
add_definition(new_scope, parameter_name, formal)
result.scope := new_scope
;; Stash formal in result
if match?(lexer, #\...)
;; This is the rest parameter
if default then parse_error(lexer, "rest parameter cannot have a default")
if selector then parse_error(lexer, "rest parameter cannot have a selector keyword")
result.rest := formal
return(result)
else if mode = #required
result.required := list(result.required + [formal]...)
else if mode = #optional
result.optional := list(result.optional + [formal]...)
else ; if mode = #named
result.named := list(result.named + [formal]...)
;; See if there are more formal parameters
if not match?(lexer, #\,)
return(result)
else if match?(lexer, #\#)
;; #constant abbreviated syntax
if mode = #named then parse_error(lexer, "# parameter cannot be named or rest")
def tempname = name("temp", macro_context(scope))
def constant = next!(lexer) ; name or keyword or integer or character
def formal = formal_parameter_definition(name: tempname,
type: set(constant))
add_definition(result.scope, tempname, formal)
;; Stash formal in result
if match?(lexer, #\...)
parse_error(lexer, "# parameter cannot be named or rest")
else if mode = #required
result.required := list(result.required + [formal]...)
else if mode = #optional
result.optional := list(result.optional + [formal]...)
;; See if there are more formal parameters
if not match?(lexer, #\,)
return(result)
else if mode = #required and match?(lexer, #optional:)
mode := #optional
else if (mode = #required or mode = #optional) and match?(lexer, #named:)
mode := #named
else
;; An unrecognized token ends the parsing
return(result)
The types in formal parameters, variable definitions, etc. are actual types (instances of the class type), not names of types.
A type declaration must be converted from an expression to an actual type datum at compile time. This could have been done by
def evaluate_type(expression expression | false, scope scope)
if not expression
;; Type is defaulted
everything
else if expression in name
def defn = lookup(scope, expression)
if defn in known_definition and defn.value in type
defn.value
else
error("$expression is not a defined type")
else if expression in call_expression
assert not (expression in spread_call_expression) ; ugh!
def actuals = mapf(evaluate_type(_, scope), expression.parameters)
def fcn = if expression.function in function then expression.function
else
def defn = lookup(scope, expression.function)
if defn in known_definition and defn.value in function
defn.value
else
error("Don't know how to call $(expression.function) at compile time")
def result = fcn(actuals...)
if result in type
result
else
error("$expression does not evaluate to a type")
else
error("Don't know yet how to evaluate $expression at compile time")
A method_head datum represents everything about a method except its body. This is separated out because of the require statement. It can be parsed as the methodhead syntactic construct. It could have been defined by
defclass method_head (named:
name name,
modifiers method_modifiers,
formal_parameters formal_parameters,
result_type type)
def parse_methodhead(lexer, indentation, scope, required?)
block exit: return
if def function_name := parse_name(lexer, indentation, scope, false)
;; function(actual parameters) syntax
parse_methodhead_after_name(lexer, indentation, scope, function_name)
else
;; unary or binary operator syntax
def parameters = if match?(lexer, #\()
;; "(" name1 type_expression1 ")" operator ( "(" name2 type_expression2 ")" |
;; "#" constant )
def name1 = parse_name(lexer, indentation, scope, true)
def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true),
scope)
match!(lexer, #\))
function_name := parse_operator(lexer, indentation, scope, true)
if def macro_expander = known_definition(scope, function_name).infix_macro_expander
def expansion = macro_expander(name1, lexer, indentation, scope,
function_name.context, true)
if expansion in method_head
;; Put the real left hand side into the formal parameters
if first(expansion.formal_parameters.required) eq name1
first(expansion.formal_parameters.required) :=
formal_parameter_definition(name: name1, type: type1)
if match?(lexer, #\=>) ;; [ "=>" result_type_expression ]
expansion.result_type :=
evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
return(expansion)
else if expansion in call_expression and
length(expansion.parameters) >= 1 and
first(expansion.parameters) eq name1
def operator_name = function_name
function_name := expansion.function
if function_name in bundle
function_name := function_name.name
[formal_parameter_definition(name: name1, type: type1),
[(if arg in name
formal_parameter_definition(name: arg, type: everything)
else if arg in quotation
formal_parameter_definition(
name: name("temp", macro_context(scope)),
type: set(arg.datum))
else
error("Don't know how to convert $expansion expansion of $operator_name operator into a method_head"))
for arg in rest(expansion.parameters)]...]
else
error("Don't know how to convert $expansion expansion of $function_name operator into a method_head"))
else
def [name2, type2] = if match?(lexer, #\()
def values =
[parse_name(lexer, indentation, scope, true),
evaluate_type(parse_expression(lexer, indentation,
scope, true),
scope)]
match!(lexer, #\))
values
else if match?(lexer, #\#)
[name("temp", macro_context(scope)),
block
def datum := next!(lexer)
if datum in name
;; Strip hygienic context
datum := name(datum, false)
set(datum)]
else
wrong_token_error(lexer, "( or #")
[formal_parameter_definition(name: name1, type: type1),
formal_parameter_definition(name: name2, type: type2)]
else if function_name := parse_operator(lexer, indentation, scope, false)
;; operator "(" name type_expression ")"
match!(lexer, #\()
def name1 = parse_name(lexer, indentation, scope, true)
def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true),
scope)
match!(lexer, #\))
[formal_parameter_definition(name: name1, type: type1)]
else if required?
wrong_token_error(lexer, "a name, an operator, or ( to start a method head")
else return(false)
def formals_scope = formal_parameter_scope(scope)
for formal in parameters
add_definition(formals_scope, formal.name, formal)
def parameters = formal_parameters(required: parameters, scope: formals_scope)
;; Adjust function_name and parameters if ":=" "(" name type ")" follows
def function_name = parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)
def result_type = if match?(lexer, #\=>) ;; [ "=>" result_type_expression ]
evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
else everything
return(method_head(name: function_name,
modifiers: method_modifiers(),
formal_parameters: parameters,
result_type: result_type))
;; Call this when the name has already been parsed
;; { name & "." }+ [ "[" generic_class_formalparameters "]" ] "(" methodmodifiers formalparameters ")"
;; [ ":=" "(" new_value_name type_expression ")" ]
;; [ "=>" result_type_expression ]
def parse_methodhead_after_name(lexer, indentation, scope, function_name_arg)
def function_name := function_name_arg
while match?(lexer, #\.)
function_name := name(parse_name(lexer, indentation, scope, true), lookup_module(function_name))
;;---TODO insert generic_class_formalparameters support here
match!(lexer, #\()
def modifiers = parse_methodmodifiers(lexer, indentation, scope, false)
def parameters = parse_formalparameters(lexer, indentation, scope, true)
match!(lexer, #\))
;; Adjust function_name and parameters if ":=" "(" name type ")" follows
function_name := parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)
def result_type = if match?(lexer, #\=>)
evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
else
everything
method_head(name: function_name,
modifiers: modifiers,
formal_parameters: parameters,
result_type: result_type)
def parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)
if match?(lexer, #\:=)
match!(lexer, #\()
def new_value_name = parse_name(lexer, indentation, scope, true)
def new_value_type = evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
match!(lexer, #\))
if parameters.optional or parameters.named or parameters.rest
parse_error(lexer, ":= can only be used with required formal parameters")
def formal = formal_parameter_definition(name: new_value_name, type: new_value_type)
parameters.required := list(parameters.required + [formal]...)
parameters.scope := formal_parameter_scope(parameters.scope)
add_definition(parameters.scope, new_value_name, formal)
name(function_name.spelling + ":=", function_name)
else function_name
The parsing of the def statement, with lookahead, could have been implemented by
defmacro def =>
block exit: return_macro_expansion
;; First step is pre-classification:
;;
;; A constant definition starts with tokens matching the pattern
;; (name [ "(" stuff ")" ] | "[" stuff "]") "="
;; where stuff is any sequence of tokens containing balanced parentheses or brackets.
;; In other words, the left-hand side preceding = is either a name or a destructuring.
;;
;; A variable definition starts with a name followed by :=.
;;
;; A forward definition starts with a name but the next token is none of =, :=, or (.
;;
;; Anything else must be a method definition or a syntax error.
def definition_type := #unknown
def lookahead_buffer = stack()
def initial_name = parse_name(lexer, indentation, scope, false)
;; Returns true if successful, false if statement ends inside brackets
def parse_balanced_stuff(initiator name)
block exit: return_from_parse_balanced_stuff
def terminator = if same_spelling?(initiator, #\[) then #\]
else if same_spelling?(initiator, #\{) then #\}
else #\)
push!(lookahead_buffer, initiator)
while def token = next!(lexer) ;; implies return_from_parse_balanced_stuff(false) at EOF
if token in newline and token.indentation <= indentation
push!(lookahead_buffer, token)
return_from_parse_balanced_stuff(false)
else if token in name and (same_spelling?(token, #\() or
same_spelling?(token, #\[) or
same_spelling?(token, #\{))
parse_balanced_stuff(token) or return_from_parse_balanced_stuff(false)
else
push!(lookahead_buffer, token)
if same_spelling?(token, terminator)
return_from_parse_balanced_stuff(true)
if same_spelling?(next(lexer), if initial_name then #\( else #\[) and
parse_balanced_stuff(next!(lexer)) and
same_spelling?(next(lexer), #=)
;; Lookeahead matched, this is a destructuring constant definition
insert!(lexer, lookahead_buffer)
definition_type := #destructuring
else
;; This is something other than a destructuring constant definition
insert!(lexer, lookahead_buffer)
if initial_name and match?(lexer, #\:=)
definition_type := #variable
else if initial_name and match?(lexer, #=)
definition_type := #constant
else if initial_name and not same_spelling?(next(lexer), #\()
;; Forward declaration or name followed by garbage
if not next(lexer) or
next(lexer) in newline and next(lexer).indentation <= indentation or
same_spelling?(next(lexer), #\))
definition_type := #forward
else
wrong_token_error(lexer, ") or end of statement")
else
definition_type := #method
;; Pre-classification is done and initial-name is parsed, do the real parsing
def defn = case definition_type
#forward => known_definition(initial_name, bundle(name: initial_name))
#variable => def initial_value = parse_expression(lexer, indentation, scope, true)
def type = evaluate_type(parse_expression(lexer, indentation, scope, false), scope)
variable_definition(initial_name, type, initial_value)
#constant => def expr = parse_expression(lexer, indentation, scope, true)
if expr in number | character | string
known_definition(initial_name, expr)
else if expr in quotation
known_definition(initial_name, expr.datum)
else if expr in name and lookup(scope, expr) in known_definition
known_definition(initial_name, lookup(scope, expr).value)
else
constant_definition(initial_name, everything, expr)
;; #destructuring => TBD
#method => def head = if initial_name
parse_methodhead_after_name(lexer, indentation, scope, initial_name)
else
parse_methodhead(lexer, indentation, scope, true)
def body_scope = head.formal_parameters.scope
def function_name = head.name
def existing_definition = lookup1(scope, function_name)
add_statement_modifiers(head.modifiers, modifiers)
if existing_definition
if existing_definition in known_definition and
existing_definition.value in bundle
return_macro_expansion(call_expression(name("add_method", modules.lunar),
function_name,
method_expression(head,
parse_body(lexer, indentation,
body_scope, true))))
else parse_error(lexer, "Incompatible definitions for $function_name")
else
;; Not already defined directly in this scope, add the definition
def bundle = bundle(name: function_name)
def defn = known_definition(function_name, bundle)
def expr = expand_definition(scope, function_name, defn)
def new_scope = if scope in global_scope then scope
else if expr in name then scope
else
;; expand_definition created a new block_tail_scope
last(expr.expressions)
;; Reparent the body and formal parameter scopes on new_scope
;; so the method is inside its own scope, so it can be recursive
for s = body_scope then s.parent while s using return
if s.parent eq scope
s.parent := new_scope
return(false)
;; Parse the body and make an add_method call
def am = call_expression(name("add_method", modules.lunar),
function_name,
method_expression(head, parse_body(lexer, indentation,
body_scope, true)))
;; Plug it into the return value
return_macro_expansion(if expr in prog_expression
new_scope.expressions := new_scope.expressions + [am]
expr
else
am)
;; Not a method, just a simple definition
expand_definition(scope, initial_name, defn)
Previous page Table of Contents Next page