Parsing the def statement is a bit tricky because we use the same name for five different forms of definition, each with its own syntax. It is rather difficult to merge those five syntaxes into one LL(1) syntax. Instead we use look-ahead to pre-classify the def statement as a method definition (regular or generic), a constant definition, a variable definition, or a forward declaration.
The look-ahead remembers the tokens read from the source token stream, along with their source locations. Once the pre-classification is completed, the remembered tokens and source locations are returned to the token stream push back buffer so the same tokens can be parsed again.
A constant definition starts with tokens matching the pattern
(name [ "(" stuff ")" ] | "[" stuff "]") "="where stuff is any sequence of tokens containing balanced parentheses or brackets. In other words, the left-hand side preceding = is either a name or a destructuring. If it's a destructuring, it's either a name followed by stuff in parentheses (a function call destructuring) or it is stuff in brackets (a list destructuring).
A variable definition starts with a name followed by :=.
A forward definition starts with a name but the next token is none of =, :=, or ( and marks the end of the statement.
Anything else must be a method definition or a syntax error.
See Definitions for how the def statement is parsed once it has been pre-classified.
A method_modifiers datum represents the modifiers and description that can be attached to a method. It can be parsed as the methodmodifiers syntactic construct. It could have been defined by
defclass method_modifiers(named: sealed boolean, dominant boolean, intrinsic boolean, description string | false) ;; True if this method_modifiers object can be discarded def empty?(mm method_modifiers) not (mm.sealed or mm.dominant or mm.intrinsic or mm.description) def parse_methodmodifiers(lexer, indentation, scope, required?) block exit: return def sealed := false def dominant := false def intrinsic := false def description := false while true if match?(lexer, #sealed:) then sealed := true else if match?(lexer, #dominant:) then dominant := true else if match?(lexer, #intrinsic:) then intrinsic := true else if match?(lexer, #description:) description := parse_string(lexer, indentation, scope, true) match?(lexer, #\,) else if sealed or dominant or intrinsic or description return(method_modifiers(sealed: sealed, dominant: dominant, intrinsic: intrinsic, description: description) else if required? parse_error(lexer, "Required method modifiers not present") else return(false)
There is also a utility method to help with converting modifiers placed in front of a def statement into method_modifiers.
def add_statement_modifiers(mm method_modifiers, statement_modifiers set![name]) if remove!(statement_modifiers, #sealed) then mm.sealed := true if remove!(statement_modifiers, #dominant) then mm.dominant := true if remove!(statement_modifiers, #intrinsic) then mm.intrinsic := true mm
A formal_parameters datum represents the formal parameters of a method. It can be parsed as the formalparameters syntactic construct. It is variable but only the parser should modify it.
It could have been defined by
defclass formal_parameters(named: required = [] list[formal_parameter_definition], optional = [] list[formal_parameter_definition], named = [] list[formal_parameter_definition], rest = false formal_parameter_definition | false, scope scope) ; scope where all parameters are visible def parse_formalparameters(lexer, indentation, scope, required?) block exit: return def initial_name := false if not required? ;; Check if formal parameters are present initial_name := parse_name(lexer, indentation, scope, false) if not initial_name and next(lexer) ~= #optional: and next(lexer) ~= #named: return(false) def result := formal_parameters(scope: scope) def mode := #required while true match_newline?(lexer, indentation, true) def selector = if mode = #named and next(lexer) in keyword then next!(lexer).name if def parameter_name = initial_name or parse_name(lexer, indentation, scope, false) initial_name := false def default = if mode ~= #required if match?(lexer, #=) parse_expression(lexer, indentation, scope, true) else quotation(false) def type = evaluate_type(parse_expression(lexer, indentation, scope, false), scope) def selector2 = selector or mode = #named and name(parameter_name.spelling) def formal = formal_parameter_definition(name: parameter_name, type: type, default: default, scope_for_default: default and result.scope, selector: selector2) ;; Scope for next formal parameter's default, and body, includes this formal parameter def new_scope = formal_parameter_scope(result.scope) add_definition(new_scope, parameter_name, formal) result.scope := new_scope ;; Stash formal in result if match?(lexer, #\...) ;; This is the rest parameter if default then parse_error(lexer, "rest parameter cannot have a default") if selector then parse_error(lexer, "rest parameter cannot have a selector keyword") result.rest := formal return(result) else if mode = #required result.required := list(result.required + [formal]...) else if mode = #optional result.optional := list(result.optional + [formal]...) else ; if mode = #named result.named := list(result.named + [formal]...) ;; See if there are more formal parameters if not match?(lexer, #\,) return(result) else if match?(lexer, #\#) ;; #constant abbreviated syntax if mode = #named then parse_error(lexer, "# parameter cannot be named or rest") def tempname = name("temp", macro_context(scope)) def constant = next!(lexer) ; name or keyword or integer or character def formal = formal_parameter_definition(name: tempname, type: set(constant)) add_definition(result.scope, tempname, formal) ;; Stash formal in result if match?(lexer, #\...) parse_error(lexer, "# parameter cannot be named or rest") else if mode = #required result.required := list(result.required + [formal]...) else if mode = #optional result.optional := list(result.optional + [formal]...) ;; See if there are more formal parameters if not match?(lexer, #\,) return(result) else if mode = #required and match?(lexer, #optional:) mode := #optional else if (mode = #required or mode = #optional) and match?(lexer, #named:) mode := #named else ;; An unrecognized token ends the parsing return(result)
The types in formal parameters, variable definitions, etc. are actual types (instances of the class type), not names of types.
A type declaration must be converted from an expression to an actual type datum at compile time. This could have been done by
def evaluate_type(expression expression | false, scope scope) if not expression ;; Type is defaulted everything else if expression in name def defn = lookup(scope, expression) if defn in known_definition and defn.value in type defn.value else error("$expression is not a defined type") else if expression in call_expression assert not (expression in spread_call_expression) ; ugh! def actuals = mapf(evaluate_type(_, scope), expression.parameters) def fcn = if expression.function in function then expression.function else def defn = lookup(scope, expression.function) if defn in known_definition and defn.value in function defn.value else error("Don't know how to call $(expression.function) at compile time") def result = fcn(actuals...) if result in type result else error("$expression does not evaluate to a type") else error("Don't know yet how to evaluate $expression at compile time")
A method_head datum represents everything about a method except its body. This is separated out because of the require statement. It can be parsed as the methodhead syntactic construct. It could have been defined by
defclass method_head (named: name name, modifiers method_modifiers, formal_parameters formal_parameters, result_type type) def parse_methodhead(lexer, indentation, scope, required?) block exit: return if def function_name := parse_name(lexer, indentation, scope, false) ;; function(actual parameters) syntax parse_methodhead_after_name(lexer, indentation, scope, function_name) else ;; unary or binary operator syntax def parameters = if match?(lexer, #\() ;; "(" name1 type_expression1 ")" operator ( "(" name2 type_expression2 ")" | ;; "#" constant ) def name1 = parse_name(lexer, indentation, scope, true) def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true), scope) match!(lexer, #\)) function_name := parse_operator(lexer, indentation, scope, true) if def macro_expander = known_definition(scope, function_name).infix_macro_expander def expansion = macro_expander(name1, lexer, indentation, scope, function_name.context, true) if expansion in method_head ;; Put the real left hand side into the formal parameters if first(expansion.formal_parameters.required) eq name1 first(expansion.formal_parameters.required) := formal_parameter_definition(name: name1, type: type1) if match?(lexer, #\=>) ;; [ "=>" result_type_expression ] expansion.result_type := evaluate_type(parse_expression(lexer, indentation, scope, true), scope) return(expansion) else if expansion in call_expression and length(expansion.parameters) >= 1 and first(expansion.parameters) eq name1 def operator_name = function_name function_name := expansion.function if function_name in bundle function_name := function_name.name [formal_parameter_definition(name: name1, type: type1), [(if arg in name formal_parameter_definition(name: arg, type: everything) else if arg in quotation formal_parameter_definition( name: name("temp", macro_context(scope)), type: set(arg.datum)) else error("Don't know how to convert $expansion expansion of $operator_name operator into a method_head")) for arg in rest(expansion.parameters)]...] else error("Don't know how to convert $expansion expansion of $function_name operator into a method_head")) else def [name2, type2] = if match?(lexer, #\() def values = [parse_name(lexer, indentation, scope, true), evaluate_type(parse_expression(lexer, indentation, scope, true), scope)] match!(lexer, #\)) values else if match?(lexer, #\#) [name("temp", macro_context(scope)), block def datum := next!(lexer) if datum in name ;; Strip hygienic context datum := name(datum, false) set(datum)] else wrong_token_error(lexer, "( or #") [formal_parameter_definition(name: name1, type: type1), formal_parameter_definition(name: name2, type: type2)] else if function_name := parse_operator(lexer, indentation, scope, false) ;; operator "(" name type_expression ")" match!(lexer, #\() def name1 = parse_name(lexer, indentation, scope, true) def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true), scope) match!(lexer, #\)) [formal_parameter_definition(name: name1, type: type1)] else if required? wrong_token_error(lexer, "a name, an operator, or ( to start a method head") else return(false) def formals_scope = formal_parameter_scope(scope) for formal in parameters add_definition(formals_scope, formal.name, formal) def parameters = formal_parameters(required: parameters, scope: formals_scope) ;; Adjust function_name and parameters if ":=" "(" name type ")" follows def function_name = parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters) def result_type = if match?(lexer, #\=>) ;; [ "=>" result_type_expression ] evaluate_type(parse_expression(lexer, indentation, scope, true), scope) else everything return(method_head(name: function_name, modifiers: method_modifiers(), formal_parameters: parameters, result_type: result_type)) ;; Call this when the name has already been parsed ;; { name & "." }+ [ "[" generic_class_formalparameters "]" ] "(" methodmodifiers formalparameters ")" ;; [ ":=" "(" new_value_name type_expression ")" ] ;; [ "=>" result_type_expression ] def parse_methodhead_after_name(lexer, indentation, scope, function_name_arg) def function_name := function_name_arg while match?(lexer, #\.) function_name := name(parse_name(lexer, indentation, scope, true), lookup_module(function_name)) ;;---TODO insert generic_class_formalparameters support here match!(lexer, #\() def modifiers = parse_methodmodifiers(lexer, indentation, scope, false) def parameters = parse_formalparameters(lexer, indentation, scope, true) match!(lexer, #\)) ;; Adjust function_name and parameters if ":=" "(" name type ")" follows function_name := parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters) def result_type = if match?(lexer, #\=>) evaluate_type(parse_expression(lexer, indentation, scope, true), scope) else everything method_head(name: function_name, modifiers: modifiers, formal_parameters: parameters, result_type: result_type) def parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters) if match?(lexer, #\:=) match!(lexer, #\() def new_value_name = parse_name(lexer, indentation, scope, true) def new_value_type = evaluate_type(parse_expression(lexer, indentation, scope, true), scope) match!(lexer, #\)) if parameters.optional or parameters.named or parameters.rest parse_error(lexer, ":= can only be used with required formal parameters") def formal = formal_parameter_definition(name: new_value_name, type: new_value_type) parameters.required := list(parameters.required + [formal]...) parameters.scope := formal_parameter_scope(parameters.scope) add_definition(parameters.scope, new_value_name, formal) name(function_name.spelling + ":=", function_name) else function_name
The parsing of the def statement, with lookahead, could have been implemented by
defmacro def => block exit: return_macro_expansion ;; First step is pre-classification: ;; ;; A constant definition starts with tokens matching the pattern ;; (name [ "(" stuff ")" ] | "[" stuff "]") "=" ;; where stuff is any sequence of tokens containing balanced parentheses or brackets. ;; In other words, the left-hand side preceding = is either a name or a destructuring. ;; ;; A variable definition starts with a name followed by :=. ;; ;; A forward definition starts with a name but the next token is none of =, :=, or (. ;; ;; Anything else must be a method definition or a syntax error. def definition_type := #unknown def lookahead_buffer = stack() def initial_name = parse_name(lexer, indentation, scope, false) ;; Returns true if successful, false if statement ends inside brackets def parse_balanced_stuff(initiator name) block exit: return_from_parse_balanced_stuff def terminator = if same_spelling?(initiator, #\[) then #\] else if same_spelling?(initiator, #\{) then #\} else #\) push!(lookahead_buffer, initiator) while def token = next!(lexer) ;; implies return_from_parse_balanced_stuff(false) at EOF if token in newline and token.indentation <= indentation push!(lookahead_buffer, token) return_from_parse_balanced_stuff(false) else if token in name and (same_spelling?(token, #\() or same_spelling?(token, #\[) or same_spelling?(token, #\{)) parse_balanced_stuff(token) or return_from_parse_balanced_stuff(false) else push!(lookahead_buffer, token) if same_spelling?(token, terminator) return_from_parse_balanced_stuff(true) if same_spelling?(next(lexer), if initial_name then #\( else #\[) and parse_balanced_stuff(next!(lexer)) and same_spelling?(next(lexer), #=) ;; Lookeahead matched, this is a destructuring constant definition insert!(lexer, lookahead_buffer) definition_type := #destructuring else ;; This is something other than a destructuring constant definition insert!(lexer, lookahead_buffer) if initial_name and match?(lexer, #\:=) definition_type := #variable else if initial_name and match?(lexer, #=) definition_type := #constant else if initial_name and not same_spelling?(next(lexer), #\() ;; Forward declaration or name followed by garbage if not next(lexer) or next(lexer) in newline and next(lexer).indentation <= indentation or same_spelling?(next(lexer), #\)) definition_type := #forward else wrong_token_error(lexer, ") or end of statement") else definition_type := #method ;; Pre-classification is done and initial-name is parsed, do the real parsing def defn = case definition_type #forward => known_definition(initial_name, bundle(name: initial_name)) #variable => def initial_value = parse_expression(lexer, indentation, scope, true) def type = evaluate_type(parse_expression(lexer, indentation, scope, false), scope) variable_definition(initial_name, type, initial_value) #constant => def expr = parse_expression(lexer, indentation, scope, true) if expr in number | character | string known_definition(initial_name, expr) else if expr in quotation known_definition(initial_name, expr.datum) else if expr in name and lookup(scope, expr) in known_definition known_definition(initial_name, lookup(scope, expr).value) else constant_definition(initial_name, everything, expr) ;; #destructuring => TBD #method => def head = if initial_name parse_methodhead_after_name(lexer, indentation, scope, initial_name) else parse_methodhead(lexer, indentation, scope, true) def body_scope = head.formal_parameters.scope def function_name = head.name def existing_definition = lookup1(scope, function_name) add_statement_modifiers(head.modifiers, modifiers) if existing_definition if existing_definition in known_definition and existing_definition.value in bundle return_macro_expansion(call_expression(name("add_method", modules.lunar), function_name, method_expression(head, parse_body(lexer, indentation, body_scope, true)))) else parse_error(lexer, "Incompatible definitions for $function_name") else ;; Not already defined directly in this scope, add the definition def bundle = bundle(name: function_name) def defn = known_definition(function_name, bundle) def expr = expand_definition(scope, function_name, defn) def new_scope = if scope in global_scope then scope else if expr in name then scope else ;; expand_definition created a new block_tail_scope last(expr.expressions) ;; Reparent the body and formal parameter scopes on new_scope ;; so the method is inside its own scope, so it can be recursive for s = body_scope then s.parent while s using return if s.parent eq scope s.parent := new_scope return(false) ;; Parse the body and make an add_method call def am = call_expression(name("add_method", modules.lunar), function_name, method_expression(head, parse_body(lexer, indentation, body_scope, true))) ;; Plug it into the return value return_macro_expansion(if expr in prog_expression new_scope.expressions := new_scope.expressions + [am] expr else am) ;; Not a method, just a simple definition expand_definition(scope, initial_name, defn)
Previous page Table of Contents Next page