Lunar Programming Language

by David A. Moon
January 2017 - January 2018



Parsing the Def Statement

Parsing the def statement is a bit tricky because we use the same name for five different forms of definition, each with its own syntax. It is rather difficult to merge those five syntaxes into one LL(1) syntax. Instead we use look-ahead to pre-classify the def statement as a method definition (regular or generic), a constant definition, a variable definition, or a forward declaration.

The look-ahead remembers the tokens read from the source token stream, along with their source locations. Once the pre-classification is completed, the remembered tokens and source locations are returned to the token stream push back buffer so the same tokens can be parsed again.

A constant definition starts with tokens matching the pattern

(name [ "(" stuff ")" ] | "[" stuff "]") "="
where stuff is any sequence of tokens containing balanced parentheses or brackets. In other words, the left-hand side preceding = is either a name or a destructuring. If it's a destructuring, it's either a name followed by stuff in parentheses (a function call destructuring) or it is stuff in brackets (a list destructuring).

A variable definition starts with a name followed by :=.

A forward definition starts with a name but the next token is none of =, :=, or ( and marks the end of the statement.

Anything else must be a method definition or a syntax error.

See Definitions for how the def statement is parsed once it has been pre-classified.

Method Modifiers

A method_modifiers datum represents the modifiers and description that can be attached to a method. It can be parsed as the methodmodifiers syntactic construct. It could have been defined by

defclass method_modifiers(named: sealed      boolean,
                                 dominant    boolean,
                                 intrinsic   boolean,
                                 description string | false)

;; True if this method_modifiers object can be discarded
def empty?(mm method_modifiers)
  not (mm.sealed or mm.dominant or mm.intrinsic or mm.description)

def parse_methodmodifiers(lexer, indentation, scope, required?)
  block exit: return
    def sealed      := false
    def dominant    := false
    def intrinsic   := false
    def description := false
    while true
      if match?(lexer, #sealed:) then sealed := true
      else if match?(lexer, #dominant:) then dominant := true
      else if match?(lexer, #intrinsic:) then intrinsic := true
      else if match?(lexer, #description:)
        description := parse_string(lexer, indentation, scope, true)
        match?(lexer, #\,)
      else if sealed or dominant or intrinsic or description
        return(method_modifiers(sealed: sealed, dominant: dominant,
               intrinsic: intrinsic, description: description)
      else if required?
        parse_error(lexer, "Required method modifiers not present")
      else
        return(false)

There is also a utility method to help with converting modifiers placed in front of a def statement into method_modifiers.

def add_statement_modifiers(mm method_modifiers, statement_modifiers set![name])
  if remove!(statement_modifiers, #sealed)    then mm.sealed    := true
  if remove!(statement_modifiers, #dominant)  then mm.dominant  := true
  if remove!(statement_modifiers, #intrinsic) then mm.intrinsic := true
  mm

Parsing Formal Parameters

A formal_parameters datum represents the formal parameters of a method. It can be parsed as the formalparameters syntactic construct. It is variable but only the parser should modify it.

It could have been defined by

defclass formal_parameters(named:
  required = []                 list[formal_parameter_definition],
  optional = []                 list[formal_parameter_definition],
  named    = []                 list[formal_parameter_definition],
  rest     = false              formal_parameter_definition | false,
  scope                         scope)  ; scope where all parameters are visible

def parse_formalparameters(lexer, indentation, scope, required?)
  block exit: return
    def initial_name := false

    if not required?
      ;; Check if formal parameters are present
      initial_name := parse_name(lexer, indentation, scope, false)
      if not initial_name and next(lexer) ~= #optional: and next(lexer) ~= #named:
        return(false)

    def result := formal_parameters(scope: scope)
    def mode   := #required

    while true
      match_newline?(lexer, indentation, true)
      def selector = if mode = #named and next(lexer) in keyword then next!(lexer).name
      if def parameter_name = initial_name or parse_name(lexer, indentation, scope, false)
        initial_name := false
        def default  = if mode ~= #required
                         if match?(lexer, #=)
                           parse_expression(lexer, indentation, scope, true)
                         else
                           quotation(false)
        def type      = evaluate_type(parse_expression(lexer, indentation, scope, false), scope)
        def selector2 = selector or mode = #named and name(parameter_name.spelling)
        def formal    = formal_parameter_definition(name:              parameter_name,
                                                    type:              type,
                                                    default:           default,
                                                    scope_for_default: default and result.scope,
                                                    selector:          selector2)

        ;; Scope for next formal parameter's default, and body, includes this formal parameter
        def new_scope = formal_parameter_scope(result.scope)
        add_definition(new_scope, parameter_name, formal)
        result.scope := new_scope

        ;; Stash formal in result
        if match?(lexer, #\...)
          ;; This is the rest parameter
          if default  then parse_error(lexer, "rest parameter cannot have a default")
          if selector then parse_error(lexer, "rest parameter cannot have a selector keyword")
          result.rest := formal
          return(result)
        else if mode = #required
          result.required := list(result.required + [formal]...)
        else if mode = #optional
          result.optional := list(result.optional + [formal]...)
        else ; if mode = #named
          result.named := list(result.named + [formal]...)

        ;; See if there are more formal parameters
        if not match?(lexer, #\,)
          return(result)
      else if match?(lexer, #\#)
        ;; #constant abbreviated syntax
        if mode = #named then parse_error(lexer, "# parameter cannot be named or rest")
        def tempname = name("temp", macro_context(scope))
        def constant = next!(lexer)  ; name or keyword or integer or character
        def formal   = formal_parameter_definition(name: tempname,
                                                   type: set(constant))
        add_definition(result.scope, tempname, formal)

        ;; Stash formal in result
        if match?(lexer, #\...)
          parse_error(lexer, "# parameter cannot be named or rest")
        else if mode = #required
          result.required := list(result.required + [formal]...)
        else if mode = #optional
          result.optional := list(result.optional + [formal]...)

        ;; See if there are more formal parameters
        if not match?(lexer, #\,)
          return(result)
      else if mode = #required and match?(lexer, #optional:)
        mode := #optional
      else if (mode = #required or mode = #optional) and match?(lexer, #named:)
        mode := #named
      else
        ;; An unrecognized token ends the parsing
        return(result)

Type Declarations are Evaluated at Compile Time

The types in formal parameters, variable definitions, etc. are actual types (instances of the class type), not names of types.

A type declaration must be converted from an expression to an actual type datum at compile time. This could have been done by

def evaluate_type(expression expression | false, scope scope)
  if not expression
    ;; Type is defaulted
    everything
  else if expression in name
    def defn = lookup(scope, expression)
    if defn in known_definition and defn.value in type
      defn.value
    else
      error("$expression is not a defined type")
  else if expression in call_expression
    assert not (expression in spread_call_expression)  ; ugh!
    def actuals = mapf(evaluate_type(_, scope), expression.parameters)
    def fcn = if expression.function in function then expression.function
              else
                def defn = lookup(scope, expression.function)
                if defn in known_definition and defn.value in function
                  defn.value
                else
                  error("Don't know how to call $(expression.function) at compile time")
    def result = fcn(actuals...)
    if result in type
      result
    else
      error("$expression does not evaluate to a type")
  else
    error("Don't know yet how to evaluate $expression at compile time")

Parsing Method Heads

A method_head datum represents everything about a method except its body. This is separated out because of the require statement. It can be parsed as the methodhead syntactic construct. It could have been defined by

defclass method_head (named:
  name                          name,
  modifiers                     method_modifiers,
  formal_parameters             formal_parameters,
  result_type                   type)

def parse_methodhead(lexer, indentation, scope, required?)
  block exit: return
    if def function_name := parse_name(lexer, indentation, scope, false)
      ;; function(actual parameters) syntax
      parse_methodhead_after_name(lexer, indentation, scope, function_name)
    else
      ;; unary or binary operator syntax
      def parameters = if match?(lexer, #\()
                         ;; "(" name1 type_expression1 ")" operator ( "(" name2 type_expression2 ")" |
                         ;;                                           "#" constant )
                         def name1 = parse_name(lexer, indentation, scope, true)
                         def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true),
                                                   scope)
                         match!(lexer, #\))
                         function_name := parse_operator(lexer, indentation, scope, true)
                         if def macro_expander = known_definition(scope, function_name).infix_macro_expander
                           def expansion = macro_expander(name1, lexer, indentation, scope,
                                                          function_name.context, true)
                           if expansion in method_head
                             ;; Put the real left hand side into the formal parameters
                             if first(expansion.formal_parameters.required) eq name1
                               first(expansion.formal_parameters.required) :=
                                 formal_parameter_definition(name: name1, type: type1)
                             if match?(lexer, #\=>)   ;; [ "=>" result_type_expression ]
                               expansion.result_type :=
                                 evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
                             return(expansion)
                           else if expansion in call_expression and
                                   length(expansion.parameters) >= 1 and
                                   first(expansion.parameters) eq name1
                                  def operator_name = function_name
                                  function_name := expansion.function
                                  if function_name in bundle
                                    function_name := function_name.name
                                  [formal_parameter_definition(name: name1, type: type1),
                                   [(if arg in name
                                       formal_parameter_definition(name: arg, type: everything)
                                     else if arg in quotation
                                       formal_parameter_definition(
                                         name: name("temp", macro_context(scope)),
                                         type: set(arg.datum))
                                     else
                                       error("Don't know how to convert $expansion expansion of $operator_name operator into a method_head"))
                                    for arg in rest(expansion.parameters)]...]
                           else
                             error("Don't know how to convert $expansion expansion of $function_name operator into a method_head"))
                         else
                           def [name2, type2] = if match?(lexer, #\()
                                                  def values =
                                                    [parse_name(lexer, indentation, scope, true),
                                                     evaluate_type(parse_expression(lexer, indentation,
                                                                                  scope, true),
                                                                  scope)]
                                                  match!(lexer, #\))
                                                  values
                                                else if match?(lexer, #\#)
                                                  [name("temp", macro_context(scope)),
                                                   block
                                                     def datum := next!(lexer)
                                                     if datum in name
                                                       ;; Strip hygienic context
                                                       datum := name(datum, false)
                                                     set(datum)]
                                                else
                                                  wrong_token_error(lexer, "( or #")
                           [formal_parameter_definition(name: name1, type: type1),
                            formal_parameter_definition(name: name2, type: type2)]
                       else if function_name := parse_operator(lexer, indentation, scope, false)
                         ;; operator "(" name type_expression ")"
                         match!(lexer, #\()
                         def name1 = parse_name(lexer, indentation, scope, true)
                         def type1 = evaluate_type(parse_expression(lexer, indentation, scope, true),
                                                   scope)
                         match!(lexer, #\))
                         [formal_parameter_definition(name: name1, type: type1)]
                       else if required?
                         wrong_token_error(lexer, "a name, an operator, or ( to start a method head")
                       else return(false)
      def formals_scope = formal_parameter_scope(scope)

      for formal in parameters
        add_definition(formals_scope, formal.name, formal)

      def parameters    = formal_parameters(required: parameters, scope: formals_scope)
      ;; Adjust function_name and parameters if ":=" "(" name type ")" follows
      def function_name = parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)
      def result_type   = if match?(lexer, #\=>)   ;; [ "=>" result_type_expression ]
                            evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
                          else everything

      return(method_head(name:              function_name,
                         modifiers:         method_modifiers(),
                         formal_parameters: parameters,
                         result_type:       result_type))

;; Call this when the name has already been parsed
;; { name & "." }+ [ "[" generic_class_formalparameters "]" ] "(" methodmodifiers formalparameters ")"
;; [ ":=" "(" new_value_name type_expression ")" ]
;; [ "=>" result_type_expression ]
def parse_methodhead_after_name(lexer, indentation, scope, function_name_arg)
  def function_name := function_name_arg
  while match?(lexer, #\.)
    function_name := name(parse_name(lexer, indentation, scope, true), lookup_module(function_name))

  ;;---TODO insert generic_class_formalparameters support here
  match!(lexer, #\()
  def modifiers  = parse_methodmodifiers(lexer, indentation, scope, false)
  def parameters = parse_formalparameters(lexer, indentation, scope, true)
  match!(lexer, #\))

  ;; Adjust function_name and parameters if ":=" "(" name type ")" follows
  function_name := parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)

  def result_type = if match?(lexer, #\=>)
                      evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
                    else
                      everything

  method_head(name:              function_name,
              modifiers:         modifiers,
              formal_parameters: parameters,
              result_type:       result_type)

def parse_methodhead_assignment(lexer, indentation, scope, function_name, parameters)
  if match?(lexer, #\:=)
    match!(lexer, #\()
    def new_value_name = parse_name(lexer, indentation, scope, true)
    def new_value_type = evaluate_type(parse_expression(lexer, indentation, scope, true), scope)
    match!(lexer, #\))
    if parameters.optional or parameters.named or parameters.rest
      parse_error(lexer, ":= can only be used with required formal parameters")
    def formal = formal_parameter_definition(name: new_value_name, type: new_value_type)
    parameters.required := list(parameters.required + [formal]...)
    parameters.scope    := formal_parameter_scope(parameters.scope)
    add_definition(parameters.scope, new_value_name, formal)
    name(function_name.spelling + ":=", function_name)
  else function_name

Parsing Def

The parsing of the def statement, with lookahead, could have been implemented by

defmacro def =>
  block exit: return_macro_expansion
    ;; First step is pre-classification:
    ;;
    ;;  A constant definition starts with tokens matching the pattern
    ;;    (name [ "(" stuff ")" ] | "[" stuff "]") "="
    ;; where stuff is any sequence of tokens containing balanced parentheses or brackets.
    ;; In other words, the left-hand side preceding = is either a name or a destructuring.
    ;;
    ;; A variable definition starts with a name followed by :=.
    ;;
    ;; A forward definition starts with a name but the next token is none of =, :=, or (.
    ;;
    ;; Anything else must be a method definition or a syntax error.
    def definition_type := #unknown
    def lookahead_buffer = stack()
    def initial_name     = parse_name(lexer, indentation, scope, false)

    ;; Returns true if successful, false if statement ends inside brackets
    def parse_balanced_stuff(initiator name)
      block exit: return_from_parse_balanced_stuff
        def terminator = if same_spelling?(initiator, #\[) then #\]
                         else if same_spelling?(initiator, #\{) then #\}
                         else #\)
        push!(lookahead_buffer, initiator)
        while def token = next!(lexer)  ;; implies return_from_parse_balanced_stuff(false) at EOF
          if token in newline and token.indentation <= indentation
            push!(lookahead_buffer, token)
            return_from_parse_balanced_stuff(false)
          else if token in name and (same_spelling?(token, #\() or
                                     same_spelling?(token, #\[) or
                                     same_spelling?(token, #\{))
            parse_balanced_stuff(token) or return_from_parse_balanced_stuff(false)
          else
            push!(lookahead_buffer, token)
            if same_spelling?(token, terminator)
              return_from_parse_balanced_stuff(true)

    if same_spelling?(next(lexer), if initial_name then #\( else #\[) and
       parse_balanced_stuff(next!(lexer)) and
       same_spelling?(next(lexer), #=)
      ;; Lookeahead matched, this is a destructuring constant definition
      insert!(lexer, lookahead_buffer)
      definition_type := #destructuring
    else
      ;; This is something other than a destructuring constant definition
      insert!(lexer, lookahead_buffer)
      if initial_name and match?(lexer, #\:=)
        definition_type := #variable
      else if initial_name and match?(lexer, #=)
        definition_type := #constant
      else if initial_name and not same_spelling?(next(lexer), #\()
        ;; Forward declaration or name followed by garbage
        if not next(lexer) or
           next(lexer) in newline and next(lexer).indentation <= indentation or
           same_spelling?(next(lexer), #\))
          definition_type := #forward
        else
          wrong_token_error(lexer, ") or end of statement")
      else
        definition_type := #method

    ;; Pre-classification is done and initial-name is parsed, do the real parsing
    def defn = case definition_type
      #forward  => known_definition(initial_name, bundle(name: initial_name))
      #variable => def initial_value = parse_expression(lexer, indentation, scope, true)
                   def type = evaluate_type(parse_expression(lexer, indentation, scope, false), scope)
                   variable_definition(initial_name, type, initial_value)
      #constant => def expr = parse_expression(lexer, indentation, scope, true)
                   if expr in number | character | string
                     known_definition(initial_name, expr)
                   else if expr in quotation
                     known_definition(initial_name, expr.datum)
                   else if expr in name and lookup(scope, expr) in known_definition
                     known_definition(initial_name, lookup(scope, expr).value)
                   else
                     constant_definition(initial_name, everything, expr)
      ;; #destructuring => TBD
      #method   => def head = if initial_name
                                parse_methodhead_after_name(lexer, indentation, scope, initial_name)
                              else
                                parse_methodhead(lexer, indentation, scope, true)
                   def body_scope = head.formal_parameters.scope
                   def function_name = head.name
                   def existing_definition = lookup1(scope, function_name)

                   add_statement_modifiers(head.modifiers, modifiers)

                   if existing_definition
                     if existing_definition in known_definition and
                        existing_definition.value in bundle
                       return_macro_expansion(call_expression(name("add_method", modules.lunar),
                                                              function_name,
                                                              method_expression(head,
                                                                parse_body(lexer, indentation,
                                                                           body_scope, true))))
                     else parse_error(lexer, "Incompatible definitions for $function_name")
                   else
                     ;; Not already defined directly in this scope, add the definition
                     def bundle    = bundle(name: function_name)
                     def defn      = known_definition(function_name, bundle)
                     def expr      = expand_definition(scope, function_name, defn)
                     def new_scope = if scope in global_scope then scope
                                     else if expr in name     then scope
                                     else
                                       ;; expand_definition created a new block_tail_scope
                                       last(expr.expressions)

                     ;; Reparent the body and formal parameter scopes on new_scope
                     ;; so the method is inside its own scope, so it can be recursive
                     for s = body_scope then s.parent while s using return
                       if s.parent eq scope
                         s.parent := new_scope
                         return(false)

                     ;; Parse the body and make an add_method call
                     def am = call_expression(name("add_method", modules.lunar),
                                              function_name,
                                              method_expression(head, parse_body(lexer, indentation,
                                                                                 body_scope, true)))
                     ;; Plug it into the return value
                     return_macro_expansion(if expr in prog_expression
                                              new_scope.expressions := new_scope.expressions + [am]
                                              expr
                                            else
                                              am)

    ;; Not a method, just a simple definition
    expand_definition(scope, initial_name, defn)


Previous page   Table of Contents   Next page



Creative Commons License
Lunar by David A. Moon is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Please inform me if you find this useful, or use any of the ideas embedded in it.
Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.