Lunar Programming Language

by David A. Moon
January 2017 - January 2018



Expressions

Expressions are the fundamental unit of Lunar programs. An expression is a name, a literal, or an operator with its operand(s) which (recursively) are expressions unless the operator has an idiosyncratic syntax. Every expression has a result, which is some datum. Where there is no meaningful result, by convention the result is false.

Some expressions do more than simply compute a result, so they are considered statements. Most statements start with a prefix operator.

Some statements define some kind of meaning for a name, so they are considered definitions. Definitions and all other statements are also expressions.

Bodies

A body is a series of one or more expressions to be executed consecutively. The result of the body is the result of the last expression. The scope of any definitions in the body is the rest of the body and does not extend outside the body. See Scope for full details of scopes.

If a body directly contains more than one expression, each expression must be on a separate line and all lines must have the same indentation. For example, you can write

if f(x) then g(x) else e(x)
and you can write
if f(x) then g(x)
else e(x)
and you can write
if f(x)
  g(x)
  h(x)
else e(x)
but you cannot write
if f(x) then g(x)
             h(x)
else e(x)
because g(x) is not on a separate line.

Bodies are ubiquitous in Lunar, appearing inside many statements.

A Lunar source file resembles a body, but in global scope. Any other body creates a local scope.

In a body, any top-level expression can be preceded by modifiers. A modifier is a keyword that is understood by the expression it precedes. For example, a class definition can be preceded by the abstract: modifier to indicate that the class cannot have any direct members; only its subclasses can have direct members. In a local scope, modifiers are only used with definitions.

In the top level of a Lunar source file, module-related modifiers can precede their idiosyncratic parameters. See Module-Related Modifiers

Operators

Syntactically, an operator is a name whose definition is a bundle with operator nature. Operators control and direct parsing of source code.

An operator has arity which controls how it parses its operands. There are six possible arities:

Arity Meaning Example
unary one operand follows the operator -x
binary two operands precede and follow the operator a+b
ternary like binary but equal-precedence ternary operators can be chained x<y<=z
prefix idiosyncratic syntax follows the operator if x<y then z
infix expression precedes and idiosyncratic syntax follows the operator x := f(x)
suffix expression precedes the operator x++ (not predefined)

An operator can have unary or prefix arity and also another arity that is neither unary nor prefix. For example, - is both unary and binary.

Operators with unary or binary arity, and operators with ternary arity when not chained, are syntax for invoking a function with the same name as the operator. The actual parameters are the one or two operands.

Chained ternary operators are syntax for a series of function invocations linked by the and operator. Thus x < y <= z is the same as \<(x, y) and \<=(y, z) except that if y has side-effects it is only executed once. Note that z could be executed zero or one time.

The \ operator used here causes the following token to be parsed as an ordinary name rather than as an operator (the following token is said to be denatured). \ can also be used at the end of a line to insert a newline where one would not normally be allowed.

A unary, binary, or ternary operator can stand by itself as an expression, with no operand preceding or following. The result is the operator name's value, a bundle.

An optional newline can be inserted after a binary or ternary operator and after many infix operators. This allows an expression to extend onto a continuation line. The indentation of the newline must be greater than the indentation of the line where the expression began.

Operators with prefix, infix, or suffix arity are macros which take over parsing of the tokens after the operator. Sometimes the idiosyncratic syntax is simply an expression, and the purpose of the macro is not to parse special syntax but to produce something other than a simple function invocation.

A suffix operator is actually an infix operator whose idiosyncratic syntax is empty.

An operator has precedence which controls the order of wrapping up operands and operators into function invocations. A numerically larger precedence means higher binding power. When an operand appears between two binary, ternary, infix, or suffix operators, the operator with higher precedence takes the operand. If the precedences are equal, the operator on the left takes the operand.

An operator has separate left precedence and right precedence. The right precedence of the operator on the left is compared to the left precedence of the operator on the right. Typically the left and right precedences are equal, but the right precedence can be lower than the left to make the operator right-associative.

No operators are built into the syntax, but the lunar module exports the following operators:

name precedence arity methods or uses
+ 50 unary, binary numeric addition, sequence concatenation
- 50 unary, binary numeric subtraction
* 60 binary numeric multiplication, sequence replication
/ 60 binary numeric division
mod 60 binary numeric modulus (remainder)
^ 71,70 binary numeric exponentiation
& 60 binary bitwise and, set/type intersection
| 50 binary bitwise or, set/type union
~ 50 unary, binary bitwise xor or complement
( 80 prefix, infix grouping, function call
[ 80 prefix, infix subscripting, list/map display, comprehension
{ 80 prefix, infix set display, comprehension
. 90 infix slot read
.. 40 binary inclusive range
..< 40 binary inclusive-exclusive range
<.. 40 binary exclusive-inclusive range
<..< 40 binary exclusive range
:= 80,0 infix assignment
= 30 ternary equality (applicable to all data)
~= 30 ternary not =
< 30 ternary numeric or sequence less than
<= 30 ternary numeric or sequence less than or equal
> 30 ternary numeric or sequence greater than
>= 30 ternary numeric or sequence greater than or equal
eq 25 binary same exact datum
<< 75 binary integer left shift
>> 75 binary integer right shift
# prefix literal
` prefix template
\ prefix denature
and 20 infix short-circuiting logical and
or 10 infix short-circuiting logical or
not unary logical complement
in 40 binary set/type/sequence/map membership
as 0 binary up-cast

TODO Unicode equivalents

The statement and definition prefix operators are described elsewhere.

Particles

A particle is a name that is recognized in a particular piece of syntax but is not defined as an operator. Its identity depends on its spelling, not on its definition.

The following particles are the principal ones used by macros exported by the lunar module:

$ interpolation
, separates members of a list
) matches (
] matches [
} matches {
... sequence of parameters
then consequent in an if statement
else alternative in an if statement
= constant as opposed to variable
:= variable as opposed to constant
=> end of pattern, end of case, result type

See Patterns for additional particles used in syntax patterns.

Grouping

Parentheses are used for grouping (or overriding operator precedence) in the usual way. This is not built into the language, but could have been defined by

defmacro \( expression ")" => expression
except that the actual implementation uses a special parser that defers currying to the containing expression. See Curried Functions

Function Call

Infix parenthesis is the function call operator, as in most programming languages. See Actual Parameter List for the syntax of the actual parameters enclosed in parentheses.

The actual parameters can optionally be followed by the token ... which indicates that the result of the last actual parameter expression must be a sequence. Each member of that sequence becomes an actual parameter.

This could have been defined by

defoperator \(
  precedence: 80
  macro: function "(" actualparameters [ "..." ellipsis_flag ] ")" =>
    (if ellipsis_flag then spread_call_expression else call_expression)(
      function, actualparameters...)

Actual Parameter List

An actual parameter list is a list of zero or more expressions, separated by commas. In the simplest case, the result of each expression is a parameter for a function invocation.

Each expression can optionally be preceded by a keyword, written as one token name:. This supplies an additional parameter for the function invocation. The parameter's value is the keyword's name, a name object with no hygienic context. Often this serves as a selector to select a named formal parameter, but that is not required.

An optional newline can be inserted after the opening left parenthesis or a comma. This allows an actual parameter list to extend onto a continuation line. The indentation of the newline must be greater than the indentation of the line where the parameter list began.

The parameters in a call_expression or spread_call_expression are the result of parsing an actual parameter list. Each keyword has been converted to a quotation of its name with no hygienic context.

This could have been defined by

;;; Handles keywords, returns list of expressions
;;; Caller must handle "..."
defsyntax actualparameters { [ ^^ ] [ keyword ] expression & "," }* =>
  for k in keyword, e in expression using collect
    if k then collect quotation(k.name)
    collect e

;;; Or if you prefer to write the parser by hand
def parse_actualparameters(lexer, indentation, scope, required?)
  block exit: return
    def result = stack()
    def loop()
      match_newline?(lexer, indentation, true)
      if next(lexer) in keyword
        push!(result, quotation(next!(lexer).name))
      if not def exp = parse_expression(lexer, indentation, scope, not empty?(result))
        return(false)
      push!(result, exp)
      if match?(lexer, #\,)
        loop()
    loop()
    list(result...)

Literals

A literal is an expression whose result is a datum specified directly in the source code of the program.

Numbers, characters, and strings have lexical syntax that allows them to be written directly as literals.

There are no Boolean literals, but the names false and true have constant definitions in the lunar module.

A literal name can be written using the # macro, which is followed by a non-punctuation name or by \ and any name or a string that is the spelling of the desired name. A literal name never has hygienic context.

Punctuation tokens other than \ following # are reserved for future expansion.


Previous page   Table of Contents   Next page



Creative Commons License
Lunar by David A. Moon is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Please inform me if you find this useful, or use any of the ideas embedded in it.
Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.