Programming Language for Old Timers

by David A. Moon
February 2006 .. September 2008

Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.

Previous page Table of Contents Next page

Newlines

In PLOT, a newline is "invisible" punctuation separating two tokens that are not on the same line. A newline can only appear where allowed by the grammar. For example, within an expression a newline can only appear after an infix operator. This permits newlines to be "statement" delimiters with no need to use something like semicolons for that purpose. Two expressions can be separated by only a newline without syntactic ambiguity, even when the second expression begins with a token that is both a prefix operator and an infix operator. An expression can be continued onto another line provided that the preceding line ends in an infix operator and the next line is indented more than the line where the expression started.

The lexical syntax does not currently provide a way to hide a newline, e.g. by putting \ at the end of a line as in some other languages.

The indentation of the next token after a newline is significant. This allows nesting structure to be indicated by indentation rather than by any kind of explicit bracketing.

Newlines are always optional but many constructs require them to be used consistently; either every place where a newline is allowed has a newline, or there are no newlines in the construct. Furthermore, every newline must have the same indentation. Requiring consistent indentation makes it easier to diagnose syntax errors that would cause nesting structure different from what the programmer intended. Expressions are an exception to this rule: they allow inconsistent use of newlines, where only some infix operators are followed by a newline and the indentations can vary.

If one construct is nested inside another, indentation within the inner construct must never be less than indentation within the outer construct. Most constructs require indentation within them to be strictly greater than the indentation within the containing construct, but there are a few exceptions (e.g. the else clause of an if statement can be indented the same as the if.)

In the above rules, a "construct" is a portion of a program parsed by one parse method. When a parse method is written using syntax patterns, the method parses exactly one construct. When a parse method is written directly in raw imperative form without using patterns, the parse method parses one or more constructs, inserting construct boundaries wherever it likes.

These rules imply that indentation more than current indicates either a continuation line or the start of a nested construct such as a method body or a class definition body. Such a nested construct ends when indentation returns to its previous value.

All of the above rules are implemented through the parse-newline? function.

defun parse-newline?(tokens is token-stream,
                     min-indent is integer,
                   optional:
                     required-indent is integer or false,
                     match-next-token is name or keyword or number or
                                         string or character or false,
                     forbid-next-tokens is sequence or false)
      is integer or false
  ...

The result of parse-newline? is false if it failed to match, -1 if it matched but no newline is present, or the indentation of the next line if it matched a newline. The result can be fed back into the required-indent argument in the next call if consistent use of newlines is required.

The min-indent argument enforces the rule that indentation within an inner construct cannot be less than, and in most constructs cannot be equal to, indentation within the containing construct.

The required-indent argument enforces consistent use and indentation of newlines when not false.

The match-next-token argument allows for LL(2) parsing in situations like the else clause of an if statement where a newline is only consumed if a specific token follows. match-next-token can be a name, keyword, or literal.

The forbid-next-tokens argument also allows for LL(2) parsing. It is a sequence of names, keywords, or literals. The newline is only consumed if the next token is not in this sequence.

If parse-newline? does not see a newline and required-indent is false, it returns -1. If it does not see a newline and required-indent is not false, it returns false.

If parse-newline? sees a newline, the indentation is greater than or equal to min-indent, required-indent is specified, and the indentation is greater than required-indent, signal an error reporting inconsistent indentation.

If parse-newline? sees a newline but the indentation is less than min-indent, or required-indent is specified and not equal to the indentation, it returns false.

If match-next-token is specified and not equal to the token after the newline, parse-newline? returns false.

If forbid-next-tokens is specified and the token after the newline is in that sequence, parse-newline? returns false.

Otherwise, parse-newline? consumes the newline. If match-next-token is specified it also consumes the token after the newline. It then returns the indentation of the newline.

In the pattern language, a newline is parsed by one of the four line break patterns based on the ^ character:

^ matches a consistent, indented newline. min-indent is the current indentation + 1. required-indent is initially false, then is the result of the previous parse-newline? call in the same pattern.
^^ matches an inconsistent, indented newline. min-indent is the current indentation + 1. required-indent is false.
^= matches a consistent, non-indented newline. min-indent is the current indentation. required-indent is initially false, then is the result of the previous parse-newline? call in the same pattern.
^^= matches an inconsistent, non-indented newline. min-indent is the current indentation. required-indent is false.

The current indentation is an argument to a parse method. At the outermost level of the program it is zero.

Previous page Table of Contents Next page

`^`	matches a consistent, indented newline. min-indent is the current indentation + 1. required-indent is initially false, then is the result of the previous parse-newline? call in the same pattern.
`^^`	matches an inconsistent, indented newline. min-indent is the current indentation + 1. required-indent is false.
`^=`	matches a consistent, non-indented newline. min-indent is the current indentation. required-indent is initially false, then is the result of the previous parse-newline? call in the same pattern.
`^^=`	matches an inconsistent, non-indented newline. min-indent is the current indentation. required-indent is false.