Programming Language for Old Timers

by David A. Moon
February 2006 .. September 2008

Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.

Previous page   Table of Contents   Next page

Token Streams

A token-stream is a stream whose elements are tokens. In addition to implementing the stream protocol, it keeps track of the current indentation and of source locations in the form of source-file and line-number.

The token-stream pseudo-constructor function can be given an input-file, a string, or a sequence of tokens. Users could write their own subclasses of the abstract class token-stream and their own methods for the pseudo-constructor if they needed to.

The abstract subclass character-lexer includes all token-stream types that read from any source of characters and use the lexical syntax rules to construct tokens. It contains the code and tables that represent the built-in lexical syntax. The classes input-file-lexer and string-lexer are useful subclasses.

The subclass token-sequence-stream reads from a sequence of tokens; this is used when reparsing the result of macro expansion.

Single-token lookahead is performed by applying the next function to a token-stream. This returns the next token without advancing the stream, or returns false if the token-stream has reached the end of its input or has reached a parsing barrier. This can be a line break parsing barrier or an expression parsing barrier that enforces operator precedence and associativity.

Use the end? function to test if a token-stream has reached the end of its input or a parsing barrier.

When the end? function returns true for a token-stream, the next function is guaranteed to return false rather than having an undefined result as specified by the stream protocol. Thus false as a token is an end-of-file indicator.

Use the advance function to move past a token, as with any stream.

In addition to the functions next and advance of the stream protocol, token streams implement next-after-newline and advance-after-newline which allow peeking past a newline token to see the following token. If the next token is not a newline, these are the same as next and advance. The match functions described in the next section use these.

Use the current-indentation function to get the current indentation of a token stream, which is 0 when it is constructed. Use the current-indentation:= function to set the current indentation of a token stream.

Token-streams implement the source-locator protocol.

defprotocol token-stream is stream, source-locator
  next-after-newline(x is token-stream, result: token)
  advance-after-newline(x is token-stream)
  current-indentation(x is token-stream, result: indentation is integer
  current-indentation(x is token-stream) := new-indentation is integer

def token-stream(source is string, result: stream is string-lexer) ...
def token-stream(source is input-file, result: stream is input-file-lexer) ...
def token-stream(source is sequence, result: stream is token-sequence-stream) ...

Previous page   Table of Contents   Next page