Programming Language for Old Timers

by David A. Moon
February 2006 .. September 2008

Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.

Previous page   Table of Contents   Next page

Token Streams

A token-stream is a stream whose elements are tokens. In addition to implementing the stream protocol, it handles newlines and indentation and keeps track of the current source location in the form of source-file and line-number.

The token-stream pseudo-constructor function can be given an input-file, a string, or a sequence of tokens. Users could write their own subclasses of the abstract class token-stream and their own methods for the pseudo-constructor if they needed to.

The abstract subclass character-lexer includes all token-stream types that read from any source of characters and use the lexical syntax rules to construct tokens. It contains the code and tables that represent the built-in lexical syntax. The classes input-file-lexer and string-lexer are useful subclasses.

The subclass token-sequence-stream reads from a sequence of tokens; this is used when reparsing the result of macro expansion.

Single-token lookahead is performed by applying the next function to a token-stream. This returns the next token without advancing the stream, or returns false if the token-stream has reached the end of its input.

Use the end? function to test if a token-stream has reached the end of its input.

When the end? function returns true for a token-stream, the next function is guaranteed to return false rather than having an undefined result as specified by the stream protocol. Thus false as a token is an end-of-file indicator.

Use the advance function to move past a token, as with any stream.

In addition to the functions next and advance of the stream protocol, token streams implement newline-indentation and token-after-newline which provide the necessary special handling for newlines and indentation for parse-newline? Parse functions should use parse-newline? rather than advance to move past newlines.

Token-streams implement the source-locator protocol.

defprotocol token-stream is stream, source-locator
  ;; If looking at a newline, return its indentation
  ;; If looking at any other token or end of stream, return false
  newline-indentation(tokens is token-stream) is integer or false

  ;; If looking at a newline, return the next token after the newline
  ;; Otherwise result is undefined
  token-after-newline(tokens is token-stream)

defun token-stream(source is string) is string-lexer ...
defun token-stream(source is input-file) is input-file-lexer ...
defun token-stream(source is sequence) is token-sequence-stream ...

Previous page   Table of Contents   Next page