Programming Language for Old Timers


by David A. Moon
February 2006 .. September 2008

Comments and criticisms to dave underscore moon atsign alum dot mit dot edu.


Previous page   Table of Contents   Next page


Source Code Model

This section describes the abstract syntax tree form of source code, which is called P-Expressions. This is the form that results from parsing and is manipulated by macros. The textual form of source code, or surface syntax, is described later.

Source code is made out of P-expressions. A P-expression, often abbreviated to just expression, is the parsed form of a PLOT construct.

Names in P-expressions have not yet had their binding to a lexically visible definition resolved. This allows fragments of a complete source code model to be manipulated. However, it requires a scope to be supplied in order to fully understand an expression.

The following kinds of expressions exist in PLOT, i.e. are known to the compiler:

literal self-quoting literal datum (number, character, or string)
quotation quoted constant
name reference to a definition
invocation application of a function to arguments
special expression something executed in a special way

There is a separate class of special expression for each "special form" intrinsic to the compiler. The special expression classes and more detail on the object-oriented representation of source code are explained at the end of this section.

The textual form of expressions is greatly expanded beyond the text directly corresponding to this semantic model through the use of macros, which are explained later.

A definition is an association of a name and an object. A definition is either fixed, meaning it always associates to the same object throughout its lifetime, or assignable, meaning the association can be changed to a different object through assignment. Most definitions are fixed; it takes a special construct to create an assignable definition. A fixed definition can be constant, which means that the associated object is known at compile time.

A definition is created by executing the expansion of one of several macros whose names by convention start with def, or by binding a parameter to an argument when invoking a method, or by binding a pattern variable to an item that was parsed when executing a pattern. Parameters and pattern variables are always fixed definitions.

To understand definitions it is necessary to understand the concept of blocks. A block is a sequence of one or more expressions that constitutes a scope for definitions. Blocks can be created explicitly using the block macro, and also are implied in certain places: a method is enclosed in a block that includes its parameter list and its body. The iteration statements for, while, and until enclose the entire statement in a block.

Blocks can nest. An expression in an inner block is not a member of the outer block.

A definition is either a local definition or a global definition. A global definition is created by a definition that is outside of any block. All other definitions are local.

The scope of a local definition is the block where it is defined and all nested blocks that do not hide that definition with another definition with the same name. The scope of a global definition is all code outside of blocks that is in the same module or in another module that imports the definition and all nested blocks that do not hide that definition with another definition with the same name. In addition, a global definition can be accessed from outside its scope by using the name-in-module construct (the @ sign) which is explained later.

It is an error to define two definitions of the same name as members of the same block.

The concept of "the same name" is a bit tricky where macros are involved and the full details are explained later.

Note that the scope of a definition includes expressions that appear in its scope before the definition itself. In other words, the scope of a definition is the entire containing block, not just the rest of the containing block. Thus no special construct is needed to define recursive or mutually recursive functions and mutually referencing classes. However this necessitates two additional concepts:

A definition is accessible in each expression after the one that defines it, and also in methods that appear before the defining expression but are not called until afterwards. Sometimes a definition is only conditionally executed; in this case the definition is only accessible if it was actually executed on the path to the expression where it is used. The definition itself is created unconditionally, but the value of the definition is conditional. It is an error to access a definition where it is in scope but not accessible.

A definition is known in each expression after the one that defines it, provided the definition is constant. Operators and macros, and any definitions used by the bodies of macros or by methods that macro bodies call must be known where the operator or macro is referenced. A module referenced by the name-in-module construct must be a known definition. The known definition concept makes compilation possible.

Referencing a definition where it is in scope but not known, when the value of the definition is an operator or a macro and the reference is such that if the definition had been known the reference would have parsed differently, is an error. This error can be detected if the compiler keeps track of all names in each scope that have been unsuccessfully checked for known definitions and gives an error if a known definition to an operator or a macro is added to the scope later.

A similar error can occur if a known definition in an outer block is hidden by a definition that occurs later than the reference. The known definition in the outer block would have been used erroneously by the compiler. This too can be detected at the expense of keeping track of past known definition lookups.

To evaluate a sequence of expressions as a body without enclosing them in a block, use the do statement. Its body expressions become members of the same scope of which the do statement is a member. This is especially useful for macro expansions.

A module is a set of global definitions. A module is an object and can be given a name by definition, normally as a global definition in another module. This is only useful if the name of the module is a fixed definition and thus known. Some of the definitions in a module can be exported, and a module can augment its definitions by importing some or all of the exports of another module, possibly renaming them.

There is a module named PLOT that exports all the names defined in the standard language, including all the operators, macros, and standard functions. There is also a root module which contains definitions of names as other modules. Another important module is compiler, which exports among other things the names of the types and functions associated with the P-expression representation of source code. Many macros use definitions from the compiler module.

The expression that creates a local definition can be executed more than once, if it is inside a method or a loop. Each execution creates a new definition. The previous definition created by the previous execution is no longer in scope, unless any closures that reference it were created and still exist.

A name object in a P-expression is a reference to a definition, but unlike the case in Common Lisp there is no direct connection from the name object to the definition. The connection only exists in the context of a scope, which is either a module or a local scope comprising a block and its nested blocks. Hygienic macros aside, there is only one name object with a given spelling but there can be many definitions of that name in different scopes.

The object-oriented representation of source code is as follows. Note that all expression objects are immutable, at least in their public API.

module: @compiler@

;; The type expression includes all source code expressions
defprotocol expression

;; Self-quoting literal data are expressions
defprotocol literal is expression

add-protocol($number, $literal)
add-protocol($character, $literal)
add-protocol($string, $literal)

;; A name is an expression
add-protocol($name, $expression)

;; A quotation is a constant expression
defclass quotation is expression
  datum         ; the quoted object

;; The source-locator protocol remembers where an expression came from,
;; through parsing, macro expansion, and all other forms of code rewriting
defprotocol source-locator
  source-file(exp is source-locator) is string or false         ; file name
  source-line(exp is source-locator) is integer                 ; line number
  ;; With no arguments, source-file and source-line return the current source location,
  ;; which is dynamically bound by the parser
  source-file() is string or false                              ; file name
  source-line() is integer                                      ; line number

;; Base class for all expressions that can remember their source location
defclass compound-expression() is source-locator, expression
  source-file = source-file()
  source-line = source-line()

;; An invocation applies a function to arguments
defclass invocation(function is expression, rest: arguments is expression)
         is compound-expression
  function = function
  n-arguments = length(arguments)
  argument[this.n-arguments] = arguments

;; A spread-invocation is an invocation that contains one or more spread
;; arguments.  The spread-flags is a bit vector with 1 where spreading occurs.
defclass spread-invocation(spread-flags is integer, function is expression,
                           rest: arguments is expression)
         is invocation(function, arguments...)
  spread-flags = spread-flags

;; A conditional executes one of two expressions depending on the result of a test
defclass conditional is compound-expression
  test is expression
  consequent is expression
  alternate is expression

;; An assignment changes the value of a definition
defclass assignment is compound-expression
  lhs is expression
  rhs is expression

;; A collation evaluates a sequence of expressions in order and
;; returns the value of the last
defclass collation(rest: body is expression) is expression
  length = body.length
  body[this.length] = body

;; A scopation is a collation that also defines a scope
defclass scopation(scope, rest: body is expression)
         is collation(body...)
  scope = scope

;; An exitation defines an exit-function whose scope is the evaluation
;; of the body.  Calling the exit-function immediately returns its argument
;; as the value of the exitation and terminates evaluation of the body.
;; The name will be defined as an exit function in the scope.
defclass exitation is compound-expression
  scope is scope
  name is name
  body is expression

;; A sanitation implements the cleanup statement
defclass sanitation is compound-expression
  body is expression
  cleanup is expression

The following classes represent definitions. They have double duty: as a P-Expression, a definition represents a source code expression that creates a definition, typically a def statement. In the scope and module data structures, a name is mapped to a definition that represents what that name is defined to be.

;; The abstract base class for all types of definition
defclass definition is compound-expression : abstract

;; A constant-definition has a value known at compile time.
defclass constant-definition is definition
  name is name
  value is anything

;; A variable-definition does not have a value known at compile time,
;; but once set the value cannot be changed by assignment, unless
;; this is the assignable-definition subclass.
defclass variable-definition is definition
  name is name
  type-expression is expression
  value-expression is expression

defun assignable?(defn is definition) false

;; An assignable-definition is like variable-definition but the value
;; can be changed by assignment.
defclass assignable-definition is variable-definition

defun assignable?(defn is assignable-definition) true

;; A method-definition adds a method to a function
;; and defines the function if the name is not already defined
;; in the current scope or imported into the current scope
;; from another module.  The "method" is actually a functation
;; which will later be evaluated to get the run-time method.
;; This does not actually appear in a scope, it is only used as
;; a P-Expression.
defclass method-definition is definition
  name is name
  method is functation

;; A functation evaluates to a function containing one method
;; which is constructed from a name, parameter-list, and body.
;; A function has a collection of methods and also has a collection of functations
;; that represent methods that are being compiled but have not yet been loaded.
;; This lets the compiler know that such a method will exist at run time.
;; When a functation is evaluated, it is removed from the function's functations
;; and its value's method is added to the function's methods.
defclass functation(name is name, scope is scope,
                    parameter-list is parameter-list, body is expression)
         is compound-expression
  name = name
  scope = scope
  parameter-list = parameter-list
  body = body


Previous page   Table of Contents   Next page