Ed Lowry
7 Alder Way
Bedford Mass 01730
781 276-4098
eslowry@alum.mit.edu
users.rcn.com/eslowry
Revised December 1, 1999
Textbooks for technical subject matter express their content using precise mathematical formulas along with other kinds of less formal representation including natural language statements, metaphors, diagrams, and examples. It is a thesis of this paper that large amounts of potentially precise information are not expressed precisely but diffused in less formal representations because of inadequacies in traditional mathematical notations. That creates a large avoidable burden for the learner of distilling the precise information out of the informal expressions and integrating it mentally without benefit of effective symbolic tools. The main thesis of this paper is that the burden is large and can be substantially eliminated by using optimum information components in a representation method that applies to any technical subject matter.
Teachers of bricklaying need to understand the shape of bricks and how they fit together. Classroom teaching is largely devoted to showing students how to arrange pieces of information so educators need a sound answer to the question:
What is a reasonable structure for basic
pieces of information and how do they fit together?
Traditional representations of information tend to be built from components which are rarely subject to critical examination. There tends to be an assumption that the best choice for representations is highly dependent on the subject matter and that improvements are unlikely to be large or have wide scope of applicability. The best available engineering analysis [1] of information structures at the level of their most basic components presents a very differ view.
A basic finding of the analysis is that pressing the limits of expressive simplicity in formal language for non-trivial subject matter forces uniformity on the structure of the component data objects used to represent the subject matter of the language. The orderliness which appears when needless complexity is removed is analogous to crystallization that develops in a liquid as heat is removed. The properties of formal language change significantly for the better, across this "phase change". For maximum expressive simplicity all data objects will have a structure called "needles" by analogy with pine needles which are pointed and organized in tree structures. The more important and simpler part of the analysis concludes that all data objects will have the same well defined structure. There is nothing foggy about the structure of well chosen data objects even though a foggy understanding of them may be widespread.
An offer of $20,000 for justification of alternative data object structures was posted on the above website and fairly widely advertised in February 1999. The response has tended to confirm that the information technology community has been deeply unaware of data object structure as a significant issue. This contrasts sharply with other technologies where basic structures have almost always received meticulous attention. The crystallization appears to have gone unnoticed because there has been little serious effort to press the limits of expressive simplicity. The resulting lack of appropriate expertise has been impeding assessment of the urgency for educators to be able to answer the above question clearly. A basic goal in writing this paper (and of the above offer) is to encourage anyone concerned about technical education to ask that question until they are satisfied that both they and the education community have an adequate answer.
Additional reasons for trying to understand the fine structure of information include:
Educational technology based on improvements in representation can be much less costly than improvements based on hardware, and may be much more effective. Focusing more on representational issues rather than hardware could do much to reduce the "digital divide".
Needle data objects make it possible to combine many of the advantages of formal language and natural language. The result can be used to improve communication in a wide range of technical subjects. Such language can be enduring because maximum simplicity of expression is being approached. There will be very little further simplification in a statement such as:
82 = count element where some isotope of it is stable
The language would usually be written or read. It would only be spoken in brief phrases.
Needles allow near maximum simplicity of description for any moderately rich technical subject matter. They interconnect with each other and are organized in hierarchies. Each needle points from its "parent" in the hierarchy to another object, possibly remote in the hierarchy or perhaps itself. A needle also has connections to immediate neighbors in the hierarchy, a possible next "sibling", and a possible first "child".
.
.
.
.
Needles representing an age relationship
There are a few dozen cases where progressively optimizing a simple engineering design terminates with a sharply defined structural constraint rather than by design choice in a tradeoff. In each case the structural constraint eliminates a class of design deficiencies. For example, constraining wheels to be round eliminates vertical vibration, constraining pillars to be vertical eliminates shear forces, constraining mirrors to be flat eliminates image distortion, constraining the top and bottom of building blocks to be horizontal eliminates sliding forces. Such constraints often have no adverse side-effects which raise significant tradeoff issues. Most of these "irreducibility optima" [1] get broad and enduring acceptance. They tend to have large social and economic value. The needle data object appears to fit the pattern.
In most of them, even small deviations from the optimum structure are usually unacceptable when the engineering requirements are demanding. It is all right for a tent pole to slope away from vertical, but a pillar holding up a heavy building deviates little. Similarly, simple information can be safely represented in a variety of ways, but rich information structures are best represented using only needles. The possibility that current representations are unreasonable in the same sense that square wheels or doughnut shaped bricks are unreasonable adds to the urgency of getting a good answer to the above question.
The optimum choice of data primitives can reduce the need for specialization in technical language semantics. Since all functions operate on data structures using a common primitive object, they can be easily merged into a common language semantics regardless of what subject matter is involved. While special purpose language features are needed, they can be much more easily expressed as superficial extensions to the more general purpose language.
Historically, one limitation on formal language generality has been a dichotomy between:
- STRUCTURALLY EXPRESSIVE languages which have rich data
structures but not powerful expression. (Ada, PL/1, etc
which use MANY primitive kinds of data object)
and
- FUNCTIONALLY EXPRESSIVE languages which allow many
operations on large data aggregates in a single
expression. (APL, Relational DB, etc, which use very
FEW kinds of primitive)
Use of just one kind of primitive, the needle, allows for functional expressiveness. Needles also allow a rich structural expressiveness. Limitations of either structural or functional expressiveness of the data structures reduce simplicity of expression and limits generality.
Unspecialized language reduces barriers to accessing unfamiliar technical knowledge by reducing need for preliminary language learning. Parts of the language could be learned in primary school and then used to support subsequent technical learning.
Needles can provide a natural standard for the data objects in technical language. After the underlying data structure issues are decided, a semantics of simple expressions for referencing and manipulating data substructures follows fairly naturally. Additional standardization could lead a durable foundation for a universal language supporting technical literacy. Universality of capability, of course, is no assurance of universal acceptance.
Complex problems increasingly transcend disciplinary boundaries and integration of the concepts can be facilitated by use of a common unspecialized language.
A stable set of functions used in programming languages includes:
arithmetic including comparisons
boolean operations
set operations
matrix operations.
They have broad application and stable definitions. The optimality of needles suggests that an expanded and more integrated set of functions can be defined which can have similar breadth of application and durability.
From formal languages which are at least somewhat unspecialized, it is possible to select groups of functions which operate on simple structures and which do not embody real world knowledge. They can form a language kernel which could be referred to as the "Shannon operators".
Such groups include:
So far, attempts to produce general purpose languages have not succeeded in incorporating all of these in a satisfactory way. The simplicities gained by using needles make it practical to do so. Almost all have been incorporated into KEEP, a predecessor of Shannon.
The stability and integration could make the operators a common tool for technical communication, both with people and machines.
Learning of a basic core language for computer usage and system description can probably be made a once in a lifetime effort. Simplicity of expression, simplicity of language, and generality of language can be critical in justifying introduction of technology. Historically educators have carefully controlled the complexity of learning environments, but current trends in educational technology endanger that control.
Type concepts are useful for providing diagnostics and abbreviations in the use of such functions. It is proposed that underlying definitions without type sensitivity be defined on the ground that those definitions would be more stable.
Computer hardware and software can enhance the usefulness of the language but computer assistance is NOT initially a requirement. Such language can serve to assist students in a variety of their basic needs:
The language could assist educators in:
Needles allow a declarative natural language style which can help build on previous learning. The total explicitness provides confidence that mysteries can be resolved. Automated analysis tools can speed the resolution. This can be particularly valuable when teachers are not readily available.
The need for metaphor and other informal expressions arises mostly during early stages of learning when the student may be more disoriented. Later, for proficiency at a detailed level, there is a greater need for tightly integrated comprehensive, precise, statements whose unambiguous interpretation can be depended on. Metaphors introduce foreign baggage which can obscure the picture later on. At any stage of learning the student (especially mature students) may have questions (not necessarily articulated) where less than precise answers are not satisfactory.
The optimum data object structure provides assurance that the underlying semantics of the core language can remain stable. Variations in superficial syntax and other extensions may be expected.
The earliest student exposure to the language could take the form of using it to manipulate toy environments using computers. Later it would be used to communicate well established mathematical and scientific ideas to the student. Learning to read the language is easier than learning to write it, so it can be used to explain before proficiency in writing it is developed. At a later stage the emphasis would shift to developing, and testing models.
The structure of an environment tends to be more clearly expressed in the language than the procedural details of data manipulation within it. Algebraic problems may illustrate these effects. The form of algebraic expressions can be described fairly clearly in the language. The procedures are less clearly represented (at least at present) and are probably useful mainly for supporting clarifications of more intuitive descriptions.
For complex subject matter where coherence of discussion is difficult to maintain, the language can be used to describe the underlying structures of the subject matter. Doing so helps assure that there is a coherent subject of discussion and it can provide a framework on which to interpret informal discussion of the subject.
Most of the results of education are in the student's head, but they also include a body of notebooks and personal library which the student has learned to access easily and with confidence. Expressing such knowledge in the simplest way will enhance its adaptability to different learning and work environments.
Computer analysis of knowledge prerequisites implied in the declarations could help people get oriented in unfamiliar subject matter. They could then solve problems successfully in technical areas for which their background is limited by selecting only the information which is relevant to their immediate needs. Such analysis could also help orient students when they fall behind. Reducing the need for depth of commitment to specialized learning can encourage a wider variety of intellectual exploration by students.
There are hints that unspecialized formal language can enhance creativity. An effort to translate physics into such language resulted in finding a clear picture [2] of electromagnetic fields hidden in Maxwell's equations. Clarification of the structure of data objects and their effect on simplicity of expression resulted from efforts to describe such language in itself.
[1] E. S. Lowry, Toward Perfect Information Microstructures at www.ultranet.com/~eslowry.
[2] E. S. Lowry, Physical Rev. pg 616, 1960, and Am. J. of Physics pg 871, 1963 For a brief description see The Electromagnetic Field in Space-time
[3] E. S. Lowry, Proc. of ED-Media96, AACE, June 1996, pg407.(a preliminary version of this paper)
The following give descriptions of some initial content from high school chemistry, accounting and particle physics. In each case substantial amounts of precise information are presented in a precise way that was only informally expressed in the original source material. Such descriptions need not and often cannot be executed by computer.
Elementary CHEMISTRY in SHANNON
[[declare chemistry domain
declare element list
has id(hydrogen, helium, lithium, ... )
has atomic_weight in number
has atomic_number in tally
declare atom set
has element
declare mass quantities
declare volume quantities
declare temperature quantities
declare molecule set
has compound
has set atom
declare compound set
has id(carbon_dioxide, water, molecular_oxygen, ozone, ...)
holds set component
has set portion converse
has molecular_weight in number :=sum for its component take
its tally * atomic_weight of its element
declare component sets
has element key
has compound converse
has tally
has fraction in number
:= its tally * atomic_weight of its element / molecular_weight
of its compound
declare portion set
mayhave compound
has state_of_matter
has mass
has molecule_count in tally
mayhave temperature
mayhave volume
declare state_of_matter set
has id(solid, liquid, gas)
declare transformation set
has set input in portion
has set output in portion
maybe decomposition := count(its input)=1 and count(its output)>1
/* gas law
certify some number satisfies every portion where gas satisfies
its pressure * its volume / its temperate = the number
certify decomposition where compound of its input is sulphur_dioxide
satisfies mass of its sulphur output = mass of its oxygen output ]]
ACCOUNTING
This summarizes some basic accounting concepts in SHANNON.
[[declare accounting domain
declare business_entity set
has name in string key generic
holds set ledger in account
declare account sets
has name in string key generic
mayhave business_entity
is_one_of (asset_acct, liability_acct, capital_acct)
is_one_of (curr_asset, fixed_asset) if(asset_acct)
is_one_of ( control_acct, subsidiary_ledger ) subtype
holds set subsidiary_ledger if(control_acct)
holds list acct_period
declare quarter list
has ordinal key
holds list journal in transaction
has set acct_period
declare acct_period lists
has quarter key
has account converse
has list entry_line := entry_line of transaction of its quarter
where account of the acct_period = account of the entry_line
has list debit in entry_line := its entry_line where dr
has list credit in entry_line := its entry_line where cr
has balance in dollar :=
sum (value of its debit)
- sum (value of its credit)
declare transaction lists
has ordinal key
has date
has quarter
has event in string
maybe adjusting
maybe closing
holds set entry_line which
(is_one_of ( dr, cr ) subtype
has value in number
has account)
]]
PARTICLE PHYSICS in SHANNON
[[ declare particle_physics domain;
declare materiality set
has id(matter, anti_matter);
declare color set
has id(lepton, red, green, blue);
declare tronity set
has id(tron, trino)
has set flavor converse;
declare generation set
has id(first_generation, second_generation, third_generation)
has set flavor converse;
declare flavor set
has id( down, up, strange, charm, bottom, top)
has generation key := first_generation if down or up else
second_generation if strange or charm else
third_generation
has tronity key := tron if down or strange or bottom else
trino;
declare handedness set
has id(left, right);
declare mass quantities;
declare charge values additive
has electric in(for integer take it/3)
mayhave weak in(for integer take it/2)
has r_g in(for integer take it/2)
has g_b in(for integer take it/2)
has b_r in(for integer take it/2) := -r_g-g_b;
/* particle physics, continued
declare particle set
has generation
has tronity
has color
has materiality
has mass
has flavor := flavor(its generation, its tronity)
has handedness
isoneof(neutrino, tau, muon, electron, quark)
:= quark if not lepton
else neutrino if trino
else tau if bottom
else muon if strange
else electron
has charge := create charge with (
electric: ( 0 if neutrino
else -1 if lepton
else 2/3 if trino
else -1/3 if tron )
*(1 if matter else -1),
weak: 0 if left and anti_matter
or right and matter
else 1/2 if tron xor antimatter
else -1/2,
r_g: ( 0 if lepton or blue
else 1/2 if red else -1/2)
*(1 if matter else -1),
g_b: ( 0 if lepton or red
else 1/2 if green else -1/2)
*(1 if matter else -1) );
]]