Human Engineering of Programming Languages

The term human engineering is not that widely used when talking about programming languages, but arguably it is the most important job of a language designer.  Closely related to the term ergonomics, human engineering is the act of ensuring that a design is fit for use by humans.  In the context of a programming language, that means the language should help prevent errors during the construction of a program, and be designed so as to faithfully communicate the intended meaning of the program from the programmer both to the readers, and to the automated tools that might analyze the program or convert it into a machine executable form.

One well known technique for preventing errors is to require (or at least allow) some degree of checked redundancy.  That is, you can say some things that could potentially be inferred from earlier (or later) parts of a communication, because the redundancy enables an automated check that the intent of the communication was correctly captured in the given representation.  From a pure syntax point of view, we can also try to maximize the hamming distance, in terms of, say, the number of single token replacements, which would be needed to move from one syntactically legal program to a different syntactically legal program.

Examples of checked redundancy include the requirement in a statically-typed language to declare the type of an object before manipulating it with various operations.  One could potentially infer the type from the operations performed (or, as in early Fortran, from the first letter of the object's name), but it is arguable in many situations that would both make mistakes in constructing the program more likely to slip through, and to render even a correct program harder to understand by other humans.  On the other hand, excessive redundancy can make the language painful to use, and potentially obfuscate the meaning with information overload.  Clearly the tradeoffs needs to be identified and weighed, which is arguably the essence of any kind of engineering.

Examples of hamming distance issues include cases where the difference between, say "=" and "==", can alter the meaning completely, in a way that could escape detection by both the author and any reviewers.  The problem can be avoided by using symbols that are more easily distinguished, and require more than a single character to go from one to the other.  For example, using ":=" for assignment and "==" for equality would increase the hamming distance from simply adding one character, to having to delete one and add another, and at the same time provide visually more distinctive symbols.  And further semantic requirements can be added, such as not allowing the two symbols to appear in the same enclosing context, by for example, requiring assignments to only be performed as part of a stand-alone statement, while limiting equality to expression contexts.  But then again, distinctions between statement context and expression context can become burdensome when similar constructs, such as conditionals, are needed in both contexts.

We will return to the topic of human engineering many times in future blog entries, because as stated at the top of this entry, human engineering may very well be the most important kind of engineering we as programming language designers can do.

Comments

Popular posts from this blog

The Role of Programming Languages