Re: Programming language and IDE design

Martin Ward <>
Thu, 7 Nov 2013 19:40:43 +0000

          From comp.compilers

Related articles
[5 earlier articles]
Re: Programming language and IDE design (George Neuner) (2013-10-22)
Re: Programming language and IDE design (Hans-Peter Diettrich) (2013-10-23)
Re: Programming language and IDE design (2013-10-22)
Re: Programming language and IDE design (BartC) (2013-10-23)
Re: Programming language and IDE design (Stefan Monnier) (2013-10-24)
Re: Programming language and IDE design (George Neuner) (2013-10-24)
Re: Programming language and IDE design (Martin Ward) (2013-11-07)
Re: Programming language and IDE design (glen herrmannsfeldt) (2013-11-08)
Re: Programming language and IDE design (Hans-Peter Diettrich) (2013-11-08)
Re: Programming language and IDE design (George Neuner) (2013-11-08)
Re: Programming language and IDE design (Jonathan Thornburg) (2013-11-10)
Re: Programming language and IDE design (Martin Ward) (2013-11-16)
Re: Programming language and IDE design (Hans-Peter Diettrich) (2013-11-16)
[10 later articles]
| List of all articles for this month |

From: Martin Ward <>
Newsgroups: comp.compilers
Date: Thu, 7 Nov 2013 19:40:43 +0000
Organization: Compilers Central
References: 13-10-016 13-10-017
Keywords: design
Posted-Date: 07 Nov 2013 21:45:48 EST

BartC <> makes the point that the redundancy
provided by enforced consistency between grouping and indentation
will catch typos which would otherwise cause bugs in the code.

Hans-Peter Diettrich <> claimed that batch
processing "leads to *better* code, by thinking *before* writing
instead of while debugging and bug fixing." Our esteemed moderator
also noted that "Keypunching often had this salutary effect."

Regarding automatic generation of code, two cases need to
be distinguished: A domain specific language which is automatically
translated into executable code is fine: so long as it is the DSL code
which is maintained. This is a way to reduce the total amount of code
for the reader to have to read, while also making it more readable:
a good DSL should lead to code which is closer to the problem
domain and is understandable by an expert in the domain.
On the other hand, if the DSL is used to generate huge wads of code
which then form part of the system to be maintained: well, that is
the same as using a compiler to generate an assembler program,
then throwing away the source code and maintaining the assembler.

How many languages should be used in one program? As many as necessary:
if they are "little languages". If they are huge, baroque monstrosities
like PL/I or C++, then one of these language is already too many :-)

"Using PL/1 must be like flying a plane with 7000 buttons,
switches and handles to manipulate in the cockpit."
--E.W.Dijstra "The Humble Programmer".

"If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor, and when was
the last time you needed one?"--Tom Cargill, C++ Journal, Fall 1990.

Hans-Peter describes the idea of recording variable types
everywhere as "another maintenance nightmare": but with proper
tool support it does not have to be. Even without tool support:
forcing the programmer who wants to change the type of a data
item to at least glance at all the modules which use that data
could be beneficial.

Regarding enforced indentation: with continuation lines it could
be possible to slightly relax the restrictions and allow two or more
indentation points to be valid.

George Neuner <> says "My preference is simply
for a common END keyword used by all constructs."
This is just as bad as using "}" to close all nesting:
I want to insert a line of code at the end of the loop body which
is nested in some IF statements, and contains some nested statements.
Where do I put it? (The top of the loop is a few pages earlier.)

Of course, you could still end up with:
... which is an argument for repeating the test at the closing keyword.

> How exactly should a compiler "enforce consistency" of corresponding
> declarations in separate compilation units? How is the compiler even
> to know that they should be corresponding?

This point raises some important issues about languages and compilers:

(1) The language should be easy to parse: both for humans and computers.
Just because it is *possible* to invent a new parsing technique
which can unscramble almost impenetrable complexity and ambiguity,
does not mean it is a good idea to define such a language.

(2) Absolutely no behaviour should be "implementation dependent"
or "undefined". Every syntactically valid program should have
a single semantically valid meaning (even if that meaning
is "halt with an error message").

(3) The language should be easy to analyse. Current program
analysis sometimes feels like a race between researchers developing
ever more sophisticated analysis techniques, and language
designers adding ever more impenetrable features to the language.
Currently, the language designers are winning: for example,
in C++ it is a non-computable problem just to determine which piece
of code will be executed at runtime for a given function call.

(4) The language should not impose arbitrary limitations on the programmer.
An integer data type should be available and efficiently implemented
which can hold any size of integers. Similarly, a string data type
which can hold any size of string. Hash tables should allow
any type of key and value, and so on.

One aim is to catch errors as early as possible: syntactic errors
can be caught at editing time (with a syntax-aware editor),
semantic errors can be caught at compile time, and as a last resort,
runtime errors are caught at runtime (and do not result in potentially
exploitable undefined behaviour!).

This means: redundancy (to catch errors at edit time), strong typing
(catch errors at compile time), no aliasing (so the behaviour
of a piece of code can be determined from the code alone),
no unrestricted pointers, no undefined behaviour (so runtime errors
can be caught at runtime).

As well as the security benefits, there are also significant
performance benefits from the above approach: if the language
is easy to parse, then the whole program can be parsed quickly:
which solves the problem of ensuring consistency between modules.
If the language is also easy to analyse, then the whole program
can be analysed and compiled as a unit: leading to many more
optimisation possibilities. Strong typing, together with the
elimination of aliasing and pointers also allows much deeper analysis
and optimisation. Memory is cheap and CPUs are powerful,
so whole-program analysis and optimisation should become the norm:
but this provides the greatest benefit when the language is
easy to analyse.

Hans-Peter Diettrich <> wrote:
> It turned out soon that C is the most ugly and pretentious language WRT
> to automatic analysis, that's why I continued research and
> implementation on just that language. Newer languages should be much
> easier to master automatically...

Newer languages like C++, for example? :-)

William Clodius <> wrote:
> Among workers whose primary task is not
> programing it can be very difficult to get a mutually consistent
> programming style unless their mutual inconsistencies, where it affects
> legibility, have significant consequences.

Indeed: a consistent coding style is much more important than anyone's
personal preference: in the sense that readability is improved when everyone
is forced to use the same style (even if some think that it is sub-optimal)
compared to the mix of styles that results when everyone is allowed
to write in their own way. Not to mention the wasted effort when
people keep on restructuring each other's code into their preferred style!


Dr Martin Ward STRL Principal Lecturer and Reader in Software Engineering Erdos number: 4
G.K.Chesterton web site:
Mirrors: and

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.