Re: Programming language and IDE design

George Neuner <gneuner2@comcast.net>
Fri, 08 Nov 2013 16:04:52 -0500

          From comp.compilers

Related articles
[8 earlier articles]
Re: Programming language and IDE design bc@freeuk.com (BartC) (2013-10-23)
Re: Programming language and IDE design monnier@iro.umontreal.ca (Stefan Monnier) (2013-10-24)
Re: Programming language and IDE design gneuner2@comcast.net (George Neuner) (2013-10-24)
Re: Programming language and IDE design martin@gkc.org.uk (Martin Ward) (2013-11-07)
Re: Programming language and IDE design gah@ugcs.caltech.edu (glen herrmannsfeldt) (2013-11-08)
Re: Programming language and IDE design DrDiettrich1@aol.com (Hans-Peter Diettrich) (2013-11-08)
Re: Programming language and IDE design gneuner2@comcast.net (George Neuner) (2013-11-08)
Re: Programming language and IDE design jthorn@astro.indiana.edu (Jonathan Thornburg) (2013-11-10)
Re: Programming language and IDE design martin@gkc.org.uk (Martin Ward) (2013-11-16)
Re: Programming language and IDE design DrDiettrich1@aol.com (Hans-Peter Diettrich) (2013-11-16)
Re: Programming language and IDE design gneuner2@comcast.net (George Neuner) (2013-11-18)
Re: Programming language and IDE design sgk@REMOVEtroutmask.apl.washington.edu (Steven G. Kargl) (2013-11-19)
Re: Programming language and IDE design gneuner2@comcast.net (George Neuner) (2013-11-19)
[7 later articles]
| List of all articles for this month |
From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Fri, 08 Nov 2013 16:04:52 -0500
Organization: A noiseless patient Spider
References: 13-10-016 13-10-017 13-11-003
Keywords: tools, design
Posted-Date: 08 Nov 2013 16:17:32 EST

On Thu, 7 Nov 2013 19:40:43 +0000, Martin Ward <martin@gkc.org.uk>
wrote:


>George Neuner <gneuner2@comcast.net> says
>
>> How exactly should a compiler "enforce consistency" of corresponding
>> declarations in separate compilation units? How is the compiler even
>> to know that they should be corresponding?
>
>This point raises some important issues about languages and compilers:
>
>(1) The language should be easy to parse: both for humans and computers.
>Just because it is *possible* to invent a new parsing technique
>which can unscramble almost impenetrable complexity and ambiguity,
>does not mean it is a good idea to define such a language.


The majority of successful languages are LL(k) for small k. It's hard
to get much simpler than that for parsing.




>(2) Absolutely no behaviour should be "implementation dependent"
>or "undefined". Every syntactically valid program should have
>a single semantically valid meaning (even if that meaning
>is "halt with an error message").


It's impossible not to have implementation dependent behavior: e.g.,
program execution time is a behavior that can't be specified.


Limitations of hardware must be addressed when implementing high level
features - the runtime must compensate for hardware features that are
unavailable on some platforms or have differing semantics across
platforms: e.g., signed vs unsigned arithmetic, flag vs branch
compare, floating point support, etc.


Witness that there is not a single CPU that *fully* implements
IEEE-754 arithmetic (old or new spec), and every implementation
differs in what is lacking and in what needs to be corrected by
software to yield compliant results.




>(3) The language should be easy to analyse. Current program
>analysis sometimes feels like a race between researchers developing
>ever more sophisticated analysis techniques, and language
>designers adding ever more impenetrable features to the language.


Agreed - they *should* be easy to analyze. The problem is that in
theory, theory and practice are the same - in practice they aren't.


Programs which do no I/O, use no (psuedo)random operations and are
completely constrained to their virtually constructed bubbles are
relatively easy to analyze. It is also the case that nearly all such
programs are completely useless.


Programs which interface with the messy, error-ridden, analog real
world, and which operate on dynamic data, are not easy to analyze
regardless of language.




>Currently, the language designers are winning: for example,
>in C++ it is a non-computable problem just to determine which piece
>of code will be executed at runtime for a given function call.


That's a straw man: it's a variant of the halting problem which is
undecidable.




>(4) The language should not impose arbitrary limitations on the programmer.
>An integer data type should be available and efficiently implemented
>which can hold any size of integers. Similarly, a string data type
>which can hold any size of string. Hash tables should allow
>any type of key and value, and so on.


You're ignoring that user-friendly features such as arbitrary
precision arithmetic impose other limits such as unpredictable
execution timing and memory use.


Perhaps we should do away with "general purpose" languages altogether
and separate languages into "system" or "application" uses.


"Application" languages should be safe by default. However, even safe
languages probably shouldn't be fully insulated from hardware, else
people will simply abandon them for system languages. There should
have expert modes that allow safely getting closer to the hardware
when necessary. E.g., if the developer *knows* that the dynamic data
range makes it safe to use hardware integer types, she should not be
prevented from doing so.




>One aim is to catch errors as early as possible: syntactic errors
>can be caught at editing time (with a syntax-aware editor),
>semantic errors can be caught at compile time, and as a last resort,
>runtime errors are caught at runtime (and do not result in potentially
>exploitable undefined behaviour!).
>
>This means: redundancy (to catch errors at edit time), strong typing
>(catch errors at compile time), no aliasing (so the behaviour
>of a piece of code can be determined from the code alone),
>no unrestricted pointers, no undefined behaviour (so runtime errors
>can be caught at runtime).


Lotsa luck! Have you tried to write a non-trivial program that has
absolutely no (source level) aliasing?




>As well as the security benefits, there are also significant
>performance benefits from the above approach: if the language
>is easy to parse, then the whole program can be parsed quickly:
>which solves the problem of ensuring consistency between modules.
>If the language is also easy to analyse, then the whole program
>can be analysed and compiled as a unit: leading to many more
>optimisation possibilities.


"Simple to analyze" does not necessarily equate to "simple to
optimize". Nor is "safe" synonymous with "simple".


It is true that whole program analysis can find more optimization
opportunities, but the possibilities ultimately are constrained by the
operational semantics of the language.


In general, "safe" languages require more complex implementations
which present fewer optimization possibilities than do "simple"
languages.


YMMV,
George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.