Re: Use of punctuation in a language?

Ray Dillinger <bear@sonic.net>
11 Nov 2003 14:34:24 -0500

          From comp.compilers

Related articles
Use of punctuation in a language? hsauro@cs.caltech.edu (Herbert) (2003-10-31)
Re: Use of punctuation in a language? derkgwen@HotPOP.com (Derk Gwen) (2003-11-01)
Re: Use of punctuation in a language? rosing@peakfive.com (MattR) (2003-11-01)
Re: Use of punctuation in a language? gah@ugcs.caltech.edu (Glen Herrmannsfeldt) (2003-11-02)
Re: Use of punctuation in a language? joachim.durchholz@web.de (Joachim Durchholz) (2003-11-08)
Re: Use of punctuation in a language? bobduff@shell01.TheWorld.com (Robert A Duff) (2003-11-08)
Re: Use of punctuation in a language? bear@sonic.net (Ray Dillinger) (2003-11-11)
Re: Use of punctuation in a language? jcownie@etnus.com (James Cownie) (2003-11-11)
Re: Use of punctuation in a language? landauer@got.net (Doug Landauer) (2003-11-11)
Re: Use of punctuation in a language? Martin.Ward@durham.ac.uk (Martin Ward) (2003-11-11)
Significant indentation joachim.durchholz@web.de (Joachim Durchholz) (2003-11-21)
Re: Use of punctuation in a language? jvorbrueggen@mediasec.de (Jan C. =?iso-8859-1?Q?Vorbr=FCggen?=) (2003-11-21)
Re: Use of punctuation in a language? vbdis@aol.com (2003-11-21)
[3 later articles]
| List of all articles for this month |

From: Ray Dillinger <bear@sonic.net>
Newsgroups: comp.compilers
Date: 11 Nov 2003 14:34:24 -0500
Organization: Compilers Central
References: 03-10-129 03-11-024
Keywords: syntax, design
Posted-Date: 11 Nov 2003 14:34:24 EST

Joachim Durchholz wrote:
>
> Herbert wrote:
>
> > Does anyone have any comments on the use of punctucation is a
> > language, eg, compare the following two approaches?
> >
> > a = 3.4; b = 6.7;
> >
> > or
> >
> > a = 3.4 b = 6.7
> >
> > which is better, ease of reading for humans, issues regarding design
> > of compilers (eg the punctuation-less version requires
> > look-ahead?). Perhaps lack of punctuation is a bad language design?
>
> Language readability depends on many factors. Punctuation can help
> structure the code. You can get away without it,but only if there are
> other means that help structure the code.


Also, it really depends on your language. How do you want the
programmer to understand what he's dealing with? If it's a list of
statements and control flow is unusual, then go with the imperative
style and use a list of statements. Statement ending punctuation
makes your life simpler writing a compiler and also helps the code be
clear.


If you want the programmer to understand the program as a syntax tree
with structure and substructure apparent, then liberal use of
encapsulating syntax (like lisp with its parens) is indicated. This
makes the lexer/parser dead easy to write, and also sets you up to be
able to use macros that manipulate list structure directly, which is
why Lisps have more powerful macrology than every other family of
languages. In the extreme form of this you see in some Lisps, there's
exactly one kind of punctuation (parens) and it means exactly one
thing, which means the programmer has near-zero cognitive load to keep
track of syntax. With editor support, autoindentation is easy and you
never get confused about changed whitespace if someone cuts and pastes
code from one lexical level to another.


If you want the programmer to understand the program as sequential
things to put onto a stack, which continually operates with side
effects happening on its contents, then no punctuation at all is
needed (A la FORTH); There is, essentially, a single well-defined form
of semantics for every token, so separating one from another is
pointless and any subgrouping of them is completely arbitrary.


If the programs in your language are mainly an interaction of objects,
then your punctuation should draw clear boundaries between objects and
the fundamental grammar should simply specify a sequence of
arbitrarily complex objects and the message passing protocol used by
each.


When you ask about punctuation in syntax, what you are really asking
is what kind of structure have the programs got? What's the runtime
model you're proposing for the programmer to keep in his head? What
kinds of things need boundaries? Conceptually, what is the conceptual
model of what the program is? Is it a list of instructions, or a
mathematical specification, or a set of declarations of
implicitly-structured data, or a composition of matrix and array
functions, or a system of linear matrix equations with a common
solution, or a set of constraints to solve, or ....? And what is the
runtime doing when it runs a program? Is it an "agent" following a
list of ordered instructions, or is it an "evaluator" trying to
evaluate a function through recursive abstractions, or is it unifying
some set of declared rules over some set of declared inputs, or is it
just a simple conceptual machine that just processes inputs in ways
that affect its store, or .....?


Each of these decisions implies different things about what syntax and
puncutation is appropriate to your language.


Also, who's trying to read it and who's trying to write it? Is it a
language where nonprogrammers are supposed to be able to figure out
what's going on? Where the main users are well-grounded in some solid
notation-heavy discipline like math or electrical engineering? Is this
for gurus and code-gods to do amazing stuff after guru meditation, or
is this for weekend webmonkeys to work on their home pages using a
simple, constrained language that's easy to understand and tries to
keep them from accidentally hanging themselves?


People from notation-heavy disciplines like math, ironically, will
sometimes rebel at the use of lots of different punctuation;
especially if they feel that it's unnecessary in the context of the
language. It actually *offends* some of them to write a semicolon at
the end of a statement, because semicolons don't appear when they're
writing equations and they don't seem to carry semantic information in
the program. But Lisp parens are fine with them, because they express
operator precedence. Basically, they get an attitude that every
symbol has to mean something "significant" and it has to mean exactly
the same thing, as much as it can, everywhere it appears.


> For example, you can get away without semicolons for statement
> separators if you use indentation and/or distinctive keywords.


I still think significant indentation is annoying. When code gets
moved from point A to point B, it frequently changes indentation
level. Significant indentation means I can't just let my editor
autoindent it for me.


> In other words, what's "best" is partly a matter of commonly used
> tools, partly a matter of personal taste, partly a matter of ease of
> reading, and partly a matter of ease of parsing (the latter plays a
> role if the compiler reports a syntax error: complicated parsing
> algorithms make it harder to understand what sent the compiler into
> nirwana).


yep... All these and more. You have to answer them for the language
in question.


Bear





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.