Re: Formal semantics of language semantics

"Mark" <whopkins@alpha2.csd.uwm.edu>
18 Oct 2002 23:39:03 -0400

          From comp.compilers

Related articles
[7 earlier articles]
Re: Formal semantics of language semantics whopkins@alpha2.csd.uwm.edu (Mark) (2002-09-29)
Re: Formal semantics of language semantics nmm1@cus.cam.ac.uk (Nick Maclaren) (2002-10-13)
Re: Formal semantics of language semantics haberg@matematik.su.se (Hans Aberg) (2002-10-13)
Re: Formal semantics of language semantics scgupta@solomons.cs.uwm.edu (Satish C. Gupta) (2002-10-13)
Re: Formal semantics of language semantics lex@cc.gatech.edu (Lex Spoon) (2002-10-13)
Re: Formal semantics of language semantics anw@merlot.uucp (Dr A. N. Walker) (2002-10-18)
Re: Formal semantics of language semantics whopkins@alpha2.csd.uwm.edu (Mark) (2002-10-18)
Re: Formal semantics of language semantics whopkins@alpha2.csd.uwm.edu (Mark) (2002-10-18)
Re: Formal semantics of language semantics nmm1@cus.cam.ac.uk (Nick Maclaren) (2002-10-20)
Re: Formal semantics of language semantics nmm1@cus.cam.ac.uk (Nick Maclaren) (2002-10-20)
Re: Formal semantics of language semantics merlot!anw@mailbox1.ucsd.edu (Dr A. N. Walker) (2002-10-25)
Re: Formal semantics of language semantics whopkins@alpha2.csd.uwm.edu (Mark) (2002-10-25)
Re: Formal semantics of language semantics whopkins@alpha2.csd.uwm.edu (Mark) (2002-10-25)
[3 later articles]
| List of all articles for this month |
From: "Mark" <whopkins@alpha2.csd.uwm.edu>
Newsgroups: comp.compilers
Date: 18 Oct 2002 23:39:03 -0400
Organization: University of Wisconsin - Milwaukee, Computing Services Division
References: 02-09-149 02-09-162 02-10-005
Keywords: semantics
Posted-Date: 18 Oct 2002 23:39:02 EDT

From Joe Hendrix:
> Are there any notations commonly used to define the semantics of a
> programming language? (Similar to how BNF defines the syntax).


The more basic issue is to first provide a purely SYNTATIC explication
of the standard Algol-like notation before proceeding. In particular,
the fixed point part of the control flow syntax should disappear
before the semantics sees anything, and the result is that the
underlying language will actually be nothing more than a concrete
syntax for an abstract syntax which, itself, is purely
expression-based and free any considerations related to control flow.


The biggest mistake made in programming language semantics is to
try and semanticize the phenomenon of control flow, which as seen
below, is entirely syntatic in nature.


So, what we're talking about is the necessity to factor out the extra
layer of PRE-SEMANTICS that invariably gets mangled in with the rest
of the semantics, muddying what should otherwise be a relatively clean
and straightforward definition.


You have to first reinvent the old 1960's Algol control flow notation
in 21st century form based on much more fundamental notions, before
going on. Most of what you would normally recognize as programming
language semantics is gone by the time this explication is carried out.


The general issue was touched on in comp.theory under the subject header
"do we need else-clause ?".


More is also written under
The Infinitary Lambda Calculus & Programming Languages
currently at www.csd.uwm.edu/~whopkins/functional/index.html and
in:
                    JOLT - The Purely Functional Imperative Programming Language
currently at
www.csd.uwm.edu/~whopkins/seqcd/jolt.txt
(which was also posted here recently).


------------


Ariel Burbaickij <Ariel.Burbaickij@t-online.de> writes:
>Immagine rather restricted language
>without do-while, without for, without
>switch-case, without elseif and surely
>gotos are considered harmful and are not present.


Actually: imagine a language without a while either, and instead of an
if-then[-else] it has only conditional expressions (A?B:C) (using
C-like notation).


To make it interesting, we'll also allow the language to
have infinitely large expressions, such as:


                    A? (A? (A? (A? (...): Q): Q): Q): Q


Of course, there are some rather obvious difficulties with trying to
directly transcribe such expressions in a computer, not the least of
which is its extreme redundancy.


So, we'll come up with some useful aids to help us render such
expressions without having to take an infinite amount of time doing
so.


First, an expression may be tagged and referred to elsewhere.
For instance
                                        x: 6; (5 + x)


will mean 5 + 6. A tag is an abbreviation for an expression. In the
expression above, the use of x in (5 + x) could be ambiguous. It
might be easy to confuse that x for an ordinary programming language
variable.


So, to distinguish tags from ordinary variables, we'll write each
use of a tag by prefixing it by some kind of marker that tell
us to refer to the thing tagged. We'll call this marker "goto".
So the expression above would be written:


                                      x: 6; (5 + goto x)


Tags can be used within the expressions so tagged. So, infinitely
big expressions can be rendered without having to take an infinite
amount of time doing so. Thus, the expression


                    A? (A? (A? (A? (...): Q): Q): Q): Q


would be rendered as


                                          x: A? goto x: Q.


Second, we'll allow for the use of contexts. The most specialized
context would be, for instance, the "wrapper" part of a lambda
expression:
                                              (lambda z.[ ]) 5.


When the context is filled with the expression z + 6, this would yield
the expression:
                                      (lambda z.(z + 6)) 5


To simplify matters a bit, we'll use specialized notation for a context
such as the above, rendering the context


                                          (lambda z.[ ]) T
as
                                              z = T, [ ].


A general context S[...] then takes an expression E to yield a (usually)
more complex expression S[E].


We'll adopt the following convention:


                  For any context S[], the expression
                                        x: A? S[goto x]: Q,


as a context applied to the expression Q, may be denoted as:


                                          (while (A) S[]) Q


So, this provides an equivalent notation that allows us to render
the infinitely large expression:


                                A? S[A? S[A? S[...]: Q]: Q]: Q


without having to take an infinite amount of time doing so.


Third, we'll allow the use of schemata to denote an even larger class
of infinite expressions. For example, the expression:


                      x = 1,
                      n > 0?
                            x = x * n,
                            n - 1 > 0?
                                  x = x * (n - 1),
                                  (n - 1) - 1 > 0?
                                        x = x * ((n - 1) - 1),
                                        ((n - 1) - 1) - 1 > 0? (...): x:
                                        x:
                                  x:
                          x


cannot be finitely represented within the tagging scheme just described.
But if we allow tags to have parameters (and call such parametrized
tags schemata), then we can render this expression like so:


                  P[n, Q]: (n > 0)? (x *= n, P[n-1, Q]): Q


                                  x = 1, P[n, x].


To make things easier, we will adopt the following convention:


                    (A) If P[a,...,z,Q] is a scheme, then the context
                            P[a,...,z,[]] may be written P[a,...,z],[]
                            Thus
                                      P[a,...,z,Q] <--> P[a,..,z],Q
                            both mean the same thing.
                    (B) A scheme may be defined with the last parameter
                            implicit. A standard name, "return" will be
                            used wherever it's necessary to make reference
                            to this parameter.


Thus, the scheme above could also be defined as:


                  P[n, Q]: (n > 0)? (x *= n, P[n-1], Q): Q
or as
                  P[n]: (n > 0)? (x *= n, P[n-1], return): return.


Also, if-then[-else] notation will be introduced by the following
conventions:


                The context A? S[]: [] may be denoted (if A S[])
                Thus
                                  (if A S) Q means A? S[Q]: Q


Similarly,
                The context A? S[]: T[] may be denoted (if A S else T).
                Thus,
                                (if A S else T) Q means A? S[Q]: T[Q].


With these conventions, the scheme above may be further rendered as:


                    P[n]: (if (n > 0) (x *= n, P[n-1], ) return


As a final convention, we'll also say that:


                    If the body of a scheme has the form S[return]
                    then the "return" may be removed and the body
                    written as S.


Thus, the expression rendered as


                  P[n, Q]: (n > 0)? (x *= n, P[n-1, Q]): Q


                                  x = 1, P[n, x].


can now be written as:


                              P[n]: if (n > 0) (x *= n, P[n-1], )


                              x = 1, P[n], x.


There, we're almost all done. To make things even more interesting,
we should endow our enhanced notation for infinitary expressions with
the full power of a turing machine. Then we could represent a much
larger class of infinitely-sized expressions -- namely those whose
expression-subexpression composition is recursively enumerable. For
this, we'd need to introduce the concept of syntatic variables, and
syntatic operations (such as assignment statements, function
definitions and so on). These variables would look and act just
like pointers do in an ordinary programming language. But that's
all going beyond where I wanted to take this brief discussion.


So, in actual fact, given the developments above, you can see
that we actually don't need ANY of the control flow structures ...
provided that we allow


        (a) Conditional expressions A? B: C
        (b) The ability to denote and use infinitely-sized expressions.


Everything else is then just a convenient abbreviation to make (a) and
(b) easier.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.