JOLT: infinitary expression-based imperative language w/o control flow

"Alfred Einstead" <whopkins@csd.uwm.edu>
13 Oct 2002 16:38:20 -0400

          From comp.compilers

Related articles
JOLT: infinitary expression-based imperative language w/o control flow whopkins@csd.uwm.edu (Alfred Einstead) (2002-10-13)
| List of all articles for this month |

From: "Alfred Einstead" <whopkins@csd.uwm.edu>
Newsgroups: comp.compilers
Date: 13 Oct 2002 16:38:20 -0400
Organization: http://groups.google.com/
Keywords: design, functional
Posted-Date: 13 Oct 2002 16:38:19 EDT

JOLT is a family of languages whose members all share the one common
feature. They are imperative programming languages which have no
explicit control flow of any kind: no loops, no goto's, no
subroutines, not even goto statements.


They are all purely expression-based using, as its core syntax, C's
expression syntax, with the addition of 2 facilities: (1) the ability
to label subexpressions and refer to them elsewhere, and (2) a
powerful macro facility. Both of these are provided to allow one to
more concisely specify the syntatic structure of an expression without
having to write it all out.


Different members of the family may add in more features (e.g., lambda
expressions, facilities for concurrency, non-determinism, etc.)


The language is a direct embodiment of a novel approach to programming
language design & verification which makes use of a creative synthesis
between the functional and imperative programming language paradigms.


The usual distinction is made between an abstract and concrete syntax.
But the novelty is that an expression's abstract syntax may be
infinite -- an infinite parse tree. A natural analogue of state
machine and control flow emerges from this, but the naive
interpretation along those lines is defied by counterexamples in JOLT
of expressions whose 'control flow' reading would be very unusual, to
say the least.


The concrete syntactic structure is that delineated by the actual
notation used to denote the expression. In general, the abstract
syntax of Jolt is the subset of the Jolt syntax consisting of
everything except the labelling and macro facilities.


In addition, the path expresson operator S* is introduced with the following
equivalences:


                            (S C? T)*, q <--> x: S C? T x: q
                                          S*, q <--> x: S x otherwise


noting that S* = (S 1?)* = (1? S)*. This allows rational infinite
expressions to be more directly represented and recovers a semblance of
the loop.


Two examples of JOLT expressions (whose abstract syntatic structures are
infinite) are:


                    a = 0, b = 1, x: a < 4? (a++, b *= 2, x): b


whose value is 16. This looks like and would be evaluated just like the
while loop it resembles. It can also be written more directly as:


                            a = 0, b = 1, (a < 4? a++, b *= 2)*, b


The second example, however:


                  a = 2, b = 3, x + y
                  (y: a = 0, b = 1, x: a < 4? (a++, b *= 2, x): b)


(whose value is 28) would defy any ordinary interpretation along the
lines of a control flow structure. But using the path expression
operator it can still be more directly written as:


                  a = 2, b = 3, x + (a = 0, b = 1, x)
                  (x: (a < 4? a++, b *= 2)*, b)


The importance of the distinction between abstract and concrete syntax
is that the language is defined rigorously in terms of an
evaluation/reduction model that operates on the abstract syntax of an
expression. Corresponding to this model, and systematically derived
from it, is an evaluation mechanism that operates directly on the
concrete syntax, which (in fact) leads directly to a Jolt interpreter.


The evaluation mechanism consists of the abstract evaluation model with the
addition of the infrastructure needed to traverse the structure delineated
by the labelling and syntatic templates. This is, in fact, how the 'imperative'
aspect of the language emerges. Corresponding to the system of labels
contained in an expression is a finite state control (with each state
corresponding to a label), and corresponding to the system of macros is a
'call-return' facility closely analogous to that found on a more traditional
processor.


A summary of the language's syntax is provided below in EBNF notation:


ABSTRACT SYNTAX:
E: Expressions
E -> E "," E
E -> L "=" E
E -> L bin"=" E
E -> inc L
E -> L inc
E -> L
E -> E "?" E ":" E
E -> E "||" E
E -> E "&&" E
E -> !E
E -> E rel E
E -> E bin E
E -> un E
E -> &L
E -> K


L: L-Expressions
L -> *E
L -> E "[" E "]"
L -> X


X: Variable name


Operators
rel: == != <= >= < >
bin: + - * / % ^ | & << >>
un: + - ~
inc: ++ --
K: number, string or char


BASIC CONCRETE SYNTAX:
Includes the abstract syntax above, plus the following:


Program -> D* E
D -> "define" P "(" [ Q ("," Q)* ] ")" E
D -> "proc" P "(" [ Q ("," Q)* ] ")" E
E -> [Q ":"] E (Q ":" E)*
E -> Q
E -> P "(" [E ("," E)*] ")"
E -> P "(" [E ("," E)*] ")" ";" E
E -> "(" E ")"
E -> E "*"


P: macro name
Q: subexpression label


The operator precedence is the same as that in C. Macro calls of the
form P ( ... ) have the same precedence as C functions, macro calls
of the form P ( ... ) have precedence one higher than comma expressions
and labelled expressions have the highest precedence of all.


FUNCTIONAL EXTENSION:
The following additions are provided in the functional extension of the
abstract syntax:


E -> "\" X ("," X)* ["."] E
E -> E "[" E ("," E)* "]"


In the concrete syntax, the second item has the same precedence as
arrays, and the first item as the highest precedence of all.


The semantics of function applications and arrays are formally unified. All
new arrays may be initialized to a function. The built-in constant "new",
defined as (\x.0), is provided as a standard initialization. Both arrays and
functions may be updated via component assignments, A[X1, ..., Xn] = B.


CONCURRENT EXTENSION:
The following addition is provided in the concurrent extension of the
abstract syntax.


E -> "{" E (";" E)* "}"


This is a structured constant, but its components are understood to be
evaluated in parallel.


Two built-in macros are also provided:


proc put(S, V) <... send value V to channel S... >
define get(S) <... wait for a value to appear on channel S, return it... >


The precedence of the bracketed expression is the same as the
parenthesized expression. The components of this expression are
explicitly stipulated to be evaluated concurrently.


NON-DETERMINISTIC EXTENSION:
This extension, which generalises the concurrent extension further, has yet to
be fully defined. But the key is that the conditional is generalized into a
non-deterministic branch:


                                                        { A; ...; Z }
with
                                                                  { }


used to denote failure. They are defined such that the following identities
are satisfied:


                                    { A; {} } = A = { {}; A } = { A }
                        { A; { B; C } } = { A; B; C } = { { A; B }; C }
                                                          {} B = {}
                                  A { B; C } D = { A B D; A C D }


-----------


SEMANTICS (provided for the BASIC CONCRETE SYNTAX):
      Since expressions in JOLT are built out of the same syntax as C, and
since C expressions may include embedded assignments, it is important to
keep an account of which assignments occur in an expression. This leads
to the concept of an 'assignment list', which is formally equated to
the better known concept of an 'environment'.
      Therefore, two partial functions are defined:


            [E C] = The environment corresponding to C in the environment E
            (E C) = The value of C in the environment E.


To avoid non-determinism, we adopt the sequencing conventions expressed by the
following equivalences:


                          a op b <--> u = a, v = b, u op v
                          op a <--> u = a, op u
                          l op b <--> p = &l, v = b, *p op b
                          l inc <--> p = &l, (*p) inc
                          inc l <--> p = &l, inc (*p)
                          a? b: c <--> u = a, (u? b: c)
                          a || b <--> a? 1: (b? 1: 0)
                          a && b <--> a? (b? 1: 0): 0
                          !a <--> a? 0: 1
                          l <--> *&l
                          &*a <--> a
                          &a[b] <--> a + b


where p, u and v denote otherwise unused variables, l an L-expression,
and a, b expressions.


The auxillary functions are defined so as to preserve these equivalences.
These definitions are provided below.


            [E (A, B)] = [[E A] B]
            (E (A, B)] = ([E A] B)


            [E (L = B)] = [E (&L, B)] v = (E (&L,B)),
            (E (L = B)] = (E (&L, B))
            [E (L op= B)] = [E (&L, B)] v = (E (L op B)),
            (E (L op= B)) = (E (L op B))
            [E (L inc)] = [E &L] v = (E (L op 1)),
            (E (L inc)) = (E L)
            [E (inc L)] = [E &L] v = (E (L op 1)),
            (E (inc L)] = (E (L op 1))
                  where (E &L) = &v
            [E L] = [E &L]
            (E L) = (E (&L, v)), if (E &L) = &v


            [E K] = E
            (E K) = K


            [E (A? B: C)] = [E (A, B)] if (E A) != 0
                                        = [E (A, C)] if (E A) == 0
            (E (A? B: C)) = (E (A, B)) if (E A) != 0
                                        = (E (A, C)) if (E A) == 0


            [E (A || B)] = [E (A? 1: (B? 1: 0))]
            (E (A || B)) = (E (A? 1: (B? 1: 0))]
            [E (A && B)] = [E (A? (B? 1: 0): 0)]
            (E (A && B)) = (E (A? (B? 1: 0): 0))
            [E !A] = [E (A? 0: 1)]
            (E !A) = (E (A? 0: 1))


            [E (A op B)] = [E (A, B)]
            (E (A op B)) = (E A) op (E (A, B))


            [E (op A)] = [E A]
            (E (op A)) = op (E A)


            [E &*P] = [E P]
            (E &*P) = (E P)
            [E &A[B]] = [E (A + B)]
            (E &A[B]] = (E (A + B))
            [E &X] = E
            (E &X) = &X


Note that the following operations are defined where a is an array:
                  &a[i] + j = &a[i + j] = j + &a[i]
                  &a[0] = a


The value of (E v), where v is a variable is defined inductively as follows:


                            (c) = c
                            (E v = b, c) = (E ((\v.c) b))


Where b and c are values, E an environment and v a variable. In the case
where v = a[n] is a structured variable, the lambda operator is defined
as the following:


                            (\a[n].c) b = (\a.c) (\x. (x == n)? b: a[x])


EVALUATOR:
      An abstract machine is defined with the following structure:


                                                          S, E, C, D


where S represents the state (S = a, b, c, A, B or C), E the current
environment, C the code and D the dump, similar to Landin's SECD machine.


      State a is for evaluations where only the effect on the environment is
needed, state b is for evaluations where only the value is needed, and
state c is where both are needed. The following inductive assertions
are not only preserved by the rules to be listed, but are in fact the
starting point for deriving those very rules.


                          a, E, C, D ==> A, [E C], _, D
                          b, E, C, D ==> B, _, (E C), D
                          c, E, C, D ==> C, [E C], (E C), D


In state A, the value C is not used and so is not listed in the
descriptions below. Similarly, in state B, the environment is not
used or listed.


RULES:
(0) Start
        C --> b, (), C, []
(1) End
        B, N, [] --> N
(2) Start State
      (a) A, B
              S, E, (A,B), D --> a, E, A, 0 S B: D
      (b) L = B
              S, E, L = B, D --> c, E, &L, 1 S = B: D (S = a, c)
              b, E, L = B, D --> b, E, &L, 0 b B: D
      (c) L op= B
              S, E, (L op=B), D --> c, E, &L, 1 S op B: D (S = a, c)
              b, E, (L op=B), D --> b, E, L op B, D
      (d) L inc
              S, E, L inc, D --> c, E, &L, 1 S inc: D (S = a, c)
              b, E, L inc, D --> b, E, L, D
      (e) inc L
              S, E, inc L, D --> c, E, &L, 1 S INC: D (S = a, c)
              b, E, inc L, D --> b, E, L op 1, D
      (f) L
              a, E, L, D --> a, E, &L, D
              S, E, L, D --> c, E, &L, 3 S: D (S = b, c)
      (g) &L
              S, E, &*P, D --> S, E, P, D
              S, E, &A[B], D --> S, E, A + B, D
              a, E, &X, D --> A, E, D
              b, E, &X, D --> B, &X, D
              c, E, &X, D --> C, E, &X, D
      (h) K
              a, E, K, D --> A, E, D
              b, E, K, D --> B, K, D
              c, E, K, D --> C, E, K, D
      (i) A? B: C
              S, E, A?B:C, D --> c, E, A, 4 S B C: D
      (j) A || B
              S, E, A||B, D --> c, E, A, 5 S B: D
      (k) A && B
              S, E, A&&B, D --> c, E, A, 6 S B: D
      (l) A op B
              a, E, A op B, D --> a, E, (A,B), D
              S, E, A op B, D --> c, E, A, 8 S op B: D (S = b, c)
      (m) un A
              a, E, un A, D --> a, E, A, D
              S, E, un A, D --> S, E, A, 9 un: D (S = b, c)


(3) Final State
      (0) A, E, 0 S B: D --> S, E, B, D
      (1) C, E, p, 1 S = B: D --> c, E, B, 2 S = p: D
              C, E, &v, 1 S op B: D --> c, E, B, 2 S op &v E(v):D
              C, E, &v, 1 a inc: D --> A, (E v = E(v) op 1,), D
              C, E, &v, 1 c inc: D --> C, (E v = E(v) op 1,), E(v), D
              C, E, &v, 1 a INC: D --> A, (E v = E(v) op 1,), D
              C, E, &v, 1 c INC: D --> C, (E v = E(v) op 1,), E(v) op 1, D
      (2) C, E, b, 2 a = &v: D --> A, (E v = b,), D
              C, E, b, 2 c = &v: D --> C, (E v = b,), b, D
              C, E, b, 2 a op &v V: D --> A, (E v = V op b), D
              C, E, b, 2 c op &v V: D --> C, (E v = V op b), V op b, D
      (3) C, E, &v, 3 b: D --> B, E(v), D
              C, E, &v, 3 c: D --> C, E, E(v), D
      (4) C, E, 0, 4 S B C: D --> C, E, C, D
              C, E, X, 4 S B C: D --> C, E, B, D (X != 0)
      (5) C, E, 0, 5 a B: D --> a, E, B, D
              C, E, 0, 5 S B: D --> S, E, B, 7:D (S = b, c)
              C, E, X, 5 a B: D --> A, E, D (X != 0)
              C, E, X, 5 b B: D --> B, 1, D (X != 0)
              C, E, X, 5 c B: D --> C, E, 1, D (X != 0)
      (6) C, E, 0, 6 a B: D --> A, E, D
              C, E, 0, 6 b B: D --> B, 0, D
              C, E, 0, 6 c B: D --> C, E, 0, D
              C, E, X, 6 a B: D --> a, E, B, D (X != 0)
              C, E, X, 6 S B: D --> S, E, B, 7:D (X != 0, S = b, c)
      (7) B, X, 7:D --> B, 1, D (X != 0)
              B, 0, 7:D --> B, 0, D
              C, E, X, 7:D --> C, E, 1, D (X != 0)
              C, E, 0, 7:D --> C, E, 0, D
      (8) C, E', a, 8 S op B: D --> S, E', B, 9 op a: D
      (9) B, b, 9 op a: D --> B, a op b, D
              C, E, b, 9 op a: D --> C, E, a op b, D
              B, a, 9 un: D --> B, un a, D
              C, E, a, 9 un: D --> C, E, un a, D


Pseudo-code for the abstract machine is listed below:
(0) Start (with term T)
              S = b, C = T; Empty Environment, Clear Dump
(2) Start State:
        Repeat the following, based on the current values of C and S,
        until reading the final state (S = A, B or C).
        Case C:
              A,B) Dump (0 S B)
                        S = a, C = A
              L=B) if (S == b) Dump (0 S B), C = &L
                              else Dump (1 S = B), S = c, C = &L
              L op=B) if (S == b) C = (L op B)
                              else Dump (1 S op B), S = c, C = &L
              L inc) if (S == b) C = L
                              else Dump (1 S inc), S = c, C = &L
              inc L) if (S == b) C = L op 1
                              else Dump (1 S INC), S = c, C = &L
              L) if (S != a) Dump (3 S), S = c;
                    C = &L;
              &L) L = X: S = toupper(S), goto End
                      L = *P: C = P
                      L = A[B]: C = A + B
              K) S = toupper(S), goto End


              A?B:C) Dump(4 S B C), S = c, C = A
              A||B) Dump(5 S B), S = c, C = A
              A&&B) Dump(6 S B), S = c, C = A


              A op B) if (S != a) Dump(8 S op B), S = c, C = A;
                              else C = (A, B)
              un A) if (S != a) Dump(9 un);
                              C = A
(3) Final State + (1) End
        Repeat the following, based on the current Dump until the state
        reverts to a, b or c or the Dump is emptied.
        If the Dump is Empty, return C as the final value
        else Case Top Of Dump)
              0 s B) S = s, C = B
              1 s = B) S = c, Dump(2 s = C), C = B
              1 s op B) S = c, Dump(2 s op p value(p)), C = B
              1 s inc) p = C must be a pointer.
                                if (s != a) C = value(p)
                                Assign p <- value(p) op 1
                                S = toupper(s)
              1 s INC) p = C must be a pointer
                                Assign p <- value(p) op 1
                                if (s != a) C = value(p)
                                S = toupper(s)
              2 s = p) Assign p <- C
                                S = toupper(s)
              2 s op p v) Assign p <- v op C
                                      if (s != a) C = v op C
                                      S = toupper(s)
              3 s) C = value(C), S = toupper(s)
              4 s b c) C = C? b: c, S = s
              5 s b) C = C? 1: b
                            if (C == 0)
                                  if (s != a) Dump(7)
                                  S = s;
                            else
                                  S = toupper(s)
              6 s b) C = C? b: 0
                            if (C != 0)
                                  if (s != a) Dump(7)
                                  S = s;
                            else
                                  S = toupper(s)
              7) C = C? 1: 0
              8 s op b) Dump(9 op C), C = b, S = s
              9 op a) C = (a op C)
              9 un) C = un C


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.