Re: specifying semantics, was Formatting of Language LRMs

Hans-Peter Diettrich <DrDiettrich1@aol.com>
Fri, 04 Jul 2014 08:51:24 +0200

          From comp.compilers

Related articles
[10 earlier articles]
Re: specifying semantics, was Formatting of Language LRMs genew@telus.net (Gene Wirchenko) (2014-06-30)
Re: specifying semantics, was Formatting of Language LRMs ivan@ootbcomp.com (Ivan Godard) (2014-06-30)
Re: specifying semantics, was Formatting of Language LRMs anton@mips.complang.tuwien.ac.at (2014-07-02)
Re: specifying semantics, was Formatting of Language LRMs monnier@iro.umontreal.ca (Stefan Monnier) (2014-07-03)
Re: specifying semantics, was Formatting of Language LRMs genew@telus.net (Gene Wirchenko) (2014-07-03)
Re: specifying semantics, was Formatting of Language LRMs gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-07-04)
Re: specifying semantics, was Formatting of Language LRMs DrDiettrich1@aol.com (Hans-Peter Diettrich) (2014-07-04)
Re: Parsing Fortran, was specifying semantics gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-07-04)
Re: Parsing Fortran, was specifying semantics wclodius@earthlink.net (2014-07-04)
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups: comp.compilers
Date: Fri, 04 Jul 2014 08:51:24 +0200
Organization: Compilers Central
References: 14-06-010 14-06-023 14-06-025 14-06-027 14-06-030 14-06-031 14-07-003 14-07-006
Keywords: syntax, semantics, comment
Posted-Date: 04 Jul 2014 10:25:30 EDT

Stefan Monnier schrieb:
>> What is conventionally called "syntax" is an artifact of the compiler
>> technology (or maybe of specification technology, see below):
>
> I disagree here. It is no accident. Same as the split between
> lexical and syntactic analysis, the division between syntax and
> (static) semantics is an engineering issue: specifying a language is a
> fairly large amount of work, so you want to split it into simpler
> parts.


I don't know of any *general* formal description of lexical analysis,
that would allow to e.g. describe Fortran and Algol (or C) lexing at the
same time. In an experiment with an scannerless parser a problem with
whitespace popped up, where whitespace sometimes is required to separate
keywords from identifiers, sometimes is optional:


Fortran: D O I = ...
Pascal: if x ...
C: int i ...


Just in the Fortran example neither a longest nor a shortest match lexer
is a solution. Do we have to exclude Fortran (and C) as context
sensitive languages then? But what would be the benefit of an
description formalism, that only is applicable to new languages, whose
syntax and semantics can (must!) be described using just that formalims?


In the case of the scannerless parser experiment the grammar size
exploded, when the handling of whitespace was added. Also the well known
"dangling else" problem can be solved only at the expense of a bigger
grammar, which then may still be readable by a compiler generator, but
not by humans any more.




Such problems suggest to me that the design of a language starts with
the design of an appropriate description formalism. But which formalism
could be used to describe *that* formalism?


DoDi


[You can scan Fortran with longest match, but the things you're
matching for are very context dependent. I've done it. I agree that
it's unlikely we'll find a general model for different languages. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.