Re: language design for parsing, was C++ intermediate representation.

mefrill@yandex.ru (Vladimir)
18 May 2005 00:51:36 -0400

          From comp.compilers

Related articles
Re: C++ intermediate representation. mefrill@yandex.ru (2005-05-15)
RE: C++ intermediate representation. quinn-j@shaw.ca (Quinn Tyler Jackson) (2005-05-15)
Re: language design for parsing, was C++ intermediate representation. mefrill@yandex.ru (2005-05-18)
RE: language design for parsing, was C++ intermediate representation. quinn-j@shaw.ca (Quinn Tyler Jackson) (2005-05-18)
Re: language design for parsing, was C++ intermediate representation. mefrill@yandex.ru (2005-05-19)
| List of all articles for this month |

From: mefrill@yandex.ru (Vladimir)
Newsgroups: comp.compilers
Date: 18 May 2005 00:51:36 -0400
Organization: http://groups.google.com
References: 05-05-114 05-05-130
Keywords: C++, parse, design
Posted-Date: 18 May 2005 00:51:36 EDT

Quinn Tyler Jackson <quinn-j@shaw.ca> wrote in message


> If all the world uses a hammer, what about when parsing encounters
> screws?
>
> If we settle for context-free engines, how are we to parse
> context-sensitive constructions with proven tools?
>
> Take pseudoknots for example:
>
> http://www.cs.brandeis.edu/~cs178/SearlsNature2002.pdf
>
> We can hack context-sensitivity into context-free technologies, but we
> can't prove with any degree of ease (in the formal sense) that those
> hacks are correct....


[skip]


I am a little bit confused... As I correctly understood!? It seems
that we are talking about two diferrent things.


I am talking about programming language parsing, not about formal
grammar manipulating, not about correctness proving, only about
parsing. What are talking about?


As it is known, there is some kind of dualism when we discuss
programming language (PL) parsing. The nature of PL is mixed and
derived from two main scientific areas: informatics and linguistics.
What is the programming language for programmer? It is the set of key
words, operators, functions and algorithms composed from all these
elements. Fom other hand, formal language is the set of strings
composed from language's alphabet symbols. And these two conceps are
brought together in compiling practice.


So, from one point of view, the programming language is
context-sensitive and, from other hand, it is context-free! How it is
possible? Because of programming language dualism. One part of
programming language described in terms of linguastics, other one - in
terms of informatics. The first part is named as syntactic one and
other part - as semantic one. Syntatic part obviously is described by
CF-grammar to make easier parsing process. Semantic part is not really
semantic one. Programmer picks up many syntactic constructions from
the PL as formal language and puts these in semantic module. For
example, in C++ we have to declare each veariable name before using.
As it is known, this makes C++ grammar context-sensitif. To avoid it
we pick up this from formal language definition by dividing syntax
analysis onto two parts: lexical and syntatic ones and encoding the
symbols names in symbol table. So, as least every modern programming
language is CS-language. And the main problem is decide what part of
the language put to "semantic" module and what part - to syntactic
one. The tradition says: leave syntax part so as the grammar to be
LALR(1). But, for many complex languages it is not good - too much
work we have to do in "semantic" module. Because, extend syntactic
part of the language adding to this possibility to parse ambigous
grammars and (that is most important) possibility to parse any
grammar, not only LALR(1) one.


Regards,
Vladimir.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.