Re: Designing a language in which a class can define (not just overload) operators

vidar@hokstad.name (Vidar Hokstad)
4 Sep 2003 22:47:29 -0400

          From comp.compilers

Related articles
Designing a language in which a class can define (not just overload) news@chunk.com.au (Richard Cartwright) (2003-08-20)
Re: Designing a language in which a class can define (not just overloa joshualevy@yahoo.com (2003-08-23)
Re: Designing a language in which a class can define (not just overloa tmk@netvision.net.il (2003-08-23)
Re: Designing a language in which a class can define (not just overloa rkrayhawk@aol.com (2003-09-04)
Re: Designing a language in which a class can define (not just overloa vidar@hokstad.name (2003-09-04)
Re: Designing a language in which a class can define (not just overloa nmm1@cus.cam.ac.uk (2003-09-09)
Re: Designing a language in which a class can define (not just overloa derkgwen@HotPOP.com (Derk Gwen) (2003-09-09)
| List of all articles for this month |

From: vidar@hokstad.name (Vidar Hokstad)
Newsgroups: comp.compilers
Date: 4 Sep 2003 22:47:29 -0400
Organization: http://groups.google.com/
References: 03-08-062 03-08-083
Keywords: syntax, OOP
Posted-Date: 04 Sep 2003 22:47:28 EDT

tmk@netvision.net.il (Michael Tiomkin) wrote in message news:03-08-083...
> > [Write your parser with generic expression rules like this:
> >
> > expr: expr OP5 expr /* precedence 5 operator */
> >
> > Then when you define a new operator at level 5, have the lexer return
> > an OP5 for it with semantic info so the parser can figure out which
> > operator it was. I'm pretty sure this trick was used in extensible
> > languages before 1980. -John]
>
> Unfortunately, this will need 100 rules for 100 priorities.
> [It's true. But since a language with 100 priorities would be unusable,
> so what? -John]


My solution to this problem years ago was to make the parser entirely
dynamic. Instead of hardwiring rules to specific priorities, I defined
almost the entire language I was working on in terms of operators with
variable priorities. I started out with arithmetic expressions, and
then went a little bit overboard and made everything including control
structures expressions that were defined by:


    - The lexer symbol for the operator
    - The type (to make it flexible enough to handle constructs that
weren't "traditional" operators, I added postfix types with more than
one right operand, to handle if <condition> <block> for instance)
    - Priority
    - Whether this is a "paired" operator (I was treating "(" + ")", "{"
+ "}" etc. as operators as well)
    - Whether or not the operator should be present in the parse tree
(again "(" + ")", for instance, shouldn't in my case)
    - Associativity of the operator.


Parsing this requires a quite simple function. It is very easy to
implement recursively, and I believe the basic algorithm is presented
in the dragon book (I might be mistaken, though, haven't read it in
years), and is present in quite a few algorithm text books (look for
the typical arithmetic calculator examples). Extending it for more
"unusual" constructs, like I did requires a bit of thought, but not
that much work.


It's a very flexible approach, but in hindsight I woud caution anyone
against using it for a whole language unless you have a compelling
reason to - juggling priorities to get the right behaviour can get
quite "interesting", and quickly eat up any advantage the simplicity
of the algorithm gives you..


If I remember correctly, I ended up at around 40 priorities to get
everything working, so I'm not sure I agree completely with the
sentiment that 100 priorities would make a language unusable - it
depends on what subset of the language you apply a priority based
expression parser to. For instance, I believe you'd easily end up with
well above 100 priorities if you'd for some twisted reason want to
parse C++ in this manner.


Vidar Hokstad


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.