Re: Is This Expression Parsing Feasible?

torbenm@app-3.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Thu, 28 Jun 2007 11:58:38 +0200

          From comp.compilers

Related articles
Is This Expression Parsing Feasible? Milburn.Young@gmail.com (2007-06-27)
Re: Is This Expression Parsing Feasible? martin@gkc.org.uk (Martin Ward) (2007-06-28)
Re: Is This Expression Parsing Feasible? torbenm@app-3.diku.dk (2007-06-28)
Re: Is This Expression Parsing Feasible? milburn.young@gmail.com (Milburn Young) (2007-06-28)
Re: Is This Expression Parsing Feasible? jeffrey.kenton@comcast.net (Jeff Kenton) (2007-06-29)
Re: Is This Expression Parsing Feasible? jeffrey.kenton@comcast.net (Jeff Kenton) (2007-06-30)
| List of all articles for this month |

From: torbenm@app-3.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Newsgroups: comp.compilers
Date: Thu, 28 Jun 2007 11:58:38 +0200
Organization: Department of Computer Science, University of Copenhagen
References: 07-06-066
Keywords: parse
Posted-Date: 30 Jun 2007 09:48:24 EDT

Milburn.Young@gmail.com writes:


> I'm new to expression parsing (but not programming) and have an
> algorithm in mind that I would like critiqued by the community. To
> make an expression tree from an expression ending in a terminator
> (e.g. ';' or '\0'), the lexer returns a random access collection of
> tokens. The parser would scan through the entire list of tokens (left
> to right) noting the token(s) with the highest precedence and making
> sub-trees out of those tokens (on the subsequent pass) and the tokens
> immediately to the left and right of those tokens. This would be done
> until there are no tokens not a part of the expression tree. I
> believe the complexity of this algorithm would be O(p*n), where 'p' is
> the number of precedences and 'n' is the number of tokens.


> 1) What is the type of parsing called (if anything)?


I'm not sure this technique has a name, but it seems to be a variant
of operator precedence parsing
(http://en.wikipedia.org/wiki/Operator-precedence_parser,
http://epaperpress.com/oper/download/oper.pdf).


The paper above uses a two-dimensional table which cross-indexes two
operators to find an action, but there are simpler algorithms where
each operator is simply given two precedences: One for the left side
and one for the right side. An opening parenthesis would, for example
have high precedence on the left side and low precedence on the right
side, and vice versa for a closing parenthesis. Left-associative
operators (like /) would have a slightly higher precedence to the left
than to the right, so 2/3/4 would group like (2/3)/4. For expressions
with the four basic arithmetic operators (left-associative),
parentheses and right-associative bitwise operators (& and |), a table
could be:


    op | left | right
  ----+------+------
    ( | 10 | 1
    ) | 1 | 10
    * | 9 | 8
    / | 9 | 8
    + | 7 | 6
    - | 7 | 6
    & | 4 | 5
    | | 2 | 3


> 2) Is there any reason this technique would not be feasible?


Your method should certainly be feasible as you describe it, but much
slower than stack-based OPP parsing (which does only one pass over the
string) and not really any simpler.


Torben


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.