Re: Is This Expression Parsing Feasible?

"Milburn Young" <milburn.young@gmail.com>
Thu, 28 Jun 2007 21:45:03 -0500

          From comp.compilers

Related articles
Is This Expression Parsing Feasible? Milburn.Young@gmail.com (2007-06-27)
Re: Is This Expression Parsing Feasible? martin@gkc.org.uk (Martin Ward) (2007-06-28)
Re: Is This Expression Parsing Feasible? torbenm@app-3.diku.dk (2007-06-28)
Re: Is This Expression Parsing Feasible? milburn.young@gmail.com (Milburn Young) (2007-06-28)
Re: Is This Expression Parsing Feasible? jeffrey.kenton@comcast.net (Jeff Kenton) (2007-06-29)
Re: Is This Expression Parsing Feasible? jeffrey.kenton@comcast.net (Jeff Kenton) (2007-06-30)
| List of all articles for this month |

From: "Milburn Young" <milburn.young@gmail.com>
Newsgroups: comp.compilers
Date: Thu, 28 Jun 2007 21:45:03 -0500
Organization: Compilers Central
References: 07-06-066 <200706280937.53681.martin@gkc.org.uk>
Keywords: parse
Posted-Date: 30 Jun 2007 09:49:03 EDT

On 6/28/07, Martin Ward <martin@gkc.org.uk> wrote:
> On Wednesday 27 Jun 2007 19:37, Milburn.Young@gmail.com wrote:
> > The parser would scan through the entire list of tokens (left
> > to right) noting the token(s) with the highest precedence and making
> > sub-trees out of those tokens (on the subsequent pass) and the tokens
> > immediately to the left and right of those tokens. This would be done
> > until there are no tokens not a part of the expression tree. I
> > believe the complexity of this algorithm would be O(p*n), where 'p' is
> > the number of precedences and 'n' is the number of tokens.
>
> What about parentheses in the expression?
> I assume that your parser would have to scan sub-expressions
> with no parentheses first, replace these by subtrees,
> then start again scanning for the highest precidence tokens.
> So the complexity would be O(p*n^2) since the depth
> of parentheses nesting is proportional to n in the worst case.
>
> For example:
>
> ((...(((a0 + a1) + a2) + a3) ...) + an)




Good point. I could have shown parentheses in my example, but I was
afraid I would lose even more readers in my explanation (for the same
reason I indexed my passes starting with 1, not 0). I see parentheses
as a (paired) unary operator with the highest of precedences. I
believe grouping operators would produce the only recursion/stack in
my algorithm.


In your example,
> (((((a0 + a1) + a2) + a3)))
the first pass would produce (with little 'v's to indicate the paired
operator token and letters as recursive/stack indeces):
1:
  V V V V V v v v v v
[(] [(] [(] [(] [(] [a0] [+] [a1] [)] [+] [a2] [)] [+] [a3] [)] [)] [)]


2a:
{()}
  V V V V v v v v
[(] [(] [(] [(] [a0] [+] [a1] [)] [+] [a2] [)] [+] [a3] [)] [)]


3b:
      {()}
{()}
  V V V v v v
[(] [(] [(] [a0] [+] [a1] [)] [+] [a2] [)] [+] [a3] [)]


4c:
            {()}
      {()}
{()} [+] [a3]
  V V v v
[(] [(] [a0] [+] [a1] [)] [+] [a2] [)]


5d:
                  {()}
            {()}
      {()} [+] [a3]
{()} [+] [a2]
  V v
[(] [a0] [+] [a1] [)]


6e:
                        {()}
                  {()}
            {()} [+] [a3]
      {()} [+] [a2]
{()}
[a0] [+] [a1]


7:
                        {()}
                  {()}
            {()} [+] [a3]
      {()} [+] [a2]
{()}
            V
[a0] [+] [a1]


8 (neither child was an operator):
                                    {()}
                              {()}
                        {()} [+] [a3]
                  {()} [+] [a2]
            {()}
      {+}
{a0} {a1}


Now as the recursive calls return, the tokens that have not become
tree nodes are iteratively evaluated (with the same recursive/stack
operations on grouping). Also, the cases where node.child and
node.child.child are both grouping operators would be left alone for
the optimizer.


For brevity, I believe this results in a total of twelve (12)
iterative or recursive calls (because the two remaining additions
result in an iterative call to mark the highest precedence operators
and another to discover there are no more operators).
If you count the parentheses pairs as one "expression token", the
original expression had twelve (12) expression tokens (5 parenthese
pairs, 3 addition operators, and 4 terms/variables).


Actually, I think my conceptual algorithm may have a complexity of O(p*(n+1)).


Now, do you think this algorithm is feasible? If so, is there a
pre-existing name for it? I saw a post where someone said that
recursive descent was natural because that's how most people evaluated
expressions [in math class]. However, this First-Things-First
algorithm is how I evaluate complex expressions.


Regards,
Milburn Young


"When pride comes, then comes disgrace, but with humility comes wisdom."
bProverbs 11:2


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.