Fri, 21 May 2021 14:14:37 -0000 (UTC)

Related articles |
---|

About finding the start symbol of a grammar ecosta.tmp@gmail.com (Eduardo Costa) (2021-05-21) |

Re: About finding the start symbol of a grammar 563-365-8930@kylheku.com (Kaz Kylheku) (2021-05-21) |

Re: About finding the start symbol of a grammar DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2021-05-21) |

Re: About finding the start symbol of a grammar anton@mips.complang.tuwien.ac.at (2021-05-21) |

Re: About finding the start symbol of a grammar drikosev@gmail.com (Ev. Drikos) (2021-05-22) |

Re: About finding the start symbol of a grammar gah4@u.washington.edu (gah4) (2021-05-22) |

From: | Kaz Kylheku <563-365-8930@kylheku.com> |

Newsgroups: | comp.compilers |

Date: | Fri, 21 May 2021 14:14:37 -0000 (UTC) |

Organization: | A noiseless patient Spider |

References: | 21-05-015 |

Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="88660"; mail-complaints-to="abuse@iecc.com" |

Keywords: | parse, comment |

Posted-Date: | 21 May 2021 11:17:15 EDT |

On 2021-05-21, Eduardo Costa <ecosta.tmp@gmail.com> wrote:

*> Hey guys,*

*>*

*> I've been lately dealing with a parser generator for LL grammars, and since*

*> it's inception I've always been blindy assuming the first element read from*

*> within the input file is going to be the start symbol or starting rule.*

*>*

*> So I've been wondering all this time, just out of curiosity, if there exists a*

*> method or algorithm to find out the start symbol of a given grammar?*

*>*

*> I guess the answer is no.*

Surely, the start symbol of a context-free grammar is one which appears

only in the left hand side of a rule. If there is such a unique symbol,

it must be /the/ start symbol.

*> While there would exist grammars we could recursively check to find out which*

*> it's start symbol is (i.e.: it's the only rule that used the rest of them,*

*> where checking every other resulted in dangling rules that weren't even called*

*> in), there might be other grammars for which more than one rule yields full*

*> coverage (all of these obviously defining different languages) and so leading*

*> to ambiguity.*

Ambiguity doesn't imply there is no algorithm to find a start symbol,

but that the algorithm must be able to report situations like the

presence of multiple start symbols, or no start symbols.

On the face of it, this problem does not smell of undecidability, or

even NP completeness. Where do you suspect is the difficulty?

It seems like this is a fairly trivial property of a graph, type of

thing.

Whether rules are dangling is also a graph property: is the graph

connected.

*> I only contemplate a simple coverage test, even though other techniques could*

*> exist, again, all of them leading to a point where we couldn't ascertain if*

*> one or the other is what the user meant.*

But tha seems like an identifiable point where the algorithm can stop

and announce a decision. Then diagnostics can be issued.

--

TXR Programming Language: http://nongnu.org/txr

Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

[I have seen useful grammars where the start symbol also appears in the RHS of a rule.

Think of the standard expression grammar.

You pick the start symbol that gives you the language you want to parse. -John]

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.