From: | "Mr.E" <mr.waverlye@verizon.net> |
Newsgroups: | comp.compilers |
Date: | 11 Sep 2006 15:51:53 -0400 |
Organization: | Compilers Central |
References: | 06-09-02906-09-034 06-09-038 06-09-041 |
Keywords: | parse, Basic |
Posted-Date: | 11 Sep 2006 15:51:53 EDT |
Pascal Bourguignon wrote:
[snip]
>
> The main idea of the recursive descent parser, is that you can select
> the production rule from the first terminal.
>
> With these rules:
>
> start --> c1|c2|c3|c4|k1|k2|k3|l1|l2|d1|d2|d3
>
> c1 --> "compile"
> c2 --> "compile" "long" "if"
> c3 --> "compile" "xelse"
> c4 --> "compile" "end" "if"
> k1 --> "clear"
> k2 --> "clear" "local"
> k3 --> "clear" "local" "mode"
> l1 --> "local"
> l2 --> "local" "mode"
> d1 --> "def" "fn" <function name> [ "(" <paramater list> ")" ]
> d2 --> "def" "fn" <function name> "=" <expression>
> d3 --> "def" "fn" <function name> "USING" <functionPointer>
>
>
> from the parse-start function and the first terminal read (let's say
> it's "compile", you cannot know which rule to apply, which function to
> call.
>
> So you have to transform the grammar to make sure that for each non
> terminal the set of the first symbols derivable from that non terminal
> is disjoint from the set for the other non terminals (at least, for
> the other non terminals derivable from the same places).
>
> c --> "compile" c1|c2|c3|c4
> c1 -->
> c2 --> "long" "if"
> c3 --> "xelse"
> c4 --> "end" "if"
>
> k --> "clear" k1|kl
> k1 -->
> kl --> "local" kl1|kl2
> kl1 -->
> kl2 --> "mode"
>
> l --> "local" l1|l2
> l1 -->
> l2 --> "mode"
>
> d --> "def" "fn" <function name> d0|d1|d2|d3
> d0 -->
> d1 --> "(" <paramater list> ")"
> d2 --> "=" <expression>
> d3 --> "USING" <functionPointer>
>
>
>
> Then, for each production rule you can write a function that will know
> immediately from the current terminal symbol read (token) which
> function to call (what non-terminal production rule to invoke).
>
> parse-start:
> if token="compile" then c
>
> parse-c:
> accept "compile"
> if token="long" then c2
> else if token="xelse" then c3
> else if token="end" then c4
> else c1
>
> parse-d:
> accept "def"
> accept "fn"
> parse-function-name
> if token="(" then parse-d1
> else if token="=" then parse-d2
> else if token="USING" then parse-d3
> else parse-d0
>
> parse-d1:
> accept "("
> parse-paremter-list
> accept ")"
>
> ...
>
>
>
>
> But notice how this creates "artificial" non-terminals and their
> corresponding procedures.
>
As I've understood your parser, it is similar in design (initially) as
my scanner. I made the scanner do more of the work with recognizing
the compound keywords. I know all the combinations so the scanner
return a single token for the compound word reducing the tokens to a
terminal, simplifying the parser. As far as the parser is concerned
there are no compound keywords.
I see what you are saying about the parsing and will keep it in mind
while I'm re-drafting how I'm going to parse.
I believe I read a post from our moderator once that said that scanning
can be 20% of compilation time because the scanner has to touch every
character. My scanner is pretty efficient and should make life easier
on my parser and me.
W.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.