Re: Generating a simple hand-coded like recursive descent parser

Pascal Bourguignon <pjb@informatimago.com>
10 Sep 2006 23:43:38 -0400

          From comp.compilers

Related articles
[3 earlier articles]
Re: Generating a simple hand-coded like recursive descent parser oliverhunt@gmail.com (oliverhunt@gmail.com) (2006-09-09)
Re: Generating a simple hand-coded like recursive descent parser pjb@informatimago.com (Pascal Bourguignon) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser pjb@informatimago.com (Pascal Bourguignon) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser tommy.thorn@gmail.com (Tommy Thorn) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser ArarghMail609@Arargh.com (2006-09-11)
Re: Generating a simple hand-coded like recursive descent parser DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-09-11)
Re: Generating a simple hand-coded like recursive descent parser mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2006-09-11)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-11)
Re: Generating a simple hand-coded like recursive descent parser ArarghMail609@Arargh.com (2006-09-11)
[28 later articles]
| List of all articles for this month |

From: Pascal Bourguignon <pjb@informatimago.com>
Newsgroups: comp.compilers
Date: 10 Sep 2006 23:43:38 -0400
Organization: Informatimago
References: 06-09-029 06-09-034 06-09-038
Keywords: parse, Basic
Posted-Date: 10 Sep 2006 23:43:37 EDT

"Mr.E" <mr.waverlye@verizon.net> writes:


> oliverhunt@gmail.com wrote:
>> Mr.E wrote:
>> > There are approximately 600 keyword or keyword combinations which
>> > include compound words due to intrinsic functions.
>>
>> What do you mean by keyword combinations?
>
> Examples off the top of my head
>
> keyword or compound phase, usage
> "compile", directive on what compiler options will be used
> "compile long if", conditional compile block statement
> "compile xelse", what to do if the conditional compile expression fails
> "compile end if", end of conditially compiled block
> "clear", reinitialize global and main scoped variables for entire
> program
> "clear local", initialize variables belonging to function before use
> "clear local mode" , same as "clear local" but user global variable are
> inaccessible ( not visible to the function)
> "local", indicates start of local variables scope
> "local mode", start of local scope, procedure can not see global
> variables
> "def fn <function name> [( paramater list)]", prototype a function
> "def fn <function name> = expression", simple function
> "def fn <function name> USING functionPointer", virtual function
> prototype




The main idea of the recursive descent parser, is that you can select
the production rule from the first terminal.


With these rules:


start --> c1|c2|c3|c4|k1|k2|k3|l1|l2|d1|d2|d3


c1 --> "compile"
c2 --> "compile" "long" "if"
c3 --> "compile" "xelse"
c4 --> "compile" "end" "if"
k1 --> "clear"
k2 --> "clear" "local"
k3 --> "clear" "local" "mode"
l1 --> "local"
l2 --> "local" "mode"
d1 --> "def" "fn" <function name> [ "(" <paramater list> ")" ]
d2 --> "def" "fn" <function name> "=" <expression>
d3 --> "def" "fn" <function name> "USING" <functionPointer>




from the parse-start function and the first terminal read (let's say
it's "compile", you cannot know which rule to apply, which function to
call.


So you have to transform the grammar to make sure that for each non
terminal the set of the first symbols derivable from that non terminal
is disjoint from the set for the other non terminals (at least, for
the other non terminals derivable from the same places).


c --> "compile" c1|c2|c3|c4
c1 -->
c2 --> "long" "if"
c3 --> "xelse"
c4 --> "end" "if"


k --> "clear" k1|kl
k1 -->
kl --> "local" kl1|kl2
kl1 -->
kl2 --> "mode"


l --> "local" l1|l2
l1 -->
l2 --> "mode"


d --> "def" "fn" <function name> d0|d1|d2|d3
d0 -->
d1 --> "(" <paramater list> ")"
d2 --> "=" <expression>
d3 --> "USING" <functionPointer>






Then, for each production rule you can write a function that will know
immediately from the current terminal symbol read (token) which
function to call (what non-terminal production rule to invoke).


parse-start:
      if token="compile" then c


parse-c:
      accept "compile"
      if token="long" then c2
      else if token="xelse" then c3
      else if token="end" then c4
      else c1


parse-d:
        accept "def"
        accept "fn"
        parse-function-name
        if token="(" then parse-d1
        else if token="=" then parse-d2
        else if token="USING" then parse-d3
        else parse-d0


parse-d1:
        accept "("
        parse-paremter-list
        accept ")"


...








But notice how this creates "artificial" non-terminals and their
corresponding procedures.




My example Recursive Descent Parser Generator doesn't normalize the
grammar in such a way, so you'd have to make sure to write the grammar
such as the first set of each non-terminal group is disjoint.


It would be better to use a parser generator that did this grammar
normalization (there is a simple algorithm to do this normalization).


Then the procedures generated won't match 1-1 the production rules you
write.


Therefore I wouldn't bother with expecting a readable generated
parser. Table driven parsers are well known and work well and
efficiently.


--
__Pascal Bourguignon__ http://www.informatimago.com/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.