From: | Pascal Bourguignon <pjb@informatimago.com> |
Newsgroups: | comp.compilers |
Date: | 10 Sep 2006 23:43:38 -0400 |
Organization: | Informatimago |
References: | 06-09-029 06-09-034 06-09-038 |
Keywords: | parse, Basic |
Posted-Date: | 10 Sep 2006 23:43:37 EDT |
"Mr.E" <mr.waverlye@verizon.net> writes:
> oliverhunt@gmail.com wrote:
>> Mr.E wrote:
>> > There are approximately 600 keyword or keyword combinations which
>> > include compound words due to intrinsic functions.
>>
>> What do you mean by keyword combinations?
>
> Examples off the top of my head
>
> keyword or compound phase, usage
> "compile", directive on what compiler options will be used
> "compile long if", conditional compile block statement
> "compile xelse", what to do if the conditional compile expression fails
> "compile end if", end of conditially compiled block
> "clear", reinitialize global and main scoped variables for entire
> program
> "clear local", initialize variables belonging to function before use
> "clear local mode" , same as "clear local" but user global variable are
> inaccessible ( not visible to the function)
> "local", indicates start of local variables scope
> "local mode", start of local scope, procedure can not see global
> variables
> "def fn <function name> [( paramater list)]", prototype a function
> "def fn <function name> = expression", simple function
> "def fn <function name> USING functionPointer", virtual function
> prototype
The main idea of the recursive descent parser, is that you can select
the production rule from the first terminal.
With these rules:
start --> c1|c2|c3|c4|k1|k2|k3|l1|l2|d1|d2|d3
c1 --> "compile"
c2 --> "compile" "long" "if"
c3 --> "compile" "xelse"
c4 --> "compile" "end" "if"
k1 --> "clear"
k2 --> "clear" "local"
k3 --> "clear" "local" "mode"
l1 --> "local"
l2 --> "local" "mode"
d1 --> "def" "fn" <function name> [ "(" <paramater list> ")" ]
d2 --> "def" "fn" <function name> "=" <expression>
d3 --> "def" "fn" <function name> "USING" <functionPointer>
from the parse-start function and the first terminal read (let's say
it's "compile", you cannot know which rule to apply, which function to
call.
So you have to transform the grammar to make sure that for each non
terminal the set of the first symbols derivable from that non terminal
is disjoint from the set for the other non terminals (at least, for
the other non terminals derivable from the same places).
c --> "compile" c1|c2|c3|c4
c1 -->
c2 --> "long" "if"
c3 --> "xelse"
c4 --> "end" "if"
k --> "clear" k1|kl
k1 -->
kl --> "local" kl1|kl2
kl1 -->
kl2 --> "mode"
l --> "local" l1|l2
l1 -->
l2 --> "mode"
d --> "def" "fn" <function name> d0|d1|d2|d3
d0 -->
d1 --> "(" <paramater list> ")"
d2 --> "=" <expression>
d3 --> "USING" <functionPointer>
Then, for each production rule you can write a function that will know
immediately from the current terminal symbol read (token) which
function to call (what non-terminal production rule to invoke).
parse-start:
if token="compile" then c
parse-c:
accept "compile"
if token="long" then c2
else if token="xelse" then c3
else if token="end" then c4
else c1
parse-d:
accept "def"
accept "fn"
parse-function-name
if token="(" then parse-d1
else if token="=" then parse-d2
else if token="USING" then parse-d3
else parse-d0
parse-d1:
accept "("
parse-paremter-list
accept ")"
...
But notice how this creates "artificial" non-terminals and their
corresponding procedures.
My example Recursive Descent Parser Generator doesn't normalize the
grammar in such a way, so you'd have to make sure to write the grammar
such as the first set of each non-terminal group is disjoint.
It would be better to use a parser generator that did this grammar
normalization (there is a simple algorithm to do this normalization).
Then the procedures generated won't match 1-1 the production rules you
write.
Therefore I wouldn't bother with expecting a readable generated
parser. Table driven parsers are well known and work well and
efficiently.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Return to the
comp.compilers page.
Search the
comp.compilers archives again.