Re: Generating a simple hand-coded like recursive descent parser

"oliverhunt@gmail.com" <oliverhunt@gmail.com>
9 Sep 2006 22:52:42 -0400

          From comp.compilers

Related articles
Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-08)
Re: Generating a simple hand-coded like recursive descent parser DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-09-09)
Re: Generating a simple hand-coded like recursive descent parser Juergen.Kahrs@vr-web.de (=?ISO-8859-1?Q?J=FCrgen_Kahrs?=) (2006-09-09)
Re: Generating a simple hand-coded like recursive descent parser oliverhunt@gmail.com (oliverhunt@gmail.com) (2006-09-09)
Re: Generating a simple hand-coded like recursive descent parser pjb@informatimago.com (Pascal Bourguignon) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-10)
Re: Generating a simple hand-coded like recursive descent parser pjb@informatimago.com (Pascal Bourguignon) (2006-09-10)
[34 later articles]
| List of all articles for this month |

From: "oliverhunt@gmail.com" <oliverhunt@gmail.com>
Newsgroups: comp.compilers
Date: 9 Sep 2006 22:52:42 -0400
Organization: Compilers Central
References: 06-09-029
Keywords: parse

Mr.E wrote:
> There are approximately 600 keyword or keyword combinations which
> include compound words due to intrinsic functions.


What do you mean by keyword combinations?


> Is there a parser generator that produces the equivalent of a
> hand-coded recursive descent parser? I'm looking for a generator
> that doesn't require an engine and doesn't use external libraries...
> just plain old C.


I've seen CoCo/R mentioned... ANTLR also produces recursive descent
(LL(*)) parsers, but converting its output to basic may be about the
least fun thing you could do...


>
> I can watch and debug recursive descent code because I can understand
> that. I cant imagine trying to debug a table driven parser or having
> to rewrite it in BASIC.


You could always write one by hand -- BASIC's grammar isn't overly
complex, and it could be a good learning experience :D


> Also are there any algorithms for AST building. Everything I've
> understand tells me that I really want to build an AST and do code
> generation from it versus trying to generate code as I go along. I
> thought I understood the process but I'm not there yet.


Well that one's up to you... you could make a completely abstract tree:
struct ASTNode {
    int isLeaf;
    void *leaf;
    struct {
        int numChildren;
        ASTNode *children;
    } branch;
}


In this you can just store the whole parse tree, alternatively you just
store the semantics of what is being parsed. In which case basically
all you are doing is building a tree that represents the important bits
of your grammar. eg (from http://en.wikipedia.org/wiki/Tiny_BASIC -- i
really don't know basic :D).
statement ::= PRINT expr-list
                                  IF expression relop expression THEN statement
                                  GOTO expression
                                  INPUT var-list
                                  LET var = expression
                                  GOSUB expression
                                  RETURN
                                  CLEAR
                                  LIST
                                  RUN
                                  END


for which you *might* make a pair of types:
enum statement_type {
    print_statement, if_statement, goto_statement, ..., end_statement
};


struct statement_s {
    statement_type type;
    union {
        struct {
            expression_list *expressions;
        } print_stat;
        struct {
            relop operator;
            expression *lhs, *rhs;
            struct statement_s *statement;
        } if_stat;
        struct {
            expression *target;
        } goto_statement;
        ...
    };
};




Now when you parse a statement you create a statement struct for the
appropriate branch, and fill in the appropriate bits. But note we
aren't storing syntax info -- all we have in the node is the actual
information we *need* to reconstruct the meaning.


Apologies for any issues in the above codei tend to use languages where
subtyping is an option :D


And other people here can probably explain this somewhat better than i
have :(


--Oliver


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.