Related articles |
---|
advice on lexing/parsing split (novice) sfinnie@sqf.hp.com (Scott Finnie) (1998-07-17) |
Re: advice on lexing/parsing split (novice) qjackson@wave.home.com (Quinn Tyler Jackson) (1998-07-20) |
From: | "Quinn Tyler Jackson" <qjackson@wave.home.com> |
Newsgroups: | comp.compilers |
Date: | 20 Jul 1998 17:09:42 -0400 |
Organization: | Compilers Central |
References: | 98-07-125 |
Keywords: | lex, parse |
>Please forgive what is probably a very basic question...
There are no basic questions.
>Our thoughts at present are
>
>1. Build a generic lexer capable of tokeninsing the input.
>2. Build / extend parsers to handle the specific grammars of each file.
>My questions are
>
>1. Is this a sensible approach to take?
>2. Assuming it is, any guidance on the level to pitch tokens at? The
>two options we
> identified were
> (a) tokens are block delimiters (entries in square brackets,
>e.g. [entry]) and
> values (e.g. name-value pairs);
> (b) tokens would be complete sections; i.e. the block delimiter
>([entry]) and all
> associated attribute values.
If your basic format is something like:
[foobar]
baz=quux
bar=DES broken
buz=see the NSA tremble
[somemorefoo]
somemorebaz=somemorequux
[... etc ...]
then you are probably looking at the something like the following grammar:
// Visual Parse++ Grammar
%expression Main
'[ \t\r\n\f\b]+' %ignore; // whitespace
'[ \[a-zA-Z0-9_\]]+' TOKEN_BLOCK_HEADER, '[HEADER]';
'[ a-zA-Z0-9_]+' TOKEN_STRING, 'STRING';
'=' TOKEN_ASSIGNMENT, '=';
%production S
S S -> block_unit;
bu_full block_unit -> block block_unit;
bu_empty block_unit -> ;
block_simple block -> '[HEADER]' assign_list;
alist_simple assign_list -> list_item assign_list;
alist_empty assign_list -> ;
item_simple list_item -> 'STRING' '=' 'STRING';
A screen shot of the parse tree generated by the above grammar is available
at:
http://www.qtj.net/~quinn/pics/sfinnie_parse.gif
Cheers,
Quinn
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.