Re: Is lex/yacc the right tool for this problem

codeworker@free.fr (Cedric LEMAIRE)
29 Jan 2003 23:49:41 -0500

          From comp.compilers

Related articles
Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-26)
Re: Is lex/yacc the right tool for this problem arnold@skeeve.com (2003-01-27)
Re: Is lex/yacc the right tool for this problem tenger@idirect.com (Terrence Enger) (2003-01-27)
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29)
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29)
Re: Is lex/yacc the right tool for this problem codeworker@free.fr (2003-01-29)
| List of all articles for this month |

From: codeworker@free.fr (Cedric LEMAIRE)
Newsgroups: comp.compilers
Date: 29 Jan 2003 23:49:41 -0500
Organization: http://groups.google.com/
References: 03-01-163
Keywords: lex, yacc
Posted-Date: 29 Jan 2003 23:49:41 EST

> I am required to extract certain fields of data when certain
> conditions are met.
> Typically a set of rules like
> 1.If type == 123 && (( amount < 35 ) || ( customer == unknown ) ) =>
> fetch amount, date, duration
> 2. If type == 234 && (( weight > 100 ) && ( height < 567 ) ) => fetch
> name, weight, height ...


> I thought I will essentially generate a program with the help of lex &
> yacc wherein all the rules will be given in a file rules.cfg with
> lines of the form ( for the above 2 rules )
> 1. Field( 1,3) == 123 && ((Integer(Field( 7,9)) < 35 ) || ( Strip (
> Field( 11,30 ) ) == "unknown" )) => Float( Field( 35, 40 )), Date(
> Field( 42, 50)), Time( Field( 55,60 ) )
> 2. Field( 1,3) == 123 && Integer( Field( 10, 13 ) ) > 100 && Integer(
> Field( 14, 16 ) ) < 567 => Field( 20, 40 ), Float( Field( 41,45)),
> Float(Field( 46,50))


Try 'CodeWorker' at "http://codeworker.free.fr":
your E-BNF script "rules.gen" looks like:
--------------------------
file ::=
        >ontinue // means that a parsing error is raised if the rest of
the sequence doesn't match
        => local iLineCounter = 0; // need a counter to store fetched
values
        [
                int3:type // read a 3-digits integer and put it into a local
variable
                // add of a subnode item into the parse tree 'project.lines'
seen as
                // an array of lines;
                // 'handleLine<type>' is a template clause, where
instantiations are
                // defined below
                handleLine<type>(project.lines[iLineCounter])
                => increment(iLineCounter) // next line
        ]*
        #empty; // 'end of file' must be encountered


// instantiated template clause for type = "123";
// the clause expects a tree node to store fetched values
handleLine<"123">(currentLine : node) ::=
        >ontinue // the pattern-matching must succeed
        // the current line is passed to the rules
        [rule123_1(currentLine) | rule123_2(currentLine)];


// instantiated template clause for all other types. The interest of
this
// approach is that if you forget a type, or if some types might be
added later,
// an error is thrown if the corresponding template clause isn't
instantiated
...


rule123_1(currentLine : node) ::=
        [#readChar]3 // you have ignored the characters preceding 'amount'
        int3:amount
        [
                >heck($amount < 35$)
        |
                [#readChar]2 // you have ignored the characters preceding
'customer'
                [#readChar]20:customer
                => trim(customer); // 'Strip()' means to ignore spaces?
                >heck(customer == "unknown")
        ]
        >ontinue // the antecedent of the rule is valid, so the
consequent must be
        [#readChar]4 // you have ignored the characters preceding 'amount'
value
        float6:currentLine.amountValue // populates the parse tree
        #readChar // you have ignored the character preceding 'date'
        date:currentLine.date // populates the parse tree
        [#readChar]4 // you have ignored the character preceding
'duration'
        duration:currentLine.duration // populates the parse tree
        // some characters to bypass at the end of the line?
        ;


rule123_2(currentLine : node) ::= ...




//------------------------
// lexical clauses
//------------------------


// 3-digits integer
int3 ::= ['0'..'9']3;


// 6-chars floating-point number. Seems complicated, but not. A lot of
code to
// assure the 6-chars constraint.
float6 ::=
        => local iLength = 0; // length of the float must be 6 exactly
        [['+' | '-'] => increment(iLength)]? // sign is optional
        [>heck($iLength < 6$) '0'..'9' => increment(iLength)]*
        [>heck($iLength < 6$) '.' => increment(iLength)]?
        [>heck($iLength < 6$) '0'..'9' => increment(iLength)]*
        >heck(iLength == 6); // must occupy 6 chars exactly


date ::= // your date format
duration ::= // your duration format
-------------------


To exploit extracted data, some possibilities:
        - you want to generate some files with these data -> try the
'source-to-source translation' or the 'pattern' scripts of
"CodeWorker",
        - you want to take it back into a programming language:
                - C++ binding is provided by 'CodeWorker' (see the
corresponding chapter of the documentation) via external functions,
                - "rules.gen" may be translated to C++ in "rules.cpp/h"
(option '-c++' on the command line),




Regards,


Cedric


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.