Is lex/yacc the right tool for this problem

sonyantony@hotmail.com (Sony Antony)
26 Jan 2003 16:32:59 -0500

          From comp.compilers

Related articles
Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-26)
Re: Is lex/yacc the right tool for this problem arnold@skeeve.com (2003-01-27)
Re: Is lex/yacc the right tool for this problem tenger@idirect.com (Terrence Enger) (2003-01-27)
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29)
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29)
Re: Is lex/yacc the right tool for this problem codeworker@free.fr (2003-01-29)
| List of all articles for this month |

From: sonyantony@hotmail.com (Sony Antony)
Newsgroups: comp.compilers
Date: 26 Jan 2003 16:32:59 -0500
Organization: http://groups.google.com/
Keywords: parse, question
Posted-Date: 26 Jan 2003 16:32:59 EST

Hello:


I have a huge file with lines of the form
89234758979hfjhkjh39485893475398789576945349856789
hgdstfh3478567356h45g64674569468457694645u6ui68945
389478976984596875649864645987597954795879498657




( I just typed garbage with the keyboard. But the real data files will
be similer closely packed digits and alphabets without space,
signifying different pieces of data like name, time,date, amount etc.)
Each of these lines are data. The first 3 characters represent the
type of the line. For each given type, the remaining positions are
different types of data packed closely without space, in a way
specific for that type. ( None of the data is encrypted though )


I am required to extract certain fields of data when certain
conditions are met.
Typically a set of rules like
1.If type == 123 && (( amount < 35 ) || ( customer == unknown ) ) =>
fetch amount, date, duration
2. If type == 234 && (( weight > 100 ) && ( height < 567 ) ) => fetch
name, weight, height


( all teh nouns used like amount, weight, customer etc. are domain
specific and need not be concerned here. But they are all positions of
teh line, that can be represented as Field( n, m ), which is the
string starting at column n and ending at m of the line )
Initially I thought this will be a text book case of a situation
where lex and yacc will be the proper tools.


I thought I will essentially generate a program with the help of lex &
yacc wherein all the rules will be given in a file rules.cfg with
lines of the form ( for the above 2 rules )
1. Field( 1,3) == 123 && ((Integer(Field( 7,9)) < 35 ) || ( Strip (
Field( 11,30 ) ) == "unknown" )) => Float( Field( 35, 40 )), Date(
Field( 42, 50)), Time( Field( 55,60 ) )
2. Field( 1,3) == 123 && Integer( Field( 10, 13 ) ) > 100 && Integer(
Field( 14, 16 ) ) < 567 => Field( 20, 40 ), Float( Field( 41,45)),
Float(Field( 46,50))


Essentially for any line from teh data file, when the condition
before one of the rules' "=>" is met, whatever is specified after
"=>" for the same rule is printed.




I was wondering if this is a right kind of problem for using lex and
yacc. If it is how does one go about writing the grammer.


Thanks a lot
--sony


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.