Related articles |
---|
Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-26) |
Re: Is lex/yacc the right tool for this problem arnold@skeeve.com (2003-01-27) |
Re: Is lex/yacc the right tool for this problem tenger@idirect.com (Terrence Enger) (2003-01-27) |
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29) |
Re: Is lex/yacc the right tool for this problem sonyantony@hotmail.com (2003-01-29) |
Re: Is lex/yacc the right tool for this problem codeworker@free.fr (2003-01-29) |
From: | arnold@skeeve.com (Aharon Robbins) |
Newsgroups: | comp.compilers |
Date: | 27 Jan 2003 23:26:23 -0500 |
Organization: | Pioneer Consulting, Ltd. |
References: | 03-01-163 |
Keywords: | parse |
Posted-Date: | 27 Jan 2003 23:26:23 EST |
Sony Antony <sonyantony@hotmail.com> wrote:
>I have a huge file with lines of the form
>89234758979hfjhkjh39485893475398789576945349856789
>hgdstfh3478567356h45g64674569468457694645u6ui68945
>389478976984596875649864645987597954795879498657
>
>
>( I just typed garbage with the keyboard. But the real data files will
>be similer closely packed digits and alphabets without space,
>signifying different pieces of data like name, time,date, amount etc.)
>Each of these lines are data. The first 3 characters represent the
>type of the line. For each given type, the remaining positions are
>different types of data packed closely without space, in a way
>specific for that type. ( None of the data is encrypted though )
>
>I am required to extract certain fields of data when certain
>conditions are met.
>Typically a set of rules like
>1.If type == 123 && (( amount < 35 ) || ( customer == unknown ) ) =>
>fetch amount, date, duration
>2. If type == 234 && (( weight > 100 ) && ( height < 567 ) ) => fetch
>name, weight, height
You want something that will let you extract the columns into
variables and then do your logic test. You can do this with gawk and
its FIELDWIDTHS variable, something along these lines:
# this rule is run for each input line
{
type = substr($0, 1, 3) # first three chars
type = type + 0 # make numeric
if (type == 123) {
extract1()
logic1()
} else if (type == 234) {
extract2()
logic2()
} # etc...
}
function extract1()
{
FIELDWIDTHS = "3 2 5 7" # whatever
$0 = $0 # force $0 to be reparsed
amount = $2
customer = $3 # assign fields to variables for readability
# ...
}
function logic1()
{
if (amount == 42 && customer == "whatever)
...
}
....
Undoubtedly perl, python or tcl could be used too. Lex & yacc are
likely to be overkill for this job. You could probably even do it in
C using some straightforward sscanf calls on your input line.
Arnold
--
Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. arnold@skeeve.com
P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 928 569 9018
Nof Ayalon Cell Phone: +972 51 297-545
D.N. Shimshon 99785 ISRAEL
Return to the
comp.compilers page.
Search the
comp.compilers archives again.