You, too, can look at strings.

cl@lgc.com (Cameron Laird)
Wed, 20 Feb 91 15:02:04 GMT

          From comp.compilers

Related articles
You, too, can look at strings. cl@lgc.com (1991-02-20)
Re: You, too, can look at strings. enag@ifi.uio.no (Erik Naggum) (1991-02-22)
| List of all articles for this month |
Newsgroups: comp.unix.questions,comp.unix.programmer,comp.compilers
From: cl@lgc.com (Cameron Laird)
Keywords: C, lex
Organization: Landmark Graphics Corp., Houston, Tx
References: <1991Feb12.144738.11530@lgc.com>
Date: Wed, 20 Feb 91 15:02:04 GMT

I asked for help extracting string constants from source code.
I summarize the responses I received:
1. my own was to write (approximately)
echo 's/"[^"]*$/"/
s/[^"]*"/"/' >/tmp/string_script
grep '".*"' | tee /tmp/string_list | \
sed -f /tmp/string_script | ...
rm /tmp/string_script
        as part of a filter. The filter does these things:
        a. puts a grep-listing (not egrep, not fgrep, but grep)
                of all lines with at least two "-s into /tmp/string_list,
                for my later convenience in examining the contexts where
                the strings occur; and
        b. copies what's left of those lines after throwing away
                everything before the first " and after the last " to
                stdout.
        This was something I knew how to write in a few minutes,
        and works well enough, although it is ignorant nothing about
        the syntax of C beyond looking for a pair of "-s.
2. various folks suggested combinations of
{m,}xstr--available on uunet:bsd-sources/pgrm/{m,}xstr/*
I thought this had possibilities, but didn't
work with it much.
cxref
I didn't find any quick way to make this do
something useful to me.
strings--this was definitely not what I had in
mind (I'm thinking about source code, and,
as far as I'm concerned, strings is for work-
ing with object files), but I've invoked
strings hundreds of times for other chores,
and I'm happy to give it a bit of publicity.
3. a few folks wrote to say that perl could do it in
        one line; no one delivered such a line, but I didn't
        ask. Does perl remind anyone else of APL? That's not
        entirely a bad thing ...
4. comp.compilers publishes each month sites for distribution
        of lexical analyzers and such. I haven't checked this
        list. I also received the advice that, "At site
        primost.cs.wisc.edu (128.105.2.115) in directory
        /pub/comp.compilers are files called *grammar.Z
        They contain grammars for lex/yacc for c, c++ ftn
        and pascal. . . ."
5. a Swedish HPUX user reported that he relies on findstr,
        in the NLS (Natural Language Support) package that is part
        of HPUX.
6. William A. Hoffman posted the kind of lapidary answer I expected
        from the net: a couple dozen lines, definitive (in some sense),
        no-nonsense, functional, and a starting-point for yet more re-
        finements (or arguments).


... string.lex
--------------------------------------------------------
string \"([^"\n]|\\["\n])*\"
%%
{string} printf("%s\n", yytext); return(1);
\n ;
. ;
%%
main()
{
int i;


while(i= yylex())
;
}


yywrap()
{
}
------------------------------------------------------------
to run just:
lex string.lex
cc lex.yy.c -o string
string < *.c


        The moderator noted that this deserved to be beefed up "... to
        handle character constants and comments ..."
7. One reader wrote that he'd send a finite-state machine which
        models C syntax as soon as he found his copy. I haven't heard
        from him since. I'll pass it along when it arrives.
My apologies to Henry Spencer for misremembering his name as "Harry".


Thanks, all.
--
Cameron Laird USA 713-579-4613
cl@lgc.com USA 713-996-8546
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.