syslog parser

"Michael B. Allen" <mballen@NOSPAM_erols.com>
19 Feb 2000 00:33:30 -0500

          From comp.compilers

Related articles
syslog parser mballen@NOSPAM_erols.com (Michael B. Allen) (2000-02-19)
Re: syslog parser george@castro.dbnet.ece.ntua.gr (2000-02-21)
Re: syslog parser rkrayhawk@aol.com (2000-02-21)
| List of all articles for this month |
From: "Michael B. Allen" <mballen@NOSPAM_erols.com>
Newsgroups: comp.compilers
Date: 19 Feb 2000 00:33:30 -0500
Organization: Compilers Central
Keywords: parse, question

Hello,


I thought it might be a cool idea to use flex/bison to parse all the
config files on my linux system for interfacing with an XML parser
front-end and exporting them as objects thru CORBA. So I ran out and
bought the O'reilly book on lex and yacc, read through chapter 3, and
started writing a parser for syslog.conf. I'm a little confused
however and thought that you guys would probably have some good advice
for me.


The format of syslog.conf is like this:


# These lines
# are comments
*.info;one,two.priority;three.*;mail.none /var/log/messages


I could parse the selector list, the comments, and the filename but
how do I incorporate all parsing pieces into one parser? If I have
rules for the filter, how do I turn around and parse the filename. The
filename is actually an "action" meaning it could be a command with
hyphens and so fourth. It should be interpreted as a generic text
string. Must I use start conditions? I guess I could use the fact that
the filter has no space or that there is no ';' before the space
...etc but how do I incorporate the comments? If I put code in for a
comment the parser properly finds the comment but then reports a parse
error. Similarly if I simply enter a single word like 'hello\n', I get
a parse error. Is the newline causing a problem? I understand how
reductions are triggered and how it uses a stack to parse the tree but
I'm not sure I understand exactly how the rules hierarchy works when
parsing different types of sections, ie comments, sections of plain
text that could be anything ...etc.


Should I have the lexer doing more of the work?


Here's my code. I would appreciate any help.


Thanks,
Michael B. Allen
mballen@erols.com


------------- LEXER


%{
/*
  * syslog.conf lexer
  *
  */
#include "syslog.conf.tab.h"
int yywrap( void );
%}


%%


[ \t] ;
#.* { printf( "found a comment\n" ); }
[a-z][a-zA-Z0-9]* { return WORD; }
\n { return 0; }
. { return yytext[0]; }


%%


int yywrap()
{
  return 1;
}


------------- PARSER


%{
/*
  * syslog.conf parser
  *
  */
#include <stdio.h>
%}
%token WORD


%%


selector_set: selector_set ';' selector
                                | selector
                                ;
selector: word '.' WORD { printf( "found a selector\n" ); }
                                ;
word: word ',' WORD { printf( "found a word member\n" ); }
                                | '*' { printf( "found a *\n" ); }
                                | WORD { printf( "found a word\n" ); }
                                ;


%%


main()
{
        while( 1 ) {
                yyparse();
        }
}
yyerror( char *s )
{
        fprintf( stderr, "%s\n", s );
}


------------- INPUT


# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console


# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none /var/log/messages


# The authpriv file has restricted access.
authpriv.* /var/log/secure


# Log all the mail messages in one place.
mail.* /var/log/maillog


# Everybody gets emergency messages, plus log them on another
# machine.
*.emerg *


# Save mail and news errors of level err and higher in a
# special file.
uucp,news.crit /var/log/spooler


------------- TEST RUN
[miallen@angus parsers]$ syslog.parser
one,two,three.book;four.magazine;five,six.*;*.journal
found a word
found a word member
found a word member
found a selector
found a word
found a selector
found a word
found a word member
parse error
parse error
found a *
found a selector
[Looks like you're pretty close. I'd make white space return a token,
and probably use a flex %x start state to lex the filename after white
space as a single token. I'm a great believer in using lexical hackery
to make the parser's job easier. -John]







Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.