syslog parser

"Michael B. Allen" <>
19 Feb 2000 00:33:30 -0500

          From comp.compilers

Related articles
syslog parser (Michael B. Allen) (2000-02-19)
Re: syslog parser (2000-02-21)
Re: syslog parser (2000-02-21)
| List of all articles for this month |

From: "Michael B. Allen" <>
Newsgroups: comp.compilers
Date: 19 Feb 2000 00:33:30 -0500
Organization: Compilers Central
Keywords: parse, question


I thought it might be a cool idea to use flex/bison to parse all the
config files on my linux system for interfacing with an XML parser
front-end and exporting them as objects thru CORBA. So I ran out and
bought the O'reilly book on lex and yacc, read through chapter 3, and
started writing a parser for syslog.conf. I'm a little confused
however and thought that you guys would probably have some good advice
for me.

The format of syslog.conf is like this:

# These lines
# are comments
*.info;one,two.priority;three.*;mail.none /var/log/messages

I could parse the selector list, the comments, and the filename but
how do I incorporate all parsing pieces into one parser? If I have
rules for the filter, how do I turn around and parse the filename. The
filename is actually an "action" meaning it could be a command with
hyphens and so fourth. It should be interpreted as a generic text
string. Must I use start conditions? I guess I could use the fact that
the filter has no space or that there is no ';' before the space
...etc but how do I incorporate the comments? If I put code in for a
comment the parser properly finds the comment but then reports a parse
error. Similarly if I simply enter a single word like 'hello\n', I get
a parse error. Is the newline causing a problem? I understand how
reductions are triggered and how it uses a stack to parse the tree but
I'm not sure I understand exactly how the rules hierarchy works when
parsing different types of sections, ie comments, sections of plain
text that could be anything ...etc.

Should I have the lexer doing more of the work?

Here's my code. I would appreciate any help.

Michael B. Allen

------------- LEXER

  * syslog.conf lexer
#include ""
int yywrap( void );


[ \t] ;
#.* { printf( "found a comment\n" ); }
[a-z][a-zA-Z0-9]* { return WORD; }
\n { return 0; }
. { return yytext[0]; }


int yywrap()
  return 1;

------------- PARSER

  * syslog.conf parser
#include <stdio.h>
%token WORD


selector_set: selector_set ';' selector
                                | selector
selector: word '.' WORD { printf( "found a selector\n" ); }
word: word ',' WORD { printf( "found a word member\n" ); }
                                | '*' { printf( "found a *\n" ); }
                                | WORD { printf( "found a word\n" ); }


        while( 1 ) {
yyerror( char *s )
        fprintf( stderr, "%s\n", s );

------------- INPUT

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none /var/log/messages

# The authpriv file has restricted access.
authpriv.* /var/log/secure

# Log all the mail messages in one place.
mail.* /var/log/maillog

# Everybody gets emergency messages, plus log them on another
# machine.
*.emerg *

# Save mail and news errors of level err and higher in a
# special file.
uucp,news.crit /var/log/spooler

------------- TEST RUN
[miallen@angus parsers]$ syslog.parser
found a word
found a word member
found a word member
found a selector
found a word
found a selector
found a word
found a word member
parse error
parse error
found a *
found a selector
[Looks like you're pretty close. I'd make white space return a token,
and probably use a flex %x start state to lex the filename after white
space as a single token. I'm a great believer in using lexical hackery
to make the parser's job easier. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.