Related articles |
---|
parsing bibtex file with flex/bison bnrj.rudra@gmail.com (2013-03-04) |
Re: parsing bibtex file with flex/bison drikosev@otenet.gr (Evangelos Drikos) (2013-03-06) |
Re: parsing bibtex file with flex/bison bnrj.rudra@gmail.com (Rudra Banerjee) (2013-03-07) |
Re: parsing bibtex file with flex/bison gah@ugcs.caltech.edu (glen herrmannsfeldt) (2013-03-08) |
Re: parsing bibtex file with flex/bison bnrj.rudra@gmail.com (Rudra Banerjee) (2013-03-17) |
Re: parsing bibtex file with flex/bison torsten.eichstaedt@FernUni-Hagen.de (Torsten =?UTF-8?B?RWljaHN0w6RkdA==?=) (2013-03-25) |
From: | bnrj.rudra@gmail.com |
Newsgroups: | comp.compilers |
Date: | Mon, 4 Mar 2013 15:48:56 -0800 (PST) |
Organization: | Compilers Central |
Injection-Date: | Mon, 04 Mar 2013 23:48:56 +0000 |
Keywords: | lex, parse, design, comment |
Posted-Date: | 05 Mar 2013 00:13:53 EST |
I want to parse bibtex file using flex/bison. A sample bibtex is:
@Book{a1,
author="amook",
Title="ASR",
Publisher="oxf",
Year="2010",
Add="UK",
Edition="1",
}
@Article{a2,
Author="Rudra Banerjee",
Title={FeNiMo},
Publisher={P{\"R}B},
Issue="12",
Page="36690",
Year="2011",
Add="UK",
Edition="1",
}
(A new key may start in same line)
Now, I have written a flex code:
%{
#include <stdio.h>
#include <stdlib.h>
%}
%{
char yylval;
int YEAR,i;
//char array_author[1000];
%}
%x author
%x title
%x pub
%x year
%%
@ printf("\nNEWENTRY\n");
[a-zA-Z][a-zA-Z0-9]* {printf("%s",yytext);
BEGIN(INITIAL);}
author= {BEGIN(author);}
<author>\"[a-zA-Z\/.]+\" {printf("%s",yytext);
BEGIN(INITIAL);}
title= {BEGIN(title);}
<title>\"[a-zA-Z\/.]+\" {printf("%s",yytext);
BEGIN(INITIAL);}
publisher= {BEGIN(pub);}
<pub>\"[a-zA-Z\/.]+\" {printf("%s",yytext);
BEGIN(INITIAL);}
[a-zA-Z0-9\/.-]+= printf("ENTRY TYPE ");
\" printf("QUOTE ");
\{ printf("LCB ");
\} printf(" RCB");
; printf("SEMICOLON ");
\n printf("\n");
%%
int main(){
yylex();
//char array_author[1000];
//printf("%d%s",&i,array_author[i]);
i++;
return 0;
}
while this is peeking up the few things, not all.
Can anyone kindly help me with this?
[My suggestion would be to do less in the lexer and more in the parser. In the
lexer, responable tokens might be '@' '{' '}' '=' ',' word
qstring (quoted string)
Then you could write bison rules like this:
clause: '@' word '{' word ',' attrlist '}' ;
attrlist: attr | attr ',' attrlist ;
attr: name '=' value ;
value: word | qstring | nestlist :
nestlist: '{' list '}' ;
list: listitem | list listitem ;
listitem: word | qstring | nestlist :
And so forth. This isn't exactly right, but it should get you going
in the right direction. The parser will recognize some invalid bibtex,
e.g., words that aren't attribute names, which it's easier to check in
semantic code rather than trying to stick laundry lists of keywords
into the parser. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.