little lex question

"Bart Vandewoestyne" <Bart.Vandewoestyne@pandora.be>
12 Sep 2002 00:28:13 -0400

          From comp.compilers

Related articles
little lex question Bart.Vandewoestyne@pandora.be (Bart Vandewoestyne) (2002-09-12)
Re: little lex question john43@temple.edu (john43) (2002-09-12)
| List of all articles for this month |
From: "Bart Vandewoestyne" <Bart.Vandewoestyne@pandora.be>
Newsgroups: comp.compilers
Date: 12 Sep 2002 00:28:13 -0400
Organization: MyHome
Keywords: lex
Posted-Date: 12 Sep 2002 00:28:13 EDT

Just trying to get my hands on lex by experimenting a bit on a
bookmarks.html file exported from my Netscape 4.77 bookmark list, i am
encountering the following problem which i don't know how to solve
yet...


I'm able to recognize everything but the text coming from an url or
folder description. The problem is that when I allow too much, some of
the HTML tags get also recognised as ordinary text. And when I'm too
restrictive, the text I'm trying to match might contain characters
interfering with the restrictions.


As far as i can see, the main problem is that 'text from description
of folders' can contain a too broad variety of characters, including
some of the characters matching HTML tags or special tokens.


I think I might be able to solve this problem by using 'start
conditions', but i have not figured out yet how to do this in the most
efficient way. (actually, i haven't figured out *any* way yet ;-)


Can somebody point me in the right direction? Source is located at
http://mc303.ulyssis.org/downloads/bookmarks.tar.gz


Thanks,
Bart


--
Ing. Bart Vandewoestyne Bart.Vandewoestyne@pandora.be
Hugo Verrieststraat 48 GSM: +32 (0)478 397 697
B-8550 Zwevegem http://users.pandora.be/vandewoestyne


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.