Related articles |
---|
Regular Expressions m_j_mather@yahoo.com.au (2004-10-09) |
Re: Regular Expressions newsserver_mails@bodden.de (Eric Bodden) (2004-10-12) |
Re: Regular Expressions randyhyde@earthlink.net (Randall Hyde) (2004-10-12) |
Re: Regular Expressions schmitz@i3s.unice.fr (Sylvain Schmitz) (2004-10-12) |
Re: Regular Expressions Martin.Ward@durham.ac.uk (Martin Ward) (2004-10-12) |
Re: Regular Expressions torbenm@diku.dk (2004-10-12) |
Re: Regular Expressions dmaze@mit.edu (David Z Maze) (2004-10-12) |
Re: Regular Expressions Martin.Ward@durham.ac.uk (Martin Ward) (2004-10-17) |
Re: Regular Expressions choksheak@yahoo.com (ChokSheak Lau) (2004-10-21) |
[8 later articles] |
From: | "Randall Hyde" <randyhyde@earthlink.net> |
Newsgroups: | comp.compilers |
Date: | 12 Oct 2004 00:51:07 -0400 |
Organization: | EarthLink Inc. -- http://www.EarthLink.net |
References: | 04-10-069 |
Keywords: | lex |
Posted-Date: | 12 Oct 2004 00:51:07 EDT |
"Mark" <m_j_mather@yahoo.com.au> wrote in message
> I just can't seem to figure out how to invent a regular expression
> that will strip all HTML tags (except TABLE tags) out of a string and
> leave the rest of the text. When a TABLE tag is encountered i need to
> strip everything under it.
>
> This will strip all HTML out <[^>]*>
>
> But how do I make it also strip entire TABLE elements?
>
> Perhaps something like <table[^</table>]*</table>|<[^>]*>
>
> Thanks,
> Mark
> [That seems awfully complex for a single regex. -John]
Indeed, in general this requires a context-free grammar. I don't
understand the OP's exact problem well enough to determine if you can
get by with a regex.
Cheers,
Randy Hyde
Return to the
comp.compilers page.
Search the
comp.compilers archives again.