Re: Yacc grammar for HTML/XML/WML

Pierre Mai <>
20 Jul 1998 17:01:47 -0400

          From comp.compilers

Related articles
Yacc grammar for HTML/XML/WML (Terry Robinson) (1998-07-10)
Re: Yacc grammar for HTML/XML/WML (Quinn Tyler Jackson) (1998-07-11)
Re: Yacc grammar for HTML/XML/WML (1998-07-13)
Re: Yacc grammar for HTML/XML/WML (Pierre Mai) (1998-07-20)
| List of all articles for this month |

From: Pierre Mai <>
Newsgroups: comp.compilers
Date: 20 Jul 1998 17:01:47 -0400
Organization: Technical University of Berlin, Germany
References: 98-07-112
Keywords: WWW, parse
X-PGP-Fingerprint: 17 2D 00 93 8B C8 57 57 A7 D7 CD E9 3A EA 6E 4C (Marc Wachowitz) writes:

> Terry Robinson <> wrote:
> > Does anyone have a grammar for Yacc/Bison for a real mark-up language=

> > like HTML or WML (XML needs a document type definition to define a
> > language - well normally) or know where one can be gotten ?
> =

> Just in case "Yacc/Bison" is merely your assumption how a parser would
> be written, while the real problem is just to get some parser for these=

> languages: As long as the text follows a DTD, you could use nsgmls or
> directly the underlying C++ interface of SP, James Clark's SGML parser:=


One should also note, that at least for SGML, constructing a correct
parser is a rather non-trivial exercise, complicated by the fact, that
the syntax and semantics of full SGML are not a good match to most
"conventional" parsing strategies/tools used in the programming
language community (especially things like white-space handling should
pose a problem for yacc/bison).

Parsing XML is probably an order of magnitude simpler (which was one
of the design criteria for XML), but still is not a very good match
for yacc/bison&co.

Overall, you are much, much better of using one of the many available
XML parsers, like e.g. =C6lfred (in Java), or nsgmls (C++) which also
does full SGML, and HyTime, and ... as well.

=C6lfred is free for both commercial and non-commercial use, and COMES
WITH NO WARRANTEE. You can download a copy of version 1.0 (with
source code) from the following URL:

(Beware, this quote is somewhat old, so maybe terms of use or
availability have changed...)

Regs, Pierre.

-- =
Pierre Mai <>

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.