Re: Wanted: HTML lex spec/yacc grammar

jpc1@doc.ic.ac.uk (Justin Cormack)
17 Feb 1996 22:48:02 -0500

          From comp.compilers

Related articles
Re: Wanted: HTML lex spec/yacc grammar jpc1@doc.ic.ac.uk (1996-02-17)
| List of all articles for this month |
From: jpc1@doc.ic.ac.uk (Justin Cormack)
Newsgroups: comp.compilers
Date: 17 Feb 1996 22:48:02 -0500
Organization: Dept. of Computing, Imperial College, University of London, UK.
Keywords: parse

"Paul D. Wilson" <pdw@intergate.net> writes:
|> I'm looking for a freely available HTML lex specification and or yacc
|> grammar.


Dan Connolly at w3.org has been working on this for a while: see
http://www.w3.org/pub/WWW/MarkUp/SGML/sgml-lex/sgml-lex


As HTML is an application of SGML, comp.text.sgml is a useful
newsgroup where the pitfalls of parsing HTML are often discussed (the
group is archived but I cant remember the site off hand). HTML is
unfortunately rather hard to parse in practise as the usage is rather
sloppy, and browsers tend to be written to display stuff on a 'best
efforts' basis rather than rejecting incorrect HTML.


Justin Cormack
j.cormack@ic.ac.uk
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.