Related articles |
---|
Re: Wanted: HTML lex spec/yacc grammar jpc1@doc.ic.ac.uk (1996-02-17) |
From: | jpc1@doc.ic.ac.uk (Justin Cormack) |
Newsgroups: | comp.compilers |
Date: | 17 Feb 1996 22:48:02 -0500 |
Organization: | Dept. of Computing, Imperial College, University of London, UK. |
Keywords: | parse |
"Paul D. Wilson" <pdw@intergate.net> writes:
|> I'm looking for a freely available HTML lex specification and or yacc
|> grammar.
Dan Connolly at w3.org has been working on this for a while: see
http://www.w3.org/pub/WWW/MarkUp/SGML/sgml-lex/sgml-lex
As HTML is an application of SGML, comp.text.sgml is a useful
newsgroup where the pitfalls of parsing HTML are often discussed (the
group is archived but I cant remember the site off hand). HTML is
unfortunately rather hard to parse in practise as the usage is rather
sloppy, and browsers tend to be written to display stuff on a 'best
efforts' basis rather than rejecting incorrect HTML.
Justin Cormack
j.cormack@ic.ac.uk
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.