| Related articles |
|---|
| Parsing HTML : I would appreciate advice jim@aol.com (Jim) (2006-11-13) |
| Re: Parsing HTML : I would appreciate advice zingard@mcmaster.ca (Daniel Zingaro) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice JustinBl@osiristrading.com (excalibur2000) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice vidar.hokstad@gmail.com (Vidar Hokstad) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice Juergen.KahrsDELETETHIS@vr-web.de (Juergen Kahrs) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice JoachimPimiskern@web.de (Joachim Pimiskern) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice m.collado@fi.upm.es (Manuel Collado) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice ojh16@student.canterbury.ac.nz (Oliver Hunt) (2006-11-15) |
| Re: Parsing HTML : I would appreciate advice sorry@nospam.org (Tim Van Holder) (2006-11-18) |
| From: | Oliver Hunt <ojh16@student.canterbury.ac.nz> |
| Newsgroups: | comp.compilers |
| Date: | 15 Nov 2006 15:21:02 -0500 |
| Organization: | Compilers Central |
| References: | 06-11-059 06-11-072 |
| Keywords: | parse, practice |
| Posted-Date: | 15 Nov 2006 15:21:01 EST |
Just for the record i'd like to point out that HTML is *not* xml. It
isn't just a matter of most websites not being well formed -- the
HTML spec itself is not XML compliant.
For generated data you might be able to get away with an xml parser,
but there are a few html tags that aren't xml at all (they have no
closing tag, and using one is actually invalid html). So valid HTML
can break any standard XML parser.
--Oliver
Return to the
comp.compilers page.
Search the
comp.compilers archives again.