Related articles |
---|
The remarkable similarities between Flex/Lex and XSLT costello@mitre.org (Roger L Costello) (2022-06-24) |
Re: The remarkable similarities between Flex/Lex and XSLT gah4@u.washington.edu (gah4) (2022-06-24) |
Re: The remarkable similarities between Flex/Lex and XSLT matt.timmermans@gmail.com (matt.ti...@gmail.com) (2022-06-25) |
From: | "matt.ti...@gmail.com" <matt.timmermans@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Sat, 25 Jun 2022 09:20:54 -0700 (PDT) |
Organization: | Compilers Central |
References: | 22-06-073 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="79332"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, history |
Posted-Date: | 25 Jun 2022 12:44:44 EDT |
In-Reply-To: | 22-06-073 |
On Friday, 24 June 2022 at 09:00:44 UTC-4, Roger L Costello wrote:
> Hi Folks,
>
> XSLT is a language for processing XML documents.
>
> There are remarkable similarities between Flex/Lex and XSLT. Lex was created
> 47 years ago, long before XSLT. One wonders if some members of the XSLT 1.0
> Working Group were Lex users and were influenced by its concepts?
It's not really about a single tool like Lex.
Before XML there was SGML, which XML was supposed to "simplify". SGML
included a schema language (DTD), which defines the hierarchical structure of a
document using regular expressions over elements. There was also a strange
unnecessary constraint on these expressions called "ambiguity", which
*everybody* who wrote SGML software needed to understand, and so the idea of
applying formal language techniques to SGML was inevitable.
Long before XSLT, there were a variety of attempts to define languages that
would allow users to specify an automatic translation from SGML into printed
form. Many of these languages were context-free grammars at their core, with
translation rules as actions. This is called "syntax-directed translation"
and was a well-known concept long before that.
With SGML, though, the problem of syntax-directed translation is different
than it is in other contexts, and more difficult in many ways, because the
basic structures in the input are very easy to parse -- elements are delimited
after all -- but the input was a semantically marked up text and the output
was a published document that had to follow all the ambiguously-defined
stylistic rules that people use when they actually to typography. This meant
that complicated grammars, over *element trees* instead of linear text, and
lots of other ideas, needed to be applied. Lots of companies put a lot of
work into it.
So by the time XSLT came around, everyone on the committee as already familiar
with a lot of this history from SGML processing, which was based on a lot of
work rooted in the same formal language theory that goes into lexers and
parsers, and that is why some of XSLT looks a lot like Lex.
Unfortunately, XSLT kind of sucks. When the standard was written, the problem
itself had not really been solved by industry in a really acceptable way (and
it still hasn't been!), and the W3C committee fell into the trap of trying to
innovate instead of codifying best practice.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.