|[3 earlier articles]|
|Re: TeX syntax? firstname.lastname@example.org (Allan Adler) (2007-02-09)|
|Re: TeX syntax? email@example.com (Philipp Lucas) (2007-02-12)|
|Re: TeX syntax? jhallen@TheWorld.com (2007-02-16)|
|Re: TeX syntax? firstname.lastname@example.org (glen herrmannsfeldt) (2007-02-16)|
|Re: TeX syntax? jhallen@TheWorld.com (2007-02-25)|
|Re: TeX syntax? email@example.com (Jim Hill) (2007-02-25)|
|Re: TeX syntax? firstname.lastname@example.org (Rock Brentwood) (2021-04-04)|
|Re: TeX syntax? email@example.com (gah4) (2021-04-05)|
|Re: TeX syntax? firstname.lastname@example.org (gah4) (2021-04-05)|
|Re: macros of yore, was TeX syntax? email@example.com (gah4) (2021-04-09)|
|From:||Rock Brentwood <firstname.lastname@example.org>|
|Date:||Sun, 4 Apr 2021 14:08:30 -0500|
|Injection-Info:||gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="57677"; mail-complaints-to="email@example.com"|
|Posted-Date:||04 Apr 2021 21:12:04 EDT|
[ This is a followup to a thread from 2007. ]
>I've looked high and low without success. Where can i find
>something resembling the BNF of Knuth's TeX typesetting syntax?
It's in the file weave.web, section 14.
The syntax for TeX was written as a context-sensitive translation
grammar, suitable for a streaming translator, rather than a
context-free grammar. It may be possible to convert it to one (either
directly or as a context-free enveloping grammar with semantic
constraints). That's a matter that may be worth looking into. But in
its present form, there is no tree-building required or involved: it
can stream. The distinction is analogous to that between SSDT's versus
SDT's ([S]SDT = [simple-]syntax-directed translations). SSDT's can be
streamed, SDT's require stacking or treeing values and, in effect, SDT
= SSDT + value-stacking/treeing.
TeX is written in Web which is essentially Pascal + hyper-linked
comments. It is also in C-Web on the main TeX distribution site, which
is C + hyper-linked comments. They *can* be converted directly to more
normal programs with the comments embedded. I did so in the local
versions of my older MiKTeX distribution, but haven't
regression-tested it yet - since I haven't established a working
baseline yet to work off of.
The syntax is in - and an essential part - of the weave.web file. In detail:
Section 14.1 describes the framework used for the syntax
Section 14.2 lists the "category codes" used
Section 14.3 lists additional post-processed lexical units used
Section 14.4 lists describes a processing layer from lexical units to "scraps"
Section 14.5 contains the productions for the context sensitive grammar
Section 15 implements the parser; the most important routine being
translate_cases() (its name in the C-Web file) - as a master "switch"
statement (or "case" statement in Pascal) in section 15.7.
By the way the "open" case (its subcases are in 15.19), "math" subcase
(its sub-subcases iare in 15.20), "var_head" sub-subcase has a bug in
it. The "intro" sub-sub-subcase listed a transition to rule 31,
instead of to rule 33. (I want my money Knuth! :))
I believe it's possible to convert it all to a context-free grammar,
albeit with the possible inclusion of a few semantic constraints. Why
Knuth chose to write everything this way - as borderline obfuscated
code that cannot be scaled upwards or sideways or integrated in other
existing code - is beyond me. But it is not maintainable, and
heavily-laden with Technical Debt; notably, its *critical* reliance on
the dated assumption that the Future would be Pascal, along with all
the other assumptions and - more importantly - the now-unnecessary
restrictions that came out of that.
Much of the very design of the entire Web framework's very conception
and design was premised on the assumed necessity of those
restrictions; and the whole thing can be done on a much simpler
foundation, when remade in more up-to-date terms (relatively speaking)
*natively* in C or C++. Among other things, there isn't a need for any
Web-like framework. You can just simply use ordinary comments. I know,
because I did so: I rewrote the entire set of Web files in my local
copy doing just that. When a baseline is established and it is
regression-validated I'll put a copy up on GitHub.
A follow-up to the additional comments at the end of the article:
>Knuths TeX book is an abomination, describing lexing and parsing
>as mouth, gullet and stomach nonsense.
I know. It's literally a woven and tangled mess - both the book and the code.
>[Well, he invented most of what we know about parsing, he gets to
>explain it any way he wants. Chapters 7 and 8 describe the syntax
Discovery. Not invention. Mathematics is not invented, it is
discovered (and in this case: only a partial and incomplete discovery).
And that, too, is a complete tangle that we had to remake from bottom
up. Now, finally with recent publications [2-5] establishing the
foundations for the new algebraic framework ... along with another,
currently in submission, that may come out in 2021, for the remaking
alluded to in  of the 1963 algebraic formulation by Chomsky and
Schuetzenberger  that lies at the foundation of this all, we're now
finally in a position to refactor both the theory itself and
everything that's based on it or is an application of it; literally
remaking the entire stack from bottom up.
 Chomsky, N., Schuetzenberger, M.: "The algebraic theory of context
free languages". In: Braffort, P., Hirschberg, D. (eds.) Computer
Programming and Formal Systems, pp. 118=E2=80=93161. North-Holland, Amsterdam
 H. Lei=C3=9F et al: "C-dioids and =CE=BC-continuous Chomsky algebras". In:
Desharnais, J., et al. (eds.) RAMiCS 2018. LNCS, vol. 11194, pp.
21=E2=80=9336. Springer, Cham (2018)
 M. Hopkins et al: "Coequalizers and Tensor Products for Continuous
Idempotent Semirings". In: Desharnais, J., et al. (eds.) RAMiCS 2018.
LNCS, vol. 11194, pp. 37-52. Springer, Cham (2018)
 M.Hopkins: "The algebraic approach I: the algebraization of the
Chomsky hierarchy". In: Berghammer, R., M=C3=B6ller, B., Struth, G. (eds.)
RelMiCS 2008. LNCS, vol. 4988, pp. 155=E2=80=93172. Springer, Heidelberg
 N.B.B. Grathwohl et al: "Infinitary axiomatization of the
equational theory of context-free languages". In: Baelde, D., Carayol,
A. (eds.) Fixed Points in Computer Science (FICS 2013). EPTCS, vol.
126, pp. 44=E2=80=9355 (2013)
Return to the
Search the comp.compilers archives again.