Related articles |
---|
Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-22) |
Re: Who first invented dotted-item notation? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-05-22) |
RE: Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-23) |
RE: Who first invented dotted-item notation? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-05-24) |
From: | Christopher F Clark <christopher.f.clark@compiler-resources.com> |
Newsgroups: | comp.compilers |
Date: | Mon, 23 May 2022 13:23:41 +0300 |
Organization: | Compilers Central |
References: | 22-05-046 22-05-047 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="22129"; mail-complaints-to="abuse@iecc.com" |
Keywords: | parse, history, comment |
Posted-Date: | 23 May 2022 10:39:10 EDT |
Let me explain a little further.
Dotted-items are useful in translating (and visualizing) state
machines as translations of regular expressions or grammar rules.
In Glushkov's construction one takes a regular expression /ab*(cb)+a/
and makes copies of it with a dot before each "symbol" (character you
can recognize) and one at the end. Thus, 6 dotted-items:
/.ab*(cb)+a/
/a.b*(cb)+a/
/ab*(.cb)+a/
/ab*(c.b)+a/
/ab*(cb)+.a/
/ab*(cb)+a./
Each state in a Glushkov NFA is a subset of the powerset of these
items: (see below)
state 0:
/.ab*(cb)+a/
state 1:
/a.b*(cb)+a/
/ab*(.cb)+a/
state 2:
/ab*(c.b)+a/
state 3:
/ab*(.cb)+a/
/ab*(cb)+.a/
state 4:
/ab*(cb)+a./
The algorithm explains how to calculate what the states and their
transitions are.
Yielding something like this:
state 0:
/.ab*(cb)+a/ -> state 1
state 1:
/a.b*(cb)+a/ -> state 1
/ab*(.cb)+a/ -> state 2
state 2:
/ab*(c.b)+a/ -> state 3
state 3:
/ab*(.cb)+a/ -> state 2
/ab*(cb)+.a/ -> state 4
state 4:
/ab*(cb)+a./ -> accept
Below: Actually, in a Gluskov NFA the 6 dotted-items might be the
states, and those subsets might be how you change it to a DFA (the
subset construction algorithm). In my mind, those things have gotten
merged, but that might be an error on my part.
Anyway, when building the LR(0) machine, you build similar-dotted
items (and subsets) to define the states of the LR(0) DFA. And, the
Earley construction adds some context information to the doffed-items.
It's how I think about state machines. It's how they are documented
in Yacc++ with the debug option. Yacc (and Bison) does something
similar using _ instead of dot if I recall correctly.
But, someone must have introduced the idea first. My question is who?
Did Backus use dotted-items? Or did Knuth or Glushkov invent them?
Or maybe Naur. Or someone else, maybe one unfamiliar to me? Maybe it
goes all the way back to Kleene. Is the question lost to time?
That's my question....
Again, thanks,
Chris
--
******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------
[BNF was descriptive, don't think Backus ever did anything with formal parsers.
I looked at Knuth's 1965 LR parsing paper which has a lot of notation but no
dots. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.