Related articles |
---|
Finite State Technologies. watson@RibbitSoft.com (Dr. Bruce Watson) (1997-10-26) |
From: | "Dr. Bruce Watson" <watson@RibbitSoft.com> |
Newsgroups: | comp.compilers |
Date: | 26 Oct 1997 22:08:24 -0500 |
Organization: | Compilers Central |
Keywords: | DFA |
X-Authentication-Warning: | Stratus.CAM.ORG: watson owned process doing -bs |
To all finite state machine fans,
Here are still more pointers to finite state machine (automata) literature and
research.
> The following URLs might be of use. The first three are more oriented
> towards fast parsing of natural languages, but they'll work fine for
> computer languages.
>
> http://www.xerox.fr/research/mltt/fst/home.html
> http://www.rxrc.xerox.com/research/mltt/fst/articles/jnle-97/rele.html
> http://www.sil.org/pckimmo/about_pc-kimmo.html
> http://www.let.rug.nl/~vannoord/FSA/fsa.html
> http://www.xerox.fr/research/mltt/fst/fsrefs.html
Those are some excellent references to (part of) the finite state literature,
though with emphasis on natural language processing. At Ribbit, we are also
doing finite-state technologies, in C++ and Java, of a more general nature (ie.
for use in computational linguistics, asynchronous circuit simulation,
indexing, protocol verification, etc.). As the architect of some of those
products, I maintain a personal research page at
http://www.RibbitSoft.com/research/watson/index.html
which contains a number of my own conference papers, tech. reports, etc., but
also pointers to some of the conference URLs. In particular, there have now
been two Workshops on Implementing Automata (Aug. 1996 and Sept. 1997), both
held in London, Ontario. The web sites for both workshops appear to still be
active, and they are
http://www.csd.uwo.ca/staff/drraymon/wia.html
http://www.csd.uwo.ca/faculty/syu/wia97.html
They were excellent workshops and the proceedings are worth checking out
(pointers to the proceedings are also on my page).
> Be warned that efficient finite-state machines are hard to produce (we
> have proprietary technology which makes the transducers small enough
> and fast enough to be useful; the optimizations allow lexicons with
> millions of words to fit into 200K-300K and run at speeds competitive
> with hash table lookup).
> Peter Ludemann +1.650.813.6806 (fax: +1.650.813.7400)
> Software Architect ludemann@inxight.com
I agree that it's a lot tougher than it first appears. There are many many
domain specific optimizations that you can use (and that we have developed) to
compress automata and speed their transitions, and unfortunately most of the
interesting ones remain proprietary (and thus a black art). I, for one, would
be interested in hearing from others working on automata, especially when it's
for a specific domain in which there may be tuning opportunities.
Best regards, Bruce.
------------------------------------------------------------------------------
Dr. Bruce W. Watson (Chief Architect) | E-mail: watson@RibbitSoft.com
Ribbit Software Systems Inc. | Web: www.RibbitSoft.com
| Phone: + 1 (250) 762-6591
* Text pattern recognition technology | FAX: + 1 (250) 762-6511
* Text transformation technology | Address: Box 24040, 297 Bernard
* High performance compiler/interpreter | Kelowna, B.C.
technology for Java/C/C++/embedded sys. | V1Y 9P9, Canada
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.