Related articles |
---|
looking for a better way (help with algorithm design) mcr+usenet@yuck.net (Michael Roach) (2001-09-25) |
Re: looking for a better way (help with algorithm design) mcr+usenet@yuck.net (Michael Roach) (2001-09-25) |
Re: looking for a better way (help with algorithm design) Georg.Bisseling@pallas.com (Georg Bisseling) (2001-09-29) |
From: | Georg Bisseling <Georg.Bisseling@pallas.com> |
Newsgroups: | comp.compilers |
Date: | 29 Sep 2001 10:59:51 -0400 |
Organization: | Pallas GmbH |
References: | 01-09-107 01-09-108 |
Keywords: | parse |
Posted-Date: | 29 Sep 2001 10:59:51 EDT |
Dear Michael,
I will propose a very obvious solution that you might did rule out
since you thought that it needs too much space.
For every word in the set S store the word and any valid abbreviation
in the same tree.
Each node in the tree will have a marker to flag if it is a word in S.
Say S = {nap, nat}
root
/ | \ \
a(PT) n(PT) p(P) t(T)
/| |
(P)p t(T) a(PT)
/ \
T P
Uppercase marks a word in S. Letters in paren- theses show pointers to
words in S (which are identifyed by the two corresponding end nodes)
This solution would not take up more space than storing the set S
alone, provided that a large S gets more and more similar to all
allowed abbreviations. This assumption should be valid for natural
languages and DNA sequences....
Of course it is very difficult to formalize this assumption. Maybe it
is wrong?
It is obvious how to examine a given abbreviation. Follow the tree
until you reach the end of the abbreviation (if not found it is no
abbreviation). For each pointer (if no pointer it is no abbreviation)
trace back to root to get the matching word in S.
ciao
Georg
Return to the
comp.compilers page.
Search the
comp.compilers archives again.