Related articles |
---|
Spell checking identifiers johann@myrkraverk.invalid (Johann 'Myrkraverk' Oskarsson) (2020-06-24) |
Re: Spell checking identifiers johann@myrkraverk.invalid (Johann 'Myrkraverk' Oskarsson) (2020-06-24) |
Re: Spell checking identifiers gah4@u.washington.edu (2020-06-23) |
Re: Spell checking identifiers derek@_NOSPAM_knosof.co.uk.invalid (Derek M. Jones) (2020-06-24) |
Re: Spell checking identifiers 937-053-0959@kylheku.com (Kaz Kylheku) (2020-06-24) |
Re: Spell checking identifiers tkoenig@netcologne.de (Thomas Koenig) (2020-06-24) |
Re: Spell checking identifiers gautier_niouzes@hotmail.com (2020-06-24) |
[6 later articles] |
From: | Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> |
Newsgroups: | comp.compilers |
Date: | Wed, 24 Jun 2020 01:38:11 +0800 |
Organization: | Easynews - www.easynews.com |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="10774"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, errors, question, comment |
Posted-Date: | 23 Jun 2020 14:40:37 EDT |
Content-Language: | en-GB |
Dear c.compilers,
While experimenting with Rust, I came across this suggestion.
--> foo.rs:5:9
|
5 | return j; // the variable, not the type.
| ^ help: a local variable with a similar name exists: `i`
Here it is suggesting i where I typed j. This is the same problem as
spell checking identifiers with fuzzy matching, so apologies for a po-
tentially misleading subject.
So, without going through the source of rustc to find out, I'm curious
about what general techniques people use to make this work? In particu-
lar the Damerau–Levenshtein distance algorithm is not appropriate for
dictionary lookups, as far as I know.
I've come across a survey of fuzzy matching algorithms, some of which
work with dictionaries but I have no idea which data structures would
be appropriate in a compiler, nor do I know what criteria I'd use to
choose an appropriate algorithm from such a survey.
As an added bonus, the same technique can of course be used to spell
check identifiers against a natural language dictionary. But since
such a dictionary is more static than the list of identifiers in the
current source file, a precomputed database will work, and a more
expensive indexing method can be used. Is there an indexing method
that works for this, but would not be appropriate for fuzzy matching
against identifiers?
[Apologies for not responding to my other topic yet, I should be able
to reply soon.]
--
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk
[There's a vast amount of work on edit distance. My guess is they
use something like Levenshtein, but rather than use a constant
distance of 1 between different letters, the distance varies depending
on how different the letters look. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.