|Spell checking identifiers email@example.com (Johann 'Myrkraverk' Oskarsson) (2020-06-24)|
|Re: Spell checking identifiers firstname.lastname@example.org (Johann 'Myrkraverk' Oskarsson) (2020-06-24)|
|Re: Spell checking identifiers email@example.com (2020-06-23)|
|Re: Spell checking identifiers derek@_NOSPAM_knosof.co.uk.invalid (Derek M. Jones) (2020-06-24)|
|Re: Spell checking identifiers firstname.lastname@example.org (Kaz Kylheku) (2020-06-24)|
|Re: Spell checking identifiers email@example.com (Thomas Koenig) (2020-06-24)|
|Re: Spell checking identifiers firstname.lastname@example.org (2020-06-24)|
|[6 later articles]|
|From:||Johann 'Myrkraverk' Oskarsson <email@example.com>|
|Date:||Wed, 24 Jun 2020 01:38:11 +0800|
|Organization:||Easynews - www.easynews.com|
|Injection-Info:||gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="10774"; mail-complaints-to="firstname.lastname@example.org"|
|Keywords:||lex, errors, question, comment|
|Posted-Date:||23 Jun 2020 14:40:37 EDT|
While experimenting with Rust, I came across this suggestion.
5 | return j; // the variable, not the type.
| ^ help: a local variable with a similar name exists: `i`
Here it is suggesting i where I typed j. This is the same problem as
spell checking identifiers with fuzzy matching, so apologies for a po-
tentially misleading subject.
So, without going through the source of rustc to find out, I'm curious
about what general techniques people use to make this work? In particu-
lar the Damerau–Levenshtein distance algorithm is not appropriate for
dictionary lookups, as far as I know.
I've come across a survey of fuzzy matching algorithms, some of which
work with dictionaries but I have no idea which data structures would
be appropriate in a compiler, nor do I know what criteria I'd use to
choose an appropriate algorithm from such a survey.
As an added bonus, the same technique can of course be used to spell
check identifiers against a natural language dictionary. But since
such a dictionary is more static than the list of identifiers in the
current source file, a precomputed database will work, and a more
expensive indexing method can be used. Is there an indexing method
that works for this, but would not be appropriate for fuzzy matching
[Apologies for not responding to my other topic yet, I should be able
to reply soon.]
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk
[There's a vast amount of work on edit distance. My guess is they
use something like Levenshtein, but rather than use a constant
distance of 1 between different letters, the distance varies depending
on how different the letters look. -John]
Return to the
Search the comp.compilers archives again.