Re: Spell checking identifiers

gah4@u.washington.edu
Tue, 23 Jun 2020 16:51:38 -0700 (PDT)

          From comp.compilers

Related articles
Spell checking identifiers johann@myrkraverk.invalid (Johann 'Myrkraverk' Oskarsson) (2020-06-24)
Re: Spell checking identifiers johann@myrkraverk.invalid (Johann 'Myrkraverk' Oskarsson) (2020-06-24)
Re: Spell checking identifiers gah4@u.washington.edu (2020-06-23)
Re: Spell checking identifiers derek@_NOSPAM_knosof.co.uk.invalid (Derek M. Jones) (2020-06-24)
Re: Spell checking identifiers 937-053-0959@kylheku.com (Kaz Kylheku) (2020-06-24)
Re: Spell checking identifiers tkoenig@netcologne.de (Thomas Koenig) (2020-06-24)
Re: Spell checking identifiers gautier_niouzes@hotmail.com (2020-06-24)
Re: Spell checking identifiers gah4@u.washington.edu (2020-06-24)
Re: Spell checking identifiers johann@myrkraverk.invalid (Johann 'Myrkraverk' Oskarsson) (2020-06-25)
[4 later articles]
| List of all articles for this month |

From: gah4@u.washington.edu
Newsgroups: comp.compilers
Date: Tue, 23 Jun 2020 16:51:38 -0700 (PDT)
Organization: Compilers Central
References: 20-06-010 20-06-011
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="40178"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, errors
Posted-Date: 24 Jun 2020 14:45:22 EDT
In-Reply-To: 20-06-011

On Tuesday, June 23, 2020 at 12:59:35 PM UTC-7, Johann 'Myrkraverk' Oskarsson wrote:


(snip)


> This clang blog specifically mentions Levenshtein,


> http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#spell_checker


> and it looks like what people do is to go through the entire symbol
> table and compute it against the individual erroneous identifier.


> I thought that'd be a bit on the expensive side,


With either constant weighting or character dependent weighting
it is easy to do with dynamic programming. The time is then O(m n)
where m and n are the two lengths.


It seems most obvious to do only variable that are in the appropriate
scope to be misspelled, but I suspect catching variables used out
of scope is also worth doing. Well, in the latter case, you could
hope that they at least spell them the same.


I think you should turn it off for one character names, though,
even though I suspect those are more likely. Too many false
positives!



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.