[cfe-dev] Spell Correction Efficiency
David Tweed
david.tweed at gmail.com
Sun Nov 21 17:53:28 PST 2010
On Sun, Nov 21, 2010 at 7:58 PM, Matthieu Monrocq
<matthieu.monrocq at gmail.com> wrote
>> We had a discussion a month ago I think about the Spell Correction
>> algorithm which was used in CLang, notably for auto-completion, based on the
>> Levenstein distance.
I presume you've thought about whether suffix-tries and prefix-tries
would work for the case of typos on computer identifiers?
> - a good Trie implementation, or hints in how to implement one. A Trie seems
> to be the best index (quite compact yet ordered), however I only have a
> naive implementation at the moment.
I've got a hash-trie implementation (the datastructure that is used,
eg, to store the dictionary of hypenation exceptions in TeX) in C++
that's heavily designed for small alphabets and integer "payloads", so
it's unlikely to be directly suitable, but you can have a look at it
if you like. (Drop me a personal email.)
--
cheers, dave tweed__________________________
computer vision reasearcher: david.tweed at gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
More information about the cfe-dev
mailing list