Technology

How to Implement an opensource Spell Corrector in your AI

In theory there are plenty of very good open-source dictionaries and programs, but in practice it is more complicated than that. Although Hunspell, an open-source dictionary, is used by LibreOffice and Chrome the code to use the data is not readily available. For one thing most programs written in C or C++ which is not what modern apps are coded with for many reasons. Another thing is making your program manipulate the dictionary data into the way you need it.

Update

I found a simple program that can be installed on a server and used easily in PHP. Just note that the installation instructions need to be updated for newer Ubuntu’s with this code:

sudo apt install php-pspell

then

sudo service apache2 restart

Then I downloaded aspell dictionary and linked it like in this example:

<?php
$pspell_link = pspell_new(“en”);

if (!pspell_check($pspell_link, “testt”)) {
$suggestions = pspell_suggest($pspell_link, “testt”);

foreach ($suggestions as $suggestion) {
echo “Possible spelling: $suggestion<br />”;
}
}
?>

Building the mysql database

Yes chances are that you will need to build your own sql database from scratch. This could be done by either converting raw data into sql tables or trying to migrate the existing tables into a database that could used by your program.

Using It Natively

If you have a more organic application running on its own server you might be able to use Hunspell the way it is. You could install Hunspell on the local operating system and then access it from your app or website using a shell executor. In this example php code searches a word using shell.

$x = shell_exec(“echo workers | hunspell -d en_US -m”);

That is the simple way to use the native code without any direct integration with the program.

Use existing front-end Wrappers

I haven’t found any existing front that are fit for production use or fit for customisation. PhpSpellChecker almost fits the bill for a stand alone wrapper for Hunspell or other dictionaries. It works by using php to code to read and seach the raw ‘.dic’ files. There are two basic problems with this program. One, that it is inefficient which could make bulky and slow. Second, that it’s licensing doesn’t allow for it to be used for production. When you try too many queries at once it gives and error that licensing is required.

Php-Speller is wrapper that accesses the native program installed on the operating system. This requires local shell control which wouldn’t work well for Sass like a wordpress plugin.

Using Refyn method “spellcheck”

Just call Refyn API using method r=spellcheck

Leave a Reply

© 2017 Refyn. All Rights Reserved. Designed by Yeshourun Rotstain