Go4Expert

Go4Expert (http://www.go4expert.com/)
-   PHP (http://www.go4expert.com/articles/php-tutorials/)
-   -   Find Similar Sounding Words in PHP (http://www.go4expert.com/articles/similar-sounding-words-php-t28352/)

pradeep 9May2012 12:10

Find Similar Sounding Words in PHP
 

How To Find Similar Sounding Words



Let say a condition where we wanted a utility that finds a matches of words which sounds all same. For example, stupid/stpid/stuuupid/sstuuupiid would all have the same soundex code, S313, Soundex is a phonetic algorithm which computes a soundex value for each english word passed to it. We can get similar sounding words using a function called soundex in php. This feature can be used in spell & document checking.

Rules for Soundex



The first letter of the word becomes the first letter of the Soundex code. Cross out all vowels (A, E, I, O, U, Y), and the letters H and W, that follow the initial letter.If your word has less than three letters left, assign zeroes to those places. Your final Soundex code should be the first letter of the word followed by three numbers (i.e. stupid is coded as S313).

Summing It Up Into Steps



Step 1: The first letter of the word becomes the first letter of the Soundex code.

Step 2: The consonants are replaced with numbers as below:

Code:

b, f, p, v become 1

c, g, j, k, q, s, x, z become 2

d, t become 3

l becomes 4

m, n become 5

r becomes 6

h, w are removed


Step 3: If two adjacent letters have the same number, the second is removed.

Step 4: Add zeros to make up the three numbers if no letters remain.

Description of Function:

soundex Calculate the soundex key of a string

Syntax:

string soundex ( string $str )

Parameters: $str : input string

Return value: Returns the soundex key as a string.

The following code is an example where we try to find the word 'stupid' from a list of some similar sounding words.

Code: PHP

<?php
$word2find = 'stupid';

$words = array('stupid','stu and pid','hello','foobar','stpid','supid','stuuupid','sstuuupiiid',);

while(list($id, $str) = each($words)) {

    $soundex_code = soundex($str);

    if (soundex($word2find) == $soundex_code) {
        print '"' . $word2find . '" sounds like ' . $str;
    }
    else {
        print '"' . $word2find . '" sounds not like ' . $str;
    }

    print "\n";

}
?>


Output:
Code:

"stupid" sounds like stupid

"stupid" sounds not like stu and pid

"stupid" sounds not like hello

"stupid" sounds not like foobar

"stupid" sounds like stpid

"stupid" sounds not like supid

"stupid" sounds like stuuupid

"stupid" sounds like sstuuupiiid


References



Soudex at Wikipedia
soundex function at php.net


All times are GMT +5.5. The time now is 15:56.