Find Similar Sounding Words in PHP

Discussion in 'PHP' started by pradeep, May 9, 2012.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in

    How To Find Similar Sounding Words



    Let say a condition where we wanted a utility that finds a matches of words which sounds all same. For example, stupid/stpid/stuuupid/sstuuupiid would all have the same soundex code, S313, Soundex is a phonetic algorithm which computes a soundex value for each english word passed to it. We can get similar sounding words using a function called soundex in php. This feature can be used in spell & document checking.

    Rules for Soundex



    The first letter of the word becomes the first letter of the Soundex code. Cross out all vowels (A, E, I, O, U, Y), and the letters H and W, that follow the initial letter.If your word has less than three letters left, assign zeroes to those places. Your final Soundex code should be the first letter of the word followed by three numbers (i.e. stupid is coded as S313).

    Summing It Up Into Steps



    Step 1: The first letter of the word becomes the first letter of the Soundex code.

    Step 2: The consonants are replaced with numbers as below:

    Code:
    b, f, p, v become 1
    
    c, g, j, k, q, s, x, z become 2
    
    d, t become 3
    
    l becomes 4
    
    m, n become 5
    
    r becomes 6
    
    h, w are removed
    
    Step 3: If two adjacent letters have the same number, the second is removed.

    Step 4: Add zeros to make up the three numbers if no letters remain.

    Description of Function:

    soundex — Calculate the soundex key of a string

    Syntax:

    string soundex ( string $str )

    Parameters: $str : input string

    Return value: Returns the soundex key as a string.

    The following code is an example where we try to find the word 'stupid' from a list of some similar sounding words.

    PHP:
    <?php
    $word2find 
    'stupid';

    $words = array('stupid','stu and pid','hello','foobar','stpid','supid','stuuupid','sstuuupiiid',);

    while(list(
    $id$str) = each($words)) {

        
    $soundex_code soundex($str);

        if (
    soundex($word2find) == $soundex_code) {
            print 
    '"' $word2find '" sounds like ' $str;
        }
        else {
            print 
    '"' $word2find '" sounds not like ' $str;
        }

        print 
    "\n";

    }
    ?>
    Output:
    Code:
    "stupid" sounds like stupid
    
    "stupid" sounds not like stu and pid
    
    "stupid" sounds not like hello
    
    "stupid" sounds not like foobar
    
    "stupid" sounds like stpid
    
    "stupid" sounds not like supid
    
    "stupid" sounds like stuuupid
    
    "stupid" sounds like sstuuupiiid
    

    References



    Soudex at Wikipedia
    soundex function at php.net
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice