Page 116 - Open Soource Technologies 304.indd
P. 116

Web Technologies-I



                   Notes         The Soundex and Metaphone algorithms each yield a string that represents roughly how a
                                 word is pronounced in English. To see whether two strings are approximately equal with these
                                 algorithms, compare their pronunciations. You can compare Soundex values only to Soundex
                                 values and Metaphone values only to Metaphone values. The Metaphone algorithm is generally
                                 more accurate, as the following example demonstrates:
                                 $known = “Pradip”; $query = “Phred”; if (soundex($known) == soundex($query)) { print “soundex:
                                 $known sounds $query<br>”; } else { print “soundex: $known does not sound like $query<br>”; }
                                 if (metaphone($known) == metaphone($query)) { print “metaphone: $known sounds $query<br>”;
                                 } else { print “metaphone: $known does not sound like $query<br>”; } soundex: Pradip does not
                                 sound like Phred metaphone: Pradip sounds like Phred

                                 The similar_text( ) function returns the number of characters that its two string arguments have
                                 in common. The third argument, if present, is a variable in which to store the commonality as a
                                 percentage:
                                 $string_1 = “XYZ”; $string_2 = “AXBYZ”; $common = similar_text($string_1, $string_2, $percent);
                                 printf(“They have %d chars in common (%.2f%%).”, $common, $percent); They have 13 chars in
                                 common (89.66%).
                                 The Levenshtein algorithm calculates the similarity of two strings based on how many characters
                                 you must add, substitute, or remove to make them the same. For instance, “cat” and “cot” have
                                 a Levenshtein distance of 1, because you need to change only one character (the “a” to an “o”)
                                 to make them the same:
                                 $similarity = levenshtein(“cat”, “cot”); // $similarity is 1
                                 This measure of similarity is generally quicker to calculate than that used by the similar_text( )
                                 function. Optionally, you can pass three values to thelevenshtein( ) function to individually weight
                                 insertions, deletions, and replacements for instance, to compare a word against a contraction.

                                 This example excessively weights insertions when comparing a string against its possible
                                 contraction, because contractions should never insert characters:
                                 echo levenshtein(‘would not’, ‘wouldn\’t’, 500, 1, 1);

                                 5.7 Manipulating and Searching Strings


                                 PHP has many functions to work with strings. The most commonly used functions for searching
                                 and modifying strings are those that use regular expressions to describe the string in question.
                                 The functions described in this do not use regular expressions. They are faster than regular
                                 expressions, but they work only when you are looking for a fixed string.

                                 5.7.1 Substrings
                                 If you know where in a larger string the interesting data lies, you can copy it out with the substr( )
                                 function:
                                 $piece = substr(string, start [, length ]);
                                 The start argument is the position in string at which to begin copying, with 0 meaning the start of
                                 the string. The length argument is the number of characters to copy (the default is to copy until
                                 the end of the string).

                                        Example:

                                 $name = “Pradip Kumar”; $fluff = substr($name, 6, 4); // $fluff is “lint” $sound = substr($name,
                                 11); // $sound is “tone”





        110                               LOVELY PROFESSIONAL UNIVERSITY
   111   112   113   114   115   116   117   118   119   120   121