Page 126 - Open Soource Technologies 304.indd
P. 126
Web Technologies-I
Notes
$found = ereg($pattern, “http://www.xxit.edu.au”); // false
Escaping Special Characters
We have already discussed the need to escape the special meaning of characters used as operators
in a regular expression. However, when to escape the meaning depends on how the character is
used. Escaping the special meaning of a character is done with the backslash character as with the
expression “2\+3, which matches the string “2 + 3”. If the + is not escaped, the pattern matches
one or many occurrences of the character 2 followed by the character 3. Another way to write
this expression is to express the + in the list of characters as “2[+]3”. Because + does not have the
same meaning in a list, it does not need to be escaped in that context. Using character lists in this
way can improve readability. The following examples show how escaping is used and avoided:
// need to escape ‘(‘ and ‘)’
$phone = “(03) 9429 5555”;
$found = ereg(“^\([0-9]{2,3}\)”, $phone); // true
// No need to escape (*.+?)| within brackets
$special = “Special Characters are (, ), *, +, ?, |”;
$found = ereg(“[(*.+?)|]”, $special); // true
// The backslash always needs to be quoted
$backSlash = ‘The backslash \ character’;
$found = ereg(‘^[a-zA-Z \\]*$’, $backSlash); //true
// Do not need to escape the dot within brackets
$domain = “www.ora.com”;
$found = ereg(“[.]com”, $domain); //true
Another complication arises due to the fact that a regular expression is passed as a string to
the regular expression functions. Strings in PHP can also use the backslash character to escape
quotes and to encode tabs, newlines, and so on. Consider the following example, which matches
a backslash character:
// single-quoted string containing a backslash
$backSlash = ‘\ backslash’;
// Evaluates to true
$found = ereg(“^\\\\ backslash”, $backSlash);
The regular expression looks quite odd: to match a backslash, the regular expression function
needs to escape the meaning of backslash, but because we are using a double-quoted string, each
of the two backslashes needs to be escaped.
Metacharacters
Metacharacters can also be used in regular expressions. For example, the tab character is
represented as \t and the carriage-return character as \n. There are also shortcuts: \d means any
digit, and \s means any whitespace. The following example returns true because the tab character,
\t, is contained in the $source string:
$source = “fast\tfood”;
$result = ereg(‘\s’, $source); // true
120 LOVELY PROFESSIONAL UNIVERSITY