Page 123 - Open Soource Technologies 304.indd
P. 123

Unit 5: Strings



            // true if $var does not contain alphanumeric characters                              Notes
            $found = ereg(“[^0-9a-zA-Z]”, “123abc”); // false
            The ^ character can be used without meaning by placing it in a position other than the start of
            the characters enclosed in the brackets. For example, “[0-9^]” matches the characters to 9 and the
            ^ character. Similarly, the - character can be matched by placing it at the start or the end of the
            list; for example, “[-123]” matches the characters -, 1, 2, or 3. The characters ^ and - have different
            meanings outside the [] character lists.

            Anchors
            A regular expression can specify that a pattern occurs at the start or end of a subject string using
            anchors. The ^ anchors a pattern to the start, and the $ character anchors a pattern to the end of
            a string. For example, the expression: ereg(“^php”, $var) matches strings that start with “php”
            but not others. The following code shows the operation of both:
            $var = “to be or not to be”;
            $match = ereg(‘^to’, $var); // true
            $match = ereg(‘be$’, $var); // true
            $match = ereg(‘^or’, $var); // false

            The following illustrates the difference between the use of ^ as an anchor and the use of ^ in a
            character list:
            $var = “123467”;
            // match strings that start with a digit
            $match = ereg(“^[0-9]”, $var); // true

            // match strings that contain any character other than a digit
            $match = ereg(“[^0-9]”, $var); // false
            Both start and end anchors can be used in a single regular expression to match a whole string.
            The following example illustrates this:


            // Must match “Yes” exactly
            $match = ereg(‘^Yes$’, “Yes”);     // true

            $match = ereg(‘^Yes$’, “Yes sir”); // false
            Optional and Repeating Characters
            When a character in a regular expression is followed by a ? operator, the pattern matches zero or
            one times. In other words, ? marks something that is optional. A character followed by + matches
            one or more times. And a character followed by * matches zero or more times. Let’s look at concrete
            examples of these powerful operators.


            The ? operator allows zero or one occurrence of a character, so the expression:

            ereg(“pe?p”, $var)
            matches either “pep” or “pp”, but not the string “peep”. The * operator allows zero or many
            occurrences of the “o” in the expression:

            ereg(“po*p”, $var)
            and matches “pp”, “pop”, “poop”, “pooop”, and so on. Finally, the + operator allows one to many
            occurrences of “b” in the expression:


                                             LOVELY PROFESSIONAL UNIVERSITY                                   117
   118   119   120   121   122   123   124   125   126   127   128