Page 125 - Open Soource Technologies 304.indd
P. 125

Unit 5: Strings



                                                                                                  Notes

            ereg(“(123)+”, $var)
            The Expression matches “123”, “123123”, “123123123”, and so on. Grouping characters allows
            complex patterns to be expressed, as in the following example that matches an alphabetic-only URL:
            // A simple, incomplete, HTTP URL regular expression
            // that does not allow numbers
            $pattern = ‘^(http://)?[a-zA-Z]+(\.[a-zA-z]+)+$’;
            $found = ereg($pattern, “www.ora.com”); // true
            Figure 5.1 shows the parts of this complex regular expression and how they are interpreted.
            The regular expression assigned to $pattern includes both the start and end anchors, ^ and $, so
            the whole subject string, “www.ora.com” must match the pattern. The start of the pattern is the
            optional group of characters “http://”, as specified by “(http ://)?”. This does not match any of
            the subject string in the example but does not rule out a match, because the “http://” pattern is
            optional. Next the “[a-zA-Z] +” pattern specifies one or more alpha characters, and this matches
            “www” from the subject string. The next pattern is the group “(\.[a-zA-z]+)”. This pattern must
            start with a period (the wildcard meaning of . is escaped with the backslash) followed by one
            or more alphabetic characters. The pattern in this group is followed by the + operator, so the
            pattern must occur at least once in the subject and can repeat many times. In the example, the
            first occurrence is “.ora” and the second occurrence is “.com”.
                                Figure 5.1: Regular Expression with Groups
















            Groups can also define subpatterns when ereg( ) extracts values into an array.
            Alternative Patterns

            Alternatives in a pattern are specified with the | operator; for example, the pattern “cat|bat|rat”
            matches “cat”, “bat”, or “rat”. The | operator has the lowest precedence of the regular expression
            operators, treating the largest surrounding expressions as alternative patterns. To match “cat”,
            “bat”, or “rat” another way, the following expression can be used:




            $var = “bat”;
            $found = ereg(“(c|b|r)at”, $var);  // true
            Another example shows alternative endings to a pattern:
            // match some URL damains
            $pattern = ‘(com$|net$|gov$|edu$)’;
            $found = ereg($pattern, “http://www.ora.com”); // true




                                             LOVELY PROFESSIONAL UNIVERSITY                                   119
   120   121   122   123   124   125   126   127   128   129   130