next up previous contents index
Next: CASE Clause Up: Customizing Queries Previous: LIKE Comparison

  
Regular Expressions

 Regular expressions allow more powerful comparisons than LIKE  and NOT LIKE. Regular expression comparisons are a unique feature of POSTGRESQL. They are very common in Unix, such as in the Unix grep  command.9.4

Table [*] lists the regular expression operators , and Table [*]

 
Table: Regular expression operators
Comparison Operator 
regular expression ~
regular expression, case-insensitive ~*
not equal to regular expression !~
not equal to regular expression, case-insensitive !~*


lists the regular expression special characters.
 
Table: Regular expression special characters
Test Special Characters
start ^
end $
any single character .
set of characters [ccc]
set of characters not equal [^ccc]
range of characters [c-c]
range of characters not equal [^c-c]
zero or one of previous character ?
zero or multiple of previous characters *
one or multiple of previous characters +
OR operator  


Note that the caret (^) has a different meaning outside and inside square brackets ([ ]).

Although regular expressions are powerful, they can be complex to create. Table [*] shows some examples, and Figure [*] shows selected queries using regular expressions. For a description of each query, see the comment above it.

 
Table: Examples of regular expressions
Test Operation
begins with D ~ '^D'
contains D ~ 'D'
D in second position ~ '^.D'
begins with D and contains e ~ '^D.*e'
begins with D, contains e, and then f ~ '^D.*e.*f'
contains A, B, C, or D ~ '[A-D]' or  ~ '[ABCD]'
contains A or a ~* 'a' or  ~ '[Aa]'
does not contain D !~ 'D'
does not begin with D !~ '^D' or  ~ '^[^D]'
begins with D, with one optional leading space ~ '^ ?D'
begins with D , with optional leading spaces ~ '^ *D'
begins with D, with at least one leading space ~ '^ +D'
ends with G, with optional trailing spaces ~ 'G *$'


 

        
        test=> SELECT * FROM friend 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dean            | Yeager               | Plymouth        | MA    |  24
         Dick            | Gleason              | Ocean City      | NJ    |  19
         Ned             | Millstone            | Cedar Creek     | MD    |  27
         Sandy           | Gleason              | Ocean City      | NJ    |  25
         Sandy           | Weber                | Boston          | MA    |  33
         Victor          | Tabor                | Williamsport    | PA    |  22
        (6 rows) 
         
        test=> -- firstname begins with 'S'
        test=> SELECT * FROM friend 
        test-> WHERE firstname ~ '^S' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Sandy           | Gleason              | Ocean City      | NJ    |  25
         Sandy           | Weber                | Boston          | MA    |  33
        (2 rows) 
         
        test=> -- firstname has an e in the second position
        test=> SELECT * FROM friend 
        test-> WHERE firstname ~ '^.e' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dean            | Yeager               | Plymouth        | MA    |  24
         Ned             | Millstone            | Cedar Creek     | MD    |  27
        (2 rows) 
         
        test=> -- firstname contains b, B, c, or C
        test=> SELECT * FROM friend 
        test-> WHERE firstname ~* '[bc]' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dick            | Gleason              | Ocean City      | NJ    |  19
         Victor          | Tabor                | Williamsport    | PA    |  22
        (2 rows) 
         
        test=> -- firstname does not contain s or S
        test=> SELECT * FROM friend 
        test-> WHERE firstname !~* 's' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dean            | Yeager               | Plymouth        | MA    |  24
         Dick            | Gleason              | Ocean City      | NJ    |  19
         Ned             | Millstone            | Cedar Creek     | MD    |  27
         Victor          | Tabor                | Williamsport    | PA    |  22
        (4 rows)
        
 

Figure [*] shows two more complex regular expressions.  

        test=> -- firstname ends with n
        test=> SELECT * FROM friend 
        test-> WHERE firstname ~ 'n *$' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dean            | Yeager               | Plymouth        | MA    |  24
        (1 row) 
         
        test=> -- firstname contains a non-S character
        test=> SELECT * FROM friend 
        test-> WHERE firstname ~ '[^S]' 
        test-> ORDER BY firstname;
            firstname    |       lastname       |      city       | state | age 
        -----------------+----------------------+-----------------+-------+-----
         Dean            | Yeager               | Plymouth        | MA    |  24
         Dick            | Gleason              | Ocean City      | NJ    |  19
         Ned             | Millstone            | Cedar Creek     | MD    |  27
         Sandy           | Gleason              | Ocean City      | NJ    |  25
         Sandy           | Weber                | Boston          | MA    |  33
         Victor          | Tabor                | Williamsport    | PA    |  22
        (6 rows)
 

The first query demonstrates how to properly test for a trailing n.  Because CHAR() columns contain trailing spaces to fill the column, you must test for possible trailing spaces.  (See Section [*] for complete coverage on character data types.) The second query might seem surprising. Some might think that it returns rows that do not contain an S. Actually, it returns all rows that have any character that is not an S. For example, Sandy contains characters that are not S, such as a, n, d, and y, so that row is returned. The test would prevent rows containing only S's from being printed.

You can also test for the literal characters listed in Table [*]. Use of a backslash removes any special meaning from the character that follows it. For example, to test for a dollar sign, use \$. To test for an asterisk, use \*. To test for a literal backslash, use two backslashes (\\).

Because regular expressions are so powerful, creating them can be challenging. Try some queries on the friend table until you are comfortable with regular expression comparisons. 


next up previous contents index
Next: CASE Clause Up: Customizing Queries Previous: LIKE Comparison
Bruce Momjian
2001-05-09