Posix Character Classes - Cisco CS-MARS-20-K9 - Security MARS 20 User Manual

Security mars local controller
Table of Contents

Advertisement

Appendix B
Regular Expression Reference
The minus (hyphen) character can be used to specify a range of characters in a character class. For
example, [d-m] matches any letter between d and m, inclusive. If a minus character is required in a class,
it must be escaped with a backslash or appear in a position where it cannot be interpreted as indicating
a range, typically as the first or last character in the class.
It is not possible to have the literal character "]" as the end character of a range. A pattern such as [W-]46]
is interpreted as a class of two characters ("W" and "-") followed by a literal string "46]", so it would
match "W46]" or "-46]". However, if the "]" is escaped with a backslash it is interpreted as the end of
range, so [W-\]46] is interpreted as a class containing a range followed by two other characters. The octal
or hexadecimal representation of "]" can also be used to end a range.
Ranges operate in the collating sequence of character values. They can also be used for characters
specified numerically, for example [\000-\037]. In UTF-8 mode, ranges can include characters whose
values are greater than 255, for example [\x{100}-\x{2ff}].
If a range that includes letters is used when caseless matching is set, it matches the letters in either case.
For example, [W-c] is equivalent to [][\\^_`wxyzabc], matched caselessly, and in non-UTF-8 mode, if
character tables for the "fr_FR" locale are in use, [\xc8-\xcb] matches accented E characters in both
cases. In UTF-8 mode, PCRE supports the concept of case for characters with values greater than 128
only when it is compiled with Unicode property support.
The character types \d, \D, \p, \P, \s, \S, \w, and \W may also appear in a character class, and add the
characters that they match to the class. For example, [\dABCDEF] matches any hexadecimal digit. A
circumflex can conveniently be used with the upper case character types to specify a more restricted set
of characters than the matching lower case type. For example, the class [^\W_] matches any letter or
digit, but not underscore.
The only metacharacters that are recognized in character classes are backslash, hyphen (only where it
can be interpreted as specifying a range), circumflex (only at the start), opening square bracket (only
when it can be interpreted as introducing a POSIX class name - see the next section), and the terminating
closing square bracket. However, escaping other non-alphanumeric characters does no harm.

Posix Character Classes

Perl supports the POSIX notation for character classes. This uses names enclosed by [: and :] within the
enclosing square brackets. PCRE also supports this notation. For example,
matches "0", "1", any alphabetic character, or "%". The supported class names are
78-17020-01
[01[:alpha:]%]
alnum
letters and digits
alpha
letters
ascii
character codes 0 - 127
blank
space or tab only
cntrl
control characters
digit
decimal digits (same as \d)
graph
printing characters, excluding space
lower
lower case letters
print
printing characters, including space
punct
printing characters, excluding letters and digits
space
white space (not quite the same as \s)
upper
upper case letters
word
"word" characters (same as \w)
xdigit
hexadecimal digits
User Guide for Cisco Security MARS Local Controller
Posix Character Classes
B-9

Advertisement

Table of Contents
loading

This manual is also suitable for:

Mars 20Mars 50Mars 100Mars 200

Table of Contents