Cisco CS-MARS-20-K9 - Security MARS 20 User Manual page 527

Security mars local controller
Table of Contents

Advertisement

Appendix B
Regular Expression Reference
In UTF-8 mode, quantifiers apply to UTF-8 characters rather than to individual bytes. Thus, for example,
\x{100}{2} matches two UTF-8 characters, each of which is represented by a two-byte sequence.
Similarly, when Unicode property support is available, \X{3} matches three Unicode extended
sequences, each of which may be several bytes long (and they may be of different lengths).
The quantifier {0} is permitted, causing the expression to behave as if the previous item and the
quantifier were not present.
For convenience (and historical compatibility) the three most common quantifiers have single-character
abbreviations:
It is possible to construct infinite loops by following a subpattern that can match no characters with a
quantifier that has no upper limit, for example:
Earlier versions of Perl and PCRE used to give an error at compile time for such patterns. However,
because there are cases where this can be useful, such patterns are now accepted, but if any repetition of
the subpattern does in fact match no characters, the loop is forcibly broken.
By default, the quantifiers are "greedy", that is, they match as much as possible (up to the maximum
number of permitted times), without causing the rest of the pattern to fail. The classic example of where
this gives problems is in trying to match comments in C programs. These appear between /* and */ and
within the comment, individual * and / characters may appear. An attempt to match C comments by
applying the pattern
to the string
fails, because it matches the entire string owing to the greediness of the .* item.
However, if a quantifier is followed by a question mark, it ceases to be greedy, and instead matches the
minimum number of times possible, so the pattern
does the right thing with the C comments. The meaning of the various quantifiers is not otherwise
changed, just the preferred number of matches. Do not confuse this use of question mark with its use as
a quantifier in its own right. Because it has two uses, it can sometimes appear doubled, as in
which matches one digit by preference, but can match two if that is the only way the rest of the pattern
matches.
If the PCRE_UNGREEDY option is set (an option which is not available in Perl), the quantifiers are not
greedy by default, but individual ones can be made greedy by following them with a question mark. In
other words, it inverts the default behaviour.
78-17020-01
*
is equivalent to {0,}
+
is equivalent to {1,}
?
is equivalent to {0,1}
(a?)*
/\*.*\*/
/* first comment */
not comment
/\*.*?\*/
\d??\d
/* second comment */
User Guide for Cisco Security MARS Local Controller
Repetition
B-13

Advertisement

Table of Contents
loading

This manual is also suitable for:

Mars 20Mars 50Mars 100Mars 200

Table of Contents