Oracle 5.0 Reference Manual page 802

Table of Contents

Advertisement

10.1.7.5. Collation of Expressions
In the great majority of statements, it is obvious what collation MySQL uses to resolve a comparison
operation. For example, in the following cases, it should be clear that the collation is the collation of
column charset_name:
SELECT x FROM T ORDER BY x;
SELECT x FROM T WHERE x = x;
SELECT DISTINCT x FROM T;
However, with multiple operands, there can be ambiguity. For example:
SELECT x FROM T WHERE x = 'Y';
Should the comparison use the collation of the column x, or of the string literal 'Y'? Both
have collations, so which collation takes precedence?
Standard SQL resolves such questions using what used to be called "coercibility" rules. MySQL
assigns coercibility values as follows:
• An explicit
• The concatenation of two strings with different collations has a coercibility of 1.
• The collation of a column or a stored routine parameter or local variable has a coercibility of 2.
• A "system constant" (the string returned by functions such as
has a coercibility of 3.
• The collation of a literal has a coercibility of 4.
or an expression that is derived from
NULL
The preceding coercibility values are current as of MySQL 5.0.3. In MySQL 5.0 prior to 5.0.3, there
is no system constant or
rather than 3, and literals have a coercibility of 3 rather than 4.
MySQL uses coercibility values with the following rules to resolve ambiguities:
• Use the collation with the lowest coercibility value.
• If both sides have the same coercibility, then:
• If both sides are Unicode, or both sides are not Unicode, it is an error.
• If one of the sides has a Unicode character set, and another side has a non-Unicode character set,
the side with Unicode character set wins, and automatic character set conversion is applied to the
non-Unicode side. For example, the following statement does not return an error:
SELECT CONCAT(utf8_column, latin1_column) FROM t1;
It returns a result that has a character set of
Values of
• For an operation with operands from the same character set but that mix a
a
or
_ci
nonbinary and binary strings evaluate the operands as binary strings, except that it is for collations
rather than data types.
Although automatic conversion is not in the SQL standard, the SQL standard document does say that
every character set is (in terms of supported characters) a "subset" of Unicode. Because it is a well-
known principle that "what applies to a superset can apply to a subset," we believe that a collation for
Unicode can apply for comparisons with non-Unicode strings.
clause has a coercibility of 0. (Not coercible at all.)
COLLATE
coercibility. Functions such as
NULL
are automatically converted to
latin1_column
collation, the
_cs
_bin
Collation Issues
has a coercibility of 5.
NULL
and the same collation as utf8_column.
utf8
collation is used. This is similar to how operations that mix
782
[964]
or
USER()
VERSION()
[964]
have a coercibility of 2
USER()
before concatenating.
utf8
_bin
and
x
'Y'
[964])
collation and

Advertisement

Table of Contents
loading

This manual is also suitable for:

Mysql 5.0

Table of Contents