Oracle 5.0 Reference Manual page 2920

Table of Contents

Advertisement

MySQL 5.0 FAQ: MySQL Chinese, Japanese, and Korean Character Sets
CREATE PROCEDURE p_convert(ucs2_char CHAR(1) CHARACTER SET ucs2)
BEGIN
CREATE TABLE tj
(ucs2 CHAR(1) character set ucs2,
utf8 CHAR(1) character set utf8,
big5 CHAR(1) character set big5,
cp932 CHAR(1) character set cp932,
eucjpms CHAR(1) character set eucjpms,
euckr CHAR(1) character set euckr,
gb2312 CHAR(1) character set gb2312,
gbk CHAR(1) character set gbk,
sjis CHAR(1) character set sjis,
ujis CHAR(1) character set ujis);
INSERT INTO tj (ucs2) VALUES (ucs2_char);
UPDATE tj SET utf8=ucs2,
big5=ucs2,
cp932=ucs2,
eucjpms=ucs2,
euckr=ucs2,
gb2312=ucs2,
gbk=ucs2,
sjis=ucs2,
ujis=ucs2;
/* If there is a conversion problem, UPDATE will produce a warning. */
SELECT hex(ucs2) AS ucs2,
hex(utf8) AS utf8,
hex(big5) AS big5,
hex(cp932) AS cp932,
hex(eucjpms) AS eucjpms,
hex(euckr) AS euckr,
hex(gb2312) AS gb2312,
hex(gbk) AS gbk,
hex(sjis) AS sjis,
hex(ujis) AS ujis
FROM tj;
DROP TABLE tj;
END//
The input can be any single
representation) of that character. For example, from Unicode's list of
(http://www.unicode.org/Public/UNIDATA/UnicodeData.txt), we know that the Katakana character Pe
appears in all CJK character sets, and that its code point value is 0x30da. If we use this value as the
argument to p_convert(), the result is as shown here:
mysql>
CALL p_convert(0x30da)//
+------+--------+------+-------+---------+-------+--------+------+------+------+
| ucs2 | utf8
| big5 | cp932 | eucjpms | euckr | gb2312 | gbk
+------+--------+------+-------+---------+-------+--------+------+------+------+
| 30DA | E3839A | C772 | 8379
+------+--------+------+-------+---------+-------+--------+------+------+------+
1 row in set (0.04 sec)
Since none of the column values is 3F—that is, the question mark character (?)—we know that every
conversion worked.
B.11.14: Why do CJK strings sort incorrectly in Unicode? (I)
Sometimes people observe that the result of a
or of an
sort is not what they think a native would expect. Although we never rule out the
ORDER BY
possibility that there is a bug, we have found in the past that many people do not read correctly the
standard table of weights for the Unicode Collation Algorithm. MySQL uses the table found at
character, or it can be the code point value (hexadecimal
ucs2
| A5DA
| ABDA
utf8_unicode_ci
2900
encodings and names
ucs2
| sjis | ujis |
| A5DA
| A5DA | 8379 | A5DA |
or
ucs2_unicode_ci
search,
http://

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the 5.0 and is the answer not in the manual?

Questions and answers

This manual is also suitable for:

Mysql 5.0

Table of Contents