Adobe 38043740 - ColdFusion Standard - Mac Development Manual page 372

Developing applications
Hide thumbs Also See for 38043740 - ColdFusion Standard - Mac:
Table of Contents

Advertisement

DEVELOPING COLDFUSION 9 APPLICATIONS
Developing CFML Applications
You use Japanese encodings, such as Shift-JIS, EUC-JP, and ISO-2022-JP, to represent Japanese text. These encodings
can vary slightly, but they include a common set of approximately 10,000 characters used in Japanese.
The following terms apply to character encodings:
Single-byte character set; a character set encoded in one byte per character, such as ASCII or ISO 8859-1.
SBCS
Double-byte character set; a method of encoding a character set in no more than 2 bytes, such as Shift-JIS. Many
DBCS
character encoding schemes that are referred to as double-byte, including Shift-JIS, allow mixing of single-byte and
double-byte encoded characters. Others, such as UCS-2, use 2 bytes for all characters.
Multiple-byte character set; a character set encoded with a variable number of bytes per character, such as UTF-8.
MBCS
The following table lists some common character encodings; however, there are many additional character encodings
that browsers and web servers support:
Encoding
Type
ASCII
SBCS
Latin-1
SBCS
(ISO 8859-1)
Shift_JIS
DBCS
EUC-KR
DBCS
UCS-2
DBCS
UTF-8
MBCS
The World Wide Web Consortium maintains a list of all character encodings supported by the Internet. You can find
this information at www.w3.org/International/O-charset.html.
Computers must often convert between character encodings. In particular, the character encodings most commonly
used on the Internet are not used by Java or Windows. Character sets used on the Internet are typically single-byte or
multiple-byte (including DBCS character sets that allow single-byte characters). These character sets are most efficient
for transmitting data, because each character takes up the minimum necessary number of bytes. Currently, Latin
characters are most frequently used on the web, and most character encodings used on the web represent those
characters in a single byte.
Computers, however, process data most efficiently if each character occupies the same number of bytes. Therefore,
Windows and Java both use double-byte encoding for internal processing.
The Java Unicode character encoding
ColdFusion uses the Java Unicode Standard for representing character data internally. This standard corresponds to
UCS-2 encoding of the Unicode character set. The Unicode character set can represent many languages, including all
major European and Asian character sets. Therefore, ColdFusion can receive, store, process, and present text from all
languages supported by Unicode.
The Java Virtual Machine (JVM) that is used to processes ColdFusion pages converts between the character encoding
used on a ColdFusion page or other source of information to UCS-2. The page or data encodings that ColdFusion
supports depend on the specific JVM, but include most encodings used on the web. Similarly, the JVM converts
between its internal UCS-2 representation and the character encoding used to send the response to the client.
Description
7-bit encoding used by English and Indonesian Bahasa languages
8-bit encoding used for many Western European languages
16-bit Japanese encoding
Note: Use an underscore character (_), not a hyphen (-) in the name in CFML attributes.
16-bit Korean encoding
Two-byte Unicode encoding
Multibyte Unicode encoding. ASCII is 7-bit; non-ASCII characters used in European and many
Middle Eastern languages are two-byte; and most Asian characters are three-byte
Last updated 1/20/2012
367

Advertisement

Table of Contents
loading

This manual is also suitable for:

Coldfusion 9

Table of Contents