About Character Encodings - MACROMEDIA COLDFUSION MX 61-DEVELOPING COLDFUSION MX Develop Manual

Developing coldfusion mx applications
Table of Contents

Advertisement

A locale identifies the exact language and cultural settings for a user. The locale controls how dates
and currencies are formatted, how to display time, and how to display numeric data. For example,
the locale English (US) determines that a currency value displays as:
$100,000.00
while a locale of Portuguese (Brazilian) displays the currency as:
R$ 100.000
In order to correctly display date, time, currency, and numeric data to your customers, you must
know the customer's locale. For more information on locales, see

About character encodings

A character encoding maps each character in a character set to a numeric value that can be
represented by a computer. These numbers can be represented by a single byte or multiple bytes.
For example, the ASCII encoding uses seven bits to represent the Latin alphabet, punctuation,
and control characters.
You use Japanese encodings, such as Shift-JIS, EUC-JP, and ISO-2022-JP, to represent Japanese
text. These encodings can vary slightly, but they include a common set of approximately 10,000
characters used in Japanese.
The following terms apply to character encodings:
SBCS
Single-byte character set; a character set encoded in one byte per character, such as
ASCII or ISO 8859-1.
DBCS
Double-byte character set; a method of encoding a character set in no more than two
bytes, such as Shift-JIS. Many character encoding schemes that are referred to as double-byte,
including Shift-JIS, allow mixing of single-byte and double-byte encoded characters. Others,
such as UCS-2, use two bytes for all characters.
MBCS
Multiple-byte character set; a character set encoded with a variable number of bytes
per character, such as UTF-8.
The following table lists some common character encodings; however, there are many additional
character encodings that browsers and web servers support:
Encoding
ASCII
Latin-1
(ISO 8859-1)
Shift_JIS
EUC-KR
UCS-2
UTF-8
The World Wide Web Consortium maintains a list of all character encodings supported by the
Internet. You can find this information at www.w3.org/International/O-charset.html.
374
Chapter 17: Developing Globalized Applications
Type
Description
SBCS
7-bit encoding used by English and Indonesian Bahasa languages
SBCS
8-bit encoding used for many Western European languages
DBCS
16-bit Japanese encoding (Note that you must use an underscore
character (_), not a hyphen (-) in the name in CFML attributes.)
DBCS
16-bit Korean encoding
DBCS
Two-byte Unicode encoding
MBCS
Multibyte Unicode encoding. ASCII is 7-bit; non-ASCII characters used in
European and many Middle Eastern languages are two-byte; and most
Asian characters are three-byte
"Locales" on page
376.

Advertisement

Table of Contents
loading

This manual is also suitable for:

Coldfusion mx

Table of Contents