that are internationalized but in which the user has not enabled NLS, use this
character set. Note, however, that 7-bit ASCII is not sufficient even to span the
Latin-based alphabet used in many European languages.
For many Asian languages, character sets can contain several thousand
characters. This is more than can be encoded in the single 8-bit number which
is the conventional value used to represent character data. For this and other
reasons, NLS character handling has the following characteristics:
• The 8th bit of a character byte is never stripped or modified.
• The extra bit provides support for languages that have additional characters,
accented vowels, consonants with special forms, and special symbols.
• Multi-byte coded character sets may be used for character sets that contain
more than 256 members.
There are many implementations of non-ASCII character sets currently in use.
NLS permits users to define their own character sets and character properties.
However, Hewlett-Packard has already supported coded character sets that
permit the processing of many Eastern and Western European, Middle Eastern,
and Asian languages.
Every HP-supported 8-bit coded character set is a superset of ASCII. The
HP-supported 8-bit coded character sets for Western European languages are
ROMAN8 and the standard ISO 8859-1. Hewlett-Packard also supports ISO
8859-2 and ISO 8859-5 for Eastern European languages, including Cyrillic.
Other 8-bit coded character sets are defined for other locales. For a listing,
please refer to Appendix E, "Languages and Codesets".
For alphabets of more than 256 characters, such as Kanji (Japanese ideographic
characters), multi-byte character codes are required. Hewlett-Packard has
defined a multi-byte character encoding scheme, HP-15, which uses two bytes
(16-bits) to represent a character. Four sets are defined under this scheme,
which are used to represent Traditional Chinese, Simplified Chinese, Korean,
and Japanese.
In addition, Hewlett-Packard provides support for the EUC character encoding
scheme for up to 2 bytes. This scheme is used for data processing and storage.
For input and output, Hewlett-Packard uses a multi-byte character encoding
scheme called HP-16.
