Supported Character Sets - Multi-Tech MultiModem EDGE Reference Manual

Wireless edge modems
Table of Contents

Advertisement

1.5 Supported Character Sets

The ME supports two character sets: GSM 03.38 (7 bit, also referred to as GSM alphabet or SMS alphabet) and
UCS2 (16 bit, refer to ISO/IEC 10646). See AT+CSCS for information about selecting the character set. Character
tables can be found below.
Explanation of terms
• International Reference Alphabet (IRA)
IRA means that one byte is displayed as two characters in hexadecimal format. For example, the byte 0x36
(decimal 54) is displayed as "36" (two characters). IRA is used here for input 8-bit or 16-bit data via terminal
devices using text mode. This means only characters 'A'..F','a'..'f' and '0'..'9' are valid.
• Escape sequences
The escape sequence used within a text coded in the GSM default alphabet (0x1B) must be correctly
interpreted by the TE, both for character input and output. To the module, an escape sequence appears like
any other byte received or sent.
• Terminal Adapter (TA)
TA is used equivalent to Mobile Equipment (ME) which stands for the GSM module described here. It uses
GSM default alphabet as its character set.
• Terminal Equipment (TE)
TE is the device connected to the TA via serial interface. In most cases TE is an ANSI/ASCII terminal that
does not fully support the GSM default alphabet, for example MS HyperTerminal.
• TE Character Set
The character set currently used by Terminal Equipment is selected with AT+CSCS.
• Data Coding Scheme (dcs)
DCS is part of a short message and is saved on the SIM. When writing a short message to the SIM in text
mode, the dcs stored with AT+CSMP is used and determines the coded character set.
The behavior when encountering characters, that are not valid characters of the supported alphabets, is undefined.
Due to the constraints described below it is recommended to prefer the USC2 alphabet in any external application.
If the GSM alphabet is selected all characters sent over the serial line (between TE and TA) are in the range from 0 to
127 (7 Bit range). CAUTION: ASCII alphabet (TE) is not GSM alphabet (TA/ME) !
Several problems resulting from the use of GSM alphabet with ASCII terminal equipment:
"@" character with GSM alphabet value 0 is not printable by an ASCII terminal program (e.g., Microsoft©
HyperTerminal®).
"@" character with GSM alphabet value 0 will terminate any C string! This is because the 0 is defined as C
string end tag. Therefore, the GSM Null character may cause problems on application level when using a
'C'- function as "strlen()". This can be avoided if it is represented by an escape sequence as shown in the
table below.
By the way, this may be the reason why even network providers often replace "@"with "@=*" in their SIM
application.
Other characters of the GSM alphabet are misinterpreted by an ASCII terminal program. For example, GSM
"ö" (as in "Börse") is assumed to be "|" in ASCII, thus resulting in "B|rse". This is because both alphabets
mean different characters with values hex. 7C or 00 and so on.
In addition, decimal 17 and 19 which are used as XON/XOFF control characters when software flow control
is activated, are interpreted as normal characters in the GSM alphabet.
When you write characters differently coded in ASCII and GSM (e.g., Ä, Ö, Ü), you need to enter escape
sequences. Such a character is translated into the corresponding GSM character value and, when output later,
the GSM character value can be presented. Any ASCII terminal then will show wrong responses.
Table 1.5: Examples for character definitions depending on alphabet
GSM 03.38
GSM character
character
hex. value
Ö
5C
"
22
ò
08
@
00
Often, the editors of terminal programs do not recognize escape sequences. In this case, an escape
CAUTION:
sequence will be handled as normal characters. The most common workaround to this problem is to write a
script which includes a decimal code instead of an escape sequence. This way you can write, for example, short
messages which may contain differently coded characters.
Multi-Tech Systems, Inc. AT Commands for EDGE Modems (S000371G)
Corresponding
ASCII Esc
ASCII character
sequence
\
\5C
"
\22
BSP
\08
NULL
\00
Chapter 1 – Introduction
Hex Esc
sequence
5C 35 43
5C 32 32
5C 30 38
5C 30 30
10

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the MultiModem EDGE and is the answer not in the manual?

Subscribe to Our Youtube Channel

Table of Contents