When you send an SMS, the text is encoded using one of two character sets: GSM-7 or UCS-2. The encoding is selected automatically based on the characters in your message text, and it directly determines how many segments your message is split into — which affects the sms.segments value you receive in the Message Status API and webhooks.
An SMS message is transmitted as 140 bytes of data. This is a fixed constraint defined by the GSM standard (GSM 03.38) and has not changed since SMS was designed.
The character limit you see — 160 or 70 — is a direct consequence of how many characters fit into those 140 bytes depending on the encoding used:
- GSM-7: each character uses 7 bits →
(140 × 8) / 7 =160 characters - UCS-2: each character uses 16 bits (2 bytes) →
140 / 2 =70 characters
GSM-7 is the default encoding. It supports the standard Latin alphabet, digits, and a set of common symbols — 128 characters in total.
A single GSM-7 SMS can contain up to 160 characters.
When a message exceeds 160 characters, it is split into multiple segments. Each segment in a multi-part message can carry up to 153 characters — the remaining 7 characters per segment are used by a User Data Header (UDH), a metadata block that tells the recipient's device how to reassemble the parts in the correct order.
| Message length | Segments |
|---|---|
| 1–160 chars | 1 |
| 161–306 chars | 2 |
| 307–459 chars | 3 |
| 460–612 chars | 4 |
| … | … |
| Up to 1,600 chars | Up to 10 |
Formula: segments = ceil(length / 153) for messages longer than 160 characters.
The basic GSM-7 alphabet includes:
- Uppercase and lowercase Latin letters (
A–Z,a–z) - Digits (
0–9) - Common punctuation:
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ - Special characters:
£ ¥ è é ù ì ò Ç Ø ø Å å Δ Φ Γ Λ Ω Π Ψ Σ Θ Ξ ß É Æ æ Ä Ö Ñ Ü ä ö ñ ü à - Space, newline, carriage return
Some characters belong to the GSM-7 extension table and count as 2 characters each, because they require an escape sequence:
{ } [ ] \ | ^ ~ €
If your message contains even one character outside the GSM-7 alphabet, the entire message is automatically re-encoded in UCS-2. This applies to accented characters not in the GSM-7 set, emojis, Arabic, Chinese, Cyrillic, and any other non-Latin script.
UCS-2 is used when the message contains characters outside the GSM-7 alphabet. It encodes each character as 16 bits (2 bytes), which allows it to represent up to 65,536 characters — covering virtually all scripts in the Unicode Basic Multilingual Plane.
The trade-off is a significantly reduced character limit: a single UCS-2 SMS can contain up to 70 characters.
For multi-part messages, each segment carries up to 67 characters (3 characters per segment are reserved for the User Data Header).
| Message length | Segments |
|---|---|
| 1–70 chars | 1 |
| 71–134 chars | 2 |
| 135–201 chars | 3 |
| 202–268 chars | 4 |
| … | … |
| Up to 700 chars | Up to 10 |
Formula: segments = ceil(length / 67) for messages longer than 70 characters.
| GSM-7 | UCS-2 | |
|---|---|---|
| Bits per character | 7 | 16 |
| Supported characters | 128 (Latin + common symbols) | 65,536 (Unicode BMP) |
| Single SMS limit | 160 characters | 70 characters |
| Multi-part segment limit | 153 characters | 67 characters |
| Max concatenated length | ~1,600 characters | ~700 characters |
| Triggered by | Default | Any non-GSM-7 character in the text |
| Message | Encoding | Length | Segments |
|---|---|---|---|
Your code is 123456 | GSM-7 | 19 chars | 1 |
160 × A | GSM-7 | 160 chars | 1 |
161 × A | GSM-7 | 161 chars | 2 |
Votre fenêtre est ouverte | UCS-2 | 25 chars | 1 — ê is not in GSM-7 |
Bâtiment B, salle 3 | UCS-2 | 19 chars | 1 — â triggers UCS-2 |
70 × 中 | UCS-2 | 70 chars | 1 |
71 × 中 | UCS-2 | 71 chars | 2 |
The € symbol is part of the GSM-7 extension table and counts as 2 characters, not 1. A message containing only € signs has an effective limit of 80 symbols per single SMS (160 ÷ 2), not 160.
Because UCS-2 applies to the entire message, a single non-GSM-7 character forces re-encoding of all the text. This can have a significant impact on segment count:
A message of 152 GSM-7 characters + 1 emoji would fit in 1 GSM-7 segment. But because of the emoji, the whole message is encoded in UCS-2 — 153 characters at 67 per segment results in 3 segments, not 2.
The number of segments used for a delivered message is available in two places:
The sms.segments field is returned in the response body of all three message status endpoints:
{
"message": {
"id": "011d9d6e-b5b9-4cb9-be13-2bc336a923ce",
"channel": "SMS",
"status": "DELIVERED",
"sms": {
"segments": 2
}
}
}See Message Status API for the full response reference.
The same sms.segments field is included in every webhook status notification:
{
"type": "STATUS_UPDATE",
"message": {
"id": "b31b6607-9c55-48ba-b145-3f40b809d2d2",
"channel": "SMS",
"status": "DELIVERED",
"sms": {
"segments": 2
}
}
}See Understanding webhook for the full payload reference.
sms.segments is only present when channel is SMS. It is not included for RCS messages.