Skip to content
Last updated

When you send an SMS, the text is encoded using one of two character sets: GSM-7 or UCS-2. The encoding is selected automatically based on the characters in your message text, and it directly determines how many segments your message is split into — which affects the sms.segments value you receive in the Message Status API and webhooks.

No

Yes

Message text

Any character\noutside GSM-7?

Encoding: GSM-7

Encoding: UCS-2


Why SMS has a character limit

An SMS message is transmitted as 140 bytes of data. This is a fixed constraint defined by the GSM standard (GSM 03.38) and has not changed since SMS was designed.

The character limit you see — 160 or 70 — is a direct consequence of how many characters fit into those 140 bytes depending on the encoding used:

  • GSM-7: each character uses 7 bits → (140 × 8) / 7 = 160 characters
  • UCS-2: each character uses 16 bits (2 bytes) → 140 / 2 = 70 characters

GSM-7

GSM-7 is the default encoding. It supports the standard Latin alphabet, digits, and a set of common symbols — 128 characters in total.

A single GSM-7 SMS can contain up to 160 characters.

When a message exceeds 160 characters, it is split into multiple segments. Each segment in a multi-part message can carry up to 153 characters — the remaining 7 characters per segment are used by a User Data Header (UDH), a metadata block that tells the recipient's device how to reassemble the parts in the correct order.

Message lengthSegments
1–160 chars1
161–306 chars2
307–459 chars3
460–612 chars4
Up to 1,600 charsUp to 10

Formula: segments = ceil(length / 153) for messages longer than 160 characters.

GSM-7 character set

The basic GSM-7 alphabet includes:

  • Uppercase and lowercase Latin letters (A–Z, a–z)
  • Digits (0–9)
  • Common punctuation: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @
  • Special characters: £ ¥ è é ù ì ò Ç Ø ø Å å Δ Φ Γ Λ Ω Π Ψ Σ Θ Ξ ß É Æ æ Ä Ö Ñ Ü ä ö ñ ü à
  • Space, newline, carriage return

GSM-7 extended characters

Some characters belong to the GSM-7 extension table and count as 2 characters each, because they require an escape sequence:

{ } [ ] \ | ^ ~ €

If your message contains even one character outside the GSM-7 alphabet, the entire message is automatically re-encoded in UCS-2. This applies to accented characters not in the GSM-7 set, emojis, Arabic, Chinese, Cyrillic, and any other non-Latin script.


UCS-2

UCS-2 is used when the message contains characters outside the GSM-7 alphabet. It encodes each character as 16 bits (2 bytes), which allows it to represent up to 65,536 characters — covering virtually all scripts in the Unicode Basic Multilingual Plane.

The trade-off is a significantly reduced character limit: a single UCS-2 SMS can contain up to 70 characters.

For multi-part messages, each segment carries up to 67 characters (3 characters per segment are reserved for the User Data Header).

Message lengthSegments
1–70 chars1
71–134 chars2
135–201 chars3
202–268 chars4
Up to 700 charsUp to 10

Formula: segments = ceil(length / 67) for messages longer than 70 characters.


Comparison

GSM-7UCS-2
Bits per character716
Supported characters128 (Latin + common symbols)65,536 (Unicode BMP)
Single SMS limit160 characters70 characters
Multi-part segment limit153 characters67 characters
Max concatenated length~1,600 characters~700 characters
Triggered byDefaultAny non-GSM-7 character in the text

Practical examples

MessageEncodingLengthSegments
Your code is 123456GSM-719 chars1
160 × AGSM-7160 chars1
161 × AGSM-7161 chars2
Votre fenêtre est ouverteUCS-225 chars1 — ê is not in GSM-7
Bâtiment B, salle 3UCS-219 chars1 — â triggers UCS-2
70 × UCS-270 chars1
71 × UCS-271 chars2

The symbol is part of the GSM-7 extension table and counts as 2 characters, not 1. A message containing only signs has an effective limit of 80 symbols per single SMS (160 ÷ 2), not 160.


Common pitfalls

A single Unicode character can double your segment count

Because UCS-2 applies to the entire message, a single non-GSM-7 character forces re-encoding of all the text. This can have a significant impact on segment count:

A message of 152 GSM-7 characters + 1 emoji would fit in 1 GSM-7 segment. But because of the emoji, the whole message is encoded in UCS-2 — 153 characters at 67 per segment results in 3 segments, not 2.


Reading the segment count in the API

The number of segments used for a delivered message is available in two places:

Message Status API

The sms.segments field is returned in the response body of all three message status endpoints:

{
  "message": {
    "id": "011d9d6e-b5b9-4cb9-be13-2bc336a923ce",
    "channel": "SMS",
    "status": "DELIVERED",
    "sms": {
      "segments": 2
    }
  }
}

See Message Status API for the full response reference.

Webhook

The same sms.segments field is included in every webhook status notification:

{
  "type": "STATUS_UPDATE",
  "message": {
    "id": "b31b6607-9c55-48ba-b145-3f40b809d2d2",
    "channel": "SMS",
    "status": "DELIVERED",
    "sms": {
      "segments": 2
    }
  }
}

See Understanding webhook for the full payload reference.

sms.segments is only present when channel is SMS. It is not included for RCS messages.