String Length Calculator
Characters, bytes (UTF-8/UTF-16/ASCII), codepoints, graphemes, URL-encoded length, and Base64 length — all at once.
How to Use the String Length Calculator
- Paste your string into the input area. Metrics update instantly as you type.
- Summary mode — shows all key length metrics at a glance: characters, bytes (UTF-8, UTF-16, ASCII), codepoints, graphemes, URL-encoded length, and Base64 length.
- Encoding mode — adds a per-character breakdown table showing each character's codepoint and UTF-8 byte sequence.
- Compare mode — paste two strings side by side and see a difference table for all metrics.
Understanding String Length Metrics
Character Count vs. Grapheme Count
Most people want the grapheme count when they ask "how many characters is this?" — the number of symbols a human would count by pointing at each one. This is distinct from JavaScript's string.length, which counts UTF-16 code units, and from the Unicode codepoint count, which counts atomic Unicode values. A family emoji like 👨👩👧 is 1 grapheme, 8 Unicode codepoints, and 16 UTF-16 code units. This tool shows all three counts so you know exactly what each API or database field will measure.
UTF-8 Byte Count
UTF-8 is the standard encoding for web content, JSON, and most databases. It uses a variable number of bytes: 1 byte for ASCII characters (U+0000–U+007F), 2 bytes for characters with diacritics and many European scripts (U+0080–U+07FF), 3 bytes for most CJK (Chinese, Japanese, Korean) characters and common symbols (U+0800–U+FFFF), and 4 bytes for emoji and supplementary characters (U+10000+). The UTF-8 byte count is what determines database column size requirements when using VARCHAR or TEXT columns with byte limits.
UTF-16 Byte Count
UTF-16 is the internal string format used by JavaScript, Java, C#, and Windows APIs. It uses 2 bytes for most characters (the Basic Multilingual Plane) and 4 bytes for supplementary characters (emoji, historic scripts). UTF-16 is common in Windows file system APIs and .NET string handling. Knowing the UTF-16 byte count is useful when working with Windows native APIs, Java's String.length(), or .NET's string.Length, all of which count UTF-16 code units.
ASCII Byte Count
ASCII-compatible byte count assumes the string will be passed through a system that only supports 7-bit ASCII. Non-ASCII characters are counted as they would be encoded in ASCII with a percent-encoding scheme — each non-ASCII byte as 3 bytes (%XX). This approximates the length overhead of transmitting the string through protocols that require ASCII encoding. Note that true ASCII does not support non-ASCII characters — this metric shows what the length would be if the string were URL-encoded (encodeURIComponent).
URL-Encoded Length
URL encoding (percent-encoding) converts each non-ASCII and reserved character to a %XX sequence. The URL-encoded length is the length of the string after encodeURIComponent() is applied. This is the relevant length for query string parameters, path segments, and form data submitted via GET. Most browsers support URLs up to about 2,000 characters, though the HTTP spec has no defined limit. If your URL-encoded string approaches 2,000 characters, consider using POST instead of GET.
Base64 Length
Base64 encodes binary data (or UTF-8 text) as ASCII characters using 4 base-64 characters for every 3 bytes, with padding to make the output length a multiple of 4. The Base64 length of a string is approximately ceil(utf8_bytes / 3) * 4. Base64 is commonly used in data URIs (embedding images in CSS), JSON Web Tokens (JWT), and HTTP Basic Authentication headers. Knowing the Base64 length helps when you need to check if a Base64-encoded value will fit within a database column, cookie size limit, or API field length restriction.
Compare Mode
Compare mode is useful for debugging encoding mismatches between two versions of the same string, checking whether a field was trimmed or modified by a system, or verifying that two strings that look identical are truly identical at the byte level. Paste both strings and the tool shows the difference in every length metric. If two strings look the same but have different byte counts, there is likely a hidden character (zero-width space, non-breaking space, or BOM) in one of them — use the Zero-Width Character Detector or Whitespace Visualizer to find it.