Zero-Width Hidden Message Encoder

Hide a secret message inside visible text using invisible zero-width Unicode characters. Decode any text to reveal hidden messages.

Binary encoding will appear here after encoding…
Enter carrier text and a secret message, then click Encode.

What Are Zero-Width Characters?

Zero-width characters are Unicode code points that have no visible rendering — they take up no horizontal space in text. The most important ones are Zero Width Space (U+200B), Zero Width Non-Joiner (U+200C), Zero Width Joiner (U+200D), and Zero Width No-Break Space (U+FEFF, also the Byte Order Mark). While these characters were designed for legitimate typographic purposes — ZWJ joins emoji components, ZWNJ prevents unwanted ligatures in scripts like Devanagari — their invisibility makes them usable for information hiding.

Unicode Steganography

Text steganography using zero-width characters exploits the fact that these characters are invisible to readers but present in the underlying byte sequence. By mapping the binary representation of a secret message to sequences of zero-width characters and inserting them into ordinary text, a hidden message can be embedded in plain sight. The technique works in any medium that preserves the exact Unicode codepoints: email, social media posts, documents, and web pages all transmit zero-width characters without modification.

Legitimate Use Cases

The most practical legitimate application of zero-width steganography is document watermarking. A confidential document distributed to multiple recipients can contain a unique hidden message for each recipient. If the document is leaked, the watermark identifies the source. This technique is used by intelligence agencies, law firms, financial institutions, and any organization that needs to track document provenance without visible markings. The hidden message can include a timestamp, recipient identifier, or any other provenance information.

Security Implications

Zero-width characters pose several security risks beyond steganography. They can be used to bypass content filters and keyword detection systems, since the string "hello" and "h​ello" look identical but are different byte sequences. Phishing attacks can use zero-width characters to make malicious text pass filters while appearing legitimate to users. In code review, adversarial invisible characters have been used to make malicious code appear as benign no-ops. The Trojan Source attack demonstrated how bidirectional Unicode control characters (including invisible ones) can make source code appear different from what the compiler actually executes.

Detecting Zero-Width Characters

To detect zero-width characters in text, use our Unicode Character Inspector — it shows every character including invisible ones. In JavaScript, you can test for common zero-width characters with a regex: /[\u200B\u200C\u200D\uFEFF]/. For more thorough detection, check for any character with Unicode category Cf (Format character) or Cs (Surrogate). Text editors like VS Code show zero-width characters as highlighted invisible glyphs when configured to display whitespace. The wc -c command on Unix will show a byte count higher than expected if zero-width characters are present.

Encoding Capacity and Limitations

The binary encoding used here represents each secret character as 8 bits, requiring 8 zero-width characters per byte. For ASCII text (1 byte per character), a 100-character secret message requires 800 zero-width characters. For Unicode text with multi-byte characters, the requirement is higher. While zero-width characters are invisible, their presence can be inferred from an unusually large file size or character count. More advanced steganographic schemes use multiple zero-width character types to encode higher bit densities or employ error-correcting codes for robustness.

Frequently Asked Questions

Zero-width characters are Unicode code points with no visible width when rendered. Key ones include Zero Width Space (U+200B), Zero Width Non-Joiner (U+200C), Zero Width Joiner (U+200D), and Zero Width No-Break Space (U+FEFF). They are invisible to readers but present in the raw byte sequence of the text.
Each character of the secret message is converted to 8-bit binary. Bit 0 maps to Zero Width Space (U+200B) and bit 1 maps to Zero Width Non-Joiner (U+200C). These invisible characters are inserted between characters of the visible carrier text. To decode, extract all ZWS/ZWNJ characters, group into 8-bit chunks, and convert back to text.
Yes. Zero-width characters can be detected by examining the raw bytes or character count of the text, using a Unicode inspector, or with regex patterns matching Unicode category Cf. The text will have more bytes than visible characters. Security tools, document analysis software, and careful manual inspection can all reveal hidden zero-width characters.
Legitimate uses include watermarking documents to identify the source of leaks, adding invisible provenance metadata to text, and security research. Malicious uses include bypassing content filters (since invisible characters alter string comparisons without being visible), embedding covert communication channels, and the Trojan Source attack on source code. This tool is for educational and legitimate purposes only.
No. The visible text appears completely unchanged. The hidden message is encoded entirely in zero-width characters inserted between visible characters. The only detectable differences are a longer byte sequence, a higher character count, and potential issues with text processing that does not expect non-printable characters.