Zero-Width Character Detector
Find, highlight, and remove invisible zero-width Unicode characters — ZWS, ZWNJ, ZWJ, BOM, and more.
How to Use the Zero-Width Character Detector
- Paste your text into the input area on the left. Zero-width characters are invisible, so text containing them looks normal.
- Detect mode — the tool highlights every zero-width character with a red badge showing its Unicode code point, and lists all findings below the output with position and name.
- Remove mode — strips all zero-width characters and outputs clean text ready to copy or download.
- Insert mode — adds zero-width characters to your text for line-break control or document watermarking.
What Are Zero-Width Characters?
Zero-width characters are Unicode code points that occupy no horizontal space when rendered. They are invisible in virtually all fonts and rendering environments, making them ideal for both legitimate typographic purposes and malicious obfuscation. The most common zero-width characters are: Zero-Width Space (U+200B), which allows line breaks within words; Zero-Width Non-Joiner (U+200C), which prevents character joining in Arabic and Indic scripts; Zero-Width Joiner (U+200D), which causes adjacent characters to join (used in emoji sequences like family emojis); Byte Order Mark (U+FEFF, also Zero-Width No-Break Space); and Word Joiner (U+2060), which prevents line breaks without adding space.
Security Implications
Zero-width characters have become a significant security concern in several contexts. In phishing attacks, they are inserted into domain names and URLs to make them look identical to legitimate addresses while actually pointing to different servers. In social engineering, they can be inserted into names, keywords, and sensitive strings to bypass automated keyword filters and spam detectors. In content theft detection, they can be removed to strip watermarks that were embedded to track document leaks. This tool helps security researchers and content creators audit their text for these hidden characters.
Zero-Width Characters in Programming
Zero-width characters in source code are a particularly dangerous attack vector. A malicious code review contribution could insert ZWC characters into string literals, variable names, or comments to create code that looks correct in a code review but behaves differently than expected. For example, a Zero-Width Joiner between two characters in a string comparison can make the string never match its expected value. Some IDEs and code editors do not highlight these characters by default, making them hard to detect during review. Always run untrusted code contributions through a zero-width character detector before merging.
Insert Mode: Watermarking with Zero-Width Characters
The Insert mode allows you to embed zero-width characters into text — a technique used for document fingerprinting and leak detection. The idea is simple: different combinations of ZWS (U+200B) and ZWNJ (U+200C) can be used to encode binary data. By assigning different zero-width character patterns to different recipients of a sensitive document, the document's origin can be traced if it leaks, even after formatting changes or copy-paste operations that preserve the invisible characters. Note that sophisticated actors can detect and strip these watermarks with tools like this one, so zero-width watermarking is most effective against unsophisticated leakers.
Detected Character Reference
This tool detects the following zero-width and invisible Unicode characters: U+200B Zero-Width Space, U+200C Zero-Width Non-Joiner, U+200D Zero-Width Joiner, U+FEFF Byte Order Mark (BOM) / Zero-Width No-Break Space, U+2060 Word Joiner, U+2061 Function Application, U+2062 Invisible Times, U+2063 Invisible Separator, U+2064 Invisible Plus, U+00AD Soft Hyphen, U+034F Combining Grapheme Joiner, and U+180E Mongolian Vowel Separator.
Common Sources of Zero-Width Characters
- Copy from Twitter/X — may add ZWS for line-break control in long words
- Arabic and Indic text — ZWNJ and ZWJ control character joining
- Emoji sequences — family emojis and flag emojis use ZWJ to combine base characters
- UTF-8 BOM files — some Windows text editors add a BOM at file start
- Web scraping — scraped content often contains ZWC from the source page's JavaScript
- Malicious content — spam, phishing, and filter evasion deliberately insert ZWC