Text String Obfuscator

Replace ASCII characters with Unicode homoglyphs, interleave zero-width spaces, or convert to HTML entities. Includes a decode mode to reverse obfuscation.

Input Text
Obfuscated Output
Paste text above to obfuscate.

How to Use the String Obfuscator

  1. Choose a mode — Homoglyphs replaces look-alike characters, Zero-Width inserts invisible characters, HTML Entities converts to entity codes, Decode reverses obfuscation.
  2. Paste your text into the left input area. Results appear instantly.
  3. Copy or download the obfuscated output using the buttons above the result.

Obfuscation Methods Explained

Homoglyph Substitution

Homoglyph substitution replaces standard ASCII letters with Unicode characters that look visually identical but have different code points. The Latin letter "a" (U+0061) can be replaced with the Cyrillic "а" (U+0430), the Latin "e" with "е" (U+0435), the Latin "o" with "о" (U+043E), and so on. To the human eye, the text looks unchanged. But to a computer performing string comparison, the strings are completely different. This technique is used by security researchers to study homograph attacks on domain names (e.g., apple.com vs аpple.com using a Cyrillic "а") and by developers watermarking documents. A wide range of Latin, Greek, and Cyrillic homoglyphs are available for most common letters.

Zero-Width Character Injection

Zero-width characters occupy no visual space in rendered text but are real Unicode characters present in the string. This tool uses Zero-Width Space (U+200B), Zero-Width Non-Joiner (U+200C), and Zero-Width Joiner (U+200D). By inserting these characters between or around the visible characters in a text, the string becomes invisible-to-the-eye but detectable by string matchers, search engines, and plagiarism checkers. This technique is used to watermark text leaks — different recipients get different patterns of zero-width characters, allowing the source of a leak to be traced. It also bypasses simple keyword-matching filters and copy-paste detection in some platforms.

HTML Entity Encoding

HTML entity encoding converts each character to its numeric HTML entity form (e.g., "H" → "H", "e" → "e"). When a browser renders this HTML, it displays the original text. But simple scraper scripts that read raw HTML source and do not parse entities will see a garbled string of ampersands and semicolons. This is a classic technique for protecting email addresses on web pages from harvester bots, though modern scrapers are sophisticated enough to decode entities. It remains useful for obfuscating strings in HTML source, testing HTML parsers, and educational demonstrations of entity encoding.

Decode Mode

The Decode mode strips all known zero-width characters (U+200B, U+200C, U+200D, U+FEFF, and similar) from the input, and decodes HTML numeric entities back to their characters. Note that homoglyphs cannot be automatically decoded because the substitution mapping is not always reversible — a Cyrillic "а" could be a homoglyph or it could be intentional Cyrillic text. Related encoding tools: HTML Entity Encoder, Base64 Encoder, Unicode Text Generator.

Frequently Asked Questions

A homoglyph is a character that looks visually identical or very similar to another but has a different Unicode code point. For example, Cyrillic "а" (U+0430) looks identical to Latin "a" (U+0061). Homoglyphs are used in domain spoofing, text watermarking, and obfuscation.
Zero-width characters are Unicode code points that take up no visible space. Examples include Zero-Width Space (U+200B) and Zero-Width Non-Joiner (U+200C). They can be inserted between characters to break string matching without affecting visual appearance.
HTML entity encoding replaces characters with their numeric HTML entities (e.g., "A" becomes "A"). Browsers render them correctly but simple scrapers reading raw HTML may not decode them, making it useful for protecting emails from bots.
Zero-width characters can be stripped by removing known Unicode code points. HTML entities can be decoded by parsing them. Homoglyphs are harder to reverse automatically. This tool's Decode mode handles zero-width stripping and HTML entity decoding.
Common uses include: watermarking documents to trace leaks (unique zero-width patterns per recipient), protecting email addresses from spam harvesters, bypassing keyword filters, testing Unicode handling in applications, and creative text effects.