Regular Expressions Guide for Developers
Regular expressions (regex) are one of the most powerful and versatile tools in a developer's toolkit. They let you search, match, validate, and transform text using compact pattern definitions. Every major programming language supports them, and mastering even the basics will make you significantly more productive.
This guide covers regex fundamentals from the ground up: syntax building blocks, the most common real-world patterns, flags that change matching behavior, and performance tips to avoid costly mistakes. To test patterns as you learn, use our Regex Tester — it provides live matching, group highlighting, and explanation of your pattern.
What Is Regex and When Should You Use It?
A regular expression is a sequence of characters that defines a search pattern. When you apply that pattern to a string, the regex engine scans the text and reports matches. Regex is the right tool when you need to:
- Validate input — check if an email, URL, phone number, or date matches an expected format
- Search and extract — pull specific data from logs, HTML, CSV, or unstructured text
- Find and replace — transform text patterns across files (renaming variables, reformatting dates)
- Parse structured text — extract fields from log lines, URLs, or configuration values
Regex is not the right tool for parsing nested structures like HTML or JSON — use a proper parser for those. The classic advice: "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." Use regex for flat pattern matching, not complex parsing.
Basic Syntax: Character Classes, Quantifiers, and Anchors
Character Classes
Character classes match a single character from a defined set:
| Pattern | Matches | Example |
|---|---|---|
. |
Any character except newline | c.t matches "cat", "cot", "c9t" |
\d |
Any digit (0-9) | \d{3} matches "123", "456" |
\w |
Word character (a-z, A-Z, 0-9, _) | \w+ matches "hello", "user_1" |
\s |
Whitespace (space, tab, newline) | \s+ matches one or more spaces |
[abc] |
Any character in the set | [aeiou] matches any vowel |
[^abc] |
Any character NOT in the set | [^0-9] matches non-digits |
Uppercase versions negate: \D matches non-digits, \W matches non-word characters, \S matches non-whitespace.
Quantifiers
Quantifiers specify how many times the preceding element should match:
| Quantifier | Meaning | Example |
|---|---|---|
* |
0 or more | ab*c matches "ac", "abc", "abbc" |
+ |
1 or more | ab+c matches "abc", "abbc" but not "ac" |
? |
0 or 1 (optional) | colou?r matches "color" and "colour" |
{n} |
Exactly n times | \d{4} matches exactly 4 digits |
{n,m} |
Between n and m times | \d{2,4} matches 2-4 digits |
{n,} |
n or more times | \w{8,} matches 8+ word chars |
Anchors
Anchors match a position, not a character:
^— start of string (or line withmflag)$— end of string (or line withmflag)\b— word boundary (between a\wand\Wcharacter)
For example, ^\d{5}$ matches a string that is exactly a 5-digit ZIP code — nothing more, nothing less. Without the anchors, \d{5} would also match inside "The ZIP is 90210 for that area."
Common Regex Patterns You Will Actually Use
Here are battle-tested patterns for the most frequently needed validations:
Email Address
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Matches: user@example.com, first.last@company.co.uk. Note: for production email validation, also send a confirmation email — regex alone cannot verify deliverability.
URL
^https?:\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(\/[^\s]*)?$
Matches: https://example.com, http://docs.example.com/api/v2. For more permissive URL matching, consider the URL constructor in JavaScript which handles edge cases better.
Phone Number (US)
^(\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$
Matches: (555) 123-4567, 555.123.4567, +1-555-123-4567, 5551234567.
IPv4 Address
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$
Matches valid IPv4 addresses where each octet is 0-255. This is more accurate than the simpler \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} which would also match invalid addresses like 999.999.999.999.
Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Matches ISO 8601 dates like 2026-03-29. Note: this validates format but not calendar validity (it will accept 2026-02-31). Use a date library for full validation.
Test any of these patterns with real data in our Regex Tester, or browse our Regex Library for more pre-built patterns.
Regex Flags: g, i, m, s
Flags modify how the regex engine interprets your pattern. In JavaScript, flags are placed after the closing slash: /pattern/flags.
g(global) — find all matches in the string, not just the first. Withoutg, the regex stops after the first match. Essential when usingString.matchAll()orString.replace()to replace all occurrences.i(case-insensitive) —/hello/imatches "Hello", "HELLO", and "hElLo". Use when case does not matter, such as matching HTML tags or user input.m(multiline) — makes^and$match the start and end of each line, not the entire string. Critical when processing multi-line text like log files or configuration.s(dotAll) — makes.match newline characters (\n) as well. By default,.matches any character except newlines. Useswhen your pattern needs to span multiple lines.
Flags can be combined: /pattern/gim applies global, case-insensitive, and multiline matching simultaneously.
Performance Tips: Avoiding Regex Pitfalls
Regex engines use backtracking, which means poorly written patterns can be exponentially slow on certain inputs. This is called catastrophic backtracking (or ReDoS — Regular Expression Denial of Service). Here is how to avoid it:
1. Avoid Nested Quantifiers
Patterns like (a+)+ or (a*)* create exponential backtracking. The engine tries every possible way to split the input between the inner and outer quantifiers. On the input "aaaaaaaaaaaaaab", this can take millions of steps.
Fix: Flatten to a+ — there is no reason for the nested group.
2. Use Non-Capturing Groups
If you do not need to extract the matched text, use (?:...) instead of (...). Capturing groups store the matched text in memory, which is unnecessary overhead when you only need grouping for alternation or quantification.
// Slower (capturing)
/(https?):\/\/(www\.)?(.+)/
// Faster (non-capturing where not needed)
/(?:https?):\/\/(?:www\.)?(.+)/
3. Be Specific, Not Greedy
Instead of .* (match everything), use a more specific character class. For example, to match a quoted string, "[^"]*" is much faster than ".*?" because it cannot backtrack past the closing quote.
4. Anchor When Possible
Adding ^ and $ anchors tells the engine exactly where to start and stop, eliminating unnecessary scanning. If you know the pattern must match the entire string, always anchor it.
5. Set Timeouts in Production
When running user-supplied regex (search forms, filters), always set a timeout. In JavaScript, use a Web Worker with a timeout. In Python, use the regex library's timeout parameter. Never let untrusted regex run unbounded.
Test Your Regex Patterns
Our Regex Tester lets you write patterns, see matches highlighted in real time, inspect capturing groups, and get plain-English explanations of what your pattern does. It supports all JavaScript regex flags and runs entirely in your browser — your patterns and test data never leave your device.