Regex Cheatsheet
Comprehensive regular expression reference with 80+ patterns. Search, copy, and test regex patterns across JavaScript, Python, and PHP flavors.
Quick Regex Tester
Understanding Regular Expressions
Regular expressions are a powerful pattern-matching language used across virtually every programming environment. Originally developed in the 1950s by mathematician Stephen Kleene, regex has become an indispensable tool for developers, data engineers, and system administrators who need to search, validate, or transform text efficiently.
At their core, regular expressions work by defining a search pattern using a combination of literal characters and special metacharacters. The regex engine then scans a target string from left to right, attempting to match the pattern at each position. Understanding how the engine processes patterns — including concepts like backtracking, greedy vs. lazy matching, and anchoring — is key to writing efficient expressions that perform well even on large inputs.
Modern regex flavors in JavaScript, Python, and PHP share a common foundation but differ in feature support. JavaScript added lookbehind assertions and named capture groups in ES2018, while Python uses a slightly different syntax for named groups ((?P<name>...)). PHP, built on the PCRE library, supports advanced features like atomic groups and recursive patterns that other flavors lack. This cheatsheet highlights these differences so you can write portable patterns across languages.
Essential Regex Metacharacters
| Character | Name | Description | Example |
|---|---|---|---|
. | Dot | Matches any single character except newline | a.c matches "abc", "a1c" |
^ | Caret | Matches the start of a string or line | ^Hello matches "Hello world" |
$ | Dollar | Matches the end of a string or line | end$ matches "the end" |
* | Asterisk | Matches zero or more of the preceding element | ab*c matches "ac", "abc", "abbc" |
+ | Plus | Matches one or more of the preceding element | ab+c matches "abc", "abbc" |
? | Question mark | Makes the preceding element optional (0 or 1) | colou?r matches "color", "colour" |
\d | Digit shorthand | Matches any digit character (0-9) | \d{3} matches "123" |
\w | Word shorthand | Matches word characters (letters, digits, underscore) | \w+ matches "hello_1" |
[ ] | Character class | Matches any one character inside the brackets | [aeiou] matches any vowel |
( ) | Capture group | Groups patterns and captures the matched text | (\d+)-(\d+) captures both numbers |
Frequently Asked Questions
What are regular expressions?
Regular expressions (often shortened to regex or regexp) are sequences of characters that define search patterns. They are used in virtually every programming language — including JavaScript, Python, PHP, Java, Go, and Ruby — to find, match, validate, and manipulate text. Common applications include form input validation, log file parsing, search-and-replace operations across codebases, URL routing, and extracting structured data from unstructured text.
What is the difference between * and + in regex?
Both * and + are quantifiers, but they differ in their minimum match requirement. The asterisk (*) matches zero or more occurrences of the preceding element, meaning the element is entirely optional. The plus sign (+) matches one or more occurrences, requiring at least one match. For example, ab*c matches "ac" (zero b's), "abc", and "abbc", while ab+c only matches "abc" and "abbc" — it will not match "ac" because at least one "b" is required.
How do I match an email address with regex?
A widely used basic email pattern is [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}. This matches a local part containing letters, digits, and common special characters, followed by an @ symbol and a domain name with a TLD of at least two characters. Keep in mind that fully RFC 5322-compliant email validation with regex is extraordinarily complex. For production applications, it is generally better to use a simplified regex for format checking combined with a verification email to confirm the address is valid and accessible.
What are capture groups?
Capture groups are portions of a regex pattern enclosed in parentheses () that isolate and extract specific parts of a match. When the pattern (\d{4})-(\d{2})-(\d{2}) matches the string "2024-12-31", it creates three capture groups containing "2024", "12", and "31" respectively. You can reference these captured values using backreferences (\1, \2) within the same pattern, or access them programmatically after matching. Named capture groups, written as (?<year>\d{4}), make code more readable by letting you reference matches by name instead of number.
What is a lookahead in regex?
A lookahead is a zero-width assertion — it checks whether a pattern exists at a given position without actually consuming any characters in the string. A positive lookahead (?=...) succeeds if the enclosed pattern matches ahead, while a negative lookahead (?!...) succeeds if it does not. For example, \d(?=px) matches a digit only when followed by "px", but "px" itself is not included in the match. Lookaheads are especially useful for password validation (e.g., requiring at least one digit, one uppercase letter, and one special character) and for complex conditional matching scenarios.