Regex Cheatsheet

Regular expressions — syntax, quantifiers, groups, lookahead & common patterns

Reference
Contents
📝

Characters

.         # any character (except newline)
\d        # digit [0-9]
\D        # non-digit
\w        # word char [a-zA-Z0-9_]
\W        # non-word char
\s        # whitespace [ \t\n\r\f]
\S        # non-whitespace
\         # escape special char
\n        # newline
\t        # tab
🔢

Quantifiers

*         # 0 or more
+         # 1 or more
?         # 0 or 1 (optional)
{3}       # exactly 3
{2,5}     # 2 to 5
{2,}      # 2 or more
{,5}      # up to 5

# Greedy vs Lazy
.*        # greedy (match as much as possible)
.*?       # lazy (match as little as possible)
.+?       # lazy one or more

Anchors & Boundaries

^         # start of string/line
$         # end of string/line
\b        # word boundary
\B        # non-word boundary

# Examples
^Hello    # line starting with "Hello"
world$    # line ending with "world"
\bcat\b   # whole word "cat" (not "catch")
📦

Groups & Alternation

(abc)     # capturing group
(?:abc)   # non-capturing group
a|b       # alternation (a OR b)
\1        # backreference to group 1

# Named groups
(?P<name>pattern)   # Python named group
(?<name>pattern)    # JS/C# named group

# Examples
(\d{3})-(\d{4})     # capture phone parts
(cat|dog)s?          # "cat", "cats", "dog", "dogs"
(\w+)\s+\1          # repeated word
📐

Character Classes

[abc]     # a, b, or c
[^abc]    # NOT a, b, or c
[a-z]     # lowercase letter
[A-Z]     # uppercase letter
[0-9]     # digit
[a-zA-Z]  # any letter
[a-zA-Z0-9_]  # same as \w

# POSIX classes (in some engines)
[:alpha:]   # letters
[:digit:]   # digits
[:alnum:]   # letters + digits
[:space:]   # whitespace
🏴

Flags / Modifiers

i    # case insensitive
g    # global (all matches)
m    # multiline (^ $ match line boundaries)
s    # dotall (. matches newline)
x    # verbose (ignore whitespace, allow comments)
u    # unicode

# Usage (JavaScript)
/pattern/gi
# Usage (Python)
re.compile(r"pattern", re.IGNORECASE | re.MULTILINE)
👀

Lookaround

# Lookahead (match if followed by)
foo(?=bar)      # "foo" followed by "bar"
foo(?!bar)      # "foo" NOT followed by "bar"

# Lookbehind (match if preceded by)
(?<=foo)bar     # "bar" preceded by "foo"
(?<!foo)bar     # "bar" NOT preceded by "foo"

# Examples
\d+(?=px)       # number before "px"
(?<=\$)\d+      # number after "$"
(?<!\d)\d{3}(?!\d)  # exactly 3 digits
💡

Common Patterns

# Email
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

# URL
https?://[\w.-]+(/[\w./-]*)?

# IP Address
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

# Phone (US)
(\+1[-.])?\(?\d{3}\)?[-.]?\d{3}[-.]?\d{4}

# Date (YYYY-MM-DD)
\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])

# HTML tag
<([a-z]+)([^<]*)(?:>(.*?)</\1>|/>)

# Password (8+ chars, upper, lower, digit)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$

# Remove HTML tags
<[^>]*>