Regular expressions — syntax, quantifiers, groups, lookahead & common patterns
Reference. # any character (except newline)
\d # digit [0-9]
\D # non-digit
\w # word char [a-zA-Z0-9_]
\W # non-word char
\s # whitespace [ \t\n\r\f]
\S # non-whitespace
\ # escape special char
\n # newline
\t # tab* # 0 or more
+ # 1 or more
? # 0 or 1 (optional)
{3} # exactly 3
{2,5} # 2 to 5
{2,} # 2 or more
{,5} # up to 5
# Greedy vs Lazy
.* # greedy (match as much as possible)
.*? # lazy (match as little as possible)
.+? # lazy one or more^ # start of string/line
$ # end of string/line
\b # word boundary
\B # non-word boundary
# Examples
^Hello # line starting with "Hello"
world$ # line ending with "world"
\bcat\b # whole word "cat" (not "catch")(abc) # capturing group
(?:abc) # non-capturing group
a|b # alternation (a OR b)
\1 # backreference to group 1
# Named groups
(?P<name>pattern) # Python named group
(?<name>pattern) # JS/C# named group
# Examples
(\d{3})-(\d{4}) # capture phone parts
(cat|dog)s? # "cat", "cats", "dog", "dogs"
(\w+)\s+\1 # repeated word[abc] # a, b, or c
[^abc] # NOT a, b, or c
[a-z] # lowercase letter
[A-Z] # uppercase letter
[0-9] # digit
[a-zA-Z] # any letter
[a-zA-Z0-9_] # same as \w
# POSIX classes (in some engines)
[:alpha:] # letters
[:digit:] # digits
[:alnum:] # letters + digits
[:space:] # whitespacei # case insensitive
g # global (all matches)
m # multiline (^ $ match line boundaries)
s # dotall (. matches newline)
x # verbose (ignore whitespace, allow comments)
u # unicode
# Usage (JavaScript)
/pattern/gi
# Usage (Python)
re.compile(r"pattern", re.IGNORECASE | re.MULTILINE)# Lookahead (match if followed by)
foo(?=bar) # "foo" followed by "bar"
foo(?!bar) # "foo" NOT followed by "bar"
# Lookbehind (match if preceded by)
(?<=foo)bar # "bar" preceded by "foo"
(?<!foo)bar # "bar" NOT preceded by "foo"
# Examples
\d+(?=px) # number before "px"
(?<=\$)\d+ # number after "$"
(?<!\d)\d{3}(?!\d) # exactly 3 digits# Email
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
# URL
https?://[\w.-]+(/[\w./-]*)?
# IP Address
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
# Phone (US)
(\+1[-.])?\(?\d{3}\)?[-.]?\d{3}[-.]?\d{4}
# Date (YYYY-MM-DD)
\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])
# HTML tag
<([a-z]+)([^<]*)(?:>(.*?)</\1>|/>)
# Password (8+ chars, upper, lower, digit)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$
# Remove HTML tags
<[^>]*>