BRE vs. ERE
When users transition from modern programming languages (Python, JS) to CLI grep, they are often frustrated when standard regex patterns fail to work.
This is because, by default, grep uses Basic Regular Expressions (BRE). Modern languages use a variant of Extended Regular Expressions (ERE) or PCRE.
Basic Regular Expressions (BRE)
In BRE mode (the default grep behavior), meta-characters like ?, +, {, }, |, and () are treated as literal characters unless they are escaped with a backslash.
This leads to "Leaning Toothpick Syndrome"—ugly, unreadable patterns filled with backslashes.
Example: The OR Operator (|)
If you want to search for "cat" OR "dog":
# WRONG (BRE): This searches for the literal string "cat|dog"
grep "cat|dog" file.txt
# CORRECT (BRE): You must escape the pipe to give it programmatic meaning
grep "cat\|dog" file.txt
Example: Grouping and Quantifiers
# BRE: Extremely hard to read
grep "\(cat\|dog\)\+" file.txt
Extended Regular Expressions (ERE) and -E
To fix this, you must explicitly tell grep to use Extended Regular Expressions by passing the -E flag. (Historically, this was the egrep command).
In ERE mode, meta-characters act as operators by default. If you actually want to match a literal + or |, you must escape them. This behaves exactly like modern programming languages.
# ERE: Clean, readable OR operator
grep -E "cat|dog" file.txt
# ERE: Clean grouping and quantifiers
grep -E "(cat|dog)+" file.txt
Always use grep -E (or egrep) when writing regular expressions. It aligns with modern regex engines and saves you from the nightmare of escaping every operator. Reserve standard grep for simple, literal string searches.
Perl Compatible Regular Expressions (PCRE) and -P
While ERE (-E) is a massive improvement, it still lacks some advanced features found in modern regex engines, such as lookaheads ((?=...)), lookbehinds ((?<=...)), and specific shorthand character classes like \d (digits) or \w (words).
GNU grep includes a -P flag to enable PCRE mode.
# Use PCRE to match digits (\d)
grep -P "\d{3}-\d{2}-\d{4}" file.txt
The -P flag is a GNU extension. It works on Linux (Ubuntu, Debian, RHEL) but will fail on macOS (BSD grep) and Alpine Linux (Busybox grep). If you are writing a portable .sh script for deployment, do not use -P. Use -E and POSIX character classes ([[:digit:]]) instead.