Summary: in this tutorial, you will learn how to use C# regex lookahead in the regular expression to match only if it is followed by another pattern.
Introduction to the C# Regex Lookahead
The regex lookahead is a feature in regular expressions that allows you to match A only if it is followed by B. The syntax of the regex lookahead is as follows:
A(?=B)
Code language: C# (cs)
This regex lookahead matches A only if it is followed by B.
Suppose you have the following string:
"1 shark is 5 feet long"
Code language: C# (cs)
And you want to match the number 5 which is followed by a space and the literal string “feet” but not the number 1. To do that you can use the regex lookahead like this:
\d+(?=\s*feet)
Code language: C# (cs)
This regular expression pattern matches one or more digits but only if they are followed by zero or more whitespace characters and the word “feet”.
Here’s the program:
using System.Text.RegularExpressions;
using static System.Console;
var text = "1 shark is 5 feet long";
var pattern = @"\d+(?=\s*feet)";
var matches = Regex.Matches(text, pattern);
foreach (var match in matches)
{
WriteLine(match);
}
Code language: C# (cs)
C# Regex multiple lookaheads
Regular expressions support multiple lookaheads with the following syntax:
A(?=B)(?=C)
Code language: C# (cs)
In this syntax, we have two lookaheads. The A(?=B)(?=C)
regular expression matches the pattern A
if it is followed by both B
and C
.
Here are how the regex engine processes:
- First, match the pattern
A
. - Second, evaluate the first lookahead
(?=B)
at the position immediately afterA
. If it doesn’t match, the regex engine stops. - Third, evaluate the second lookahead
(?=C)
at the position immediately afterB
. If it doesn’t match, the regex engine stops. - Finally, if both tests passed, the regex engine returns the match for the pattern
A
.
In theory, you can have as many lookaheads as you want.
Regex negative lookaheads
Negative lookahead negates a lookahead. It matches A only if it is not followed by B:
A(?!B)
Code language: C# (cs)
Negative lookaheads are useful when you want to specify exclusions in your regular expression patterns. They help you find matches that do not have certain patterns following them.
The following example uses a negative lookahead in a regular expression to match a number that is not followed by the word “feet”:
using System.Text.RegularExpressions;
using static System.Console;
var text = "1 shark is 5 feet long";
var pattern = @"\d+(?!\s*feet)";
var matches = Regex.Matches(text, pattern);
foreach (var match in matches)
{
WriteLine(match);
}
Code language: C# (cs)
Output:
1
Code language: C# (cs)
Summary
- Use negative lookahead
A(?=B)
in regular expressions to match A only if it is followed by B. - Use negative lookahead
A(?!B)
in regular expressions to match A only if it is not followed by B.