Summary: in this tutorial, you will learn about word boundaries in regular expressions to match a position between a word character and a non-word character.
Introduction to Regex word boundary
A word boundary defines a position that denotes the boundary between a word character and a non-word character.
The following defines three positions in a string that are word boundaries:
- At the start of the string, if the first character is a word character. (criterion #1)
- At the end of the string, if the last character is a word character. (criterion #2)
- Between a word character and a non-word character. (criterion #3)
For example, the word boundary positions in the string “C# 11” are as follows:
- Before the letter “C” (criterion #1).
- After the “#” symbol (criterion #3).
- After the space character between “C#” and “11”. (criterion #3)
- After the last character “1” (criterion #2).
Regular expressions use \b
metacharacter to denote a word boundary. The \b
metacharacter matches a word boundary in a string and does not match any actual character.
For example, the following pattern uses the \b
word boundary to match the whole word in a string:
\bword\b
Code language: C# (cs)
The following example finds the matches that contain the word sea
:
using System.Text.RegularExpressions;
using static System.Console;
var text = "She loves the sea and seaside";
var pattern = "sea";
var matches = Regex.Matches(text, pattern);
foreach (var match in matches)
{
WriteLine(match);
}
Code language: C# (cs)
It returns both sea
and seaside
.
To match the whole word sea
, you can use the word boundary \b
like this:
using System.Text.RegularExpressions;
using static System.Console;
var text = "She loves the sea and seaside";
var pattern = @"\bsea\b";
var matches = Regex.Matches(text, pattern);
foreach (var match in matches)
{
WriteLine(match);
}
Code language: C# (cs)
Output:
sea
Code language: C# (cs)
Now, it returns one match instead of two.
Summary
- Use
\b
metacharacter to match a word boundary in a string. - Use
\bword\b
to match the whole word.