Summary: in this tutorial, you will learn about the lazy quantifiers to find the smallest match in an input string.
Introduction to lazy quantifiers
Lazy quantifiers, also known as non-greedy quantifiers, are a feature in regular expressions that modify the behavior of quantifiers to match as little as possible. They provide the smallest possible match that satisfies the regular expression pattern.
By default, quantifiers in regular expressions are greedy, meaning they match as much as possible. However, lazy quantifiers work in the opposite way. They match as little as possible while still allowing the overall pattern to be satisfied.
Lazy quantifiers are denoted by appending a question mark (?
) to the standard quantifiers.
The following table shows the greedy quantifiers, lazy quantifiers, and the meanings of the lazy quantifiers:
Greedy Quantifiers | Lazy Quantifiers | Lazy Quantifier Meaning |
---|---|---|
* | *? | Match zero or more occurrences (as few as possible) |
+ | +? | Match one or more occurrences (as few as possible) |
? | ?? | Match zero or one occurrence (preferably zero) |
{n} | {n}? | Match exactly n occurrences |
{n,} | {n,}? | Match n or more occurrences |
{n,m} | {n,m}? | Match between n and m occurrences (as few as possible) |
Lazy quantifiers example
The following example illustrates how to use a lazy quantifier to extract attribute values of an input tag:
using System.Text.RegularExpressions;
using static System.Console;
var html = """<input type="submit" values="Send">""";
var pattern = """
".+?"
""";
var matches = Regex.Matches(html, pattern);
foreach (var match in matches)
{
WriteLine(match);
}
Code language: C# (cs)
Output:
"submit"
"Send"
Code language: JSON / JSON with Comments (json)
How it works.
- The program begins with the necessary
using
directive to include theRegex
class from theSystem.Text.RegularExpressions
namespace. - The program then includes
using static System.Console;
to allow the usage of theWriteLine
method without explicitly specifying theConsole
class. - The HTML string is defined as
html
using a raw string ("""
) that contains an HTML input element with the attributetype="submit"
andvalues="Send"
. - The regular expression pattern is defined as
pattern
using a raw string with triple quotes ("""
) to avoid escaping the ” inside the regular expression. The pattern".+?"
matches the attributes of the input HTML tag including quotes (“). The lazy quantifier?
ensures that the match is as small as possible. - The program uses the
Regex.Matches()
method to find all matches of the pattern in the HTML stringhtml
. It takes thehtml
string and thepattern
as arguments and returns a collection ofMatch
objects representing the matches found. - The program then iterates over each
Match
object in thematches
collection using aforeach
loop and use theWriteLine
method to print each match to the console inside the loop.
Summary
- A lazy quantifier in regular expressions matches as little as possible while still satisfying the pattern.