Groups and Backreferences in Regular Expressions
1. Capturing Groups
Capturing groups are used to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses ()
. This allows you to apply quantifiers to the entire group and later refer to the matched substring.
Example:
Pattern: (abc)+
Matches: "abc", "abcabc", "abcabcabc"
Explanation: The group (abc)
is matched one or more times.
2. Non-Capturing Groups
Non-capturing groups are used to group multiple characters without capturing them for later use. They are created by placing the characters to be grouped inside a set of parentheses with a ?:
prefix. This is useful when you want to apply a quantifier to a group but don't need to refer to it later.
Example:
Pattern: (?:abc)+
Matches: "abc", "abcabc", "abcabcabc"
Explanation: The group (?:abc)
is matched one or more times, but it is not captured for later reference.
3. Named Capturing Groups
Named capturing groups allow you to assign a name to a capturing group. This makes it easier to refer to the matched substring later. They are created by placing the characters to be grouped inside a set of parentheses with a ?<name>
prefix.
Example:
Pattern: (?<word>abc)
Matches: "abc"
Explanation: The group (?<word>abc)
is matched and can be referred to by the name "word".
4. Backreferences
Backreferences allow you to refer to a previously matched capturing group within the same regular expression. They are created using the backslash \
followed by the number of the capturing group. For named capturing groups, you can use \k<name>
.
Example:
Pattern: (\w)\1
Matches: "aa", "bb", "cc"
Explanation: The pattern (\w)\1
matches any word character followed by the same character.
5. Conditional Groups
Conditional groups allow you to specify different patterns based on whether a previous capturing group matched or not. They are created using the syntax (?(condition)true-pattern|false-pattern)
. This is useful for creating more complex and flexible patterns.
Example:
Pattern: (a)?(?(1)b|c)
Matches: "ab", "c"
Explanation: The pattern (a)?(?(1)b|c)
matches "ab" if "a" is present, otherwise it matches "c".