Capturing Groups in Regular Expressions
1. What are Capturing Groups?
Capturing groups are a feature in regular expressions that allow you to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses ()
. Capturing groups not only match text but also capture the matched text for later use.
2. Basic Capturing Group
A basic capturing group is simply a sequence of characters enclosed in parentheses. For example, (abc)
is a capturing group that matches the string "abc" and captures it for later reference.
Example:
Pattern: (hello)
Matches: "hello"
Explanation: The capturing group (hello)
matches the string "hello" and captures it.
3. Nested Capturing Groups
You can nest capturing groups inside each other to create more complex patterns. The inner groups are captured first, followed by the outer groups. For example, (a(b)c)
captures both "abc" and "b".
Example:
Pattern: (a(b)c)
Matches: "abc"
Captures: Group 1: "abc", Group 2: "b"
Explanation: The outer group (a(b)c)
captures "abc", and the inner group (b)
captures "b".
4. Backreferences
Capturing groups can be referenced later in the same regular expression using backreferences. A backreference is specified by \n
, where n
is the number of the capturing group. For example, (a)\1
matches "aa".
Example:
Pattern: (a)\1
Matches: "aa"
Explanation: The capturing group (a)
captures "a", and the backreference \1
matches another "a".
5. Non-Capturing Groups
Sometimes, you may want to group characters without capturing them. This can be done using non-capturing groups, which are specified by (?:...)
. For example, (?:abc)
groups "abc" but does not capture it.
Example:
Pattern: (?:abc)
Matches: "abc"
Explanation: The non-capturing group (?:abc)
matches "abc" but does not capture it for later use.