Python Regular Expressions
1. What are Regular Expressions?
- Regular Expressions (Regex) are patterns used to match and manipulate text.
- They are a powerful tool for searching, extracting, and replacing text based on specific patterns.
- Python provides the
re
module for working with regular expressions.
2. Basic Regex Syntax
1. Literal Characters
- Match exact characters in the text.
- Example: The regex
cat
matches the string"cat"
.
2. Metacharacters
- Special characters with specific meanings in regex:
.
: Matches any single character except newline.^
: Matches the start of a string.$
: Matches the end of a string.*
: Matches 0 or more repetitions of the preceding character.+
: Matches 1 or more repetitions of the preceding character.?
: Matches 0 or 1 repetition of the preceding character.{m,n}
: Matches betweenm
andn
repetitions of the preceding character.[]
: Matches any single character within the brackets.|
: Acts as an OR operator.()
: Groups patterns together.
Examples:
a.b
matches"aab"
,"acb"
, but not"ab"
.^abc
matches"abc"
at the start of a string.xyz$
matches"xyz"
at the end of a string.
3. Special Sequences
\d
: Matches any digit (0-9).\D
: Matches any non-digit.\w
: Matches any word character (a-z, A-Z, 0-9, _).\W
: Matches any non-word character.\s
: Matches any whitespace character (space, tab, newline).\S
: Matches any non-whitespace character.\b
: Matches a word boundary.\B
: Matches a non-word boundary.
Examples:
\d{3}
matches any 3 digits (e.g.,"123"
).\w+
matches one or more word characters (e.g.,"hello"
).
3. Using the re
Module
1. re.match()
- Checks if the regex matches at the beginning of the string.
- Returns a match object if found, otherwise
None
.
Example:
2. re.search()
- Searches the entire string for a match.
- Returns a match object if found, otherwise
None
.
Example:
3. re.findall()
- Returns all non-overlapping matches of the regex in the string as a list.
Example:
4. re.finditer()
- Returns an iterator yielding match objects for all matches.
Example:
5. re.sub()
- Replaces all occurrences of the regex pattern in the string with a replacement string.
Example:
6. re.split()
- Splits the string by the occurrences of the regex pattern.
Example:
4. Regex Groups
- Use parentheses
()
to create groups in a regex. - Groups allow you to extract specific parts of a match.
Example:
5. Named Groups
- Assign names to groups using
(?P<name>...)
syntax.
Example:
6. Additional Examples
-
Matching Names:
-
Extracting Phone Numbers:
-
Replacing Text:
-
Splitting Text:
7. Best Practices
- Use raw strings (
r"..."
) for regex patterns to avoid escaping backslashes. - Test regex patterns using tools like regex101.com.
- Use comments and verbose mode (
re.VERBOSE
) for complex regex patterns.
Example: