1. What are Regular Expressions?
- Regular Expressions (Regex) are patterns used to match and manipulate text.
- They are a powerful tool for searching, extracting, and replacing text based on specific patterns.
- Python provides the
remodule for working with regular expressions.
2. Basic Regex Syntax
1. Literal Characters
- Match exact characters in the text.
- Example: The regex
catmatches the string"cat".
2. Metacharacters
- Special characters with specific meanings in regex:
.: Matches any single character except newline.^: Matches the start of a string.$: Matches the end of a string.*: Matches 0 or more repetitions of the preceding character.+: Matches 1 or more repetitions of the preceding character.?: Matches 0 or 1 repetition of the preceding character.{m,n}: Matches betweenmandnrepetitions of the preceding character.[]: Matches any single character within the brackets.|: Acts as an OR operator.(): Groups patterns together.
a.bmatches"aab","acb", but not"ab".^abcmatches"abc"at the start of a string.xyz$matches"xyz"at the end of a string.
3. Special Sequences
\d: Matches any digit (0-9).\D: Matches any non-digit.\w: Matches any word character (a-z, A-Z, 0-9, _).\W: Matches any non-word character.\s: Matches any whitespace character (space, tab, newline).\S: Matches any non-whitespace character.\b: Matches a word boundary.\B: Matches a non-word boundary.
\d{3}matches any 3 digits (e.g.,"123").\w+matches one or more word characters (e.g.,"hello").
3. Using the re Module
1. re.match()
- Checks if the regex matches at the beginning of the string.
- Returns a match object if found, otherwise
None.
2. re.search()
- Searches the entire string for a match.
- Returns a match object if found, otherwise
None.
3. re.findall()
- Returns all non-overlapping matches of the regex in the string as a list.
4. re.finditer()
- Returns an iterator yielding match objects for all matches.
5. re.sub()
- Replaces all occurrences of the regex pattern in the string with a replacement string.
6. re.split()
- Splits the string by the occurrences of the regex pattern.
4. Regex Groups
- Use parentheses
()to create groups in a regex. - Groups allow you to extract specific parts of a match.
5. Named Groups
- Assign names to groups using
(?P<name>...)syntax.
6. Additional Examples
-
Matching Names:
-
Extracting Phone Numbers:
-
Replacing Text:
-
Splitting Text:
7. Best Practices
- Use raw strings (
r"...") for regex patterns to avoid escaping backslashes. - Test regex patterns using tools like regex101.com.
- Use comments and verbose mode (
re.VERBOSE) for complex regex patterns.