×
Post thumbnail

Python re Module Quick Cheatsheet.

Basic Functions
  1. re.compile(pattern, flags=0)

    • Compiles a regex pattern into a regex object.
    • Example:
      pattern = re.compile(r'\d+')
  2. re.search(pattern, string, flags=0)

    • Searches for the first occurrence of the pattern in the string. Returns a match object or None.
    • Example:
      match = re.search(r'\d+', 'The number is 42') if match: print(match.group()) # Output: 42
  3. re.match(pattern, string, flags=0)

    • Checks for a match only at the beginning of the string. Returns a match object or None.
    • Example:
      match = re.match(r'\d+', '42 is the answer') if match: print(match.group()) # Output: 42
  4. re.fullmatch(pattern, string, flags=0)

    • Checks if the entire string matches the pattern. Returns a match object or None.
    • Example:
      match = re.fullmatch(r'\d+', '42')
      if match:
          print(match.group())  # Output: 42
      
  5. re.findall(pattern, string, flags=0)

    • Returns a list of all non-overlapping matches in the string.
    • Example:
      numbers = re.findall(r'\d+', 'There are 42 apples and 17 oranges') 
      print(numbers) # Output: ['42', '17']
  6. re.finditer(pattern, string, flags=0)

    • Returns an iterator yielding match objects for all non-overlapping matches.
    • Example:
      for match in re.finditer(r'\d+', 'There are 42 apples and 17 oranges'):
          print(match.group())  # Output: 42, 17
  7. re.split(pattern, string, maxsplit=0, flags=0)

    • Splits the string by the occurrences of the pattern.
    • Example:
      parts = re.split(r'\W+', 'Words, separated by non-alphanumeric characters.')
      print(parts)  # Output: ['Words', 'separated', 'by', 'non', 'alphanumeric', 'characters', '']
      
  8. re.sub(pattern, repl, string, count=0, flags=0)

    • Replaces occurrences of the pattern with the replacement string.
    • Example:
      result = re.sub(r'\d+', '#', 'There are 42 apples and 17 oranges')
      print(result)  # Output: There are # apples and # oranges
      
  9. re.subn(pattern, repl, string, count=0, flags=0)

    • Same as re.sub but also returns the number of substitutions made.
    • Example:
      result, num_subs = re.subn(r'\d+', '#', 'There are 42 apples and 17 oranges')
      print(result, num_subs)  # Output: There are # apples and # oranges 2
      
Match Object Methods
  1. match.group([group1, ...])

    • Returns one or more subgroups of the match.
    • Example:
      match = re.search(r'(\d+)', 'The number is 42')
      print(match.group())  # Output: 42
      print(match.group(1))  # Output: 42
      
  2. match.start([group])

    • Returns the start position of the match.
    • Example:
      match = re.search(r'\d+', 'The number is 42')
      print(match.start())  # Output: 14
      
  3. match.end([group])

    • Returns the end position of the match.
    • Example:
      match = re.search(r'\d+', 'The number is 42')
      print(match.end())  # Output: 16
      
  4. match.span([group])

    • Returns a tuple containing the (start, end) positions of the match.
    • Example:
      match = re.search(r'\d+', 'The number is 42')
      print(match.span())  # Output: (14, 16)
      
Special Sequences
  • \d: Matches any digit (equivalent to [0-9]).
  • \D: Matches any non-digit.
  • \s: Matches any whitespace character.
  • \S: Matches any non-whitespace character.
  • \w: Matches any alphanumeric character (equivalent to [a-zA-Z0-9_]).
  • \W: Matches any non-alphanumeric character.
Use Cases
  1. Extracting Dates

    date_pattern = re.compile(r'\d{2}/\d{2}/\d{4}')
    dates = date_pattern.findall('Dates: 01/01/2020, 12/12/2021')
    print(dates)  # Output: ['01/01/2020', '12/12/2021']
    
  2. Validating Email Addresses

    email_pattern = re.compile(r'[\w\.-]+@[\w\.-]+')
    emails = email_pattern.findall('Contact: test.email@example.com, invalid-email@com')
    print(emails)  # Output: ['test.email@example.com']
    
  3. Finding and Replacing Text

    text = 'The color is blue'
    updated_text = re.sub(r'blue', 'red', text)
    print(updated_text)  # Output: The color is red
    
  4. Splitting a String by Multiple Delimiters

    text = 'apple, orange; banana'
    fruits = re.split(r'[;,]\s*', text)
    print(fruits)  # Output: ['apple', 'orange', 'banana']
    
  5. Extracting All Capitalized Words

    text = 'This is a Sample Text with Capitalized Words'
    capitalized_words = re.findall(r'\b[A-Z][a-z]*\b', text)
    print(capitalized_words)  # Output: ['This', 'Sample', 'Text', 'Capitalized', 'Words']
    

I hope this will come in handy whenever you need this 📚.

• • •

Latest Opinions

No Opinion so far...

End.

No internet connection

Trying to reconnect...

Loading...