Python Regular Expressions

Python Regular Expressions: A Python regular expression, python regx or regexp is a sequence of characters that define a search pattern. Mainly this pattern is used for string search algorithms.

Python Regular Expressions

Syntax: re.match(pattern, string, flags=0)

  • parameters: -Pattern: It is a regular expression to be matched.
  • string: It helps you to search the matching pattern at the beginning of the string.
  • flags: We can specify different flags using bitwise OR ( | )

Example:

import re          
line="Cats are smarter than dogs"          
matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)     
if matchObj:           
print "matchObj.group() : ", matchObj.group()           
print "matchObj.group(1): ", matchObj.group(1)          
print "matchObj.group(2): ", matchObj.group(2)   
else:        
print "No match!!"

Output  
matchObj.group( ) : Cats are smarter than dogs
matchObj.group(1):  Cats
matchObj.group(2):  smarter

Python RegExp

Here, we are providing a list of Python regular expression functions.

Function Description
findall Returns a list containing all matches
search Returns a Match object if there is a match anywhere in the string
split Returns a list where the string has been split at each match
sub Replaces one or many matches with a string

Python findall( ) function

Example

import re                
str = "The rain in Spain"                
x = re.findall("ai", str)                 
print(x)

Output: [‘ai’, ‘ai’]

If no matches are found it returns an empty list.

Example

import re         
str = "The rain in Spain"            
x = re.findall("Portugal", str)         
print(x)

Output
[ ]
no match

Python Search function ()

It helps to search the string for a match, and also the first occurrence of the match which will be returned.

Example

import re              
str = "The rain in Spain"              
x = re.search("\s", str)              
Print ("The first white-space character is located in position:", x.start())

Output

The first white-space character is located in position: 3

Python split() function

It returns a list where the string has been split at each match.

Example

import re         
str = "The rain in Spain"         
x = re.split("\s", str)        
print(x)import re        
str = "The rain in Spain"          
x = re.split("\s", str)        
print(x)

Output: [‘The’, ‘rain’, ‘in’, ‘Spain’]

You can also control the no.of occurrences by specifying the maxsplit parameter.

 Example

import re        
str = "The rain in Spain"         
x = re.split("\s", str, 1)         
print(x)

Output: [‘The’, ‘rain in Spain’]

Python sub() function

Python sub() function replaces the matches with the text of your choice.

Example

import re          
str = "The rain in Spain"          
x = re.sub("\s", "9", str)           
print(x)

Output: The9rain9in9Spain

The regular expression looks for any words that start with an upper case “S”:

Example

import re    
str = "The rain in Spain"    
x = re.search(r"\bS\w+", str)     
print(x.group())

Output: Spain