In this post, we will be discussing the following contents...
- Why we use RE
- Finding text without RE
- Finding text with RE
- Creating Regex Object
- Matching Regex Object
- Review
Why use Regular Expression???
We usually search text by pressing Ctrl + F and typing the word you
want.
Regular Expression goes more further and allows you to specify a pattern of
text to search for. Let us take an example of a Telephone number from the US
or Canada. There they have a pattern as 3 number then hyphen, 4 number then a
hyphen then 4 number.
So the correct way is::: 415-555-1234
But the incorrect way is ::: 4,155,551,234.
What if we need to find all the numbers of this pattern in the file, not just
a single number then here we will be using Regular Expression...
But is there a way, in which we are not using a regular expression?
Finding Text without Regular Expression...
Yes, there is.
So, You know the pattern three numbers, a hyphen, three numbers a hyphen, and
four numbers.
Ex.: 415-555-4242
Try it yourself first
Isn't it very lengthy...
And Hence, we have Regular Expression which is going to convert these 17-18
lines of code to 3-4 lines of code...
Finding Pattern with Regular Expression...
Regular Expression called regex for short, are descriptions for patterns of
texts. For example, \d in regex
stands for digit character -- i.e., a single number from 0 to 9.
So the regex
\d\d\d-\d\d\d-\d\d\d\d is used by
Pyhton to match the same text the previous
isPhoneNumber() function
did.
Regular Expression is more sophisticated. Eg. Adding 3 in curly bracket
({3})after the pattern is like
saying. "Match this pattern three times."
So, we have :
\d{3}-\d{3}-\d{4}
Yaa isn't it very simple...
Creating Regex Objects...
All regex functions are in the re module.
>>> import re
Passing a string value representing your regular expression to
re.compile() returns a Regex
pattern object (or simply, a Regex object).
>>> phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
Here, you see
r' '. This makes the string
inside the quotes a Raw String. If you forget to write this you need to
write it like,
\\d\\d\\d-\\d\\d\\d-\\d\\d\\d\\d.
Matching Regex Object...
search() method -- searches
the string it is passed for any matches to the regex. Returns None if
the regex is not formed.
group() method --
returns the actual matched text from the searched string.
Enter this in your interactive shell...
Let's review...
While there are several steps in regular expression in Python, each step is
fairly simple.
- Import the regex module with import re.
- Create a Regex Object with the re.compile() function. {Remember to use raw string}
- Pass the string you want to search into the Regex object's search() method. This returns a Matched object.
- Call the Matched object's group() method to return a string of actual matched text.
Here is the video Explanation of the Basics of Regular Expression, go check
this out...
December 15, 2020
Tags :
Python
,
Regular Expression
Subscribe by Email
Follow Updates Articles from This Blog via Email
1 Comments
Great...
Reply DeleteWaiting for rest parts...