To parse or not to parse (a guide to Python’s Parse)

Are Regular Expression too complex? Looking for a simpler alternative? Here’s a guide to the Parse library in Python

To parse or not to parse (a guide to Python’s Parse)

We often employ Regular Expressions in Python to parse or extract information while working with data. However, this approach can result in cumbersome and excessively long expressions, making it challenging for any team member, including the author, to understand the code—especially after a couple of months.

While regular expressions are incredibly useful, a new library called Parse is designed to alleviate the burden of parsing data.

What are we going to talk about?

How to install Parse

As with all Python libraries, installing Parse is straightforward:

$ pip3 install parse

Parsing a phone number

Let’s take a look at a practical example, first using Regular Expression and then using the Parse library:

import re

phone_number = "0011239956213"
pattern = r'\b(\d{3})(\d{10})$'

match = re.search(pattern, phone_number)
if match:
    country_code = match.group(1)
    number = match.group(2)
    print("Country code:", country_code)
    print("Phone:", number)
else:
    print("Invalid phone number")

When we run this code, we will get the following result:

Country code: 001
Phone: 3439969736

This is great, but what if we change the number to (001)1239956213:

Invalid phone number

Now, let’s try again but this time using Parse:

from parse import compile
from parse import Parser

compiler = Parser("{country_code}{number:10.10},")
content = "0011239956213,"

results = compiler.findall(content)

for result in results:
    print(f"Country code: {result['country_code']}")
    print(f"Phone: {result['number']}")

When we run this code, we will get the following result:

Country code: 001
Phone: 3439969736

This is great, but what if we change the number to (001)1239956213:

Country code: (001)
Phone: 3439969736

It just works fine, although you can see that there’s a small caveat where we need to put an extra “,” as a terminator, that’s a small price to pay.

Parsing text

Let’s say we have the following text:

Hello, my name is Blag and I'm a Senior Developer Advocate

And we want to extract the Name and Title.

This is how we would do it using Regular Expressions:

import re

string = "Hello, my name is Blag and I'm a Senior Developer Advocate"

pattern = r"my name is (\w+) and I'm a (.+)"

matches = re.search(pattern, string)

if matches:
    name = matches.group(1)
    title = matches.group(2)
    print("Name:", name)
    print("Title:", title)
else:
    print("No match found.")

The result would be:

Name: Blag
Title: Senior Developer Advocate

Now, using Parse, it would become:

from parse import *

string = "Hello, my name is Blag and I'm a Senior Developer Advocate"
parse_string = parse("Hello, my name is {name} and I'm a {title}", string)
print("Name:", parse_string["name"])
print("Title:", parse_string["title"])

Short and easier to manage.

Parse text from HTML

Sometimes, we need to extract information from a tag inside an HTML string.

Using Regular Expressions, we would do it like this:

import re

string = "'<p><b>Python likes Regex</b></p>"
matches = re.findall(r"<b>(.*?)<\/b>", string)
for match in matches:
    print(match)

The result would be:

Python likes Regex

Now, that’s not so hard, but maybe we can make it even easier with Parse:

from parse import *

string = "'<p><b>Python likes Regex</b></p>"
string = ''.join(r[0] for r in findall("<b>{}</b>", string))
print(string)

And the result would be the same, although the code is easier to understand:

Python likes Regex

Parse information from text

We already parse text, but this example goes a bit deeper. Let’s start with Regular Expressions:

import re

string = "We need 5 examples of Regex in Python"

# Extract the count using regex
count = re.search(r'\b(\d+)\b', string).group(1)

# Extract the item using regex
item = re.search(r'\b of (\w+)\b', string).group(1)

# Extract the language using regex
language = re.search(r'in\s+(\w+)', string, re.IGNORECASE).group(1)

print("Count:", count)
print("Item:", item)
print("Language:", language)

This is the result:

Count: 5
Item: Regex
Language: Python

While not so complicated, we can for sure make a simpler version using Parse:

from parse import *

string = "We need 5 examples of Regex in Python"
parse_string = parse('We need {:d} examples of {:w} in {:w}', string)
print(f"How many: {parse_string[0]}")
print(f"What: {parse_string[1]}")
print(f"Language: {parse_string[2]}")

The result would be the same:

Count: 5
Item: Regex
Language: Python

I hope you find this blog useful. Parse is a nice and useful library.

What’s next?

While this blog post was not about Nylas, you can still use Parse in your Nylas projects.

You can sign up for Nylas for free and start building!

Also, don’t miss the action, join our LiveStream Coding with Nylas:

You May Also Like

Transactional Email APIs vs Contextual Email APIs
Best email tracker
Find the best email tracker and elevate your app’s email game
How to create and read Webhooks with PHP, Koyeb and Bruno

Subscribe for our updates

Please enter your email address and receive the latest updates.