Email parsing with Ruby and ChatGPT

7 min read

Email parsing is crucial as it enables us to extract essential information from an email without the need to read it manually. By utilizing Ruby and ChatGPT, we can construct an application that facilitates parsing information from any email in our mailbox.

By the way, this blog post has been proofread by ChatGPT.

Is your system ready?

If you already have the Nylas Ruby SDK installed and your Ruby environment configured, feel free to proceed with the blog.

Otherwise, I recommend reading the on How to Send Emails with the Nylas Ruby SDK where the basic setup is clearly explained.

What are we going to talk about?

How do we handle email parsing before ChatGPT?

There’s no denying that ChatGPT is a powerful tool for a wide range of programming problems. However, ChatGPT hasn’t always been available, so how did we handle parsing before its advent? One option that comes to mind is Regular Expressions.

Consider the following message:

Hey! I've been asking you for things for my gym, but I completely forgot to send you my contact information. 

Cellphone: (102) 456-234-1934 
Address: 1010 Saddle Horn Ct, Augusta, Georgia. 

Looking forward to hearing from you soon. 

Sincerely, 

Terry "Hulk" Hogan. 
Hulk's Gym Owner.

And suppose we want to extract crucial pieces of information. We could achieve this as follows:

email = <<~EMAIL
  Hey! I've been asking you for things for my gym, but I completely forgot to send you my contact information.

  Cellphone: (102) 456-234-1934
  Address: 1010 Saddle Horn Ct, Augusta, Georgia.

  Looking forward to hearing from you soon.

  Sincerely,

  Terry "Hulk" Hogan.
  Hulk's Gym Owner.
EMAIL

contact_info = {}

# Extract cellphone number
cellphone_regex = /Cellphone:\s*(\(\d{3}\)\s*\d{3}-\d{3}-\d{4})/
match = cellphone_regex.match(email)
contact_info[:cellphone] = match[1] if match

# Extract address
address_regex = /Address:\s*([\w\s\d]+),\s*([\w\s]+), \s*([\w\s]+)\./
match = address_regex.match(email)
contact_info[:address] = "#{match[1]}, #{match[2]}" if match

puts contact_info

When we run this, it’s going to return the following:

{:cellphone=>"(102) 456-234-1934", :address=>"1010 Saddle Horn Ct, Augusta"}

We could certainly enhance it further and extract more information, but that would be time-consuming and specific to that particular example. If the email structure varies, our entire source code would need to change.

Therefore, while possible, using Regular Expressions for this type of application might not be the optimal approach, especially if we lack a method to ‘generate’ code for each unique example.

What our application will look like

Our application will be kept simple, as our primary focus is on the parsing functionality:

ChatGPT Email Parsing Application

Our application will showcase the first five emails in our mailbox. Subsequently, we can select any of them and display the parsed information:

Parsed email

And what if we don’t select any email? The application will notify us accordingly:

No emails selected

Installing the Sinatra package

As we aim to develop a web application, our optimal choice is to leverage Flask, one of the most popular micro-frameworks in the Python world:

$ gem install sinatra
$ gem install webrick

Also, we need to install the OpenAI Ruby gem:

$ gem install ruby-openai

Once installed, we’re ready to go.

Creating a ChatGPT Account

First, we need to have a ChatGTP account and then create our API keys:

After creating the key, ensure to store it securely, as it cannot be recovered. You can take advantage of your .env file to store it there.

Creating the Ruby chatGPT email parser project

Initially, let’s create a folder named Parse_Email, and within it, establish a folder named views.

Now, create a file named Parse_Email.rb with the following source code:

# frozen_string_literal: true

# Import your dependencies
require 'sinatra'
require 'dotenv/load'
require 'nylas'
require 'date'
require 'ruby/openai'
require 'json'

# Initialize your Nylas API client
nylas = Nylas::API.new(
    app_id: ENV["CLIENT_ID"],
    app_secret: ENV["CLIENT_SECRET"],
    access_token: ENV["ACCESS_TOKEN"]
)

# Configure our OpenAI client
OpenAI.configure do |config|
  config.access_token = ENV["OPENAI_TOKEN"]
end

# Create the actual OpenAI client
client = OpenAI::Client.new

# Select the first 5 emails from our mailbox
emails = nylas.messages.limit(5).where(in: 'CRM')

# Display the emails
get '/' do
    erb :main, layout: :layout, locals: {emails: emails}
end

post '/' do
    # Get the email id as a parameter
    if params[:email_id] != nil
        # Using the email, retrieve the body of the message
	message = nylas.messages.find(params[:email_id])
	# Clean the body using some Regular Expressions
	body = message.body.gsub(/\n/," ").
    gsub(/<style>.+<\/style>/," ").
    gsub(/<("[^"]*"|'[^']*'|[^'">])*>/," ").
    gsub(/.email-content\{.+\}/," ").
    gsub(/&nbsp;/, " ").
    gsub(/.s/, " ").
    gsub(/^\s+|\s+$\/g/, "")	
    else
        body = ""
    end
    
# Create the ChatGPT prompt
text = """ 
You are an email parser receiving an email. You need to parse the information as json. 
You don't need to show the code, just return the result and call it Result. 
Also, return an array with the json keys. Here's the email: #{body}
""" 
    # Call the ChatGPT client
    response = client.chat(
        parameters: {
	    model: "gpt-4",
	    messages: [{ role: "system", content: """#{text}"""}],
	    temperature: 0,
	})
    
    # Read the ChatGPT response
    full_response = response.dig("choices", 0, "message", "content").split("\n")
    response_line = ""
    # And make it a single line
    full_response.each do |line|
        response_line += line.strip
    end

    line_matches = ""
    # We want everything that goes inside brackets
    # and these are the parsed keys
    matches = response_line.scan(/\[(.*?)\]/)
    matches.flatten.each do |match|
        line_matches += match
    end

    # Replace "" with an space
    line_matches = line_matches.gsub('"', '')
    # Split into an array
    array = line_matches.split(',')
    # Make sure there are not extra spaces
    array.each do |line|
        line.strip!
    end
	
    # If we have an email id to analyze
    if params[:email_id] != nil
        # Get everything that goes inside curly brackets
        result = result = response_line.match("\{([^}]*)\}")
        # And turn it into a JSON object 
        json_object = JSON.parse("#{result}")
    
        # This will hold the keys and results
        emails_array = []
    
        # Create a line with each key and result
        # Eg. Name: Blag
        #       Location: Ottawa
        array.each do |line|
            emails_array.push("<b>" + line + "</b>: " + json_object[line])
        end
    else
        # No email was selected...nothing to do here...
        emails_array = []
        emails_array.push("<b>No emails were selected</b>")    
    end
    
    # Call the parsed page and display the keys and results
    erb :parsed, layout: :layout, locals: {emails_array: emails_array}
end

Inside the views folder, we need to create 3 different files, let’s start with layout.erb:

<html>
<head>
<script src="https://cdn.tailwindcss.com"></script>
<title>Email Parser</title>
</head>
<body>
<%= yield %>
</body>
</html>

Then main.erb:

<div class="bg-[#315acb] border-green-600 border-b p-4 m-4 rounded grid place-items-center">
<p class="text-4xl">Email Parser</p>
<br>
<form method="post">
	<% emails.each do |email| %>
		<input type="radio" name="email_id" id="email_id" value=<%= email.id %>>
		<label for="email_id" class="font-bold"><%= email.subject %></label><br>
	<%end %>
<br>	
<button type="submit" class="block bg-blue-500 hover:bg-blue-700 text-white text-lg mx-auto py-2 px-4 rounded-full">Submit</button>
</form>
</div>

Finally, parsed.erb:

<div class="bg-[#315acb] border-green-600 border-b p-4 m-4 rounded grid place-items-center">
	<% emails_array.each do |email| %>
		<p><%= email %></p>
	<% end %>
	<br>
	<a href="/" class="block bg-blue-500 hover:bg-blue-700 text-white text-lg mx-auto py-2 px-4 rounded-full"><b>Go back</b></a>
</div>

And that’s it.

Running our Ruby and ChatGPT email parser application

To execute our application, simply type the following command in the terminal window:

$ ruby Parse_Email.rb
Running the ChatGPT Email Parsing application with Ruby

Our application will be active on port 4567 of localhost. Simply open your favourite browser and navigate to the following address:

http://localhost:4567

By combining Ruby, Nylas and ChatGPT, we were able to create an Email Parsing application is a few lines of code.

What’s next?

You can sign up Nylas for free and start building!

If you want to learn more about Emails, go to our documentation

Also, don’t miss the action, join our LiveStream Coding with Nylas:

Related resources

How to integrate Nylas Scheduler to your user flow

Learn how to integrate advanced scheduling features into your application using Nylas Scheduler v3 to streamline appointment booking and enhance user productivity.

How to set up Nylas API Webhooks using Hookdeck

This blog post covers how to setup Nylas API v3 webhooks using Hookdeck to receive real-time calendar, and email updates in your application.

How to create and read Google Webhooks using Ruby

Create and read your Google webhooks using Ruby and Sinatra, and publish them using Koyeb. Here’s the full guide.